Arize Phoenix

Arize Phoenix is an open-source LLM tracing and evaluation tool built on OpenTelemetry/OpenInference. Run it locally in a notebook or self-host it to capture traces, run evals (including LLM-as-judge), and debug RAG and agent runs without sending data to a vendor.

Arize Phoenix is an open-source observability and evaluation tool for LLM applications. Built on OpenTelemetry and the OpenInference tracing standard, it captures the full trace of a run and lets you evaluate outputs — and because it's open source and runs locally or self-hosted, your traces never have to leave your environment.

It is aimed at engineers who want vendor-neutral observability they can spin up in a notebook during development and self-host in production. Phoenix is the open-source companion to Arize's commercial platform, so you can start free and graduate to the managed product if you outgrow it.

Highlights

OpenTelemetry-native tracing — instrument with open standards (OpenInference), avoiding lock-in to one vendor's SDK.
Run anywhere — launch locally in a notebook for dev, or self-host for team/production use.
Built-in evals — LLM-as-judge and other evaluators for relevance, hallucination, and RAG quality.
RAG & agent debugging — inspect retrieval steps, tool calls, and the full span tree behind an answer.
Framework-agnostic — works across common LLM and orchestration stacks via auto-instrumentation.

In an AI-assisted workflow

import phoenix as px
px.launch_app()          # local UI for traces + evals
# auto-instrument your LLM/agent calls, then inspect spans and run evaluators

TIP

Because Phoenix speaks OpenTelemetry, the instrumentation you add is portable — you can ship the same traces to another OTel-compatible backend later without re-instrumenting.

Good to know

Phoenix is open source and free to self-host; you bring an LLM provider for judge-based evals. Arize also offers a managed platform for teams that want hosted scale and support. For a hosted-first open-source option, compare Langfuse; for the commercial LangChain-native option, LangSmith.

Frequently asked questions

What is Arize Phoenix?

Arize Phoenix is an open-source observability and evaluation tool for LLM applications, built on OpenTelemetry and the OpenInference tracing standard. It captures the full trace of a run, runs evals (including LLM-as-judge) for relevance, hallucination, and RAG quality, and lets you debug RAG and agent runs by inspecting the span tree behind an answer.

Is Arize Phoenix free?

Yes — Phoenix is open source and free to self-host, and because it runs locally or self-hosted, your traces never have to leave your environment. You bring an LLM provider for judge-based evals, and Arize offers a managed commercial platform if you outgrow it.

How do I use Arize Phoenix?

Launch it locally with import phoenix as px; px.launch_app(), auto-instrument your LLM or agent calls, then inspect spans and run evaluators in the local UI. Because Phoenix speaks OpenTelemetry, the instrumentation is portable — you can ship the same traces to another OTel-compatible backend later without re-instrumenting.

Arize Phoenix vs Langfuse?

Both are open-source LLM observability tools. Phoenix is OpenTelemetry-native and built to run anywhere — in a notebook during development or self-hosted in production — while Langfuse is the hosted-first open-source option; LangSmith is the commercial LangChain-native alternative.

Highlights

In an AI-assisted workflow

Good to know

Frequently asked questions

Related