# Pipecat

> An open-source Python framework for real-time voice and multimodal conversational AI — it orchestrates streaming STT, LLM, and TTS into composable pipelines.

Pipecat is an open-source Python framework for building real-time voice and multimodal conversational agents. It orchestrates the streaming STT → LLM → TTS loop, the audio transport (WebRTC/WebSocket), and turn-taking into composable pipelines, with integrations across dozens of speech and model providers — so you build the agent's behavior instead of the real-time plumbing.

Website: https://pipecat.ai

Pipecat is an open-source Python framework for **real-time voice and multimodal conversational AI**. It solves the hard, generic part of a [voice agent](/guides/voice/build-a-voice-agent): orchestrating the streaming STT → LLM → TTS loop, managing the audio transport, and handling turn-taking — all as composable pipelines. You assemble a pipeline from provider-backed components and Pipecat runs the real-time hand-offs, so you focus on the agent's behavior rather than the streaming infrastructure.

It's aimed at developers building custom voice agents who want best-of-breed providers per stage instead of a single bundled API — and the control over latency, cost, and model choice that brings.

## Highlights

- **Composable real-time pipeline** — wire streaming STT, LLM, and TTS into one low-latency loop.
- **Broad integrations** — works with dozens of STT/LLM/TTS providers (Deepgram, ElevenLabs, OpenAI, Anthropic, and many more).
- **Transports built in** — WebRTC and WebSocket for browser, phone, and app audio.
- **Turn-taking & interruptions** — voice-activity detection, endpointing, and barge-in handled in the framework.
- **Single or multi-agent** — compose one agent or coordinate several with handoff and parallel processing.

## In an AI-assisted workflow

```python
# a Pipecat pipeline wires the streaming stages into one real-time loop
from pipecat.pipeline.pipeline import Pipeline
pipeline = Pipeline([transport.input(), stt, llm, tts, transport.output()])
```

Swap any stage's provider without rewriting the loop — the pipeline structure stays the same.

> [!TIP]
> Pipecat shines when you want to mix providers — e.g. Deepgram for STT, your own LLM via a gateway, and ElevenLabs for TTS — and still get tuned turn-taking and barge-in for free. For a single-vendor prototype, a bundled voice-agent API is faster to start.

## Good to know

Pipecat is open source (BSD-2-Clause) and free; you pay the underlying STT/LLM/TTS providers for usage. It's a Python framework you run yourself (locally or in the cloud), with WebRTC/WebSocket transports for getting audio in and out. To pick the providers it orchestrates, compare [Deepgram](/tools/deepgram) (STT) and [ElevenLabs](/tools/elevenlabs) (TTS); the [voice-agent-engineer](/agents/data-ai/voice-agent-engineer) builds and tunes the whole pipeline.

---

_Source: https://agentscamp.com/tools/pipecat — Tool on AgentsCamp._