Chroma
An open-source, Python-first vector database that runs in-process — the fastest path from pip install to a working retrieval prototype.
Chroma is an open-source, Python-first vector database that runs embedded in your process: pip install, create a collection, add documents, and query — often without wiring an embedding model yourself. The default for prototypes and notebooks, with a client-server mode and Chroma Cloud when you outgrow embedded.
Chroma is an open-source vector database designed for developer experience. It runs in-process by default — no server to start — so you can go from pip install chromadb to a working retrieval loop in a handful of lines, and it ships a default embedding function so you don't even have to wire an embedding provider to get started. That low-friction path is why Chroma is the most common first vector store in prototypes, notebooks, and demos.
It is aimed at developers building and iterating on retrieval who want to move fast and add infrastructure only when they need it. When you outgrow embedded, Chroma also runs as a client-server deployment, and Chroma Cloud offers a managed, serverless option with the same API.
Highlights
- Embedded by default — runs inside your application process against local persistence; no separate service to deploy.
- Batteries-included DX — collections, a default embedding function, and metadata filtering in a small, friendly Python (and JS) API.
- Metadata filtering — attach metadata to documents and filter on it (
whereclauses) at query time. - Grows with you — the same API runs in-process, as a client-server backend, or on managed Chroma Cloud.
In an AI-assisted workflow
Create a collection, add documents (Chroma can embed them for you), and query:
import chromadb
client = chromadb.PersistentClient(path="./chroma")
docs = client.get_or_create_collection("docs")
docs.add(ids=["doc-1"], documents=["How to rotate API keys..."],
metadatas=[{"product": "billing"}])
res = docs.query(
query_texts=["How do I rotate API keys?"],
n_results=20, # over-retrieve, then rerank
where={"product": "billing"},
)TIP
Chroma's default embedding function is convenient for prototyping, but for production retrieval choose a retrieval-tuned model deliberately — see Choosing Embeddings in 2026. Switching the embedding model later means re-adding (re-embedding) your documents.
Good to know
Chroma is free and open source under Apache-2.0. It's the fastest store to start with; for embedded use at larger scale on disk or object storage, compare it with LanceDB, and for a server you operate, with Qdrant. See Best Vector Database in 2026 for where each fits.
Related
- Best Vector Database in 2026: pgvector vs Pinecone vs Qdrant vs Weaviate vs Milvus vs Chroma vs LanceDBA decision guide to vector databases — embedded, server, or managed; whether you already run Postgres; and which fits your scale, filtering, and RAG needs.
- LanceDBAn open-source embedded vector database built on the Lance columnar format — serverless, multimodal, and designed to scale on local disk or object storage.
- How RAG Actually Works: Ingestion, Chunking, Retrieval & RerankingA clear, practical walkthrough of the retrieval-augmented generation pipeline — what each stage does, where it fails, and how the pieces fit together.
- Vector Search EngineerUse this agent to design, build, and tune the vector-database layer of a search or RAG system — schema and index design (HNSW/IVF + quantization), metadata/payload filtering, hybrid (dense + sparse) search, and ingestion/upsert pipelines — sized to a real latency, recall, and cost budget. Examples — "set up pgvector for our docs with HNSW and filtered search", "our Qdrant queries are slow and recall dropped after quantization", "add metadata filtering so search only returns the current tenant's documents".