Vector Database
A vector database stores embeddings and answers nearest-neighbor queries fast — the retrieval layer under RAG and semantic search, using ANN indexes like HNSW.
A vector database stores embeddings and answers the query "which stored vectors are closest to this one?" fast enough for production — the retrieval layer beneath RAG and semantic search.
The hard problem it solves is scale. Exact nearest-neighbor search means comparing the query against every vector — fine at ten thousand, hopeless at a hundred million. Vector databases use approximate nearest neighbor (ANN) indexes, dominated by HNSW graphs, to get sub-millisecond lookups at a small, tunable recall cost. Around that core they layer the production necessities: metadata filtering ("only docs from this tenant"), hybrid keyword+vector search, quantization to shrink memory, and replication.
The market splits three ways: Postgres-native (pgvector) riding your existing database, open-source engines (Qdrant, Weaviate, Milvus, Chroma, LanceDB), and managed services (Pinecone). The honest decision guide — including when plain pgvector is the right answer — is Best Vector Database in 2026; tuning the index you pick is the embedding-index-tuner skill's job.
Frequently asked questions
- Do I need a dedicated vector database?
- Not always. pgvector adds vector search to the Postgres you already run, and at small-to-medium scale it's often the pragmatic choice. Dedicated engines (Qdrant, Pinecone, Weaviate, Milvus) earn their place with scale, filtering performance, hybrid search, and operational features — the decision tree is in our vector database guide.
- What is HNSW?
- Hierarchical Navigable Small World — the dominant approximate-nearest-neighbor index. It builds a layered graph over vectors so queries hop toward neighbors in logarithmic time instead of scanning everything, trading a little recall for orders-of-magnitude speed. Its parameters (M, efConstruction, efSearch) are the main tuning knobs.
Related
- Best Vector Database in 2026: pgvector vs Pinecone vs Qdrant vs Weaviate vs Milvus vs Chroma vs LanceDBA decision guide to vector databases — embedded, server, or managed; whether you already run Postgres; and which fits your scale, filtering, and RAG needs.
- EmbeddingAn embedding is a vector of numbers representing text's meaning, placed so similar texts land close together — the foundation of semantic search and RAG.
- Semantic SearchSemantic search retrieves results by meaning rather than keyword overlap — embedding queries and documents in one vector space and matching by similarity.
- RAG (Retrieval-Augmented Generation)RAG retrieves relevant documents from your own data and injects them into an LLM's prompt at query time, grounding answers in facts the model wasn't trained on.
- pgvectorAn open-source Postgres extension that adds a vector type and HNSW/IVFFlat indexes for similarity search inside your existing database.
- QdrantAn open-source vector database written in Rust, built for low-latency similarity search at scale.
- Embedding Index TunerTune a vector index — HNSW graph parameters and quantization — to hit a recall target at the lowest latency and memory, by sweeping settings against a fixed query set instead of trusting defaults. Use when vector search is slow or memory-hungry, when recall dropped after enabling quantization, or when standing up an index and you need defensible parameters.
- Graphrag ScaffolderStand up a GraphRAG experiment the disciplined way: audit whether your failed queries are actually connection-shaped, scope a minimal entity/relationship ontology, build extraction → graph → community-summary indexing on a corpus slice, and measure against vector-RAG baselines before committing. Use when multi-hop or whole-corpus questions keep failing plain RAG.
- Qdrant vs Pinecone: Which Vector Database? (2026)Qdrant vs Pinecone compared — open-source control vs fully managed serverless, filtering and hybrid search, cost shape, and which fits your RAG stack.
- GraphRAG Explained: When Knowledge Graphs Beat Vector SearchWhat GraphRAG is, how graph-based retrieval differs from vector RAG, the query shapes where it wins, and the honest costs before you build one.
- Cosine SimilarityCosine similarity measures how alike two embeddings are by the angle between them — the standard relevance score behind semantic search and RAG retrieval.
- Embedding DimensionEmbedding dimension is the length of an embedding vector — how many numbers represent each text — trading capacity against storage and search cost.
- Hybrid SearchHybrid search runs keyword (BM25) and semantic (vector) retrieval together and merges the results — catching both exact terms and paraphrases.