Hybrid Search
Hybrid search runs keyword (BM25) and semantic (vector) retrieval together and merges the results — catching both exact terms and paraphrases.
Hybrid search retrieves with two engines at once — lexical keyword search (BM25) and semantic vector search — and merges their results, so queries match both by exact terms and by meaning.
It exists because neither half suffices alone. Pure vector retrieval has a famous blind spot: exact strings — error codes, function names, part numbers — where "semantically similar" is precisely wrong. Pure keyword search has the inverse: zero tolerance for vocabulary mismatch between askers and documents. Production corpora contain both query types, so production retrieval runs both engines — usually merged by Reciprocal Rank Fusion (rank-based, immune to score-scale mismatch) and refined by a reranker that sorts the combined pool.
Adoption is now mostly a checkbox: vector databases from Qdrant to Weaviate to pgvector-based stacks ship hybrid retrieval natively. The judgment that remains is tuning — fusion weights per corpus, and measuring whether the lexical leg actually helps your queries — covered with the full recall-to-precision architecture in Hybrid Search & Reranking. When RAG misses queries containing exact identifiers, hybrid search is the first fix on the debugging checklist.
Frequently asked questions
- Why combine keyword and vector search?
- Their failure modes are complementary. Vector search finds paraphrases ('laptop won't boot' → 'system fails to start') but fuzzes exact tokens; keyword search nails identifiers, error codes, and SKUs but misses rephrasings. Run both and each covers the other's blind spot — the single highest-ROI retrieval upgrade for most corpora.
- How are the two result lists merged?
- Most commonly Reciprocal Rank Fusion (RRF) — combining by rank position rather than raw scores, which sidesteps the incomparable-score-scales problem — or a weighted score blend tuned per corpus. Then, typically, a reranker sorts the merged pool precisely. Most vector databases now ship hybrid search built in.
Related
- Hybrid Search & Reranking: From Top-50 Recall to Top-5 PrecisionHow production RAG combines dense and sparse search, fuses with RRF, and reranks — turning a wide candidate set into the few passages that actually answer.
- Semantic SearchSemantic search retrieves results by meaning rather than keyword overlap — embedding queries and documents in one vector space and matching by similarity.
- RerankingReranking is a second-pass scoring step: a cross-encoder model re-orders the top results from fast retrieval so the truly relevant few rise to the top.
- RAG (Retrieval-Augmented Generation)RAG retrieves relevant documents from your own data and injects them into an LLM's prompt at query time, grounding answers in facts the model wasn't trained on.
- Vector DatabaseA vector database stores embeddings and answers nearest-neighbor queries fast — the retrieval layer under RAG and semantic search, using ANN indexes like HNSW.