AI & Technology · Architecture Patterns

Hybrid RAG: Dense + Sparse + RRF

Pure vector search misses domain terms. Hybrid — dense + sparse + RRF fusion — is a correctness fix, not an optimization. Dense-only RAG leaves recall on the table.

Pure vector search fails on domain-specific terminology. The embedding for "AMH" in a fertility corpus doesn't carry enough signal to retrieve the right chunk against the embedding for "anti-Müllerian hormone." The fix is hybrid: run a dense (vector) search and a sparse (BM25) search in parallel, then fuse the results with Reciprocal Rank Fusion.

Sparse retrieval catches the exact-term matches that dense misses. Dense catches the semantic matches sparse misses. RRF balances them without needing to tune weights per query.

This isn't an optimization, it's a correctness fix. Any production RAG in a technical domain that runs dense-only is leaving recall on the table — and the missing chunks are precisely the ones the user asked about by name.