What Is Hybrid Fusion (RRF)? Sparse and Dense Rank Merging
By Tharindu Gunawardana | SearchMinistry Media |
Reciprocal Rank Fusion (RRF) is a rank aggregation algorithm that merges multiple ranked lists into a single unified ranking by summing 1/(k + rank) for each document across all lists. Hybrid retrieval combines BM25 sparse retrieval and dense vector retrieval, then applies RRF to merge their ranked outputs.
The RRF Formula
RRF assigns each document a fusion score by summing 1/(k + rank) for every ranked list the document appears in. The constant k is typically 60. A document ranked 1st in both BM25 and dense retrieval scores 1/61 + 1/61 = 0.033. A document ranked 1st in BM25 only scores 1/61 = 0.016. Documents performing well in both lists receive approximately double the score of documents appearing in only one list.
Why RRF Uses Ranks Instead of Scores
BM25 and cosine similarity scores operate on different scales. Direct score combination requires careful calibration that is sensitive to the specific models and query distribution used. RRF avoids this entirely by treating both systems equivalently via rank order. RRF's stability makes it the default choice for production hybrid retrieval, consistently matching or outperforming learned fusion methods.
SEO Implications
Appearing in both sparse and dense retrieval results roughly doubles your RRF score relative to appearing in one only. Content optimised for both signals achieves hybrid retrieval amplification. Precise terminology ensures sparse retrieval coverage; rich semantic variation ensures dense retrieval coverage. Both goals align with good comprehensive content writing, making hybrid-optimised content also better for human readers.