008-bm25

BM25

Source: https://arpitbhayani.me/blogs/bm25 Date: 2026-03-04

There is a particular kind of respect reserved in engineering for the algorithm that outlives its era. BM25 is one of them. BM25 was born out of information retrieval research in the 1970s and 1980s, polished over decades, and eventually adopted as the default ranking function in Elasticsearch, Solr, and Lucene.

There is a particular kind of respect reserved in engineering for the algorithm that outlives its era. BM25 is one of them. BM25 was born out of information retrieval research in the 1970s and 1980s, polished over decades, and eventually adopted as the default ranking function in Elasticsearch, Solr, and Lucene.

What makes BM25 worth understanding is not just that it works. It is that it works for knowable reasons.

Every part of the formula has a clear interpretation. When a result is surprising, you can trace why. When you need to tune for your domain, the parameters give you meaningful handles to turn. The interpretability is genuinely valuable.

BM25

Source: https://arpitbhayani.me/blogs/bm25 Date: 2026-03-04

There is a particular kind of respect reserved in engineering for the algorithm that outlives its era. BM25 is one of them. BM25 was born out of information retrieval research in the 1970s and 1980s, polished over decades, and eventually adopted as the default ranking function in Elasticsearch, Solr, and Lucene.

What makes BM25 worth understanding is not just that it works. It is that it works for knowable reasons.

BM25

008-bm25

BM25

What BM25 Was Built To Solve

TF is Linear

Okapi and the TREC years

How BM25 Works

Saturating Term Frequency

Normalizing Document Length

The IDF Component

The Complete Formula

Worked Example

Tuning `k1` and `b`

`k1`

`b`

What BM25 Cannot Do

When To Use BM25 vs Alternatives

BM25 in Elasticsearch

Footnote

008-bm25

BM25

008-bm25

BM25

What BM25 Was Built To Solve

TF is Linear

Okapi and the TREC years

How BM25 Works

Saturating Term Frequency

Normalizing Document Length

The IDF Component

The Complete Formula

Worked Example

Tuning k1 and b

k1

b

What BM25 Cannot Do

When To Use BM25 vs Alternatives

BM25 in Elasticsearch

Footnote

Tuning `k1` and `b`

`k1`

`b`