TLDR: BM25
Date: 2026-03-04 Source: https://arpitbhayani.me/blogs/bm25
Overview
There is a particular kind of respect reserved in engineering for the algorithm that outlives its era. BM25 was born out of information retrieval research in the 1970s and 1980s, polished over decades, and eventually adopted as the default ranking function in Elasticsearch, Solr, and Lucene.
Key Points
- There is a particular kind of respect reserved in engineering for the algorithm that outlives its era.
- What BM25 Was Built To Solve: The simplest possible retrieval system is Boolean keyword matching: a document is relevant if it contains the query terms, and irrelevant if it does not.
- Okapi and the TREC years: BM25 emerged from work done on the Okapi system at City University London.
- How BM25 Works: Rather than presenting the formula and explaining it, it is more useful to build it up from the two problems TF-IDF could not solve.