TLDR: Bloom Filters
Date: 2025-12-29 Source: https://arpitbhayani.me/blogs/bloom-filters
Overview
A Bloom filter is a probabilistic data structure that answers a very specific question - have I seen this thing before? - while using almost no memory.
Key Points
- A Bloom filter is a probabilistic data structure that answers a very specific question - have I seen this thing before? - while using almost no memory.
- Why Bloom Filters Matter: Say, you are building a recommendation engine for a content platform with millions of users.
- The Fundamental Structure: A Bloom filter consists of two components: a bit array of m bits, all initially set to zero, and a collection of k independent hash functions.
- Mathematics of False Positives: After inserting n elements using k hash functions into a filter with m bits, the probability that a specific bit remains zero is: If you have m bits in your array and a hash function that outputs random positions, the probability of hitting any specific bit is 1/m.