TLDR: Consistent Hashing - What It Is and How to Implement It
Date: 2020-05-24 Source: https://arpitbhayani.me/blogs/consistent-hashing
Overview
Consistent hashing is a hashing technique that performs really well when operated in a dynamic environment where the distributed system scales up and scales down frequently.
Key Points
- The function is computationally efficient and the values generated are easy for lookups
- The function, for most general use cases, behaves like a pseudorandom generator that spreads data out evenly without any noticeable correlation
- keep the hash function independent of the number of storage nodes
- Hash Function in Consistent Hashing: We define total_slots as the size of this entire hash space, typically of the order 2^256 and the hash function could be implemented by taking SHA-256 followed by a mod total_slots.
- Adding a new node in the system: When there is a need to scale up and add a new node in the system, in our case a new Storage Node, we When a new node is added in the system it only affects the files that hash at the location to the left and associated with the node to the right, of the position the new node will fit in.