System Design : Load Balancing And Consistent Hashing - Part 2

Rohit Modi

Jun 23, 20224 min read

Updated: Dec 18, 2022

Hi, today's article is the continuation of article "System Design : Load Balancing And Consistent Hashing", here we will explore Consistent Hashing, which is very important for large size application which runs over containers supported by multiple servers.

In the previous article we have seen, how an application loses its cache memory data and how addition/deletion of server to/from the application infrastructure can impact whole application at large extent which is almost unavoidable.

Consistent hashing solves all these problems at great extent, its not only minimizes the impact on the application during addition/deletion of server, it also saves cache memory data.

Lets see through below diagram, in normal circumstances (non consistent hashing implementation) how much an addition of server impact the whole application

so we have currently 4 server, and each server consumes 25 percent of the search space, lets assume at some point of time number of request increases

and all 4 server overloaded by request, to avoid overloading the existing servers, team decided to add one more server to distribute the load and reduce the load on other servers

so now each server will be having 20 percent of search space, so first server loses 5 percent of its space, this 5 percent space is taken by server 2 and it loses 10 percent of space, server 3 takes this 10 percent space and loses 15 percent of its space, server 4 takes this 15 percent space and loses 20 percent space , and this 20 percent space is utilized by server 5, so total change is (5+5+10+10+15+15+20+20) 100, its really huge change and we lose all the cache memory data as well since these data will not be relevant anymore.

To overcome from this we go for Consistent hashing technique, lets see how it works through the following diagram

In above diagram, we can see request flows in clockwise direction and request which land in those buckets which are between server 0 to server 1, will be served by server 1, request which lands between server 1 to server 2 will be served by server 2 and so on.

Servers took these spots by calculating its hash value, in consistent hashing we do hashing for server id as well and based on the obtained hash value we put the server at that particular location, calculation is done as per below

lets say total search space is 50

hash(S0) = 2

spot = 2%50 which is 2

hash(S1) = 15

spot = 15%50 which is 15

hash(S2) = 67

spot = 67%50 which is 17

hash(S3) = 90

spot = 90%50 which is 40

the range of hash function is quite huge so instead of wasting memory we consider a number which can be effectively work, this number generally decided, based on the request flow and number of servers used for any application, it varies project to project. For above example we took a range of 50 and then we use modulus operator between obtained server hash value with the search space to find out the spot in the circle where we place the server.

This kind of setup will help distributing the request in proper way, now suppose the same scenario where number of requests has increased and all 4 servers overloaded, hence a new server has added to reduce the load from existing servers, now lets see how this can be taken care in consistent hashing.

lets add another server (placement of server will be decided based on the hash function and modulus operator) server 4 (S4), here we can see, it has not impacted the application much , now few request which were serving by server 3 (S3) now will be served by server 4 (S4) , remaining servers will be unimpacted by this change, so change is minimal.

So far so good, but here too one issue we can face, lets talk about that

now imagine a scenario where many request are landing between server 0 (S0) and server 1 (S1), so server 1 will be having more load even we have distributed the load quite efficiently.

In this case should we add few more servers ? , not exactly as our each server are powerful server and adding few more server will be costly, so by doing this we actually increase the organization bill and end up having underutilized servers which actually can serve a lot more request.

So what can be done to overcome from this scenario, not unusual, we can go for virtual servers or virtual nodes which are not actual servers but multiple replicas of actual server, lets see the following diagram and understand that.

Here we have used multiple (k) hash function for the servers to generate multiple spot in the search space, this statement actually stating that one server will have multiple points in the search space, which has illustrated in the above diagram, here we can see same server has appeared at multiple places, since we have multiple replicas of servers, incoming request can be served by any of the servers, there are little to none chances of overwhelming any particular server with tons of request.

To generates multiple replicas of a server, either we can go for k hash functions or we can generate k replica id's for each server id's and then place the server accordingly in the search space

This is how consistent hashing works and solve todays large system problem by hashing the servers and lumping in multiple replicas to serve millions of request.

That's all folks in this article, please let me know your input/suggestion/thought in comment section, keep visiting for more articles.

A Big thank you for checking out my article, this really encourage me to come up with more topics.

System Design : Load Balancing And Consistent Hashing - Part 2

Recent Posts

Comments