Understanding Load Balancer

A load balancer is an important component of any distributed system. It helps to distribute the client requests within a cluster of servers to improve the responsiveness and availability of applications or websites.

It distributes workloads uniformly across servers or other compute resources to optimize the network efficiency, reliability and capacity. Load balancing is performed by an appliance either physical or virtual that identifies in real time which server [pod incase of kubernetes] in a pool can best meet a given client request, while ensuring heavy network traffic doesn't overwhelm any single server [ or pod]. Another important task of load balancer is to carry out continuous health checks on servers [or pods] to ensure they can handle requests. It ensures better use of system resources by balancing user requests and guarantees 100% availability of service.

Reverse Proxy/Load Balancer Communication Flow

During the system design, horizontal scaling is a very common strategy or solution to scale any system when the user base is huge in number. It also ensures better overall throughput of the application or website. Latencies should occur less often as requests are not blocked, and users need not to wait for their requests to be processed/served.

Availability is a key characteristic of any distributed system. In case of a full server failure, there won’t be any impact on the user experience as the load balancer will simply send the client request to a healthy server. Instead of a single resource performing or taking heavy load, load balancer ensures that several resources perform a bearable amount of work.

Categories of Load Balancer

Layer 4 Category Load Balancer

Load balancers distribute traffic based on transport data, such as IP addresses and Transmission Control Protocol (TCP) port numbers. Examples - Network Load balances in AWS and Internal Load balancer in GCP

Layer 7 Category Load Balancer

Load balancers make routing decisions based on application characteristics that include HTTP header information or the actual contents of the message such as URLs, Cookies etc. Examples - Applications Load balancer in AWS and Gloabl Load balancer in GCP

Types of Load Balancing

Hardware Load Balancing Type

Vendors of hardware‑based solutions load proprietary software onto the machine they provide, which often uses specialized components or resources. To handle the increasing traffic to the application or website, one has to buy specific h/w from the vendors. Example - F5 Load balancer from F5 networks

Software Load Balancing Type

Software solutions generally run on regular hardware, making them economical and more flexible. You can install the software on the hardware of your choice or in cloud environments like AWS, GCP, Azure etc.

Load Balancing Techniques

There are various types of load balancing methods and every type uses different algorithms for distributing the requests. Here is a list of load balancing techniques:

Random Selection

As the name itself says, the servers are selected randomly. There are no other factors considered in selection of the server. This method might cause a problem, where some of the servers gets overloaded with requests and other might be sitting idle.

Round Robin

One of the most commonly used load balancing methods. It’s a method where the load balancer redirects incoming traffic between a set of servers in a certain order. As per the above diagram, we have have 3 application servers; the first request goes to App Server 1, the second one goes to App Server 2, and so on. When load balancer reaches the end of the server list, it starts over again from the beginning which is from App Server 1. It almost evenly balances the traffic between the servers. All servers need to be of same specification for this method to work successfully. Otherwise, a low specification server may have the same load as a high processing capacity server.

Weighted Round Robin

It's a bit more complex than the Round Robin, as this method is designed to handle servers with different characteristics. A weight is assigned to each server in the configuration. This weight can be an integer value that varies according to the specifications of the server. Higher specification servers get more weightage, which is the key parameter for traffic redirection.

Least Response Time

This algorithm sends the client requests to the server with the least active connections and the lowest average response time. The backend server that responds the fastest receives the next request.

Least Connections

In this method, the traffic redirection happens based on the server with the least number of active connections.

IP Hash

In this method, a hash of the source/client's IP address is generated which is used to select a server for redirection. Once the server is allocated, same server will be used for the client’s consecutive requests. It becomes more like a sticky where requests of a client will be sent to same server irrespective of how busy the server with requests. In some use cases, this method will come very handy and even improve the performance.

Conclusion

Availability is a key characteristic of a distributed system. In case of a one server failure scenario, it won’t affect the end user experience as the load balancer will simply send the client request to another healthy server.

While designing a distributed system, one of the important task is to choose the load balancing strategy according to the application or website requirements.

HAProxy (High Availability Proxy) is open source proxy and load balancing server software. It provides high availability at the network (TCP) and application (HTTP/S) layers, improving speed and performance by distributing workload across multiple servers.

Nginx is a very efficient HTTP load balancer to distribute traffic to several application servers and to improve performance, scalability and reliability of web applications.

The Bhargav Journal

Monday, July 19, 2021