• About Blog

    What's Blog?

    A blog is a discussion or informational website published on the World Wide Web consisting of discrete, often informal diary-style text entries or posts.

  • About Cauvery Calling

    Cauvery Calling. Action Now!

    Cauvery Calling is a first of its kind campaign, setting the standard for how India’s rivers – the country’s lifelines – can be revitalized.

  • About Quinbay Publications

    Quinbay Publication

    We follow our passion for digital innovation. Our high performing team comprising of talented and committed engineers are building the future of business tech.

Monday, July 19, 2021

Understanding Load Balancer

Load Balancer Image
Photo by Jon Flobrant

A load balancer is an important component of any distributed system. It helps to distribute the client requests within a cluster of servers to improve the responsiveness and availability of applications or websites.

It distributes workloads uniformly across servers or other compute resources to optimize the network efficiency, reliability and capacity. Load balancing is performed by an appliance either physical or virtual that identifies in real time which server [pod incase of kubernetes] in a pool can best meet a given client request, while ensuring heavy network traffic doesn't overwhelm any single server [ or pod]. Another important task of load balancer is to carry out continuous health checks on servers [or pods] to ensure they can handle requests. It ensures better use of system resources by balancing user requests and guarantees 100% availability of service.

Load Balancer Image
Reverse Proxy/Load Balancer Communication Flow

During the system design, horizontal scaling is a very common strategy or solution to scale any system when the user base is huge in number. It also ensures better overall throughput of the application or website. Latencies should occur less often as requests are not blocked, and users need not to wait for their requests to be processed/served.

Availability is a key characteristic of any distributed system. In case of a full server failure, there won’t be any impact on the user experience as the load balancer will simply send the client request to a healthy server. Instead of a single resource performing or taking heavy load, load balancer ensures that several resources perform a bearable amount of work.


Categories of Load Balancer

Layer 4 Category Load Balancer

Load balancers distribute traffic based on transport data, such as IP addresses and Transmission Control Protocol (TCP) port numbers. Examples - Network Load balances in AWS and Internal Load balancer in GCP

Layer 7 Category Load Balancer

Load balancers make routing decisions based on application characteristics that include HTTP header information or the actual contents of the message such as URLs, Cookies etc. Examples - Applications Load balancer in AWS and Gloabl Load balancer in GCP


Types of Load Balancing

Hardware Load Balancing Type

Vendors of hardware‑based solutions load proprietary software onto the machine they provide, which often uses specialized components or resources. To handle the increasing traffic to the application or website, one has to buy specific h/w from the vendors. Example - F5 Load balancer from F5 networks 

Software Load Balancing Type

Software solutions generally run on regular hardware, making them economical and more flexible. You can install the software on the hardware of your choice or in cloud environments like AWS, GCP, Azure etc.


Load Balancing Techniques

There are various types of load balancing methods and every type uses different algorithms for distributing the requests. Here is a list of load balancing techniques:

Random Selection

As the name itself says, the servers are selected randomly. There are no other factors considered in selection of the server. This method might cause a problem, where some of the servers gets overloaded with requests and other might be sitting idle.

Round Robin

One of the most commonly used load balancing methods. It’s a method where the load balancer redirects incoming traffic between a set of servers in a certain order. As per the above diagram, we have have 3 application servers; the first request goes to App Server 1, the second one goes to App Server 2, and so on. When load balancer reaches the end of the server list, it starts over again from the beginning which is from App Server 1. It almost evenly balances the traffic between the servers. All servers need to be of same specification for this method to work successfully. Otherwise, a low specification server may have the same load as a high processing capacity server.

Weighted Round Robin

It's a bit more complex than the Round Robin, as this method is designed to handle servers with different characteristics. A weight is assigned to each server in the configuration. This weight can be an integer value that varies according to the specifications of the server. Higher specification servers get more weightage, which is the key parameter for traffic redirection.

Least Response Time

This algorithm sends the client requests to the server with the least active connections and the lowest average response time. The backend server that responds the fastest receives the next request.

Least Connections

In this method, the traffic redirection happens based on the server with the least number of active connections.

IP Hash

In this method, a hash of the source/client's IP address is generated which is used to select a server for redirection. Once the server is allocated, same server will be used for the client’s  consecutive requests. It becomes more like a sticky where requests of a client will be sent to same server irrespective of how busy the server with requests. In some use cases, this method will come very handy and even improve the performance.


Conclusion

Availability is a key characteristic of a distributed system. In case of a one server failure scenario, it won’t affect the end user experience as the load balancer will simply send the client request to another healthy server.

While designing a distributed system, one of the important task is to choose the load balancing strategy according to the application or website requirements.

HAProxy (High Availability Proxy) is open source proxy and load balancing server software. It provides high availability at the network (TCP) and application (HTTP/S) layers, improving speed and performance by distributing workload across multiple servers.

Nginx is a very efficient HTTP load balancer to distribute traffic to several application servers and to improve performance, scalability and reliability of web applications.


References

Thursday, July 8, 2021

Understanding Forward and Reverse Proxy

Understanding Forward and Reverse Proxy Server
Photo Courtesy Unsplash

What is Forward Proxy Server?

A forward proxy, often called a proxy server or web proxy, is a server that sits in front of a group of machines within a same network. When these computers make requests to sites and services usually external to the network or on the Internet, the proxy server intercepts those requests and then communicates with web servers on behalf of those clients like a middleman. By doing so, it can control the traffic according to the custom policies, convert and mask client IP addresses, enforce security protocols, and block unknown traffic.

Direct Communication Flow
Direct Communication Flow

In the above diagram, the standard communication flow where Laptop/Phone would reach out directly to website AZ, with the client sending requests to the server and the  server responding to the client request.

Forward Proxy Communication Flow
Forward Proxy Communication Flow

When we have a forward proxy in place, Laptop/Phone/Desktop will instead send requests to FP, which will then forward the request to website AZ. Website AZ will then send a response to FP, which will in-turn forward the response back to Laptop/Phone/Desktop.

Reasons for Forward Proxy:

Restricted Browsing

Governments, Schools and Organizations use proxy server to give their users access to a limited version of the Internet. A forward proxy can be used to implement these restrictions, as they let the user requests to go through the proxy rather than directly to the websites.

To Block Specific Content

Proxies can also be set up to block a group of users from accessing certain websites. Our  office network might be configured to connect to the web through a proxy which enables content filtering, blocking the requests to social media or online streaming sites.


What is Reverse Proxy Server?

A reverse proxy is a server that sits in front of one or more web servers, intercepting requests from clients. This is different from a forward proxy, where the proxy sits in front of the clients. With a reverse proxy, when clients send requests to the origin server of a website, those requests are intercepted at the network edge by the reverse proxy server. The reverse proxy server will then send requests to and receive responses from the origin server. Unlike a traditional proxy server, which is used to protect clients, a reverse proxy is used to protect servers.

Reverse Proxy Communication Flow
Reverse Proxy Communication Flow

A reverse proxy effectively serves as a gateway between clients, users, and application servers. It handles all the access policy management and traffic routing, and it protects the identity of the server that actually processes the request.

Reasons for Reverse Proxy:

Load Balancing

A reverse proxy server can act as a traffic cop, sitting in front of your backend servers and distributing client requests across a group of servers in a manner that maximizes speed and capacity utilization while ensuring no one server is overloaded, which can degrade performance. In the event that a server fails completely, other servers can step up to handle the traffic.

Global Server Load Balancing

In this form of load balancing, a website can be distributed on several servers around the globe and the reverse proxy will send clients to the server that’s geographically closest to them. This decreases the distances that requests and responses need to travel, minimizing load times.

SSL Encryption

Encrypting and decrypting SSL (or TLS) communications for each client can be computationally expensive for an origin server. A reverse proxy can be configured to decrypt all incoming requests and encrypt all outgoing responses, freeing up valuable resources on the origin server.

Protection from Attacks

With a reverse proxy in place, a website or service never needs to reveal their server identities. Also acts as an additional defense against security attacks. This makes it much harder for attackers to leverage a targeted attack against them, such as a DDoS attack.

Caching

Reverse proxies can compress inbound and outbound data, as well as cache commonly requested content, results in boosting the performance of traffic between clients and servers.


Superior Compression

Server responses use up a lot of bandwidth. Compressing server responses (e.g. with gzip) before sending them to the client can reduce the amount of bandwidth required, speeding up server responses over the network.


Monitoring and Logging Traffic

A reverse proxy captures any requests that go through it. Hence, you can use them as a central hub to monitor and log traffic. Even if you use multiple web servers to host all your website’s components, using a reverse proxy will make it easier to monitor all the incoming and outgoing data from your site.

Cloudfare, Amazon CloudFront, Akamai, StackPath, DDos-Guard, CDNetworks etc are some of the well known reverse proxies available in the market.

Featured Post

Benefits & Best Practices of Code Review

Photo by Bochelly Code reviews are methodical assessments of code designed to identify bugs, increase code quality, and help developers lear...