• About Blog

    What's Blog?

    A blog is a discussion or informational website published on the World Wide Web consisting of discrete, often informal diary-style text entries or posts.

  • About Cauvery Calling

    Cauvery Calling. Action Now!

    Cauvery Calling is a first of its kind campaign, setting the standard for how India’s rivers – the country’s lifelines – can be revitalized.

  • About Quinbay Publications

    Quinbay Publication

    We follow our passion for digital innovation. Our high performing team comprising of talented and committed engineers are building the future of business tech.

Wednesday, March 24, 2021

Latency and Fault Tolerance for Distributed System

Hystrix by Netflix Image

Microservices

Also known as the microservices architecture, is an architectural style that structures an application as a collection of services that are

  • Highly Maintainable and Testable
  • Loosely Coupled
  • Independently Deployable
  • Organized around Business Capabilities
  • Owned by a Small Team

The microservices architecture enables the rapid, frequent and reliable delivery of large, complex applications. It also enables an organization to evolve its technology stack.

The decentralization of business logic increases the flexibility and most importantly decouples the dependencies between two or more components, this being one of the major reasons as to why many companies are moving from monolithic architecture to a microservices architecture.

What's Hystrix?

  • Hystrix is designed to do the following:
  • Give protection from and control over latency and failure from dependencies accessed (typically over the network) via third-party client libraries.
  • Stop cascading failures in a complex distributed system.
  • Fail fast and rapidly recover.
  • Fallback and gracefully degrade when possible.
  • Enable near real-time monitoring, alerting, and operational control.

In simple terms, it isolates external dependencies so that one may not affect the other.

Hystrix provides 2 isolation strategies, thread and semaphore based isolation, but it’s left to one to decide which one works based on the requirements.

Thread Based Isolation

In this isolation method; the external call gets executed on a different thread (non-application thread), so that the application thread is not affected by anything that goes wrong with the external call.

Hystrix uses the bulkhead pattern. In general, the goal of the bulkhead pattern is to avoid faults in one part of a system to take the entire system down. The term comes from ships where a ship is divided in separate watertight compartments to avoid a single hull breach to flood the entire ship; it will only flood one bulkhead.

Hystrix Thread Isolation Image

Hystrix uses per dependency thread-pool for isolation. So, for every call only the threads from the pool of corresponding dependency are utilized. This ensures that only the pool of the dependency which fails get exhausted and leaving the others unaffected. The number of maximum concurrent requests allowed are based on the thread-pool size defined.

It’s very important to set the right thread-pool size for each of the dependency. A very small pool size may result in requests going into fallback although the downstream may be responding timely. On the other hand a very large one may cause all the threads to be blocked in case the downstream is running with high latencies, thereby degrading the application performance.

How to find the thread-pool size?

As mentioned in hystrix documentation, the correct thread-pool size can be computed with the given formula:

RPS = Requests per second at which we are calling the downstream.
P99 = The 99th percentile latency of downstream.
Pool Size = (RPS * P99) + some breathing room to cater spikes in RPS.

Let’s take an example, the cart service is dependent on product service to get the product details. Assume cart is calling product service at 30 RPS. The product service latencies P99 is 200 ms, P99.5 is 300 ms and Median is 50 ms.

30 rps x 0.2 seconds = 6 + breathing room = 10

In above case thread-pool size is defines as 10. It is sized at 10 to handle a burst of 99th percentile requests, but when everything is healthy this thread-pool will typically only have 1 or 2 active threads at any given time to serve mostly 50 ms median calls. Also it is ensured that only 10 requests are processed concurrently at any point of time, any concurrent requests after the 10th will directly go into the fallback mechanism.

Hystrix also provides an option to configure the queue size, where requests are queued up-to a certain number after the thread-pool is filled. The queued requests will be processed as soon as the threads become free. If the requests cross the queue size then it will directly go into the fallback mechanism.

Thread based isolation also gives an added protection from timing out. If, a thread in the hystrix thread-pool is waiting for response for more than a specified time(read timeout), the request corresponding to that thread is served by the fallback mechanism. So thread based isolation provides 2 layers of protection — concurrency and timeout.

To summarise, the concurrency limit is defined such that, if the incoming requests exceed the limit, it implies that the downstream is not in a healthy state and it is suffering high latencies. Also, if the requests are taking more time than the read timeout to respond, again it’s an indication that the downstream is not in a healthy state as the latency is high.

Semaphore Based Isolation

In this method, the external call runs on the application thread and the number of concurrent calls are limited by the semaphore count defined in the configuration.

The “time out” functionality is not allowed in this method, since the external call is running on application thread and hystrix will not interfere with the thread-pool that doesn’t belong to it. Hence, there is only one level of protection, i.e., the concurrency limit.

You might wonder, what is the advantage of using semaphore based isolation?

The limiting of concurrency is nothing but a way of throttling the load in case a downstream is running with high latencies and semaphore based isolation allows us that. Please note, the thread based isolation adds an inherent computational overhead. Each command execution involves the queueing, scheduling and context switching is involved in running a command on a separate thread. All of these are not required in case of semaphores, making them computationally much lighter as compared to threads.

Generally, semaphore isolation should be used when the external call rate is so high that the overhead of thread creation comes out to be very costly, or when the downstream is trusted to respond or fail quickly, so it will ensure our application threads are not stuck. The logic for setting up the semaphore size is same as the one which use for setting thread-pool size, but the overhead when using semaphores is much less and will result in faster execution.

Catch with Thread Isolation

As you know by now, if a downstream application call takes more than the defined timeout to return the result, the caller is served using the fallback defined. Please note this fallback is not served by the hystrix thread that was created for the external call, infact it will keep waiting until it gets a response or an exception from the downstream application. We can’t force the latent thread to stop the work; the best hystrix can do is to throw an InterruptedException. If the implementation wrapped by hystrix doesn’t respect InterruptedException, then thread will continue it’s work though the caller would have received the response from the fallback.

We can handle this situation by defining the read timeout as close as possible with hystrix thread timeout. Once the thread timeout is pass, the external call receives an exception which will mark the task of hystrix thread as completed and is returned to the thread-pool.

To validate the above scenarios, I have created two microservices one is named as rest-producer and other one is rest-consumer.

rest-producer microservice

This service exposes a simple API which takes username as a path variable, prefix Welcome to it and returns back to the caller.

@RestController
@RequestMapping("/api/users")
public class UserController {
    @GetMapping("/{username}")
    public String getWelcomeMessage(@PathVariable("username") String userName) {
        try {
            Thread.sleep(4000);
        } catch(Exception ex) {
            ex.printStackTrace();
        }
        StringBuilder stringBuilder = new StringBuilder()
                .append("Welcome ").append(userName).append(" !\n");
        return stringBuilder.toString();
    }
}

As usual, you can find the source on github for rest-producer service.

rest-consumer microservice

This service exposes a simple API /api/greet/{username} which internally calls a service that will invoke the rest-producer service API /api/users/{username} using the restTemplate.

@Slf4j
@RestController
@RequestMapping("/api/greet")
public class GreetingController {
    @Autowired
    private GreetingService greetingService;
    @GetMapping("/{username}")
    public String getGreetingMessage(@PathVariable("username") String userName) {
        try {
            log.warn("Main Thread: {} ({})",
                    Thread.currentThread().getName(),
                    Thread.currentThread().getId());
            return greetingService.getMessage(userName);
        } finally {
            log.warn("Main Thread: {} ({})",
                    Thread.currentThread().getName(),
                    Thread.currentThread().getId());
        }
    }
}

RestTemplate Bean

@Bean
public RestTemplate restTemplate() {
   HttpComponentsClientHttpRequestFactory httpComponentsClientHttpRequestFactory
         = new HttpComponentsClientHttpRequestFactory();
   httpComponentsClientHttpRequestFactory.setConnectTimeout(5000);
   httpComponentsClientHttpRequestFactory.setReadTimeout(5000);
   return new RestTemplate(httpComponentsClientHttpRequestFactory);
}

GreetingService Implementation

@Slf4j
@Service
public class GreetingService {

    @Autowired
    private RestTemplate restTemplate;

    @HystrixCommand(fallbackMethod = "fallback_getMessage", 
            commandProperties = { 
            @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "3000") 
    })
    public String getMessage(String userName) {
        log.warn("Hystrix Thread: {} ({})",
                Thread.currentThread().getName(),
                Thread.currentThread().getId());
        String greetMessage = restTemplate.exchange("http://localhost:9191/api/users/{username}",
                HttpMethod.GET, null,
                new ParameterizedTypeReference<String>() {}, userName).getBody();
        log.warn("Hystrix Thread: {} ({}), Msg: {}",
                Thread.currentThread().getName(),
                Thread.currentThread().getId(), greetMessage);
        return greetMessage;
    }

    public String fallback_getMessage(String userName) {
        log.warn("Hystrix Fallback Thread: {} ({})",
                Thread.currentThread().getName(),
                Thread.currentThread().getId());
        return "Fallback Message " + userName + " !";
    }

}

If you notice, I have given connection timeout as 5 secs, read timeout as 5 secs and hystrix thread timeout as 3 secs. The downstream API has been intentionally made to respond after 4 secs. So it will demonstrate that the call from rest-consumer will be served with fallback because the hystrix thread timeout is set to 3 secs. It’s time to hit the rest-consumer API either from cURL or Postman.

curl -X GET 'http://localhost:9090/api/greet/Rogers'

You should see the response as below

Fallback Message - Rogers !

Please take a look at the log messages on the rest-consumer service.

2021-02-13 12:12:34.945  WARN 28673 --- [nio-9090-exec-1] c.b.h.controller.GreetingController      : Main Thread: http-nio-9090-exec-1 (21)
2021-02-13 12:12:35.284  WARN 28673 --- [GreetingService-1] c.b.hystrix.service.GreetingService      : Hystrix Thread: hystrix-GreetingService-1 (42)
2021-02-13 12:12:38.279  WARN 28673 --- [ HystrixTimer-1] c.b.hystrix.service.GreetingService      : Hystrix Fallback Thread: HystrixTimer-1 (41)
2021-02-13 12:12:38.283  WARN 28673 --- [nio-9090-exec-1] c.b.h.controller.GreetingController      : Main Thread: http-nio-9090-exec-1 (21)
2021-02-13 12:12:39.394  WARN 28673 --- [GreetingService-1] c.b.hystrix.service.GreetingService      : Hystrix Thread: hystrix-GreetingService-1 (42), Msg: Welcome Rogers !

As you can see, the main thread ID is 21 which is working in controller and hystrix spawns a thread with ID 42 to make an external call to the downstream service. Now the thread 42 will be on waiting state till it gets response from the downstream service. But the thread timeout has been set to 3 secs and hystrix will launch another thread with ID 41 which will give a fallback message to the controller and you see that main thread with ID 21 returns the fallback response to the caller.

Please observe carefully, the last log statement which is printed after 4 secs which is nothing but the thread with ID 42 which was on waiting state receives the actual response from the downstream after 4 secs and it prints the response. But it’s of no use as the original request is already returned with fallback message. So it proves that the hystrix thread that’ s used to make an external call will be blocked until it gets response/exception from the downstream service. Hence it’s very important to setup the http read timeout and thread timeout as same so we can avoid blocking the hystrix thread and release it back to the pool, otherwise the hystrix thread will exhaust if we have high traffic.

As usual, you can find the source on github for rest-consumer service.

With great decentralization comes a great need of resilient fault tolerance. As Netflix says, fault tolerance is not a feature, it’s a requirement.

Conclusion

Simply wrapping Hystrix around external calls does not guarantee an effective fault tolerance. Generally, it has been observed that the hystrix configurations which we discussed above like thread-pool size, isolation strategy, max concurrency limit, read timeout etc are not configured explicitly and left to use their default values. As explained configuring them as per the requirements is very important to ensure an effective fault tolerance for an application.

Configuration Reference

Alternate for Hystirx


Monday, March 8, 2021

JSON Web Token (JWT)

JSON Web Token Image

What is JWT ? How it works ? How can it keep our applications secure?

JSON Web Tokens has become the favourite choice among modern developers when implementing user authentication. Let’s understand what JWT is and how it works, specifically in the context of securing web applications.

There is an open industry standard specification called RFC 7519 that outlines how a JWT should be structured and how to use it for exchanging the information between parties as JSON objects.

Authentication is basically what happens when users sign-in. We check the user’s identity based on credentials like username/password.

Authorization, on the other hand, checks if the above-validated user is able to access specified modules or not

There are multiple ways that web applications can manage sessions and two of the popular ways is by using the tokens.

Session Tokens

In this mechanism, the server will create a session for the user after the user is successfully authenticated. The session will have an unique identifier, which is stored as a cookie on the users browser. While the user stays logged in, the cookie would be sent along with every subsequent request.

The server parses the cookie and then compares the session id against the session information stored in the memory or data store to verify the user’s identity and provides the user context to the application.

The biggest problem with this approach is, it assumes that, there is always just one monolithic server web application. That used to be the case typically in the past. But that’s no longer the case these days as we live in micro services world.

There would be multiple servers that share the load that sit behind a load balancer. When a request comes in, the load balancer decides which server to route the request. The user could have had their login request routed to one server, but the next request goes through the load balancer and may land on a different server. Now this new server has no idea about the previous interaction.

Being a techie, we may find a solution for it. Let’s say, we introduce a shared cache that all these servers persist and look up user session information which will solve the problem.

JSON Web Tokens

In this mechanism, the authentication server will authenticate the user and generate a JWT which will have all the required information. It will be sent back to the client for later usage. This is more scalable solution as JWT is stateless, which means, the user state is never stored on the server but the state is stored inside the token itself.

If the user is making a subsequent request to the application, the JWT needs to be added with the request. The application server will be configured to be able to check whether the incoming JWT is exactly what was created by the authentication server.

Let’s look at the format of the JWT to understand it better. A JSON Web Token consists of 3 sections separated by periods.

JWT Representation Image

Header

The header section typically contains 2 details — the type of token (JWT in this case) and the hashing algorithm used by the token such as RSA, HMAC, or SHA256. The default algorithm used is HS256.

Payload

The payload section contains actual data pertaining to a user is what we call as claims. The claims can be of 3 types:

Reserved Claims

These are some pre-defined claims which are not mandatory but recommended to use it as a best practise. These claims help the application judge the authenticity of the token. Listing few of them for sample are iss (issuer), sub (subject), exp (expiration time) etc.

Public Claims

These can be defined based on the requirements by those using JWTs. As it’s a public claims, to avoid issues they should be defined in the IANA JSON Web Token Registry.

Private Claims

These are the custom claims created to share information between parties that agree on using them. Listing few of them as sample are employment type, department name etc.

If anyone is interested to read more about claims, you can read it over here.

Signature

The signature is the most important part of a JSON Web Token. It is calculated by encoding the header and payload using Base64URL Encoding and concatenating them with a period as separator, which is then run through the cryptographic algorithm. Please remember when the header or payload changes, the signature has to be calculated again.

// Signature Algorithm
jwtData = base64urlEncode(header) + "." + base64urlEncode(payload)
signature = HMAC(jwtData, secret_salt)

// Token Generation
token = encodeBase64Url(header) + "." + encodeBase64Url(payload) + "." + encodeBase64Url(signature)

eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJ1c2VyIiwiaXNzIjoiQmhhcmdhdiBJbmMuIiwiZXhwIjoxNjEzOTM4Mzg3LCJpYXQiOjE2MTM5MjAzODd9.XblnkCqOUdtjLIg2pJcN_7gXUc7nSHIuXnBwin8hSeQ


What’s next ?

Shhh! Let me tell you a secret. Go to jwo.io website, copy the above JWT and paste it in the encoded section of the online debugger. Voila, you could see all the data stored in the token. Now you will have a question on your mind, what the heck, how is this secure? 🤔

Please note that, JWT’s are encoded but not encrypted. It is a mechanism by which you can verify that the data is not tampered and has come from the trusted source.

The two open industry standards that describe the security features of JWT are RFC 7515 for JSON Web Signature and RFC 7516 for JSON Web Encryption.

JSON Web Signature

The purpose of a signature is to allow one or more parties to establish the authenticity of the JWT. Now if you remember the signature is basically the encoded header and payload concatenated with a period and then run through a hashing algorithm with a secret key.

The signature attached at the end helps us to determine if the JWT has been tampered with because for any change in the data the signature will change. A signature, however does not prevent third parties from reading the contents of the JWT.

JSON Web Encryption

JWS provides us to establish the authenticity of the JWT contents, where as JWE provides a way to keep the contents of the JWT unreadable to third parties.

An encrypted JWT, can use two cryptographic schemes — a shared secret scheme or a public/private-key scheme.

Conclusion

JWT is a modern and robust solution to authenticate and authorise users and sharing sensitive information while not maintaining state. A JWT is made of three parts — header, payload and signature. Sending JWTs in cookies instead of in the header, shortening their expiration time and using refresh tokens to issue new access tokens are some of the security measures we can take to guarantee the security of our application, its users and their data.

Featured Post

Benefits & Best Practices of Code Review

Photo by Bochelly Code reviews are methodical assessments of code designed to identify bugs, increase code quality, and help developers lear...