Microservices
Also known as the microservices architecture, is an architectural style that structures an application as a collection of services that are
- Highly Maintainable and Testable
- Loosely Coupled
- Independently Deployable
- Organized around Business Capabilities
- Owned by a Small Team
The microservices architecture enables the rapid, frequent and reliable delivery of large, complex applications. It also enables an organization to evolve its technology stack.
The decentralization of business logic increases the flexibility and most importantly decouples the dependencies between two or more components, this being one of the major reasons as to why many companies are moving from monolithic architecture to a microservices architecture.
What's Hystrix?
- Hystrix is designed to do the following:
- Give protection from and control over latency and failure from dependencies accessed (typically over the network) via third-party client libraries.
- Stop cascading failures in a complex distributed system.
- Fail fast and rapidly recover.
- Fallback and gracefully degrade when possible.
- Enable near real-time monitoring, alerting, and operational control.
In simple terms, it isolates external dependencies so that one may not affect the other.
Hystrix provides 2 isolation strategies, thread and semaphore based isolation, but it’s left to one to decide which one works based on the requirements.
Thread Based Isolation
In this isolation method; the external call gets executed on a different thread (non-application thread), so that the application thread is not affected by anything that goes wrong with the external call.
Hystrix uses the bulkhead pattern. In general, the goal of the bulkhead pattern is to avoid faults in one part of a system to take the entire system down. The term comes from ships where a ship is divided in separate watertight compartments to avoid a single hull breach to flood the entire ship; it will only flood one bulkhead.
Hystrix uses per dependency thread-pool for isolation. So, for every call only the threads from the pool of corresponding dependency are utilized. This ensures that only the pool of the dependency which fails get exhausted and leaving the others unaffected. The number of maximum concurrent requests allowed are based on the thread-pool size defined.
It’s very important to set the right thread-pool size for each of the dependency. A very small pool size may result in requests going into fallback although the downstream may be responding timely. On the other hand a very large one may cause all the threads to be blocked in case the downstream is running with high latencies, thereby degrading the application performance.
How to find the thread-pool size?
As mentioned in hystrix documentation, the correct thread-pool size can be computed with the given formula:
RPS = Requests per second at which we are calling the downstream.
P99 = The 99th percentile latency of downstream.
Pool Size = (RPS * P99) + some breathing room to cater spikes in RPS.
Let’s take an example, the cart service is dependent on product service to get the product details. Assume cart is calling product service at 30 RPS. The product service latencies P99 is 200 ms, P99.5 is 300 ms and Median is 50 ms.
30 rps x 0.2 seconds = 6 + breathing room = 10
In above case thread-pool size is defines as 10. It is sized at 10 to handle a burst of 99th percentile requests, but when everything is healthy this thread-pool will typically only have 1 or 2 active threads at any given time to serve mostly 50 ms median calls. Also it is ensured that only 10 requests are processed concurrently at any point of time, any concurrent requests after the 10th will directly go into the fallback mechanism.
Hystrix also provides an option to configure the queue size, where requests are queued up-to a certain number after the thread-pool is filled. The queued requests will be processed as soon as the threads become free. If the requests cross the queue size then it will directly go into the fallback mechanism.
Thread based isolation also gives an added protection from timing out. If, a thread in the hystrix thread-pool is waiting for response for more than a specified time(read timeout), the request corresponding to that thread is served by the fallback mechanism. So thread based isolation provides 2 layers of protection — concurrency and timeout.
To summarise, the concurrency limit is defined such that, if the incoming requests exceed the limit, it implies that the downstream is not in a healthy state and it is suffering high latencies. Also, if the requests are taking more time than the read timeout to respond, again it’s an indication that the downstream is not in a healthy state as the latency is high.
Semaphore Based Isolation
In this method, the external call runs on the application thread and the number of concurrent calls are limited by the semaphore count defined in the configuration.
The “time out” functionality is not allowed in this method, since the external call is running on application thread and hystrix will not interfere with the thread-pool that doesn’t belong to it. Hence, there is only one level of protection, i.e., the concurrency limit.
You might wonder, what is the advantage of using semaphore based isolation?
The limiting of concurrency is nothing but a way of throttling the load in case a downstream is running with high latencies and semaphore based isolation allows us that. Please note, the thread based isolation adds an inherent computational overhead. Each command execution involves the queueing, scheduling and context switching is involved in running a command on a separate thread. All of these are not required in case of semaphores, making them computationally much lighter as compared to threads.
Generally, semaphore isolation should be used when the external call rate is so high that the overhead of thread creation comes out to be very costly, or when the downstream is trusted to respond or fail quickly, so it will ensure our application threads are not stuck. The logic for setting up the semaphore size is same as the one which use for setting thread-pool size, but the overhead when using semaphores is much less and will result in faster execution.
Catch with Thread Isolation
As you know by now, if a downstream application call takes more than the defined timeout to return the result, the caller is served using the fallback defined. Please note this fallback is not served by the hystrix thread that was created for the external call, infact it will keep waiting until it gets a response or an exception from the downstream application. We can’t force the latent thread to stop the work; the best hystrix can do is to throw an InterruptedException. If the implementation wrapped by hystrix doesn’t respect InterruptedException, then thread will continue it’s work though the caller would have received the response from the fallback.
We can handle this situation by defining the read timeout as close as possible with hystrix thread timeout. Once the thread timeout is pass, the external call receives an exception which will mark the task of hystrix thread as completed and is returned to the thread-pool.
To validate the above scenarios, I have created two microservices one is named as rest-producer and other one is rest-consumer.
rest-producer microservice
This service exposes a simple API which takes username as a path variable, prefix Welcome to it and returns back to the caller.
@RequestMapping("/api/users")
public class UserController {
@GetMapping("/{username}")
public String getWelcomeMessage(@PathVariable("username") String userName) {
try {
Thread.sleep(4000);
} catch(Exception ex) {
ex.printStackTrace();
}
StringBuilder stringBuilder = new StringBuilder()
.append("Welcome ").append(userName).append(" !\n");
return stringBuilder.toString();
}
}
As usual, you can find the source on github for rest-producer service.
rest-consumer microservice
This service exposes a simple API /api/greet/{username} which internally calls a service that will invoke the rest-producer service API /api/users/{username} using the restTemplate.
@RestController
@RequestMapping("/api/greet")
public class GreetingController {
@Autowired
private GreetingService greetingService;
@GetMapping("/{username}")
public String getGreetingMessage(@PathVariable("username") String userName) {
try {
log.warn("Main Thread: {} ({})",
Thread.currentThread().getName(),
Thread.currentThread().getId());
return greetingService.getMessage(userName);
} finally {
log.warn("Main Thread: {} ({})",
Thread.currentThread().getName(),
Thread.currentThread().getId());
}
}
}
RestTemplate Bean
public RestTemplate restTemplate() {
HttpComponentsClientHttpRequestFactory httpComponentsClientHttpRequestFactory
= new HttpComponentsClientHttpRequestFactory();
httpComponentsClientHttpRequestFactory.setConnectTimeout(5000);
httpComponentsClientHttpRequestFactory.setReadTimeout(5000);
return new RestTemplate(httpComponentsClientHttpRequestFactory);
}