API Rate Limiting
Rate limiting is a strategy to limit the access to APIs. It restricts the number of API calls that a client can make within any given timeframe. This helps to defend the API against abuse, both unintentional and malicious scripts.
Rate limits are often applied to an API by tracking the IP address, API keys or access tokens, etc. As an API developers, we can choose to respond in several different ways when a client reaches the limit.
- Queueing the request until the remaining time period has elapsed.
- Allowing the request immediately but charging extra for this request.
- Most common one is rejecting the request (HTTP 429 Too Many Requests)
Sliding Log Algorithm
Sliding Log rate limiting involves tracking a time stamped log for each consumer request. These logs are usually stored in a hash set or table that is sorted by time. Logs with timestamps beyond a threshold are discarded. When a new request comes in, we calculate the sum of logs to determine the request rate. If the request would exceed the threshold rate, then it is held.
The advantage of this algorithm is that it does not suffer from the boundary conditions of fixed windows. The rate limit will be enforced precisely and because the sliding log is tracked for each consumer, you don’t have the rush effect that challenges fixed windows. However, it can be very expensive to store an unlimited number of logs for every request. It’s also expensive to compute because each request requires calculating a summation over the consumers prior requests, potentially across a cluster of servers. As a result, it does not scale well to handle large bursts of traffic or denial of service attacks.
Please refer to the Understanding Rate Limiting Algorithms blog where the Sliding Log and other algorithms have been explained in detail.
Building a Springboot Application with API Rate Limiter
Create a new spring boot application from Spring Initializr with dependency on spring web module.
Unzip the downloaded project and import to your IDE. We are going to implement a simple calculator REST APIs that can do operations like add and subtract.
@RestController
@RequestMapping(value = "/api/calculator")
public class CalculatorController {
@GetMapping(value = "/add")
public ResponseEntity<Calculator> add(@RequestParam int left, @RequestParam int right) {
return ResponseEntity.ok(Calculator.builder()
.operation("add").answer(left + right).build());
}
@GetMapping(value = "/subtract")
public ResponseEntity<Calculator> subtract(@RequestParam int left, @RequestParam int right) {
return ResponseEntity.ok(Calculator.builder()
.operation("subtract").answer(left - right).build());
}
}
Let’s ensure that our above APIs are up and running as expected. You can use the cURL or PostMan to make an API call.
curl -X GET -H "Content-Type: application/json" 'http://localhost:9090/api/calculator/add?left=20&right=30'{"operation":"add","answer":50}
Now that we have APIs ready to consume, next let’s introduce some subscription plans with rate limits. Let’s assume that we have the following subscription plans for our clients:
Free Subscription allows 2 requests per 60 seconds.
Basic Subscription allows 10 requests per 60 seconds.
Professional Subscription allows 20 requests per 60 seconds.
Each API client gets a unique API key that they must send along with each request. This would help us identify the client and subscription plan linked.
public enum SubscriptionPlan {
SUBSCRIPTION_FREE(2, 60),
SUBSCRIPTION_BASIC(10, 60),
SUBSCRIPTION_PROFESSIONAL(20, 60);
private final int requestLimit;
private final int windowTime;
SubscriptionPlan(int requestLimit, int windowTime) {
this.requestLimit = requestLimit;
this.windowTime = windowTime;
}
public int getRequestLimit() {
return this.requestLimit;
}
public int getWindowTime() {
return this.windowTime;
}
}
Next we create a subscription service which will store the references for each of the API client in a memory.
@Service
public class SubscriptionService {
private final Map<String, UserRequestData>
subscriptionCacheMap = new ConcurrentHashMap<>();
public UserRequestData resolveSubscribedUserData(String subscriptionKey) {
return subscriptionCacheMap.computeIfAbsent(
subscriptionKey, this::resolveUser);
}
private UserRequestData resolveUser(String subscriptionKey) {
if (subscriptionCacheMap.containsKey(subscriptionKey)) {
return subscriptionCacheMap.get(subscriptionKey);
}
return buildUserLog(
resolveSubscriptionPlanByKey(subscriptionKey));
}
private UserRequestData buildUserLog(SubscriptionPlan subscriptionPlan) {
return new UserRequestData(
subscriptionPlan.getRequestLimit(),
subscriptionPlan.getWindowTime());
}
private SubscriptionPlan resolveSubscriptionPlanByKey(String subscriptionKey) {
if (subscriptionKey.startsWith("PS1129-")) {
return SubscriptionPlan.SUBSCRIPTION_PROFESSIONAL;
} else if (subscriptionKey.startsWith("BS1129-")) {
return SubscriptionPlan.SUBSCRIPTION_BASIC;
}
return SubscriptionPlan.SUBSCRIPTION_FREE;
}
}
Let’s understand the implementation. The API client sends an API key with the X-Subscription-Key request header. We use the SubscriptionService to get the user reference for the API key and check whether the request is allowed or not with the help of methods.
In order to enhance the client experience of the API, we will add the following additional response headers to send information about the rate limit.
- X-Rate-Limit-Remaining — number of tokens remaining in the current time window.
- X-Rate-Limit-Retry-After-Seconds — remaining time in seconds until the bucket is refilled with new tokens.
We can call UserRequestData methods getRequestWaitTime and getRemainingRequests, to get the count of the remaining requests and the time remaining until the next sliding log respectively. The implementation provided in this class is self explanatory and easy to understand the same.
public class UserRequestData {
private int requestLimit;
private int windowTimeInSec;
private Queue<Long> requestTimeStamps;
public UserRequestData(
int requestLimit, int windowTimeInSec) {
this.requestLimit = requestLimit;
this.windowTimeInSec = windowTimeInSec;
this.requestTimeStamps =
new ConcurrentLinkedDeque<Long>();
}
public int getRemainingRequests() {
return requestLimit - requestTimeStamps.size();
}
public int getRequestWaitTime() {
long currentTimeStamp =
System.currentTimeMillis() / 1000;
int initialElapsedTime =
(int) (currentTimeStamp - requestTimeStamps.peek());
return initialElapsedTime > windowTimeInSec
? 0 : windowTimeInSec - initialElapsedTime;
}
public boolean isServiceCallAllowed() {
long currentTimeStamp =
System.currentTimeMillis() / 1000;
evictOlderRequestTimeStamps(currentTimeStamp);
if (requestTimeStamps.size() >= this.requestLimit) {
return false;
}
requestTimeStamps.add(currentTimeStamp);
return true;
}
public void evictOlderRequestTimeStamps(long currentTimeStamp) {
while (requestTimeStamps.size() != 0 &&
(currentTimeStamp - requestTimeStamps.peek()
> windowTimeInSec)) {
requestTimeStamps.remove();
}
}
}
Here is the implementation of the Interceptor to validate the request with rate limiter to see whether we accept or reject the request.
@Component
public class RateLimiterInterceptor implements HandlerInterceptor {
private static final String
HEADER_SUBSCRIPTION_KEY = "X-Subscription-Key";
private static final String
HEADER_LIMIT_REMAINING = "X-Rate-Limit-Remaining";
private static final String
HEADER_RETRY_AFTER = "X-Rate-Limit-Retry-After-Seconds";
private static final String
SUBSCRIPTION_QUOTA_EXHAUSTED =
"You've exhausted your API Request Quota. " +
"Please upgrade your subscription plan.";
@Autowired
private SubscriptionService subscriptionService;
@Override
public boolean preHandle(HttpServletRequest request,
HttpServletResponse response,
Object handler) throws Exception {
String subscriptionKey =
request.getHeader(HEADER_SUBSCRIPTION_KEY);
if (StringUtils.isEmpty(subscriptionKey)) {
response.sendError(HttpStatus.BAD_REQUEST.value(),
"Missing Request Header: " +
HEADER_SUBSCRIPTION_KEY);
return false;
}
UserRequestData userRequestData = subscriptionService
.resolveSubscribedUserData(subscriptionKey);
if (!userRequestData.isServiceCallAllowed()) {
int waitTime = userRequestData.getRequestWaitTime();
response.addHeader(HEADER_RETRY_AFTER,
String.valueOf(waitTime));
response.setContentType(
MediaType.APPLICATION_JSON_VALUE);
response.sendError(
HttpStatus.TOO_MANY_REQUESTS.value(),
SUBSCRIPTION_QUOTA_EXHAUSTED);
return false;
}
response.addHeader(HEADER_LIMIT_REMAINING,
String.valueOf(
userRequestData.getRemainingRequests()));
return true;
}
}
Finally, let’s add the interceptor to the InterceptorRegistry of Springboot so that the RateLimitInterceptor intercepts each request to our calculator API endpoints.
@SpringBootApplication
public class SlidingWindowApplication implements WebMvcConfigurer {
@Autowired
@Lazy
private RateLimiterInterceptor interceptor;
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(interceptor)
.addPathPatterns("/api/calculator/**");
}
public static void main(String[] args) {
SpringApplication.run(SlidingWindowApplication.class, args);
}
}
Let invoke calculator API to see the behaviour.
curl -X GET 'http://localhost:9090/api/calculator/add?left=20&right=30'
{"timestamp":"2021-01-03T12:56:20.047+0000","status":400,"error":"Bad Request","message":"Missing Request Header: X-Subscription-Key","path":"/api/calculator/add"}
The client has to send the API key within the http header otherwise the interceptor will not process the request. Let’s add the API key to the header and make the call.
curl -v -X GET -H "X-subscription-key:A1129-12" 'http://localhost:9090/api/calculator/subtract?left=20&right=30'
* Connected to localhost (::1) port 9090 (#0)
> GET /api/calculator/subtract?left=20&right=30 HTTP/1.1
> Host: localhost:9090
> User-Agent: curl/7.64.1
> Accept: */*
> X-subscription-key:A1129-12
>
< HTTP/1.1 200
< X-Rate-Limit-Remaining: 1
< Content-Type: application/json
< Transfer-Encoding: chunked
< Date: Sun, 03 Jan 2021 12:57:09 GMT
<
* Connection #0 to host localhost left intact
{"operation":"subtract","answer":-10}
* Closing connection 0
You can see the API key is added in the header, the API responds to our request and also it has added response header which shows how many rate is remaining for the API key.
Let’s make 2 more calls then we should see that we exhausted our rate for the free plan and returns 429 as response.
curl -v -X GET -H "X-subscription-key:A1129-12" 'http://localhost:9090/api/calculator/subtract?left=20&right=30'
* Connected to localhost (::1) port 9090 (#0)
> GET /api/calculator/subtract?left=20&right=30 HTTP/1.1
> Host: localhost:9090
> User-Agent: curl/7.64.1
> Accept: */*
> X-subscription-key:A1129-12
>
< HTTP/1.1 429
< X-Rate-Limit-Retry-After-Seconds: 24
< Content-Type: application/json
< Transfer-Encoding: chunked
< Date: Sun, 03 Jan 2021 12:58:58 GMT
<
* Connection #0 to host localhost left intact
{"timestamp":"2021-01-03T12:58:58.176+0000","status":429,"error":"Too Many Requests","message":"You've exhausted your API Request Quota. Please upgrade your subscription plan.","path":"/api/calculator/subtract"}
* Closing connection 0
It looks like we have successfully implemented the rate limiter using the Sliding Log algorithm. We can keep adding endpoints and the interceptor would apply the rate limit for each request.
As usual, the source code for the above spring boot implementation is available over on GitHub.