REST Top

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE

1. Overview

In this tutorial, we'll learn how to use Bucket4j to rate limit a Spring REST API. We'll explore API rate limiting, learn about Bucket4j, and work through a few ways of rate-limiting REST APIs in a Spring application.

2. API Rate Limiting

Rate limiting is a strategy to limit access to APIs. It restricts the number of API calls that a client can make within a certain timeframe. This helps defend the API against overuse, both unintentional and malicious.

Rate limits are often applied to an API by tracking the IP address, or in a more business-specific way such as API keys or access tokens. As API developers, we can choose to respond in several different ways when a client reaches the limit:

  • Queueing the request until the remaining time period has elapsed
  • Allowing the request immediately but charging extra for this request
  • Or, most commonly, rejecting the request (HTTP 429 Too Many Requests)

3. Bucket4j Rate Limiting Library

3.1. What Is Bucket4j?

Bucket4j is a Java rate-limiting library based on the token-bucket algorithm. Bucket4j is a thread-safe library that can be used in either a standalone JVM application or a clustered environment. It also supports in-memory or distributed caching via the JCache (JSR107) specification.

3.2. Token-bucket Algorithm

Let's look at the algorithm intuitively, in the context of API rate limiting.

Say that we have a bucket whose capacity is defined as the number of tokens that it can hold. Whenever a consumer wants to access an API endpoint, it must get a token from the bucket. We remove a token from the bucket if it's available and accept the request. On the other hand, we reject a request if the bucket doesn't have any tokens.

As requests are consuming tokens, we are also replenishing them at some fixed rate, such that we never exceed the capacity of the bucket.

Let's consider an API that has a rate limit of 100 requests per minute. We can create a bucket with a capacity of 100, and a refill rate of 100 tokens per minute.

If we receive 70 requests, which is fewer than the available tokens in a given minute, we would add only 30 more tokens at the start of the next minute to bring the bucket up to capacity. On the other hand, if we exhaust all the tokens in 40 seconds, we would wait for 20 seconds to refill the bucket.

4. Getting Started with Bucket4j

4.1. Maven Configuration

Let's begin by adding the bucket4j dependency to our pom.xml:

<dependency>
    <groupId>com.github.vladimir-bukhtoyarov</groupId>
    <artifactId>bucket4j-core</artifactId>
    <version>4.10.0</version>
</dependency>

4.2. Terminology

Before we look at how we can use Bucket4j, let's briefly discuss some of the core classes, and how they represent the different elements in the formal model of the token-bucket algorithm.

The Bucket interface represents the token bucket with a maximum capacity. It provides methods such as tryConsume and tryConsumeAndReturnRemaining for consuming tokens. These methods return the result of consumption as true if the request conforms with the limits, and the token was consumed.

The Bandwidth class is the key building block of a bucket – it defines the limits of the bucket. We use Bandwidth to configure the capacity of the bucket and the rate of refill.

The Refill class is used to define the fixed rate at which tokens are added to the bucket. We can configure the rate as the number of tokens that would be added in a given time period. For example, 10 buckets per second or 200 tokens per 5 minutes, and so on.

The tryConsumeAndReturnRemaining method in Bucket returns ConsumptionProbe. ConsumptionProbe contains, along with the result of consumption, the status of the bucket such as the tokens remaining, or the time remaining until the requested tokens are available in the bucket again.

4.3. Basic Usage

Let's test some basic rate limit patterns.

For a rate limit of 10 requests per minute, we'll create a bucket with capacity 10 and a refill rate of 10 tokens per minute:

Refill refill = Refill.intervally(10, Duration.ofMinutes(1));
Bandwidth limit = Bandwidth.classic(10, refill);
Bucket bucket = Bucket4j.builder()
    .addLimit(limit)
    .build();

for (int i = 1; i <= 10; i++) {
    assertTrue(bucket.tryConsume(1));
}
assertFalse(bucket.tryConsume(1));

Refill.intervally refills the bucket at the beginning of the time window – in this case, 10 tokens at the start of the minute.

Next, let's see refill in action.

We'll set a refill rate of 1 token per 2 seconds, and throttle our requests to honor the rate limit:

Bandwidth limit = Bandwidth.classic(1, Refill.intervally(1, Duration.ofSeconds(2)));
Bucket bucket = Bucket4j.builder()
    .addLimit(limit)
    .build();
assertTrue(bucket.tryConsume(1));     // first request
Executors.newScheduledThreadPool(1)   // schedule another request for 2 seconds later
    .schedule(() -> assertTrue(bucket.tryConsume(1)), 2, TimeUnit.SECONDS); 

Suppose, we have a rate limit of 10 requests per minute. At the same time, we may wish to avoid spikes that would exhaust all the tokens in the first 5 seconds. Bucket4j allows us to set multiple limits (Bandwidth) on the same bucket. Let's add another limit that allows only 5 requests in a 20-second time window:

Bucket bucket = Bucket4j.builder()
    .addLimit(Bandwidth.classic(10, Refill.intervally(10, Duration.ofMinutes(1))))
    .addLimit(Bandwidth.classic(5, Refill.intervally(5, Duration.ofSeconds(20))))
    .build();

for (int i = 1; i <= 5; i++) {
    assertTrue(bucket.tryConsume(1));
}
assertFalse(bucket.tryConsume(1));

5. Rate Limiting a Spring API Using Bucket4j

Let's use Bucket4j to apply a rate limit in a Spring REST API.

5.1. Area Calculator API

We're going to implement a simple, but extremely popular, area calculator REST API. Currently, it calculates and returns the area of a rectangle given its dimensions:

@RestController
class AreaCalculationController {

    @PostMapping(value = "/api/v1/area/rectangle")
    public ResponseEntity<AreaV1> rectangle(@RequestBody RectangleDimensionsV1 dimensions) {
        return ResponseEntity.ok(new AreaV1("rectangle", dimensions.getLength() * dimensions.getWidth()));
    }
}

Let's ensure that our API is up and running:

$ curl -X POST http://localhost:9001/api/v1/area/rectangle \
    -H "Content-Type: application/json" \
    -d '{ "length": 10, "width": 12 }'

{ "shape":"rectangle","area":120.0 }

5.2. Applying Rate Limit

Now, we'll introduce a naive rate limit – the API allows 20 requests per minute. In other words, the API rejects a request if it has already received 20 requests, in a time window of 1 minute.

Let's modify our Controller to create a Bucket and add the limit (Bandwidth):

@RestController
class AreaCalculationController {

    private final Bucket bucket;

    public AreaCalculationController() {
        Bandwidth limit = Bandwidth.classic(20, Refill.greedy(20, Duration.ofMinutes(1)));
        this.bucket = Bucket4j.builder()
            .addLimit(limit)
            .build();
    }
    //..
}

In this API, we can check whether the request is allowed by consuming a token from the bucket, using the method tryConsume. If we have reached the limit, we can reject the request by responding with an HTTP 429 Too Many Requests status:

public ResponseEntity<AreaV1> rectangle(@RequestBody RectangleDimensionsV1 dimensions) {
    if (bucket.tryConsume(1)) {
        return ResponseEntity.ok(new AreaV1("rectangle", dimensions.getLength() * dimensions.getWidth()));
    }

    return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS).build();
}
# 21st request within 1 minute
$ curl -v -X POST http://localhost:9001/api/v1/area/rectangle \
    -H "Content-Type: application/json" \
    -d '{ "length": 10, "width": 12 }'

< HTTP/1.1 429

5.3. API Clients and Pricing Plan

Now that we have a naive rate limit that can throttle the API requests. Next, let's introduce pricing plans for more business-centered rate limits.

Pricing plans help us monetize our API. Let's assume that we have the following plans for our API clients:

  • Free: 20 requests per hour per API client
  • Basic: 40 requests per hour per API client
  • Professional: 100 requests per hour per API client

Each API client gets a unique API key that they must send along with each request. This would help us identify the pricing plan linked with the API client.

Let's define the rate limit (Bandwidth) for each pricing plan:

enum PricingPlan {
    FREE {
        Bandwidth getLimit() {
            return Bandwidth.classic(20, Refill.intervally(20, Duration.ofHours(1)));
        }
    },
    BASIC {
        Bandwidth getLimit() {
            return Bandwidth.classic(40, Refill.intervally(40, Duration.ofHours(1)));
        }
    },
    PROFESSIONAL {
        Bandwidth getLimit() {
            return Bandwidth.classic(100, Refill.intervally(100, Duration.ofHours(1)));
        }
    };
    //..
}

Next, let's add a method to resolve the pricing plan from the given API key:

enum PricingPlan {
    
    static PricingPlan resolvePlanFromApiKey(String apiKey) {
        if (apiKey == null || apiKey.isEmpty()) {
            return FREE;
        } else if (apiKey.startsWith("PX001-")) {
            return PROFESSIONAL;
        } else if (apiKey.startsWith("BX001-")) {
            return BASIC;
        }
        return FREE;
    }
    //..
}

Next, we need to store the Bucket for each API key and retrieve the Bucket for rate limiting:

class PricingPlanService {

    private final Map<String, Bucket> cache = new ConcurrentHashMap<>();

    public Bucket resolveBucket(String apiKey) {
        return cache.computeIfAbsent(apiKey, this::newBucket);
    }

    private Bucket newBucket(String apiKey) {
        PricingPlan pricingPlan = PricingPlan.resolvePlanFromApiKey(apiKey);
        return Bucket4j.builder()
            .addLimit(pricingPlan.getLimit())
            .build();
    }
}

So, we now have an in-memory store of buckets per API key. Let's modify our Controller to use the PricingPlanService:

@RestController
class AreaCalculationController {

    private PricingPlanService pricingPlanService;

    public ResponseEntity<AreaV1> rectangle(@RequestHeader(value = "X-api-key") String apiKey,
        @RequestBody RectangleDimensionsV1 dimensions) {

        Bucket bucket = pricingPlanService.resolveBucket(apiKey);
        ConsumptionProbe probe = bucket.tryConsumeAndReturnRemaining(1);
        if (probe.isConsumed()) {
            return ResponseEntity.ok()
                .header("X-Rate-Limit-Remaining", Long.toString(probe.getRemainingTokens()))
                .body(new AreaV1("rectangle", dimensions.getLength() * dimensions.getWidth()));
        }
        
        long waitForRefill = probe.getNanosToWaitForRefill() / 1_000_000_000;
        return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS)
            .header("X-Rate-Limit-Retry-After-Seconds", String.valueOf(waitForRefill))
            .build();
    }
}

Let's walk through the changes. The API client sends the API key with the X-api-key request header. We use the PricingPlanService to get the bucket for this API key and check whether the request is allowed by consuming a token from the bucket.

In order to enhance the client experience of the API, we'll use the following additional response headers to send information about the rate limit:

  • X-Rate-Limit-Remaining: number of tokens remaining in the current time window
  • X-Rate-Limit-Retry-After-Seconds: remaining time, in seconds, until the bucket is refilled

We can call ConsumptionProbe methods getRemainingTokens and getNanosToWaitForRefill, to get the count of the remaining tokens in the bucket and the time remaining until the next refill, respectively. The getNanosToWaitForRefill method returns 0 if we are able to consume the token successfully.

Let's call the API:

## successful request
$ curl -v -X POST http://localhost:9001/api/v1/area/rectangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "length": 10, "width": 12 }'

< HTTP/1.1 200
< X-Rate-Limit-Remaining: 11
{"shape":"rectangle","area":120.0}

## rejected request
$ curl -v -X POST http://localhost:9001/api/v1/area/rectangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "length": 10, "width": 12 }'

< HTTP/1.1 429
< X-Rate-Limit-Retry-After-Seconds: 583

5.4. Using Spring MVC Interceptor

So far, so good! Suppose we now have to add a new API endpoint that calculates and returns the area of a triangle given its height and base:

@PostMapping(value = "/triangle")
public ResponseEntity<AreaV1> triangle(@RequestBody TriangleDimensionsV1 dimensions) {
    return ResponseEntity.ok(new AreaV1("triangle", 0.5d * dimensions.getHeight() * dimensions.getBase()));
}

As it turns out, we need to rate-limit our new endpoint as well. We can simply copy and paste the rate limit code from our previous endpoint. Or, we can use Spring MVC's HandlerInterceptor to decouple the rate limit code from the business code.

Let's create a RateLimitInterceptor and implement the rate limit code in the preHandle method:

public class RateLimitInterceptor implements HandlerInterceptor {

    private PricingPlanService pricingPlanService;

    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) 
      throws Exception {
        String apiKey = request.getHeader("X-api-key");
        if (apiKey == null || apiKey.isEmpty()) {
            response.sendError(HttpStatus.BAD_REQUEST.value(), "Missing Header: X-api-key");
            return false;
        }

        Bucket tokenBucket = pricingPlanService.resolveBucket(apiKey);
        ConsumptionProbe probe = tokenBucket.tryConsumeAndReturnRemaining(1);
        if (probe.isConsumed()) {
            response.addHeader("X-Rate-Limit-Remaining", String.valueOf(probe.getRemainingTokens()));
            return true;
        } else {
            long waitForRefill = probe.getNanosToWaitForRefill() / 1_000_000_000;
            response.addHeader("X-Rate-Limit-Retry-After-Seconds", String.valueOf(waitForRefill));
            response.sendError(HttpStatus.TOO_MANY_REQUESTS.value(),
              "You have exhausted your API Request Quota"); 
            return false;
        }
    }
}

Finally, we must add the interceptor to the InterceptorRegistry:

public class AppConfig implements WebMvcConfigurer {
    
    private RateLimitInterceptor interceptor;

    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        registry.addInterceptor(interceptor)
            .addPathPatterns("/api/v1/area/**");
    }
}

The RateLimitInterceptor intercepts each request to our area calculation API endpoints.

Let's try our new endpoint out:

## successful request
$ curl -v -X POST http://localhost:9001/api/v1/area/triangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "height": 15, "base": 8 }'

< HTTP/1.1 200
< X-Rate-Limit-Remaining: 9
{"shape":"triangle","area":60.0}

## rejected request
$ curl -v -X POST http://localhost:9001/api/v1/area/triangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "height": 15, "base": 8 }'

< HTTP/1.1 429
< X-Rate-Limit-Retry-After-Seconds: 299
{ "status": 429, "error": "Too Many Requests", "message": "You have exhausted your API Request Quota" }

It looks like we're done! We can keep adding endpoints and the interceptor would apply the rate limit for each request.

6. Bucket4j Spring Boot Starter

Let's look at another way of using Bucket4j in a Spring application. The Bucket4j Spring Boot Starter provides auto-configuration for Bucket4j that helps us achieve API rate limiting via Spring Boot application properties or configuration.

Once we integrate the Bucket4j starter into our application, we'll have a completely declarative API rate limiting implementation, without any application code.

6.1. Rate Limit Filters

In our example, we've used the value of the request header X-api-key as the key for identifying and applying the rate limits.

The Bucket4j Spring Boot Starter provides several predefined configurations for defining our rate limit key:

  • a naive rate limit filter, which is the default
  • filter by IP Address
  • expression-based filters

Expression-based filters use the Spring Expression Language (SpEL). SpEL provides access to root objects such as HttpServletRequest that can be used to build filter expressions on the IP Address (getRemoteAddr()), request headers (getHeader(‘X-api-key')), and so on.

The library also supports custom classes in the filter expressions, which is discussed in the documentation.

6.2. Maven Configuration

Let's begin by adding the bucket4j-spring-boot-starter dependency to our pom.xml:

<dependency>
    <groupId>com.giffing.bucket4j.spring.boot.starter</groupId>
    <artifactId>bucket4j-spring-boot-starter</artifactId>
    <version>0.2.0</version>
</dependency>

We had used an in-memory Map to store the Bucket per API key (consumer) in our earlier implementation. Here, we can use Spring's caching abstraction to configure an in-memory store such as Caffeine or Guava.

Let's add the caching dependencies:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-cache</artifactId>
</dependency>
<dependency>
    <groupId>javax.cache</groupId>
    <artifactId>cache-api</artifactId>
</dependency>
<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>2.8.2</version>
</dependency>
<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>jcache</artifactId>
    <version>2.8.2</version>
</dependency>

Note: We have added the jcache dependencies as well, to conform with Bucket4j's caching support.

6.3. Application Configuration

Let's configure our application to use the Bucket4j starter library. First, we'll configure Caffeine caching to store the API key and Bucket in-memory:

spring:
  cache:
    cache-names:
    - rate-limit-buckets
    caffeine:
      spec: maximumSize=100000,expireAfterAccess=3600s

Next, let's configure Bucket4j:

bucket4j:
  enabled: true
  filters:
  - cache-name: rate-limit-buckets
    url: /api/v1/area.*
    strategy: first
    http-response-body: "{ \"status\": 429, \"error\": \"Too Many Requests\", \"message\": \"You have exhausted your API Request Quota\" }"
    rate-limits:
    - expression: "getHeader('X-api-key')"
      execute-condition: "getHeader('X-api-key').startsWith('PX001-')"
      bandwidths:
      - capacity: 100
        time: 1
        unit: hours
    - expression: "getHeader('X-api-key')"
      execute-condition: "getHeader('X-api-key').startsWith('BX001-')"
      bandwidths:
      - capacity: 40
        time: 1
        unit: hours
    - expression: "getHeader('X-api-key')"
      bandwidths:
      - capacity: 20
        time: 1
        unit: hours

So, what did we just configure?

  • bucket4j.enabled=true – enables Bucket4j auto-configuration
  • bucket4j.filters.cache-name – gets the Bucket for an API key from the cache
  • bucket4j.filters.url – indicates the path expression for applying rate limit
  • bucket4j.filters.strategy=first – stops at the first matching rate limit configuration
  • bucket4j.filters.rate-limits.expression – retrieves the key using Spring Expression Language (SpEL)
  • bucket4j.filters.rate-limits.execute-condition – decides whether to execute the rate limit or not, using SpEL
  • bucket4j.filters.rate-limits.bandwidths – defines the Bucket4j rate limit parameters

We've replaced the PricingPlanService and the RateLimitInterceptor with a list of rate limit configurations that are evaluated sequentially.

Let's try it out:

## successful request
$ curl -v -X POST http://localhost:9000/api/v1/area/triangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "height": 20, "base": 7 }'

< HTTP/1.1 200
< X-Rate-Limit-Remaining: 7
{"shape":"triangle","area":70.0}

## rejected request
$ curl -v -X POST http://localhost:9000/api/v1/area/triangle \
    -H "Content-Type: application/json" -H "X-api-key:FX001-99999" \
    -d '{ "height": 7, "base": 20 }'

< HTTP/1.1 429
< X-Rate-Limit-Retry-After-Seconds: 212
{ "status": 429, "error": "Too Many Requests", "message": "You have exhausted your API Request Quota" }

7. Conclusion

In this tutorial, we've looked at several different approaches using Bucket4j for rate-limiting Spring APIs. Be sure to check out the official documentation to learn more.

As usual, the source code for all the examples is available over on GitHub.

REST bottom

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE
4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Tamas Rigoczki
Tamas Rigoczki
1 month ago

Thanks for this guide!
 
I really appreciate your work doing here, your guides requires some additional research. It’s up to personal preference that it is good or bad. I personally like the solutions what are not ready to work, because that way I have to understand the problem and learn.

Loredana Crusoveanu
29 days ago
Reply to  Tamas Rigoczki

Hi Tamas,
Thanks for the feedback. For the full working code, you can refer to the github link in the conclusion.

Pratik
Pratik
16 days ago

I am struggling to write my Rate Limiter for different clients . I found this tutorial, and found very useful. I have an issue as my yml config is not working for bucket4j.
 
Can you help is this?

Loredana Crusoveanu
12 days ago
Reply to  Pratik

Hi Pratik,
We’re glad the tutorial helped you. However, if you have specific questions with your custom code, please go ahead and post a question on Stackoverflow and send us a link. They have very clear (and strict) guidelines on everything from how to structure the question, how much (or little) detail to provide, how to structure code, etc.
And these help with how easy it is for others to go through that and actually help.

We’ll be happy to answer the specific question there.
Cheers!

Comments are closed on this article!