eBook – Guide Spring Cloud – NPI EA (cat=Spring Cloud)
announcement - icon

Let's get started with a Microservice Architecture with Spring Cloud:

>> Join Pro and download the eBook

eBook – Mockito – NPI EA (tag = Mockito)
announcement - icon

Mocking is an essential part of unit testing, and the Mockito library makes it easy to write clean and intuitive unit tests for your Java code.

Get started with mocking and improve your application tests using our Mockito guide:

Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Reactive – NPI EA (cat=Reactive)
announcement - icon

Spring 5 added support for reactive programming with the Spring WebFlux module, which has been improved upon ever since. Get started with the Reactor project basics and reactive programming in Spring Boot:

>> Join Pro and download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Jackson – NPI EA (cat=Jackson)
announcement - icon

Do JSON right with Jackson

Download the E-book

eBook – HTTP Client – NPI EA (cat=Http Client-Side)
announcement - icon

Get the most out of the Apache HTTP Client

Download the E-book

eBook – Maven – NPI EA (cat = Maven)
announcement - icon

Get Started with Apache Maven:

Download the E-book

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

eBook – RwS – NPI EA (cat=Spring MVC)
announcement - icon

Building a REST API with Spring?

Download the E-book

Course – LS – NPI EA (cat=Jackson)
announcement - icon

Get started with Spring and Spring Boot, through the Learn Spring course:

>> LEARN SPRING
Course – RWSB – NPI EA (cat=REST)
announcement - icon

Explore Spring Boot 3 and Spring 6 in-depth through building a full REST API with the framework:

>> The New “REST With Spring Boot”

Course – LSS – NPI EA (cat=Spring Security)
announcement - icon

Yes, Spring Security can be complex, from the more advanced functionality within the Core to the deep OAuth support in the framework.

I built the security material as two full courses - Core and OAuth, to get practical with these more complex scenarios. We explore when and how to use each feature and code through it on the backing project.

You can explore the course here:

>> Learn Spring Security

Course – LSD – NPI EA (tag=Spring Data JPA)
announcement - icon

Spring Data JPA is a great way to handle the complexity of JPA with the powerful simplicity of Spring Boot.

Get started with Spring Data JPA through the guided reference course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (cat=Spring Boot)
announcement - icon

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

Partner – LambdaTest – NPI EA (cat=Testing)
announcement - icon

Regression testing is an important step in the release process, to ensure that new code doesn't break the existing functionality. As the codebase evolves, we want to run these tests frequently to help catch any issues early on.

The best way to ensure these tests run frequently on an automated basis is, of course, to include them in the CI/CD pipeline. This way, the regression tests will execute automatically whenever we commit code to the repository.

In this tutorial, we'll see how to create regression tests using Selenium, and then include them in our pipeline using GitHub Actions:, to be run on the LambdaTest cloud grid:

>> How to Run Selenium Regression Tests With GitHub Actions

Course – LJB – NPI EA (cat = Core Java)
announcement - icon

Code your way through and build up a solid, practical foundation of Java:

>> Learn Java Basics

eBook – Reactive – NPI(cat= Reactive)
announcement - icon

Spring 5 added support for reactive programming with the Spring WebFlux module, which has been improved upon ever since. Get started with the Reactor project basics and reactive programming in Spring Boot:

>> Join Pro and download the eBook

1. Introduction

In this tutorial, we’ll see different ways of limiting the number of requests per second with Spring 5 WebClient.

While we usually want to take advantage of its non-blocking nature, some scenarios might force us to add delays. We’ll learn about some of these scenarios while using a few Project Reactor features to control a stream of requests to a server.

2. Initial Setup

A typical case where we’d need to limit our requests per second is to avoid overwhelming the server. Also, some web services have a maximum number of requests allowed per hour. Likewise, some control the number of concurrent requests per client.

2.1. Writing a Simple Web Service

To explore this scenario, we’ll start with a simple @RestController that serves random numbers from a fixed range:

@RestController
@RequestMapping("/random")
public class RandomController {

    @GetMapping
    Integer getRandom() {
        return new Random().nextInt(50));
    }
}

Next, we’ll simulate an expensive operation and limit the number of concurrent requests.

2.2. Rate Limiting Our Server

Before seeing solutions, let’s change our service to simulate a more realistic scenario.

Firstly, we’ll limit the number of concurrent requests our server can take, throwing an exception when the limit is reached.

Secondly, we’ll add a delay to process our response, simulating an expensive operation. While there are more robust solutions available, we’ll do this just for illustration purposes:

public class Concurrency {

    public static final int MAX_CONCURRENT = 5;
    static final Map<String, AtomicInteger> CONCURRENT_REQUESTS = new HashMap<>();

    public static int protect(IntSupplier supplier) {
        try {
            if (CONCURRENT_REQUESTS.incrementAndGet() > MAX_CONCURRENT) {
                throw new UnsupportedOperationException("max concurrent requests reached");
            }

            TimeUnit.SECONDS.sleep(2);
            return supplier.getAsInt();
        } finally {
            CONCURRENT_REQUESTS.decrementAndGet();
        }
    }
}

Finally, let’s change our endpoint to use it:

@GetMapping
Integer getRandom() {
    return Concurrency.protect(() -> new Random().nextInt(50));
}

Now, our endpoint refuses to process requests when we’re over MAX_CONCURRENT requests, returning an error to the client.

2.3. Writing a Simple Client

All examples will follow this pattern to generate a Flux of n requests and make a GET request to our service:

Flux.range(1, n)
  .flatMap(i -> {
    // GET request
  });

To reduce the boilerplate, let’s implement the request part in a method we can reuse in all examples. We’ll receive a WebClient, call get(), and retrieve() the response body with generics using ParameterizedTypeReference:

public interface RandomConsumer {

    static <T> Mono<T> get(WebClient client) {
        return client.get()
          .retrieve()
          .bodyToMono(new ParameterizedTypeReference<T>() {});
    }
}

Now we’re ready to see some approaches.

3. Delaying With zipWith(Flux.interval())

Our first example combines our requests with a fixed delay using zipWith():

public class ZipWithInterval {

    public static Flux<Integer> fetch(
      WebClient client, int requests, int delay) {
        return Flux.range(1, requests)
          .zipWith(Flux.interval(Duration.ofMillis(delay)))
          .flatMap(i -> RandomConsumer.get(client));
    }
}

As a result, this delays each request by delay milliseconds. We should note that this delay applies before sending the request.

4. Delaying With Flux.delayElements()

Flux has a more straightforward way to delay its elements:

public class DelayElements {

    public static Flux<Integer> fetch(
      WebClient client, int requests, int delay) {
        return Flux.range(1, requests)
          .delayElements(Duration.ofMillis(delay))
          .flatMap(i -> RandomConsumer.get(client));
    }
}

With delayElements(), the delay applies directly to Subscriber.onNext() signals. In other words, it delays each element from Flux.range(). Therefore, the function passed into flatMap() will be affected, taking longer to start. For instance, if the delay value is 1000, there will be a one-second delay before our request starts.

4.1. Adapting Our Solution

Consequently, if we don’t provide a long enough delay, we’ll get an error:

@Test
void givenSmallDelay_whenDelayElements_thenExceptionThrown() {
    int delay = 100;

    int requests = 10;
    assertThrows(InternalServerError.class, () -> {
      DelayElements.fetch(client, requests, delay)
        .blockLast();
    });
}

That’s because we’re waiting 100 milliseconds per request, but each request takes two seconds to complete on the server side. So, rapidly, our concurrent requests limit is reached, and we get a 500 error.

We can get away with the request limit if we add enough delay. But then, we’d have another problem – we’d wait for more time than necessary.

Depending on our use case, waiting too much might significantly impact performance. So, next, let’s check a more appropriate way to handle this since we know the limitations of our server.

5. Concurrency Control With flatMap()

Given the limitations of our service, our best option is to send at most Concurrency.MAX_CONCURRENT requests in parallel. To do this, we can add one more argument to flatMap() for the maximum number of parallel processing:

public class LimitConcurrency {

    public static Flux<Integer> fetch(
      WebClient client, int requests, int concurrency) {
        return Flux.range(1, requests)
          .flatMap(i -> RandomConsumer.get(client), concurrency);
    }
}

This parameter guarantees the maximum number of concurrent requests doesn’t exceed concurrency and that our processing won’t be delayed more than necessary:

@Test
void givenLimitInsideServerRange_whenLimitedConcurrency_thenNoExceptionThrown() {
    int limit = Concurrency.MAX_CONCURRENT;

    int requests = 10;
    assertDoesNotThrow(() -> {
      LimitConcurrency.fetch(client, TOTAL_REQUESTS, limit)
        .blockLast();
    });
}

Still, a few other options are worth discussing, depending on our scenario and preference. Let’s go over some of them.

6. Using Resilience4j RateLimiter

Resilience4j is a versatile library designed for dealing with fault tolerance in applications. We’ll use it to limit the number of concurrent requests within an interval and include a timeout.

Let’s start by adding the resilience4j-reactor and resilience4j-ratelimiter dependencies:

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-reactor</artifactId>
    <version>1.7.1</version>
</dependency>
<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-ratelimiter</artifactId>
    <version>1.7.1</version>
</dependency>

Then we build our rate limiter with RateLimiter.of(), providing a name, an interval for sending new requests, a concurrency limit, and a timeout:

public class Resilience4jRateLimit {

    public static Flux<Integer> fetch(
      WebClient client, int requests, int concurrency, int interval) {
        RateLimiter limiter = RateLimiter.of("my-rate-limiter", RateLimiterConfig.custom()
          .limitRefreshPeriod(Duration.ofMillis(interval))
          .limitForPeriod(concurrency)
          .timeoutDuration(Duration.ofMillis(interval * concurrency))
          .build());

        // ...
    }
}

Now we include it in our Flux with transformDeferred(), so it controls our GET requests rate:

return Flux.range(1, requests)
  .flatMap(i -> RandomConsumer.get(client)
    .transformDeferred(RateLimiterOperator.of(limiter))
  );

We should notice we can still have a problem if we define our interval as too low. But, this approach is helpful if we need to share a rate limiter specification with other operations.

7. Precise Throttling With Guava

Guava has a general-purpose rate limiter that works well for our scenario. Furthermore, since it uses the token bucket algorithm, it’ll only block when necessary instead of every time, unlike Flux.delayElements().

First, we need to add guava to our pom.xml:

<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>33.2.1-jre</version>
</dependency>

To use it, we call RateLimiter.create() and pass it the maximum number of requests per second we want to send. Then, we call acquire() on the limiter before sending our request to throttle execution when necessary:

public class GuavaRateLimit {

    public static Flux<Integer> fetch(
      WebClient client, int requests, int requestsPerSecond) {
        RateLimiter limiter = RateLimiter.create(requestsPerSecond);

        return Flux.range(1, requests)
          .flatMap(i -> {
            limiter.acquire();

            return RandomConsumer.get(client);
          });
    }
}

This solution works excellently due to its simplicity – it doesn’t make our code block longer than necessary. For instance, if, for some reason, a request takes longer than expected, the next won’t wait to execute. But, this is the case only if we’re inside the range set for requestsPerSecond.

8. Conclusion

In this article, we saw a few available approaches to rate limit our WebClient. After that, we simulated a controlled web service to see how it affected our code and tests. Moreover, we used Project Reactor and a few libraries to help us achieve the same goal differently.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.
Baeldung Pro – NPI EA (cat = Baeldung)
announcement - icon

Baeldung Pro comes with both absolutely No-Ads as well as finally with Dark Mode, for a clean learning experience:

>> Explore a clean Baeldung

Once the early-adopter seats are all used, the price will go up and stay at $33/year.

eBook – HTTP Client – NPI EA (cat=HTTP Client-Side)
announcement - icon

The Apache HTTP Client is a very robust library, suitable for both simple and advanced use cases when testing HTTP endpoints. Check out our guide covering basic request and response handling, as well as security, cookies, timeouts, and more:

>> Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

Course – LS – NPI EA (cat=REST)

announcement - icon

Get started with Spring Boot and with core Spring, through the Learn Spring course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (tag=Refactoring)
announcement - icon

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.

eBook Jackson – NPI EA – 3 (cat = Jackson)