Introduction to Spring Cloud Load Balancer

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Browser testing is essential if you have a website or web applications that users interact with. Manual testing can be very helpful to an extent, but given the multiple browsers available, not to mention versions and operating system, testing everything manually becomes time-consuming and repetitive.

To help automate this process, Selenium is a popular choice for developers, as an open-source tool with a large and active community. What's more, we can further scale our automation testing by running on theLambdaTest cloud-based testing platform.

Read more through our step-by-step tutorial on how to set up Selenium tests with Java and run them on LambdaTest:

>> Automated Browser Testing With Selenium

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Introduction

As microservice architectures become more popular, it’s becoming more common to run multiple services distributed across different servers. In this quick tutorial, we’ll look at using Spring Cloud Load Balancer to create more fault-tolerant applications.

2. What Is Load Balancing?

Load balancing is the process of distributing traffic among different instances of the same application.

To create a fault-tolerant system, it’s common to run multiple instances of each application. Thus, whenever one service needs to communicate with another, it needs to pick a particular instance to send its request.

There are many algorithms when it comes to load balancing:

Random selection: Choosing an instance randomly
Round-robin: Choosing an instance in the same order each time
Least connections: Choosing the instance with the fewest current connections
Weighted metric: Using a weighted metric to choose the best instance (for example, CPU or memory usage)
IP hash: Using the hash of the client IP to map to an instance

These are just a few examples of load balancing algorithms, and each has its pros and cons.

Random selection and round-robin are easy to implement but may not optimally use services. Conversely, the least connections and weighted metrics are more complex but generally create more optimal service utilization. And IP hash is great when server stickiness is important, but it isn’t very fault-tolerant.

3. Introduction to Spring Cloud Load Balancer

The Spring Cloud Load Balancer library allows us to create applications that communicate with other applications in a load-balanced fashion. Using any algorithm we want, we can easily implement load balancing when making remote service calls.

To illustrate, let’s look at some example code. We’ll start with a simple server application. The server will have a single HTTP endpoint and can be run as multiple instances.

Then, we’ll create a client application that uses Spring Cloud Load Balancer to alternate requests between different instances of the server.

3.1. Example Server

For our example server, we start with a simple Spring Boot application:

@SpringBootApplication
@RestController
public class ServerApplication {

    public static void main(String[] args) {
        SpringApplication.run(ServerApplication.class, args);
    }

    @Value("${server.instance.id}")
    String instanceId;

    @GetMapping("/hello")
    public String hello() {
        return String.format("Hello from instance %s", instanceId);
    }
}

We start by injecting a configurable variable named instanceId. This allows us to differentiate between multiple running instances. Next, we add a single HTTP GET endpoint that echoes back a message and instance ID.

The default instance will run on port 8080 with an ID of 1. To run a second instance, we just need to add a couple of program arguments:

--server.instance.id=2 --server.port=8081

3.2. Example Client

Now, let’s look at the client code. This is where we use Spring Cloud Load Balancer, so let’s start by including it in our application:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-loadbalancer</artifactId>
</dependency>

Next, we create an implementation of ServiceInstanceListSupplier. This is one of the key interfaces in Spring Cloud Load Balancer. It defines how we find available service instances.

For our sample application, we’ll hard-code two different instances of our example server. They run on the same machine but use different ports:

class DemoInstanceSupplier implements ServiceInstanceListSupplier {
    private final String serviceId;

    public DemoInstanceSupplier(String serviceId) {
        this.serviceId = serviceId;
    }

    @Override
    public String getServiceId() {
        return serviceId;
    }

    @Override
        public Flux<List<ServiceInstance>> get() {
          return Flux.just(Arrays
            .asList(new DefaultServiceInstance(serviceId + "1", serviceId, "localhost", 8080, false),
              new DefaultServiceInstance(serviceId + "2", serviceId, "localhost", 8081, false)));
    }
}

In a real-world system, we would want to use an implementation that does not hard-code service addresses. We’ll look at this a little more later on.

Now, let’s create a LoadBalancerConfiguration class:

@Configuration
@LoadBalancerClient(name = "example-service", configuration = DemoServerInstanceConfiguration.class)
class WebClientConfig {
    @LoadBalanced
    @Bean
    WebClient.Builder webClientBuilder() {
        return WebClient.builder();
    }
}

This class has one role: create a load-balanced WebClient builder to make remote requests. Notice that our annotation uses a pseudo name for the service.

This is because we likely won’t know the actual hostnames and ports for running instances ahead of time. So, we use a pseudo name as a placeholder, and the framework will substitute real values when it picks a running instance.

Next, let’s create a Configuration class that instantiates our service instance supplier. Notice that we use the same pseudo name as above:

@Configuration
class DemoServerInstanceConfiguration {
    @Bean
    ServiceInstanceListSupplier serviceInstanceListSupplier() {
        return new DemoInstanceSupplier("example-service");
    }
}

Now, we can create the actual client application. Let’s use the WebClient bean from above to send ten requests to the example server:

@SpringBootApplication
public class ClientApplication {

    public static void main(String[] args) {

        ConfigurableApplicationContext ctx = new SpringApplicationBuilder(ClientApplication.class)
          .web(WebApplicationType.NONE)
          .run(args);

        WebClient loadBalancedClient = ctx.getBean(WebClient.Builder.class).build();

        for(int i = 1; i <= 10; i++) {
            String response =
              loadBalancedClient.get().uri("http://example-service/hello")
                .retrieve().toEntity(String.class)
                .block().getBody();
            System.out.println(response);
        }
    }
}

Looking at the output, we can confirm that we’re load balancing between two different instances:

Hello from instance 2
Hello from instance 1
Hello from instance 2
Hello from instance 1
Hello from instance 2
Hello from instance 1
Hello from instance 2
Hello from instance 1
Hello from instance 2
Hello from instance 1

4. Other Features

The example server and client show a very simple use of Spring Cloud Load Balancer. But other library features are worth mentioning.

For starters, the example client used the default RoundRobinLoadBalancer policy. The library also provides a RandomLoadBalancer class. We could also create our own implementation of ReactorServiceInstanceLoadBalancer with any algorithm we want.

Additionally, the library provides a way to discover service instances dynamically. We do this using the DiscoveryClientServiceInstanceListSupplier interface. This is useful for integrating with service discovery systems such as Eureka or Zookeeper.

In addition to different load balancing and service discovery features, the library also offers a basic retry capability. Under the hood, it ultimately relies on the Spring Retry library. This allows us to retry failed requests, possibly using the same instance after some waiting period.

Another built-in feature is metrics, which is built on top of the Micrometer library. Out of the box, we get basic service level metrics for each instance, but we can also add our own.

Finally, the Spring Cloud Load Balancer library provides a way to cache service instances using the LoadBalancerCacheManager interface. This is important because, in reality, looking up available service instances likely involves a remote call. This means it can be expensive to lookup data that doesn’t change often, and it also represents a possible failure point in the application. By using a cache of service instances, our applications can work around some of these shortcomings.

5. Conclusion

Load balancing is an essential part of building modern, fault-tolerant systems. Using Spring Cloud Load Balancer, we can easily create applications that use various load balancing techniques to distribute requests to different service instances.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.