Course – Black Friday 2025 – NPI EA (cat= Baeldung)
announcement - icon

Yes, we're now running our Black Friday Sale. All Access and Pro are 33% off until 2nd December, 2025:

>> EXPLORE ACCESS NOW

Partner – Orkes – NPI EA (cat=Spring)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Partner – Orkes – NPI EA (tag=Microservices)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

eBook – Guide Spring Cloud – NPI EA (cat=Spring Cloud)
announcement - icon

Let's get started with a Microservice Architecture with Spring Cloud:

>> Join Pro and download the eBook

eBook – Mockito – NPI EA (tag = Mockito)
announcement - icon

Mocking is an essential part of unit testing, and the Mockito library makes it easy to write clean and intuitive unit tests for your Java code.

Get started with mocking and improve your application tests using our Mockito guide:

Download the eBook

eBook – Reactive – NPI EA (cat=Reactive)
announcement - icon

Spring 5 added support for reactive programming with the Spring WebFlux module, which has been improved upon ever since. Get started with the Reactor project basics and reactive programming in Spring Boot:

>> Join Pro and download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Jackson – NPI EA (cat=Jackson)
announcement - icon

Do JSON right with Jackson

Download the E-book

eBook – HTTP Client – NPI EA (cat=Http Client-Side)
announcement - icon

Get the most out of the Apache HTTP Client

Download the E-book

eBook – Maven – NPI EA (cat = Maven)
announcement - icon

Get Started with Apache Maven:

Download the E-book

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

eBook – RwS – NPI EA (cat=Spring MVC)
announcement - icon

Building a REST API with Spring?

Download the E-book

Course – LS – NPI EA (cat=Jackson)
announcement - icon

Get started with Spring and Spring Boot, through the Learn Spring course:

>> LEARN SPRING
Course – RWSB – NPI EA (cat=REST)
announcement - icon

Explore Spring Boot 3 and Spring 6 in-depth through building a full REST API with the framework:

>> The New “REST With Spring Boot”

Course – LSS – NPI EA (cat=Spring Security)
announcement - icon

Yes, Spring Security can be complex, from the more advanced functionality within the Core to the deep OAuth support in the framework.

I built the security material as two full courses - Core and OAuth, to get practical with these more complex scenarios. We explore when and how to use each feature and code through it on the backing project.

You can explore the course here:

>> Learn Spring Security

Partner – Orkes – NPI EA (cat=Java)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Course – LSD – NPI EA (tag=Spring Data JPA)
announcement - icon

Spring Data JPA is a great way to handle the complexity of JPA with the powerful simplicity of Spring Boot.

Get started with Spring Data JPA through the guided reference course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (cat=Spring Boot)
announcement - icon

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

Course – Black Friday 2025 – NPI (cat=Baeldung)
announcement - icon

Yes, we're now running our Black Friday Sale. All Access and Pro are 33% off until 2nd December, 2025:

>> EXPLORE ACCESS NOW

1. Overview

Modern applications are increasingly integrating with Large Language Models (LLMs) to build intelligent solutions. While a single LLM can handle multiple tasks, relying on just one model isn’t always the optimal approach.

Different models specialize in different capabilities, with some excelling at technical analysis while others perform better at creative writing. Additionally, we might prefer lighter and cost-effective models for handling simple tasks while reserving powerful models for complex tasks.

In this tutorial, we’ll explore integrating multiple LLMs in a Spring Boot application using Spring AI.

We’ll configure models from different providers as well as multiple models from the same provider. Then, we’ll build upon this configuration to implement a resilient chatbot capable of automatically switching between models when failures occur.

2. Configuring LLMs of Different Providers

Let’s start by configuring two LLMs from different providers in our application.

For our demonstration, we’ll use OpenAI and Anthropic as our AI model providers.

2.1. Configuring a Primary LLM

We’ll begin by configuring an OpenAI model as our primary LLM.

First, let’s add the necessary dependency to our project’s pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-openai</artifactId>
    <version>1.0.2</version>
</dependency>

The OpenAI starter dependency is a wrapper around OpenAI’s Chat Completions API, enabling us to interact with OpenAI models in our application.

Next, let’s configure our OpenAI API key and chat model in the application.yaml file:

spring:
  ai:
    open-ai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: ${PRIMARY_LLM}
          temperature: 1

We use the ${} property placeholder to load the values of our properties from environment variables. Additionally, we set the temperature to 1 since the newer OpenAI models only accept this default value.

On configuring the above properties, Spring AI automatically creates a bean of type OpenAiChatModel. Let’s use it to define a ChatClient bean, which serves as the main entry point for interacting with our LLM:

@Configuration
class ChatbotConfiguration {

    @Bean
    @Primary
    ChatClient primaryChatClient(OpenAiChatModel chatModel) {
        return ChatClient.create(chatModel);
    }
}

In our ChatbotConfiguration class, we use the OpenAiChatModel bean to create a ChatClient bean for our primary LLM.

We annotate this bean with @Primary. Spring Boot will automatically inject it in our components when we autowire a ChatClient bean without using a qualifier.

2.2. Configuring a Secondary LLM

Now, let’s configure a model from Anthropic to act as our secondary LLM.

First, let’s add the Anthropic starter dependency to our pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-anthropic</artifactId>
    <version>1.0.2</version>
</dependency>

This dependency is a wrapper around the Anthropic Message API and provides us with the necessary classes to establish a connection and interact with Anthropic models.

Next, let’s define the configuration properties for our secondary model:

spring:
  ai:
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}
      chat:
        options:
          model: ${SECONDARY_LLM}

Similar to our primary LLM configuration, we load the Anthropic API key and the model ID from environment variables.

Finally, let’s create a dedicated ChatClient bean for our secondary model:

@Bean
ChatClient secondaryChatClient(AnthropicChatModel chatModel) {
    return ChatClient.create(chatModel);
}

Here, we create a secondaryChatClient bean using the AnthropicChatModel bean that Spring AI auto-configures for us.

3. Configuring LLMs of Same Provider

More often than not, the LLMs we need to configure might belong to the same AI provider.

Spring AI does not natively support this scenario, since the auto-configuration creates only one ChatModel bean per provider. We’ll need to manually define a ChatModel bean for the additional model(s).

Let’s explore this process and configure a second Anthropic model in our application:

spring:
  ai:
    anthropic:
      chat:
        options:
          tertiary-model: ${TERTIARY_LLM}

In our application.yaml, under the Anthropic configuration, we’ve added a custom property to hold the model name for the tertiary LLM.

Next, let’s define the necessary beans for our tertiary LLM:

@Bean
ChatModel tertiaryChatModel(
    AnthropicApi anthropicApi,
    AnthropicChatModel anthropicChatModel,
    @Value("${spring.ai.anthropic.chat.options.tertiary-model}") String tertiaryModelName
) {
    AnthropicChatOptions chatOptions = anthropicChatModel.getDefaultOptions().copy();
    chatOptions.setModel(tertiaryModelName);
    return AnthropicChatModel.builder()
      .anthropicApi(anthropicApi)
      .defaultOptions(chatOptions)
      .build();
}

@Bean
ChatClient tertiaryChatClient(@Qualifier("tertiaryChatModel") ChatModel tertiaryChatModel) {
    return ChatClient.create(tertiaryChatModel);
}

First, to create our custom ChatModel bean, we inject the auto-configured AnthropicApi bean, the default AnthropicChatModel bean which we used to create our secondary LLM’s ChatClient, and the tertiary model name property using @Value.

We copy the default options from the existing AnthropicChatModel bean and simply override the model name.

This setup assumes that both Anthropic models share the same API key and other configurations. If we need to specify different properties, we can customize the AnthropicChatOptions further.

Finally, we use the custom tertiaryChatModel to create a third ChatClient bean in our configuration class.

4. Exploring a Practical Use Case

With our multi-model configuration in place, let’s implement a practical use case. We’ll build a resilient chatbot that automatically falls back to alternative models in sequence when the primary one fails.

4.1. Building a Resilient Chatbot

To implement the fallback logic, we’ll use Spring Retry.

Let’s create a new ChatbotService class and autowire the three ChatClient beans we’ve defined. Then, let’s define the entry point for our chatbot that uses the primary LLM:

@Retryable(retryFor = Exception.class, maxAttempts = 3)
String chat(String prompt) {
    logger.debug("Attempting to process prompt '{}' with primary LLM. Attempt #{}",
        prompt, RetrySynchronizationManager.getContext().getRetryCount() + 1);
    return primaryChatClient
      .prompt(prompt)
      .call()
      .content();
}

Here, we create a chat() method that uses the primaryChatClient bean. We annotate this method with @Retryable, configuring it to make up to three attempts in case of any Exception.

Next, let’s define a recovery method:

@Recover
String chat(Exception exception, String prompt) {
    logger.warn("Primary LLM failure. Error received: {}", exception.getMessage());
    logger.debug("Attempting to process prompt '{}' with secondary LLM", prompt);
    try {
        return secondaryChatClient
          .prompt(prompt)
          .call()
          .content();
    } catch (Exception e) {
        logger.warn("Secondary LLM failure: {}", e.getMessage());
        logger.debug("Attempting to process prompt '{}' with tertiary LLM", prompt);
        return tertiaryChatClient
          .prompt(prompt)
          .call()
          .content();
    }
}

The @Recover annotation marks our overloaded chat() method as the fallback if the original chat() method fails and exhausts the configured retries.

We first attempt to get a response from the secondaryChatClient. If that fails as well, we make a final attempt with the tertiaryChatClient bean.

We use this rudimentary try-catch implementation since Spring Retry only allows one recovery method per method signature. However, in a production application, we should consider a more sophisticated solution like Resilience4j.

Now that we’ve implemented our service layer, let’s expose a REST API on top of it:

@PostMapping("/api/chatbot/chat")
ChatResponse chat(@RequestBody ChatRequest request) {
    String response = chatbotService.chat(request.prompt);
    return new ChatResponse(response);
}

record ChatRequest(String prompt) {}
record ChatResponse(String response) {}

Here, we define a POST /api/chatbot/chat, which accepts a prompt, passes it to the service layer, and finally, wraps and returns the response in a ChatResponse record.

4.2. Testing Our Chatbot

Finally, let’s test our chatbot to verify that the fallback mechanism works correctly.

Let’s start our application with environment variables that specify invalid model names for our primary and secondary LLMs, but a valid one for the tertiary LLM:

OPENAI_API_KEY=.... \
ANTHROPIC_API_KEY=.... \
PRIMARY_LLM=gpt-100 \
SECONDARY_LLM=claude-opus-200 \
TERTIARY_LLM=claude-3-haiku-20240307 \
mvn spring-boot:run

In our command, gpt-100 and claude-opus-200 are invalid model names that will cause API errors, while claude-3-haiku-20240307 is a valid model from Anthropic.

Next, let’s use the HTTPie CLI to invoke the API endpoint and interact with our chatbot:

http POST :8080/api/chatbot/chat prompt="What is the capital of France?"

Here, we send a simple prompt to the chatbot, let’s see what we receive as a response:

{
    "response": "The capital of France is Paris."
}

As we can see, despite configuring invalid primary and secondary LLMs, our chatbot still provides a correct response, confirming that the system falls back to the tertiary LLM.

To see the fallback logic in action, let’s also examine our application logs:

[2025-09-30 12:56:03] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #1
[2025-09-30 12:56:05] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #2
[2025-09-30 12:56:06] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #3
[2025-09-30 12:56:07] [WARN] [com.baeldung.multillm.ChatbotService] - Primary LLM failure. Error received: HTTP 404 - {
    "error": {
        "message": "The model `gpt-100` does not exist or you do not have access to it.",
        "type": "invalid_request_error",
        "param": null,
        "code": "model_not_found"
    }
}
[2025-09-30 12:56:07] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with secondary LLM
[2025-09-30 12:56:07] [WARN] [com.baeldung.multillm.ChatbotService] - Secondary LLM failure: HTTP 404 - {"type":"error","error":{"type":"not_found_error","message":"model: claude-opus-200"},"request_id":"req_011CTeBrAY8rstsSPiJyv3sj"}
[2025-09-30 12:56:07] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with tertiary LLM

The logs clearly illustrate our request’s execution flow.

We see three failed attempts with the primary LLM. Then, our service attempts to use the secondary LLM, which also fails. Finally, it invokes the tertiary LLM that processes the prompt and returns the response we saw.

This demonstrates that our fallback mechanism works exactly as designed, ensuring our chatbot remains available even when multiple LLMs fail.

5. Conclusion

In this article, we’ve explored integrating multiple LLMs within a single Spring AI application.

First, we demonstrated how Spring AI’s abstraction layer simplifies configuring models from different providers like OpenAI and Anthropic.

Then, we tackled a more complex scenario of configuring multiple models from the same provider, creating custom beans when Spring AI’s auto-configuration isn’t sufficient.

Finally, we used this multi-model configuration to build a resilient and highly available chatbot. Using Spring Retry, we configured a cascading fallback pattern that automatically switches between LLMs in case of failure.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.
Course – Black Friday 2025 – NPI EA (cat= Baeldung)
announcement - icon

Yes, we're now running our Black Friday Sale. All Access and Pro are 33% off until 2nd December, 2025:

>> EXPLORE ACCESS NOW

Partner – Orkes – NPI EA (cat = Spring)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Partner – Orkes – NPI EA (tag = Microservices)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

eBook – HTTP Client – NPI EA (cat=HTTP Client-Side)
announcement - icon

The Apache HTTP Client is a very robust library, suitable for both simple and advanced use cases when testing HTTP endpoints. Check out our guide covering basic request and response handling, as well as security, cookies, timeouts, and more:

>> Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

Course – LS – NPI EA (cat=REST)

announcement - icon

Get started with Spring Boot and with core Spring, through the Learn Spring course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (tag=Refactoring)
announcement - icon

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.

Course – Black Friday 2025 – NPI (All)
announcement - icon

Yes, we're now running our Black Friday Sale. All Access and Pro are 33% off until 2nd December, 2025:

>> EXPLORE ACCESS NOW

eBook Jackson – NPI EA – 3 (cat = Jackson)