Chat Memory in Spring AI

Last updated: October 9, 2025

Written by: Manfred Ng

Reviewed by: Eric Martin

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Browser testing is essential if you have a website or web applications that users interact with. Manual testing can be very helpful to an extent, but given the multiple browsers available, not to mention versions and operating system, testing everything manually becomes time-consuming and repetitive.

To help automate this process, Selenium is a popular choice for developers, as an open-source tool with a large and active community. What's more, we can further scale our automation testing by running on theLambdaTest cloud-based testing platform.

Read more through our step-by-step tutorial on how to set up Selenium tests with Java and run them on LambdaTest:

>> Automated Browser Testing With Selenium

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Overview

We often need human-like interactions when we have a conversation with an AI application. Therefore, there is a need to maintain the conversation with the LLM model, while Spring AI addresses it through its chat memory functionality.

In this tutorial, we’ll explore different options of chat memory provided in Spring AI and provide examples on how we integrate the chat memory with the chat client.

2. Chat Memory

Large language models (LLMs) are stateless and do not memorize anything. Each prompt to the LLM is considered an isolated query, meaning that the model does not remember any previous message.

In AI applications, keeping previous conversations is crucial to let the LLM produce meaningful responses. This is where chat memory come in to fill the gap, providing:

Contextual Understanding – This allows the LLM to produce responses based on the whole conversation.
Personalization – This facilitates providing personalized responses based on the chat memory.
Persistence – Based on the implementation, the chat memory could persist across multiple sessions.

3. Chat Memory Repositories

Spring AI provides a ChatMemory interface and some off-the-shelf implementations to help us easily integrate chat memory into our application.

First, let’s add the Maven spring-ai-starter-model-openai dependency to enable OpenAI integration. This dependency will transitively import the Spring AI core libraries as well:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-openai</artifactId>
    <version>1.0.0</version>
</dependency>

When we create the chat memory, we have to provide an implementation of ChatMemoryRepository, which is responsible for persisting the chat messages to the store:

ChatMemoryRepository chatMemoryRepository;

ChatMemory chatMemory = MessageWindowChatMemory.builder()
  .chatMemoryRepository(chatMemoryRepository)
  .maxMessages(10)
  .build();

Spring AI provides different types of chat memory repositories that we can choose from depending on the tech stack of our project. We’ll discuss two of them in here.

3.1. In-Memory Repository

If we do not explicitly define the chat memory, Spring AI uses an in-memory store by default. It stores the chat message internally in a ConcurrentHashMap where the conversation ID is the key, and the value is a list of messages in that conversation:

public final class InMemoryChatMemoryRepository implements ChatMemoryRepository {
    Map<String, List<Message>> chatMemoryStore = new ConcurrentHashMap();

    // other methods
}

The in-memory repository is very simple and works well when we don’t need any long-term persistence. We need to pick something else if long-term persistence is needed.

3.2. JDBC Repository

JDBC repository is to persist chat messages in a relational database. Spring AI provides built-in support for several relational databases, including MySQL, PostgreSQL, SQL Server, and HSQLDB.

If we want to store the chat memory in a relational database, we’ll need to include the Maven dependency to support it:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-model-chat-memory-repository-jdbc</artifactId>
    <version>1.0.0</version>
</dependency>

Each of the built-in supported databases has its own dialect implementation that provides the SQL statements for CRUD operations on the chat memory table. We need to supply the dialect when we initialize the JdbcChatMemoryRepository:

JdbcChatMemoryRepositoryDialect dialect = ...; // The choose repository dialect
ChatMemoryRepository repository = JdbcChatMemoryRepository.builder()
  .jdbcTemplate(jdbcTemplate)
  .dialect(dialect)
  .build();

For databases that do not have built-in support, we have to implement the JdbcChatMemoryRepositoryDialect interface and provide the SQL statements for each CRUD operation:

public interface JdbcChatMemoryRepositoryDialect {
    String getSelectMessagesSql();

    String getInsertMessageSql();

    String getSelectConversationIdsSql();

    String getDeleteMessagesSql();
}

For the implemented dialect in Spring AI, the CRUD operations follow standard SQL and do not rely on a specific vendor. Therefore, we could simply use a provided implementation such as MysqlChatMemoryRepositoryDialect without implementing our custom dialect.

We need to initialize the schema before using it. For supported dialects, Spring AI provides the schema creation script as well. We can find those scripts in classpath:org/springframework/ai/chat/memory/repository/jdbc.

4. Apply Chat Memory to Chat Client

Spring AI provides auto-configuration for chat memory in ChatMemoryAutoConfiguration. If we choose an in-memory repository, we don’t need to define anything explicitly, as this is the default.

However, if we want to use the JDBC repository instead, we need to provide our bean method of ChatMemoryRepository to override the default in-memory one:

@Configuration
public class ChatConfig {
    @Bean
    public ChatMemoryRepository getChatMemoryRepository(JdbcTemplate jdbcTemplate) {
        return JdbcChatMemoryRepository.builder()
          .jdbcTemplate(jdbcTemplate)
          .dialect(new HsqldbChatMemoryRepositoryDialect())
          .build();
    }
}

Note that we don’t need to define the bean method of ChatMemory explicitly, as it’s already defined in the ChatMemoryAutoConfiguration.

Let’s create a ChatService in Spring Boot:

@Component
@SessionScope
public class ChatService {
    private final ChatClient chatClient;
    private final String conversationId;

    public ChatService(ChatModel chatModel, ChatMemory chatMemory) {
        this.chatClient = ChatClient.builder(chatModel)
          .defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory).build())
          .build();
        this.conversationId = UUID.randomUUID().toString();
    }

    public String chat(String prompt) {
        return chatClient.prompt()
          .user(userMessage -> userMessage.text(prompt))
          .advisors(a -> a.param(ChatMemory.CONVERSATION_ID, conversationId))
          .call()
          .content();
    }
}

In the constructor, Spring Boot will automatically inject the ChatMemory implementation. We thereby initialize the ChatClient with it via MemoryChatMemoryAdvisor.

We define the chat method to accept the prompt and send the message to the chat model. Additionally, we add a conversation ID as the chat advisor parameter to uniquely identify the conversation based on the current session.

It’s important to note that we must annotate the service with @SessionScope so that its instance can be persisted across multiple requests.

5. Integration with OpenAI

In our demonstration, we’ll integrate chat memory with OpenAI and examine how the Spring AI calls the OpenAI API and adopt an in-memory HSQL DB as the persistence store.

Let’s add properties to application.yml for adding the OpenAI API key, setting up the database connection, and initializing the schema during application startup:

spring:
  ai:
    openai:
      api-key: "<YOUR-API-KEY>"

  datasource:
    url: jdbc:hsqldb:mem:chatdb
    driver-class-name: org.hsqldb.jdbc.JDBCDriver
    username: sa
    password:

  sql:
    init:
      mode: always
      schema-locations: classpath:org/springframework/ai/chat/memory/repository/jdbc/schema-hsqldb.sql

Now, the configuration is all set. Let’s create a REST endpoint so that we can call the ChatService that we defined earlier:

@RestController
public class ChatController {
    private final ChatService chatService;

    public ChatController(ChatService chatService) {
        this.chatService = chatService;
    }

    @PostMapping("/chat")
    public ResponseEntity<String> chat(@RequestBody @Valid ChatRequest request) {
        String response = chatService.chat(request.getPrompt());
        return ResponseEntity.ok(response);
    }
}

ChatRequest is a simple DTO that contains the prompt as a String:

public class ChatRequest {
    @NotNull
    private String prompt;

    // getter and setter
}

6. Test Run

We’re now ready to send requests to the REST endpoint. We’ll use Postman to send requests to the REST endpoint and intercept the HTTP requests between our Spring Boot application and OpenAI with the HTTP toolkit to see how things work together.

6.1. First Request

Let’s make a call to ask for a joke in Postman and check the response:

When we observe the intercepted request in the HTTP toolkit, we’ll see the HTTP request to OpenAI:

{
  "messages": [
    {
      "content": "Tell me a joke",
      "role": "user"
    }
  ],
  "model": "gpt-4o-mini",
  "stream": false,
  "temperature": 0.7
}

This is a pretty trivial request that sends our prompt payload using the user role.

6.2. Second Request

Now, let’s make another request to compare the difference:

When we read the intercepted HTTP request to OpenAI this time, we see that Spring AI not only sends our prompt payload to OpenAI, but it also sends the previous prompt and response as well:

{
  "messages": [
    {
      "content": "Tell me a joke",
      "role": "user"
    },
    {
      "content": "Why did the scarecrow win an award? \n\nBecause he was outstanding in his field!",
      "role": "assistant"
    },
    {
      "content": "Tell me another one",
      "role": "user"
    }
  ],
  "model": "gpt-4o-mini",
  "stream": false,
  "temperature": 0.7
}

In this example, we observe that the Spring AI sends the entire chat history to the chat model. This approach helps the chat model to maintain the context of the whole conversation and make the interaction feel more natural.

7. Conclusion

In this article, we’ve learned how Spring AI enhances the conversational experience by maintaining the chat history across multiple chat requests via chat memory.

We explored different memory repositories and illustrated how to integrate chat memory with Spring AI and OpenAI. We also examined how the Spring AI chat memory works with OpenAI behind the scenes.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.