A Guide to OpenAI’s Moderation Model in Spring AI

Last updated: August 28, 2025

Written by: Kostiantyn Ivanov

Reviewed by: David Martinez

Spring+

Azure Container Apps is a fully managed serverless container service that enables you to build and deploy modern, cloud-native Java applications and microservices at scale. It offers a simplified developer experience while providing the flexibility and portability of containers.

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, visit the documentation page.

You can also ask questions and leave feedback on the Azure Container Apps GitHub page.

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, you can get started over on the documentation page.

And, you can also ask questions and leave feedback on the Azure Container Apps GitHub page.

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Traditional keyword-based search methods rely on exact word matches, often leading to irrelevant results depending on the user's phrasing.

By comparison, using a vector store allows us to represent the data as vector embeddings, based on meaningful relationships. We can then compare the meaning of the user’s query to the stored content, and retrieve more relevant, context-aware results.

Explore how to build an intelligent chatbot using MongoDB Atlas, Langchain4j and Spring Boot:

>> Building an AI Chatbot in Java With Langchain4j and MongoDB Atlas

Accessibility testing is a crucial aspect to ensure that your application is usable for everyone and meets accessibility standards that are required in many countries.

By automating these tests, teams can quickly detect issues related to screen reader compatibility, keyboard navigation, color contrast, and other aspects that could pose a barrier to using the software effectively for people with disabilities.

Learn how to automate accessibility testing with Selenium and the LambdaTest cloud-based testing platform that lets developers and testers perform accessibility automation on over 3000+ real environments:

Automated Accessibility Testing With Selenium

1. Introduction

We use Spring AI with OpenAI’s Moderation model to detect harmful or sensitive content in text. The moderation model analyzes input and flags categories like self-harm, violence, hate, or sexual content.

In this tutorial, we’ll learn how to build a moderation service and integrate it with the moderation model.

2. Dependencies

Let’s add the spring-ai-starter-model-openai dependency:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-openai</artifactId>
</dependency>

Using it, we get the chat client, including moderation model requests.

3. Configuration

Next, we configure our Spring AI client:

spring:
  ai:
    openai:
      api-key: ${OPEN_AI_API_KEY}
      moderation:
        options:
          model: omni-moderation-latest

We’ve specified the API key and moderation model name. Now, we can start using the moderation API.

4. Moderation Categories

Let’s review the moderation categories we can use:

Hate, we can use this category to detect content that expresses or promotes hate based on protected traits.
Hate/Threatening, we can use this category to detect hate content that includes threats of violence or serious harm.
Harassment, we may face this category when language harasses, bullies, or targets an individual or group.
Harassment/Threatening, we may face this category when harassment includes explicit threats or intent to cause harm.
Self-harm, we can use this category to identify content that promotes or depicts self-harm behaviors.
Self-harm/Intent, we can use this category when someone expresses an intent to self-harm.
Self-harm/Instructions, we may face this category when content gives instructions, methods, or encouragement to self-harm.
Sexual, we can use this category to flag explicit sexual content or promotion of sexual services.
Sexual/Minors, we can use this category to flag any sexual content involving minors, which is strictly disallowed.
Violence, we may face this category when content depicts or describes death, violence, or physical injury.
Violence/Graphic, we can use this category to detect vivid or graphic depictions of injury, death, or severe harm.
Illicit, we can use this category to flag advice, instructions, or promotion of illegal activities.
Illicit/Violent, we may face this category when illicit content includes elements of violence.

5. Build Moderation Service

Now, let’s build the Moderation Service. In this service, we’ll consume user input messages and validate them against different categories using the moderation model.

5.1. TextModerationService

Let’s start by building the TextModerationService:

@Service
public class TextModerationService {

    private final OpenAiModerationModel openAiModerationModel;

    @Autowired
    public TextModerationService(OpenAiModerationModel openAiModerationModel) {
        this.openAiModerationModel = openAiModerationModel;
    }

    public String moderate(String text) {
        ModerationPrompt moderationRequest = new ModerationPrompt(text);
        ModerationResponse response = openAiModerationModel.call(moderationRequest);
        Moderation output = response.getResult().getOutput();
        
        return output.getResults().stream()
          .map(this::buildModerationResult)
          .collect(Collectors.joining("\n"));
    }
}

Here, we’ve used the OpenAiModerationModel. We send the ModerationPrompt with the text we want to moderate and build the result from the model’s response. Now, let’s create the buildModerationResult() method:

private String buildModerationResult(ModerationResult moderationResult) {

    Categories categories = moderationResult.getCategories();

    String violations = Stream.of(
           Map.entry("Sexual", categories.isSexual()),
           Map.entry("Hate", categories.isHate()),
           Map.entry("Harassment", categories.isHarassment()),
           Map.entry("Self-Harm", categories.isSelfHarm()),
           Map.entry("Sexual/Minors", categories.isSexualMinors()),
           Map.entry("Hate/Threatening", categories.isHateThreatening()),
           Map.entry("Violence/Graphic", categories.isViolenceGraphic()),
           Map.entry("Self-Harm/Intent", categories.isSelfHarmIntent()),
           Map.entry("Self-Harm/Instructions", categories.isSelfHarmInstructions()),
           Map.entry("Harassment/Threatening", categories.isHarassmentThreatening()),
           Map.entry("Violence", categories.isViolence()))
       .filter(entry -> Boolean.TRUE.equals(entry.getValue()))
       .map(Map.Entry::getKey)
       .collect(Collectors.joining(", "));

    return violations.isEmpty()
      ? "No category violations detected."
      : "Violated categories: " + violations;
}

We got the moderation result categories and created a map to add the violation result for each category. If no categories are violated, we just build the default text response.

5.2. TextModerationController

Before building the controller, let’s create the ModerateRequest, which we’ll use to send the text for moderation:

public class ModerateRequest {

    private String text;

    //getters and setters
}

Now, let’s create the TextModerationController:

@RestController
public class TextModerationController {

    private final TextModerationService service;

    @Autowired
    public TextModerationController(TextModerationService service) {
        this.service = service;
    }

    @PostMapping("/moderate")
    public ResponseEntity<String> moderate(@RequestBody ModerateRequest request) {
        return ResponseEntity.ok(service.moderate(request.getText()));
    }
}

Here, we got the text from the ModerateRequest and sent it to our TextModerationService.

5.3. Test the Behavior

Finally, let’s test our moderation service:

@AutoConfigureMockMvc
@ExtendWith(SpringExtension.class)
@EnableAutoConfiguration
@SpringBootTest
@ActiveProfiles("moderation")
class ModerationApplicationLiveTest {

    @Autowired
    private MockMvc mockMvc;

    @Test
    void givenTextWithoutViolation_whenModerating_thenNoCategoryViolationsDetected() throws Exception {
        String moderationResponse = mockMvc.perform(post("/moderate")
            .contentType(MediaType.APPLICATION_JSON)
            .content("{\"text\": \"Please review me\"}"))
          .andExpect(status().isOk())
          .andReturn()
          .getResponse()
          .getContentAsString();

        assertThat(moderationResponse).contains("No category violations detected");
    }
}

We sent a text that doesn’t violate any categories and verified that the service confirms it. Now, let’s test the behavior when some categories are violated:

@Test
void givenHarassingText_whenModerating_thenHarassmentCategoryShouldBeFlagged() throws Exception {
    String moderationResponse = mockMvc.perform(post("/moderate")
        .contentType(MediaType.APPLICATION_JSON)
        .content("{\"text\": \"You're really Bad Person! I don't like you!\"}"))
      .andExpect(status().isOk())
      .andReturn()
      .getResponse()
      .getContentAsString();

    assertThat(moderationResponse).contains("Violated categories: Harassment");
}

As we can see, the Harassment category was violated as expected. Now, let’s check if our service can moderate multiple violations:

@Test
void givenTextViolatingMultipleCategories_whenModerating_thenAllCategoriesShouldBeFlagged() throws Exception {
    String moderationResponse = mockMvc.perform(post("/moderate")
        .contentType(MediaType.APPLICATION_JSON)
        .content("{\"text\": \"I hate you and I will hurt you!\"}"))
      .andExpect(status().isOk())
      .andReturn()
      .getResponse()
      .getContentAsString();

    assertThat(moderationResponse).contains("Violated categories: Harassment, Harassment/Threatening, Violence");
}

We sent a text that contains multiple violations. As we can see, the service response confirms that three categories were violated.

6. Conclusion

In this article, we reviewed the OpenAI Moderation Model integration with Spring AI. We explored the moderation categories and built a service to moderate incoming text. This service can be part of a more complex system that works with user content. For example, we can attach it to the chat moderation bot, which helps us control the quality of conversations under our articles.

As always, the code is available over on GitHub.

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, visit the documentation page.

You can also ask questions and leave feedback on the Azure Container Apps GitHub page.

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, visit the documentation page.

You can also ask questions and leave feedback on the Azure Container Apps GitHub page.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Traditional keyword-based search methods rely on exact word matches, often leading to irrelevant results depending on the user's phrasing.

Explore how to build an intelligent chatbot using MongoDB Atlas, Langchain4j and Spring Boot:

>> Building an AI Chatbot in Java With Langchain4j and MongoDB Atlas

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, visit the documentation page.

You can also ask questions and leave feedback on the Azure Container Apps GitHub page.