Partner – Microsoft – NPI EA (cat = Baeldung)
announcement - icon

Azure Container Apps is a fully managed serverless container service that enables you to build and deploy modern, cloud-native Java applications and microservices at scale. It offers a simplified developer experience while providing the flexibility and portability of containers.

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, visit the documentation page.

You can also ask questions and leave feedback on the Azure Container Apps GitHub page.

Partner – Microsoft – NPI EA (cat= Spring Boot)
announcement - icon

Azure Container Apps is a fully managed serverless container service that enables you to build and deploy modern, cloud-native Java applications and microservices at scale. It offers a simplified developer experience while providing the flexibility and portability of containers.

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, you can get started over on the documentation page.

And, you can also ask questions and leave feedback on the Azure Container Apps GitHub page.

Partner – Orkes – NPI EA (cat=Spring)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Partner – Orkes – NPI EA (tag=Microservices)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

eBook – Guide Spring Cloud – NPI EA (cat=Spring Cloud)
announcement - icon

Let's get started with a Microservice Architecture with Spring Cloud:

>> Join Pro and download the eBook

eBook – Mockito – NPI EA (tag = Mockito)
announcement - icon

Mocking is an essential part of unit testing, and the Mockito library makes it easy to write clean and intuitive unit tests for your Java code.

Get started with mocking and improve your application tests using our Mockito guide:

Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Reactive – NPI EA (cat=Reactive)
announcement - icon

Spring 5 added support for reactive programming with the Spring WebFlux module, which has been improved upon ever since. Get started with the Reactor project basics and reactive programming in Spring Boot:

>> Join Pro and download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Jackson – NPI EA (cat=Jackson)
announcement - icon

Do JSON right with Jackson

Download the E-book

eBook – HTTP Client – NPI EA (cat=Http Client-Side)
announcement - icon

Get the most out of the Apache HTTP Client

Download the E-book

eBook – Maven – NPI EA (cat = Maven)
announcement - icon

Get Started with Apache Maven:

Download the E-book

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

eBook – RwS – NPI EA (cat=Spring MVC)
announcement - icon

Building a REST API with Spring?

Download the E-book

Course – LS – NPI EA (cat=Jackson)
announcement - icon

Get started with Spring and Spring Boot, through the Learn Spring course:

>> LEARN SPRING
Course – RWSB – NPI EA (cat=REST)
announcement - icon

Explore Spring Boot 3 and Spring 6 in-depth through building a full REST API with the framework:

>> The New “REST With Spring Boot”

Course – LSS – NPI EA (cat=Spring Security)
announcement - icon

Yes, Spring Security can be complex, from the more advanced functionality within the Core to the deep OAuth support in the framework.

I built the security material as two full courses - Core and OAuth, to get practical with these more complex scenarios. We explore when and how to use each feature and code through it on the backing project.

You can explore the course here:

>> Learn Spring Security

Course – LSD – NPI EA (tag=Spring Data JPA)
announcement - icon

Spring Data JPA is a great way to handle the complexity of JPA with the powerful simplicity of Spring Boot.

Get started with Spring Data JPA through the guided reference course:

>> CHECK OUT THE COURSE

Partner – MongoDB – NPI EA (tag=MongoDB)
announcement - icon

Traditional keyword-based search methods rely on exact word matches, often leading to irrelevant results depending on the user's phrasing.

By comparison, using a vector store allows us to represent the data as vector embeddings, based on meaningful relationships. We can then compare the meaning of the user’s query to the stored content, and retrieve more relevant, context-aware results.

Explore how to build an intelligent chatbot using MongoDB Atlas, Langchain4j and Spring Boot:

>> Building an AI Chatbot in Java With Langchain4j and MongoDB Atlas

Partner – LambdaTest – NPI EA (cat=Testing)
announcement - icon

Accessibility testing is a crucial aspect to ensure that your application is usable for everyone and meets accessibility standards that are required in many countries.

By automating these tests, teams can quickly detect issues related to screen reader compatibility, keyboard navigation, color contrast, and other aspects that could pose a barrier to using the software effectively for people with disabilities.

Learn how to automate accessibility testing with Selenium and the LambdaTest cloud-based testing platform that lets developers and testers perform accessibility automation on over 3000+ real environments:

Automated Accessibility Testing With Selenium

1. Overview

In today’s fast-paced world of AI service development, there is a vast array of pre-trained AI models and numerous model-serving platforms, such as TensorFlow and NVIDIA Triton, available to host them. In the Java ecosystem, the primary focus is on integrating applications with model-serving servers, enabling efficient communication with AI models.

Apache Camel’s KServe component offers a streamlined solution for AI model inference by facilitating interaction with KServe-compliant model servers. This component simplifies the communication between Java applications and the model servers that host the trained AI models.

In this tutorial, we’ll explore the tools necessary for enabling Java applications to interact with AI models. Additionally, we’ll walk through a practical example of a Java service that uses AI to analyze user-provided sentences and determine whether they express positive or negative sentiment.

2. High-Level Overview of the Tools

To integrate our Java application with an AI model, we require a trained AI model, a host to serve the model, and our Java application with a framework that facilitates easy integration with the host.

The model can be of any type, such as TensorFlow, PyTorch, ONNX, OpenVINO, etc. The host should be any host that supports the KServe Open Inference Protocol (OIP) V2, such as the latest versions of NVIDIA Triton.

Next, let’s have a quick look at the exact tools, protocols, and frameworks that we’re going to use in our application:

  • KServe (formerly KFServing) is a Kubernetes-based tool for serving machine learning models. It’s designed to simplify deploying and managing ML models on Kubernetes, with features like autoscaling, rolling updates, and model versioning. KServe integrates well with tools like TensorFlow, PyTorch, and scikit-learn, making it easier to deploy these models at scale.
  • Open Inference Protocol V2 is a standardized communication protocol. It was first introduced and used in the KServe project. Now, the protocol is supported by multiple backends, not just KServe.
  • Apache Camel is an open-source integration framework that enables the integration of different systems using a variety of protocols and technologies. Camel has a ton of components, so it can work with almost any kind of technology we throw at it—HTTP, JMS, databases, and more. More specifically, Apache Camel’s KServe component makes integration with systems that support the KServe Open Inference Protocol (OIP) V2.
  • NVIDIA Triton is an open-source model-serving platform designed for high-performance AI inference. It supports a variety of popular frameworks such as TensorFlow, PyTorch, and ONNX, allowing organizations to deploy models at scale with flexibility.
  • Hugging Face is a leading platform in the natural language processing (NLP) space. It offers a wide range of pre-trained models and tools for NLP tasks like text classification, translation, and sentiment analysis. This is where we’ll get our trained AI model for our service as well.

3. Sentiments Service Using a Pre-Trained AI Model

For our demonstration, let’s create a practical application that accepts a sentence and predicts if the sentiment of the sentence is good or bad. All we need is a pre-trained AI model, a host for the model, and a Java application that will offer a more user-friendly experience.

3.1. Setting up the Pre-Trained AI Model of Sentiments

It’s not in the scope of this article to create, train, etc, an AI Model. We’ll use one existing from Hugging Face and more specifically, we’ll download the pre-trained model pjxcharya/onnx-sentiment-model. This model accepts a sentence, after being converted to the proper tokens, and responds with two numbers, one negative and one positive. If the negative’s absolute value is larger, then the sentiment is bad, or else it’s good.

The only things we need to do before we move on are to understand the inputs and outputs of the model. We’ll need those things later, when we use Apache Camel’s KServe Component to interact with the server. One easy way to do this is to use the online tool netron.app that provides the graph of the model and the properties:

sentiments model, graph properties

From the picture, we see that our inputs are input_ids and attention_mask, both of type int64. The output is logits of type float32.

3.2. Setting up the Triton Model Serving Server

NVIDIA’s Triton model Server can host our model of type ONNX. First, we put the ONNX model file in models/sentiment/1/model.onnx, and then we create the configuration file models/sentiment/config.pbtxt:

name: "sentiment"
platform: "onnxruntime_onnx"
max_batch_size: 8

input [
  {
    name: "input_ids"
    data_type: TYPE_INT64
    dims: [ -1 ]
  },
  {
    name: "attention_mask"
    data_type: TYPE_INT64
    dims: [ -1 ]
  }
]

output [
  {
    name: "logits"
    data_type: TYPE_FP32
    dims: [ 2 ]
  }
]

Here we set the name of the model and the names and types of input and output.

Finally, we create a Dockerfile to make it easily deployable:

FROM nvcr.io/nvidia/tritonserver:25.02-py3

# Copy the model repository into the container
COPY models/ /models/

# Expose default Triton ports
EXPOSE 8000 8001 8002

# Set entrypoint to run Triton with your model repo
CMD ["tritonserver", "--model-repository=/models"]

All we need to do is copy the models folder into the container and set it as the model-repository.

3.3. The Web Service Using Apache Camel’s KServe Component

Apache Camel can be used to communicate with a model server just by using HTTP or gRPC. However, with Apache Camel’s KServe Component, the interaction is more straightforward and better suited for any model server that supports the KServe V2 protocol.

Let’s start by setting the dependencies for using the KServe component. Note that the component is available from 4.10.0 and later versions:

<dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-main</artifactId>
    <version>4.13.0</version>
</dependency>
<dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-kserve</artifactId>
    <version>4.13.0</version>
</dependency>

Next, we proceed by creating the Apache Camel Kserve route, which takes the request (in our case, it’s an HTTP request) and routes it to a call to the Triton Server:

public class SentimentsRoute extends RouteBuilder {
    @Override
    public void configure() {
        // (code that configures the REST incoming call)

        // Main route
        from("direct:classify")
          .routeId("sentiment-inference")
          .setBody(this::createRequest)
          .setHeader("Content-Type", constant("application/json"))
          .to("kserve:infer?modelName=sentiment&target=localhost:8001")
          .process(this::postProcess);
    }

    // ...
}

Assuming we have set the configuration to accept the incoming REST call and route it to “direct:classify”, we set the route to integrate with the Triton Server. The createRequest() method creates the body of the request, and postProcess() handles the response, as we’ll see later in this tutorial. The endpoint for the KServer protocol starts with “kserve:…”, then the method, which is usually infer for Triton. The modelName and target properties are set to point to the correct URL and model (note that for our case, using Docker Compose, we’ll need to set the host to host.docker.internal).

3.4. Handling Apache Camel’s KServe Component Input

Apache Camel provides a rich library for creating requests for our routes. The KServe Component adds the InferTensorContents class to create objects for the request. It allows us to set the data type of the model’s attribute, name, shapes, content, and more:

private ModelInferRequest createRequest(Exchange exchange) {
    String sentence = exchange.getIn().getHeader("sentence", String.class);
    Encoding encoding = tokenizer.encode(sentence);
    List<Long> inputIds = Arrays.stream(encoding.getIds()).boxed().collect(Collectors.toList());
    List<Long> attentionMask = Arrays.stream(encoding.getAttentionMask()).boxed().collect(Collectors.toList());

    var content0 = InferTensorContents.newBuilder().addAllInt64Contents(inputIds);
    var input0 = ModelInferRequest.InferInputTensor.newBuilder()
      .setName("input_ids").setDatatype("INT64").addShape(1).addShape(inputIds.size())
      .setContents(content0);

    var content1 = InferTensorContents.newBuilder().addAllInt64Contents(attentionMask);
    var input1 = ModelInferRequest.InferInputTensor.newBuilder()
      .setName("attention_mask").setDatatype("INT64").addShape(1).addShape(attentionMask.size())
      .setContents(content1);

    ModelInferRequest requestBody = ModelInferRequest.newBuilder()
      .addInputs(0, input0).addInputs(1, input1)
      .build();

    return requestBody;
}

As noted earlier, we need to create a request with two inputs. The tokenizer object comes from the ai.djl.huggingface:tokenizers dependency and helps us encode the human-readable String to encoding.ids that our model accepts. This is all we need for the request to the Triton Server.

3.5. Handling Apache Camel’s KServe Component Output

Similarly, we can use the ModelInferResponse class to receive the response from the model serving server:

private void postProcess(Exchange exchange) {
    ModelInferResponse response = exchange.getMessage().getBody(ModelInferResponse.class);

    List<List<Float>> logits = response.getRawOutputContentsList().stream()
      .map(ByteString::asReadOnlyByteBuffer)
      .map(buf -> buf.order(ByteOrder.LITTLE_ENDIAN).asFloatBuffer())
      .map(buf -> {
          List<Float> longs = new ArrayList<>(buf.remaining());
          while (buf.hasRemaining()) {
              longs.add(buf.get());
          }
          return longs;
      })
      .toList();

    String result = Math.abs(logits.getFirst().getFirst()) < logits.getFirst().getLast() ? "good" : "bad";

    exchange.getMessage().setBody(result);
}

Here, we use the getRawOutputContentsList() method to translate the bytes to float numbers and compare the negative to the positive, to understand if the prediction is good or bad sentiment.

3.6. Demonstration Using Docker Compose

Let’s create a docker-compose.yml file to easily spin up both the Triton Server and our Java applications:

services:
  triton-server:
    build: ./triton-server
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
    ports:
      - "8000:8000"  # HTTP
      - "8001:8001"  # gRPC
      - "8002:8002"  # Metrics
  sentiment-service:
    build: ./sentiment-service
    ports:
      - "8080:8080"

After we spin up the services, we can request our Java service to test the bad sentiment:

bad-sentiment

And the good sentiment:

good-sentiment
We see in the pictures that the model works as expected through our web service.

4. Conclusion

In this article, we demonstrated how to create a Java application with AI model inference, using Apache Camel’s KServe component. We used Apaceh Camel to easily integrate with the model serving host Triton. We pre-loaded Triton with an AI model that offers predictions about the sentiment of the provided sentence (good or bad).

As always, all the source code used in the examples is available over on GitHub.

Baeldung Pro – NPI EA (cat = Baeldung)
announcement - icon

Baeldung Pro comes with both absolutely No-Ads as well as finally with Dark Mode, for a clean learning experience:

>> Explore a clean Baeldung

Once the early-adopter seats are all used, the price will go up and stay at $33/year.

Partner – Microsoft – NPI EA (cat = Baeldung)
announcement - icon

Azure Container Apps is a fully managed serverless container service that enables you to build and deploy modern, cloud-native Java applications and microservices at scale. It offers a simplified developer experience while providing the flexibility and portability of containers.

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, visit the documentation page.

You can also ask questions and leave feedback on the Azure Container Apps GitHub page.

Partner – Microsoft – NPI EA (cat = Spring Boot)
announcement - icon

Azure Container Apps is a fully managed serverless container service that enables you to build and deploy modern, cloud-native Java applications and microservices at scale. It offers a simplified developer experience while providing the flexibility and portability of containers.

Of course, Azure Container Apps has really solid support for our ecosystem, from a number of build options, managed Java components, native metrics, dynamic logger, and quite a bit more.

To learn more about Java features on Azure Container Apps, visit the documentation page.

You can also ask questions and leave feedback on the Azure Container Apps GitHub page.

Partner – Orkes – NPI EA (cat = Spring)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Partner – Orkes – NPI EA (tag = Microservices)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

eBook – HTTP Client – NPI EA (cat=HTTP Client-Side)
announcement - icon

The Apache HTTP Client is a very robust library, suitable for both simple and advanced use cases when testing HTTP endpoints. Check out our guide covering basic request and response handling, as well as security, cookies, timeouts, and more:

>> Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

Partner – MongoDB – NPI EA (tag=MongoDB)
announcement - icon

Traditional keyword-based search methods rely on exact word matches, often leading to irrelevant results depending on the user's phrasing.

By comparison, using a vector store allows us to represent the data as vector embeddings, based on meaningful relationships. We can then compare the meaning of the user’s query to the stored content, and retrieve more relevant, context-aware results.

Explore how to build an intelligent chatbot using MongoDB Atlas, Langchain4j and Spring Boot:

>> Building an AI Chatbot in Java With Langchain4j and MongoDB Atlas

Course – LS – NPI EA (cat=REST)

announcement - icon

Get started with Spring Boot and with core Spring, through the Learn Spring course:

>> CHECK OUT THE COURSE

eBook Jackson – NPI EA – 3 (cat = Jackson)
Subscribe
Notify of
guest
0 Comments
Oldest
Newest
Inline Feedbacks
View all comments