1. Overview

Spring Data Repositories offer flexible ways to query large chunks of data in a Collection or in a Stream. In this tutorial, we’ll learn about querying the data in a List and a Stream and when to use them.

2. List vs. Stream

As we know, social media sites have the details of millions of users. Let’s define a situation where there’s a need to find all users whose age is greater than 20. In this section, we’ll learn to solve this problem using queries that return List and Stream. We’ll also understand the ways both queries work.

Since we’ll use some code examples, there are some prerequisites for running them. We’ve used the H2 database. User is our Entity that has firstName, lastName, and age as its attributes. We’re persisting some users in the setup method of the test class.

We’ve used the Java Faker library to generate random data for this entity.

2.1. List

List in Java is an interface with multiple implementations like ArrayList, LinkedList, etc., and stores a collection of data.

In the following example, we’ll write a Spring Data JPA test to load all users in a List and asserts that all the users in the result are older than 20.

Spring Data offers multiple ways to create queries. Here we’ll use the query method to form our query.

We’ll use this query method to load users in a List:

List<User> findByAgeGreaterThan20();

Now, let’s write a test case to see how it works:

public void whenAgeIs20_thenItShouldReturnAllUsersWhoseAgeIsGreaterThan20InAList() {
  List<User> users = userRepository.findByAgeGreaterThan(20);
  assertThat(users.stream().map(User::getAge).allMatch(age -> age > 20)).isTrue();

The above test case queries users and asserts that all of them are older than 2o. Here, the client gets the users all at once and the underlying database resources will be closed after users are fetched for this query, unless we keep them open.

2.2. Stream

A Stream is a pipeline through which data flows. Some intermediate methods that it supports perform operations on the data as it flows.

Although querying in List is the common way to fetch collections, there are some caveats about using it as a database result that we’ll discuss in the next section. For now, let’s understand how to query data in Stream.

We’ll use this query method this time to load users in a Stream:

Stream<User> findByAgeGreaterThan20();

Now, let’s write a test case:

public void whenAgeIs20_thenItShouldReturnAllUsersWhoseAgeIsGreaterThan20InAStream() {
  Stream<User> users = userRepository.findAllByAgeGreaterThan(20);
  assertThat(users.map(User::getAge).allMatch(age -> age > 20)).isTrue();

We can see clearly that by getting results in Stream, we can operate on it directly. As soon as the first user arrives, the client can act on them, and underlying database resources remain open while processing all users in the stream.

To ensure the EntityManager doesn’t close until all results in the Stream are processed, the Stream data must be queried with the @Transactional annotation. It’s also a good practice to wrap the Stream query in try-with-resources.

Now that we know how to use each of them, in the next section we’ll explore when it’s best to use each one.

3. When to Use

It’s important to use Stream and List in the appropriate context, as using them in situations where they aren’t the best choice may lead to issues such as poor performance or unexpected behavior. It’s always good to evaluate alternatives and choose the one that’s most suitable for the problem.

The List is ideal for small result sets where all records are needed at once, while Stream is better for large result sets that can be processed one by one and also where the client requires a Stream rather than a Collection.

While querying data in Stream, we should prefer a database query rather than the intermediate Stream methods if both can produce the same result.

4. Conclusion

In this article, we learned how to use List and Stream when working with Spring Data Repositories.

We also understood that List is used when a client needs all results at once while in the case of Stream, the client can start working as soon as it gets the first result. We also discussed the effect on underlying database resources and when it’s best to use them.

All the code example used in this article is available over on GitHub.

Course – LSD (cat=Persistence)
announcement - icon

Get started with Spring Data JPA through the reference Learn Spring Data JPA


res – Persistence (eBook) (cat=Persistence)