Returning Stream vs. Collection

Azure Spring Apps is a fully managed service from Microsoft (built in collaboration with VMware), focused on building and deploying Spring Boot applications on Azure Cloud without worrying about Kubernetes.

And, the Enterprise plan comes with some interesting features, such as commercial Spring runtime support, a 99.95% SLA and some deep discounts (up to 47%) when you are ready for production.

>> Learn more and deploy your first Spring Boot app to Azure.

You can also ask questions and leave feedback on the Azure Spring Apps GitHub page.

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

The Jet Profiler was built for MySQL only, so it can do things like real-time query performance, focus on most used tables or most frequent queries, quickly identify performance issues and basically help you optimize your queries.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

Accelerate Your Jakarta EE Development with Payara Server!

With best-in-class guides and documentation, Payara essentially simplifies deployment to diverse infrastructures.

Beyond that, it provides intelligent insights and actions to optimize Jakarta EE applications.

The goal is to apply an opinionated approach to get to what's essential for mission-critical applications - really solid scalability, availability, security, and long-term support:

>> Download and Explore the Guide (to learn more)

The AI Assistant to boost Boost your productivity writing unit tests - Machinet AI.

AI is all the rage these days, but for very good reason. The highly practical coding companion, you'll get the power of AI-assisted coding and automated unit test generation.
Machinet's Unit Test AI Agent utilizes your own project context to create meaningful unit tests that intelligently aligns with the behavior of the code.
And, the AI Chat crafts code and fixes errors with ease, like a helpful sidekick.

Simplify Your Coding Journey with Machinet AI:

>> Install Machinet AI in your IntelliJ

Looking for the ideal Linux distro for running modern Spring apps in the cloud?

Meet Alpaquita Linux: lightweight, secure, and powerful enough to handle heavy workloads.

This distro is specifically designed for running Java apps. It builds upon Alpine and features significant enhancements to excel in high-density container environments while meeting enterprise-grade security standards.

Specifically, the container image size is ~30% smaller than standard options, and it consumes up to 30% less RAM:

>> Try Alpaquita Containers now.

DbSchema is a super-flexible database designer, which can take you from designing the DB with your team all the way to safely deploying the schema.

The way it does all of that is by using a design model, a database-independent image of the schema, which can be shared in a team using GIT and compared or deployed on to any database.

And, of course, it can be heavily visual, allowing you to interact with the database using diagrams, visually compose queries, explore the data, generate random data, import data or build HTML5 database reports.

>> Take a look at DBSchema

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

1. Overview

The Java 8 Stream API offers an efficient alternative over Java Collections to render or process a result set. However, it’s a common dilemma to decide which one to use when.

In this article, we’ll explore Stream and Collection and discuss various scenarios that suit their respective uses.

2. Collection vs. Stream

Java Collections offer efficient mechanisms to store and process the data by providing data structures like List, Set, and Map.

However, the Stream API is useful for performing various operations on the data without the need for intermediate storage. Therefore, a Stream works similarly to directly accessing the data from the underlying storage like collections and I/O resources.

Additionally, the collections are primarily concerned with providing access to the data and ways to modify it. On the other hand, streams are concerned with transmitting data efficiently.

Although Java allows easy conversion from Collection to Stream and vice-versa, it’s handy to know which is the best possible mechanism to render/process a result set.

For instance, we can convert a Collection into a Stream using the stream and parallelStream methods:

public Stream<String> userNames() {
    ArrayList<String> userNameSource = new ArrayList<>();
    userNameSource.add("john");
    userNameSource.add("smith");
    userNameSource.add("tom");
    return userNames.stream();
}

public List<String> userNameList() {
    return userNames().collect(Collectors.toList());
}

Here, we’ve converted a Stream into a List using the Collectors.toList() method. Similarly, we can convert a Stream into a Set or into a Map:

public static Set<String> userNameSet() {
    return userNames().collect(Collectors.toSet());
}

public static Map<String, String> userNameMap() {
    return userNames().collect(Collectors.toMap(u1 -> u1.toString(), u1 -> u1.toString()));
}

3. When to Return a Stream?

3.1. High Materialization Cost

The Stream API offers lazy execution and filtering of the results on the go, the most effective ways to lower the materialization cost.

For instance, the readAllLines method in the Java NIO Files class renders all the lines of a file, for which the JVM has to hold the entire file contents in memory. So, this method has a high materialization cost involved in returning the list of lines.

However, the Files class also provides the lines method that returns a Stream that we can use to render all the lines or even better restrict the size of the result set using the limit method – both with lazy execution:

Files.lines(path).limit(10).collect(toList());

Also, a Stream doesn’t perform the intermediate operations until we invoke terminal operations like forEach over it:

userNames().filter(i -> i.length() >= 4).forEach(System.out::println);

Therefore, a Stream avoids the costs associated with premature materialization.

3.2. Large or Infinite Result

Streams are designed for better performance with large or infinite results. Therefore, it’s always a good idea to use a Stream for such a use case.

Also, in the case of infinite results, we usually don’t process the entire result set. So, Stream API’s built-in features like filter and limit prove handy in processing the desired result set, making the Stream a preferable choice.

3.3. Flexibility

Streams are very flexible in allowing the processing of the results in any form or order.

A Stream is an obvious choice when we don’t want to enforce a consistent result set to the consumer. Additionally, the Stream is a great choice when we want to offer much-needed flexibility to the consumer.

For instance, we can filter/order/limit the results using various operations available on the Stream API:

public static Stream<String> filterUserNames() {
    return userNames().filter(i -> i.length() >= 4);
}

public static Stream<String> sortUserNames() {
    return userNames().sorted();
}

public static Stream<String> limitUserNames() {
    return userNames().limit(3);
}

3.4. Functional Behavior

A Stream is functional. It doesn’t allow any modification to the source when processed in different ways. Therefore, it’s a preferred choice to render an immutable result set.

For instance, let’s filter and limit a set of results received from the primary Stream:

userNames().filter(i -> i.length() >= 4).limit(3).forEach(System.out::println);

Here, operations like filter and limit on the Stream return a new Stream every time and don’t modify the source Stream provided by the userNames method.

4. When to Return a Collection?

4.1. Low Materialization Cost

We can choose collections over streams when rendering or processing the results involving low materialization cost.

In other words, Java constructs a Collection eagerly by computing all the elements at the beginning. Hence, a Collection with a large result set puts a lot of pressure on the heap memory in materialization.

Therefore, we should consider a Collection to render a result set that doesn’t put much pressure on the heap memory for its materialization.

4.2. Fixed Format

We can use a Collection to enforce a consistent result set for the user. For instance, Collections like TreeSet and TreeMap return naturally ordered results.

In other words, with the use of the Collection, we can ensure each consumer receives and processes the same result set in identical order.

4.3. Reuseable Result

When a result is returned in the form of a Collection, it can be easily traversed multiple times. However, a Stream is considered consumed once traversed and throws IllegalStateException when reused:

public static void tryStreamTraversal() {
    Stream<String> userNameStream = userNames();
    userNameStream.forEach(System.out::println);
    
    try {
        userNameStream.forEach(System.out::println);
    } catch(IllegalStateException e) {
        System.out.println("stream has already been operated upon or closed");
    }
}

Therefore, returning a Collection is a better choice when it’s obvious that a consumer will traverse the result multiple times.

4.4. Modification

A Collection, unlike a Stream, allows modification of the elements like adding or removing elements from the result source. Hence, we can consider using collections to return the result set to allow modifications by the consumer.

For example, we can modify an ArrayList using add/remove methods:

userNameList().add("bob");
userNameList().add("pepper");
userNameList().remove(2);

Similarly, methods like put and remove allow modification on a map:

Map<String, String> userNameMap = userNameMap();
userNameMap.put("bob", "bob");
userNameMap.remove("alfred");

4.5. In-Memory Result

Additionally, it’s an obvious choice to use a Collection when a materialized result in the form of the collection is already present in memory.

5. Conclusion

In this article, we compared Stream vs. Collection and examined various scenarios that suit them.

We can conclude that Stream is a great candidate to render large or infinite result sets with benefits like lazy initialization, much-needed flexibility, and functional behavior.

However, when we require a consistent form of the results, or when low materialization is involved, we should choose a Collection over a Stream.

As usual, the source code is available over on GitHub.

Returning Stream vs. Collection

Get started with Spring and Spring Boot, through the Learn Spring course:

1. Overview

2. Collection vs. Stream

3. When to Return a Stream?

3.1. High Materialization Cost

3.2. Large or Infinite Result

3.3. Flexibility

3.4. Functional Behavior

4. When to Return a Collection?

4.1. Low Materialization Cost

4.2. Fixed Format

4.3. Reuseable Result

4.4. Modification

4.5. In-Memory Result

5. Conclusion

Get started with Spring and Spring Boot, through the Learn Spring course:

REST with Spring

Learn Spring Security ▼▲

Learn Spring Security Core

Learn Spring Security OAuth

Learn Spring

Learn Spring Data JPA

Persistence

REST

Security

Full Archive

Baeldung Ebooks

About Baeldung

Write for Baeldung

Get started with Spring and Spring Boot, through the Learn Spring course:

1. Overview

2. Collection vs. Stream

3. When to Return a Stream?

3.1. High Materialization Cost

3.2. Large or Infinite Result

3.3. Flexibility

3.4. Functional Behavior

4. When to Return a Collection?

4.1. Low Materialization Cost

4.2. Fixed Format

4.3. Reuseable Result

4.4. Modification

4.5. In-Memory Result

5. Conclusion

Get started with Spring and Spring Boot, through the Learn Spring course: