Course – LS (cat=Java)

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

> CHECK OUT THE COURSE

1. Introduction

In this tutorial, we’ll dive deep into a Java Streams vs. For-Loops comparison. These tools play a vital role in data processing for every Java developer. Although they are different in many ways, as we’ll see in the rest of the article, they have very similar use cases and can be easily interchangeable many times.

Streams, introduced in Java 8, offer a functional and declarative approach, while for-loops provide a traditional imperative method. By the end of the article, we can make the most suitable decision for our programming tasks.

2. Performance

When it comes to comparing solutions to a particular programming problem, we often have to talk about performance. Also, this case is no different. Since both streams and for-loops are used to process large amounts of data, performance can be important in choosing the right solution.

Let’s walk through a comprehensive benchmarking example to understand the performance differences between for-loops and streams. We’ll compare the execution times of complex operations involving filtering, mapping, and summing using both for-loops and streams. For this purpose, we’ll use the Java Microbenchmarking Harness (JMH), a tool designed specifically for benchmarking Java code.

2.1. Getting Started

We start by defining the dependencies:

<dependency>
    <groupId>org.openjdk.jmh</groupId>
    <artifactId>jmh-core</artifactId>
    <version>1.37</version>
</dependency>
<dependency>
    <groupId>org.openjdk.jmh</groupId>
    <artifactId>jmh-generator-annprocess</artifactId>
    <version>1.37</version>
</dependency>

We can always find the latest versions of JMH Core and JMH Annotation Processor on Maven Central.

2.2. Setting Up the Benchmark

In our benchmark, we’ll create a scenario with a list of integers ranging from 0 to 999,999. We want to filter out even numbers, square them, and then calculate their sum. Besides that, to ensure fairness, we’ll first implement this process using a traditional for-loop:

@State(Scope.Thread)
public static class MyState {
    List<Integer> numbers;

    @Setup(Level.Trial)
    public void setUp() {
        numbers = new ArrayList<>();
        for (int i = 0; i < 1_000_000; i++) {
            numbers.add(i);
        }
    }
}

This State class will be passed to our benchmark. Also, the Setup will run before each of them.

2.3. Benchmarking with For-Loops

Our for-loop implementation involves iterating through the list of numbers, checking for evenness, squaring them, and accumulating the sum in a variable:

@Benchmark
public int forLoopBenchmark(MyState state) {
    int sum = 0;
    for (int number : state.numbers) {
        if (number % 2 == 0) {
            sum = sum + (number * number);
        }
    }
    return sum;
}

2.4. Benchmarking with Streams

Next, we’ll implement the same complex operations using Java streams. Moreover, we’ll begin by filtering the even numbers, mapping them to their squares, and ultimately calculating the sum:

@Benchmark
public int streamBenchMark(MyState state) {
    return state.numbers.stream()
      .filter(number -> number % 2 == 0)
      .map(number -> number * number)
      .reduce(0, Integer::sum);
}

We use the terminal operations reduce() to compute the sum of the numbers. Also, we can calculate the sum in multiple ways.

2.5. Running the Benchmark

With our benchmark methods in place, we’ll run the benchmark using JMH. We’ll execute the benchmark multiple times to ensure accurate results and measure the average execution time. To do this, we’ll add the following annotations to our class:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 3, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)

With these additions, we ensure the result would be more accurate, running the benchmark five times, after three warmups and computing the average of all five iterations. Now, we can run the main method to see the results:

public static void main(String[] args) throws RunnerException {
    Options options = new OptionsBuilder()
      .include(PerformanceBenchmark.class.getSimpleName())
      .build();
    new Runner(options).run();
}

2.6. Analyzing the Results

Once we run the benchmark, JMH will provide us with average execution times for both the for-loop and stream implementations:

Benchmark                              Mode  Cnt         Score         Error  Units
PerformanceBenchmark.forLoopBenchmark  avgt    5   3386660.051 ± 1375112.505  ns/op
PerformanceBenchmark.streamBenchMark   avgt    5  12231480.518 ± 1609933.324  ns/op

We can see that in our example, the for-loops performed much better than the streams from the performance perspective. Even though streams performed worse than for-loops in this example, this could change in some cases, especially with parallel streams.

3. Syntax and Readability

As programmers, the readability of our code plays an important role. Because of this, this aspect becomes an important one when we try to choose the best solution for our problem.

First and foremost, let’s dive into the syntax and readability of streams. Streams promote a more concise and expressive style of coding. This is evident when filtering and mapping data:

List<String> fruits = Arrays.asList("apple", "banana", "orange", "grape");
long count = fruits.stream()
  .filter(fruit -> fruit.length() > 5)
  .count();

The stream code reads like a fluent sequence of operations, with the filtering condition and the count operation clearly expressed in a single, fluid chain. Furthermore, streams often result in code that’s easier to read due to their declarative nature. The code focuses more on what needs to be done than how to do it.

In contrast, let’s explore the syntax and readability of for-loops. for-loops provide a more traditional and imperative style of coding:

List<String> fruits = Arrays.asList("apple", "banana", "orange", "grape");
long count = 0;
for (String fruit : fruits) {
    if (fruit.length() > 5) {
        count++;
    }
}

Here, the code involves explicit iteration and conditional statements. While this approach is well-understood by most developers, it can sometimes lead to more verbose code, making it potentially harder to read, especially for complex operations.

4. Parallelism and Concurrency

Parallelism and concurrency are crucial aspects to consider when comparing streams and for-loops in Java. Both approaches offer different capabilities and challenges when utilizing multi-core processors and managing concurrent operations.

Streams are designed to make parallel processing more accessible. Java 8 introduced the concept of parallel streams, which automatically leverage multi-core processors to speed up data processing. We can easily rewrite the benchmark from the previous point to compute the sum concurrently:

@Benchmark
public int parallelStreamBenchMark(MyState state) {
    return state.numbers.parallelStream()
      .filter(number -> number % 2 == 0)
      .map(number -> number * number)
      .reduce(0, Integer::sum);
}

The only thing needed to parallelize the process is to replace stream() with parallelStream() method. On the other side, rewriting the for-loop to compute the sum of numbers in parallel is more complicated:

@Benchmark
public int concurrentForLoopBenchmark(MyState state) throws InterruptedException, ExecutionException {
    int numThreads = Runtime.getRuntime().availableProcessors();
    ExecutorService executorService = Executors.newFixedThreadPool(numThreads);
    List<Callable<Integer>> tasks = new ArrayList<>();
    int chunkSize = state.numbers.size() / numThreads;

    for (int i = 0; i < numThreads; i++) {
        final int start = i * chunkSize;
        final int end = (i == numThreads - 1) ? state.numbers.size() : (i + 1) * chunkSize;
        tasks.add(() -> {
            int sum = 0;
            for (int j = start; j < end; j++) {
		int number = state.numbers.get(j);
	        if (number % 2 == 0) {
		    sum = sum + (number * number);
	        }
            }
            return sum;
        });
    }

    int totalSum = 0;
    for (Future<Integer> result : executorService.invokeAll(tasks)) {
        totalSum += result.get();
    }

    executorService.shutdown();
    return totalSum;
}

We can use Java’s concurrency utilities, such as ExecutorService, to execute tasks concurrently. We divide the list into chunks and process them concurrently using a thread pool. When deciding between streams and for-loops for parallelism and concurrency, we should consider the complexity of our task. Streams offer a more straightforward way to enable parallel processing for tasks that can be parallelized easily. On the other hand, for-loops, with manual concurrency control, are suitable for more complex scenarios that require custom thread management and coordination.

5. Mutability

Now, let’s explore the aspect of mutability and how it differs between streams and for-loops. Understanding how these handle mutable data is essential for making informed choices.

First and foremost, we need to recognize that streams, by their nature, promote immutability. In the context of streams, elements within a collection are not modified directly. Instead, operations on the stream create new streams or collections as intermediate results:

List<String> fruits = new ArrayList<>(Arrays.asList("apple", "banana", "orange"));
List<String> upperCaseFruits = fruits.stream()
  .map(fruit -> fruit.toUpperCase())
  .collect(Collectors.toList());

In this stream operation, the original list remains unchanged. The map() operation produces a new stream where each fruit is transformed to uppercase, and the collect() operation collects these transformed elements into a new list.

Contrastingly, for-loops can operate on mutable data structures directly:

List<String> fruits = new ArrayList<>(Arrays.asList("apple", "banana", "orange"));
for (int i = 0; i < fruits.size(); i++) {
    fruits.set(i, fruits.get(i).toUpperCase());
}

In this for-loop, we directly modify the original list, replacing every element with its uppercase correspondent. This can be advantageous when we need to modify existing data in place, but it also necessitates careful handling to avoid unintended consequences.

6. Conclusion

Both streams and loops have their strengths and weaknesses. Streams offer a more functional and declarative approach, enhancing code readability and often leading to concise and elegant solutions. On the other hand, loops provide a familiar and explicit control structure, making them suitable for scenarios where precise execution order or mutability control is critical.

The complete source code and all code snippets for this article are over on GitHub.

Course – LS (cat=Java)

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments