Java Top

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE

1. Introduction

There are several options to iterate over a collection in Java. In this short tutorial, we’ll look at two similar looking approaches — Collection.stream().forEach() and Collection.forEach().

In most cases, both will yield the same results, however, there are some subtle differences we’ll look at.

2. Overview

First, let’s create a list to iterate over:

List<String> list = Arrays.asList("A", "B", "C", "D");

The most straightforward way is using the enhanced for-loop:

for(String s : list) {
    //do something with s
}

If we want to use functional-style Java, we can also use the forEach(). We can do so directly on the collection:

Consumer<String> consumer = s -> { System.out::println }; 
list.forEach(consumer);

Or, we can call forEach() on the collection’s stream:

list.stream().forEach(consumer);

Both versions will iterate over the list and print all elements:

ABCD ABCD

In this simple case, it doesn’t make a difference which forEach() we use.

3. Execution Order

Collection.forEach() uses the collection’s iterator (if one is specified). That means that the processing order of the items is defined. In contrast, the processing order of Collection.stream().forEach() is undefined.

In most cases, it doesn’t make a difference which of the two we choose.

3.1. Parallel Streams

Parallel streams allow us to execute the stream in multiple threads, and in such situations, the execution order is undefined. Java only requires all threads to finish before any terminal operation, such as Collectors.toList(), is called.

Let’s look at an example where we first call forEach() directly on the collection, and second, on a parallel stream:

list.forEach(System.out::print);
System.out.print(" ");
list.parallelStream().forEach(System.out::print);

If we run the code several times, we see that list.forEach() processes the items in insertion order, while list.parallelStream().forEach() produces a different result at each run.

One possible output is:

ABCD CDBA

Another one is:

ABCD DBCA

3.2. Custom Iterators

Let’s define a list with a custom iterator to iterate over the collection in reversed order:

class ReverseList extends ArrayList<String> {

    @Override
    public Iterator<String> iterator() {

        int startIndex = this.size() - 1;
        List<String> list = this;

        Iterator<String> it = new Iterator<String>() {

            private int currentIndex = startIndex;

            @Override
            public boolean hasNext() {
                return currentIndex >= 0;
            }

            @Override
            public String next() {
                String next = list.get(currentIndex);
                currentIndex--;
                return next;
             }

             @Override
             public void remove() {
                 throw new UnsupportedOperationException();
             }
         };
         return it;
    }
}

When we iterate over the list, again with forEach() directly on the collection and then on the stream:

List<String> myList = new ReverseList();
myList.addAll(list);

myList.forEach(System.out::print); 
System.out.print(" "); 
myList.stream().forEach(System.out::print);

We get different results:

DCBA ABCD

The reason for the different results is that forEach() used directly on the list uses the custom iterator, while stream().forEach() simply takes elements one by one from the list, ignoring the iterator.

4. Modification of the Collection

Many collections (e.g., ArrayList or HashSet) shouldn’t be structurally modified while iterating over them. If an element is removed or added during an iteration, we’ll get a ConcurrentModification exception.

Furthermore, collections are designed to fail-fast, which means that the exception is thrown as soon as there’s a modification.

Similarly, we’ll get a ConcurrentModification exception when we add or remove an element during the execution of the stream pipeline. However, the exception will be thrown later.

Another subtle difference between the two forEach() methods is that Java explicitly allows modifying elements using the iterator. Streams, in contrast, should be non-interfering.

Let’s look at removing and modifying elements in more detail.

4.1. Removing an Element

Let’s define an operation that removes the last element (“D”) of our list:

Consumer<String> removeElement = s -> {
    System.out.println(s + " " + list.size());
    if (s != null && s.equals("A")) {
        list.remove("D");
    }
};

When we iterate over the list, the last element is removed after the first element (“A”) is printed:

list.forEach(removeElement);

Since forEach() is fail-fast, we stop iterating and see an exception before the next element is processed:

A 4
Exception in thread "main" java.util.ConcurrentModificationException
	at java.util.ArrayList.forEach(ArrayList.java:1252)
	at ReverseList.main(ReverseList.java:1)

Let’s see what happens if we use stream().forEach() instead:

list.stream().forEach(removeElement);

Here, we continue iterating over the whole list before we see an exception:

A 4
B 3
C 3
null 3
Exception in thread "main" java.util.ConcurrentModificationException
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380)
	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
	at ReverseList.main(ReverseList.java:1)

However, Java does not guarantee that a ConcurrentModificationException is thrown at all. That means we should never write a program that depends on this exception.

4.2. Changing Elements

We can change an element while iterating over a list:

list.forEach(e -> {
    list.set(3, "E");
});

However, while there is no problem with doing this using either Collection.forEach() or stream().forEach(), Java requires an operation on a stream to be non-interfering. This means that elements shouldn’t be modified during the execution of the stream pipeline.

The reason behind this is that the stream should facilitate parallel execution. Here, modifying elements of a stream could lead to unexpected behavior.

5. Conclusion

In this article, we saw some examples that show the subtle differences between Collection.forEach() and Collection.stream().forEach().

However, it’s important to note that all the examples shown above are trivial and are merely meant to compare the two ways of iterating over a collection. We shouldn’t write code whose correctness relies on the shown behavior.

If we don’t require a stream but only want to iterate over a collection, the first choice should be using forEach() directly on the collection.

The source code for the examples in this article is available over on GitHub.

Java bottom

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE

Leave a Reply

avatar
  Subscribe  
Notify of