Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

In this tutorial, we’ll compare different ways of filtering Java Streams. First, we’ll see which solution leads to more readable code. Then we’ll compare the solutions from a performance point of view.

2. Readability

Let’s start by comparing the two solutions from a readability perspective. For the code examples in this section, we’ll use the Student class:

public class Student {

    private String name;
    private int year;
    private List<Integer> marks;
    private Profile profile;

    // constructor getters and setters

}

Our goal is to filter a Stream of Students based on the following three rules:

  • the profile must be Profile.PHYSICS
  • the count of the marks should be greater than 3
  • the average mark should be greater than 50

2.1. Multiple Filters

The Stream API allows chaining multiple filters. We can leverage this to satisfy the complex filtering criteria described. We can also use the not Predicate if we want to negate conditions.

This approach will lead to clean and easy-to-understand code:

@Test
public void whenUsingMultipleFilters_dataShouldBeFiltered() {
    List<Student> filteredStream = students.stream()
      .filter(s -> s.getMarksAverage() > 50)
      .filter(s -> s.getMarks().size() > 3)
      .filter(not(s -> s.getProfile() == Student.Profile.PHYSICS))
      .collect(Collectors.toList());

    assertThat(filteredStream).containsExactly(mathStudent);
}

2.2. Single Filter With Complex Condition

The alternative would be to use a single filter with a more complex condition.

Unfortunately, the resulting code will be a bit harder to read:

@Test
public void whenUsingSingleComplexFilter_dataShouldBeFiltered() {
    List<Student> filteredStream = students.stream()
      .filter(s -> s.getMarksAverage() > 50 
        && s.getMarks().size() > 3 
        && s.getProfile() != Student.Profile.PHYSICS)
      .collect(Collectors.toList());

    assertThat(filteredStream).containsExactly(mathStudent);
}

Though, we can make it better by extracting several conditions into a separate method:

public boolean isEligibleForScholarship() {
    return getMarksAverage() > 50
      && marks.size() > 3
      && profile != Profile.PHYSICS;
}

As a result, we’ll hide the complex condition and give more meaning to the filtering criteria:

@Test
public void whenUsingSingleComplexFilterExtracted_dataShouldBeFiltered() {
    List<Student> filteredStream = students.stream()
        .filter(Student::isEligibleForScholarship)
        .collect(Collectors.toList());

    assertThat(filteredStream).containsExactly(mathStudent);
}

This would be a good solution, especially when we can encapsulate the filter logic inside our model. 

3. Performance

We’ve seen that using multiple filters can improve the readability of our code. On the other hand, this will imply the creation of multiple objects, and can lead to a loss in performance. To demonstrate this, we’ll filter Streams of different sizes and perform multiple checks on their elements.

After this, we’ll calculate the total processing time in milliseconds, and compare the two solutions. Additionally, we’ll include in our tests Parallel Streams and the simple, old, for loop:

 

stream filer size comparisson

As we can see, using a complex condition will result in a performance gain. 

Though, for small sample sizes, the difference might not be noticeable.

4. The Order of the Conditions

Regardless if we’re using single or multiple filters, the filtering can cause a performance drop if the checks aren’t executed in the optimal order.

4.1. Conditions Which Are Filtering out Many Elements

Let’s assume we have a stream of 100 integer numbers, and we want to find the even numbers smaller than 20.

If we first check the parity of the number, we’ll end up with a total of 150 checks. This is because the first condition will be evaluated each time, while the second condition will be evaluated only for the even numbers.

@Test
public void givenWrongFilterOrder_whenUsingMultipleFilters_shouldEvaluateManyConditions() {
    long filteredStreamSize = IntStream.range(0, 100).boxed()
      .filter(this::isEvenNumber)
      .filter(this::isSmallerThanTwenty)
      .count();

    assertThat(filteredStreamSize).isEqualTo(10);
    assertThat(numberOfOperations).hasValue(150);
}

On the other hand, if we inverse the order of the filters, we’ll only need a total of 120 checks to properly filter the stream. Consequently, the conditions which are filtering out the majority of the elements should be evaluated first.

4.2. Slow or Heavy Conditions

Some of the conditions can potentially be slow, like if one of the filters requires executing some heavy logic, or an external call over the network. For better performance, we’ll try to evaluate these conditions as few times as possible. Basically, we’ll try to evaluate them only if all other conditions are met.

5. Conclusion

In this article, we analyzed different ways of filtering Java Streams. First, we compared the two approaches from a readability point of view. We discovered that multiple filters provide a more comprehensible filtering condition.

Next, we compared the solutions from a performance perspective. We learned that using a complex condition, and therefore creating fewer objects, will lead to better overall performance.

As always, the source code is available over on GitHub.

Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.