Generic Top

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

In this article, we'll compare different ways of filtering Java Streams. Initially, we'll see which solution leads to more readable code. After that, we'll compare the solutions from a performance point of view.

2. Readability

Firstly, we'll compare the two solutions from a readability perspective. For the code examples in this section, we'll use the Student class:

public class Student {

    private String name;
    private int year;
    private List<Integer> marks;
    private Profile profile;

    // constructor getters and setters

}

Our goal is to filter a Stream of Students based on the following three rules:

  • the profile must be Profile.PHYSICS
  • the count of the marks should be greater than 3
  • the average mark should be greater than 50

2.1. Multiple Filters

The Stream API allows chaining multiple filters. We can leverage this to satisfy the complex filtering criteria described. Besides, we can use the not Predicate if we want to negate conditions.

This approach will lead to a clean and easy-to-understand code:

@Test
public void whenUsingMultipleFilters_dataShouldBeFiltered() {
    List<Student> filteredStream = students.stream()
      .filter(s -> s.getMarksAverage() > 50)
      .filter(s -> s.getMarks().size() > 3)
      .filter(not(s -> s.getProfile() == Student.Profile.PHYSICS))
      .collect(Collectors.toList());

    assertThat(filteredStream).containsExactly(mathStudent);
}

2.2. Single Filter With Complex Condition

The alternative would be to use a single filter with a more complex condition.

Unfortunately, the resulted code will be a bit harder to read:

@Test
public void whenUsingSingleComplexFilter_dataShouldBeFiltered() {
    List<Student> filteredStream = students.stream()
      .filter(s -> s.getMarksAverage() > 50 
        && s.getMarks().size() > 3 
        && s.getProfile() != Student.Profile.PHYSICS)
      .collect(Collectors.toList());

    assertThat(filteredStream).containsExactly(mathStudent);
}

Though, we can make it better by extracting the several conditions into a separate method:

public boolean isEligibleForScholarship() {
    return getMarksAverage() > 50
      && marks.size() > 3
      && profile != Profile.PHYSICS;
}

As a result, we'll hide the complex condition and we'll give more meaning to the filtering criteria:

@Test
public void whenUsingSingleComplexFilterExtracted_dataShouldBeFiltered() {
    List<Student> filteredStream = students.stream()
        .filter(Student::isEligibleForScholarship)
        .collect(Collectors.toList());

    assertThat(filteredStream).containsExactly(mathStudent);
}

This would be a good solution, especially when we can encapsulate the filter logic inside our model. 

3. Performance

We have seen that using multiple filters can improve the readability of our code. On the other hand, this will imply the creation of multiple objects and it can lead to a loss in performance. To demonstrate this, we'll filter Streams of different sizes and perform multiple checks on their elements.

After this, we'll calculate the total processing time in milliseconds and compare the two solutions. Additionally, we'll include in our tests Parallel Streams and the simple, old, for loop:

 

As a result, we can notice that using a complex condition will result in a performance gain. 

Though, for small sample sizes, the difference might not be noticeable.

4. The Order of the Conditions

Regardless if we are using single or multiple filters, the filtering can cause a performance drop if the checks are not executed in the optimal order.

4.1. Conditions Which are Filtering out Many Elements

Let's assume we have a stream of 100 integer numbers and we want to find the even numbers smaller than 20.

If we first check the parity of the number, we'll end up with a total of 150 checks. This is because the first condition will be evaluated each time, while the second condition will be evaluated only for the even numbers.

@Test
public void givenWrongFilterOrder_whenUsingMultipleFilters_shouldEvaluateManyConditions() {
    long filteredStreamSize = IntStream.range(0, 100).boxed()
      .filter(this::isEvenNumber)
      .filter(this::isSmallerThanTwenty)
      .count();

    assertThat(filteredStreamSize).isEqualTo(10);
    assertThat(numberOfOperations).hasValue(150);
}

On the other hand, if we inverse the order of the filters, we'll only need a total of 120 checks to properly filter the stream. Consequently, the conditions which are filtering out the majority of the elements should be evaluated first.

4.2. Slow or Heavy Conditions

Some of the conditions can potentially be slow. For instance, if one of the filters would require executing some heavy logic or an external call over the network. For better performance, we'll try to evaluate these conditions as fewer times as possible. Therefore, we'll try to evaluate them only if all other conditions were met.

5. Conclusion

In this article, we have analyzed different ways of filtering Java Streams. Firstly, we have compared the two approaches from a readability point of view. We discovered that multiple filters provide a more comprehensible filtering condition.

After that, we have compared the solutions from a performance perspective. We learned that using a complex condition and, therefore, creating fewer objects will lead to better overall performance.

As always, the source code is available over on GitHub.

Generic bottom

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE
Generic footer banner
Comments are closed on this article!