Java Top

I just announced the new Spring Boot 2 material, coming in REST With Spring:

>> CHECK OUT THE COURSE

1. Overview

In this quick tutorial, we’ll have a look at how to compute the intersection between two Integer arrays ‘a’ and ‘b’.

We’ll also focus on how to handle duplicate entries.

For the implementation, we’ll use Streams.

2. Membership Predicate for an Array

The intersection of two sets is by definition a set with all values from one, which are also part of the second set.

Therefore we need a Function or rather a Predicate to decide the membership in the second array. Since List provides such a method out of the box, we’ll transform this to a List:

Predicate isContainedInB = Arrays.asList(b)::contains;

3. Building the Intersection

To build up the resulting array, we’ll consider the elements of the first set sequentially and verify if they’re also contained in the second array. Then we’ll create a new array based on this.

The Stream API provides us with the needed methods. First, we’ll create a Stream, then filter with the membership-Predicate and finally we’ll create a new array:

public static Integer[] intersectionSimple(Integer[] a, Integer[] b){
    return Stream.of(a)
      .filter(Arrays.asList(b)::contains)
      .toArray(Integer[]::new);
}

4. Duplicate Entries

Since arrays in Java are no Set implementation, we face the issue of duplicate entries in the input and then in the result. Notice that the number of occurrences in the result depends on the occurrences in the first parameter.

But for sets, elements must not occur multiple times. We can archive this by using the distinct() method:

public static Integer[] intersectionSet(Integer[] a, Integer[] b){
    return Stream.of(a)
      .filter(Arrays.asList(b)::contain)
      .distinct()
      .toArray(Integer[]::new);
}

So the length of the intersection no longer depends on the parameter order.

However, the intersection of an array with itself may not be the array again since we remove double entries.

5. Multiset Intersection

A more general notion, which allows multiple equal entries, are multisets. For them, the intersection is then defined by the minimal number of input occurrences. So our membership-Predicate must keep score how often we add an element to the result.

The remove() method can be used for this, which returns the membership and consumes the elements. So after all equal elements in ‘b’ are consumed, no more equal elements are added to the result:

public static Integer[] intersectionSet(Integer[] a, Integer[] b){
    return Stream.of(a)
      .filter(new LinkedList<>(Arrays.asList(b))::remove)
      .toArray(Integer[]::new);
}

Since the Arrays API only returns an immutable List, we have to generate a dedicate mutable one.

6. Conclusion

In this article, we’ve seen how to use the contains and remove methods to implement an intersection for two arrays in Java.

All the implementation, code snippets, and tests can be found in our Github repository – this is a Maven-based project, so it should be easy to import and run as it is.

Java bottom

I just announced the new Spring Boot 2 material, coming in REST With Spring:

>> CHECK OUT THE LESSONS

4
Leave a Reply

avatar
1 Comment threads
3 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
3 Comment authors
Jan HauerEricMoe Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Moe
Guest
Moe

Isn’t contains for Lists an O(n) operation? So if array ‘a’ has n elements and array ‘b’ has m elements we are doing an O(m) operation n times. I believe this could be written as O(mn) operation. Wouldn’t it be better to index one of the arrays first in a hash set and then check the other one against that? I believe this is O(m+n) operation. (add and contains are O(1) for HashSets). The only time that this is worse than the O(mn) is if at least one of the n or m are equal to 1. Something like this:… Read more »

Eric Martin
Member
Eric Martin

You are correct that more optimal solutions exist for this problem. This article offers a few different ways to solve the problem with Java Streams.

If this type of operation is a key performance bottleneck in your application, then extra care needs to be taken when choosing a solution.