Java Top

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

In this article, we'll learn how HashMap internally manages key-value pairs and how to write custom key implementations.

2. Key Management

2.1. Internal Structure

Maps are used to store values that are assigned to keys. The key is used to identify the value in the Map and to detect duplicates.

While TreeMap uses the Comparable#compareTo(Object) method to sort keys (and also to identify equality), HashMap uses a hash-based structure that can be more easily explained using a quick sketch:

A Map doesn't allow duplicate keys, so the keys are compared to each other using the Object#equals(Object) method. Because this method has poor performance, invocations should be avoided as much as possible. This is achieved through the Object#hashCode() method. This method allows sorting objects by their hash values, and then the Object#equals method only needs to be invoked when objects share the same hash value.

This kind of key management is also applied to the HashSet class, whose implementation uses a HashMap internally.

2.2. Inserting and Finding a Key-Value Pair

Let's create a HashMap example of a simple shop that manages the number of stock items (Integer) by an article id (String). There, we put in a sample value:

Map<String, Integer> items = new HashMap<>();
// insert
items.put("158-865-A", 56);
// find
Integer count = items.get("158-865-A");

The algorithm to insert the key-value pair:

  1. calls “158-865-A”.hashCode() to get the hash value
  2. looks for the list of existing keys that share the same hash value
  3. compares any key of the list with “158-865-A”.equals(key)
    1. The first equality is identified as already existing, and the new one replaces the assigned value.
    2. If no equality occurs, the key-value pair is inserted as a new entry.

To find a value, the algorithm is the same, except that no value is replaced or inserted.

3. Custom Key Classes

We can conclude that to use a custom class for a key, it is necessary that hashCode() and equals() are implemented correctly. To put it simply, we have to ensure that the hashCode() method returns:

  • the same value for the object as long as the state doesn't change (Internal Consistency)
  • the same value for objects that are equal (Equals Consistency)
  • as many different values as possible for objects that are not equal.

We can commonly say that hashCode() and equals() should consider the same fields in their calculation, and we must override both or neither of them. We can easily achieve this by using Lombok or our IDE's generator.

Another important point is: Do not change the hash code of an object while the object is being used as a key. A simple solution is to design the key class to be immutable, but this is not necessary as long as we can ensure that manipulation cannot take place at the key.

Immutability has an advantage here: The hash value can be calculated once on object instantiation, which could increase performance, especially for complex objects.

3.1. Good Example

As an example, we'll design a Coordinate class, consisting of an x and y value, and use it as a key in a HashMap:

Map<Coordinate, Color> pixels = new HashMap<>();
Coordinate coord = new Coordinate(1, 2);
pixels.put(coord, Color.CYAN);
// read the color
Color color = pixels.get(new Coordinate(1, 2));

Let's implement our Coordinate class:

public class Coordinate {
    private final int x;
    private final int y;
    private int hashCode;

    public Coordinate(int x, int y) {
        this.x = x;
        this.y = y;
        this.hashCode = Objects.hash(x, y);
    }

    public int getX() {
        return x;
    }

    public int getY() {
        return y;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o)
            return true;
        if (o == null || getClass() != o.getClass())
            return false;
        Coordinate that = (Coordinate) o;
        return x == that.x && y == that.y;
    }

    @Override
    public int hashCode() {
        return this.hashCode;
    }
}

As an alternative, we could make our class even shorter by using Lombok:

@RequiredArgsConstructor
@Getter
// no calculation in the constructor, but
// since Lombok 1.18.16, we can cache the hash code
@EqualsAndHashCode(cacheStrategy = CacheStrategy.LAZY)
public class Coordinate {
    private final int x;
    private final int y;
}

The optimal internal structure would be:

3.2. Bad Example: Static Hash Value

If we implement the Coordinate class by using a static hash value for all instances, the HashMap will work correctly, but the performance will drop significantly:

public class Coordinate {

    ...

    @Override
    public int hashCode() {
        return 1; // return same hash value for all instances
    }
}

The hash structure then looks like this:

That negates the advantage of hash values completely.

3.3. Bad Example: Modifiable Hash Value

If we make the key class mutable, we should ensure that the state of the instance never changes while it is used as a key:

Map<Coordinate, Color> pixels = new HashMap<>();
Coordinate coord = new Coordinate(1, 2); // x=1, y=2
pixels.put(coord, Color.CYAN);
coord.setX(3); // x=3, y=2

Because the Coordinate is stored under the old hash value, it cannot be found under the new one. So, the line below would lead to a null value:

Color color = pixels.get(coord);

And the following line would result in the object being stored twice within the Map:

pixels.put(coord, Color.CYAN);

4. Conclusion

In this article, we have clarified that implementing a custom key class for a HashMap is a matter of implementing equals() and hashCode() correctly. We've seen how the hash value is used internally and how this would be affected in both good and bad ways.

As always, the example code is available over on GitHub.

Java bottom

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE
Generic footer banner
guest
0 Comments
Inline Feedbacks
View all comments