In this tutorial, we’ll learn about linear probing – a collision resolution technique for searching the location of an element in a hash table.
Hash tables are auxiliary data structures that map indexes to keys. However, hashing these keys may result in collisions, meaning different keys generate the same index in the hash table.
We’ll demonstrate how linear probing helps us insert values into a table despite all collisions that may occur during the process. Also, we’ll look at how linear probing works for search operations.
2. Linear Probing
Linear probing is one of many algorithms designed to find the correct position of a key in a hash table. When inserting keys, we mitigate collisions by scanning the cells in the table sequentially. Once we find the next available cell, we insert the key. Similarly, to find an element in a hash table, we linearly scan the cells until we find the key or all positions have been scanned.
Before diving into the details of the algorithm, let’s demonstrate different scenarios of hashing keys and collisions. First, let’s assume we have the following set of keys and an arbitrary hash function that yields the following:
To use the linear probing algorithm, we must traverse all cells in the hash table sequentially. Inserting or searching for keys could result in a collision with a previously inserted key. Yet, with linear probing, we overcome this by searching linearly for the next available cell.
Let’s walk through the algorithm using as the input set and a hash table with the following initial state:
As we saw previously, inserting 2 into the hash table results in a collision at index 3. Now, we must visit every cell on the table from left to right, looking for an empty space. Let’s look at this process closely:
When trying to insert 2 into the table, we encounter three collisions during the process. The first collision occurs at index 3, which is the hash value generated by the function for this particular key. However, the cell at index 3 is not empty, and we continue to the next cell. Cells #4 and #5 are also occupied by other keys, and we reached the end of the hash table.
Ultimately, because we have not checked the cells preceding index 3, we must loop all the way around to the beginning of the hash table. Consequently, the key 2 is inserted at index 0, given that it was the next available space.
There is one more case we still need to consider. That is, what if the hash table has no capacity? Then, the linear probing algorithm must have a way to terminate the loop scanning the cells. We’ll demonstrate this in the pseudocode section.
Similarly, instead of inserting, searching for 2 leads to the same iterations we covered when inserting the key.
Let’s look at the pseudocode for linear probing. For simplicity’s sake, we’ll use two different functions to determine whether a key can be inserted or found in the hash table. Let’s start with the insert operation.
The insert implementation entails a function that returns a boolean value, indicating whether we can find a cell to insert our key:
First, we must track the index to insert our key. Also, because we might have to wrap around the hash table, a first_scan variable will tell whether we have scanned the same indexes twice and terminate the loop if so.
We use an auxiliary isEmpty function to determine whether the cell at the current index is empty. If the cell is empty, we store the key in the current cell and return true.
Otherwise, we adjust the index to scan the next cell in the hash table. For this, we use the mod operator to go back to the beginning of the cell when necessary. Additionally, if index equals the hash value for the second time, return false, since there’s no space to insert the key.
To search an element in a hash table using linear probing, we use a similar approach to the insert operation. However, this time we return the index at which we find the key, or -1 when it’s not in the hash table. Also, the loop will continue if the element at the current index is not equal to the key.
Let’s look at the pseudocode for search:
5. Time Complexity
A well-designed hash function and a hash table of size n increase the probability of inserting and searching a key in constant time. However, no combination between the two can guarantee a operation.
Therefore, a collision causes linear probing to search linearly for the next available cell. Given that our table size is n, we can infer that the maximum number of iterations is also n. Therefore, we can conclude that the time complexity for linear probing is
In this article, we learned about the linear probing technique. We use this algorithm to resolve collisions when multiple keys have the same hash value. Further, we covered all algorithm steps and analyzed their runtime complexity.
Despite other alternatives to minimize collisions in a hash table, linear probing provides a solution and a simple implementation to code.