Java Top

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE

1. Overview

General-purpose sorting algorithms like Merge Sort make no assumption about the input, so they can't beat the O(n log n) in the worst case. Counting Sort, on the contrary, has an assumption about the input which makes it a linear time sorting algorithm.

In this tutorial, we're going to get acquainted with the mechanics of the Counting Sort and then implement it in Java.

2. Counting Sort

Counting sort, as opposed to most classic sorting algorithms, does not sort the given input by comparing the elements. Instead, it assumes that the input elements are n integers in the range [0, k]. When k = O(n), then the counting sort will run in O(n) time.

Please note, then, that we can't use the counting sort as a general-purpose sorting algorithm. However, when the input is aligned with this assumption, it's pretty fast!

2.1. Frequency Array

Let's suppose we're going to sort an input array with values in the [0, 5] range:

First, we should count the occurrence of each number in the input array. If we represent the countings with array C, then C[i] represents the frequency of number in the input array:

For example, since 5 appears 3 times in the input array, the value for the index 5 is equal to 3.

Now given the array C, we should determine how many elements are less than or equal to each input element. For example:

  • One element is less than or equal to zero, or in other words, there is only one zero value, which is equal to C[0]
  • Two elements are less than or equal to one, which is equal to C[0] + C[1]
  • Four values are less than or equal to two, which is equal to C[0] + C[1] + C[2]

So if we keep computing the summation of consecutive elements in C, we can know how many elements are less than or equal to number n-1 in the input array. Anyway, by applying this simple formula we can update the as the following:

2.2. The Algorithm

Now we can use the auxiliary array to sort the input array. Here's how the counting sort works:

  • It iterates the input array in reverse
  • For each element i, C[i] – 1 represents the location of number i in the sorted array. This is because of the fact that there are C[i] elements less than or equal to i
  • Then, it decrements the C[i] at the end of each round

In order to sort the sample input array, we should first start with the number 5, since it's the last element. According to C[5], there are 11 elements are less than or equal to the number 5.

So, 5 should be the 11th element in the sorted array, hence the index 10:

Since we moved 5 to the sorted array, we should decrement the C[5]. Next element in the reversed order is 2. Since there are 4 elements less than or equal to 2, this number should be the 4th element in the sorted array:

Similarly, we can find the right spot for the next element which is 0:

If we keep iterating in reverse and move each element appropriately, we would end up with something like:

3. Counting Sort – Java Implementation

3.1. Computing the Frequency Array

First off, given an input array of elements and the k, we should compute the array C:

int[] countElements(int[] input, int k) {
    int[] c = new int[k + 1];
    Arrays.fill(c, 0);

    for (int i : input) {
        c[i] += 1;
    }

    for (int i = 1; i < c.length; i++) {
	c[i] += c[i - 1];
    }

    return c;
}

Let's breakdown the method signature:

  • input represents an array of numbers we're going to sort
  • The input array is an array of integers in the range of [0, k] – so represents the maximum number in the input
  • The return type is an array of integers representing the array

And here's how the countElements method works:

  • First, we initialized the array. As the [0, k] range contains k+1 numbers, we're creating an array capable of containing k+1 numbers
  • Then for each number in the input, we're computing the frequency of that number
  • And finally, we're adding consecutive elements together to know how many elements are less than or equal to a particular number

Also, we can verify that the countElements method works as expected:

@Test
void countElements_GivenAnArray_ShouldCalculateTheFrequencyArrayAsExpected() {
    int k = 5;
    int[] input = { 4, 3, 2, 5, 4, 3, 5, 1, 0, 2, 5 };
    
    int[] c = CountingSort.countElements(input, k);
    int[] expected = { 1, 2, 4, 6, 8, 11 };
    assertArrayEquals(expected, c);
}

3.2. Sorting the Input Array

Now that we can calculate the frequency array, we should be able to sort any given set of numbers:

int[] sort(int[] input, int k) {
    int[] c = countElements(input, k);

    int[] sorted = new int[input.length];
    for (int i = input.length - 1; i >= 0; i--) {
        int current = input[i];
	sorted[c[current] - 1] = current;
	c[current] -= 1;
    }

    return sorted;
}

Here's how the sort method works:

  • First, it computes the array
  • Then, it iterates the input array in reverse and for each element in the input, finds its correct spot in the sorted array. The ith element in the input should be the C[i]th element in the sorted array. Since Java arrays are zero-indexed, the C[i]-1 entry is the C[i]th element – for example, sorted[5] is the sixth element in the sorted array
  • Each time we find a match, it decrements the corresponding C[i] value

Similarly, we can verify that the sort method works as expected:

@Test
void sort_GivenAnArray_ShouldSortTheInputAsExpected() {
    int k = 5;
    int[] input = { 4, 3, 2, 5, 4, 3, 5, 1, 0, 2, 5 };

    int[] sorted = CountingSort.sort(input, k);

    // Our sorting algorithm and Java's should return the same result
    Arrays.sort(input);
    assertArrayEquals(input, sorted);
}

4. Revisiting the Counting Sort Algorithm

4.1. Complexity Analysis

Most classic sorting algorithms, like merge sort, sort any given input by just comparing the input elements to each other. These type of sorting algorithms are known as comparison sorts. In the worst case, comparison sorts should take at least O(n log n) to sort elements.

Counting Sort, on the other hand, does not sort the input by comparing the input elements, so it's clearly not a comparison sort algorithm.

Let's see how much time it consumes to sort the input:

  • It computes the array in O(n+k) time: It once iterates an input array with size in O(n) and then iterates the in O(k) – so that would be O(n+k) in total
  • After computing the C, it sorts the input by iterating the input array and performing a few primitive operations in each iteration. So, the actual sort operation takes O(n)

In total, counting sort takes O(n+k) time to run:

O(n + k) + O(n) = O(2n + k) = O(n + k)

If we assume k=O(n), then counting sort algorithm sorts the input in linear time. As opposed to general-purpose sorting algorithms, counting sorts makes an assumption about the input and takes less than the O(n log n) lower bound to execute.

 4.2. Stability

A few moments ago, we laid a few peculiar rules about the mechanics of counting sort but never cleared the reason behind them. To be more specific:

  • Why should we iterate the input array in reverse?
  • Why we decrement the C[i] each time we're using it?

Let's iterate from the beginning to better understand the first rule. Suppose we're going to sort a simple array of integers like the following:

In the first iteration, we should find the sorted location for the first 1:

So the first occurrence of number 1 gets the last index in the sorted array. Skipping the number 0, let's see what happens to the second occurrence of number 1:

The appearance order of elements with the same value is different in the input and sorted array, so the algorithm is not stable when we're iterating from the beginning.

What happens if we don't decrement the C[i] value after each use? Let's see:

Both occurrences of number 1 are getting the last place in the sorted array. So if we don't decrement the C[i] value after each use, we could potentially lose a few numbers while sorting them!

5. Conclusion

In this tutorial, first, we learned how the Counting Sort works internally. Then we implemented this sorting algorithm in Java and wrote a few tests to verify its behavior. And finally, we proved that the algorithm is a stable sorting algorithm with linear time complexity.

As usual, the sample codes are available on our GitHub project, so make sure to check it out!

Java bottom

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE
newest oldest most voted
Notify of
Fred
Guest
Fred

Very interesting algorithm. I never saw that one before. Thanks for letting us know about it! I am going to try it with number that are not in the 0..n range and sparse, so I’ll have to use some sort of mapping between the values and the array indexes, kind of a sparse array.

Comments are closed on this article!