1. Overview
Heapsort is an elegant and robust algorithm for sorting. It’s optimal since it takes time to sort
items, which is the best we can do with comparison-based sorting algorithms.
In this tutorial, we’ll intuitively explain Heapsort’s essential steps and the algorithm.
2. Heaps
A heap is a special kind of binary tree with the following properties:
- It’s a complete binary tree
- The value at each node is greater than or equal to the values of its child nodes
Here’s an example:
Given any complete binary tree, we can convert it into a heap using the procedure max-heapify in time.
A complete binary tree of nodes can be represented compactly by an array
of size
. The children of a node at the
th position are
and
. For example, the children of node
are
and
. This is how we store the above heap in an array:
3. Sorting with Heaps
Let’s see how to sort this array with Heapsort. We’ll first show how to move the maximum element to its correct position.
3.1. Placing the Maximum at the Right Position
A heap’s maximum element is the one at position 0. So, if we want to sort our array in non-decreasing order, we need to put its maximal element at the end. So, we swap with
.
Our job isn’t done since the remaining elements are unsorted. If we found the maximum of the subarray
and placed it at the second-to-last position, we would have the last two elements in their correct positions. So, we can repeat these steps until getting a fully sorted array.
For instance, here’s how the first step unfolds with our previous example:
However, there’s a problem! When we put 11 at position 0, we violated the heap property since 11 is not greater than its children 42 and 30. Therefore, we need to correct the heap so that the new node 0 does indeed contain the next maximum element (which is 42).
3.2. Correcting the Heap With Sift-Down
To correct the error caused by replacing the root node, we use the procedure Sift-Down. We start at the root and work our way down to a leaf node. Our objective is to look at a node and its two children and swap the node with whichever child is greater (if any):
Sift-Down starts at the root of a subtree, indicated by start, and ends at stop. During each pass, it checks if either of the two children of the root are more significant than the root. If so, it swaps the root and the greater child and sets start to the former location of the greater child.
If neither of the two children has a value greater than their parent, the procedure proceeds with the left child.
3.3. An Example of Sift-Down
In the following diagrams, blue triangles denote node comparisons to their children.
After placing the maximal element at its position in the above example, we made node one greater than node 0. So, we interchange their values, as shown by the double-headed blue arrow:
Now, we compare node 1 to its children, nodes 3 and 4. We exchange the values in nodes 1 and 3 since node 3 is the greater child:
In the final step, we check nodes 3, 7, and 8. No swapping is required here, as 11 is already greater than 2 or 5:
We’ve now obtained a new heap of 9 elements. This heap is correct, as each node has a value greater than its two children.
Our next step is to move the largest element (42) to its correct place:
That results in a sorted array of size 2 (purple) and an unsorted array of size 8. To re-heapify the remainder, we apply Sift-Down to the array . Afterward, we repeat the process until we sort the entire array.
3.5. Completing the Sorting Process
We’ll illustrate the remaining process using arrays. Green squares denote the heap, which shrinks in size from to 0, while purple squares indicate the sorted array, which increases in size to
. The arrows indicate the transfer of the max element to its proper location. We’ve omitted other entries for clarity:
3.6. Heapsort – the Complete Pseudocode
Here’s the complete pseudocode of Heapsort:
The algorithm repeatedly removes the maximal elements and re-heapifies the remainder with Sift-Down.
4. Complexity of Heapsort
We start our Heapsort by creating a heap using max-heapify with complexity .
After this, we run Sift-Down times. This procedure moves from the tree’s root to a leaf node and takes at most
steps since we deal with complete binary trees, which have at most
levels. So, the total time for the sifts is
.
The time for max-heapify, , is masked by the time for the sifts. Therefore, the complexity of Heapsort is
.
5. Heapsort vs. Selection Sort
It’s interesting to compare Heapsort with Selection Sort. Both pick and move the maximum element to its proper place at each step.
However, Selection Sort makes comparisons in the
th iteration, which results in
complexity. In the worst case, Heapsort makes
comparisons and swaps per iteration, which is why its complexity is
.
Heapsort uses the clever heap approach to find and place the maximal element, which is why we can consider Heapsort a refinement of Selection Sort.
6. Conclusion
In this article, we explained the Heapsort algorithm for sorting an array in non-descending order. It uses the fact that a Max-Heap’s maximum element is at its root.
It’s simple to modify this algorithm to use a Min-Heap, leading to sorting in non-ascending order. The complexity of Heapsort is , which is the best possible for sorting algorithms that use comparisons.