1. Introduction

A sorting algorithm is stable if two records with equal keys remain in the same order in which they were in the unsorted array.

Stable sorting algorithms are crucial in many important applications, such as radix sort.

In this article, we’ll show that heapsort, a powerful sorting algorithm with O(n\log n) time complexity, is not stable.

2. Understanding Stability

Suppose we had the following input array of tuples:

    \[[(72, bob), (89, jim), (77,alice), (90,tom), (63,sam), (55,jill), (89,emma)]\]

Each tuple could represent, for example, (grade, name). Now let’s see how they would be sorted on the key grade using a stable sort:

    \[[(55,jill),  (63,sam), (72,bob), (77,alice), (89,jim), (89,emma), (90,tom)]\]

We can immediately see that (89,jim) and (89,emma) are in the same order in which they occurred in the original unsorted array, even though one might consider the order (89,emma) and (89,jim) to be more obvious in a lexicographic sense.

This is because we’re sorting only on the key grade, and key name does not enter into consideration.

3. Heapsort

Let’s quickly review the essential aspects of heapsort.

  • An input, unsorted array of records of size n is given.
  • A max-heap is created. This is a complete binary tree with the property that each node has key \boldsymbol{\geq} the keys of its two children.
  • The maximum value in the input array is at the root of the tree.

3.1. Sorting With a Heap

We now repeat the following steps n times:

  1. Swap the root with the last element of the array.
  2. Correct the modified max-heap (of size n-1) so that it still follows the heap property. This is done using the procedure sift-down.

At the end of this process, the max-heap will be sorted in increasing order.

All the above steps can be executed on the original input array — a separate binary tree is not required.

4. Heapsort Is Not Stable

Let’s demonstrate, using a counterexample, that heapsorts are not stable.

We’ll use the same array of tuples that we presented above. In the following the input data is in an array A with indexes 1\ldots 7:

heapsort example

4.1. Making a Max-Heap

We can construct a max-heap corresponding to this array, as shown below. We can satisfy ourselves that each node has a key value \geq its children.

We can also see that the maximum element in the array \mathbf{(90,tom)} sits in the root:

heapsort intermediate state 1

We can now interchange the root element (index 1) with the last element (index 7) in the array. This puts (90,tom) in its final and correct position, i.e. in index 7. The purple line indicates that the sub-array composed only of A[7] is correctly sorted:

heapsort intermediate state 2

4.2. Correcting the Heap Using Sift-Down

We observe that the tree no longer satisfies the max-heap property, since (55,jill) at index i has a smaller key than its two children, which have keys 89 and 72.

We can correct this problem by a sequence of interchanges that move the maximum element in A[1\ldots6] to the root and adjusts the remaining nodes appropriately. The procedure to do this is called sift-down:

heapsort intermediate state 3

The nodes 1\ldots 6 form a proper max-heap, with the maximum element (89,jim) at the root. We can now interchange the elements at node 1 with the element at n-1=6. We can see (as indicated by the purple line), that elements A[6..7] are properly sorted:

heapsort intermediate state 4

4.3. Continuing the Correction

The tree made up of nodes 1..5 is again not a proper max-tree, so we’ll apply the correction procedure sift-down to obtain:

heapsort intermediate state 5

We can now exchange node 1 (89,emma) and node 5 (55,jill) to put (89,emma) in its final location in the sorted array:

heapsort intermediate state 6

4.4. Where Do We Stand Now?

We could continue this process until the entire array was sorted. However, at the moment we see that A[5,6,7] are (89,emma), (89,jim) and (90,tom), respectively.

These are the final positions of the tuples, and we notice that (89,emma) precedes (89,jim). In the original array (89,jim) precedes (89,emma). Thus, heapsort is not stable.

5. Conclusion

Stable sorting algorithms have many important applications. Heapsort, though it is an efficient (\mathbf{O\boldsymbol{(}n\  \boldsymbol{\log}\ n\boldsymbol{)}}) algorithm, is not stable, as we have demonstrated with a counterexample. To evaluate the stability of a sorting algorithm, we need to think in terms of records with a specific sorting key, as we have done with tuples. Merely thinking in terms of numbers does not give us much insight into the stability issue.

Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.