Most of the developers and people who’re involved in Computer Science heard about Quicksort. Many of them implemented this algorithm in one form or the other. This is a ubiquitous algorithm, but is it so good?
A different sorting algorithm is used as a default in many major languages. This algorithm has interesting characteristics, yet most people haven’t heard about Timsort.
This tutorial will dive deeper into the difference between these two algorithms and discuss their pros and cons. However, as the goal is to highlight the differences, this article won’t concentrate on the implementation details of these algorithms.
2. The History of Quicksort and Timsort
Let’s briefly review the history of these algorithms and when they were introduced to get more context.
Let’s refresh our memory about the Quicksort algorithms. Tony Hoare published this algorithm in 1961, and since then, it has been widely used. There are many different approaches and implementations of this algorithm that can improve its optimality in particular situations. However, it’s fascinating that the algorithm, introduced more than sixty years ago, is widely used nowadays, and many programming languages use it as the main to sort data.
Timsort is a young algorithm compared to Quicksort, although it’s more than twenty years old. Tim Peters introduced this algorithm in 2002. It was based on the techniques from Peter McIlroy’s 1993 paper “Optimistic Sorting and Information Theoretic Complexity.” Timsort is a highly optimized fusion of Mergesort and Insertion sort and outperforms both of them.
3. Complexity Comparison
In most cases, the most interesting part of the algorithms is their time complexity metrics, which show how fast the algorithm would work on huge amounts of data. Let’s compare the performance of these two algorithms before reviewing other differences:
As we can see, Timsort has a better worst time (which happens not so often) and a very good best time. However, time complexity is only one part of the entire picture, as often we have a trade-off between time and space. Let’s compare the space complexity of these algorithms:
4. Sorting Characteristics
Sorting algorithms have the following characteristics: adaptive/non-adaptive, stable/non-stable, recursive/non-recursive, online/offline, serial/parallel, and external/internal. We’ll concentrate on stability and adaptivity in this article. Other parameters are irrelevant, in our case, as they’re heavily implementation specific.
Is bubble sort always the wrong way to go? One of the most interesting characteristics of sorting algorithms and adaptivity. It tells us how well the algorithms can adapt to the given data and perform their best. Do we need to sort the collection which already sorted? Should we spend less time sorting the collection with only a few elements out of order? These are reasonable questions, and some of the algorithms, especially naive implementations, might not use the structure of the input data to improve the performance.
The simplistic version of a Quicksort isn’t adaptive and will try to sort even an already sorted collection. Bubble sort or Insertion sort would perform better than Quicksort in the setup where only a few elements are out of order. Because these algorithms are adaptive, they can use the benefits of collections being presorted. However, on random sets of data, these algorithms may drive the entire system to a halt.
However, in the examples above, all the algorithms that have the benefits of adaptivity have a massive impediment of being slow on random data. Timsort resolves this issue, and additionally, to being optimal, this algorithm also can exploit sorted or presorted data. Because real-world data are usually not entirely random and often contain sorted parts, Timsort can dramatically improve performance. For example, adaptive Timsort greatly outperforms non-adaptive Mergesort on partially sorted data. This doesn’t mean that Mergesort or Quicksort cannot be adaptive, but it provides a vivid picture of the benefits of adaptive algorithms.
Another key characteristic that differs between Quicksort and Timsort is stability. In Layman’s terms, stability ensures that equal (equal in terms of comparison parameter) elements will be in the same order as before sorting. For example, imagine we have a list of people sorted alphabetically and then decide to sort them by age. In this case, stability would ensure that all the people of the same age would still be in alphabetical order. This characteristic is especially valuable when sorting non-primitive values. That’s why, for example, in Java, Quicksort is used for primitive data and Timsort for objects.
5. Pros and Cons
As we saw previously, Timsort is a great algorithm with many benefits and a good performance overall. The first thing that can be considered a negative point is space complexity. Another issue with Timsort is that it’s much more complex than Quicksort. Even a well-tuned implementation of a Quicksort can be written down in twenty or so lines of code. At the same time, Timsort is a hybrid of Mergesort and Insertion sort and has many performance-improving techniques at its core.
However, it doesn’t mean we don’t use Timsort daily. As mentioned previously, many major languages use this algorithm as the primary sorting method. Although the implementation is hidden from the users, we get all the benefits of Timsort in our code.
In this article, we learned about the differences between Timsort and Quicksort. Quicksort is a perfect algorithm that is easy to use and simple to implement. Timsort, on the other hand, has a more complex implementation and can significantly improve performance. However, Timsort has more moving parts and is quite complex compared to a Quicksort. It’s an excellent challenge to spend some time on implementing it as this would provide a nice introspection into several computer science concepts.