Powersort is a modern sorting algorithm (proposed 2018) that keeps the worst-case guarantees of previous algorithms like Quicksort and Mergesort, but works faster on "easy" inputs. This leads us to first define what we exactly mean by easy and hard, and how we can exploit these ideas reliably without spending much computation on them.

Even though all the following can be applied to any comparable type of object (e.g. strings), we use integers for easy visualization. For length n ∈ ℕ, write the array as A = (a0, a1, …, an-1), where ai denotes the element at index i. We index arrays from 0 to n-1. First off, we consider the arguably "easiest" input possible for sorting; an ascending array, i.e. an array that is already sorted. In this case, nothing needs to be done:

Next up, let's think about hard inputs. A very straightforward idea could be just taking the opposite of the previous array; one that is fully descending:

Just from visual intuition, we can sense that this should not be considered as hard, as we can just flip it horizontally.

After a bit of testing, a promising hard candidate might look somewhat like the following. The bars can be swapped by drag-and-drop, so try out other arrangements.

A way to quantify why this one is harder than the previous ones is counting the subarrays that are already monotone. For indices 0 ≤ i ≤ j < n, we write A[i..j] = (ai, ai+1, …, aj) for the contiguous subarray of A from position i to j. From now on, these monotone subarrays will be called runs.

More precisely, an ascending run of A is a subarray A[i..j] with 0 ≤ i ≤ j < n such that ai ≤ ai+1 ≤ … ≤ aj. A descending run is defined analogously. In this article, we use the word run for both ascending and descending runs.

The runs are color-coded in the figures so that two neighboring runs have different colors. We can now state more precisely that the arrays shown at the start of the post each consist of a single run, while this new hard input decomposes into several shorter runs.

To use these runs productively for sorting, we can borrow the merge-operation of Mergesort. It takes two runs that are next to each other and merges them to one run that contains all the elements of the two input runs. Let k be the number of runs at the very start, then exactly k-1 merge-operations are needed to sort the whole array, as each time the number of runs is reduced by 1, and the sorted array is known to be just one big run.

So far, it is not specified in which order which runs are supposed to be merged. By just playing around with the array above, we can see that any order of merges results in a correctly sorted array. The difference lies in that different merge-orders generally take a different amount of time. The cost of a merge is proportional to the total length of the runs being merged. Let's consider the array below, in which this effect is particularly nice to see:

There are 3 runs, the left one being by far the longest, and the middle and right one being short. Before reading further, try acting out the sorting algorithm by hand (it won't take long as there are only two choices for the merging order).

...

...

...

...

...

...

The two approaches how this can play out are explained below:


R1 (len 7)
R2 (len 2)
R3 (len 1)
R1+R2 (len 9)
Sorted (len 10)

Total Cost: (len(R1) + len(R2)) + (len(R1+R2) + len(R3)) = (7 + 2) + (9 + 1) = 9 + 10 = 19

A naive left-to-right merge order is very inefficient, creating an "unbalanced" order of operations. The created array R1+R2 appears unnecessarily big.


R1 (len 7)
R2 (len 2)
R3 (len 1)
R2+R3 (len 3)
Sorted (len 10)

Total Cost: (len(R2) + len(R3)) + (len(R1) + len(R2+R3)) = (2 + 1) + (7 + 3) = 3 + 10 = 13

Merging the two smaller runs R2 and R3 first is much cheaper. The reason appears to be related because R2 and R3 have a similar size, while R1 is much bigger, thus merging it with any other run doesn't change its size much.


TODO: Introduce power and powerloop