Mergesort Investigation 16 - sorting reverse sorted lists
There’s one more item I discovered during my mergesort investigation that bothers me. I wrote code that finishes with a low-to-high data value list. Starting with a “reverse sorted” (high-to-low data values) list with Wikipedia’s bottom up algorithm does not show the abrupt performance drops that a randomly-chosen data values initial list does, but has bumps in comparison count at lists of 2N+1 lengths.
Why is this?
I collected new data for this post. I didn’t re-use data from previous mergesort posts because I wanted to make sure I hadn’t made mistakes the previous time around. I ran the benchmarking data collection on my Dell R530
Count of Comparisons to sort a list
I previously counted comparisons to sort a list.
Each comparison looks
something like if p.Data < q.Data {
in the code.

Above, a graph showing how many comparisons it took to:
- (purple) sort a list with randomly-chosen integer value data
- (green) sort a list with initially sorted data values
- (blue) sort a list with initially high-to-low sorted data values
There are 5 jumps in number of comparisons easily visible. The list lengths where the jump occurs are a little different for randomly-chosen data, and initially reverse sorted data.
| List length | N | 2N | Jump size, green | Jump size, purple |
|---|---|---|---|---|
| 261000 | 18 | 262144 | 23880 | |
| 263000 | 289155 | |||
| 525000 | 19 | 524288 | 554240 | |
| 539000 | 27742 | |||
| 1047000 | 20 | 1048576 | 27456 | |
| 1049000 | 1082631 | |||
| 1557000 | 20.57 | 1556635 | 26032 | 50544 |
| 2089000 | 29056 | |||
| 2099000 | 21 | 2097152 | 2117729 |
I ran the comparison counter program with an increment of 2,000 nodes between sortings, so the list length at the “jump” is not exactly 2N+1.
The randomly-chosen data values cause a lot more comparisons than starting with a reverse-sorted list. The above graph is a little deceptive. The Y-axis scale has a factor of 10 on the X-axis scale.
List Sorting Performance

That’s the actual sorting performance. I had my program sort much larger lists than for the above comparison count graph. The performance drops at smaller list lengths aren’t as visible because of the noise in performance measurements. I ran 50 sorts at each list length, the average sorting time is graphed above.
There’s almost no difference in performance between sorting an already sorted list, and sorting a reverse-sorted list.
My previous conclusion was that the extra comparisons the bottom-up mergesort required at lists of length 2N+1 caused the abrupt performance drops. Sorting a reverse-sorted list very definitely causes extra comparisons at the same list lengths, but there’s no detectable performance drop when sorting reverse-sorted lists of length 2N+1.
Effect of averaging multiple sorts
I thought that possibly the noise from varying performance was washing out the performance drops on the chart, so I ran my comparison counting program with 1, 2, 5, 10, 20, 50 sorts per list length. If noisy data exists, then doing more sorts per list length should reveal the drops.

Above, an animation. Frames contain sorting performance of sorting an initially reverse-sorted list (high to low), with 1, 2, 5, 10, 20 and 50 sorts averaged per list length. As you can see, the more sorts averaged, the smother the graph becomes. I don’t think there are any drops in performance, even visually-indistinct drops.
I have to conclude that performance drops are not getting lost in noisy measurements.
Branch prediction
The code to merge two sorted linked lists looks like this:
1 for p != nil && q != nil {
2 n := &q
3 if p.Data < q.Data {
4 n = &p
5 }
6 t.Next = *n
7 *n = (*n).Next
8 t = t.Next
9 }
Line 2 sets a variable n to the address of the head of the (remaining)
right hand linked list.
The comparison counted in other tests takes place at line 3.
Line 4 actually gets executed if the data value of the left hand list
(whose head is variable p) is numerically less than the data value
of the right hand list (variable q).
Line 4 is “extra” work done if the comparison is true.
What happens if the CPU’s branch prediction causes performance problems,
or if the extra work is the problem?
I put in extra counters to keep track of how many times the extra
assignment takes place.
It turns out that when the sorting algorithm starts with
an already sorted list,
the extra assignment takes place every time.
When the sorting algorithm starts with
an reverse-order sorted list,
the extra assignment never takes place.
This makes complete sense.
Looking at the code fragment above,
I chose to assign the right hand list head node
to n,
and if the comparison (line 3) is true,
re-assign the left hand list node to n.
When sorting a linked list with high-to-low data values,
the right hand sub-list’s head node will always have a lower
data value than the left hand sub-list’s head node.
I benchmarked mergesorts with a “switched” merge function:
1 for p != nil && q != nil {
2 n := &p
3 if q.Data < p.Data {
4 n = &q
5 }
6 t.Next = *n
7 *n = (*n).Next
8 t = t.Next
9 }

Above, benchmarking exactly like the sorting performance graph above, except that the merging chooses the left hand link list, and does extra assignments if the right hand linked list data values are lower.
The two graphs are astonishingly close. The extra assignments, lines 4 in the two merge function fragments above, have no performance impact.