Mergesort Investigation 14 - count of comparisons
My recursive mergesort algorithm, and the wikipedia bottom-up) algorithm read and write the same list nodes in the same order, for list lengths that are powers of two. Something other than mere data reading and pointer writing causes abrupt performance drops at linked list lengths of 2N+1 nodes.
I decided to count the number of comparisons used in merging lists. I had to write a different program so as to use an unsorted list with data in the same randomly chosen order for each of the three algorithms.
Merging two sorted lists into a combined, sorted list works something like this:
1 func merge(p *Node, q *Node) *Node {
2 if p == nil {
3 return q // list p is empty
4 }
5 if q == nil {
6 return p // list q is empty
7 }
8
9 x := &q
10 if p.Data < q.Data { // choose lowest-data-valued head node
11 x = &p
12 }
13
14 h, t := *x, *x // h, t are at head and tail of merged list
15 *x = (*x).Next
16
17 for p != nil && q != nil {
18 n := &q
19 if p.Data < q.Data { // choose lowest-data-valued node remaining
20 n = &p
21 }
22 t.Next = *n // append lowest-data-valued node to merged list
23 *n = (*n).Next
24 t = t.Next
25 }
26
27 t.Next = p
28 if q != nil {
29 t.Next = q
30 }
31
32 return h
33 }
Lines 10 and 19 above contain “less than” operations on the .Data
elements of two linked list nodes.
Those are the comparisons I’m counting in this post.
Two groups of lines, 9-15 and 18-24, comprise similar functionality. This code does the actual sorting that makes a “mergesort”. Given two linked lists, the groups of lines choose the smallest-data-valued head node of the two lists, and append it to a new, merged list. This happens repeatedly, until one list is entirely consumed. The remaining unmerged list gets appended to the merged list (lines 27-30), because every data value in the remaining list is greater than the data value of the tail of the merged list.
The two initial lists are sorted from smallest data value to greatest,
The merged list, traversed by .Next
pointers, will have .Data
elements
from least to greatest.
All of the algorithms I benchmarked have this merging in one form or another.
I’m proud of lines 18-24.
Only one if
without an else
clause to pick the smallest
data value of the two lists to merge, and append it to the merged list.
I thought this was a clever use of pointer-to-a-pointer.
It costs a little in performance.
Count of Comparisons
My counted the number of “less than” comparisons each algorithm made
while sorting a randomly-chosen-data-value list.
The list node have a slightly different composition than in other benchmarking.
They carry an extra .Reset
pointer that holds the same address
as .Next
pointers before the sort.
After sorting and counting comparisons,
my code uses the .Reset
pointers to put the sorted lists’ nodes back
into the original unsorted statee.
The code I wrote looks like this:
for j := 0; j < 10; j++ {
head := listCreation(listLength)
nl := recursiveMergeSort(head)
checkSorted(nl, listLength, "recursive")
resetList(head, listLength, "recursive")
nl = buMergesort(head)
checkSorted(nl, listLength, "bottom up")
resetList(head, listLength, "bottom up")
nl = mergesort(head)
checkSorted(nl, listLength, "iterative")
}
For each of 10 iterations,
the code creates a linked list with randomly chosen data values,
and sort that list 3 times.
Between each sort, the code verifies each algorithm sorted the list
(calls to function checkSorted()
,
then returns the list to its original order
by following .Reset
pointers (calls to function resetList()
).
Above, a graph of the number of comparisons 2 of the 3 algorithms (recursive, wikipedia bottom up ) versus the length of the linked list (in nodes). I ran each algorithm 10 times on the same list for each list length, then took the mean of the number of comparisons. My July 2021 iterative gave exactly the same count of comparisons as the bottom up algorithm, so I didn’t include it on the graph above.
I needed to run each algorithm more than once for each list length.
The randomly chosen arrangement of data values
causes the number of comparisons made to differ from run to run.
If one to-be-merged-list has a lot of smaller data values,
the merge code uses up that list first, leaving the other list,
Lines 27-30 in func merge
append whole pieces of remaining lists
without doing a comparison - the other list is already merged.
Depending on the arrangement of data in the to-be-merged lists,
any particular merging can incur more or less comparisons.
I had gnuplot
draw lines, instead of using big “points”
because points make the graphs look fuzzy and visually hide any differences
in comparison counts.
The graph shows the performance drops: at 2N+1 nodes, the number of comparisons goes up. At least some of the comparisons will involve changing pointers during list merges, so the amount of work the algorithm does also goes up.
You can even see the little extra pops between list lenghts of 2N which my amateur anomaly detection almost kind of pointed out, but I chose to ignore.
Visualization
I made a plot of the difference between comparison counts made by recursive and bottom up algorithms. The comparison counts of bottom up and my own July 2021 iterative algorithm are identical.
The purple spikes on the graph above represent the difference between the means of 10 sorts each for recursive and wikipedia bottom up algorithms. The big spikes occur around list lengths of powers of 2. The small, but still prominent, spikes are obviously at half and quarter of the way to the next power of 2 length.
Explanation
My previous investigation
of memory access patterns
incorrectly showed me that my recursive algorithm
and the wikipedia bottom up algorithm merged the same sublists
in the same length patterns.
During a shower, I realized that I had missed the “roll up” merges.
The bottom up algorithm leaves an array of lists,
where each non-nil array element at index i
value is 2i nodes long.
When the original unsorted list is 2N nodes long,
the only non-nil array element is array[N]
.
The comparison count of bottom up and recursive are identical
when the unsorted list is 2N nodes long.
Adding 1 more node to the original list
causes the bottom up algorithm to put a long sorted list at array[N]
and a single-node list at array[0]
.
Merging that single node list with a list of length 2N
requires on average 2N/2 comparisons.
As N gets larger, so, on average do the number of comparisons this requires.
In contrast, the recursive algorithm doesn’t deal in sublists of 2N. The recursive algorithm counts off half the list it receives as a formal argument, then divides the formal argument list into two sublists. An odd number of nodes just means that the right sublist has one more node than the left sublist.
The difference between a count of comparisons for bottom up and recursive algorithms, is the extra comparisons made by the bottom up algorithm The extra comparisons should be approximately 2N/2 at list lengths of 2N+1. My comparison counting program didn’t use exact powers of 2 length lists, so I’ll have to approximate to do some arithmetical checks.
list length | Closest 2N | extra comparisons | 2N/2 |
---|---|---|---|
281000 | 262144 | 187018 | 131072 |
561000 | 524288 | 377460 | 262144 |
1081000 | 1048576 | 864435 | 524288 |
2121000 | 2097152 | 1930245 | 1048576 |
4201000 | 4194304 | 4128230 | 2097152 |
The extra comparisons and 2N/2 columns above should be close, but they’re not terribly. If I run my comparison counter program exactly at 2N+1 I get closer agreement.
list length | N | 2N | extra comparisons | 2N/2 |
---|---|---|---|---|
1048575 | -152 | |||
1048576 | 20 | 1048576 | 0 | |
1048577 | 474739 | 524288 | ||
2097151 | -577 | |||
2097152 | 21 | 2097152 | 0 | |
2097153 | 1310732 | 1048576 |
There, that’s much closer.
It’s good that the number of extra comparisons is 0 (zero)
at list lengths of exactly 2N.
The bottom up algorithm main part would finish with exactly
one non-nil element at array[N]
,
so it does no further work merging a single-element list.
The smaller “cliffs” of extra comparisons in the graph above occur at list lengths of 6321000, 10521000, 12601000, 14681000 nodes. Looking at those lengths as binary will explain.
N | As binary |
---|---|
6321000 | 11000000111001101101000 |
10521000 | 101000001000100110101000 |
12601000 | 110000000100011010101000 |
14681000 | 111000000000001110101000 |
In my incorrect conclusion,
I noted that the way the main part of the bottom up algorithm
filled in the array
variable could be viewed as a binary representation
of the length of the original, unsorted list.
The binary representations of list lengths above
show that the bottom up algorithm’s array
variable had a few slots
pointing to long lists (the “11” or “101” or “111” prefixes of the binary numbers)
and mostly nil slots, no smaller length lists.
The extra comparisons arise from the circumstance of a single node at the end
of the original unsorted list getting merged into a long list.