Mergesort Investigation 14 - count of comparisons

My recursive mergesort algorithm, and the wikipedia bottom-up) algorithm read and write the same list nodes in the same order, for list lengths that are powers of two. Something other than mere data reading and pointer writing causes abrupt performance drops at linked list lengths of 2N+1 nodes.

I decided to count the number of comparisons used in merging lists. I had to write a different program so as to use an unsorted list with data in the same randomly chosen order for each of the three algorithms.

Merging two sorted lists into a combined, sorted list works something like this:

     1	func merge(p *Node, q *Node) *Node {
     2		if p == nil {
     3			return q // list p is empty
     4		}
     5		if q == nil {
     6			return p // list q is empty
     7		}
     8	
     9		x := &q
    10		if p.Data < q.Data { // choose lowest-data-valued head node
    11			x = &p
    12		}
    13	
    14		h, t := *x, *x // h, t are at head and tail of merged list
    15		*x = (*x).Next
    16	
    17		for p != nil && q != nil {
    18			n := &q
    19			if p.Data < q.Data { // choose lowest-data-valued node remaining
    20				n = &p
    21			}
    22			t.Next = *n // append lowest-data-valued node to merged list
    23			*n = (*n).Next
    24			t = t.Next
    25		}
    26	
    27		t.Next = p
    28		if q != nil {
    29			t.Next = q
    30		}
    31	
    32		return h
    33	}

Lines 10 and 19 above contain “less than” operations on the .Data elements of two linked list nodes. Those are the comparisons I’m counting in this post.

Two groups of lines, 9-15 and 18-24, comprise similar functionality. This code does the actual sorting that makes a “mergesort”. Given two linked lists, the groups of lines choose the smallest-data-valued head node of the two lists, and append it to a new, merged list. This happens repeatedly, until one list is entirely consumed. The remaining unmerged list gets appended to the merged list (lines 27-30), because every data value in the remaining list is greater than the data value of the tail of the merged list.

The two initial lists are sorted from smallest data value to greatest, The merged list, traversed by .Next pointers, will have .Data elements from least to greatest. All of the algorithms I benchmarked have this merging in one form or another.

I’m proud of lines 18-24. Only one if without an else clause to pick the smallest data value of the two lists to merge, and append it to the merged list. I thought this was a clever use of pointer-to-a-pointer. It costs a little in performance.

Count of Comparisons

My counted the number of “less than” comparisons each algorithm made while sorting a randomly-chosen-data-value list. The list node have a slightly different composition than in other benchmarking. They carry an extra .Reset pointer that holds the same address as .Next pointers before the sort. After sorting and counting comparisons, my code uses the .Reset pointers to put the sorted lists’ nodes back into the original unsorted statee.

The code I wrote looks like this:

        for j := 0; j < 10; j++ {
            head := listCreation(listLength)

            nl := recursiveMergeSort(head)
            checkSorted(nl, listLength, "recursive")
            resetList(head, listLength, "recursive")

            nl = buMergesort(head)
            checkSorted(nl, listLength, "bottom up")
            resetList(head, listLength, "bottom up")

            nl = mergesort(head)
            checkSorted(nl, listLength, "iterative")
        }

For each of 10 iterations, the code creates a linked list with randomly chosen data values, and sort that list 3 times. Between each sort, the code verifies each algorithm sorted the list (calls to function checkSorted(), then returns the list to its original order by following .Reset pointers (calls to function resetList()).

graph of linked list length vs comparison count for 3 algorithms

Above, a graph of the number of comparisons 2 of the 3 algorithms (recursive, wikipedia bottom up ) versus the length of the linked list (in nodes). I ran each algorithm 10 times on the same list for each list length, then took the mean of the number of comparisons. My July 2021 iterative gave exactly the same count of comparisons as the bottom up algorithm, so I didn’t include it on the graph above.

I needed to run each algorithm more than once for each list length. The randomly chosen arrangement of data values causes the number of comparisons made to differ from run to run. If one to-be-merged-list has a lot of smaller data values, the merge code uses up that list first, leaving the other list, Lines 27-30 in func merge append whole pieces of remaining lists without doing a comparison - the other list is already merged. Depending on the arrangement of data in the to-be-merged lists, any particular merging can incur more or less comparisons.

I had gnuplot draw lines, instead of using big “points” because points make the graphs look fuzzy and visually hide any differences in comparison counts.

The graph shows the performance drops: at 2N+1 nodes, the number of comparisons goes up. At least some of the comparisons will involve changing pointers during list merges, so the amount of work the algorithm does also goes up.

You can even see the little extra pops between list lenghts of 2N which my amateur anomaly detection almost kind of pointed out, but I chose to ignore.

Visualization

I made a plot of the difference between comparison counts made by recursive and bottom up algorithms. The comparison counts of bottom up and my own July 2021 iterative algorithm are identical.

comparison count difference versus list length chart

The purple spikes on the graph above represent the difference between the means of 10 sorts each for recursive and wikipedia bottom up algorithms. The big spikes occur around list lengths of powers of 2. The small, but still prominent, spikes are obviously at half and quarter of the way to the next power of 2 length.

Explanation

My previous investigation of memory access patterns incorrectly showed me that my recursive algorithm and the wikipedia bottom up algorithm merged the same sublists in the same length patterns. During a shower, I realized that I had missed the “roll up” merges. The bottom up algorithm leaves an array of lists, where each non-nil array element at index i value is 2i nodes long. When the original unsorted list is 2N nodes long, the only non-nil array element is array[N]. The comparison count of bottom up and recursive are identical when the unsorted list is 2N nodes long. Adding 1 more node to the original list causes the bottom up algorithm to put a long sorted list at array[N] and a single-node list at array[0]. Merging that single node list with a list of length 2N requires on average 2N/2 comparisons. As N gets larger, so, on average do the number of comparisons this requires.

In contrast, the recursive algorithm doesn’t deal in sublists of 2N. The recursive algorithm counts off half the list it receives as a formal argument, then divides the formal argument list into two sublists. An odd number of nodes just means that the right sublist has one more node than the left sublist.

The difference between a count of comparisons for bottom up and recursive algorithms, is the extra comparisons made by the bottom up algorithm The extra comparisons should be approximately 2N/2 at list lengths of 2N+1. My comparison counting program didn’t use exact powers of 2 length lists, so I’ll have to approximate to do some arithmetical checks.

list length Closest 2N extra comparisons 2N/2
281000 262144 187018 131072
561000 524288 377460 262144
1081000 1048576 864435 524288
2121000 2097152 1930245 1048576
4201000 4194304 4128230 2097152

The extra comparisons and 2N/2 columns above should be close, but they’re not terribly. If I run my comparison counter program exactly at 2N+1 I get closer agreement.

list length N 2N extra comparisons 2N/2
1048575 -152
1048576 20 1048576 0
1048577 474739 524288
2097151 -577
2097152 21 2097152 0
2097153 1310732 1048576

There, that’s much closer.

It’s good that the number of extra comparisons is 0 (zero) at list lengths of exactly 2N. The bottom up algorithm main part would finish with exactly one non-nil element at array[N], so it does no further work merging a single-element list.

The smaller “cliffs” of extra comparisons in the graph above occur at list lengths of 6321000, 10521000, 12601000, 14681000 nodes. Looking at those lengths as binary will explain.

N As binary
6321000 11000000111001101101000
10521000 101000001000100110101000
12601000 110000000100011010101000
14681000 111000000000001110101000

In my incorrect conclusion, I noted that the way the main part of the bottom up algorithm filled in the array variable could be viewed as a binary representation of the length of the original, unsorted list. The binary representations of list lengths above show that the bottom up algorithm’s array variable had a few slots pointing to long lists (the “11” or “101” or “111” prefixes of the binary numbers) and mostly nil slots, no smaller length lists. The extra comparisons arise from the circumstance of a single node at the end of the original unsorted list getting merged into a long list.