Sorting a Linked List

One of the favorite pastimes of computer scientists is to devise sorting algorithms. Any good textbook will describe several popular algorithms: bubble sort, Shell sort, heapsort, quicksort, and more.

However, most texts describe these algorithms in terms of arrays. It is not always obvious how to apply them to a linked list. In addition, some algorithms (such as quicksort) are recursive, and hence awkward to implement in COBOL.

These pages describe two approaches which are well suited to linked lists in COBOL. The insertion sort is easier to implement; the merge sort is more complicated but also more efficient for long lists.

Insertion Sort

The simplest way to sort a linked list is to build it sorted in the first place. Add the nodes one at a time. As you add each one, scan the list to find the right position. Once you find the node which should follow the new node, you have to back up to find the node which should precede the new node.

With doubly linked lists it's easy to back up. With singly-linked lists it's a bit more awkward. You can start over from the beginning, searching for the node which points to the node you already found. However, it's easy to avoid this second pass. On the first pass, maintain two pointers: one pointing to the node you're comparing to the new node, and another pointing to the node just behind that one.

If the list already exists in an unsorted state, then temporarily transfer it all to a second list -- you can perform this transfer just by exchanging a few pointers. Then build it back, one node at a time.

For an insertion sort, the number of key comparisons is roughly proportional to the square of the number of nodes. For a modest number of nodes, that's probably as efficient as it needs to be. For a large number of nodes, consider using a more efficient algorithm such as a merge sort.

Merge Sort

A merge sort uses the same kind of match-merge logic we have come to know and love in ordinary batch processing. You've probably coded it at least a dozen times. You read two files sorted by the same key, at each step reading whichever file is behind the other one, or both if they match. When you reach end-of-file on one file, you keep reading the other one.

In ordinary batch processing, of course, both files have to start out already sorted. In a merge sort we start from scratch. We slosh the data back and forth among several lists. At each pass, we merge small chunks of sorted data into larger chunks. When there's only one chunk left, we're done. The number of comparisons is proportional to n * log n, where n is the number of nodes.

This approach is described in greater detail elsewhere.

Other Approaches


[trap]Pointer Traps [crane]Data Structures [books]Further Reading
[home]COBOL Home