The following description is based on a classical algorithm called the three tape sort (because you can use it with three tape drives). At first we will concentrate on explaining how it works. Then we will mention some simple tricks to make it more efficient.
Now we merge the two lists, taking two nodes at a time, one from each
list. We compare each pair of nodes and add them to a third list
in the appropriate sequence. The resulting list is a series of
two-node chunks, each of which is sorted:
There is one node left over, without a partner. Such leftovers may
occur at any stage, and the logic must accommodate them.
Now we divide this list back into two separate lists. Again we march though the list, alternately transferring nodes to one list or the other. This time, however, we transfer two nodes at a time, keeping the chunks which we created in the previous merge sort:
Again we merge the two lists back into one. This time, instead of
merging two nodes at a time, we merge two pairs of nodes,
keeping each output foursome in order. It is important that the
match-merge logic respect the boundaries between successive pairs.
The resulting list is a series of four-node chunks, each one in order:
You can imagine the tape drives spinning as we continue with the next
stage, casting alternate foursomes into one list or the other:
Again we merge the lists. This time we merge groups of four into
groups of eight:
Once more we split into two lists:
After one more merge, our list is completely sorted:
We repeat this cycle as many times as we need to. The number of
passes is roughly the base-2 logarithm of the number of nodes. The
length of each pass is equal to the number of nodes. Hence the
running time is proportional to n * log n, where n is the number of
nodes.
If we already know how long the list is, we can divide by two and count off that many nodes. With this approach we don't even have to look at the latter half of the list until we start merging.
Even if we have to traverse the entire list to determine its length, and then traverse it again to the middle, we may reduce the total amount of pointer-juggling.
This approach reduces the number of times we visit each node. In a virtual memory system it may reduce the number of page faults.
If we had a fourth tape drive we could apply the same trick to tape files, thereby reducing the total amount of I/O. For that matter, if we had twelve tape drives, we could match-merge six files at a time, reducing the total I/O even further. For linked lists the extra complexity of merging multiple lists wouldn't buy us much.