General Reference Material | The Shell Sort Algorithm

The Shell Sort Algorithm

This sort was invented by Donald Shell, and first appeared in a 1959 paper. Since then, much research has gone into the algorithm, and especially in the effort to determine good "gap sizes" for use in the algorithm, but a definitive analysis of the algorithm's complexity in all cases continues to elude investigators.

Shell Sort: Basic Idea

The idea of this sort is to divide the sequence to be sorted up into some number of subsequences, sort each of the subsequences using insertion sort, then divide the sequence again, this time into a smaller number of subsequences, sort these new subsequences again using insertion sort, and so on, until the number of subsequences being sorted is one. At this last step, then, we are, in effect, just using insertion sort to sort the entire sequence.

Shell Sort: Name, Input and Output

Name: SortByShell

Input:

A sequence of values, all of a given type, which is called s
Two positions in the sequence, to determine the range of values within the sequence that will be sorted, which are called firstPosition and lastPosition, and which satisfy firstPosition <= lastPosition

Output:

The sequence of values s, sorted in ascending order between firstPosition and lastPosition, inclusive

Shell Sort: Pseudocode

Before presenting the pseudocode of the Shell sort there is a question that arises quite naturally out of the "basic idea" of the sort given above, and we need to answer it: Just how do we choose the subsequences from the original sequence. There are many ways that we might choose to do this, but for simplicity we choose a particularly straightforward method. We begin by choosing a "gap size", which determines how far apart the values in a subsequence are located. For example, if the starting gap size is 6, the first subsequence would be values in positions 1, 7, 13, 19 and so on, while those in the second subsequence would be in positions 2, 8, 14, 20, and so on. The third sequence would start in position 3, the fourth in position 4, and so on, and in this case there would only be six subsequences, since by the time we got to element 7 as a starting value for the next subsequence we would be overlapping with the first subsequence. After a "pass of the outer loop" in the Shell sort, then, we will have sorted all the subsequences determined by the current gap size. For the next pass, we simply divide the gap size by 2 (via integer division, of course) and repeat the process. This ensures that eventually the gap size is reduced to one, the point at which we are sorting the entire sequence.

Thus because of the "intimate" way it uses insertion sort, the Shell sort pseudocode is especially short and simple:

Algorithm SortByShell(s, firstPosition, lastPosition)
-----------------------------------------------------
Set gap size to (size of sequence)/2
while gap size > 0
    Sort all subsequences determined by gap size by insertion sort
    Set gap size to (gap size) / 2
endwhile

Shell Sort: Performance

The Shell sort uses insertion sort in a very straightforward fashion, and insertion sort is a "well known" and long-studied sorting method whose complexity is also well known, and quite easily established, to be O(n²). Because of this, you might expect the complexity of the Shell sort to be easy to find, and therefore well known as well. Quite the opposite is the case. In fact, though considerable effort over thirty years or so has been expended in the study of this sort, no one yet knows just what the best possible sequence of gap sizes is, for large values of n. What investigation has been done suggests a complexity of something like O(n^3/2), which is somewhere between linear and quadratic.