Heaps
A heap is a complete binary tree, each of whose nodes contains a key which is greater than or equal to the key in each of its children. Actually, this is technically a "maximum heap"; if we replace "greater than or equal to" with "less than or equal to", we get the definition of a "minimum heap".
Note that this use of the term heap has absolutely nothing to do with the other meaning of the word heap, which in that context was another term for "free store". Two words that have the same spelling or pronunciation but different meanings are called homonyms. Here are some other examples in which the sound is again the same, but at least the spelling is different: so and sew; do, dew, due; grate and great.
Thus we may say that a heap satisfies two properties:
while not finished (while heap not empty) Remove root element and put it in its place Re-heap the remaining elementsThis is a O(n*log n) algorithm, and, unlike quicksort, it is guaranteed not to degenerate to O(n2). However, the algorithm begs a couple of questions:
In order to deal with this question we introduce a new way of representing binary trees.
Study the binary tree shown below, and the array (or vector) that immediately follows it and contains the same values:
Note that although this binary tree happens to be a heap, it could be any kind of binary tree. That is, we could use this kind of representation for any of our binary trees; it's just that it turns out to be particularly convenient for heaps. You should make the following observations:
The children of the value at index 0 are at indices 1 and 2. The children of the value at index 1 are at indices 3 and 4. The children of the value at index 2 are at indices 5 and 6. The children of the value at index 3 are at indices 7 and 8. ... and so on, or, in general ... The children of value at index i are at indices 2i+1 and 2i+2.
And, going the other way ...
The parent of the value at index k is at index (k-1)/2.
Now that we know what a heap is, and how we are going to represent it, it's time for the usual questions:
Though it seems natural to ask the above questions in the order given, it turns out to be convenient to answer them in the opposite order. These pictures illustrate the heap algorithms we need.
The way we perform deletion is to first overwrite the root value with the "most remote" value in the tree (last value in the vector). This retains the shape property, but destroys the order property. So, we have to move this value down through the tree by exchanging it with one of its children until we reach a point where the order property, and hence "heapness", has been restored. This process is called "re-heaping down", and its pseudocode looks like this:
Algorithm ReHeapDown -------------------- if currentNode is not a leaf Set maxChild to index of child of currentNode with larger value if value at currentNode < value at maxChild Swap value at currentNode with value at maxChild ReHeapDown starting at maxChild
Algorithm ReHeapUp ------------------ if value at currentNode > value at parent node Set parentNode to index of parent of currentNode Swap value at currentNode with value at parentNode ReHeapUp starting at parentNode
Algorithm BuildHeap ------------------- for each index from first non-leaf back up to root (in reverse-level-order order) ReHeapDown starting at that index