Notes on Binary Search Trees
A binary search tree is a binary tree in which each node contains a key, all keys are unique (for simplicity) and for which the following criterion is satisfied:
For each node in the tree, all keys in the nodes of its left subtree must be less than the key in the node, and all keys in the nodes of its right subtree must be greater than the key in the node.
This is called the binary search tree property, and is a powerful and useful property for a tree to have. For example, it turns out (and this should be clear if you think about the property for a moment), that an inorder traversal of a binary search tree visits the nodes in ascending key order.
Here is one possibility for a suitable structure that we can use for a "nested" class in Java, if we limit our binary search trees to contain characters (i.e., our binary search tree class will not be generic):
private static class TreeNode { private char data; //The data in this node. private TreeNode left; //Reference to left subtree. private TreeNode right; //Reference to right subtree. /** * Constructor. * Creates a node containing the input character as data. * References to the left and right subnodes will be null by default. */ public TreeNode(char c) { data = c; } }
Rather than go to the trouble of creating an entire BinarySearchTree class, our demo program will just start with an empty tree (a private static root node variable set to null) and use whatever methods are required to build a tree on that root and perform the necessary tasks.
In implementing this operation we will
This method can be implemented recursively, like most of the others we want for our tree, but let's go with an iterative version, for which we have the following pseudocode:
if tree is empty Create a new node at the root, put the data into it and return while (true) //A potentially infinite loop but not really since you're //going to look down the tree and must come to the end if newItem < data in the current node if left child is null Create a new node in the left child, put the data into it and return else Move down to the root of the left subtree else //newItem > data in the current node (equality not possible) if right child is null Create a new node in the right child, put the data into it and return else Move down to the root of the right subtree
There are several things to consider in implementing this
operation:
Preorder, inorder and postorder we know all about, but what about
"levelorder, left to right"? Here's an idea for implementing that
option:
Rather than recursion, use an iterative approach that employs a queue as an auxiliary structure and whose pseudocode goes something like this:
Put the root node of the tree into the queue while the queue is not empty Remove the first item from the queue and display its value Put left child of node just removed from queue into queue Put right child of node just removed from queue into queue
Because in our (simplified) BST there is nothing in a node except the key value, a simple search operation like this is not of much practical use (for retrieving additional information associated with a key, for example), and serves only as a way of determining whether a particular value is, or is not, in the tree. As usual, the actual "retrieving" will be done by an auxiliary helper function. Here is some pseudocode for this function, which, you will note, bears a marked resemblance to the binary search algorithm):
if this node is empty found is false else if the value sought is < value in this node Retrieve the value sought from the left subtree else if the value sought is > value in this node Retrieve the value from the right subtree else found is true
if the tree is empty return 0 else return 1 + left-subtree count + right-subtree count
if the tree is empty return -1 else return 1 + max(left-subtree depth, right-subtree depth)This operation to find the height of the tree (or, equivalently, the depth of the tree, since both are the maximum length of any path from the root to a leaf) is a little more interesting. Note, in this connection, that the height of any node is one more than the maximum of the heights of the two subtrees of that node. In this case you will find it convenient to give an empty tree a "height" of -1, so that a tree with one node in it will have a height of 0.
This is a case in which, when writing the algorithm, it seems better to write the general part of the recursion first, and deduce from it what you want the stopping condition to be, since if you start by trying to deduce the stopping condition there may be a tendency to leap to the conclusion that the value should be zero, which is not what you want. This is a bit unusual, since writing recursive routines more often proceeds the other way.
As we know, a binary search tree may or may not be balanced, and in the worst case may be nothing more than a linked list with each node having an extra (redundant) link in it. However, it is always possible to convert a binary search tree that is not balanced into one that is. There are many different schemes for doing this, some of them very complicated and quite difficult to implement correctly.
At the highest level, as you know, the reason we want "balance" or "bushiness" in our binary search trees is to guarantee O(log n) search times. We present one simple (but not necessarily efficient) algorithm for achieving balance. Some others are mentioned briefly in the following section.
Here is some pseudocode for building a balanced binary search tree:
Place all values into a linear structure Sort values in the linear structure Insert middle value from linear structure into BST as root (at level 0) Insert middle values of two remaining linear sub-structures as children of the level 0 node (root) Insert the middle values of the four remaining sub-structures as children of the level 1 nodes Continue until all values have been inserted
Note that although this pseudocode is not stated in "standard" recursive form, you should see that it is clearly recursive and is best implemented that way.
Note that this algorithm involves placing all the values into an auxiliary container and performing a sort on them before beginning to build the BST. Once the tree is built, though, we have a balanced BST and can be assured of O(log n) searches, as long as there are no deletions from, or additional insertions into, the tree.
But suppose that, in addition to searching, we are also inserting into and deleting from, the tree. Then we can expect, over time, that the shape of the tree might well become "unbalanced" to a greater or lesser degree, which might, in turn, result in degradation in search times. How can we remedy this problem? That is, how can we re-balance a BST if necessary?
Well, it is, after all, a BST. So, at any time we can delete all the values and place them back into a vector. If we do this via an inorder traversal, the keys will automatically be in ascending order. Thus we won't have to sort, but we can then use the rest of the above algorithm to build a new BST which will again be balanced.