Why we care about balanced trees
Why we care about balanced trees
Other than representing information with binary relationships, an important reason why we use binary trees is because, if we’ve organized information in just the right way, searching for the information is fast.  For example, suppose we have the following set of numbers: { 0, 1, 2, 3, 4, 5, 6, 7 }.
We could build the following tree:
0
 \
  1
   \
    2
     \
      3
       \
        4
         \
          5
           \
            6
             \
              7
While this is a binary tree, it’s not a very good one.  Computer scientists often like to ask: how much work do we need to do?  Suppose we want to find the node storing the number 7.  To find it, we’d have to do at least 7 traversal operations.  Is 7 == 0?  No. Is 7 < 0?  No.  Go right.  Is 7 == 1?  No.  Is 7 < 1?  No. Go right. And so on… until we get to the last node, which is equal to 7.
In fact, you might be thinking “this tree looks awfully like a list”, which you already know has undesirable worst-case properties for searching: O(n), where n is the size of the list.  Indeed, when a tree “degenerates” this way, that’s also the worst case for a tree.  The time it takes to find an arbitrary element in the worst case is O(n), because in that case, a tree is essentially a list.
Contrast the worst case with the following tree:
           4
         /   \
        2     6
       / \   / \
      1   3 5   7
     /
    0
For this tree, we have to do at most 3 traversal operations to find the deepest node, 0.  Clearly, if we have to store the numbers 0 through 7 in a tree, this is a better tree, because for any given node, we’ll spend less time looking than in the worst-case tree above.  In fact, while you may reorganize the nodes in this better tree (e.g., suppose 2 was at the root), there is no more “efficient” binary tree than this one.  How do I know?  Because it’s balanced.
Tree “height” and your little brother
We often refer to the length of the longest path from the root to any leaf in the tree as tree height. If this bothers you (because, let’s face it… the tree is upside-down), just flip the tree rightside-up in your head. You’ll get used to it eventually.
I’ve referred to height in a hand-wavy way up to this point, but let’s be a little more formal. The height of a tree can be defined recursively:
nodeHeight(tree) =
- 0if the tree is empty, or
- max(height(tree.left), height(tree.right)) + 1
This makes intuitive sense.  If your little brother pointed at nothing and asked you how tall it was, you’d either tell him that there’s something seriously wrong with him, or you’d realize that he’s probably a mathematician and you’d answer “zero, dummy.”  But if your same little brother pointed at an actual tree and asked you the same question, you’d break out your tape measure, and assuming that you weren’t afraid of heights, you’d measure the distance from the root to the tallest part of the tree.  That’s all the definition above for nodeHeight does.
Hang on, hang on, this is a totally different definition for height than the one you showed us in class.  Either I am confused or you are wrong.
Yes, I apologize. In fact, you are confused because I made a mistake. I am sorry.
Note that the above definition is for the height of the tree in terms of nodes. There’s another way to measure tree height: in terms of edges.
For example, the following tree:
       2
     /   \
    1     3
Has a height of 2 in terms of node height (the nodeHeight definition above).  But it has a height of 1 in terms of edge height.  The definition for edge height is the following:
edgeHeight(tree) =
- -1if the tree is empty, or
- max(height(tree.left), height(tree.right)) + 1
For edge height, think of the base case: the tree is empty.  It has a height of -1.  That’s pretty weird, right?  Fortunately, it all makes sense once you look at a non-empty tree.  Suppose you have the following tree:
2
In other words, you have a tree with just a root–no children.  What is the height of the left subtree?  By the edgeHeight definition, it is -1.  The same is true for the right subtree: -1.  So what is the height of the node 2?  max(-1,-1) + 1 = 0.  This make sense because it has no edges.
The nodeHeight definition counts a little differently, because it’s not counting edges, it’s counting nodes.  nodeHeight equals zero when you have a totally empty tree, in other words, no nodes.
I am telling you all this because the cute little theorem we worked out in class for the height of a tree in terms of , i.e.:
To store nodes,
uses the edge height definition. But the tree height code I showed you for the tree balance-checking algorithm used the node height definition.
In fact, the tree balance-checking algorithm can utilize either definition for tree height. Use whichever definition you like better. But remember that the minimum tree size formula is in terms of edge height.
I am going to leave both definitions on here for posterity.  But going forward, we’re going to use the edgeHeight definition.
Tree “balance” and your little brother’s annoying hobby
Suppose your little brother is really quite the avid tree enthusiast, and he cares deeply about your trees. He’s worried about insects eating the trees. He wants you to help him hang little insect traps on every tree branch. Sadly, you wore the wrong pants today, and your pocket can only hold one trap. So that means that you need to climb up the tree from the bottom for every branch on the tree in order to hang a trap. Grr. Little brothers.
There are two trees. They both have the same number of branches (edges). But one of the trees is taller. If you want to minimize your work, which tree do you climb? Let’s use the trees we mentioned before.
For the first tree, which grows really straight and tall, you need to climb, from the bottom, 8 times. But the amount of work is (climb 0 branches to get to 0) + (climb 1 branch to get to 1, climb down) + (climb 2 branches to get to 2, climb down) + (climb 3 branches to get to 3, climb down) + (climb 4 branches to get to 4, climb down) + (climb 5 branches to get to 5, climb down) + (climb 6 branches to get to 6, climb down) + (climb 7 branches to get to 7, climb down) = 28 branches to climb. Phew!
For the second tree, you also need to climb from the bottom 8 times. But this tree is a little more well-rounded. You need to (climb 0 branches to get to 0) + (climb 1 branch to get to 2) + (climb 1 branch to get to 6) + (climb 2 branches to get to 1) + (climb 2 branches to get to 3) + (climb 2 branches to get to 6) + (climb 2 branches to get to 7) + (climb 3 branches to get to 0) = 13 branches to climb. Your brother may be a mathematician, but as a budding computer scientist you definitely know how to count and you know that 13 < 28. You pick the second tree.
But as a budding computer scientist it also makes you wonder… is there a way to hang 8 insect traps and be able to climb fewer than 13 branches? In this insane, make-believe world, each branch splits into only two other branches of course (i.e., the tree is binary). So the answer is “no”, because the tree is “balanced.” It’s already as short as it can be. You aren’t just a lazy older sibling, you’re optimally lazy, which as a budding computer scientist makes you quite proud of yourself.
So how do we know this? We know because the second tree is balanced. And again, since we’ve talked up until this point about “balance” in a hand-wavy kind of way, let’s get a little more formal. You will not be surprised that there’s an elegant recursive definition for balance:
isBalanced(tree) = is true if and only if
- case 1: the tree is empty, or
- case 2:
    - isBalanced(tree.left)is- trueand
- isBalanced(tree.right)is- trueand
- | height(tree.left) - height(tree.right) | <= 1
 
Sure enough, we can walk through computing this for the second tree and see that it is balanced. Give it a try!
It turns out that the amount of time it takes to climb to a branch in the worst case is always if a tree is balanced. For big trees, is a lot smaller than . The overview in the Wikipedia article on self-balancing binary search trees has a nice, clear derivation of this logarithmic time complexity. Don’t worry– you are not required to learn self-balancing binary search trees just yet. You’ll likely learn how those work in your data structures course.
OK, OK, enough of this stupid story. Where’s the code?
Fine.  Here’s an example.  And since I can’t give you a binary tree implementation just yet, (because implementing one is a part of Assignment #7), suppose you have access to a BinaryTree<T> that can do the following operations.