1. Overview

In this tutorial, we’ll introduce a balanced Binary Search Tree. Moreover, we’ll learn about what is the height of a tree and show that in a balanced tree with n nodes it is O(log(n)).

We’ll also use Big-O notation to approximate the tree height. We assume having a basic knowledge of Binary and Binary Search Trees.

2. Tree Height

The height of a tree is the longest downward path from its root to any reachable leaf.

Let’s look at an example:

There is a height value in each node in the above tree. Notice that the longest path from the root node to the furthest leaf, colored in red, is 4. Hence the height of this tree is 4. To compute this, we can follow a simple algorithm.

We can traverse the tree with post-order traversal to compute the height of each node. At each algorithm step, we calculate current node’s height by adding 1 to max(h_{left}, h_{right}), where h_{left} and h_{right} are the height of the left and right subtree respectively. The height of a leaf is considered to be 0.

The time complexity of operations on Binary Search Trees (BST) are usually based on its height. Thus, a BST with n nodes can be built such that the search operation will take O(n) time. However, it’s considered to be inefficient, because BSTs can perform much better.

A BST is an ordered data structure. The search and delete operations cost O(h) time, where h is the height of the tree. The best case is when h = O(log(n)). However, the worst case is h = O(n). Further, we’ll see that in a balanced BST, h is always O(log(n)).

3. Balanced Binary Tree

A binary tree is balanced, if, for every node, the heights of its left and right children differ by at most 1. If a node doesn’t have any of the children, then the height of the absent children is -1. Let’s have a look at these two trees:

In the tree on the left, nodes of a height 2, marked in red, make this binary tree unbalanced. The height of their left child, which is absent, is  -1. However, the height of their right child is 1. The difference is |-1 - 1| = 2, which is greater than 1. However, we can reorder the nodes. The right tree is balanced, in case, for every node, the difference between its children’s height is at most 1.

The example of a balanced BST is a Red-Black-Tree. The Red-Black-Tree is self-balancing. Balanced BSTs are also implemented in several Java Collections. For instance, TreeMap and TreeSet in Java have a Red-Black-Tree as a backbone.

3.1. Height of a Balanced Binary Tree

Balanced BST allows keeping the operations efficient. In a balanced BST of \boldsymbol{n} nodes, we may be sure, that the search operation will take \boldsymbol{O(log(n))} time. Actually, the search operation would take O(h) time if h is the height of a Balanced BST. Moreover, we can prove that its height is O(log(n)).

4. Lower Bound of a Balanced Tree Size

To prove that h = O(log(n)), we can prove the inequality, which gives us the lower bound of the balanced binary tree size given a height. Let \boldsymbol{\Omega_{h}} be the minimum number of nodes in a tree of height \boldsymbol{h}.

For a balanced binary tree of height \boldsymbol{h}, the minimum number of nodes is greater, than \boldsymbol{2^{h / 2 - 1}}}: \boldsymbol{\Omega_{h} > 2^{h / 2 - 1}}. Let’s prove this statement. To do it, we’ll use mathematical induction on h.

4.1. Mathematical Induction Base Case

For the base induction case h = 1 a balanced binary tree of height 1 has at least 2 nodes. It has a root and either its left or right child is present. Thus, we have 2 > 2^{1/2 - 1}, which is correct.

Similarly, for the case h = 2 a balanced binary tree has at least 4 nodes. Let’s see an example:

We have 4 > 2^{2/2 - 1} = 1, which is also correct.

4.2. Inductive Step

Now, let’s prove the statement for the case h \geq 3. With the induction technique, we assume the statement \boldsymbol{\Omega_{h} > 2^{h / 2 - 1}} holds for every value in the range 1, 2, …, h – 1. Our task is to prove it holds for \boldsymbol{h}.

Below, we use a tree of h for the tree of height h.

So, a balanced binary tree of h with the minimum number of nodes \Omega_{h} has a root and two subtrees. Since it has the minimum number of nodes, one subtree has a height equal to h - 1. It’s true because we compute the height of the root as 1 + max(h_{left}, h_{right}). Therefore, one of the root’s subtrees must be of h - 1.

Let’s look at the picture below:

Since that tree has the minimum number of nodes, the other subtree has a height equal to h - 2. It can’t be shorter, because the entire binary tree is balanced.

As a result, we have \Omega_{h} = 1 + \Omega_{h - 1} + \Omega_{h - 2}. It’s also clear the smaller the height, the fewer nodes the tree has and \Omega_{h - 1} > \Omega_{h - 2}. Thus, \Omega_{h} = 1 + \Omega_{h - 1} + \Omega_{h - 2} > 1 + 2 * \Omega_{h - 2} > 2 * \Omega_{h - 2}. By the induction hypothesis \Omega_{h - 2} > 2^{(h - 2) / 2 - 1}. So, \Omega_{h} > 2 * \Omega_{h - 2} > 2 * 2^{(h - 2) / 2 - 1}. Finally, 2 * 2^{(h - 2) / 2 - 1} = 2^{1 + (h - 2)/ 2 - 1} = 2^{h/2 - 1} and \Omega_{h} > 2^{h/2 - 1}.

We proved that the minimum number of nodes \Omega_{h} in a balanced binary tree of h is always greater than 2^{h/2 - 1}. We’ll use this in the next section.

5. Balanced Tree Height with Big-O Notation

Now we’ll show the height \boldsymbol{h} of a balanced binary tree with \boldsymbol{n} nodes is \boldsymbol{O(log(n))}. Firstly, let’s show h < 2 * log(\Omega_{h}) + 2. Previously, we’ve proved \Omega_{h} > 2^{h/2 - 1}. Taking logarithm on both sides, we get log(\Omega_{h}) >h/2 - 1 or h / 2 < log(\Omega_{h}) + 1. Thus, h < 2 * log(\Omega_{h}) + 2.

Is important to remember, \Omega_{h} is the minimum number of nodes in a balanced binary tree of h. It means, \Omega_{h} \leq n, where n is the number of nodes of any balanced binary tree of h. Similarly,  log(\Omega_{h}) \leq log(n). So, h < 2 * log(\Omega_{h}) + 2 < 2 * log(n) + 2.

Finally, we’ve proved that h < 2 * log(n) + 2 for any balanced binary tree of h. Therefore, \boldsymbol{h} is \boldsymbol{O(log(n))}.

6. Conclusion

In this article, we’ve discussed balanced binary trees and proved that their height is O(log(n)). Moreover, we’ve learned the way of proving theorems with mathematical induction. Finally, we’ve noticed, that the time complexity of some operations on a binary tree is based on its height. Therefore, a balanced binary tree is more efficient.

guest
0 Comments
Inline Feedbacks
View all comments