Red-black Trees

red-black trees: A red-black tree is a binary search tree with one extra bit of storage per node: its color, which can be either RED or BLACK. By constraining the way nodes can be colored on any path from the root to a leaf, red-black trees ensure that no such path is more than twice as long as any other, so that the tree is approximately balanced.

Each node of the tree now contains the fields color, key, left, right, and p. If a child or the parent of a node does not exist, the corresponding pointer field of the node contains the value NIL. We shall regard these NIL's as being pointers to external nodes (leaves) of the binary search tree and the normal, key-bearing nodes as being internal nodes of the tree.

A binary search tree is a red-black tree if it satisfies the following red-black properties:

Every node is either red or black.
The root is black.
Every leaf (NIL) is black.
If a node is red, then both its children are black.
For each node, all paths from the node to descendant leaves contain the same number of black nodes.

Figure 13.1(a) shows an example of a red-black tree.

Figure 13.1: A red-black tree with black nodes darkened and red nodes shaded. Every node in a red-black tree is either red or black, the children of a red node are both black, and every simple path from a node to a descendant leaf contains the same number of black nodes. (a) Every leaf, shown as a NIL, is black. Each non-NIL node is marked with its black-height; NIL's have black-height 0. (b) The same red-black tree but with each NIL replaced by the single sentinel nil[T], which is always black, and with black-heights omitted. The root's parent is also the sentinel. (c) The same red-black tree but with leaves and the root's parent omitted entirely. We shall use this drawing style in the remainder of this chapter.

As a matter of convenience in dealing with boundary conditions in red-black tree code, we use a single sentinel to represent NIL (see page 206). For a red-black tree T, the sentinel nil[T] is an object with the same fields as an ordinary node in the tree. Its color field is BLACK, and its other fields-p, left, right, and key-can be set to arbitrary values. As Figure 13.1(b) shows, all pointers to NIL are replaced by pointers to the sentinel nil[T].
We use the sentinel so that we can treat a NIL child of a node x as an ordinary node whose parent is x. Although we instead could add a distinct sentinel node for each NIL in the tree, so that the parent of each NIL is well defined, that approach would waste space. Instead, we use the one sentinel nil[T] to represent all the NIL's-all leaves and the root's parent. The values of the fields p, left, right, and key of the sentinel are immaterial, although we may set them during the course of a procedure for our convenience.
We generally confine our interest to the internal nodes of a red-black tree, since they hold the key values. In the remainder of this chapter, we omit the leaves when we draw red-black trees, as shown in Figure 13.1(c).
We call the number of black nodes on any path from, but not including, a node x down to a leaf the black-height of the node, denoted bh(x). By property 5, the notion of black-height is well defined, since all descending paths from the node have the same number of black nodes. We define the black-height of a red-black tree to be the black-height of its root.

The following lemma shows why red-black trees make good search trees.

Lemma 13.1: A red-black tree with n internal nodes has height at most 2 lg(n 1).

Proof We start by showing that the subtree rooted at any node x contains at least 2^bh(x) - 1 internal nodes. We prove this claim by induction on the height of x. If the height of x is 0, then x must be a leaf (nil[T]), and the subtree rooted at x indeed contains at least 2^bh(x) - 1 = 2⁰ - 1 = 0 internal nodes. For the inductive step, consider a node x that has positive height and is an internal node with two children. Each child has a black-height of either bh(x) or bh(x) - 1, depending on whether its color is red or black, respectively. Since the height of a child of x is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has at least 2^bh(x)-1 -1 internal nodes. Thus, the subtree rooted at x contains at least (2^bh(x)-1 - 1) (2^bh(x)-1 - 1) 1 = 2^bh(x) - 1 internal nodes, which proves the claim.

To complete the proof of the lemma, let h be the height of the tree. According to property 4, at least half the nodes on any simple path from the root to a leaf, not including the root, must be black. Consequently, the black-height of the root must be at least h/2; thus,

n ≥ 2^h/2 - 1.

Moving the 1 to the left-hand side and taking logarithms on both sides yields lg(n 1) ≥ h/2, or h ≤ 2 lg(n 1).

SKEDSOFT