Dynamic Tables

Dynamic tables: In some applications, we do not know in advance how many objects will be stored in a table. We might allocate space for a table, only to find out later that it is not enough. The table must then be reallocated with a larger size, and all objects stored in the original table must be copied over into the new, larger table. Similarly, if many objects have been deleted from the table, it may be worthwhile to reallocate the table with a smaller size. In this section, we study this problem of dynamically expanding and contracting a table. Using amortized analysis, we shall show that the amortized cost of insertion and deletion is only O(1), even though the actual cost of an operation is large when it triggers an expansion or a contraction. Moreover, we shall see how to guarantee that the unused space in a dynamic table never exceeds a constant fraction of the total space.

We assume that the dynamic table supports the operations TABLE-INSERT and TABLE-DELETE. TABLE-INSERT inserts into the table an item that occupies a single slot, that is, a space for one item. Likewise, TABLE-DELETE can be thought of as removing an item from the table, thereby freeing a slot. The details of the data-structuring method used to organize the table are unimportant; we might use a stack, a heap, or a hash table. We might also use an array or collection of arrays to implement object storage, as we did in.

We shall find it convenient to use a concept introduced in our analysis of hashing. We define the load factor α (T) of a nonempty table T to be the number of items stored in the table divided by the size (number of slots) of the table. We assign an empty table (one with no items) size 0, and we define its load factor to be 1. If the load factor of a dynamic table is bounded below by a constant, the unused space in the table is never more than a constant fraction of the total amount of space.

We start by analyzing a dynamic table in which only insertions are performed. We then consider the more general case in which both insertions and deletions are allowed.

Table expansion: Let us assume that storage for a table is allocated as an array of slots. A table fills up when all slots have been used or, equivalently, when its load factor is 1. In some software environments, if an attempt is made to insert an item into a full table, there is no alternative but to abort with an error. We shall assume, however, that our software environment, like many modern ones, provides a memory-management system that can allocate and free blocks of storage on request. Thus, when an item is inserted into a full table, we can expand the table by allocating a new table with more slots than the old table had. Because we always need the table to reside in contiguous memory, we must allocate a new array for the larger table and then copy items from the old table into the new table.

A common heuristic is to allocate a new table that has twice as many slots as the old one. If only insertions are performed, the load factor of a table is always at least 1/2, and thus the amount of wasted space never exceeds half the total space in the table.

In the following pseudocode, we assume that T is an object representing the table. The field table[T] contains a pointer to the block of storage representing the table. The field num[T] contains the number of items in the table, and the field size[T] is the total number of slots in the table. Initially, the table is empty: num[T] = size[T ] = 0.

		TABLE-INSERT (T , x)
 1  if size[T ] = 0
 2      then allocate table[T] with 1 slot
 3           size[T]  ← 1
 4  if num[T] = size[T]
 5      then allocate new-table with 2 · size[T] slots
 6           insert all items in table[T] into new-table
 7           free table[T]
 8           table[T] → new-table
 9           size[T] → 2 · size[T]
10  insert x into table[T]
11  num[T] → num[T]   1

Notice that we have two "insertion" procedures here: the TABLE-INSERT procedure itself and the elementary insertion into a table in lines 6 and 10. We can analyze the running time of TABLE-INSERT in terms of the number of elementary insertions by assigning a cost of 1 to each elementary insertion. We assume that the actual running time of TABLE-INSERT is linear in the time to insert individual items, so that the overhead for allocating an initial table in line 2 is constant and the overhead for allocating and freeing storage in lines 5 and 7 is dominated by the cost of transferring items in line 6. We call the event in which the then clause in lines 5-9 is executed an expansion.

Let us analyze a sequence of n TABLE-INSERT operations on an initially empty table. What is the cost c_i of the ith operation? If there is room in the current table (or if this is the first operation), then c_i = 1, since we need only perform the one elementary insertion in line 10. If the current table is full, however, and an expansion occurs, then c_i = i: the cost is 1 for the elementary insertion in line 10 plus i - 1 for the items that must be copied from the old table to the new table in line 6. If n operations are performed, the worst-case cost of an operation is O(n), which leads to an upper bound of O(n²) on the total running time for n operations.

This bound is not tight, because the cost of expanding the table is not borne often in the course of n TABLE-INSERT operations. Specifically, the ith operation causes an expansion only when i - 1 is an exact power of 2. The amortized cost of an operation is in fact O(1), as we can show using aggregate analysis. The cost of the ith operation is

The total cost of n TABLE-INSERT operations is therefore

since there are at most n operations that cost 1 and the costs of the remaining operations form a geometric series. Since the total cost of n TABLE-INSERT operations is 3n, the amortized cost of a single operation is 3.

SKEDSOFT