SKEDSOFT

Data Mining & Data Warehousing

Introduction:

Recall the reason that we are interested in pre-computing data cubes: Data cubes facilitate fast on-line analytical processing (OLAP) in a multidimensional data space. However, a full data cube of high dimensionality needs massive storage space and unrealistic computation time.

Iceberg cubes provide a more feasible alternative, as we have seen wherein the iceberg condition is used to specify the computation of only a subset of the full cube’s cells. However, although an iceberg cube is smaller and requires less computation time than its corresponding full cube, it is not an ultimate solution. For one, the computation and storage of the iceberg cube can still be costly. For example, if the

Algorithm: Star-Cubing. Compute iceberg cubes by Star-Cubing.

Input:

R: a relational table

min support: minimum support threshold for the iceberg condition (taking count as the measure).

Output: The computed iceberg cube.

Method: Each star-tree corresponds to one cuboid tree node, and vice versa.

BEGIN

scan R twice, create star-table S and star-tree T;

output count of T.root;

call starcubing(T, T.root);

END

procedure starcubing(T, cnode)// cnode: current node

{

1.       for each non-null child C of T’s cuboid tree

2.       insert or aggregate cnode to the corresponding

3.       position or node in C’s star-tree;

4.       if (cnode.count  min support) then f

5.       if (cnode 6= root) then

6.       output cnode.count;

7.       if (cnode is a leaf) then

8.       output cnode.count;

9.       else { // initiate a new cuboid tree

10.   create CC as a child of T’s cuboid tree;

11.   let TC be CC’s star-tree;

12.   TC:root0s count = cnode:count;

13.   }

14.   }

15.   if (cnode is not a leaf) then

16.   starcubing(T, cnode.first child);

17.   if (CC is not null) then f

18.   starcubing(TC;TC:root);

19.   remove CC from T’s cuboid tree; g

20.   if (cnode has sibling) then

21.   starcubing(T, cnode.sibling);

22.   remove T;

23.   }