Clustering indexes

A clustering index determines how rows are physically ordered (clustered) in a table space. Clustering indexes provide significant performance advantages in some operations, particularly those that involve many records. Examples of operations that benefit from clustering indexes include grouping operations, ordering operations, and comparisons other than equal.

Any index, except for an expression-based index or an XML index, can be a clustering index. You can define only one clustering index on a table.

You can define a clustering index on a partitioned table space or on a segmented table space. On a partitioned table space, a clustering index can be a partitioning index or a secondary index. If a clustering index on a partitioned table is not a partitioning index, the rows are ordered in cluster sequence within each data partition instead of spanning partitions. (Prior to Version 8 of DB2® UDB for z/OS®, the partitioning index was required to be the clustering index.)

Restriction: Start of changeAn expression based index or an XML index cannot be a clustering index.End of change

When a table has a clustering index, an INSERT statement causes DB2 to insert the records as nearly as possible in the order of their index values. The first index that you define on the table serves implicitly as the clustering index unless you explicitly specify CLUSTER when you create or alter another index. For example, if you first define a unique index on the EMPNO column of the EMP table, DB2 inserts rows into the EMP table in the order of the employee identification number unless you explicitly define another index to be the clustering index.

Although a table can have several indexes, only one index can be a clustering index. If you do not define a clustering index for a table, DB2 recognizes the first index that is created on the table as the implicit clustering index when it orders data rows.

Tip:
  • Always define a clustering index. Otherwise, DB2 might not choose the key that you would prefer for the index.
  • Define the sequence of a clustering index to support high-volume processing of data.

You use the CLUSTER clause of the CREATE INDEX or ALTER INDEX statement to define a clustering index.

Begin general-use programming interface information.
Example: Assume that you often need to gather employee information by department. In the EMP table, you can create a clustering index on the DEPTNO column.
CREATE INDEX DEPT_IX
  ON EMP
    (DEPTNO ASC)
     CLUSTER;

As a result, all rows for the same department are probably close together. DB2 can generally access all the rows for that department in a single read. (Using a clustering index does not guarantee that all rows for the same department are stored on the same page. The actual storage of rows depends on the size of the rows, the number of rows, and the amount of available free space. Likewise, some pages may contain rows for more than one department.)

End general-use programming interface information.

The following figure shows a clustering index on the DEPT column of the EMP table; only a subset of the rows is shown.

Figure 1. A clustering index on the EMP table
Begin figure description. This figure shows a clustering index on the DEPT column of the EMP table; only a subset of the rows is shown. End figure description.

Suppose that you subsequently create a clustering index on the same table. In this case, DB2 identifies it as the clustering index but does not rearrange the data that is already in the table. The organization of the data remains as it was with the original nonclustering index that you created. However, when the REORG utility reorganizes the table space, DB2 clusters the data according to the sequence of the new clustering index. Therefore, if you know that you want a clustering index, you should define the clustering index before you load the table. If that is not possible, you must define the index and then reorganize the table. If you create or drop and re-create a clustering index after loading the table, those changes take effect after a subsequent reorganization.