DB2 10.5 for Linux, UNIX, and Windows

Space requirements for indexes

When designing indexes, you must be aware of their space requirements. For compressed indexes, the estimates you derive from the formulas in this topic can be used as an upper bound, however, it will likely be much smaller.

Space requirements for uncompressed indexes

For each uncompressed index, the space needed can be estimated as:

(average index key size + index key overhead) × number of rows × 2

where:

The average index key size is the byte count of each column in the index key. When estimating the average column size for VARCHAR and VARGRAPHIC columns, use an average of the current data size, plus two bytes.

The index key overhead depends on the type of table on which the index is created:

Table 1. Index key overhead for different tables
Type of table space	Table type	Index type	Index key overhead (bytes)
Any	Any	XML paths or regions	11
Regular	Nonpartitioned	Any	7
	Partitioned	Partitioned	9
	Partitioned	Nonpartitioned	11
Regular	Nonpartitioned	Any	9
	Partitioned	Partitioned	11
	Partitioned	Nonpartitioned	13

The number of rows is the number of rows in a table or the number of rows in a given data partition. Using the number of rows in the entire table in this calculation will give you an estimate the size for the index (for a nonpartitioned index) or for all index partitions combined (for a partitioned index). Using the number of rows in a data partition will give you an estimate of the size for the index partition.
The factor of "2" is for overhead, such as non-leaf pages and free space.

Note:

For every column that allows null values, add one extra byte for the null indicator.
For block indexes created internally for multidimensional clustering (MDC) or insert time clustering (ITC) tables, the "number of rows" would be replaced by the "number of blocks".
For every column that has a random ordering, add two extra bytes.

Space requirements for XML indexes

For each index on an XML column, the space needed can be estimated as:

(average index key + index key overhead) × number of indexed nodes × 2

where:

The average index key is the sum of the key parts that make up the index. The XML index is made up of several XML key parts plus a value (sql-data-type):
- 14 + variable overhead + byte count of sql-data-type
where:
- 14 represents the number of bytes of fixed overhead
- The variable overhead is the average depth of the indexed node plus 4 bytes.
- The byte count of sql-data-type follows the same rules as SQL.
The number of indexed nodes is the number of documents to be inserted multiplied by the number of nodes in a sample document that satisfy the XML pattern expression (XMLPATTERN) in the index definition. The number of indexed nodes could be the number of nodes in a partition or the entire table.

Temporary space requirements for index creation

Temporary space is required when creating the index. The maximum amount of temporary space required during index creation can be estimated as:

(average index key size + index key overhead) × number of rows × 3.2

For those indexes for which there could be more than one index key per row, such as spatial indexes, indexes on XML columns and internal XML regions indexes, the temporary space required can be estimated as:

(average index key size + index key overhead) × number of indexed nodes × 3.2

where the factor of "3.2" is for index overhead, and space required for sorting during index creation. The number of rows or the number of indexed nodes is the number in an entire table or in a given data partition.

Note: In the case of non-unique indexes, only one copy of a given duplicate key entry is stored on any given leaf node. For indexes on tables in LARGE table spaces the size for duplicate keys is 9 for nonpartitioned indexes, 7 for partitioned indexes and indexes on nonpartitioned tables. For indexes on tables in REGULAR table spaces these values are 7 for nonpartitioned indexes, 5 for partitioned indexes and indexes on nonpartitioned tables. The only exception to these rules are XML paths and XML regions indexes where the size of duplicate keys is always 7.The estimate shown previously assumes no duplicates. The space required to store an index might be over-estimated by the formula shown previously.

Temporary space is required when inserting if the number of index nodes exceeds 64 KB of data. The amount of temporary space can be estimated as:

average index key size × number of indexed nodes × 1.2

Estimating the number of keys per leaf page

The following two formulas can be used to estimate the number of keys per index leaf page (the second provides a more accurate estimate). The accuracy of these estimates depends largely on how well the averages reflect the actual data.

Note: For SMS table spaces, the minimum required space for leaf pages is three times the page size. For DMS table spaces, the minimum is an extent.

A rough estimate of the average number of keys per leaf page is:
- ((.9 * (U - (M×2))) × (D + 1)) ÷ (K + 7 + (Ds × D))
where:
- U, the usable space on a page, is approximately equal to the page size minus 100. For example, with a page size of 4096, U would be 3996.
- M = U ÷ (9 + minimumKeySize)
- Ds = duplicateKeySize (See the note under "Temporary space requirements for index creation".)
- D = average number of duplicates per key value
- K = averageKeySize
Note: The minimumKeySize and averageKeysize must include an additional:
- byte for each nullable key part.
- 2 bytes for the length of each variable length key part,
- 2 bytes for each column that is defined with a random ordering.
If there are include columns, they should be accounted for in minimumKeySize and averageKeySize.
The minimum key size is the sum of the key parts that make up the index:
- fixed overhead + variable overhead + byte count of sql-data-type
where:
- The fixed overhead is 13 bytes.
- The variable overhead is the minimum depth of the indexed node plus 4 bytes.
- The byte count of sql-data-type value follows the same rules as SQL.
The .9 can be replaced by any (100 - pctfree)/100 value, if a percent free value other than the default value of ten percent is specified during index creation.
A more accurate estimate of the average number of keys per leaf page is:
- number of leaf pages = x / (avg number of keys on leaf page)
where x is the total number of rows in the table or partition.
For the index on an XML column, x is the total number of indexed nodes in the column.
You can estimate the original size of an index as:
- (L + 2L/(average number of keys on leaf page)) × pagesize
For DMS table spaces, add the sizes of all indexes on a table and round up to a multiple of the extent size for the table space on which the index resides.

You should provide additional space for index growth due to INSERT/UPDATE activity, from which page splits might result.
Use the following calculation to obtain a more accurate estimate of the original index size, as well as an estimate of the number of levels in the index. (This might be of particular interest if include columns are being used in the index definition.) The average number of keys per non-leaf page is roughly:
- ((.9 × (U - (M × 2))) × (D + 1))÷(K + 13 + (9 * D))
where:
- U, the usable space on a page, is approximately equal to the page size minus 100. For a page size of 4096, U is 3996.
- D is the average number of duplicates per key value on non-leaf pages (this will be much smaller than on leaf pages, and you might want to simplify the calculation by setting the value to 0).
- M = U ÷ (9 + minimumKeySize for non-leaf pages)
- K = averageKeySize for non-leaf pages
The minimumKeySize and the averageKeySize for non-leaf pages will be the same as for leaf pages, except when there are include columns. Include columns are not stored on non-leaf pages.

You should not replace .9 with (100 - pctfree)÷100, unless this value is greater than .9, because a maximum of 10 percent free space will be left on non-leaf pages during index creation.
The number of non-leaf pages can be estimated as follows:
```
   if L > 1 then {P++; Z++}
   While (Y > 1)
   { 
      P = P + Y
      Y = Y / N
     Z++
   }
```
where:
- P is the number of pages (0 initially).
- L is the number of leaf pages.
- N is the number of keys for each non-leaf page.
- Y = L ÷ N
- Z is the number of levels in the index tree (1 initially).
Note: The previous calculation applies to single, nonpartitioned indexes, or to a single index partition for partitioned indexes.
Total number of pages is:
- T = (L + P + 2) × 1.0002
The additional 0.02% (1.0002) is for overhead, including space map pages.
The amount of space required to create the index is estimated as:
- T × page size