DB2 Version 10.1 for Linux, UNIX, and Windows

Table-level compression dictionary creation

Table-level compression dictionaries for tables that you enable for adaptive or classic row compression can be built automatically or manually. Tables that you enable for adaptive compression include page-level data dictionaries, which are always automatically created.

Automatic dictionary creation

Starting with DB2® Version 9.5, a table-level compression dictionary is created automatically if each of the following conditions is met:
  • You set the COMPRESS attribute for the table by using the CREATE TABLE or ALTER TABLE statement with the COMPRESS YES ADAPTIVE or COMPRESS YES STATIC clause.
  • A table-level compression dictionary does not already exist for the table.
  • The table contains sufficient data for constructing a dictionary of repeated data.
Data that you move into the table after the dictionary is created is compressed using the dictionary if compression remains enabled.
The following diagram shows the process by which the compression dictionary is automatically created:
Automatic dictionary creation
The sequence of events illustrated in the diagram is as follows:
  1. A compression dictionary is not yet created, because the table is empty.
  2. Data is inserted into the table by using insert or load operations and remains uncompressed.
  3. As more data is inserted or loaded into the table, the data remains uncompressed.
  4. After a threshold is reached, dictionary creation is triggered automatically if the COMPRESS attribute is set to YES ADAPTIVE or YES STATIC.
  5. The dictionary is created.
  6. The dictionary is appended to the table.
  7. From this point forward, table-level compression is enabled, and the rows that are later inserted or added are compressed by the table-level compression dictionary.
Important: The rows that existed in a table before the dictionary was created remain uncompressed unless you change them or manually rebuild the dictionary.
If you create a table with DB2 Version 9.7 or later and the table contains at least one column of type XML, a second compression dictionary is created. This dictionary is used to compress the XML data that is stored in the default XML storage object that is associated with the table. Compression dictionary creation for XML data occurs automatically if each of the following conditions is met:
  • You set the COMPRESS attribute on the table to YES ADAPTIVE or YES STATIC.
  • A compression dictionary does not exist within that XML storage object.
  • There is sufficient data in the XML storage object.
Restriction: You cannot compress data in XML columns that you created with DB2 Version 9.5 or DB2 Version 9.1. However, you can compress inline XML columns that you add to a table using DB2 Version 9.7 or later, provided the table was created without XML columns in an earlier release of the product. If a table that you created in an earlier release already has one or more XML columns and you want to add a compressed XML column by using DB2 Version 9.7 or later, you must use the ADMIN_MOVE_TABLE stored procedure to migrate the table before you can use compression.

The mechanism for creating table-level compression dictionaries for temporary tables is similar to the mechanism that is used for permanent tables. However, the database manager automatically makes the determination whether to use classic row compression for temporary tables, based on factors such as query complexity and the size of the result set.

Manual dictionary creation

Although dictionaries are created automatically when compression-enabled tables grow to a sufficient size, you can also force a table-level compression dictionary to be created if none exists by using the REORG TABLE command with the RESETDICTIONARY parameter. This command forces the creation of a compression dictionary if there is at least one row of data in the table. Table reorganization is an offline operation; one benefit of using automatic dictionary creation is that the table remains online as the dictionary is built.

Instead of using the REORG TABLE command to force the creation of a new dictionary, you can also use the INSPECT command with the ROWCOMPESTIMATE parameter. This command creates a compression dictionary if the table does not already have one. The advantage of this approach over performing a table reorganization is that the table remains online. Rows that you add later are subject to compression; however, rows that existed before you ran the INSPECT command remain uncompressed until you perform a table reorganization. However, if compression is enabled, automatic dictionary creation will usually take place shortly after you activate compression, likely before you even have a chance to use the INSPECT command.

Resetting compression dictionaries

Whether a table-level compression dictionary is created automatically or manually, the dictionary is static; after it is built, it does not change. As you add or update rows, they are compressed based on the data that exists in the compression dictionary. For many situations, this behavior is appropriate. Consider, for example, a table in a database that is used for maintaining customer accounts for a city water utility. Such a table might have columns such as STREET_ADDRESS, CITY, PROVINCE, TELEPHONE_NUM, POSTAL_CODE, and ACCOUNT_TYPE. If a compression dictionary is built with data from a such table, even if it is only a modestly sized table, there is likely sufficient repetitive information for classic row compression to yield significant space savings. Much of the data might be common from customer to customer, for example, the values of the CITY, POSTAL_CODE, or PROVINCE column or portions of the value in the STREET_ADDRESS or TELEPHONE_NUM column.

However, other tables might change significantly over time. Consider a table that is used for retail sales data as follows:
  • A master table is used to accumulate data on a month-by-month basis.
  • Each month, a new set of records is loaded into the table.
In this case, a compression dictionary created in, for example, April might not reflect repeating data from sales in later parts of the year. In situations where data in a table changes significantly over time, you might want to reset your compression dictionaries by using the REORG TABLE command with the RESETDICTIONARY parameter. The advantage of resetting the compression dictionary is that data from the entire table is considered when the dictionary is built.