Table-level compression dictionaries
for tables that you enable for adaptive or classic row compression
can be built automatically or manually. Tables that you enable for
adaptive compression include page-level data dictionaries, which are
always automatically created.
Automatic dictionary creation
Starting with DB2® Version
9.5, a table-level compression dictionary is created automatically
if each of the following conditions is met:
- You set the COMPRESS attribute for the table by using the CREATE
TABLE or ALTER TABLE statement with the COMPRESS YES ADAPTIVE or COMPRESS
YES STATIC clause.
- A table-level compression dictionary does not already exist for
the table.
- The table contains sufficient data for constructing a dictionary
of repeated data.
Data that you move into the table after the dictionary is created
is compressed using the dictionary if compression remains enabled.
The
following diagram shows the process by which the compression dictionary
is automatically created:
The sequence
of events illustrated in the diagram is as follows:
- A compression dictionary is not yet created, because the table
is empty.
- Data is inserted into the table by using insert or load operations
and remains uncompressed.
- As more data is inserted or loaded into the table, the data remains
uncompressed.
- After a threshold is reached, dictionary creation is triggered
automatically if the COMPRESS attribute is set to YES ADAPTIVE or
YES STATIC.
- The dictionary is created.
- The dictionary is appended to the table.
- From this point forward,
table-level compression is enabled, and the rows that are later inserted
or added are compressed by the table-level compression dictionary.
Important: The rows that existed in a table before
the dictionary was created remain uncompressed unless you change them
or manually rebuild the dictionary.
If
you create a table with DB2 Version
9.7 or later and the table contains at least one column of type XML,
a second compression dictionary is created. This dictionary is used
to compress the XML data that is stored in the default XML storage
object that is associated with the table. Compression dictionary creation
for XML data occurs automatically if each of the following conditions
is met:
- You set the COMPRESS attribute on the table to YES ADAPTIVE or
YES STATIC.
- A compression dictionary does not exist within that XML storage
object.
- There is sufficient data in the XML storage object.
Restriction: You cannot compress
data in XML columns that you created with DB2 Version 9.5 or DB2 Version
9.1. However, you can compress inline XML columns that you add to
a table using DB2 Version 9.7
or later, provided the table was created without XML columns in an
earlier release of the product. If a table that you created in an
earlier release already has one or more XML columns and you want to
add a compressed XML column by using DB2 Version
9.7 or later, you must use the ADMIN_MOVE_TABLE stored procedure to
migrate the table before you can use compression.
The
mechanism for creating table-level compression dictionaries for temporary
tables is similar to the mechanism that is used for permanent tables. However,
the database manager automatically makes the determination whether
to use classic row compression for temporary tables, based on factors
such as query complexity and the size of the result set.
Manual dictionary creation
Although
dictionaries are created automatically when compression-enabled tables
grow to a sufficient size, you can also force a table-level compression
dictionary to be created if none exists by using the REORG
TABLE command with the RESETDICTIONARY parameter.
This command forces the creation of a compression dictionary if there
is at least one row of data in the table. Table reorganization is
an offline operation; one benefit of using automatic dictionary creation
is that the table remains online as the dictionary is built.
Instead
of using the REORG TABLE command to force the creation
of a new dictionary, you can also use the INSPECT command
with the ROWCOMPESTIMATE parameter. This command creates a
compression dictionary if the table does not already have one. The
advantage of this approach over performing a table reorganization
is that the table remains online. Rows that you add later are subject
to compression; however, rows that existed before you ran the INSPECT command
remain uncompressed until you perform a table reorganization. However, if compression is enabled, automatic dictionary
creation will usually take place shortly after you activate compression,
likely before you even have a chance to use the INSPECT command.
Resetting compression dictionaries
Whether
a table-level compression dictionary is created automatically or manually,
the dictionary is static; after it is built, it does not change. As
you add or update rows, they are compressed based on the data that
exists in the compression dictionary. For many situations, this behavior
is appropriate. Consider, for example, a table in a database that
is used for maintaining customer accounts for a city water utility.
Such a table might have columns such as STREET_ADDRESS, CITY, PROVINCE,
TELEPHONE_NUM, POSTAL_CODE, and ACCOUNT_TYPE. If a compression dictionary
is built with data from a such table, even if it is only a modestly
sized table, there is likely sufficient repetitive information for
classic row compression to yield significant space savings. Much of
the data might be common from customer to customer, for example, the
values of the CITY, POSTAL_CODE, or PROVINCE column or portions of
the value in the STREET_ADDRESS or TELEPHONE_NUM column.
However, other tables might change significantly
over time. Consider a table that is used for retail sales data as
follows:
- A master table is used to accumulate data on a month-by-month
basis.
- Each month, a new set of records is loaded into the table.
In this case, a compression dictionary created in, for example,
April might not reflect repeating data from sales in later parts of
the year. In situations where data in a table changes significantly
over time, you might want to reset your compression dictionaries by
using the
REORG TABLE command with the
RESETDICTIONARY parameter.
The advantage of resetting the compression dictionary is that data
from the entire table is considered when the dictionary is built.