DB2 10.5 for Linux, UNIX, and Windows

Column compression dictionaries after a load or insert operation

The load utility analyzes the input data to determine the best encoding schemes for building column compression dictionaries.

Load operations can apply both column-level and page-level compression to all the input data because those dictionaries are built during the initial load operation.

If an insert operation is used instead of a load operation, automatic creation of the column compression dictionaries is not initiated until a threshold number of values is inserted. This design ensures that a large enough sample of values exists to build dictionaries that yield an adequate compression ratio. Values that are inserted before the column compression dictionaries are populated are not compressed by those dictionaries.

As of DB2® Cancun Release 10.5.0.4, page-level compression for insert operations is enabled by default. Such operations, including import and ingest operations, can take advantage of higher compression ratios and data clustering at the page level for improved insert performance. Values can be compressed by page-level compression dictionaries if the space savings from encoded values outweigh the cost of storing the page-level compression dictionaries in the data pages.

In general, you obtain a better compression ratio if you load a set of data rather than insert the same data. To help build the best dictionaries and get the maximum compression, use a good set of representative data during the initial load operation. This approach is preferable to loading a small initial subset of data that contains only a few distinct values and then appending the rest of the data by using multiple load operations.

If the compression ratio decreases significantly because new values are not being compressed, you can unload the table and then reload it to rebuild the dictionaries and help improve compression.