Compressing data by using the LOAD utility

You can use LOAD to compress data in a table space or partition that is defined with COMPRESS YES. LOAD first builds a compression dictionary. After the dictionary is built, the data is compressed as it is loaded. If a compression dictionary already exists, you can specify that LOAD not build a new one. Alternatively, you can run the REORG TABLESPACE utility to compress your data.

Before you begin

Your table space or partition in a partitioned table space must be defined with COMPRESS YES before you can compress the data by using LOAD.

About this task

LOAD creates a compression dictionary while records are loaded. The data is not compressed until the dictionary is built. To save processing costs, the initial LOAD does not go back to compress the records that were used to build the dictionary. After the dictionary is built, the rest of the data is compressed as it is loaded.

The number of records that are required to build a dictionary depends on the frequency of patterns in the data. For large data sets, a small percentage of the total number of rows is used to build the dictionary. For the best compression results, build a new dictionary whenever you load the data.

For partition-by-growth table spaces, the utility builds one dictionary and the same dictionary page is populated through all partitions.

For XML table spaces that are defined with COMPRESS YES, compression does not occur until the first time that the REORG TABLESPACE utility is run on that table space.

Procedure

To compress data by using the LOAD utility:

  • If a compression dictionary does not already exist or you want to build a new one, specify the REPLACE, RESUME NO, or RESUME YES options in the LOAD statement as follows:
    Start of change
    REPLACE
    For simple table spaces, you must specify LOAD REPLACE to build new compression dictionaries. If you specify RESUME YES or RESUME NO for a simple table space, LOAD keeps the existing dictionary if one exists. Alternatively, you can run REORG TABLESPACE.
    RESUME NO
    If you specify RESUME NO, the table space must be empty.
    RESUME YES
    If you specify RESUME YES and SHRLEVEL NONE is explicitly or implicitly specified, LOAD builds a dictionary if the table space is empty. If you specify RESUME YES SHRLEVEL CHANGE, LOAD builds a dictionary when the amount of data in the table space reaches a threshold that is determined by DB2.
    End of change
  • If you want to use an existing compression dictionary to compress the data, specify the KEEPDICTIONARY option in the LOAD statement.

    Alternatively, you can run REORG TABLESPACE with the KEEPDICTIONARY option to compress the data. REORG TABLESPACE with KEEPDICTIONARY is efficient, because the data is not decompressed in the process. However, REORG TABLESPACE with KEEPDICTIONARY does not generate a compression report. To get that information, run RUNSTATS to update the catalog statistics and then query the catalog columns yourself.

    One advantage of reusing an existing compression dictionary is that you can eliminate the utility processing time of building the dictionary. Consider specifying KEEPDICTIONARY in the LOAD statement in the following situations:

    • If you are satisfied with the compression that is obtained from an existing dictionary.
    • If the last dictionary was built by REORG TABLESPACE. The REORG TABLESPACE sampling method can yield more representative dictionaries than LOAD and can thus result in a better compression.
    • If the data is not changed significantly since the last dictionary was built.

    You can also specify KEEPDICTIONARY for specific partitions of a partitioned table space. In this case, each partition has its own dictionary.

    The following example LOAD statement reuses an existing compression dictionary:

    LOAD DATA
      REPLACE KEEPDICTIONARY
      INTO TABLE DSN8A10.DEPT
      ( DEPTNO    POSITION (1)     CHAR(3),
        DEPTNAME  POSITION (5)     VARCHAR,
        MGRNO     POSITION (37)    CHAR(6),
        ADMRDEPT  POSITION (44)    CHAR(3),
        LOCATION  POSITION (48)    CHAR(16) )
      ENFORCE NO
  • If you want to copy another compression dictionary from an existing partition into another empty partition, specify the COPYDICTIONARY option in the LOAD statement and use a dummy input data set. In this case, the data that is inserted into the partition is compressed.

What to do next

For XML table spaces, run REORG TABLESPACE on the table space.