DB2 10.5 for Linux, UNIX, and Windows

Loading data into column-organized tables

Although loading data into column-organized tables is very similar to loading data into row-organized tables, you should be aware of the few exceptions and configure your database to handle the additional resources that are required.

Before you begin

You must have an established connection to the database or be able to implicitly connect to the database into which you want to load the data.

About this task

When data is being loaded into a column-organized table, the first phase is the ANALYZE phase, which is unique to column-organized tables. The column compression dictionary is built during the ANALYZE phase. This phase is followed by the LOAD, BUILD, and DELETE phases. The INDEX COPY phase applies to row-organized tables only.

Restrictions

Check the restrictions of the LOAD command as they also apply to row-organized tables.

Procedure

To load data in column-organized tables:

Set the blocknonlogged (block creation of tables that allow non-logged activity) database configuration parameter to NO before you load data into a column-organized table. If this parameter is not set, the error message SQL2032N is returned.
Ensure that the util_heap_sz (utility heap size) database configuration parameter is set to at least 1,000,000 pages and AUTOMATIC to address the resource needs of the LOAD command.
If the database server has at least 128 GB of memory, set util_heap_sz 4,000,000 pages. If concurrent load utilities are running, increase the util_heap_sz value to accommodate higher memory requirements. If memory is scarce, the util_heap_sz value can be increased dynamically only when a load operation is running.

If you set DB2_WORKLOAD to ANALYTICS before the creation of your database, the util_heap_sz value is automatically configured during database creation.
Ensure that the path for load temporary files has sufficient storage space (equivalent to the raw size of the input data) to build the column compression dictionary.
If a column compression dictionary must be built, the input data source (such as a pipe or a socket) is processed twice. If the input source can be reopened, it is read twice. If the input source cannot be reopened, its contents are temporarily cached in the load temporary file directory.

The default path for load temporary files is located under the instance directory, or in a location that is specified by the TEMPFILES PATH option on the LOAD command.
Ensure that you have enough memory.
Memory requirements temporarily increase when the column compression dictionary is being created.

For optimal load performance, additional cache memory is required to write column-organized data in extent-sized amounts, rather than one page at a time, therefore reducing I/O costs.