Although loading data into column-organized tables
is very similar to loading data into row-organized tables,
you should be aware of the few exceptions and configure your database
to handle the additional resources that are required.
Before you begin
- You must have an established connection to the database or be
able to implicitly connect to the database into which you want to
load the data.
About this task
When data is being loaded into a column-organized table,
the first phase is the ANALYZE phase, which is unique to column-organized tables.
The column compression dictionary is built during the ANALYZE phase.
This phase is followed by the LOAD, BUILD, and DELETE phases. The
INDEX COPY phase applies to row-organized tables
only.
Restrictions
- Check the restrictions of the LOAD command
as they also apply to row-organized tables.
Procedure
To load data in column-organized tables:
- Set the blocknonlogged (block creation
of tables that allow non-logged activity) database configuration parameter
to NO before you load data into a column-organized table.
If this parameter is not set, the error message SQL2032N is returned.
- Ensure that the util_heap_sz (utility
heap size) database configuration parameter is set to at least 1,000,000
pages and AUTOMATIC to address the resource needs
of the LOAD command.
If the database server has at
least 128 GB of memory, set util_heap_sz 4,000,000
pages. If concurrent load utilities are running, increase the util_heap_sz value
to accommodate higher memory requirements. If memory is scarce, the util_heap_sz value
can be increased dynamically only when a load operation is running.
If
you set DB2_WORKLOAD to ANALYTICS before
the creation of your database, the util_heap_sz value
is automatically configured during database creation.
- Ensure that the path for load temporary files has sufficient
storage space (equivalent to the raw size of the input data) to build
the column compression dictionary.
If a column compression
dictionary must be built, the input data source (such as a pipe or
a socket) is processed twice. If the input source can be reopened,
it is read twice. If the input source cannot be reopened, its contents
are temporarily cached in the load temporary file directory.
The
default path for load temporary files is located under the instance
directory, or in a location that is specified by the TEMPFILES
PATH option on the LOAD command.
- Ensure that you have enough memory.
Memory
requirements temporarily increase when the column compression dictionary
is being created.
For optimal load performance, additional cache
memory is required to write column-organized data in extent-sized
amounts, rather than one page at a time, therefore reducing I/O costs.