DB2 Version 9.7 for Linux, UNIX, and Windows

Data organization schemes in DB2 and Informix databases

Table partitioning is a data organization scheme in which table data is divided across multiple storage objects called data partitions or ranges according to values in one or more table columns. Each data partition is stored separately. These storage objects can be in different table spaces, in the same table space, or a combination of both.

Table data is partitioned as specified in the PARTITION BY clause of the CREATE TABLE statement. The columns used in this definition are referred to as the table partitioning key columns. DB2® table partitioning maps to the data fragmentation approach to data organization offered by Informix® Dynamic Server and Informix Extended Parallel Server.

The Informix approach

Informix supports several data organization schemes, which are called fragmentation in the Informix products. One of the more commonly used types of fragmentation is FRAGMENT BY EXPRESSION. This type of fragmentation works much like a CASE statement, where there is an expression associated with each fragment of the table. These expressions are checked in order to determine where to place a row.

An Informix and DB2 database system comparison

DB2 database provides a rich set of complementary features that map directly to the Informix data organization schemes, making it relatively easy for customers to convert from the Informix syntax to the DB2 syntax. The DB2 database manager handles complicated Informix schemes using a combination of generated columns and the PARTITION BY RANGE clause of the CREATE TABLE statement. Table 1 compares data organizations schemes used in Informix and DB2 database products.

Table 1. A mapping of all Informix and DB2 data organization schemes
Data organization scheme	Informix syntax	DB2 Version 9.1 syntax
Informix: expression-based DB2: table partitioning	FRAGMENT BY EXPRESSION	PARTITION BY RANGE
Informix: round-robin DB2: default	FRAGMENT BY ROUND ROBIN	No syntax: DB2 database manager automatically spreads data among containers
Informix: range distribution DB2: table partitioning	FRAGMENT BY RANGE	PARTITION BY RANGE
Informix: system defined-hash DB2: database partitioning	FRAGMENT BY HASH	DISTRIBUTE BY HASH
Informix: HYBRID DB2: database partitioning with table partitioning	FRAGMENT BY HYBRID	DISTRIBUTE BY HASH, PARTITION BY RANGE
Informix: n/a DB2: Multidimensional clustering	n/a	ORGANIZE BY DIMENSION

Examples

The following examples provide details on how to accomplish DB2 database equivalent outcomes for any Informix fragment by expression scheme.

Example 1: The following basic create table statement shows Informix fragmentation and the equivalent table partitioning syntax for a DB2 database system:

Informix syntax:

		CREATE TABLE demo(a INT) FRAGMENT BY EXPRESSION
			a = 1 IN db1,
			a = 2 IN db2,
			a = 3 IN db3;

DB2 syntax:

		CREATE TABLE demo(a INT) PARTITION BY RANGE(a)
			(STARTING(1) IN db1,
			STARTING(2) IN db2,
			STARTING(3) ENDING(3) IN db3);

Informix XPS supports a two-level fragmentation scheme known as hybrid where data is spread across co-servers with one expression and within the co-server with a second expression. This allows all co-servers to be active on a query (that is, there is data on all co-servers) as well as allowing the query to take advantage of data partition elimination.

The DB2 database system achieves the equivalent organization scheme to the Informix hybrid using a combination of the DISTRIBUTE BY and PARTITION BY clauses of the CREATE TABLE statement.

Example 2:The following example shows the syntax for the combined clauses:

Informix syntax

		CREATE TABLE demo(a INT, b INT) FRAGMENT BY HYBRID HASH(a)
			EXPRESSION	b = 1 IN dbsl1,
					b = 2 IN dbsl2;

DB2 syntax

		CREATE TABLE demo(a INT, b INT) IN dbsl1, dbsl2
			DISTRIBUTE BY HASH(a),
			PARTITION BY RANGE(b) (STARTING 1 ENDING 2 EVERY 1);

In addition, you can use multidimensional clustering to gain an extra level of data organization:

		CREATE TABLE demo(a INT, b INT, c INT) IN dbsl1, dbsl2
			DISTRIBUTE BY HASH(a),
			PARTITION BY RANGE(b) (STARTING 1 ENDING 2 EVERY 1)
			ORGANIZE BY DIMENSIONS(c);

Thus, all rows with the same value of column a are in the same database partition. All rows with the same value of column b are in the same table space. For a given value of a and b, all rows with the same value c are clustered together on disk. This approach is ideal for OLAP-type drill-down operations, because only one or several extents (blocks)in a single table space on a single database partition must be scanned to satisfy this type of query.

Table partitioning applied to common application problems

The following sections discuss how to apply the various features of DB2 table partitioning to common application problems. In each section, particular attention is given to best practices for mapping various Informix fragmentation schemes into equivalent DB2 table partitioning schemes.

Considerations for creating simple data partition ranges

One of the most common applications of table partitioning is to partition a large fact table based on a date key. If you need to create uniformly sized ranges of dates, consider using the automatically generated form of the CREATE TABLE syntax.

Examples

Example 1: The following example shows the automatically generated form of the syntax:

CREATE TABLE orders
(
	l_orderkey		DECIMAL(10,0) NOT NULL,
	l_partkey		INTEGER,
	l_suppkey		INTEGER,
	l_linenumber		INTEGER,
	l_quantity		DECIMAL(12,2),
	l_extendedprice		DECIMAL(12,2),
	l_discount		DECIMAL(12,2),
	l_tax			DECIMAL(12,2),
	l_returnflag		CHAR(1),
	l_linestatus		CHAR(1),
	l_shipdate		DATE,
	l_commitdate		DATE,
	l_receiptdate		DATE,
	l_shipinstruct		CHAR(25),
	l_shipmode		CHAR(10),
	l_comment		VARCHAR(44)) 
		PARTITION BY RANGE(l_shipdate)
		(STARTING '1/1/1992' ENDING '12/31/1993' EVERY 1 MONTH);

This creates 24 ranges, one for each month in 1992-1993. Attempting to insert a row with l_shipdate outside of that range results in an error.

Example 2: Compare the preceding example to the following Informix syntax:

create table orders
(
	l_orderkey		decimal(10,0) not null,
	l_partkey		integer,
	l_suppkey		integer,
	l_linenumber	integer,
	l_quantity		decimal(12,2),
	l_extendedprice	decimal(12,2),
	l_discount		decimal(12,2),
	l_tax			decimal(12,2),
	l_returnflag	char(1),
	l_linestatus	char(1),
	l_shipdate		date,
	l_commitdate	date,
	l_receiptdate	date,
	l_shipinstruct	char(25),
	l_shipmode		char(10),
	l_comment		varchar(44)
) fragment by expression
l_shipdate < '1992-02-01' in ldbs1,
l_shipdate >= '1992-02-01' and l_shipdate < '1992-03-01' in ldbs2,
l_shipdate >= '1992-03-01' and l_shipdate < '1992-04-01' in ldbs3,
l_shipdate >= '1992-04-01' and l_shipdate < '1992-05-01' in ldbs4,
l_shipdate >= '1992-05-01' and l_shipdate < '1992-06-01' in ldbs5,
l_shipdate >= '1992-06-01' and l_shipdate < '1992-07-01' in ldbs6,
l_shipdate >= '1992-07-01' and l_shipdate < '1992-08-01' in ldbs7,
l_shipdate >= '1992-08-01' and l_shipdate < '1992-09-01' in ldbs8,
l_shipdate >= '1992-09-01' and l_shipdate < '1992-10-01' in ldbs9,
l_shipdate >= '1992-10-01' and l_shipdate < '1992-11-01' in ldbs10,
l_shipdate >= '1992-11-01' and l_shipdate < '1992-12-01' in ldbs11,
l_shipdate >= '1992-12-01' and l_shipdate < '1993-01-01' in ldbs12,
l_shipdate >= '1993-01-01' and l_shipdate < '1993-02-01' in ldbs13,
l_shipdate >= '1993-02-01' and l_shipdate < '1993-03-01' in ldbs14,
l_shipdate >= '1993-03-01' and l_shipdate < '1993-04-01' in ldbs15,
l_shipdate >= '1993-04-01' and l_shipdate < '1993-05-01' in ldbs16,
l_shipdate >= '1993-05-01' and l_shipdate < '1993-06-01' in ldbs17,
l_shipdate >= '1993-06-01' and l_shipdate < '1993-07-01' in ldbs18,
l_shipdate >= '1993-07-01' and l_shipdate < '1993-08-01' in ldbs19,
l_shipdate >= '1993-08-01' and l_shipdate < '1993-09-01' in ldbs20,
l_shipdate >= '1993-09-01' and l_shipdate < '1993-10-01' in ldbs21,
l_shipdate >= '1993-10-01' and l_shipdate < '1993-11-01' in ldbs22,
l_shipdate >= '1993-11-01' and l_shipdate < '1993-12-01' in ldbs23,
l_shipdate >= '1993-12-01' and l_shipdate < '1994-01-01' in ldbs24,
l_shipdate >= '1994-01-01' in ldbs25;

Notice that the Informix syntax provides an open ended range at the top and bottom to catch dates that are not in the expected range. The DB2 syntax can be modified to match the Informix syntax by adding ranges that make use of MINVALUE and MAXVALUE.

Example 3: The following example modifies Example 1 to mirror the Informix syntax::

CREATE TABLE orders
(
	l_orderkey		DECIMAL(10,0) NOT NULL,
	l_partkey		INTEGER,
	l_suppkey		INTEGER,
	l_linenumber	INTEGER,
	l_quantity		DECIMAL(12,2),
	l_extendedprice	DECIMAL(12,2),
	l_discount		DECIMAL(12,2),
	l_tax			DECIMAL(12,2),
	l_returnflag	CHAR(1),
	l_linestatus	CHAR(1),
	l_shipdate		DATE,
	l_commitdate	DATE,
	l_receiptdate	DATE,
	l_shipinstruct	CHAR(25),
	l_shipmode		CHAR(10),
	l_comment		VARCHAR(44)
) PARTITION BY RANGE(l_shipdate)
	(STARTING MINVALUE,
	 STARTING '1/1/1992' ENDING '12/31/1993' EVERY 1 MONTH,
	 ENDING MAXVALUE);

This technique allows any date to be inserted into the table.

Partition by expression using generated columns

Although DB2 database does not directly support partitioning by expression, partitioning on a generated column is supported, making it possible to achieve the same result.

Consider the following usage guidelines before deciding whether to use this approach:

The generated column is a real column that occupies physical disk space. Tables that make use of a generated column can be slightly larger.
Altering the generated column expression for the column on which a partitioned table is partitioned is not supported. Attempting to do so will result in the message SQL0190. Adding a new data partition to a table that uses generated columns in the manner described in the next section generally requires you to alter the expression that defines the generated column. Altering the expression that defines a generated column is not currently supported.
There are limitations on when you can apply data partition elimination when a table uses generated columns.

Examples

Example 1: The following uses the Informix syntax, where it is appropriate to use generated columns. In this example, the column to be partitioned on holds Canadian provinces and territories. Because the list of provinces is unlikely to change, the generated column expression is unlikely to change.

	CREATE TABLE customer (
		cust_id		INT,
		cust_prov	CHAR(2))
		FRAGMENT BY EXPRESSION
			cust_prov = "AB" IN dbspace_ab
			cust_prov = "BC" IN dbspace_bc
			cust_prov = "MB" IN dbspace_mb
			...
			cust_prov = "YT" IN dbspace_yt
			REMAINDER IN dbspace_remainder;

Example 2: In this example, the DB2 table is partitioned using a generated column:

	CREATE TABLE customer (
		cust_id     INT,
		cust_prov   CHAR(2),
		cust_prov_gen GENERATED ALWAYS AS (CASE
			WHEN cust_prov = 'AB' THEN 1
			WHEN cust_prov = 'BC' THEN 2
			WHEN cust_prov = 'MB' THEN 3
			...
			WHEN cust_prov = 'YT' THEN 13
			ELSE 14 END))
		IN tbspace_ab, tbspace_bc, tbspace_mb, .... tbspace_remainder
		PARTITION BY RANGE (cust_prov_gen)
			(STARTING 1 ENDING 14 EVERY 1);

Here the expressions within the case statement match the corresponding expressions in the FRAGMENT BY EXPRESSION clause. The case statement maps each original expression to a number, which is stored in the generated column (cust_prov_gen in this example). This column is a real column stored on disk, so the table could occupy slightly more space than would be necessary if DB2 supported partition by expression directly. This example uses the short form of the syntax. Therefore, the table spaces in which to place the data partitions must be listed in the IN clause of the CREATE TABLE statement. Using the long form of the syntax requires a separate IN clause for each data partition.

Note: This technique can be applied to any FRAGMENT BY EXPRESSION clause.