DB2 10.5 for Linux, UNIX, and Windows

Statistics used from column group statistics on statistical views

In complex scenarios where statistics need to be gathered for a query, creating a statistical view or gathering column group statistics on a table is not enough. You might need to combine the two to collect column group statistics on a statistical view. This combination of statistical view and column group statistics can help the DB2® optimizer to generate better access plans.

The use of column groups statistics on statistical views usually occurs when additional correlation exists on top of a typical situation that statistical views would normally handle.

For example, consider the following query and statistical view:

select * from T1,T2 where T1.C1=T2.D1 and T1.C2=5 and T2.D3=10;
create view SV2 as (select * from T1,T2 where T1.C1=T2.D1);
alter view SV2 enable query optimization;
runstats on table db2.SV2 on all columns with distribution;

This query might run slowly and the cardinality estimate can be inaccurate. If you check the access plan, you might find that there is a strong correlation between T1.C2 and T2.D3 although the cardinality estimate has been adjusted by the statistical view. Therefore, the cardinality estimate is inaccurate.

To resolve this situation, you can collect column group statistics of the view SV2 by issuing the following command:

runstats on table db2.SV2 on all columns and columns((C2,D3)) with distribution;

These additional statistics help improve the cardinality estimates which might result in better access plans.

Note: Collecting distribution statistics for a list of columns in a statistical view is not supported.

Collecting column group statistics on statistical views can also be used to compute the number of distinct groupings, or the grouping key cardinality, for queries that require data to be grouped in a certain way. A grouping requirement can result from operations such as the GROUP BY or DISTINCT operations.

For example, consider the following query and statistical view:

select T1.C1, T1.C2 from T1,T2 where T1.C3=T2.C3 group by T1.C1, T1.C2;
create view SV2 as (select T1.C1, T1.C2 from T1,T2 where T1.C3=T2.C3);
alter view SV2 enable query optimization;

Collecting column group statistics on the statistical view covering the join predicate helps the optimizer estimate the grouping key cardinality more accurately. Issue the following command to collect the column group statistics:

runstats on table db2.SV2 on columns((C1,C2));