You can run certain DB2®
utilities to collect access path statistics for your database objects. Accurate statistics are an
essential component of access path selection.
Before you begin
- Consider identifying statistics to collect by using a query optimization tool such as
IBM® Data Studio or IBM Data Server
Manager.
About this task
You can use DB2 online utilities to
collect statistics for database objects. The purpose of the RUNSTATS utility is to collect
statistics for database objects. However, you can also collect inline statistics when
you run certain other DB2 utilities.
Procedure
To collect statistics for database objects, use any of the following
approaches:
- Run the RUNSTATS utility. The RUNSTATS utility collects the most complete and accurate statistics.
- Specify the STATISTICS keyword to specify the collection of inline statistics when you
run one of the following utilities:
- LOAD
- REBUILD INDEX
- REORG INDEX
- REORG TABLESPACE
You might be able to avoid the cost of running the RUNSTATS utility by collecting
inline statistics. However, certain limitations apply to inline
statistics. For example:
- You cannot
collect column group statistics with the STATISTICS keyword. You must run the
RUNSTATS utility to collect column group statistics.
- You cannot use statistics profiles with inline
statistics.
For details of the limitations, see the information for each utility.
Important: Statistics
that are collected with inline statistics are likely to differ from statistics that are
collected by the RUNSTATS utility. Certain resources and values that RUNSTATS uses might
be unavailable in the context of a utility that collects inline statistics. Estimations
must be used in place of these missing values or uncertainties, and the resulting
statistics might be less exact. Consequently, you might need to evaluate whether the
inline statistics are suitable to support access path selection for your query
workload.
For example:
- If the DISCARDDN option is specified when you collect inline statistics with the
LOAD utility, the statistics are collected before the rows are discarded. If the
number of discarded rows is large enough, the inaccuracy of the resulting statistics
might be significant. As a general rule, if the number of discarded rows exceeds 20
percent of the total number of rows in the table, run the RUNSTATS separately, after
running the LOAD utility, to collect accurate statistics.
What to do next
Consider taking the following actions to standardize and automate
statistics collection: