The runstats utility collects statistics on tables, indexes,
and statistical views to provide the optimizer with accurate information
for access plan selection.
Use the runstats utility to collect statistics in the following
situations:
- After data has been loaded into a table and appropriate indexes
have been created
- After creating a new index on a table
- After a table has been reorganized with the reorg utility
- After a table and its indexes have been significantly modified
through update, insert, or delete operations
- Before binding application programs whose performance is critical
- When you want to compare current and previous statistics
- When the prefetch value has been changed
- After executing the REDISTRIBUTE DATABASE PARTITION GROUP command
- When you have XML columns. When runstats is used to collect statistics
for XML columns only, existing statistics for non-XML columns that
were collected during a load operation or a previous runstats operation
are retained. If statistics on some XML columns were collected previously,
those statistics are either replaced or dropped if the current runstats
operation does not include those columns.
To improve runstats performance and save disk space used to store
statistics, consider specifying only those columns for which data
distribution statistics should be collected.
You should rebind application programs after executing runstats.
The query optimizer might choose different access plans if new statistics
are available.
If a full set of statistics cannot be collected at one time, use
the runstats utility on subsets of the objects. If inconsistencies
occur as a result of ongoing activity against those objects, a warning
message (SQL0437W, reason code 6) is returned during query optimization.
If this occurs, use runstats again to update the distribution statistics.
To ensure that index statistics are synchronized with the corresponding
table, collect both table and index statistics at the same time. If
a table has been modified extensively since the last time that statistics
were gathered, updating only the index statistics for that table will
leave the two sets of statistics out of synchronization with each
other.
Using the runstats utility on a production system might negatively
impact workload performance. The utility now supports a throttling
option that can be used to limit the performance impact of runstats
execution during high levels of database activity.
When you collect statistics for a table in a partitioned database
environment, runstats only operates on the database partition from
which the utility is executed. The results from this database partition
are extrapolated to the other database partitions. If this database
partition does not contain a required portion of the table, the request
is sent to the first database partition in the database partition
group that contains the required data.
Statistics for a statistical view are collected on all database
partitions containing base tables that are referenced by the view.
Consider the following tips to improve the efficiency of runstats
and the usefulness of the statistics:
- Collect statistics only for columns that are used to join tables
or for columns that are referenced in the WHERE, GROUP BY, or similar
clauses of queries. If the columns are indexed, you can specify these
columns with the ONLY ON KEY COLUMNS clause on the RUNSTATS command.
- Customize the values of the num_freqvalues and num_quantiles database
configuration parameters for specific tables and columns.
- Collect detailed index statistics with the SAMPLE DETAILED clause
to reduce the amount of background calculation that is performed for
detailed index statistics. The SAMPLE DETAILED clause reduces the
time that is required to collect statistics, and produces adequate
precision in most cases.
- When you create an index for a populated table, use the COLLECT
STATISTICS clause to create statistics as the index is created.
- When significant numbers of table rows are added or removed, or
if data in columns for which you collect statistics is updated, use
runstats again to update the statistics.
- Because runstats collects statistics on only a single database
partition, the statistics will be less accurate if the data is not
distributed consistently across all database partitions. If you suspect
that there is skewed data distribution, consider redistributing the
data across database partitions by using the REDISTRIBUTE
DATABASE PARTITION GROUP command before using the runstats
utility.
- For DB2® V9.7
Fix Pack 1 and later releases, distribution statistics can be collected
on an XML column. Distribution statistics are collected for each index
over XML data specified on the XML column. By default, a maximum of
250 quantiles are used for distribution statistics for each index
over XML data.
When collecting distribution statistics on an XML
column, you can change maximum number of quantiles. You can lower
the maximum number of quantiles to reduce the space requirements for
XML distribution statistics based on your particular data size, or
you can increase the maximum number of quantiles if 250 quantiles
is not sufficient to capture the distribution statistics of the data
set for an index over XML data.