Query and application performance analysis

Performance analysis for queries and applications begins by answering some basic questions.

To improve query performance and application performance, you need to answer some basic questions to determine how well your queries and applications perform.

Are the catalog statistics up-to-date?

Keeping object statistics current is an important activity. DB2® needs those statistics to choose an optimal access path to data.

Start of changeDB2 utilities can collect statistics about DB2 objects. The RUNSTATS utility is designed for collecting statistics, but you can also collect inline statistics when you run certain other utilities, such as the LOAD, REORG INDEX, REORG TABLESPACE, and REBUILD INDEX utilities. These statistics are stored in the DB2 catalog. DB2 uses this information during the bind process to choose an access path. If you do not collect statistics and then rebind your packages or plans, DB2 does not have the information that it needs to choose the most efficient access path. Lack of statistical information can result in unnecessary I/O operations and excessive processor consumption. End of change

Recommendation: Collect statistics at least once for each table and its associated indexes. How often you rerun the utility depends on how current you need the catalog data to be. If data characteristics of a table vary significantly over time, you must keep the catalog current with those changes. Collecting statistics is most beneficial for the following objects:
  • Table spaces that contain frequently accessed tables
  • Tables that are involved in sort operations
  • Tables with many rows
  • Tables that are queried by SELECT statements that include many search arguments
  • Tables with indexes

Does the estimated filter factor match the filtering at run time?

DB2 uses estimated filter factors to choose efficient access paths. When the estimated and actual filtering differs, DB2 might choose an access path that performs poorly because the cost estimates are inaccurate. When the estimated and actual factors differ, you might collect more statistics or reoptimize statements at run time to improve access path selection.

Is the query coded as simply and efficiently as possible?

Ensure that the SQL query is coded as simply and efficiently as possible. Make sure that:

  • Unused columns are not selected.
  • No unneeded ORDER BY or GROUP BY clauses are in the query.
  • No unneeded predicates are in the query.

Are you using materialized query tables?

Define materialized query tables to improve the performance of dynamic queries that operate on large amounts of data and that involve multiple joins. DB2 generates the results of all or parts of the queries in advance and stores the results in materialized query tables. DB2 determines when using precomputed results is likely to optimize the performance of dynamic queries.

Is access through an index?

An index provides efficient access to data. DB2 uses different types of index scans, each of which affects performance differently. Sometimes DB2 can avoid a sort by using an index.

If a query is satisfied by using only the index, DB2 uses a method called index-only access.

  • For a SELECT operation, if all the columns that are needed for the query can be found in the index, DB2 does not need to access the table.
  • For an UPDATE or DELETE operation, an index-only scan can be performed to search for qualifying rows to update or delete. After the qualifying rows are identified, DB2 must retrieve those rows from the table space before they are updated or deleted.

Other types of index scans that DB2 might use are matching or nonmatching index scans.

  • In a matching index scan, the query uses predicates that match the index columns. Predicates provide filtering; DB2 needs to access only specific index and data pages.
  • In a nonmatching index scan, DB2 reads all index keys and their rows of data. This type of scan is less likely to provide an efficient access path than a matching index scan.

In addition to providing selective access to data, indexes can also order data, and sometimes they eliminate the need for sorting. You can avoid some sorts if index keys are in the order that is needed by ORDER BY, GROUP BY, a join operation, or DISTINCT in an aggregate function. When you want to prevent a sort, consider creating an index on the columns that are necessary to provide that ordering.

Is a table space scan used?

When index access is not possible, DB2 uses a table space scan. DB2 typically uses the sequential prefetch method to scan table spaces.

Begin general-use programming interface information.
Example: Assume that table T has no index on column C1. DB2 uses a table space scan in the following example:
SELECT * FROM T WHERE C1 = VALUE;

In this case, every row in table T must be examined to determine if the value of C1 matches the given value.

End general-use programming interface information.

A table space scan on a partitioned table space can be more efficient than a scan on a nonpartitioned table space. DB2 can take advantage of the partitions by limiting the scan of data in a partitioned table space to one or more partitions.

Are sorts performed?

Minimizing the need for DB2 to use sorts to process a query can result in better performance. In general, try to create indexes that match the predicates in your queries before trying to avoid sorts in your queries.
Related information:

Is data accessed or processed in parallel?

Parallel processing applies to read-only queries. DB2 can use parallel I/O and CPU operations to improve performance. For example, DB2 can use multiple parallel operations to access data from a table or index. The response time for data-intensive or processor-intensive queries can be reduced.

Are host variables used?

When you specify the bind option REOPT(VARS), DB2 determines the access paths at both bind time and run time for statements that contain one or more host variables, parameter markers, or special registers. At run time, DB2 uses the values in those variables to determine access paths.

DB2 spends extra time determining the access path for statements at run time. But if DB2 finds a better access path using the variable values, you might see an overall performance improvement.

For static SQL applications with host variables, if you specify REOPT(VARS), DB2 determines the access path at bind time and again at run time, using the values of input variables.

For static SQL applications with no host variables, DB2 determines the access path when you bind the plan or package. This situation yields the best performance because the access path is already determined when the program runs.

For applications that contain dynamic SQL statements with host variables, using REOPT(VARS) is the recommended approach for binding.

Are dynamic SQL statements used?

For dynamic SQL statements, DB2 determines the access path at run time, when the statement is prepared.

When an application performs a commit operation, it must issue another PREPARE statement if that SQL statement is to be executed again. For a SELECT statement, the ability to declare a cursor WITH HOLD provides some relief but requires that the cursor be open at the commit point. Using the WITH HOLD option also causes some locks to be held for any objects that the prepared statement depends on. Also, the WITH HOLD option offers no relief for SQL statements that are not SELECT statements.

You can use the dynamic statement cache to decrease the number of times that those dynamic statements must be prepared. Using the dynamic statement cache is useful when you execute the same SQL statement often.

DB2 can save prepared dynamic statements in a cache. The cache is a DB2-wide cache that all application processes can use to store and retrieve prepared dynamic statements. After an SQL statement is prepared and is automatically stored in the cache, subsequent prepare requests for that same SQL statement can use the statement in the cache to avoid the costly preparation process. Different threads, plans, or packages can share cached statements.

The SELECT, UPDATE, INSERT, and DELETE statements are eligible for caching.
Related information: