Single table optimization

At run time, the optimizer chooses an optimal access method for a query by calculating an implementation cost based on the current state of the database. The optimizer uses two costs in its decision: an I/O cost and a CPU cost. The goal of the optimizer is to minimize both I/O and CPU cost.

Improved query optimization I/O cost estimates

The time it takes to perform an disk I/O operation can vary according to the connecting infrastructure, the external or internal nature of the media and media type, spinning disk or Solid State Disk. Consequently, the total I/O cost associated with a particular query access method may vary from system to system.

In order to more accurately estimate these costs, the optimizer considers the performance of each disk unit individually. It does this by measuring the time it takes for read operations to complete across a sample of pages across the disk. This analysis is done at each IPL for disks in the system and user ASPs and at vary-on time for independent ASPs. With this information and with the additional knowledge about how database objects are spread across various disk units, the optimizer can make a reasonable estimate about the time it takes to perform I/O against a given database object. This means that no matter where your data resides, and even as it moves around, the optimizer can choose the most efficient plan to execute your queries.

Optimizing Access to each table

The optimizer uses a general set of guidelines to choose the best method for accessing data in each table. The optimizer:

  • Determines the default filter factor for each predicate in the selection clause.
  • Determines the true filter factor of the predicates by key range estimate when the selection predicates match the index left-most keys, or by available column statistics.
  • Determines the cost of table scan processing if an index is not required.
  • Determines the cost of creating an index over a table if an index is required. This index is created by performing either a table scan or creating an index-from-index.
  • Determines the cost of using a sort routine or hashing method if appropriate.
  • Determines the cost of using existing indexes using Index Probe or Index Scan
    • Orders the indexes. For SQE, the indexes are ordered in general such that the indexes that access the smallest number of entries are examined first. For CQE, the indexes are ordered from mostly recently created to oldest.
    • For each index available, the optimizer does the following:
      • Determines if the index meets the selection criteria.
      • Determines the cost of using the index by estimating the number of I/Os and CPU needed to Index Probe or Index Scan, and possible Table Probes.
      • Compares the cost of using this index with the previous cost (current best).
      • Picks the cheaper one.
      • Continues to search for best index until the optimizer decides to look at no more indexes.

      SQE orders the indexes so that the best indexes are examined first. Once an index is found that is more expensive than the previously chosen best index, the search is ended.

      For CQE, the time limit controls how much time the optimizer spends choosing an implementation. The time limit is based on how much time was spent so far and the current best implementation cost found. The idea is to prevent the optimizer from spending more time optimizing the query than it takes to actually execute the query. Dynamic SQL queries are subject to the optimizer time restrictions. Static SQL query optimization time is not limited. For OPNQRYF, if you specify OPTALLAP(*YES), the optimization time is not limited.

      For small tables, the query optimizer spends little time in query optimization. For large tables, the query optimizer considers more indexes. For CQE, the optimizer generally considers five or six indexes for each table of a join before running out of optimization time. Because of this processing, it is normal for the optimizer to spend longer lengths of time analyzing queries against the tables.

  • Determines the cost of using a temporary bitmap
    • Order the indexes that can be used for bit mapping. In general the indexes that select the smallest number of entries are examined first.
    • Determine the cost of using this index for bit mapping and the cost of merging this bitmap with any previously generated bitmaps.
    • If the cost of this bitmap plan is cheaper than the previous bitmap plan, continue searching for bitmap plans.
  • After examining the possible methods of access the data for the table, the optimizer chooses the best plan from all the plans examined.