IBM Support

Performance monitor data collection and reports using db2mon

Question & Answer


Question

How to interpret monitor data collected by db2mon CLP scripts?

Answer

db2mon is a set of CLP scripts that use Db2 lightweight in-memory 'mon_get' monitoring interfaces to collect monitoring data for a specific period. The mon_get interfaces return data that is incrementing from the time of database activation. To find activity during a particular period, each metric must be collected twice (at the beginning and at the end) and the difference calculated (using stored procedures).

Db2 11.1

Performance monitor data collection and reports using db2mon

db2mon is a set of CLP scripts that use Db2 lightweight in-memory 'mon_get' monitoring interfaces to collect monitoring data for a specific period. The mon_get interfaces return data that is incrementing from the time of database activation. To find activity during a particular period, each metric must be collected twice (at the beginning and at the end) and the difference is calculated (using stored procedures).

Note: It is advised that you collect data for a maximum of 300 seconds (5 minutes) to avoid wrapping some of the counters. For monitoring longer periods, collecting successive reports is preferred.

The following files are provided:

• db2mon.sh


• db2mon.sql (concatenated form of db2monBefore.sql, db2monInterval.sql, and db2monAfter.sql)
• db2mon_import.sql
• db2mon_export.sql
• db2mon_report.sql

Before running


The CLP scripts require a connection to the database. db2mon.sql also requires the database to have a user temporary tablespace. Many systems already have such a tablespace: if not, a new one can be created e.g.:

db2 create user temporary tablespace utemp

db2mon.sh creates and drops a user temporary tablespace.

Note: For db2mon.sql and db2mon_export.sql to properly collect monitoring data, monitoring must be enabled at the database configuration level. MON_ACT_METRICS and MON_REQ_METRICS must be set to at least BASE, which is the default value. MON_OBJ_METRICS defaults to EXTENDED which gives full monitor information on tables and indexes, and is ideal for db2mon.

Both the existence of a user temporary tablespace and the minimum required MON_x_METRICS settings are checked by db2mon itself in a prerequisites section at the top of the report.

Collecting and analyzing monitor data online

db2mon is run most commonly in online mode, where it runs to completion and produces a report on stdout on the system it is monitoring. Therefore, all the SQL to find the delta values in the metrics and produce the report runs will also be present in the report. db2mon.sh automatically connects to the database, creates a user temporary tablespace and a small private bufferpool (to minimize impact to the system), runs db2mon.sql, and then disconnects.

db2mon.sh MYDBNAME > db2mon.out

Many errors can be reported in db2mon.out when the script attempts to drop tables that don't yet exist. These errors can be ignored.


• Other types of errors may indicate that a user temporary tablespace doesn't exist, or that the script was generated for the wrong version of Db2.
• The report section begins after the text "STARTS", easily found by searching forward from the start of the output.
• The report can be wide so it is best to use a tool like less(1) or vim(1) where you can truncate lines if you like.
db2mon can be run online by using your current database connection, as follows.

db2 -tvf db2mon.sql > db2mon.out

Important: If you run on your current connection and interrupt db2mon.sql during execution (for example with Ctrl-C), CURRENT SCHEMA is likely to be set to SESSION, which could affect subsequent SQL executed on that connection. db2mon uses CURRENT SCHEMA to resolve its table references to the DGTTs used during online collection in db2mon.sql, or to the correct schema for permanent tables produced by db2mon_import.sql and db2mon_report.sql.

Using db2mon.sh also avoids any issue with CURRENT SCHEMA since it uses a separate, temporary connection.

Collecting and analyzing monitor data offline

Separate SQL files are provided for export, import and report generation. If the monitored system is performance sensitive, the performance data can be collected using the export script and transferred to another system for import and reporting.

Step 1

db2 –tvf db2mon_export.sql > db2mon_export.out

db2mon_export.sql exports the contents of all mon_get functions that db2mon uses to IXF files – twice – once for the ‘start’ data and once for the ‘end’. The overhead of this is low. These files can be transferred to another system for import into another Db2 database for analysis. Note: the ‘report’ can be run using any platform or version of Db2 – it does not have to match the level or architecture where the data was collected.

Step 2

On the analysis system, from the directory where the IXF files were restored / unpacked.

db2 –tvf db2mon_import.sql > db2mon_import.out

db2mon_import.sql uses the DB2 IMPORT utility to reconstitute the mon_get data back into tables in Db2 for analysis. IMPORT is used because the tables are created automatically (you don't have to reproduce a CREATE TABLE to match the source system's version of mon_get_workload, for example). IMPORT may be slower than LOAD but does not require the table definitions to be recorded separately.


The imported tables are permanent tables, not DGTTs. They are created in the current default schema and tablespace. The user must have the authority to create permanent tables and indexes. CURRENT SCHEMA can be set to a unique value to represent this set of monitoring data, allowing multiple data sets to stored.


Actual analysis and reporting are then done in the next and final step.

Step 3

db2 -tvf db2mon_report.sql > db2mon_report.out

The report that is generated by db2mon_report.sql is shorter than the output of db2mon.sql because it does not have all the DGTT declarations and delta calculations.

Note:Export provides ALL columns in almost all the monitoring table functions – not just the ones that are listed for querying in db2mon.pl. Exporting all columns can be helpful when reference to the original data is needed for analysis. Ad-hoc queries from the command line or altered versions of db2mon_report.sql can take advantage of the extra metrics when using the exported data set.

db2mon Report content

The report includes many useful queries, roughly grouped into sections.

• Point-in-time data (such as currently executing SQL/utilities, lock waits) are collected at the beginning and end of the interval.


• Cumulative data, measured over the whole interval.
– Database-level, tablespace-level, bufferpool-level, table-level, query-level, connection-level, and so on.
– Different deployment types - “regular” Db2 ESE, Db2 pureScale, Db2 with BLU, and so on.
– Environment information – database and instance configuration, registry variables, CPU counts and utilization, memory utilization

Examples of db2mon report output

1) Find the database throughput section near the beginning to identify work done by the system


======================================
Throughput metrics at database level
======================================

select min(ts_delta) ts_delta, member, decimal((sum(act_completed_total) / float(min(ts_delta))), 10, 1) as act

TS_DELTA    MEMBER ACT_PER_S    CMT_PER_S    RB_PER_S     DDLCK_PER_S  SEL_P_S      UID_P_S      ROWS_INS_P_S
----------- ------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
      35        0    22629.7         2361.1          0.0          0.0      13089.6       9540.0       4364.0
      35        1    24331.0         2525.0          0.0          0.0      14064.1      10266.8       4638.2
      35        2    27331.5         2842.1          0.0          0.0      15804.4      11527.1       5204.6
      35        3    25674.2         2682.0          0.0          0.0      14859.5      10814.6       4878.8
 4 record(s) selected.




The "Throughput metrics" report shows four members, each at about 2400 tx/s or 25,000 SQL/s, plus metrics, which are measured over 35 seconds. The process of collecting data can sometimes increase the planned interval (for example, 30 seconds in this case) slightly.

Abbreviated names are common in the report.

• ACT_PER_S: Activities per second


• CMT_PER_S: Commits per second
• RB_PER_S: Rollbacks per second
• DDLCK_PER_S: Deadlocks per second
• SEL_P_S: Select statements per second
• UID_P_S: Update/Insert/Delete statements per second
• ROWS_INS_P_S: Rows inserted per second


2) Analysis of processing time for the database.

Spotting high compile times (over 15%) on a transactional system can be a sign of an application that is not following the best practice of using parameter markers and is instead using literals.

======================================================
Time breakdown at database level (wait + processing)
======================================================

select member, integer(sum(total_rqst_time)) as total_rqst_tm, decimal(sum(total_compile_time) / float(sum

MEMBER TOTAL_RQST_TM PCT_COMPILE PCT_SECTION PCT_SORT PCT_COL PCT_COL_SYNOP PCT_COMMIT PCT_RBACK PCT_CONN
------ ------------- ----------- ----------- -------- ------- ------------- ---------- --------- --------
    0       1035374        0.04       78.59     0.74    0.00          0.00       9.38      0.00     0.10
    1        438810        0.01       75.85     1.02    0.00          0.00      15.15      0.00     0.12
    2        492605        0.01       74.57     1.06    0.00          0.00      16.66      0.00     0.09
    3        482646        0.01       76.32     1.04    0.00          0.00      14.73      0.00     0.11
 4 record(s) selected.




Most time is spent executing sections (PCT_SECTION includes processing and wait times for SQL), with a chunk that is also spent in commit processing – which is considered normal.


3) Analysis of time spent waiting by the database. This report is often the most important top-level information on performance health.

==============================
Wait times at database level
==============================
select w.member, integer(sum(total_rqst_time)) as total_rqst_tm, integer(sum(total_wait_time)) as total_wait

MEMBER TOTAL_RQST_TM TOTAL_WAIT_TM PCT_RQST_WAIT PCT_LOCK PCT_GLB_LOCK PCT_LTCH PCT_LG_DSK PCT_RCLM PCT_CF
------ ------------- ------------- ------------- -------- ------------ -------- ---------- -------- -------
    0       1035374        732450         70.74     1.31         0.57     0.89       4.28     1.07   56.26
    1        438810        291659         66.46     2.29         1.51     0.57       8.03     1.91   46.21
    2        492605        320535         65.06     1.50         1.13     0.92       9.37     1.67   44.36
    3        482646        319718         66.24     1.39         0.86     1.05       7.74     1.87   46.23
 4 record(s) se1ected.


In this example, each member spends 70% of each request waiting (PCT_RQST_WAIT). The majority of this wait time is spent performing CF communication (PCT_CF), followed by log writes (PCT_LG_DST) and lock wait (PCT_LOCK).


Note: most of these report tables are wider than the space available – the report tables were trimmed down on the right for this document.


4) Identify the busiest statements.

Note: Various sections of the report shows statement data from the package cache – which means statement executions that finished execution during the monitoring interval. Statements still executing when monitoring finished are shown in the point-in-time section at the beginning of the report.

The top 100 statements by total activity time are listed. Different views are shown in tables, with focus on basic metrics, wait time, sorting, IO, etc.

Comparing the database-level wait time table with the information on the busiest statements can identify whether high wait times are caused by just a few statements or a system-wide issue affecting most/all statements. System-wide issues may be improved by configuration changes or changes in hardware. Statement performance may be improved by altering SQL, adding or removing indexes, refreshing statistics (e.g. RUNSTATS), etc.

======================================
Top SQL statements by execution time
======================================

select member, integer(num_exec_with_metrics) as num_exec, coord_stmt_exec_time, decimal(coord_stmt_exec_time / double(num_exec_with_metrics), 10, 2) as avg_coord_exec_time, decimal( (coord_stmt_exec_time / double(

MEMBER NUM_EXEC    COORD_STMT_EXEC_TIME AVG_COORD_EXEC_TIME PCT_COORD_STMT_EXEC_TIME TOTAL_CPU_TIME       AVG_CPU_TIME         PCT_WAIT_TIME AVG_SECT_TIME AVG_COL_TIME STMT_TEXT
------ ----------- -------------------- ------------------- ------------------------ -------------------- -------------------- ------------- ------------- ------------ ----------------------------------------------
    0       61729               301979                4.89                    15.09             29928189                  484         53.67          4.89         0.00 Se1ect Count(Distinct S_I_ID) from STOCK, ORDE
    3       69026               150782                2.18                     7.53             33255990                  481         40.93          2.18         0.00 Se1ect Count(Distinct S_I_ID) from STOCK, ORDE
    2       72832               148618                2.04                     7.42             34689087                  476         36.56          2.04         0.00 Se1ect Count(Distinct S_I_ID) from STOCK, ORDE
    1       64562               137670                2.13                     6.88             3121S907                  483         41.59          2.13         0.00 Se1ect Count(Distinct S_I_ID) from STOCK, ORDE
    0      124839               112563                0.90                     5.62              4468845                   35         89.65          0.90         0.00 Update STOCK set S_QUANTITY = ?, S_YTD = ?, S_
    0      124833               111863                0.89                     5.59              3686860                   29         90.54          0.89         0.00 Se1ect S_QUANTITY, S_DIST_01, S_DIST_02, S_DIS
    0      124836                58906                0.46                     2.91              2807094                   22         85.45          0.46         0.00 Insert into ORDER_LINE va1ues (?, ?, ?, ?, ?,
    0       11350                57011                5.02                     2.84              1997313                  175         93.18          5.02         0.00 Update ORDER_LINE set OL_DELIVERY_D = ? where
    2      147141                46613                0.31                     2.33              4830165                   32         84.75          0.31         0.00 Se1ect S_QUANTITY, S_DIST_01, S_DIST_02, S_DIS
    3      138286                45568                0.32                     2.27              4618735                   33         85.28          0.32         0.00 Se1ect S_QUANTITY, S_DIST_01, S_DIST_02, S_DIS
    3      138285                40548                0.29                     2.02              5767918                   41         83.84          0.29         0.00 Update STOCK set S_QUANTITY = ?, S_YTD = ?, S_
    2      147137                39975                0.27                     1.99              6031554                   40         82.78          0.27         0.00 Update STOCK set S_QUANTITY = ?, S_YTD = ?, S_
    1      131525                39818                0.30                     1.99              4457211                   33         86.20          0.30         0.00 Se1ect S_QUANTITY, S_DIST_01, S_DIST_02, S_DIS


==============================================================
Wait time breakdown for top SQL statements by execution time
==============================================================

select member, decimal((total_act_wait_time / double(total_act_time)) * 100, 5, 2) as pct_wait, decimal((log_disk_wait_time / double(total_act_time)) * 100, 5, 2) as pct_lg_dsk, decimal((log_buffer_wai

MEMBER PCT_WAIT PCT_LG_DSK PCT_LG_BUF PCT_LOCK PCT_GL_LK PCT_LTCH PCT_RCLM PCT_CF  PCT_PFTCH PCT_DIAG PCT_POOL_RD PCT_DIR_WR PCT_DIR_RD STMT_TEXT
------ -------- ---------- ---------- -------- --------- -------- -------- ------- --------- -------- ----------- ---------- ---------- -----------------------------------------------------------------
   0     53.67       0.00       0.00     2.37      0.36     2.34     0.00   47.55      0.00     0.00        1.40       0.00       0.00 Se1ect Count(Distinct S_I_ID) from STOCK, ORDER_LINE where (S_W_I
   3     40.93       0.00       0.00     2.29      1.02     2.57     0.00   30.97      0.00     0.00        5.06       0.00       0.00 Se1ect Count(Distinct S_I_ID) from STOCK, ORDER_LINE where (S_W_I
   2     36.56       0.00       0.00     1.42      0.65     1.88     0.00   29.37      0.00     0.00        3.87       0.00       0.00 Se1ect Count(Distinct S_I_ID) from STOCK, ORDER_LINE where (S_W_I
   1     41.59       0.00       0.00     5.04      3.21     1.05     0.00   31.96      0.00     0.00        3.52       0.00       0.00 Se1ect Count(Distinct S_I_ID) from STOCK, ORDER_LINE where (S_W_I
   0     89.65       0.16       0.00     0.00      0.00     0.07     0.35   89.06      0.00     0.00        0.00       0.00       0.00 Update STOCK set S_QUANTITY = ?, S_YTD = ?, S_ORDER_CNT = ?, S_RE
   0     90.54       0.00       0.00     2.32      2.29     0.00     3.59   75.30      0.00     0.00        9.27       0.00       0.00 Se1ect S_QUANTITY, S_DIST_01, S_DIST_02, S_DIST_03, S_DIST_04, S_
   0     85.45       4.61       0.00     0.00      0.00     0.71     3.77   75.19      0.00     0.00        0.49       0.00       0.00 Insert into ORDER_LINE va1ues (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
   0     93.18       0.00       0.00     0.00      0.00     0.02     0.05   88.41      0.00     0.00        4.67       0.00       0.00 Update ORDER_LINE set OL_DELIVERY_D = ? where OL_W_ID = ? and OL_
   2     84.75       0.00       0.00     5.16      5.12     0.00     4.94   63.97      0.00     0.00       10.32       0.00       0.00 Se1ect S_QUANTITY, S_DIST_01, S_DIST_02, S_DIST_03, S_DIST_04, S_
   3     85.28       0.00       0.00     0.23      0.19     0.00     7.03   67.87      0.00     0.00       10.12       0.00       0.00 Se1ect S_QUANTITY, S_DIST_01, S_DIST_02, S_DIST_03, S_DIST_04, S_
   3     83.84       0.19       0.00     0.00      0.00     0.01     1.50   82.12      0.00     0.00        0.00       0.00       0.00 Update STOCK set S_QUANTITY = ?, S_YTD = ?, S_ORDER_CNT = ?, S_RE
   2     82.78       0.20       0.00     0.00      0.00     0.01     0.71   81.85      0.00     0.00        0.00       0.00       0.00 Update STOCK set S_QUANTITY = ?, S_YTD = ?, S_ORDER_CNT = ?, S_RE
   1     86.20       0.00       0.00     0.00      0.33     0.00     4.85   70.23      0.00     0.00       10.76       0.00       0.00 Se1ect S_QUANTITY, S_DIST_01, S_DIST_02, S_DIST_03, S_DIST_04, S_
   1     83.34       0.22       0.00     0.00      0.00     0.00     0.90   82.20      0.00     0.00        0.00       0.00       0.00 Update STOCK set S_QUANTITY = ?, S_YTD = ?, S_ORDER_CNT = ?, S_RE

Tip: The "Statement & Plan Identifiers" section shows the executable_id for each of the 100 statements, which you can use to query deeper into mon_get_pkg_cache_stmt, or get the explain plan.


db2 "call explain_from_section(x'<executable id>','M',NULL,0,'<current user>',?,?,?,?,?)"


"Top SQL statements by execution time, aggregated by PLANID" is similar to the "Top SQL statement by execution time". All statements that differ by literal values (and so have the same PLANID hash value) are aggregated to show the total costs. Useful when a transactional application is driving lots of SQL with literal values. Available in Db2 10.5 and later.




The following table shows buffer pool activity for the top statements.

==========================================================
IO statistics per stmt - top statements by execution time
==========================================================

select member, integer(num_exec_with_metrics) num_exec, decimal(pool_data_l_reads/double(num_exec_with_metrics), 16,1) as avg_d_lrd, decimal(pool_data_p_reads/double(num_exec_with_metrics), 10,1)

MEMBER NUM_EXEC    AVG_D_LRD          AVG_D_PRD    AVG_I_LRD          AVG_I_PRD    AVG_TD_PRD   AVG_TI_PRD   AVG_COL_LRD        AVG_COL_PRD  AVG_DIR_R_RQS AVG_DIR_W_RQS STMT_TEXT
------ ----------- ------------------ ------------ ------------------ ------------ ------------ ------------ ------------------ ------------ ------------- ------------- --------------------------
    0       61729                0.0          0.0              607.4          0.1          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect Count(Distinct S_I_]
    3       69026                0.0          0.0              607.4          0.4          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect Count(Distinct S_I_]
    2       72832                0.0          0.0              607.3          0.2          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect Count(Distinct S_I_]
    1       64562                0.0          0.0              607.3          0.2          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect Count(Distinct S_I_]
    0      124839                2.0          0.0                4.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Update STOCK set S_QUANTITY
    0      124833                1.1          0.1                3.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect S_QUANTITY, S_DIST_0
    0      124836                1.0          0.0                4.1          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Insert into ORDER_LINE valu
    0       11350               20.3          0.4                4.1          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Update ORDER_LINE set OL_DE
    2      147141                1.1          0.1                3.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect S_QUANTITY, S_DIST_0
    3      138286                1.1          0.1                3.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect S_QUANTITY, S_DIST_0
    3      138285                2.0          0.0                4.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Update STOCK set S_QUANTITY     2      147137                2.0          0.0                4.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Update STOCK set S_QUANTITY
   1      131525                1.1          0.1                3.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect S_QUANTITY, S_DIST_0
    1      131523                2.0          0.0                4.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Update STOCK set S_QUANTITY
    0      124837                0.0          0.0                2.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect I_NAME, I_PRICE, I_D
    0       11343                0.0          0.0                4.1          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Se1ect sum(OL_AMOUNT) from
    2      147135                1.0          0.0                4.1          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Insert into ORDER_LINE valu
    3      138285                1.0          0.0                4.1          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Insert into ORDER_LINE valu     1      131522                1.0          0.0                4.1          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Insert into ORDER_LINE valu
    0       12533                1.0          0.0                6.0          0.0          0.0          0.0                0.0          0.0           0.0           0.0 Insert into ORDERS va1ues (
    2       13568               20.4          0.4                4.1          0.0          0
.0 0.0 0.0 0.0 0.0 0.0 Update ORDER_LINE set OL_DE




I/O information can indicate whether a statement is using an index or not. AVG_I_LRD is average index logical reads per execution. In this example, the first four entries (the same statement on each of four members) have high AVG_I_LRD values and zero AVG_D_LRD (average data logical reads) – implying an index-based plan. If an expensive statement has high data logical reads and low index logical reads, changes to the indexes may improve performance.


5) Examining Long Disk Performance

One or a few very busy statements may suggest a missing index or a bad plan. If that does not provide a solution, compare the highest wait times from the database and statement levels. This information should identify the largest contributors to wait time and then consult the appropriate report sections to provide more detailed information.

In db2mon, the individual wait time percentages (log disk wait time, for example) are calculated as a percent of total request time, not as a percent of total wait time. The latter approach (where everything adds up to 100% wait time) is what monreport.dbsummary does.

A high percentage of log disk wait time (PCT_LG_DSK in db2mon output) is common for transactional applications.:

==========================  
Database log write times  
==========================

select member, num_log_write_io, case when ts_delta > 0 then decimal( double(num_log_write_io) / ts_delta, 10, 4 ) else null end as
MEMBER NUM_LOG_WRITE_IO     LOG_WRITE_IO_PER_S LOG_WRITE_MB_PER_S LOG_WRITE_TIME       LOG_WRITE_TIME_PER_IO_MS NUM_LOG_BUFFER_FULL
------ -------------------- ------------------ ------------------ -------------------- ------------------------ --------------------
    0                57237          1788.6562             7.0000                25306                   0.4421                    0
    1                68286          2133.9375             8.3437                26345                   0.3858                    0
    2                69373          2167.9062             8.5312                26669                   0.3894                    0
    3                69286          2165.1875             8.4687                26741                   0.3859                    0

 4 record(s) selected.




In this example, log write time per I/O is 0.4 milliseconds and the percentage of log disk wait time is low. "Good" values will vary from system to system, but higher Log Wait percentage and write times above 4 milliseconds may suggest changes in storage configuration (cache, RAID, number of LUNs per filesystem, etc).




6) Buffer pool read time

High percentage bufferpool read time (PCT_POOL_R) is a common bottleneck, especially when the database is warming up (e.g. shortly after database activation or when a new application starts making connections). Average pool read times (AVG_READ_TIME) are reported at bufferpool and tablespace levels. Similarly, a high percentage for direct I/O time (for Large Objects) can be tracked in AVG_DRCT_READ_TIME and AVG_DRCT_WRITE_TIME.




Note: the abbreviations here include:

• POOL_DATA_L_READS: bufferpool data page logical reads (i.e. base table ORGANIZE BY ROW access)


• POOL_DATA_P_READS: bufferpool data page physical reads
• POOL_INDEX_L_READS: bufferpool index page logical reads (i.e. index access)
• POOL_INDEX_P_READS: bufferpool index page physical reads
• POOL_COL_L_READS: bufferpool column-organized data page logical reads (i.e. base table ORGANIZED BY COLUMN access)
• POOL_COL_P_READS: bufferpool column-organized data page physical reads

============================  
Bufferpool read statistics  
============================

select member, substr(bp_name,1,20) as bp_name, pool_data_l_reads, pool_data_p_reads, pool_index_l_reads, pool_index_p_reads, pool_col_l_reads, pool_col_p_reads, pool_read_time, case when

MEMBER BP_NAME              POOL_DATA_L_READS    POOL_DATA_P_READS    POOL_INDEX_L_READS   POOL_INDEX_P_READS   POOL_COL_L_READS     POOL_COL_P_READS     POOL_READ_TIME       AVG_READ_TIME
------ -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------------- -------------
    2 IBMDEFAULTBP                      3037899                44159             45043859                27427                    0                    0                19619          0.27
    3 IBMDEFAULTBP                      2841993                41020             42586533                34728                    0                    0                21063          0.27
    1 IBMDEFAULTBP                      2696917                38268             40018451                23941                    0                    0                17030          0.27
    0 IBMDEFAULTBP                      2518978                38610             37512769                10298                    0                    0                28304          0.57

    4 record(s) selected.


===============================  
Disk read and write I/O times  
===============================

select member, substr(tbsp_name,1,20) as tbsp_name, (pool_data_p_reads + pool_index_p_reads+ pool_col_p_reads) as num_reads, case when ((pool_data_p_reads + pool_index_p_reads+ pool

MEMBER TBSP_NAME            NUM_READS            AVG_READ_TIME DIRECT_READ_REQS     AVG_DRCT_READ_TIME NUM_WRITES           AVG_WRITE_TIME DIRECT_WRITE_REQS    AVG_DRCT_WRITE_TIME
------ -------------------- -------------------- ------------- -------------------- ------------------ -------------------- -------------- -------------------- -------------------
    3 TBS_OM                              29235          0.27                    0                  -                    0              -                    0                   -
    2 TBS_OM                              22195          0.26                    0                  -                    0              -                    0                   -
    2 TBS_CST                             21312          0.27                    0                  -                    0              -                    0                   -
    3 TBS_CST                             19700          0.28                    0                  -                    0              -                    0                   -
    1 TBS_OM                              19201          0.26                    0                  -                    0              -                    0                   -
    1 TBS_CST                             18485          0.27                    0                  -                    0              -                    0                   -
    0 TBS_CST                             17705          0.56                    0                  -                    0              -                    0                   -
    2 TBS_STK                             16568          0.27                    0                  -                    0              -                    0                   -
    0 TBS_STK                             16127          0.59                    0                  -                    0              -                    0                   -
    3 TBS_STK                             15559          0.27                    0                  -                    0              -                    0                   -



7) Round-trip time for messages between the members and CFs

For reports that show high percentage CF wait time (PCT_CF), the following report shows performance data on messages exchanged between the Db2 member and the CFs.

===================================================================  
Round-trip CF command execution counts and average response times  ===================================================================

select member, id, cast(substr(cf_cmd_name,1,30) as varchar(30)) as cf_cmd_name, total_cf_requests, decimal( doub

MEMBER ID     CF_CMD_NAME                    TOTAL_CF_REQUESTS    AVG_CF_REQUEST_TIME_MICRO
------ ------ ------------------------------ -------------------- -------------------------
    2    128 SetLockState                                1016554                     60.11
    3    128 SetLockState                                 955295                     65.82
    1    128 SetLockState                                 905310                     62.04
    0    128 SetLockState                                 851689                    223.05
    3    128 ReadAndRegister                              682169                     88._8
    2    128 ReadAndRegister                              671832                     83.39
    1    128 ReadAndRegister                              662450                     84.51
    0    128 ReadAndRegister                              627922                    282.94
    2    128 SetLockStateMultiple                         413551                     70.50
    3    128 SetLockStateMultiple                         387168                     77.93
    1    128 SetLockStateMultiple                         367427                     73.74
    0    128 SetLockStateMultiple                         345954                    252.11




8) Reclaims of Index and Data pages

If reclaim wait time (PCT_RCLM) is above zero, the following data shows which table has the most reclaims, and whether the table or an index is being reclaimed:

===============================================  
Page reclaim metrics for index and data pages  
===============================================

select member, substr(tabschema,1,20) as tabschema, substr(tabname,1,40) as tabname, substr(objtype,1,10) as objtype, data_partition_id, iid, (page

MEMBER TABSCHEMA            TABNAME                                  OBJTYPE    DATA_PARTITION_ID IID    PAGE_RECLAIMS        RECLAIM_WAIT_TIME  
------ -------------------- ---------------------------------------- ---------- ----------------- ------ -------------------- --------------------
    0 DTW                  ORDERS                                   INDEX                      -      -                 4334                 3357
    2 DTW                  ORDER_LINE                               INDEX                      -      -                 3870                 3096
    2 DTW                  ORDERS                                   INDEX                      -      -                 3652                 2732
    3 DTW                  ORDERS                                   INDEX                      -      -                 3638                 2724
    1 DTW                  ORDER_LINE                               INDEX                      -      -                 3634                 2666
    3 DTW                  ORDER_LINE                               INDEX                      -      -                 3577                 2391
    1 DTW                  ORDERS                                   INDEX                      -      -                 3502                 2326
    0 DTW                  ORDER_LINE                               INDEX                      -      -                 4169                 2286
    0 DTW                  STOCK_501_750                            INDEX                      -      -                  165                  983
    0 DTW                  STOCK_751_1000                           INDEX                      -      -                  199                  952
    1 DTW                  NEW_ORDER                                INDEX                      -      -                  441                  880




This example shows highest reclaim activity on indexes on ORDERS, and reclaim wait time is low at the database level. High reclaims of index pages are the most common, and can be improved by using RANDOM indexes, CURRENT MEMBER partitioning, range partitioning or other approaches.




9) Reclaims of SMP pages

The following example shows reclaims of SMP pages, which are often due to heavy inserts into a table with a too-small tablespace extent size.

====================================  
Page reclaim metrics for SMP pages  
====================================

select member, substr(tabschema,1,20) as tabschema, substr(tabname,1,40) as tabname, substr(objtype,1,10) as objtype, data_partition_id, iid, (spacemapp

MEMBER TABSCHEMA            TABNAME                                  OBJTYPE    DATA_PARTITION_ID IID    SMP_PAGE_RECLAIMS    SMP_PAGE_RECLAIM_WAIT_TIME
------ -------------------- ---------------------------------------- ---------- ----------------- ------ -------------------- --------------------------
    3 DTW                  ORDER_LINE                               TABLE                      -      -                    9                        254
    2 DTW                  ORDER_LINE                               INDEX                      -      -                   57                        241
    0 DTW                  ORDER_LINE                               INDEX                      -      -                   90                        192
    0 DTW                  ORDER_LINE                               TABLE                      -      -                    9                        170
    3 DTW                  ORDER_LINE                               INDEX                      -      -                   67                        169
    1 DTW                  ORDER_LINE                               INDEX                      -      -                   57                         76
    1 DTW                  ORDER_LINE                               TABLE                      -      -                    3                         10
    2 DTW                  ORDER_LINE                               TABLE                      -      -                    4                          3
    3 DTW                  NEW_ORDER                                TABLE                      -      -                    2                          3
    1 DTW                  NEW_ORDER                                TABLE                      -      -                    1                          2

 10 record(s) se1ected.




10) Latch wait time

A high percentage (more than 15-20%) of latch wait time (PCT_LTCH) can result from various causes. The following example shows latch waits by type.

====================  Latch wait metrics  ====================

select member, substr(latch_name,1,60) as latch_name, total_extended_latch_wait_time as tot_ext_latch_wait_time_ms, total_extended_latch_waits

MEMBER LATCH_NAME                                                   TOT_EXT_LATCH_WAIT_TIME_MS TOT_EXT_LATCH_WAITS  TIME_PER_LATCH_WAIT_MS
------ ------------------------------------------------------------ -------------------------- -------------------- ----------------------
 0 SQLO_LT_SQLB_BPD_bpdLatch_SX                                                       7182                 4454                   1.61  
 3 SQLO_LT_SQLB_BPD_bpdLatch_SX                                                       4085                 1965                   2.07  
 2 SQLO_LT_SQLB_BPD_bpdLatch_SX                                                       3086                 2088                   1.47  
 1 SQLO_LT_SQLB_BPD_bpdLatch_SX                                                       1764                 1724                   1.02  
 0 SQLO_LT_SQLB_BPD_WARLatch                                                          1272                 1783                   0.71  
 2 SQLO_LT_SQLB_BPD_WARLatch                                                          1168                 2637                   0.44  
 3 SQLO_LT_SQLB_BPD_WARLatch                                                           706                 2110                   0.33  
 1 SQLO_LT_SQLB_BPD_WARLatch                                                           509                 1870                   0.27  
 0 SQLO_LT_SQLP_LHSH_hshlatch                                                          263                  758                   0.34  
 3 SQLO_LT_SQLI_INX_PAGE_CACHE_ipcLatch                                                240                   36                   6.66  
 2 SQLO_LT_SQLI_INX_PAGE_CACHE_ipcLatch                                                221                   45                   4.91  
 1 SQLO_LT_SQLI_INX_PAGE_CACHE_ipcLatch                                                193                   21                   9.19  
 0 SQLO_LT_SQLI_INX_PAGE_CACHE_ipcLatch                                                175                   25                   7.00




This example shows no significant latching issue. For scenarios with more significant latch waits:

• In Db2 pureScale, there is a strong correlation between latch wait time and page reclaim time. Even 2-5% of reclaim wait time can end up causing 20-40% of latch wait time in some scenarios. Examining the reclaims report may indicate the table or index which needs alteration, especially if both latch and reclaim wait time are high on the same SQL statements.


• Significant improvements in latching were made in Db2 v11.1. If the hottest latch looks like SQLB_HASH_BUCKET_GROUP_HEADER then upgrading to v11 may reduce the time spent holding this latch.
• The SQLB_BPD__bpdLatch type is associated with a (some) hot buffer pool page. Identify the table involved by looking at statements with high latch wait time. One approach to reducing latch contention is range partitioning the table and using partitioned indexes.

[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"High Availability - PureScale","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"}],"Version":"11.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg22014557