DB2 Version 10.1 for Linux, UNIX, and Windows

Histograms in workload management

A histogram is a collection of bins, which are containers for collecting discrete ranges of data. Histograms are useful for a variety of workload analysis and performance-tuning tasks.

DB2® workload management histograms have a fixed number of 41 bins. The 40th bin contains the highest defined value for the histogram, and the 41st bin is for values that are beyond the highest defined value. Each bin represents a specific range of values, and the bin ranges follow a logarithmic scale, with each bin representing a progressively larger range as one progresses from the 1st to 40th bin. The following figure shows a histogram of activity lifetimes that are plotted using a bar chart.
Figure 1. Histogram of activity lifetimes that are plotted using a bar chart
Histogram of activity lifetimes that are plotted using a bar chart
The activity lifetime histogram corresponds to the following data. Each count represents the number of activities whose lifetimes (in milliseconds) are within the range of the low bin value to the high bin value. For example, 156 activities had a lifetime in the range of 68 milliseconds to 103 milliseconds.
 Low Bin  High Bin Count
       0         1     0
       1         2     0
       2         3     0
       3         5     0
       5         8     0
       8        12     0
      12        19     0
      19        29    10
      29        44    15
      44        68    45
      68       103   156
     103       158    65
     158       241    23
     241       369     0
     369       562     0
     562       858     0
     858      1309     0
    1309      1997     0
    1997      3046     0
    3046      4647     0
    4647      7089     0
    7089     10813     0
   10813     16493     0
   16493     25157     0
   25157     38373     0
   38373     58532     0
   58532     89280     0
   89280    136181     0
  136181    207720     0
  207720    316840     0
  316840    483283     3
  483283    737162     0
  737162   1124409     0
 1124409   1715085     0
 1715085   2616055     0
 2616055   3990325     0
 3990325   6086529     0
 6086529   9283913     0
 9283913  14160950     0
14160950  21600000     0
21600000  Infinity     0

You can use histograms for a number of different purposes. For example, you can use them to see the distribution of values, use them to identify outlying values, or use them to compute averages and standard deviations. See Scenario: Tuning a DB2 workload management configuration when capacity planning information is unavailable and Example: Computing averages and a standard deviation from histograms in a DB2 workload management configuration for examples of how to use histograms to better understand and characterize your workload.

In a multimember database environment, histograms are collected on each member. Histogram bins have the same range of values on all database members, with specific counts per bin per member. You can use the bins to analyze information on a per-member basis. You can also combine the histograms from all database members by adding the counts in the corresponding bins and use this single histogram to obtain a global view of the data, which you can then use for tasks such as calculating the global average and standard deviation.

Histograms are available for service subclasses, workloads, and work classes, through work actions. Histograms are collected for these objects when you specify one of the COLLECT AGGREGATE ACTIVITY DATA, COLLECT AGGREGATE REQUEST DATA, or COLLECT AGGREGATE UNIT OF WORK DATA clauses when creating or altering the objects. For work classes, histograms are also collected if you apply a COLLECT AGGREGATE ACTIVITY DATA work action to the work class. The following histograms are available:
  • Non-nested coordinator activity lifetime, when you specify AGGREGATE ACTIVITY DATA BASE or AGGREGATE ACTIVITY DATA EXTENDED for a service subclass, for a workload, or for a work action applied to a work class
  • Non-nested coordinator activity execution time, when you specify AGGREGATE ACTIVITY DATA BASE or AGGREGATE ACTIVITY DATA EXTENDED for a service subclass, for a workload, or for a work action applied to a work class
  • Non-nested coordinator activity queue time, when you specify AGGREGATE ACTIVITY DATA BASE or AGGREGATE ACTIVITY DATA EXTENDED for a service subclass, for a workload, or for a work action applied to a work class
  • Request execution time, when you specify AGGREGATE REQUEST DATA BASE for a service subclass; this histogram does not apply to workloads or work classes
  • Non-nested activity interarrival time histogram, when you specify AGGREGATE ACTIVITY DATA EXTENDED for a service subclass, for a workload, or for a work action applied to a work class
  • Non-nested DML activity estimated cost, when you specify AGGREGATE ACTIVITY DATA EXTENDED for a service subclass, for a workload, or for a work action applied to a work class
  • Unit of work lifetime, when you specify AGGREGATE UNIT OF WORK DATA BASE for a service class

All activity-related histograms contain information about activities that are completed, are cancelled, or are rejected.

Histogram templates

You can optionally specify a histogram template that is used to determine what a particular histogram looks like, including the high bin value. A histogram template is a unitless object, meaning that there is no predefined measurement unit assigned to it. A measurement unit, which depends on the context in which the histogram template is used, is assigned to the histogram when a service class, workload, or work action is created or altered. Time-based histograms, such as those specified with the ACTIVITY LIFETIME HISTOGRAM clause as an example, use millisecond units, while cost-based histograms, such as those specified with the ACTIVITY ESTIMATED COST HISTOGRAM clause as an example, use timeron units.

You can create a histogram template by using the CREATE HISTOGRAM TEMPLATE statement, specifying the maximum high bin value. All other bin values are automatically defined as exponentially increasing values that approach the high bin value. For example, to create a histogram template with a high bin value of 3 000 000, issue a statement such as the following one:
CREATE HISTOGRAM TEMPLATE TEMPLATE1 HIGH BIN VALUE 3000000
This statement creates a histogram template with the following bin values:
Low Bin	High Bin
      0        1
      1        2
      2        3
      3        4
      4        6
      6        9
      9       13
     13       19
     19       28
     28       41
     41       60
     60       87
     87      127
    127      184
    184      268
    268      389
    389      565
    565      821
    821     1192
   1192     1732
   1732     2514
   2514     3651
   3651     5300
   5300     7696
   7696    11173
  11173    16222
  16222    23553
  23553    34196
  34196    49649
  49649    72084
  72084   104657
 104657   151948
 151948   220609
 220609   320297
 320297   465030
 465030   675163
 675163   980250
 980250  1423197
1423197  2066299
2066299  3000000
3000000 Infinity

You apply a histogram template by using the appropriate HISTOGRAM TEMPLATE keyword when creating or altering service subclasses, workloads, or work actions. If you do not specify a histogram template, the default template, SYSDEFAULTHISTOGRAM, is used. If you do not enable AGGREGATE ACTIVITY DATA collection for an object, the histogram template is ignored.

For example, to use the TEMPLATE1 histogram template for the existing activity lifetime histogram of service subclass MYSUBCLASS under the service superclass MYSUPERCLASS, issue the following statement:
ALTER SERVICE CLASS MYSUBCLASS UNDER MYSUPERCLASS
ACTIVITY LIFETIME HISTOGRAM TEMPLATE TEMPLATE1

After you commit the ALTER SERVICE CLASS statement, the activity lifetime histogram that is collected for the MYSUBCLASS service subclass has high bin values that are determined by the TEMPLATE1 histogram template instead of by the SYSDEFAULTHISTOGRAM histogram template.

If you change a service class or a workload to use a different histogram template or change a histogram template, the change does not take effect until a statistics reset occurs.

You can drop a histogram template by using the DROP HISTOGRAM TEMPLATE statement.

You can view the histogram templates by querying the SYSCAT.HISTOGRAMTEMPLATES view and view the corresponding histogram template high bin values by querying the SYSCAT.HISTOGRAMTEMPLATEBINS view. The low bin value is always 0 for the first bin; for any other bins, the low bin value is the high bin value from the preceding bin.

Example

The following example creates a table function to compute the CoordActLifetime, CoordActExecTime, CoordActQueueTime, or CoordActEstCost histogram for a service superclass as a whole by summing across the subclasses. Summing across subclasses is useful when activities are remapped to different service subclasses under the same service superclass during execution, as can occur under a priority aging scenario where service class tiers and specialized thresholds are used to control resources for activities dynamically. This example does not apply to the CoordActInterArrivalTime histogram because the weighted averages computed do not account for the fact that the CoordActInterArrivalTime histogram of a subclass measures the time between the arrival of a query in that subclass and the next query but the CoordActInterArrivalTime histogram of a superclass measures the time between the arrival of a query in any of its subclasses and the next query.

CONNECT TO SAMPLE

DROP FUNCTION histsuper

CREATE FUNCTION histsuper(superclass varchar(128),
                          histogram_type varchar(24))
RETURNS TABLE (statistics_timestamp timestamp,
               bin_top integer,
               number_in_bin integer,
               graph varchar(60))
LANGUAGE SQL
READS SQL DATA
NO EXTERNAL ACTION
DETERMINISTIC
RETURN WITH HISTOGRAMS AS
       (SELECT HISTOGRAM_TYPE,
              substr(PARENTSERVICECLASSNAME,1,26) as SUPERCLASS,
              STATISTICS_TIMESTAMP,
              TOP as BIN_TOP,
              sum(NUMBER_IN_BIN) as NUMBER_IN_BIN
       FROM HISTOGRAMBIN_DB2STATISTICS H,
            SYSCAT.SERVICECLASSES S
       WHERE H.SERVICE_CLASS_ID = S.SERVICECLASSID
         AND PARENTSERVICECLASSNAME = histsuper.superclass
         AND HISTOGRAM_TYPE = histsuper.histogram_type
         AND HISTOGRAM_TYPE IN ('CoordActLifetime', 'CoordActExecTime',
	    'CoordActQueueTime', 'CoordActEstCost')
       GROUP BY HISTOGRAM_TYPE, PARENTSERVICECLASSNAME, STATISTICS_TIMESTAMP, TOP)
       SELECT STATISTICS_TIMESTAMP,
              BIN_TOP,
              NUMBER_IN_BIN,
              substr(repeat('#', cast(NUMBER_IN_BIN * 60 /
                (SELECT CASE WHEN MAX(NUMBER_IN_BIN) = 0 THEN 1
		   ELSE MAX(NUMBER_IN_BIN) END FROM HISTOGRAMS) AS INTEGER)),1,60)
		   AS GRAPH FROM HISTOGRAMS

CONNECT RESET

The output looks as follows:

STATISTICS_TIMESTAMP       BIN_TOP     NUMBER_IN_BIN GRAPH
-------------------------- ----------- ------------- ------------------------------------------------------------
2008-11-06-14.47.08.833188 -1          0
2008-11-06-14.47.08.833188 1           1
2008-11-06-14.47.08.833188 2           1
2008-11-06-14.47.08.833188 3           2
2008-11-06-14.47.08.833188 5           4
2008-11-06-14.47.08.833188 8           7
2008-11-06-14.47.08.833188 12          15
2008-11-06-14.47.08.833188 19          29            #
2008-11-06-14.47.08.833188 29          41            #
2008-11-06-14.47.08.833188 44          67            ##
2008-11-06-14.47.08.833188 68          112           ###
2008-11-06-14.47.08.833188 103         228           #####
2008-11-06-14.47.08.833188 158         335           ########
2008-11-06-14.47.08.833188 241         723           #################
2008-11-06-14.47.08.833188 369         1289          ###############################
2008-11-06-14.47.08.833188 562         1890          #############################################
2008-11-06-14.47.08.833188 858         2484          ############################################################
2008-11-06-14.47.08.833188 1309        1943          ###############################################
2008-11-06-14.47.08.833188 1997        478           ###########
2008-11-06-14.47.08.833188 3046        221           #####
2008-11-06-14.47.08.833188 4647        29            #
2008-11-06-14.47.08.833188 7089        7
2008-11-06-14.47.08.833188 10813       0
2008-11-06-14.47.08.833188 16493       2
2008-11-06-14.47.08.833188 25157       0
2008-11-06-14.47.08.833188 38373       1
2008-11-06-14.47.08.833188 58532       0
2008-11-06-14.47.08.833188 89280       0
2008-11-06-14.47.08.833188 136181      0
2008-11-06-14.47.08.833188 207720      0
2008-11-06-14.47.08.833188 316840      0
2008-11-06-14.47.08.833188 483283      0
2008-11-06-14.47.08.833188 737162      0
2008-11-06-14.47.08.833188 1124409     0
2008-11-06-14.47.08.833188 1715085     0
2008-11-06-14.47.08.833188 2616055     0
2008-11-06-14.47.08.833188 3990325     0
2008-11-06-14.47.08.833188 6086529     0
2008-11-06-14.47.08.833188 9283913     0
2008-11-06-14.47.08.833188 14160950    0
2008-11-06-14.47.08.833188 21600000    0

41 record(s) selected.