DB2 10.5 for Linux, UNIX, and Windows

Example: Computing averages and a standard deviation from histograms in a DB2 workload management configuration

One use for histograms is for obtaining the standard deviation for activity lifetimes. The example in this topic shows how bins are used for the calculation of this statistic.

A calculation of the average lifetime for each activity is a useful piece of information. However, the average alone does not accurately describe the user experience. If the variability in activity lifetime is large, the users whom you are supporting might see queries run fast at some times (which is fine) and slow at others (which might not be acceptable). When you define a goal for activity lifetimes, not only is the average lifetime of the activities important but also the standard deviation of the activity lifetime. You need to both understand and control variability to ensure that your users actually experience the observed average.

In a DB2® workload management configuration, statistics are collected on each member. The following example shows how to obtain the average activity lifetime for a single member.

Suppose that you have a single-member environment and histogram with the following bins. There are more bins in the real histograms, but this example is limited to eight bins to make the example simpler.
Bin 1 - 0 to 2 seconds
Bin 2 - 2 to 4 seconds
Bin 3 - 4 to 8 seconds
Bin 4 - 8 to 16 seconds
Bin 5 - 16 to 32 seconds
Bin 6 - 32 to 64 seconds
Bin 7 - 64 to 128 seconds
Bin 8 - 128 seconds to infinity
You can compute an approximation of the average by assuming that the average response time for a query that falls into a bin with the range x to y is (x + y)/2. You can then multiply this number by the number of queries that fell into the bin, sum across all bins, then divide the sum by the total count. For the preceding example, assume that the average response time for each bin is:
Bin 1 average lifetime = (0+2)/2 = 1
Bin 2 average lifetime = (2+4)/2 = 3
Bin 3 average lifetime = (4+8)/2 = 6
Bin 4 average lifetime = (8+16)/2 = 12
Bin 5 average lifetime = (16+32)/2 = 24
Bin 6 average lifetime = (32+64)/2 = 48
Bin 7 average lifetime = (64+128)/2 = 96
Assume that the following histogram was collected during the measurement period:
Bin 1     Bin 2     Bin 3    Bin 4     Bin 5     Bin 6     Bin 7     Bin 8
count     count     count    count     count     count     count     count
   20        30        80       10         5         3         2         0

To calculate average lifetime, bin 8 must be empty. Bin 8 only exists to let you know when you need to change the upper boundary of your range. For this reason, you must specify the upper bound for the range.

You can approximate the average lifetime for member 1 as follows:
average lifetime = (20 x 1 + 30 x 3 + 80 x 6 + 10 x 12 + 5 x 24 + 3 x 48 + 2 x 96) / 150
                 = (20 + 90 + 480 + 120 + 120 + 144 + 192) / 150
                 = 1166 / 150
                 = 7.77 seconds
You can approximate the lifetime standard deviation as follows:
Standard deviation = [(20 x (1 - 7.77)2 + 30 x (3 - 7.77)2 + ... ) / 150]1/2

For multimember database environments, averages and standard deviations can be computed by first computing a combined histogram across all database members by adding the counts of each bin across the database members.

For example, assume that the database has two members, the histogram bin sizes are as described previously, and the histogram has the following data:
Database  Bin 1     Bin 2     Bin 3    Bin 4     Bin 5      Bin 6     Bin 7     Bin 8
  member  count     count     count    count     count      count     count     count
        1    20        30        80       10         5          3         2         0
        2     1         5        20       20         4          0         0         0
Because the bin sizes are the same across all database members, the overall histogram is easy to compute:
Bin 1     Bin 2     Bin 3     Bin 4     Bin 5     Bin 6     Bin 7     Bin 8
count     count     count     count     count     count     count     count
   21        35       100        30         9         3         2         0
From the combined histogram, you can calculate the overall lifetime average and standard deviation in a similar way to how they were computed for a single-member environment:
Average lifetime = (21 x 1 + 35 x 3 + 100 x 6 + 30 x 12 + 9 x 24 + 3 x 48 + 2 x 96) / 200
                 = (21 + 105 + 600 + 360 + 216 + 144 + 192) / 200
                 = 1638 / 200
                 = 8.19 seconds

Standard deviation = [(21 x (1 - 8.19)2 + 35 x (3 - 7.77)2 + ... ) / 200]1/2