DB2 Version 9.7 for Linux, UNIX, and Windows

Monitoring and troubleshooting using db2pd command

The db2pd command is used for troubleshooting because it can return quick and immediate information from the DB2® memory sets.

Overview

The tool collects information without acquiring any latches or using any engine resources. It is therefore possible (and expected) to retrieve information that is changing while db2pd is collecting information; hence the data might not be completely accurate. If changing memory pointers are encountered, a signal handler is used to prevent db2pd from ending abnormally. This can result in messages such as "Changing data structure forced command termination" to appear in the output. Nonetheless, the tool can be helpful for troubleshooting. Two benefits to collecting information without latching include faster retrieval and no competition for engine resources.

If you want to capture information about the database management system when a specific SQLCODE, ZRC code or ECF code occurs, this can be accomplished using the db2pdcfg -catch command. When the errors are caught, the db2cos (callout script) is launched. The db2cos script can be dynamically altered to run any db2pd command, operating system command, or any other command needed to resolve the problems. The template db2cos script file is located in sqllib/bin on UNIX and Linux. On the Windows operating system, db2cos is located in the $DB2PATH\bin directory.

When adding a new node, you can monitor the progress of the operation on the database partition server, that is adding the node, using the db2pd -addnode command with the optional oldviewapps and detail parameters for more detailed information.

If you require a list of event monitors that are currently active or have been, for some reason, deactivated, run the db2pd -gfw command. This command also returns statistics and information about the targets, into which event monitors write data, for each fast writer EDU.

Examples

Example 1: Diagnosing a lockwait

If you run db2pd -db databasename -locks -transactions -applications -dynamic, the results are similar to the following ones:
Locks:
Address            TranHdl Lockname                   Type Mode Sts Owner Dur HldCnt Att    ReleaseFlg
0x07800000202E5238 3       00020002000000040000000052 Row  ..X  G   3     1   0      0x0000 0x40000000
0x07800000202E4668 2       00020002000000040000000052 Row  ..X  W*  2     1   0      0x0000 0x40000000
For the database that you specified using the -db database name option, the first results show the locks for that database. The results show that TranHdl 2 is waiting on a lock held by TranHdl 3.
Transactions:
Address            AppHandl [nod-index] TranHdl Locks State Tflag      Tflag2     Firstlsn       Lastlsn        LogSpace SpaceReserved TID            AxRegCnt GXID
0x0780000020251B80 11       [000-00011] 2       4     READ  0x00000000 0x00000000 0x000000000000 0x000000000000 0        0             0x0000000000B7 1        0
0x0780000020252900 12       [000-00012] 3       4     WRITE 0x00000000 0x00000000 0x000000FA000C 0x000000FA000C 113      154           0x0000000000B8 1        0
We can see that TranHdl 2 is associated with AppHandl 11 and TranHdl 3 is associated with AppHandl 12.
Applications:
Address            AppHandl [nod-index] NumAgents CoorPid Status        C-AnchID C-StmtUID L-AnchID L-StmtUID Appid

0x07800000006879E0 12       [000-00012] 1         1073336 UOW-Waiting   0        0         17       1         *LOCAL.burford.060303225602
0x0780000000685E80 11       [000-00011] 1         1040570 UOW-Executing 17       1         94       1         *LOCAL.burford.060303225601
We can see that AppHandl 12 last ran dynamic statement 17, 1. ApplHandl 11 is currently running dynamic statement 17, 1 and last ran statement 94, 1.
Dynamic SQL Statements:
Address            AnchID StmtUID NumEnv NumVar NumRef NumExe Text
0x07800000209FD800 17     1       1      1      2      2      update pdtest set c1 = 5
0x07800000209FCCC0 94     1       1      1      2      2      set lock mode to wait 1
We can see that the text column shows the SQL statements that are associated with the lock timeout.

Example 2: Using the -wlocks parameter to capture all the locks being waited on

If you run db2pd -wlocks -db pdtest, results similar to the following ones are generated. They show that the first application (AppHandl 47) is performing an insert on a table and that the second application (AppHandl 46) is performing a select on that table:
venus@boson:/home/venus =>db2pd -wlocks -db pdtest

Database Partition 0 -- Database PDTEST -- Active -- Up 0 days 00:01:22

Locks being waited on :
AppHandl [nod-index] TranHdl    Lockname                   Type    Mode Conv Sts CoorEDU    AppName  AuthID   AppID
47       [000-00047] 8          00020004000000000840000652 Row     ..X       G   5160       db2bp    VENUS    *LOCAL.venus.071207213730
46       [000-00046] 2          00020004000000000840000652 Row     .NS       W   5913       db2bp    VENUS    *LOCAL.venus.071207213658

Example 3: Using the -apinfo parameter to capture detailed runtime information about the lock owner and the lock waiter

The following sample output was generated under the same conditions as those for Example 2:
venus@boson:/home/venus =>db2pd -apinfo 47 -db pdtest

Database Partition 0 -- Database PDTEST -- Active -- Up 0 days 00:01:30

Application :
  Address :                0x0780000001676480
  AppHandl [nod-index] :   47       [000-00047]
  Application PID :        876558
  Application Node Name :  boson
  IP Address:              n/a
  Connection Start Time :  (1197063450)Fri Dec  7 16:37:30 2007
  Client User ID :         venus
  System Auth ID :         VENUS
  Coordinator EDU ID :     5160
  Coordinator Partition :  0
  Number of Agents :       1
  Locks timeout value :    4294967294 seconds
  Locks Escalation :       No
  Workload ID :            1
  Workload Occurrence ID : 2
  Trusted Context :        n/a
  Connection Trust Type :  non trusted
  Role Inherited :         n/a
  Application Status :     UOW-Waiting
  Application Name :       db2bp
  Application ID :         *LOCAL.venus.071207213730

  ClientUserID :           n/a
  ClientWrkstnName :       n/a
  ClientApplName :         n/a
  ClientAccntng :          n/a

  List of inactive statements of current UOW :
    UOW-ID :          2
    Activity ID :     1
    Package Schema :  NULLID
    Package Name :    SQLC2G13
    Package Version :
    Section Number :  203
    SQL Type :        Dynamic
    Isolation :       CS
    Statement Type :  DML, Insert/Update/Delete
    Statement :       insert into pdtest values 99


venus@boson:/home/venus =>db2pd -apinfo 46 -db pdtest

Database Partition 0 -- Database PDTEST -- Active -- Up 0 days 00:01:39

Application :
  Address :                0x0780000000D77A60
  AppHandl [nod-index] :   46       [000-00046]
  Application PID :        881102
  Application Node Name :  boson
  IP Address:              n/a
  Connection Start Time :  (1197063418)Fri Dec  7 16:36:58 2007
  Client User ID :         venus
  System Auth ID :         VENUS
  Coordinator EDU ID :     5913
  Coordinator Partition :  0
  Number of Agents :       1
  Locks timeout value :    4294967294 seconds
  Locks Escalation :       No
  Workload ID :            1
  Workload Occurrence ID : 1
  Trusted Context :        n/a
  Connection Trust Type :  non trusted
  Role Inherited :         n/a
  Application Status :     Lock-wait
  Application Name :       db2bp
  Application ID :         *LOCAL.venus.071207213658

  ClientUserID :           n/a
  ClientWrkstnName :       n/a
  ClientApplName :         n/a
  ClientAccntng :          n/a

  List of active statements :
   *UOW-ID :          3
    Activity ID :     1
    Package Schema :  NULLID
    Package Name :    SQLC2G13
    Package Version :
    Section Number :  201
    SQL Type :        Dynamic
    Isolation :       CS
    Statement Type :  DML, Select (blockable)
    Statement :       select * from pdtest

Example 4: Using the callout scripts when considering a locking problem

To use the callout scripts, find the db2cos output files. The location of the files is controlled by the database manager configuration parameter diagpath. The contents of the output files will differ depending on what commands you enter in the db2cos script file. An example of the output provided when the db2cos script file contains a db2pd -db sample -locks command is as follows:
Lock Timeout Caught
Thu Feb 17 01:40:04 EST 2006
Instance DB2
Database: SAMPLE
Partition Number: 0
PID: 940
TID: 2136
Function: sqlplnfd
Component: lock manager
Probe: 999
Timestamp: 2006-02-17-01.40.04.106000
AppID: *LOCAL.DB2...
AppHdl:
...
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:06:53
Locks:
Address    TranHdl Lockname                   Type Mode Sts Owner Dur HldCnt Att Rlse
0x402C6B30 3       00020003000000040000000052 Row  ..X  W*  3     1   0      0   0x40

In the output, W* indicates the lock that experienced the timeout. In this case, a lockwait has occurred. A lock timeout can also occur when a lock is being converted to a higher mode. This is indicated by C* in the output.

You can map the results to a transaction, an application, an agent, or even an SQL statement with the output provided by other db2pd commands in the db2cos file. You can narrow down the output or use other commands to collect the information that you need. For example, you can use the db2pd -locks wait parameters to print only locks with a wait status. You can also use the -app and -agent parameters.

Example 5: Mapping an application to a dynamic SQL statement

The command db2pd -applications -dynamic reports the current and last anchor ID and statement unique ID for dynamic SQL statements. This allows direct mapping from an application to a dynamic SQL statement.

Applications:
Address            AppHandl [nod-index] NumAgents  CoorPid  Status
0x00000002006D2120 780      [000-00780] 1          10615    UOW-Executing

C-AnchID C-StmtUID  L-AnchID L-StmtUID  Appid
163      1          110      1          *LOCAL.burford.050202200412

Dynamic SQL Statements:
Address            AnchID StmtUID NumEnv NumVar NumRef NumExe Text
0x0000000220A02760 163    1       2      2      2      1      CREATE VIEW MYVIEW
0x0000000220A0B460 110    1       2      2      2      1      CREATE VIEW YOURVIEW

Example 6: Monitoring memory usage

The db2pd -memblock command can be useful when you are trying to understand memory usage, as shown in the following sample output:

All memory blocks in DBMS set.

Address            PoolID   PoolName   BlockAge   Size(Bytes) I LOC   File
0x0780000000740068 62       resynch    2          112         1 1746  1583816485
0x0780000000725688 62       resynch    1          108864      1 127   1599127346
0x07800000001F4348 57       ostrack    6          5160048     1 3047  698130716
0x07800000001B5608 57       ostrack    5          240048      1 3034  698130716
0x07800000001A0068 57       ostrack    1          80          1 2970  698130716
0x07800000001A00E8 57       ostrack    2          240         1 2983  698130716
0x07800000001A0208 57       ostrack    3          80          1 2999  698130716
0x07800000001A0288 57       ostrack    4          80          1 3009  698130716
0x0780000000700068 70       apmh       1          360         1 1024  3878879032
0x07800000007001E8 70       apmh       2          48          1 914   1937674139
0x0780000000700248 70       apmh       3          32          1 1000  1937674139
...

This is followed by the sorted 'per-pool' output:

Memory blocks sorted by size for ostrack pool:
PoolID     PoolName   TotalSize(Bytes)     TotalCount LOC   File
57         ostrack    5160048              1          3047  698130716
57         ostrack    240048               1          3034  698130716
57         ostrack    240                  1          2983  698130716
57         ostrack    80                   1          2999  698130716
57         ostrack    80                   1          2970  698130716
57         ostrack    80                   1          3009  698130716
Total size for ostrack pool: 5400576 bytes

Memory blocks sorted by size for apmh pool:
PoolID     PoolName   TotalSize(Bytes)     TotalCount LOC   File
70         apmh       40200                2          121   2986298236
70         apmh       10016                1          308   1586829889
70         apmh       6096                 2          4014  1312473490
70         apmh       2516                 1          294   1586829889
70         apmh       496                  1          2192  1953793439
70         apmh       360                  1          1024  3878879032
70         apmh       176                  1          1608  1953793439
70         apmh       152                  1          2623  1583816485
70         apmh       48                   1          914   1937674139
70         apmh       32                   1          1000  1937674139
Total size for apmh pool: 60092 bytes
...

The final section of output sorts the consumers of memory for the entire memory set:

All memory consumers in DBMS memory set:
PoolID     PoolName   TotalSize(Bytes)     %Bytes TotalCount %Count LOC   File
57         ostrack    5160048              71.90  1          0.07   3047  698130716
50         sqlch      778496               10.85  1          0.07   202   2576467555
50         sqlch      271784               3.79   1          0.07   260   2576467555
57         ostrack    240048               3.34   1          0.07   3034  698130716
50         sqlch      144464               2.01   1          0.07   217   2576467555
62         resynch    108864               1.52   1          0.07   127   1599127346
72         eduah      108048               1.51   1          0.07   174   4210081592
69         krcbh      73640                1.03   5          0.36   547   4210081592
50         sqlch      43752                0.61   1          0.07   274   2576467555
70         apmh       40200                0.56   2          0.14   121   2986298236
69         krcbh      32992                0.46   1          0.07   838   698130716
50         sqlch      31000                0.43   31         2.20   633   3966224537
50         sqlch      25456                0.35   31         2.20   930   3966224537
52         kerh       15376                0.21   1          0.07   157   1193352763
50         sqlch      14697                0.20   1          0.07   345   2576467555
...

You can also report memory blocks for private memory on UNIX and Linux operating systems. For example, if you run db2pd -memb pid=159770, results similar to the following ones are generated:

All memory blocks in Private set. 

Address            PoolID     PoolName     BlockAge   Size(Bytes) I LOC   File 
0x0000000110469068 88         private      1          2488        1 172   4283993058
0x0000000110469A48 88         private      2          1608        1 172   4283993058
0x000000011046A0A8 88         private      3          4928        1 172   4283993058
0x000000011046B408 88         private      4          7336        1 172   4283993058
0x000000011046D0C8 88         private      5          32          1 172   4283993058
0x000000011046D108 88         private      6          6728        1 172   4283993058
0x000000011046EB68 88         private      7          168         1 172   4283993058
0x000000011046EC28 88         private      8          24          1 172   4283993058
0x000000011046EC68 88         private      9          408         1 172   4283993058
0x000000011046EE28 88         private      10         1072        1 172   4283993058
0x000000011046F288 88         private      11         3464        1 172   4283993058
0x0000000110470028 88         private      12         80          1 172   4283993058
0x00000001104700A8 88         private      13         480         1 1534  862348285 
0x00000001104702A8 88         private      14         480         1 1939  862348285 
0x0000000110499FA8 88         private      80         65551       1 1779  4231792244
Total set size: 94847 bytes

Memory blocks sorted by size:
PoolID     PoolName   TotalSize(Bytes)     TotalCount LOC   File      
88         private    65551                1          1779  4231792244
88         private    28336                12         172   4283993058
88         private    480                  1          1939  862348285 
88         private    480                  1          1534  862348285 
Total set size: 94847 bytes

Example 7: Determine which application is using up your table space

Using db2pd -tcbstats, you can identify the number of inserts for a table. The following is sample information for a user-defined global temporary table called TEMP1:

TCB Table Information:
Address            TbspaceID TableID PartID MasterTbs MasterTab TableName  SchemaNm ObjClass DataSize  LfSize LobSize XMLSize
0x0780000020B62AB0 3         2       n/a    3         2         TEMP1      SESSION  Temp     966       0      0       0 

TCB Table Stats:
Address            TableName   Scans   UDI   PgReorgs   NoChgUpdts   Reads   FscrUpdates Inserts   Updates   Deletes   OvFlReads OvFlCrtes 
0x0780000020B62AB0 TEMP1       0       0     0          0            0       0           43968     0         0          0        0         

You can then obtain the information for table space 3 by using the db2pd -tablespaces command. Sample output is as follows:

Tablespace 3 Configuration:
Address            Type Content PageSz ExtentSz Auto Prefetch BufID BufIDDisk FSC NumCntrs MaxStripe  LastConsecPg Name
0x0780000020B1B5A0 DMS  UsrTmp  4096   32       Yes  32       1     1         On  1        0          31           TEMPSPACE2

Tablespace 3 Statistics:
Address            TotalPgs   UsablePgs  UsedPgs    PndFreePgs FreePgs    HWM        State      MinRecTime NQuiescers
0x0780000020B1B5A0 5000       4960       1088       0          3872       1088       0x00000000 0          0          

Tablespace 3 Autoresize Statistics:
Address            AS  AR  InitSize    IncSize     IIP MaxSize     LastResize      LRF
0x0780000020B1B5A0 No  No  0           0           No  0           None            No  

Containers:
Address            ContainNum Type    TotalPgs   UseablePgs StripeSet  Container 
0x0780000020B1DCC0 0          File    5000       4960       0          /home/db2inst1/tempspace2a

The FreePgs column shows that space is filling up. As the free pages value decreases, there is less space available. Notice also that the value for FreePgs plus the value for UsedPgs equals the value of UsablePgs.

Once this is known, you can identify the dynamic SQL statement that is using the table TEMP1 by running the db2pd -db sample -dyn:

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:13:06

Dynamic Cache:
Current Memory Used           1022197
Total Heap Size               1271398
Cache Overflow Flag           0
Number of References          237
Number of Statement Inserts   32
Number of Statement Deletes   13
Number of Variation Inserts   21
Number of Statements          19

Dynamic SQL Statements:
Address            AnchID StmtUID  NumEnv  NumVar  NumRef  NumExe  Text
0x0000000220A08C40 78     1        2       2       3       2       declare global temporary table temp1 (c1 char(6)) not logged
0x0000000220A8D960 253    1        1       1       24      24      insert into session.temp1 values('TEST')

Finally, you can map the information from the preceding output to the applications output to identify the application by running db2pd -db sample -app.

Applications:
Address            AppHandl [nod-index] NumAgents  CoorPid Status
0x0000000200661840 501      [000-00501] 1          11246   UOW-Waiting

C-AnchID C-StmtUID  L-AnchID L-StmtUID  Appid
0        0          253      1          *LOCAL.db2inst1.050202160426

You can use the anchor ID (AnchID) value that identified the dynamic SQL statement to identify the associated application. The results show that the last anchor ID (L-AnchID) value is the same as the anchor ID (AnchID) value. You use the results from one run of db2pd in the next run of db2pd.

The output from db2pd -agent shows the number of rows read (in the Rowsread column) and rows written (in the Rowswrtn column) by the application. These values give you an idea of what the application has completed and what the application still has to complete, as shown in the following sample output:.
Address            AppHandl [nod-index] AgentPid  Priority  Type	DBName
0x0000000200698080 501      [000-00501] 11246     0         Coord	SAMPLE

State       ClientPid  Userid   ClientNm Rowsread   Rowswrtn   LkTmOt
Inst-Active 26377      db2inst1 db2bp    22         9588       NotSet

You can map the values for AppHandl and AgentPid resulting from running the db2pd -agent command to the corresponding values for AppHandl and CoorPid resulting from running the db2pd -app command.

The steps are slightly different if you suspect that an internal temporary table is filling up the table space. You still use db2pd -tcbstats to identify tables with large numbers of inserts, however. Following is sample information for an implicit temporary table:

TCB Table Information:
Address            TbspaceID TableID PartID MasterTbs MasterTab TableName          SchemaNm ObjClass    DataSize  ...     
0x0780000020CC0D30 1         2       n/a    1         2         TEMP (00001,00002) <30>     <JMC Temp   2470      ...
0x0780000020CC14B0 1         3       n/a    1         3         TEMP (00001,00003) <31>     <JMC Temp   2367      ...
0x0780000020CC21B0 1         4       n/a    1         4         TEMP (00001,00004) <30>     <JMC Temp   1872      ...

TCB Table Stats:
Address            TableName          Scans   UDI   PgReorgs   NoChgUpdts Reads   FscrUpdates Inserts ... 
0x0780000020CC0D30 TEMP (00001,00002) 0       0     0          0          0       0           43219   ...
0x0780000020CC14B0 TEMP (00001,00003) 0       0     0          0          0       0           42485   ...
0x0780000020CC21B0 TEMP (00001,00004) 0       0     0          0          0       0           0       ...

In this example, there are a large number of inserts for tables with the naming convention TEMP (TbspaceID, TableID). These are implicit temporary tables. The values in the SchemaNm column have a naming convention of the value for AppHandl concatenated with the value for SchemaNm, which makes it possible to identify the application doing the work.

You can then map that information to the output from db2pd -tablespaces to see the used space for table space 1. Take note of the relationship between the UsedPgs and UsablePgs values in the table space statistics in the following output:

Tablespace Configuration:
Address            Id    Type Content PageSz ExtentSz Auto Prefetch BufID BufIDDisk FSC NumCntrs MaxStripe LastConsecPg Name
0x07800000203FB5A0 1     SMS  SysTmp  4096   32       Yes  320      1     1         On  10       0         31           TEMPSPACE1

Tablespace Statistics:
Address            Id    TotalPgs   UsablePgs  UsedPgs    PndFreePgs FreePgs    HWM    State      MinRecTime NQuiescers
0x07800000203FB5A0 1     6516       6516       6516       0          0          0      0x00000000 0          0

Tablespace Autoresize Statistics:
Address            Id    AS  AR  InitSize    IncSize     IIP MaxSize     LastResize  LRF
0x07800000203FB5A0 1     No  No  0           0           No  0           None        No

Containers:
...

You can then identify application handles 30 and 31 (because you saw them in the -tcbstats output) by using the command db2pd -app:

Applications:
Address            AppHandl [nod-index] NumAgents  CoorPid    Status          C-AnchID C-StmtUID  L-AnchID   L-StmtUID  Appid
0x07800000006FB880 31       [000-00031] 1          4784182    UOW-Waiting     0        0          107        1          *LOCAL.db2inst1.051215214142
0x07800000006F9CE0 30       [000-00030] 1          8966270    UOW-Executing   107      1          107        1          *LOCAL.db2inst1.051215214013

Finally, map the information from the preceding output to the Dynamic SQL output obtained by running the db2pd -dyn command:

Dynamic SQL Statements:
Address            AnchID StmtUID    NumEnv     NumVar     NumRef     NumExe     Text
0x0780000020B296C0 107    1          1          1          43         43         select c1, c2 from test group by c1,c2

Example 8: Monitoring recovery

If you run the command db2pd -recovery, the output shows several counters that you can use to verify that recovery is progressing, as shown in the following sample output. The Current Log and Current LSN values provide the log position. The CompletedWork value is the number of bytes completed thus far.

Recovery:
Recovery Status     0x00000401
Current Log         S0000005.LOG
Current LSN         000002551BEA
Job Type            ROLLFORWARD RECOVERY
Job ID              7
Job Start Time      (1107380474) Wed Feb  2 16:41:14 2005
Job Description     Database Rollforward Recovery
Invoker Type        User
Total Phases        2
Current Phase       1

Progress:
Address            PhaseNum Description StartTime                CompletedWork  TotalWork
0x0000000200667160 1        Forward     Wed Feb  2 16:41:14 2005 2268098 bytes  Unknown
0x0000000200667258 2        Backward    NotStarted               0 bytes        Unknown

Example 9: Determining the amount of resources a transaction is using

If you run the command db2pd -transactions, the output shows the number of locks, the first log sequence number (LSN), the last LSN, the log space used, and the space reserved, as shown in the following sample output. This can be useful for understanding the behavior of a transaction.

Transactions:
Address            AppHandl [nod-index] TranHdl  Locks  State  Tflag
0x000000022026D980 797      [000-00797] 2        108    WRITE  0x00000000
0x000000022026E600 806      [000-00806] 3        157    WRITE  0x00000000
0x000000022026F280 807      [000-00807] 4        90     WRITE  0x00000000

Tflag2     Firstlsn       Lastlsn        LogSpace   SpaceReserved
0x00000000 0x000001072262 0x0000010B2C8C 4518       95450
0x00000000 0x000001057574 0x0000010B3340 6576       139670
0x00000000 0x00000107CF0C 0x0000010B2FDE 3762       79266

TID            AxRegCnt   GXID
0x000000000451 1          0
0x0000000003E0 1          0
0x000000000472 1          0

Example 10: Monitoring log usage

The command db2pd -logs is useful for monitoring log usage for a database. By using thePages Written value, as shown in the following sample output, you can determine whether the log usage is increasing:

Logs:
Current Log Number            0
Pages Written                 0
Cur Commit Disk Log Reads     0
Cur Commit Total Log Reads    0
Method 1 Archive Status       n/a
Method 1 Next Log to Archive  n/a
Method 1 First Failure        n/a
Method 2 Archive Status       n/a
Method 2 Next Log to Archive  n/a
Method 2 First Failure        n/a
Log Chain ID                  0
Current LSN                   0x0000000001F40010

Address            StartLSN        State      Size      Pages      Filename
0x00002B75E9E3D2D0 0000000001F40010 0x00000000 1000      1000      S0000000.LOG
0x00002B75E9E53D70 0000000002328010 0x00000000 1000      1000      S0000001.LOG
0x00002B75E9E545D0 0000000002710010 0x00000000 1000      1000      S0000002.LOG
You can identify two types of problems by using this output:
  • If the most recent log archive fails, Archive Status is set to a value of Failure. If there is an ongoing archive failure, preventing logs from being archived at all, Archive Status is set to a value of First Failure.
  • If log archiving is proceeding very slowly, the Next Log to Archive value is lower than the Current Log Number value. If archiving is very slow, space for active logs might run out, which in turn might prevent any data changes from occurring in the database.

Example 11: Viewing the sysplex list

Without the db2pd -sysplex command showing the following sample output, the only other way to report the sysplex list is by using a DB2 trace.

Sysplex List:
Alias:         HOST
Location Name: HOST1
Count:         1

IP Address      Port       Priority   Connections Status     PRDID
 1.2.34.56      400        1          0           0

Example 12: Generating stack traces

You can use the db2pd -stack all command for Windows operating systems or the -stack command for UNIX operating systems to produce stack traces for all processes in the current database partition. You might want to use this command iteratively when you suspect that a process or thread is looping or hanging.

You can obtain the current call stack for a particular engine dispatchable unit (EDU) by issuing the command db2pd -stack eduid, as shown in the following example:

  Attempting to dump stack trace for eduid 137.
  See current DIAGPATH for trapfile.

If the call stacks for all of the DB2 processes are desired, use the command db2pd -stack all, for example (on Windows operating systems):

  Attempting to dump all stack traces for instance.
  See current DIAGPATH for trapfiles.

If you are using a partitioned database environment with multiple physical nodes, you can obtain the information from all of the partitions by using the command db2_all "; db2pd -stack all". If the partitions are all logical partitions on the same machine, however, a faster method is to use db2pd -alldbp -stacks.

You can also redirect the output of the db2pdb -stacks command for db2sysc processes to a specific directory path with the dumpdir parameter. The output can be redirected for a specific duration only with the timeout parameter. For example, to redirect the output of stack traces for all EDUs in db2sysc processes to /home/waleed/mydir for 30 seconds, issue the following command:

db2pd -alldbp -stack all dumpdir=/home/waleed/mydir timeout=30

Example 13: Viewing memory statistics for a database partition

The db2pd -dbptnmem command shows how much memory the DB2 server is currently consuming and, at a high level, which areas of the server are using that memory.

Following is an example of the output from running db2pd -dbptnmem on an AIX® machine:

Database Partition Memory Controller Statistics

Controller Automatic: Y
Memory Limit:   122931408 KB
Current usage:     651008 KB
HWM usage:         651008 KB
Cached memory:     231296 KB
The descriptions of these data fields and columns are as follows:
Controller Automatic
Indicates the memory controller setting. It shows the value "Y" if the instance_memory configuration parameter is set to AUTOMATIC. This means that database manager automatically determines the upper boundary of memory consumption.
Memory Limit
If an instance memory limit is enforced, the value of the instance_memory configuration parameter is the upper bound limit of DB2 server memory that can be consumed.
Current usage
The amount of memory the server is currently consuming.
HWM usage
The high water mark (HWM) or peak memory usage that has been consumed since the activation of the database partition (when the db2start command was run).
Cached memory
The amount of the current usage that is not currently being used but is cached for performance reasons for future memory requests.

Following is the continuation of the sample output from running db2pd -dbptnmem on an AIX operating system:

Individual Memory Consumers:
Name           Mem Used (KB)   HWM Used (KB)   Cached (KB)
===========================================================
APPL-DBONE       160000          160000        159616 
DBMS-name         38528           38528          3776 
FMP_RESOURCES     22528           22528             0 
PRIVATE           13120           13120           740 
FCM_RESOURCES     10048           10048             0 
LCL-p606416         128             128             0 
DB-DBONE         406656          406656         67200
All registered "consumers" of memory within the DB2 server are listed with the amount of the total memory they are consuming. The column descriptions are as follows:
Name
A short, distinguishing name of a consumer of memory, such as the following:
APPL-dbname
Application memory consumed for database dbname
DBMS-name
Global database manager memory requirements
FMP_RESOURCES
Memory required to communicate with db2fmps
PRIVATE
Miscellaneous private memory requirements
FCM_RESOURCES
Fast Communication Manager resources
LCL-pid
The memory segment used to communicate with local applications
DB-dbname
Database memory consumed for database dbname
Mem Used (KB)
The amount of memory that is currently allotted to the consumer
HWM Used (KB)
The high-water mark (HWM) of the memory, or the peak memory, that the consumer has used
Cached (KB)
Of the Mem Used (KB), the amount of memory that is not currently being used but is immediately available for future memory allocations

Example 14: Monitoring the progress of index reorganization

In DB2 Version 9.7 Fix Pack 2 and later fix packs, the progress report of an index reorganization has the following characteristics:
  • The db2pd -reorgs index command reports index reorg progress for partitioned indexes (Fix Pack 1 introduced support for only non-partitioned indexes).
  • The db2pd -reorgs index command supports the monitoring of index reorg at the partition level (that is, during reorganization of a single partition).
  • The reorg progress for non-partitioned and partitioned indexes is reported in separate outputs. One output shows the reorg progress for non-partitioned indexes, and the following outputs show the reorg progress for partitioned indexes on each table partition; the index reorg statistics of only one partition is reported in each output.
  • Non-partitioned indexes are processed first, followed by partitioned indexes in serial fashion.
  • The db2pd -reorgs index command displays the following additional information fields in the output for partitioned indexes:
    • MaxPartition - Total number of partitions for the table being processed. For partition-level reorg, MaxPartition will always have a value of 1 since only a single partition is being reorganized.
    • PartitionID - The data partition identifier for the partition being processed.
The following is an example of output obtained using the db2pd -reorgs index command which reports the index reorg progress for a range-partitioned table with 2 partitions.
Note: The first output reports the Index Reorg Stats of the non-partitioned indexes. The following outputs report the Index Reorg Stats of the partitioned indexes on each partition.
Index Reorg Stats:
Retrieval Time: 02/08/2010 23:04:21
TbspaceID: -6       TableID: -32768
Schema: ZORAN    TableName: BIGRPT
Access: Allow none
Status: Completed
Start Time: 02/08/2010 23:03:55   End Time: 02/08/2010 23:04:04
Total Duration: 00:00:08
Prev Index Duration:  -
Cur Index Start: -
Cur Index: 0            Max Index: 2            Index ID: 0
Cur Phase: 0          ( -     )   Max Phase: 0
Cur Count: 0                      Max Count: 0
Total Row Count: 750000
Retrieval Time: 02/08/2010 23:04:21
TbspaceID: 2        TableID: 5
Schema: ZORAN    TableName: BIGRPT
PartitionID: 0      MaxPartition: 2
Access: Allow none
Status: Completed
Start Time: 02/08/2010 23:04:04   End Time: 02/08/2010 23:04:08
Total Duration: 00:00:04
Prev Index Duration:  -
Cur Index Start: -
Cur Index: 0            Max Index: 2            Index ID: 0
Cur Phase: 0          ( -     )   Max Phase: 0
Cur Count: 0                      Max Count: 0
Total Row Count: 375000
Retrieval Time: 02/08/2010 23:04:21
TbspaceID: 2        TableID: 6
Schema: ZORAN    TableName: BIGRPT
PartitionID: 1      MaxPartition: 2
Access: Allow none
Status: Completed
Start Time: 02/08/2010 23:04:08   End Time: 02/08/2010 23:04:12
Total Duration: 00:00:04
Prev Index Duration:  -
Cur Index Start: -
Cur Index: 0            Max Index: 2            Index ID: 0
Cur Phase: 0          ( -     )   Max Phase: 0
Cur Count: 0                      Max Count: 0
Total Row Count: 375000

Example 15: Displaying the top EDUs by processor time consumption and displaying EDU stack information

If you issue the db2pd command with the -edus parameter option, the output lists all engine dispatchable units (EDUs). Output for EDUs can be returned at the level of granularity you specify, such as at the instance level or at the member. On Linux and UNIX operating systems only, you can also specify the interval parameter suboption so that two snapshots of all EDUs are taken, separated by an interval you specify. When the interval parameter is specified, two additional columns in the output indicate the delta of processor user time (USR DELTA column) and the delta of processor system time (SYS DELTA column) across the interval.

In the following example, the deltas for processor user time and processor system time are given across a five-second interval:

$ db2pd -edus interval=5

Database Partition 0 -- Active -- Up 0 days 00:53:29 -- Date 06/04/2010 03:34:59

List of all EDUs for database partition 0

db2sysc PID: 1249522
db2wdog PID: 2068678

EDU ID    TID            Kernel TID      EDU Name                            USR           SYS         USR DELTA    SYS DELTA
=============================================================================================================================
6957      6957           13889683        db2agntdp (SAMPLE  ) 0              58.238506     0.820466    1.160726     0.014721
6700      6700           11542589        db2agent (SAMPLE) 0                 52.856696     0.754420    1.114821     0.015007
5675      5675           4559055         db2agntdp (SAMPLE  ) 0              60.386779     0.854234    0.609233     0.014304
3088      3088           13951225        db2agntdp (SAMPLE  ) 0              80.073489     2.249843    0.499766     0.006247
3615      3615           2887875         db2loggw (SAMPLE) 0                 0.939891      0.410493    0.011694     0.004204
4900      4900           6344925         db2pfchr (SAMPLE) 0                 1.748413      0.014378    0.014343     0.000103
7986      7986           13701145        db2agntdp (SAMPLE  ) 0              1.410225      0.025900    0.003636     0.000074
2571      2571           8503329         db2ipccm 0                          0.251349      0.083787    0.002551     0.000857
7729      7729           14168193        db2agntdp (SAMPLE  ) 0              1.717323      0.029477    0.000998     0.000038
7472      7472           11853991        db2agnta (SAMPLE) 0                 1.860115      0.032926    0.000860     0.000012
3358      3358           2347127         db2loggr (SAMPLE) 0                 0.151042      0.184726    0.000387     0.000458
515       515            13820091        db2aiothr 0                         0.405538      0.312007    0.000189     0.000178
7215      7215           2539753         db2agntdp (SAMPLE  ) 0              1.165350      0.019466    0.000291     0.000008
6185      6185           2322517         db2wlmd (SAMPLE) 0                  0.061674      0.034093    0.000169     0.000100
6442      6442           2756793         db2evmli (DB2DETAILDEADLOCK) 0      0.072142      0.052436    0.000092     0.000063
4129      4129           15900799        db2glock (SAMPLE) 0                 0.013239      0.000741    0.000064     0.000001
2         2              11739383        db2alarm 0                          0.036904      0.028367    0.000009     0.000009
4386      4386           13361367        db2dlock (SAMPLE) 0                 0.015653      0.001281    0.000014     0.000003
1029      1029           15040579        db2fcms 0                           0.041929      0.016598    0.000010     0.000004
5414      5414           14471309        db2pfchr (SAMPLE) 0                 0.000093      0.000002    0.000000     0.000000
258       258            13656311        db2sysc 0                           8.369967      0.263539    0.000000     0.000000
5157      5157           7934145         db2pfchr (SAMPLE) 0                 0.027598      0.000177    0.000000     0.000000
1543      1543           2670647         db2fcmr 0                           0.004191      0.000079    0.000000     0.000000
1286      1286           8417339         db2extev 0                          0.000312      0.000043    0.000000     0.000000
2314      2314           14360813        db2licc 0                           0.000371      0.000051    0.000000     0.000000
5928      5928           3137537         db2taskd (SAMPLE) 0                 0.004903      0.000572    0.000000     0.000000
3872      3872           2310357         db2lfr (SAMPLE) 0                   0.000126      0.000007    0.000000     0.000000
4643      4643           11694287        db2pclnr (SAMPLE) 0                 0.000094      0.000002    0.000000     0.000000
1800      1800           5800175         db2extev 0                          0.001212      0.002137    0.000000     0.000000
772       772            7925817         db2thcln 0                          0.000429      0.000072    0.000000     0.000000
2057      2057           6868993         db2pdbc 0                           0.002423      0.001603    0.000000     0.000000
2828      2828           10866809        db2resync 0                         0.016764      0.003098    0.000000     0.000000

To provide information only about the EDUs that are the top consumers of processor time and to reduce the amount of output returned, you can further include the top parameter option. In the following example, only the top five EDUs are returned, across an interval of 5 seconds. Stack information is also returned, and can be found stored separately in the directory path specified by DUMPDIR, which defaults to diagpath.

$ db2pd -edus interval=5 top=5 stacks

Database Partition 0 -- Active -- Up 0 days 00:54:00 -- Date 06/04/2010 03:35:30

List of all EDUs for database partition 0

db2sysc PID: 1249522
db2wdog PID: 2068678

EDU ID    TID            Kernel TID     EDU Name                USR          SYS         USR DELTA    SYS DELTA
===============================================================================================================
3358      3358           2347127        db2loggr (SAMPLE) 0     0.154906     0.189223    0.001087     0.001363
3615      3615           2887875        db2loggw (SAMPLE) 0     0.962744     0.419617    0.001779     0.000481
515       515            13820091       db2aiothr 0             0.408039     0.314045    0.000658     0.000543
258       258            13656311       db2sysc 0               8.371388     0.264812    0.000653     0.000474
6700      6700           11542589       db2agent (SAMPLE) 0     54.814420    0.783323    0.000455     0.000310


$ ls -ltr
total 552
drwxrwxr-t    2 vbmithun build         256 05-31 09:59 events/
drwxrwxr-t    2 vbmithun build         256 06-04 03:17 stmmlog/
-rw-r--r--    1 vbmithun build       46413 06-04 03:35 1249522.3358.000.stack.txt
-rw-r--r--    1 vbmithun build       22819 06-04 03:35 1249522.3615.000.stack.txt
-rw-r--r--    1 vbmithun build       20387 06-04 03:35 1249522.515.000.stack.txt
-rw-r--r--    1 vbmithun build       50426 06-04 03:35 1249522.258.000.stack.txt
-rw-r--r--    1 vbmithun build      314596 06-04 03:35 1249522.6700.000.stack.txt
-rw-r--r--    1 vbmithun build       94913 06-04 03:35 1249522.000.processObj.txt

Example 16: Displaying agent event metrics

The db2pd command supports returning event metrics for agents. If you need to determine whether an agent changed state during a specific period of time, use the event option together with the -agents parameter. The AGENT_STATE_LAST_UPDATE_TIME(Tick Value) column that is returned shows the last time that the event being processed by the agent was changed. Together with a previously obtained value for AGENT_STATE_LAST_UPDATE_TIME(Tick Value), you can determine whether an agent has moved on to a new task or whether it continues to process the same task over an extended period of time.

db2pd -agents event
Database Partition 0 -- Active -- Up 0 days 03:18:52 -- Date 06/27/2011 11:47:10

Agents:
Current agents:      12
Idle agents:         0
Active coord agents: 10
Active agents total: 10
Pooled coord agents: 2
Pooled agents total: 2

AGENT_STATE_LAST_UPDATE_TIME(Tick Value)         EVENT_STATE  EVENT_TYPE  EVENT_OBJECT  EVENT_OBJECT_NAME
2011-06-27-14.44.38.859785(5622972377924968075)  IDLE         WAIT        REQUEST       n/a