Detects and analyzes possible DB2® database application hangs by using various metrics that are gathered from the db2pd command. The db2_hang_analyze script is available only on Linux and UNIX operating systems.
The db2_hang_analyze script is a Perl script that runs indefinitely. It gathers metrics on each application for each iteration, and checks if the application is active over certain time interval (the default value is 300 seconds). If the application is not active over the time interval, it is flagged as hanging. If a potential hang is detected, a list of applications is written to a report file. You can terminate the script by pressing Ctrl-C or Ctrl-Z.
The db2_hang_analyze script is in the sqllib/samples/pd/ directory.
You require one of the following authorities:
>>-db2_hang_analyze--db--dbname--+--------------------------+-->< +-member--member–number----+ +-timerlimit -seconds------+ +-Sleeptime -Seconds-------+ +-retrylimit -attempts-----+ +-path--directory----------+ +-cputhreshold -percentage-+ +-exec -script_path--------+ +-log----------------------+ +-sql----------------------+ +-list---------------------+ '-h------------------------'
Specify yes to print the most recent SQL statement that was issued by the hanging application, if the data exists. The default value is no.
In the following example, thescript monitors the SAMPLE database to detect any possible application hangs. Various metrics from the db2pd command are collected on every application every 60 seconds. If an application is determined to be hanging, the script writes a report and then exits.
$HOME/sqllib/samples/pd/db2_hang_analyze -db sample -log
Invoked: /home/hotel32/shenli/sqllib/samples/pd/db2_hang_analyze -db sample -log
APPLICATION HANG DETECTION: Started on Fri Jan 25 14:41:38 EST 2013
Sleeptime : 60 seconds
Timer Limit : 300 seconds
Node Member : default
Retry Limit : 3
Log : yes
SQL : no
Event Metrics Available : yes
Logfile : db2_hang_analyze.20130125.14.41.38.10297.log
Script PID : 10297
CPU Threshold : 0.1%
Post Detection Script : none
Path : /home/hotel32/shenli/sqllib/db2dump
Press CTRL-C or CTRL-Z to terminate script
Pre-loop setup...
Iteration 1: No hang found.
Iteration 2: No hang found.
Iteration 3: No hang found.
Iteration 4: No hang found.
Iteration 5: POSSIBLE HANG DETECTED!
Logfile : /home/hotel32/shenli/sqllib/db2dump/db2_hang_analyze.20130125.14.41.38.10297.log
VIEW REPORT AT : /home/hotel32/shenli/sqllib/db2dump/db2_hang_analyze.20130125.14.41.38.10297.report
APPLICATION HANG DETECTION: Ended on Fri Jan 25 14:46:47 EST 2013
If a hang or multiple hangs are detected, then a report file is generated that lists the hanging applications:
cat /home/hotel32/shenli/sqllib/db2dump/db2_hang_analyze.20130125.14.41.38.10297.report
APPLICATION HANG DETECTION: Started on Fri Jan 25 14:41:38 EST 2013
Sleeptime : 60 seconds
Timer Limit : 300 seconds
Node Member : default
Retry Limit : 3
Log : yes
SQL : no
Event Metrics Available : yes
Logfile : db2_hang_analyze.20130125.14.41.38.10297.log
Script PID : 10297
CPU Threshold : 0.1%
Post Detection Script : none
Path : /home/hotel32/shenli/sqllib/db2dump/
POTENTIAL APPLICATIONS HANGING: 3 application(s).
Apphdl : 7
Status : CommitActive
AgentEDUID : 16
Apphdl : 9
Status : CommitActive
AgentEDUID : 28
Apphdl : 19
Status : CommitActive
AgentEDUID : 37
APPLICATION HANG DETECTION: Ended on Fri Jan 25 14:46:47 EST 2013
You can control how the script detects possible hanging application by altering a number of the options:
$HOME/sqllib/samples/pd/db2_hang_analyze -db sample -member 0 -log -timerlimit 30 -sleeptime 15 -sql -path /TMP/
Invoked: /home/hotel32/shenli/sqllib/samples/pd/db2_hang_analyze -db sample -member 0 -log -timerlimit 30 -sleeptime 15 -sql -path /TMP/
APPLICATION HANG DETECTION: Started on Fri Jan 25 15:38:27 EST 2013
Sleeptime : 15 seconds
Timer Limit : 30 seconds
Node Member : 0
Retry Limit : 3
Log : yes
SQL : yes
Event Metrics Available : yes
Logfile : db2_hang_analyze.20130125.15.38.27.10189.log
Script PID : 10189
CPU Threshold : 0.1%
Post Detection Script : none
Path : /TMP
Press CTRL-C or CTRL-Z to terminate script
Pre-loop setup...
Iteration 1: No hang found.
Iteration 2: POSSIBLE HANG DETECTED!
Logfile : /TMP/db2_hang_analyze.20130125.15.38.27.10189.log
VIEW REPORT AT : /TMP/db2_hang_analyze.20130125.15.38.27.10189.report
APPLICATION HANG DETECTION: Ended on Fri Jan 25 15:39:03 EST 2013
In this example, the user wants to check whether there are any applications that are hanging on member 0. A number of db2pd command metrics are gathered every 15 seconds, instead of the default 60 seconds. The application is identified as hanging if its metrics do not change within 30 seconds, instead of the default of 300 seconds. After a hang is detected, the latest SQL statement that relates to the hanging application is printed in the report file. The report and log files are written to the /TMP/ directory.
The following report file displays the results. One application is possibly hanging.
cat /TMP/db2_hang_analyze.20130125.15.38.27.10189.report
APPLICATION HANG DETECTION: Started on Fri Jan 25 15:38:27 EST 2013
Sleeptime : 15 seconds
Timer Limit : 30 seconds
Node Member : 0
Retry Limit : 3
Log : yes
SQL : yes
Event Metrics Available : yes
Logfile : db2_hang_analyze.20130125.15.38.27.10189.log
Script PID : 10189
CPU Threshold : 0.1%
Post Detection Script : none
Path : /TMP
POTENTIAL APPLICATIONS HANGING: 1 application(s).
Apphdl : 51
Status : UOW-Executing
AgentEDUID : 44
Current query : select * from staff
Last query: : none
APPLICATION HANG DETECTION: Ended on Fri Jan 25 15:39:03 EST 2013
The script does not consider applications that are in a lock wait state to be hanging.
The script requires Perl v5.6.0 or higher.