DB2 Version 10.1 for Linux, UNIX, and Windows

Scenario: Identifying activities that are taking too long to complete

Workload management table functions simplify the task of identifying a specific activity inside the data server and, if necessary, canceling it without having to end the entire application.

Identifying an activity that is taking too long to complete

Following is an example of identifying a long-running query. Assume that a user from the Sales department who is running the SalesReport application complains that the application is taking too long to complete.

After identifying the application handle, use the WLM_GET_WORKLOAD_OCCURRENCE_ACTIVITIES table function to look up all activities currently running in this application. For example, if the application handle is 1, your query might resemble the following one:
SELECT SUBSTR(CHAR(COORD_MEMBER),1,5) AS COORD,
       SUBSTR(CHAR(MEMBER),1,4) AS MEMB,
       SUBSTR(CHAR(UOW_ID),1,5) AS UOWID,
       SUBSTR(CHAR(ACTIVITY_ID),1,5) AS ACTID,
       SUBSTR(CHAR(PARENT_UOW_ID),1,8) AS PARUOWID,
       SUBSTR(CHAR(PARENT_ACTIVITY_ID),1,8) AS PARACTID,
       SUBSTR(ACTIVITY_TYPE,1,8) AS ACTTYPE,
       SUBSTR(CHAR(NESTING_LEVEL),1,7) AS NESTING
   FROM TABLE(WLM_GET_WORKLOAD_OCCURRENCE_ACTIVITIES(1, -2)) 
       AS WLOACTS
   ORDER BY MEMB, UOWID, ACTID

COORD MEMB UOWID ACTID PARUOWID PARACTID ACTTYPE NESTING
----- ---- ----- ----- -------- -------- -------- -------
    0    0     2     3        -        -     CALL       0
    0    0     2     5        2        3 READ_DML       1
The activity is identified as having a unit of work ID of 2 and an activity ID of 5. You can then use the WLM_GET_SERVICE_CLASS_AGENTS table function to discover what the agents that work on this activity are doing:
SELECT APPLICATION_HANDLE, UOW_ID, ACTIVITY_ID,
       SUBSTR(REQUEST_TYPE,1,8) AS REQUEST_TYPE,
       SUBSTR(EVENT_TYPE,1,8) AS EVENT_TYPE,
       SUBSTR(EVENT_OBJECT,1,8) AS EVENT_OBJECT
   FROM TABLE(WLM_GET_SERVICE_CLASS_AGENTS('', '', CAST(NULL AS BIGINT),-2))
       AS AGENTS
   WHERE APPLICATION_HANDLE = 1
   AND UOW_ID = 2
   AND ACTIVITY_ID = 5
For example, the activity might be queued, executing, or waiting on a lock. If the activity were queued, the result would be:
APPLICATION_HANDLE UOW_ID ACTIVITY_ID REQUEST_TYPE EVENT_TYPE EVENT_OBJECT
------------------ ------ ----------- ------------ ---------- ------------
                 1      2           5         OPEN       WAIT    WLM_QUEUE
If the activity were executing, the result would be:
APPLICATION_HANDLE UOW_ID ACTIVITY_ID REQUEST_TYPE EVENT_TYPE EVENT_OBJECT
------------------ ------ ----------- ------------ ---------- ------------
                 1      2           5          OPEN   PROCESS      REQUEST
If the activity were waiting on a lock, the result would be:
APPLICATION_HANDLE UOW_ID ACTIVITY_ID REQUEST_TYPE EVENT_TYPE EVENT_OBJECT
------------------ ------ ----------- ------------ ---------- ------------
                 1      2           5         OPEN    ACQUIRE         LOCK
When you know what the activity is doing, you can proceed appropriately:
  • If the activity is queued, if the user indicates that the query is taking so long that they no longer care about the results, or you think the query is consuming too many resources, you can cancel it.
  • If the activity is important and it is queued, consider cancelling some other less important work that is currently running (reducing the concurrency so that activities leave queue), or maybe the user will be satisfied to know that work is not hanging and is just waiting for chance to run.
  • If the activity is waiting for a lock, you can use the snapshot monitor to investigate which locks the application is waiting for.
  • If the activity is waiting for a lock held by lower priority activity, consider cancelling that activity.

You might also find it useful to know the DML statement that activity 5 is running. Assuming that you have an active activities event monitor, you can run the WLM_CAPTURE_ACTIVITY_IN_PROGRESS procedure to capture information about the DML statement and other information about activity 5 while it is running. Unlike the statement event monitor, the WLM_CAPTURE_ACTIVITY_IN_PROGRESS procedure permits you to capture information about a specific query, as opposed to every statement running at the time. You can also obtain the statement text by using MON_GET_ACTIVITY_DETAILS.

If you decide that you must cancel the activity, you can use the WLM_CANCEL_ACTIVITY routine to cancel the activity without having to end the application that issued it:
CALL WLM_CANCEL_ACTIVITY (1, 2, 5)

The application that issued the activity receives an SQL4725N error. Any application that handles negative SQL codes is able to handle this SQL code.

Identifying an activity hang caused by lock contention

Assume that you have a situation in which a user is complaining about an application that is taking too long. Also assume that you have either the application name or the authorization ID of the long-running application. With this information, you can use the LIST APPLICATIONS command to obtain the application handle. Assuming that application handle returned by the LIST APPLICATIONS command is 2, you can use the WLM_GET_SERVICE_CLASS_AGENTS table function to determine which agents are working on this activity. Your query might resemble the following one:

SELECT SUBSTR(CHAR(APPLICATION_HANDLE),1,7) AS APPHANDLE,
       SUBSTR(CHAR(MEMBER),1,4) AS MEMB,
       SUBSTR(CHAR(AGENT_TID),1,9) AS AGENT_TID,
       SUBSTR(AGENT_TYPE,1,11) AS AGENTTYPE,
       SUBSTR(EVENT_OBJECT,1,11) AS EVENTOBJECT,
       SUBSTR(REQUEST_TYPE,1,7) AS REQTYPE,
       SUBSTR(CHAR(UOW_ID),1,6) AS UOW_ID,
       SUBSTR(CHAR(ACTIVITY_ID),1,6) AS ACT_ID
   FROM TABLE(WLM_GET_SERVICE_CLASS_AGENTS('', '', 2, -2)) AS SCDETAILS
   ORDER BY APPHANDLE, MEMB, AGENT_TID

APPHANDLE MEMB AGENT_TID AGENTTYPE   EVENTOBJECT  REQTYPE UOW_ID ACT_ID
--------- ---- --------- ----------- -----------  ------- ------ ------
2         0    1         COORDINATOR REQUEST      OPEN    2      1
2         1    3         SUBAGENT    LOCK         -       2      1

The results indicate that agent 1 is waiting on a remote response. Looking at the agent on the remote member that is working on the same activity, the EVENTOBJECT field indicates that the agent is waiting to obtain a lock.

The next step is to determine who owns the lock. You can obtain this information by turning on the monitor switches and using the snapshot monitor table function, as shown in the following example:
SELECT AGENT_ID AS WAITING_FOR_LOCK,
       SUBSTR(APPL_ID_HOLDING_LK,1,40) AS HOLDING_LOCK,
       CAST(LOCK_MODE_REQUESTED AS SMALLINT) AS WANTED,
       CAST(LOCK_MODE AS SMALLINT) AS HELD
   FROM TABLE(SNAPSHOT_LOCKWAIT('SAMPLE',-1)) AS SLW

WAITING_FOR_LOCK     HOLDING_LOCK                             WANTED HELD
-------------------- ---------------------------------------- ------ ------
                   2 *LOCAL.DB2.060131021547                       9      5
You can also determine the lock owner by using the following sequence of commands:
db2pd -db database alias -locks
db2pd -db database alias -transactions

If you want to cancel the long-running activity, you can use the WLM_CANCEL_ACTIVITY procedure. If the successful completion of the long-running application is more important than the successful completion of the lock-owning application, you can force the lock-owning application.