IBM Support

Collecting data: DB2 Stored Procedures Problems

Question & Answer


Question

MustGather information for DB2® for Linux, UNIX®, and Windows® routine (stored procedure or user defined function) related problems. Gathering this information before calling IBM® support helps familiarize you with the troubleshooting process and saves you time.

Cause

This section lists questions regarding the conditions under which the problem occurred.

  • Are you able to reproduce this behavior from command line? If so, can a testcase be provided?
  • Have there been any recent changes to the file system?
  • Have there been any recent changes to the instance or the database ?
  • Has it worked before and suddenly stop working or you just newly installed the stored procedure?
  • Is the problem intermittent?
  • What type of stored procedure it is ? The answer includes C stored procedure, Java stored procedure, COBOL stored procedure, SQL stored procedure and build-in stored procedure...
  • Is it possible to provide the definition of stored procedure and binary library if there is any ?

Answer

Since stored procedure is running on server side, it is always recommended to test the stored procedure from command line on server side first. Successfully running on server side rules out stored procedure problem.


Please collect the following data based on the different types of stored procedure problem you are encountering:


Stored procedure returns error code

1. DB2 trace on server side even if the stored procedure is invoked from client side.

    1. db2trc on -t -f server.dmp  
    2. <recreate the problem>
    3. db2trc off
    4. db2trc fmt server.dmp server.fmt; db2trc flw -t server.dmp server.flw

1a. If regular trace is not workable due to large size, try trace the return code

    1. db2trc on -t -rc <return code>
    2. <recreate the problem>
    3. db2trc dmp server.dmp
    4. db2trc off
    5. db2trc fmt server.dmp server.fmt; db2trc -t flw server.dmp server.flw


2. Db2support information from server side

    • db2support . -g -s -a


3. Definition of stored procedure (CREATE PROCEDURE...statement)

4. Additional information related to external stored procedure

Java stored procedure

  • $JDK_PATH/jre/bin/Java -version  #JDK_PATH is a parameter defined in DBM CFG
    • set >set.out
    • ls -lr /usr/lib  (Linux only)
    • set DIAGLEVEL to 4, some JVM related information will be dumped to db2diag.log.


C stored procedure

  • library for stored procedure
    • C compiler version


Cobol stored procedure

  • Name and version of COBOL compiler
    • Compiling and linking option


Note: It is recommended to recreate the problem in DEV/TEST system if possible. Depending on how busy the system, the trace can become huge. Try to take trace at non-peak time to reduce both the performance overhead and the trace file size.


SQL stored procedure performance issue (instrumentation)


SP_SELECTCOUNT accepts an input value and inserts it into a table called t1 then returns the number of rows in the table after the insert.

DEBUG.RTN_STMT_PERF uses MON_GET_PKG_CACHE_STMT table function to return statistics regarding individual statements within the routine. The query can be modified to include additional metrics like LOCK_ESCALS, ROWS_MODIFIED, ROWS_READ, etc... Refer to docs for MON_GET_PKG_CACHE_STMT along with the units returned for each metric, for example STMT_EXEC_TIME is in milliseconds.

Need to set the following DBM CFG parameters in order for debug.rtn_stmt_perf to collect diagnostics:

mon_rtn_execlist = ON
mon_rtn_data = BASE


sample.db2
-- To execute: db2 -td@ -vf sample.db2

CREATE PROCEDURE sp_selectcount(IN p_val INT, OUT p_rows INT)
LANGUAGE SQL
P1: BEGIN

insert into t1 values(p_val);
select count(*) into p_rows from t1;

END P1 @

-- Display diagnostics for individual statements inside SQL SP
-- INPUTS:
-- p_rtnschema: Schema for stored procedure/udf
-- p_rtnname: Name for stored procedure/udf

CREATE PROCEDURE debug.rtn_stmt_perf(IN p_rtnschema VARCHAR(50), IN p_rtnname VARCHAR(80))
LANGUAGE SQL
result sets 1
P1: BEGIN

-- Display SQL statements within SQL routine
-- SECTION_TYPE = S for static SQL
-- BTYPE = K for SQL routine

declare c cursor with return for
select substr(t.package_name,1,20) as pkgname, section_number, substr(stmt_text,1,500) as stmt, total_cpu_time, num_executions, stmt_exec_time, lock_wait_time, rows_read, ROWS_MODIFIED, TOTAL_SECTION_SORT_TIME, DEADLOCKS, LOG_DISK_WAIT_TIME
from table(mon_get_pkg_cache_stmt(null, null, null, -2)) as t, syscat.routinedep rd, syscat.routines r
where section_type='S'
and t.package_name = rd.bname
and r.specificname=rd.specificname
and r.routineschema=rd.routineschema
and rd.btype='K'
and r.routineschema = p_rtnschema
and r.routinename = p_rtnname
order by total_cpu_time desc;

open c;
END P1 @


Sample Output

call sp_selectcount(100,?)

Value of output parameters
--------------------------
Parameter Name : P_ROWS
Parameter Value : 12

Return Status = 0

call debug.rtn_stmt_perf('DB2INST1','SP_SELECTCOUNT')

Result set 1
--------------

PKGNAME SECTION_NUMBER
-------------------- --------------------
P138716953 1
P138716953 2

STMT TOTAL_CPU_TIME
-------------------------------------------------- --------------------
insert into T1 values(:HV00008 :HI00008 ) 933
select count(*) into :HV00009 :HI00009 from T1 13


NUM_EXECUTIONS STMT_EXEC_TIME LOCK_WAIT_TIME
-------------------- -------------------- --------------------
1 2 0
1 0 0

2 record(s) selected.

Return Status = 0



Notes: In some cases nothing will be returned if the SQL statement has already been removed from DB2's package cache.

Another alternative is to add adding debug instrumentation to the routine.

sptest.db2
-- Script to create stored procedure.
-- Execute: db2 -td@ -vf sptest.db2

CONNECT TO SAMPLE @
CREATE TABLE DEBUG(DESC CHAR(30), TIME TIMESTAMP)@

DROP PROCEDURE sp_dummy @

CREATE PROCEDURE SPTEST()
LANGUAGE SQL
BEGIN

-- Record timestamp when routine starts
INSERT INTO DEBUG VALUES('SPTEST - Entry', current timestamp);

INSERT INTO T1 VALUES(38);

-- Record timestamp when routine exits
INSERT INTO DEBUG VALUES('SPTEST - Exit', current timestamp);

RETURN 0;
END @

TERMINATE @


DBMS_OUTPUT module

This allows the routine to write output to the console. Useful when trying to debug invalid data used inside the stored procedure because variable values can be output. Refer to docs for more details.

SQL stored procedure performance issue (no error code)

1. Take the SQL statement out and run it from CLP, is it slow ?

2. Is statistics updated? If not, rebind package or recreate the stored procedure after REORG, RUNSTATS. Consider using REOPT ONCE/ALWAYS on rebind packages / recreate of the stored procedure. To find the corresponding package name for a SQL stored procedure, see How to map stored procedure and user defined function (UDF) names to packages

3. DB2 perfcount trace

    1. db2trc on -perfcount -t 
    2. <recreate the problem>
    3. db2trc dmp server.dmp
    4. db2trc off
    5. db2trc perffmt server.dmp server.perfmtrc


DB2 perfcount trace will count the frequency of each function and add up the CPU time spend on each function. It will be less useful if it is not running in a dedicated system.

4. DB2 trace

    1. db2trc on -t -f trace.dmp 
    2. <recreate the problem>
    3. db2trc off
    4. db2trc fmt server.dmp server.fmt; db2trc flw -t server.dmp server.flw

5. Collect stack information

    • db2pd -stack all -repeat <interval>
    • db2pd -stack <pid> -repeat <interval>  
  • #if specific PID is found


6. db2support . -g -s -a 

7. Collect new access plan after rebind and recompile.  For instructions see Collecting explain data for SQL stored procedures in DB2 or Collecting explain data for SQL stored procedures in Related URL section.



Non-SQL stored procedure performance issue (hang, slow performance, high CPU, high Memory)

Java routine tracing

JDBC driver trace can be enabled if needed. It will provide timestamps for each operation.

  1. db2set DB2_JVM_STARTARGS="-Ddb2.jcc.override.traceLevel=-1 -Ddb2.jcc.override.traceFile=c:\tmp\sptrace.txt -Ddb2.jcc.override.traceFileAppend=true"
  2. Restart DB2 instance for change to take effect, unless DBM CFG parameter KEEPFENCED=NO


Run the attached script collect_db2fmp.sh.gz to automatically collect the needed diagnostics or perform manual collect further below.
This script does not enable the JDBC driver trace above.


Automated Collection (easier)

$ ./collect_db2fmp.sh v11 /home/db2inst1/test7

Starting script to collect diagnostics for all db2fmp processes
Detected Linux operating system

DATABASE: v11
Diagnostics directory: /home/db2inst1/test7
Created output directory /home/db2inst1/test7

Collecing diags for db2fmp PID: 2289
Collecing diags for db2fmp PID: 2291
Collecing diags for db2fmp PID: 2292
Collecing diags for db2fmp PID: 2293
Collecing diags for db2fmp PID: 2351
Collecing diags for db2fmp PID: 3085
Collecing diags for db2fmp PID: 3092
Process 3092 not found.
Collecting general DB2 fmp diags
Attempting to produce all stack traces for database partition.
See /home/db2inst1/test7/ directory for stack trace file.
================================
Finished script
Diagnostics in current directory
================================


collect_db2fmp.sh.gzcollect_db2fmp.sh.gz

Notes:

  • Script does not execute "kill -3" to generate javacores since db2fmp is owned by db2fenc1 instead of db2inst1
  • Script works on AIX, Linux and Solaris. Not been tested on HP-UX


Manual Collection
An easy method to collect everything output to the console without redirecting the output to a file is to run the "script" command. The output is then written to a file called "typescript". For this example, we assume everything is captured by the "script" command.

Example:
$ script
Script command is started. The file is typescript.

(Execute commands from steps #1 to #3)

$ (Ctrl-D when done)
Script command is complete. The file is typescript.


1. Collect thread level stack information


  • AIX: procstack <pid_fmp>
    Linux: gstack <pid_fmp>
    Solaris and HP-UX: pstack <pid_fmp>

    AIX: procmap -F <pid_fmp>
    Linux: pmap -x <pid_fmp>

    Note: The commands above may need to be run as DB2 fenced id or root.

    Wait a few seconds after running the command above, then run the same command again. If re-directing to a file, remember to use a different filename or the append ">>" character.

    For all platforms running Java routines run the command below to create files in /tmp/javacore*. Normally the db2fmp process will be called "db2fmp (C)" for C routines and "db2fmp (Java)" for Java routines.

    kill -3 <pid_fmp>

    Optional: Collect two corefiles of process several seconds apart.

    AIX: gencore <pid_fmp> <output file>
    Linux: gcore -o <output file> <pid_fmp>




  •  


2. Collect thread level CPU usage

AIX: ps -mp <pid_fmp> -o THREAD,rssize,vsz
Solaris: ps -Le -o pid,user,s,lwp,pcpu,pmem,vsz,args
Linux: ps -Lf <pid_fmp>
HP-UX: Please check with your system administrator

All platforms:
ps -elf
vmstat 1 5




3. Collect FMP memory information

    1. db2pd -fmp -applications -alldbs -dynamic -edus-repeat <interval>
    2. db2pd -memblocks fmp -repeat 2 3
    3. db2pd -memblocks pid=<pid_fm> -repeat 2 3
    4. ps aux |grep <pid_fmp>
    5. AIX: svmon -P <pid_fmp>
    6. If running "script" it can be stopped on this step since the diagnostics collected here will be re-directed to the file fmpHist.txt.  The command below works only on V9.5 FP5 and higher or V9.7 FP1 higher.
  • $ db2pd -fmpexechistory n=128 genquery>query.db2

    Or to get execution for specific db2fmp process
    $ db2pd -fmpexechistory pid=<pid_fmp> n=128 genquery>query.db2
    • Open query.db2
    • Remove the first line of the output which should begin with the words “Database Partition”
    • Add a “@” symbol to the end of the last line in the file which should end with the words “R.TID, R.RTNTIME”.
    Modified query.db2
    Database Member 0 -- Active -- Up 0 days 00:11:00 -- Date 2014-12-05-11.48.32.224658

    WITH RTNHIST ( PID, TID, RTNID, RTNTIME) AS
    ( VALUES ( 20447370 , 772 , 65932 , TIMESTAMP('2014-12-05-11.44.41.088546')),
    ( 20447370 , 772 , 65822 , TIMESTAMP('2014-12-05-11.42.43.807698')),
    ( 20447370 , 772 , 65822 , TIMESTAMP('2014-12-05-11.42.43.335626')),
    ( 20447370 , 772 , 65822 , TIMESTAMP('2014-12-05-11.42.43.191981')),
    ( 20447370 , 772 , 65822 , TIMESTAMP('2014-12-05-11.42.43.089549')),
    ( 20447370 , 772 , 65822 , TIMESTAMP('2014-12-05-11.42.41.841561')),
    ( 20447370 , 1029 , 65822 , TIMESTAMP('2014-12-05-11.47.41.175732'))
    )
    SELECT R.PID, R.TID, R.RTNTIME, ROUTINESCHEMA, ROUTINENAME, SPECIFICNAME, ROUTINEID
    FROM syscat.routines, RTNHIST as R
    where ROUTINEID = R.RTNID
    ORDER BY R.PID, R.TID, R.RTNTIME@

    Once the output file has been modified the script can be run by connecting to the database and issuing the command.

    db2 -td@ -vf query.db2 >fmpHist.txt

    The fmpHist.txt should contain output similar to the output below:

    PID TID RTNTIME ROUTINESCHEMA ROUTINENAME SPECIFICNAME ROUTINEID

    20447370 772 2014-12-05-11.42.41.841561 SYSPROC SYSINSTALLOBJECTS SYSINSTALLOBJECTS 65822


4. This creates db2support.zip in the current directory: db2support . -g -s



Non-SQL stored procedure performance issue (abnormal termination)

These are normally caused by signals such as Abort or Segmentation fault (Signal #11) which causes the routine to abnormally terminate.

Java routines

Collect /tmp/javacore* files and send to IBM DB2 support.

Sometimes Java routine crashes are caused by defects within the IBM JDK. One common issue is an incompatibility with older versions of the IBM JDK with AIX Power7 processors. More details in the link "db2setup or db2fmp (Java) abnormally terminated on Power7".

C routines

The db2fmp process's ulimit is inherited from the id which starts the DB2 instance. This is typically the DB2 instance owner, but in some customer configurations they may use a different id such as root.

The ulimit command can be used to verify coredump=unlimited

$ ulimit -a


time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 4194304
memory(kbytes) unlimited
coredump(blocks) unlimited


Modify the instance user id's .profile or the id used to start the DB2 instance so that it it executes the command below:

ulimit -c unlimited

Set DB2 registry variable to enable core dumps for routines.

db2set DB2_ROUTINE_DUMPCORE=ON

This causes core files to be placed in ~/sqllib/db2dump/core*

Restart the DB2 instance if the ulimit was modified

Once a core file has been created, ask the author of the routine to review. The corefile can be loaded into a debugger. For example using the GDB debugger point it to the location of the executable db2fmp and the core file.

GNU Debugger
gdb -e /home/db2inst1/sqllib/adm/db2fmp -c /home/db2inst1/sqllib/db2dump/core123456

/* This logs gdb output to the file gdb.txt */
gdb> set logging on

/* List all the threads inside the process */
gdb> info threads

/* Display backtrace/stacks for all threads */
gdb> thread apply all bt

gdb> set logging off

gdb> quit


DBX Debugger

/* Write anything which appears on console to dbx.out */
script dbx.out

dbx /home/db2inst1/sqllib/adm/db2fmp /home/db2inst1/sqllib/db2dump/core123456

/* List all threads inside the process */
(dbx) thread

/* For all thread numbers appearing */
(dbx) thread current <thread number>
(dbx) where

(dbx) quit

/* End script command */
Ctrl-D

 
Infosphere Federation db2fmp hang

1) Get the Coordinator pid/thread or CoordEDUID associated with the
application

/* Look for Coordinator pid/thread */

db2 "list applications show detail"

OR

/* Look for CoordEDUID */
db2pd -applications -db <db name>


2) Find the CoordEDUID in the output to find the PID of the db2fmp.  If CoordEDUID from step #1 above is 8382, then the operating system PID of db2fmp hanging is 6784

db2pd -fmp

(truncated output)

   Active Threads:
  Address    FmpPid     EduPid     ThreadId
  012332123   6784     8382


3.0) Get function stacks for the Federated db2fmp process

3.1) This command saves all console output to a file called "typescript"

$ script
Script command is started. The file is typescript.

3.2) Get stacks. The output can be redirected to a text file as well instead of outputting to console.

AIX: procstack <pid_fmp>
Linux: gstack <pid_fmp>
Solaris and HP-UX: pstack <pid_fmp>

AIX: procmap -F <pid_fmp>
Linux: pmap -x <pid_fmp>

3.3) Wait few seconds and collect same output in step 3.2.  Don't over write the file in 3.2 if redirecting output to a file.

3.4) Get thread level CPU usage for the db2fmp PID

AIX: ps -mp <pid_fmp> -o THREAD,rssize,vsz
Solaris: ps -Le -o pid,user,s,lwp,pcpu,pmem,vsz,args
Linux: ps -Lf <pid_fmp>
HP-UX: Please check with your system administrator

3.5) Ctrl-D
(Saves all output to the file called typscript)

$
Script command is complete. The file is typescript.

What to do next
Once you have collected the preceding information, submit the diagnostic information to IBM support.


Miscellaneous: Attaching debugger or trace

This should only be performed if instructed by DB2 Support.

Window #1
1) db2stop force

2) This step is optional since it just changes the fenced userid to the
same as the instance owner to make debugging easeri:
cd sqllib/adm

mv .fenced fenced.bkup
touch .fenced

3) db2start

4) db2 "connect to sample"
(or any db where the java routine has been cataloged and jar installed)


db2 "get snapshot for all applications">appsnap.txt

5)
Java routines:
db2trc on -t -f trace.dmp -crash "DB2.SQLO.sqloJVMstart.entry" -suspend

C routines:
db2trc on -t -f trace.dmp -crash "DB2.SQLER.sqlerSendFmpStart.entry" -suspend


6) Call the UDF, it should be suspended when it hits the code point in
#5 above.
<move to window 2>


Window 2
7) use db2pd -fmp to  find the pid of the new fmp


D:\db2v97\BIN>db2pd -fmp

Database Partition 0 -- Active -- Up 0 days 06:13:39 -- Date 11/21/2011
14:12:34


FMP:
Pool Size:       0
Max Pool Size:   200 ( Automatic )
Keep FMP:        YES
Initialized:     YES
Trusted Path:    D:\db2v97\function\unfenced
Fenced User:     DB2

Shared Memory:   0xFDFEBFE0
IPC Pool:        0xFDFE9FF8

FMP Process:
Address    FmpPid     Bit   Flags      ActiveThrd PooledThrd ForcedThrd
Active I
PCList
0xFE861E58 5068       32    0x00000002 0          1          0
Yes    0
xFE9ADFE8

  Active Threads:
  Address    FmpPid     EduPid     ThreadId
  012332123   6784     8382

6784 is the PID of the db2fmp process.  If there are multiple processes
find the corresponding EDUPID=8382 in the application snapshots
(appsnap.txt) to get more information about this application and to
confirm whether it is executing your routine

appsnap.txt
Current database partition number          = 0
Coordinator agent process or thread ID     = 8382
Current Workload ID                        = 0
...
(SQL statement will be at the bottom)




Use a debugger to attach to the PID of the db2fmp process to find out what code it is executing when it crashes.

Another alternative is to attach ano operating system trace or debugger to the PID of the db2fmp process to determine if any O/S calls fail.

AIX:     truss -ealf -o output.file -p <PID of db2fmp process> &
Linux: strace -f -o trace.txt -p <PID of db2fmp process> &
Solaris: truss -fae -o trace.txt -p <PID of db2fmp process> &
Others: Check O/S documentation for traces which can trace o/s API calls.

 
Example (Linux):

strace -f -o trace.txt -p <PID of db2fmp process> &

9) Continue executing the stored procedure.  This should unhang Window 1

db2trc chg -resume
Trace has been changed
(Execution will continue in Window #1)

db2trc off
10) Stop strace

kill -9 <PID of strace>
db2trc flw -t trace.dmp trace.flw
db2trc fmt trace.dmp trace.fmt
db2trc fmt -c trace.dmp trace_drda.fmt


Miscellaneous: Additional DB2 registry variables

This should only be performed if instructed by DB2 Support.

DB2_ROUTINE_STDFILEIO=Y

If the routine writes to standard out/error, this registry variable will place the output into the files ~/sqllib/db2dump/db2fmp<PID>.stdout and db2fmp<PID>.stderr respectively.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"ARM Category":[{"code":"a8m500000008PmjAAE","label":"Routines - Stored Procedures and UDF->Java\/Non-SQL Routines"},{"code":"a8m500000008PmkAAE","label":"Routines - Stored Procedures and UDF->SQL Routines"}],"ARM Case Number":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"All Version(s)","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
30 July 2020

UID

swg21383251