IBM Support

Using stackit to gather debugging data on UNIX and Linux systems

Troubleshooting


Problem

You have a process which is hanging or using too much CPU, or which has crashed and left behind a core dump, and you need help determining the cause of the problem.

Resolving The Problem

The stackit script provides an easy way to examine running processes and core dumps on AIX, HP-UX, Linux and Solaris systems. Although stackit was written by the IBM MQ support team, it can be used to gather information about any process or core file.
 


 

Using stackit

In order to use stackit, you must first download the script to your system and make it executable, for example by running: chmod a+x stackit

   

Syntax

 
stackit -?

stackit [-f File] [-o Options] {-m Match | -n Name | -p Pid}...

stackit [-f File] [-o Options] -c Core -e Executable
 

General options

-f File

Write the stackit output to a new file, which is important if you need to send it to IBM. Without this option, stackit will display its output to the screen only.


-o Options

Control what information stackit gathers. The default options are sufficient in most cases, but IBM may ask for custom options when diagnosing certain problems:


 Default: Usually stack,cred,map,ldd
     All: All possible data

   stack: Stack trace for all threads        All platforms
    cred: Security credentials               All platforms
     map: Library map information            All but some HP-UX
   smaps: Detailed address space map         Linux only
     ldd: Shared library dependencies        All platforms
  status: Process status information         All but HP-UX
   files: File descriptor usage              All but some HP-UX
    regs: Machine register contents          All platforms
     asm: Assembler instructions             AIX only
   locks: Mutexes and condition variables    AIX only
  thread: Detailed thread data               AIX only
coreinfo: Core file details                  AIX only
 coremap
: Core file address space map        AIX only
     sig: Disposition for signals            All but Linux & HP-UX
    safe: Safe mode (avoids debuggers)       All platforms
 

Process selection flags

-m Match

Analyze processes whose command line contain a matching pattern. For example, if you use "-m FRED", then stackit will capture any process with the name "FRED" in its command line arguments. Regular expressions are permitted when matching patterns.


-n Name

Analyze processes by the executable program name. For example, if you use "-n runmqlsr" then stackit will capture any process named runmqlsr.


-p Pid

Analyze a processes by its process identifier (pid). For example, if you use "-p 12345" then stackit will capture pid 12345, if it exists.

 

Core file flags

-c Core

The name of the core dump file stackit should analyze. If the core dump is called 'core', you should rename it first to prevent another process from overwriting it.


-e Executable

The path to the executable which created the core file. Please be sure to identify the right program, particularly if there are multiple versions on your system, or the stackit analysis will not be valid.

   

Usage Notes

The stackit script provides several features to make it easier to use at the command line.

 

Combining options

You can repeat the -o flag or provide a comma-separated list, or both, when selecting options. For example, the following commands are equivalent:

sh> stackit -f debug.txt -o stack -o locks -o asm -p 29723

sh> stackit -f debug.txt -o stack,locks -o asm -p 29723

sh> stackit -f debug.txt -o stack,locks,asm -p 29723
 

Repeating process selection flags

You can use the -m, -n and -p flags together, and repeat them as many times as needed, in order to select all the processes you want stackit to analyze. However, a process will be analyzed only once, regardless of how many times it was matched.

Implicit process selection flags

You can even skip the -m and -p flags when selecting processes. When stackit sees extra arguments, it treats numerical values as process identifiers and other values as patterns to match. For example, the following commands are equivalent:

sh> stackit -f ibm.txt -m FRED -p 498932 -p 997106 -m WILMA

sh> stackit -f ibm.txt -m FRED -p 498932 997106 WILMA

sh> stackit -f ibm.txt FRED 498932 997106 WILMA
   

Example 1

Read detailed help for stackit, customized for your system:

sh> stackit -?

 

Example 2

Generate default output to the file stackit.out from all processes whose command line matches the string TEST.QMGR:

sh> stackit -f stackit.out -m TEST.QMGR
 

Example 3

Generate stack and map data from processes with "mq" in their name to the file mqstackdata.txt

sh> stackit -f mqstackdata.txt -o stack,map -n mq

 

Example 4

Generate default output and registers from processes 7029, 10737 and 41824 to the file regs.txt

sh> stackit -f regs.txt -o default,regs 7029 10737 41824

 

Example 5

Generate stack data in safe mode to the file stack.txt from all processes with "db2" in their name, all processes with "AppSrv01" on their command line, and process 9297367:

sh> stackit -f stack.txt -o stack,safe -n db2 -m AppSrv01 -p 9297367

 

Example 6

Generate default data from a core file called core.11424 generated by the program /usr/local/bin/pmrouter and write it to the file coreinfo.txt:

sh> stackit -f coreinfo.txt -c core.11424 -e /usr/local/bin/pmrouter
   

Security

You must have authority to examine a process in order for stackit to succeed. In most cases you can analyze only the processes you started. However, some programs use the UNIX setuid/setgid bits to run as another user, and in most cases you must be root to examine such processes.


For example, IBM MQ uses the setuid/setgid bits to run its processes as the mqm user. If you were logged in as mqm when you started an MQ queue manager, the mqm user should be able to run stackit against the queue manager processes; Otherwise, you must run stackit as root.


If stackit cannot run against a process, it will tell you who can analyze it. Be sure to look for such messages before sending stackit output to IBM for review. For example:
 
stackit: Success rate: 0%
stackit: Stackit failed to analyze processes which belong to another user.
         Run stackit as scotty to analyze those processes.
 
The root user can examine any process, so run stackit as root if you are having difficulty.
   

Debuggers

Stackit gathers information using the tools provided by your operating system. One important tool stackit calls is a debugger, which it may use to examine live processes and always uses to examine core files. The debuggers supported by stackit on each operating system are:

 
  • AIX: The dbx debugger is in the bos.adt.debug LPP which is part of AIX.
  • HP-UX: The Wildebeest Debugger (WDB) is an HPE supported implementation of the GNU Debugger and is available for free download from HP for Itanium systems.
  • Linux: The GNU Debugger (gdb) is available in the gdb RPM on many Linux distributions.
  • Solaris: The modular debugger (mdb) is part of the Solaris SUNWmdb package.


Stackit goes to great lengths to ensure that it does not interrupt or terminate processes when using a debugger, even if you kill or cancel stackit (e.g. by using Ctrl-C). However, there is a small chance that a fault in the debugger or in stackit itself could terminate a process. You should use stackit only when diagnosing a problem to minimize the chance of a failure.

You can also add the 'safe' option to the stackit command line, preventing stackit from using debuggers against processes on your system. Stackit may not be able to gather all the data you requested, especially on HP-UX and Linux, but the safe option ensures it cannot terminate any processes by accident. Example 5 demonstrates the use of the 'safe' option.
   

Sample Output

This sample output shows the kind of information stackit can gather. In this example, stackit was able to analyze one process successfully, but lacked authority to analyze another:

  sh> stackit -o stack,asm runmqlsr inetd
stackit: V5.1 running on AIX 7.3 (powerpc) with arguments: -o stack,asm
         runmqlsr inetd
         

Analyzing process 9896164                       1 Mar 2024 at 16:46:10 GMT
==========================================================================

     PID     PPID  STARTED  EUSER  EGROUP  COMMAND
 9896164 10420476   Mar 01    mqm     mqm  runmqlsr -m V8QM -t TCP -p 1607


  Thread Stacks:

    9896164: /usr/mqm/bin/runmqlsr -m V8QM -t TCP -p 1607
    ---------- tid# 36700257 (pthread ID:      1) ----------
    0x0900000000112394  naccept(??, ??, ??) + 0xb4
    0x09000000117357e0  cciTcpListenConv() + 0xd00
    0x090000000818f894  ccxListenConv() + 0x274
    0x00000001000004dc  WaitForConnectLoop() + 0x7c
    0x00000001000011a8  main() + 0x888
    0x0000000100000288  __start() + 0x90
    ---------- tid# 38011017 (pthread ID:    772) ----------
    0x0900000000154834  __fd_poll(??, ??, ??) + 0xb4
    0x09000000076547a8  xcsWaitFd() + 0x9c8
    0x0900000007653d84  xcsWaitSocket() + 0x44
    0x0900000007fcde94  cccJobMonitor() + 0x11d4
    0x0900000007601c30  ThreadMain() + 0x15d0
    0x09000000004f4d30  _pthread_body(??) + 0xf0
    ---------- tid# 37945479 (pthread ID:    515) ----------
    0x0900000000507584  _event_sleep(??, ??, ??, ??, ??, ??) + 0x5a4
    0x0900000000508018  _event_wait(??, ??) + 0x2b8
    0x09000000005162c4  _cond_wait_local(??, ??, ??) + 0x4e4
    0x090000000051689c  _cond_wait(??, ??, ??) + 0xbc
    0x0900000000517508  pthread_cond_wait(??, ??) + 0x1a8
    0x090000000762d278  xtmTimerThread() + 0x378
    0x0900000007601c30  ThreadMain() + 0x15d0
    0x09000000004f4d30  _pthread_body(??) + 0xf0
    ---------- tid# 37486713 (pthread ID:    258) ----------
    0x0900000000507584  _event_sleep(??, ??, ??, ??, ??, ??) + 0x5a4
    0x090000000050c044  _p_sigtimedwait(??, ??, ??) + 0x4a4
    0x090000000762368c  xehAsySignalMonitor() + 0x7ec
    0x0900000007601c30  ThreadMain() + 0x15d0
    0x09000000004f4d30  _pthread_body(??) + 0xf0


  Assembler Instructions

    0x900000000507560 (_event_sleep+0x580)  beq  0x9000000005075f8
    0x900000000507564 (_event_sleep+0x584)   li  r6,0x0
    0x900000000507568 (_event_sleep+0x588)  ori  r5,r29,0x0
    0x90000000050756c (_event_sleep+0x58c) cmpi  cr4,0x1,r0,0x0
    0x900000000507570 (_event_sleep+0x590)  ori  r0,r0,0x0
    0x900000000507574 (_event_sleep+0x594)  ori  r0,r0,0x0
    0x900000000507578 (_event_sleep+0x598)  ori  r0,r0,0x0
    0x90000000050757c (_event_sleep+0x59c)  ori  r1,r1,0x0
    0x900000000507580 (_event_sleep+0x5a0)   bl  0x900000000508710
    0x900000000507584 (_event_sleep+0x5a4)   ld  r2,0x28(r1)
    0x900000000507588 (_event_sleep+0x5a8)  sli  r0,r3,0x0
    0x90000000050758c (_event_sleep+0x5ac)  stw  r3,0x70(r1)
    0x900000000507590 (_event_sleep+0x5b0) addi  r4,0x78(r1)
    0x900000000507594 (_event_sleep+0x5b4)   ld  r3,0x30(r31)
    0x900000000507598 (_event_sleep+0x5b8)  ori  r5,r31,0x0
    0x90000000050759c (_event_sleep+0x5bc)  ori  r6,r30,0x0
    0x9000000005075a0 (_event_sleep+0x5c0) cmpi  cr0,0x1,r3,0x0
    0x9000000005075a4 (_event_sleep+0x5c4) addi  r7,0x70(r1)
    0x9000000005075a8 (_event_sleep+0x5c8)  beq  0x90000000050744c
    0x9000000005075ac (_event_sleep+0x5cc)   bl  0x900000000507960
    0x9000000005075b0 (_event_sleep+0x5d0) cmpi  cr0,0x0,r3,0x0

Analyzing process 4194432                       1 Mar 2024 at 16:46:10 GMT
==========================================================================

     PID     PPID  STARTED  EUSER  EGROUP  COMMAND
 4194432  3866762   Jan 30   root  system  /usr/sbin/inetd


stackit: You must run stackit as root to analyze process 4194432.
stackit: Analysis failed for process 4194432.




Summary of results                              1 Mar 2024 at 16:46:11 GMT
==========================================================================
stackit: Success rate: 50%
stackit: Stackit failed to analyze processes which belong to another user.
         Run stackit as root to analyze those processes.    

DISCLAIMER: All source code and/or binaries attached to this document are referred to here as "the Program". IBM is not providing program services of any kind for the Program. IBM is providing the Program on an "AS IS" basis without warranty of any kind. IBM WILL NOT BE LIABLE FOR ANY ACTUAL, DIRECT, SPECIAL, INCIDENTAL, OR INDIRECT DAMAGES OR FOR ANY ECONOMIC CONSEQUENTIAL DAMAGES (INCLUDING LOST PROFITS OR SAVINGS), EVEN IF IBM, OR ITS RESELLER, HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

[{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"ARM Category":[{"code":"a8m0z0000001hlDAAQ","label":"Performance-\u003EHangs"}],"ARM Case Number":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"}],"Version":"All Versions"}]

Document Information

Modified date:
19 March 2024

UID

swg21179404