IBM Support

Determining which Java thread is consuming CPU cycles on Solaris systems

Troubleshooting


Problem

This technotes shows how to determine which thread(s) is consuming the CPU cycles when a Java process has high CPU usage for an extended period of time on a Solaris system. This technote uses the data gathered from technote # 1115625, titled: "MustGather: 100% CPU Usage on Solaris". For educational purposes, the data that is used in this particular example is included at the end of the document.

Cause

If the IBM WebSphere Application Server causes a spike in CPU utilization for a period of greater than several minutes, this technote will show you techniques to assist you in identifying the Java code that is causing the problem.

Resolving The Problem

Using the files attached to this technote, the following three steps demonstrate how to find the problem thread and the corresponding Java code.

  1. Analyze the prstat information to determine the LWPID (light weight process) that is consuming a significant amount of CPU.

    The following prstat was generated with the following command:

  2. prstat -mvL  1  1

    Use of different prstat parameters will give different output but can be used in a similar fashion as described below.

    ID USRNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROC/LWPID 
    16365 root      36  53 0.0 0.0 0.0 0.0  18 0.0   0 178 13K   0 prstat/1
    16310 root    
     46 0.0 0.1 0.0 0.0 4.3  45 5.0   0  58   8   0 java/30
    16310 root     5.8 0.2 0.0 0.0 0.0 0.0  94 0.0   0  11  56   0 java/11
    16310 root     2.6 0.1 0.0 0.0 0.0 0.0  97 0.0   0   6  18   0 java/5
    5158  root     0.5 0.1 0.0 0.0 0.0 0.0  99 0.0  49  30  1K  19   .netscape.bi/1
    16310 root     0.3 0.0 0.0 0.0 0.0 0.0 100 0.0   2   1  11   0 java/23
    16310 root     0.3 0.0 0.0 0.0 0.0 0.0 100 0.0   5   1   7   0 java/13
    16310 root     0.3 0.0 0.0 0.0 0.0 0.0 100 0.0   1   0   2   0 java/3
    16310 root     0.3 0.0 0.0 0.0 0.0 0.0 100 0.0   0   0   2   0 java/17
    16310 root     0.2 0.0 0.0 0.0 0.0 0.0 100 0.0   4   2   6   0 java/16
    16310 root     0.2 0.0 0.0 0.0 0.0 0.0 100 0.0   0   1   1   0 java/18
    16310 root     0.1 0.1 0.0 0.0 0.0 0.0  99 0.5  99   0  87   0 java/1
    16310 root     0.2 0.0 0.0 0.0 0.0 0.0 100 0.0   3   0  12   0 java/28
    16310 root     0.2 0.0 0.0 0.0 0.0 0.0 100 0.0   0   0   1   0 java/19
      290 root     0.1 0.1 0.0 0.0 0.0 0.0 100 0.0  21   0 132   0 Xsun/1
    Total: 90 processes, 233 lwps, load averages: 1.00, 0.84, 0.66

    In the preceding prstat output, the third and fourth columns provide the amount of time the process has spent in user and system mode. Ignoring the prstat command the most CPU time is being consumed by LWPID=30.

  3. Use the LWPID of 30 to find the thread in the pstack output. The following is a snippet from the pstack output:

  4. ---------------------  lwp# 30 / thread# 50  -----------------------
    fe753aa8 ???????? (0, 5265c00, fa, 43a85d79, fe74e170, e04803fc)
    fb1025b4 ???????? (f60748, e04819b8, fe74e170, f60748, fe763ca0, 16)
    .....

    fe505288 _start   (fe74e170, e1ef5d10, 0, 5, 1, fe401000) + 20
    ff36b6e0
    _thread_start (f25078, 0, 0, 0, 0, 0) + 40
    ----------------------------------------------------------

    Look for _thread_start at the bottom of the thread stack. The first number inside the ( ) is the threadID or "tid". In this case the tid, in hex, is f25078.

  5. Finally, search for f25078 in the Java thread dump output. This output, usually gathered from issuing a kill -3 on the process in question, is located in the native_stdout.log file. In the following snippet of the thread dump you will see the tid of f25078:

  6. "Servlet.Engine.Transports : 0" daemon prio=5 tid=0xf25078 nid=0x32 runnable [0xe0480000..0xe04819d8]
    at java.lang.System.currentTimeMillis(Native Method)
    at java.util.Date.<init>(Date.java:161)
    at org.apache.jsp._wtime._jspService(_wtime.java:96)
    at com.ibm.ws.webcontainer.jsp.runtime. HttpJspBase.service(HttpJspBase.java:89)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
    at com.ibm.ws.webcontainer.jsp.servlet. JspServlet$JspServletWrapper.service
    ...

    Alternatively, you may use the thread# 50 from the pstack output (from the command run on the system core file "/usr/proc/bin/pstack [core] >pstack.out") to correlate the prstat and the thread dump. This is the decimal representation of the thread's "nid" in the thread dump. Using the preceding example, since decimal 50 is 0x32, thread# 50 can be used to find the thread by locating "nid=0x32"in the thread dumps. This is the only method that works if the application is using the alternate threading libraries.

Analyzing the thread stack for this tid will help lead you to the code causing the high CPU usage. In this case the issue most likely comes from the JSP:

_wtime._jspService(_wtime.java:96)

native_stdout.logprstat.logpstack.logserver1.pid

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"100% CPU Usage","Platform":[{"code":"PF027","label":"Solaris"}],"Version":"8.5;8.0;7.0;6.1;6.0;5.1","Edition":"Base;Network Deployment","Line of Business":{"code":"LOB45","label":"Automation"}},{"Product":{"code":"SSNVBF","label":"Runtimes for Java Technology"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Java SDK","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
15 June 2018

UID

swg21162381