MustGather: Performance, hang, or high CPU issues on Windows

Technote (troubleshooting)


Problem(Abstract)

If you are experiencing performance degradation, hang, no response, hung threads, CPU starvation, high CPU utilization, network delays, or deadlocks, this MustGather will assist you in collecting the critical data that is needed to troubleshoot your issue.

Resolving the problem

If you are experiencing performance degradation, hang, no response, hung threads, CPU starvation, high CPU utilization, network delays, or deadlocks, this MustGather will assist you in collecting the critical data that is needed to troubleshoot your issue.

To improve the accuracy of complete data collection, IBM recommends you use the automated data collectors within IBM Support Assistant. Not only will the automated collector gather the equivalent of the manual process, it will also provide a secure file transfer of the collection to IBM.

Collecting data using the IBM Support Assistant Data Collector

    Beginning with WebSphere Application Server V8.0.0.6, the IBM Support Assistant Data Collector (ISADC) tool is bundled with the product and is automatically installed. As a result, you can run the  ISADC tool directly from the app_server_root/bin directory. Note: ISADC is not bundled with the WebSphere Application Server Liberty Profile. For more details, see Using the IBM Support Assistant Data Collector.


  1. Using the ISA Data Collector:
    • To run ISADC from command line, go to your app_server_root/bin directory and run isadc.[sh|bat]

      • To download, install and run ISADC locally:
      • Obtain the IBM Support Assistant Data collector online ISA Data Collector site for WebSphere Application Server.
      • Select the option to collect from this or another system using a downloadable utility.
      • Download and extract the zip file to your WAS_HOME directory. From a command line, run isadc.[sh|bat] or launch index.html to use the web interface.

  2. Select the JDK > Hang / High CPU / Performance Problem collector and click Start.

  3. Follow the prompts to automatically submit the collected data to IBM Support.

Collecting data manually

The verboseGC data is critical to analyze a performance problem. If you have not already done so, enable verboseGC and restart the server.

Important note: Step 2. below involves installation of the chosen CPU data collection tool. Make sure to read that step and complete the installation of the tool on the problem server before the problem.

At the time of the problem:

  1. Take the output of netstat command to get information about TCP/IP sockets:

    netstat -an > netstat_before.out

  2. If you are seeing high CPU usage: Start collecting the CPU data. In most of the cases, the TPROF For Windows tool gives a complete and granular CPU data so its our preferred tool. Please follow the steps given in TPROF For Windows tool, to start collecting the CPU data.

    However, if it is not possible to use the preceding tool, then here are the other tools to collect CPU data:
    Perfmon (Windows XP / Windows 2003)
    Perfmon (Windows 2008 / Windows 7)
    Pslist


  3. Download the file windows_hang.py and copy the file to your <PROFILE_ROOT>\bin directory. If instead copied to <WAS_HOME>\bin, the default server, which may be the deployment manger (dmgr), will be accessed when wsadmin.bat is launched.


    NOTE: This script works for WebSphere Application Server 6.1 and higher.
    windows_hang.pywindows_hang.py

    If you are looking for the older windows_hang.bat that works with older releases of WebSphere Application Server, see the FAQ section.


    To launch the script to produce the default three javacores spaced 2 minutes apart, run this command:

    wsadmin -lang jython -f windows_hang.py -j -s SERVER_NAME

    Replacing SERVER_NAME with your server's name.
If a specific SOAP port needs to be connected to:
wsadmin -port PORT -lang jython -f windows_hang.py -j -s SERVER_NAME

If a specific hostname and SOAP port need to be connected to:
wsadmin -host HOST_NAME -port PORT -lang jython -f windows_hang.py -j -s SERVER_NAME

Where HOST_NAME is either localhost or a valid hostname or IP address
Where PORT is the defined SOAP port used by the application server or deployment manager that
is being connecting to.



This script cannot be used while the application server is starting up (i.e. before the "e-business" message is seen in the SystemOut.log). This is due to the requirement that an active SOAP connection has to be established through wsadmin.
Alternative steps include collecting raw core dumps using userdump.exe, or (on Windows Vista/2008 or later) opening the Task Manager, right-click on the java process, and selecting Create Dump File from the context menu. See the manual steps (and FAQ) in the Crash MustGather to properly configure full core dumps as well as how to process any raw core dumps.


Further information about the script:
  • To adjust the quantity of javacores produced, use the option "-i".
  • To adjust the time in-between each javacore produced, use the option "-r" and provide the number of seconds.
    If you are trying to run the script for a set period of time, you will need to calculate -i and -r separately. If you wanted to run the script for 900 seconds, but wanted javacores generated every 3 minutes (180 seconds), you would need to divide 900 seconds into 180 seconds to determine the setting for -r (the iterations, or number of javacores to produce). In this case, it's 5, and the command would be:

    wsadmin -host HOST_NAME -port SOAP_PORT -lang jython -f windows_hang.py -j -i 180 -r 5 -s SERVER_NAME


All the arguments below are added after the -f windows_hang.py option. Any arguments added before -f are reserved for wsadmin.bat (such as -lang, -host, and/or -port).

Arguments
Default Value
Description
Required
--serverName
-s
The problematic application server name. This is not the same as the profile name or the host name of the physical machine.
Case-sensitive
YES
--nodeName
-n
The problematic application server's node.
This is not the same as the profile name or the host name of the physical machine.
Case-sensitive
Optional; use if multiple nodes are defined or running the script against the dmgr.
--javacore
-j
disabled Enables the generation of multiple javacores YES, if you want to capture javacores.
--interval
-i
120 (seconds) Javacore generations are spaced apart the number of seconds defined here. No
--iterations
-r
3 This defines the quantity of javacores to produce. No
--heapdump
-d
disabled Enables the generation of a single heapdump No
--multiple
-m
disabled Enables the generation of multiple heapdumps. Use -i to control the quantity. No
--help Displays a help page. Note the two dashes. No




4. Follow the steps given in TPROF For Windows tool (or the other tool you chose in step 2), to
stop collecting the CPU data.

5. Take the final output of netstat command to get information about TCP/IP sockets:

netstat -an > netstat_after.out

Submitting required data:
Zip all the output and log files:
  • netstat output (per #1 and #5 above)
  • CPU data (per #2 and #4 above)
  • All the generated javacores (per #3 above)
  • Server logs from the server having problems (<PROFILE_ROOT>\logs\<MY_SERVER>\)

Send the results to IBM Support.


Frequently Asked Questions (FAQs):
  • What is the impact of enabling verboseGC?
    VerboseGC data is critical to diagnosing these issues. This can be enabled on production systems because it has a negligible impact on performance (< 2%).

  • What are 'javacores' and where do I find them?
    Javacores are snapshots of the Java™ Virtual Machine activity and are essential to troubleshooting these issues. These files will usually be found in the profile_root, else search the entire system for "*javacore*".

  • How to check the SOAP port of the server ?
    Check the value of SOAP_CONNECTOR_ADDRESS in serverindex.xml file present under <PROFILE_ROOT>\config\cells\cell_name\nodes\node_name

  • If either script fails, can I still collect javacores manually via wsadmin?
    Follow these manual steps to collect the javacores:
    1. From the command prompt, enter the command to get a wsadmin command prompt :

      <WAS_HOME>\bin\wsadmin.bat

      If security is enabled or the default SOAP ports have been changed, you will need to pass additional parameters to the batch file in order to get a wsadmin prompt. For example:

      wsadmin.bat [-host host_name] [-port port_number] [-user username [-password password]]

      Note: You can connect wsadmin to any of the server JVM in the cell. After running the wsadmin command it will display the server process for which it has attached to. Depending on the process that it has attached to, you can get thread dumps for various JVMs. If wsadmin is connected to deployment manager, then you can get thread dumps for any JVM in that cell. If it is attached to a node agent, then you can get thread dumps for any JVM in that Node. If it is attached to a server, then you can get thread dumps only for the server to which has connected to.

    2. Get a handle to the problem application server.

      Note: The contents in brackets "[.....]", along with the brackets, is not optional. It must be entered to set the jvm object. Also, note that there is a space between the words "completeObjectName" and "type":

      wsadmin> set jvm [$AdminControl completeObjectName type=JVM,process=problemServerName,*]

      Where server1 is the name of the application server that does not respond (or is hung). If wsadmin is connected to a Deployment Manager and if the server names in the cell are not unique, then you can qualify the JVM with node attribute in addition to process:

      ,node=nodeName,*

    3. Generate multiple javacores by issuing the following command every 2 minutes for 3 iterations:

      wsadmin> $AdminControl invoke $jvm dumpThreads

  • Is there another way to gather the required data?
  • How to analyze the Java thread dumps ?
    Download the IBM Thread and Monitor Dump Analyzer for Java Technology.

    ThreadAnalyzer is a technology preview tool that can analyze thread dumps from WebSphere Application Server. It is useful for identifying deadlocks, contention, bottlenecks, and to summarize the state of threads within WebSphere Application Server.

  • Where is the old windows_hang.bat?
    The old script is located here, although there are limitations with this script as you are required to run this against the individual application server. Running this script with wsadmin running through the dmgr might cause this to fail.

    Download the attached script (windows_hang.bat) under <PROFILE_ROOT>\bin folder.



    This script will be used to automatically generate 3 javacores with 2 minutes interval. Before running the script, check the following:
    • Name of the problematic server(s)

    • If admin security is enabled then get the username/password.Check which SOAP port is in use, as you will be required to enter it interactively when running the script

    For each of the problematic server(s) open a command prompt and go to profile_root\bin. Enter the following command to start the script:

    windows_hang.bat [problem servername]

    The script will prompt for the admin security and the SOAP port. It will then generate 3 javacores, 2 minutes apart. Once done, you should see the following message and 3 javacores in the <profile_root> directory:

    "MustGather>> Last javacore generation Successful. Script will now exit"

  • How to change the default time interval for javacore generation in the older windows_hang.bat script?
    Edit the TIME_SLEEP variable in the batch file. This variable accepts the time in milli seconds.

  • What if I am using WebSphere Application Server 5.1? Where are the server logs?
    For WebSphere Application Server 5.1 the server logs will be here:

    install_root\logs\server_name\*


If asked to do so:
The preceding data is used to troubleshoot most of these issues; however, in certain situations Support may need additional data. Only collect the following data if asked to do so by IBM Support.

Userdumps
Follow instructions in MustGather: Getting user.dmp when hangs/performance degradation prevents generating a javacore to produce a set of three user.dmp files taken at 2 minute intervals.

For a listing of all technotes, downloads, and educational materials specific to a hang or performance degradation, search the WebSphere Application Server support site.

Related information
How to enable verbosegc for WebSphere
IBM Thread and Monitor Dump Analyzer
Steps to getting support for WebSphere Application Server
Submitting information to IBM support
MustGather: Read first for WebSphere Application Server
Troubleshooting guide for WebSphere Application Server
Not geting javacores? Instructions to get user.dmp.

Exchanging data with IBM Support

To diagnose or identify a problem, it is sometimes necessary to provide Technical Support with data and information from your system. In addition, Technical Support might also need to provide you with tools or utilities to be used in problem determination. You can submit files using one of following methods to help speed problem diagnosis:


Read first and related MustGathers

Related information

MustGather: Generating Javacores and Userdumps Manually

Cross reference information
Segment Product Component Platform Version Edition
Application Servers WebSphere Application Server - Express Hangs/performance degradation Windows 7.0, 6.1, 6.0, 5.1
Application Servers Runtimes for Java Technology Java SDK Windows

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

WebSphere Application Server
Hangs/Performance Degradation

Software version:

5.1, 6.0, 6.1, 7.0, 8.0, 8.5, 8.5.5

Operating system(s):

Windows

Software edition:

Base, Express, Network Deployment

Reference #:

1111364

Modified date:

2012-04-15

Translate my page

Machine Translation

Content navigation