IBM Support

MustGather: Performance, Hang, or High CPU issues with ELM applications

Troubleshooting


Problem

This document assists you in collecting the data necessary to diagnose and resolve performance, hang, or high CPU for application server or database server issues with the IBM Engineering Lifecycle Management (ELM) products (includes IBM Jazz Team Server (JTS), IBM Engineering Workflow Management (EWM) and IBM Test Management (ETM).

Symptom

Performance, hang, or high CPU issues with the ELM products can occur when installed in a distributed environment. Each issue may contribute to a variety of symptoms and behavioral deficiencies.

Cause

This MustGather assists you in collecting the data necessary to help you diagnose and resolve the issue. If you are unable to determine the root cause using the information collected, you should open a case with IBM Support for further investigation providing the data collected.

Resolving The Problem


You can use the IBM Support Assistant Lite (ISA Lite) Data Collector tool to quickly collect diagnostic files, such as log files, configuration files or to run traces. This tool is bundled with ELM. ISA Lite collects information about your Jazz Team Server environment and stores the information in a .zip archive file. If you have a need to open a case with IBM Support for further assistance, you can send the archive file with the data collection so that they can help diagnose and fix problems.


The information below should be gathered in addition to the normal information and log gathering done by ISA Lite.

Note: Further information to help troubleshoot performance issues with your ELM products is also available in the Performance troubleshooting section of the Deployment wiki on jazz.net.


Business Impact

Includes:

  • What effect is this having on the business
  • Is this a production or test environment
  • How many users are affected
Unexpected Behavior

Details include:

  • Problem description
  • Steps to re-create
  • Exact date and time of the issue. If the issue happened multiple times, provide all times when the issue happened
  • Were there any recent changes made to the environment?
  • Were new users able to log in to any ELM applications?
  • What behavior did the logged-in users encounter? Provide a screen capture or record the video of the issue
  • Any errors or exceptions logged at the time the incident occurred
  • If you use proxy server, check whether the issue happens also when you by-pass this server
  • What operations/applications/processes/users DO HAVE the issue. What operations/applications/processes/users DO NOT HAVE the issue

Topology

Description of the Topology includes:

  • Is this a stand-alone or distributed environment?
  • The version of the ELM applications including all applied interim fixes and hot fixes
  • Are the ELM applications deployed on Apache Tomcat or IBM WebSphere

    If WebSphere, provide the output of:

    <WebSphere_Install_Root>\bin\versionInfo -maintenancePackages

     
  • The Operating System the ELM applications are installed on
    • Include the number of CPU's
    • How much memory is available
    • Available disk space
    • Size of indices on disk, for example, conf\<app>\indices
  • The Database vendor and version being used.

    If multiple database instances are being used, provide details on which ELM application is using which database instance.
  • JVM params provided at server startup (heap sizes and config cache tuning) .  If ISA tool is used, this information is already available.
Javacore log files and Application Logs


Due to the distributed and interconnected nature of the ELM applications we will need at minimum data from the JTS server and the affected ELM application which are collected at the time the incident occurs. If possible, providing data from all JVMs would be preferred

  • Enable verbose gc and javacore creation
    • WebSphere instructions:
      • Instructions to enable verbose gc are located here
      • To add javacore creation, add the following parameters to Generic JVM arguments in Websphere Admin Console:
        -Xdump:java:events=user
        -Xdump:heap+java:events=excessivegc,range=1..1,request=exclusive+prepwalk+preempt
    • Tomcat instructions:
      • Add the following two lines to the server.startup script or to the Java Options in the Tomcat Control Panel if running as a Windows Service (ensure that the - character is present at the beginning)
        -verbose:gc
        -Xdump:java:events=user
        -Xdump:heap+java:events=excessivegc,range=1..1,request=exclusive+prepwalk+preempt
      • Restart the Application Server for the changes to take effect.
    • Liberty Instructions:
      • Windows instructions:
        1. Open <CLM_Install_Root>\server\server.startup.bat file.
        2. Add the following lines in set JAVA_OPTS section:
          set JAVA_OPTS=%JAVA_OPTS% -Xdump:java:events=user
          set JAVA_OPTS=%JAVA_OPTS% -Xdump:heap+java:events=excessivegc,range=1..1,request=exclusive+prepwalk+preempt
        3. Save the file and restart the server
      • Linux instructions:
        1. Open <CLM_Install_Root>\server\server.startup file.
        2. Add the following lines in #Enable verbose GC logging for serviceability section:
          JAVA_OPTS="$JAVA_OPTS 
          -Xdump:java:events=user
          JAVA_OPTS="$JAVA_OPTS -Xdump:heap+java:events=excessivegc,range=1..1,request=exclusive+prepwalk+preempt
        3. Save the file and restart the server.
  • Gather 4-6 Java Cores in 30s intervals according to How to gather Java cores for different application servers in Engineering Lifecycle Management applications document.
    NOTE: Java Cores have to be gathered at the moment when the issue is taking place, especially BEFORE server restart. Do not gather Java Cores when the issue is gone.
  • Include all the ELM logs including ETL logs located in:

    WebSphere location: <WebSphere_Install_Root>\profiles\(profile_name)\logs directory
    Liberty Profile and Tomcat location: <CLM_Install_Root>\server\logs directory
    NOTE: These logs are gathered automatically by ISA DC tool. if you use ISA DC tool to gather log files, you do not have to attach them separately
  • If ELM is running on WebSphere, provide the WebSphere Performance, hang, or high CPU mustgathers for the operating system ELM is installed on (in addition to those listed above)
  • Provide access.log, error.log and http_plugin.log file if you use IHS. These files sometimes have been rotated quickly or are cleaned up by customer tools when the server is restarted. Make sure that these logs contains the entries from the time when the issue happened, not after server restart.
  • Provide the report from monitoring tool like Splunk or APM tool if you have configured them to gather the data from the server.
    The most significant usage parameters from the report are following: processor, memory, Java heap, thread connections pool, garbage collection time, active services number, disk and network IO, number of expensive scenarios. Two charts are required: from the last week and the last day.
Database usage logs
Gather Oracle AWR report or corresponding report for different database vendor.
Additional Information
  • Gather a screen capture of the following page from the web UI and save the content as Full HTML. Ensure you gather a screen capture for each application (example: jts, ccm, jazz, and others):  https://<hostname>/<context root>/service/com.ibm.team.repository.service.internal.counters.ICounterContentService
  • Gather the screen capture from list of active services: https://<hostname>/<context root>/admin#action=com.ibm.team.repository.admin.activeServices
IBM Doors Next issues

For IBM Doors Next issues use the following document: IBM DOORS Next V7.X Performance MustGather

Lifecycle Query Engine and Link Indexer issues

For Lifecycle Query Engine (LQE) and Link Indexer (LDX) issues use the following document: Mustgather: Investigating LQE/LDX performance

[{"Line of Business":{"code":"LOB59","label":"Sustainability Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSUVV6","label":"IBM Engineering Test Management"},"ARM Category":[{"code":"","label":""}],"Platform":[{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"All Versions"},{"Line of Business":{"code":"LOB59","label":"Sustainability Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSUC3U","label":"IBM Engineering Workflow Management"},"ARM Category":[{"code":"","label":""}],"Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"All Versions"}]

Document Information

Modified date:
20 June 2023

UID

swg21607533