IBM Support

Failure of CLM Data Collection Jobs for a Large Data Set on Windows 2003 Server

Troubleshooting


Problem

When CLM Reporting Jobs are executed for a very large repository data set (more than 500000 records) on a 32-bit Windows environment, the jobs may fail after a few hours. This technote provides the solution to this problem which appears on 32-bit Windows systems due to limited system resources.

Cause

Limited system resources availability.

Environment

CLM Build installed on the following environment

  • Operating System (OS): Windows 2003 Server (32 bit) with 16 GB RAM
  • Application Server: WebSphere Application Server (WAS) version 7.00.11
  • Database Server: DB2 version 9.5.500 Fix Pack FP5 on Windows 2003 Server (32 bit) with 16 GB :RAM

Diagnosing The Problem

Execute Data Collection Jobs

  1. Log onto jts/admin with an admin account (or qm/admin if you are only going to run QM jobs) and navigate to Reports -> Data Collection Jobs.
  2. Click the link "Run all data warehouse collection jobs for all the applications" from the JTS Admin page to run all the ETL jobs (or “Run all data warehouse collection jobs” from the QM Admin page to just run the QM ETL jobs).
  3. Monitor Data Collection Jobs status. The jobs should fail after a few hours.

Check Log Files
To diagnose the problem, we need to check the application log files as well as the DB2 diagnostic log and see if the problem is due to insufficient system resources.

Steps

  1. Go to <CLM Install Dir>/server/logs/
  2. Check the application log files (qm-etl.log, jts-etl.log, ccm-etl.log) and see if they contain the following exception trace:
    com.ibm.db2.jcc.am.yn: [jcc][t4][2030][11211][3.57.82] A communication error occurred during operations on the connection's underlying socket, socket input stream,or socket output stream.  Error location: Reply.fill().  Message: Connection reset. ERRORCODE=-4499, SQLSTATE=08001 at com.ibm.db2.jcc.am.bd.a(bd.java:319)
  3. Go to <DB2 Install Dir> and search for the DB2 diagnostic log file db2diag.log and see if it contains the error similar to the one shown below:
    OSERR   : 1450 "Insufficient system resources exist to complete the requested service."

Resolving The Problem

The problem depends on the deployment environment configuration and is primarily associated with the availability of memory and system resources.
Steps

  1. A possible reason for this error could be the default setting of the /fastdetect switch in the boot.ini file which is /3GB Check your server's boot.ini file. If the /3GB switch is present, remove it and save the file.
  2. Check your DB2 server's DBM CFG for the INSTANCE_MEMORY setting. By default it is set to AUTOMATIC. For example,
    Size of instance shared memory (4KB) (INSTANCE_MEMORY) = AUTOMATIC(1000000)
    Set the size of instance shared memory which is appropriate for your system. For example, for the system configuration mentioned above, you should set INSTANCE_MEMORY to 524,288 which equals 2 GB or to 262144 which equals 1 GB. Consult to your DBA or Database support team to find out the right configuration needed for your server.
  3. Change the DB2 DW Database buffer pool settings to enable 'Self Tuning'.

After applying all of the above changes, re-start the Windows 2003 server and run the ETL Jobs again. This should resolve the problem.

Other recommendations
  • CLM requires a separate database for the Jazz Team Server (JTS), the Quality Management (QM) application, CCM, and the data warehouse. It is recommended that you install each database on its own machine.
  • Upgrade the DB2 machines to 64-bit if they are running the Data Collection Jobs under a large, enterprise load.

[{"Product":{"code":"SSUVV6","label":"IBM Engineering Test Management"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Reports","Platform":[{"code":"PF033","label":"Windows"}],"Version":"3.0.1","Edition":"Standard","Line of Business":{"code":"LOB59","label":"Sustainability Software"}}]

Product Synonym

CALM;Rational Quality Manager

Document Information

Modified date:
16 June 2018

UID

swg21503427