PM79172: EXCESSIVE LOADING OF JOB STATUS OBJECTS BY SCHEDULER LEADS TO OUTOFMEMORYERROR AND OTHER ISSUES UPON ENDPOINT SERVER RESTART.

A fix is available

8.5.0.2: WebSphere Application Server V8.5 Fix Pack 2

APAR status

Closed as program error.

Error description

Excessive loading of job status objects by scheduler leads to
OutOfMemoryError and other issues upon endpoint server restart.

Local fix

Problem summary

****************************************************************
* USERS AFFECTED:  Users of the Java batch function in IBM     *
*                  WebSphere Application Server V8.5           *
****************************************************************
* PROBLEM DESCRIPTION: Excessive heap memory consumed in the   *
*                      scheduler especially when a CG          *
*                      endpoint server is started or           *
*                      restarted, possibly leading to an       *
*                      OutOfMemoryError.   Also there is       *
*                      a timing issue flowing job logs back    *
*                      to wsgrid clients that can              *
*                      lead to a problem dispatching a job     *
*                      as well as reporting its correct        *
*                      status.                                 *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
When an endpoint server is started or restarted, the scheduler
server queries its internal database tables for information
about job executions associated with the endpoint being
started.  This query was using unnecessarily loading too
much into memory, and in cases where there were a large
number of completed jobs in the tables, this could lead
to excessive resource usage, processing slowdown, and
even OutOfMemoryError.
For the second problem, a job coming in from the WSGrid
interface could result in dispatch not working properly
if a timing window was hit.   The problem also involved
the mechanism by which the scheduler requests job log
parts from the dispatch endpoint to stream back to the
WSGrid client.  This could lead to the job
not getting dispatch as well as the job status not being
maintained correctly, possibly leading to the job appearing
to be "stuck" in submitted state rather than moving to the
restartable state upon failure.  This would tend to happen
if say there were a slowdown (e.g. a GC cycle) on the endpoint
server to which the job was dispatched right after the dispatch
had been initiated by the scheduler.

Problem conclusion

The database queries upon endpoint start were refined to
ignore completed jobs.   For the second problem the
WSGrid processing was modified to short-circuit the job
log streaming from the endpoint as long as the job is still
in "submitted" state.
The fix for this APAR is currently targeted for inclusion in
fix pack 8.5.0.2.  Please refer to the Recommended Updates
page for delivery information:
http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980

Temporary fix

Comments

APAR Information

APAR number
PM79172
Reported component name
WEBS APP SERV N
Reported component ID
5724H8800
Reported release
850
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2012-12-14
Closed date
2012-12-17
Last modified date
2012-12-17

APAR is sysrouted FROM one or more of the following:

PM70434
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
WEBS APP SERV N
Fixed component ID
5724H8800

Applicable component levels

R850 PSY
UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.5","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
02 November 2021

Tips

PM79172: EXCESSIVE LOADING OF JOB STATUS OBJECTS BY SCHEDULER LEADS TO OUTOFMEMORYERROR AND OTHER ISSUES UPON ENDPOINT SERVER RESTART.

A fix is available

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

R850 PSY

Document Information

Share your feedback

Need support?