A fix is available
APAR status
Closed as program error.
Error description
Excessive loading of job status objects by scheduler leads to OutOfMemoryError and other issues upon endpoint server restart.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: Users of the Java batch function in IBM * * WebSphere Application Server V8.5 * **************************************************************** * PROBLEM DESCRIPTION: Excessive heap memory consumed in the * * scheduler especially when a CG * * endpoint server is started or * * restarted, possibly leading to an * * OutOfMemoryError. Also there is * * a timing issue flowing job logs back * * to wsgrid clients that can * * lead to a problem dispatching a job * * as well as reporting its correct * * status. * **************************************************************** * RECOMMENDATION: * **************************************************************** When an endpoint server is started or restarted, the scheduler server queries its internal database tables for information about job executions associated with the endpoint being started. This query was using unnecessarily loading too much into memory, and in cases where there were a large number of completed jobs in the tables, this could lead to excessive resource usage, processing slowdown, and even OutOfMemoryError. For the second problem, a job coming in from the WSGrid interface could result in dispatch not working properly if a timing window was hit. The problem also involved the mechanism by which the scheduler requests job log parts from the dispatch endpoint to stream back to the WSGrid client. This could lead to the job not getting dispatch as well as the job status not being maintained correctly, possibly leading to the job appearing to be "stuck" in submitted state rather than moving to the restartable state upon failure. This would tend to happen if say there were a slowdown (e.g. a GC cycle) on the endpoint server to which the job was dispatched right after the dispatch had been initiated by the scheduler.
Problem conclusion
The database queries upon endpoint start were refined to ignore completed jobs. For the second problem the WSGrid processing was modified to short-circuit the job log streaming from the endpoint as long as the job is still in "submitted" state. The fix for this APAR is currently targeted for inclusion in fix pack 8.5.0.2. Please refer to the Recommended Updates page for delivery information: http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
Temporary fix
Comments
APAR Information
APAR number
PM79172
Reported component name
WEBS APP SERV N
Reported component ID
5724H8800
Reported release
850
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2012-12-14
Closed date
2012-12-17
Last modified date
2012-12-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WEBS APP SERV N
Fixed component ID
5724H8800
Applicable component levels
R850 PSY
UP
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.5","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
02 November 2021