A fix is available
APAR status
Closed as program error.
Error description
After a storage shortage occurred, Enterprise Extender PABs stopped dispatching exposing an EE liveness timer issue that resulted in an EE timer PAB scheduling loop.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All using Enterprise Extender connections. * **************************************************************** * PROBLEM DESCRIPTION: EE timer management does not pace the * * number of connections allowed to * * be scheduled for liveness processing. * **************************************************************** * RECOMMENDATION: * **************************************************************** NOTE: In general, this APAR addresses large enterprise EE environments which consist of thousands of EE endpoints This problem was reported in an EE configuration with over ten thousand EE endpoints to a single z/OS CS image. The problem is summarized as follows: 1) CSA/ECSA storage shortage occurs. 2) EE timer management SRB runs to schedule EE liveness processing. EE liveness processing runs at approximately six second intervals. 3) EE liveness timer management logic does not adequately pace the liveness processes that it schedules at a given time. 4) Currently, hundreds or thousands of EE lines may be scheduled at once to perform liveness (keep-alive) processing. This can put unnecessary strain on the system in terms of CPU and ECSA spikes. 5) In the reported problem, the CRA4 and CRA8 buffer pools were defined with small base size pools. 6) These buffers are required for dispatching services to dispatch the schedules processes (PABs). 7) In this case, the EE lines were scheduled but did not dispatch due to the lack of ECSA (CRAs). The following messages may be issued: IST561I STORAGE UNAVAILABLE: CRA4 BUFFER POOL IST561I STORAGE UNAVAILABLE: CRA8 BUFFER POOL 8) Because these processes did not dispatch, the next time the EE timer management ran for liveness processing, it scheduled thousands of processes including ones that had already been scheduled, but had not run yet. 9) At each six second interval, the EE timer management services repeatedly scheduled these liveness processes. Externally, this appears as VTAM is hung due to a TPSCHED loop. For small scale EE environments, this inefficiency should not be of consequence.
Problem conclusion
The overall solution is for EE liveness timer services to pace the number of scheduled as well as the number of outstanding EE test requests. The current design is to not allow more than 500 outstanding tests at a time. This will insure that processes are dispatching in a timely fashion prior to scheduling more work. For EE networks consisting of less than 1000 active EE connections, VTAM will process one 12th of the liveness timer per 500ms internal. This will allow EE timer services to process the EE liveness queue in approximately 6 seconds. For large scale EE networks consisting of more than 1000 active EE connections, VTAM will process one 20th of the liveness timer per 500ms internal. This will allow EE timer services to process the EE liveness queue in approximately 10 seconds. In either size environment, only timers which have expired will actually have processes scheduled. Also, as the scheduled liveness processes dispatch, EE timer management will detect this and allow more processes to be scheduled at the next 500ms interval. This EE liveness pacing algorithm has been designed in a way to minimize the CPU costs while preserving the accuracy of the timers being managed.
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
OA40662
Reported component name
VTAM V4 MVS/ESA
Reported component ID
569511701
Reported release
1D0
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2012-10-23
Closed date
2012-11-11
Last modified date
2013-01-23
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UA67242 OA41281
Modules/Macros
ISTAUCAL ISTAUCDL ISTAUCEH ISTAUCFT ISTAUCLA ISTAUCTM ISTAUCUL ISTAUCVR ISTAUNCB ISTCLW ISTINCIC ISTINCON ISTINM01 ISTRAFME ISTRAFM1
Fix information
Fixed component name
VTAM V4 MVS/ESA
Fixed component ID
569511701
Applicable component levels
R1D0 PSY UA67242
UP12/12/21 P F212
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1D0","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSCY4DZ","label":"DO NOT USE"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1D0","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
23 January 2013