IBM Support

OA40662: ENTERPRISE EXTENDER LIVENESS TIMER EE PU PAB LOOP AFTER ECSA STORAGE SHORTAGE

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • After a storage shortage occurred, Enterprise Extender PABs
    stopped dispatching exposing an EE liveness timer issue that
    resulted in an EE timer PAB scheduling loop.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All using Enterprise Extender connections.   *
    ****************************************************************
    * PROBLEM DESCRIPTION: EE timer management does not pace the   *
    *                      number of connections allowed to        *
    *                      be scheduled for liveness processing.   *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    NOTE: In general, this APAR addresses large enterprise
          EE environments which consist of thousands of EE endpoints
          This problem was reported in an EE configuration with over
          ten thousand EE endpoints to a single z/OS CS image.
    
    The problem is summarized as follows:
    
    1) CSA/ECSA storage shortage occurs.
    2) EE timer management SRB runs to schedule EE liveness
       processing.  EE liveness processing runs at approximately
       six second intervals.
    3) EE liveness timer management logic does not adequately pace
       the liveness processes that it schedules at a given time.
    4) Currently, hundreds or thousands of EE lines may be
       scheduled at once to perform liveness (keep-alive)
       processing.  This can put unnecessary strain on the
       system in terms of CPU and ECSA spikes.
    5) In the reported problem, the CRA4 and CRA8 buffer
       pools were defined with small base size pools.
    6) These buffers are required for dispatching services
       to dispatch the schedules processes (PABs).
    7) In this case, the EE lines were scheduled but did not
       dispatch due to the lack of ECSA (CRAs).
       The following messages may be issued:
       IST561I STORAGE UNAVAILABLE: CRA4 BUFFER POOL
       IST561I STORAGE UNAVAILABLE: CRA8 BUFFER POOL
    8) Because these processes did not dispatch, the next
       time the EE timer management ran for liveness
       processing, it scheduled thousands of processes
       including ones that had already been scheduled,
       but had not run yet.
    9) At each six second interval, the EE timer
       management services repeatedly scheduled these
       liveness processes. Externally,
       this appears as VTAM is hung due to a TPSCHED loop.
       For small scale EE environments, this inefficiency should
       not be of consequence.
    

Problem conclusion

  • The overall solution is for EE liveness timer services to
    pace the number of scheduled as well as the number of
    outstanding EE test requests.  The current design is to not
    allow more than 500 outstanding tests at a time.  This will
    insure that processes are dispatching in a timely fashion
    prior to scheduling more work.
    
    For EE networks consisting of less than 1000 active
    EE connections, VTAM will process one 12th of the liveness
    timer per 500ms internal.  This will allow EE timer services
    to process the EE liveness queue in approximately 6 seconds.
    
    For large scale EE networks consisting of more than 1000 active
    EE connections, VTAM will process one 20th of the liveness
    timer per 500ms internal.  This will allow EE timer services
    to process the EE liveness queue in approximately 10 seconds.
    
    In either size environment, only timers which have expired will
    actually have processes scheduled.  Also, as the scheduled
    liveness processes dispatch, EE timer management will detect
    this and allow more processes to be scheduled at the next
    500ms interval.
    
    This EE liveness pacing algorithm has been designed in a way to
    minimize the CPU costs while preserving the accuracy of the
    timers being managed.
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    OA40662

  • Reported component name

    VTAM V4 MVS/ESA

  • Reported component ID

    569511701

  • Reported release

    1D0

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2012-10-23

  • Closed date

    2012-11-11

  • Last modified date

    2013-01-23

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UA67242 OA41281

Modules/Macros

  • ISTAUCAL ISTAUCDL ISTAUCEH ISTAUCFT ISTAUCLA
    ISTAUCTM ISTAUCUL ISTAUCVR ISTAUNCB ISTCLW   ISTINCIC ISTINCON
    ISTINM01 ISTRAFME ISTRAFM1
    

Fix information

  • Fixed component name

    VTAM V4 MVS/ESA

  • Fixed component ID

    569511701

Applicable component levels

  • R1D0 PSY UA67242

       UP12/12/21 P F212 Ž

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1D0","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSCY4DZ","label":"DO NOT USE"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1D0","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
23 January 2013