PM74875: CMAS NETWORK CONNECTION TIMEOUTS AND REPOSITORY SYNCHRONIZATION FAILURES INCLUDING ISOLATION EVENTS DURING PERIODS OF HIGH CPU

A fix is available

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • During periods of high-CPU utilization your CMAS network
    connections timeout and may fail repository synchronization
    possibly leading to an eventual CMAS isolation event.
    In this specific instance, the isolation events occurred during
    the evenings when CPU was highly utilized by higher priority
    batch service classes, however any slowdown or logic error
    where a CICS MAS might acquire a global lock could also surface
    a CMAS isolation event.
    The specific symptoms for this APAR show up as follows:
    - During high CPU period the FS (First-Speaker) CMAS begin
    communication with the QS (Quiet-Side) CMAS.
    - Each CMAS connection requires a pair of send/receive link
    tasks to be attached in both the FS an QS CMASes.
    - The FS CMAS initiated communication with the QS CMAS.  The QS
    CMAS attached a send link task and receive link task.
    - The receive link task spawns a network lock task under
    transid LPLK in the QS CMAS.  Due to the high CPU, the LPLK
    task enters an XSWX wait for exclusive access to the network
    lock.
    - During the time the receive link task in the QS CMAS is
    awaiting  for exclusive control of the network lock, the send
    link task in the QS CMAS is timed out as indicated in the QS
    CMAS EYULOG by message:
    EYUCS0204W A timeout has occurred with CMAS xyz (where zyz is
    the FS CMAS).
    - The send link task abnormal termination due to timeout cleans
    up and resets the link descriptor for the connection to the FS
    CMAS.
    - During this time the initial receive link task in the QS CMAS
    is still waiting for exclusive access to the network lock so it
    can generate new DLSTs and TLSTs for the direct connection to
    the FS CMAS.
    - The FS CMAS now attempts to re-establish the connection by
    contacting the QS CMAS again.  This results in a new receive
    link task being attached in the QS CMAS as it finds the link in
    RESET from the previous send link task timeout termination.
    - The timing issue occurs as the new receive link task starts
    as the link is in RESET and then the previous receive link task
    obtains exclusive access to the network lock and calls CPDG and
    CPAG to generate DLSTSs and TSLTs for the direct connection to
    the FS CMAS and then is purged by CSLT with AZI2 for MRO as it
    is in CICS communications.
    - At this point the communications control structures indicate
    a direct connection however there is no send link task in the
    QS CMAS.
    - The FS CMAS continues to attempt contact every 2 minutes with
    the QS CMAS until it finally is isolated after almost 1 hour.
    - The QS CMAS issues the following message every 2 minutes:
    EYUCL0113E Receive Link Task terminated abnormally for MRO
    Network connection with CMAS (FS CMAS).
    - The FS CMAS issues the following message every 2 minutes:
    EYUCS0009I Message received from CMAS (QS CMAS) : EYUCS0206W
    Exception trace issued: Point Id= 28 ,Debug text= ConExist .
    - The FS CMAS eventually issues message:
    EYUCP0023E CPQCM08 A communications failure has occurred while
    performing Repository Synchronization with CMAS (QS CMAS).
    That CMAS is being isolated.
    - The QS CMAS issues corresponding message:
    EYUCP0024E A communications failure has occurred while
    performing Repository Synchronization.  This CMAS is being
    isolated at the request of CMAS (FS CMAS).
    Additional Symptom(s) Search Keyword(s): KIXREVWJB
    - EYUCP0205S Repository Synchronization with CMAS (QS CMAS)
    failed.
    - EYUCP0203I Repository Synchronization started with (QC CMAS).
    - EYUXM0505E Method = CLMU Debug = CLMUXMSM Comp = 08
    SComp = L MsgID = 0000 PtID (XMSM) = 0C
    -
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All CICSPlex SM V4R1M0 and V4R2M0 Users      *
    ****************************************************************
    * PROBLEM DESCRIPTION: -  During periods of high-CPU           *
    *                         utilization CMAS network connections *
    *                         may timeout and may fail repository  *
    *                         synchronization, possibly leading to *
    *                         an eventual CMAS isolation event.    *
    *                                                              *
    *                         Either CMAS partner may receive      *
    *                         multiple EYUCL0113E messages         *
    *                         indicating that a SEND and/or        *
    *                         RECEIVE link task has terminated     *
    *                         abnormally.  If isolation occurs,    *
    *                         one partner will issue message       *
    *                         EYUCP0023E while the other partner   *
    *                         will issue mesage EYUCP0024E.        *
    *                                                              *
    *                         Additionally, one partner may issue  *
    *                         multiple instances of EYULOG message *
    *                         EYUCS0009I.  The text of this        *
    *                         message will be similar to the       *
    *                         following:                           *
    *                                                              *
    *                           EYUCS0009I Message received from   *
    *                           CMAS (<partner cmasname>) :        *
    *                           EYUCS0206W Exception trace issued: *
    *                           Point Id= 28 ,Debug text=          *
    *                           ConExist .                         *
    *                                                              *
    *                      -  If a send task for a CMAS to CMAS    *
    *                         connection does not respond to a     *
    *                         terminate request during CMAS        *
    *                         termination and is forcepurged, an   *
    *                         abend ASRA/S0C4 may occur in method  *
    *                         EYU0CLMS (CLMS - MRO connection) or  *
    *                         EYU0CLCS (CLCS - LU62 connection).   *
    ****************************************************************
    * RECOMMENDATION: After applying the PTF that resolves this    *
    *                 APAR, all CMASes must be restarted.  Note    *
    *                 that the restarts do not need to occur at    *
    *                 the same time.                               *
    ****************************************************************
    -  When a connection between two CMASes is timed out because one
       of the partners is not responding, the CMAS with the lower
       alphabetic SYSID will attempt to re-acquire the connection.
       If during the new connection attempt, the Receive task in the
       partner CMAS hangs waiting to acquire the CPSM communications
       network lock, it is possible that the new connection attempt
       will time out also.  When that occurs, the SEND link task
       will be timed out prior to the RECEIVE link task.  If the
       Receive task in the partner CMAS acquires the lock after the
       Send task terminates but prior to it being timed out, then
       method EYU0CLMU (CLMU - MRO connection) or EYU0CLCU (CLCU -
       LU62 connection) does not verify that the Send task is still
       active, and completes the Receive link side of the
       connection.  However, since the Send task has already
       terminated, the connection does not fully complete, leading
       to the errors described above.
    
    -  When a CMAS to CMAS connection send task is forcepurged, a
       recovery routine is given control in method CLMS or CLCS,
       where control block pointers are restored and link cleanup is
       performed.  If the CMAS is in termination, not all control
       block pointers are restored properly, which can result in
       abend ASRA/S0C4 when CLMS or CLCS performs link cleanup.
    

Problem conclusion

  • -  Methods CLMU and CLCU have been updated to verify if the Send
       task is still active after receiving the CPSM communications
       network lock.  If the Send task is no longer active, the
       CLMU/CLCU will clean up the link and terminate.
    
    -  Methods CLMS and CLCS have been updated to ensure that all
       control block pointers are restored if a send task is
       forcepurged during CMAS termination.
    

Temporary fix

  •             *********
                * HIPER *
                *********
    FIX AVAILABLE BY PTF ONLY
    

Comments

APAR Information

  • APAR number

    PM74875

  • Reported component name

    CICS TS Z/OS V4

  • Reported component ID

    5655S9700

  • Reported release

    70M

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-10-11

  • Closed date

    2012-11-29

  • Last modified date

    2013-01-02

  • APAR is sysrouted FROM one or more of the following:

    PM74407

  • APAR is sysrouted TO one or more of the following:

    PM77989 UK83861 UK83862

Modules/Macros

  •    EYU0CLCS EYU0CLCT EYU0CLCU EYU0CLMS EYU0CLMT
    EYU0CLMU
    

Fix information

  • Fixed component name

    CICS TS Z/OS V4

  • Fixed component ID

    5655S9700

Applicable component levels

  • R60M PSY UK83861

       UP12/12/04 P F212

  • R70M PSY UK83862

       UP12/12/04 P F212

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.



Rate this page:

(0 users)Average rating

Document information


More support for:

CICS Transaction Server

Software version:

4.2

Reference #:

PM74875

Modified date:

2013-01-02

Translate my page

Machine Translation

Content navigation