II13538: DEALING WITH HUNG COUPLING FACILITY CONNECTIONS IXLCONN REASON CODE 02010C27 0C27 02010C09

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as canceled.

Error description

  • DEALING WITH HUNG COUPLING FACILITY CONNECTIONS
    -----------------------------------------------
    When a DB2 member abnormally terminates, its connections
    to the coupling facility structures are put into a
    FAILING state by cross-system extended services for z/OS
    (XES). The FAILING DB2 member remains in this state until
    all surviving members of the group have responded to the
    XES Disconnected/Failed Connection (DiscFailConn) event for
    each structure. XES sends this event to each surviving
    member of the group so that the necessary recovery actions
    can be taken in response to the failed member.
    
    After all surviving members of the group perform the
    necessary recovery actions and provide a response to XES
    for the DiscFailConn event for a given CF structure, XES
    changes the failed DB2 member's connection status for that
    CF structure from FAILING to FAILED PERSISTENT. The DB2
    member can reconnect to the CF structure on restart when the
    member's status is FAILED PERSISTENT.
    
    When you restart the DB2 member immediately following a
    connection failure, it can attempt to reconnect to a CF
    structure while its connection is still in a FAILING state.
    If this occurs, XES denies the reconnect request with a 0C27
    reason code. DB2 responds to this by entering a connection
    retry loop until the connection succeeds or until it reaches
    the maximum retry count.
    IXL013I IXLCONN REQUEST FOR STRUCTURE DSNxxxx_LOCK1 FAILED.
      JOBNAME: xxxxIRLM ASID: 0055 CONNECTOR NAME: DXRDBPG$$xxxx002
      IXLCONN RETURN CODE: 0000000C,  REASON CODE: 02010C27
    is message you will see when system was started before the
    old structure was completely cleaned up.
    
    For the SCA, the maximum retry count is 200 times with a 3
    second interval between each attempt. For the GBPs, the
    maximum retry count is 5 times with a 10 second interval
    between each attempt.
    
    When joining the datasharing group, IRLM will connect with
    current established name protocols which allow it to reusea
    failed persistent connection.  Because recovery for the failed
    connection may NOT be complete, it may get an 0C27 return code
    on the CONNECT.
    
    IRLM will tolerate the 0C27 return code on the first CONNECT
    attempt only.  It will change the connection name slightly and
    try again to CONNECT.  Since the join to XCF was already
    successfully done, IRLM will not disconnect from XCF when it
    changes the name.  As a result, the XCF group name will no
    longer match the lock structure group name.
    
    If the second attempt to join the group also gets an 0C27
    return code from XES, IRLM will stop the group initialization
    process and DENY the DBMS identify.
    
    **IMPORTANT NOTE**
    If IRLM connects to the structure as a result of RSN0C27, there
    will be a FAILED PERSISTANT CONNECTION for this member with the
    ORIGINAL CONNECTION ID.  Failed persistent connections will be
    deleted by a new IRLM joining the group if there are no RLE
    associated with the connection and it is no longer needed.
    Users should ** NOT ** be deleting these FAILED PERSISTANT
    connections.
    
     msgIXL030I CONNECTOR STATISTICS
     msgIXL031I CONNECTOR CLEANUP ... FOR CONNECTOR n HAS
          COMPLETED
    Can be seen on either an IRLM joining or leaving the group.
    
    You may notice a message similar to the following message,
    which indicates a failed connection attempt:
    
    
    IXL013I IXLCONN REQUEST FOR STRUCTURE DB2GR0W_SCA FAILED.
         JOBNAME: DB2VMSTR ASID: 05E1 CONNECTION NAME: DB2_DB2V
         IXLCONN RETURN CODE: 0000000C,   REASON CODE: 02010C27
    
    The preceding message might be displayed multiple times
    while DB2 is in a connection retry loop. This is normal.
    
    In rare cases, one or more of the surviving members of a
    group will encounter difficulties in providing the
    DiscFailConn response to XES for a given CF structure. XES
    issues a message similar to the following message for each
    DB2 member that it does not receive a response from within
    two minutes:
    
    IXL041I CONNECTOR NAME:DB2_DB2M, JOBNAME:DB2MMSTR, ASID:0086
    
      HAS NOT RESPONDED TO THE DISCONNECTED/FAILED CONNECTION
      EVENT FOR SUBJECT CONNECTION: DB2_DB2V.
      DISCONNECT/FAILURE PROCESS FOR STRUCTURE DB2GR0W_SCA CANNOT
      CONTINUE.
      MONITORING FOR RESPONSE STARTED: 08/08/2002 23:50:23.
      DIAG: 0000 0000 00000000
    
    In extreme cases, the maximum number of connection retries
    will be reached. If encountered for the SCA, this prevents
    the failed member from restarting and DB2 issues a message
    similar to the following message:
    
    DSN7506A  -DB2V DSN7LSTK
    CONNECTION TO THE SCA STRUCTURE DB2GR0W_SCA FAILED.
     MVS IXLCONN RETURN CODE = 0000000C,
     MVS IXLCONN REASON CODE = 02010C27.
    
    IRLM will try over and over again to connect and these msgs
     will be seen -
     DXR133I xxxx002 TIMEOUT DURING GLOBAL INITIALIZATION WAITING
            FOR aaaa
     Eventually IRLM will get
      DXR122E xxxx013 ABEND UNDER IRLM TCB/SRB IN MODULE DXRRL732
             ABEND CODE=U2025 ( U2025 )
    
    Operator actions for dealing with hung CF connections
    -------------------------------------------------------
    Preform the following actions to recover from hung CF
    structure connections.
    
    1. Save a dump of all DB2 and IRLM members along with
       SDATA=(COUPLE,XESDATA) so IBM Software Support can
       determine what is causing the hung connections. See
       message II10850 for more information.
    
    2. Attempt a rebuild of the lock structure. Sometimes
       the SCA rebuild process is suspended on an IRLM
       lock requst, and there's a chance that a rebuild
       of the lock structure can shake loose a stalled lock
       request and clear the condition that is causing the
       DiscFailConn response to hang. If the Rebuild of the lock
       structure works, XES issues a message similar to the
       following message for each group member as it provides
       the required DiscFailConn response:
    
    IXL043I CONNECTION NAME: DB2_DB2M, JOBNAME: DB2MMSTR,
       ASID: 0086 HAS PROVIDED THE REQUIRED RESPONSE. THE
       REQUIRED RESPONSE FOR THE DISCONNECTED/FAILED CONNECTION
       EVENT FOR SUBJECT CONNECTION DB2_DB2V,
       STRUCTURE DB2GR0W_SCA IS NO LONGER EXPECTED
    
       If the Rebuild does not work, proceed to step 3.
    
    3. Issue the D XCF,STR,STRNM=<strnmae>,CONNM=<conname>
       command for the structure/connector that is in the
       FAILING state. Alternatively, issue the D XCF,STR,STRNM=
       <strname>,CONNM=ALL command.
    
       If this command identifies the unresponsive members, skip
       to Step 6. If it does not identify the unresponsive
       members, proceed to Step 4.
    
    4. Attempt a structure Rebuild for the affected structure,
       if you have not already done this.
    
    5. If the Rebuild hangs, issue the D XCF,STR,STRNM=<strname>
       command to identify the unresponsive connector.
    
       This will identify the members that are unresponsive to
       the Rebuild. These members are probably the same members
       that are unresponsive to the DiscFailConn event.
    
    6. Cancel/recycle the unresponsive members. The STOP DB2
       command might not work because internal DB2 processes
       are hung, so issue MODIFY irlmproc,ABEND command to
       bring down IRLM, or cancel IRLM and DB2 MSTR.
    
       As each member terminates, ensure that XES issues message
       IXL043I to indicate that it no longer expects a
       DiscFailConn response from that member. When all members
       that owe responses have been stopped, all connections to
       the SCA connections go away.
    
    7. Issue the D XCF,STR,STRNM=<sca>,CONNM=ALL command to
       verify the status of the connections to SCA.
    
    8. Restart all DB2 members with FAILED PERSISTENT
       connections.
       As each member successfully reconnects to the SCA, XES
       issues message IXL014I. If a problem still exists,
       proceed to Step 9.
    
    9. Take down/restart the systems on which the unresponsive
       members are running. If restarting the system does not
       fix the unresponsive members, proceed to Step 10.
    
    10. Cancel/recycle all connections to CF structures. If a
        problem still exists, proceed to Step 11.
    
    11. Take down/restart all systems.
    
    
    Refer to z/OS Recovery and Reconfiguration Guide for more inform
    Other references:
    - MVS apar OW46531 closing text give additional info on proper
      Handling of 'FAILING' CF structure connections.
    - MVS doc apar OW29300.
    

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

  • can
    Informational apar
    

APAR Information

  • APAR number

    II13538

  • Reported component name

    PB LIB INFO ITE

  • Reported component ID

    INFOPBLIB

  • Reported release

    001

  • Status

    CLOSED CAN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2003-03-20

  • Closed date

    2005-01-05

  • Last modified date

    2006-04-26

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels



Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

z/OS family

Software version:

001

Operating system(s):

MVS

Reference #:

II13538

Modified date:

2006-04-26

Translate my page

Machine Translation

Content navigation