PM74532: UPD DB STOP(QUIESCE) FAILS, IMS HALDB NO LONGER ACCESSIBLE

A fix is available

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • The problem is that a database quiesce fails because an IMS goes
    down, leaving their databases stuck in a database quiesce state,
    where the only way to get them out of that state is to recycle
    RMs and IMSs.
    The V12 base code will be retro-fitted to V11
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: IMS V11 users of IMSplex-wide                *
    *                 processes using RM, such as database         *
    *                 quiesce, global online change, and           *
    *                 UPD IMS SET(PLEXPARM) command                *
    ****************************************************************
    * PROBLEM DESCRIPTION: An UPDATE DB STOP(QUIESCE) command      *
    *                      fails with reason code 4124 after a     *
    *                      previous UPDATE DB START(QUIESCE)       *
    *                      command failed because an IMS           *
    *                      came down.                              *
    ****************************************************************
    * RECOMMENDATION: INSTALL CORRECTIVE SERVICE FOR APAR/PTF      *
    ****************************************************************
    If an RM client participating in a global process step
    terminates normally or abnormally, or its SCI goes down
    in the middle of the process step, the process step hangs
    until it times out.  IMS functions that use RM to coordinate
    IMSplex-wide processes are never returned a response.
    The IMS functions that use the RM IMSplex-wide process
    function include:
    - database quiesce (UPD DB START or STOP QUIESCE command)
    - global & ACBLIB member online change (INITIATE OLC command)
    - UPD IMS SET(PLEXPARM) command
    The process step timeout value is set to be the same as the
    timeout value specified on the command that initiated the
    RM IMSplex-wide process, so the process step could time out
    long after the RM client or its SCI went down.
    
    In the case of database quiesce, IMS databases are left in
    a database quiesce state, with no way to remove the quiesce
    state other than recycling the RMs and IMSs and scratching
    the resource structure.
    

Problem conclusion

  • GEN:
    KEYWORDS:
    
    *** END IMS KEYWORDS ***
    RM needs to be changed to be more usable in this situation.
    RM should return a non-zero completion code for the RM client
    that went down right away, rather than waiting for the process
    step to time out.  This would enable the RM client to take
    action, such as terminating the IMSplex-wide process or
    cleaning up on behalf of the failed client.
    This would also improve IMS availability, because the command
    would fail as soon as an IMS, RM, or SCI failed, freeing
    IMS up to do other work.
    
    Database quiesce needs to be changed to detect that the
    command failed because an IMS or RM terminated or is not
    reachable through SCI, and terminate the database quiesce
    process and remove databases from the database quiesce
    state.
    
    This fix does not address the case where none of the RMs
    in the IMSplex are available after the database quiesce
    process is started.
    
    This fix does not address the case where the database quiesce
    command master goes down in the middle of a database quiesce
    command.
    
    CSLRAWDR adds new AWE function for client not reachable.
    
    CSLRAWXI adds equates for the new AWE process functions to
    CSLRPR20, to cleanup processes for a member that terminated
    that is involved in IMSplex-wide processes.
    
    CSLRCODE adds client not reachable RM code.
    
    CSLRPRSR adds bits to indicate member terminated normally,
    member terminated abnormally, and member not reachable through
    SCI.
    
    CSLRPR20 adds logic to handle new functions for member
    terminated normally, member terminated abnormally, and member
    not reachable through SCI, by providing a response to an
    IMSplex-wide process that is in progress, rather than waiting
    for the process to time out.
    
    CSLRPR30 adds logic to set new completion codes for the cases
    where the member terminated normally, the member terminated
    abnormally, and the member is not reachable.
    
    CSLRREG0 adds code to enqueue an AWE to CSLRPR20 to check
    if a terminated member is participating in an IMSplex-wide
    process, so that a response can be issued on that client's
    behalf.  This enables the client, such as global online change
    or database quiesce to take action and clean up.
    
    CSLRRR adds new reason codes for member abended, member
    shutdown, and member not reachable through SCI.
    
    CSLRTM10 is recompiled for the CSLRAWDR change.
    
    CSLRTM20 is recompiled for the CSLRAWDR change.
    
    CSLRXNF0 adds a check for member not reachable, in order
    to set a unique function code for CSLRREG0 to differentiate
    between a member that terminated normally, a member that
    terminated abnormally, and a member that is not reachable
    through SCI.
    
    DFSCCTX0 adds new completion code text for member abended,
    member shutdown, and member is not reachable through SCI.
    
    DFSCMDRR adds new IMS completion codes for member abended,
    member shutdown, and member is not reachable through SCI.
    
    DFSDBQ00 adds logic to handle the new RM reason codes by
    cleaning up the database quiesce process and converting the
    new RM reason codes into IMS reason codes.
    
    DFSGPM10 adds logic to handle the new RM reason codes by
    converting them into IMS reason codes.
    
    DFSOLC00 adds logic to handle the new RM completion codes by
    converting them to IMS reason codes.
    
    **********************
    * DOCUMENTATION      *
    **********************
    These documentation changes are needed in subsequent IMS
    releases as well.
    
    IMS V11 Commands, Volume 1: IMS Commands A - M
    SC19-2430-02
    INITIATE OLC command
      Add the following completion codes:
      14A Command failed for the IMS because its SCI is not active
          so that the IMS is not reachable.
    
      14B Command failed for the IMS because it shut down normally
          during command processing.
    
      14C Command failed for the IMS because it terminated
          abnormally during command processing.
    
    IMS V11 Commands, Volume 2: IMS Commands N - V
    SC19-2431-02
    TERMINATE OLC command
      Add the following completion codes:
      14A Command failed for the IMS because its SCI is not active
          so that the IMS is not reachable.
    
      14B Command failed for the IMS because it shut down normally
          during command processing.
    
      14C Command failed for the IMS because it terminated
          abnormally during command processing.
    
    UPDATE DB command
      Add the following usage notes:
      When a nonzero return code is received for the UPDATE DB
      START(QUIESCE) with OPTION(HOLD) command or the UPDATE DB
      STOP(QUIESCE) command, you may need to correct the problem
      reported by the IMS reason code/completion code and
      try the command again.
    
      Add the following completion codes:
      14A Command failed for the IMS because its SCI is not active
          so that the IMS is not reachable.
    
      14B Command failed for the IMS because it shut down normally
          during database quiesce processing.
    
      14C Command failed for the IMS because it terminated
          abnormally during database quiesce processing.
    
    UPDATE IMS command
      Add the following completion codes:
      14A Command failed for the IMS because its SCI is not active
          so that the IMS is not reachable.
    
      14B Command failed for the IMS because it shut down normally
          during command processing.
    
      14C Command failed for the IMS because it terminated
          abnormally during command processing.
    
    **********************
    * AUTOMATION         *
    **********************
    If the UPD DB START(QUIESCE), INITIATE OLC, TERMINATE OLC,
    or UPD IMS SET(PLEXPARM()) command is issued and
    one of the participants goes down or its SCI goes down,
    the command will now fail with new completion codes
    14A, 14B, or 14C, rather than the command timing out.
    Users may want to change their automation to handle
    these new completion codes.
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PM74532

  • Reported component name

    IMS V11

  • Reported component ID

    5635A0200

  • Reported release

    100

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-10-08

  • Closed date

    2012-12-03

  • Last modified date

    2013-01-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UK83958

Modules/Macros

  •    CSLRCODE CSLRPR20 CSLRPR30 CSLRREG0 CSLRRR
    CSLRTM10 CSLRTM20 CSLRXNF0 DFSCCTX0 DFSCMDRR DFSDBQ00 DFSGPM10
    DFSOLC00
    

Publications Referenced
SC19243002 SC19243102      

Fix information

  • Fixed component name

    IMS V11

  • Fixed component ID

    5635A0200

Applicable component levels

  • R100 PSY UK83958

       UP12/12/06 P F212

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.



Rate this page:

(0 users)Average rating

Document information


More support for:

IMS

Software version:

100

Reference #:

PM74532

Modified date:

2013-01-02

Translate my page

Machine Translation

Content navigation