IBM Support

PH37290: WMQ START CHINIT HANGS

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • MQ Development describes the sequence of events, that a START
    CHINIT command is issued from a CSQINP2 dataset.  The channel
    initiator address space subsequently starts but its
    initialisation didn't complete until a bit later.
    Meanwhile, automation issued a second START CHINIT command.
    This was before the first CHINIT instance had successfully
    connected to the QMGR, and this timing window allowed for a
    second channel initiator address space to start.
    The first CHINIT instance then went on to connect to the QMGR,
    subsequently placing its ASID and xGwa address in the QMGR's
    MGBL control block. While processing the second START CHINIT
    command, the command processor proceeded to overwrite the saved
    CHINIT ASID in the QMGR MGBL with the ASID of the second CHINIT
    instance.
    The second CHINIT instance then tried to connect to the QMGR
    and detected that another CHINIT was already connected. This
    resulted in the second CHINIT instance failing to start with
    message CSQX007E with reason MQRC_DUPLICATE_RECOV_COORD. This
    left the QMGR's MGBL control block in an inconsistent state:
    the xGwa address correctly pointed to the xGwa in the original
    CHINIT address space, but the saved ASID was for a channel
    initiator address space that had since ended. This ASID was
    later claimed by a different job. The QMGR checks if the CHINIT
    address space is connected in various ways, but one of the
    first checks is that the ASCB associated with the ASID saved in
    the MGBL has the correct CHINIT jobname. If this check fails,
    then the QMGR assumes that the CHINIT isn't started.
    Later attempts to start the CHINIT can result in new CHINIT
    instances starting due to the QMGR not thinking the CHINIT was
    started. These will subsequently end due to failing to connect
    to the QMGR with messsage CSQX007E.
    -
    This APAR investigates improvement in the serialisation between
    multiple CHINIT instances starting to ensure against an
    inconsistent state.
    

Local fix

  • Allow a bit of time between START CHINIT commands to allow for
    the channel initiator address space to initialise, prior to
    other START CHINIT commands being issued (either manually or
    via automation).
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of IBM MQ for z/OS Version 9       *
    *                 Release 1 Modification 0 and Release 2       *
    *                 Modification 0.                              *
    ****************************************************************
    * PROBLEM DESCRIPTION: Message "CSQM131I +xxxx CSQMDCST        *
    *                      CHANNEL INITIATOR NOT ACTIVE, CLUSTER   *
    *                      AND CHANNEL COMMANDS INHIBITED" is      *
    *                      issued when the channel initiator is    *
    *                      active.                                 *
    ****************************************************************
    If multiple attempts to start the channel initiator occur during
    a short interval, a timing window exists where multiple
    xxxxCHIN address spaces are started.
    One of these will successfully connect to the queue manager, and
    the others will fail after issuing
    "CSQX007E  +xxxx CSQXADPI Unable to connect to queue manager
     xxxx MQCC=2 MQRC=2163 (MQRC_DUPLICATE_RECOV_COORD)".
    Depending on which instance connects, and when the associated
    START CHINT command ran, it is possible for the wrong asid to
    be stored in the mgbl.
    When subsequent commands (e.g. DIS CHSTATUS, STOP CHINIT)
    check if the channel initiator is running, this incorrect asid
    value results in the command incorrectly determining that the
    channel initator is not running, preventing the command from
    being executed.
    

Problem conclusion

  • Channel Initiator startup processing is changed to ensure the
    correct asid is only stored for a Channel Initiator address
    space that has successfully connected to the queue manager.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH37290

  • Reported component name

    IBM MQ Z/OS V9

  • Reported component ID

    5655MQ900

  • Reported release

    100

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-05-17

  • Closed date

    2022-02-16

  • Last modified date

    2022-04-01

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UI79379 UI79380

Modules/Macros

  • CSQMCCHT CSQMSCHI
    

Fix information

  • Fixed component name

    IBM MQ Z/OS V9

  • Fixed component ID

    5655MQ900

Applicable component levels

  • R100 PSY UI79380

       UP22/03/11 P F203 ¢

  • R200 PSY UI79379

       UP22/03/11 P F203 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"100"}]

Document Information

Modified date:
02 April 2022