IBM Support

PM86913: Z/OS WMQ V7.1, MQCLOSE IS HANGING WHEN THE STRUCTURE HAS A FAILED STATUS.

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The MQCLOSE is hanging when the structure has failed status.
    
    The reason for the hang is that there are 10's of EB's
    (threads) waiting to acquire a latch.
    .
    Batch Job XXXXXXX has initiated the close of shared queue
    ANY.CFS.SHARE.QUEUE.
    CSQMCLOS calls CSQMCLS2 that acquires an exclusive latch
    (DMCSEGAL) to the queue object.
    Control passes to CSQVEUS1 to create another thread to
    execute CSQECLOS to close and disconnect from the CF Strcuture
    WXYZ (that is in FAIL status).
    CSQVEUS1 suspends and waits for CSQECLOS to return but
    for some reason hangs and it never returns.
    .
    As part of MQCLOSE process CSQEOCRQ queues a work request
    to the CF structure task to close the shared queue and it
    waits for the close to complete.
    The task hangs. The structure failed at the event
    in CSQESTE/CSQESTFA/CSQIORF1/CSQILOC2 but CSQILOC2 has
    requested the DMCTREE latch that is held by the
    RESET QSTATS command.
    .
    There was a deadly embrace between the RESET QSTATS command
    that is  holding latch DMCTREE and it is requesting latch
    DMCSEGAL for queue object ANY.CFS.SHARE.QUEUE.
    .
    The hung EB's (threads) waiting to acquire a latch could
    cause a shortage of storage.
    .
    Additional keywords:
    GRS wait for SYSZCSQE CSQERECOVER CF STRUCTURE TASK ENQMQGT
    

Local fix

  • n/a
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 7 *
    *                 Release 1 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: A coupling facility structure failure   *
    *                      during the processing of a RESET QSTATS *
    *                      command may result in a deadly embrace  *
    *                      scenario. This results in the queue     *
    *                      manager becoming unresponsive.          *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    If during the processing of the RESET QSTATS command, a coupling
    facility structure associated with a shared queue fails, the
    queue manager may become unresponsive. This is due to both
    processes attempting to gain latches on the same resources. In
    some timing conditions a deadly embrace scenario may occur,
    where both processes wait for the resources held by the other
    process. This can result in the queue manager becoming
    unresponsive.
    

Problem conclusion

  • The RESET QSTATS command has been altered to output zero values
    for any shared queue that is currently involved with structure
    failure processing. This removes the indefinite wait for the
    resource in this scenario.
    100Y
    CSQM1RQS
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

  • ×**** PE13/10/01 PTF IN ERROR. SEE APAR PM98197  FOR DESCRIPTION
    

APAR Information

  • APAR number

    PM86913

  • Reported component name

    WMQ Z/OS V7

  • Reported component ID

    5655R3600

  • Reported release

    100

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2013-04-12

  • Closed date

    2013-06-25

  • Last modified date

    2013-10-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UK95383

Modules/Macros

  • CSQM1RQS
    

Fix information

  • Fixed component name

    WMQ Z/OS V7

  • Fixed component ID

    5655R3600

Applicable component levels

  • R100 PSY UK95383

       UP13/08/23 P F308 Ž

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
03 October 2013