IBM Support

PI05847: WMQ Z/OS: THE QUEUE MANAGER BECAME UNRESPONSIVE TO COMMANDS AFTER AN "EMTPY QUEUE" COMMAND FOR A SHARED QUEUE

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The queue manager became unresponsive to commands after a
    CSQUTIL EMTPY QUEUE for a shared queue holding large messages.
    
    Command processing was queued up behind commit processing for
    the emptied queue, which originally had thousands of messages.
    During commit processing, the BLOBs for those messages must be
    deleted from DB2, which was taking a long time to complete.
    
    The deletes of the BLOBs were performed while holding the
    ETHR.ethr_chain_latch latch of the ETHR control block
    associated with the empty. However, checkpoint processing was
    waiting for this latch while holding the EANC.Ethr_chain_latch
    latch. Other threads were suspended for this latch.
    
    In a dump, the internal %MQTHR clist showed that the command
    server task RTSSRV01 was waiting for a latch.  The owner of
    that BMXL1/CFMXL1 class 15 latch was the RCRSC
    checkpoint/restart thread. That thread in turn was waiting for
    a latch owned by a "spawned" thread associated with the CSQUTIL
    EMPTY batch job.
    
    The program stack for the CSQUTIL job was CSQMCCMT -> CSQRUC01
    -> CSQVEUS1 -> CSQSFBK.
    
    The program stack for the latch owner (the EB spawned by the
    CSQUTIL job) was CSQVEUS2 -> CSQRUCA3 -> CSQRUC02 -> CSQECMT2
    -> CSQESYNC -> CSQ5LMSG  (DB2 Manager BLOB services module).
    
    Trace entries for the spawned thread included CSQ5MOBJ entries
    called by CSQESYNC.
    
    
    Additional Symptom(s) Search Keyword(s):
    slow performance hang hung wait waiting
    CLEAR QLOCAL
    OFFLOAD SMDS
    BMXL1 CFMXL
    DB2SRVxx Fstatus2 = TASK INDB2
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 7 *
    *                 Release 1 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: When a CSQUTIL EMPTY command is used to *
    *                      empty a shared queue the queue manager  *
    *                      does not accept commands, queue manager *
    *                      appears to hang as no new threads can   *
    *                      be started.                             *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    A customer issues a CSQUTIL EMPTY on a shared queue for which
    there are BLOBS on DB2. CSQESYNC holds the ETHR.THREAD_CHAIN
    latch while it looks through all the TROPs for the thread and
    deletes the blobs in Commit_one_TROP. However if there are a lot
    of blobs a significant length of time might elapse while all the
    DB2 calls to delete the blobs are made. In this time another
    thread may come along, for example when a checkpoint is taken
    during ARCHIVE LOG, which will hold the EANC.Ethr_chain_latch
    and then go on to wait for the ETHR.THREAD_CHAIN lock which is
    already held. This will cause all other commands to the queue
    manager to wait, and any new threads to hang until the empty
    command has completed.
    

Problem conclusion

  • CSQESYNC no longer deletes the blobs under latch, instead it
    calls on CSQETHDP to do this work after the latch on
    ETHR.THREAD_CHAIN has been released.
    100Y
    CSQESYNC
    CSQETHDP
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PI05847

  • Reported component name

    WMQ Z/OS V7

  • Reported component ID

    5655R3600

  • Reported release

    100

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2013-11-08

  • Closed date

    2014-03-18

  • Last modified date

    2014-05-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UI16127

Modules/Macros

  • CSQESYNC CSQETHDP
    

Fix information

  • Fixed component name

    WMQ Z/OS V7

  • Fixed component ID

    5655R3600

Applicable component levels

  • R100 PSY UI16127

       UP14/04/08 P F404 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
02 May 2014