IBM Support

PI21198: WMQ Z/OS LOG ARCHIVE HANGS 15/05/14 PTF PECHANGE

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In the test of adding of new logs to the limit of 31 log
    copies,  Queue manager CSQ1 had 7 dual LOGCOPY, it was added
    from 8 to 32.  On the 33rd log, expected error was displayed:
    CSQJ143I $CSQ1 BSDS active log data set record is full.
    Several manual ARCHIVE LOG commands were entered to ensure new
    logs  were being utilized.  4 new logs were able to be
    successfully archived.  For the rest of the logs, it appears MQ
    writes the logs to tape, but then does not release  the drive
    and catalogue the dataset.
    The situation is this MQ subsystem grabs all our tape
    drives, writes the log and causes the log archive process to
    hang.  Failed to shut down MQ via STOP QMGR command, and
    had to CANCEL it.  After bringing MQ up, it does exactly the
    same thing - allocates the tape drives, writes the logs, but
    does not terminate the log archive.
    *
    Additional Symptom(s) Search Keyword(s):
    If a SET ARCHIVE command is issued at the same time as an
    offload is in progress, the following sequence may occur:
    .
    - The first log archive completes normally.
    .
    - The next log archive fails:
    .
    CSQJ073E LOG ARCHIVE UNIT ALLOCATION FAILED, REASON CODE=0008.
    ALLOCATION OR OFFLOAD OF ARCHIVE LOG DATA SET MAY FAIL
    .
    CSQJ103E CSQJDS01 LOG ALLOCATION ERROR
    DSNAME=<garbage name>
    .
    CSQJ115E OFFLOAD FAILED, COULD NOT ALLOCATE AN ARCHIVE DATA SET
    .
    CSQJ139I LOG OFFLOAD TASK ENDED
    .
    CSQV086E QUEUE MANAGER ABNORMAL TERMINATION REASON=00D10251
    .
    IEA794I SVC DUMP HAS CAPTURED:
    DUMPID=nnn REQUESTED BY JOB (ssidMSTR)
    DUMP TITLE=ssid,ABN=5C6-00D10269,U=SYSOPR  ,C=R3600.710.RLMC-BF
                R-WRT ,M=CSQJW008,LOC=CSQJL002.CSQJW107+0000217E
    .
    <garbage name> in the CSQJ103E message starts with the value of
    the UNIT parameter from CSQ6ARVP with the first letter stripped
    off and  continues with the ARCPFX1 value and other parameters
    from CSQ6ARVP.
    .
    - When the support center reviews the dump, the copy of the
    CSQ6ARVP parameters pointed to by LMBNARVP looks corrupted or
    overlaid.  The control block appears to be shifted right by two
    words, which in the reported case contained zeroes.
    What should start with 005D011A  C1D9E5D7, e.g.
     005D011A  C1D9E5D7  00007000  000010E0  | .)..ARVP.......\ |
    instead is:
     00000000  00000000  005D011A  C1D9E5D7  | .........)..ARVP |
    .
    When UI19842 caused an offset change of 8 bytes to CSQDDPRM,
    CSQJOFF1 was not recompiled to pick up this change.  It
    therefore copies ARVP information from the wrong offset within
    the DPRM to the address pointed to by LMBNARVP.
    .
    A recycle of the queue manager will clear this symptom.
    

Local fix

  • N/A
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 7 *
    *                 Release 1 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: Queue manager hangs as it archives a    *
    *                      log to tape, but all tape units were    *
    *                      in use. The queue manager does not      *
    *                      recover from the hang when the offload  *
    *                      tasks have completed.                   *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    The offload service task, CSQJOFF1, calls CSQJDS01 to issue SVC
    99 to allocate a dataset on a tape unit. However all tape units
    are in use, so it waits. When CSQJOFF7 offload tasks for
    previous archives complete they post LOAEECB in order to resume
    CSQJOFF1 but CSQJOFF1 cannot resume because it is waiting in
    CSQJDS01, so there is a deadly embrace.
    

Problem conclusion

  • A new CSQZPARM parameter, MAXCNOFF, has been introduced to limit
    the number of CSQJOFF7 offload tasks that can be run in
    parallel. This will allow a queue manager or queue managers to
    be tuned such that they will not use all the available tape
    units. Instead the queue manager will wait until a CSQJOFF7
    offload task has completed before trying to allocate any new
    archive datasets.
    100Y
    CMQCFA
    CMQCFC
    CMQCFP
    CSQJC00A
    CSQJOFF1
    CSQJOFF2
    CSQJOFF6
    CSQJS001
    CSQZMSTR
    CSQ6LOGP
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PI21198

  • Reported component name

    WMQ Z/OS V7

  • Reported component ID

    5655R3600

  • Reported release

    100

  • Status

    CLOSED PER

  • PE

    YesPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2014-07-01

  • Closed date

    2014-11-11

  • Last modified date

    2015-05-14

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    PI27066 PI28014 UI22931

Modules/Macros

  • CMQCFA   CMQCFC   CMQCFP   CSQJC00A CSQJOFF1 CSQJOFF2 CSQJOFF6
    CSQJS001 CSQZMSTR CSQ6LOGP
    

Fix information

  • Fixed component name

    WMQ Z/OS V7

  • Fixed component ID

    5655R3600

Applicable component levels

  • R100 PSY UI22931

       UP14/12/06 P F412 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
14 May 2015