IBM Support

PI36869: WMQ HUNG AND USING CONSIDERABLE AMOUNT CPU DUE TO ARCHIVE LOG FILLING UP WITH CSQJ003I +SSID FULL ARCHIVE LOG VOLUME

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The customer experienced an outage on a production
    WMQ region.  WMQ was using considerable amount CPU
    and having a problem with the archive log files.
    In the MSTR log, there were many
    CSQJ003I +SSID FULL ARCHIVE LOG VOLUME messages.
    There was a large number of expired messages on the
    queue. The client channels had issued MQGETs against
    the queue, and the MQGET was processing finds/deletes
    of the expired messages.
    The messages on queue were persistent, and log
    records were written for the expired messages,
    which caused the active logs dataset to fill up.
    An SVCDUMP taken and there were over 1 million messages
    on the queue and most of them had expired or were about
    to expire.
    To make matters worse, the queue-manager has a relatively
    low value for LOGLOAD (10,000),which meant that the
    queue-manager was checkpointing very frequently due
    to the log records for the expiry of the messages.
    .
    Additional Symptom(s) Search Keyword(s):
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of WebSphere MQ for z/OS Version 8 *
    *                 Release 0 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: Shunted units of work contain           *
    *                      unnecessary log records, leading to an  *
    *                      increase in log usage and checkpoint    *
    *                      frequency.                              *
    *                      In rare cases, if LOGLOAD is set to a   *
    *                      low value, the additional logging for   *
    *                      large units of work requiring shunting  *
    *                      can trigger further shunting, resulting *
    *                      in the queue manager being unable to    *
    *                      perform further non-shunt related       *
    *                      logging.                                *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    While processing long running units of work, CSQRSHUN will
    'shunt' the log records associated with those units of work to
    the current active log, so that if undo processing is required,
    it will not need to go back to the original archived log
    records.
    However, there are situations where log records in a unit of
    work are not required for undo processing - for example, when
    out of suncpoint work has taken place on the same task, or
    when expired messages are identified during browse or get
    processing. In these cases a compensating log record is
    written to identify which records are not needed for undo
    processing, however these records will still be shunted by
    CSQRSHUN.
    These records increase the number of records that are written
    each time the unit of work is shunted. In some cases, where
    the number of these records is very high (for example, when
    getting from a queue with a large number of expired messages)
    the amount of log space required to shunt these records leads
    to an increase in checkpoint frequency and performance
    degradation.
    

Problem conclusion

  • CSQRSHUN is changed to no longer shunt log records for which a
    compensating log record exists, and which consequently are not
    needed for undo processing.
    000Y
    CSQRSHUN
    

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

APAR Information

  • APAR number

    PI36869

  • Reported component name

    WMQ Z/OS 8

  • Reported component ID

    5655W9700

  • Reported release

    000

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2015-03-12

  • Closed date

    2015-03-20

  • Last modified date

    2015-05-04

  • APAR is sysrouted FROM one or more of the following:

    PI28301

  • APAR is sysrouted TO one or more of the following:

    UI26136

Modules/Macros

  • CSQRSHUN
    

Fix information

  • Fixed component name

    WMQ Z/OS 8

  • Fixed component ID

    5655W9700

Applicable component levels

  • R000 PSY UI26136

       UP15/04/16 P F504 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
04 May 2015