IBM Support

IT15737: IBM MQ APPLIANCE INTERMITTENTLY REBOOTS WITH NO ERRORS REPORTED

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Some users deploying an MQ Appliance have reported
    intermittent reboots of a running appliance.
    No errors are reported in the system logs when the appliance
    restarts.
    Queue managers on the affected appliance that are part of an HA
    pair
    fail over to the standby instance as expected during the
    shutdown phase of the reboot.
    

Local fix

  • The watchdog timer can be disabled by running a remote ipmitool
    request to the affected appliance@
    
    ipmitool -L operator -I lanplus -H <ipmi-channel-IP> -U
    <ipmi_user> -P <ipmi_password> mc watchdog off
    
    
    When run successfully, the command output will report:
    Watchdog Timer Shutoff successful -- timer stopped
    
    
    Please note that if the watchdog timer is disabled, then the IBM
    MQ Appliance firmware may falsely report an intrusion detection
    warning if the connection is lost between the low-level firmware
    and the rest of the system.
    
    
    
    The watchdog timer can be re-enabled by running the following
    remote command:
    
    ipmitool -L operator -I lanplus -H <ipmi-channel-IP> -U
    <ipmi_user> -P <ipmi_password> mc reset warm
    

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    Users of the IBM MQ Appliance
    
    
    Platforms affected:
    MultiPlatform
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    The IBM MQ Appliance low-level firmware makes use of a "watchdog
    timer" to monitor the state of the system, and restart the
    appliance if an unrecoverable system state is detected.
    
    An error in the low level firmware meant that it was possible
    for the low-level firmware's watchdog timer to lose
    communication with the rest of the system, and incorrectly
    determine that the system had entered an unrecoverable state,
    triggering a reboot of the appliance.
    

Problem conclusion

  • The low-level firmware within the MQ appliance has been updated
    to prevent this loss of communication, in order to maintain
    normal operation of the watchdog timer.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v8.0       8.0.0.5
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT15737

  • Reported component name

    IBM MQ APPL M20

  • Reported component ID

    5725S1400

  • Reported release

    800

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2016-06-15

  • Closed date

    2016-07-14

  • Last modified date

    2017-12-04

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    IBM MQ APPL M20

  • Fixed component ID

    5725S1400

Applicable component levels

  • R800 PSY

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS5K6E","label":"IBM MQ Appliance"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
04 December 2017