IV25030: WEBSPHERE MQ CLUSTER FAILS, POSSIBLY GENERATING FDC FILES WITH PROBE IDS RM296000 OR OTHER rrcE_REPOSITORY_ERROR
Fixes are available
WebSphere MQ V6.0 Fix Pack 22.214.171.124
WebSphere MQ V6.0 for iSeries Fix Pack 126.96.36.199
WebSphere MQ V7.5 Fix Pack 188.8.131.52
WebSphere MQ V7.0 Fix Pack 184.108.40.206
WebSphere MQ V7.0.1 for i5/OS Fix Pack 220.127.116.11
Fix Pack 18.104.22.168 for WebSphere MQ V7.1
WebSphere MQ 6.0 for HP OpenVMS Alpha and Itanium - Fix Pack 22.214.171.124
Closed as program error.
WebSphere MQ cluster stops working. FDCs may include but are not limited to (see problem summary) error code rrcE_REPOSITORY_ERROR and Probe ID RM296000. The repository manager process amqrrmfa ends abruptly. Other symptom will include a corrupted cluster name of WMQ cluster objects, for example: CLUSTER( ￗ￢￱@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@). If amqrfdm output is viewed in binary mode, and the first bytes of the CLUSTER name are the hex for EBCDIC characters the cluster name is not being properly converted. This issue may be related to APAR PM84108 on z/OS platform.
**************************************************************** USERS AFFECTED: All users of WebSphere MQ Clusters are potentially affected by this error. The issue is significantly more likely to occur in clusters with very large numbers of queue managers or objects (queues and topics), or where object definitions are modified frequently. Platforms affected: All Distributed (iSeries, all Unix and Windows) **************************************************************** PROBLEM SUMMARY: When objects in the cluster repository cache are modified (for example, changing an attribute on a cluster queue), the details for that object are republished to the cluster. Previous records for the object may persist for some time in the cluster cache, so that applications currently using them (for instance having opened the queue for output) can continue processing without interruption. Periodically, the repository process attempts to 'garbage collect' these older records, checking whether they are still in use. Where multiple such records exist for a particular cluster queue manager object (the record in the cache which stores information about the channel definition to reach a remote queue manager), and these are held in use for a prolonged period, an error in the logic leads to the possibility that the storage for parts of these queue manager records can be reused (for example overwritten to hold another object) while actually still required. This can lead to a variety of errors depending on the precise nature of the reuse, varying from no external symptom to entire failure of the cluster repository process. Because this affects only the cluster repository cache, message data is unlikely to be lost, but corrupted records may lead to MQ API calls failing (for example with MQRC_CLUSTER_RESOLUTION_ERROR), messages may be DLQ'd, or channels may have to stop processing when a message cannot be correctly routed.
The garbage collection logic in the cluster repository process is modified to correctly ensure that all handles on 'old' cluster queue manager records are released before freeing certain chained areas from the record. Users should perform the following command on repositories where they see incorrect cluster data: REFRESH CLUSTER(*) REPOS(YES) --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: v6.0 Platform Fix Pack 126.96.36.199 -------- -------------------- Windows U200331 AIX U842074 HP-UX (PA-RISC) U842208 HP-UX (Itanium) U842213 Solaris (SPARC) U842209 Solaris (x86-64) U842216 iSeries tbc_p600_0_2_12 Linux (x86) U842210 Linux (x86-64) U842215 Linux (zSeries) U842211 Linux (Power) U842212 Linux (s390x) U842214 v7.0 Platform Fix Pack 188.8.131.52 -------- -------------------- Windows U200352 AIX U853055 HP-UX (PA-RISC) U853082 HP-UX (Itanium) U853087 Solaris (SPARC) U853083 Solaris (x86-64) U853089 iSeries 184.108.40.206 Linux (x86) U853084 Linux (x86-64) U853088 Linux (zSeries) U853085 Linux (Power) U853086 v7.1 Platform Fix Pack 220.127.116.11 -------- -------------------- Windows 18.104.22.168 AIX 22.214.171.124 HP-UX (Itanium) 126.96.36.199 Solaris (SPARC) 188.8.131.52 Solaris (x86-64) 184.108.40.206 iSeries 220.127.116.11 Linux (x86) 18.104.22.168 Linux (x86-64) 22.214.171.124 Linux (zSeries) 126.96.36.199 Linux (Power) 188.8.131.52 Platform v7.5 -------- -------------------- Multiplatforms 184.108.40.206 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Reported component name
WMQ LIN ZSERIEX
Reported component ID
Last modified date
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fixed component name
WMQ LIN ZSERIEX
Fixed component ID
Applicable component levels