IBM Support

PM74655: READ TIMEOUTS CAN CAUSE A CLASSCASTEXCEPTION WHEN USING IPC AS THE LOCAL CONNECTOR.

Fixes are available

8.5.0.2: WebSphere Application Server V8.5 Fix Pack 2
8.0.0.6: WebSphere Application Server V8.0 Fix Pack 6
7.0.0.29: WebSphere Application Server V7.0 Fix Pack 29
8.0.0.7: WebSphere Application Server V8.0 Fix Pack 7
8.0.0.8: WebSphere Application Server V8.0 Fix Pack 8
7.0.0.31: WebSphere Application Server V7.0 Fix Pack 31
7.0.0.33: WebSphere Application Server V7.0 Fix Pack 33
8.0.0.9: WebSphere Application Server V8.0 Fix Pack 9
7.0.0.35: WebSphere Application Server V7.0 Fix Pack 35
8.0.0.10: WebSphere Application Server V8.0 Fix Pack 10
7.0.0.37: WebSphere Application Server V7.0 Fix Pack 37
8.0.0.11: WebSphere Application Server V8.0 Fix Pack 11
7.0.0.39: WebSphere Application Server V7.0 Fix Pack 39
8.0.0.12: WebSphere Application Server V8.0 Fix Pack 12
7.0.0.41: WebSphere Application Server V7.0 Fix Pack 41
8.0.0.13: WebSphere Application Server V8.0 Fix Pack 13
7.0.0.43: WebSphere Application Server V7.0 Fix Pack 43
8.0.0.14: WebSphere Application Server V8.0 Fix Pack 14
7.0.0.45: WebSphere Application Server V7.0 Fix Pack 45
8.0.0.15: WebSphere Application Server V8.0 Fix Pack 15
7.0.0.29: Java SDK 1.6 SR13 FP2 Cumulative Fix for WebSphere Application Server
7.0.0.45: Java SDK 1.6 SR16 FP60 Cumulative Fix for WebSphere Application Server
7.0.0.31: Java SDK 1.6 SR15 Cumulative Fix for WebSphere Application Server
7.0.0.35: Java SDK 1.6 SR16 FP1 Cumulative Fix for WebSphere Application Server
7.0.0.37: Java SDK 1.6 SR16 FP3 Cumulative Fix for WebSphere Application Server
7.0.0.39: Java SDK 1.6 SR16 FP7 Cumulative Fix for WebSphere Application Server
7.0.0.41: Java SDK 1.6 SR16 FP20 Cumulative Fix for WebSphere Application Server
7.0.0.43: Java SDK 1.6 SR16 FP41 Cumulative Fix for WebSphere Application Server
Obtain the fix for this APAR.

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • A request is sent from the DMGR to the local Node agent using
    the IPC connector.
    When the response is not received within the set timeout
    (default 600 seconds) the READ will time out and the DMGR
    closes the connection. The connection is put back into the pool
    and eventually gets reused for new connections.
    At some point the Node agent is finally sending the response to
    the DMGR which is put on the inbound message queue of the
    current connection, and the READ will result in a
    ClassCastException.
    
    Here is an example flow:
    The DMGR sends a request  to the NA at 09:50:50.361. Let's call
    this REQ1.
    After sending this request the DMGR eventually timed out on a
    READ request waiting for the response, as seen in this trace
    entry:
    
    Trace: 2012/10/05 10:00:50.366 01 t=8B0580 c=UNK key=S2
    (13007002)
       ThreadId: 0000001b
       FunctionName:
    com.ibm.ws.management.connector.ipc.ClientAccessor
       SourceId: com.ibm.ws.management.connector.ipc.ClientAccessor
       Category: FINEST
       ExtendedMessage: Exception while receiving or de-serializing
    response; java.io.IOException: ZAioTCPChannel: sync read
    operation timed out.
        at
    com.ibm.ws.tcp.channel.impl.ZAioTCPReadRequestContextImpl.read(Z
    AioTCPReadRequestContextImpl.java:227)
     ...... rest of stack trace omitted.
    
    
    The connection from the DMGR to the NA is re-established at
    10:00:50.381.   It drove a new request into the NA (REQ2) and
    issued a synchronous READ to wait for the response (RSP2).
    The original request finally finished and drove a response at
    10:03:42.940 (RSP1).
    The DMGR was expecting RSP2 on its inbound message queue, and
    received RSP1 instead, causing the ClassCastException.
    The DMGR is left with an additional response in its pipe and
    the status of the node will not be reflected correctly after
    that. This is because the isAlive request from the DMGR to the
    Node agent will continue to fail after the exception.
    
    If the DMGR and Node agent are restarted it will work for a
    while.
    

Local fix

  • Use SOAP instead of IPC as the local connector
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of IBM WebSphere Application      *
    *                  Server V7.0, V8.0, and V8.5                 *
    ****************************************************************
    * PROBLEM DESCRIPTION: WebSphere Application Server for z/OS   *
    *                      local communication received stale      *
    *                      data                                    *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    A timing window in the local communication code can lead to the
    successful sending of data using a stale connection.  The
    delivery of this data across a stale connection can cause
    unexpected behavior on the receiving side of a new connection.
    It has been observed, on the Admin IPC channel over local
    communication, that this data was mistakenly treated as valid
    reply data to new requests.  This in turn lead to a
    ClassCastException after a number of erroneous "successful"
    requests.
    

Problem conclusion

  • Additional code has been added in the local communication path
    to detect and reject attempts to send data over a stale
    connection.
    
    APAR PM74655 is currently targeted for inclusion in Fix Packs
    7.0.0.29, 8.0.0.6, and 8.5.0.2 of WebSphere Application Server.
    
    Please refer to URL:
    //www.ibm.com/support/docview.wss?rs=404&uid=swg27006970
    for Fix Pack availability.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PM74655

  • Reported component name

    WEBSPHERE FOR Z

  • Reported component ID

    5655I3500

  • Reported release

    700

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-10-09

  • Closed date

    2012-12-11

  • Last modified date

    2013-07-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WEBSPHERE FOR Z

  • Fixed component ID

    5655I3500

Applicable component levels

  • R700 PSY UK94926

       UP13/06/20 P F306

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.



Document information

More support for: WebSphere Application Server for z/OS
General

Software version: 7.0

Reference #: PM74655

Modified date: 03 July 2013