IBM Support

PM91941: THREADS HANG IN SOCKETREAD() WHEN DATABASE IS BROUGHT DOWN WHILE STATEMENTS ARE RUNNING

Fixes are available

8.0.0.8: WebSphere Application Server V8.0 Fix Pack 8
7.0.0.31: WebSphere Application Server V7.0 Fix Pack 31
8.5.5.2: WebSphere Application Server V8.5.5 Fix Pack 2
7.0.0.33: WebSphere Application Server V7.0 Fix Pack 33
8.0.0.9: WebSphere Application Server V8.0 Fix Pack 9
8.5.5.3: WebSphere Application Server V8.5.5 Fix Pack 3
7.0.0.35: WebSphere Application Server V7.0 Fix Pack 35
8.5.5.4: WebSphere Application Server V8.5.5 Fix Pack 4
8.0.0.10: WebSphere Application Server V8.0 Fix Pack 10
7.0.0.37: WebSphere Application Server V7.0 Fix Pack 37
8.5.5.5: WebSphere Application Server V8.5.5 Fix Pack 5
8.5.5.6: WebSphere Application Server V8.5.5 Fix Pack 6
8.0.0.11: WebSphere Application Server V8.0 Fix Pack 11
8.5.5.7: WebSphere Application Server V8.5.5 Fix Pack 7
7.0.0.39: WebSphere Application Server V7.0 Fix Pack 39
8.5.5.8: WebSphere Application Server V8.5.5 Fix Pack 8
8.0.0.12: WebSphere Application Server V8.0 Fix Pack 12
8.5.5.9: WebSphere Application Server V8.5.5 Fix Pack 9
7.0.0.41: WebSphere Application Server V7.0 Fix Pack 41
8.5.5.10: WebSphere Application Server V8.5.5 Fix Pack 10
8.5.5.11: WebSphere Application Server V8.5.5 Fix Pack 11
8.0.0.13: WebSphere Application Server V8.0 Fix Pack 13
7.0.0.43: WebSphere Application Server V7.0 Fix Pack 43
8.5.5.12: WebSphere Application Server V8.5.5 Fix Pack 12
8.0.0.14: WebSphere Application Server V8.0 Fix Pack 14
8.5.5.13: WebSphere Application Server V8.5.5 Fix Pack 13
7.0.0.45: WebSphere Application Server V7.0 Fix Pack 45
8.0.0.15: WebSphere Application Server V8.0 Fix Pack 15
7.0.0.45: Java SDK 1.6 SR16 FP60 Cumulative Fix for WebSphere Application Server
7.0.0.31: Java SDK 1.6 SR15 Cumulative Fix for WebSphere Application Server
7.0.0.35: Java SDK 1.6 SR16 FP1 Cumulative Fix for WebSphere Application Server
7.0.0.37: Java SDK 1.6 SR16 FP3 Cumulative Fix for WebSphere Application Server
7.0.0.39: Java SDK 1.6 SR16 FP7 Cumulative Fix for WebSphere Application Server
7.0.0.41: Java SDK 1.6 SR16 FP20 Cumulative Fix for WebSphere Application Server
7.0.0.43: Java SDK 1.6 SR16 FP41 Cumulative Fix for WebSphere Application Server
8.5.5.14: WebSphere Application Server V8.5.5 Fix Pack 14

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • When an application running in WebSphere Application Server
    connects to a database and the database is brought down
    or crashes while SQL statements (queries or updates) are
    executing, the threads on which the statements are executing
    can hang in socketRead() indefinitely while waiting for a
    response from the database that will never be received.  In
    order for this problem to occur, the SQL statements must run
    long enough that they will be waiting for a response from
    the database, when the database goes down.
    .
    A Javacore will show the threads hung with a stack similar to
    the following:
    .
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:140)
    at oracle.net.ns.Packet.receive(Packet.java:283)
    at oracle.net.ns.DataPacket.receive(DataPacket.java:103)
    at
    oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:2
    30)
    at oracle.net.ns.NetInputStream.read(NetInputStream.java:175)
    at oracle.net.ns.NetInputStream.read(NetInputStream.java:100)
    at oracle.net.ns.NetInputStream.read(NetInputStream.java:85)
    at
    oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4
    CSocketInputStreamWrapper.java:123)
    at
    oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInp
    utStreamWrapper.java:79)
    at
    oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1
    122)
    at
    oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1
    099)
    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:288)
    at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:191)
    at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:523)
    at
    oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedState
    ment.java:207)
    at
    oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPrepar
    edStatement.java:1010)
    at
    oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleSt
    atement.java:1315)
    at
    oracle.jdbc.driver.OraclePreparedStatement.executeInternal(Oracl
    ePreparedStatement.java:3576)
    at
    oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OracleP
    reparedStatement.java:3657)
    at
    oracle.jdbc.driver.OraclePreparedStatementWrapper.executeUpdate(
    OraclePreparedStatementWrapper.java:1350)
    at
    com.ibm.ws.rsadapter.cci.WSResourceAdapterBase.pmiExecuteUpdate(
    WSResourceAdapterBase.java:667)
    at
    com.ibm.ws.rsadapter.cci.WSResourceAdapterBase.executeUpdate(WSR
    esourceAdapterBase.java:535)
    ...
    

Local fix

  • Set the oracle.jdbc.ReadTimeout property to cause the statement
    to timeout after the number of milliseconds specified.  This can
    be set to a value greater than the WebSphere Application Server
    total transaction lifetime timeout to ensure that normally
    long-running statements will not be impacted.  The property can
    be set by creating a new custom property on an Oracle data
    source.  In this example it is set for 3 minutes:
    Name = connectionProperties
    Value = oracle.jdbc.ReadTimeout=180000
    Type = String
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of IBM WebSphere Application      *
    *                  Server                                      *
    ****************************************************************
    * PROBLEM DESCRIPTION: Threads hang in socketRead() when       *
    *                      database is brought down while          *
    *                      statements are running                  *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    The problem scenario is that in WebSphere Application Server,
    you may notice threads in hung state when connections obtained
    from a JDBC Driver are used to run statements that have
    not completed when the database is crashed or
    forcibly brought down.  Because the database goes down, these
    threads that are running a statement and are waiting for a
    response from the database remain hung in socketRead()
    indefinitely.  The same problem can be reproduced in a simple
    Java program that is run outside of an application server.  If
    a long-running query is simulated and the database is brought
    down before the query finishes running, the Java program
    can hang indefinitely.  Version 11.2.0.2 of the Oracle JDBC
    driver has been tested to reproduce this problem on AIX.
    Other JDBC drivers and platforms can experience the same
    problem.
    When the threads using the connections are in the hung state,
    the number of new connections that can be created will be
    reduced by the number of connections held by the hung threads.
    These connections held with the hung threads will also be
    counted toward the maximum number of connections the pool is
    allowed to have.  When the number of connections held with the
    hung threads is high, the connection pool can get exhausted
    quickly and the applications new connection requests will
    timed out and can lead to the the application server JVM
    going to a hung state.
    From the TCP/IP traces captured after the database was brought
    down,  we noticed that there were several connections that
    exist which never receive a connection reset, so they will
    stay alive either indefinitely, or until the application
    probes the connection and determines that it is dead.  There
    is an operating system parameter called tcp_keepidle which
    probes idle connections to determine whether or not the
    connection is still alive, and the default value is 2 hours
    (for AIX). Since changing tcp_keepidle affects all
    applications, changing this parameter is often not a viable
    solution.
    The solution is to have WebSphere Application Server support
    the socket keep alive parameters configuration that will be
    set on the socket opened by the JDBC Driver in order to
    override the default settings and also to tune these
    parameters as required for the application environment. The
    socket keep alive parameters namely socketIdleTime,
    socketIntervalTime & socketProbeCount can be configured as
    datasource custom properties. When any of these custom
    properties are defined on the datasource, the WebSphere
    relational resource adapter component will set the respective
    socket keep alive properties on the socket before the
    socket is returned to the JDBC driver.  This behavior has been
    enabled through datasource custom properties, so that it
    can be enabled for each datasource as needed.
    a) socketIdleTime - Defines the length of idle time in
    seconds on the socket until a keepalive probe is sent.
    b) socketIntervalTime - Defines the length of interval of time
    in seconds between sending the keepalive probes.
    c) socketProbeCount - Defines the maximum number of keepalive
    probes that can be sent to validate if the socket connection
    is active.
    Prerequisites :
    The support for configuring socket keep alive parameters is
    available on the following and later versions of WebSphere
    Application Server and IBM SDK for Java.
    a) WebSphere Application Server Version 7.0.0.31 +  IBM SDK
    for Java 6.0 SR15
    b) WebSphere Application Server Version 8.0.0.8  + IBM SDK for
    Java 6.0.1 SR7
    c) WebSphere Application Server Version 8.5.5.2 + IBM SDK for
    Java 7.0 SR6 FP1 / IBM SDK for Java 6.0.1 SR7 FP1
    Note :
    -For  Version 7.0.0.31, the required Java SDK is not installed
    as part of the 7.0.0.31 fix pack installation, hence IBM SDK
    for Java 6.0 SR15 must be installed separately.
    -For Version 8.0.0.8 and 8.5.5.2, the required Java SDK is
    installed along with the fix pack installation.
    -For APAR PM91941 to work, the IBM SDK for Java
    level should include the JDK APAR IV47758
    (http://www-01.ibm.com/support/docview.wss?uid=swg1IV47758)
    Procedure :
    1)  Upgrade the WebSphere Application Server  to the fix pack
    version as per the above prerequisites
    2)  Ensure that the appropriate version of IBM SDK for Java is
    also upgraded as per the above prerequisites
    3) To configure the Datasource with the Socket keep alive
    parameters, add any or all of the following properties as a
    datasource custom property of type String :
    socketIdleTime
    socketIntervalTime
    socketProbeCount
    Example :
    In the administrative console, Navigate to JDBC providers >
    (JDBC Provider Name) > Data sources > (datasource name) >
    Custom properties
    Create the following custom properties with values set to any
    integer value that is zero or greater.  For example:
    socketIdleTime = 10
    socketIntervalTime = 5
    socketProbeCount = 10
    - Warnings will be logged in the trace file if any parameter is
    set to a value that is not allowed and the parameter will be
    ignored.
    - A warning will be logged in the trace file if the Java SDK
    level is not correct and any of these properties that are set
    will be ignored.
    4) Restart the WebSphere Application Server for the custom
    property values to take effect.
    

Problem conclusion

  • The WebSphere relational resource adapter component (RRA) has
    been updated to support the socket keep alive parameter
    configuration to avoid threads hanging at the socketRead()
    when the database crashes or is forcibly shutdown.
    
    The fix for this APAR is currently targeted for inclusion in
    fix packs 7.0.0.31, 8.0.0.8 and 8.5.5.2.  Please refer to the
    Recommended Updates page for delivery information:
    http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
    

Temporary fix

Comments

APAR Information

  • APAR number

    PM91941

  • Reported component name

    WEBS APP SERV N

  • Reported component ID

    5724H8800

  • Reported release

    700

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-06-26

  • Closed date

    2013-12-12

  • Last modified date

    2014-07-14

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WEBS APP SERV N

  • Fixed component ID

    5724H8800

Applicable component levels

  • R700 PSY

       UP

  • R800 PSY

       UP

  • R850 PSY

       UP



Document information

More support for: WebSphere Application Server
General

Software version: 7.0

Reference #: PM91941

Modified date: 14 July 2014