IBM Support

PM91220: Deadlock in near cache invalidation logic during server shutdown.

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • When using client invalidation a container may deadlock in XIO
    code when adding or removing a near cache client subscriber.
    
    Add case:
    
    thread
    XIOPrimaryPool : 42:
    Blocked on:
    Owned by:
    "com.ibm.ws.xs.xio.flowcontrol.server.impl.FlowControlSchedulerS
    chedulePool : 0"
    
    com/ibm/ws/xs/xio/flowcontrol/server/impl/ContainerFlowControlIm
    pl.informListenersAboutSlowClients
    (ContainerFlowControlImpl.java:212)
    
    com/ibm/ws/xs/xio/flowcontrol/server/impl/Prober.run(Prober.java
    :479(Compiled Code))
    
    com/ibm/ws/xs/xio/flowcontrol/server/impl/Prober.receiveClockReq
    uestMessage(Prober.java:354)
    
    com/ibm/ws/xs/xio/flowcontrol/server/impl/Prober.receive(Prober.
    java:201
    
    thread
    "com.ibm.ws.xs.xio.flowcontrol.server.impl.FlowControlSchedulerS
    cheduleP
    ool : 0"
    
    Blocked on:
    Owned by: "SubscriptionProcessPool : 1
    
    com/ibm/ws/xs/pubsub/publication/Publisher.internalPublish(Publi
    sher.java:716(Compiled Code))
    
    com/ibm/ws/xs/pubsub/publication/Publisher.publishPendingMessage
    s(Publisher.java:797(Compiled Code))
    
    com/ibm/ws/xs/xio/flowcontrol/server/impl/ContainerFlowControlIm
    pl.run(ContainerFlowControlImpl.java:271(Compiled Code))
    
    thread
    SubscriptionProcessPool : 1
    Blocked on: java/util/HashMap@0x000000000C602458
    Owned by: "XIOPrimaryPool : 42"
    
    com/ibm/ws/xs/xio/flowcontrol/server/impl/ContainerFlowControlIm
    pl.addXSClient(ContainerFlowControlImpl.java:121(Compiled Code)
    
    com/ibm/ws/xs/pubsub/publication/Publisher.addSubscriber(Publish
    er.java:517(Compiled Code))
    
    com/ibm/ws/xs/pubsub/publication/Publisher.subscribe(Publisher.j
    ava:439(Compiled Code))
    
    
    Remove case:
    
    This may also be seen in a similar deadlock for the code below:
    
    com/ibm/ws/xs/xio/flowcontrol/server/impl/ContainerFlowControlIm
    pl.removeXSClient(ContainerFlowControlImpl.java:148)
    
    com/ibm/ws/xs/pubsub/publication/Publisher.unsubscribe(Publisher
    .java:620)
    
    com/ibm/ws/xs/pubsub/publication/Publisher.dispatchException(Pub
    lisher.java:771(Compiled Code))
    

Local fix

  • Disable near cache
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  Users of the near-cache invalidation.       *
    ****************************************************************
    * PROBLEM DESCRIPTION: During shutdown, a timing window        *
    *                      exists that can trigger a deadlock      *
    *                      condition.  This deadlock  prevents     *
    *                      proper server shutdown.                 *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    Locks are not acquired in the correct order, resulting in a
    deadlock.  To confirm that the deadlock exists, look at the
    javacore file for the process. In the javacore file, look for
    the
    FlowControlSchedulerSchedulePool string:
    1LKDEADLOCK    Deadlock detected !!!
    NULL           ---------------------
    NULL
    2LKDEADLOCKTHR  Thread "XIOPrimaryPool : 42"
    (0x00000000616A7D00)
    3LKDEADLOCKWTR    is waiting for:
    4LKDEADLOCKMON      sys_mon_t:0x00007F3E06C78D80 infl_mon_t:
    0x00007F3E06C78DF8:
    4LKDEADLOCKOBJ
    java/util/concurrent/ConcurrentHashMap@0x000000000C5FE800
    3LKDEADLOCKOWN    which is owned by:
    2LKDEADLOCKTHR  Thread
    "com.ibm.ws.xs.xio.flowcontrol.server.impl.FlowControlSchedulerS
    chedulePool : 0" (0x0000000060C4F800)
    3LKDEADLOCKWTR    which is waiting for:
    4LKDEADLOCKMON      sys_mon_t:0x00007F3D980522A8 infl_mon_t:
    0x00007F3D98052320:
    4LKDEADLOCKOBJ      java/util/LinkedList@0x000000000B37C228
    3LKDEADLOCKOWN    which is owned by:
    2LKDEADLOCKTHR  Thread "SubscriptionProcessPool : 1"
    (0x0000000060D06800)
    3LKDEADLOCKWTR    which is waiting for:
    4LKDEADLOCKMON      sys_mon_t:0x00007F3D98063CB8 infl_mon_t:
    0x00007F3D98063D30:
    4LKDEADLOCKOBJ      java/util/HashMap@0x000000000C602458
    3LKDEADLOCKOWN    which is owned by:
    2LKDEADLOCKTHR  Thread "XIOPrimaryPool : 42"
    (0x00000000616A7D00)
    

Problem conclusion

  • The lock acquisition order is corrected to avoid this
    deadlock condition.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PM91220

  • Reported component name

    WS EXTREME SCAL

  • Reported component ID

    5724X6702

  • Reported release

    860

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-06-17

  • Closed date

    2013-06-21

  • Last modified date

    2013-06-21

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WS EXTREME SCAL

  • Fixed component ID

    5724X6702

Applicable component levels

  • R860 PSY

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSTVLU","label":"WebSphere eXtreme Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"860","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
21 June 2013