IBM Support

PI16178: Failures in WXS container reconnect code which can lead to duplicate containers.

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • When having network issues it is possible for the WXS container
    reconnect code to fail to restart or tear down containers
    properly.
    
    In some such cases duplicate containers may arise.  This can be
    seen in the showPlacement output where multiples of the same
    container with different atomic numbers can be seen.
    
    
    FFDCs such as the ones below may be seen during the reconnects:
    
    FFDC Exception:org.omg.CORBA.TRANSIENT
    SourceId:com.ibm.ws.objectgrid.catalog.wrapper.LocationServiceWr
    apper.resetRemote
    
    org.omg.CORBA.TRANSIENT: java.net.SocketTimeoutException:
    connect timed out
    at
    com.ibm.CORBA.transport.TransportConnectionBase.connect(Transpor
    tConnectionBase.java:435)
    .
    .
    at
    com.ibm.ws.objectgrid.ServerImpl.reconnectContainers(ServerImpl.
    java:1536)
    
    FFDC Exception:org.omg.CORBA.OBJ_ADAPTER SourceId:class
    com.ibm.ws.objectgrid.thread.XSThreadPool$XSUncaughtExceptionHan
    dler.uncaughtException ProbeId:43
    
    org.omg.CORBA.OBJ_ADAPTER:
    com.ibm.websphere.objectgrid.ObjectGridRuntimeException: The
    server <container_name> has not been bound into naming
    (registerWithWAS)
    .
    .
    at
    com.ibm.ws.objectgrid.catalog.wrapper.PlacementServiceWrapper.re
    connectWithWAS(PlacementServiceWrapper.java:664)
    at
    com.ibm.ws.objectgrid.ServerImpl.reconnectContainers(ServerImpl.
    java:1549)
    
    
    FFDC Exception:com.ibm.websphere.objectgrid.TransactionException
     SourceId:com.ibm.ws.objectgrid.ObjectGridImpl.activate
    ProbeId:1090
    
    at
    com.ibm.ws.objectgrid.container.ObjectGridContainerImpl.teardown
    (ObjectGridContainerImpl.java:1417)
    at
    com.ibm.ws.objectgrid.ServerImpl.reconnectContainers(ServerImpl.
    java:1588)
    
    
    FFDC Exception:org.omg.CORBA.NO_RESPONSE SourceId:class
    com.ibm.ws.objectgrid.thread.XSThreadPool$XSUncaughtExceptionHan
    dler.uncaughtException ProbeId:43
    org.omg.CORBA.NO_RESPONSE: Request timed out
    
    at
    com.ibm.ws.objectgrid.container.ObjectGridContainerImpl.<init>(O
    bjectGridContainerImpl.java:408)
    at
    com.ibm.ws.objectgrid.ServerImpl.reconnectContainers(ServerImpl.
    java:1592)
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of WebSphere eXtreme Scale        *
    *                  that are running in WebSphere Application   *
    *                  Server.                                     *
    ****************************************************************
    * PROBLEM DESCRIPTION: Extra containers appear in the          *
    *                      output after                            *
    *                      the reconnect                           *
    *                      occurs. The extra containers can be     *
    *                      seen when running the                   *
    *                      showPlacement command (or other         *
    *                      equivalent placement                    *
    *                      commands) with the XSCMD                *
    *                      utility.                                *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    During a container reconnect, it is possible that if an error
    occurs, that new containers are not properly disposed of and
    the server does not save the handle to that container,
    resulting in "ghost" containers.
    

Problem conclusion

  • The code was updated to ensure that errors encountered during
    container reconnect do not result in the loss of container
    handles.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PI16178

  • Reported component name

    WS EXTREME SCAL

  • Reported component ID

    5724X6702

  • Reported release

    850

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-04-16

  • Closed date

    2014-05-21

  • Last modified date

    2014-05-21

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WS EXTREME SCAL

  • Fixed component ID

    5724X6702

Applicable component levels

  • R711 PSY

       UP

  • R850 PSY

       UP

  • R860 PSY

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSTVLU","label":"WebSphere eXtreme Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"850","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
21 May 2014