IBM Support

PI60922: Container server fails to start during concurrent catalog and co ntainer server starts.

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • When container and catalog servers start concurrently, a timing
    problem can cause a container server start process to fail with
    
    When this occurs, the container server will log a MessageTimeOut
    
    For example,
    
    Exception = com.ibm.ws.xsspi.xio.exception.MessageTimeOutExcepti
    Source = com.ibm.ws.objectgrid.naming.XIOLocationServiceClient.b
    probeid = 108
    Stack Dump = com.ibm.ws.xsspi.xio.exception.MessageTimeOutExcept
        at com.ibm.ws.xsspi.xio.dispatch.MessageInfo.throwException(
        at com.ibm.ws.xsspi.xio.dispatch.MessageInfo.getMessage(Mess
        at com.ibm.ws.xsspi.xio.dispatch.MessageInfo.getMessage(Mess
        at com.ibm.ws.objectgrid.catalog.wrapper.xio.XIOServiceMessa
        at com.ibm.ws.objectgrid.naming.XIOLocationServiceClient.bin
        at com.ibm.ws.objectgrid.catalog.wrapper.LocationServiceWrap
        at com.ibm.ws.objectgrid.server.impl.ServerImpl.<init>(Serve
    
    Then the container server will fail to start and log messages in
    
    [3/31/16 18:55:54:484 CEST] 0000009c com.ibm.ws.logging.internal
    
    The catalog server will have a hung thread around the same time
    
    Example from catalog server JVM log.
    [3/31/16 18:55:41:780 CEST] 0000003a com.ibm.ws.objectgrid.threa
    Stack Trace:
        java.lang.Object.wait(Native Method)
        java.lang.Object.wait(Object.java:201)
        com.ibm.ws.objectgrid.server.naming.CommonLocationService.ge
        com.ibm.ws.objectgrid.server.naming.CommonLocationService.bi
        com.ibm.ws.objectgrid.server.naming.XIOLocationService.bind(
        com.ibm.ws.objectgrid.server.naming.XIOLocationService.bindR
        com.ibm.ws.objectgrid.server.naming.XIOLocationService.recei
        com.ibm.ws.xs.xio.actor.impl.XIOReferableImpl.dispatch(XIORe
        com.ibm.ws.xsspi.xio.actor.XIORegistry.sendToTarget(XIORegis
        com.ibm.ws.xs.xio.transport.channel.XIORegistryRunnable.run(
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
        com.ibm.ws.objectgrid.thread.XSThreadPool$Worker.run(XSThrea
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  WebSphere eXtreme Scale users starting      *
    *                  catalog                                     *
    *                  and container servers concurrently where 2  *
    *                  or                                          *
    *                  more catalog servers are defined.           *
    ****************************************************************
    * PROBLEM DESCRIPTION: Container server fails to start during  *
    *                      concurrent catalog and container server *
    *                      starts.                                 *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    A timing condition during startup causing a transport timeout
    during the server name bind request. This prevents the server
    from
    correctly binding its name and starting.
    

Problem conclusion

  • The bind exception is handled and the server can start.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PI60922

  • Reported component name

    WS EXTREME SCAL

  • Reported component ID

    5724X6702

  • Reported release

    860

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2016-04-18

  • Closed date

    2016-04-25

  • Last modified date

    2016-04-25

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WS EXTREME SCAL

  • Fixed component ID

    5724X6702

Applicable component levels

  • R860 PSY

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSTVLU","label":"WebSphere eXtreme Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"860","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
25 April 2016