IBM Support

PI18503: Problems occur when you restart catalog servers and quorum is enabled.

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • A timing issue exists where the catalog cluster HAM/DCS view is
    at quorum level. However, the catalog static replication code
    is not fully wired before a primary shard, in the process of
    being promoted, attempts to write to the balance data grid.
    That write fails with a
    ReplicationVotedToRollbackTransactionException. You see an
    FFDC similar to the following example:
    
    FFDC
    Exception:com.ibm.ws.xsspi.xio.exception.TransportException$Inte
    rnal
    SourceId:com.ibm.ws.objectgrid.server.catalog.placement.CatalogS
    erviceCommon.activate ProbeId:1547
    Reporter:com.ibm.ws.objectgrid.server.catalog.placement.CatalogS
    erviceCommon@1114fc5
    com.ibm.ws.xsspi.xio.exception.TransportException$Internal
    [originating=10.193.8.92:10001;exid=0]: CWOBJ1688E: Unable to
    bind OBJECTGRID_PLACEMENT_SERVICE: rolling back transaction,
    see caused by exception
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeCons
    tructorAccessorImpl.java:57)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Delega
    tingConstructorAccessorImpl.java:45)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:525)
    at
    com.ibm.ws.xsspi.xio.exception.XIOExceptionFactory.createExcepti
    on(XIOExceptionFactory.java:58)
    at
    com.ibm.ws.objectgrid.server.naming.CommonLocationService.setPla
    cementService(CommonLocationService.java:671)
    at
    com.ibm.ws.objectgrid.server.naming.XIOLocationService.setPlacem
    entService(XIOLocationService.java:375)
    at
    com.ibm.ws.objectgrid.server.catalog.placement.CatalogServiceCom
    mon.doActivate(CatalogServiceCommon.java:961)
    at
    com.ibm.ws.objectgrid.server.catalog.placement.CatalogServiceCom
    mon$1.run(CatalogServiceCommon.java:871)
    at java.lang.Thread.run(Thread.java:722)
    Caused by: com.ibm.websphere.objectgrid.TransactionException:
    rolling back transaction, see caused by exception
    at
    com.ibm.ws.objectgrid.SessionImpl.rollbackPMapChanges(SessionImp
    l.java:2548)
    at
    com.ibm.ws.objectgrid.SessionImpl.commit(SessionImpl.java:2160)
    at
    com.ibm.ws.objectgrid.server.naming.CommonLocationService.setPla
    cementService(CommonLocationService.java:653)
    ... 4 more
    Caused by:
    com.ibm.websphere.objectgrid.ReplicationVotedToRollbackTransacti
    onException: BalanceGrid:ENTITY_MAPSET:0: Only 0 replicas voted
    to commit the transaction. Number of replicas voting: 0.
    Minimum required to commit: 3. Domain: null.  A possible reason
    is a LifecycleFailedException during shard activation, check
    server logs for CWOBJ1209 messages
    at
    com.ibm.ws.objectgrid.SessionImpl.commit(SessionImpl.java:2041)
    ... 5 more
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All WebSphere eXtreme Scale customers who   *
    *                  use quorum.                                 *
    ****************************************************************
    * PROBLEM DESCRIPTION: Problems occur when you restart         *
    *                      catalog servers and quorum is enabled.  *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    A timing issue exists where the catalog cluster HAM/DCS view is
    at quorum level. However, the catalog static replication code
    is not fully wired before a primary shard, in the process of
    being promoted, attempts to write to the balance data grid.
    That write fails with a
    ReplicationVotedToRollbackTransactionException. You see an
    FFDC similar to the following example:
    FFDC
    Exception:com.ibm.ws.xsspi.xio.exception.TransportException$Inte
    rnal
    SourceId:com.ibm.ws.objectgrid.server.catalog.placement.CatalogS
    erviceCommon.activate ProbeId:1547
    Reporter:com.ibm.ws.objectgrid.server.catalog.placement.CatalogS
    erviceCommon@1114fc5
    com.ibm.ws.xsspi.xio.exception.TransportException$Internal
    [originating=10.193.8.92:10001;exid=0]: CWOBJ1688E: Unable to
    bind OBJECTGRID_PLACEMENT_SERVICE: rolling back transaction,
    see caused by exception
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
    Method)
    at
    sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeCons
    tructorAccessorImpl.java:57)
    at
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Delega
    tingConstructorAccessorImpl.java:45)
    at
    java.lang.reflect.Constructor.newInstance(Constructor.java:525)
    at
    com.ibm.ws.xsspi.xio.exception.XIOExceptionFactory.createExcepti
    on(XIOExceptionFactory.java:58)
    at
    com.ibm.ws.objectgrid.server.naming.CommonLocationService.setPla
    cementService(CommonLocationService.java:671)
    at
    com.ibm.ws.objectgrid.server.naming.XIOLocationService.setPlacem
    entService(XIOLocationService.java:375)
    at
    com.ibm.ws.objectgrid.server.catalog.placement.CatalogServiceCom
    mon.doActivate(CatalogServiceCommon.java:961)
    at
    com.ibm.ws.objectgrid.server.catalog.placement.CatalogServiceCom
    mon$1.run(CatalogServiceCommon.java:871)
    at java.lang.Thread.run(Thread.java:722)
    Caused by: com.ibm.websphere.objectgrid.TransactionException:
    rolling back transaction, see caused by exception
    at
    com.ibm.ws.objectgrid.SessionImpl.rollbackPMapChanges(SessionImp
    l.java:2548)
    at
    com.ibm.ws.objectgrid.SessionImpl.commit(SessionImpl.java:2160)
    at
    com.ibm.ws.objectgrid.server.naming.CommonLocationService.setPla
    cementService(CommonLocationService.java:653)
    ... 4 more
    Caused by:
    com.ibm.websphere.objectgrid.ReplicationVotedToRollbackTransacti
    onException: BalanceGrid:ENTITY_MAPSET:0: Only 0 replicas
    voted to commit the transaction. Number of replicas voting: 0.
    Minimum required to commit: 3. Domain: null.  A possible
    reason is a LifecycleFailedException during shard activation,
    check server logs for CWOBJ1209 messages
    at
    com.ibm.ws.objectgrid.SessionImpl.commit(SessionImpl.java:2041)
    ... 5 more
    

Problem conclusion

  • A code fix was delivered to accommodate the timing window, and
    retry the operation.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PI18503

  • Reported component name

    WS EXTREME SCAL

  • Reported component ID

    5724X6702

  • Reported release

    860

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-05-22

  • Closed date

    2014-05-23

  • Last modified date

    2014-05-23

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WS EXTREME SCAL

  • Fixed component ID

    5724X6702

Applicable component levels

  • R850 PSY

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSTVLU","label":"WebSphere eXtreme Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"860","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
23 May 2014