IBM Support

Database locks and rollbacks occur during deployment and the BPEL application fails to start on some cluster members for WebSphere Process Server (WPS)

Troubleshooting


Problem

When you attempt to start a BPEL application, one or more of the AppTarget Java virtual machines (JVM) in the cluster does not start the module and transaction rollbacks are seen in the AppTarget(cluster member) log files.

Symptom

You see the following information in the AppTarget(cluster member) log files:

BpelEngine E com.ibm.db2.jcc.am.SqlTransactionRollbackException: DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=2, DRIVER=3.61.86

You see the following enterprise archive (EAR) expansion error for the node agent:

AppDataMgr A ADMA7101E: An unexpected error occurred for the EAR expansion process. The enterprise archive (EAR) file




After the BPEL application is redeployed, the application fails to start on some cluster members in a large clustered environment. Also, database locks and rollbacks are seen. You see messages on the node agent that are similar to the following text:

AppDataMgr A ADMA7101E: An unexpected error occurred for the EAR expansion process......

bpe tracing from at least one of the node agents will display a stack such as the following:

TraceBPE 3 com.ibm.bpe.management.application.process.SCDLProcessComponentModifiedSyncTask.syncUpdates(SCDLProcessComponentModifiedSyncTask.java:189) About to sync updates for Business Process application: 'myAppName', process: myCell.myCluster.myNode
TraceBPE 3 com.ibm.bpe.management.application.process.SCDLProcessComponentModifiedSyncTask.syncUpdates(SCDLProcessComponentModifiedSyncTask.java:273) Exception thrown in RequiredModelMBean while trying to invoke operation updateApplication
javax.management.MBeanException: Exception thrown in RequiredModelMBean while trying to invoke operation updateApplication
at javax.management.modelmbean.RequiredModelMBean.invokeMethod(RequiredModelMBean.java:1112)
at javax.management.modelmbean.RequiredModelMBean.invoke(RequiredModelMBean.java:966)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:848)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:773)
at com.ibm.ws.management.AdminServiceImpl$1.run(AdminServiceImpl.java:1331)
at com.ibm.ws.security.util.AccessController.doPrivileged(AccessController.java:118)
at com.ibm.ws.management.AdminServiceImpl.invoke(AdminServiceImpl.java:1224)
at com.ibm.ws.management.connector.AdminServiceDelegator.invoke(AdminServiceDelegator.java:181)
at com.ibm.ws.management.connector.ipc.CallRouter.route(CallRouter.java:242)
at com.ibm.ws.management.connector.ipc.IPCConnectorInboundLink.doWork(IPCConnectorInboundLink.java:353)
at com.ibm.ws.management.connector.ipc.IPCConnectorInboundLink.ready(IPCConnectorInboundLink.java:132)
at com.ibm.ws.ssl.channel.impl.SSLConnectionLink.determineNextChannel(SSLConnectionLink.java:1049)
at com.ibm.ws.ssl.channel.impl.SSLConnectionLink$MyReadCompletedCallback.complete(SSLConnectionLink.java:643)
at com.ibm.ws.ssl.channel.impl.SSLReadServiceContext$SSLReadCompletedCallback.complete(SSLReadServiceContext.java:1784)
at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:138)
at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:204)
at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:775)
at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:905)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1646)
Caused by: com.ibm.websphere.management.exception.AdminException: com.ibm.bpe.api.UnexpectedFailureException: CWWBA0010E: Unexpected exception during execution.
at com.ibm.bpe.management.application.process.ProcessApplicationManager.checkIfProcessTemplatesRemovable(ProcessApplicationManager.java:598)
at com.ibm.bpe.management.application.process.ProcessApplicationManager.updateApplication(ProcessApplicationManager.java:321)
at com.ibm.bpe.framework.ProcessContainer.updateApplication(ProcessContainer.java:1970)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:49)
at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:256)
at javax.management.modelmbean.RequiredModelMBean.invokeMethod(RequiredModelMBean.java:1085)
... 21 more
Caused by: com.ibm.bpe.api.UnexpectedFailureException: CWWBA0010E: Unexpected exception during execution.
at com.ibm.bpe.admin.AdminService.getNotStoppedTemplateNamesB(AdminService.java:1134)
at com.ibm.bpe.management.application.process.ProcessApplicationManager.checkIfProcessTemplatesRemovable(ProcessApplicationManager.java:551)
... 33 more
Caused by: com.ibm.bpe.database.TomSQLException: com.ibm.db2.jcc.am.SqlTransactionRollbackException: DB2 SQL Error: SQLCODE=-911, SQLSTATE=40001, SQLERRMC=2, DRIVER=3.64.82
at com.ibm.bpe.database.ProcessCellMap.selectDbByPTID(ProcessCellMap.java:242)
at com.ibm.bpe.database.Tom.getProcessCellMaps(Tom.java:5472)
at com.ibm.bpe.admin.AdminService.filterNonMultiCellTemplateNames(AdminService.java:4040)
at com.ibm.bpe.admin.AdminService.getNotStoppedTemplateNamesB(AdminService.java:1128)
... 34 more


Note: The exact BPE methods can vary. However, the updateApplication path is common.

Cause

The lack synchronization of WebSphere Application Server nodes between an uninstall and re-install of custom applications causes the update path to be used. This scenario occurs because the uninstall and reinstall calls are optimized by WebSphere Application Server into a single update call. For BPEL applications, this scenario is not desirable due to a restriction in the current design, which is especially applicable with large clusters. The following information explains why this scenario is not desirable:

  • The first consideration is the current product design that is specific to the deployment of BPEL applications. The installation and uninstallation of business process and human task templates is a two-step process. First, applications are installed using the WebSphere Administrative Console and the applications are written into the WebSphere configuration. Next, the configuration is synchronized with the nodes. During the node synchronization process, the templates are written into or deleted from the Business Process Choreographer database (BPEDB). For these changes to occur, at least one of the cluster members has to be running. To ensure consistency between the installed applications and the templates that are in the database, it is very important that the nodes are correctly synchronized.

  • In WebSphere Application Server, the synchronization of the nodes returns after the configuration is synchronized down to the nodes. The expansion of the binaries on the node and the deployment or deletion of the templates on the nodes is started asynchronously. If the expansion and deployment takes a long time, it is possible that the node synchronization API returns before the templates are written into or deleted from the database.

  • The lack of the synchronization between the uninstall and the install also leads to an optimization at the application server level. At this level, the application is updated rather than uninstalled and later installed. Your scripts might be requesting an uninstall and an install action. However, the lack of immediate synchronization leads to this optimization.

  • With the resulting unintended update path, combined with high levels of parallel expansion on the nodes (for example, due to 6 nodes), a rare timing issue can occasionally cause locking issues that cause a rollback. Usually this scenario is not noticed because a retry completes successfully. However, in ever rarer circumstances, the retry process might also fail, which leads to the expansion failure.

Resolving The Problem

A best practice adds an additional synchronization process, which is shown in boldface type. The following steps are a sample outline of a custom deployment script:

  1. Stop the module.
  2. Uninstall the module.
  3. Save and synchronize.
  4. Install the module.
  5. Save.
  6. Synchronize.
  7. Start the module.
  8. Validate that the module was started successfully

[{"Product":{"code":"SSQH9M","label":"WebSphere Process Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Business Process Choreographer","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"},{"code":"PF035","label":"z\/OS"}],"Version":"7.0;6.2","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Product Synonym

WPS

Document Information

Modified date:
15 June 2018

UID

swg21647617