Transaction troubleshooting tips

Use these tips to help you troubleshoot problems with the WebSphere® Application Server transaction service.

For messaging problems specific to WebSphere Application Server nodes, see other topics in the documentation, such as the topic about messaging troubleshooting tips, and the WebSphere Application Server Support web page.

Peer recovery fails to acquire a lock

If peer recovery of a transaction fails to acquire a file lock that is needed to undertake recovery processing, the following messages may occur:
[10/26/04 8:41:38:887 CDT] 00000029 CoordinationL A   CWWTR0100_GENERIC_ERROR
[10/26/04 8:41:39:100 CDT] 00000029 RecoveryHandl A   CWWTR0100E: An attempt to 
acquire a file lock needed to perform recovery processing failed. Either the 
target server is active or the recovery log configuration is incorrect
....
[10/26/04 8:42:34:921 CDT] 00000027 HAGroupImpl   I   CWRHA0130I: The local 
member of group GN_PS=fwsitkaCell01\fwwsaix1Node01\GriffinServer3,
IBM_hc=GriffinCluster,type =WAS_TRANSACTIONS has indicated that is it not 
alive. The JVM will be terminated.
[10/26/04 8:42:34:927 CDT] 00000027 SystemOut     O Panic:component requested 
panic from isAlive
To troubleshoot the cause of failure to acquire the file lock, check the following factors:
  • If you have enabled failover of transaction log recovery on the server cluster and are using a NAS devise for the transaction logs, check that the DFS level on your machine is at a correct level for the NAS DFS level. If the two levels are not correct, the transaction logs cannot be accessed.
  • If you are running as non-root, check that the ID numbers of the non-root user and group match on all machines involved with peer recovery.
  • If you have a policy defined for transaction, review the policy to ensure that you are giving control to the correct servers (perhaps you have to add to or reorder the preferred server list).

Client requests and web services transaction protocol messages are not routed to the appropriate server

When the client is not part of the same administrative cell as the target service, and you require transaction affinity or transaction high availability, you can use the WebSphere Application Server proxy server topology to route client requests and web services transaction protocol messages to the appropriate server. In this topology, the client communicates with a WebSphere Application Server proxy server, which dynamically routes the client requests and web services transaction protocol messages to the appropriate server in a WebSphere Application Server cluster. For this scenario to work, the proxy server must be configured in the same administrative cell as the target service.
Avoid trouble: WebSphere Application Server does not provide on demand router (ODR) support for this scenario. Only the WebSphere Application Server proxy server can act as a proxy for web service transaction endpoints.

XAER_NOTA exception logged after server fails

If an application server fails, and the end transaction record is not forced to disk immediately, you might or might not recover a transaction.

WebSphere Application Server does not force the end record to the log, so it is up to the operating system/network file system to decide when to write to the disk. The record would be forced if the server was shut down cleanly. The transaction service is designed to cope with the case of the end record never being written to disk - when it gets an XAER_NOTA returned from the databases.
[date time] 00000057 WSRdbXaResour E   CWWRA0302E:  XAException occurred.  
Error code is: XAER_NOTA (-4).  Exception is: XAER_NOTA

If there is a transaction without an end record in the transaction log, the transaction service tries to check with the database. If the transaction has completed, the database indicates that there is nothing to complete (XAER_NOTA). This behavior is normal, and is not an error.

Clean shutdown message is not in the message log

When an application server shuts down, any active transactions are rolled back. If all transactions complete successfully, message CWWTR0105I is logged, indicating a clean shutdown of the transaction service, and the next server restart does not need any recovery activity. If an application server shuts down and message CWWTR0105I is not logged, this message does not indicate a problem, but it does mean that recovery activity is required when the server restarts.

Prior to uninstalling the product, you should have a clean shutdown of all application servers so that you avoid data integrity problems.

[z/OS]Ensure that recovery from an RRS or XA resource perspective is not needed
On the z/OS operating system, the clean shutdown message CWWTR0105I is never logged. To ensure that recovery from an RRS or XA resource perspective is not needed, you can restart the application server in recovery mode, in the system in which it is configured. In recovery mode, if there are any outstanding units of recovery (URs), the application server completes the URs, then shuts down. If there are no outstanding URs, the application server starts, then shuts down normally. Therefore, to ensure that all recovery has occurred, restart the server in recovery mode and wait until a normal shutdown.
[z/OS]

Hung servers following the failover of large cross-cluster or cross-node global transactions in a high availability environment

In the event of a failover, such as LPAR failure, some number of the surviving application servers can become unresponsive.

To resolve this problem, cancel and restart the application servers. If necessary, force restart the application servers.