Troubleshooting lock timeout exceptions for a multi-partition transaction
The scenario that is described is an example of a multi-partition transaction that is causing a lock timeout exception. Depending on the state of the transaction, the solutions illustrate how you can manually resolve this problem.
Before you begin
The following exception displays as a result:
Caused by: com.ibm.websphere.objectgrid.LockTimeoutException:
Local-40000139-DEF8-05EA-E000-64A856931719 timed out waiting
for lock mode S to be granted for map name: TS2_MapP, key: key12
granted = X
lock request queue
->[WXS-40000139-DEF6-FA84-E000-1CB456931719, state = Granted, requested
73423 milli-seconds ago, marked to keep current mode false,
snapshot mode 0, mode = X, thread name = xIOReplicationWorkerThreadPool : 29]
->[Local-40000139-DEF8-05EA-E000-64A856931719, state
= Waiting for 5000 milli-seconds, marked to keep current mode false,
snapshot mode 0, mode = S, thread name = xIOWorkerThreadPool : 28]
dump of all locks for WXS-40000139-DEF6-FA84-E000-1CB456931719
Key: key12, map: TS2_MapP
strongest currently granted mode for key is X
->[WXS-40000139-DEF6-FA84-E000-1CB456931719, state = Granted,
requested 73423 milli-seconds ago, marked to keep current mode false,
snapshot mode 0, mode = X, thread name = xIOReplicationWorkerThreadPool : 29]
dump of all locks for Local-40000139-DEF8-05EA-E000-64A856931719
This
message represents the string that is passed as a parameter when the exception is created and
thrown.Procedure
Problem: You see a lock timeout exception and the holder of the lock is a
multi-partition transaction, or, the log folder is increasing with log messages.
Diagnosis:
You will see a log messages repeatedly filling up your log folder such as the following:
00000099 TransactionLog I CWOBJ8705I:
Automatic resolution of transaction
WXS-40000139-DF01-216D-E002-1CB456931719
at RM:TestGrid:TestSet2:20 is still waiting for a decision.
Another attempt to resolve the transaction will occur in 30 seconds.
Determine
what type of transaction is causing the lock. If the prefix on the transaction identifier is WXS-,
then is indicates multi-partition transaction. If the prefix on the transaction identifier is
Local-, then this indicates that the transaction is single partition transaction.Cause: The application is likely holding the lock because a commit or rollback did not occur.
Solution: Determine the state of the transaction and how long it was in that state. Use either the command utility xscmd -c listindoubts with option -d (for a detailed output) or use the transaction MBean.