Using Connection information in WebSphere trace files to troubleshoot J2CA0045E and J2CA0020E or connection wait time-out problems

Troubleshooting

Problem

J2CA0045E and J2CA0020E errors can be caused by many problems. They are showing a time-out condition where a resource or a managed connection is not available to fulfill a connection request. In this technote we will use connection information in WebSphere Application Server trace files to troubleshoot J2CA0045E and J2CA0020E or connection wait time-out problems.

Cause

In a trace file with WAS.j2c=all=enabled:RRA=all=enabled:WAS.database=all=enabled, you will get dumps of the connection pooling that look like the following:

[7/20/05 23:40:01:450 EDT] 6a8068fa d UOW= source=com.ibm.ejs.j2c.poolmanager.PoolManager org=IBM prod=WebSphere component=Application Server reserve(), Pool contents ==> PoolManager name:jdbc/xxxxxx
PoolManager object:1424371533
Total number of connections: 45 (max/min 45/20)
Pool Bucket configuration
Shared Buckets: 200
UnShared Buckets: 45
The waiter count is 38
The mcWrappers in waiter queue []
Shared Connection information

Unshare Connection information

Figure 1

Resolving The Problem

1) See if you have Waiters:
Waiters are connection requests waiting on a managed connection. If a connection request is made and a managed connection is not allocated for that request, the request will timeout and throw an error combination like below in your SystemOut.log and or your trace file:

J2CA0045E: Connection not available while invoking method queueRequest for resource jdbc/xxxxxx.
J2CA0020E: The Connection Pool Manager could not allocate a Managed Connection: com.ibm.websphere.ce.j2c.ConnectionWaitTimeoutException: Connection not available, Timed out waiting for 10988

One way to use the connection pool section in the trace file to troubleshoot these kinds of errors is to enter the trace file and search for the word "waiter". As you move down through the file, you will expect to see the waiter count to increase and decrease if in fact you do have waiters. If you do not have any waiters, the waiter count entry will not appear in the trace for a particular connection object listing and you can be assured that your connection pooling is sized properly and you do not have contention for resources. If you do see waiters, you have connection requests waiting on managed connections.

2) Discover if the waiters you have is on one or more PoolManager Object:
Another important point to make at this point is that there can be more than one PoolManager Object. There is one pool manager object per data source that is configured for the application server and the number assigned to that PoolManager object will remain the same for the duration of the application server run from startup to stop of the same.

Once you discover you have waiters, you must find out if they are for one or more PoolManager Objects. You can determine how many pool manager objects you have by searching on the keyword PoolManager object then note the numbers that follow. In the case of figure 1, the number is: 1424371533.

Once you have determined how many PoolManager Objects you have, you must determine if the waiters are for one Object or many. You do this by going through each of the PoolManager Objects to see if in fact that Object has the waiters. You can usually tell after going through a few of them or try the first one in the file and the last one in the file.

3) Assess the damage:
Look in the area called Shared Connection information that is below all the PoolManger Object information. Pull that information out into a separate file and then sort it by thread ID or manually move the same thread ID lines together. You can see problems as such in some cases:

A) J2CA0086W errors can be found to look as follows in the Shared Connection information section:

com.ibm.ws.LocalTransaction.LocalTranCoordImpl@4423ef2d MCWrapper id 3705a8c8 Managed connection com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl@2612ef7a State:STATE_TRAN_WRAPPER_INUSE Thread Id: 2a1368fa Thread Name: Servlet.Engine.Transports : 39 Handle count Start time inuse Wed Jul 20 23:24:32 EDT 2005 Time inuse 929 (seconds)

com.ibm.ws.LocalTransaction.LocalTranCoordImpl@4423ef2d MCWrapper id 67cc68ce Managed connection com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl@6f4128ce State:STATE_TRAN_WRAPPER_INUSE Thread Id: 2a1368fa Thread Name: Servlet.Engine.Transports : 39 Handle count Start time inuse Wed Jul 20 23:24:32 EDT 2005 Time inuse 929 (seconds)

What do we see here? We see that the transaction type is local for both entries. We see the thread ID is indicating the same thread. We also see that the managed connection is a different connection. Keep in mind that we are looking in the shared connection area. So, what we have here is two different connections that are held by the same thread and the same local transaction. The result of this is that these two separate connections will be held until the local transaction is complete. If our max connections for the data source is 10, we have just decreased it to 8 for the duration of the local transaction above.

B) Nested Local transactions:
In the information below that is taken from the Shared Connection Information section as well, you may see the following:

com.ibm.ws.LocalTransaction.LocalTranCoordImpl@1a3fef31 MCWrapper id 1fea8bc Managed connection com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl@195ee8e4 State:STATE_TRAN_WRAPPER_INUSE Thread Id: 2d9228fa Thread Name: Servlet.Engine.Transports : 40 Handle count Start time inuse Wed Jul 20 23:26:30 EDT 2005 Time inuse 811 (seconds)

com.ibm.ws.LocalTransaction.LocalTranCoordImpl@5f08288c MCWrapper id 21d428c1 Managed connection com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl@3a40e8d2 State:STATE_TRAN_WRAPPER_INUSE Thread Id: 2d9228fa Thread Name: Servlet.Engine.Transports : 40 Handle count Start time inuse Wed Jul 20 23:26:28 EDT 2005 Time inuse 812 (seconds)

What do we see here? We see the same thread ID with two shared connections that are of different connection IDs and two separate transaction IDs. In this case, these are nested local transactions in the same thread.

Because shareable connections are used, even if the app called close() on the first connection it will remain allocated, and when the thread enters a nested LTC and gets a connection, it cannot re-use the first connection because of the differing LTC contexts. Each connection is released as the associated LTC comes to an end.

These kind of situations are cause when you pass control of processing in the thread from something like a servlet to another servlet or EJB. Could be several things. Again, we have taken up two connections out of our pool.

Note! If the customer’s code must have this type of nested architecture, they should use unshared connections to fix the problem. Yes, they will see more connections but, the connections will release as soon as the work with them is complete.

Nested Global and Local Transactions:

com.ibm.ws.Transaction.JTA.TransactionImpl@7ebdafd2#tid=510632 MCWrapper id 421fa8c1 Managed connection com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl@1a5d68c8 State:STATE_TRAN_WRAPPER_INUSE Thread Id: 32c4e8fa Thread Name: Servlet.Engine.Transports : 381 Handle count Start time inuse Wed Jul 20 23:36:09 EDT 2005 Time inuse 231 (seconds)

com.ibm.ws.LocalTransaction.LocalTranCoordImpl@69132f2d MCWrapper id 16df28df Managed connection com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl@d5fe8e4 State:STATE_TRAN_WRAPPER_INUSE Thread Id: 32c4e8fa Thread Name: Servlet.Engine.Transports : 381 Handle count Start time inuse Wed Jul 20 23:36:05 EDT 2005 Time inuse 235 (seconds)

What do we see here? We have a local transaction that spawned a global transaction. The same thread number is identified for both. We can tell which one did the spawning by the Start time inuse entry in the entries above. They both have different managed connection numbers so that means is that each of them has a shared connection but not the same one. There is two more connections that are basically dependant on each other out of our max number of connections.

Long Running Transactions:
com.ibm.ws.LocalTransaction.LocalTranCoordImpl@31f36fda MCWrapper id 1e64a8af Managed connection com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl@3d5328a8 State:STATE_TRAN_WRAPPER_INUSE Thread Id: 7b76e8fa Thread Name: Servlet.Engine.Transports : 35 Handle count Start time inuse Wed Jul 20 23:23:17 EDT 2005 Time inuse 1003 (seconds)

What do we see here? We see this transaction in the Shared Connections section is showing us that it has been in use for 1003 seconds. That is a bit much. If you see entries such as these and the customer’s transaction time out is much less than this, you can expect that there is something holding up this transaction like the database or JDBC. Also, note the Handle count is null so it is waiting on something to complete.

Note: Transaction timeout does not kill a transaction when it times out. The transaction timeout will not be enforced until the transaction comes back from the database or what ever it is working into. Therefore, if the transaction time-out is reached, all that means is that when the transaction does come back, it will be rolled back or handled in a non-positive way depending on how the customer has his transactions set up in the deployment descriptor but, the default is to roll the transaction back. Further, if a database, JMS or something else does not perform the tasks requested of it in a transaction before the end of the transaction timeout setting, the transaction timeout. WebSphere Application Server will rollback the transaction. If the transaction does not come back or is not released by the database, JMS or whatever, it will sit there and hold the resources.

Follow on:
If you look in the Shared Connection information or the Unshared Connection information and you see an entry no shared connections or no unshared connections respectively, this does not mean that there are no connections left. This means there are none in use.

Both shared and unshared connections count toward the maximum number of connections set in your data source.

If you see the free connection count, you have to remember the minimum connections. If connections time-out, they will go away down to the minimum connections number.

Other possible causes of the ConnectionWaitTimeout:
The Web container thread pool settings.
1. Setting the isGrowable flag on in the WebContainer thread pool can cause this. If you have too many consumers for the size of the connection pool.
2. Similarly, a Web container thread pool whose maximum is too great for the connection pool, then it can cause this too.

Related Information

Using Local Transactions

J2CA0086W fix

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Java 2 Connectivity (J2C)","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"8.5.5;8.5;8.0;7.0;6.1","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}},{"Product":{"code":"SSNVBF","label":"Runtimes for Java Technology"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Java SDK","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Tips

Using Connection information in WebSphere trace files to troubleshoot J2CA0045E and J2CA0020E or connection wait time-out problems

Troubleshooting

Problem

Cause

Resolving The Problem

Related Information

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?