Troubleshooting JPA deadlocks and transaction timeouts

Database deadlocks and transaction timeouts are the result of contention between two or more clients attempting to access the same database resource. Deadlock is a special case where a circular blocking condition between two or more clients, each blocked by the others, but no one can proceed. Usually these phenomena are not programming errors. They are caused by business logic accessing database resources in a complex and inter-depending fashions.

About this task

By using the following best practices and strategies, these conditions can be minimized.

Procedure

  • Identify the affected database resources that caused the error.
    1. Examine the exceptions (if any) when the error occurs can give you clues on the failing entities that caused in the failing condition.
    2. If the WebSphere® Application Server default JPA provider is used, you can enable the product trace group "openjpa.jdbc.SQL=all" to collect the SQL statements and thread information of the database resources that are in question.
      By collating information, you can find which clients and entities are the artifacts that caused the failing condition.
    3. For more complex usage scenario, you can consult the database documentation for any specialized tool or techniques that can help identify the objects and transaction that are causing the data contention issues.
  • When the problems are identified, examine the business logic that uses these resources to determine that their usage does not cause any prolong contention.
    There is no one prescribed solution to resolve deadlock and timeout problems. It highly depends on the complexity of the business logic and entity relationships. The basic concept to resolve deadlocks and timeouts is to minimize the time of resources being held more than it must and allow other clients to access the same resource. If contention is unavoidable, application must establish means for recovery when these failing conditions occur. The following are strategies that might help you minimize the problem conditions and determine if you can apply one or more to resolve the problem:
    1. Make sure that the business logic does not lock entities more than it must.
      A JPA application that are mostly read-only should use optimistic lock semantics offered by JPA by default. JPA 2.0 introduces pessimistic locking functions that allow applications to explicitly lock entity pessimistically on demand. You should optimize the number of pessimistic locks that are applied at any one time. Over locking increases the chance of deadlocks and timeouts, and degrades application concurrency and throughput.
    2. Avoid setting data source connection isolation level more restrictive than it requires.
      Isolation level has the same affect that pessimistic lock semantics do at the connection level. JPA requires the minimum of "read-committed" isolation level to achieve basic data integrity objective. WebSphere Application Server connection manager default isolation level is database-specific. See the Programming interfaces topic for the JDBCConnectionSpec API specification information.
    3. Minimize the duration of any active transaction.
      Extended active transaction has the potential of holding out resources that are required by other clients. Commit any transaction as soon as it has all the resources completed.
    4. Optimize business logic to avoid entity inter-dependency.
      If two sets of business logic uses similar resources, consider updating the resources in the same order to void the classic deadlock circular dependency. This depends on the isolation level being used. It has the same effect as applying implicit lock to the resources as describe in the next strategy.
    5. Use pessimistic lock to synchronize concurrent access to common resource.
      When the previous strategies do not apply to your situation, pessimistic locking is an alternative to synchronize access to common resources used by different concurrent clients. The application can use JPA 2.0 pessimistic locking functions to block other clients from accessing the common resource until the transaction is committed or rolled back. Increase of pessimistic lock usage can affect concurrency and throughput. WebSphere Application Server JPA Access Intent also provides another alternative to lock data resources based on EJB invocation or user-defined task name.
    6. Handle and recover from deadlock and timeout exception.
      Most database servers have a built-in deadlock prevention mechanism. When it detects a deadlock condition, the typical strategy used by the database is to cancel one of the requests and let the other one succeed. However the request that is terminated is database-specific and non-deterministic. If an application allows, it is a good practice to recover the deadlock condition and attempt to retry the operation.

Results

For more information, review the Apache OpenJPA User Guide.