Clustered environment considerations for timer service
In a single server environment, it is clear which server instance should invoke the timeout method of the bean on a given bean. In a multi-server clustered environment there are other considerations governing the behavior.
WebSphere® Application Server implements the Enterprise JavaBeans (EJB) Timer Service. Based on your business needs, you can use persistent timers or non-persistent timers. Persistent timers are helpful if you are creating a timer for a time-based event that requires assurance of timer existence beyond the life cycle of the server to survive server shutdowns and restarts. Previously started persistent timers automatically start when your server starts and they require a database instance. Non-persistent timers do not use a data store and are canceled when the application server is stopped or fails to remain in an active state. Non-persistent timers exist only on the server where they are created. In a clustered environment, if your EJB application automatically creates a non-persistent timer and this application is mirrored on multiple servers, each server has its own non-persistent timer that runs within that server environment. A programmatically created non-persistent timer only runs in the cluster member that it was created in.
When configuring a persistent timer in a multi-server clustered server environment, consider the following possibilities for the server instance to invoke the timeout method on a given bean:
- Separate timer service database per server process or cluster member. This is the default configuration. Only the server instance or cluster member that created the Timer can access the Timer and run the timeout method of the bean. If the server instance is unavailable, the Timer does not run at the specified time, and does not run until the server is restarted. Also, if an enterprise bean calls the getTimers() method, only those timers created on the server instance are found. This can cause unexpected behavior if the enterprise bean attempts to cancel all timers associated with it; for example, when the enterprise bean is removed. This configuration is NOT recommended for production level systems.
- Shared or common timer service database for the cluster. Timers can be created and accessed on any server process or cluster member. Timers created in one server process are found by the getTimers() method on other server processes in the cluster. When an entity bean is removed, all timers, no matter where created, are cancelled. However, all timers are executed on a single server in the cluster, that is, the timeout method of the bean is run for all timers on a single server. Which server executes the timers varies depending on which server process obtains a lock on the common database tables. If the server executing timers becomes unavailable, then another server or cluster member takes over and begins executing all timers at their scheduled time. This is the recommended configuration for all production level systems.
-
Avoid trouble: When using the EJB Timer service in an application using multi-threaded database access, application flow can introduce deadlock problems.To avoid this, use the wsPessimisticUpdate access intent. This access intent causes the finder method in your application to run a select for update statement instead of a generic select. This in turn prevents the lock escalation deadlock when multiple threads try to escalate their locks to perform an update.
Controlling WebSphere Application Server scheduler lease behavior
The following WebSphere variables (Environment | WebSphere variables) can be configured to control the scheduler lease behavior. You set these variables using the administative console.
Click Environment > WebSphere variables > WebSphere_variable_name
Name | Definition | Default Value | Example Value |
---|---|---|---|
scheduler.lease.timems | Length of time (in milliseconds) that the lease remains valid | 60000 (60 seconds) | scheduler/sched1=300000 (5 minutes) |
scheduler.lease.alarmintervalms | Number of milliseconds allowed between attempts to acquire the lease | 40000 (40 seconds) | scheduler/sched1=240000 (4 minutes) |