Restarting an application server in recovery mode

When an application server instance with active transactions in progress restarts after a failure, the transaction service uses recovery logs to complete the recovery process. These logs, which each transactional resource maintains, are used to rerun any InDoubt transactions and return the overall system to a self-consistent state.

[z/OS]

Before you begin

If you are migrating from a previous version of the product, make sure that the REC parameter is included on the JCL procedure statement for the controller as either REC=N or REC=Y. If the JCL procedure does not specify either REC=N or REC=Y, the server does not restart in recovery mode even if you specify the -recovery option.

If the JCL procedures includes REC=N, the setting automatically changes to REC=Y if you specify -recovery when you restart the server. REC=N is automatically included on the JCL procedure if you did not migrate from a previous version of the product. Following is an example of what your updated PROC statement might look like:

//BBO6ACR  PROC PARMS=' ',REC=N,Z=BBO6ACRZ       

About this task

When you restart an application server in recovery mode:
  • Transactional resources complete the actions in their recovery logs and then shut down. This action frees up any resource locks that the application server held prior to the failure.
  • During the recovery period, only the subset of application server functions that are necessary for transactional recovery to proceed are available.
  • The application server does not accept new work during the recovery process.
  • The application server shuts down when the recovery is complete.

This recovery process begins as soon as all of the necessary subsystems within the application server are available. If the application server is not restarted in recovery mode, the application server can start accepting new work as soon as the server is ready, which might occur before the recovery work has completed.

Normally, this process is not a problem. However, situations exist when your operating procedures might not be compatible with supporting recovery work and new work simultaneously. For example, you might have a high availability environment where the work handled by the application server that failed is immediately moved to another application server. This backup application server then exclusively processes the work from the application server that failed until recovery has completed on the failed application server and the two application servers can be re-synchronized. In this situation, you might want the failing application server to only perform its transactional recovery process and then shut down. You might not want this application server to start accepting new work while the recovery process is taking place.

To prevent the assignment of new work to an application server that is going through its transaction recovery process, restart the application server in recovery mode.

When you restart a failed application server, the node agent for the node on which the failed application server resides must be running before you can restart that application server.

[z/OS]Avoid trouble: When an application server stops as part of normal shutdown processing, message WSVR0024I: Server xxxxxxxx PROCESS xxxxxxxx stopped is sent to the system log file. If the server user Ids have ALTER access to the appropriate MVSADMIN.* profiles in the facility class, the resource manager registration entry that is associated with the application server for this instance of the application server is removed from the RRS logs. However, if the server user Ids have do not have ALTER access to the appropriate MVSADMIN.* profiles in the facility class, the resource manager registration entry that is associated with the application server for this instance of the application server is not removed from the RRS logs.

If the resource manager registration entry was deleted from the RRS logs, on a subsequent application server start, a cold start is performed. However, you cannot perform a cold start with RRS if you are starting the application server in recovery mode.

[z/OS]With this service release, you can cold start the server in a recovery mode only on the system where the server was configured.

If you want to be able restart an application server in recovery mode, you must perform the following steps before a failure occurs, and then restart the application server to enable your configuration changes:

Procedure

  • If the server is monitored by a node agent, you must clear the Automatic restart option for that server.
    Clearing this option prevents the node agent from automatically restarting the server in normal mode, before you have a chance to start it in recovery mode.
    1. In the administrative console, click Servers > Server Types > WebSphere application servers > server_name.
    2. In the Server Infrastructure section, click Java and process management > Monitoring Policy.
    3. Clear the Automatic restart option.
  • If a catastrophic failure occurs that leaves InDoubt transactions, issue the startServer server_name -recovery command from the command line.
    This command restarts the server in recovery mode. You must issue the command from the profile_root/bin directory for the profile with which the server is associated.

Results

The application server restarts in recovery mode, performs transactional recovery, and shuts down. Any resource locks that the application server held prior to the failure are released.
[z/OS]

What to do next

Configure the integrated high availability support for the transaction service subcomponent for peer recovery of transactions.