To allow the product to restart on an alternate system,
the following prerequisites must be installed on every system (your
original system as well as any systems intended for recovery) before
reconfiguring the ARM policies to enable peer restart and recovery.
Before you begin
Deprecated feature: Peer Restart and Recovery
(PRR) functionality is deprecated. You should use the integrated high
availability support for the transaction service subcomponent, instead
of Peer Restart and Recovery for transaction recovery. See the topic Transaction
support in WebSphere Application Server for
more information about the integrated high availability support for
the transaction service subcomponent and how to configure it for peer
recovery of transactions being processed on a application server that
fails.
You must also make sure all of the systems, where
you might need to perform restart, are part of the same RRS log group.
- z/OS® Version 1.2 or higher
- BCP APAR OA01584
- RRS APARs OA02556 and OA2556
- WebSphere® Application Server Version
5 or higher
Installing the prerequisite service updates on all of these
systems will not hinder your current running environment if you want
to continue to only restart in place. However, if this service is
not installed, there is a possibility that the controller will not
be able to move back. OTS will attempt to restart on the alternate
system and fail. If there are any URs that are unresolved with RRS
once this happens, the controller will not be allowed to restart on
the home system until RRS is cancelled on the alternate system. For
more information on OTS and RRS, see z/OS MVS™ Programming:
Resource Recovery.
If you do not plan to use peer restart
and recovery, you do not need to abide by these functional prerequisites.
Your system will instead use the restart-in-place function.
The
following products all support RRS. Individually, they also support
peer restart and recovery, providing that the previously listed prerequisites
are all properly installed:
- DB2® Version 7 or higher
- IMS Version 8 or higher
- CICS® Version 1.3 or higher
- MQSeries® Version 5.2 or higher
In addition to the preceding products,
many JTA XAResource Managers can be used to assist in a the product
peer restart and recovery. Consult your JTA XAResource Manager's
documentation to determine if it supports restarting on an alternate
system.
Avoid trouble: When setting up the
ARM policy for a sysplex, make sure that both systems have the same
level of the Application Server installed. For example, you cannot
use an application server that is running WebSphere Application Server Version 5.1 to
perform peer restart and recovery for an application server that is
running WebSphere Application Server Version
6.0.1.
Prior to using peer restart and recovery:
- You must ensure that the location service Daemon and node agent
are already running on all systems that might be used for recovery.
Otherwise, the recovering system might attempt to recover on a system
that is not running the location service Daemon and node agent. If
this happens, the server will fail to start, and recovery will fail.
Clients will see a performance impact if the systems
are running at capacity. In an attempt to minimize the memory and
CPU impact on the alternate system, the enterprise bean and web containers
are not restarted for servers running in peer-restart mode. This means
that application servers that are in the state of being recovered
will not be able to accept any inbound work.
About this task
After the prerequisites are installed, starting a server
on a system to which it was not configured implicitly places the server
into peer restart and recovery mode. If you configured your XA Partner
log to write to a non-shared HFS, or if you are using a JTA XA Resource
Manager, you need to perform the following steps before starting a
server:
Procedure
- (Required only if you are using a non-shared HFS.) Enable
non-shared HFS support.
When using a non-shared HFS, the
configuration settings must be replicated across the different systems
in the sysplex. This is done automatically by the deployment manager
and node agent. To enable this support, each node agent in your configuration
must be set as a recovery node. This change is made in the administrative
console:
- In the administrative console navigation, select .
- Select a node agent from the list.
- In the Additional Properties section, select File
Synchronization Service.
- In the Additional Properties section, , select .
- Select .
- Enter recoveryNode for Name, and
true for Value. The Description field can remain
blank.
- Repeat steps 3-7 for each node agent in your configuration.
- Save your configuration.
- (Required only if you are using JTA XAResource Managers.)
Make appropriate logs and classes are available on the alternate system
If you plan to use peer restart and recovery, and your applications
access JTA XAResource Managers, you must ensure that the appropriate
logs and classes are available on the alternate system.
- Point the product variable TRANLOG_ROOT to a shared
HFS.
The TRANLOG_ROOT variable must point to a shared
HFS, to which all systems in the cell can write. The XA partner log
is stored here, and the alternate system must be able to read and
update this log.
- In the administrative console, click server_name.
- Under Container Services, click Transaction Service.
- Enter the directory of the shared HFS in the Transaction
log directory field.
- Store the driver (i.e., JDBC Driver, JMS Provider, or
JCA Resource Adapter, etc.) for each JTA XAResource Manager in an
HFS that is readable by all systems in the cell.
For example,
if your connector is a JDBC driver for a database, the driver would
likely be stored in a read-only HFS that is accessible by all systems
in the sysplex. This allows the alternate system to read the saved
classpath for the resource, and reconstruct it during a restart.
If
the connector used to access a JTA XAResource Manager is not stored
in an HFS that is readable by all systems that might be used for recovery,
when an application server restarts on an alternate system, it will
either appear that there is no XA recovery work to do, or it will
be impossible to load the classes necessary to communicate with the
JTA XAResource Manager
- Resolve InDoubt units.
During a recovery,
there will be instances when manual intervention is required to resolve
InDoubt units. You will need to use RRS
panels for this manual intervention.