DB2 Version 9.7 for Linux, UNIX, and Windows

DB2® High Availability Disaster Recovery (HADR) standby database states

At any time, the standby database is in one of five states: local catchup; remote catchup pending; remote catchup; peer; and disconnected peer. The state that the standby database is in determines what operations it is capable of performing. You can use the GET SNAPSHOT command to see what state your standby database is in.

Figure 1. States of the standby database

This diagram shows the states of the standby database.

Database startup, local catchup, and remote catchup pending

With the high availability disaster recovery (HADR) feature, when the standby database is started, it enters local catchup state and attempts to read the log files in its local log path. If it does not find a log file in the local log path and a log archiving method has been specified, the log file is retrieved using the specified method. After the log files are read, they are replayed on the standby database. During this time, a connection to the primary database is not required; however, if a connection does not exist, the standby database tries to connect to the primary database. When the end of local log files is reached, the standby database enters remote catchup pending state.

If local log files become available after the standby database enters remote catchup pending state, you can shut down the standby database and restart it to cause it to re-enter local catchup state. You might do this if local access to such log files on the standby database is more efficient than allowing HADR to copy the files over the network from the primary database.

Remote catchup pending, remote catchup, peer

The standby database remains in remote catchup pending state until a connection to the primary database is established, at which time the standby database enters remote catchup state. During this time, the primary database reads log data from its log path or by way of a log archiving method and sends the log files to the standby database. The standby database receives and replays the log data. The primary and standby databases enter peer state when the standby database receives all of the log files that are on the disk of the primary database machine.

When in peer state, log pages are shipped to the standby database whenever the primary database flushes a log page to disk. The log pages are written to the local log files on the standby database to ensure that the primary and standby databases have identical log file sequences. The log pages can then be replayed on the standby database.

If the connection between the primary and standby databases is lost when the databases are in remote catchup state, the standby database will enter remote catchup pending state. If the connection between the primary and standby databases is lost when the databases are in peer state, and if the HADR_PEER_WINDOW database configuration parameter is not set (or set to zero) then the standby database will enter remote catchup pending state. However, if the connection between the primary and standby databases is lost when the databases are in peer state, and if the HADR_PEER_WINDOW database configuration parameter is set to a non-zero value, then the standby database enters disconnected peer state.

Disconnected peer

If you configure the database configuration parameter HADR_PEER_WINDOW to a time value that is greater than zero, then if the primary database loses connection with the standby database, then the primary database will continue to behave as though the primary and standby databases were in peer state for the configured amount of time. When the primary database and standby database are disconnected, but behaving as though in peer state, this state is called disconnected peer. The period of time for which the primary database remains in disconnected peer state after losing connection with the standby database is called the peer window. When the connection to the standby database is restored or the peer window expires, the standby database leaves the disconnected peer state.

The advantage of configuring a peer window is a lower risk of transaction loss during multiple or cascading failures. Without the peer window, when the primary database loses connection with the standby database, the primary database moves out of peer state. When the primary database is disconnected, it processes transactions independent of the standby database. If a failure occurs on the primary database while it is not in peer state like this, then transactions could be lost because they have not been replicated on the standby database. With the peer window configured, the primary database will not consider a transaction committed until the primary database has received acknowledgement from the standby database that the logs have been written to main memory on the standby system, or that the logs have been written to log files on the standby database (depending on the HADR synchronization mode.)

The disadvantage of configuring a peer window is that transactions on the primary database will take longer or even time out while the primary database is in the peer window waiting for the connection with the standby database to be restored or for the peer window to expire.

You can determine the peer window size, which is the value of the HADR_PEER_WINDOW database configuration parameter, using the GET SNAPSHOT command or the db2pd utility with the -hadr parameter.

Implications and restrictions of these standby database states for synchronizing the primary and standby databases

One method for synchronizing the primary and standby databases is to manually copy the primary database log files into the standby database log path to be used for local catchup. If you synchronize the primary and standby databases by manually copying the primary database logs into the standby database log path, you must copy the primary log files before you start the standby database for the following reasons:

When the end of the local log files is reached, the standby database will enter remote catchup pending state and will not try to access the local log files again until the standby database is restarted.
If the standby database enters remote catchup state, copying log files into its log path could interfere with the writing of local log files by the standby database.