DB2 10.5 for Linux, UNIX, and Windows

Detecting an unplanned outage

Before you can respond to the failure of a component, you must detect that the component failed. DB2® Data Server has several tools for monitoring the health of a database, or otherwise detecting that a database has failed. You can configure these tools to notify you or take predefined actions when they detect a failure.

Procedure

You can use the following tools to detect when a failure has occurred in some part of your DB2 database solution:
DB2 fault monitor facility

The DB2 fault monitor facility keeps DB2 database instances up and running. When the DB2 database instance to which a DB2 fault monitor is attached terminates unexpectedly, the DB2 fault monitor restarts the instance. If your database solution is implemented in a cluster, you should configure the cluster managing software to restart failed database instances instead of the DB2 fault monitor.

Heartbeat monitoring in clustered environments

Cluster managing software uses heartbeat messages between the nodes of a cluster to monitor the health of the nodes. The cluster manager detects that a node has failed when the node stops responding or sending any messages.

Monitoring DB2 High Availability Disaster Recovery (HADR) databases

The HADR feature has its own heartbeat monitor. The primary database and the standby database each expect heartbeat messages from the other at regular intervals.