Cluster Resource Group APIs

The information provided here includes:

Cluster Resource Group APIs--Introduction
Cluster Resource Group API List

Cluster Resource Group APIs--Introduction

The Cluster Resource Group (CRG) function within a cluster is to:

maintain operationally identical cluster resource groups (CRGs) on every node of the cluster resource group recovery domain.
call the Cluster Resource Group exit program for most cluster resource group APIs
coordinate activities performed whenever access points for cluster resource groups are changed from one node to another.

Any cluster resource group API may be called on any node in the cluster. Most cluster resource group APIs have an asynchronous behavior.

The majority of the cluster resource group APIs require that Cluster Resource Services be active. This is necessary to ensure consistency of cluster resource groups across the cluster. Each API indicates whether or not Cluster Resource Services needs to be active for the API to complete successfully.

Cluster Resource Services maintains synchronous copies of cluster resource groups (perceptively and operationally identical) on all nodes in the group's recovery domain. When a node joins the cluster or when a cluster partition is resolved, the cluster resource group object is reconciled. This may mean copying the cluster resource group object from some node already in the cluster to the joining node or from the primary partition to nodes in the secondary partition. See Partition Rules for details on primary and secondary partitions.

Types of Cluster Resource Groups

There are two models of cluster resource groups.

Primary-backup model. All cluster resource groups of this model define nodes in the recovery domain with a specific role: either primary, backup or replicate. The primary and backup nodes are available to be the access point for the cluster resource. However only one node will be the active access point at a given point in time. This node will be the primary node. Replicate nodes are not available to be an access point. A node role can be changed by assigning the replicate node a role of backup. Examples of cluster resource groups of this model are data, device and application.
Peer model. All cluster resource groups of this model define nodes in the recovery domain with a role of peer or replicate. The peer nodes are available to be the access point for the cluster resource group. All nodes defined as peer will be the access point at the same time when the cluster resource group is started. Replicate nodes are not available to be an access point. This can be changed by assigning the replicate node a role of peer. Example of cluster resource groups of this model are peer cluster resource groups.

Cluster resource group objects are either data resiliency, application resiliency, device resiliency or peer resiliency. Data resiliency represents multiple copies of data maintained on more than one node in a cluster. Application resiliency enables an application (program) to be restarted on either the same or a different node in the cluster. This is made possible by a Takeover IP Address. Device resiliency allows devices such as auxiliary storage pools to be switched from one node in a cluster to another node. Peer resiliency represents resources being accessed by mutliple clients.

Recovery Comain

Cluster resource groups contain a recovery domain. A Recovery Domain is that set of cluster nodes which, for a particular cluster resource group, describes the access points of the cluster resource. Each node in the recovery domain is assigned a role that reflects its point of access:

Primary Node	The cluster node that is the point of access for the resilient cluster resource. For a replicated resource, the primary node also contains the principal copy of a resource. If this node fails, all cluster resource group objects having this node as the primary access point will failover to a backup node. This role is allowed for primary-backup model cluster resource groups.
Backup Nodes	Cluster nodes that will take over the role of primary access if the present primary node fails. For a replicated cluster resource, this cluster node contains a copy of that resource. In the case of a data cluster resource group, copies of the data are kept current with replication. This role is allowed for primary-backup model cluster resource groups.
Replicate Nodes	Cluster nodes that has copies of cluster resources. The node it is unable to assume the role of primary, backup, or peer.
Peer Nodes	All nodes have the same copy of the cluster resources. A node defined with this role is available to be the active point of access for the cluster resources. This role is only supported for peer cluster resource groups.

Some Cluster Control APIs cause cluster resource group actions to be taken. For example, an End Cluster Node (QcstEndClusterNode) API will cause the active cluster resource groups on that node to be ended and the Cluster Resource Group Exit Program to be called. In these instances, the success indicator returned by the exit program will be ignored. The operations will always be considered successful.

A cluster resource group has a recovery domain of one or more cluster nodes. Each cluster node within the recovery domain has two roles: preferred and current. The two node roles need not be the same. When a cluster resource group is initially created, the preferred and the current roles are the same. When a cluster resource group is created, a cluster resource group job is started on each active node in the cluster and a *CRG object will be created on each recovery domain node.

The current role of a node in the recovery domain is changed as a result of operations occurring within the cluster (for example nodes ending, nodes starting, and nodes failing).

For primary-backup model cluster resource groups

When the recovery domain is obtained by the List Cluster Resource Group Information (QcstListClusterResourceGroupIn) API or when it is passed to the exit program, it is always presented as an array with the primary node first, followed by backup nodes, and finally replicate nodes. If the cluster resource group is active, backup nodes in the recovery domain are ordered so that active nodes appear before nodes that are inactive or partitioned. APIs or cluster events that affect a node's membership status in the recovery domain also cause the order of the backup nodes to change for an active cluster resource group.
If a cluster resource group is not active, APIs can cause changes to the order of the recovery domain but cluster events such as nodes failing or rejoining the cluster do not. This is done to keep the current recovery domain in the order last requested by the user or the last order when the cluster resource group was ended during times when node failures or rejoins are not important. However, when the cluster resource group becomes active such as with the Start Cluster Resource Group (QcstStartClusterResourceGroup) API, the recovery domain will be reordered if necessary to put active backup nodes before inactive or partitioned backup nodes.
When the Initiate Switchover (QcstInitiateSwitchOver) API is used on an active cluster resource group, the first active backup node becomes the new primary and the old primary becomes the last active backup node.
When the primary node fails for an active cluster resource group, the first active backup node becomes the new primary and the old primary becomes the last inactive backup.

For peer model cluster resource groups:

When the recovery domain is obtained by the List Cluster Resource Group Information (QcstListClusterResourceGroupIn) API or when it is passed to the exit program, it is always presented as an array with the peer nodes first followed by replicate nodes.
The Initiate Switchover (QcstInitiateSwitchOver) API is not allowed.
The nodes are not reordered when nodes fail for an active cluster resource group.

The preferred role of a node in the cluster is changed only by running the following APIs:

Add Node to Recovery Domain
Remove Node from Recovery Domain
Change Cluster Resource Group

Changes to the node roles are done independently. The role specified for a node in any of these APIs will be assigned to both the current and preferred roles of the node.

Example of node roles for a primary-backup model cluster resource group.

For example, the recovery domain of a primary-backup model cluster resource group object has preferred roles of N1-primary, N2-backup1, and N3-backup2, but the current roles are N1-backup2, N2-primary, and N3-backup1. N4 is being added as backup2. Therefore, the preferred roles of the nodes are N1-primary, N2-backup1, N3-backup3, and N4-backup2, and the current roles are N1-backup3, N2-primary, N3-backup1, and N4-backup2.

Example of node roles

Example of node roles for a peer model cluster resource group.

In this example, the recovery domain of a peer model cluster resource group object preferred roles are N1-peer, and N2-replicate, but the current roles are N1-peer, and N2-replicate. N3 is being added as peer. Therefore, the preferred roles of the nodes are N1-peer, N2-replicate, and N3-peer, and the current roles are N1-peer, N2-replicate, and N3-peer. Notice that the recovery domain is reordered when another node is added to the recovery domain with a role of peer.

Example of node roles

Exit program

Every data or application cluster resource group has an associated exit program. A device cluster resource group can also have an exit program but one is not required. This exit program will be called for each of the different action codes listed under the Cluster Resource Group exit program. The exit program is called from a separate job using the user profile supplied when the cluster resource group is created. See Cluster Resource Group Exit Program for a description of the conditions that cause the exit program to be called.

The user exit program will be restricted from calling some of the APIs. Each API specifies the user exit program restrictions.

Application takeover IP address

An application takeover IP address is a high availability mechanism used to insulate clients from application server outages. The concept is to use IP address aliasing (multihoming) to define a "floating IP address" associated with multiple application servers or hosts. When one application server in a cluster fails, another cluster node can assume the responsibilities of the application server without requiring the user to reconfigure the clients.

To support address aliasing, application groups contain an IP address resource and a recovery domain. When the application or the node running the application fails, Cluster Resource Services initiates a failover of the group using the IP address to the node assigned the current role of first backup.

The address specified for the takeover IP address must not be used for any other purposes. Cluster Resource Services will not allow certain API operations to complete successfully if the IP address is in use. This restriction ensures that the structures being created will provide application resilience.

Server takeover IP address

A server takeover IP address is just like the application takeover IP address for an application CRG, except it is used for servers associated with the relational database name in the device description for an auxiliary storage pool. The address can only be specified for a primary auxiliary storage pool. Only one IP address can be specified per primary auxiliary storage pool. The address must be unique, and must not be used for any other purpose.

The user is responsible for configuring and managing the server takeover IP address. The IP address must be added on all nodes in the recovery domain prior to starting the cluster resource group. Starting of a device cluster resource group will not start the server IP address or vary on the device. That is the user's responsibility. Cluster Resource Service manages the IP address only during a switchover or failover.

On switchover or failover, clustering will end the IP address on the current primary, and uses the value in the "configuration object online" field to determine what action should occur on the new primary node. Based on the value in the "configuration object online" field it will either start the IP address and vary on the device or do nothing to the IP address and device.

Failover message queue

A failover message queue allows a user to control what happens at failover time. A failover policy could be:

failover continues to act like it did for V5R1M0 and prior.
failover sends an inquiry message to the failover message queue and waits the specific amount of time specified by the user.

A failover message queue may be specifed when a cluster resource group is created. A message will be placed on the queue when the primary node of the active cluster resource group either ends or fails, forcing the cluster resource group to fail over to a new primary. In the case of a node failure, each cluster resource group will enqueue a separate message to its failover message queue if one is defined. No message will be enqueued if the primary node is removed from the cluster.

The message will be placed on the message queue on the new primary node before the call to the exit program. This gives the user the option of continuing the failover to the new primary, or cancelling the failover. If the failover is cancelled, the primary node will not be changed, and the cluster resource group will become Inactive. The exit program will be called with an action code of Failover Cancelled.

There are two associated parameters with the qualified failover message queue. The failover wait time allows the user to specify how long Cluster Resource Services should wait for a reply to the failover message. The user can choose to wait forever, proceed with failover without waiting for a reply, or wait a specified number of minutes. The failover default action allows the user to choose whether to continue or cancel failover if a reply to the failover message is not received within the time limit specified in the failover wait time parameter or if the message cannot be enqueued for some reason.

Site name and data port IP addresses

Site name and data port IP addresses are associated with a recovery domain node for a device CRG, applicable only to cross-site mirroring. Both must be specified together for a recovery domain node. That is, a node which has a site name must also have at least one data port IP address specified.

Geographic mirroring, which is a subfunction of cross-site mirroring, supports two physical copies of auxiliary storage pool, one on each site. Only two sites are supported. A site primary node is the node which has the highest node role ranking for that site. A production site primary node, which is also the primary node for a device CRG, owns a production copy of the auxiliary storage pool. A mirror site primary node, which is the backup node which has the highest node role ranking at the mirror site, owns a mirror copy of the auxiliary storage pool.

A site may contain one or more recovery domain nodes at the same physical location. All nodes at a site must have access to the same physical copy of auxiliary storage pool. If there is only one node at a site, the auxiliary storage pool on that site does not need to be switchable. A node which belongs to more than one device CRG may or may not have the same site name.

Geographic mirroring is performed by sending updates from a production site primary node to a mirror site primary node via data port IP addresses. Each recovery domain node could have up to four data port IP addresses. They must be unique across all recovery domain nodes and CRGs.

User is responsible for configuring and managing data port IP addresses. They must already exist on all nodes in the recovery domain prior to starting a device CRG. Clustering will not start or end data port IP addresses under any circumstances, including starting and ending of a cluster resource group, switchover and failover. User must start the data port IP addresses before geographic mirroring can be performed. It is recommended that data port IP addresses are dedicated for geographic mirroring use only. It is also recommended that multiple data port IP addresses on each recovery domain node map to different adapters. This will help to avoid a single point of failure on the adapter and also improve performance of geographic mirroring.

Summary of cluster resource group status

Each cluster resource group has a status associated with it. The status of the cluster resource group may govern the behavior of a particular API call. In the following list of values, an indication of what happens when the exit program completes successfully applies only to a cluster resource group which has an exit program. If no exit program was specified, the same action occurs as for a successful completion. The possible values are:

10 Active. The resources managed by the cluster resource group are currently resilient.

20 Inactive. The resources managed by the cluster resource group are currently not resilient.

30 Indoubt. The information contained within the cluster resource group object may not be accurate. This status occurs when an exit program is called with an action of Undo and fails to complete successfully.

40 Restored. The cluster resource group object was restored on this node and has not been copied to the other nodes in the recovery domain. When Cluster Resource Services is started on this node, the cluster resource group will be synchronized with the other nodes in the recovery domain and its status set to Inactive.

500 Add Node Pending. A new node is in the process of being added to the recovery domain of a cluster resource group. If the exit program is successful the status is reset to its value at the time the API was called. If the exit program fails and the original state cannot be recovered, the status is set to Indoubt.

510 Delete Pending. Cluster resource group object is in the process of being deleted. When the exit program completes the cluster resource group is deleted from all nodes in the recovery domain.

520 Change Pending. The cluster resource group is in the process of being changed. If the exit program is successful the status is reset to the value at the time the API was called. If the exit program fails and the original state cannot be recovered, status is set to Indoubt.

530 End Cluster Resource Group Pending. Resiliency for the cluster resource group is in the process of ending. If the exit program is successful, the status is set to Inactive. If the exit program fails and the original state cannot be recovered, the status is set to Indoubt.

540 Initialize Pending. A cluster resource group is being created and is in the process of being initialized. If the exit program is successful, the status is set to Inactive. If the exit program fails, the cluster resource group will be deleted from all nodes.

550 Remove Node Pending. A node is in the process of being removed from the recovery domain of the cluster resource group. If the exit program is successful, the status is reset to the value at the time the API was called. If the exit program fails and the original state cannot be recovered, the status is set to Indoubt.

560 Start Cluster Resource Group Pending. Resiliency is in the process of starting for the cluster resource group. If the exit program is successful, the status is set to Active. If the exit program fails and the original state cannot be recovered, the status is set to Indoubt. For peer model cluster resource groups all nodes defined with a role of peer are active access points for the cluster resources.

570 Switchover Pending. The Initiate Switchover API was called, a failure of a cluster resource group occurred, or a node failed, causing a switchover or failover to begin. The first backup node is in the process of becoming the primary node. If the exit program is successful, the status is set to Active. If the exit program fails and the original state cannot be recovered, the status is set to Indoubt. While the switchover function is not valid for a peer cluster resource group, users may see the status "switchover pending" during a node failure.

580 Delete Command Pending. Cluster resource group object is being deleted by the Delete Cluster Resource Group (DLTCRG) command. The Cluster resource group object is only removed from the node running the command. This is not a distributed request. At the completion of the command, the cluster resource group is deleted from the node.

590 Add Device Entry Pending. A device entry is being added to a cluster resource group. If the exit program is successful, the status is reset to its value at the time the API was called. If the exit program fails and the original state cannot be recovered, the status is set to Indoubt.

600 Remove Device Entry Pending. A device entry is being removed from a cluster resource group. If the exit program is successful, the status is reset to its value at the time the API was called. If the exit program fails and the original state cannot be recovered, the status is set to Indoubt.

610 Change Device Entry Pending. A device entry is being changed in a cluster resource group. If the exit program is successful, the status is reset to its value at the time the API was called. If the exit program fails and the original state cannot be recovered, the status is set to Indoubt.

620 Change Node Status Pending. The status of a node in the cluster resource group"s current recovery domain is being changed. If the change is successful, the status is reset to its value at the time the Change Cluster Node Entry API was called. Failure of the exit program causes the status of the cluster resource group to be set to Indoubt. If a backup node is reassigned as the primary node for a resilient device cluster resource group and the ownership of the device cannot be transferred to the new primary node, the status is set to Indoubt.

The relationship between the cluster resource group status and the cluster resource group APIs is summarized in the following table. See the cluster resource group APIs for additional details on the cluster resource group status.

Summary of cluster resource group statuses for affected Cluster Resource Services API

Cluster Resource Services API	Original status	Status while exit program running	Action Code	Status - exit program successful	Status - exit program failure on Undo
Add Cluster Resource Group Device Entry	Active Inactive Indoubt Restored - ERROR Any pending status - ERROR	Add Device Entry Pending	17 - Add Device Entry	original status	Indoubt
Add Node to Recovery Domain	Active Adding primary - Error Adding backup, replicate or peer Inactive Indoubt Restored - ERROR Any pending status - ERROR	Add Node Pending	11 - Add Node	original status	Indoubt
Change Cluster Node Entry	When changing node status: Active Inactive Indoubt Restored - ERROR Any pending status - ERROR	Change Node Status Pending	20 - Change Node Status	original status	Indoubt ** Indoubt if device ownership cannot be transferred for a resilient device cluster resource group.
Change Cluster Resource Group	If changing node to primary or changing takeover IP address: Active -- ERROR Inactive Indoubt Restored - ERROR Any pending status - ERROR All other changes: Active Inactive Indoubt Restored - ERROR Any pending status - ERROR	Change Pending	13 - Change Note: Only call exit program for changing node role in recovery domain.	original status	Indoubt
Change Cluster Resource Group Device Entry	Active Inactive Indoubt Restored - ERROR Any pending status - ERROR	Change Device Entry Pending	19 - Change Device Entry	original status	Indoubt
Create Cluster Resource Group	N/A	Initialize Pending	1 - Initialize	Inactive	*CRG deleted
Delete Cluster Resource Group	Active -- ERROR Inactive Indoubt Restored Any pending status - ERROR	Delete Pending	5 - Verification Phase 7 - Delete	*CRG deleted	orginal status (if during Verification Phase, no undo, CRG not deleted) CRG deleted (if during Delete) ** Indoubt if Cluster Resource Services fails
End Cluster Resource Group	Active Inactive -- ERROR Indoubt Restored -- ERROR Any pending status - ERROR	End Cluster Resource Group Pending	4 - End	Inactive	Indoubt
End Cluster Node	Active Inactive Indoubt Restored Any pending status	Switchover Pending	16 - End Node for the node ending 9 - Failover for other nodes	original status	Indoubt
Initiate Switchover	Active Inactive -- ERROR Indoubt -- ERROR Restored -- ERROR Any pending status - ERROR Any peer cluster resource group - ERROR	Switchover Pending	10 - Switchover Note: If application cluster resource group, exit program called again with action Start.	Active	Indoubt
Remove Cluster Node Entry	Active Inactive Indoubt Restored Any pending status	Switchover Pending	12 - Remove Node for the node being removed 9 - Failover for other nodes	original status	Indoubt
Remove Cluster Resource Group Device Entry	Active - ERROR if last device entry removed Inactive Indoubt Restored - ERROR Any pending status - ERROR	Remove Device Entry Pending	18 - Remove Device Entry	original status	Indoubt
Remove Node From Recovery Domain	Active Removing primary - ERROR Removing backup, replicate or peer Inactive Indoubt Restored -- ERROR Any pending status - ERROR	Remove Node Pending	12 - Remove Node	original status	Indoubt
Start Cluster Node Entry	Active Inactive Indoubt Restored Any pending status	No pending value used	8 - Rejoin	original status	Indoubt
Start Cluster Resource Group	Active -- ERROR Inactive Indoubt Restored -- ERROR Any pending status - ERROR	Start Cluster Resource Group Pending	2 - Start	Active	Indoubt

Partition rules

When a partition is detected, each partition is designated as a primary or secondary partition for each cluster resource group defined in the cluster.

For primary-backup model cluster resource groups:

The primary partition contains the node that has the current node role of primary. All other partitions are secondary. The primary partition may not be the same for all cluster resource groups.

For peer model cluster resource groups:

If the recovery domain nodes are fully contained within one partition, it will be the primary partition.
If the recovery domain nodes span a partition, there will be no primary partion. Both partitions will be secondary partitions.
If the cluster resource group is active and there are no peer nodes in the given partition, the cluster resource group will be ended.
Operational changes are allowed in a secondary partition as long as the restrictions for the operational changes are met.
No configuration changes are allowed in a secondary partition.

The restrictions for each API when in a partition state are:

Add Cluster Resource Group Device Entry: Allowed only in a primary partition and all nodes in the cluster resource group"s recovery domain must be active in the primary partition.
Add Node to Recovery Domain: Allowed only in a primary partition.
Change Cluster Resource Group: Allowed only in a primary partition.
Change Cluster Resource Group Device Entry: Allowed only in a primary partition.
Create Cluster Resource Group: Not allowed in any partition.
Delete Cluster Restore Group: Allowed in any partition, but only affects partition running the API.
Distribute Information: Allowed in any partition, but only affects partition running the API.
End Cluster Resource Group: Allowed only in a primary partition. Allowed in all partitions for peer cluster resource groups, but only affects the partition running the API.
Initiate Switchover: Allowed only in a primary partition.
List Cluster Resource Groups: Allowed in any partition.
List Cluster Resource Group Information: Allowed in any partition.
Remove Cluster Resource Group Device Entry: Allowed only in a primary partition.
Remove Node from Recovery Domain: Allowed only in a primary partition.
Start Cluster Resource Group.: Allowed only in a primary partition. Allowed in all partitions for peer cluster resource groups, but only affects the partition running the API.

By applying these restrictions, cluster resource groups can be resynchronized when the cluster is no longer partitioned. As nodes rejoin the cluster from a partitioned status, the version of the cluster resource group in the primary partition will be copied to nodes in a secondary partition. When merging two secondary partitions for peer-model, the partition which has cluster resource group with status of Active will override the other partition. If both partitions have the same status for cluster resource group, the partition which contains the first active node listed in the cluster resource group recovery domain will be copied to all nodes in the recovery domain. The version of the cluster resource group in the winning partition will be copied to nodes in the overridden partition.

On occasion, a partition condition may be reported incorrectly and one or more nodes may have actually failed. If one of these failed nodes has the current role of primary for a cluster resource group, special recovery actions are required in order to assign the primary node role to a node in a secondary partition.

After these actions have been taken, returning the failed nodes to the cluster becomes much more difficult. Thus, these actions should be taken only when the failed node will be unavailable for an extended period of time. An example of when to do this would be the loss of a primary site.

The Change Cluster Node Entry API may be used to tell Cluster Resource Services that a node has really failed rather than partitioned. Once all nodes have been identified as failing, the List Cluster Resource Group Information API can be used to determine if the recovery domain has been reordered as the situation requires, and the Start Cluster Resource Group API can be used to restart the cluster resource group.

See Change Cluster Node Entry (QcstChangeClusterNodeEntry) API for additional information.

Cluster resource group API list

The cluster resource group APIs are:

Add Cluster Resource Group Device Entry (QcstAddClusterResourceGroupDev) adds a new device entry to a cluster resource group.
Add Node To Recovery Domain (QcstAddNodeToRcvyDomain) adds a new node to the recovery domain of an existing cluster resource group.
Change Cluster Resource Group (QcstChangeClusterResourceGroup) changes some of the attributes of a cluster resource group.
Change Cluster Resource Group Device Entry (QcstChgClusterResourceGroupDev) changes a device entry in a cluster resource group.
Create Cluster Resource Group (QcstCreateClusterResourceGroup) creates a cluster resource group object.
Delete Cluster Resource Group (QcstDeleteClusterResourceGroup) deletes a cluster resource group.
Distribute Information (QcstDistributeInformation) delivers information from a node in the recovery domain to other nodes in the recovery domain.
End Cluster Resource Group (QcstEndClusterResourceGroup) calls the Cluster Resource Group Exit Program to disable the resilience of the objects or application.
Initiate Switchover (QcstInitiateSwitchOver) changes the current recovery domain of a cluster resource group by making the primary node the last backup node and first backup node the primary node.
List Cluster Resource Group Information (QcstListClusterResourceGroupIn) returns the contents of a cluster resource group object.
List Cluster Resource Groups (QcstListClusterResourceGroups) generates a list of cluster resource groups and descriptive information about them.
Remove Cluster Resource Group Device Entry (QcstRmvClusterResourceGroupDev) removes a device entry from a cluster resource group.
Remove Node From Recovery Domain (QcstRemoveNodeFromRcvyDomain) removes a node from the recovery domain of an existing cluster resource group.
Start Cluster Resource Group (QcstStartClusterResourceGroup) calls the Cluster Resource Group Exit Program to enable resilience for the objects or application.

[ Back to top | Cluster APIs | APIs by category ]