Initiate Switchover (QcstInitiateSwitchOver) API

Required Parameter Group:

1	Request handle	Output	Char(16)
2	Cluster name	Input	Char(10)
3	Cluster resource group name	Input	Char(10)
4	Exit program data	Input	Char(256)
5	Results information	Input	Char(30)
6	Error code	I/O	Char(*)

  Service Program: QCSTCRG2

  Default Public Authority: *EXCLUDE

  Threadsafe: Yes

The Initiate Switchover (QcstInitiateSwitchOver) API changes the current roles of nodes in the recovery domain of a cluster resource group:

The current primary node is assigned the role of last active backup.
The current first backup is assigned the role of primary.

If a backup node does not exist in the recovery domain, the switchover will fail. If the first backup is not the desired primary, first use the Change Cluster Resource Group (QcstChangeClusterResourceGroup) API to arrange the backup nodes in recovery domain to the desired order.

This API will do the following for all cluster resource group types:

Set the cluster resource group status Switchover Pending (570).
Call the cluster resource group exit program on all active nodes in the recovery domain with an action code of Switchover (10), if an exit program is specified for the cluster resource group.
Set the cluster resource group status to Active (10) if the exit program completes successfully.
Set the cluster resource group status to Indoubt (30) if the exit program is unsuccessful and the original state of the cluster resource group cannot be recovered.

This API will do the following for resilient application cluster resource groups:

Cancel the cluster resource group exit program job with a Cancel Job Immediate on the current primary.
End the takeover IP interface on the current primary.
Start the takeover IP interface on the new primary.
Start the cluster resource group exit program on the new primary.
Note: The application and exit program code should provide cancel handlers to clean up the job if it is cancelled.
Set the cluster resource group status to Active (10) if the takeover IP address and the cluster resource group exit program job are started.
Set the cluster resource group status to Indoubt (30) if either the takeover IP address or the cluster resource group exit program job are not started.

This API will do the following for resilient device cluster resource groups:

The configuration objects must exist on all active nodes in the recovery domain and the resource names in the configuration objects must be the same on all active nodes.
The current primary node must own the IOPs or high-speed link I/O bridges for the devices configured in the cluster resource group.
The new primary node must be able to access the IOPs or high-speed link I/O bridges for the devices configured in the cluster resource group. This requirement does not apply to cross-site mirroring if the new primary node is at a different site than the current primary node.
For geographic mirroring, if the new primary node is at a different site than the current primary node, a role swap of auxiliary storage pools will occur where a production copy on the current primary node becomes a mirror copy, and a mirror copy on the new primary node becomes a production copy. If there is another active backup node at the same site as the current primary node, the auxiliary storage pools are moved from the current primary node to that site backup node.
On the current primary node if the cluster resource group is active, the configuration objects specified in the cluster resource group are varied off and the server takeover IP addresses are ended. The devices are moved to the new primary or the role of the auxiliary storage pools are swapped (for geographic mirroring where the new primary node is at a different site than the current primary node), before the exit program is called on the current primary. If any of the devices in the cluster resource group are a primary auxiliary storage pool, all members of the auxiliary storage pool group will be varied off. Before varying the devices off, cluster resource services will attempt to end all jobs which are using auxiliary storage pools configured in the cluster resource group. There are some system server jobs which will not be cancelled. If those server jobs are performing long running operations against data on an auxiliary storage pool, the devices may not vary off and the switchover will fail.
For the configuration objects specified in the cluster resource group, vary the configuration objects on and start the server takeover IP address on the new primary node if the entry in the cluster resource group indicates the configuration objects is to be varied on. If any of the devices in the cluster resource group are a primary auxiliary storage pool, all members of the auxiliary storage pool group will be varied on if the primary specifies the vary on value. The exit program is called on the new primary after the devices are moved to the new primary and varied on.
Cluster Resource Services varies the configuration objects in parallel based on the object type. All configuration objects of the type must either complete the vary (successfully or not) or be in vary pending before configuration objects of the next type are varied. The vary off sequence is *NWSD configuration objects first, followed by *DEVD, *CTLD, and *LIND. The vary on sequence is the reverse order: *LIND, *CTLD, *DEVD, and lastly *NWSD configuration objects.
Cluster Resource Services submits a batch job for each configuration object in the device list to vary the object on or off. The job is submitted to the job queue defined in the job description associated with the API's requesting user profile. The batch subsystem should be defined to allow these batch jobs to run concurrently in order to make switchover as fast as possible.
Set the cluster resource group status to Active (10) if the switchover to the new primary node is successful.
If the devices can not be switched to the new primary, then the switchover fails. The exit program will be called with an action code of Undo (15) and the devices will be moved back to the original primary node and/or the role of auxiliary storage pool will be swapped back as before (for geographic mirroring where the new primary node is at a different site than the current primary node).
If the device entry in the cluster resource group indicates the device should be varied on and the vary on or the start of the server takeover IP address fails for some reason, the switchover will not complete successfully. The devices remain on the new primary, and the exit program will be called with an action code dependent data of Configuration Object Online Failure (16). Configuration objects that had successfully varied on will remain varied on.
Set the cluster resource group status to Indoubt (30) if the devices cannot be successfully switched to the new primary node and cannot be returned to the same state on the old primary node, or if there were any vary on failures.
If there are one or more external storage devices:
- If the current storage source and target sites are the same as the current CRG's production and mirror sites, then the switchover continues and the sites will be switched if needed.
- If the current storage source and target sites are the same as the new production and mirror sites, then the switchover continues without the sites being switched.
- If the current storage source and target sites are not the same as the current or new production and mirror sites, then the switchover fails and the CRG status is changed to InDoubt.
- There is no backout if changing the sites fails, and the CRG status is changed to InDoubt.

When switching over cluster resource groups of different types, the order of switchover is important. Device cluster resource group objects should be done first followed by data cluster resource group objects and finally application cluster resource group objects.

If a cluster resource group has a status of Indoubt (30), the Start Cluster Resource Group API can be used to change the status to Active (10). See Start Cluster Resource Group (QcstStartClusterResourceGroup) API for more information.

This API requires:

Cluster Resource Services started on the node processing the request.
Cluster resource group status of Active (10).

This API operates in an asynchronous mode. See Behavior of Cluster Resource Services APIs for more information.

Restrictions:

This API cannot be called from a cluster resource group exit program.
This API is not allowed for peer cluster resource groups.

Authorities and Locks

The program that calls this API must be running under a user profile with *IOSYSCFG special authority.

Cluster Resource Group Authority: *CHANGE
Cluster Resource Group Library Authority: *EXECUTE
Cluster Resource Group Lock: *EXCL
Exit Program Authority (applies to user profile calling the API and user profile to run the exit program): *EXECUTE
Exit Program Library Authority (applies to user profile calling the API and user profile to run the exit program): *EXECUTE
User Profile Authority (applies to user profile to run the exit program): *USE
Request Information User Queue Authority: *OBJOPR, *ADD
Request Information User Queue Library Authority: *EXECUTE
Request Information User Queue Lock: *EXCLRD
Configuration Object Authority: *USE and *OBJMGT
Vary Configuration (VRYCFG) Command: *USE

Required Parameter Group

Request handle

OUTPUT; CHAR(16)

A unique string or handle that identifies this API call. It is used to associate this call to any responses placed on the user queue specified in the results information parameter.

Cluster name

INPUT; CHAR(10)

The name of the cluster containing the cluster resource group.

Cluster resource group name

INPUT; CHAR(10)

The name of the cluster resource group.

Exit program data

INPUT; CHAR(256)

256 bytes of data that is passed to the cluster resource group exit program when it is called. This parameter may contain any scalar data except pointers. For example, it can be used to provide state information. This data will be stored with the specified cluster resource group and copied to all nodes in the recovery domain. Pointers in this area will not resolve correctly on all nodes and should not be placed in the data. See Cluster Resource Group Exit Program for information about the cluster resource group exit program. The data specified will replace the existing exit program data stored with the cluster resource group. If blanks are specified, then the exit program data stored with the cluster resource group will be cleared. This parameter must be set to *SAME if no exit program is specified for the cluster resource group. The following special value can be used:

*SAME

The exit program data stored with the cluster resource group specified will be passed to the exit program. This must be left justified.

Results information

INPUT; CHAR(30)

This parameter identifies a qualified user queue field and is followed by a reserved field.

Qualified user queue: Completion information is returned to this user queue, which exists on the node from which the API was called, after the function has completed. See the Usage Notes section of this API for a description of the data that is placed on this queue. This is a 20-character field. The first 10 characters contain the user queue name, and the second 10 characters contain the user queue library name. No special values are supported. QTEMP, *LIBL, *CURLIB are not valid library names. The attributes of this user queue must be keyed.

Reserved: The last 10 characters of the 30-character results information are reserved. Each character in this field must be set to hexadecimal zero.

Error code

I/O; CHAR(*)

The structure in which to return error information. For the format of the structure, see Error code parameter.

Usage Notes

Results Information User Queue

Asynchronous results are returned to a user queue specified by the Results Information parameter of the API. See Cluster APIs Use of User Queues and Using Results Information for details on how to create the results information user queue, the format of the entries, and how to use the data placed on the queue. The data is sent to the user queue in the form of a message identifier and the substitution data for the message (if any exists). The following identifies the data sent to the user queue (excluding the message text).

Message ID	Message Text
CPCBB01 C	Cluster Resource Services API &1 completed.
CPF18BA D	Error occurred with subsystem.
CPF2204 D	User profile &1 not found.
CPF26B6	Initialization program has ended with a hard error.
CPF2640	Vary command not processed.
CPF2659	Vary command may not have completed.
CPF3CF2 D	Error(s) occurred during running of &1 API.
CPF9801 D	Object &2 in library &3 not found
CPF9802 D	Not authorized to object &2 in &3.
CPF9803 D	Cannot allocate object &2 in library &3.
CPF9804 D	Object &2 in library &3 damaged.
CPF9810 D	Library &1 not found
CPFBB09 D	Cluster node &1 does not exist in cluster &2.
CPFBB0A D	Cluster node &1 in cluster &2 not active.
CPFBB0F D	Cluster resource group &1 does not exist in cluster &2.
CPFBB17 D	&1 API cannot be processed in cluster &2.
CPFBB18 D	Request &1 not allowed for cluster resource group &2.
CPFBB1E D	A switchover cannot be done for cluster resource group &1.
CPFBB2C D	Attributes for exit program &1 in library &2 are not valid.
CPFBB2D D	Timeout detected while waiting for a response.
CPFBB2E D	Job submission failed for cluster resource group &1 in cluster &2.
CPFBB38 D	Value &1 not allowed for library name.
CPFBB39 D	Current user does not have IOSYSCFG special authority.
CPFBB46 D	Cluster resource service internal error.
CPFBB5B D	Resource name &1 incorrect for configuration object &2 on node &3.
CPFBB66 D	Request failed for device cluster resource group &3.
CPFBB67 D	Ownership of hardware associated with configuration object &1 cannot be changed.
CPFBB69 A	Primary node &1 not current owner of hardware resource &2.
CPFBB6A D	Primary node &1 not current owner of specified devices.
CPFBB6C D	Hardware configuration is not complete for configuration objects in cluster resource group &1.
CPFBB6E E	Exit program data cannot be specified.
CPFBB7B D	Device type not correct for configuration object &1 on node &2.
CPFBB80 D	Request failed for device cluster resource group &3.
CPFBB90 D	Request failed for device cluster resource group &3.
CPFBB92 D	Hardware resource &1 not owned by node &3 or node &4.
CPFBB98 D	Hardware resource &1 not switchable.
CPFBB99 D	Request failed for device cluster resource group &3.
CPIBB10 D	Cluster resource group exit program &1 in library &2 on node &3 failed.
TCP1B01 D	Unable to determine if &1 interface started.
TCP1B02 D	Cannot determine if &1 interface started.
TCP1B05 D	&2 interface not started. Reason &1.
TCP1B10 D	&2 interface not started.
TCP1B11 D	&1 interface not started. Tried to exceed maximum number of active interfaces allowed.
TCP1B12 D	&1 interface not started. &1 interface already active.
TCP1B13 D	&1 interface not started. &1 interface not defined the TCP/IP configuration.
TCP1B14 D	&1 interface not started. Line description &2 not found.
TCP1B15 D	Line description &2 unusable. Internal errors encountered.
TCP1B16 D	&2 interface not started.
TCP1B25 D	&1 interface not started.
TCP265F D	INTNETADR parameter value &2 not valid.
TCP1B61 D	Unable to determine if &1 interface ended.&2 successful (&3).
TCP1B62 D	Cannot determine if &1 interface ended.
TCP1B65 D	&2 interface not ended. Reason &1.
TCP1B72 D	&1 interface not ended. &1 interface is not active.
TCP1B73 D	&1 interface not ended. &1 interface not defined in TCP/IP configuration.
TCP1B74 D	&1 interface not ended. Line description &2 not found.
TCP1B85 D	&1 interface not ended.
TCP3210 D	Connection verification statistics: &1 of &2 successful (&3).
TCP9999 D	Internal system error in program &1.

Error Messages

Messages that are delivered through the error code parameter are listed here. The data (messages) sent to the results information user queue are listed in the Usage Notes above.

Message ID	Error Message Text
CPF2113 E	Cannot allocate library &1.
CPF24B4 E	Severe error while addressing parameter list.
CPF3C1E E	Required parameter &1 omitted.
CPF3C39 E	Value for reserved field not valid.
CPF3CF1 E	Error code parameter not valid.
CPF3CF2 E	Error(s) occurred during running of &1 API.
CPF9801 E	Object &2 in library &3 not found
CPF9802 E	Not authorized to object &2 in &3.
CPF9803 E	Cannot allocate object &2 in library &3.
CPF9804 E	Object &2 in library &3 damaged.
CPF980C E	Object &1 in library &2 cannot be in an independent auxiliary storage pool.
CPF9810 E	Library &1 not found
CPF9820 E	Not authorized to use library &1.
CPFBB02 E	Cluster &1 does not exist.
CPFBB09 E	Cluster node &1 does not exist in cluster &2.
CPFBB0A E	Cluster node &1 in cluster &2 not active.
CPFBB0F E	Cluster resource group &1 does not exist in cluster &2.
CPFBB1E E	A switchover cannot be done for cluster resource group &1.
CPFBB26 E	Cluster Resource Services not active or not responding.
CPFBB2C E	Attributes for exit program &1 in library &2 are not valid.
CPFBB32 E	Attributes of user queue &1 in library &2 are not valid.
CPFBB39 E	Current user does not have IOSYSCFG special authority.
CPFBB44 E	&1 API cannot be called from a cluster resource group exit program.
CPFBB6E E	Exit program data cannot be specified.
CPFBB3B E	Request not allowed for cluster resource group type &1.

API introduced: V4R4

[ Back to top | Cluster APIs | APIs by category ]