Core group administration considerations

Core group configuration information is stored in a CoreGroup configuration object that is backed by a coregroup.xml document. Process-specific configuration information for each core group member is stored in a HAManagerService configuration object that is backed by a hamanagerservice.xml document.

The coregroup.xml document is a cell-scoped document. The master copy of this document is stored in the configuration repository for the deployment manager. A copy of this document is stored on every node in the cell. The coregroup.xml document includes the following configuration information:

  • The list of core group members
  • The high availability policies for the core group
  • The core group coordinator configuration information
  • The core group transport configuration information, including memory buffer size settings
  • Discovery and failure detection protocol configuration settings

The core group member process-specific configuration information stored in the hamanagerservice.xml document includes:

  • Whether the high availability manager is enabled.
  • The name of the core group to which the member belongs.
  • How frequently the high availability manager checks the health of highly available singletons running on the member, if a length of time is in effect for this function.

Core group configuration document

For transitioning users: Prior to Version 6.1.0.27, every core group must contain at least one administrative process. In this version of the product an administrative process is only required if one of the following conditions exist:
  • Your core group only contains members of one cluster.
  • You are running in a mixed cell environment. In a mixed cell environment, every core group that contains any Version 6.x members must also contain an administrative process.

The master copy of the core group configuration document is directly modified when direct attributes, such as the coordinator configuration, are modified. The master copy of the core group configuration document is implicitly modified when a server is created or deleted, or a node is added or removed. In either case, the list of core group members is updated to reflect which processes are added or removed.

The set of core group members for which the View Synchrony Protocol is established is commonly referred to as a view. Whenever a view is installed, one of the core group members is elected to send its current configuration to all other members of the view. This processing ensures that all members of the view are running with a consistent core group configuration. This processing also means that inconsistencies in a high availability policy or coordinator configuration are tolerated. However, inconsistencies in the list of core group members or the core group transport are not tolerated.

If you modify a list of core group members, do not start a member of that core group until you are sure that the change is fully synchronized to all nodes in the cell. If a node agent is down when the configuration change is made, you must manually synchronize the configuration change before any processes are started on that node. If you do not manually synchronize the change, the process that is starting cannot establish the View Synchrony Protocol with the other core group members because when a core group member starts, it reads the core group configuration information from the repository on the local node. It then opens connections to other core group members and attempts to establish the View Synchrony Protocol with them. If the local copy of the coregroup.xml document is not synchronized with the master core group configuration document, problems occur. For example, if the running processes dynamically reloaded the updated configuration, the configuration for the process that just started is out of sync with the configurations of the other core group members. If the update changed the list of core group members, the list is now inconsistent across the nodes in the cell, and any attempt to establish view synchrony fails because of these inconsistent member lists. When this condition is detected, an error message similar to the following message is logged:

DCSV8022I: DCS Stack {0} at Member {1}: Inconsistency of configured defined set 
with that of another member. Inconsistent member is {2}. The list of members only
in the local defined set is {3}, whereas the list of members only in the defined 
set at the inconsistent member is {4}.

When a process detects an inconsistent core group membership condition, the process attempts to reread the core group configuration several times. It is possible that the configuration document is in the process of being synchronized to the node. In such a case, rereading the configuration document can resolve the inconsistency. However, if the process can not resolve the inconsistency after trying to reread the configuration several times, the process stops trying to resolve the inconsistency. To recover from this situation, you must resynchronize the configuration and restart the process.

Core group process-specific configuration document

Unlike the cell-scoped core group configuration information that is contained in the coregroup.xml document, the process-specific configuration information for each core group member that is contained in the hamanagerservice.xml document cannot be dynamically reloaded. You must restart a process before core group process-specific configuration changes go into affect.