Core group scaling considerations

The amount of system resources, such as CPU and memory, that the high availability manager consumes does not increase linearly as the size of a core group increases. For example, the View Synchrony Protocol that the high availability manager uses requires a large amount of these resources to maintain a tight coupling over the core group members. Therefore, the amount of resources that a large core group consumes might become significant.

View Synchrony Protocol resource requirements can vary considerably for different core groups of the same size. The amount of resources that the View Synchrony Protocol uses for a core group is determined by:
  • The number of applications that are running.
  • The type of applications that are running.
  • The high availability manager services that are used.

When setting up core group scalability, you must ensure that:

  • All of the processes within the cell are distributed properly into core groups of appropriate sizes. Properly distributing these processes limits the amount of resources that the View Synchrony Protocol consumes.
  • All of the processes within a given core group are properly configured to support the high availability services that are used within the core group.

Consider implementing one or more of the following scalability techniques to scale the high availability manager in large cells, even if your system is operating properly. The two most basic techniques are:

  • Disabling the high availability manager if it is not required.
  • Distributing the processes over a number of core groups and using a core group bridge to connect the core groups as required.

Adjusting the size of a core group

Core group size directly effects three aspects of high availability manager processing that impact resource usage:
  • The first and most significant aspect is the establishment of the View Synchrony Protocol over a set of active core group members. This activity is commonly referred to as a view change.
  • The second aspect is the regularly scheduled discovery and failure detection tasks that high availability manager runs in the background.
  • The third aspect is the resource usage that results when other product components use high availability manager-provided services.
View Changes

The View Synchrony Protocol creates a new view whenever it detects that there is a change in core group members that are active. A view change typically occurs whenever a core group member starts or stops. When a core group member starts, it opens a connection to all of the other running core group members. When a core group member stops, other core group members detect that their open connections to the stopped member are closed. In either case, the View Synchrony Protocol needs to account for this change. In the case of a newly started member, the View Synchrony Protocol must establish a view that includes the new member. In the case of a stopped member, the View Synchrony Protocol must establish new view for the surviving core group members that excludes the stopped member.

Establishing a new view is an important activity but uses a lot of system resources, especially for large core groups.
  • Each running core group member must communicate its current state to other core group members, including information about the messages it has sent or received in the current view.
  • All messages sent in a given view must be received and acknowledged by all recipients before a new view can be installed. Under normal operating conditions, receipt of these messages is acknowledged slowly. Completing messages at a view change boundary in a timely fashion requires aggressive acknowledgement and retransmission.
  • All core group members must transmit data regarding their current state, such as the set of other core group members to which they can actively communicate.

As the number of active members grows, installing a new view requires a larger, temporary nonlinear increase in high availability manager CPU usage. It is significantly more expensive to add or remove a single member when 50 other core group members exist, than it is to add or remove a member when 20 other members exist.

Installing a new view also triggers state changes in the product components that use the high availability manager. For example, routing tables might need to be updated to reflect the started or stopped member, or a singleton service might need to be restarted on a new member.

The end result is that installing a new view results in a significant, transient spike in CPU usage. If core group sizes become too large, degenerate network timing conditions occur at the view change boundary. These conditions usually result in a failure during an attempt to install a new view. Recovery from such a failure is also CPU intensive. When insufficient CPU is available, or paging occurs, failures can quickly multiply.

Background tasks

The high availability manager periodically runs a number of background tasks, such as checking the health of highly available singleton services that it is managing. Most of these background tasks consume trivial amounts of CPU. The exceptions are the regularly scheduled discovery and Failure Detection Protocols.

The Discovery Protocol attempts to establish communications among core group members that are not currently connected, including processes that are not running. For a given core group that contains N core group members, of which M are currently running, each discovery period results in roughly M x (N – M) discovery messages. Therefore, creating a large number of processes that never start adversely affects the Discovery Protocol CPU usage.

Similarly, when the Failure Detection Protocol runs, each core group member sends heartbeats to all of its established connections to other core group members. For M active members, M x (M-1) heartbeat messages are sent. If aggressive failure detection is required, the size of the core group can adversely affect the amount of CPU usage that heartbeating between core group members consumes.

Smaller core groups positively affect the amount of CPU usage these two protocols consume. For example, if a core group contains 100 active members, 9900 heartbeat messages are sent during every failure detection period. Splitting the 100 member core group into five smaller core groups of 20 members reduces this number of message to 1900, which is a significant reduction.

External usage
Other product components, such as work load management (WLM), and on demand configuration, use high availability manager-provided services, such as live server state exchange, to maintain routing information. The amount of CPU usage that these components consume is linked to core group size. For example, the usage of the live server state exchange to build highly available routing information is linked to the size of the core group.

Distributing processes among multiple core groups

You can use two basic techniques to minimize the amount of resources that the view synchrony and related protocols consume:
  • You can disable the high availability manager on processes where the services that the high availability manager provides are not used.
  • You can keep core group sizes small.

The key to limiting the high availability manager CPU usage is to limit the size of the core group. Multiple small core groups are much better than one large core group. If you have large cells, create multiple core groups.

The hardware on which you are running the product is also a factor in determining the core group size that is appropriate for your environment.

Split large core groups into multiple, smaller core groups. If the resulting core groups need to share routing information, you can use core group bridges to bridge the core groups together.

Core group sizes

You can reasonably determine core group sizes provided you initially perform the following tuning activities:
  • When you enable the setting IBM_CS_WIRE_FORMAT_VERSION core group custom property to a value of 6.1.0, you can gain core group internal protocol improvements. These improvements are only available in Version 6.1 and later.
  • When you enable the setting IBM_CS_HAM_PROTOCOL_VERSION core group custom property to a vale of 6.0.2.31, you can gain significant improvements in the memory utilization and failover characteristics of core group bridges.
  • You can adjust transport memory settings. There are two memory or buffer size settings associated with the core group transport. The default values for these settings are sufficient for small core groups of 50 members or less. For core groups of over 50 members, these settings must be increased from the default values.
    Note: Increasing the value of these transport memory settings does not directly translate into more statically allocated memory or long-term memory usage by the High Availability Manager.
Provided that WebSphere Application Server network deployment is deployed on relatively modern hardware, CPU and memory are not constrained, applications are well behaved, the network is stable and factors affecting core group stability are addressed, such as network and memory issues, the following core group sizes can be considered.
  • Core groups of up to 100 members should work without issue.
  • Core groups containing more than 100 members should work without issue in many topologies. Exceeding a core group of 200 members is not recommended.

Adjusting individual core groups based on the application mix and services used

You might need to further adjust Individual core groups based on the application mix and the high availability services that the core group members use.

  • Adjust how frequently the default Discovery Protocol and the default Failure Detection Protocol run if the default settings are not appropriate.
  • Configure the core group coordinator to run on a specific process or set of processes.
  • Partition the coordinator across multiple instances if the consumption of resources by the coordinator process is noticeable.
  • Configure the amount of memory that is available to the distribution and consistency services (DCS) and reliable multicast messaging (RMM) components for sending network messages when congestion is detected. Congestion can occur under some conditions, even though memory-to-memory replication is not used.

Adjusting ephemeral port ranges

The number of sockets that a core group uses is usually not a major concern. Each core group member must establish a connection with every other member of that core group. Therefore, the number of connections grows exponentially (n-squared) because each connection requires two sockets, one on each end of the connection. Because multiple machines are typically involved, normally you do not have to be concerned about the number of sockets that a core group uses. However, if you have an abnormally large number of core group members that are running on a single machine, you might have to adjust the operating system parameters that are related to ephemeral port ranges. Most operating systems have different default behavior for ephemeral port ranges.

Best practice: Although it is not an incorrect configuration, if you define ports in the ephemeral range, you might encounter unpredictable behavior. Each operating system has its own ephemeral port range, which you can configure on the operating system level. Due to individual operating system configurations, it is not possible to document specific port numbers.