Configuring distributed DVIPAs — sysplex distributor

A distributed DVIPA exists on several stacks, but is advertised outside the sysplex by only one stack. This stack receives all incoming connection requests and routes them to all the stacks in the distribution list for processing. This provides the benefit of distributing the workload of incoming requests and providing additional fail-safe precautions in the event of a server failure.

You can distribute connections destined for a dynamic VIPA (DVIPA) by adding a VIPADISTRIBUTE configuration statement for a previously defined dynamic VIPA. The order of the statements is important. The VIPA is first defined with the VIPADEFINE statement and then included on a VIPADISTRIBUTE statement. Another TCP/IP can act as a backup for the distributed DVIPA by properly coding a VIPABACKUP statement; the backup will perform the routing function in the event of a failure. The options specified on a VIPADISTRIBUTE statement are inherited by a backup stack unless the second stack has its own VIPADISTRIBUTE statement for that DVIPA, in which case it will use its own VIPADISTRIBUTE statement for distributing. You can also code a VIPADISTRIBUTE statement with just the VIPABACKUP statement and not for the VIPADEFINE statement. This would allow workload distribution only during a primary outage.

You can change the distribution of a DVIPA after a backup stack has activated it. However, if the backup stack did not not have its own distribution defined by a VIPADISTRIBUTE statement before it activated the DVIPA, any distribution changes made while the DVIPA is active on the backup stack are temporary. Those changes will be in effect while the DVIPA remains active on the backup stack, but will not be remembered if this stack takes over the DVIPA again in the future.

The following example is a properly coded distributed DVIPA:

IPCONFIG SYSPLEXROUTING DYNAMICXCF 193.9.200.4 255.255.255.240 1 
IPCONFIG6 DYNAMICXCF 2000::93:9:200:4
VIPADYNAMIC  
  VIPADEFINE  255.255.255.192 9.67.240.02 
  VIPADISTRIBUTE DEFINE  9.67.240.02 PORT 20 21 8000 9000 DESTIP 
          193.9.200.2                                               
          193.9.200.4                                               
          193.9.200.6
  VIPADEFINE V6DVIPA1 2000::9:67:240:2
  VIPADISTRIBUTE DEFINE V6DVIPA1 PORT 20 21 8000 9000 DESTIP
          2000::93:9:200:2
          2000::93:9:200:4
          2000::93:9:200:6
ENDVIPADYNAMIC 

Prior to z/OS® V1R6 Communications Server, the TCP/IP stack that was configured as a distributor of dynamic VIPAs was required to enable IP forwarding using the IPCONFIG (or IPCONFIG6) DATAGRAMFWD TCP/IP profile statement. For installations that do not want to configure their TCP/IP stack as a forwarding node, it is no longer a requirement for distributing dynamic VIPAs. However, if your installation is configured such that target TCP/IP stacks have only XCF connectivity, datagram forwarding still needs to be configured on the distributor, as all packets originating from the target will be forwarded by the distributor.

There are several configuration changes that can be made to affect the method the distributing stack will use to forward connections to the target stacks. In each of the following items, all participating stacks is used to refer to the distributing stack and all target stacks.

WLM-based forwarding based on target system workload
If the DISTMethod BASEWLM parameter is specified on the respective VIPADISTRIBUTE statement, or if the DISTMethod parameter is not specified, this distribution method is enabled. This is the default distribution method. To enable the distributing stack to forward connections based upon the workload of each of the target stacks, specify SYSPLEXROUTING on the IPCONFIG statement in all participating stacks. This registers all participating stacks with WLM and enables the distributing stack to request workload information from WLM.

The WLM workload information is based on a comparison of available general CPU capacity for each target system. If the application uses System z® Application Assist Processor (zAAP) capacity or System z Integrated Information Processor (zIIP) capacity, you can configure the VIPADISTRIBUTE statement so that available zAAP CPU capacity and zIIP CPU capacity are also considered. For these additional processor types to be considered, no distributor and target systems used by this application can be earlier than z/OS V1R9 Communications Server. If you need to consider zAAP and zIIP CPU capacity, evaluate whether you can use SERVERWLM distribution as an alternative to BASEWLM distribution for this application. SERVERWLM distribution has the advantage that processor proportions are automatically determined and dynamically updated by WLM based on the actual CPU usage of the application. If you need BASEWLM distribution, to determine the processor proportions to configure, study the workload usage of assist processors by analyzing SMF records, using performance monitors reports such as RMF™, and so on.

WLM-based forwarding based on server-specific workload
If the DISTMethod SERVERWLM parameter is specified on the respective VIPADISTRIBUTE statement, the distributing stack selects from the available servers for a DVIPA/port and forwards connections based on a WLM recommendation indicating how well each server is executing on its system. To enable the distributing stack to forward connections based on server-specific workload, specify SYSPLEXROUTING on the IPCONFIG statement in all participating stacks.

If the server uses System z Application Assist Processor (zAAP) capacity or System z Integrated Information Processor (zIIP) capacity, processor proportions are automatically determined and dynamically updated by WLM, based on the actual CPU usage of the application; however, you can influence the WLM server-specific recommendation with configuration options on the VIPADISTRIBUTE statement. You can use the PROCXCOST parameter on the VIPADISTRIBUTE statement so that the WLM recommendation favors servers with available zAAP or zIIP capacity over servers on which work targeted for the specialty processors might instead run on the conventional processor. You can also use the ILWEIGHTING parameter on the VIPADISTRIBUTE statement to influence how aggressively the WLM recommendation favors servers on systems with displaceable capacity at lower importance levels over servers on systems with displaceable capacity at higher importance levels. For these additional factors to be considered by WLM, no systems can run a release prior to z/OS V1R11 Communications Server.

WLM/QoS-based forwarding
Regardless of whether BASEWLM or SERVERWLM weights are being used, to enable the distributing stack to forward connections based upon a combination of workload information and network performance information (TCP retransmissions and time-outs), specify SYSPLEXROUTING on the IPCONFIG statement in all participating stacks, and also define a sysplex distributor performance policy on the target stacks. For information on configuring these policies, see Sysplex distributor policy example.
Round-robin forwarding
If the DISTMethod ROUNDROBIN parameter is specified on the respective VIPADISTRIBUTE statement, the distributing stack uses a round-robin mechanism to select one of the DVIPA/port targets for each connection.
Weighted active forwarding
If the DISTMethod WEIGHTEDActive parameter is specified on the respective VIPADISTRIBUTE statement, the distribution of incoming TCP connection requests is balanced across the targets, such that the number of active connections on each target is proportionally equivalent to a configured active connection weight for each target. However, server-specific abnormal completion information, server-specific health information, and the TSR value are used to reduce the active connection weight when these indicators are not optimal. To enable the distributing stack to use server-specific abnormal completion and health information to affect the active connection weight, specify SYSPLEXROUTING on the IPCONFIG statement for all participating stacks.
Hot-standby forwarding
If the DISTMethod HOTSTANDBY parameter is specified on the respective VIPADISTRIBUTE statement, one preferred target server and one or more backup (hot-standby) target servers are configured. The distributing stack does not perform load balancing of new connection requests across multiple targets; instead, the preferred target server with an active listener receives all new incoming connection requests, and the hot-standby target servers, which typically also have a ready listener application, do not receive any new connection requests. If the preferred target server becomes unavailable, then the highest ranked backup server becomes the active target server and receives all new connection requests. To enable the distributing stack to use server-specific abnormal completion and health information to affect the availability of the preferred target, specify SYSPLEXROUTING on the IPCONFIG statement for all participating stacks.
Target-controlled forwarding
If the DISTMethod TARGCONTROLLED parameter is specified on the respective VIPADISTRIBUTE statement, distribution is controlled using weights that are received from the targets. This distribution method can be used with only non-z/OS tier 1 targets, such as DataPower® appliances. For more information, see Sysplex distribution with DataPower.
Result: If you have not made the changes needed to enable WLM-based forwarding (SYSPLEXROUTING has not been specified for all participating stacks), and BASEWLM or SERVERWLM is specified, the distributing stack will use round-robin forwarding to distribute connections.
Tip: The weights received from WLM are returned based on the first 24 characters of the HOSTNAME value. The HOSTNAME value is determined by the search path at initialization time. To ensure correct distribution, verify that the first 24 characters of each HOSTNAME value are unique for every system.

Regardless of the distribution method used, sysplex distributor routing policies can further affect the distribution of connections. Sysplex distributor routing policies, configured on the distributing stack, are used to specify a set of target stacks for a given set of traffic. For example, all traffic destined to a given port/DVIPA from a specified subnet can be assigned one group of target stacks, while traffic for the same port/DVIPA from another subnet can be assigned to a different group of target stacks. For more information on configuring these types of policies, see Sysplex distributor policy example.

Distribution of connections to target servers can also be affected by the responsiveness of the target stacks or target servers. The sysplex distributor stack monitors how well target servers respond to connection setup requests, calculating a target server responsiveness (TSR) value for each server.

For WLM-based forwarding based on target system workload, WLM/QoS-based forwarding, WLM-based forwarding based on server-specific workload, or weighted active forwarding, new connection setup requests are diverted away from target servers that are handling connection setup requests relatively poorly. For round-robin forwarding, if the sysplex distributor determines that a target server is not successfully accepting connection setup requests at all, that target server is bypassed. Periodically, the distributor sends a new connection request to a target with a TSR of 0, to check whether the responsiveness of that target has improved.

By default, the sysplex distributor updates the status of each target server at 1-minute intervals as follows:

The default 1-minute interval is sufficient for most workloads. However, in some environments, particularly when the load on each target system will be close to 100% capacity and when the workload consists of a high volume of short-lived connections, you might want to use a shorter interval so that the distributor reacts faster to changes in a target server's status. You can change the interval by using the SYSPLEXWLMPOLL parameter on the GLOBALCONFIG statement.

Each distributing stack and each target stack must have an IPv4 or IPv6 DYNAMICXCF address, or both. This address is used by other distributing stacks as a destination point. When using sysplex distributor, do not define an IUTSAMEH link. These links will be created automatically from the DYNAMICXCF statement. See z/OS Communications Server: IP Configuration Reference for directions for coding DYNAMICXCF on the IPCONFIG or IPCONFIG6 statements. For more information on additional configuration parameters required, also see the usage notes related to the DYNAMICXCF parameter under the IPCONFIG or IPCONFIG6 statements in z/OS Communications Server: IP Configuration Reference.

The VIPADISTRIBUTE statement specifies how new connection requests are routed to a set of candidate target stacks. The VIPADISTRIBUTEd DVIPA can be followed by up to 64 ports. The preceding example shows the well-known ports for FTP and the ports for a custom application.

Up to 32 target TCP/IPs follow the DESTIP keyword and are identified by their respective dynamic XCF IP addresses. Alternatively, the VIPADISTRIBUTE statement can specify DESTIP ALL, in which case all current and future stacks with activated dynamic XCF can participate in the distribution as candidate target stacks. As an application listens to one of the specified ports on each listed TCP/IP, the routing TCP/IP begins to forward connections to that stack.

Guideline: If you are using OMPROUTE for connectivity to a dynamic VIPA, the LPAR with the distributing stack that owns that dynamic VIPA is the only target stack (no remote target stacks are available), and a HiperSockets™ (iQDIO) interface is not configured, code a static route that represents the shared IP address in the attached network router to maintain connectivity; otherwise, because a HiperSockets interface is not configured, OMPROUTE does not advertise the routing information representing the shared IP address for the dynamic XCF interfaces in that LPAR, and the address might become unreachable because the interfaces to the remote target stacks are deleted or marked inactive. Unlike IUTSAMEH and DXCF interfaces, a HiperSockets interface that shares an IP address can remain active when there are no more remote target stacks available, and OMPROUTE advertises the routing information to the neighboring network routers (for example, Cisco) for connectivity to the shared IP address.

For more information about sysplex distributor, see IBM® z/OS V1R13 Communications Server TCP/IP Implementation, Volume 3: High Availability, Scalability, and Performance, SG24–7998.