Sysplex problem detection and recovery

Each sysplex member monitors itself and can automatically leave the sysplex if it determines that it is no longer able to function correctly as a router, target, backup, or owner of a DVIPA. Through a variety of methods, it monitors:

As long as the TCP/IP stack is a member of a TCP/IP sysplex group, the sysplex monitor gets control periodically. The time period is determined from the TIMERSECS value specified on the SYSPLEXMONITOR parameter of the GLOBALCONFIG statement in PROFILE.TCPIP. The default TIMERSECS value is 60 seconds.

After a problem is detected, further actions depend on whether RECOVERY or NORECOVERY was specified on the SYSPLEXMONITOR parameter of the GLOBALCONFIG statement. NORECOVERY is the default value.

Joining the TCP/IP sysplex group can be delayed until certain network routing and connectivity availability conditions are met. Those conditions can include any combination of the following conditions:

The first condition is activated using the DELAYJOIN option on the SYSPLEXMONITOR parameter of the GLOBALCONFIG statement. The other two conditions are activated using the MONINTERFACE or MONINTERFACE DYNROUTE option on the SYSPLEXMONITOR parameter of the GLOBALCONFIG statement. No sysplex-related definitions within the TCP/IP profile (that is, VIPADYNAMIC and DYAMICXCF statements) are processed until the sysplex group is joined.

Tips:

During a planned or unplanned outage, the DVIPAs and distributed DVIPAs for a TCP/IP stack are taken over by backup TCP/IP stacks. When the primary TCP/IP stack is restarted, the DVIPAs and distributed DVIPAs are taken back from the backup TCP/IP stacks. If dynamic routing is used to advertise routes to these DVIPAs, and specified network routing and connectivity availability conditions are not met on the primary TCP/IP stack, existing connections to these DVIPAs might be reset and new connect requests to these DVIPAs might fail. By using the GLOBALCONFIG SYSPLEXMONITOR DELAYJOIN and MONINTERFACE DYNROUTE configuration statements in the TCP/IP profile on the primary TCP/IP stack, it is possible to delay taking back the DVIPAs and distributed DVIPAs until specified network routing and connectivity availability conditions are met. New and existing connections continue to be serviced by the backup TCP/IP stacks until OMPROUTE is active and monitored interfaces and dynamic routes over those monitored interfaces are present on the primary TCP/IP stack.

For more information about the GLOBALCONFIG statement and its parameters, see z/OS Communications Server: IP Configuration Reference.