Troubleshooting EE problems

Table 1 shows some common EE problems you might encounter and suggested solutions to resolve them.

Table 1. Troubleshooting EE problems
Problem indication	Avoidance method or suggested remedy
Line activation failure: Pending act link PGAIN failure	When all of the rules for determining the local static VIPA address (or addresses) for EE are followed, EE line activation could still fail for one of the following reasons: An incorrect TCP/IP stack name was specified with the TCPNAME VTAM® start option. An incorrect source VIPA address was specified with the IPADDR VTAM start option, or on the XCA GROUP definition statement. An incorrect source VIPA address was resolved from the host name specified as the HOSTNAME VTAM start option, or on the XCA GROUP definition statement. A host name could not be resolved for the specified source VIPA address. The IPv4 or IPv6 same host device is not defined or is not started by the appropriate TCP/IP stack If the activation fails, perform one of the following recovery actions: Stop TCP/IP, and activate the correct TCP/IP stack. Then LINE (thus, a PORT) activation should complete normally. Individually deactivate (force) the LINEs. After the last LINE is deactivated, the PORT deactivates. Thus, the XCA major node is also deactivated. Then, change the TCPNAME, IPADDR, or HOSTNAME start option to the correct value, or update the XCA major node definition to specify the correct IPADDR or HOSTNAME value, and reactivate the XCA major node.
Activation failure with the following messages: `IST1411I INOP GENERATED FOR linename IST1430I REASON FOR INOP IS XID OR LDLC COMMAND TIMEOUT IST314I END`	This message group is issued when VTAM is not receiving responses to XID requests during activation. It indicates that either the partner is not responding to the request or there are connectivity problems within the IP infrastructure. Some common setup problems that cause this are: IP connectivity has been lost within your network. EE UDP ports are not defined with consistent values across the network. EE ports should be defined in the range of 12000 - 12004. EE is not enabled on the remote endpoint. For example, lines and switched PUs are not defined, not activated, or both. If the EE connection path traverses one or more firewalls, the firewalls must allow UDP traffic to flow for EE ports 12000 - 12004. If NAT is used in the EE connection path, adhere to these rules: Avoid NAPT. EE does not support NAPT. When a one-to-one address translation function is performed, the name-to-address resolution mapping for the host name yields the correct NAT address. If connection network is being used with NAT, you must use HOSTNAME definitions when defining your virtual routing node. Use the D NET, EEDIAG,TEST=YES command. Other helpful commands are PING, TRACERTE.
LU 6.2 sessions do not stay up over EE; your session unexpectedly ends	This problem usually indicates that a limited resource is in use somewhere along the session path. For predefined EE connections, use DISCNT=NO (the default) For EE VRN-based dynamic connections, consider coding a DYNTYPE=VN model with DISCNT=NO or a delay value of 60+ seconds. Important note for CICS® LU6.2 Users: Specifying DISCNT=NO prevents CICS from terminating its sessions at the end of every transaction.
Active EE connection unexpectedly fails with the messages: `IST1411I INOP GENERATED FOR linename IST1430I REASON FOR INOP IS XID OR LDLC COMMAND TIMEOUT IST314I END`	EE connection deactivation because of LDLC timeout. EE periodically tests the EE partner to verify IP connectivity and that the partner is still there. When the tests are unanswered, the EE connection ends with these messages. Some common causes are: The partner unexpectedly ended. IP connectivity has been lost within your network. OMPROUTE problems
Diagnosis of connection failures or performance considerations	Use the TCP/IP packet trace to analyze Enterprise Extender-related packets flowing to and from a TCP/IP stack on a z/OS® Communications Server host. You can use the PKTTRACE statement to copy IP packets as they enter or leave TCP/IP, and then examine the contents of the copied packets. The method of capturing and formatting this trace is described in z/OS Communications Server: IP Diagnosis Guide.
Dial collision problems	See Dial usability - DWACT, DWINOP, KEEPACT, REDIAL, and REDDELAY for more information about dial collisions.
Poor throughput when using PSRETRY	After each path switch, HPR resets its sending rate to the initial value so frequent path switches can lead to reduced throughput. In particular, setting PSWEIGHT to EQUAL or SAMEROUT can lead to an excessive number of path switches. See z/OS Communications Server: SNA Resource Definition Reference for information about the START option.
Poor HPR throughput over EE with multipath enabled	If MULTIPATH is enabled on the TCP/IP stack, the VTAM start option MULTPATH is set to TCPVALUE and multiple equal-cost routes exist to the partner EE node, then TCP/IP will round-robin batches of EE packets across each of these routes. If one of these routes cannot reach the partner EE node, then EE may not activate, or if it does, there can be significant performance impacts. See z/OS Communications Server: SNA Diagnosis Vol 1, Techniques and Procedures for information about DISPLAY commands. Code or let start option MULTPATH default to NO. This will disable multipath routing for only EE traffic in the TCPIP stack. This can be updated dynamically using the MODIFY VTAMOPTS command.
High CPU utilization in a branch environment (with lots of EE connections always active)	If you are using many Enterprise Extender connections and these connections are kept indefinitely active (switched PU associated with the EE connection has a DISCNT=NO, either specified or defaulted), the following functions can reduce the CPU utilization by VTAM: HPR alive timer optimization - This function by default is enabled and is controlled by the VTAM start option HPREELIV. For full details of this function, see HPR ALIVE timer optimization for Enterprise Extender. LDLC Keep-Alive reduction - This function requires you to specify an operand for the LIVTIME Enterprise Extender PORT option. For full details of this function, see Enterprise Extender LDLC keep-alive reduction.
EE connections through the connection network are not rerouting to an alternate path	If the EE connection network path has the lowest weight of any available path to the partner node, any attempt to redial the partner node will continue to try the path over this particular VRN. This is likely to result in failures until the underlying problem with the path is corrected. EE connection network reachability awareness is designed to detect the dial failure or connection INOP for the connection over an Enterprise Extender connection network and prevent that specific path to the partner node from being used for a period of time. Use the EE connection network reachability awareness function to indicate that the path to a partner node over an Enterprise Extender VRN should not be used for route selection for a period of time after the initial dial failure or connection INOP, providing time for the underlying connection problem to be corrected. This function can be enabled in the following manner: Specify the UNRCHTIM start option Specify the UNRCHTIM operand on either the EE XCA major node PORT or GROUP definition statements
A new EE connection is established between you and a partner company but sessions cannot be established.	This is generally because the firewalls are not allowing UDP traffic on all EE ports. The firewall must allow UDP traffic both inbound and outbound on all five EE ports (12000 - 12004). To assist in diagnosing new EE connection problems, use this command: D NET, EEDIAG, TEST=YES. See z/OS Communications Server: SNA Operation for more information.
A new EE connection fails with the EE health verification failure message: `IST2330I EE HEALTH VERIFICATION FAILED FOR puname AT time`	This is generally because the firewalls or intermediate routers are not allowing UDP traffic on all EE ports. Any firewall or intermediate router must allow UDP traffic both inbound and outbound on all five EE ports, typically 12000 - 12004. To assist in diagnosing new EE health verification failures, use the following command: `D NET,EEDIAG,TEST=YES` See z/OS Communications Server: SNA Operation for more information. Review the Enterprise Extender Connectivity Test output for any unsuccessful test results. See DISPLAY EEDIAG,TEST=YES in z/OS Communications Server: SNA Diagnosis Vol 1, Techniques and Procedures for information about analyzing the test output.
EE health verification fails on active connections. The following eventual action message is issued on the console: `IST2323E EE HEALTH VERIFICATION FAILED FOR ONE OR MORE CONNECTIONS`	Use the DISPLAY NET,EE,LIST=EEVERIFY command to determine which EE connections are experiencing EE health verification failures. Message IST2325I is displayed for each line or PU, which failed health verification on the most recent LDLC probe to its remote partner. Use the DISPLAY NET,ID=linename or puname command to get more information that includes local and remote IP addresses. Determine the network connectivity problems between this node and the remote partners. Use the DISPLAY NET,EEDIAG,TEST=YES command to determine the reason of the failure. See DISPLAY EEDIAG,TEST=YES in z/OS Communications Server: SNA Diagnosis Vol 1, Techniques and Procedures for more information.
A new EE connection comes up with the EE health verification failure not supported message: `IST2342I EE HEALTH VERIFICATION NOT SUPPORTED BY puname`	During the activation of the EE connection, VTAM sends Logical Data Link Control (LDLC) probes to the remote partner to determine whether all five ports are accessible. VTAM does not receive a response to any of the LDLC probe requests. VTAM continues with the activation of the EE connection between this node and the remote partner. Because VTAM receives no replies to its LDLC probe requests, VTAM determines that the remote partner does not support EE health verification. If EE health verification is required for this PU, contact the remote PU owner about upgrading the PU to support EE health verification probes. If you think that EE health verification supported by the remote partner, use the following command: `D NET,EEDIAG,TEST=YES` See z/OS Communications Server: SNA Operation for more information. Review the Enterprise Extender Connectivity Test output for any unsuccessful test results. See DISPLAY EEDIAG,TEST=YES in z/OS Communications Server: SNA Diagnosis Vol 1, Techniques and Procedures for information about analyzing the test output.
The EE connection link terminates because of XID or LDLC timeout.	Consider tuning the LDLC parameters as described in When does the EE connection go away?. Also consider using the D NET, EEDIAG, SRQRETRY command.
The RTP pipe fails to successfully path switch even though an alternate link is available	Because of a problem with the EE connection, an HPR pipe attempts to pathswitch but fails to connect with a message that no alternate routes are available. Ensure that values in the HPRPST start option are all greater than the EE link inoptime. See When does the EE connection go away? for more information.
Excessive path switch messages (IST1494I) flooding the system console log during large network outage.	Enable the HPR path switch message reduction function with the HPRPSMSG start option. For more information about the HPRPSMSG start option, see z/OS Communications Server: SNA Resource Definition Reference.
You cannot determine the APPNCOS name associated with an RTP puname that unexpectedly deactivates.	Enhance the HPR activation and deactivation messages by setting the HPRITMSG start option to the value ENHANCED. Now, when an RTP is deactivated, you can locate the IST1488I (deactivation) message group on the system console log. You can then find the associated APPNCOS in messages IST1962I, IST1963I, IST1964I, or IST1965I. For more information about the HPRITMSG start option, see z/OS Communications Server: SNA Resource Definition Reference.
RTP transmission stalls repeatedly for pipes which use Enterprise Extender. The following messages are displayed: `IST2245I XMIT STALL DETECTED FOR RTP puname TO cpname` If the stall persists, VTAM issues the following message every 30 seconds: `IST2246I XMIT STALL CONTINUES FOR RTP puname TO cpname` If the transmission stall clears, VTAM issues the following message: `IST2247I XMIT STALL ALLEVIATED FOR RTP puname TO cpname` If the transmission stall extends beyond the time limit specified by the HPRSTALL VTAM start option, VTAM automatically initiates termination of the HPR pipe and issues the following message: `IST2253I HPRSTALL TIME EXCEEDED FOR RTP puname TO cpname`	If path MTU discovery is enabled for IPv4 or IPv6 Enterprise Extender connections, and firewalls are used in the configuration, verify that the firewalls are configured to allow ICMP errors to flow on all hops of the connection. If problems persist, you may consider disabling path MTU discovery for IPv4 Enterprise Extender connections. This can be done by specifying PMTUD=NO in the appropriate ATCSTRxx VTAM start list or on the VTAM START command. Optionally, when VTAM is active, issue the MODIFY procname,VTAMOPTS, PMTUD=NO command. If path MTU discovery is not enabled for IPv4 Enterprise Extender connections, but you still suspect this is an MTU issue, you may consider limiting the maximum packet size which Enterprise Extender will transmit. Consider specifying the MTU operand available on these major nodes: For EE connection networks, this parameter may be defined on the connection network GROUP definition statements in the EE XCA major node. For dial in Enterprise Extender connections which have their associated PUs dynamically created, this parameter may be defined on the model major node (DYNTYPE=EE) PU definition statement. For predefined Enterprise Extender connections, this parameter may be defined on the PU definition statement in the switched major node.