A fix is available
APAR status
Closed as program error.
Error description
Various components in VTAM fail to get storage because the UTILPVT storage pool is full. The pool has allocated x'FFFF' 4K pages. This is the maximum number of pages allowed for the pool. All subsequent storage requests fail. A review of the doc show that there are many TDUs queued on the congestion matrix for the TP.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All using APPN networks. * **************************************************************** * PROBLEM DESCRIPTION: Storage growth in UTILPVTS * * GETBLK storage pool. Might eventually * * lead to message IST565I or IST2266I. * * In this case, the storage was full of * * TDU (x'12C2') GDS variables being * * exchanged between two NNs over the * * CP-CP sessions. * **************************************************************** * RECOMMENDATION: * **************************************************************** The problem is summarized as follows: 1) There exists a congestion control mechanism which paces the topology data exchanged over the CP-CP sessions. This mechanism allows the TRS TP (transaction program) to tell the TRS (Topology) component that the session to a specific CP partner is congested. 2) In this specific case, the Connection Network Reachability Awareness (CNRA) function was enabled and at some point there was a disruption to the IP network (used by Enterprise Extender). 3) This IP disruption caused many nodes to report routing failures to the VRN. This is indicated by messages: IST1903I FAILURE OVER VRN xxxx.xxxx TO CP xxxx.xxxx IST2050I THIS PATH WILL NOT BE SELECTED FOR UNRCHTIM = xx 4) At some point in time, the CP-CP session between two network nodes went congested for topology traffic. 5) Large amounts of topology information was being reported, which included ISTV4Cs (TG Unreachable Partner control vectors). 6) The large amount of topology flows caused the TP to congest and TRS was notified. At this point, TRS suspends traffic to this specific TP partner. 7) The specific CP-CP RTP (CPSVCMG) RTP pipe also detects network connectivity problems and path switches to another route. 8) Upon completing the path switch, a CP_Status IPS was sent to notify TRS this CP-CP RTP pipe path switched successfully. 9) TRS processed the CP_Status signal and reset the ANO_IS_CONGEST (ANN_IS_CONGEST) bit. This bit is what tells TRS whether or not the TP is congested. Since it is now off, TRS resumes sending TDUs to this partner CP again. 10) In reality, the TP (XPRT PAB) is really still congested and cannot send data to the parter. As a result, data accumulates on the congestion matrix (ISTCGHDR). As a result of this error, TRS can overrun the TRS TP to this partner which allows the UTILPVTS (holds TDU vectors) to accumulate at a much faster rate than can be delivered to the partner. As a result, other components may suffer from storage allocation problems.
Problem conclusion
ISTBRPRM - Defined BRP_Congestion_Notification_Count. ISTCGHDR - Defined CGH_Congestion_Notification_Count. ISTTRDAT - Defined TRD_CROSSPOINT_COUNT_HIGH_WATER to hold the high water mark of the CGH_CROSSPOINT_COUNT field. This field may be updated by any CGID when TRS/TP congestion hits and a crosspoint count has reached a new high. ISTTRBCN - Made the following changes to the TRS congestion notification routine: Modified to define three new input parameters: Que_Depth, Notified_Count and Congestion_Min. Mainline code has also been changed to add backup congestion notification when TRS has already been notified previously. ISTTRPCS - Two small changes have been made with regards to resetting the ANO_IS_CONGEST bit. The first change is to not reset the ANO_IS_CONGEST bit when processing a CP_Status for a path switch. For this case, the TRS TP already sends a node update signal with the NDU_IS_CONGEST set to off to notify TRS that congestion is relieved and it may resume sending TDUs to this partner. The second change is to only reset the ANO_IS_CONGEST bit when processing an inactive CP_Status for the conwinner session. Previously, the ANO_IS_CONGEST bit was being reset four times when the CP-CP sessions went down and then reactivated. This opened some timing windows which allowed TRS to get out of synch with the TP services. ISTXPBRD - Modified the APPN broadcast processor to make two unique calls to the BRP_Congestion_Notifier (one for DS and one for TRS). The TRS congestion notification routine is now invoked once when the congestion maximum queue depth is reached, and when the congestion queue depth reaches a queue depth value that is equal to a multiple of 100 (i.e. 100, 200, 300, etc). ISTXPBRF - Modified the APPN broadcast feedback processor to make two unique calls to the BRP_Congestion_Notifier (one for DS and one for TRS). ISTXPFPF - Modified the APPN broadcast feedback controller to set the BRP_Congestion_Notification_Count from the CGH_Congestion_Notification_Count field. This is done prior to the call to ISTXPBRF. Upon return, the BRP_Congestion_Notification_Count is cleared. ISTXPSSR - Modified the storage shortage recover routine to set the BRP_Congestion_Notification_Count from the CGH_Congestion_Notification_Count field. This is done prior to the call to ISTXPBRF. Upon return, the BRP_Congestion_Notification_Count is cleared.
Temporary fix
********* * HIPER * *********
Comments
APAR Information
APAR number
OA31085
Reported component name
VTAM V4 MVS/ESA
Reported component ID
569511701
Reported release
190
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt
Submitted date
2009-11-12
Closed date
2009-12-09
Last modified date
2010-02-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UA51701 UA51702 UA51703
Modules/Macros
ISTBRPRM ISTCGHDR ISTTRBCN ISTTRDAT ISTTRPCS ISTXPBRD ISTXPBRF ISTXPFPF ISTXPSSR
Fix information
Fixed component name
VTAM V4 MVS/ESA
Fixed component ID
569511701
Applicable component levels
R1A0 PSY UA51701
UP10/01/23 P F001
R1B0 PSY UA51702
UP10/01/23 P F001
R190 PSY UA51703
UP10/01/23 P F001
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"190","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSCY4DZ","label":"DO NOT USE"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"190","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
01 February 2010