A fix is available
APAR status
Closed as program error.
Error description
A series of VSWITCH controller stalls may result in a deadlock (blocking network activity for that VSWITCH). Problems with the OSA hardware or CP system resources may cause VSWITCH controller stalls (on users DTCVSW*) which are normally resolved automatically. In some cases, the stall recovery hangs and prevents the controller from restoring connectivity for this VSWITCH.
Local fix
Apply PTF
Problem summary
**************************************************************** * USERS AFFECTED: All customers using a z/VM VSwitch for guest * * network connectivity. * * * **************************************************************** * PROBLEM DESCRIPTION: * **************************************************************** * RECOMMENDATION: APPLY PTF * **************************************************************** z/VM may become unresponsive, requiring the system to be IPLed for recovery, when exploiting a VSwitch for virtual machine network connectivity and multiple VSwitch Controller Stalls occurs. Problems with the OSA hardware or CP system resources may cause a VSWITCH controller stall (on users DTCVSW*), which are normally resolved automatically. In some cases, the stall recovery could hang preventing the controller from restoring external network connectivity for all VSwitches. Once this issue is encountered, any virtual machine device configuration change, including LOGON/LOGOFF will cause the virtual machine to hang up.
Problem conclusion
z/VM's recovery for a VSwitch Controller Stall, automatically detaches all networking devices from the stalled controller and attaches them to a another functional controller. When multiple controllers stall, this results in an excessive number of ATTACH/DETACH operations to occur concurrently. It is this activity which exposes a Network and I/O Lock hierarchy issue, resulting in a deadlock requiring a system IPL to recover. The VSwitch recovery logic executing the DETACH/ATTACH operations is not using Console Function Mode (CFM) serialization required by the I/O Subsystem to serialize this processing. This results in a deadlock between the I/O VM Configuration Lock and the networking Switch Eligible Table Lock (SLMSWLCK). The VSwitch Recovery logic is modified to use CFM serialization when performing both a DETACH/ATTACH operation.
Temporary fix
FOR RELEASE VM/ESA CP/ESA R640 : PREREQ: VM65918 VM65925 VM66280 VM66332 VM66105 CO-REQ: NONE IF-REQ: NONE FOR RELEASE VM/ESACP/ESAR710 : PREREQ: VM66302 VM66219 VM66280 VM66332 CO-REQ: NONE IF-REQ: NONE
Comments
×**** AE21/03/29 FIX IN ERROR. SEE APAR VM66509 FOR DESCRIPTION
APAR Information
APAR number
VM66357
Reported component name
VM CP
Reported component ID
568411202
Reported release
640
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-01-08
Closed date
2020-03-10
Last modified date
2021-06-29
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UM35609 UM35610 UM35611
Modules/Macros
HCPDTD HCPIQR HCPLAN HCPMES HCPMESA HCPMESB HCPMXRBK HCPSWC HCPSWI HCP2832E
Fix information
Fixed component name
VM CP
Fixed component ID
568411202
Applicable component levels
RA64 PSY UM35812
UP21/02/17 I 1000 ¢
R640 PSY UM35610
UP20/03/19 I 1000 ¢
R710 PSY UM35611
UP20/03/19 P 2101 ¢
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU011","label":"Systems - zSystems software"},"Product":{"code":"SG27M"},"Platform":[{"code":"PF054","label":"z\/OS"}],"Version":"640","Line of Business":{"code":"LOB16","label":"Mainframe HW"}}]
Document Information
Modified date:
30 June 2021