A fix is available
APAR status
Closed as program error.
Error description
In rare cases, Linux error recovery for I/O to EDEVICEs will lead to one I/O being placed on the deferred work queue twice, leading to an SZC001. When Linux is attempting to recover from failed I/O to EDEVICEs, it will issue a clear subchannel (CSCH) to the EDEVICE in order to clear the problem I/O. When the z/VM SCSI driver attempts to purge the I/O as requested, the I/O gets placed on the deferred work queue multiple times. When it comes time to perform completion tasks for the deferred work queue, we end up in what is essentially an infinite loop calling iodone() for the same I/O over and over. The first call results in a SZC001 soft abend. While that soft abend processing is occurring, the second call results in a second SZC001 which gets translated into a hard abend, bringing down the sytem.
Local fix
N/A
Problem summary
**************************************************************** * USERS AFFECTED: All users of z/VM EDEVICEs * **************************************************************** * PROBLEM DESCRIPTION: * **************************************************************** * RECOMMENDATION: APPLY PTF * **************************************************************** In rare cases, when linux attempts to recover from I/O errors on z/VM EDEVICEs, z/VM will abend with an SZC001. The problem stems from deferred work queue processing that occurs in the z/VM SCSI driver during clear subchannel processing. In some instances, an element may be placed on the deferred work queue multiple times which leads z/VM to attempt to perform I/O completion processing on the same I/O multiple times. The first time, things work as expected. The second time we encounter an SZC001 (soft) abend. The third time we encounter another SZC001 abend which gets converted to a hard abend because the first soft abend has not yet completed processing.
Problem conclusion
Code has been added to ensure an element does not already exist on the deferred work queue before adding it to the queue.
Temporary fix
Comments
APAR Information
APAR number
VM66545
Reported component name
VM CP CP
Reported component ID
568411202
Reported release
710
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-06-18
Closed date
2021-07-29
Last modified date
2022-12-13
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UM35889 UM35890
Modules/Macros
HCWPA2
Fix information
Fixed component name
VM CP CP
Fixed component ID
568411202
Applicable component levels
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG27M"},"Platform":[{"code":"PF054","label":"z Systems"}],"Version":"710","Line of Business":{"code":"LOB16","label":"Mainframe HW"}}]
Document Information
Modified date:
13 December 2022