APAR status
Closed as program error.
Error description
When IO workload is running, such as NSD read on the TCP connection may be incorrectly reset since we pass an invalid parameter to syscall. And reconnect logic is triggered, if all the connections to the peer node are reset, this will trigger node expel. Below message is showed in mmfs.log: Close connection to 192.168.116.63 ps7n13 <c0n2>:[0] (Connection reset by peer). Attempting reconnect. Reported in: 5.1.2.x Known Impact: Frequent node expels
Local fix
Problem summary
When IO workload is running, such as NSD read on general GPFS, ECE or ESS, the TCP connection may be incorrectly reset since we pass an invalid parameter to syscall. And reconnect logic is triggered, if all the connections to the peer node are reset, this will trigger node expel.
Problem conclusion
Fix the potential TCP connection reset and node expel issue
Temporary fix
Comments
APAR Information
APAR number
IJ35941
Reported component name
SPEC SCALE STD
Reported component ID
5737F33AP
Reported release
511
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-11-02
Closed date
2022-01-12
Last modified date
2022-01-12
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SPEC SCALE STD
Fixed component ID
5737F33AP
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"511","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
13 January 2022