A fix is available
APAR status
Closed as program error.
Error description
In certain cases when the FCA_ERR6 decodes to "No free cmd to respond to unsolicited ELS"; there are instances where some of the processes will hang. The reason for the hang is that there is timing window that causes us not to wake up a thread. When pdump.sh is run against the hanging thread, this might be the stack trace [000D78F0]e_block_thread+000290 () [000D8548]e_sleep_thread+0000E8 (??, ??, ??) [00014F50].kernel_add_gate_cstack+000030 () [043B3B3C]efsc_send_sequence+0007BC (??, ??, ??, ??, ??, ??, ??, ??) [04355C00]efsc_send_payld+0004E0 (??, ??, ??) [04362DC4]efsc_ioctl+000564 (??, ??, ??, ??, ??, ??) [0056DA80]rdevioctl+0000C0 (??, ??, ??, ??, ??, ??) [00746480]spec_ioctl+000080 (??, ??, ??, ??, ??, ??) [0059E750]vnop_ioctl+000050 (??, ??, ??, ??, ??, ??) [005B40DC]vno_ioctl+00009C (??, ??, ??, ??, ??) [00688C78]common_ioctl+0000F8 (??, ??, ??, ??) This hang occurs on LPARS that have the Emulex adapter. The side effect of this is sometime the LPM might hang
Local fix
There is no workaround except reboot the LPAR
Problem summary
1) errpt indicates FCA_ERR6 No free cmd to respond to unsolicited ELS 2) The reason for condition 1 is because the active cmd linked list has head = null tail = non-null 3) The reason for condition 2 is because the linked list is corrupted because we have issued a command twice 4) The reason for condition 3 is the timing window efc driver of ioctls may hang with stack pvthread+015900 STACK: 000D78F0 e_block_thread+000290 () 000D8548 e_sleep_thread+0000E8 (??, ??, ??) 00014F50 .kernel_add_gate_cstack+000030 () 043B3B3C efsc_send_sequence+0007BC 04355C00 efsc_send_payld+0004E0 (??, ??, ??) 04362DC4 efsc_ioctl+000564 (??, ??, ??, ??, ??, ??) 0056DA80 rdevioctl+0000C0 (??, ??, ??, ??, ??, ??) 00746480 spec_ioctl+000080 (??, ??, ??, ??, ??, ??) 0059E750 vnop_ioctl+000050 (??, ??, ??, ??, ??, ??) 005B40DC vno_ioctl+00009C (??, ??, ??, ??, ??) 00688C78 common_ioctl+0000F8 (??, ??, ??, ??) 00003850 ovlya_addr_sc_flih_main+000130 () kdb_get_virtual_memory no real storage @ 2FF221C8 D0130EF4 D0130EF4 () kdb_read_mem no real storage @ FFFFFFFFFFF91B0
Problem conclusion
Code corrected to avoid the timing hole.
Temporary fix
Comments
6100-08 - use AIX APAR IV60834 6100-09 - use AIX APAR IV60904 6100-09 - use AIX APAR IV60904 6100-09 - use AIX APAR IV60904 7100-02 - use AIX APAR IV64215 7100-03 - use AIX APAR IV60908 7100-04 - use AIX APAR IV61120
APAR Information
APAR number
IV64215
Reported component name
AIX V7.1
Reported component ID
5765H4000
Reported release
710
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Submitted date
2014-08-29
Closed date
2014-08-29
Last modified date
2016-05-11
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
010PC2
Fix information
Fixed component name
AIX V7.1
Fixed component ID
5765H4000
Applicable component levels
R710 PSY U867211
UP15/01/19 I 1000
PTF to Fileset Mapping
U867211 devices.pci.df1000f7.com 7.1.2.20
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX 6.1 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11R","label":"AIX 7.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
11 May 2016