IBM Support

IV91511: POTENTIAL DATA LOSS USING VIRTUAL FC WITH NUM_CMD_ELEMS > 256 APPLIES TO AIX 7200-01

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • **************************************************************
    * USERS AFFECTED:
    * Systems running the AIX 7200-01 Technology Level
    * with devices.vdevice.IBM.vfc-client.rte at the 7.2.1.0
    * level.
    **************************************************************
    * PROBLEM DESCRIPTION:
    * On AIX LPARs using Virtual Fibre Channel (NPIV)
    * adapters, if num_cmd_elems is increased higher
    * than 256, DMA errors may be seen in the error
    * report, and possible undetected data loss may occur,
    * with applications accessing disks through this
    * adapter.
    *
    * The num_cmd_elems attribute can be viewed
    * using lsattr, for example:
    * # lsattr -E -l fcs0 -a num_cmd_elems
    *
    * With the fix applied, it is safe to run with
    * num_cmd_elems > 256.
    * The fix will require a reboot.
    *
    * If the LPAR is configured with multiple Virtual FC
    * adapters and the disks use multipathing through
    * redundant adapters, then num_cmd_elems can be
    * lowered to 256 or below on the fly without rebooting
    * This is possible by bringing down and modifying
    * one adapter at a time, for example:
    * - First confirm the disks have multiple paths through
    * different Virtual FC adapters
    * # lspath
    * - Next, for each Virtual FC adapter that needs
    * modification, find the protocol device, and confirm
    * that any disks using it have a redundant path that
    * they can use while bringing this one down.
    * # lsdev -p fcs0
    * - Once confirmed that it will be safe to do so,
    * move that device and it's children to defined state
    * # rmdev -Rl fscsi0
    * - Modify the num_cmd_elems for the adapter
    * # chdev -l fcs0 -a num_cmd_elems=256
    * - Restore the defined devices to active state
    * # cfgmgr
    * - Confirm all paths to the disks are Enabled
    * # lspath
    * - Repeat this process for the other Virtual FC adapter(s)
    **************************************************************
    * RECOMMENDATION:
    * Install APAR IV91511.
    * Prior to fix availability, an interim fix is available from
    * either
    * ftp://aix.software.ibm.com/aix/ifixes/iv91511/
    * https://aix.software.ibm.com/aix/ifixes/iv91511/
    * The ifix can be installed using Live Update (LU).
    * If LU is not used, installation of the ifix requires a
    * reboot.
    **************************************************************
    

Local fix

  • Reduce num_cmd_elems less than or equal to 256.
    

Problem summary

  • The message queue used between VIOS and the client can hold
    up to 256 commands.
    When client adapter has num_cmd_elems higher than 256, it
    can send more than 256 commands to the target storage.
    When the queue becomes full, and before VIOS processes the
    queue, if the client tries to send the next command it gets
    H_DROPPED error and client unmaps the mapped memory and
    queues the command back to the pending queue to be sent
    sometime later. During that unmap it does not reset some
    fields related to DMA and hence when the command completes
    and while doing unmap we get error, or it may unmap some
    other address.
    

Problem conclusion

  • Reset the metadata associated with DMA buffers so that it
    is safe to run with num_cmd_elems > 256.
    

Temporary fix

  •   *********
      * HIPER *
      *********
    

Comments

APAR Information

  • APAR number

    IV91511

  • Reported component name

    AIX V7.2

  • Reported component ID

    5765CD200

  • Reported release

    720

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Submitted date

    2016-12-14

  • Closed date

    2016-12-14

  • Last modified date

    2017-04-11

  • APAR is sysrouted FROM one or more of the following:

    IV90915

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    AIX V7.2

  • Fixed component ID

    5765CD200

Applicable component levels

  • R720 PSY U872713

       UP17/04/06 I 1000 Ž

PTF to Fileset Mapping

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSVEF8","label":"AIX 7.2 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"720","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11S","label":"AIX 7.2 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"720","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
11 April 2017