IBM Support

IV52981: THE OFED DEVICE INTERFACE MAY BE GONE AFTER A REBOOT. APPLIES TO AIX 6100-09

A fix is available

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • When a system is restarted, or the device reconfigured it is
    is possible that the OFED device may be gone. This is because
    the provider's registration with OFED has failed.
    

Local fix

Problem summary

  • When a system is restarted, or the device reconfigured it is
    is possible that the OFED device may be gone. This is because
    the provider's registration with OFED has failed.
    

Problem conclusion

  • When the Provider calls ofed_register, as part of the call, the
    code does the MAD buffer initialization. If there are MAD
    messages in the system while the initialization is being done,
    they are received, processed and replenished.
    
    However, if two instances of the replenish code run at the same
    time (one in the initialization path and one in the receive
    completion path that will repost the buffer), then both of them
    will try to repost all the buffers to the limit. This code is
    implemented as a "do-while" loop so if 2 threads execute in
    parallel there will be one more post than the QP receive
    depth/capacity. This will caiuse the ib_post_receive() to fail
    (from the provider).
    
    The receive completion code can handle that error but the MAD
    initialization code will not and will bail out. As a result
    ofed_register() will fail and the provider will not be able to
    register the device with OFED.
    
    The fix is to implement the loop as a while-do, so that we
    don't do the extra receive buffer posting in this path.
    

Temporary fix

Comments

  • 6100-09 - use AIX APAR IV52981
    6100-09 - use AIX APAR IV52981
    6100-09 - use AIX APAR IV52981
    7100-03 - use AIX APAR IV53250
    

APAR Information

  • APAR number

    IV52981

  • Reported component name

    AIX 610 STD EDI

  • Reported component ID

    5765G6200

  • Reported release

    610

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2013-12-09

  • Closed date

    2013-12-09

  • Last modified date

    2014-10-28

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IV53250

Fix information

  • Fixed component name

    AIX 610 STD EDI

  • Fixed component ID

    5765G6200

Applicable component levels

  • R610 PSY U861129

       UP14/10/28 I 1000

PTF to Fileset Mapping



Document information

More support for: AIX Standard Edition

Software version: 610

Operating system(s): AIX

Reference #: IV52981

Modified date: 28 October 2014