IBM Support

IV56803: CONNECTION MGR OPERATIONS MAY FAIL DUE TO EXHAUSTED SEND BUFF APPLIES TO AIX 6100-09

A fix is available

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • In some cases when the CM is retransmitting (say a DREQ etc.)
    there are scenarios where we can have a thundering herd issue.
    Since a lot of DREQs were transmissted, when they time out (due
    to other side not being there) the code will retransmit. This
    happens at the same time, in a loop, on the ib_mad thread. This
    is the same thread that hadnles the MAD completions which will
    replenish the MAD send queue. That means there will be no
    replenishment during the retries and it is possible to exhaust
    the send queue momentarily during this phase. At that time
    CM operations (sending from other threads for example) can
    get affected.
    

Local fix

Problem summary

  • In some cases when the CM is retransmitting (say a DREQ etc.)
    there are scenarios where we can have a thundering herd issue.
    Since a lot of DREQs were transmissted, when they time out (due
    to other side not being there) the code will retransmit. This
    happens at the same time, in a loop, on the ib_mad thread. This
    is the same thread that hadnles the MAD completions which will
    replenish the MAD send queue. That means there will be no
    replenishment during the retries and it is possible to exhaust
    the send queue momentarily during this phase. At that time
    CM operations (sending from other threads for example) can
    get affected.
    

Problem conclusion

  • Increase the MAD buffer send queue size to 512
    

Temporary fix

Comments

  • 6100-09 - use AIX APAR IV56803
    6100-09 - use AIX APAR IV56803
    6100-09 - use AIX APAR IV56803
    7100-03 - use AIX APAR IV56851
    7100-04 - use AIX APAR IV56899
    

APAR Information

  • APAR number

    IV56803

  • Reported component name

    AIX 610 STD EDI

  • Reported component ID

    5765G6200

  • Reported release

    610

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2014-03-18

  • Closed date

    2014-03-18

  • Last modified date

    2016-05-10

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IV56851 IV56899

Fix information

  • Fixed component name

    AIX 610 STD EDI

  • Fixed component ID

    5765G6200

Applicable component levels

  • R610 PSY U861129

       UP14/10/28 I 1000

PTF to Fileset Mapping



Document information

More support for: AIX Standard Edition

Software version: 610

Operating system(s): AIX

Reference #: IV56803

Modified date: 10 May 2016