IBM Support

LI77634: SUBOPTIMAL BRANCHLESS CODE FOR HI !=0

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • For the test case below, the current sequence
    of instructions are not optimal
    ==> addic+addze+neg+extsw+and
    
    It can be improved to the following:
    ==> addic+subfe+and
    
    
    ===== TESTCASE:
    $ cat test.cpp
    #include <builtins.h>
    
    #define Uint unsigned long long
    #define Uint64 unsigned long long
    #define Sint64 long long
    #define ossCountLeadingZeros64 __cntlz8
    
    
    
    Uint ossCountLeadingZeros128(Uint64 hi, Uint64 lo)
    {
        Uint64 clzHi = ossCountLeadingZeros64(hi);
        Uint64 clzLo = ossCountLeadingZeros64(lo);
        Uint64 mask = -(hi != 0);
        return clzHi + (clzLo & mask);
    }
    $
    
    
    ===== ACTUAL OUTPUT:
     CCR's set/used:   ---- ----
         | 000000                           PDEF
    ossCountLeadingZeros128(unsigned long long, unsigned long long)
         | 000000                           AKA
    ossCountLeadingZeros128__FULT1
       10|                                  PROC      hi,lo,gr3,gr4
       14| 000000 addic    3003FFFF   1     ADDC      gr0,ca=gr3,-1
       14| 000004 addi     38A00000   1     LI        gr5=0
       13| 000008 cntlzd   7C840074   1     CNTLZ8    gr4=gr4
       14| 00000C addze    7C050194   1     ADDE
    gr0,ca=gr5,0,ca
       14| 000010 neg      7CA000D0   1     COMP      gr5=gr0
       14| 000014 extsw    7CA007B4   1     EXTS4     gr0=gr5
       12| 000018 cntlzd   7C650074   1     CNTLZ8    gr5=gr3
       15| 00001C and      7C002038   1     N         gr0=gr0,gr4
       15| 000020 add      7C602A14   1     A         gr3=gr0,gr5
       16| 000024 bclr     4E800020   1     BA        lr
         |               Tag Table
    
    
    ===== EXPECTED OUTPUT:
    addic+subfe+and
    

Local fix

  • N/A
    

Problem summary

  • USERS AFFECTED:
    Users that have code sequences similar to the following:
    -(hi==0
    
    PROBLEM DESCRIPTION:
    The compiler generates inefficient code for a brnchless code
    -(hi==0).
    As a result, programs would issue two unnecessary instructions
    that would degrade run time performance.
    

Problem conclusion

  • The compiler has been fixed such that two new opcodes were
    added and the unnecessary instructions were eliminated one by
    one.
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI77634

  • Reported component name

    XL C/C++ FOR LI

  • Reported component ID

    5725C7300

  • Reported release

    C10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-10-28

  • Closed date

    2013-10-28

  • Last modified date

    2013-10-28

  • APAR is sysrouted FROM one or more of the following:

    IV43118

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    XL C/C++ FOR LI

  • Fixed component ID

    5725C7300

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSXVZZ","label":"XL C\/C++ for Linux"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"12.1","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
14 October 2021