IBM Support

LI78452: MISSED RLDIMI OPTIMIZATION FOR __BPERMD AND __POPCNT8

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When doing bitwise operations after the __bpermd and __popcnt8
    built-ins, the compiler is not recognizing a code pattern and is
    not replacing a rldicr/or with a rldimi instruction:
    
    
       === TEST CASE ===
    
    unsigned long long mm(unsigned long long *p)
    {
       unsigned long long a = __bpermd(0x0001, p[0]);
       unsigned long long b = __bpermd(0x0001, p[1]);
       return (a << 8) | b;
    }
    
    
    The assembly generated is using a rldicr and or instead of a
    combined rldimi:
    
      12| 00002C bpermd   7C8001F8   1     BPERMD   gr0=gr0,gr4
      13| 000030 bpermd   7C8319F8   1     BPERMD   gr3=gr3,gr4
      14| 000034 rldicr   780045E4   1     SLL8     gr0=gr0,8
      14| 000038 or       7C031B78   1     O        gr3=gr0,gr3
      15| 00003C bclr     4E800020   1     BA        lr
    
    Other built-ins have the rldimi generated, ex:
    
    unsigned long long mm3(unsigned long long *p)
    {
       unsigned long long a = __cntlz8(p[0]);
       unsigned long long b = __cntlz8(p[1]);
       return (a << 8) | b;
    }
    
      26| 000088 cntlzd 7C000074 1 CNTLZ8  gr0=gr0
      27| 00008C cntlzd 7C630074 1 CNTLZ8  gr3=gr3
      28| 000090 rldimi 7803402C 1 RI8     gr3=gr0,8,gr3,0xFFFFFF00
      29| 000094 bclr   4E800020 1 BA      lr
    

Local fix

Problem summary

  • PROBLEM DESCRIPTION:
    When doing bitwise operations after the __bpermd and __popcnt8
    built-ins, the compiler is not generating efficient code.
    
    USERS AFFECTED:
    Users who use bitwise operations after the __bpermd and
    __popcnt8 built-ins.
    

Problem conclusion

  • The fix will help the compiler generate efficient code when
    mixing bitwise operations after the __bpermd and __popcnt8
    built-ins.
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI78452

  • Reported component name

    XL C/C++ FOR LI

  • Reported component ID

    5725C7300

  • Reported release

    D10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2015-02-25

  • Closed date

    2015-02-25

  • Last modified date

    2015-02-25

  • APAR is sysrouted FROM one or more of the following:

    IV62255

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    XL C/C++ FOR LI

  • Fixed component ID

    5725C7300

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSXVZZ","label":"XL C\/C++ for Linux"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"13.1","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
17 October 2021