IBM Support

LI77360: SUBOPTIMAL CODE FOR VECTOR < (0,0) COMPARISON

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • For the following test case:
    
    #include <builtins.h>
    
    extern "C" bool lessZero64(vector signed long long in)
    {
      vector signed long long vn = { 0, 0 };
      return vec_any_lt(in, vn);
    }
    
    xlC -q64 -O2 -qarch=pwr7 -c -qlist -qaltivec vecLessZero.C
    
    The xlC compiler generates the following assembly code:
    
    Output:
         | 000000                           PDEF     lessZero64
       10|                                  PROC      in,vs34
       13| 000030 xxlxor   F00004D7   1     VXOR      vs32=vs32,vs32
       13| 000034 addi     38600001   1     LI        gr3=1
       13| 000038 vcmpequw 10201086   1     VCMPEQUW  vs33=vs32,vs34
       13| 00003C vcmpgtuw 10601286   1     VCMPGTUW  vs35=vs32,vs34
       13| 000040 vcmpgtsw 10401386   1     VCMPGTSW  vs34=vs32,vs34
       13| 000044 xxsldwi  F0031916   1     VSLDWI
    vs0=vs35,vs35,1
       13| 000048 xxland   F0000C12   1     VAND      vs0=vs0,vs33
       13| 00004C xxlor    F0001492   1     VOR       vs0=vs0,vs34
       13| 000050 xxspltw  F0200290   1     VSPLTW    vs1=vs0,0
       13| 000054 xxspltw  F0020290   1     VSPLTW    vs0=vs0,2
       13| 000058 xxpermdi F0210051   1     VMRGHD    vs33=vs1,vs0
       13| 00005C vcmpgtuw 10010686   1     VCMPGTUW_
    vs32,cr6=vs33,vs32
       13| 000060 bclr     4C9A0020   1     BF
    CL.4,cr6,0x4/eq,
    taken=50%(0,0)
       13| 000064 addi     38600000   1     LI        gr3=0
       14|                              CL.4:
       14| 000068 bclr     4E800020   1     BA        lr
    
    Instead, the compiler could use the following code sequence for
    performance improvement:
    
       19| 000080 xxlxor   F00004D7   1     VXOR      vs32=vs32,vs32
       20| 000084 addi     38600001   1     LI        gr3=1
       19| 000088 xvcpsgnd F0220787   1     VCPSGNFL  vs33=vs34,vs32
       20| 00008C vcmpgtsw 10000F86   1     VCMPGTSW_
    vs32,cr6=vs32,vs33
       20| 000090 bclr     4C9A0020   1     BF
    CL.5,cr6,0x4/eq,
    taken=50%(0,0)
       20| 000094 addi     38600000   1     LI        gr3=0
       21|                              CL.5:
       21| 000098 bclr     4E800020   1     BA        lr
         |               Tag Table
    

Local fix

  • Use copy sign operation to get just the sign bit into the
    vector(0,0), then the word based comparison can be used to
    check if any of the elements is less zero.
    
    This better code can be generated by:
    extern "C" bool lessZero64_opt(vector signed long long in)
    {
      vector signed long long vn = { 0, 0 };
      vector signed int x = (vector signed int)vec_cpsgn((vector
    double)in,
    (vector double)vn);
      return vec_any_lt(x,(vector signed int)vn);
    }
    

Problem summary

  • PROBLEM DESCRIPTION: suboptimal code is generated when
    comparing to a zero vector of long long
    
    USERS AFFECTED: users who use this builtin to compare agaisnt
    zero vector
    

Problem conclusion

  • Added code to recognize the comparison is less than vector zero
    in the function call and grenerate optimal code.
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI77360

  • Reported component name

    XL C/C++ FOR LI

  • Reported component ID

    5725C7300

  • Reported release

    C10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-04-27

  • Closed date

    2013-04-27

  • Last modified date

    2013-04-27

  • APAR is sysrouted FROM one or more of the following:

    IV36100

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    XL C/C++ FOR LI

  • Fixed component ID

    5725C7300

Applicable component levels

  • RC10 PSY

       UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSXVZZ","label":"XL C\/C++ for Linux"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"12.1","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
14 October 2021