IBM Support

LI77661: MISSING XSCVDPUX INSTRUCTION FOR PWR7

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • V12.1 compiler does not use xscvdpux Power 7 instruction as
    seen by the below test case.
    
    $cat toUint.C
    unsigned long long toUint64(double d)
    {
      return (unsigned long long)d;
    }
    
    xlC -qarch=pwr7 -qtune=pwr7 -qlist -c toUint.C -O2 -qaltivec
    -q64
    
    Currently compiler outputs:
    
         | 000000                           PDEF
    toUint64(double)
         | 000000                           AKA       toUint64__Fd
        3|                                  PROC      d,vs1
        5| 000000 xxlxor   F00004D6   1     VXOR      vs0=vs32,vs32
        5| 000004 ld       E8620008   1     L8
    gr3=.+CONSTANT_AREA
    (gr2,0)
        5| 000008 fsel     FC41006E   1     FSEL
    vs2=vs1,vs0,vs1
        5| 00000C lfs      C0230000   2     LFS
    vs1=+CONSTANT_AREA
    (gr3,0)
        5| 000010 xsadddp  F0220900   1     AFL
    vs1=vs2,vs1,fcr
        5| 000014 fsel     FC41106E   1     FSEL
    vs2=vs1,vs2,vs1
        5| 000018 xscmpudp F0010118   1     CFL       cr0=vs1,vs0
        5| 00001C xscvdpsx F0001560   1     FCTIDZ    vs0=vs2
        5| 000020 stfd     D801FFF0   1     STFL
    #MX_CONVF1_0(gr1,-16)
    =vs0
        5| 000024 ld       E861FFF0   1     L8
    gr3=#MX_CONVF1_0(gr1,
    -16)
        5| 000028 bclr     4D800020   1     BT
    CL.4,cr0,0x20/flt,
    taken=50%(0,0)
        5| 00002C addi     38000001   1     LI        gr0=1
        5| 000030 rldicr   7800F806   1     SLL8      gr0=gr0,63
        5| 000034 add      7C630214   1     A         gr3=gr3,gr0
        6|                              CL.4:
        6| 000038 bclr     4E800020   1     BA        lr
         |               Tag Table
         | 00003C        00000000 00092200 00000000 0000003C
         |               Instruction count           15
         |               Straight-line exec time     16
    
    
    It takes 15 instructions and does not use xscvdpux instruction
    for pwr7.
    
    Optimal code could be generated by the following sequence:
    
    unsigned long long toUint64_opt(double d)
    {
      union {
        double d;
        unsigned long long ul;
      } cnv;
      cnv.d = __fctudz(d);
      return cnv.ul;
    }
    
    Optimal sequence:
         | 000000                           PDEF
    toUint64_opt(double)
         | 000000                           AKA
    toUint64_opt__Fd
        8|                                  PROC      d,vs1
       14| 000050 xscvdpux F0000D20   1     FCTUDZ    vs0=vs1
       14| 000054 stfd     D801FFF0   1     STFL
    cnv(gr1,-16)=vs0
       15| 000058 ld       E861FFF0   1     L8
    gr3=cnv(gr1,-16)
       16| 00005C bclr     4E800020   1     BA        lr
         |               Tag Table
         | 000060        00000000 00092200 00000000 00000010
         |               Instruction count            4
         |               Straight-line exec time      4
    
    Only 4 instructions when using xscvdpux.
    

Local fix

  • see the above alternative code sequence.
    

Problem summary

  • PROBLEM DESCRIPTION: Inefficient code generation for conversion
    from double to unsigned types for -qarch=pwr7
    
    USERS AFFECTED: C/C++/Fortran users generating code with
    -qarch=pwr7.
    

Problem conclusion

  • Enabled generation of xvcvdpux instruction for improved
    conversion code generation.
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI77661

  • Reported component name

    XL C/C++ FOR LI

  • Reported component ID

    5725C7300

  • Reported release

    C10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-10-28

  • Closed date

    2013-10-28

  • Last modified date

    2013-10-28

  • APAR is sysrouted FROM one or more of the following:

    IV42697

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    XL C/C++ FOR LI

  • Fixed component ID

    5725C7300

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSXVZZ","label":"XL C\/C++ for Linux"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"12.1","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
14 October 2021