LI77662: SUBOPTIMAL CODE FOR VECTOR LONG LONG SUBTRACT

Fixes are available

XL C/C++ for Linux Fix Pack 8 (October 2015 Update) for 12.1
XL C/C++ for Linux Fix Pack 4 (October 2013 Update) for 12.1
XL C/C++ for Linux Fix Pack 5 (December 2013 Update) for 12.1
XL C/C++ for Linux Fix Pack 6 (February 2014 Update) for 12.1
XL C/C++ for Linux Fix Pack 7 (May 2014 Update) for 12.1

APAR status

Closed as program error.

Error description

The code generated by the compiler could be further improved
for the below test case.

$cat vecadd64.C
extern "C" vector unsigned long long sub64(vector unsigned long
long a,
vector unsigned long long b)
{
  return a - b;
}

Command line:
xlC -q64 vecadd64.C -O2 -qarch=pwr7 -qaltivec -qlist

Actual compiler output:
     | 000000                           PDEF     sub64
   33|                                  PROC      a,b,vs34,vs35
   35| 0013D0 xxlnor   F0031D17   1     VNOR      vs32=vs35,vs35
   35| 0013D4 ld       E8620010   1     L8
gr3=.+CONSTANT_AREA
(gr2,0)
   35| 0013D8 addi     38000080   1     LI        gr0=128
   35| 0013DC addi     38800088   1     LI        gr4=136
   35| 0013E0 lxvdsx   7C230299   1     VLDS
vs33=+CONSTANT_AREA
(gr3,gr0,0)
   35| 0013E4 lxvdsx   7C032298   1     VLDS
vs0=+CONSTANT_AREA
(gr3,gr4,0)
   35| 0013E8 vadduwm  10600880   1     VADDUWM   vs35=vs32,vs33
   35| 0013EC vaddcuw  10000980   1     VADDCUW   vs32=vs32,vs33
   35| 0013F0 xxsldwi  F0000117   1     VSLDWI
vs32=vs32,vs32,1
   35| 0013F4 vadduwm  10001880   1     VADDUWM   vs32=vs32,vs35
   35| 0013F8 vadduwm  10201080   1     VADDUWM   vs33=vs32,vs34
   35| 0013FC vaddcuw  10001180   1     VADDCUW   vs32=vs32,vs34
   35| 001400 xxsldwi  F0200116   1     VSLDWI
vs1=vs32,vs32,1
   35| 001404 xxland   F0000C11   1     VAND      vs32=vs0,vs1
   35| 001408 vadduwm  10400880   1     VADDUWM   vs34=vs32,vs33
   36| 00140C bclr     4E800020   1     BA        lr
     |               Tag Table
     | 001410        00000000 00092200 00000000 00000040
     |               Instruction count           16
     |               Straight-line exec time     16


The compiler could consider generating the following to save 7
instructions:

     | 000000                           PDEF     sub64_opt
   38|                                  PROC      a,b,vs34,vs35
   42| 001420 vcmpgtuw 10231286   1     VCMPGTUW  vs33=vs35,vs34
   46| 001424 addi     38000090   1     LI        gr0=144
   43| 001428 vsubuwm  10621C80   1     VSUBUWM   vs35=vs34,vs35
   46| 00142C ld       E8620010   1     L8
gr3=.+CONSTANT_AREA
(gr2,0)
   46| 001430 xxlxor   F00004D7   1     VXOR      vs32=vs32,vs32
   46| 001434 lxvd2x   7C430699   1     VLQD
vs34=+CONSTANT_AREA
(gr3,gr0,0)
   46| 001438 vperm    100100AB   1     VPERM
vs32=vs33,vs32,vs34
   47| 00143C vadduwm  10401880   1     VADDUWM   vs34=vs32,vs35
   49| 001440 bclr     4E800020   1     BA        lr
     |               Tag Table
     | 001444        00000000 00092000 00000000 00000024
     |               Instruction count            9
     |               Straight-line exec time      9

Local fix

The following code would generate optimal binary code:

extern "C" vector unsigned long long sub64_opt(vector unsigned
long
long a, vector unsigned long long b)
{
  vector unsigned int ai = (vector unsigned int)a;
  vector unsigned int bi = (vector unsigned int)b;
  vector unsigned int ov = (vector unsigned
int)vec_cmpgt(bi,ai);
  vector unsigned int diff = ai - bi;
  vector unsigned int vn = { 0, 0, 0, 0 };
  vector unsigned char vp =
{0x07,0x07,0x07,0x7,0x1F,0x1F,0x1F,0x1F,0xF,
0xF,0xF,0xF,0x1F,0x1F,0x1F,0x1F};
  ov = vec_perm(ov,vn,vp);
  diff = diff + ov;
  return (vector unsigned long long)diff;
}

Problem summary

PROBLEM DESCRIPTION: Inefficient code generated for vector
unsigned 64-bit subtract

USERS AFFECTED: Users of V12/14 with -qarch=pwr7 and up with
-qaltivec with code using vec_sub for vector unsigned long long
types

Problem conclusion

Code generation for vector subtraction is improved by reducing
number of generated instructions. Apply provided service.

Temporary fix

Comments

APAR Information

APAR number
LI77662
Reported component name
XL C/C++ FOR LI
Reported component ID
5725C7300
Reported release
C10
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2013-10-28
Closed date
2013-10-28
Last modified date
2013-10-28

APAR is sysrouted FROM one or more of the following:

IV37235
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
XL C/C++ FOR LI
Fixed component ID
5725C7300

Applicable component levels

RC10 PSN IV37235
UP06/09/13

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSXVZZ","label":"XL C\/C++ for Linux"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"12.1","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
14 October 2021

Tips

LI77662: SUBOPTIMAL CODE FOR VECTOR LONG LONG SUBTRACT

Fixes are available

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

RC10 PSN IV37235

Document Information

Share your feedback

Need support?