IBM Support

LI75608: NO RUNTIME PERFORMANCE IMPROVEMENTS WITH C99 RESTRICT.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When using the restrict keyword from the C99 standard, the
    compiler is supposed to exploit with -qalias=restrict (should
    be default) and -O3 -qhot.
    
    However the compiler does not flag the main inner loop as
    independent without additional pragmas disjoint or ibm
    independent_loop or (as an alternate) by
    compiling with -qipa=level=2.
    
    The loop in question is the innermost loop (region=12) on the
    routine calc_pot in distance.c. It should be reported
    as independent on the listing file, but it is not.
    
    The problem is likely related to the handling of the "restrict"
    keyword for variables r and r2.
    
    $cat distance.c
    #include <stdio.h>
    #include <stdlib.h>
    #include <math.h>
    
    #define SQR(a) ((a)*(a))
    typedef struct {
    
      double x;
      double y;
      double z;
    
    } Vector3d;
    
    #ifdef __64BIT__
    typedef unsigned long long index;
    #else
    typedef unsigned int index;
    #endif
    
    static Vector3d * restrict p;
    static double * restrict r2;
    static double * restrict r;
    
    //static double *  r2;
    //static double *  r;
    
    
    #pragma alloca
    double calc_pot(index natoms)
    {
      double  pot = 0.0;
      index i,j;
    
    //  #pragma disjoint(*r,*r2,*p)
    
      for (i=0;i < natoms-1;i++)
        {
     //           #pragma ibm independent_loop
          for (j=0;j < natoms-i-1;j++)
            {
              Vector3d d;
    
              d.x = p[i].x - p[j+i+1].x;
              d.y = p[i].y - p[j+i+1].y;
              d.z = p[i].z - p[j+i+1].z;
              r2[j] = SQR(d.x) + SQR(d.y) + SQR(d.z);
              r[j] = __frsqrte(r2[j]);
              r[j] = 0.5*r[j]*(3.0 - r2[j]*r[j]*r[j]);
            }
        }
      return pot;
    }
    
    int main(void)
    {
      index natoms = 1000;
      index nsweep = 10000;
      double pot;
      index i;
    
      p = (Vector3d *)malloc(sizeof(Vector3d)*natoms);
      r = (double *)malloc(sizeof(double)*natoms);
      r2 = (double *)malloc(sizeof(double)*natoms);
    
      for (i=0;i < natoms;i++) {
        p[i].x = i;
        p[i].y = i;
        p[i].z = i;
      }
    
      pot = 0.0;
      for (i=0;i < nsweep;i++)
        pot += calc_pot(natoms);
    
      printf("pot = %f\n",pot);
    
      return 0;
    }
    

Local fix

  • n/a
    

Problem summary

  • USER AFFECTED:
    Users of IPA/HOT and some builtin functions.
    
    PROBLEM DESCRIPTION:
    The inner loop was not marked as independent. This resulted in
    some performance problem with use of some builtin functions.
    

Problem conclusion

  • The compiler now marks the inner loops as independent so that
    the optimizer can now better optimze the loops.
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI75608

  • Reported component name

    XL C/C++ SLES10

  • Reported component ID

    5724U8300

  • Reported release

    A10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-07-28

  • Closed date

    2010-07-28

  • Last modified date

    2010-07-28

  • APAR is sysrouted FROM one or more of the following:

    IZ64217

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    XL C/C++ SLES10

  • Fixed component ID

    5724U8300

Applicable component levels

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSJT9L","label":"XL C\/C++"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A.1","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
15 October 2021