-qunroll

Category

Optimization and tuning

Pragma equivalent

#pragma options [no]unroll[= yes|no|auto|n]

#pragma unroll

Purpose

Controls loop unrolling, for improved performance.

When unroll is in effect, the optimizer determines and applies the best unrolling factor for each loop; in some cases, the loop control may be modified to avoid unnecessary branching. The compiler remains the final arbiter of whether the loop is actually unrolled. You can use the #pragma unroll directive to gain more control over unrolling.

Syntax

Read syntax diagramSkip visual syntax diagram
Option syntax

                     .-auto-.     
        .-unroll--=--+-yes--+-.   
        |            +-no---+ |   
        |            '-n----' |   
>>- -q--+-nounroll------------+--------------------------------><

Read syntax diagramSkip visual syntax diagram
Pragma syntax

>>-#--pragma--+-nounroll------------+--------------------------><
              '-unroll--+---------+-'   
                        '-(--n--)-'     

Defaults

-qunroll=auto

Parameters

auto (option only)
Instructs the compiler to perform basic loop unrolling.
yes (option only)
Instructs the compiler to search for more opportunities for loop unrolling than that performed with auto. In general, this suboption has more chances to increase compile time or program size than auto processing, but it may also improve your application's performance.
no (option only)
Instructs the compiler to not unroll loops.
n
Instructs the compiler to unroll loops by a factor of n. In other words, the body of a loop is replicated to create n copies and the number of iterations is reduced by a factor of 1/n. The -qunroll=n option specifies a global unroll factor that affects all loops that do not have an unroll pragma already. The value of n must be a positive integer.
Specifying #pragma unroll(1) or -qunroll=1 disables loop unrolling, and is equivalent to specifying #pragma nounroll or -qnounroll. If n is not specified and if -qhot, -qsmp, -O4, or -O5 is specified, the optimizer determines an appropriate unrolling factor for each nested loop.

Specifying -qunroll without any suboptions is equivalent to -qunroll=yes.

-qnounroll is equivalent to -qunroll=no.

Usage

The pragma overrides the option setting for a designated loop. However, even if #pragma unroll is specified for a given loop, the compiler remains the final arbiter of whether the loop is actually unrolled.

Only one pragma may be specified on a loop. The pragma must appear immediately before the loop or the #pragma block_loop directive to have effect.

The pragma affects only the loop that follows it. An inner nested loop requires a #pragma unroll directive to precede it if the wanted loop unrolling strategy is different from that of the prevailing option.

The #pragma unroll and #pragma nounroll directives can only be used on for loops or #pragma block_loop directives. They cannot be applied to do while and while loops.

The loop structure must meet the following conditions:
  • There must be only one loop counter variable, one increment point for that variable, and one termination variable. These cannot be altered at any point in the loop nest.
  • Loops cannot have multiple entry and exit points. The loop termination must be the only means to exit the loop.
  • Dependencies in the loop must not be "backwards-looking". For example, a statement such as A[i][j] = A[i -1][j + 1] + 4 must not appear within the loop.

Predefined macros

None.

Examples

In the following example, the #pragma unroll(3) directive on the first for loop requires the compiler to replicate the body of the loop three times. The #pragma unroll on the second for loop allows the compiler to decide whether to perform unrolling.
#pragma unroll(3)
for( i=0;i < n; i++)
{
      a[i] = b[i] * c[i];
}

#pragma unroll
for( j=0;j < n; j++)
{
      a[j] = b[j] * c[j];

}
In this example, the first #pragma unroll(3) directive results in:
i=0;
if (i>n-2) goto remainder;
for (; i<n-2; i+=3) { 
  a[i]=b[i] * c[i];
  a[i+1]=b[i+1] * c[i+1]; 
  a[i+2]=b[i+2] * c[i+2]; 
} 
if (i<n) { 
  remainder: 
  for (; i<n; i++) { 
    a[i]=b[i] * c[i]; 
  } 
}

Related information