#pragma stream_unroll

Category

Optimization and tuning

Purpose

When optimization is enabled, breaks a stream contained in a for loop into multiple streams.

Syntax

Read syntax diagramSkip visual syntax diagram
>>-#--pragma--stream_unroll--+--------------+------------------><
                             '-(--number--)-'   

Parameters

number
A loop unrolling factor. The value of number is a positive integral constant expression.

An unroll factor of 1 disables unrolling.

If number is not specified, the optimizer determines an appropriate unrolling factor for each nested loop.

Usage

To enable stream unrolling, you must specify -qhot and -qstrict, or -qsmp, or use optimization level -O4 or higher. If -qstrict is in effect, no stream unrolling takes place.

For stream unrolling to occur, the #pragma stream_unroll directive must be the last pragma specified preceding a for loop. Specifying #pragma stream_unroll more than once for the same for loop or combining it with other loop unrolling pragmas (#pragma unroll, #pragma nounroll, #pragma unrollandfuse, #pragma nounrollandfuse) results in a warning.

Examples

The following example shows how #pragma stream_unroll can increase performance.
int i, m, n;
int a[1000];
int b[1000];
int c[1000];

....

#pragma stream_unroll(4)
for (i=0; i<n; i++) {
  a[i] = b[i] * c[i];
}
The unroll factor of 4 reduces the number of iterations from n to n/4, as follows:
m = n/4;

for (i=0; i<n/4; i++){
  a[i] = b[i] + c[i];
  a[i+m] = b[i+m] + c[i+m];
  a[i+2*m] = b[i+2*m] + c[i+2*m];
  a[i+3*m] = b[i+3*m] + c[i+3*m];
}
The increased number of read and store operations are distributed among a number of streams determined by the compiler, which reduces computation time and increase performance.

Related information