UNROLL_AND_FUSE

Purpose

The UNROLL_AND_FUSE directive instructs the compiler to attempt a loop unroll and fuse where applicable. Loop unrolling replicates the body of multiple DO loops and combines the necessary iterations into a single unrolled loop. Using a fused loop can minimize the required number of loop iterations, while reducing the frequency of cache misses. Applying the UNROLL_AND_FUSE directive to a loop with dependencies will produce unexpected results.

Syntax

Read syntax diagramSkip visual syntax diagram
>>-+-UNROLL_AND_FUSE--+---------------------+-+----------------><
   |                  '-(--unroll_factor--)-' |   
   '-NOUNROLL_AND_FUSE------------------------'   

unroll_factor
The unroll_factor must be a positive scalar integer constant expression. An unroll_factor of 1 disables loop unrolling. If you do not specify an unroll_factor, loop unrolling is compiler determined.

Rules

You must specify one of the following compiler options to enable loop unrolling:
  • –O3 or higher optimization level
  • -qhot compiler option
  • -qsmp compiler option
Note that if the -qstrict option is in effect, no loop unrolling will occur. If you want to enable loop unrolling with the -qhot option alone, you must also specify -qnostrict.

The UNROLL_AND_FUSE directive must immediately precede a DO loop.

You must not specify the UNROLL_AND_FUSE directive for the innermost DO loop.

You must not specify the UNROLL_AND_FUSE directive more than once, or combine the directive with NOUNROLL_AND_FUSE, NOUNROLL, UNROLL, or STREAM_UNROLL directives for the same DO construct.

You must not specify the UNROLL_AND_FUSE directive for a DO WHILE loop or an infinite DO loop.

Examples

Example 1: In the following example, the UNROLL_AND_FUSE directive replicates and fuses the body of the loop. This reduces the number of cache misses for Array B.
      INTEGER, DIMENSION(1000, 1000) :: A, B, C
!IBM* UNROLL_AND_FUSE(2)
      DO I = 1, 1000
         DO J = 1, 1000
            A(J,I) = B(I,J) * C(J,I)
         END DO
      END DO
      END
The DO loop below shows a possible result of applying the UNROLL_AND_FUSE directive.
      DO I = 1, 1000, 2
         DO J = 1, 1000
            A(J,I) = B(I,J) * C(J,I)
            A(J,I+1) = B(I+1, J) * C(J, I+1)
         END DO
      END DO
Example 2: The following example uses multiple UNROLL_AND_FUSE directives:
      INTEGER, DIMENSION(1000, 1000) :: A, B, C, D, H
!IBM* UNROLL_AND_FUSE(4)
      DO I = 1, 1000
!IBM* UNROLL_AND_FUSE(2)
         DO J = 1, 1000
            DO k = 1, 1000
               A(J,I) = B(I,J) * C(J,I) + D(J,K)*H(I,K)
            END DO
         END DO
      END DO
      END

Related information