The UNROLL_AND_FUSE directive instructs the compiler to attempt a loop unroll and fuse where applicable. Loop unrolling replicates the body of multiple DO loops and combines the necessary iterations into a single unrolled loop. Using a fused loop can minimize the required number of loop iterations, while reducing the frequency of cache misses. Applying the UNROLL_AND_FUSE directive to a loop with dependencies will produce unexpected results.
The UNROLL_AND_FUSE directive must immediately precede a DO loop.
You must not specify the UNROLL_AND_FUSE directive for the innermost DO loop.
You must not specify the UNROLL_AND_FUSE directive more than once, or combine the directive with NOUNROLL_AND_FUSE, NOUNROLL, UNROLL, or STREAM_UNROLL directives for the same DO construct.
You must not specify the UNROLL_AND_FUSE directive for a DO WHILE loop or an infinite DO loop.
INTEGER, DIMENSION(1000, 1000) :: A, B, C
!IBM* UNROLL_AND_FUSE(2)
DO I = 1, 1000
DO J = 1, 1000
A(J,I) = B(I,J) * C(J,I)
END DO
END DO
END
The DO loop below shows a possible
result of applying the UNROLL_AND_FUSE directive.
DO I = 1, 1000, 2
DO J = 1, 1000
A(J,I) = B(I,J) * C(J,I)
A(J,I+1) = B(I+1, J) * C(J, I+1)
END DO
END DO
INTEGER, DIMENSION(1000, 1000) :: A, B, C, D, H
!IBM* UNROLL_AND_FUSE(4)
DO I = 1, 1000
!IBM* UNROLL_AND_FUSE(2)
DO J = 1, 1000
DO k = 1, 1000
A(J,I) = B(I,J) * C(J,I) + D(J,K)*H(I,K)
END DO
END DO
END DO
END