Optimization and tuning

You can control the optimization and tuning process, which can improve the performance of your application at run time, using the options in the following table. Remember that not all options benefit all applications. Trade-offs sometimes occur between an increase in compile time, a reduction in debugging capability, and the improvements that optimization can provide. In addition to the option descriptions in this section, consult the XL Fortran Optimization and Programming Guide for details on the optimization and tuning process as well as writing optimization friendly source code.

Some of the options in Floating-point and integer control can also improve performance, but you must use them with care to ensure your application retains the floating-point semantics it requires.

Table 1. Optimization and tuning options
Option name @PROCESS directive Description
-qalias ALIAS(argument_list)

Indicates whether a program contains certain categories of aliasing or does not conform to Fortran standard aliasing rules. The compiler limits the scope of some optimizations when there is a possibility that different names are aliases for the same storage location.

-qarch None.

Specifies the processor architecture, or family of architectures, where the code may run. This allows the compiler to take maximum advantage of the machine instructions specific to an architecture, or common to a family of architectures.

-qassert ASSERT

Provides information about the characteristics of your code that can help the compiler fine-tune optimizations.

-qcache None.

Specifies the cache configuration for a specific execution machine.

-qcompact COMPACT

Avoids optimizations that increase code size.

-qdirectstorage None.

Informs the compiler that a given compilation unit may reference write-through-enabled or cache-inhibited storage.

-qessl None.

Allows the compiler to substitute the Engineering and Scientific Subroutine Library (ESSL) routines in place of Fortran 90 intrinsic procedures.

-qfdpr None.

Provides object files with information that the IBM Feedback Directed Program Restructuring (FDPR®) performance-tuning utility needs to optimize the resulting executable file.

-qhot HOT(suboptions)

Performs high-order loop analysis and transformations (HOT) during optimization.

-qinline None.

Attempts to inline procedures instead of generating calls to those procedures, for improved performance.

-qipa None.

Enables or customizes a class of optimizations known as interprocedural analysis (IPA).

-qlargepage None.

Takes advantage of large pages provided on POWER4 and higher systems, for applications designed to execute in a large page memory environment.

-qlibansi None. Assumes that all functions with the name of an ANSI C library function are, in fact, the library functions and not a user function with different semantics.
-qlibessl None. Assumes that all functions with the name of an ESSL library function are, in fact, the library functions and not a user function with different semantics.
-qlibmpi None.

Asserts that all functions with Message Passing Interface (MPI) names are in fact MPI functions and not a user function with different semantics.

-qlibposix None. Assumes that all functions with the name of a POSIX 1003.1 library function are, in fact, the system functions and not a user function with different semantics.
-qmaxmem MAXMEM

Limits the amount of memory that the compiler allocates while performing specific, memory-intensive optimizations to the specified number of kilobytes.

-qminimaltoc None.

Minimizes the number of entries in the global entity table of contents (TOC).

-O OPTIMIZE

Specifies whether to optimize code during compilation and, if so, at which level.

-p None.

Prepares the object files produced by the compiler for profiling.

-qpdf1, -qpdf2 None.

Tunes optimizations through profile-directed feedback (PDF), where results from sample program execution are used to improve optimization near conditional branches and in frequently executed code sections.

-qprefetch None.

Inserts prefetch instructions automatically where there are opportunities to improve code performance.

-qshowpdf None.

When used with -qpdf1 and a minimum optimization level of -O2 at compile and link steps, creates a PDF map file that contains additional profiling information for all procedures in your application.

-qsimd None.

Controls whether the compiler can automatically take advantage of vector instructions for processors that support them.

-qsmallstack None.

Minimizes stack usage where possible.

-qsmp None.

Enables parallelization of program code.

-qstacktemp None.

Determines where to allocate certain XL Fortran compiler temporaries at run time.

-qstrict STRICT

Ensures that optimizations done by default at the -O3 and higher optimization levels, and, optionally at -O2, do not alter certain program semantics mostly related to strict IEEE floating-point conformance.

-qstrict_induction None.

Prevents the compiler from performing induction (loop counter) variable optimizations. These optimizations may be unsafe (may alter the semantics of your program) when there are integer overflow operations involving the induction variables.

-qtune None.

Tunes instruction selection, scheduling, and other architecture-dependent performance enhancements to run best on a specific hardware architecture. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.

-qunroll None.

Specifies whether unrolling DO loops is allowed in a program. Unrolling is allowed on outer and inner DO loops.

-qunwind None.

Specifies that the compiler will preserve the default behavior for saves and restores to volatile registers during a procedure call.