-qtune

Category

Optimization and tuning

@PROCESS

None.

Purpose

Tunes instruction selection, scheduling, and other architecture-dependent performance enhancements to run best on a specific hardware architecture. Allows specification of a target SMT mode to direct optimizations for best performance in that mode.

Syntax

Read syntax diagramSkip visual syntax diagram
                 .-balanced-.                        
>>- -q--tune--=--+-auto-----+--+-----------------+-------------><
                 +-pwr5-----+  |    .-st-------. |   
                 +-pwr6-----+  '-:--+-balanced-+-'   
                 +-pwr7-----+       +-smt2-----+     
                 '-pwr8-----'       +-smt4-----+     
                                    '-smt8-----'     

Defaults

-qtune=balanced:balanced when no valid -qarch setting is in effect. Otherwise, the default depends on the effective -qarch setting. See Table 1 for details.

Parameters for CPU suboptions

The following -qtune CPU suboptions allow you to specify a particular architecture for the compiler to target for best performance:

auto
Optimizations are tuned for the platform on which the application is compiled.
balanced
Optimizations are tuned across a selected range of recent hardware.
pwr5
Optimizations are tuned for the POWER5 hardware platforms.
pwr6
Optimizations are tuned for the POWER6® hardware platforms.
pwr7
Optimizations are tuned for the POWER7® or POWER7+™ hardware platforms.
pwr8
Optimizations are tuned for the POWER8™ hardware platforms.

Parameters for SMT suboptions

The following -qtune simultaneous multithreading (SMT) suboptions allow you to optionally specify an execution mode for the compiler to target for best performance:

balanced
Optimizations are tuned for performance across various SMT modes for a selected range of recent hardware.
st
Optimizations are tuned for single-threaded execution.
smt2
Optimizations are tuned for SMT2 execution mode (two threads).
smt4
Optimizations are tuned for SMT4 execution mode (four threads).
smt8
Optimizations are tuned for SMT8 execution mode (eight threads).

Usage

If you want your program to run on more than one architecture, but to be tuned to a particular architecture, you can use a combination of the -qarch and -qtune options. These options are primarily of benefit for floating-point intensive programs.

By arranging (scheduling) the generated machine instructions to take maximum advantage of hardware features such as cache size and pipelining, -qtune can improve performance. It only has an effect when used in combination with options that enable optimization.

A particular SMT suboption is valid if the effective -qarch option supports the specified SMT mode. The acceptable combinations of the -qarch and SMT tune options are listed in Table 1. The compiler ignores any invalid -qarch/-qtune SMT combination.

Although changing the -qtune setting may affect the performance of the resulting executable, it has no effect on whether the executable can be executed correctly on a particular hardware platform.

Acceptable combinations of -qarch and -qtune are shown in the following table.

Table 1. Acceptable -qarch/-qtune combinations
-qarch option Default -qtune setting Available -qtune CPU settings Available -qtune SMT settings
ppc balanced:balanced auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced balanced | st
ppcgr balanced:balanced auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced balanced | st
ppc64 balanced:balanced auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced balanced | st
ppc64gr balanced:balanced auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced balanced | st
ppc64grsq balanced:balanced auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced balanced | st
ppc64v balanced:balanced auto | pwr6 | pwr7 | pwr8 | balanced balanced | st
pwr5 pwr5:st auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced balanced | st
pwr5x pwr5:st auto | pwr5 | pwr6 | pwr7 | pwr8 | balanced balanced | st | smt2
pwr6 pwr6:st auto | pwr6 | pwr7 | pwr8 | balanced balanced | st | smt2
pwr6e pwr6:st auto | pwr6 | balanced balanced | st
pwr7 pwr7:st auto | pwr7 | pwr8 | balanced balanced | st | smt2 | smt4
pwr8 pwr8:st auto | pwr8 | balanced balanced | st | smt2 | smt4 | smt8

Examples

To specify that the executable program testing compiled from myprogram.f is to be optimized for a POWER7 hardware platform, enter:
xlf -o testing myprogram.f -qtune=pwr7
To specify that the executable program testing compiled from myprogram.f is to be optimized for a POWER8 hardware platform configured for SMT4 mode, enter:
xlf -o testing myprogram.f -qtune=pwr8:smt4

Related information