-qcache

Category

Optimization and tuning

@PROCESS

None.

Purpose

Specifies the cache configuration for a specific execution machine.

The compiler uses this information to tune program performance, especially for loop operations that can be structured (or blocked) to process only the amount of data that can fit into the data cache.

Syntax

Read syntax diagramSkip visual syntax diagram
                  .-:------------------------.   
                  V                          |   
>>- -q--cache--=----+-assoc--=--+-0------+-+-+-----------------><
                    |           +-1------+ |     
                    |           '-number-' |     
                    +-auto-----------------+     
                    +-cost--=--cycles------+     
                    +-level--=--+-1-+------+     
                    |           +-2-+      |     
                    |           '-3-'      |     
                    +-line--=--bytes-------+     
                    +-size--=--Kbytes------+     
                    '-type--=--+-C-+-------'     
                               +-c-+             
                               +-D-+             
                               +-d-+             
                               +-I-+             
                               '-i-'             

Defaults

Not applicable.

Parameters

assoc=number
Specifies the set associativity of the cache:
0
Direct-mapped cache
1
Fully associative cache
n > 1
n-way set-associative cache
auto
Automatically detects the specific cache configuration of the compiling machine. It assumes that the execution environment will be the same as the compilation environment.
cost=cycles
Specifies the performance penalty that results from a cache miss so that the compiler can decide whether to perform an optimization that might result in extra cache misses.
level=level
Specifies which level of cache is affected:
1
Basic cache
2
Level-2 cache or the table lookaside buffer (TLB) if the machine has no level-2 cache
3
TLB in a machine that does have a level-2 cache
Other levels are possible but are currently undefined. If a system has more than one level of cache, use a separate -qcache option to describe each level.
line=bytes
Specifies the line size of the cache.
size=Kbytes
Specifies the total size of this cache.
type={C|c| D|d|I|i}
Specifies the type of cache that the settings apply to, as follows:
  • C or c for a combined data and instruction cache
  • D or d for the data cache
  • I or i for the instruction cache

Usage

If you know exactly what type of system a program is intended to be executed on and that system has its instruction or data cache configured differently from the default case (as governed by the -qtune setting), you can specify the exact characteristics of the cache to allow the compiler to compute more precisely the benefits of particular cache-related optimizations.

For the -qcache option to have any effect, you must include the level and type suboptions and specify the -qhot option or an option that implies -qhot.

  • If you know some but not all of the values, specify the ones you do know.
  • If a system has more than one level of cache, use a separate -qcache option to describe each level. If you have limited time to spend experimenting with this option, it is more important to specify the characteristics of the data cache than of the instruction cache.
  • If you are not sure of the exact cache sizes of the target systems, use relatively small estimated values. It is better to have some cache memory that is not used than to have cache misses or page faults from specifying a cache that is larger than the target system has.

If you specify the wrong values for the cache configuration or run the program on a machine with a different configuration, the program may not be as fast as possible but will still work correctly. Remember, if you are not sure of the exact values for cache sizes, use a conservative estimate.

Examples

To tune performance for a system with a combined instruction and data level-1 cache where the cache is two-way associative, 8 KB in size, and has 64-byte cache lines:
  xlf95 -O3 -qhot -qcache=type=c:level=1:size=8:line=64:assoc=2 file.f
To tune performance for a system with two levels of data cache, use two -qcache options:
  xlf95 -O3 -qhot -qcache=type=D:level=1:size=256:line=256:assoc=4 \
        -qcache=type=D:level=2:size=512:line=256:assoc=2 file.f
To tune performance for a system with two types of cache, again use two -qcache options:
  xlf95 -O3 -qhot -qcache=type=D:level=1:size=256:line=256:assoc=4 \
        -qcache=type=I:level=1:size=512:line=256:assoc=2 file.f

Related information