Optimizing at level 2

Benefits at level 2

Eliminates redundant code
Performs basic loop optimization
Structures code to take advantage of -mcpu and -mtune settings

After successfully compiling, executing, and debugging your application using -O0, recompiling at -O2 opens your application to a set of comprehensive low-level transformations that apply to subprogram or compilation unit scopes and can include some inlining. Optimizations at -O2 are a relative balance between increasing performance while limiting the impact on compilation time and system resources. You can increase the memory available to some of the optimizations in the -O2 portfolio by providing a larger value for the -qmaxmem option. Specifying -qmaxmem=-1 allows the optimizer to use memory as needed without checking for limits but does not change the transformations the optimizer applies to your application at -O2.

In C, compile with -qlibansi unless your application defines functions with names identical to those of library functions. If you encounter problems with -O2, consider using -qalias=noansi rather than turning off optimization.

Also, ensure that pointers in your C code follow these type restrictions:

Generic pointers can be char* or void*.
Mark all shared variables and pointers to shared variables volatile.

Starting to tune at O2

Choosing the right hardware architecture target or family of targets becomes even more important at -O2 and higher. By targeting the proper hardware, the optimizer can make the best use of the hardware facilities available. If you choose a family of hardware targets, the -mtune option can direct the compiler to emit code that is consistent with the architecture choice and can execute optimally on the chosen tuning hardware target. With this option, you can compile for a general set of targets and have the code run best on a particular target.

For details on the -mcpu and -mtune options, see the Tuning for your system architecture section.

The -O2 option can perform a number of additional optimizations as follows:

Common subexpression elimination: Eliminates redundant instructions.
Constant propagation: Evaluates constant expressions at compile-time.
Dead code elimination: Eliminates instructions that a particular control flow does not reach, or that generate an unused result.
Dead store elimination: Eliminates unnecessary variable assignments.
Global register allocation: Globally assigns user variables to registers.
Value numbering: Simplifies algebraic expressions, by eliminating redundant computations.
Instruction scheduling for the target machine.
Loop unrolling and software pipelining.
Moving loop-invariant code out of loops.
Simplifying control flow.
Strength reduction and effective use of addressing modes.
Widening, which merges adjacent load/stores and other operations.
Pointer aliasing improvements to enhance other optimizations.

Even with -O2 optimizations, some useful information about your source code is made available to the debugger if you specify -g. Using a higher -g level increases the information provided to the debugger but reduces the optimization that can be done. Conversely, higher optimization levels can transform code to an extent to which debugging information is no longer accurate.

Voice your opinion on getting help information

Ask IBM compiler experts a technical question in the IBM XL compilers forum