Getting the most out of -qhot
Here are some suggestions for using -qhot:
- Try using -qhot along with -O3 for all of your code. It is designed to have a neutral effect when no opportunities for transformation exist. However, it increases compilation time and might have little benefit if the program has no loop processing vectors or arrays. In this case, using -O3 -qnohot might be better.
- If the runtime performance of your code can significantly benefit from automatic inlining and memory locality optimizations, try using -O4 with -qhot=level=0 or -qhot=novector.
- If you encounter unacceptably long compilation time (this can happen with complex loop nests), try -qhot=level=0 or -qnohot.
- If your code size is unacceptably large, try reducing the inlining level or using -qcompact along with -qhot.
- You can compile some source files with the -qhot option and some files without the -qhot option, allowing the compiler to improve only the parts of your code that need optimization.
- Use -qreport along with -qsimd=auto to generate a loop transformation listing. The listing file identifies how loops are transformed in a section marked LOOP TRANSFORMATION SECTION. Use the listing information as feedback about how the loops in your program are being transformed. Based on this information, you might want to adjust your code so that the compiler can transform loops more effectively. For example, you can use this section of the listing to identify non-stride-one references that might prevent loop vectorization.
- Use -qreport along with -qhot or any optimization option that implies -qhot to generate information about nested loops in the LOOP TRANSFORMATION SECTION of the listing file. In addition, when you use -qprefetch=assistthread to generate prefetching assist threads, a message Assist thread for data prefetching was generated is also displayed in this section of the report.
- If you specify -qassert=refalign, you promise the compiler that all pointers inside the compilation unit only point to data that is naturally aligned with respect to the length of the pointer types. With this assertion, the compiler might generate more efficient code. This assertion is particularly useful when you target a SIMD architecture with -qhot=level=0 or -qhot=level=1 with the -qsimd=auto option.