-qpdf1, -qpdf2

Category

Optimization and tuning

Pragma equivalent

None.

Purpose

Tunes optimizations through profile-directed feedback (PDF), where results from sample program execution are used to improve optimization near conditional branches and in frequently executed code sections.

PDF is a two-step process. You first compile the application with -qpdf1 and a minimum optimization level of -O2, with linking. You then run the resulting application with a typical data set. During the test run, profile data is written to a profile file. By default, the profile file is named ._pdf and is saved in the current working directory, or in the directory named by the PDFDIR environment variable, if it is set. You then recompile and link or relink the application with -qpdf2 and an optimization level used for -qpdf1, which fine-tunes the optimizations applied according to the profile data collected during the program execution.

You can use old profiling information. In previous releases, when you modify the source file or compiler options and compile with -qpdf2, the compilation stops with an error. As of IBM XL C/C++ for AIX, V11.1, you see a list of warnings but compilation does not stop. However, using different compiler options between different stages of PDF does not give you any benefits for using PDF.

PDF is intended to be used after other debugging and tuning is finished, as one of the last steps before putting the application into production.

Syntax

Read syntax diagramSkip visual syntax diagram
        .-nopdf2-----------------------------.   
        +-nopdf1-----------------------------+   
>>- -q--+-pdf1--+--------------------------+-+-----------------><
        |       +-=--pdfname--=--file_path-+ |   
        |       +-=--exename---------------+ |   
        |       +-=--defname---------------+ |   
        |       '-=--level--=--0--1--2-----' |   
        '-pdf2--+--------------------------+-'   
                +-=--pdfname--=--file_path-+     
                +-=--exename---------------+     
                +-=--defname---------------+     
                '-=--level--=--0--1--2-----'     

Defaults

-qnopdf1, -qnopdf2

Parameters

defname
Reverts the PDF file to its default file name.
exename
Generates the name of the PDF file based on what you specify with the -o option. For example, you can use -qpdf1=exename -o foo foo.f to generate a PDF file called .foo_pdf.
level=0 | 1 | 2
Supports multiple-pass profiling, cache miss, block counter, call counter and extended value profiling. You can compile your application with -qpdf1=level=0|1|2 to generate profiling data with different levels of optimization. Note that -qpdf1=level=0 and -qpdf1=level=1 support single-pass profiling, whereas -qpdf=level=2 supports multiple-pass profiling. The following is a list of detailed descriptions for each level of optimization:
  • 0 is the basic compiler instrumentation that generates lower overhead than -qpdf1=level=1.
  • 1 is the default compiler instrumentation that is equivalent to -qpdf1 in previous releases.
  • 2 is a more aggressive compiler instrumentation. Cache-miss profiling is enabled on AIX® in addition to basic block counter and value profiling performed at -qpdf1=level=1. This suboption is supported at all optimization levels where PDF is enabled. You can use -qpdf1=level=2 to aggressively gather profile information, execute applications with typical input data, and use -qpdf2 to optimize the executable.
    Note:
    • Cache-miss profiling (and future performance counter profiling) is supported at –qpdf1=level=2 only.
    • You can set the environment variable PDF_PM_EVENT to gather different level cache miss profiling information.
    • You can set the environment variable PDF_BIND_PROCESSOR to bind your application to the specified processor for cache miss profiling. Cache-miss profiling information is only available on the POWER5™, POWER6™, and POWER7 processors.
    • These new features are to be used with link-time PDF and do not apply to compile-time PDF (ie. -qpdf2 -qnoipa).
pdfname= file_path
Specifies the path to the file that will hold the profile data. By default, the file name is ._pdf, and it is placed in the current working directory or in the directory named by the PDFDIR environment variable. You can use the pdfname suboption to allow you to do simultaneous runs of multiple executables using the same PDF directory. This is especially useful when tuning with PDF on dynamic libraries.

Usage

You must compile the main program with PDF for profiling information to be collected at run time.

If you do not want the optimized object files to be relinked during the second step, specify -qpdf2 -qnoipa.

If you want to specify an alternate path and file name for the profile file, use the pdfname suboption. Alternatively, you can use the PDFDIR environment variable to specify the absolute path name for the directory. To generate the name of the PDF files you specify, you can use the exename suboption. To revert the PDF file to its default name, use the defname suboption . Do not compile or run two different applications that use the same profiling directory at the same time, unless you have used the pdfname suboption to distinguish the sets of profiling information. For examples, see Optimizing your applications.

When you run -qprefetch=assistthread to generate data prefetching assist threads, the compiler uses the delinquent load information to perform analysis and generate them. The delinquent load information can be gathered from dynamic profiling using -qpdf1=level=2. For more information, see -qprefetch.

You can also use the following option with -qpdf1:
-qshowpdf
Provides additional information, such as block and function call counts, to the profile file. See -qshowpdf for more information.

For recommended procedures of using PDF, see Using profile-directed feedback.

The following utility programs, found in /usr/vacpp/bin/, are available for managing the directory to which profile data is written:
cleanpdf
Read syntax diagramSkip visual syntax diagram
>>-cleanpdf--+----------------+--------------------------------><
             '-directory_path-'   

Removes all profiling information from the directory specified by directory_path; or if pathname is not specified, from the directory set by the PDFDIR environment variable; or if PDFDIR is not set, from the current directory. Removing profiling information reduces runtime overhead if you change the program and then go through the PDF process again.

Run cleanpdf only when you are finished with the PDF process for a particular application. Otherwise, if you want to resume using PDF with that application, you will need to recompile all of the files again with -qpdf1.

mergepdf
Read syntax diagramSkip visual syntax diagram
             .-------------------------.                                  
             V                         |                                  
>>-mergepdf----+--------------+--input-+-- -o--output--+-----+--+-----+-><
               '- -r--scaling-'                        '- -n-'  '- -v-'   

Merges two or more PDF records into a single PDF output record.

-r scaling
Specifies the scaling ratio for the PDF record file. This value must be greater than zero and can be either an integer or floating point value. If not specified, a ratio of 1.0 is assumed.
input
Specifies the name of a PDF input record file, or a directory that contains PDF record files.
-o output
Specifies the name of the PDF output record file, or a directory to which the merged output will be written.
-n
If specified, PDF record files are not normalized. If not specified, mergepdf normalizes records based on an internally-calculated ratio before applying any user-defined scaling factor.
-v
Specifies verbose mode, and causes internal and user-specified scaling ratios to be displayed to standard output.
resetpdf
Read syntax diagramSkip visual syntax diagram
>>-resetpdf--+----------------+--------------------------------><
             '-directory_path-'   

Same as cleanpdf, described above.

showpdf
Read syntax diagramSkip visual syntax diagram
>>-showpdf--+----------------+--+-----+--+-----------+---------><
            '-directory_path-'  '- -f-'  '-file_path-'   

Displays the function call and block counts written to the profile file, specified by the -f option, during a program run. To use this command, you must first compile your application specifying both -qpdf1 and -qshowpdf compiler options on the command line.

Predefined macros

None.

Examples

Here is a simple example:
// Compile all files with -qpdf1.
xlc -qpdf1 -O3 file1.c file2.c file3.c

// Run with one set of input data.             
./a.out < sample.data 

// Recompile all files with -qpdf2.
xlc -qpdf2 -O3 file1.c file2.c file3.c

// The program should now run faster than 
// without PDF if the sample data is typical.   
Here is a more elaborate example.
// Set the PDFDIR variable.                     
export PDFDIR=$HOME/project_dir

// Compile most of the files with -qpdf1.       
xlc -qpdf1 -O3 -c file1.c file2.c file3.c

// This file is not so important to optimize.
xlc -c file4.c

// Non-PDF object files such as file4.o can be linked in.  
xlc -qpdf1 -O3 file1.o file2.o file3.o file4.o

// Run several times with different input data.           
./a.out < polar_orbit.data
./a.out < elliptical_orbit.data
./a.out < geosynchronous_orbit.data

// No need to recompile the source of non-PDF object files (file4.c).
xlc -qpdf2 -O3 file1.c file2.c file3.c

// Link all the object files into the final application.   */
xlc -qpdf2 -O3 file1.o file2.o file3.o file4.o

Here is an example that bypasses recompiling the source with -qpdf2:

// Compile source with -qpdf1.
xlc -O3 -qpdf1 -c file.c

// Link in object file.
xlc -O3 -qpdf1 file.o

// Run with one set of input data.
./a.out < sample.data

// Link in object file from qpdf1 pass.
// (Bypass source recompilation with -qpdf2.)
  xlc -O3 -qpdf2 file.o
Here is an example of using pdf1 and pdf2 objects:
// Compile source with -qpdf1.
xlc -c -qpdf1 -O3 file1.c file2.c

// Link in object files.
xlc -qpdf1 -O3 file1.o file2.o

// Run with one set of input data.
./a.out < sample.data

// Link in the mix of pdf1 and pdf2 objects.
  xlc -qpdf2 -O3 file1.o file2.o
Here is an example that creates PDF-optimized object files without relinking into an executable:
// Compile source with -qpdf1.
xlc -c -O3 -qpdf1 file1.c file2.c file3.c  

// Link in object files.
xlc -O3 -qpdf1 file1.o file2.o file3.o   

// Run with one set of input data.
./a.out < sample data   

// Recompile the instrumented source files 
xlc -c -O3 -qpdf2 -qnoipa file1.c file2.c file3.c    

Here is an example that reduces possible runtime instrumentation overhead:

//Compile all files with -qpdf1=level=0.
xlc -qpdf1=level=0 -O3 file1.c file2.c file3.c

//Run with one set of input data.
./a.out < sample.data 

//Recompile all files with -qpdf2.
 xlc -qpdf2 -O3 file1.c file2.c file3.c

//The program should now run faster than
//without PDF if the sample data is typical.

Here is an example that gathers cache miss profiling information (including block counter profiling and value profiling information):

//Compile all files with -qpdf1=level=2.
xlc -qpdf1=level=2 –O5 file1.c file2.c file3.c

//set PM_EVENT=L2MISS to gather L2 cache misses
//Run with one set of input data.

export PDF_PM_EVENT=L2MISS

./a.out < sample.data 

//Recompile all files with -qpdf2.
 xlc -qpdf2 -O5 file1.c file2.c file3.c

//The program should now run faster than
//without PDF if the sample data is typical.

Here is an example of multiple pass profiling:

//Compile all files with -qpdf1=level=2.
xlc -qpdf1=level=2 –O5 file1.c file2.c file3.c

//Run with one set of input data, the profiling information is recorded
//in ._pdf by default.
./a.out < sample.data 

//Recompile all files with -qpdf1=level=2 again.
//The compiler will read the previous profile data,
//refine instrumentation and generate a new instrumented executable.

 xlc -qpdf1=level=2 -O5 file1.c file2.c file3.c

//Run it again, the profiling information is recorded in 
//._pdf.1

./a.out < sample.data

//Recompile all files with -qpdf2

xlc -qpdf2 -O5 file1.c file2.c file3.c

//The program should now run faster than
//without PDF if the sample data is typical.

Here is an example that uses -qpdf[1|2]=exename:

//Compile all files with -qpdf1=exename.
xlc -qpdf1=exename –O5 -o final file1.c file2.c file3.c

//Run executable with sample input data.

./final < typical.data 

//List the content of the directory.
 >ls -lrta

 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file1.c
 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file2.c
 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file3.c
 -rwxr-xr-x 1 user staff 12243 Dec 05 17:00 final
 -rwxr-Sr-- 1 user staff 762 Dec 05 17:03 .final_pdf

//Recompile all files with -qpdf2=exename.

xlc -qpdf2=exename –O5 -o final file1.c file2.c file3.c

//The program is now optimized using PDF information.
 

Related information