-qpdf1, -qpdf2

Pragma equivalent

None.

Purpose

Tunes optimizations through profile-directed feedback (PDF), where results from sample program execution are used to improve optimization near conditional branches and in frequently executed code sections.

Optimizes an application for a typical usage scenario based on an analysis of how often branches are taken and blocks of code are run.

Syntax

Read syntax diagramSkip visual syntax diagram
        .-nopdf2-----------------------------.   
        +-nopdf1-----------------------------+   
>>- -q--+-pdf1--+--------------------------+-+-----------------><
        |       +-=--pdfname--=--file_path-+ |   
        |       +-=--unique----------------+ |   
        |       +-=--nounique--------------+ |   
        |       +-=--exename---------------+ |   
        |       +-=--defname---------------+ |   
        |       '-=--level--=--+-0-+-------' |   
        |                      +-1-+         |   
        |                      '-2-'         |   
        '-pdf2--+--------------------------+-'   
                +-=--pdfname--=--file_path-+     
                +-=--exename---------------+     
                '-=--defname---------------'     

Defaults

-qnopdf1, -qnopdf2

Parameters

defname
Reverts a PDF file to its default file name.
exename
Specifies the name of the generated PDF file according to the output file name specified by the -o option. For example, you can use -qpdf1=exename -o func func.c to generate a PDF file called .func_pdf.
level=0 | 1 | 2
Specifies different levels of profiling information to be generated by the resulting application. The following table shows the type of profiling information supported on each level. The plus sign (+) indicates that the profiling type is supported.
Table 1. Profiling type supported on each -qpdf1 level
Profiling type Level
0 1 2
Block-counter profiling + + +
Call-counter profiling + + +
Single-pass profiling + +  
Value profiling   + +
Multiple-pass profiling     +
Cache-miss profiling     +

-qpdf1=level=1 is the default level. It is equivalent to -qpdf1. Higher PDF levels profile more optimization opportunities but have a larger overhead.

Notes:
  • Only one application compiled with the -qpdf1=level=2 option can be run at a time on a particular computer.
  • Cache-miss profiling information has several levels. If you want to gather different levels of cache-miss profiling information, set the PDF_PM_EVENT environment variable to L1MISS, L2MISS, or L3MISS (if applicable) accordingly. Only one level of cache-miss profiling information can be instrumented at a time. L2 cache-miss is the default level.
  • If you want to bind your application to the specified processor for cache-miss profiling, set the PDF_BIND_PROCESSOR environment variable. Processor 0 is set by default.
pdfname= file_path
Specifies the directories and names for the PDF files and any existing PDF map files. By default, if the PDFDIR environment variable is set, the compiler places the PDF and PDF map files in the directory specified by PDFDIR. Otherwise, if the PDFDIR environment variable is not set, the compiler places these files in the current working directory. If the PDFDIR environment variable is set but the specified directory does not exist, the compiler issues a warning message. The name of the PDF map file follows the name of the PDF file if the -qpdf1=unique option is not specified. For example, if you specify the -qpdf1=pdfname=/home/joe/func option, the generated PDF file is called func, and the PDF map file is called func_map. Both of the files are placed in the /home/joe directory. You can use the pdfname suboption to do simultaneous runs of multiple executable applications by using the same directory. It is especially useful when tuning with PDF process on dynamic libraries.
unique | nounique
You can use the -qpdf1=unique option to avoid locking a single PDF file when multiple processes are writing to the same PDF file in the PDF training step. This option specifies whether a unique PDF file is created for each process during run time. The PDF file name is <pdf_file_name>.<pid>. <pdf_file_name> is ._pdf by default or specified by other -qpdf1 suboptions, which include pdfname, exename, and defname. <pid> is the ID of running process in the PDF training step. For example, if you specify the -qpdf1=unique:pdfname=abc option, and there are two processes for PDF training with the IDs 12345678 and 87654321, two PDF files abc.12345678 and abc.87654321 are generated.
Note:
  • When -qpdf1=unique is specified, only one PDF map file is generated. The default name of the PDF map file is ._pdf_map.
  • When -qpdf1=unique is specified, multiple PDF files with process IDs as suffixes are generated. You must use the mergepdf program to merge all these PDF files into one after the PDF training step.

Usage

The PDF process consists of the following three steps:
  1. Compile your program with the -qpdf1 option and a minimum optimization level of -O2. A PDF map file named ._pdf_map by default and a resulting application are generated.
  2. Run the resulting application with a typical data set. Profiling information is written to a PDF file named ._pdf by default. This step is called the PDF training step.
  3. Recompile and link or relink the program with the -qpdf2 option and the optimization level used for the -qpdf1 option. The -qpdf2 process fine-tunes the optimizations according to the profiling information collected when the resulting application is run.
Notes:
  • The showpdf utility uses the PDF map file to display part of the profiling information in text or XML format. For details, see Viewing profiling information with showpdf. If you do not need to view the profiling information, specify the -qnoshowpdf option during the -qpdf1 phase so that the PDF map file is not generated. For details of -qnoshowpdf, see -qshowpdf.
  • When option -O4, -O5, or any level of option -qipa is in effect, and you specify the -qpdf1 or -qpdf2 option at the link step but not at the compile step, the compiler issues a warning message. The message indicates that you must recompile your program to get all the profiling information.
  • When the -qpdf1=pdfname option is used during the -qpdf1 phase, you must use the -qpdf2=pdfname option during the -qpdf2 phase for the compiler to recognize the correct PDF file. This rule also applies to the -qpdf[1|2]=exename option.

The compiler issues an information message with a number in the range of 0 - 100 during the -qpdf2 phase. If you have not changed your program between the -qpdf1 and -qpdf2 phases, the number is 100, which means that all the profiling information can be used to optimize the program. If the number is 0, it means that the profiling information is completely outdated, and the compiler cannot take advantage of any information. When the number is less than 100, you can choose to recompile your program with the -qpdf1 option and regenerate the profiling information.

Single-pass profiling

Single-pass profiling is supported on level 0 and 1 of the -qpdf1 phase. If you recompile your program and use either of the -qpdf1=level=0 or -qpdf1=level=1 option, the compiler removes the existing PDF file and the possible existing PDF map file before generating a new application.

Multiple-pass profiling

Multiple-pass profiling is supported on level 2 of the -qpdf1 phase. After compiling a program with the -qpdf1=level=2 option when you train the resulting application, you can recompile your program with the -qpdf1=level=2 option. The profile information gathered previously is used to guide further instrumentation. When you train the resulting application again, the profiling information is written to a new profile file named ._pdf.1 by default. If you repeat this compiling and PDF training several times, the PDF files are generated up to five times (._pdf.1 to ._pdf.5). If the compiler detects that all the PDF files names have been used, it issues a warning message and overwrites the last PDF file ._pdf.5. If the compiler cannot read any PDF files when compiling a program with the -qpdf1=level=2 option, it issues a warning message to indicate that PDF files are not found. You can get initial profiling information by using the -qpdf1=level=0 or -qpdf1=level=1 option, and then use the -qpdf1=level=2 option for more profiling information.
Notes:
  • If you have not specified the -qnoshowpdf option, PDF map files that correspond to the PDF files are also generated, with the default names ._pdf_map, ._pdf.1_map, and so on up to ._pdf.5_map.
  • If you use the -qpdf2=pdfname option to specify a PDF file, specify a file name that does not end with a numeric suffix from .1 to .5. Otherwise, the compiler looks for wrong files. For example, if you specify the -qpdf2=pdfname=func.2 option during the -qpdf2 phase, the compiler looks for the PDF files named (func.2, func.2.1, func.2.2, func.2.3), which might not exist. If you specify the -qpdf2=pdfname=func option without the numeric suffix, the compiler looks for (func, func.1, func.2, func.3).

Other related options

You can use the following option with the -qpdf1 option:
-qprefetch
When you run the -qprefetch=assistthread option to generate data prefetching assist threads, the compiler uses the delinquent load information to perform analysis and generate them. The delinquent load information can be gathered from dynamic profiling using the -qpdf1=level=2 option. For more information, see -qprefetch.
-qshowpdf
Provides additional information to the profile file. See -qshowpdf for more information.

For recommended procedures of using PDF, see Using profile-directed feedback.

The following utility programs, found in /opt/ibm/xlC/13.1.1/bin/, are available for managing the directory to which profiling information is written:
cleanpdf
Read syntax diagramSkip visual syntax diagram
>>-cleanpdf--+--------+--+-----+--+--------------+-------------><
             '-pdfdir-'  '- -u-'  '- -f--pdfname-'   

Removes all PDF files or the specified PDF files, including PDF files with process ID suffixes. Removing profiling information reduces runtime overhead if you change the program and then go through the PDF process again.

pdfdir
Specifies the directory that contains the PDF files to be removed. If pdfdir is not specified, the directory is set by the PDFDIR environment variable; if PDFDIR is not set, the directory is the current directory.
-f pdfname
Specifies the name of the PDF file to be removed. When specified, files with the naming convention pdfname.<multiple_pass_profiling_times>, if applicable, are also removed. <multiple_pass_profiling_times> is a numeric suffix from 1 to 5.

If -f pdfname is not specified, ._pdf and files with the naming convention ._pdf.<multiple_pass_profiling_times>, if applicable, are removed.

-u
Removes the PDF file that is specified by pdfname and files with the following naming convention when applicable:
  • pdfname.<pid>, where <pid> is the ID of running process in the PDF training step
  • pdfname.<multiple_pass_profiling_times>.<pid>
If -f pdfname is not specified, removes ._pdf and files with the following naming convention when applicable:
  • ._pdf.<pid>
  • ._pdf.<multiple_pass_profiling_times>.<pid>

Run cleanpdf only when you finish the PDF process for a particular application. Otherwise, if you want to resume by using PDF process with that application, you must compile all of the files again with -qpdf1.

mergepdf
Read syntax diagramSkip visual syntax diagram
             .-------------------------.                                  
             V                         |                                  
>>-mergepdf----+--------------+--input-+-- -o--output--+-----+--+-----+-><
               '- -r--scaling-'                        '- -n-'  '- -v-'   

Merges two or more PDF files into a single PDF file.

-r scaling
Specifies the scaling ratio for the PDF file. This value must be greater than zero and can be either an integer or a floating-point value. If not specified, a ratio of 1.0 is assumed.
input
Specifies the name of a PDF input file, or a directory that contains PDF files.
-o output
Specifies the name of the PDF output file, or a directory to which the merged output is written.
-n
If specified, PDF files are not normalized. If not specified, mergepdf normalizes files based on an internally calculated ratio before applying any user-defined scaling factor.
-v
Specifies verbose mode, and causes internal and user-specified scaling ratios to be displayed to standard output.
resetpdf
Read syntax diagramSkip visual syntax diagram
>>-resetpdf--+--------+--+-----+--+--------------+-------------><
             '-pdfdir-'  '- -u-'  '- -f--pdfname-'   

Same as cleanpdf.

showpdf

Displays part of the profiling information written to PDF and PDF map files. To use this command, you must first compile your program and use the -qpdf1 option. See Viewing profiling information with showpdf for more information.

Predefined macros

None.

Examples

The following example uses the -qpdf1=level=0 option to reduce possible runtime instrumentation overhead:
#Compile all the files with -qpdf1=level=0
xlc -qpdf1=level=0 -O3 file1.c file2.c file3.c

#Run with one set of input data
./a.out < sample.data 

#Recompile all the files with -qpdf2
xlc -qpdf2 -O3 file1.c file2.c file3.c

#If the sample data is typical, the program 
#can now run faster than without the PDF process
The following example uses the -qpdf1=level=1 option:
#Compile all the files with -qpdf1
xlc -qpdf1 -O3 file1.c file2.c file3.c

#Run with one set of input data             
./a.out < sample.data 

#Recompile all the files with -qpdf2
xlc -qpdf2 -O3 file1.c file2.c file3.c

#If the sample data is typical, the program 
#can now run faster than without the PDF process  
The following example uses the -qpdf1=level=2 option to gather cache-miss profiling information:
#Compile all the files with -qpdf1=level=2
xlc -qpdf1=level=2 -O3 file1.c file2.c file3.c

#Set PM_EVENT=L2MISS to gather L2 cache-miss profiling 
#information
export PDF_PM_EVENT=L2MISS

#Run with one set of input data
./a.out < sample.data 

#Recompile all the files with -qpdf2
xlc -qpdf2 -O3 file1.c file2.c file3.c

#If the sample data is typical, the program 
#can now run faster than without the PDF process
The following example uses the -qpdf1=level=2 option with multiple runs to gather cache-miss profiling information at different cache levels:
#Compile all the files with -qpdf1=level=2
xlc -qpdf1=level=2 -O3 file1.c file2.c file3.c

#Set PM_EVENT=L1MISS to gather L1 cache-miss profiling 
#information
export PDF_PM_EVENT=L1MISS

#Run with one set of input data
./a.out < sample.data 

#Set PM_EVENT=L2MISS to gather L2 cache-miss profiling 
#information
export PDF_PM_EVENT=L2MISS

#Run with one set of input data
./a.out < sample.data 

#Recompile all the files with -qpdf2
xlc -qpdf2 -O3 file1.c file2.c file3.c

#If the sample data is typical, the program 
#can now run faster than without the PDF process
The following example demonstrates the process of multiple-pass profiling:
#Compile all the files with -qpdf1=level=2. The static profiling 
#information is recorded in a file named ._pdf_map by default
xlc -qpdf1=level=2 -O3 file1.c file2.c file3.c

#Run with one set of input data, the profiling information 
#is recorded in a file named ._pdf by default
./a.out < sample.data 

#Recompile all the files with -qpdf1=level=2 again
#The compiler reads the previous profiling information, refines
#instrumentation, and generates a new instrumented 
#executable. The static profiling information 
#is recorded in ._pdf.1_map
xlc -qpdf1=level=2 -O3 file1.c file2.c file3.c

#Run it again, the profiling information is recorded in 
#._pdf.1
./a.out < sample.data

#Recompile all the files with -qpdf2
xlc -qpdf2 -O3 file1.c file2.c file3.c

#If the sample data is typical, the program 
#can now run faster than without the PDF process
The following example demonstrates the use of the PDF_BIND_PROCESSOR environment variable:
#Compile all the files with -qpdf1=level=1
xlc -qpdf1=level=1 -O3 file1.c file2.c file3.c

#Set PDF_BIND_PROCESSOR environment variable so that 
#all processes for this executable are run on Processor 1
export PDF_BIND_PROCESSOR=1

#Run executable with sample input data
./a.out < sample.data 

#Recompile all the files with -qpdf2
xlc -qpdf2 -O3 file1.c file2.c file3.c

#If the sample data is typical, the program 
#can now run faster than without the PDF process
The following example demonstrates the use of the -qpdf[1|2]=exename option:
#Compile all the files with -qpdf1=exename
xlc -qpdf1=exename -O3 -o final file1.c file2.c file3.c

#Run executable with sample input data
./final < typical.data 

#List the content of the directory
 >ls -lrta

 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file1.c
 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file2.c
 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file3.c
 -rwxr-xr-x 1 user staff 12243 Dec 05 17:00 final
 -rwxr-Sr-- 1 user staff 762 Dec 05 17:03 .final_pdf

#Recompile all the files with -qpdf2=exename
xlc -qpdf2=exename -O3 -o final file1.c file2.c file3.c

#The program is now optimized using PDF information 
The following example demonstrates the use of the -qpdf[1|2]=pdfname option:
#Compile all the files with -qpdf1=pdfname.The static profiling 
#information is recorded in a file named final_map
xlc -qpdf1=pdfname=final -O3 file1.c file2.c file3.c

#Run executable with sample input data.The profiling 
#information is recorded in a file named final
./a.out < typical.data 

#List the content of the directory
 >ls -lrta

 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file1.c
 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file2.c
 -rw-r--r-- 1 user staff 50 Dec 05 13:18 file3.c
 -rwxr-xr-x 1 user staff 12243 Dec 05 18:30 a.out
 -rwxr-Sr-- 1 user staff 762 Dec 05 18:32 final

#Recompile all the files with -qpdf2=pdfname
xlc -qpdf2=pdfname=final -O3 file1.c file2.c file3.c

#The program is now optimized using PDF information 


Voice your opinion on getting help information Ask IBM compiler experts a technical question in the IBM XL compilers forum Reach out to us