None.
Tunes optimizations through profile-directed feedback (PDF), where results from sample program execution are used to improve optimization near conditional branches and in frequently executed code sections.
PDF is a two-step process. You first compile the application with -qpdf1 and a minimum optimization level of -O2, with linking. You then run the resulting application with a typical data set. During the test run, profile data is written to a profile file. By default, the profile file is named ._pdf and is saved in the current working directory, or in the directory named by the PDFDIR environment variable, if it is set. You then recompile and link or relink the application with -qpdf2 and an optimization level used for -qpdf1, which fine-tunes the optimizations applied according to the profile data collected during the program execution.
You can use old profiling information. In previous releases, when you modify the source file or compiler options and compile with -qpdf2, the compilation stops with an error. As of IBM XL C/C++ for AIX, V11.1, you see a list of warnings but compilation does not stop. However, using different compiler options between different stages of PDF does not give you any benefits for using PDF.
PDF is intended to be used after other debugging and tuning is finished, as one of the last steps before putting the application into production.
.-nopdf2-----------------------------. +-nopdf1-----------------------------+ >>- -q--+-pdf1--+--------------------------+-+----------------->< | +-=--pdfname--=--file_path-+ | | +-=--exename---------------+ | | +-=--defname---------------+ | | '-=--level--=--0--1--2-----' | '-pdf2--+--------------------------+-' +-=--pdfname--=--file_path-+ +-=--exename---------------+ +-=--defname---------------+ '-=--level--=--0--1--2-----'
-qnopdf1, -qnopdf2
You must compile the main program with PDF for profiling information to be collected at run time.
If you do not want the optimized object files to be relinked during the second step, specify -qpdf2 -qnoipa.
If you want to specify an alternate path and file name for the profile file, use the pdfname suboption. Alternatively, you can use the PDFDIR environment variable to specify the absolute path name for the directory. To generate the name of the PDF files you specify, you can use the exename suboption. To revert the PDF file to its default name, use the defname suboption . Do not compile or run two different applications that use the same profiling directory at the same time, unless you have used the pdfname suboption to distinguish the sets of profiling information. For examples, see Optimizing your applications.
When you run -qprefetch=assistthread to generate data prefetching assist threads, the compiler uses the delinquent load information to perform analysis and generate them. The delinquent load information can be gathered from dynamic profiling using -qpdf1=level=2. For more information, see -qprefetch.
For recommended procedures of using PDF, see Using profile-directed feedback.
>>-cleanpdf--+----------------+-------------------------------->< '-directory_path-'
Removes all profiling information from the directory specified by directory_path; or if pathname is not specified, from the directory set by the PDFDIR environment variable; or if PDFDIR is not set, from the current directory. Removing profiling information reduces runtime overhead if you change the program and then go through the PDF process again.
Run cleanpdf only when you are finished with the PDF process for a particular application. Otherwise, if you want to resume using PDF with that application, you will need to recompile all of the files again with -qpdf1.
.-------------------------. V | >>-mergepdf----+--------------+--input-+-- -o--output--+-----+--+-----+->< '- -r--scaling-' '- -n-' '- -v-'
Merges two or more PDF records into a single PDF output record.
>>-resetpdf--+----------------+-------------------------------->< '-directory_path-'
Same as cleanpdf, described above.
>>-showpdf--+----------------+--+-----+--+-----------+--------->< '-directory_path-' '- -f-' '-file_path-'
Displays the function call and block counts written to the profile file, specified by the -f option, during a program run. To use this command, you must first compile your application specifying both -qpdf1 and -qshowpdf compiler options on the command line.
None.
// Compile all files with -qpdf1. xlc -qpdf1 -O3 file1.c file2.c file3.c // Run with one set of input data. ./a.out < sample.data // Recompile all files with -qpdf2. xlc -qpdf2 -O3 file1.c file2.c file3.c // The program should now run faster than // without PDF if the sample data is typical.
// Set the PDFDIR variable. export PDFDIR=$HOME/project_dir // Compile most of the files with -qpdf1. xlc -qpdf1 -O3 -c file1.c file2.c file3.c // This file is not so important to optimize. xlc -c file4.c // Non-PDF object files such as file4.o can be linked in. xlc -qpdf1 -O3 file1.o file2.o file3.o file4.o // Run several times with different input data. ./a.out < polar_orbit.data ./a.out < elliptical_orbit.data ./a.out < geosynchronous_orbit.data // No need to recompile the source of non-PDF object files (file4.c). xlc -qpdf2 -O3 file1.c file2.c file3.c // Link all the object files into the final application. */ xlc -qpdf2 -O3 file1.o file2.o file3.o file4.o
Here is an example that bypasses recompiling the source with -qpdf2:
// Compile source with -qpdf1. xlc -O3 -qpdf1 -c file.c // Link in object file. xlc -O3 -qpdf1 file.o // Run with one set of input data. ./a.out < sample.data // Link in object file from qpdf1 pass. // (Bypass source recompilation with -qpdf2.) xlc -O3 -qpdf2 file.o
// Compile source with -qpdf1. xlc -c -qpdf1 -O3 file1.c file2.c // Link in object files. xlc -qpdf1 -O3 file1.o file2.o // Run with one set of input data. ./a.out < sample.data // Link in the mix of pdf1 and pdf2 objects. xlc -qpdf2 -O3 file1.o file2.o
// Compile source with -qpdf1. xlc -c -O3 -qpdf1 file1.c file2.c file3.c // Link in object files. xlc -O3 -qpdf1 file1.o file2.o file3.o // Run with one set of input data. ./a.out < sample data // Recompile the instrumented source files xlc -c -O3 -qpdf2 -qnoipa file1.c file2.c file3.c
Here is an example that reduces possible runtime instrumentation overhead:
//Compile all files with -qpdf1=level=0. xlc -qpdf1=level=0 -O3 file1.c file2.c file3.c //Run with one set of input data. ./a.out < sample.data //Recompile all files with -qpdf2. xlc -qpdf2 -O3 file1.c file2.c file3.c //The program should now run faster than //without PDF if the sample data is typical.
Here is an example that gathers cache miss profiling information (including block counter profiling and value profiling information):
//Compile all files with -qpdf1=level=2. xlc -qpdf1=level=2 –O5 file1.c file2.c file3.c //set PM_EVENT=L2MISS to gather L2 cache misses //Run with one set of input data. export PDF_PM_EVENT=L2MISS ./a.out < sample.data //Recompile all files with -qpdf2. xlc -qpdf2 -O5 file1.c file2.c file3.c //The program should now run faster than //without PDF if the sample data is typical.
Here is an example of multiple pass profiling:
//Compile all files with -qpdf1=level=2. xlc -qpdf1=level=2 –O5 file1.c file2.c file3.c //Run with one set of input data, the profiling information is recorded //in ._pdf by default. ./a.out < sample.data //Recompile all files with -qpdf1=level=2 again. //The compiler will read the previous profile data, //refine instrumentation and generate a new instrumented executable. xlc -qpdf1=level=2 -O5 file1.c file2.c file3.c //Run it again, the profiling information is recorded in //._pdf.1 ./a.out < sample.data //Recompile all files with -qpdf2 xlc -qpdf2 -O5 file1.c file2.c file3.c //The program should now run faster than //without PDF if the sample data is typical.
Here is an example that uses -qpdf[1|2]=exename:
//Compile all files with -qpdf1=exename. xlc -qpdf1=exename –O5 -o final file1.c file2.c file3.c //Run executable with sample input data. ./final < typical.data //List the content of the directory. >ls -lrta -rw-r--r-- 1 user staff 50 Dec 05 13:18 file1.c -rw-r--r-- 1 user staff 50 Dec 05 13:18 file2.c -rw-r--r-- 1 user staff 50 Dec 05 13:18 file3.c -rwxr-xr-x 1 user staff 12243 Dec 05 17:00 final -rwxr-Sr-- 1 user staff 762 Dec 05 17:03 .final_pdf //Recompile all files with -qpdf2=exename. xlc -qpdf2=exename –O5 -o final file1.c file2.c file3.c //The program is now optimized using PDF information.