Documentation errata for IBM XL C/C++ for Linux, V11.1

Preventive Service Planning


Abstract

This page contains corrections and additions to the product documentation shipped with IBM XL C/C++ for Linux, V11.1.

Content

Getting Started

The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Getting Started:

Chapter: What's new for IBM XL C/C++ for Linux, V11.1

Section: Performance and optimization

The three new -qpdf suboptions are:

level
Supports multiple-pass profiling, block counter profiling, call counter profiling, and extended value profiling. You can compile your application with -qpdf1=level=0|1|2 to generate profiling data with different levels of optimization.

should read:

These new suboptions are as follows:

level
Supports multiple-pass profiling, single-pass profiling, cache-miss profiling, value profiling, block-counter profiling, and call-counter profiling. You can compile your program with -qpdf1=level=0|1|2 to specify the type of profiling information to be generated by the resulting application.

Chapter: Enhancements added in previous versions

Section: Enhancements added in Version 11.1

Topic: New or changed compiler options and directives

The following row should be deleted:

-qipa You can generate relinkable objects while preserving IPA information by specifying -r -qipa=relink.


Installation Guide
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Installation Guide:

Chapter: After installing IBM XL C/C++ for Linux, V11.1

Section: Accessing the local documentation

Topic: Viewing the HTML documentation

This information center can be installed on any RHEL 5.5 or SLES 10 SP2 system.

should read:

This information center can be installed on any supported operating systems of XL C/C++ for Linux, V11.1.


Language Reference
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Language Reference:

Chapter: Data objects and declarations

Section: Type specifiers

Topic: User-defined types

Structures and unions

Member declarations

A structure or union member may be of any type except:
  • any variably modified type
  • C a function
  • any incomplete type

should read:

A structure or union member may be of any type except:
  • any variably modified type
  • void type
  • C a function
  • any incomplete type

Flexible array members should read as follows:

A flexible array member is an unbounded array that occurs within a structure. It is a C99 feature and can be used to access a variable-length object. A flexible array member is permitted as the last member of a structure, provided that the structure has more than one named member. It is declared with an empty index as follows:

array_identifier [ ];

For example, b is a flexible array member of structure f.

Because a flexible array member has an incomplete type, you cannot apply the sizeof operator to a flexible array. In this example, the statement sizeof(f) returns the same result as sizeof(f.a), which is the size of an integer. The statement sizeof(f.b) is not allowed, because b is a flexible array member that has an incomplete type.

Any structure containing a flexible array member cannot be a member of another structure or an element of an array, for example:


IBM extension To be compatible with GNU C, XL C/C++ extends Standard C and C++, to ease the restrictions on flexible array members and allow the following situations:
  • Flexible array members can be declared in any part of a structure, not just as the last member. C++ only The type of any member that follows the flexible array member must be compatible with the type of the flexible array member. C only The type of any member that follows the flexible array member is not required to be compatible with the type of the flexible array member; however, a warning is issued when a flexible array member is followed by members of an incompatible type. The following example demonstrates this:
  • Structures containing flexible array members can be members of other structures.
  • Flexible array members can be statically initialized only if either of the following two conditions is true:
  • The flexible array member is the last member of the structure, for example:
  • Flexible array members are contained in the outermost structure of nested structures. Members of inner structures cannot be statically initialized, for example:

End IBM extension

Zero-extent array members (IBM extension) should read as follows:

Zero-extent arrays are provided for GNU C/C++ compatibility, and can be used to access a variable-length object.

A zero-extent array is an array with an explicit zero specified as its dimension.

array_identifier [0]

For example, b is a zero-extent array member of structure f.

The sizeof operator can be applied to a zero-extent array, and the value returned is 0. In this example, the statement sizeof(f) returns the same result as sizeof(f.a), which is the size of an integer. The statement sizeof(f.b) returns 0.

A structure containing a zero-extent array can be an element of an array, for example:


A zero-extent array can only be statically initialized with an empty set {}. Otherwise, it must be initialized as a dynamically allocated array. For example:

If a zero-extent array is not initialized, no static zero filling occurs, because a zero-extent array is defined to have no members. The following example demonstrates this:

In this example, the two printf statements produce the same output:


A zero-extent array can be declared in any part of a structure, not just as the last member. The type of any member following the zero-extent array is not required to be compatible with the type of the zero-extent array; however, a warning is issued when a zero-extent array is followed by members of incompatible type. For example:


Chapter: Expressions and Operators

Section: Unary expressions

Topic: The __real__ and __imag__ operators (C only) (IBM extension)

should be:

The __real__ and __imag__ operators (IBM extension)


Compiler Reference
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Compiler Reference:

Chapter: Configuring compiler defaults

Section: Setting environment variables

Topic: Runtime environment variables

The following variable should be added:

PDF_PM_EVENT
When you run an application compiled with -qpdf1=level=2, you can set the value of the environment variable PDF_PM_EVENT to L1MISS, L2MISS or L3MISS to gather cache-miss profiling information at the specified cache level.

Chapter: Compiler options reference

Section: Individual option descriptions

Topic: -qinline

Defaults
  • -qnoinline
  • At an optimization level of -O0, the default is -qinline=noauto
  • At optimization levels of -O2 and higher, the default is -qinline=auto
  • -qinline=auto:level=5 is the default suboption of -qinline

should read:
  • -qnoinline
  • At optimization levels of -O2 and higher, the default is -qinline=noauto
  • -qinline=auto:level=5 is the default suboption of -qinline

Topic: -qpdf1, -qpdf2

Syntax should read:

.-nopdf2-----------------------------.   
        +-nopdf1-----------------------------+   
>>- -q--+-pdf1--+--------------------------+-+-----------------><
        |       +-=--pdfname--=--
file_path -+ |   
        |       +-=--exename---------------+ |   
        |       +-=--defname---------------+ |   
        |       '-=--level--=--
0 -- 1 -- 2 -----' |   
        '-pdf2--+--------------------------+-'   
                +-=--pdfname--=--
file_path -+     
                +-=--exename---------------+     
                '-=--defname---------------'


exename
Generates the name of the PDF file based on what you specify with the -o option. For example, you can use -qpdf1=exename -o foo foo.f to generate a PDF file called .foo_pdf.

should read:

exename
Sets the name of the generated PDF file based on what you specify with the -o option. For example, you can use -qpdf1=exename -o foo foo.c to generate a PDF file called .foo_pdf.

level=0 | 1 | 2 should read:

level=0 | 1 | 2

Specifies different levels of profiling information to be generated by the resulting application. The following table provides information about the type of profiling supported on each level (The symbol + indicates that the profiling type is supported):

Profiling type supported on each -qpdf1 level

Profiling type
Level
0
1
2
block-counter profiling
+
+
+
call-counter profiling
+
+
+
single-pass profiling
+
+
value profiling
+
+
multiple-pass profiling
+
cache-miss profiling
+
  • -qpdf1=level=0 is the basic compiler instrumentation that results in smaller file size and faster compilation than -qpdf1=level=1.
  • -qpdf1=level=1 is the default compiler instrumentation. It is equivalent to -qpdf1 in the releases before IBM® XL C/C++ for Linux®, V11.1.
  • -qpdf1=level=2 is a more aggressive compiler instrumentation than -qpdf1=level=0 and -qpdf1=level=1. It is supported at all optimization levels where PDF is enabled.
    Notes:
    • Cache-miss profiling is enabled on SLES11 SP1.
    • You can set the value of the environment variable PDF_PM_EVENT to L1MISS, L2MISS or L3MISS (if applicable) to gather different levels of cache-miss profiling information.
    • You can set the environment variable PDF_BIND_PROCESSOR to bind your application to the specified processor for cache-miss profiling. Cache-miss profiling information is only available on the POWER5™, POWER6®, and POWER7™ processors.

The following example should be added for -qpdf1, -qpdf2:

Here is an example that gathers cache-miss profiling information. Block-counter profiling and value profiling information are also gathered:

//Compile all files with -qpdf1=level=2.
xlc -qpdf1=level=2 –O5 file1.c file2.c file3.c

//set PM_EVENT=L2MISS to gather L2 cache misses
export PDF_PM_EVENT=L2MISS

//Run with one set of input data.
./a.out < sample.data

//Recompile all files with -qpdf2.
xlc -qpdf2 -O5 file1.c file2.c file3.c

//The program should now run faster than
//without PDF if the sample data is typical.

Topic: -qtune

The following rows in the Acceptable -qarch/-qtune combinations table:

-qarch option Default -qtune setting Available -qtune settings
pwr4 pwr4 auto | pwr4 | pwr5 | pwr7 | ppc970 | balanced
pwr5 pwr5 auto | pwr5 | pwr7 | balanced
pwr5x pwr5 auto | pwr5 | pwr7 | balanced

Should read:

-qarch option Default -qtune setting Available -qtune settings
pwr4 pwr4 auto | pwr4 | pwr5 | pwr6 | pwr7 | ppc970 | balanced
pwr5 pwr5 auto | pwr5 | pwr6 | pwr7 | balanced
pwr5x pwr5 auto | pwr5 | pwr6 | pwr7 | balanced


Chapter: Compiler pragmas reference

Section: Individual pragma descriptions

Topic: #pragma reg_killed_by

fs
    Floating-point and status control register

Should read:

fsr
    Floating-point and status control register

Topic: #pragma stream_unroll

Examples

Should read:

The following example shows how #pragma stream_unroll can increase performance.

int i, m, n;
int a[1000];
int b[1000];
int c[1000];

....

#pragma stream_unroll(4)
for (i=0; i<n; i++) {
 a[i] = b[i] * c[i];
}


The unroll factor of 4 reduces the number of iterations from n to n/4, as follows:

m = n/4;

for (i=0; i<n/4; i++){
 a[i] = b[i] + c[i];
 a[i+m] = b[i+m] + c[i+m];
 a[i+2*m] = b[i+2*m] + c[i+2*m];
 a[i+3*m] = b[i+3*m] + c[i+3*m];
}


The increased number of read and store operations are distributed among a number of streams determined by the compiler, which reduces computation time and increase performance.

Chapter: Configuring compiler defaults

Section: Setting environment variables

Topic: Environment variables for parallel processing

In "XLSMPOPTS", the description of stack= num should read:

stack= num
    Specifies the largest amount of space in bytes ( num) that a thread's stack needs. The default value for num is 4194304.

    Set num so it is within the acceptable upper limit. num can be up to the limit imposed by system resources or the stack size ulimit, whichever is smaller. An application that exceeds the upper limit may cause a segmentation fault.

Chapter: Compiler options reference

Section: Individual option descriptions

Topic: -qlanglvl

Default:

C++ only The suboptions and their default settings for different language levels (compat366, strict98, extended (C++), and extended0x) are listed in Table 1.

should read:

C++ only The suboptions and their default settings for different language levels (compat366, extended (C++), and extended0x) are listed in Table 1.

In Table 1. Default Settings of suboptions for different language levels, the strict98 column should be removed.


Optimization and Programming Guide
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Optimization and Programming Guide:

Chapter: Using C++ templates

Section: Using the -qtemplateregistry compiler option

The following two paragraphs should be deleted:

If you want to use either -qtemplateregistry or -qtempinc to compile your programs, you must organize your source code with -qtempinc. See the examples described in the Example of -qtempinc section for more information.

If you also want to compile your program with -qtempinc, you must organize your source code so that it can be compiled with and without -qtempinc.

Chapter: Optimizing your applications

Section: Using profile-directed feedback

To use PDF, follow these steps:

2. Run the program all the way through using data that is representative of the data that is used during a normal run of your finished program. The program records profiling information when it finishes. You can run the program multiple times with different data sets, and the profiling information is accumulated to provide a count of how often branches are taken and blocks of code are executed, based on the input data used. When the application exits, by default, it writes profiling information to the PDF file in the current working directory or the directory specified by the PDFDIR environment variable. The default name for the instrumentation file is ._pdf . To override the defaults, use -qpdf1=pdfname or -qpdf2=pdfname.

should read:

2. Run the program all the way through using data that is representative of the data that is used during a normal run of your finished program. The program records profiling information when it finishes. You can run the program multiple times with different data sets, and the profiling information is accumulated to provide a count of how often branches are taken and blocks of code are executed, based on the input data used. When the application exits, by default, it writes profiling information to the PDF file in the current working directory or the directory specified by the PDFDIR environment variable. The default name for the instrumentation file is ._pdf . To override the defaults, use -qpdf1=pdfname or -qpdf1=exename.

You can take more control of the PDF file generation, as follows:

3. Change the PDF file location specified by the PDFDIR environment variable or the -qipa=pdfname option to produce a PDF file in a different location.

should read:

3. Change the PDF file location specified by the PDFDIR environment variable or the -qpdf1=pdfname option to produce a PDF file in a different location.

Topic: Viewing profiling information with showpdf

3. Run the showpdf utility to display the call and block counts for the executable file. If you used the -qipa=pdfname option during compilation, use the -f option to indicate the instrumentation file.

should read:

3. Run the showpdf utility to display the call and block counts for the executable file. If you used the -qpdf[1|2]=pdfname option during compilation, use the -f option to indicate the instrumentation file.

Chapter: Debugging optimized code

Section: Using -qoptdebug to help debug optimized programs

Example 2 and Example 3 should read:

Example 2: gdb debugger listing



Example 3: Stepping through optimized source


Related information

README updates for XL C/C++ for Linux V11.1

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

XL C/C++ for Linux
Documentation

Software version:

11.1

Operating system(s):

Linux

Reference #:

1431755

Modified date:

2013-11-20

Translate my page

Machine Translation

Content navigation