IBM Support

Documentation errata for IBM XL C/C++ for Linux, V11.1

Preventive Service Planning


Abstract

This page contains corrections and additions to the product documentation shipped with IBM XL C/C++ for Linux, V11.1.

Content

Getting Started
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Getting Started:

Topic location: What's new for IBM XL C/C++ for Linux, V11.1 > Performance and optimization

The three new -qpdf suboptions are:

level
Supports multiple-pass profiling, block counter profiling, call counter profiling, and extended value profiling. You can compile your application with -qpdf1=level=0|1|2 to generate profiling data with different levels of optimization.

Should read:

These new suboptions are as follows:

level
Supports multiple-pass profiling, single-pass profiling, cache-miss profiling, value profiling, block-counter profiling, and call-counter profiling. You can compile your program with -qpdf1=level=0|1|2 to specify the type of profiling information to be generated by the resulting application.

Chapter: Enhancements added in previous versions

Section: Enhancements added in Version 11.1

Topic: New or changed compiler options and directives

The following row should be deleted:

-qipaYou can generate relinkable objects while preserving IPA information by specifying -r -qipa=relink.


Installation Guide
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Installation Guide:

Topic location: After installing IBM XL C/C++ for Linux, V11.1 > Accessing the local documentation > Viewing the HTML documentation

This information center can be installed on any RHEL 5.5 or SLES 10 SP2 system.

Should read:

This information center can be installed on any supported operating systems of XL C/C++ for Linux, V11.1.


Language Reference
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Language Reference:

Topic location: Data objects and declarations > Type specifiers > User-defined types > Structures and unions

Member declarations

A structure or union member may be of any type except:
  • any variably modified type
  • C a function
  • any incomplete type

Should read:

A structure or union member may be of any type except:
  • any variably modified type
  • void type
  • C a function
  • any incomplete type

Flexible array members should read as follows:

A flexible array member is an unbounded array that occurs within a structure. It is a C99 feature and can be used to access a variable-length object. A flexible array member is permitted as the last member of a structure, provided that the structure has more than one named member. It is declared with an empty index as follows:

array_identifier [ ];

For example, b is a flexible array member of structure f.

Because a flexible array member has an incomplete type, you cannot apply the sizeof operator to a flexible array. In this example, the statement sizeof(f) returns the same result as sizeof(f.a), which is the size of an integer. The statement sizeof(f.b) is not allowed, because b is a flexible array member that has an incomplete type.

Any structure containing a flexible array member cannot be a member of another structure or an element of an array, for example:


IBM extension To be compatible with GNU C, XL C/C++ extends Standard C and C++, to ease the restrictions on flexible array members and allow the following situations:
  • Flexible array members can be declared in any part of a structure, not just as the last member. C++ only The type of any member that follows the flexible array member must be compatible with the type of the flexible array member. C only The type of any member that follows the flexible array member is not required to be compatible with the type of the flexible array member; however, a warning is issued when a flexible array member is followed by members of an incompatible type. The following example demonstrates this:
  • Structures containing flexible array members can be members of other structures.
  • Flexible array members can be statically initialized only if either of the following two conditions is true:
  • The flexible array member is the last member of the structure, for example:
  • Flexible array members are contained in the outermost structure of nested structures. Members of inner structures cannot be statically initialized, for example:

End IBM extension

Zero-extent array members (IBM extension) should read as follows:

Zero-extent arrays are provided for GNU C/C++ compatibility, and can be used to access a variable-length object.

A zero-extent array is an array with an explicit zero specified as its dimension.

array_identifier [0]

For example, b is a zero-extent array member of structure f.

The sizeof operator can be applied to a zero-extent array, and the value returned is 0. In this example, the statement sizeof(f) returns the same result as sizeof(f.a), which is the size of an integer. The statement sizeof(f.b) returns 0.

A structure containing a zero-extent array can be an element of an array, for example:


A zero-extent array can only be statically initialized with an empty set {}. Otherwise, it must be initialized as a dynamically allocated array. For example:

If a zero-extent array is not initialized, no static zero filling occurs, because a zero-extent array is defined to have no members. The following example demonstrates this:

In this example, the two printf statements produce the same output:


A zero-extent array can be declared in any part of a structure, not just as the last member. The type of any member following the zero-extent array is not required to be compatible with the type of the zero-extent array; however, a warning is issued when a zero-extent array is followed by members of incompatible type. For example:



Topic location: Declarators > Pointers > Type-based aliasing

In the example:



Should read:



The compiler determines that the result of f += 1.0; is never used subsequently. Thus, the optimizer may discard the statement from the generated code.

Should read:

The compiler determines that the result of f += 1.0 does not affect the value of *p. Thus, the optimizer might move the assignment after the printf statement.

Topic location: Declarators > Initializers > Initialization of structures and unions

You do not have to initialize all members of a structure or union; the initial value of uninitialized structure members depends on the storage class associated with the structure or union variable. In a structure declared as static, any members that are not initialized are implicitly initialized to zero of the appropriate type; the members of a structure with automatic storage have no default initialization. The default initializer for a union with static storage is the default for the first component; a union with automatic storage has no default initialization.

The following definition shows a partially initialized structure:


struct address {
                int street_no;
                char *street_name;
                char *city;
                char *prov;
                char *postal_code;
              };
struct address temp_address =
              { 44, "Knyvet Ave.", "Hamilton", "Ontario" };


The values of temp_address are:

Member
Value
temp_address.street_no44
temp_address.street_nameaddress of string "Knyvet Ave."
temp_address.cityaddress of string "Hamilton"
temp_address.provaddress of string "Ontario"
temp_address.postal_codeDepends on the storage class of the temp_address variable; if it is static, the value would be NULL.

Should read:

You do not have to initialize all members of structure variables. If a structure variable does not have an initializer, the initial values of the structure members depend on the storage class associated with the structure variable:
  • If a structure variable has static storage, its members are implicitly initialized to zero of the appropriate type.
  • If a structure variable has automatic storage, its members have no default initialization.

If a structure variable is partially initialized, all the uninitialized structure members are implicitly initialized to zero no matter what the storage class of the structure variable is. See the following example:

struct one {
   int a;
   int b;
   int c;
};
void main(){
   struct one z1;               // Members in z1 do not have default initial values.
   static struct one z2;        // z2.a=0, z2.b=0, and z2.c=0.
   struct one z3 = {1};         // z3.a=1, z3.b=0, and z3.c=0.
}


In this example, structure variable z1 has automatic storage, and it does not have an initializer, so all the members in z1 do not have default initial values. Structure variable z2 has static storage, and all its members are implicitly initialized to zero. Structure variable z3 is partially initialized, so all its uninitialized members are implicitly initialized to zero.

You do not have to initialize all members of a union. The default initializer for a union with static storage is the default for the first component. A union with automatic storage has no default initialization.

Topic location: Expressions and Operators > Unary expressions > The __real__ and __imag__ operators (C only) (IBM extension)

The topic title should be:

The __real__ and __imag__ operators (IBM extension)


Compiler Reference
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Compiler Reference:

Topic location: Configuring compiler defaults > Setting environment variables > Runtime environment variables

The following variable should be added:

PDF_PM_EVENT
When you run an application compiled with -qpdf1=level=2, you can set the value of the environment variable PDF_PM_EVENT to L1MISS, L2MISS or L3MISS to gather cache-miss profiling information at the specified cache level.


Topic location: Configuring compiler defaults > Setting environment variables > Environment variables for parallel processing > OpenMP environment variables for parallel processing > OMP_STACKSIZE environment variable

The default value for 32-bit mode is 256M. For 64-bit mode, the default is up to the limit imposed by system resources.

Should read:

The default value is 4194304B. The maximum value for 32-bit mode is 256M. For 64-bit mode, the maximum is up to the limit imposed by system resources.


Topic location: Compiler options reference > Individual option descriptions > -qalign

Defaults
-qalign=power
linuxppc

Should read:

Defaults
-qalign=linuxppc


Topic location: Compiler options reference > Individual option descriptions > -qinline

Defaults
  • -qnoinline
  • At an optimization level of -O0, the default is -qinline=noauto
  • At optimization levels of -O2 and higher, the default is -qinline=auto
  • -qinline=auto:level=5 is the default suboption of -qinline

Should read:
  • -qnoinline
  • At optimization levels of -O2 and higher, the default is -qinline=noauto
  • -qinline=auto:level=5 is the default suboption of -qinline


Topic location: Compiler options reference > Individual option descriptions > -qlanglvl

Default:

C++ only The suboptions and their default settings for different language levels (compat366, strict98, extended (C++), and extended0x) are listed in Table 1.

Should read:

C++ only The suboptions and their default settings for different language levels (compat366, extended (C++), and extended0x) are listed in Table 1.

In Table 1. Default Settings of suboptions for different language levels, the strict98 column should be removed.


Topic location: Compiler options reference > Individual option descriptions > -O, -qoptimize

The following statement should be added under "Usage":

If optimization level -O3 or higher is specified on the command line, the -qhot and -qipa options that are set by the optimization level cannot be overridden by #pragma option_override(identifier, "opt(level, 0)") or #pragma option_override(identifier, "opt(level, 2)").


Topic location: Compiler options reference > Individual option descriptions > -qpdf1, -qpdf2

Syntax should read:

.-nopdf2-----------------------------.   
        +-nopdf1-----------------------------+   
>>- -q--+-pdf1--+--------------------------+-+-----------------><
        |       +-=--pdfname--=--
file_path-+ |   
        |       +-=--exename---------------+ |   
        |       +-=--defname---------------+ |   
        |       '-=--level--=--
0--1--2-----' |   
        '-pdf2--+--------------------------+-'   
                +-=--pdfname--=--
file_path-+     
                +-=--exename---------------+     
                '-=--defname---------------'


exename
Generates the name of the PDF file based on what you specify with the -o option. For example, you can use -qpdf1=exename -o foo foo.f to generate a PDF file called .foo_pdf.

Should read:

exename
Sets the name of the generated PDF file based on what you specify with the -o option. For example, you can use -qpdf1=exename -o foo foo.c to generate a PDF file called .foo_pdf.

level=0 | 1 | 2 should read:

level=0 | 1 | 2

Specifies different levels of profiling information to be generated by the resulting application. The following table provides information about the type of profiling supported on each level (The symbol + indicates that the profiling type is supported):

Profiling type supported on each -qpdf1 level

Profiling type
Level
0
1
2
block-counter profiling
+
+
+
call-counter profiling
+
+
+
single-pass profiling
+
+

value profiling
+
+
multiple-pass profiling

+
cache-miss profiling

+
  • -qpdf1=level=0 is the basic compiler instrumentation that results in smaller file size and faster compilation than -qpdf1=level=1.
  • -qpdf1=level=1 is the default compiler instrumentation. It is equivalent to -qpdf1 in the releases before IBM® XL C/C++ for Linux®, V11.1.
  • -qpdf1=level=2 is a more aggressive compiler instrumentation than -qpdf1=level=0 and -qpdf1=level=1. It is supported at all optimization levels where PDF is enabled.
    Notes:
    • Cache-miss profiling is enabled on SLES11 SP1.
    • You can set the value of the environment variable PDF_PM_EVENT to L1MISS, L2MISS or L3MISS (if applicable) to gather different levels of cache-miss profiling information.
    • You can set the environment variable PDF_BIND_PROCESSOR to bind your application to the specified processor for cache-miss profiling. Cache-miss profiling information is only available on the POWER5™, POWER6®, and POWER7™ processors.

The following example should be added for -qpdf1, -qpdf2:

Here is an example that gathers cache-miss profiling information. Block-counter profiling and value profiling information are also gathered:

//Compile all files with -qpdf1=level=2.
xlc -qpdf1=level=2 –O5 file1.c file2.c file3.c

//set PM_EVENT=L2MISS to gather L2 cache misses
export PDF_PM_EVENT=L2MISS

//Run with one set of input data.
./a.out < sample.data

//Recompile all files with -qpdf2.
xlc -qpdf2 -O5 file1.c file2.c file3.c

//The program should now run faster than
//without PDF if the sample data is typical.


Topic location: Compiler options reference > Individual option descriptions > -qtune

The following rows in the Acceptable -qarch/-qtune combinations table:

-qarch optionDefault -qtune settingAvailable -qtune settings
pwr4pwr4auto | pwr4 | pwr5 | pwr7 | ppc970 | balanced
pwr5pwr5auto | pwr5 | pwr7 | balanced
pwr5xpwr5auto | pwr5 | pwr7 | balanced

Should read:

-qarch optionDefault -qtune settingAvailable -qtune settings
pwr4pwr4auto | pwr4 | pwr5 | pwr6 | pwr7 | ppc970 | balanced
pwr5pwr5auto | pwr5 | pwr6 | pwr7 | balanced
pwr5xpwr5auto | pwr5 | pwr6 | pwr7 | balanced


Topic location: Compiler pragmas reference > Individual pragma descriptions > #pragma option_override

Syntax



Should read:



Parameters

#pragma option_override valueEquivalent compiler option
level, 0-O
level, 2-O2
level, 3-O3
level, 4-O4
level, 5-O5
registerspillsize, size-qspill=size
size-qcompact
size, yes
size, no-qnocompact
strict, all-qstrict, -qstrict=all
strict, no, none-qnostrict
strict, suboption_list-qstrict=suboption_list

Should read:

#pragma option_override valueEquivalent compiler option
level, 0-O1
level, 2-O21
level, 3-O32
registerspillsize, size-qspill=size
size-qcompact
size, yes
size, no-qnocompact
strict -qstrict, -qstrict=all
strict, yes
strict, no-qnostrict
strict, suboption_list-qstrict=suboption_list

Notes:
1. If optimization level -O3 or higher is specified on the command line, #pragma option_override(identifier, "opt(level, 0)") or #pragma option_override(identifier, "opt(level, 2)") does not turn off the implication of the -qhot and -qipa options.
2. Specifying -O3 implies -qhot=level=0. However, specifying #pragma option_override(identifier, "opt(level, 3)") in source code does not imply -qhot=level=0.

Topic location: Compiler pragmas reference > Individual pragma descriptions > #pragma reg_killed_by

fs
    Floating-point and status control register

Should read:

fsr
    Floating-point and status control register

Topic location: Compiler pragmas reference > Individual pragma descriptions > #pragma stream_unroll

Examples

Should read:

The following example shows how #pragma stream_unroll can increase performance.

int i, m, n;
int a[1000];
int b[1000];
int c[1000];

....

#pragma stream_unroll(4)
for (i=0; i<n; i++) {
 a[i] = b[i] * c[i];
}


The unroll factor of 4 reduces the number of iterations from n to n/4, as follows:

m = n/4;

for (i=0; i<n/4; i++){
 a[i] = b[i] + c[i];
 a[i+m] = b[i+m] + c[i+m];
 a[i+2*m] = b[i+2*m] + c[i+2*m];
 a[i+3*m] = b[i+3*m] + c[i+3*m];
}


The increased number of read and store operations are distributed among a number of streams determined by the compiler, which reduces computation time and increase performance.

Topic location: Configuring compiler defaults > Setting environment variables > Environment variables for parallel processing

In "XLSMPOPTS", the description of stack=num should read:

stack=num
    Specifies the largest amount of space in bytes (num) that a thread's stack needs. The default value for num is 4194304.

    Set num so it is within the acceptable upper limit. num can be up to the limit imposed by system resources or the stack size ulimit, whichever is smaller. An application that exceeds the upper limit may cause a segmentation fault.


Topic location: Compiler built-in functions > Synchronization and atomic built-in functions > Synchronization functions > __lwsync, __iospace_lwsync

Load Word Synchronize

Should read:

Lightweight Synchronize


Optimization and Programming Guide
The following corrections and additions apply to the IBM XL C/C++ for Linux, V11.1 Optimization and Programming Guide:

Topic location: Using C++ templates > Using the -qtemplateregistry compiler option

The following two paragraphs should be deleted:

If you want to use either -qtemplateregistry or -qtempinc to compile your programs, you must organize your source code with -qtempinc. See the examples described in the Example of -qtempinc section for more information.

If you also want to compile your program with -qtempinc, you must organize your source code so that it can be compiled with and without -qtempinc.

Topic location: Optimizing your applications > Using profile-directed feedback

To use PDF, follow these steps:

2. Run the program all the way through using data that is representative of the data that is used during a normal run of your finished program. The program records profiling information when it finishes. You can run the program multiple times with different data sets, and the profiling information is accumulated to provide a count of how often branches are taken and blocks of code are executed, based on the input data used. When the application exits, by default, it writes profiling information to the PDF file in the current working directory or the directory specified by the PDFDIR environment variable. The default name for the instrumentation file is ._pdf . To override the defaults, use -qpdf1=pdfname or -qpdf2=pdfname.

Should read:

2. Run the program all the way through using data that is representative of the data that is used during a normal run of your finished program. The program records profiling information when it finishes. You can run the program multiple times with different data sets, and the profiling information is accumulated to provide a count of how often branches are taken and blocks of code are executed, based on the input data used. When the application exits, by default, it writes profiling information to the PDF file in the current working directory or the directory specified by the PDFDIR environment variable. The default name for the instrumentation file is ._pdf . To override the defaults, use -qpdf1=pdfname or -qpdf1=exename.

You can take more control of the PDF file generation, as follows:

3. Change the PDF file location specified by the PDFDIR environment variable or the -qipa=pdfname option to produce a PDF file in a different location.

Should read:

3. Change the PDF file location specified by the PDFDIR environment variable or the -qpdf1=pdfname option to produce a PDF file in a different location.

Topic location: Optimizing your applications > Using profile-directed feedback > Viewing profiling information with showpdf

3. Run the showpdf utility to display the call and block counts for the executable file. If you used the -qipa=pdfname option during compilation, use the -f option to indicate the instrumentation file.

Should read:

3. Run the showpdf utility to display the call and block counts for the executable file. If you used the -qpdf[1|2]=pdfname option during compilation, use the -f option to indicate the instrumentation file.

Topic location: Debugging optimized code > Using -qoptdebug to help debug optimized programs

Example 2 and Example 3 should read:

Example 2: gdb debugger listing



Example 3: Stepping through optimized source

[{"Product":{"code":"SSXVZZ","label":"XL C\/C++ for Linux"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Documentation","Platform":[{"code":"PF016","label":"Linux"}],"Version":"11.1","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
08 August 2018

UID

swg21431755