Features and benefits
IBM's latest products XL C/C++ for Multicore Acceleration for Linux on x86 Systems, V10.1 and XL C/C++ for Multicore Acceleration for Linux on Power Systems, V10.1 introduce OpenMP single-source compilation support using OpenMP directives and delivers increased optimization capabilities.
At a glance - What's new for V10.1
XL C/C++ for Multicore Acceleration for Linux, V10.1 offers the following new enhancements over its V9.0 predecessor:
Delivers improved performance with new optimizations, including exploitation of the IBM PowerXCell 8i processor
Supports OpenMP single-source compiler technology to simplify program development
Delivers enhancements to -qstrict option with new suboptions to allow additional control of optimizations
Includes new and enhanced compiler options and directives for increased programming flexibility
Supports Red Hat Enterprise Linux 5.2 (RHEL 5.2), the latest level of the standard Linux distribution
Supports IBM Software Development Kit for Multicore Acceleration, V3.1, a suite of tools, libraries, frameworks and examples, to help improve programming and developer productivity, and to enable greater performance of applications
Program optimization
XL C/C++ delivers several compiler options that allow you to:
Select different levels of compiler optimizations
Control optimizations for loops, floating-point, and other types of operations
XL C/C++ also includes specific optimization features tailored to exploit the unique performance capabilities of Cell Broadband Engine processors, including specialized data types and highly optimized built-in functions.
Optimizing transformations can give your application better overall execution performance. C/C++ provides a portfolio of optimizing transformations tailored to various supported hardware. These transformations offer the following benefits:
Reducing the number of instructions executed for critical operations
Restructuring generated object code to make optimal use of the Cell Broadband Engine architecture
Improving the usage of the memory subsystem
Cross-compilation
XL C/C++ for Multicore Acceleration for Linux on x86 Systems, V10.1 and XL C/C++ for Multicore Acceleration for Linux on Power Systems, V10.1 are cross-compilers. The completed applications will run on BladeCenter servers that contain processors built on the Cell Broadband Engine Architecture such as IBM BladeCenter QS21 and IBM BladeCenter QS22.
OpenMP single source compilation
XL C/C++ introduces another compiler invocation allowing compilation and linking of Power Processor Unit (PPU) and Synergistic Processor Unit (SPU) code segments with a single compiler invocation. This single-source compilation technology simplifies the development effort to obtain parallelism in OpenMP programs.
Mathematical Acceleration Subsystem (MASS)
XL C/C++ for Multicore Acceleration for Linux includes the Mathematical Acceleration Subsystem (MASS). MASS consists of libraries of tuned mathematical intrinsic functions that offer improved performance over the standard mathematical library routines, are thread-safe and support C, C++, and Fortran applications.
Automatic code overlay
-qipa=overlay lets developers create SPU programs that would otherwise be too large to fit in the local memory store of the SPUs. -qipa=overlay tells the compiler to automatically generate code overlays for those SPUs that allow two or more code segments to be loaded at the same physical address.
Automatic SIMD vectorization of program code
When compiler option -qhot=simd is in effect, certain operations that are performed in a loop on successive elements of an array are converted into a call to a vector instruction. This call calculates several results at one time, which is faster than calculating each result sequentially.
Interprocedural Analysis (IPA)
Interprocedural Analysis can result in significant performance improvements. Interprocedural analysis can be specified on the compile step only or on both compile and link steps in whole program mode. Whole program mode expands the scope of optimization to an entire program unit, which can be an executable or shared object.
