Coded character set considerations with locale functions

Each EBCDIC coded character set consists of a mapping of all the available glyphs to their respective hex encodings and unique Graphic Character Global Identifiers (GCGIDs). GCGIDs are unique identifiers assigned to each character in the Unicode standard. A glyph is the printed appearance of a character. Each coded character set serves one linguistic environment.

There is wide variation among coded character sets; many glyphs do not appear in all coded character sets, and hexadecimal encodings for some glyphs differ from one coded character set to another. You may encounter problems when exporting a file from a system running in one coded character set, to a system running in another. For example, a left bracket ([) entered under the APL-293 or Open Systems IBM-1047 coded character set will appear as the capitalized Y-acute (Ý). This occurs in such common coded character sets as International 500, France 297, Germany 273, and US or Canada 037.

z/OS® XL C/C++ contains the following extensions to prevent such problems:

The #pragma filetag directive allows you to specify the coded character set that was used when entering the source files. See The pragma filetag directive for details on this pragma.
The LOCALE compiler option enables you to tell the compiler what locale to use at compile time. See Converting coded character sets at compile time for details on this compiler option.
The CONVLIT compiler option enables you to change the assumed code page for string literals. See CONVLIT compiler option for details on this compiler option.
The #pragma convert directive allows you to change the assumed code page for string literals. It has the advantage of allowing more than one character encoding to be used for string literals in a single compilation unit. For more information, see convert in z/OS XL C/C++ Language Reference.

These facilities cause the compiler to respect your code page. Thus, you can enter source code with what appears to you to be the correct characters, and the compiler will recognize those characters.

The rest of this topic discusses other ways to work efficiently in different locales.