Multibyte characters

The compiler recognizes and supports the additional characters (the extended character set) which you can meaningfully use in string literals and character constants. The support for extended characters includes multibyte character sets. A multibyte character is a character whose bit representation fits into more than one byte.

z/OS® systems represent multibyte characters by using Shiftout <SO> and Shiftin <SI> pairs. Strings are of the form:

<SO> x y z <SI>

Or they can be mixed:

<SO> x <SI> y z x <SO> y <SI> z

In the above, two bytes represent each character between the <SO> and <SI> pairs. z/OS XL C/C++ restricts multibyte characters to character constants, string constants, and comments.

Multibyte characters can appear in any of the following contexts:
  • String literals and character constants. To declare a multibyte literal, use a wide-character representation, prefixed by L. For example:
    wchar_t *a = L"wide_char_string";
    wchar_t b = L'wide_char';

    Strings containing multibyte characters are treated essentially the same way as strings without multibyte characters. Generally, wide characters are permitted anywhere multibyte characters are, but they are incompatible with multibyte characters in the same string because their bit patterns differ. Wherever permitted, you can mix single-byte and multibyte characters in the same string.

  • Preprocessor directives. The following preprocessor directives permit multibyte-character constants and string literals:
    • #define
    • #pragma comment
    • #include
    A file name specified in an #include directive can contain multibyte characters. For example:
    #include <multibyte_char/mydir/mysource/multibyte_char.h>
    #include "multibyte_char.h"
  • Macro definitions. Because string literals and character constants can be part of #define statements, multibyte characters are also permitted in both object-like and function-like macro definitions.
  • The # and ## operators.
  • Program comments.
The following are restrictions on the use of multibyte characters:
  • Multibyte characters are not permitted in identifiers.
  • Hexadecimal values for multibyte characters must be in the range of the code page being used.
  • You cannot mix wide characters and multibyte characters in macro definitions. For example, a macro expansion that concatenates a wide string and a multibyte string is not permitted.
  • Assignment between wide characters and multibyte characters is not permitted.
  • Concatenating wide character strings and multibyte character strings is not permitted.