Code page converters

Integration nodes complete string operations in Universal Character Set coded in 2 octets (UCS-2). If incoming strings are not encoded in UCS-2, they are converted to UCS-2 on arrival.

The integration node uses international components for Unicode (ICU) code page converters to convert data. The Unicode Consortium has further information on Unicode.

A code page converter is a mapping from the byte sequence in one code page to a serialized representation of UCS-2, known as UCS Transformation Format 16-bit form (UTF-16). A code page converter allows the integration node to create a UCS-2 representation of an incoming string.

When you handle UTF-16 data, CCSIDs 1200, 13488 and 17584 are treated differently to others. Traditionally, in ICU usage, the endian encoding of these CSSIDs was platform-specific, and IBM® Integration Bus uses an encoding parameter with these CSSIDs. You can specify the encoding parameter as MQENC_INTEGER_REVERSED to use these CCSIDs to explicitly produce little endian data.

Consider this example of the use of a code page converter. A message comes in on a queue from z/OS®, with the WebSphere® MQ CCSID field set to 1047 (LATIN-1 Open Systems without euro). The integration node looks up ibm-1047 and uses the resulting converter to create a UCS-2 representation for internal use.

If you try to convert from a Unicode to a Non-Unicode character set, the following errors might occur:
  • The target buffer is too small. This error causes a recoverable exception, which you can handle; alternatively, the message is rolled back.
  • A code point in the source does not have an equivalent value in the target. At first, fallback mappings are attempted (for example, if you are converting to Japanese, a backslash (\) can be mapped to a yen (¥) if the conversion supplies it as a fallback mapping). If fallback mappings are not present, a recoverable exception is thrown. You can handle the exception, or the message is rolled back.

    The MRM parser substitutes invalid code points with substitution characters.

IBM Integration Bus currently supports the code pages that are listed in Supported code pages. If you need support for an additional code page, or if you require a different variant of a code page, you can extend the integration node to support this code page.