Data conversion between coded character sets

Message data in IBM® MQ defined formats (also known as built-in formats) can be converted by the queue manager from one coded character set to another, provided that both character sets relate to a single language or a group of similar languages.

For example, conversion between coded character sets with identifiers (CCSIDs) 850 and 500 is supported, because both apply to Western European languages.

For EBCDIC newline (NL) character conversions to ASCII, see All queue managers.

Supported conversions are defined in Data conversion processing.

When a queue manager cannot convert messages in built-in formats

The queue manager cannot automatically convert messages in built-in formats if their CCSIDs represent different national-language groups. For example, conversion between CCSID 850 and CCSID 1025 (which is an EBCDIC coded character set for languages using Cyrillic script) is not supported because many of the characters in one coded character set cannot be represented in the other. If you have a network of queue managers working in different national languages, and data conversion among some of the coded character sets is not supported, you can enable a default conversion.

[V9.0.0.0 Jun 2016]For platforms to which the ccsid_part2.tbl applies, see Specifying default data conversion using ccsid_part2.tbl for further information. Default data conversion on platforms other than those to which the ccsid_part2.tbl file applies is described in Default data conversion.

[V9.0.0.0 Jun 2016]

Enhanced Unicode data conversion support in IBM MQ 9.0

Before IBM MQ 9.0, previous versions of the product did not support conversion of data containing Unicode code points beyond the Basic Multilingual Plane (code points above U+FFFF). Unicode data conversion support was limited to code points defined in the Unicode 3.0 standard, encoded in either UTF-8 or UCS-2, a 2-byte fixed-width subset of UTF-16.

From IBM MQ 9.0, IBM MQ supports all Unicode characters defined in the Unicode 8.0 standard in data conversion. This includes full support for UTF-16, including surrogate pairs (a pair of 2-byte UTF-16 characters in the range X'D800' through to X'DFFF' that represent a Unicode code point above U+FFFF).

Combining character sequences are also supported in cases where a precomposed character in one CCSID is mapped to a combining character sequence in another CCSID.

Data conversion to and from Unicode and CCSIDs 1388, 1390, 1399, 4933, 5488, and 16884 has been extended, on some platforms, to support all the code points currently defined for these CCSIDs, including those that map to code points in Unicode supplementary planes.

In the case of CCSIDs 1390, 1399, and 16884, this includes characters defined in the JIS X 0213 (JIS2004) standard.

Support has also been added for conversion to and from Unicode and six new CCSIDs (1374 through to 1379).

[V9.0.0.0 Jun 2016]

ccsid_part2.tbl file

From IBM MQ 9.0 an additional file, ccsid_part2.tbl, is provided.

The ccsid_part2.tbl file takes precedence over the ccsid.tbl file and:
  • Allows you to add or modify CCSID entries
  • Specify default data conversion
  • Specify data for different command levels
The ccsid_part2.tbl is applicable to the following platforms only:
  • [Linux]Linux® - all versions
  • [Solaris]Solaris
  • [Windows]Windows

[Windows]From IBM MQ 9.0, on IBM MQ for Windows, ccsid_part2.tbl is located in directory MQDataRoot\conv\table by default. Furthermore, on IBM MQ for Windows it records all the supported code sets.

[Solaris][Linux]From IBM MQ 9.0, on IBM MQ for Linux and Solaris platforms, ccsid_part2.tbl is located in directory MQDataRoot/conv/table. For all Linux and Solaris platforms, the supported code sets are held in conversion tables provided by IBM MQ.

Although the ccsid_part2.tbl file replaces the existing ccsid.tbl file used in previous versions of IBM MQ to supply additional CCSID information, the ccsid.tbl file continues to be parsed by IBM MQ and must therefore not be deleted.

For more information, see The ccsid_part2.tbl file.

ccsid.tbl file

[V9.0.0.0 Jun 2016]On platforms other than those to which ccsid_part2.tbl applies, the file ccsid.tbl is used for the following purposes:
  • [AIX][HP-UX]On AIX® and HP-UX platforms, the supported code sets are held internally by the operating system.
  • It specifies any additional code sets. To specify additional code sets, you need to edit ccsid.tbl (guidance on how to do this is provided in the file).
  • It specifies any default data conversion.

You can update the information recorded in ccsid.tbl; you might want to do this if, for example, a future release of your operating system supports additional coded character sets.

Default data conversion

[V9.0.0.0 Jun 2016]From IBM MQ 9.0 the method of default data conversion has changed on the following platforms:
  • Linux - all versions
  • Solaris
  • Windows
See Specifying default data conversion using ccsid_part2.tbl for further information.

If you set up channels between two machines on which data conversion is not normally supported, you must enable default data conversion for the channels to work.

[V9.0.0.0 Jun 2016]On platforms other than those to which ccsid_part2.tbl applies, to enable default data conversion, edit the ccsid.tbl file to specify a default EBCDIC CCSID and a default ASCII CCSID. Instructions on how to do this are included in the file. You must do this on all machines that will be connected using the channels. Restart the queue manager for the change to take effect.

The default data-conversion process is as follows:
  • If conversion between the source and target CCSIDs is not supported, but the CCSIDs of the source and target environments are either both EBCDIC or both ASCII, the character data is passed to the target application without conversion.
  • If one CCSID represents an ASCII coded character set, and the other represents an EBCDIC coded character set, IBM MQ converts the data using the default data-conversion CCSIDs defined in ccsid.tbl.
Note: Try to restrict the characters being converted to those that have the same code values in the coded character set specified for the message and in the default coded character set. If you use only the set of characters that is valid for IBM MQ object names (as defined in Naming IBM MQ objects ) you will, in general, satisfy this requirement. Exceptions occur with EBCDIC CCSIDs 290, 930, 1279, and 5026 used in Japan, where the lowercase characters have different codes from those used in other EBCDIC CCSIDs.

Converting messages in user-defined formats

The queue manager cannot convert messages in user-defined formats from one coded character set to another. If you need to convert data in a user-defined format, you must supply a data-conversion exit for each such format. Do not use default CCSIDs to convert character data in user-defined formats. For more information about converting data in user-defined formats and about writing data conversion exits, see the Writing data-conversion exits.

Changing the queue manager CCSID

When you have used the CCSID attribute of the ALTER QMGR command to change the CCSID of the queue manager, stop and restart the queue manager to ensure that all running applications, including the command server and channel programs, are stopped and restarted.

This is necessary because any applications that are running when the queue manager CCSID is changed continue to use the existing CCSID.