z/OS Unicode Services User's Guide and Reference
Previous topic | Next topic | Contents | Contact z/OS | Library | PDF


Step a: Select the conversions

z/OS Unicode Services User's Guide and Reference
SA38-0680-00

There are four types of conversion:
  1. Character conversion between two different CCSIDs.
  2. Case conversion for Unicode characters.
  3. Normalizing of a Unicode string.
  4. Collation, for culturally correct comparison between two Unicode strings.
For character conversions, each CCSID pair between which you want to be able to convert using the conversion services has to be identified. However, there are different techniques to convert between two CCSIDs and you can specify your preferred technique(s):
(R) Roundtrip conversion
Roundtrip conversions between two CCSIDs assure that all characters making the "roundtrip" arrive as they were originally.
(E) Enforced Subset conversion
Enforced Subset conversions map only those characters from one CCSID to another that have a corresponding character in the second CCSID. All other characters are replaced by a substitution character.
(C) Customized conversion
Customized conversions use conversion tables that have been created to address some special requirements.
(L) Language Environment-Behavior conversion
Language Environment-Behavior conversions use tables that map characters like the iconv() function of the C Runtime Library does. These conversions differ from others primarily in their mapping of the EBCDIC newline (NL) character to ASCII and Unicode linefeed (LF).
(M) Modified for special use conversion
Modified for special use tables can be categorized into three main groups:
  • Tables that map characters like the L tables, but for older code pages.
  • Tables that map characters like the iconv() function of the C Runtime library does for converters ending with "C" (for example IBM-932C).
  • Other special case mappings.
(0-9) User-defined conversions
User-defined conversions are supported. See Defining a new user-defined CCSID and then creating a user-defined conversion table using this new CCSID.
For case conversion you can have the following conversion modes:
  • NORMAL casing:

    This means that one character is mapped to its upper/lower case using a one-to-one relationship as described in the file UnicodeData.txt. Characters that cannot be mapped are copied to the output stream unchanged. Note also that locale specific casing is not supported with mode NORMAL. NORMAL is the preferred mode for converting English text.

  • SPECIAL casing:

    In addition to NORMAL casing, locale independent special casing as listed in the file SpecialCasing.txt is performed. This can be unconditional special casing (for example, 'German Small Letter Sharp s' = X'00DF' uppercases to 2 characters of 'Capital Letter S' =X'00530053' ) or conditional special casing (for example, 'Greek Capital Letter Sigma'=X'03A3' lowercases to either 'Greek Small Sigma'=X'03C3' when within a word or to 'Greek Small Final Sigma'=X'03C2' when it is the last character of a word).

  • LOCALE dependent casing:

    In addition to SPECIAL casing, locale dependent special casing as listed in the file SpecialCasing.txt is performed (for example, 'Capital Letter I' =X'0049' lowercases to 'Small Letter i'=X'0069' when caller's language is NOT turkish, but lowercases to 'Small Letter Dotless i'=X'0131'' when caller's language is Turkish CUNBCPRM_Locale='tr...' ).

Note: Note that user-defined case conversions are not supported.

For normalization and collation services, no special mode is required. See Normalization conversion and Collation conversion.

Go to the previous page Go to the next page




Copyright IBM Corporation 1990, 2014