There are four types of conversion:
- Character conversion between two different CCSIDs.
- Case conversion for Unicode characters.
- Normalizing of a Unicode string.
- Collation, for culturally correct comparison
between two Unicode strings.
For character conversions, each CCSID pair between which you want
to be able to convert using the conversion services has to be identified.
However, there are different techniques to convert between two CCSIDs
and you can specify your preferred technique(s):
- (R) Roundtrip conversion
- Roundtrip conversions between two CCSIDs assure that all characters
making the "roundtrip" arrive as they were originally.
- (E) Enforced Subset conversion
- Enforced Subset conversions map only those characters from one
CCSID to another that have a corresponding character in the second
CCSID. All other characters are replaced by a substitution character.
- (C) Customized conversion
- Customized conversions use conversion tables that have been created
to address some special requirements.
- (L) Language Environment-Behavior conversion
- Language Environment-Behavior conversions use tables that map
characters like the iconv() function of the C Runtime Library does.
These conversions differ from others primarily in their mapping of
the EBCDIC newline (NL) character to ASCII and Unicode linefeed (LF).
- (M) Modified for special use conversion
- Modified for special use tables can be categorized
into three main groups:
- Tables that map characters like the L tables, but for older code
pages.
- Tables that map characters like the iconv() function of the C
Runtime library does for converters ending with "C" (for example IBM-932C).
- Other special case mappings.
- (0-9) User-defined conversions
- User-defined conversions are supported. See Defining a new user-defined CCSID and then creating a user-defined conversion table using this new CCSID.
For case conversion you can have the following conversion modes:
- NORMAL casing:
This means that one character is mapped to its
upper/lower case using a one-to-one relationship as described in the
file UnicodeData.txt. Characters that cannot be mapped are copied
to the output stream unchanged. Note also that locale specific casing
is not supported with mode NORMAL. NORMAL is the preferred mode for
converting English text.
- SPECIAL casing:
In addition to NORMAL casing, locale independent
special casing as listed in the file SpecialCasing.txt is performed.
This can be unconditional special casing (for example, 'German Small
Letter Sharp s' = X'00DF' uppercases to 2 characters of
'Capital Letter S' =X'00530053' ) or conditional special
casing (for example, 'Greek Capital Letter Sigma'=X'03A3' lowercases
to either 'Greek Small Sigma'=X'03C3' when within a word
or to 'Greek Small Final Sigma'=X'03C2' when it is the last
character of a word).
- LOCALE dependent casing:
In addition to SPECIAL casing, locale
dependent special casing as listed in the file SpecialCasing.txt is
performed (for example, 'Capital Letter I' =X'0049' lowercases
to 'Small Letter i'=X'0069' when caller's language is NOT
turkish, but lowercases to 'Small Letter Dotless i'=X'0131''
when caller's language is Turkish CUNBCPRM_Locale='tr...' ).
Note: Note that user-defined case conversions are not supported.
For normalization and collation services, no special mode is required.
See Normalization conversion and Collation conversion.