DB2 Version 9.7 for Linux, UNIX, and Windows

Language-aware collations for Unicode data

When you create a Unicode database, you can specify a collation based on a weight table used for collating non-Unicode data.

Such collation orders Unicode data as if it had been code page converted to the non-Unicode code page and then had the corresponding SYSTEM collation applied. multibyte characters and characters that are not in the non-Unicode code page collate after the single-byte characters present in the non-Unicode code page. The multibyte characters and characters that are not in the non-Unicode code page are sorted in IDENTITY ordering using their UCS-2BE code unit values.

The non-Unicode SYSTEM collations can also be used with the COLLATION_KEY_BIT scalar function.

The names of these collations are in the SYSTEM_codepage_territory format. Any combination of code page and territory shown in "Supported territory codes and code pages" may be used, except for the Unicode code sets (UTF-8 and 16-bit Unicode) and all code pages where the operating system is shown as "Host".

If code page used is a multibyte code page, then the language aware collation only applies to characters that can be represented as a single byte in the multibyte code page. The code pages for China, Japan, Korea, and Taiwan are all multibyte code pages.