In a Unicode database, all identifiers are in multibyte
UTF-8. Therefore, it is possible to use any UCS-2 character in identifiers
where the use of a character in the extended character set (for example,
an accented character, or a multibyte character) is allowed by the DB2® database system.
Clients can enter any character that is supported by their environment,
and all the characters in the identifiers will be converted to UTF-8
by the database manager. Two points must be taken into account when
specifying national language characters in identifiers for a Unicode
database:
- Each non-ASCII character requires two to four bytes. Therefore,
an n-byte identifier can only hold somewhere
between n/4 and n characters,
depending on the ratio of ASCII to non-ASCII characters. If you have
only one or two non-ASCII (for example, accented) characters, the
limit is closer to n characters, whereas
for an identifier that is completely non-ASCII (for example, in Japanese),
only n/4 to n/3
characters can be used.
- If identifiers are to be entered from different client environments,
they should be defined using the common subset of characters available
to those clients. For example, if a Unicode database is to be accessed
from Latin-1, Arabic, and Japanese environments, all identifiers should
realistically be limited to ASCII.