G7: Adding new multibyte characters
Adding new characters
You cannot create new ideographic characters by joining simple phonetic elements as in English. Each new character needs to be assigned a position in a coded character set in order to be usable. It is true that many users are satisfied with the characters currently defined in IBM standard coded character sets, but entities such as personal family names often require characters not in any existing coded character set. The ability to add new characters to coded character sets is a necessity.
Allow the user to add new characters if there is a requirement.
A solution to this guideline is the UDC (User Defined Character) area in many IBM coded character sets, where a section of the coded character set is reserved for personal or private usage. These new user customized characters assigned to the UDC area would then be mapped to standard IBM characters using code conversion tables. Note however characters in the UDC area are not necessarily interchangeable across systems, as different systems, even when using the same coded character set, may have different UDC area assignments.
Example: The IBM Korean CCSID 00949 has 8,224 characters for PC data and 1,880 other characters in the UDC area.
Need assistance with your globalization questions?
- Guidelines quick reference
- A: User interface
- B: Writing for an international audience
- C: Respect for culture and conventions
- D: Product structure in a globalized environment
- E: Input and output interfaces
- F: Coded character sets
- G: Introducing Asian ideographic scripts
- H: Languages with a bidirectional script
- I: The cursive Arabic script