G4: Converting multibyte characters
Converting multibyte characters
Code point conversion is required when you move data from one platform to another, when different platforms support different coded character sets.
Guideline G4
Ensure that graphic character conversion engines can accept mixed single-byte and multibyte code points. Allow the user to customize the conversion tables and methods used by these engines when necessary.
When a code point conversion engine is designed, it should support mixed single-byte and non-single-byte characters. Most conversion between two coded character sets cannot be performed using conversion tables alone, in which case the engine would also rely upon methods. These tables and methods can be changed by the user when required.
The conversion table resources can be found under the heading 'Character Data Conversion Tables' can be downloaded from the IBM developerWorks website. Information on the different conversion methods can be found in the Character Data Reference Architecture (CDRA) reference.
Note: Many companies have set up corporate repositories for conversion tables. This will ensure all products developed by the company use the same set of conversion tables, resulting in code point conversion consistency and predictability.
Guidelines
- Guidelines quick reference
- A: User interface
- B: Writing for an international audience
- C: Respect for culture and conventions
- D: Product structure in a globalized environment
- E: Input and output interfaces
- F: Coded character sets
- G: Introducing Asian ideographic scripts
- H: Languages with a bidirectional script
- I: The cursive Arabic script