F3: Supporting graphic character sets
Supporting graphic character sets
Products may have a sensitivity to the content of a character set, that is, the product can operate correctly only when the total number of characters is limited to a subset of the total that is contained in the active coded character set. For products that must work with a restricted set of characters, the set must be redefinable to use the most common characters in a particular region.
There may be many reasons that justify limiting a character set content:
Example: File names on a file system can contain only a subset of all the characters supported by the platform. Certain characters, such as, the wildcard and path separator characters, are not allowed because they are needed for other purposes. The set of characters supported by the platform in different countries differs, hence the need to support a different set of file name characters. Products that operate in France will need to support file names such as RENÉ.
When a product or software application writes Unicode data, it must be able to write all the characters in the complete Unicode character range (from U+0000 to U+10FFFF) including surrogate areas via software interface such as network interface, database connection and APIs. The Unicode data produced for files or data streams should use the required Unicode Transformation encoding. For example, UTF-8 files should be produced when UTF-8 is required as the XML encoding.
Need assistance with your globalization questions?
- Guidelines quick reference
- A: User interface
- B: Writing for an international audience
- C: Respect for culture and conventions
- D: Product structure in a globalized environment
- E: Input and output interfaces
- F: Coded character sets
- G: Introducing Asian ideographic scripts
- H: Languages with a bidirectional script
- I: The cursive Arabic script