Skip to main content

Software  > Globalization > Guidelines overview > Coded character sets > 

Globalize your On Demand Business

Coding Graphic Characters | Using Graphic Characters | Supporting Graphic Character Sets | Accessing Graphic Characters | Validating Graphic Characters | Respecting Reserved Code Points | Redefining Graphic Character Meaning | Avoiding Unassigned Code Points | Identify Encoding
Supporting Graphic Character Sets

Products may have a sensitivity to the content of a character set, that is, the product can operate correctly only when the total number of characters is limited to a subset of the total that is contained in the active coded character set. For products that must work with a restricted set of characters, the set must be redefinable to use the most common characters in a particular region.

Guideline F3

Support more than one character encoding and allow the user to select the character encoding; support Unicode at a minimum.

There may be many reasons that justify limiting a character set content:

  • Object names that must be portable across multiple platforms.
  • Co-existence with older products that only support a limited number of characters.

Example: File names on a file system can contain only a subset of all the characters supported by the platform. Certain characters, such as, the wildcard and path separator characters, are not allowed because they are needed for other purposes. The set of characters supported by the platform in different countries differs, hence the need to support a different set of file name characters. Products that operate in France will need to support file names such as RENÉ.

Guideline F3-1

Configure data repositories for Unicode data; convert all data to be stored in the repository into Unicode.

When a product or software application writes Unicode data, it must be able to write all the characters in the complete Unicode character range (from U+0000 to U+10FFFF) including surrogate areas via software interface such as network interface, database connection and APIs. The Unicode data produced for files or data streams should use the required Unicode Transformation encoding. For example, UTF-8 files should be produced when UTF-8 is required as the XML encoding.


We're here to help
Easy ways to get the answers you need.
E-mail IBM