East Asian writing systems

Before we can discuss how an IME works, we must explore the basics of East Asian writing systems, for example, the Japanese writing system. The entire Japanese written language comprises more than 50,000 characters, of which about 10,000 are in common use. The complexity of the characters and the large number of them requires some organization to simplify reading and writing. The Japanese writing system is organized into two categories: Kana and Kanji.

Kana is an alphabet of written phonetics or syllabary that represents Kanji. The Kana syllabary itself is further broken down into two subsets: Katakana and Hiragana both of which represent the same set of phonetic syllables. The Katakana set of phonetic syllables are written in an angular form. See Figure 1. They are used to represent names and words that come from foreign languages other than Chinese and Korean.

Figure 1: Katakana characters

Katakana characters

The Hiragana characters are written in a cursive form and are used to represent all native Japanese phonemes and words. See Figure 2. Kanji characters are non-phonetic characters that represent ideas or concepts and that originate from Chinese ideographs.

Figure 2: Hiragana characters

Hiragana characters

Kanji characters are commonly referred to as ideographs. For example, see Figure 3, the kanji character for 'cloud'. Kanji characters are comprised of units, known as radicals, and other, non-radical units. For example, the radical 'rain' is used to construct the Kanji character for 'cloud'. See Figure 4. Radicals themselves are constructed from even smaller units, called strokes. Strokes are lines that are drawn in one continuous motion. For example, the 'rain' radical is comprised from eight strokes, three of which are identified in Figure 5.

Figure 3: Kanji character for cloud

Kanji character for cloud

Figure 4: Kanji radical character rain

Kanji radical character rain
Figure 5: Strokes

Strokes

Contact IBM

Need assistance with your globalization questions?

Topic contents