Skip to main content

Introduction to Indic languages

Vowel modification in Indic languages

Below are examples of vowel modification and conjunct-formation in the Unicode representation.

Hindi examples
1.Consonant + vowel sign

Indic character    
   
   
consonant 'ha'
Indic character    
   
   
+ vowel sign 'i'
Indic character    
   
   
= syllable 'hi'

2. Consonant + consonant

Indic character consonant 'na' Indic character + 'halant' Indic character + consonant 'da' Indic character = syllable 'nda'

Note ligature formation


3. Consonant + consonant + vowel sign

Indic character consonant 'na' Indic character + 'halant' Indic character + consonant 'da' Indic character + vowel sign 'ii' Indic character = syllable 'ndii'

4. Combination of 1 + 3

Indic character consonant 'ha' Indic character + vowel sign 'i' Indic character consonant 'na' Indic character + 'halant' Indic character + consonant 'da' Indic character + vowel sign 'ii' Indic character = word 'hindii'

Note that the spelling of the word Hindi written in Devanagari script does have long 'ii' at the end.


Tamil example

Indic character consonant 'ta' Indic character + consonant 'ma' Indic character + vowel sign 'i' Indic character + consonant 'zha' Indic character + special modifier 'pulli' Indic character = word 'Tamizh'

Gujarati example

cons. 'ga' Indic character vowel sign 'u' Indic character cons. 'ja' Indic character cons. 'ra' Indic character + vowel sign 'aa' Indic character + cons. 'ta' Indic character + vowel sign 'ii' Indic character = word 'Gujaratii'

Punjabi example

Indic character cons. ‘pa’ Indic character + special modifier 'tippi' Indic character + consonant 'ja' Indic character + vowel sign 'aa' Indic character + consonant 'ba' Indic character + vowel sign 'ii' Indic character = word 'Punjabii'

Ligatures
As shown above, in many instances Indic syllables form new glyphs. These glyphs are called ligatures.

Character reshaping is often simple and follows simple rules. In some cases however, the resultant ligatures have no relation to the original constituents and it is impossible for an untrained person to identify them.

Unlike bidirectional languages, no major layout transformation is required for Indic scripts when the Unicode approach is followed. Since character reshaping occurs at the individual conjunct-cluster level, the scope of complexity is localized. This means that if a specific behavior is known, it is relatively easy to render the expected behavior

Note: Bidirectional processing will be required for languages such as Kashmiri, Sindhi and Urdu when written in the Urdu script. These cases are not addressed in this article.