Skip to main content

Introduction to Indic languages

Basic components of Indic languages

The basic phonetic components of Indic languages are vowels (called 'swar' in Hindi) and consonants (called 'vyanjan' in Hindi). These form the basis of Indic alphabets. There are also dependent characters which are used in conjunction with the alphabet to modify the sounds they represent. We call these 'special-modifiers.

According to the linguistic view, an Indic alphabet contains vowels, pure consonants and some special modifiers. You will see later how this simplifies some complexities in understanding Indic languages.

Encoding Indic scripts
Large-scale, multilingual applications have existed in India since the early days of computing. The stand-alone nature of these applications, however, ensured that any scheme used to encode so many Indic languages remained confined to the environments in which they were deployed, and therefore not interoperable. There are a number of these encodings, many of which are proprietary.

The most established encoding is called ISCII (Indian Script Code for Information Interchange) which was created in 1988. It is an ingenious encoding scheme that takes advantage of the common linguistic threads of Indic scripts. The Unicode Standard and its ISO counterpart (ISO/IEC 10646) were created as a global standard to address the needs of most major languages. When Unicode was extended to Indic languages, it drew from the ISCII standard.

Most modern software providers utilize Unicode encoding for Web-based applications, making it the ideal choice for e-business in Indic languages. As additional experience is gained with the use of Unicode for Indic languages, refinements to the standard are being added for incorporation into future versions of the standard.