 | |
Transliteration
|
Transliteration is the process of writing a language using the alphabet or script of another language. Transliteration should not be confused with translation where the information is rewritten in a different language.
|
ありがとう ございます
Arigato gozaimasu
Thank you
|
For example:
| Original Japanese |
ありがとう ございます
|
| Transliteration into Latin script |
Arigato gozaimasu |
| Translation into English |
Thank you |
Transliteration is used in a variety of ways such as in tourist phase books or to produce the Latin form of a person's name in a passport or to index foreign language books in a library. A common goal of transliteration is to achieve correct pronunciation of the language when written in a different script. This phoenetic form of transliteration is sometimes referred to as transcription. There are many different transliteration algorithms. Some transliterations algorithms ensure that each unique character in the original script is represented uniquely in the target script. Using this approach the original script can be restored from the transliterated text. However, when a phonetically accurate transliteration algorithm is used the source text usually cannot be restored. If the source text cannot be recreated, then it is usually a good idea to preserve the original source text. Many governments and standards organizations such as
ISO
and the U.S. Library of Congress have created transliteration standards. Transliteration utilities are available from various sources such as International Components for Unicode (ICU) and Microsoft.
Reference:
Wikipedia - Transliteration
|
ICU User Guide: Transforms
|
ICU Transform Demo
|  | Russian
|
Russian is spoken as a first language by approximately 145 million people and as a second language by another 110 million people. Based upon the number of first language speakers, it ranks as the 8th most spoken language in the world after Chinese, Hindi, Spanish, English, Arabic, Portuguese, and Bengali.
|
Русский
|
The Russian language has official status in at least four countries: Russia, Belarus, Kazakhstan, and Kyrgyzstan, along with being one of the six official languages of the United Nations. Russian is spoken as a first or second language by almost the total population of Russia. In addition, there are large numbers of Russian language speakers in other countries around the world. The Cyrillic writing system that is used to write Russian is similar to the Latin writing system that is used for Western European languages. While the shapes of the characters are different from Latin characters, Cyrillic is written from left-to-right and it has upper and lower case variations of the characters. Almost all of the lower case letter forms are essentially a smaller form of the upper case letter forms rather than assuming a different shape as is seen in about half of the Latin characters. However, the italics letter form of Cyrillic characters is known as cursive and some cursive characters are not just slanted but have completely different shapes from their non-cursive forms. Cyrillic has been adapted to write many languages besides Russian. Some examples include Belarusian, Bulgarian, Kazakh, Macedonian, Serbian, and Ukranian. While many characters are commonly used in writing all of these languages, each language may require a slightly different selection of Cyrillic characters. For example, there are 33 letters in the Russian alphabet while there are only 30 letters in the Bulgarian alphabet. Reference:
Ethnologue - Russian | Wikipedia - Cyrillic alphabet | Wikipedia - Russian language | Wikipedia - Languages listed by number of speakers |  |
Arabic
Arabic is spoken as a first language by approximately 206 million people. Based upon the number of first language speakers, it ranks as the 5th most spoken language in the world after Chinese, Hindi, Spanish, and English. |
اللغة العربية لغة جميلة
|
Arabic is spoken by the majority of people in 19 countries located in North Africa and the Middle East and is given official status in approximately 25 countries. While spoken Arabic often differs from country-to-country, the written form of Arabic, commonly called Modern Standard Arabic, is taught in schools and is understood by all Arabic speakers. Arabic is written from right-to-left. However, numbers and foreign language words, such as English words, are written from left-to-right within Arabic text. Due to this need to write various elements of the text in different directions, the Information Technology industry refers to Arabic as a bidirectional or bidi (pronounced "bye dye") writing system. The Arabic writing system, with some modifications, has been used to write other languages besides Arabic. For example, it is currently used to write Urdu, an official language in Pakistan and India, Farsi, the official language of Iran, Pashto, a major language in Pakistan and Afghanistan, and Azerbaijani in Iran. Most of the Turkik languages such as Turkish, Azerbaijani, Kazakh, Turkmen, Uzbek, Kyrghyz, and Uyghur have been written using the Arabic writing system at one time or another. Interestingly, the Arabic script is currently the official writing system for Uyghur in China. Reference:
Ethnologue - Standard Arabic |
Omniglot - Arabic script
|
Wikipedia - Arabic language
|
Wikipedia - Languages listed by number of speakers |  |
Back to top
| The Turkish Dotted and Dotless I
The Turkish alphabet has 29 letters. The 26 upper and lower case letters of
the English alphabet are modified as follows:
- Remove the Qq, Ww, and Xx
- Add Çç, Ğğ, ı, İ, Öö, Şş,
and Üü
Of these characters, the dotted and dotless I cause the most problems
for programmers and needs special consideration whenever any case conversion
to upper or lower case is required for an operation. |
 |
| In English the lower case dotted i becomes an upper case dotless I: |
i → I |
U+0069, U+0049 |
| In Turkish the lower case dotted i becomes an upper case dotted I.: |
i → İ |
U+0069, U+0130 |
| In Turkish the lower case dotless ı becomes an upper case dotless I: |
ı → I |
U+0131, U+0049 |
The data above shows that there is more than one possible result of converting
a lower case dotted 'i' to upper case. If the operation is done in a locale-sensitive
manner, the result obtained on a machine running in a Turkish locale would
be different from the result obtained on a machine running in an English
or German locale. Such differences could result in functional failures in
the software. For example, when case is not important, a operation to compare
the text 'win' and 'WIN' by using a locale-sensitive API
to convert the first string and then comparing, will result in a match in
an English or German locale, but a mismatch in a Turkish locale.
When a program deals with user text, software users expect locale-sensitive
operations to be used, but applying local-sensitive operations, such as
Turkish casing rules, to some items (such as keyboard commands, programming
language grammar, HTML markup, etc.) would yield unexpected and incorrect
results. This means that great care must be taken in deciding when to, and
when not to, use locale-sensitive operations. In some situations, such as
for local language text, case conversion should be locale-sensitive. In
other situations, such as for file names or security authentication, case
conversion should be locale-insensitive.
The dotless I is found in other Turkic languages besides Turkish. Azerbaijani
and Tatar are examples of other Turkic languages that are written with a
Latin alphabet and have the dotless I. However, Turkmen is a Turkic language
written with a Latin alphabet but it does not have the dotless I. |
Reference:
Unicode Case Mappings
| Wikipedia
entry
|  |
| U.S. Daylight Saving Time changes coming in 2007
On August 8, 2005, President Bush signed the new U.S. energy bill into law. That bill included provisions for some important changes to Daylight Saving Time in the United States. Specifically, starting in 2007, the U.S. Daylight Saving Time will be extended by four weeks, starting three weeks earlier on the second Sunday in March and ending one week later on the first Sunday in November.
|

|
| It is fairly common for changes to be made to time zones at a regional or country levels. For example, in 2004, the country of Georgia changed its time zone by one hour in relation to Universal time while Argentina made a number of changes to its time zones rules. Starting in 2006, Indianna, one of the states in the U.S. will begin observing new Daylight Saving Time rules. The one thing that can be said about time is that it is continually changing!
In light of this, the clock adjustment software and time-sensitive computer applications need to be prepared to deal with these changes. The first line of defense is to know that a time zone change has occurred for areas of the world that you are concerned about. The second line of defense is to consider the needs of your time-sensitive applications and establish procedures for handling time-related changes as they arise. Time-sensitive applications may include common applications such as basic calendaring and meeting room scheduling as well as more complex applications such as airline reservations systems.
One well-respected source of information on time zone changes is the public-domain Olson tz database. Other sources of time zone information are also available. |
|
Unicode Version 3.1 contains 4,783 of the 4,818 characters in HKSCS-2001. The remaining 35 characters, mainly special symbols and character radicals, will be added in Unicode Version 4.1. In the meantime, these 35 characters are handled in the Private Use Area (PUA) of Unicode. 3,132 of the HKSCS-2001 characters are encoded in the Basic Multilingual Plane (BMP) of Unicode. The other 1,651 characters are encoded in the Supplementary Ideographic Plane thus requiring support for the Supplementary characters (sometimes referred to as Surrogate characters). Since the number of HKSCS characters included in different versions of Unicode differs, precise but different mappings are defined between HKSCS-2001 and Unicode V2.1, Unicode V3.0, and Unicode V3.1 primarily to deal with characters that have to be handled in the Private Use Area (PUA) of Unicode.
Reference: Sources of time zone information
|
World time zone map
|
|  |
Back to top
|
|
Hong Kong Supplementary Character Set (HKSCS)
Both Taiwan and the Chinese Special Administrative Region of Hong Kong use the Traditional Chinese writing system but they do not speak the same language. Mandarin Chinese is the common language of Taiwan while Cantonese Chinese is the common language of Hong Kong. Due to differences between Mandarin and Cantonese, the Traditional Chinese Big-5 character set, that was defined by Taiwan in 1984, does not fully meet the needs of the Cantonese speakers of Hong Kong.
|

|
|
The government of Hong Kong published an extension to Big-5 in 1994 and called it the Government Common Character Set (GCCS). In 1999, they revised the GCCS and renamed it the Hong Kong Supplementary Character Set (HKSCS). The latest revision of HKSCS was published in 2001. Since it is a "supplementary" character set, Hong Kong requires the Big5 characters plus HKSCS.
There are 4,818 characters in HKSCS-2001 as follows:
-
3,234 Han characters found in major dictionaries
-
889 names of people, companies, and locations
-
109 Cantonese dialect characters
-
30 radicals and shapes
-
12 scientific terms
-
103 other characters commonly found in Hong Kong fonts
-
441 symbols including table drawing symbols, Han character shapes and radicals, and various local and international alphabets such as Hanyu Pinyin, Japanese Katakana and Hiragana, and other international phonetic alphabets.
Updates to HKSCS are expected in the future.
|
|
Unicode Version 3.1 contains 4,783 of the 4,818 characters in HKSCS-2001. The remaining 35 characters, mainly special symbols and character radicals, will be added in Unicode Version 4.1. In the meantime, these 35 characters are handled in the Private Use Area (PUA) of Unicode. 3,132 of the HKSCS-2001 characters are encoded in the Basic Multilingual Plane (BMP) of Unicode. The other 1,651 characters are encoded in the Supplementary Ideographic Plane thus requiring support for the Supplementary characters (sometimes referred to as Surrogate characters). Since the number of HKSCS characters included in different versions of Unicode differs, precise but different mappings are defined between HKSCS-2001 and Unicode V2.1, Unicode V3.0, and Unicode V3.1 primarily to deal with characters that have to be handled in the Private Use Area (PUA) of Unicode.
Reference: Hong Kong S.A.R. HKSCS site | Unicode 4.1.0 beta announcement
|
|  |
Back to top
|
| International Domain Names - 24 September 2004 ibm.com is a domain name. |
 |
| Domain names make it easier for us to surf the Web. Without domain names we would have to use IP (Internet Protocol) addresses such as 129.42.17.99. Entering ibm.com or 129.42.17.99 into your browser will both take you |
| to the same place.Which do you think is easier to use? Businesses register meaningful domain names in order to make it easier for Internet users to find them. Until recently, domain names could only contain characters from the English alphabet, numbers, and the dot and the dash symbols. While this worked fine for English speakers, it wasn't very usable for most Asians or Europeans since those users could not use their local language brand or company names in their domain names. In March 2003, the Internet Engineering Task Force IETF published a set of standards that provide the mechanisms for expanding the domain name system to accomodate most of the characters in the Unicode V3.2 standard, thus supporting hundreds of languages. Domain names that use the expanded Unicode character set are known as International Domain Names (IDNs). For a company such as Nestlé in Switzerland, this is good news. The final é is part of the brand name for Nestlé and is the correct way to spell the name. Check it out. If your browser supports IDNs, the following url < http://www.nestlé.ch > will take you to the Swiss German Nestlé site. Obviously, this form of domain name would not be very easy to enter for a worldwide audience but it will be much more usable for the local Swiss audience. Many domain name registries around the world currently allow registration of IDNs and IDNs are supported natively in Netscape V7.1, Mozilla V1.4, Opera V7.2, and other browsers. Internet Explorer V6 does not support IDNs but free plug-ins from a number of suppliers such as Verisign's i-Nav or Domain Avenue.com iClient easily correct that deficiency. Many people have still not heard of IDNs yet, but usage is expected to grow gradually over time as knowledge spreads and browser and e-mail support becomes commonplace. To date hundreds of thousands of IDNs have been registered in .com, .net, and other country code domain name registries. |
|  |
|
Japanese Names
|
| A modern Japanese name consists of a family name followed by a personal name. Japanese do not have middle names. |
|
When written in Japanese characters, the family name always comes before the personal name. Most Japanese family names and personal names consist of two kanji characters each, although some personal names use hiragana or katakana characters. Japanese law currently restricts personal names to a list of 2,232 characters. A personal name containing other characters cannot be officially registered and must be changed. The Ministry of Justice is considering allowing people to use 578 additional characters for personal names. A few Japanese names, particularly family names, include older or uncommon characters. These characters are not included in computer character sets, such as Unicode, so people with these names sometimes substitute similar but more common characters. However, many Japanese do not wish to substitute so local Japanese systems add the characters to their systems using mechanisms such as the Private Use Area in Unicode.
Have you ever wondered if someone you are corresponding with in Japan is male or female? Frequently, the gender of a person can be guessed by the ending of his or her personal name. With a few exceptions, personal names ending with -o, -n, -ro, -shi, or -ya are male, while names ending in -a, -e, -ko, -mi, and -yo are female.
While a name written in kanji may have more than one pronunciation, only one pronunciation will be correct for a given individual. Conversely, quite a few kanji have identical pronunciations, so names that are pronounced the same, are not necessarily written with the same kanji. If it is necessary to know how to pronounce the name, when collecting a Japanese name in a computer or web application, it is common to ask the person to provide the pronunciation of the name in hiragano or katakana characters along with the kanji form of their name.
The Japanese commonly use their family name when talking to each other. The personal name is normally only used with close friends and children. In addition, a title is normally used with the name. There are a large number of titles depending on the gender and social position of the person you are addressing. “san” (for example Tanaka-san) is the most neutral title and can be used in most situations. However, it may not be polite enough in formal situations or customer letters where “sama” (for example Tanaka-sama) or other titles related to the specific position of the person would be a better choice.
|
|  |
Back to top
|
|
On 01 May 2004 ten new countries will be joining the European Union (EU)
While many business practices will change as these countries are integrated into the EU, the adoption of the euro currency is one change that can have broad impact on computer applications.
All of these countries will join the European Monetary Union (EMU) and adopt the euro currency but not right away.
|
|
In theory, the 10 countries could adopt the euro currency as early as 2007 but they need to meet specific economic criteria in order to join the EMU. These countries will likely join the EMU at different times. One source predicts the following very rough schedule:
- Slovenia, Latvia, Lithuania, and Estonia will probably be first and could join in 2007 or 2008
- Poland in 2008 or 2009
- Hungary after 2008 but no guess at the date
- Czech Republic and Slovakia - probably 2010 or later
- No predictions have been found for Malta and Cyprus
Three other EU countries have not joined the EMU yet. While Sweden is not exempted from joining, the Swedish public voted against joining in a non-binding referendum in September 2003. The United Kingdom and Denmark are not required to join and have no plans to join at this time.
The initial 12 countries that joined the EMU went through a transition period when the traditional currency and the euro currency were both official. The new countries may choose to have a transition period or may choose to switch over on one particular day.
During a transition period when two currencies are in use, it is essential to present a number with the right currency indicator and not mix them up. Once a currency amount is entered into a system, it should be accompanied by a currency code in all processing. This currency code is independent of the user's locale. If a currency code identifier is not present, the software has to guess the currency code based on the current user's locale.
|
|  |
Back to top
|
|
Among most nations of the world today, the seven-day week is universal. This has not always been true. For example,
|
 |
|
-
The Romans had an 8-day week.
-
The Mayans had 13-day and 20-day weeks in a complex calendar system.
-
The short-lived French Revolutionary calendar experimented with 10-day weeks in the late 18th century
-
The Soviet Union experimented for a short time with 5-day and 6-day weeks during the 20th century.
The idea of a day off work at the end of the week (a weekend) comes from religious tradition. Over time, the concept of the weekend has been adopted by other nations and, in some cases, extended to two days instead of one. The weekend in predominantly Christian countries tends to be on Saturday and Sunday. The weekend in Israel and Egypt is Friday and Saturday while in Saudi Arabia it is Thursday and Friday. Other countries, such as Japan, have adopted the Saturday and Sunday weekend.
The system of 2-days off on the weekend and a 40-hour work week is not universal, either. For example, Japan adopted the 40-hour workweek in April 1994, while China adopted it in May 1995. However, France went to a 35-hour workweek in the year 2000 while Brazil's official workweek is 44 hours and India's is 48 hours.
So, the concept of a seven-day week is universal, but the weekend still varies by country. e-business applications serving customers worldwide should be sure to consider these differences.
|
|  |
Back to top
| The term "dollar" is used as the name for the official currencies in as many as 54 territories around the world. 14 of these territories use the U.S. dollar, 8 use the Australian dollar, 8 use the East Caribbean dollar, 5 use the New Zealand dollar, and the other 19 territories have unique "dollar" currencies. The $ symbol is commonly used for all "dollar" currencies. In addition, the $ symbol combined with an alphabetic character is frequently used for other currencies. Some examples are the Brazilian real (R$), the Chilean peso (Ch$), and the Nicaraguan Cordoba (C$). While Portugal's currency is the euro, it uses the $ symbol as the decimal separator when writing monetary amounts. Typically, the $ symbol is used by itself within a country but when used in the international context, it becomes ambiguous. Sometimes an alphabetic letter is combined with the symbol. For example, internally in Canada, Canadians used the $ symbol by itself for all currency figures. It is sometimes written as C$ when used on the same page as U.S. currency. However, the C$ is still ambiguous as that designation may be used for other currencies as well such as the Nicaraguan Cordoba mentioned above. The only unambiguous designation in the international context is to use the 3-letter ISO 4217 currency codes. The $ symbol is encoded as hex'24' in most coded character sets used on the Internet such as 7-bit ASCII, all IBM PC code pages, all Windows code pages, all ISO 8859-x series code pages, Japanese Shift-JIS, Simplified Chinese GBK, Traditional Chinese Big5, Korean code pages, Asian 2022 series code pages, and Unicode UTF-8 encodings. In EBCDIC code pages, encoding for the dollar symbol varies. It is often encoded as hex'5B' but this encoding may also apply to other currency symbols such as the Japanese Yen ¥ or the Pound Sterling £ depending upon the character set encoding being used. If care is not take to recognize code pages and do appropriate handling, a report created on a UK EBCDIC system that uses the £ symbol with all monetary amounts will show the same monetary amounts with the $ symbol when viewed on U.S. EBCDIC systems. Reference: CIA World Factbook |  |
Back to top
|
| Daylight Saving Time is the practice of moving the clocks one hour forward in order to provide more daylight at the end of the day during the summer months. Since this practice doesn't actually save daylight, it might better be described as shifting daylight. |
 |
|
|
Daylight Saving Time has been shown to save energy consumption and people generally like it because it gives them more daylight hours in the evening during the nice summer weather.
Daylight Saving Time in the Northern Hemisphere is typically between March and October while in the Southern Hemisphere, where the seasons are opposite, Daylight Saving Time is from October to March. Since the number of hours of sunshine remains approximately the same all year for countries near the equator, Daylight Saving Time is not useful for them. Most of Africa, South America, and the Southern half of Asia, including Japan and China, do not have Daylight Saving Time.
The date for changing to Daylight Saving Time can vary.
- The European Union (EU) countries all start on the last Sunday in March and end on the last Sunday in October.
- The United States and Canada start on the first Sunday in April and end on the last Sunday in October.
- Some countries, such as Israel, establish new dates for the change each year.
During unusual economic conditions, such as war or fuel shortages, or for political reasons or festivals, countries may make changes in their Daylight Saving Time. For example, for 2 years during the fuel shortages in the mid 1970s, the United States stayed on Daylight Saving Time for a few extra months. Australia modified their Daylight Saving Time during the Olympic Games in Sydney in 2000.
For time-sensitive e-business applications, the fact that time changes on different schedules throughout the world can be an important consideration.
Reference: General information | Map of territories that have Daylight Saving Time
|
|  |
|
The majority of Internet users speak one of 10 languages. According to the CIA World Factbook, 13 countries had more than 10 million Internet users in 2002. These countries included the United States, Japan, China, United Kingdom, Germany, South Korea, Italy, Russia, France, Canada, Brazil, Taiwan, and Australia.
|
|
 |
|
This list includes all of the G8 countries plus China, Taiwan, South Korea, and Brazil. Another source, Global Reach, tracks Internet users by language. According to their statistics, the top 10 languages on the Internet are English, Chinese, Japanese, Spanish, German, Korean, French, Italian, Portuguese and Russian.
These top languages correlate directly to the countries with over 10 million Internet users (including Spanish if you consider that there are more than 17 million Spanish speaking Internet users in the United States). The picture changes a bit when we look at internet users as a percentage of the overall population. There are 16 countries or regions where over 50% of the population has internet access: Iceland, Sweden, Denmark, the Netherlands, Norway, Hong Kong, United States, United Kingdom, Australia, South Korea, Switzerland, Canada, New Zealand, Finland, Taiwan, and Singapore. Again, the languages spoken in these countries or regions strongly correlate to the top Internet languages but we begin to see some Northern European languages represented by this group, specifically Swedish, Danish, Dutch, Norwegian, Finnish, and Icelandic.
|
|  |
Back to top
|
|
October 2003
South Africa has more Internet users than any other country in Africa. Reported to have over 3 million Internet users in 2002, this is five times more users than in Egypt despite having only about 60% of the population of Egypt. South Africa has 11 official languages: Afrikaans, Ndebele, Northern Sotho (Sepedi), Southern Sotho, Swati, Tsonga, Tswana, Venda, Xhosa, Zulu, and English.
|
 |
|
|
All of the languages are written using the Latin script. Most of the languages can be written with the English character set although a few of the languages require some diacritics or accents on characters. According to the 1996 census figures, Zulu is the mother tongue of 23% of the population, followed by Xhosa (18%), Afrikaans (14 %), Northern Sotho (9%) and English (9%). English is commonly used commercially but most companies in South Africa can correspond in both English or Afrikaans.
--Source: Various including the CIA World Factbook and the Embassy of South Africa.
|
|  |
Back to top
|
 |
|
The Latin script is used to write more languages than any other script in the world. Many languages are commonly written using Latin characters. These include most of the languages in Europe, North America, and South America, along with many African languages, and many languages of the island nations in the South Pacific.
Unicode provides a single character encoding mechanism for handling all variations of the Latin script. Popular 8-bit encodings such as the ISO-8859 series handle some of the different Latin variations but a single coded character set is not sufficient. You need a series of coded character sets. Since Unicode supports all the other major languages of the world in addition to the Latin-based languages, it is becoming a popular encoding for international businesses.
|
|  |
Back to top
|
|
Portuguese is one of the top ten most frequently spoken languages in the world.
It is a major or official language in nine countries and it is one of the eleven official languages of the European Union (EU). By far, the largest Portuguese-speaking country is Brazil with 176 million people. Brazil is the 5th largest country in the world by population.
|
|
 |
|
Other Portuguese-speaking countries with large populations incude Mozambique (20 million), Angola (10 million), and Portugal (10 million). Smaller countries or regions include Guinea-Bissau, Cape Verde, and Sao Tome & Principe in Africa and East Timor and Macao, S.A.R. of China in Asia.
As with all languages, Portuguese differs in vocabulary, pronunciation, and grammar in the various places where it is spoken. Major software manufacturers commonly produce a Brazilian Portuguese translation of software. In some cases a separate translation will be produced for the European Portuguese market. Official EU documents are made available in all official languages of the EU including Portuguese. Web sites usually need to be localized for each market individually.
|
|  |
Back to top
|
| There is no single standard English language. English differs from region-to-region in spelling, vocabulary, word meaning, idioms, grammar, and pronunciation. |
 |
|
|
According to the British Council, English has official or special status in over 75 countries although it isn't necessarily the majority language. With few exceptions, most of these are small island nations or territories.
Eng.lish-speaking countries, other than the United States, usually follow the British system of spelling but vocabulary, word meaning, idioms, and pronunciation are often influenced by the local environment. For example, English in Canada, Australia, and India is spelled using the British system but Canadian vocabulary is heavily influenced by American English while vocabulary in Australia and India contains many regional contributions. Specific words can also have different meanings or different depths of meanings in different countries.
Besides basic English vocabulary and grammar, information units often contain elements that are local in nature. For example, a company may offer different products in different countries. Currencies, prices, payment methods, measurement systems, seasons, time zones, common ways of expressing dates and times, telephone numbers and address formats, contact information, and shipping options all differ from country-to-country. Local variations are commonly required for marketing and legal information.
Some information can work worldwide and some must be adapted to the specific market. These decisions should be made with care in order to achieve your company's objectives in the English-speaking countries in the world
|
|  |
Back to top
|
|
India officially recognizes and gives certain rights to English plus 18 Indian languages (known as the "scheduled languages"). Hindi and English are the official languages of communication of the central government. That does not mean that everyone understands these languages.
|
|
 |
|
Hindi and its various dialects are the mother tongue of 40% of the population and it is spoken as a second or third language by another 9% of the population. If someone speaks English in India, it is rarely their mother tongue. Approximately 11% of the population claim to speak English as a second or third language. A substantial minority of people, about 20%, can speak two or more languages, usually their official state language and the language of their village.
The states in India are free to adopt an official state language of administration and education from the 18 scheduled languages. Each of the 18 languages can have many local variants or dialects but there is usually an official form of each language that is used for state government communications.
Hindi is spoken mainly in the North central part of India. The other official languages are spoken in the Eastern, Western, and Southern parts of India. For example, Tamil, Telugu, Kannada, and Malayalam are languages of states in the South. Marathi, Gujarati, and Punjabi are languages of states in the West. Bengali, Oriya, and Assamese are languages of states in the East. Urdu is the 6th most commonly spoken language. It is spoken by just over 5% of the population but it is not a major language of any particular state in India.
Source: India Census 1991 (Language information from Census 2001 is not yet available)
References: Language data: 1991 Census | Three Main Languages in every State: 1991
|
|  |
Back to top
|
 |
 |
|
Chinese dialects are generally written using the simplified Chinese writing system that was invented in the 1950's, or the traditional Chinese writing system. Speakers of different Chinese languages can communicate with each other through their common writing system.
Mandarin Chinese is the most widely spoken language in the world with over a billion first or second-language speakers worldwide. About 70% of the population of China and about 90% of the population of Taiwan speak Mandarin as a first or second language. Significant populations of Mandarin speakers can also be found in Singapore, Malaysia, and Indonesia. Cantonese is the language most commonly spoken in Hong Kong and Macao.
Chinese speakers in China and Singapore generally use the simplified Chinese writing system while Chinese speakers in Hong Kong, Macao, and Taiwan use the traditional Chinese writing system. Traditional Chinese is the most commonly used writing system among Chinese communities elsewhere around the world.
—Source: Ethnologue, Languages of the World, 14th edition
|
|  |
Back to top
|
|
|  |
|
Easy ways to get the answers you need. |
| |  |
|
|  | |  | |  | |  |  |
|