The Arabic script is used by speakers of the Arabic languages in countries such as Egypt, Saudi Arabia, Kuwait, and Iraq, and by speakers of Urdu (spoken mainly in Pakistan), and Farsi or Persian (spoken mainly in Iran). Arabic script is cursive in nature, that is, when presented the characters appear as if hand written. Every character is interconnected with each other in most of the situations and may assume different shapes depending on their position in the word and on the connectivity properties that they and adjacent characters have. This is the only way in which Arabic script is presented, whether in printed form or on screen displays. Cursiveness implies that every Arabic character can assume up to four different shapes depending on its location.

The four possible shapes are:

  • Initial, when the character is at the beginning of a word and connected to the succeeding character only
  • Middle, when the character is in the middle of a word and connected to both preceding and succeeding characters
  • Final, when the character is at the end of a word and connected to the preceding character only
  • Isolated, when the character is alone or not connected to the preceding or succeeding character

Example of the four possible shapes of the Arabic character Ghain (U+063A (PDF, 204KB)):


(Arabic Letter Ghain Initial Form - U+FECF (PDF, 135KB))


(Arabic Letter Ghain Middle Form - U+FED0 (PDF, 135KB))


(Arabic Letter Ghain Final Form - U+FECE (PDF, 135KB))


(Arabic Letter Ghain Isolated Form - U+FECD (PDF, 135KB))


Some Arabic characters never connect to their following (left) characters. Not all Arabic characters have all four shapes. Arabic characters may be stored in unshaped form, but they must be shaped prior to any presentation.