Introduction to bidirectional languages

Hebrew script

Languages that use Hebrew script include Hebrew, and other languages and dialects such as Yiddish and Ladino.

Hebrew is a beautiful language
In English this translates to, 'Hebrew is a beautiful language.'




Hebrew is the main language spoken in Israel. Israel is a Middle Eastern country of 6 million people, bordering on Lebanon, Syria, Jordan, and Egypt. Hebrew is spoken by 95% of the population and used in all types of writing, from religious to technical. Much of the population uses other languages as well: Arabic, the other official language, is spoken by 30%; English by 70-80%; and Russian by 20%. Many other languages are also used, due to the immigration of Jewish people to Israel from countries around the world. The Hebrew language is quite old, dating back to biblical times. It remained relatively unchanged until the end of the nineteenth century, when the birth of modern Hebrew took place. Since that time, the Academy of the Hebrew Language has extended traditional Hebrew to include words for objects or concepts not previously covered by the language.

The use of Hebrew in data processing creates some special problems. The majority of data processing professionals have a working understanding of technical English, which is generally regarded as a prerequisite for a career in this field. Moreover, all of the widely used programming languages are in English, as is virtually all reference material. End users, however, expect to interact with the computer in their everyday language, so all user instructions, help screens, and messages should be written in Hebrew.

Technical characteristics
This section provides information about:

Alphabet characteristics
The Hebrew alphabet uses 27 characters to represent 22 consonants—the disparity is because five consonants have different shapes when used at the end of a word. These characters are listed below in Figure 27. The five consonants that have different shapes when they are used at the end of a word have the words 'Final Form' written in parentheses following their names.

Long vowels, short vowels, and diacritical marks
Vowels are represented in two ways:

The diacritical marks are often not included, therefore the vowel sound of a word must be inferred by the reader from the context. The marks are used in the lower school grades and in poetry, but modern Hebrew gets along without them. They have been disregarded in the information processing environment because their implementation is considered relatively difficult and the need for them is not perceived as critical. However, demand for this support is increasing, because diacritics are in fact necessary components of Hebrew words: identical character strings may have different meanings when vowels are added. For example, if vowels are added to the root word SFR, the following results are obtained:

Meaning Root with vowel (from right to left)
he counted RaFaS
border land RaFS
book ReFeS
he told RePiS
barber RaPaS

The addition of vowels becomes especially important when reading a sentence. Reading a Hebrew sentence without vowels is a two-phase process: first, the grammatical structure is analyzed, and then vowel sounds are mentally added for comprehension.

If material contains words that might be misconstrued, the ambiguous words are sometimes written with diacritical marks. The same is done for words whose pronunciation may be unknown to the reader, such as words transliterated from a foreign language. When diacritical marks are not used (which happens frequently in order to simplify both writing and printing), some consonants have a double function and can be used as substitutes for vowels. It is up to the reader to distinguish when these letters are being used as consonants and when they are used as vowels.

This is the same Hebrew text (as it appears in the first verse of the bible), with and without diacritical marks:

First line of Bible without diacritical marks

First line of Bible with diacritical marks

Character set considerations
Single-byte character sets are sufficient for storing and interchanging bilingual English-Hebrew text.

Fonts
Character Cell Size
For data processing applications, a character cell that can represent English-language characters (including lowercase characters) is acceptable. For text processing, Israel requires a high-quality output and a proportionately spaced font whenever it would be appropriate for English, in addition to non-proportional fonts.

Size and Shape of Hebrew Characters
Most Hebrew characters have square shapes that fit perfectly into a character cell; a few have descenders and only one has an ascender. The widths range from narrow to wide, but no letter is too wide to fit into one character box.

As in English, numerous fonts are used for printing the language. In printed religious material, some letters are printed at varying widths in order to fill out the line because gaps and hyphenation are not permitted.
Written Hebrew has no equivalent to capital letters. Nevertheless, large letters are sometimes used at the beginning of paragraphs or in headings for highlighting or emphasis.

The five letters that take on a special shape if they are located at the end of a word are treated as different characters by data processing applications and usually do not cause any hardware or software problems.

Punctuation and special characters
Figure 24 lists the special characters that have a directional meaning, as used in English and in Hebrew:

Punctuation marks and special characters

Other special characters are used as in English.
When these special characters are displayed, they should be shown in the proper context (either left-to-right or right-to-left). In the data stream, however, it is convenient to use always the same code to represent the same function (for example, an open parenthesis). The English code is used as standard. This means that when the characters are transferred from the data stream to the display, they must be swapped if the context is Hebrew.

Keyboard requirements
Numeric Pad
The following diagram illustrates the numeric keypads used in Hebrew keyboards:

Hebrew and Western keypads

Language layer selection
Hebrew keyboards are usually bilingual. The Hebrew alphabet layout is generally accompanied by the United States Latin alphabet layout. The Hebrew characters are engraved on the right side of the keytop, and the Latin characters are on the left.

The group of all characters engraved on the left side is called the English keyboard layer or the English Graphic layer. The group of all characters engraved on the right side is called The Hebrew keyboard layer or the Hebrew Graphic layer. Some key tops have just one set of characters engraved on them. Their characters belong to both layers. The selection of the right half or the left half of the key tops can be done manually using the Alt and right-shift or Alt and left-shift keys in combination. When invoked, the language layer selection performs as following:

The language selection indicator may be an icon, the language name, an arrow, etc. The default language on the keyboard may be set automatically for the user depending on the way the screen, window, line, or field is defined. If the user is working on a left-to-right screen and enters a field that is defined as right-to-left, the Hebrew layer is activated automatically. The converse is also true.

Character lock
There are two types of character locks: shift lock and caps lock. Shift lock holds all keys in upper shift, whereas caps lock holds only the alphabetic keys in uppercase, but not the special characters.

Processing considerations
Data Entry
Operators normally enter data in a "logical" way; that is, in spelling order. They expect the data to be progressively updated so that it always presents its correct bidirectional aspect.

Printing and displaying direction
Printing and display systems are expected to present traditionally readable text; that is, to interpret the data stream and produce its bidirectional aspect. The transition time during which a printer produces a line or a page, or a display presents one field of data, is relatively unimportant; the physical printing or displaying direction is irrelevant to the ulterior readability of the line.

Data storage orientation
Users in Israel need to be able to store data in both orientations, mainly for compatibility with existing databases which may be left-to-right, right-to-left, or both. The text type of the text may be logical (or implicit) or visual (in the physical order).

Text space requirements
Hebrew usually needs fewer letters per word than English to express the same idea. This is because of the omission of vowels, the use of single-letter suffixes and prefixes to represent words, and the use of abbreviations. As a result, Hebrew generally needs less space than English for the same material and fewer words per sentence. When, however, an English abbreviation is used, Hebrew may not have an equivalent abbreviation and may need more space than English in this case.

Spelling and hyphenation
Spelling
Hebrew words are generally derived from a three-letter fundamental root by the addition of some combination of prefixes, suffixes, or infixes. Certain prepositions, conjunctions, and articles (such as "to," "about", "from", "and", "than," and "the") are added as prefixes to the beginning of a word. More than one of these prefixes may be added to a single word.

For example, expressing the phrase "and when the rain falls" in Hebrew involves adding the prefixes "and", "when", and "the" to the noun "rain". In addition to complicating spelling verification, this makes any text retrieval based on an alphabetic search of Hebrew words more difficult.
Some Hebrew letters may be used as vowels in text without diacritic marks. This use is optional; these letters may or may not appear within words, according to the preference of the author. However, the usage is expected to be consistent within the text. As a result, there may be more than one correct spelling for many words.

Hyphenation
Hyphenation in Hebrew is simpler than in English. Since almost all letters represent a full syllable, words can be hyphenated almost anywhere. Quality printing, however, avoids hyphenation because most Hebrew words are short and do not need to be broken.

Calendars, week, date, and time
Calendars
Israel uses both the Gregorian and the Hebrew calendars. The Hebrew calendar numbers its years from the Gregorian calendar year 3761 B.C.E. It has 12 lunar months each regular year, but there are 7 leap years in a cycle of 19 years, where an extra month of 30 days is added. Thus, the regular year can have 353, 354, or 355 days, and the leap year can have 383, 384, or 385 days.

For instance, the Hebrew year 5719 began on the Gregorian date September 15, 1958 C.E. A simple algorithm subtracts 3,760 from the Hebrew date to get the (approximate) Gregorian date.

Week
The working week begins in Israel on Sunday, which is considered the first day of the week. Saturday is the official resting day, and ends the week.

Date format
Figure 25 shows the preferred and alternative formats for recording the Gregorian date. The example uses the nineteenth day of the tenth month of 1995.

Date format table

Time Format
Figure 26 shows the format for recording the time. The time in the example is 42 minutes and 59 seconds past 10 o'clock in the evening.

Figure 26. Time format table.
Time Format Preferred Range of Hours
22:42:59 0 - 23

Numbers
In Israel, the decimal system is used, and numbers are written from left to right, using Arabic numerals, just as they are in English. In a right-to-left Hebrew sentence, a numeric value is a self-contained entity for which presentation (and reading) direction is reversed to left to right.

Number rounding
Number rounding in Israel is done as follows (where the period [.] represents the decimal separator):
xx.0 through xx.4, round to xx
xx.5 through xx.9, round to xx + 1

For example: 123.4 rounds to 123
123.7 rounds to 124
Numbers with two decimal positions round according to the following (where the period [.] represents the decimal separator):
xx.yy0 through xx.yy4, round to xx.yy
xx.yy5 through xx.yy9, round to xx.yy + .01

For example: 123.454 rounds to 123.45
123.457 rounds to 123.46

Equations
In Israel, the notation for scientific and mathematical equations is the same as in English. Variable names generally are expressed with English letters.

Percent symbol
The usual format used to indicate percentages is the same as in English: 37 percent is written : 37%.

Currency
The local name for the currency used in Israel is the New Shekel. Monetary amounts are written with two currency decimal positions. Figure 28 shows the characters used as thousands and decimal separators, the length of the currency field, the format used to indicate a positive and negative amount of money, and the symbols used to pad the currency format (for example $***123.45)..

Figure 28. Currentcy Format Table
Thousand Separator Decimal Separator Padding Character Positive Format Negative Format Currentcy Field Length
Comma (,) Period (.) Asterisk (*) NIS123.45 (Note 1) NIS 123.45 17 (Note 2)

Notes

  1. NIS represents the national currency indicator, like the $ symbol in the United States.
  2. Due to inflation problems in Israel, the national currency has incurred several changes in recent years. Though nowadays the inflation rate is relatively low, possible changes in the currency field length may occur in the future.

Weights and measurements
Israel uses the metric system (Système International d'Unités) for weights and measurements.