In Thai, sorting using a character's encoding values produces invalid output. The standard of Thai collation rules refer to Thai Royal Institute Dictionary 2525 B.E. Edition, the official standard Thai dictionary.
Table 3: The different collation results between codepoint value approach and Thai Royal Institute Dictionary approach.
Words are ordered alphabetically, not phonetically. Consonant weight is:
Vowels are also ordered by written forms, not by sounds. Vowel weight is:
Tonal marks and diacritics are ignored at the primary level. If the words are identical at the primary level, the tonal mark and diacritics are considered at the secondary level. Tonal mark and diacritic weight is:
Thai punctuation marks are less significant than tonal mark and diacritics. They must be ignored at the primary and secondary level.
Usually, leading vowels (U+0E40 through U+0E44) are written before initial consonants. In Thai collation implementation, leading vowels must be considered after the initial consonant by swapping the leading vowel and consonant before string comparison.