LC_COLLATE category

A collation sequence definition defines the relative order between collating elements (characters and multicharacter collating elements) in the locale. This order is expressed in terms of collation values. It assigns each element one or more collation values (also known as collation weights). The collation sequence definition is used by regular expressions, pattern matching, and sorting and collating functions. The following capabilities are provided:
  1. Multicharacter collating elements. Specification of multicharacter collating elements (sequences of two or more characters to be collated as an entity).
  2. User-defined ordering of collating elements. Each collating element is assigned a collation value defining its order in the character (or basic) collation sequence. This ordering is used by regular expressions and pattern matching, and unless collation weights are explicitly specified, also as the collation weight to be used in sorting.
  3. Multiple weights and equivalence classes. Collating elements can be assigned 1 to 6 collating weights for use in sorting. The first weight is referred to as the primary weight.
  4. One-to-many mapping. A single character is mapped into a string of collating elements.
  5. Many-to-many substitution. A string of one or more characters are mapped to another string (or an empty string). The character or characters are ignored for collation purposes.
    Note: This is an IBM extension; therefore, locales that use it may not be portable to localedef tools developed by other vendors.
  6. Equivalence class definition. Two or more collating elements have the same collation value (primary weight).
  7. Ordering by weights. When two strings are compared to determine their relative order, the two strings are first broken up into a series of collating elements. Each successive pair of elements is compared according to the relative primary weights for the elements. If they are equal, and more than one weight is assigned, then the pairs of collating elements are compared again according to the relative subsequent weights, until either two collating elements are not equal or the weights are exhausted.