The International Components for Unicode (ICU) is a mature, portable set of C/C++ and Java libraries for software internationalization (I18N) and globalization (G11N) which implement the Unicode® Standard, giving applications the same results on all platforms.
ICU Features
As computing environments become more heterogeneous, software portability becomes very important. ICU provides robust, full-featured Unicode services on a wide variety of platforms, without sacrificing performance. An open source project sponsored, supported, and used by IBM, ICU is providing robust, full-featured, commercial quality Unicode-based technologies. Supporting the most current version of the Unicode® Standard, including supplementary Unicode characters needed for support of the repertoires of GB 18030, HKSCS, and JIS X 0213, it offers great flexibility to extend and customize supplied services, including:
- Text: Unicode text handling, full character properties and character set conversions (500 + code pages)
- Analysis: Unicode regular expressions; full Unicode sets; character, word and line boundaries
- Comparison: language sensitive collation and searching
- Transformations: normalization, upper/lowercase, script transliterations (50 + pairs)
- Locales: comprehensive data (230 +) & resource bundle architecture
- Complex Text Layout: Arabic, Hebrew, Indic and Thai
- Formatting and Parsing: multi-calendar and time zone,dates, times, numbers, currencies, messages
Getting started with ICU
Find more information about ICU or see the ICU mailing list of contacts (links reside outside of ibm.com)

