International Components for Unicode
Developer(s) | Unicode Consortium |
---|---|
Initial release | 1999 |
Stable release | 75.1[1]
/ 16 April 2024 |
Repository | |
Written in | Libraries for Unicode and internationalization |
License | Unicode License |
Website | icu |
International Components for Unicode (ICU) is an open-source project of mature C/C++ and Java libraries for Unicode support, software internationalization, and software globalization. ICU is widely portable to many operating systems and environments. It gives applications the same results on all platforms and between C, C++, and Java software. The ICU project is a technical committee of the Unicode Consortium and sponsored, supported, and used by IBM and many other companies.[2] ICU has been included as a standard component with Microsoft Windows since Windows 10 version 1703.[3]
ICU provides the following services:
ICU provides more extensive internationalization facilities than the standard libraries for C and C++. Future ICU 75 planned for April 2024 will require C++17 (up from C++11) or C11 (up from C99), depending on what languages is used. ICU has historically used UTF-16, and still does only for Java; while for C/C++ UTF-8 is supported,[5][6] including the correct handling of "illegal UTF-8".[7]
ICU 73.2 has improved significant changes for
ICU 74 "updates to Unicode 15.1, including new characters, emoji, security mechanisms, and corresponding APIs and implementations. [..]
ICU 74 and CLDR 44 are major releases, including a new version of Unicode and major locale data improvements."
Older version details
ICU 72 updated to
Origin and development
After
java.text
and java.util
The Java internationalization classes were then ported to C++ and C[14] as part of a library known as ICU4C ("ICU for C"). The ICU project also provides ICU4J ("ICU for Java"), which adds features not present in the standard Java libraries. ICU4C and ICU4J are very similar, though not identical; for example, ICU4C includes a Regular Expression API, while ICU4J does not. Both frameworks have been enhanced over time to support new facilities and new features of Unicode and Common Locale Data Repository (CLDR).
ICU was released as an open-source project in 1999 under the name IBM Classes for Unicode. It was later renamed to International Components For Unicode.[15] In May 2016, the ICU project joined the Unicode consortium as technical committee ICU-TC, and the library sources are now distributed under the Unicode license.[16]
MessageFormat
A part of ICU is the MessageFormat class, a formatting system that allows for any number of arguments to control the plural form (plural
, selectordinal
) or more general
Alternatives
An alternative for using ICU with C++, or to using it directly, is to use Boost.Locale, which is a C++ wrapper for ICU (while also allowing other backends[18]). The claim for using it rather than ICU directly is that "is absolutely unfriendly to C++ developers. It ignores popular C++ idioms (the STL, RTTI, exceptions, etc), instead mostly mimicking the Java API."[19][20] Another claim, that ICU only supports UTF-16 (and thus a reason to avoid using ICU) is no longer true with ICU now also supporting UTF-8 for C and C++.[5]
See also
- Apple Advanced Typography
- Apple Type Services for Unicode Imaging
- gettext
- Graphite (smart font technology)
- NetRexx (ICU license)
- OpenType
- Pango
- Uconv
- Uniscribe
References
- ^ "Release ICU 75.1 · unicode-org/icu". Retrieved 21 April 2024.
- ^ "ICU - International Components for Unicode". site.icu-project.org. Archived from the original on 2021-08-27. Retrieved 2011-11-14.
- ^ Chen, Raymond (27 May 2021). "How can I convert between IANA time zones and Windows registry-based time zones?". The Old New Thing. Microsoft.
- ^ "Layout Engine - ICU User Guide". userguide.icu-project.org.
- ^ a b "UTF-8". ICU Documentation. Retrieved 2022-05-24.
- ^ "UTF-8 - ICU User Guide". userguide.icu-project.org. Retrieved 2018-04-03.
- ^ "#13311 (change illegal-UTF-8 handling to Unicode "best practice")". bugs.icu-project.org. Retrieved 2018-04-03.
- ^ "ICU - International Components for Unicode - ICU 73". icu.unicode.org. Retrieved 2023-09-24.
- ^ "ICU - International Components for Unicode - ICU 74". icu.unicode.org. Retrieved 2023-11-29.
- ^ "ICU - International Components for Unicode - ICU 72". icu.unicode.org. Retrieved 2023-01-24.
- ^ "ICU - International Components for Unicode - ICU 70". icu.unicode.org. Retrieved 2023-01-24.
- ^ "Download ICU 64 - ICU - International Components for Unicode". site.icu-project.org. Retrieved 2019-10-20.
- ^ Laura Werner (1999). "Getting Java ready for the world: A brief history of IBM and Sun's internationalization efforts". Archived from the original on 2021-11-17. Retrieved 2007-05-23.
- ^ "ICU User Guide". userguide.icu-project.org.
- ^ "ICU Project Management Committee". Archived from the original on 2021-08-28. Retrieved 2012-08-17.
- ^ "ICU joins the Unicode Consortium". Unicode, Inc. 2016-05-16. Retrieved 2016-08-01.
- ^ "Formatting Messages". ICU User Guide.
- ^ "Boost.Locale: Using Localization Backends". www.boost.org. Retrieved 2022-05-24.
- ^ "Boost.Locale: Design Rationale". www.boost.org. Retrieved 2022-05-24.
- ^ "ICU vs Boost Locale in C++". Stack Overflow. Retrieved 2022-05-24.