Hong Kong Supplementary Character Set
MIME / IANA | Big5-HKSCS |
---|---|
Alias(es) | big5hk, csBig5HKSCS |
Language(s) | Big5 ETen |
The Hong Kong Supplementary Character Set (香港增補字符集; commonly abbreviated to HKSCS) is a set of
It evolved from the preceding Government Chinese Character Set (政府通用字庫) or GCCS. GCCS is a set of supplementary
History
Due to the inherent differences between
The Government Chinese Character Set (政府通用字庫) or GCCS was thus developed by the government. The character set consists of Chinese characters commonly used in Hong Kong. Some characters are
Subsequently, the HKSCS-1999 (HKSCS 1999 specification) was developed. Following its acceptance, newer revisions were released in 2001 (adding 116 new characters) and in 2004 (adding 123 new characters), totalling 4,941 characters. 106 GCCS characters were removed in HKSCS-1999 as a result of unification, and their Big5 code points are reserved for compatibility.[2][3] Retired "not verifiable" GCCS characters are found in UTC Sources (UTC-00877–UTC-00898),[4] where they are sourced from Adobe-CNS1-1,[5] an Adobe-CNS1 supplement implemented to support GCCS.[6]
The HKSCS is encoded in
Similarly to Hong Kong's situation, there are also characters that are needed by Macao but included in neither Big5 nor HKSCS, hence, the Macao Supplementary Character Set was developed, building on HKSCS with additional Unicode-mapped characters. The first batch of 121 MSCS characters were submitted for addition to or horizontal extension in Unicode (as appropriate) in 2009,[10] and the first final version of MSCS was established in 2020.[11]
Versions
The HKSCS has gone through a few iterations.[12]
Version | Total characters | Publish date |
---|---|---|
GCCS | 3,049 | 1995 |
HKSCS-1999 | 4,702 | 09/1999 |
HKSCS-2001 | 4,818 | 12/2001 |
HKSCS-2004 | 4,941 | 05/2005 |
HKSCS-2008 | 5,009 | 12/2009 |
HKSCS-2016 | 5,033 | 05/2017 |
The last edition of HKSCS to encode all of its characters in Big5 was HKSCS-2008, while the characters added in HKSCS-2016 are mapped to Unicode only (as a CJK Unified Ideographs horizontal glyph extension where appropriate).[11]
Compatibility
Operating systems
In
IBM assigns CCSID 5471 to the HKSCS-2001 Big5 code page (with CPGID 1374 as CCSID 5470 as the double byte component),[18][19] CCSID 9567 to the HKSCS-2004 code page (with CPGID 1374 as CCSID 9566 as the double byte component),[20] and CCSID 13663 to the HKSCS-2008 code page (with CPGID 1374 as CCSID 13662 as the double byte component),[21] while CCSID 1375 (with CPGID 1374 as CCSID 1374 as its double byte component) is assigned to a growing HKSCS code page, currently equivalent to CCSID 13663.[22]
HKSCS support was added to glibc in 2000, but it has not been updated since then. HKSCS-2004 support is handled as Unicode 4.1 and later. For freedesktop.org setup, AR PL ShanHeiSun Uni font fully supports HKSCS-2004 since 0.1-0.dot.1, with latest revision of HKSCS-2004 supported in version 0.1.20060903-1. Modern desktop distributions (e.g. Ubuntu) include Arphic Technology's HKSCS-compliant UKai and UMing fonts out of the box when Traditional Chinese Language support is selected during installation. They can also be installed manually at a later time.
Applications and the Web
Mozilla 1.5 and above supports HKSCS, with HKSCS-2004 support added into Gecko 1.8.1 code base.[23] Unlike the above-mentioned patch, Mozilla uses its own code page table. However, the fix for bug 343129 does not support characters mapped to code points above Basic Multilingual Plane.[24]
GNOME supports HKSCS characters in Unicode ranges, except those mapped to the Basic Multilingual Plane compatibility block. Patches to support characters mapped to above Basic Multilingual Plane was introduced during Pango 1.1.[26]
The WHATWG Encoding Standard (used by HTML5) includes HKSCS in its definition of Big5 (used even with the plain Big5 label). However, only its decoder uses all HKSCS extensions, while its encoder explicitly excludes those with lead bytes below 0xA1 (thus excluding most of the HKSCS extensions but including, for example, those inherited from Big5 ETEN).[27] Newer browsers follow this standard, including Firefox.
See also
References
- ^ FAQs about GovHK Online Services – Other Technical Questions and Trouble Shooting
- ^ "Big5CMP.txt". Archived from the original on 13 September 2016. Found at Mapping table - HKSCS-2008
- ^ "HKSCS-2004 Annex IV. Compatibility Points for GCCS" (PDF). Archived from the original (PDF) on 30 September 2016. Retrieved 29 September 2016.
- ^ "Group:Big5-GCCS外字". Retrieved 30 September 2016.
- ^ "U-source glyphs" (PDF). Retrieved 30 September 2016.
- ^ "The Adobe-CNS1-6 Character Collection" (PDF). Retrieved 30 September 2016.
- ^ "Character Sets". IANA.
- ^ "SDK components".
- ^ "Big5-HKSCS:2004".
- ^ Computer Chinese Characters Encoding Workgroup (12 June 2009). "Submission of Characters from Macao Information Systems Character Set" (PDF). ISO/IEC JTC 1/SC 2/WG 2 IRGN 1580. Archived from the original (PDF) on 4 January 2015.
- ^ a b Macao Special Administrative Region Government (11 June 2020). "Submission of Macao's Vertical Extension (UNC Characters), Horizontal Extension, and IVSes Registration for MSCS" (PDF). ISO/IEC JTC 1/SC 2/WG 2 IRGN 2430.
- ^ "OGCIO - Development of HKSCS". Archived from the original on 22 August 2017. Retrieved 21 August 2017.
- ^ Steele, Shawn. "CP 951 & HKSCS". I'm not a Klingon. MS Dev Blog. Retrieved 13 September 2016.
- ^ 華通資訊網: 小心!有人悄悄換掉了你的Windows系統字型
- ^ Microsoft: Hong Kong Supplementary Character Set – Support for Windows Platform
- ^ Microsoft Character Code Conversion Routines For HKSCS-2004
- ^ Windows XP Font Pack for ISO 10646:2003 + Amendment 1 Traditional Chinese Support
- ^ "CCSID 5471: Mixed Big-5 ext for HKSCS-2001". IBM Globalization - Coded character set identifiers. IBM. Archived from the original on 29 November 2014.
- ^ International Components for Unicode (ICU), ibm-5471_P100-2006.ucm, 9 May 2007
- ^ "CCSID 9567: Mixed Big-5 ext for HKSCS-2004". IBM Globalization - Coded character set identifiers. IBM. Archived from the original on 29 November 2014.
- ^ "CCSID 13663: Mixed Big-5 ext for HKSCS-2008". IBM Globalization - Coded character set identifiers. IBM. Archived from the original on 29 November 2014.
- ^ "CCSID 1375: Mixed Big-5 ext for HKSCS". IBM Globalization - Coded character set identifiers. IBM. Archived from the original on 29 November 2014.
- ^ Mozilla.org: Bug 343129 – Big5-HKSCS 2004 <==> Unicode Table Update
- ^ Bug 162431 – add non-BMP Unicode (plane 1 and above. surrogate) support to charset encoder/decoder
- ^ "Qt 4.7: Big5-HKSCS Text Codec". Archived from the original on 4 March 2016. Retrieved 10 November 2011.
- ^ Bug 101081 – Non-BMP (plane 1 thru plane 16) characters are not supported
- ^ van Kesteren, Anne. "Encoding Standard". WHATWG.
External links
- Hong Kong Government site on the HKSCS Downloadable HKSCS documents & font
- Microsoft HKSCS Support for Windows Platform
- 香港參考宋體 Download page of Dynalab (華康科技有限公司)'s HKSCS font.
- Graphical View of Big5-HKSCS in ICU's Converter Explorer
- A character set that works on Mac OS X
- UMing/UKai – A free, open-source font supporting HKSCS
- Open Source Hong Kong Fonts Project