Code page 936 (IBM)
Alias(es) | SHIFTGB IBM-1381 |
---|---|
Other related encoding(s) | Shift JIS |
IBM code page 936 is a character encoding for
IBM code page 936 should not be confused with
History
The encoding was in use mainly during the
The last revision of IBM-928/936/946 was documented in 1992, and it was superseded in 1993 by the
Status
Although chart definitions for Code page 1380 (the document C-H 3-3220-130 1993-11) are provided online by IBM, IBM does not similarly provide the chart definition for the older Code page 928 (the document C-H 3-3220-130 1992-11, i.e. an earlier revision of the same specification).[5][6] International Components for Unicode (ICU) does not include an IBM-936 or IBM-946 codec, and uses the Windows code page for the "cp936" label.[7] The ICU project does possess mapping data for IBM-946, which it makes publicly available,[8] but does not ship it with ICU.
Structure
Code page 928, the double byte component, includes 9,355 characters as double-byte sequences starting with 0x81 through 0xAC and 0xF0 through 0xFA.[9]
The 0x81–AC lead byte range is used for GB 2312 characters: lead bytes 0x81–87 were used for non-hanzi, 0x88–9C are used for level 1 hanzi and 0x9C–AC are used for level 2 hanzi.[1][5][8] Like Shift JIS, trail (second) bytes are in the range 0x40–FC excluding 0x7F, allowing two GB 2312 rows to be encoded per lead byte;[8] unlike Shift JIS, the bytes 0xA0–AC are not excluded from the lead byte range,[5][8] since JIS X 0201 compatibility was not required. The 0xF0–FA lead byte range is used for IBM extensions: 0xF0 through 0xF9 are used for user-defined characters, and 0xFA is used for additional non-hanzi.[5]
References
- ^ a b c Leisher, Mark (2008) [1998-03-06]. "SHIFTGB.TXT: Shifted GB2312.1980. Generated from an algorithm provided with some older Chinese packages". Department of Mathematical Sciences, New Mexico State University. Archived from the original on 2023-01-20.
- ^ ISBN 978-0-596-51447-1.
- ^ "CCSID 936". IBM. Archived from the original on 2016-03-27.
- ^ "CCSID 946". IBM. Archived from the original on 2016-03-26.
- ^ a b c d e "Table 1: Registration of GCSGID and CPGID for the IBM CH-S Graphic Character Set". C-H 3-3220-130 1993-11: IBM Simplified Chinese Graphic Character Set (PDF). 1993. p. 6.
- ^ "Code page 928 information document". Archived from the original on 2016-03-17.
- ^ "windows-936-2000 (alias cp936)". ICU Demonstration - Converter Explorer. International Components for Unicode.
- ^ a b c d "ibm-946_P100-1995". International Components for Unicode Data Repository. Unicode Consortium, IBM.
- ^ "CCSID 928 information document". Archived from the original on 2016-03-26.