CJK Notes
Encoding Schemes
- Chinese
- GB (Guobiao, 國標): Coding Standard in mainland China, Singapore, 8-bit.
- HZ (漢字): A variation of GB, 7-bit. Designed to support mixed ASCII/GB text file exchange and editing.
- BIG5 (大五碼): Coding Standard in Taiwan, double 8-bit. (data file)
- EUC-BIG5, 16-bit.
- ISO-2022-CN, 7-bit, supports GB, CNS/BIG5.
Note: Chinese also has 2 different character sets: Traditional(繁體) and Simplified(簡體).
- Japanese: JIS (7-bit), Shift-JIS(8-bit) and EUC-JIS(8-bit).
- Korean: KSC (8-bit)
- Unicode: (Unicode 16-bit, UTF-7 7-bit, UTF-8 8-bit) Designed to support CJK simultaneously. (mapping from Unicode to ISO and CJK)
CCiC Chinese Software Primer
has a lot of information on what programs are available to input Chinese and basic
Chinese coding schemes.
For more information on CJK character codes and encodings, please check
Notes on CJK Character Codes and Encodings
Chinese Input Methods
- Pinyin(拼音)
- Pinyin method uses pinyin for inputting chinese, which is
based on the sound and tone of the character. The sounds are
represented using ASCII characters.
- Zhuyin(注音)
- This is equivalent to pinyin method. It uses shorthand
from the ASCII characters to represent Chinese characters.
Note Zhuyin method there is usually no correlation between
the English character typed and the sound represented, ie,
'm' sound is the 'a' key, not 'm', which is true in pinyin
method.
- Cangjie(倉頡)
- Cangjie strokes table
- Hanzi(HZ, 漢字)
- HZ is a chinese coding scheme in which two ascii chars are used to
represent one chinese character. The Hanzi-table(GB | HZ)
is a table of the chinese character and corresponding ascii characters.
The table can be used to enter Chinese Charaters directly by their
ascii representations.
- DianBaoMa (Chinese Telegram Code, 電報碼)
- DianBaoMa is a coding scheme using
the Chinese Telegram Code. The Telegram Code Table(GB) lists all the Chinese
Characters by the Telegram Code Numbers.
- Chinese-English(英數)
- This coding scheme translates English words or phrases into Chinese Characters. Note
the dictionary usually does not contain the exact translation, so a secondary
input method is usually used in conjunction with this method.
- NeiMa(內碼)
- This is the default coding method, which is defined by each encoding schemes.
Chinese Systems and Programs
General On-Line Reviews:
Chinese System:
Chinese Applications:
jeanz@rice.edu last updated 3/12/98.