CJK Support Notes


Introduction

Chinese, Japanese, and Korean (CJK) all have more than 256 characters that could not be represented using single byte character sets. Instead, double bytes are used to represent CJK characters. Several double-byte character sets are for used Chinese (GB, BIG5, HZ), Japanese (Shift-JIS, JIS, EUC-JIS), and Korean (KSC). Universal character sets that include all CJK characters are under development (ISO2022, UTF7, UTF8).

Different input methods are also required to input CJK files. The most popular ones include PY for Chinese, KK for Japanese, and HG for Korean. Intelligent Mode are frequently used for Chinese and Japanese input. It 'guesses' the next word that the user is likely to input based on common phrases. This greatly increases inputting speed.

CJK versions of Windows support CJK applications, however, they are quite expansive. Instead, a CJK system is frequently used on top of regular Windows. There are different CJK systems for different platforms and operating systems. These CJK systems allow users to use any application to view/edit CJK files, as if the applications are CJK applications. Thus once a CJK system is installed, the user no longer need to use specific programs to do editing/viewing, all existing applications will support CJK.

Listed below is a summary of several CJK systems that are commercially available. Blanks denote information that are not currently available.

For more detailed information on CJK encodings, input methods, and various CJK applications, please see CJK Notes.


CJK Systems

[ UnionWay | CStar | MView | TwinBridge | AsianBridge | NJStar | WinMASS Lite ]


Recommandations

Below is a price comparison chart of 4 most frequently used systems/programs:


UW-Asian StdPack 97 (Demo 60 Free)
	TSJK bmp fonts		$59	/ 1 user
				$325	/ 10 user
				$12	shipping

Chinese Star Overseas Edition v2.97 (Demo Free)
	TS ttf fonts, 		$100	/ 5-10 users license
				$9.5	shipping

NJWIN CJK Internet Viewer (viewer only)
	TSJK			$49	/ 1 user

MView System V1.00 (16-bit Windows)
	TSJKU bmp fonts		$18	/ 1 user
				$38	/ 3 user
				$48	/ 5 user

NJWin is viewer only, and Chinese Star(CStar) only supports Chinese.

The MView System V1.00 is clearly much cheaper, however, it only supports 16-bit Windows, and could not work well on Win95 or Windows NT.

For cross platform support, UW-Asian StdPack 97 is strongly recommended. UnionWay has all CJK fonts, the ability to add more character sets, has several popular input methods, supports intelligent mode, and are compatible to MS Office and several graphics programs.

NJWIN is the best CJK viewing system, it autodetects between different character sets, thus allows user to view files that uses multiple encodings.

Chinese Star is the best system for Chinese, however, it does not support JK.


Glossary

4C
Four Corners Method, for Chinese inputting. Each character is divided into 4 parts,
CAN
Cantonese input method, for Chinese. This is essentially pinyin method using Cantonese sounds.
CJ
Cangjie Method, for Chinese inputting.
CJK Character Set
A set of characters defined for one or more languages. In most cases one character set defined one language, although there are exceptions (ie, ISO-8859-1).
Chinese, Japanese, and Korean.
EC
English-Chinese input method. For Chinese.
EJ
English-Japanese input method. For Japanese.
Encoding
Encoding is a method by which a document or message converts to computerized data. One encoding can be used by multiple languages, and one language may have several different encodings.
EUC-JIS
Japanese encoding. 8-bit. Used mostly on Unix.
FC
Four Corners input method for Japanese.
GB
GuoBiao encoding for Chinese. This is the most common encoding for places using simplified Chinese. Typically used in mainland China and Singapore. 8-bit.
HG
Hangul input method for Korean. Most common Korean input method.
HJ
Hanja input method. For Korean.
HZ
HanZi encoding for Chinese, a variation of GB. 7-bit. Created mostly to support mixed ASCII/GB network file exchange and editing.
J
Japanese Character Set.
JBS
Jianyi Bushou (Simplified Radical Lookup) input method, for Chinese.
JIS
Japanese encoding. 7-bit. Used mostly to support 7-bit internet mail/news.
K
Korean Character Set.
KK
Kana-Kanji input method. Most common Japanese input method.
NDPY
New Double Pinyin input method, for Chinese.
NPY
No-tune Pinyin input method, or New Pinyin input method. For Chinese inputting.
NI
Nelson Index input method, for Japanese.
PY
Pinyin input method. Most popular Chinese input method.
PZM
Popularized Zhengma input method, for Chinese.
QW
Quwei input method, for Chinese.
RK
Roma-Kanji input method, for Japanese.
RL
Radical Lookup input method, for Japanese.
S
Simplified Chinese Character Set.
Shift-JIS (SJIS, or S-JIS)
Japanese encoding. 8-bit. Used mostly on Mac/PC.
SN
Stroke Number input method, for Chinese.
ST
Strokes input method, for Japanese.
T
Traditional Chinese Character Set.
TC
Telecode input method. For Chinese. Each character is represented by a telegraph code in Mainland China.
U
Unicode Character Set.
UI
Unicode input method, for Chinese, Japanese, Korean.
WB
Wubi (also called Five Strokes Method) input method, for Chinese.
WBB
Wubi Bridge input method, for Chinese.
WBD
Wubi Drawing input method, for Chinese.
WBS
Wubi Shape input method, for Chinese.
WM
WangMa input method, for Chinese.
ZM
ZhengMa input method, for Chinese.
ZY
Zhuyin input method, for Chinese. Popular in Taiwan.


jeanz@rice.edu