The are roughly 800 or so different components. At least one of 214 common
components is found in every character. These common components are called
radicals(D-
-
6).
Traditional and Simplified
As a further complication, Chinese use two forms of characters:
traditional and
simplified(4-
10).
Simplified forms are used in (mainland) China, while Taiwan and others use the
traditional forms. Most Japanese kanji are also traditional Chinese characters,
but a few are simplified, and a few are Japanese inventions (not from Chinese).
A few components have typically Chinese and Japanese stylistic variants, but
these variants are not significant enough to require separate characters to
distinguish them.
Asian Symbol Sets
Because it is not practical to include all of the characters that have ever
been used into one
symbol set(D-
-
7),
various organizations have classified Chinese characters according to different
schemes, resulting in the different standard symbol sets in use today. The
world standards include China's GB2312, Taiwan's CNS11643 (Big 5 Code), Japan's
JIS X0208 (JIS), Korea's KS C5601, and a new standard, Unicode. Asian word
processors typically work with only one symbol set, with characters divided up
into multiple
levels(D-
-
5).
For example, Japanese JIS Level I contains the joyo kanji
plus some other commonly used and proper name kanji ordered by pronunciation.
JIS Level II contains rare kanji arranged by
radical(D-
-
6)
and
stroke(D-
-
7).
While Smart Characters is used primarily with the Chinese and Japanese
Combined(4-
9)
symbol set, a Smart Characters document can use up to different five fonts or
symbol sets. You can convert between fonts and symbol sets by using
ScConv(D-
-
7).
Combined Symbol Set
Smart Characters is supplied with a 16 and 24 point Chinese character font
which uses a Combined Japanese and Chinese
symbol set(D-
-
7).
The combined symbol set contains traditional characters for Chinese and the
Japanese joyo kanji. The traditional Japanese kanji are identical to
their Chinese forms, with some stylistic differences. The unique or simplified
joyo kanji are included in the combined symbol set separately at the end. The
combined symbol set uses two
levels(D-
-
5):
Level 0 (16h0ch00.fnt) contains 7731 Traditional plus up to 69 user-defined
Characters. Level I (16h1ch00.fnt) adds about 400 additional characters to form
a 100%
concordance(D-
-
2)
to the Japanese JIS Level I and the
Big Five(D-
-
1)
traditional Chinese
code space(D-
-
2).
Although the Combined symbol set unifies Japanese and Chinese
characters, the degree of unification is less than found in the Unicode symbol
set, which requires two separate fonts to display Japanese and Chinese
characters. The Combined symbol set maintains separate characters if a unified
character would be unacceptable to either a native Japanese or a native Chinese
reader.
Symbol Set Unification
Ongoing work on international
symbol set(D-
-
7)
unification by various organizations has yielded changes to accepted forms. The
Combined(4-
9)
symbol set reflects these changes by accepting unifications that are generally
acceptable to native speakers, and unifying Japanese and Chinese characters
when appropriate. Consequently, the combined symbol set contains now-unused
code spaces that display as
character numbers(D-
-
2),
not characters. You can see these numbers as you browse the combined font, or
in documents created by prior versions of Smart Characters. See the
Unify(D-
-
8)
symbol set unification utility.
Simplified Characters
The
Combined(4-
9)
symbol set combines both traditional Chinese and Japanese characters into one
symbol set(D-
-
7)
for free intermixing within a document. Simplified characters used in
mainland China are supported by the optional (not included) accessory
simplified character fonts.
User Characters
User characters are characters that you need to use that are not in a
standard
symbol set(D-
-
7).
Standard symbol sets define only a few thousand of the most common popular
Chinese characters, and are therefore small subsets of the over 50,000
characters that have been used over the years. Consequently, the need
frequently arises to use rare or obsolete characters that do not exist in a
particular standard symbol set. To use these characters, add them to your
user font(4-
12),
and
add a corresponding reference(8-
2)
to your
user dictionary(4-
7)
for lookup and use. See
Adding New Characters(8-
2)
and
Why Is this Necessary (Chinese)(2C-
45),
(
Japanese)(2J-
44).
Sharing User Characters
Most Asian language word processors systems do not contemplate electronic
document transmission or work group document sharing, so
user characters(4-
10)
are generally not practical to transmit. Smart Characters eliminates this
restriction by extracting the user characters actually used into a small
proxy font(D-
-
6),
and embedding the proxy font into the document for later extraction when the
document is opened. The document displays the correct user characters, without
transmitting and installing the author's
user font(4-
12).
Symbol Set Index
A document can use up to five symbol sets, although most documents use just one
symbol set(D-
-
7).
Documents which use
simplified characters(4-
10),
user created characters, or lists such as the
concordance files(12-
6)
used by the document conversion utility
ScConv(D-
-
7)
use more than one symbol set.
Smart Characters documents contain a symbol set index that holds up to five symbol set names, unique IDs, and encoding methods(D- - 3). Chinese characters are interpreted according to which symbol set and encoding method they are associated with. Chinese characters are associated with a symbol set by applying a symbol set index format code(D- - 3) using the Format Character(3- 16) dialog Asian Character Symbol Set control. This format code includes only the index number from 0 to 4, not the actual symbol set name and unique id.
By convention, some of the index numbers have pre-defined meanings: Document symbol set 0 is used for the default symbol set. Document symbol set 3 is used for user characters(4- 10). Document symbol set 4 is used for a proxy font(D- - 6).
Copyright © 1996 Apropos, Inc.