ISO/IEC JTC1/SC2/WG2 N1187
Date: 1995-03-24
This is an unofficial HTML version of a document submitted to WG2.


Title: Encoding the Yi script

Source: Michael Everson
Status: Expert Contribution
Action: For consideration by WG2

1.0 Encoding Yi in the BMP. WG2 document N965 proposes to encode the Yi script, specifying 1165 characters for inclusion in the BMP as Category A characters. I agree that it is essential that Yi be considered to be in Category A and that Yi be encoded in the BMP. A unique indigenous script, Yi's importance has been adequately demonstrated in N965.

1.1 Size of the Yi repertoire. The number of characters in N965 can be conveniently reduced by about 25 percent. 345 characters proposed in N965 can be removed if a single character, YI COMBINING TONE MARK, is added to the reportoire. Thus the basic number of Yi characters required in 10646 is 821, though there are 29-55 additional characters which may also be required, for which see 2 below; the number of characters required for encoding Yi is between 821 and 850-876.

1.2 Yi character names. Standard romanization of Yi characters can be found in Bburx 1984. The structure of Yi syllables is CVT, that is Consonant + Vowel + Tone. Initial consonants are ordered according to place of articulation, and are romanized as follows: b-, p-, bb-, nb-, hm-, m-, f-, v-; d-, t-, dd-, nd-, hn-, n-, hl-, l-; g-, k-, gg-, mg-, hx-, ng-, h-, w-; z-, c-, zz-, nz-, s-, ss-; zh-, ch-, rr-, nr-, sh-, r-; j-, q-, jj-, nj-, ny-, x-, y-, and (no initial consonant). Vowels are also orderered according to place of articulation, and are romanized as follows: -i, -ie, -a, -uo, -o, -e, -u, -ur, -y, -yr. Tones are marked conventionally in one of four ways: -t, -x, , and -p. Tones are notoriously difficult to mark in the Latin script; Chn 1985 would rewrite bit bix bi bip in I.P.A. as . The YI COMBINING TONE MARK is added to indicate the -x tone. It appears that the romanization in N965 is based on this system, although it is not consistent and uses C for IE and Q for UO; there are also three duplicate characters in N965. Clews 1988 reprints a chart from DeFrancis 1984 which differs from the ordering in Bburx 1984 only in that the final series (vowels with no initial consonant) have been shifted to the beginning of the repertoire (see Annex D). I would assume that Bburx 1984, which N965 (mostly) follows, is to be preferred. The designation YI LETTER can be preferred to YI CHARACTER, proposed in N965; LETTER is shorter and is appropriate for syllabic characters (cf. Hiragana and Katakana character names; only Thai uses the term CHARACTER in ISO/IEC 10646-1).

1.3 Ordering. Several questions remain to be answered regarding the encoding of these characters. Bburx 1984 gives three orderings for Yi: a phonetic ordering based on the romanization (Annex A of this report), a radical order (Annex B), and a stroke order (Annex C). Which ordering should be preferred for the encoding?

1.4 Naming of the script. Is the name Yi the best name for the script? Other names for the language include Lingshn Y and Lolo; the native designation is Nuo-su.

2.1 A new character. N965 contains one character which does not appear in the inventory of either Bburx 1984 or Clews 1988, namely YI LETTER WU. If included in a phonetically-based ordering, it should appear in its place in the W- series. Its position in radical- or stroke-based ordering was easily worked out: I placed it after my and after bbap respectively.

2.2 Additional radicals. In Bburx 1984, 26 radical characters are listed which are used as a guide to readers and users of dictionaries. A photocopy of these is appended in Annex D. Of the 26 radicals, 16 of them have one or more variant forms which it may also be appropriate to encode as uniquely recognized characters which it wouldn't be helpful to unify with the others. In the annexes provided I have added these 29 variants. Bburx (which Chn 1985 would write , by the way) gives names for the radicals based on the simplest character which contains each, and the names proposed here for the radical variants are based on the same principle, keeping the name of the basic radical first. This should be evaluated by Yi scholars as to its suitibility.

3.0 Bibliography. (I have tried to be accurate in representation, transcription, and translation, but beg the pardon of my Asian colleagues for any mistakes I have made here.)


Bburx Ddie Su. 1984. Nuo-su bbur-ma shep jie zzit: Syp-chuo se nuo bbur-ma syt mu curx su niep sha zho ddop ma bbur-ma syt mu wo yuop hop, Bburx Ddie da Su. [Chengdu]: Syp-chuo co cux tep yy ddurx dde. = Bn Xizh. 1984. Y wn jin z bn: Y Hn wn duzho bn. Chngd: Schun mnz chbnsh.
[Bburx Ddie Su = Bin Xizh. 1984. An examination of the fundamentals of the Yi script. Chngd: Schun National Press.]


Chn Shln. 1985. Y y yynxu jinghu = Nuo hxop ddop hxop mup mit hxip mgo. Chngd: Schun mnz chbnsh.
[Chn Shln. 1985. An introduction to Yi linguistics. Chngd: Schun National Press.]

Clews, John. 1988. Language automation worldwide: the development of character set standards. (British Library R&D reports; 5962) Harrogate: Sesame Computer Projects.

DeFrancis, John. 1984. The Chinese language: fact and fantasy. Honolulu: University of Hawaii Press.



Michael Everson, Evertype, Dublin, 2001-09-21