ISO/IEC JTC1/SC2/WG2 N1172
This is an unofficial HTML version of a document submitted to WG2.
Title: Proposal for encoding the Cherokee script
Source: Michael Everson
Status: Expert contribution
Action: For consideration by WG2
This is a Proposal and Proposal Summary (ISO/IEC JTC1/SC2/WG2 form N1116F) for encoding the Cherokee script in ISO/IEC 10646.
1. Requester's name:
2. Requester type:
3. Submission date:
4. Requester's reference:
5. Type of proposal:
This is a complete proposal.
The following two items are to be completed by WG2:
a. Relevant SC2/WG2 document numbers:
b. Status (list of meeting number and corresponding action or disposition):
B. Technical (General)
1. Nature of proposal:
This proposal is for a new script.
Proposed name of the script:
2. Number of characters in proposal:
3. Proposed category per SC2/WG2 N1116:
4. Proposed Level of Implementation:
Is a rationale provided for the choice?
There are no combining characters in Cherokee.
5. Is a repertoire including character names provided?:
Yes, see below.
a. If YES, are the names in accordance with the 'character naming guidelines' in Annex
K of ISO/IEC 10646-1?
b. Are the character shapes legible?
Yes, see the repertoires below.
6. Who will provide the appropriate computerized font for publishing the standard?
If available now, identify source(s) for the font:
a. Are references (to other character sets, dictionaries, descriptive texts etc.)
Yes, see below.
b Are published examples (such as samples from newspapers, magazines, or other sources)
of use of proposed characters attached?
Yes, see Exhibits attached.
C. Technical (Justification)
1. Information on the user community for the proposed characters (for example: size,
demographics, information technology use, or publishing use) is included.
There are three tribes of Cherokees recognized by the U.S. government: the United
Keetoowah Band (UKB) of Cherokee Indians (about 7000 members), the Cherokee Nation
of Oklahoma (CNO) (140,000 members), and the Eastern Band of Cherokee Indians in
North Carolina (3000+ members). The Cherokee script is widely used in information processing in the Cherokee Nation; use of the script is widespread among the members of the United Keetoowah Band, who, apparently, preserve a higher proportion of Cherokee
speakers than the other groups. Several 8-bit vendor-specific code tables are in use, some
of which have a higher status than others.
2. The context of use for the proposed characters (type of use; common or rare) is
The Cherokee Syllabary is the usual script used to write the Cherokee language. Though
several fonts are available commercially and for free on the Internet many are ad-hoc
ASCII-cypher solutions and there seems to be little standardization. The Cherokee
Nation uses code tables developed by Al Webster for work on the Apple Macintosh platform
which he is sending to me; though I have not yet seen it, it is unlikely that its
platform-specific encoding would affect the proposal here. Certainly the repertoire
will not be affected.
3. Are the proposed characters in current use by the user community?
4. After giving due considerations to the principles in N1116 must the proposed characters
be entirely in the BMP?
Yes. Document N947 allocates space for Cherokee. The set of characters is well-defined
and there should be no difficulty encoding them in 10646.
5. Should the proposed characters be kept together in a contiguous range (rather than
Cherokee should have 6 columns reserved for it as reflected in the repertoire sample
given below. A suggested range would be 1500-155F.
6. Can any of the proposed characters be considered a presentation form of an existing
If YES, is a rationale for its inclusion provided?
7. Can any of the proposed character(s) be considered to be similar (in appearance
or function) to an existing character?
Yes, there is superficial resemblance of some Cherokee characters to characters belonging
to the Latin scipt (D R T, A J E, M, H, G Z, V S, L C P, K, B) but their use (a e i, go gu gv, lu, mi, nah no, do du, tle tli tlv, tso, yv
) makes it clear that they are not Latin characters and that unification with Latin
characters would be wrong.
If YES, is a rationale for its inclusion provided?
Yes; the Cherokee characters are unique.
8. Does the proposal include use of composite sequences?
If YES, is a rationale provided?
Is a list of composite sequences and their corresponding glyph images (graphic symbols
1.0 Proposal for encoding of the Cherokee script
1.1 The Cherokee script is a syllabic script devised in the 19th century by Sequoyah
(who signed his name Ssiquoya). Sequoyah was impressed by the utility of writing, and set out to analyze the sounds of his language and to devise a writing system for it. Detailed information on Sequoyah's development of the script can be found in the references given below; some small part of that may be found in the Exhibit 1 appended to this proposal.
1.2 Structure of the script.
The Cherokee script is a syllabary. Unlike Ethiopic and some other syllabaries, the
glyphs assigned to each syllable are unrelated to one another. It is said that Sequoyah
derived the idea of writing from seeing books in English; some of the glyphs are
similar to Latin characters, but they are clearly not derived from a use of the knowledge
of their Latin values.
Cherokee is a simple left-to-right script requiring no combining characters. It is
a caseless script; when for various purposes initial characters are made larger it
is for typographic effect, not for casing as in Latin, Greek, or Cyrillic. Several
keyboarding conventions exist for inputting Cherokee, some involving dead-key input from
Latin transiterations, some based on sound-mnemonics related to the Latin letters
on keyboards, and some ergonomic systems based on frequency of the syllables in the
Diringer, David. 1948. The alphabet: a key to the history of mankind.
New York: The Philosophical Library.
Faulmann, Carl. 1990. Das Buch der Schrift
. Repr. of 1880 ed. Frankfurt: Eichborn.
Foreman, Grant. 1959. Sequoyah.
2nd ed. Norman: University of Oklahoma Press.
Foster, George E. 1855. Se-quo-yah, the American Cadmus and modern Moses
. Philadelphia: Office of the Indian Rights Association. Repr. 1979 by AMS.
Haarmann, Harald. 1990. Universalgeschichte der Schrift.
Frankfurt: Campus Verlag. (p. 258-61)
Holmes, Ruth Bradley, and Betty Sharp Smith. 1976. Beginning Cherokee:
(= Talisgo galiquogi dideliquasdodi Tsalagi digoweli
). Norman: University of Oklahoma Press.
Kilpatrick, Jack Frederick, and Anna Gritts Kilpatrick, eds. [s.d.] New Echota Letters: contributions of Samuel A. Worcester to the Cherokee Phoenix.
Dallas: Southern Methodist University Press.
3.0 CHEROKEE CHARACTER NAMES
xx00 (This position shall not be used)
xx01 CHEROKEE LETTER A
xx02 CHEROKEE LETTER E
xx03 CHEROKEE LETTER I
xx04 CHEROKEE LETTER O
xx05 CHEROKEE LETTER U
xx06 CHEROKEE LETTER V
xx07 CHEROKEE LETTER GA
xx08 CHEROKEE LETTER KA
xx09 CHEROKEE LETTER GE
xx0A CHEROKEE LETTER GI
xx0B CHEROKEE LETTER GO
xx0C CHEROKEE LETTER GU
xx0D CHEROKEE LETTER GV
xx0E CHEROKEE LETTER HA
xx0F CHEROKEE LETTER HE
xx10 CHEROKEE LETTER HI
xx11 CHEROKEE LETTER HO
xx12 CHEROKEE LETTER HU
xx13 CHEROKEE LETTER HV
xx14 CHEROKEE LETTER LA
xx15 CHEROKEE LETTER LE
xx16 CHEROKEE LETTER LI
xx17 CHEROKEE LETTER LO
xx18 CHEROKEE LETTER LU
xx19 CHEROKEE LETTER LV
xx1A CHEROKEE LETTER MA
xx1B CHEROKEE LETTER ME
xx1C CHEROKEE LETTER MI
xx1D CHEROKEE LETTER MO
xx1E CHEROKEE LETTER MU
xx1F CHEROKEE LETTER NA
xx20 CHEROKEE LETTER HNA
xx21 CHEROKEE LETTER NAH
xx22 CHEROKEE LETTER NE
xx23 CHEROKEE LETTER NI
xx24 CHEROKEE LETTER NO
xx25 CHEROKEE LETTER NU
xx26 CHEROKEE LETTER NV
xx27 CHEROKEE LETTER QUA
xx28 CHEROKEE LETTER QUE
xx29 CHEROKEE LETTER QUI
xx2A CHEROKEE LETTER QUO
xx2B CHEROKEE LETTER QUU
xx2C CHEROKEE LETTER QUV
xx2D CHEROKEE LETTER SA
xx2E CHEROKEE LETTER S
xx2F CHEROKEE LETTER SE
xx30 CHEROKEE LETTER SI
xx31 CHEROKEE LETTER SO
xx32 CHEROKEE LETTER SU
xx33 CHEROKEE LETTER SV
xx34 CHEROKEE LETTER DA
xx35 CHEROKEE LETTER TA
xx36 CHEROKEE LETTER DE
xx37 CHEROKEE LETTER TE
xx38 CHEROKEE LETTER DI
xx39 CHEROKEE LETTER TI
xx3A CHEROKEE LETTER DO
xx3B CHEROKEE LETTER DU
xx3C CHEROKEE LETTER DV
xx3D CHEROKEE LETTER DLA
xx3E CHEROKEE LETTER TLA
xx3F CHEROKEE LETTER TLE
xx40 CHEROKEE LETTER TLI
xx41 CHEROKEE LETTER TLO
xx42 CHEROKEE LETTER TLU
xx43 CHEROKEE LETTER TLV
xx44 CHEROKEE LETTER TSA
xx45 CHEROKEE LETTER TSE
xx46 CHEROKEE LETTER TSI
xx47 CHEROKEE LETTER TSO
xx48 CHEROKEE LETTER TSU
xx49 CHEROKEE LETTER TSV
xx4A CHEROKEE LETTER WA
xx4B CHEROKEE LETTER WE
xx4C CHEROKEE LETTER WI
xx4D CHEROKEE LETTER WO
xx4E CHEROKEE LETTER WU
xx4F CHEROKEE LETTER WV
xx50 CHEROKEE LETTER YA
xx51 CHEROKEE LETTER YE
xx52 CHEROKEE LETTER YI
xx53 CHEROKEE LETTER YO
xx54 CHEROKEE LETTER YU
xx55 CHEROKEE LETTER YV
xx56 CHEROKEE LETTER ARCHAIC HV
xx57 (This position shall not be used)
xx58 (This position shall not be used)
xx59 (This position shall not be used)
xx5A (This position shall not be used)
xx5B (This position shall not be used)
xx5C (This position shall not be used)
xx5D (This position shall not be used)
xx5E (This position shall not be used)
xx5F (This position shall not be used)
4.0 Some notes on the history of this proposal
On 2 July 1992 I sent a message to the ISO10646@JHUVM list asking whether anyone had
taken an interest in encoding Cherokee in the standard. I referred to several sources
which I had at the time for the syllabary (listed in the references in 3.0). I noted that the Diringer and Haarmann had the usual alphabetical order of the syllables
wrong, and gave a list which followed that given in Holmes & Smith, which forms the
basis for this proposal.
4.1 Contemporary sorting order.
Lloyd Anderson got in touch with me a few days later, and in response to some of the questions he raised, as to why I suggested that some of the orderings were "wrong". At some stage I had forwarded these same comments to Rick McGowan of the Unicode Consortium , and as far as I know these were used in the draft proposal for Cherokee in the Unicode Technical Report #3; certainly the tables looked a lot like the ASCII tables I sent.
Lloyd said: "As someone with interests both in scripts (as a linguist) and in computing,
I might have thought that you would want a regular arrangement of 8 code points per
consonant (so far as that is possible), keeping the relative positions of the regular 6 vowels constant within each set of 8."
I said: "Following is some information to answer some of your questions about Cherokee
coding. So far as I know, there is no coded standard for Cherokee; there is however an accepted alphabetical order, given in Holmes and Smith, which was not followed by Haarmann or Diringer. I like your idea of a regular arrangement of code per consonant, but the Cherokee syllabary is a finite set and I assumed that a strict run from D a
to B yv would make, for instance, sorting more straightforward than the aesthetic symmetry of relative positions, which would leave blanks in the set.
"First, the "wrong orderings" given in the writing-system books. Haarmann (1990:260)
gives (in German) a chart (see Exhibit 2 appended to this document) of the syllabary;
Haarmann wrote underlined a
for traditional transliteration v
, which I use here. Clearly, Haarmann is keeping the relative positions of the regular
six vowels constant within each set of eight for the sake of his chart.
Diringer (1948:175) gives the following chart, in which he is apparently changing the order from six ranges across to five in order to fit the syllabary conveniently on the page. Note too he uses O WITH DIAERESIS for the usual -v, as well as dz for ts.
Holmes and Smith (1976), in their teaching grammar, give a chart (see the charts on
page 3 of this document and Exhibit 5) which is apparently the traditionally-accepted
and expected order.
4.2 Original sorting order.
The fullest story on the history of Cherokee sorting order is given in Kilpatrick
and Kilpatrick, who reprint an article by Worcester in the first issue of the Cherokee Phoenix ( (Tsalagi Tsulehisanvhi))
, 21 February 1828. See Exhibit 3 appended to this document.
Sequoyah's own original sorting order is given in his handwriting in Foster 1885 (see
Exhibit 4 appended to this document). The 27th character in the series is the character
CHEROKEE LETTER ARCHAIC HV which was not implemented in the nineteenth century or since, and which does not appear in the painting of Sequoyah reproduced in Haarmann
1990. Compare the facsimile in Exhibit 4 with the charts here:
Worcester's "proposed systematic arrangement", with slight modifications, is the one
found in Foreman 1959 (Exhibit 1), Holmes and Smith 1976 (Exhibit 5), and on page
3 of this document. It was accepted as the standard order and is used today as given
in Holmes and Smith. In this proposal I have reserved a place for the never-used 86th
letter CHEROKEE LETTER ARCHAIC HV, and I have modified the fonts used here it create a printed
form of this character following the facsimile given in Foster 1885. It may not be
necessary to encode this character, but it is given as the final one in the sequence,
though this might be rather pointless if the character was never used.
Sequoyah also invented a system for representing numbers, which was abandoned quite
early in favour of Arabic. A copy of it is given in Sequoyah's handwriting in Holmes
and Smith. I don't think it was ever set into type. They say: "Sequoyah then spent
much time thinking into and creating a system of numbers on the decimal system which
he laid before the tribal council. They sensibly voted Sequoyah's numbers out, as
Arabic numerals, which are simpler, were already in use."
5.0 This proposal to WG2.
In July 1994 Joe LoCicero initiated a discussion list TSALAGI@JOYCE.ENG.YALE.EDU for
Cherokee encoding questions. Discussion was intense, interesting, and stimulating,
but nothing much has been heard from that list for some time. Recently I spoke with
Dr Gloria Sly, deputy director of the Education Department at Cherokee Nation about encoding
Cherokee; we agreed that I would submit this proposal to WG2 and copy it to her for
local evaluation. She agreed that the sorting order given here was the same one which she is using, and that if the 86th character were to be encoded it should come
at the end of the series. This proposal is submitted on behalf of the Cherokee Nation
with Gloria's permission. I will liaise with the Cherokee between the Geneva meeting
in April 1995 and the Helsinki meeting in June 1995. If, as I believe, there is nothing
controversial in this encoding proposal, it should be possible for WG2 to vote on
the Cherokee script at the meeting in Helsinki.
5.1 Outstanding issues to be resolved:
Input from the Cherokee Nation is requested on the following two points:
1. Is the ordering here, based on Worcester's syllabic arrangement, acceptable for
encoding Cherokee? If not, shall Sequoyah's original arrangement be employed, or
some other arrangement?
2. Shall the character CHEROKEE LETTER ARCHAIC HV be encoded as the final character
here, or shall it be left unencoded in ISO 10646?
Foreman 1959. Sketch of Sequoyah's life and of the invention of the syllabary. Facsimile
, the Cherokee Phoenix
, 1828. Facsimile of the script, with Sequoyah's arrangement and Worcester's systematic
Haarmann 1900. Discussion of the development of the script. Phonetic chart in non-standard
order. Reproduction of painting of Sequoyah with detail of the Syllabary in his ordering.
Kilpatrick and Kilpatrick [s.d.]. Reprints Worcester's article in
, the Cherokee Phoenix
, 1828-02-21, about the syllabary for which he designed printed type.
Foster 1885. Reproduction of a text in Cherokee with typographical capitals. Facsimile
of Sequoyah's original design for the characters and their original sorting order.
Holmes and Smith 1976. The syllabary as ordered today. Notes on Sequoyah's life. Chart
of the number system and facsimile of the syllabary, with Sequoyah's signature.
Faulmann 1880. The syllabary in non-standard ordering.
Michael Everson, Evertype, Dublin, 2001-09-21