ISO/IEC JTC1/SC2/WG2 N1176
Date: 1995-03-12
This is an unofficial HTML version of a document submitted to WG2.

Title: Names of Burmese characters: comment on Unicode Technical Report #1

Source: Michael Everson
Status: Expert contribution
Action: For consideration by WG2

In April of 1993 I sumbitted the following report to the Unicode Consortium and distributed an ASCII version of it on ISO10646@JHUVM. Some of the comments are recommendations on the improvement of the text of the Unicode proposal, but the substantive comments involve character naming conventions which should be of interest by WG2. Since the text below refers to the UTR, I append that to this contribution for reference. Note that I have proposed two additions to the Unicode proposal, 0F64 BURMESE DANDA and 0F65, BURMESE DOUBLE DANDA. Note too that in this proposal I suggested the name of 0F73 BURMESE SYMBOL 4NG, which in its use of the digit 4 violates the naming conventions; BURMESE SYMBOL LENG may be an acceptable name.


1.0 Upon review of Unicode Technical Report #1, the Draft Proposal for Burmese encoding, I offer below the following recommendations. These recommendations fall into three categories: correction of inconsistencies with Unicode standard practice elsewhere (as in the names of the digits), emendations to the explanatory text, and a proposed solution to the problem of Burmese romanization involving simpler names for the characters with an increased use of aliases for clarification.

Insert the following paragraphs between 3 and 4 of the text:

1.1 Romanization. Romanization of Burmese is made difficult because the sounds of Modern Burmese do not correspond neatly to the Brahmi-based structure of the alphabet. In John Okell's Guide to the romanization of Burmese (1971), no less than ten major transcriptions are given. Of these, two major transcription systems could be chosen as representative, one based on graphic transliteration (Okell 1971), and one based on pronunciation (Roop 1972). From the former, it is possible to consistently and unambiguously reconstruct Burmese spelling, while from the latter it is not; on the other hand, the strict transliteration gives an inaccurate picture of modern pronunciation. Since both romanizations enjoy some currency, and since there are advantages to both of them, it has been suggested that both systems be codified here as synonyms or aliases to the Unicode standard name.

For the purposes of the Unicode standard names of most Burmese characters have been unified with those of the other Indic scripts where possible; this corresponds rather closely to Okell's standard transliteration and is consistent with the Indic structure of the script. The character names are followed by the Burmese names for these characters in both romanizations and in English translation. The graphic transliteration precedes the pronouncing transliteration, and contains the characters : and . and - to mark tones (essential for accurate transliteration); the pronouncing omits these. Strictly speaking, initial vowels do not exist in Burmese, since the glottal onset is considered a consonant. The Unicode standard romanization ignores that, preferring E- to 'E- or QE-. Likewise, in the graphic transliteration, consonants have been doubled when they represent dotted consonants, as has been established for other Indic scripts in the Unicode standard.

0F15 BURMESE LETTER KA Unicode standard name (Okell romanization)
= ka. krii: Okell romanization
= ka ji Roop romanization
= great ka translation (Roop romanization)


1.2 Other recommendations. Emendation to 4: [ ] When used in this way, this symbol is known as rhe. thui: (hyei htou ), or 'thrust forward'.

Emendation to 5: Burmese distinguishes a set of "medial" consonants. Originally conjunct forms of RA, YA, WA, and HA, they are used in modern Burmese to form new letters

Emendation to 7: When a syllable has more than one medial, it is recommended that they appear in the order that such syllables are traditionally spelled. That is, ha. thui: (ha htou ) preceding ya. pang. (ya pin ) or ra. rac (ya yi ) preceding wa. chai: (wa hswe ). Note that ya. pang. and ra. rac cannot appear in the same syllable in Burmese. For example, kywe (cwei ) 'to drop off' is coded as 0F15 + 0F2F + 0F35 + 0F47 (KA + YA + WA + E). Mhruu (hmyu ) 'to delight, allure' is coded as 0F2E + 0F39 + 0F30 + 0F42 (MA + HA + RA + UU). This differs from the order in which medials are normally written. [Note that the coding given in the UTR Draft Proposal uses separate encoding for the medials (0F5C for ya. pang. , 0F5D for ra. rac , 0F5E for wa. chai: , 0F5F for ha. thui: ) while 7 specifies otherwise (0F2F, 0F30, 0F35, 0F39).]

Emendation to 10:
LETTER U (0F09) and LETTER NYA (0F5F)
LETTER GA (0F17) and DIGIT EIGHT (0F6E)
LETTER WA (0F35) and DIGIT ZERO (0F66)
LETTER RA (0F30) and DIGIT SEVEN (0F6D)
LETTER FOUR (0F6A) and SYMBOL 4NG (0F73)

Emendation to 12: Also, the LETTER O (0F13) is distinguished from the sequence 0F38 + 0F30 (SA = RA), and the LETTER JHA (0F1D) is distinguished from 0F1A + 0F2F (CA = YA).

Emendation to 13: Symbols not found as single characters are formed from sequences of the basic characters given here. For example, sa. krii: (tha ji ) 'great tha' is coded by the sequence 0F38 + 0F4D + 0F38 (SA + VIRAMA + SA), i.e., it is a conjunct formed from two SAs. King:ci: (kinzi ) is a conjunct formed from LETTER NGA followed by some other consonant, that is, the sequence 0F19 (NGA)+ 0F4D (VIRAMA) + Consonant. Low level tone o (o ) has already been noted. Level tone ui (ou ) is to be coded as 0F41 + 0F3F (U + I). Other combinations follow similarly.

Emendation to 15: The tone mark SIGN DOT BELOW is often written to the left of a subscript vowel sign or medial consonant. [ ] In this case, too, the SIGN DOT BELOW should come after the KILLER in the text stream. For example, the word rhwm. (hyun ) (short, high falling tone) should be represented as 0F30 + 0F39 + 0F35 + 0F02 + 0F51 (RA + HA + WA + ANUSVARA + DOT BELOW).

Emendation to 16: The SYMBOL 4NG (0F73) is only used in the literary combination 0F73 + 0F19 + 0F52 + 0F03 (4NG + NGA + KILLER+ TWO DOTS), meaning 'the aforementioned'. [For romanization 4NG cf. Okell 4.1.2.11; LENG: could be possible since 4 is written LE:.]

Emendation to 18: [ ] A notable exception is the pair LETTER NNYA and LETTER NYA. Historically, NYA is a simple palatal nasal, while NNYA is a ligature representing double NYA. NNYA, however, has come to be regarded as the primary form of the letter in Burmese, so it is assigned to the "preferred" ISCII slot for the palatal nasal (0F1E), and NYA is placed at 0F5F.


2.0 References: Okell, John. 1971. A guide to the romanization of Burmese. (James G. Forlang Fund; 27) London: Royal Asiatic Society of Great Britain and Ireland.
Roop, D. Haigh. 1972. An introduction to the Burmese writing system. New Haven and London: Yale University Press.


3.0 BURMESE CHARACTER NAMES

0F00
0F01
@ Various Signs
0F02 BURMESE SIGN ANUSVARA
= se:se: tang
= theidhei tin
= little thing put on
0F03 BURMESE SIGN TWO DOTS
= rhe.ka. pok
= hyeiga pai
= dots ahead
x visarga
0F04
@ Independent Vowels
0F05 BURMESE LETTER A
= a.
0F06
0F07 BURMESE LETTER I
= Paa-lli. atkharaa i.
= Pali ehkaya i
= Pali letter i
0F08 BURMESE LETTER II
= atkharaa ii
= ehkaya ii
= letter ii
0F09 BURMESE LETTER U
= atkharaa u.
= ehkaya u
= letter u
x Burmese nya Æ 0F5B
0F0A
0F0B BURMESE LETTER VOCALIC R
0F0C BURMESE LETTER VOCALIC L
0F0D
0F0E
0F0F BURMESE LETTER EI
= atkharaa ei:
= ehkaya ei
= letter ei
0F10
0F11
0F12
0F13 BURMESE LETTER O
= o.
x sra Æ 0F38 + 0F30
0F14
@ Consonants
0F15 BURMESE LETTER KA
= ka. krii:
= ka ji
= great ka
0F16 BURMESE LETTER KHA
= kha. khwe:
= hka gwei
= curved hka
0F17 BURMESE LETTER GA
= ga. ngay
= ga nge
= small ga
0F18 BURMESE LETTER GHA
= gha. krii:
= ga ji
= great ga
0F19 BURMESE LETTER NGA
= nga.
0F1A BURMESE LETTER CA
= ca. lum:
= sa loun
= round sa
0F1B BURMESE LETTER CHA
= cha. lim
= hsa lein
= twisted hsa
0F1C BURMESE LETTER JA
= ja. khwai:
= za gwe
= split za
0F1D BURMESE LETTER JHA
= jha. myang-chwai:
= za myinzwe
= bridle za
x cya Æ 0F1A + 0F2F
0F1E BURMESE LETTER NNYA
= nnya. krii:
= nya ji
= great nya
0F1F BURMESE LETTER TTA
= tta. samlyang:khyit
= ta talinjei
= bier-hook ta
0F20 BURMESE LETTER TTHA
= ttha. wam:bhai:
= hta wunbe
= duck hta
0F21 BURMESE LETTER DDA
= dda. rang-kok
= da yingau
= crooked-breasted da
0F22 BURMESE LETTER DDHA
= ddha. re-mhut
= da yeihmou
= water-dipper da
0F23 BURMESE LETTER NNA
= nna. krii:
= na ji
= great na
0F24 BURMESE LETTER TA
= ta. wam:puu
= ta wunbu
= pot-bellied ta
0F25 BURMESE LETTER THA
= tha. chang-thuu:
= hta hsindu
= elephant-fetter hta
0F26 BURMESE LETTER DA
= da. twe:
= da dwei
= twisted da
0F27 BURMESE LETTER DHA
= dha. okkhyuik
= da auhcai
= bottom-indented da
0F28 BURMESE LETTER NA
= na. ngay
= na nge
= small na
0F29
0F2A BURMESE LETTER PA
= pa. cok
= pa zau
= steep-sided pa
0F2B BURMESE LETTER PHA
= pha. uuthep
= hpa ouhtou
= capped hpa
0F2C BURMESE LETTER BA
= ba. thakkhyuik
= ba lahcai
= top-indented ba
0F2D BURMESE LETTER BHA
= bha. kun:
= ba goun
= hump-backed ba
0F2E BURMESE LETTER MA
= ma.
0F2F BURMESE LETTER YA
= ya. paklak
= ya pale
= supine ya
0F30 BURMESE LETTER RA
= ra. kok
= ya gau
= crooked ya
x Burmese digit seven Æ 0F6D
0F31
0F32 BURMESE LETTER LA
= la.
0F33 BURMESE LETTER LLA
= lla. krii:
= la ji
= great la
0F34
0F35 BURMESE LETTER WA
= wa.
x Burmese digit zero Æ 0F66
0F36 BURMESE LETTER SHA
= sha.
0F37 BURMESE LETTER SSA
= ssa.
0F38 BURMESE LETTER SA
= sa.
= tha
0F39 BURMESE LETTER HA
= ha.
0F3A
0F3B
0F3C
0F3D
@ Dependent Vowel Signs
0F3E BURMESE VOWEL SIGN AA
= re: khya.
= yei hca
= line drawn down
0F3F BURMESE VOWEL SIGN I
= lum:krii: tang
= loinji tin
= big circle put on
0F40 BURMESE VOWEL SIGN II
= lum:krii: tang chan khat
= lounji tin hsan hka
= big circle put on with a grain of rice
0F41 BURMESE VOWEL SIGN U
= takhyong: ngang
= tahcaun ngin
= one stroke drawn out
0F42 BURMESE VOWEL SIGN UU
= nhac-khyong: ngang
= hnacaun ngin
= two strokes drawn out
0F43 BURMESE VOWEL SIGN VOCALIC R
0F44 BURMESE VOWEL SIGN VOCALIC RR
0F45
0F46
0F47 BURMESE VOWEL SIGN E
= sawe thui:
= thawei htou
= thrust in front
0F48 BURMESE VOWEL SIGN AI
= nok pac
= nay pyi
= thrown backwards
0F49
0F4A
0F4B BURMESE VOWEL SIGN O
0F4C
@ Various signs
0F4D BURMESE SIGN VIRAMA
x Burmese sign killer Æ 0F52
0F4E
0F4F
0F50
0F51 BURMESE SIGN DOT BELOW
= ok-ka. mrac
= auka myi
= stopped below
0F52 BURMESE SIGN KILLER
= asat
= atha
= rhe. thui:
= hyei htou
= thrust forward
x Burmese virama Æ 0F4D
0F53
0F54
0F55
0F56
0F57
0F58
0F59
0F5A
0F5B
0F5C
0F5D
0F5E
@ Additional Consonants
0F5F BURMESE LETTER NYA
= nya. kale:
= nya galei
= little nya
x Burmese letter u Æ 0F09
@ Generic additions
0F60 BURMESE LETTER VOCALIC RR
0F61 BURMESE LETTER VOCALIC LL
0F62 BURMESE VOWEL SIGN VOCALIC L
0F63 BURMESE VOWEL SIGN VOCALIC LL
0F64 BURMESE DANDA
= pudma.
= pouma
= pudkrii:
= pouci
= section
@ Punctuation
0F65 BURMESE DOUBLE DANDA
= pudse:
= pouthei
= pudma. ngay
= pouma nge
= little section
@ Numbers
0F66 BURMESE DIGIT ZERO
x Burmese letter wa Æ 0F35
0F67 BURMESE DIGIT ONE
0F68 BURMESE DIGIT TWO
0F69 BURMESE DIGIT THREE
0F6A BURMESE DIGIT FOUR
x Burmese symbol 4ng Æ 0F73
0F6B BURMESE DIGIT FIVE
0F6C BURMESE DIGIT SIX
0F6D BURMESE DIGIT SEVEN
x Burmese letter ra Æ 0F30
0F6E BURMESE DIGIT EIGHT
x Burmese letter ga Æ 0F17
0F6F BURMESE DIGIT NINE
@ Other signs
0F70 BURMESE SYMBOL RWE
= atkharaa rwe.
= ehkaya ywei
= having done
0F71 BURMESE SYMBOL I
= atkharaa i.
= ehkaya i
= possession
0F72 BURMESE SYMBOL NHUIK
= atkharaa nhuik
= ehkaya hnai
= at, in
0F73 BURMESE SYMBOL 4NG
= atkharaa 4ng:
= atkharaa leng:
= ehkaya lagaun
= the aforementioned
x 0F6A + 0F19
x Burmese digit four Æ 0F6A
0F74
0F75
0F76
0F77
0F78
0F79
0F7A
0F7B
0F7C
0F7D
0F7E
0F7F


Back to the current Burmese proposal
Michael Everson, Evertype, Dublin, 2001-09-21