ISO/IEC JTC1/SC2/WG2 N1473R
Date: 1997-03-01

Title: Proposal for encoding the Sinhala script in ISO/IEC 10646 (revision 1)

Source: Michael Everson
Status: Expert Contribution
Action: For consideration by JTC1/SC2/WG2

This proposal is a minor revision of N1473, which itself was a revision of a proposal by Hugh McGregor Ross, from ISO/IEC JTC1/SC2/WG2 N1376.

A. Administrative

1. TitleProposal for encoding the Sinhala script in ISO/IEC 10646
2. Requester's nameMichael Everson, Evertype (WG2 member for Ireland)
3. Requester typeExpert contribution
4. Submission date1997-03-01
5. Requester's referencehttp://www.evertype.com/standards/si/si.html; SC2/WG2 N1057, N1376, N1480
6a. CompletionThis is a complete proposal.
6b. More information to be provided?No

B. Technical -- General

1a. New script? Name?Yes. Sinhala.
1b. Addition of characters to existing block? Name?No
2. Number of characters103
3. Proposed categoryCategory A
4. Proposed level of implementation and rationaleSinhala requires Level 2 implemenation as other Brahmic scripts do.
5a. Character names included in proposal?Yes
5b. Character names in accordance with guidelines?Yes
5c. Character shapes reviewable?Yes (see below)
6a. Who will provide computerized font?Michael Everson, Evertype
6b. Font currently available?Michael Everson, Evertype
6c. Font format?TrueType
7a. Are references (to other character sets, dictionaries, descriptive texts, etc.) provided?Yes. An 8-bit coded character set is referred to in JTC1/SC2/WG2 N1376. Other references are given below.
7b. Are published examples (such as samples from newspapers, magazines, or other sources) of use of proposed characters attached?No. Sinhala is well known. The samples are to be found in the references.
8. Does the proposal address other aspects of character data processing?Yes (see below)

C. Technical -- Justification

1. Contact with the user community?Yes. Camillus Jayewardena (University of Ruhuna, Matara, Sri Lanka), Mettavihari (IBRIC - International Buddhist Research & Information Centre, Sri Lanka). Also with Rick McGowan, Hugh McGregor Ross, Lee Collins, Lloyd Anderson, and others.
2. Information on the user community?17,464,000 people live in Sri Lanka (1992e). 74% of those are Sinhala speakers, and they and the remaining 26% are also potential users of the script as encoded in ISO 10646.
3a. The context of use for the proposed characters?Sinhala script is commonly used to write Sinhala. Encoding Sinhala is also important because of the great many Buddhist scriptures and commentaries in Pali which are written in Sinhala script.
3b. ReferenceUnicode Technical Report #2
4a. Proposed characters in current use?Yes
4b. Where?In Sri Lanka.
5a. Characters should be encoded entirely in BMP?Yes
5b. RationaleSinhala is a major Category A script
6. Should characters be kept in a continuous range?Yes
7a. Can the characters be considered a presentation form of an existing character or character sequence? No
7b. Where? 
7c. Reference 
8a. Can any of the characters be considered to be similar (in appearance or function) to an existing character?No
8b. Where? 
8c. Reference 
9a. Combining characters or use of composite sequences included?No
9b. List of composite sequences and their corresponding glyph images provided?No
10. Characters with any special properties such as control function, etc. included?No

D. SC2/WG2 Administrative

To be completed by SC2/WG2
1. Relevant SC2/WG2 document numbers:N1057, N1376, N1473, N1480
2. Status (list of meeting number and corresponding action or disposition) 
3. Additional contact to user communities, liaison organizations etc. 
4. Assigned category and assigned priority/time frame 
Other Comments 

E. Proposal

Issues

ISO/IEC JTC1/SC2/WG2 N1480 from the Sri Lanka Standards Institution is a proposal to register the Sri Lankan 7-bit encoding as the Sinhala encoding in ISO/IEC 10646. Its character set is not large enough for the wider historical requirements of 10646 encoding (the archaic numbers, for instance, are omitted). The chief objection to the Brahmic encoding presented here appears to be the sorting requirement for the modern Sinhala language. Sorting shall not be dependent upon the order of elements in the code table, and the Sri Lankan requirement can easily be met by the work of ISO/IEC JTC1/SC22/WG20, who are providing sorting for all of 10646. Brahmic encoding should have wide-ranging benefits to the Theravada Buddhist community in Sri Lanka, as regards interchange of texts in the Pali language to other scripts, such as Burmese. Brahmic encoding would also facilitate the transfer of data from Sinhala script into Tamil script, an acknowledged requirement in Sri Lanka.

The character repertoire in this proposal is a superset of the repertoire in Sri Lankan Standard SLS 1134:1996. Brahmic encoding of the repertoire here is appropriate in the unified context of Brahmic scripts in ISO/IEC 10646 -- and will facilitate software development for the Sinhala script.

This proposal is considered to be extremely stable. Positions U+0D80 -> U+0D8F are proposed for the Sinhala block.


References


Sinhala code table


Sinhala character names

000	0D80	(This position shall not be used)
001	0D81	(This position shall not be used)
002	0D82	SINHALA SIGN ANUSVARA (naasikya, anusvaaraya, binduva)
003	0D83	SINHALA SIGN VISARGA (visargaya, visarjaniiya)
004	0D84	(This position shall not be used)
005	0D85	SINHALA LETTER A (ayanna)
006	0D86	SINHALA LETTER AA (aayanna)
007	0D87	SINHALA LETTER I (iyanna)
008	0D88	SINHALA LETTER II (iiyanna)
009	0D89	SINHALA LETTER U (uyanna)
010	0D8A	SINHALA LETTER UU (uuyanna)
011	0D8B	SINHALA LETTER VOCALIC R (iruyanna)
012	0D8C	SINHALA LETTER VOCALIC L (iluyanna)
013	0D8D	(This position shall not be used)
014	0D8E	SINHALA LETTER E (eyanna)
015	0D8F	SINHALA LETTER EE (eeyanna)
016	0D90	SINHALA LETTER AI (aiyanna)
017	0D91	(This position shall not be used)
018	0D92	SINHALA LETTER O (oyanna)
019	0D93	SINHALA LETTER OO (ooyanna)
020	0D94	SINHALA LETTER AU (avyanna)
021	0D95	SINHALA LETTER KA (alpapraana kayanna)
022	0D96	SINHALA LETTER KHA (mahaapraana khayanna)
023	0D97	SINHALA LETTER GA (alpapraana gayanna)
024	0D98	SINHALA LETTER GHA (mahaapraana ghayanna)
025	0D99	SINHALA LETTER NGA (kakudya naasika)
026	0D9A	SINHALA LETTER CA (alpapraana cayanna)
027	0D9B	SINHALA LETTER CHA (mahaapraana chayanna)
028	0D9C	SINHALA LETTER JA (alpapraana jayanna)
029	0D9D	SINHALA LETTER JHA (mahaapraana jhayanna)
030	0D9E	SINHALA LETTER NYA (taaluja naasika)
031	0D9F	SINHALA LETTER TTA (alpapraana ttayanna)
032	0DA0	SINHALA LETTER TTHA (mahaapraana tthayanna)
033	0DA1	SINHALA LETTER DDA (alpapraana ddayanna)
034	0DA2	SINHALA LETTER DDHA (mahaapraana ddhayanna)
035	0DA3	SINHALA LETTER NNA (muurddhaja nnayanna)
036	0DA4	SINHALA LETTER TA (alpapraana tayanna)
037	0DA5	SINHALA LETTER THA (mahaapraana thayanna)
038	0DA6	SINHALA LETTER DA (alpapraana dayanna)
039	0DA7	SINHALA LETTER DHA (mahaapraana dhayanna)
040	0DA8	SINHALA LETTER NA (dantaja nayanna)
041	0DA9	SINHALA LETTER NNNA (Tamil)
042	0DAA	SINHALA LETTER PA (alpapraana payanna)
043	0DAB	SINHALA LETTER PHA (mahaapraana phayanna)
044	0DAC	SINHALA LETTER BA (alpapraana bayanna)
045	0DAD	SINHALA LETTER BHA (mahaapraana bhayanna)
046	0DAE	SINHALA LETTER MA (mayanna)
047	0DAF	SINHALA LETTER YA (yayanna)
048	0DB0	SINHALA LETTER RA (rayanna)
049	0DB1	SINHALA LETTER RRA (Tamil)
050	0DB2	SINHALA LETTER LA (dantaja layanna)
051	0DB3	SINHALA LETTER LLA (muurddhaja layanna)
052	0DB4	SINHALA LETTER LLLA (Tamil)
053	0DB5	SINHALA LETTER VA (vayanna)
054	0DB6	SINHALA LETTER SHA (taaluja shayanna)
055	0DB7	SINHALA LETTER SSA (muurddhaja ssayanna)
056	0DB8	SINHALA LETTER SA (dantaja sayanna)
057	0DB9	SINHALA LETTER HA (hayanna)
058	0DBA	(This position shall not be used)
059	0DBB	(This position shall not be used)
060	0DBC	(This position shall not be used)
061	0DBD	(This position shall not be used)
062	0DBE	SINHALA VOWEL SIGN AA (aelapilla)
063	0DBF	SINHALA VOWEL SIGN I (keti ispilla)
064	0DC0	SINHALA VOWEL SIGN II (diirgha ispilla)
065	0DC1	SINHALA VOWEL SIGN U (keti papilla)
066	0DC2	SINHALA VOWEL SIGN UU (diirgha papilla)
067	0DC3	SINHALA VOWEL SIGN VOCALIC R (gaettee sahita aelapilla)
068	0DC4	SINHALA VOWEL SIGN VOCALIC RR (gaetta sahita aelapilli deka)
069	0DC5	(This position shall not be used)
070	0DC6	SINHALA VOWEL SIGN E (kombuva)
071	0DC7	SINHALA VOWEL SIGN EE (diirgha kombuva)
072	0DC8	SINHALA VOWEL SIGN AI (kombu deka)
073	0DC9	(This position shall not be used)
074	0DCA	SINHALA VOWEL SIGN O (kombuva saha aelapilla)
075	0DCB	SINHALA VOWEL SIGN OO (diirgha kombuva saha aelapilla
076	0DCC	SINHALA VOWEL SIGN AU (kombuva saha gayanukitta)
077	0DCD	SINHALA SIGN VIRAMA (al-lakuna)
078	0DCE	(This position shall not be used)
079	0DCF	(This position shall not be used)
080	0DD0	SINHALA LETTER AE (aeyanna)
081	0DD1	SINHALA LETTER AEE (aeeyanna)
082	0DD2	SINHALA VOWEL SIGN AE (keti aedhapilla)
083	0DD3	SINHALA VOWEL SIGN AEE (diirgha aedhapilla)
084	0DD4	(This position shall not be used)
085	0DD5	(This position shall not be used)
086	0DD6	(This position shall not be used)
087	0DD7	(This position shall not be used)
088	0DD8	SINHALA LETTER NYGA (kakudya arddha naasika)
089	0DD9	SINHALA LETTER JNYA (taaluja naasika sanynyakaya)
090	0DDA	SINHALA LETTER NYJA (sanynyaka jayanna)
091	0DDB	SINHALA LETTER NNDDA (muurddhaja ddayanna)
092	0DDC	SINHALA LETTER NDA (dantaja dayanna)
093	0DDD	SINHALA LETTER MBA (amba bayanna)
094	0DDE	SINHALA LETTER FA (fayanna)
095	0DDF	(This position shall not be used)
096	0DE0	SINHALA LETTER VOCALIC RR (iruuyanna)
097	0DE1	SINHALA LETTER VOCALIC LL (iluuyanna)
098	0DE2	SINHALA VOWEL SIGN VOCALIC L (gayanukitta)
099	0DE3	SINHALA VOWEL SIGN VOCALIC LL (diirgha gayanukitta)
100	0DE4	(This position shall not be used)
101	0DE5	(This position shall not be used)
102	0DE6	(This position shall not be used)
103	0DE7	SINHALA DIGIT ONE
104	0DE8	SINHALA DIGIT TWO
105	0DE9	SINHALA DIGIT THREE
106	0DEA	SINHALA DIGIT FOUR
107	0DEB	SINHALA DIGIT FIVE
108	0DEC	SINHALA DIGIT SIX
109	0DED	SINHALA DIGIT SEVEN
110	0DEE	SINHALA DIGIT EIGHT
111	0DEF	SINHALA DIGIT NINE
112	0DF0	SINHALA NUMBER TEN
113	0DF1	SINHALA NUMBER TWENTY
114	0DF2	SINHALA NUMBER THIRTY
115	0DF3	SINHALA NUMBER FORTY
116	0DF4	SINHALA NUMBER FIFTY
117	0DF5	SINHALA NUMBER SIXTY
118	0DF6	SINHALA NUMBER SEVENTY
119	0DF7	SINHALA NUMBER EIGHTY
120	0DF8	SINHALA NUMBER NINETY
121	0DF9	SINHALA NUMBER ONE HUNDRED
122	0DFA	SINHALA NUMBER ONE THOUSAND
123	0DFB	(This position shall not be used)
124	0DFC	(This position shall not be used)
125	0DFD	(This position shall not be used)
126	0DFE	(This position shall not be used)
127	0DFF	SINHALA SIGN KUNDALIYA (kunnddaliya)

Michael Everson, Evertype, Dublin, 2001-09-21