![]() | ISO/IEC JTC1/SC2/WG2 N1___
|
| DATE: 1998-02-11 | |
| |
| DOC TYPE: | Expert contribution |
| TITLE: | Proposal to encode Javanese in the BMP of ISO/IEC 10646 |
| SOURCE: | Michael Everson Jeroen Hellingman |
| PROJECT: | JTC1.02.18.01 |
| STATUS: | Proposal. |
| ACTION ID: | FYI |
| DUE DATE: | -- |
| DISTRIBUTION: | Worldwide |
| MEDIUM: | Paper and web |
| NO. OF PAGES: | 3 (printed at 80%) |
A. Administrative | |
| 1. Title | Proposal to encode Javanese in the BMP of ISO/IEC 10646-1 |
| 2. Requester's name | Michael Everson, Jeroen Hellingman |
| 3. Requester type | Expert request |
| 4. Submission date | 1998-02-12 |
| 5. Requester's reference | |
| 6a. Completion | This is a complete proposal. |
| 6b. More information to be provided? | No |
B. Technical -- General | |
| 1a. New script? Name? | Yes. Javanese |
| 1b. Addition of characters to existing block? Name? | No. |
| 2. Number of characters | 64 |
| 3. Proposed category | Category A |
| 4. Proposed level of implementation and rationale | Level 2 |
| 5a. Character names included in proposal? | Yes |
| 5b. Character names in accordance with guidelines? | Yes |
| 5c. Character shapes reviewable? | Yes |
| 6a. Who will provide computerized font? | Michael Everson |
| 6b. Font currently available? | Michael Everson |
| 6c. Font format? | TrueType |
| 7a. Are references (to other character sets, dictionaries, descriptive texts, etc.) provided? | Yes. |
| 7b. Are published examples (such as samples from newspapers, magazines, or other sources) of use of proposed characters attached? | No |
| 8. Does the proposal address other aspects of character data processing? | Yes |
C. Technical -- Justification | |
| 1. Contact with the user community? | Yes. Jeroen Hellingman. |
| 2. Information on the user community? | Javanese enjoys both scholarly and some popular use. |
| 3a. The context of use for the proposed characters? | Used to represent texts in the Javanese languages. |
| 3b. Reference | See below. |
| 4a. Proposed characters in current use? | Yes. |
| 4b. Where? | In Indonesia, the Netherlands, and elsewhere. |
| 5a. Characters should be encoded entirely in BMP? | Yes |
| 5b. Rationale | Accordance with the Roadmap. |
| 6. Should characters be kept in a continuous range? | Yes |
| 7a. Can the characters be considered a presentation form of an existing character or character sequence? | No. |
| 7b. Where? | |
| 7c. Reference | |
| 8a. Can any of the characters be considered to be similar (in appearance or function) to an existing character? | No |
| 8b. Where? | |
| 8c. Reference | |
| 9a. Combining characters or use of composite sequences included? | Yes, the usual Brahmic matras are used. |
| 9b. List of composite sequences and their corresponding glyph images provided? | No. |
| 10. Characters with any special properties such as control function, etc. included? | No |
D. SC2/WG2 AdministrativeTo be completed by SC2/WG2 | |
| 1. Relevant SC 2/WG 2 document numbers: | |
| 2. Status (list of meeting number and corresponding action or disposition) | |
| 3. Additional contact to user communities, liaison organizations etc. | |
| 4. Assigned category and assigned priority/time frame | |
| Other Comments | |
Nowadays, the script is replaced by Latin script, and is slowly fading out of use. It is still tought at schools in East and Middle Java, but only older people can read and write it easily. Computerized usage seems to be of interest for printers still printing traditional literature in the script and historians.
The consonants, called _aksara_, all carry an inherent a, which can be altered by adding a vowel sign. When two consonants follow directly after each other, the second consonant is written in a alternative form, called _pasangan_, below the first, to indicate no vowel should be pronounced between them. When a phrase ends with a consonant, a special sign, called _paten_ or _pangkon_ in high language, (Sanskrit virama), is used to indicate the absense of the inherent a. Paten is also used when three or more consonants form a cluster, to avoid having to write three consonants below each other. A final aspirate is indicated by _wignjan_, (Sanskrit visarga), a final ng-sound by _cecak_ (Sanskrit anusvara), and a final r-sound is indicated by _layar_. Together with the secondary forms of ra (_cakra_), ri (_keret_), and ya (_pengkal_), which are treated specially, these signs and the vowel signs are referred to as _sandangan_.
When a normal Javanese word starts with a vowel, this is written by applying the respective vowel sign to ha, which represents a weak aspiration. The _sastra-svara_ or independent vowels are only used in Sanskrit and Arabic loanwords that start with a vowel.
The letters that represent aspirated sounds in the Sanskrit sound-system, have lost their original value because their sounds do not appear in Javanese, but are used in non-final position, replacing their non-aspirated counterparts, as honourific or `capital' letters in the names of persons and places that deserve respect.
Several extra letters have been created by placing three dots above some letters, to represent foreign sounds in loans from Arabic and Dutch. normally this sign is used with ka (kha), da (da), pa (fa), ja (za), ga (rha), also seen with ha, ta, sa, la, sa-gede, sha-gede, and ba. These three dots can be compared with the nukta in several North Indian scripts.
The Javanese script has its own decimal digits.
These three signs can be ommitted if the last word of a sentence or sentence part ends with paten.
In verse, punctuation is rather complicated. The end of a line of verse is indicated with a special sign, which depends on the last vowel of it. Actually these signs are not separators, but indicate the prolonged pronuncation of this last vowel, and thus are in effect vowel-signs for the long vowels.
In older Kawi verse, the end of a small part of verse is indicated with _dirga_, wich is preceded with a tarung if the word before it does not end with paten.
A sentence is normally started with an _adeg-adeg_ (a double dirga). But at the opening of a letter an ornamental sign, indicating the relation between the sender and the receiver is used. a _pada-luhur_ indicates that the sender is higher in rank than the receiver, a _pada-madhya_ is used between people of the same rank, and a _pada-andap_ when person with a low rank is addressing a person with a higher rank.
Elaborate signs are used at the begin and end of verse, and the major sub-divisions parts of them.
Currently, the area 1B00--1B5F of the BMP is proposed to be allocated to the Javanese script.
It may be neccessary to encode word boundaries with ZERO WIDTH SPACE, to make sensible line-breaking possible.
It may be considered to encode pancak with its filling nature implicit -- that is, the appearance of one pancak character will result in as many repetitions of the graphics as needed to fill the line. (The same thought may be followed in adding a LINE-FILLER and DOT-FILLER character, but I think this whole idea goes beyond the scope of UNICODE)
tarung is derived from vowel sign aa.
dirga mure is derived from lenght mark ai, can be used with taling and taling tarung only.
Nya-gede is derived from the Sanskrit conjunct jnya, but has become a distinguished letter in Javanese.
The ordering follows the order given in Roodra [1]. This is the traditional alphabetical order of the script.
U+xx00 U+xx JAVANESE LETTER HA U+xx01 JAVANESE LETTER NA U+xx02 JAVANESE LETTER CA U+xx03 JAVANESE LETTER RA U+xx04 JAVANESE LETTER KA U+xx05 JAVANESE LETTER DA U+xx06 JAVANESE LETTER TA U+xx07 JAVANESE LETTER SA U+xx08 JAVANESE LETTER WA U+xx09 JAVANESE LETTER LA U+xx0A JAVANESE LETTER PA U+xx0B JAVANESE LETTER DHA U+xx0C JAVANESE LETTER JA U+xx0D JAVANESE LETTER YA U+xx0E JAVANESE LETTER NYA U+xx0F JAVANESE LETTER MA U+xx10 JAVANESE LETTER GA U+xx11 JAVANESE LETTER BA U+xx12 JAVANESE LETTER THA U+xx13 JAVANESE LETTER NGA U+xx14 JAVANESE LETTER PA CEREK U+xx15 JAVANESE LETTER NGA LELET U+xx16 JAVANESE LETTER NA GEDHE U+xx17 JAVANESE LETTER CA GEDHE U+xx18 JAVANESE LETTER KA GEDHE U+xx19 JAVANESE LETTER TA GEDHE U+xx1A JAVANESE LETTER SA GEDHE U+xx1B JAVANESE LETTER SHA GEDHE U+xx1C JAVANESE LETTER PA GEDHE U+xx1D JAVANESE LETTER NYA GEDHE U+xx1E JAVANESE LETTER GA GEDHE U+xx1F JAVANESE LETTER BA GEDHE U+xx20 JAVANESE SIGN TRIPLE CECAK U+xx21 JAVANESE VOWEL SIGN E U+xx22 JAVANESE VOWEL SIGN I U+xx23 JAVANESE VOWEL SIGN U U+xx24 JAVANESE VOWEL SIGN EE U+xx25 JAVANESE VOWEL SIGN O U+xx26 JAVANESE SIGN PATEN U+xx27 JAVANESE SIGN WIGNYAN U+xx28 JAVANESE SIGN CECAK U+xx29 JAVANESE SIGN KERET U+xx2A JAVANESE LETTER A U+xx2B JAVANESE LETTER I U+xx2C JAVANESE LETTER U U+xx2D JAVANESE LETTER E U+xx2E JAVANESE LETTER O U+xx2F JAVANESE DIGIT ZERO U+xx30 JAVANESE DIGIT ONE U+xx31 JAVANESE DIGIT TWO U+xx32 JAVANESE DIGIT THREE U+xx33 JAVANESE DIGIT FOUR U+xx34 JAVANESE DIGIT FIVE U+xx35 JAVANESE DIGIT SIX U+xx36 JAVANESE DIGIT SEVEN U+xx37 JAVANESE DIGIT EIGHT U+xx38 JAVANESE DIGIT NINE U+xx39 JAVANESE PADA-LUNGSI U+xx3A JAVANESE PADA-LINGSA U+xx3B JAVANESE PANGKAT U+xx3C JAVANESE TARUNG U+xx3D JAVANESE DIRGA U+xx3E JAVANESE ADEG-ADEG U+xx3F JAVANESE ULU MELIK U+xx40 JAVANESE SUKU MENDUT U+xx41 JAVANESE DIRGA MURE U+xx42 JAVANESE PANCAK U+xx43 JAVANESE PADA LUHUR U+xx44 JAVANESE PADA MADYA U+xx45 JAVANESE PADA HANDHAP U+xx46 JAVANESE GURU U+xx47 JAVANESE PURWA PADA U+xx48 JAVANESE MADYA PADA U+xx4A JAVANESE WASANA PADA U+xx4B JAVANESE ARCHAIC LETTER DA GEDHE U+xx4C JAVANESE ARCHAIC LETTER AI U+xx4D JAVANESE ARCHAIC LENGTH MARK U+xx4E (This position shall not be used) U+xx4F (This position shall not be used) | ![]() |