ISO/IEC JTC1/SC2/WG2 N1___
DATE: 1997-12-22

DOC TYPE:Expert contribution
TITLE:DRAFT Proposal to encode Batak in ISO/IEC 10646
SOURCE:Michael Everson
PROJECT:JTC1.02.18.01
STATUS:Proposal.
ACTION ID:FYI
DUE DATE:--
DISTRIBUTION:Worldwide
MEDIUM:Paper and web
NO. OF PAGES:3


  • Faulmann, Carl. 1990 (1880). Das Buch der Schrift. Frankfurt am Main: Eichborn. ISBN 3-8218-1720-8
  • Haarmann, Harald. 1990. Universalgeschichte der Schrift. Frankfurt/Main; New York: Campus. ISBN 3-593-34346-0
  • Nakanishi, Akira. 1990. Writing systems of the world: alphabets, syllabaries, pictograms. Rutland, VT: Charles E. Tuttle. ISBN 0-8048-1654-9
  • van der Tuuk, H. N. 1864. Tobasche spraakkunst. Amsterdam: Het Nederlands Bijbelgenootschap.

    Preliminary

    This proposal for encoding the Batak script is based on but differs from that given in UTR #3 in the following ways:

    Proposal

    The Batak script is (or was) used to write Karo, Toba, Mandailing, Dairi, and possibly other dialects on the island of Sumatra. The alphabet is called si-siya-siya in Toba. Batak is read from left to right, but is often written similarly to Tagalog and Buhid, by writing vertically along the length of a piece of bamboo.

    The phonetic system of the script is similar to the scripts of the Philippines (Tagalog). Like Tagalog and other scripts of the archipelagos between Southeast Asia and Australia, Batak ultimately derives from scripts of India. Batak has a virama and final consonants are expressed in the script. Like Tagalog, only two independent vowels other than A are included in the script (but several vowel signs are used). The alphabetical order differs from both the primeval Brahmic and Tagalog orders; the accompanying chart is in the order given for Karo.

    (I AM STILL WORKING ON THIS PARAGRAPH TO HARMONIZE IT WITH THE NAMES.) The VOWEL SIGNs I and O, and the PANGOLATs (< virama) are spacing marks. The VOWEL SIGNs E and NG are non-spacing marks. The VOWEL SIGN I is placed after the consonant. The VOWEL SIGN U is placed under a consonant and somewhat to the right. Several ligated forms of letters with the u sound are known. The vowel sign o is placed after the consonant. The pangolet is likewise placed after the consonant, causing the inherent a vowel to be lost. The final ng is placed above the consonant and somewhat to the right. (When e and ng occur together on a consonant, thus, there are two dashlike marks above.) The hamisaran is usually written above the vowels i and o. When pangolat (the devoweller) is used to close a syllable, the vowel sign for the previous vowel is placed either under the final consonant or after the final consonant, and before the pangolat itself.

    Punctuation is not normally used, all letters simply running together, but a sign does exist and is occasionally used to disambiguate similar words or phrases. (This sign is, unfortunately, also known by the same name as the virama, PANGOLAT.) Other signs, called BINDUs, are often used. BUNDUGODANG (main bindu) indicates the start of a text, and BINDU NA METEK (small bindu) is used to indicate the start of a new section or verse. BINDUs are often ornamentally, carrying names like BINDUPINARJOLMA (bindu in the sape of a man) or BINDUPINARULOK (bindu in the shape of a snake). Such variants are not encoded here.

    A sign called PUSTAHA is also sometimes used to separate a title from the main text which normally begins on the same line.

    Mandailing: The Mandailing alphabetical order differs somewhat from Toba, and North Mandailing again differs slightly from South Mandailing. Some of the letter shapes are likewise slightly different; these are HA and SA. The rendering forms for the consonant vowel-sign combinations PA+U, SA+U, and LA+U may differ from the forms used for Toba. Mandailing uses two other letters for KA and CA. These two letters are produced by putting a mark called TOMPI onto the normal letters for KA (which is used for HA in Mandailing) and SA. It is not known whether the TOMPI is otherwise productive, so both the Mandailing letters and the TOMPI itself are provisionally included in the chart (see Issues below).

    Dairi: Dairi alphabetical order again differs from that of Toba and Mandailing. Dairi does not include the letter NYA. The forms for TA and WA differ significantly from those used for Toba. The vowel sign listed in the chart as U is pronounced more like a closed E and written after the associated consonant rather than under (or attached to) the consonant. The sign SIKORJAN, which is pronounced as a soft H following the associated vowel (i.e. visarga), is placed over the consonant. When final NG (anusvara) is used in Dairi, it goes over the previous consonant rather than over the vowel sign. In Toba, it may optionally go over the vowel if the vowel is not a non-spacing mark.

    Batak is known to have been in use in the mid-1800s. Nakanishi (1975) states that it is "seldom used today." It may be extinct as of this writing (1992). The completeness of this analysis and chart is not known.

    Other issues


    U+xx00	BATAK LETTER A
    U+xx01	BATAK LETTER TOBA A
    U+xx02	BATAK LETTER KA
    U+xx1C	BATAK LETTER MANDAILING KA
    U+xx03	BATAK LETTER BA
    U+xx04	BATAK LETTER TOBA BA
    U+xx05	BATAK LETTER PA
    U+xx06	BATAK LETTER NA
    U+xx07	BATAK LETTER VARIANT NA
    U+xx08	BATAK LETTER WA
    U+xx09	BATAK LETTER GA
    U+xx0A	BATAK LETTER JA
    U+xx0B	BATAK LETTER DA
    U+xx0C	BATAK LETTER NDA
    U+xx0D	BATAK LETTER VARIANT NDA
    U+xx0E	BATAK LETTER RA
    U+xx0F	BATAK LETTER MA
    U+xx10	BATAK LETTER TOBA MA
    U+xx11	BATAK LETTER TA
    U+xx12	BATAK LETTER TOBA TA
    U+xx13	BATAK LETTER SA
    U+xx1D	BATAK LETTER MANDAILING CA
    U+xx14	BATAK LETTER VARIANT SA
    U+xx15	BATAK LETTER YA
    U+xx16	BATAK LETTER NGA
    U+xx17	BATAK LETTER NYA VARIANT CA
    U+xx18	BATAK LETTER LA
    U+xx19	BATAK LETTER CA
    U+xx1A	BATAK LETTER I
    U+xx1B	BATAK LETTER U
    U+xx1E	BATAK VOWEL SIGN KELAWAN (i)
    U+xx1F	BATAK VOWEL SIGN VARIANT KELAWAN (i)
    U+xx20	BATAK VOWEL SIGN KEBERETEN (e)
    U+xx21	BATAK VOWEL SIGN KETEELEENGEN (ee)
    U+xx22	BATAK VOWEL SIGN SIKURUN (u)
    U+xx23	BATAK VOWEL SIGN HABOROTAN (u)
    U+xx24	BATAK VOWEL SIGN KETOLONGEN (o)
    U+xx25	BATAK VOWEL SIGN VARIANT KETOLONGEN (o)
    U+xx26	BATAK VOWEL SIGN KEBINCAREN (ng)
    U+xx27	BATAK VOWEL SIGN KEJERINGEN (h)
    U+xx28	BATAK COMBINING TOMPI
    U+xx29	BATAK PANGOLAT PENENGEN
    U+xx2A	BATAK TOBA PANGOLAT
    U+xx2B	BATAK BINDU
    U+xx2C	BATAK BINDUGODANG
    U+xx2D	BATAK BINDU NA METEK
    U+xx2E	(This position shall not be used)
    U+xx2F	(This position shall not be used)

    Michael Everson, Evertype, Dublin, 2001-09-21