Inuktitut Encoding Principles

This essay will discuss the principles of 7-bit, 8-bit, and 16-bit character sets and how they can and should be implemented for Inuktitut.
  1. Historical requirements. When computers first began to be used for Syllabics, the most important thing was to be able to input, display, and print the needed characters. Since it was earlier much more difficult to alter some aspects of the user interface than it is today, solutions for placement of Syllabics characters were based on the English-language Latin-script keyboard drivers implemented in the operating systems. In order to place the final syllable series on the unshifted numbers (where they are very convenient for users), it was necessary to move the numbers from their existing positions in the code table to other positions.

    This, while expedient for input, is a disaster in terms of functionality and text interchange.

  2. Modern requirements. Input, display, and printing are no longer the only requirements Inuit users have as computer users. Nowadays, with the Internet and with different computer platforms, it is necessary to interchange data and preserve its integrity. 7-bit solutions for Inuktitut are now effectively obsolete, since Internet protocols require the use of the Latin script for addressing. These protocols are based on plain text and do not recognize fancy text such as "font" to determine whether a character is Syllabic or Latin. It is possible to use 7-bit based Inuktitut fonts in closed systems -- but for Internet use, so-called "bilingual" fonts should be used.

  3. Future requirements. The future of computing in Syllabics and in Latin is the Universal Character Set (ISO/IEC 10646 and Unicode) -- and it is very nearly here. Roundtrip conversion of plain text documents between the UCS and 8-bit encodings should be taken very seriously at this point, when 8-bit code tables are still being designed.

Michael Everson, Evertype, Dublin, 2001-09-21