Extended European Subset of ISO/IEC 10646-1

Technical contents of Annex E to ENV 1973:1995
Extended European Subset of ISO/IEC 10646-1

This standard is one of the results of CEN/TC304 work on the characters and the scripts of Europe. It is based in part on the results of another work item that specifies the characters used by the indigenous languages of Europe, and was established as a European Prestandard in December 1995.

This document contains the technical contents of an informative annex of ENV 1973:1995. This information may not be used without reference to the other normative parts of the standard. The technical contents have been made available to assist users in implementing the standard. Copyright of this material is held by CEN and the CEN member bodies. You can obtain a copy of ENV 1973:1995 directly from CEN or from any of the CEN member bodies.

Substantial advantages can be expected from the specification and implementation of the Extended European Subset of ISO/IEC 10646-1, defined here as a guide to implementors, which includes 3109 characters and covers a set of historically-related alphabetic scripts of singular cultural importance to Europe.

The Extended European Subset differs from the Minimum European Subset mainly by basing its selection of characters by script and function, rather than by use in particular languages. In this way, the Extended European Subset provides a European specification for the needs of more specialized groups in government, industry, publishing, academia, and the private sector, than are provided for by the less general Minimum European Subset. The technical contents of the MES have also been made available.

The Extended European Subset will help European developers implement all characters belonging to European scripts. This is not intended to imply that European users have no need for non-European scripts; but it is logical to specify a subset defining the collections in ISO/IEC 10646-1 which contain European scripts. The non-European scripts will be provided for by non-Europeans. Full functionality with regard to the use of the Extended European Subset characters is not expected to be provided in the near future; this subset is mainly intended to provide guidance to product developers to facilitate cost-effective provision of fonts for rendering devices, for instance.

The Extended European subset is a selected subset of ISO/IEC 10646-1 for use in Europe. It is a superset of the MES. The Extended European Subset differs from the Minimum European Subset mainly by basing its selection of characters by script and function, rather than by use in particular languages.

The Extended European Subset includes, exhaustively, the collections of ISO/IEC 10646-1 characters containing the Latin, Greek, Cyrillic, Armenian, and Georgian scripts, together with those collections of symbols used academically, commercially, and scientifically in Europe. By including combining characters and phonetic characters of the Latin alphabet (including the International Phonetic Alphabet), the Extended European Subset also provides for the needs of transliteration and transcription of many of the world's languages into the Latin script.

The Extended European Subset consists of the following collections in ISO/IEC 10646-1 (collection number (which should be used normatively in implementation), collection name, hexadecimal range, number of positions in the collection, number of characters currently assigned):

Collection 1 Basic Latin, 0020-007E
Collection 2 Latin-1 Supplement, 00A0-00FF
Collection 3 Latin Extended-A, 0100-017F
Collection 4 Latin Extended-B, 0180-024F
Collection 5 IPA Extensions, 0250-02AF
Collection 6 Spacing Modifier Letters, 02B0-02FF
Collection 7 Combining Diacritical Marks, 0300-036F
Collection 8 Basic Greek, 0370-03CF
Collection 9 Greek Symbols and Coptic, 03D0-03FF
Collection 10 Cyrillic, 0400-04FF
Collection 11 Armenian, 0530-058F
Collection 27 Basic Georgian, 10D0-10FF
Collection 28 Georgian Extended, 10A0-10CF
Collection 30 Latin Extended Additional, 1E00-1EFF
Collection 31 Greek Extended, 1F00-1FFF
Collection 32 General Punctuation, 2000-206F
Collection 33 Superscripts and Subscripts, 2070-209F
Collection 34 Currency Symbols, 20A0-20CF
Collection 35 Combining Diacritical Marks for Symbols, 20D0-20FF
Collection 36 Letterlike Symbols, 2100-214F
Collection 37 Number Forms, 2150-218F
Collection 38 Arrows, 2190-21FF
Collection 39 Mathematical Operators, 2200-22FF
Collection 40 Miscellaneous Technical, 2300-23FF
Collection 41 Control Pictures, 2400-243F
Collection 42 Optical Character Recognition, 2440-245F
Collection 43 Enclosed Alphanumerics, 2460-24FF
Collection 44 Box Drawing, 2500-257F
Collection 45 Block Elements, 2580-259F
Collection 46 Geometric Shapes, 25A0-25FF
Collection 47 Miscellaneous Symbols, 2600-26FF
Collection 48 Dingbats, 2700-27BF
Collection 63 Alphabetical Presentation Forms, FB00-FB4F
Collection 65 Combining Half Marks, FE20-FE2F
Collection 70 Specials, FFF0-FFFD

The following object identifier can be used to identify the Extended European Subset:

{ISO(1) standard(0) 10646 part-(1) implementation -level(3) 1 2 3 4 5 6 7 8 9 10 11 27 28 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 63 65 70}

The EES can also be designated and invoked by making use of the escape sequences defined in Clause 17.3 of ISO/IEC 10646-1.

(Back to Top)
Michael Everson, Evertype, everson@egt.ie, Dublin, 1996-08-12