ISO/IEC 8859-14:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 14: Latin alphabet No. 8 (Celtic), is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1998. It is informally referred to as Latin-8 or Celtic. It was designed to cover the Celtic languages, such as Irish, Manx, Scottish Gaelic, Welsh, Cornish, and Breton.
MIME / IANA | ISO-8859-14 |
---|---|
Alias(es) | iso-ir-199, latin8, iso-celtic, l8[1] |
Language(s) | Irish, Manx, Scottish Gaelic, Welsh, Cornish, Breton, English |
Standard | ISO/IEC 8859-14:1998 |
Classification | ISO/IEC 8859 (Extended ASCII, ISO/IEC 4873 level 1) |
Extends | US-ASCII |
Based on | ISO-IR-182 |
ISO-8859-14 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. CeltScript made an extension for Windows called Extended Latin-8. Microsoft has assigned code page 28604 a.k.a. Windows-28604 to ISO-8859-14.[2] FreeDOS assigned code page 58163 to ISO-8859-14.[3]
History
editISO-8859-14 was originally proposed for the Sami languages.[4] ISO 8859-12 was proposed for Celtic.[5] Later, ISO 8859-12 was proposed for Devanagari, so the Celtic proposal was changed to ISO 8859-14. The Sami proposal was changed to ISO 8859-15,[6] but it got rejected as an ISO/IEC 8859 part, although it was registered as ISO-IR-197.[7]
The original proposal used a different arrangement of points 0xA1–BF.[5] At the committee draft stage of the specification, a dotless i was included at 0xAE,[8] which was changed to a registered trademark sign (matching ISO-8859-1) in the final publication.
ISO-IR-182, an earlier (registered in 1994) modification of ISO-8859-1, had added the letters Ẁ, Ẃ, Ẅ, Ỳ, Ÿ, Ŵ, Ŷ and their lowercase forms (except for ÿ, which was already included) for Welsh language use.[9] The final published version of ISO-8859-14 includes these letters in the same positions which they appear at in ISO-IR-182.
Codepage layout
editDifferences from ISO-8859-1 have the Unicode code point number below the character.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
1x | ||||||||||||||||
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
8x | ||||||||||||||||
9x | ||||||||||||||||
Ax | NBSP | Ḃ 1E02 |
ḃ 1E03 |
£ | Ċ 010A |
ċ 010B |
Ḋ 1E0A |
§ | Ẁ 1E80 |
© | Ẃ 1E82 |
ḋ 1E0B |
Ỳ 1EF2 |
SHY | ® | Ÿ 0178 |
Bx | Ḟ 1E1E |
ḟ 1E1F |
Ġ 0120 |
ġ 0121 |
Ṁ 1E40 |
ṁ 1E41 |
¶ | Ṗ 1E56 |
ẁ 1E81 |
ṗ 1E57 |
ẃ 1E83 |
Ṡ 1E60 |
ỳ 1EF3 |
Ẅ 1E84 |
ẅ 1E85 |
ṡ 1E61 |
Cx | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï |
Dx | Ŵ 0174 |
Ñ | Ò | Ó | Ô | Õ | Ö | Ṫ 1E6A |
Ø | Ù | Ú | Û | Ü | Ý | Ŷ 0176 |
ß |
Ex | à | á | â | ã | ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï |
Fx | ŵ 0175 |
ñ | ò | ó | ô | õ | ö | ṫ 1E6B |
ø | ù | ú | û | ü | ý | ŷ 0177 |
ÿ |
Draft layout
editThe first draft had positions A0-BF different. It did not include the pilcrow sign, but included the cent sign instead at its Latin-1 position. Later, it was ruled that the pilcrow sign was more common, so the pilcrow sign remains at its Latin-1 position, and the cent sign was removed instead.
Differences from ISO-8859-14 have the Unicode code point below them.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
Ax | NBSP | Ḃ 1E02 |
¢ 00A2 |
£ | ḃ 1E03 |
Ċ 010A |
ċ 010B |
§ | Ẁ | © | Ẃ | Ṡ 1E60 |
Ỳ | SHY | ® | Ÿ |
Bx | Ḋ 1E0A |
ḋ 1E0B |
Ḟ 1E1E |
ḟ 1E1F |
Ġ 0120 |
ġ 0121 |
Ṁ 1E40 |
ṁ 1E41 |
ẁ | Ṗ 1E56 |
ẃ | ṡ 1E61 |
ỳ | Ẅ | ẅ | ṗ 1E57 |
References
edit- ^ Character Sets, Internet Assigned Numbers Authority (IANA), 2018-12-12
- ^ "SheetJS/js-codepage". GitHub. 12 October 2021.
- ^ "Cpi/CPIISO/Codepage.TXT at master · FDOS/Cpi". GitHub.
- ^ Everson, Michael. "Proposed ISO 8859-14 (later 15)".
- ^ a b c Everson, Michael. "Proposed ISO 8859-12 (later 14)".
- ^ Everson, Michael (1996-06-19). Proposal for a new part of ISO/IEC 8859: Latin alphabet No. 9 (Sámi).
- ^ Swedish Institute for Standards (1997-01-24). ISO-IR-197: Sami supplementary Latin set (PDF). ITSCJ/IPSJ.
- ^ Everson, Michael (1997-05-05). "ISO/IEC CD 8859-14:1997 — Latin alphabet No. 8 (Celtic)" (Committee Draft).
- ^ British Standards Institution (1994-03-16). ISO-IR-182: Welsh variant of Latin Alphabet No. 1 (right-hand part) (PDF). ITSCJ/IPSJ.
- ^ Kuhn, Markus; Whistler, Ken (1999-07-27). "ISO/IEC 8859-14:1998 to Unicode". 8859 to Unicode mapping tables. Unicode, Inc.
- ^ International Components for Unicode (ICU), iso-8859_14-1998.ucm, 1999-07-27
External links
edit- ISO/IEC 8859-14:1998
- ISO-IR 199 Celtic Supplementary Latin Set (May 1, 1998, submitted by Irish body NSAI/AGITS/WG6)