Windows-1254 is a code page used under Microsoft Windows (and for the web), to write Turkish that it was designed for (and the vast majority of users use it for that language, even though it can also be used for some other languages). Characters with codepoints A0 through FF are compatible with ISO 8859-9, but the CR range, which is reserved for C1 control codes in ISO 8859, is instead used for additional characters (analogous to the relationship between ISO-8859-1 and Windows-1252). It matches Windows-1252 except for the replacement of six Icelandic characters (Ðð, Ýý, Þþ) with characters unique to the Turkish alphabet (Ğğ, İ, ı, Şş).
MIME / IANA | windows-1254 |
---|---|
Alias(es) | cp1254 (Code page 1254) |
Language(s) | Turkish |
Created by | Microsoft |
Standard | WHATWG Encoding Standard |
Classification | extended ASCII, Windows-125x |
Extends | ISO 8859-9 (without single-byte C1 controls) |
The WHATWG Encoding Standard, which specifies the character encodings which are permitted in HTML5 and which compliant browsers must support,[1] includes Windows-1254, which is used for both the Windows-1254 and ISO-8859-9 labels.[2][3] Unicode is preferred for modern applications; authors of new pages and the designers of new protocols are instructed to use UTF-8 instead.[2] As of 2023[update], less than 0.05% of all web pages use Windows-1254, and less than 0.05% use ISO-8859-9,[4][5] which the WHATWG also requires web browsers to handle as Windows-1254.[2] Since 2.2% of all websites located in Turkey use ISO-8859-9, plus the 1.3% that actually declare Windows-1254 used, in effect, 3.5% of websites there use Windows-1254.[6]
IBM uses code page 1254 (CCSID 1254 and euro sign extended CCSID 5350) for Windows-1254.[7][8][9]
Character set
editThe following table shows Windows-1254. Each character is shown with its Unicode equivalent.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | NUL | SOH | STX | ETX | EOT | ENQ | ACK | BEL | BS | HT | LF | VT | FF | CR | SO | SI |
1x | DLE | DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN | EM | SUB | ESC | FS | GS | RS | US |
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | DEL |
8x | € | ‚ | ƒ | „ | … | † | ‡ | ˆ | ‰ | Š | ‹ | Œ | ||||
9x | ‘ | ’ | “ | ” | • | – | — | ˜ | ™ | š | › | œ | Ÿ | |||
Ax | NBSP | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « | ¬ | SHY | ® | ¯ |
Bx | ° | ± | ² | ³ | ´ | µ | ¶ | · | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ |
Cx | À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë | Ì | Í | Î | Ï |
Dx | Ğ | Ñ | Ò | Ó | Ô | Õ | Ö | × | Ø | Ù | Ú | Û | Ü | İ | Ş | ß |
Ex | à | á | â | ã | ä | å | æ | ç | è | é | ê | ë | ì | í | î | ï |
Fx | ğ | ñ | ò | ó | ô | õ | ö | ÷ | ø | ù | ú | û | ü | ı | ş | ÿ |
See also
editReferences
edit- ^ "8.2.2.3. Character encodings". HTML 5.1 2nd Edition. W3C.
User agents must support the encodings defined in the WHATWG Encoding standard, including, but not limited to […]
- ^ a b c van Kesteren, Anne. "Names and labels". Encoding Standard. WHATWG.
- ^ van Kesteren, Anne. "Legacy single-byte encodings". Encoding Standard. WHATWG.
- ^ "Historical trends in the usage of character encodings for websites". w3techs.com.
- ^ "Frequently Asked Questions". w3techs.com.
- ^ "Distribution of character encodings among websites that use Turkey". w3techs.com. Retrieved 2023-02-23.
- ^ "Code page 1254 information document". Archived from the original on 2016-03-03.
- ^ "CCSID 1254 information document". Archived from the original on 2016-03-26.
- ^ "CCSID 5350 information document". Archived from the original on 2014-11-29.
- ^ Unicode mapping table for Windows 1254
- ^ Unicode mappings of windows 1254 with "best fit"
- ^ Code Page CPGID 01254 (pdf) (PDF), IBM
- ^ Code Page CPGID 01254 (txt), IBM
- ^ International Components for Unicode (ICU), ibm-1254_P100-1995.ucm, 2002-12-03
- ^ International Components for Unicode (ICU), ibm-5350_P100-1998.ucm, 2002-12-03