This page attempts to document standards and infrastructure for presentation of Unicode-related information on Wikipedia. It may also serve as a gathering point for work on building the same.
Templates
editUnicode-related templates.
- {{SpecialChars}}
- Add a small message box (floated right) which informs the reader that the page uses special characters, which might not display properly. Here, "special" basically means anything beyond ASCII and maybe Latin-1. This template should be added to the top of any page that makes extensive use of Unicode.
- {{Unicode}}
- This just wraps the given character(s) in an HTML SPAN block with class "Unicode". CSS can then be applied on a per-browser/platform basis to select appropriate fonts, or maybe even do other fix-ups.
Glyph images
editWikipedia and/or Wikimedia Commons host many images of glyphs — characters rendered in a given font. In article text, we generally prefer to use literal Unicode characters, not these rendered images. Thus, these images are primarily used in articles about characters, where an illustration is appropriate. In particular, any #Unicode tables provide both the literal character and an image of the character.
Ideally, all such glyph images would be vector graphics, in SVG format. However, many exist in a raster graphics format, such as GIF. Converting or replacing these with SVGs is something that should be done.
As of this writing, there is no standardized naming of these images. Sometimes an expression of the codepoint is used as the file name, e.g., U+2122.svg
. In other cases, the character name is used, e.g., OCR-A char Quotation Mark.svg
.
Unicode tables
editMany articles dealing with Unicode include tables of Unicode characters. The standard form for such tables is as given in the following example.
Example table
editChar | Image | Name | Hex | Decimal |
---|---|---|---|---|
☷ | Trigram for Earth | U+2637 | 9783 | |
☸ | Wheel of Dharma | U+2638 | 9784 | |
☹ | White frowning face (Emoticon) | U+2639 | 9785 | |
☺ | White smiling face (Emoticon) | U+263a | 9786 |
Legend
edit- A copy of this legend, or something like it, will be linked from or displayed with all Unicode tables, once we figure out exactly how that should be done.
Char | The literal character. If your computer lacks Unicode support, you may see other symbols instead of the proper character. |
---|---|
Image | A sample image of the character, rendered in an example font. |
Name | The official name of the character. Additional information may be given in parenthesis. |
Hex | The numeric code point for the character, in hexadecimal (base 16), with "U+" prefix. |
Decimal | The same code point value, expressed in decimal (base 10). |
Design features
editThe table format has the following design features:
- Sortable
- "Char" column
- "Image" column
- A sample rendering of the Unicode glyph (see #Glyph images)
- For systems/browsers which cannot render Unicode (or specific characters), allows the reader to see intended appearance
- Provides a consistency check for character, image, and browser. Discrepancies will stand out.
- When a glyph image isn't available, the table cell is left empty
- "Name" column
- The official codepoint name, as specified by the Unicode Consortium
- Either the entire name, or individual words, may be wikilinked to articles
- When the appropriate article title does not match the word(s) of the official name, piped links should be used to preserve the official name
- Additional names or references can be provided in parenthesis, if needed
- For illustration, in the above table:
- Only "Trigram" is wikilinked, because "of Earth" is not part of ba gua (concept)
- All of "Wheel of Dharma" is wikilinked, because Dharmacakra is synonymous with "Wheel of Dharma"
- "Emoticon" is a parenthetical, as that is not part of the official Unicode codepoint name
- "Hex" and "Decimal" columns
- The codepoint number, in both decimal (base ten) and hexadecimal (base 16) formats
- The “
U+
” prefix is used for hex, per the Unicode standard - Decimal is not prefixed, per WP:MOSNUM
- The plan is to eventually add some kind of standard explanation of the columns to the tables, most likely as an adjacent template, or maybe links from the headers. Ideas welcome!
Articles with Unicode tables
edit- Number Forms
- Letterlike Symbols
- Latin characters in Unicode
- Arrow (symbol)
- Unicode Mathematical Operators
- Miscellaneous Technical (Unicode)
- Box drawing characters
- Miscellaneous Symbols
- Unicode Specials
- List of Unicode characters
- Unicode symbols
- OCR-A
- Wikipedia:MOSNUM#Common_mathematical_symbols
- Linear_B#Unicode (note the link to the Unicode standard)