User:TomTheHand/Unit tests for AWB regexes/General
This section contains regular expressions that make general fixes, not limited to a particular topic or type of unit.
Replace incorrect or poorly supported characters
editReplace non-breaking hyphen with regular hyphen
editDescription | |
---|---|
Replace non-breaking hyphen with regular hyphen. The non-breaking hyphen is poorly supported in browsers, so it probably shouldn't be used on Wikipedia. | |
Find | |
‑ | |
Replace with | |
- | |
Regular expression? | Case sensitive? |
N | N/A |
Text this regex should modify: | Intended result: |
16‑inch guns |
16-inch guns |
Replace degree-like symbols with proper degree symbol
editDescription | |
---|---|
Replace Unicode ordinal indicator (º) or ring above (˚) with degree sign (°). Be careful about false positives! Ensure that the alternate symbols aren't actually intended. In most ship articles, they're probably a mistake, but they do have legitimate uses! | |
Find | |
[º˚] | |
Replace with | |
° | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
|
|
Use correct Unicode symbol for micro
editDescription | |
---|---|
Use the correct Unicode symbol for micro-, and insert non-breaking spaces per MoS. | |
Find | |
\b(\d+)(?:\s| |-)*μ(m|g|s|A|K|mol|cd|Hz|N|Pa|J|W|C|V|F|Ω|S|Wb|T|H|lm|lx|Bq|Gy|Sv|kat|M|l)\b | |
Replace with | |
$1 µ$2 | |
Regular expression? | Case sensitive? |
Y | Y |
Text this regex should modify: | Intended result: |
|
|
Format <br /> tags
editDescription | |
---|---|
Give <br /> tags proper XHTML format. | |
Find | |
</?br\s*/?> | |
Replace with | |
<br /> | |
Regular expression? | Case sensitive? |
Y | N |
Text this regex should modify: | Intended result: |
|
|
Make general SI fixes
editMake k for kilo- lower-case
editDescription | |
---|---|
Make k for kilo- lower-case, and insert non-breaking spaces per MoS. | |
Find | |
\b(\d+)(?:\s| |-)*K(m|g|s|A|K|mol|cd|Hz|N|Pa|J|W|C|V|F|Ω|S|Wb|T|H|lm|lx|Bq|Gy|Sv|kat|M|l)\b | |
Replace with | |
$1 k$2 | |
Regular expression? | Case sensitive? |
Y | Y |
Text this regex should modify: | Intended result: |
|
|
There may be some cases where the text or HTML may be preferable to Unicode; be careful of those situations.
I feel that in many cases vulgar fractions from sources are worth retaining as Unicode symbols rather than converting to a decimal. For historical articles, vulgar fractions feel appropriate, and they give level of precision that is lost on conversion to a decimal. For example, converting 5⅞ to 5.875 implies precision to the thousandth when you only actually have precision to the eighth. If you were to convert to 5.9 instead, you're losing information and still implying higher precision than the measurement actually provides.
Unicodify 1/2
editDescription | |
---|---|
Replace 1/2 with the Unicode symbol ½ | |
Find | |
\b1/2\b | |
Replace with | |
½ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
1/2 |
½ |
Text this regex should not modify: | |
|
Unicodify 1/3
editDescription | |
---|---|
Replace 1/3 with the Unicode symbol ⅓ | |
Find | |
\b1/3\b | |
Replace with | |
⅓ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
1/3 |
⅓ |
Text this regex should not modify: | |
|
Unicodify 2/3
editDescription | |
---|---|
Replace 2/3 with the Unicode symbol ⅔ | |
Find | |
\b1/3\b | |
Replace with | |
⅔ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
2/3 |
⅔ |
Text this regex should not modify: | |
|
Unicodify 1/4
editDescription | |
---|---|
Replace 1/4 with the Unicode symbol ¼ | |
Find | |
\b1/4\b | |
Replace with | |
¼ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
1/4 |
¼ |
Text this regex should not modify: | |
|
Unicodify 3/4
editDescription | |
---|---|
Replace 3/4 with the Unicode symbol ¾ | |
Find | |
\b3/4\b | |
Replace with | |
¾ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
3/4 |
¾ |
Text this regex should not modify: | |
|
Unicodify 1/5
editDescription | |
---|---|
Replace 1/5 with the Unicode symbol ⅕. Browser support for fifths isn't great, so you may not want to use this one. | |
Find | |
\b1/5\b | |
Replace with | |
⅕ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
1/5 |
⅕ |
Text this regex should not modify: | |
|
Unicodify 2/5
editDescription | |
---|---|
Replace 2/5 with the Unicode symbol ⅖. Browser support for fifths isn't great, so you may not want to use this one. | |
Find | |
\b2/5\b | |
Replace with | |
⅖ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
2/5 |
⅖ |
Text this regex should not modify: | |
|
Unicodify 3/5
editDescription | |
---|---|
Replace 3/5 with the Unicode symbol ⅗. Browser support for fifths isn't great, so you may not want to use this one. | |
Find | |
\b3/5\b | |
Replace with | |
⅗ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
3/5 |
⅗ |
Text this regex should not modify: | |
|
Unicodify 4/5
editDescription | |
---|---|
Replace 4/5 with the Unicode symbol ⅘. Browser support for fifths isn't great, so you may not want to use this one. | |
Find | |
\b4/5\b | |
Replace with | |
⅘ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
4/5 |
⅘ |
Text this regex should not modify: | |
|
Unicodify 1/6
editDescription | |
---|---|
Replace 1/6 with the Unicode symbol ⅙. Browser support for sixths isn't great, so you may not want to use this one. | |
Find | |
\b1/6\b | |
Replace with | |
⅙ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
1/6 |
⅙ |
Text this regex should not modify: | |
|
Unicodify 5/6
editDescription | |
---|---|
Replace 5/6 with the Unicode symbol ⅚. Browser support for sixths isn't great, so you may not want to use this one. | |
Find | |
\b5/6\b | |
Replace with | |
⅚ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
5/6 |
⅚ |
Text this regex should not modify: | |
|
Unicodify 1/8
editDescription | |
---|---|
Replace 1/8 with the Unicode symbol ⅛. Support for eighths is better than fifths or sixths, so this one is probably safe to use. | |
Find | |
\b1/8\b | |
Replace with | |
⅛ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
1/8 |
⅛ |
Text this regex should not modify: | |
|
Unicodify 3/8
editDescription | |
---|---|
Replace 3/8 with the Unicode symbol ⅜. Support for eighths is better than fifths or sixths, so this one is probably safe to use. | |
Find | |
\b3/8\b | |
Replace with | |
⅜ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
3/8 |
⅜ |
Text this regex should not modify: | |
|
Unicodify 5/8
editDescription | |
---|---|
Replace 5/8 with the Unicode symbol ⅝. Support for eighths is better than fifths or sixths, so this one is probably safe to use. | |
Find | |
\b5/8\b | |
Replace with | |
⅝ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
5/8 |
⅝ |
Text this regex should not modify: | |
|
Unicodify 7/8
editDescription | |
---|---|
Replace 7/8 with the Unicode symbol ⅞. Support for eighths is better than fifths or sixths, so this one is probably safe to use. | |
Find | |
\b7/8\b | |
Replace with | |
⅞ | |
Regular expression? | Case sensitive? |
Y | N/A |
Text this regex should modify: | Intended result: |
7/8 |
⅞ |
Text this regex should not modify: | |
|
En dash
editDescription | |
---|---|
Replace – HTML entity with the Unicode symbol –. | |
Find | |
– | |
Replace with | |
– | |
Regular expression? | Case sensitive? |
N | N |
Text this regex should modify: | Intended result: |
– |
– |
Em dash
editDescription | |
---|---|
Replace — HTML entity with the Unicode symbol —, and remove spaces from around em dashes. | |
Find | |
[ \t]*(?:— | |
Replace with | |
— | |
Regular expression? | Case sensitive? |
Y | N |
Text this regex should modify: | Intended result: |
|
|
Superscripts
editPlease read this section of the Manual of Style on Mathematics before using these regular expressions. If the article you are editing uses higher powers as well, use <sup></sup> tags, because these Unicode symbols will not match superscripts for higher numbers. If the article only contains ² and ³, and will never contain higher powers, using Unicode symbols can be more compact and easier to understand. An article completely unrelated to mathematics which happens to include an area in km² has no need to support higher powers.
<sup>2</sup>
editDescription | |
---|---|
Replace <sup>2</sup> with the Unicode symbol ². | |
Find | |
<sup>2</sup> | |
Replace with | |
² | |
Regular expression? | Case sensitive? |
N | N |
Text this regex should modify: | Intended result: |
2 |
² |
<sup>3</sup>
editDescription | |
---|---|
Replace <sup>3</sup> with the Unicode symbol ³. | |
Find | |
<sup>3</sup> | |
Replace with | |
³ | |
Regular expression? | Case sensitive? |
N | N |
Text this regex should modify: | Intended result: |
3 |
³ |
Other Unicode
editUse times sign instead of x
editDescription | |
---|---|
Use the Unicode times symbol instead of the letter x for multiplication, and provide correct spacing. Some WikiProjects prefer the letter x for ease of entry; make sure you don't step on anyone's toes. | |
Find | |
(\d)\s*[x×]\s*(\d) | |
Replace with | |
$1 × $2 | |
Regular expression? | Case sensitive? |
Y | N |
Text this regex should modify: | Intended result: |
|
|