Wikipedia talk:Manual of Style/Dashes archive 1


em-dash and en-dash

Question on Style. I recieved the following comment on a page I edited: (&am; "#151;" is not a valid HTML entity... it should be & "mdash;" or & "#8212;"). I think & "#151;" is perfectly valid for a "printer's em or em dash" Anyone know why it is not? Also, should not the em be separated by spaces from the rest of the text, since it is NOT an ordinary dash, but a device for redirecting rthouyght within a sentence? Anybody know about this? Marshman 04:47, 2 Aug 2003 (UTC)

Most style manuals prefer no spaces either side of m and n dashes. Where the line length is small, such as in newspapapers, it's more common to avoid m dashes altogether, or to use spaces either side. Tony 02:29, 31 July 2005 (UTC)

In ISO 8859-1 and Unicode, code point 151 is reserved as a control character. It is not an em dash except in Microsoft's proprietary code page extensions, and any program that displays an em-dash for "—" is doing so either erroneously or in deliberate emulation of common bugs in Windows. Relying on buggy behavior is not recommended. :) Please use the standard, either — or —. --Brion 05:02, 2 Aug 2003 (UTC)
Thanks I will use — in the future. Coffee-Cup Software HTML Editor inserts "—" for an em-dash and it certainly displays that way on browsers. Why the confusion? 24.94.86.252 05:36, 2 Aug 2003 (UTC)me not logged inn Marshman 05:38, 2 Aug 2003 (UTC)
please don't! — looks very ugly in wikisource and some editors may not know what it is. Stick to "--". I know it's ugly, but in future our parser may turn that into mdash automagically. -- Tarquin 12:17, 2 Aug 2003 (UTC)
Does hardly look more ugly than L&uoml;beck.


"--" gets really ugly when broken between lines, " - " would be a better advice.
-- Ruhrjung 12:42, 2 Aug 2003 (UTC)
— is the only form that renders correctly on the largest possible number of browsers. — is almost as good. No other form renders correctly, except by mistake. Always use — or —. Tannin 14:07, 15 Sep 2003 (UTC)

"pre–World War II' is not a good example of the n dash, because there should be a hyphen here! I'm changing it to a correct example. Tony 02:26, 31 July 2005 (UTC)

Actually, that example is correct. An en dash is used instead of a hyphen when one of the components is hypenated or contains more than one word. I will add a sentence clarifying this. Babomb 08:43, 23 October 2005 (UTC)

Dashes

(from the village pump)

User:Wik seems to insist on replacing ndashes – with ASCII dashes -. Style guides for printed work such as encyclopedias, as well as Unicode, state that for ranges such as dates an ndash (1998–2000) and not a dash (1998-2000) should be used. One advantage of using the correct dash is that a linebreak won't occur on the right of it. Is there some official policy from the Wikipedia on this, or should I just wait until Wik tires of his game and restore the correct dashes? Jor 01:00, 12 Feb 2004 (UTC)

Well, if you're prepared to insert the "correct dashes" into all the tens of thousands of articles which now have the ASCII dashes, go ahead. --Wik 01:04, Feb 12, 2004 (UTC)
Okay. I will interprete your quote above in that you'll start leaving them alone from now on. Jor 01:05, 12 Feb 2004 (UTC)
No, only if you go through all articles and make it consistent. I will always edit the articles to fit the de facto standard. Currently, that's the ASCII dash. --Wik 01:07, Feb 12, 2004 (UTC)
Please use two ASCII hyphens -- in a future version of MediaWiki this will be automatically converted to –. The problem with using one hyphen is that they're very difficult to find and convert once the new feature is implemented. I'd be quite happy with people using – in the meantime. -- Tim Starling 01:11, Feb 12, 2004 (UTC)

Wikipedia articles are about being easy to read and edit. The average non techie reader has no idea what the sequence of characters "–" is supposed to mean. It makes the article source ugly and therefore harder to edit. This kind of stuff should be kept at a minimum.—Eloquence 01:34, Feb 12, 2004 (UTC)

I also don't like ndashes as they make editing harder. Dori | Talk 03:14, Feb 12, 2004 (UTC)
An ndash and and an mdash are NOT the same thing, and a '-' is not a substitute for an mdash;. I agree with Jo. Stop putting in ASCI dashes anywhere. Or else you are going to be real busy for the rest of your days because I use only ndash and mdash and will change any ASCI dashes I encounter to the correct form (something a BOT cannot do). And a -- should become an mdash not an ndash. The look of the "source code" is not an issue. Incorrect English prevails over making editing "easier" : Maybe we should just ignore spelling too - Marshman 05:42, 12 Feb 2004 (UTC)
Wikipedia:Manual of Style (biographies) uses regular ascii dashes in dates (1999 - 2005). I don't see what the problem with them is personally. It makes editing easier and looks fine when rendered to my eyes. The manual of style isn't compulsory, but it's the only guideline that should be applied to wikipedia IMO. If it's under debate then hash it out on the talk page and modify the guidlines if necessary when a consensus has been reached. fabiform | talk 06:57, 12 Feb 2004 (UTC)
Can anybody please explain why this matters at all? A dash is a dash is a minus sign... or not? And minus signs are far easier to use compared with some "&..." character sequence. Furthermore, it is my impression that everything else looks ugly in Mozilla-based browsers. The advandages of "&..." listed above look not too significant compared with the ease of editing that "-" offers. So, what are the reasons for using the "&..." things? Specifically, why are they considered "correct"? Kosebamse 11:19, 12 Feb 2004 (UTC)
See Dash (punctuation), and in particular, the external link at the bottom, The trouble with EM 'n EN. While you are at it. look over Typography Matters—a short essay on the theme "Typography, at the root, is all about providing as many helpful cues for the reader’s eye as possible." Tannin
It can also be important when "viewing" pages through a different kind of browser, for example having a text-to-speech engine read it aloud. The different kinds of dash/hyphen can be used to cue different pauses or emphasis. HTH HAND --Phil 12:17, Feb 12, 2004 (UTC)
Fair enough. Still, everything except minus signs looks plain ugly on a monitor (at least under Mozilla et al.), i.e. there is not a helpful visual cue but a distraction, i.e. it is counter-productive to use the "n" and "m" things. Is there a solution to the display problem? Kosebamse 12:44, 12 Feb 2004 (UTC)
Get your eyes adjusted, Kosebamse. No, I'm not making a smart crack here---if proper typography looks ugly to you, you have been spending too long reading badly set web pages, or student term papers, or some such. Take a break from the 'pedia and read some real printed-on-paper things (books, National Geographic, anything you like) till your eyes adjust themselves back to normality. As for Mozilla, it is ugly. Always has been. The most stable and practical browser around but ugly as a hatful of ar.... um ... bottoms. If you like pretty, use Opera. Or, if you must, Explorer. Microsoft have always been good at pretty. Tannin 13:14, 12 Feb 2004 (UTC)
PS: I usually use Mozilla for most things, but nearly always Opera here (don't ask why, just habit). Looking at the page as rendered by Mozilla just now, it's fine. Perhaps your problem is the font support in Linux. Linus still has crappy on-screen fonts. Tannin 13:21, 12 Feb 2004 (UTC)
Oh my god, please stop spewing out misinformation. First of all, Mozilla has no problem with en or em dashes, minus signs, quotation marks, or most other relatively common characters in Unicode. If your font has the character, it will be displayed in Mozilla—just like with any other graphical browser. Second of all, this has absolutely nothing to do with Linux. Linux is an operating system kernel that controls your hardware and says which process gets to run when. It does not care in the slightest about em dashes. Finally, with that said, do check out the free, high-quality Bitstream Vera font family. —Daniel Brockman 08:50, Mar 7, 2004 (UTC)
Ah well... I like to see myself as a bibliophile and book maniac and could not agree more that good typesetting is A Good Thing. The "m" dashes are displayed too long, too high and without right or left spaces on Mozilla (under Linux). It would be A Very Good Thing to fix that but on which level? Browser? Style sheet? Font? Kosebamse 13:28, 12 Feb 2004 (UTC)
Font, I suspect. Am m-dash should be exactly the same width as the letter "m" (uppercase or lowercase? I can't remember) in whatever font you are using. The "lack" of spaces is not an error. That's the way an m-dash is supposed to be rendered. Some---mostly American---publishing houses have taken to inserting spaces on either side of an m-dash in recent years. I have no idea why. A micro-space is acceptable if desired, but a full space ... well ... what is it they say? Two nations seperated by a common language? Tannin
Quite possible it's the fonts. I have played around a little and they all look either like a minus or as described above (much wider than a lowercase "m" and too high). Kosebamse 13:50, 12 Feb 2004 (UTC)

Do not forget we are a wiki!

I must insist that we NOT use HTML entities in raw wiki markup. This is a barrier to editing to all the non-technically-minded people who do not know what — means when they see it in raw wiki text. Irrespective of what is correct typograpy, we must work with the tools at our disposal, and we must remember that this is a wiki and clarity in raw source is as important as clarity and accuracy in rendered form. -- Tarquin 16:42, 17 Mar 2004 (UTC)

I am relatively new, and I disagree completely. I know a lot of other people do, too. Can we vote on this or something? - Omegatron 17:12, Mar 17, 2004 (UTC)
Or perhaps a tech fix could solve the disagreement; replacing the 4 used dashes with special codes, for instance, as the horizontal bar has been replaced by "----", and people were talking about replacing en dashes with "--" or whatever, which would be rendered as the correct HTML character. Probably something like "-en-" would be better and easier to grasp. This would be very easy for newbies to grasp, and would still format articles in a readable way. - Omegatron 17:15, Mar 17, 2004 (UTC)
Getting the automatic conversion function back up would solve a lot of this tension. And "clarity in raw source"? Well, that disqualifies every single article with a summary box. Hajor 17:23, 17 Mar 2004 (UTC)
And just to throw in my two cents on the "evolution of typography"; I agree that throwing out "correct typography" is akin to throwing out proper spelling. Sure, we could use "online style" for everything, and "embrace the new internet style", but by that logic, "WELCOMA 2 WIKIEPDIA111!!!!! OMG LOL W3 R BUILDNG A MULTILNGUAL COPYLEFT ENCYCLOPADIA TAHT WIL ALWAYS BLONG 2 3VERYONA!1!111 LOL" is a perfectly valid intro. It's an encyclopedia. It should look good. There is nothing wrong with conforming to "old" standards. We certainly shouldn't force people to use them, but there is nothing wrong with using them, either. - Omegatron 17:43, Mar 17, 2004 (UTC)


Sevral points, Omegatron: yes, articles that begin with lengthy tables are a bad thing in many ways. There have been suggestions to move these to another namespace and insert them in pages. second, you need to understand how wikis work: theire open nature is crucial. Complex HTML terms are a barrier. Thirdly, I never mentioned "evolution of typography". What I SAID is that we have imperfect tools, that were not designed for typography, namely the basic ASCII set used on the internet. This is what we must work with, for now at least. Try to accept that WP is a work in progress! :) -- Tarquin 19:55, 18 Mar 2004 (UTC)
I have much more experience with WP than standard wikis, but if I am not mistaken, typical wikis don't have images, tables, TeX markup, boilerplate text, or normal text links (they have ugly CamelCase). All of these things are very good to have, make it much more encyclopedia-like, respectable, etc (if WP used camelcase I would probably never have come here. An article that looks crappy gives the impression of having crappy info). They should not be removed just because they make the markup a little more difficult to edit. Dashes are pretty trivial, but HTML entities in general should not be expressly prohibited just because they are confusing the first time someone sees them. Possible solutions to the disagreement are
  1. Put a description in the first few pages that new editors see. Perhaps a "[[what are these &number; things?]]" or "what are these special symbols?" at the bottom of an edit page. (near Editing help) (I just checked, and editing help itself has a huge list of possible HTML characters, explaining plain as day what they are for)
  2. Automated conversion - give some or all of the html entities less ugly formats that are automatically converted (-- or -n- or -en- or <endash> or [[[special character: en dash]]] becomes &ndash; for rendering) Obviously you need to make it obvious to newcomers that the code is supposed to be there, without being hard to read, yet without making it so long that no one will type it. - Omegatron
It sure would be handy if there was a more capable browser based text editor provided to edit Wiki articles, with context driven online help. Or, maybe we could put together a very easy to use guide to markup sequences, in general, from the point of view of a beginner. - Bevo 21:14, 18 Mar 2004 (UTC)

Automated dash conversion and digital representation

Well the new dash conversion has just gone live. One hyphen - ; two -- ; three --- ; and of course four is the horozontal rule. We've all been muttering about having confusing "&..." symbols in the wiki editing box, but it just occured to me that this wont happen with the new markup. In the editing box the n-dash (if it was entered that way) will just look like --, just as horozontal rules display as ----. fabiform | talk 12:38, 12 Feb 2004 (UTC)

Yay! Finally we can have correct dashes! No more ugly poor man's dashes! (I'm with the mdash-ndash camp on this one.) --seav 13:07, Feb 12, 2004 (UTC)
Don't hooray too loud, as the same one breaks the new wiki table markup, where |- works, but |-- doesn't work anymore. But Tim already heard the complain on IRC... andy 13:10, 12 Feb 2004 (UTC)
I briefly switched this on using a live patch of two lines of PHP code, which was a bit silly because it broke a few things. I switched it off when I realised it broke links to titles containing --, of which there are about 140. There was some contention on IRC as to whether -- should be expanded as an en dash or an em dash. -- Tim Starling 23:24, Feb 12, 2004 (UTC)
Two dashes should be an em dash of course. An en dash is represented in ASCII by a single dash, and as such cannot be automatically fixed but must be done manually. Jor 22:01, 13 Feb 2004 (UTC)
Actually, it makes more sense that two hyphens make an en dash, and three hyphens make an em dash, for at least three reasons: (1) It enables usage of the en dash; (2) it uses only one token for the em dash (i.e., "---"), whereas two hyphens would encourage people to put spaces around the dash (thus: " -- "), which complicates parsing and takes away power from the style sheet; finally (3) this (using three for an em and two for an en) is how TeX has done it for decades.
I strongly advise that this long-established convention be adopted. The table syntax invented—what?—some momths ago?—can easily be adjusted to make this possible. I doubt that this would cause more confusion than throwing out the logical, intuitive, and well-tested TeX syntax. —Daniel Brockman 08:50, Mar 7, 2004 (UTC)
Has this also got something to do with why the "nowiki" tags are showing at the top of this page: (3) Sign your name and date (by typing "nowiki"–~~ ~~"nowiki"? fabiform | talk 13:17, 12 Feb 2004 (UTC) (not on irc).
Seems to be related - on Template:Villagepump it shows correctly, but once imported here it shows the nowiki's and there is a double - inside the nowikis. andy 13:24, 12 Feb 2004 (UTC)

Doesn't Wikipedia use UTF-8? Can't we just insert the actual mdash and ndash characters? That would make editing much easier. 137.222.10.57 17:03, 12 Feb 2004 (UTC).

No, most English and Western European wikis use ISO 8859-1, for maximum browser compatibility. -- Tim Starling 23:24, Feb 12, 2004 (UTC)

OK. Now I'm confused. According to the recent additions to wikipedia:Manual of Style, one should use a single-dash-without-spaces to represent a simple hyphen (as in date ranges), a single-dash-surrounded-by-spaces to represent an ndash, and two dashes (ie, --) for an mdash. However, if I follow the above conversation correctly, it seems that the software will convert a double-dash into an ndash and a triple-dash into an mdash. Am I misunderstanding, are there two incompatible standards being developed, or has something changed? -Rholton (aka Anthropos) 23:51, 16 Feb 2004 (UTC)

OK, you confirmed what I suspected--that I thought it was resolved but there is in fact no clear statement to that effect. I'll try to rouse up a clarification of what's really going on. Elf 01:44, 17 Feb 2004 (UTC)
When the automatic conversion was briefly turned on, a - remained unaffected, -- turned into a dash (an n dash I assume) and --- turned into a longer dash (an m dash I assume). I have nothing to do with the programming though, so you might want to talk to someone else about this, especially if you would prefer it to be done another way.  :) fabiform | talk 07:27, 17 Feb 2004 (UTC)

Not resolved. A few people involved in the field seem to find the use of the hypen by ordinary people for all of the purposes offensive. Those same ordinary people have no objecting to ignoring the attempt to prevent the language from evolving and ignore the variations in dash lengths now that normal people can easily write and publish and not follow the conventions which used to be used in the print world. Effectively, a small group is trying to enforce an undesired style rule on everyone else, when usage clearly indicates that the majority of contributors do not agree. Simply, the online style for almost everyone is to use - for everything. Since we're a wiki, we do have to accept that change in style expectations, because it's not practical for a few people to force everyone to do what they want. Just document the way most people do it - the simple hypehen - and document that it's accepted that those who are writing new text and object can do it the print way if they desire but are discouraged from changing the writing of others. Jamesday 02:46, 17 Feb 2004 (UTC)

You are contradicting yourself. You start out by arguing that the pro-dash people are trying to “prevent the language from evolving,” and then go on to say that we should definitely use hyphens everywhere because “it’s the online style.” This begs the question: aren’t you preventing the online style from evolving/maturing?
In my not-so-humble opinion, it would be an insult to humanity—and, specifically, the people who work with on-screen typography research, Unicode, the W3C, the Mozilla Project, and others who put in effort to enable to use of good typography on the World Wide Web—to throw out the typographical lore accumulated over the course of centuries just because some PHP script can’t do this or that, or because some people allegedly can read all text equally well no matter how badly formatted. As someone else noted, you’re not proposing that we abandon other seemingly “unnecessary” and “troublesome” English punctuation rules, such as italicising emphasized words or having a whole bunch of different punctuation marks—e.g., comma, semicolon, period, colon, dash, parentheses—that all basically mean “short pause”—or are you?
Finally, yes, this is a wiki. This means that if you don’t know or care about the difference between the variously sized line segments sprinkled about the text, that’s okay, because I—and probably a hundred other people who are willing to edit your text—do. I completely fail to see the logic in arguing that the collective competence of thousands of editors of a wiki could somehow be less than that of, say, one or a few of a paper. No doubt, an article is read a lot more often than it is edited, so it makes sense to spare all the future readers some eye-strain. irritation, and confusion, at the editor’s expense of a small one-time (or more likely few-times) typing cost.
Daniel Brockman 08:50, Mar 7, 2004 (UTC)
Daniel, please don't use HTML entities, it makes your text extremely hard to read in raw form. -- Tarquin 10:08, 7 Mar 2004 (UTC)
So who's going to want to read it in raw form when Wikipedia so nicely formats it for us? --Phil 10:10, Mar 9, 2004 (UTC)
Your point seems to be based on the mistaken belief that the use of multiple types of dash is moving forward. It's not, it's moving back to conventions based on the limitations of newsprint and paper reproduction. The web needs to support the old print conventions, to support republication of old documents and to support those who want to apply print conventions online, because that's how they have been trained and what they are used to. We don't need to use things just because they can be used. Sparing readers and editors eyestrain, irritation and confusion is why I'm going with the usual practice here, which is the hyphen for everything. I've no problems at all with not using rules learned over centuries when rules learned over the last twenty years show them to be inappropriate. One good example of such abandonment of inappropriate conventions is the usual choice of sans- rather than serife fonts online. Since it's unlikely that those used to print will convert to other conventions - they are more likely to believe that they must be right - the leave it alone approach seems best here, so those used to print don't feel that their own writing is wrong. Jamesday 03:05, 9 Mar 2004 (UTC)
So you're saying the em and en dash are obsolete? Proper typography causes eyestrain? You're joking, right? Wikisux 00:20, 2 Jun 2004 (UTC)
The choice of sans-serif fonts rather than serif fonts has everything to do with the relatively low resolution of display monitors versus the printed page. Serifs simply don't render well at typical sizes on a 100dpi monitor. I think it is safe to assume when monitor resolution comes more into line with printed resolution, say around 300dpi, we will see a resurgence of the use of serif fonts which are far more readable than sans serif. I cite personal experience before I learned proper typography and references such as "The PC is Not a Typewriter" as evidence. Jason Michael Smithson 12:49, 2004 Sep 13 (UTC)

Disputed paragraph

Evolving language and the decreased reliance on print world conventions have led to the hyphen becoming an acceptable replacement for other dashes. Where hyphens have been used in place of other dashes, you are discouraged from changing these, in the same way that changing spelling forms is discouraged. (See #Usage and spelling[broken anchor]).

The statement that hyphens are now acceptable substitutes for other types of dashes had been added to the main page. In an attempt to avoid an edit war, I added a notation that this is disputed rather than removing it again. I just do not believe that this is the case. I don't know of any style guides or professional online publishers that have said that punctuation rules have changed. And I don't buy that, just because lots of people do it, that that makes it correct. (By that rule, "its" and "it's" would be interchangeable, for example.) And I certainly object to having the statement put into the style guide that we're not supposed to correct somebody else's punctuation when we come across it. I could live with the inclusion of the observation that some people feel that hyphens are acceptable to use for other dashes. And I expect tht in Wikipedia, people will type what they're comfortable with. And then other people will come through and clean it up. I get the impression from all of the various preceding discussions on dashes and hyphens that "hyphens-are-legitimate-for-anything" is a minority opinion, and "hyphen-once -typed-by-one-person-are-immutable" is very much a minority opinion. Elf 05:48, 8 Mar 2004 (UTC)

I couldn't agree more. I think that we should aim for Wikipedia articles to look clean and professional. This goes hand-in-hand with NPOV. There is plenty of Internet left for those who wish to experiment with new, "online" punctuation rules. —Daniel Brockman 21:21, Mar 8, 2004 (UTC)
I've commented this out for now and just noted that the hyphen is commonly used in place of other dashes. I disagree that it should be "corrected", and I believe the safest option is to go with the same policy we have for spelling to prevent edit wars over this. I personally regard pages containing text such as "&ldquo;&rdquo;" to be highly unreadable. Angela. 02:33, Mar 9, 2004 (UTC)
If you think it's a minority opinion, I suggest that you try the exercise of listing those who have on this talk page expressed opposition to or support for the use of different dash types. My quick count placed those opposing it in the majority, with those used to the print world being at least a high proportion of those who favor it. Jamesday 03:10, 9 Mar 2004 (UTC)
Lots of people will find the use of hyphens instead of dashes annoying, as you say, because they are used to the print world. After all, there's a pretty good amount of people who read books. On the other hand, I suspect that very few, if not no one, will find the use of proper dashes annoying to read. Granted, some might find it annoying to type, but that's been shown to be a non-problem, as other people will later come to clean it up.
As for being annoying to read in the edit box, well, let's face it: the source is and will always be harder to read than the rendered page. I think that in this case, we're talking about a rather minor decrease in source readability (em dashes are relatively rare) in exchange for a rather major enhancement in the appearance of the rendered page:
  • Hey, what's&mdash;what? ⇒ Hey, what's—what? vs.
  • Hey, what's - what? ⇒ Hey, what's - what?
Compare, for example, to Angela's example:
  • He said, "what's up?" ⇒ He said, "what's up?" vs.
  • He said, &ldquo;what&rsquo;s up?&rdquo; ⇒ He said, “what’s up?”.
Whether the apostrophes are a little bent makes a minor difference, since people will know they're apostrophes anyway. The length of dashes, on the other hand, makes a major difference, since a long dash carries a completely different meaning than a hyphen.
It might be possible to render " -- " or "---" as an em dash automatically, just as two apostrophes are rendered as emphasis markup. I'm all for this, since it would give us the very best of both worlds. (The new table syntax is very low-priority compared to this, IMHO.) In fact, I don't see how anyone could object to it. So what's the status on this? :-) Is it in the process of being implemented? —Daniel Brockman 09:38, Mar 9, 2004 (UTC)
In parallel with the automated conversion (which I think is a good idea: I reckon "--" should go to N-dash and "---" to M-dash, just like ''' makes things bold) maybe the toolbar could be extended to add in various types of dash. If people get used to clicking a button then you just need to alter the code behind the button. --Phil 10:14, Mar 9, 2004 (UTC)
The problem with having three dashes (---) go to emdash instead of two dashes (--) is that the latter has been taught as an emdash in typing classes probably since the advent of typewriters, and publishers who buy manuscripts from writers have used this as the standard for emdashes (and still do). I realize that TeK, which is a markup language, uses --- as the markup for an emdash, but that's a markup language, it's not the standard for typing. I'm going to reinsert the text that says use -- if you don't want to (or can't) use the others, because this is correct.
Wiki markup is just as much of a markup language as TeX is. Yes, representing em dashes by -- is an existing convention. But this was the case a few decades ago, too, when Donald Knuth decided to go aginst the convention, supposedly because it was too ambiguous. Today, both -- and --- are in widespread use. The former is still more common than the latter, but people who have used TeX are likely to stick to three hyphens, at least for text that is going to be parsed by a machine—as in the case of wiki markup—because they are aware of the ambiguities that would arise were both kinds of dashes represented by --. I feel confident that Knuth made the right choice, and I believe we are now facing the exact same choice. —Daniel Brockman 22:41, Mar 9, 2004 (UTC)
Furthermore, I don't really care whether people use dashes or the ampersand formats or whatever. What I do object to are (1) having info in the style guide that gives misinformation (hyphens just simply not the same as em dashes and just because people use single hyphens in their place doesn't make it correct; read some style guides) and (2) having info in the style guide that prohibits me from editing what other people have done. We made a real effort to specify what the different punctuations mean, to show the markup to use if you want to use it, and an alternative using regular dashes if that's what you want to use. I think that "-" no spaces for hyphen, " - " single with spaces for en dash, "--" double with or without spaces for em dash (I understand UK publishers sometimes prefer spaces) pretty much covers those options. Elf 20:05, 9 Mar 2004 (UTC)
I appreciate your careful efforts here Elf, but the bottom line is that using -- to represent an em dash is doing just that: it represents an em dash, it does not pretend to be a replacement for it in anything bar a typed manuscript on its way to a publisher to be set properly. Bad typography is every bit as sloppy and unprofessional as bad spelling. I don't expect every Wiki contributor to get his or her punctuation right first time, but the Manual of Style certainly should not endorse bad punctuation as the standard. If you want to abandon correct language, please first demonstrate a consensus to do so. Tannin 20:41, 9 Mar 2004 (UTC)
I expected to be stressed about this but instead you've made my day. I really am laughing--because I'm usually known as the harridan of correct typography and punctuation. It's so weird to be accused of being the opposite! Anyway, thanks for responding and I feel much better now. Maybe I'm repeating myself: I don't see that there is a disagreement that -- is never legit for em dashes, only whether to use markup of --- for emdashes (which see above) or whether single hyphens are acceptable substitutes, or whether markup should be required (which I think is your view) or shouldn't ever be used. I'm trying to be realistic. Double dashes have been taught in typing classes for so very long (I don't instinctively type &mdash; when I'm in the throes of writing text, I type --) that it can't be eradicated. And people who don't want to muck with markup are going to type something in place of em dashes, and it just seems to make more sense to identify the existing punctuation convention than to annoy people who hate typing markup. That's how we got into this whole discussion, because not everyone likes typing markup. Elf 20:57, 9 Mar 2004 (UTC)


This is perfectly fine with me. What I object to is reccomending that people use a single hyphen in place of an em dash, and forbid other people from correcting it (yes, I view this as changing incorrect langage to correct language). I do not expect people to use the correct punctuation all the time, just as I do not expect people to write perfect prose all the time. What I do expect is (1) the right to improve on other people's subperfect language, and (2) that someone will eventually come along and improve on my subperfect language. —Daniel Brockman 22:41, Mar 9, 2004 (UTC)

Current status

OK, we've tried to add NPOV description of the various dashes that does the following:
  • Briefly describes the standard usage for the various dash types. (For detailed descriptions, there's a reference to Dash (punctuation).)
  • Shows the special markup that's valid for each type of dash and also identifies how to represent each using the hyphen key on the keyboard. (Note that I don't have final info on whether someone implemented an automated tool that changes groups of hyphens to something else.)
  • Gives a nod to the fact that there might be technical issues involved in using the special markup.

Elf 17:17, 20 Feb 2004 (UTC)

I agree with the above. But I must strongly oppose the use of (--) for en dashes or (---) for em dashes. -- for en dash is absolutely wrong as it is far too long, and -- for em dash is wrong because in all computer fonts I know of they do not connect, and thus instead of providing the "long dash" which it should (as the practice does on old-style typewriters) it gives an ugly series of characters. — Jor (Darkelf) 22:18, 16 Mar 2004 (UTC)

What happened to the TeX-style autoexpansion of -- to – and --- to —? This would be very helpful... —Tkinias 21:23, 5 Apr 2004 (UTC)

Whatever happened to this? I thought it was a good idea. It just doesn't work with table markup? Sounds easy to fix. - Omegatron 19:09, Jan 21, 2005 (UTC)
I completely agree. I'd love the TeX behavior. If articles consistantly used en-dashes in the usual way and un-spaced em-dashes as a gramatical dash, then a "dash style" could turn that semantic dash into whatever format people like, just like the date style. (Personally, I like the look of this – as a dash over—that, but semantic markup would allow us to all get along, even if we are using a cell phone and want - that as a dash.) — BenFrantzDale 06:30, 28 Jan 2005 (UTC)
What's the status of this? Can someone explain the problem with dashes as they interact with tables? I'd relaly like to see a solution so that the wiki source isn't full of HTML, but so the typography looks good. —BenFrantzDale 04:18, May 13, 2005 (UTC)

Minus signs vs figure dashes

Wow. This is an insane conversation. Anyway, I don't mean to start another feud, but I see in both the Dash (punctuation) article and in Manual of Style article (that goes with this page), figure dashes are equated with minus signs. The HTML entities for each are different, however, as I pointed out in the dash talk page.

Hyphen:
+-=-=====
-+-=-----

Minus sign:
+−=−=====
−+−=−−−−−

Figure Dash:
+‒=‒=====
‒+‒=‒‒‒‒‒

TeX:
 

Hyphen:
1 + 2 - 3 =

Minus sign:
1 + 2 − 3 =

Figure dash:
1 + 2 ‒ 3 =


 
1+2-3=
1+2−3=
1+2‒3=

It looks as if the TeX markup HTML rendition uses a plain old hyphen for a minus sign. I think the &minus; works better in equations, since that is what it is designed for; to be the same width and height as the plus and equal signs. Hyphens and figure dashes look obviously bad in comparison. The figure dash doesn't even have space around it in my font. (8‒8‒0‒0)

So:

  1. Should these be separated in the two articles?
  2. I have been using the &minus; in my math articles. Is that ok?
  3. Should the TeX renderer use the minus sign too?

- Omegatron 21:56, Mar 16, 2004 (UTC)

Figure dashes are not minus signs, that is a mistake. To answer your questions:
  1. Yes. I'm doing so now.
  2. Using − is very good even, when used in mathematical operations.
  3. Probably. This is likely a font or encoding issue: the hyphen-minus is overused mainly because it is the only dash-like character known to exist on all platforms as it is in ASCII. If possible, Tex of course should use the real character.
— Jor (Darkelf) 22:10, 16 Mar 2004 (UTC)
2. Mostly, I was asking if it should be used, since it might not be supported by enough modern browsers. - Omegatron
It should be compatible with all modern browsers. I can't check archaic browsers like Netscape4, but any more recent browser should work. I know from personal experience the following support &minus:
  • Opera (3.5 and up)
  • Mozilla (any version), Netscape 6/7 and up, and other Geckos,
  • Internet Explorer 4 and up.
  • The Lynx and Links terminal browsers (ASCII approximations)
— Jor (Darkelf) 15:20, 17 Mar 2004 (UTC)

emdashes

em dashes — dashes the width of an "m" character — are used in typesetting and "upscale" web page set off text, in a use similar to parentheses. where em dashes are unavailable, two "regular" dashes are often used: --

While em dashes look much nicer, on certain displays and devices -- such as my handheld -- they aren't in the character set at all, and are displayed using a placeholder character, often a question mark or unfilled square (a "box").

Should we use nice-looking em dashes, knowing they won't display properly on certain devices, or fall back on the less elegant but more portable double dash?

Thanks. orthogonal 04:22, 16 May 2004 (UTC)

Is it that time already? Yeah, ok, we've had the mirror/fork question and the case sensitivity question, I guess we must be back around to dashes. May I suggest we keep it civil this time? More like the time before last than last time. -- Tim Starling 04:28, May 16, 2004 (UTC)


Tim Starling: while it's certainly fine to note that this discussion has come up here before -- and I understand the annoyance of an old-timer seeing questions arise again and again -- I do it would be more helpful for you to let us know what the previous consensus was.
Sorry. There was a short discussion last August, here. Then the civil discussion I referred to last December: Wikipedia talk:Special characters#Unicode. Then there was the discussion on wikitech-l in January: [1]. And finally there was the February discussion here. The current situation as far as I'm concerned is that either TeX-style conversion or -- to &mdash; conversion will be implemented as soon as someone works out how to do it without breaking various things such as the table syntax. -- Tim Starling 04:59, May 16, 2004 (UTC)
No problem, and thanks Tim, that's good to know. (however, this doesn't answer Othagonal's original question -- what should be done about the incompatibility of em dashes with certian devices? Personally, I think it's the device's own damn fault.) Adam Conover 16:47, May 16, 2004 (UTC)
In principle we could change the rendering according to user agent, although that may make caching difficult. Obviously we have to think of everyone, not just those trying to view Wikipedia from a mobile phone. -- Tim Starling 01:23, May 17, 2004 (UTC)
Personally, though I use an emdash in my sig, I generally use double-dashes when writing, simply because that's what I'm used to doing -- most online communities don't support anything else, and '--' is the quickest way to write a dash in word -- and because they're must easier to type. I would guess that double-dashes will continue to be by far the most prevalent regardless of what we decide. Adam Conover 04:33, May 16, 2004 (UTC)

Yeah, I've been against them for some time now as they make editing difficult. Use "--" instead. RickK 04:43, 16 May 2004 (UTC)

Changing '-' to "—"

I don't see why they should all be changed to "—". At large font sizes an "—" is *huge* and looks very ugly. A '-' works just as well and doesn't stand out as much.

Darrien 17:19, 2004 May 23 (UTC)

This perceived problem is a perceived font problem. Argue to have the Wikipedia font changed if you feel its m-dashes are too long. In the meanwhile, if a dash (—) is meant, a dash should be used; if a hyphen (-) is meant, a hyphen should be used. Read any book from a major publisher — you won't find them using hyphens for dashes, either single, as you like them (-), or double (--), as most wikipedians seem to. Bad punctuation is ugly to educated people Chameleon 17:35, 23 May 2004 (UTC)

It is not a font problem at all. I'm surprised an "educated" person would expect a hyphen to have the same boldness as a dash. Perhaps you should concentrate on changing the all the 'U's and 'J's to 'V's and 'I's.
Darrien 17:43, 2004 May 23 (UTC)
I think it'd be better, more wikilike, to have '--' render as '—'. (I agree that a distinction between hyphens and em-dashes should be made.) If there's no burning need to use HTML entities for something, they shouldn't be used. I vote for '--'. (Then again, we could go the TeX route and use '-' for the hyphen, '--' for the en-dash (used for ranges, like 10--25), and '---' for the em-dash (used for saparation---like this---of text). Higher-quality typesetting, fits well into the syntax ('----' is a horizontal rule, which is kinda sorta like the god-king of all dashes), and no more ugly HTML entities in the wikitext. Sounds like a win to me. grendelkhan|(blather) 18:45, 2004 May 23 (UTC)
Yes, I agree that there should be automatic conversion. That way, people could just type "--" or "---" and correct dashes would appear. That would be great. However, until this is implemented, we will have to use HTML, in the same way that we would have to use <i> </i> if we didn't have '' '' — Chameleon 19:12, 23 May 2004 (UTC)
Automated conversion is a good idea. But & m d a s h ; (rendered as —) isn't supported in all browsers, as far as I know. And "--" seems to me a usenet style for m-dashes, not for n-dashes. (A related probleme is, btw, the difference between "some text ? some extra text ? some text" and "some text?some extra text?some more text" in different languages.) -- till we | Talk 18:54, 23 May 2004 (UTC)
& mdash; displays correctly in all modern browsers these days. Let's work for forward, nor backward compatibility. Also bear in mind that "-", "--" or "---" for dashes do not display correctly (—) in any browser. — Chameleon 19:12, 23 May 2004 (UTC)
We did have automatic conversion up and running for a couple of months ago: really cool, except that it broke the wikitable formatting commands and was deactivated in a matter of minutes. It'd be nice to have it back, but I'm afraid I'm not a programmer.
Re the em-dashes and how they can look too long (a lot longer than a capital M here in Tahoma): the alternative is to use en-dashes (spaced), which is sanctioned by a fair number of schools of typography and punctuationalists. Hajor 01:14, 24 May 2004 (UTC)
Ah, that's a pity. Surely it can be programmed not to break wikitables. I hope the programmers are reading this. I've just been looking at a few fonts. It seems a lot have over-long m-dashes. Verdana, Arial Narrow, Book Antiqua, Comic Sans MS, Lucida Sans, High Tower Text, Trebuchet MS and others are okay though. Having said that, although dashes should not theoretically be any longer than an em, that fact that professional font designers make them a little longer shows that they agree with me in saying that it is more elegant for them to be too long than too short.
In any case, I think it's clear that those who believe it's okay to put just a hyphen are in a typographically-challenged minority. Darrien, please put the pages back how you found them, and stop turning my dashes into hyphens. — Chameleon 09:44, 24 May 2004 (UTC)
I like the "blank en-dash blank" proposal (written " -- " or " – ", possibly automatically transformed). Could we make a Wikipedia policy that this form should be used? – till we | Talk 16:16, 24 May 2004 (UTC)
I prefer hyphens to any form of dash, but I'd be far less opposed to the spaced endashes idea than to emdashes. Angela. 17:09, May 24, 2004 (UTC)
I would accept spaced endashes. That would seem to be the only compromise that can be reached. emdashes look horrible at large font sizes. endashes actually look slightly better than hyphens at large font sizes.
Darrien 17:32, 2004 May 24 (UTC)
Spaced en-dashes are a semi-acceptable approximation to the true em-dash. The most important thing is that dashes must never be turned into hyphens, whether single or double. Note that Darrien is still doing this.
I know points should be argued, not proved by action... which is the only thing stopping me adding lol to every page and defending it on the basis that it is the internet standard for funny things. — Chameleon 03:41, 25 May 2004 (UTC)
Does this mean that you will accept the hyphens being changed to endashes?
Darrien 04:14, 2004 May 25 (UTC)
I'd wholeheartedly support making spaced endashes policy. How would we go about that? Hajor 04:18, 25 May 2004 (UTC)
Add it to the page. See if you get reverted. If you don't, it's policy. ;) Angela. 11:38, May 26, 2004 (UTC)
Hang on — this en-for-em hack only works if en-dashes are significantly longer than hyphens in this font, and they are not. We need to keep using correct em-dashes, and perhaps consider changing the font. This is what I see around Wikipedia:
  • some text-more text
  • some text - more text
  • some text -more text
  • some text- more text
  • some text--more text
  • some text -- more text
  • some text-- more text
  • some text --more text
I'll carry on correcting all of these to:
  • some text — more text
...though I'll leave en-dashes if I find them. Chameleon 09:12, 26 May 2004 (UTC)
Please stop "correcting" them. There is clearly no consensus that these are correct, and certainly no agreement that they should be used. Anyway, aren't spaced emdashes are always wrong? Angela. 11:38, May 26, 2004 (UTC)
The no-spaces thing is the old rule. Dashes have at the very least a hair-space these days, and it's a good thing too (because they are not hyphens).
Should I stop "correcting" teh for the before I get specific consensus for it too? No. I should just carry on proofreading, whilst keeping an eye on any new policies that arise.
If I see old-fashioned spaceless em-dashes (1), or new-fangled spaced en-dashes (2), I'll leave them; but various hyphen hacks will be changed to (3).
  1. some text—more text
  2. some text – more text
  3. some text — more text
Chameleon 15:52, 26 May 2004 (UTC)
Why change them to emdashes when there is far more agreement on spaced endashes? The compromise suggested above would be far more sensible than purposefully inserted characters which many users have stated look awful on their browsers. Angela. 22:56, May 26, 2004 (UTC)
Please, no. Using hyphens can be excused as ignorance, but purposefully using en dashes when you know an em dash is called for is a sin against typography. —Steven G. Johnson 04:02, Jun 17, 2004 (UTC)
I wouldn't say there's more agreement on spaced endashes. Spaced endashes are more modern (and recommended by Elements of Typographic Style) but entering them in wikitext looses their meaning. The problem is that there are three symbls (-, –, and —) and they convey several meanings (hyphen, en-length-hyphen, range delimiter (read "to"), dash, etc.). If there were a entity, the rendering could be a style preference as it is for dates. Then an unspaced could be The Right Way and you could have it render like—that — that – that -- that - that or whatever. Personally, I use un-spaced em-dashes because they would be the safest to search and replace with a semantic dash (I can't think of a case where a character followed immediately by an em-dash followed immediately by a character would not be a semantic dash). —BenFrantzDale 14:06, 4 October 2005 (UTC)
Unspaced emdashes is—pretty much—an article of faith among those wikipedians who follow the Chicago Manual of Style. I suspect that on seeing your spaced ems, they'll feel a strong temptation to join 'em up. Faced with a spaced endash, the temptation would be less strong and the effort required greater. Chameleon, think about that one — which do you prefer, unspaced ems or spaced ens? (And while you're at it, take good note of Angela's comments there about the near-consensus we were approaching with spaced endashes: much closer than anything I've ever seen in the four or five times this issue has come up since I've been here.) Hajor 23:22, 26 May 2004 (UTC)
The consensus I see 'on the ground' is that those Wikipedians who know how to use these HTML entities use spaced em-dashes.
As for which I prefer, the main thing I want is hyphens not to be used. Chameleon 00:23, 27 May 2004 (UTC)

The question of em dashes vs. spaced en dashes is a matter of style. The latter has a couple of technical disadvantages, though. We should use the traditional em dash. Michael Z. 2005-10-4 15:01 Z

  • A spaced dash is three characters instead of one. I'm not worried about database size, but it means text entry would not be as simple.
  • A spaced dash must be lead by a non-breaking space, or it may show up at the beginning of a line, which looks almost as ugly as using hyphens.
    • Unicode non-breaking spaces get destroyed in normal editing, because some browsers silently convert them to spaces.
    • Entering the entity &nbsp; or &#160; makes entry too complicated and wikitext too ugly.
  • In some fonts the hyphen looks like an en dash (e.g. Lucida Grande, found on every Mac, and an excellent international font for a Wikipedia user's style sheet). this is not a serious visual problem, but it can make it impossible for an editor to tell which dash has been used.

UTF-8 can end this debate

Ever since I read the informative (and nit-picky, just my style) Dash (punctuation) article, I've been cheerfully using typographically correct dashes in my edits. And seeing as I can easily enter them as a single character, I've done so. Same goes for curly quotes, apostrophes, etc.: I've been typing them in directly to edit boxes.

I've just discovered that in using these characters I am quite the shill for Microsoft [2] — the funny thing is, I'm editing with Firefox on a Mac! I suppose that technology has marched on, at least in some operating systems, allowing me to join in on the Windows party of using fancy punctuation marks that aren't part of the ISO-8859-1 standard. I recognize that this is wrong, since the English Wikipedia claims to use the ISO-8859-1 charset. Sorry.

Once you use edit this way, though, it's hard to go back. There's no need for ugly HTML code in the Wiki markup, just clean, correct typography. On a Mac you type in option key combos to get the symbols; on Windows you edit in Word and let it do its creepy thing, then copy and paste; on Linux, I don't know, you write a program or something. Anyway, it's better than the other options.

Of course, the problem is that the ISO-8859 charset does not have these characters. The solution is to move the English wikipedia to UTF-8, like the French and other language Wikipedias have recently done. At that point we can use almost any single-character symbol we want.

As far as size goes, all ASCII characters still only take up one byte when represented in UTF-8. Characters that are not in ASCII but are in ISO-8859-1, like é, take up 2 bytes instead of 1, but I think that would be balanced by the savings from dashes being a 2 byte code instead of 8 bytes of HTML markup, or two ugly bytes when represented as --. Non-European characters (used in interwiki links) would take up less space in UTF-8, too. So in the end it might be an overall space savings, if the conversion is thorough.

Thoughts? This change would probably take some community support, as it can't be much fun changing character sets on the largest Wikipedia. Nathan 17:08, Jun 10, 2004 (UTC)

I'll second your opinion. Unicode (UTF8 is the easiest) is the way to go. buckwad 03:38, 2004 Jun 11 (UTC)
I used to enter special characters directly with my keyboard (Microsoft have released a free too that makes it possible to easily design your own keymap; I took the Spanish one and added a few extra characters), but people complained they couldn't view them, so I had to stop. It would be so good if I could start again. — Chameleon 12:39, 16 Jun 2004 (UTC)
AFAIK only Mozilla based browsers like Firefox convert charts not present in the given encoding to Unicode Entities --Hhielscher 09:22, 27 Mar 2005 (UTC)

Please go to Wikipedia:Unicode and make sure your voice is heard. --Hhielscher 09:22, 27 Mar 2005 (UTC)

We have UTF-8 support now. Sweet! See: "-", "–", "—". — Bcat (talk | email) 28 June 2005 15:46 (UTC)

Should non-English words be italicized?

Should non-English words be italicized? (Blackfoot music). Hyacinth 03:39, 14 Jun 2004 (UTC)

Yes, according to Wikipedia:Use other languages sparingly. Nathan 05:46, Jun 14, 2004 (UTC)

Proposed section for inclusion in the "Dashes" section

On repeated occasions we have been unable to reach agreement on what styles of dashes are acceptable in Wikipedia articles. In an attempt to avoid further petty dash reversion wars (in which I have readily – and pettily – participated in the past), I'd very much like to work for consensus towards including a paragraph along the following lines in the relevant section of the Manual of Style. It's basically the same as the approach proposed by User:Chameleon above.

===Dash guidelines for Wikipedia editors===
In the interests of Wikipedia:Wikilove and pending a planned update of the Wikimedia software that will automatically convert groups of hyphens into the appropriate correct en- and em-dashes, editors are encouraged to be accepting of others' dash preferences and not to modify a chosen style arbitrarily, in the same way as they would refrain from arbitrarily changing "artefact" to "artifact" (or vice-versa). The following five dash styles are currently in use on Wikipedia. Please do not change them to reflect your preference except as indicated below.
  • Tight (unspaced) em-dashes—like this. Entered by means of either &mdash; or &#8212.
  • Spaced em-dashes — like this.
    • A very rare subset of this style separates the dash from the surrounding words using hair spaces; since many browsers cannot display hair spaces, these appear on the display as simple tight em-dashes.
  • Spaced en-dashes – like this. Entered by means of either &ndash or &#8211. (Note: an unspaced en-dash may be used to indicate a range of numbers, but unspaced en-dashes should not be used for the parenthetical use under discussion in this Guideline.)
  • A pair of hyphens--either spaced or unspaced -- like that. These are ugly, but simple to type, and will be taken care of in the future by the automatic conversion feature; indeed, under that future version of the software, they are expected to become the default style. Editors who do not want the bother of keying in HTML entities are free to type their dashes in this fashion. However, subsequent editors are free to convert any double-hyphens they come across to any of the above three types, depending on:
  • personal preference between en-dashes and em-dashes, and
  • how the hyphens were initially entered -- i.e., a spaced double hyphen may only be converted into a spaced em-dash or a spaced en-dash. The original editor's spacing preference is respected.
  • A single spaced hyphen - actually, there's no real reason to flout the rules of good typesetting in this way. If you come across one of these, please feel free to convert it into your preferred dash style from the above list.

Is there anything truly objectionable in this proposal? May I please include it in the Manual of Style? We've been through all the arguments before, several times; we all have our preferred ways of doing things and, until the software gets updated and we get automatic hyphen conversion, including this guideline in the Manual might help reduce the odd hot-spot of editing heat and, at the same time, improve the professional appearance of our articles. Please, comments, objections, amendments below. Hajor 22:48, 14 Jun 2004 (UTC)

Being a stickler for good spelling, grammar and punctuation, I obviously agree with this. — Chameleon 12:36, 16 Jun 2004 (UTC)
That seems quite reasonable to me, other than that it should be acceptable to substitute one's own preferred hyphen style along with one's own preferred spellings over that of the original editor when greatly expanding or greatly changing an article – for example when expanding a stub or doing a complete reworking of an article. But not when making minor changes or additions.
My own dash style, however, is to use "&nbsp;&ndash; " to keep the spaced en-dash (or spaced em-dash) with the preceding word in case of a line break. jallan 23:58, 14 Jun 2004 (UTC)
Yes, I too often find it helpful to use a non-breaking space to keep the dash in the right place. — Chameleon 12:36, 16 Jun 2004 (UTC)
I feel this is a good addition. I've had it with people not respecting my double hyphens. [[User:Meelar|Meelar (talk)]] 20:19, 15 Jun 2004 (UTC)
After my perusual of the 2000 Chicago Manual of Style yesterday, I'm glad to see such a succinct, simplified, and inclusive policy suggested here. (I had forgotten how many kinds of dashes there were in typesetting!) I'm all for it, even more so with Jallan's amendment. One question: will the new software actually convert the text, or just display two hyphens as an m-dash? And (okay, two questions) will it (or can it be made to) incorporate Jallan's clever use of &nbsp? — Jeff Q 20:34, 15 Jun 2004 (UTC)
As I understand it, this new software we are hoping for will interpret double hyphens as special wiki-markup for dashes. At that point, it will be possible for someone to write a script to change all the HTML entities (&mdash; and &#8212;) into double hyphens, since these double hyphens will in any case be converted into the appropriate entity (or indeed, directly insert correct dashes as Unicode if we go over to UTF-8) for viewing in the user's browser, and the use of entities will therefore be unnecessary and undesirable — just as we prefer the use of double apostrophes (''text'') to the use of HTML (<em>text</em> or <i>text</i>) when italics are required, now that the software supports this.
N.B. Some argue that triple hyphens --- rather than double ones --- should be the markup for dashes, because this leaves "--" free to be the markup for those little en thingies ("–"), in keeping with TEΧ style. Personally, I don't mind either way. — Chameleon 12:36, 16 Jun 2004 (UTC)
An excellent proposal. I heartily endorse it. Tannin 14:56, 16 Jun 2004 (UTC)
PS: As we all know by now, the ndash and the mdash are different things and are used for different purposes: i.e., the software should convert "--" and &ndash and &#8211 to , and convert "---" and &mdash and &#8212 to .
By the way, as one of the strongest supporters of professional standards in our presentation (i.e., correct punctuation), and in the interests of harmony, I some time ago voluntarily switched from the no-space mdash style to the spaced style, as this seems to incur less wrath from the typographically challenged ... er ... sorry ... I mean as this seems to incur less wrath from my good and helpful Wikibretheren. Tannin
I think the convention of single, double and triple hyphens is a good convention that is easy and unambiguous to convert by software.
--Ruhrjung 02:47, 2004 Jun 17 (UTC)
As a LaTeX user, I'm inclined towards -- and --- for en and em dashes myself as well. Regardless of the specific syntax, though, it's important that a simple markup style be present for both dash types, since both have common and important typographical uses. Of course, most users will probably not use either one correctly, but it should be easy to correct them and the resulting wikitext should be readable. (PS. Who thinks that a spaced en dash is an acceptable typographical replacement for an em dash, spaced or not???) —Steven G. Johnson 03:48, Jun 17, 2004 (UTC)
Well, I don't like it either but plenty of people here see it as the perfect compromise between us em-dashers and the hyphen mob. — Chameleon 11:13, 17 Jun 2004 (UTC)
Who thinks that a spaced en dash is an acceptable typographical replacement for an em dash, spaced or not??? Germans, for one; their typographical convention is "space, en-dash, space" rather than (unspaced) "em-dash". A matter of style and tradition, perhaps.
Oh, and I also wouldn't mind seeing -- and --- being converted into en-dashes and em-dashes, respectively. -- pne 12:25, 17 Jun 2004 (UTC)
Yes. I've seen a number of dash recommendation specifications which suggest strongly that either em-dash or space,en-dash,space are equally acceptable. I normally use the second, but without any strong preference for it. Of course when used between quanities with the meaning 'to', an unspaced en-dash is always normal. But there are exceptions, for example "−7–−2" looks far worse than "−7 – −2". Actually both look rather horrible. But in any case I agree that '--' for en-dash and '---' for em-dash is the best known easily readible convention that provides the most control. Presumably the new software would first run a pass and change all hard "--" to "---" and then all " --- " to " -- " before beginning the new interpretation. jallan 14:24, 17 Jun 2004 (UTC)
While we're mentioning other languages, French uses spaced em-dashes. Spanish uses em-dashes spaced on one side, just like parentheses or brackets. — Chameleon 15:35, 17 Jun 2004 (UTC)


Dashes (moved from the village pump)

Opinions are being sought regarding a proposal on this thorny issue for inclusion in the Wikipedia:Manual of Style. Please see the bottom of Wikipedia talk:Manual of Style/Dashes. Thanks. Hajor 20:15, 15 Jun 2004 (UTC)

That huge and technical debate (still) seems to say that a software update will convert simple minus signs that everyone has one their keyboard into these funny entities that are not at all intuitive. One of the hugely disagreeable things about this debate is that usability (the essence of wikitext) seems to get repeatedly ignored, and we go round in circles about trivial details. Pcb21| Pete 08:07, 16 Jun 2004 (UTC)
You don't have to care about typography, but some editors do, and when they fix your typography it will be much easier for them, others, and you if they can write --- instead of &mdash; (—) and -- instead of &ndash; (–). —Steven G. Johnson 06:07, Jun 17, 2004 (UTC)
Right, that would be a software update extending the wiki-syntax. The discussion however seems to be about what combination of &mdash and &ndash to use right now. Pcb21| Pete 18:49, 17 Jun 2004 (UTC)
No, it is not about that. Except for one comment no-one has disagreed with Hajor's assertion that "&mdash;", " &ndash; " and even " &mdash; " are all acceptable. The discussion is on whether Hajor's modifications should replace the current statement which in any case already recommends use of &mdash; and &ndash;. There is discussion on what coding should be used which is another matter. Nor does using &emdash; instead of typing "--" or " - " in any way make any Wikipedia text less usable unless you mean usable in a strict ASCII text environment which barely exists any more. jallan 14:56, 18 Jun 2004 (UTC)

Can people be bothered?

I know I'm asking to get lynched posting this here, but can most editors really be bothered to do anything special just to get a dash that looks the "right" length? I certainly can't. I'm all for correct spelling, punctuation and grammar, but I will stop short of fretting about whether my dash is the correct length. I don't perceive any difference when reading between hyphens, m and n dashes and whatever else you have (by that I mean it doesn't alter the way I would read a sentence, unlike other misplaced or misused punctuation). When I'm writing I will use just a single hyphen/minus sign with a space either side - like this - when perhaps I "should" use a dash. It's easy to type and easy to read (both as source and when rendered on the page), and I frankly can't be bothered to go find the style guide every time I type a dash to check the correct way of doing it. Maybe you could blame my laziness on Microsoft Word for autocorrecting all my hyphens to dashes. Anyway, if you really have nothing more productive to do you're welcome to "correct" all my "misused" hyphens. But I'm sure Wikipedia could benefit from your time more if it were spent improving the content of articles rather than worrying about exactly what the dashes look like. Tjwood 17:34, 16 Jun 2004 (UTC)

I would hope that most editors would be in favor of anything that improves the look and readibility of their text, including using dashes rather than hyphens when appropriate. Looking good is important. Your style is substandard in the technical sense. I don't believe you will find support for use of " - " as a dash in any normal style sheet. Instead you will find that use severely deprecated. Check also Google Search: publisher submission dash for specification after specification that in submissions to publishers a normal dash must be indicated by "--". If typing "&emdash;" is too difficult for you, then use "--" accordingly. That has been the standard keyboard substitute for dash for at least 60 years and is also more efficient to enter than " - ". As you point out, even MS-Word corrects use of " - " as an error. That this is unimportant to you is irrevelant. It is important to others that Wikipedia text looks like it was written and edited by knowledgeable people. If you use non-standard style of any kind, whether punctuation, spelling, typographical style, grammar ... then you can expect to have your work edited to fit the standards. And you shouldn't expect those who know better to accept a generally deprecated style any more than they would accept a generally deprecated spelling. You seem to be suggesting that because you and some others don't know the rules and conventions and don't want to know the rules and conventions, that the rules and conventions don't matter. It doesn't work that way. jallan 01:33, 17 Jun 2004 (UTC)
TJ, you are right that most editors cannot be bothered (nor should they) to understand the correct usages of the various dash types—the most important thing is to get good writing and even better content. However, some editors will want to correct the typography, and it's important that they can do so (a) easily and (b) in a way that still produces readable wiki, so that editors like you are not too confused. —Steven G. Johnson 03:56, Jun 17, 2004 (UTC)
…and (c) it is important that people must not convert (my!) correct dashes into hyphens, as has been happening. We need a policy in favour of correct punctuation to defend Wikipedia against silly edit wars. — Chameleon 11:13, 17 Jun 2004 (UTC)
Sorry, having re-read my first comment I don't think I made it quite clear. When I'm writing text that will be printed - in MS Word for instance - I will use an unspaced double hyphen which Word will automatically convert to an m-dash. Word will also convert a spaced hyphen to a spaced n-dash such as for date ranges etc. But for Web use where I'm limited to plain ASCII or otherwise using special entities, I use a spaced hyphen because it's too much hassle to look up the correct entities to use and the source looks a mess with entities all over the place. A double dash--like this--looks more ugly than a spaced hyphen.
If Wikipedia gets around to replacing double dashes with the correct entities when the pages are displayed (in a similar manner to Word), but leaving the source code readable, I will happily use them. But until that point I will stick to my "incorrect" spaced hyphens.
As for printed versions of Wikipedia, surely any possible printed version would need fairly widespread editing first to make things consistent anyway? I've seen different pages on Wikipedia written using completely different styles of layout etc, never mind just dashes. You can get away with it no problem on the web, but when it comes to paper it would look a mess without one fixed style decided by the publisher of the paper version.
Tjwood 13:14, 17 Jun 2004 (UTC)
Actually, you know, I've just looked at what I posted above, and I now think I was wrong when I said "A double dash--like this--looks more ugly than a spaced hyphen." The double dash looks better. You may have just converted me. :-) Tjwood 13:20, 17 Jun 2004 (UTC)
It doesn't really matter whether your hyphens are single or double, spaced or tight — just as long as you don't mind the rest of us correcting them to dashes. In the same way, it doesn't matter too much whether you misspell words when you make valuable factual contributions to the encyclopaedia — just as long as you don't mind us cleaning them up afterwards. — Chameleon 15:27, 17 Jun 2004 (UTC)
It may matter for ESL readers of Wikipedia. I am not sure, but I think the unspaced dashes, when used as parenthesis, are globally a relatively uncommon phenomena. Or at least, they introduce a resistance in the reading process for readers who aren't used to them. And unspaced dashes are not, let's be honest now, particularly common on the world wide web. I would favor a standard-rendering of spaces before and after m-wide dashes.
--Ruhrjung 18:56, 2004 Jun 17 (UTC)
At this point I'd like to say that dashes are often poor punctuation anyway. Most of the time, the sentence is better with a comma, semi-colon, colon, full stop or parenthesis. — Chameleon 19:28, 17 Jun 2004 (UTC)

Consistency within articles

Having looked at the first ever printed version of Wikipedia (the German WikiReader Internet), I've noticed how awful mixing hyphens and varying length dashes within articles is. It's bad enough between articles, but can we please have a policy of not mixing them. I think consistency is more important than deciding which dashes are used. Angela. 12:43, 17 Jun 2004 (UTC)

That's fine, as long as it is not used as an excuse to justify bad punctuation and convert dashes into hyphens. This is really just an extension of the existing policy on spelling: correct mistakes, respect other people's variations within what is correct, try to keep usage consistent within articles even if usage differs between them... — Chameleon 15:27, 17 Jun 2004 (UTC)
That would be very good, and according to other wikipolicies I believe in (like consistent British or American spelling per article).
--Ruhrjung 18:41, 2004 Jun 17 (UTC)
Isn't consistency kind of a lost cause on Wikipedia? Which, if you ask me, is the way it probably ought to be?parts of articles suck, other parts are better, but overall it's getting pulled in the right direction. It'll never be complete. Wikisux 15:03, 23 Jun 2004 (UTC)

Going live with policy

The proposed text (above) appears to enjoy consensus support, so I'll be inserting it into the Manual of Style shortly. The outstanding issue to include, if anyone fancies adding in the extra language, is consistency within a given article: the guideline given for US/UK spelling conventions is follow the standard set by the first major (non-stub) edit; the logical approach for dashes would be to follow the dash style used by the first editor to include a dash in the article.

Changing to explicit &mdash; and &ndash; entities now (while admittedly not as transparent as we might wish wikitext to be) will also save us a lot of anguish and conflict when the promised automatic conversion comes on line, when presumably we'll switch to using two- and three-hyphen chains in the markup -- either manually or (as suggested above) by someone running a bot. ("Conflict" in the sense that a lot of people have been typing "--" in the expectation that they'll autoconvert into emdashes, when from what we saw a few months ago when the function was briefly activated, it'll be three hyphens for an emdash and two for an endash, similar to the TeX convention described above.)

Everybody happy? Hajor 10:53, 28 Jun 2004 (UTC)

The trouble is that, if the software interprets double hyphens as en-dashes rather than em-dashes, and we have a policy of not changing other people's punctuation unless it's really bad (i.e. they use hyphens for dashes), then we'll have a lot of "--" that the author wanted to represent em-dashes that will now show up as en-dashes and our policy will specifically forbid us from manually turning these "--" into "---" (so that — rather than – appears).
So, the policy seems fair, but a technical glitch will unfairly favour the use of en-dashes. — Chameleon 11:32, 28 Jun 2004 (UTC)

No -- the third bullet says subsequent editors are free to change double hyphens to their choice of em-dashes or en-dashes (providing the original writer's spaced/unspaced preference is respected). That was intended precisely to preempt that type of conflict when the auto-conversion comes on line: make them explicit now, and then change them back to either (1) "---" & "--" or (2) "--" and " - " following the upgrade. Hajor 12:51, 28 Jun 2004 (UTC)

No, Hajor, you're not getting me here. I understand that the policy says that now, in the short term I can turn hyphen hacks into real dashes. That wasn't my concern. I was concerned with how this policy would interact with future software that would automatically display "--" and "---" in the markup as "–" and "—" on the page.
At that point in time, the policy you are advocating would obviously be largely obsolete. The preferable way to put correct dashes into text will be, at that point, by using double and triple hyphens in the markup. Using "&mdash;" etc will be deprecated, in the same way that we prefer the use of double inverted commas instead of "<i>". That is to say, we only use HTML to do something when there is no wiki markup for it. It would be a good idea, at that point, to write a script that goes through Wikipedia changing all instances of "&mdash;" to "---". This would not affect the display of the pages in the slightest; it would just make the markup easier to read.
Follow me so far?
My point is that most of the policy would be obsolete, except for the part about not changing other people's punctuation (from one of the three acceptable forms of dashes to one of the others). I think people would argue that this part still stands. The problem with this is that, although those em-dashes originally entered into the text as "&mdash;" will now show up in the markup as "---" and on the page as "—", there will also be many instances of dashes originally entered into the markup as "--" with the author's intention being to enter an em-dash, which would at that point show up on the page as "–".
There would therefore be thousands of en-dashes that the original authors wanted to be longer. And policy would prevent me from changing the markup from "--" to "---" to make them display as "—" instead of "–". — Chameleon My page/My talk 19:55, 30 Jun 2004 (UTC)
Seems to me this whole mess could be avoided by turning "--" into em-dashes, as everyone expects, and " - " (hyphen surrounded by spaces) into en-dashes. Reverse compatibility, preserving authors' intent, and intuitive. (The three dashes proposal I find ridiculous.) Wikisux 21:19, 30 Jun 2004 (UTC)

I've said this before, but I wonder if I am being understood: we should NOT use dashes that are created in source with characters that the average person cannot understand. The average person has NO IDEA what HTML entities are. Please put this ahead of nit-picking over correct and pretty typography. Agree on a way to represent the various typographical elements with "-", "--", "---" and spaces. -- Tarquin 22:34, 28 Jun 2004 (UTC)

You've said it before, you've been understood, but the consensus shown above indicates that you're simply not being agreed with. And I think you're insulting the intuitive abilities of the "average person", too. And why on Earth didn't you contribute your views in the two weeks since I posted the proposed text? Please -- could you try to live with the HTML entity solution until the automatic conversion function is activated (when we are sure to start using nice, transparent strings of hyphens)? Hajor 23:09, 28 Jun 2004 (UTC)
I'm also suspicious of average person arguments, especially when almost all Wikipedians (and logically almost 50% of them are below the Wikipedian average) are able to master the other various stylistic features of Wikipedia source code. I think that even the average computer user today does know what HTML entities are. The forms &mdash; and &ndash; are hardly unituitive. But many Wikipedian editors find them ugly and annoying to type and some don't care about typography and some don't want to understand such things, balanced by those who very much care. But no-one wants to see "---" in the final presentation or "--" in a range, such as 1912--1914, in the final presentation. Accordingly we can't use such constructs before new interpretation software is applied. jallan 00:20, 29 Jun 2004 (UTC)
And that is in fact what was mandated by the text of the Manual of Style as it stood when I made the suggestion two weeks ago: "Historically a double hyphen (--) was used to represent an em dash because on a typewriter the hyphens tend to connect, creating a dash in appearance, but since this is almost never the case in ASCII do not use this." It's still there now, if you care to take a look. And no, it wasn't me who put it there. Hajor 04:21, 30 Jun 2004 (UTC)

I apologize if this has already been brought up, but I'd suggest replacing double hyphens (--) with em dashes, and single hyphens surrounded by spaces ( - ) with en dashes. This is the Textile standard (if you've ever used Movable Type) and it makes more sense than any of the other ideas I've seen here. edit: Oh, I see it has been brought up already. I throw my weight behind this proposal, then. Wikisux 04:40, 30 Jun 2004 (UTC)

Quotation Dash / Horizontal bar

The Unicode Standard (v4.0) on page 155 in chap. 6 has the Quotation Dash (U+2015 / &#x2015) to be called Horizontal Bar and gives Quotation Dash as an alternate name but officially refers to it as Horizontal Bar. Of course, calling it Horizontal Bar would conflict with the html tag <HR> which the resulting graphic people seem to want to call horizontal bar too, when it's really called a Horizontal Rule by the W3C standard. So yeah... Call U+2015 a quotation dash, a horizontal bar, or both? Keeping with the Unicode Standard for naming specific code characters would be best. Kevin Breitenstein 05:43, 3 Jul 2004 (UTC)

Software

I’ve come late into this debate, but I’m pleased with the way it seems to have turned out. I’d like to wish the people writing the conversion software the best of luck. It looks like a wicked problem, but there are some good precedents for a solution.

  • Textile, written in PHP, was already mentioned. It does dashes, smart quotes and lots of other character conversions. It even supports a table formatting scheme similar to wikitables. There’s also a Textile plug-in for Movable Type, written in Perl.
  • SmartyPants, in Perl, is a simple processor which converts dashes, smart quotes, and ellipses. There's also a PHP version.
  • Markdown is a more full-featured text processor by the same author.

It was inspiring to see Wikipedians hash out a solution that respects everyone’s wishes and maintains a high standard of typography.

“Apostrophes and quotation marks, anyone?”

Michael Z. 06:04, 2004 Aug 27 (UTC)

Various ASCII to HTML conventions

Just trying to list them to keep them straight. If the automatic conversion is ever implemented, this will probably become a vote.

TEX and SmartyPants convention

  • hyphen = “-” (one hyphen: “Ex-wife”)
  • en dash = “--” (two hyphens: “1995--2004”)
  • em dash = “---” (three hyphens: “em dashes---those beautiful things”)
  • spaced en dash = “ -- ” (two hyphens surrounded by spaces: “November 1 -- December 26”)
  • spaced em dash = “ --- ” (three hyphens surrounded by spaces: “em dashes --- those beautiful things”)

Textile convention

  • hyphen = “-” (one hyphen with no spaces: “Ex-wife”)
  • en dash = “ - ” (one hyphen surrounded by spaces: “1995 - 2004”)
  • em dash = “--” (two hyphens: “em dashes--those beautiful things”)
  • spaced en dash = ??? (can this be typeset at all using the Textile notation?)
  • spaced em dash = “ -- ” (two hyphens surrounded by spaces: “em dashes -- those beautiful things”)
  • hyphen = “-” (one hyphen with no spaces: “Ex-wife”)
  • en dash = “---” (three hyphens: “1995---2004”)
  • em dash = “--” (two hyphens: “em dashes--those beautiful things”)
  • spaced en dash = “ --- ”? (three hyphens surrounded by spaces: “November 1 --- December 26”)
  • spaced em dash = “ -- ”? (two hyphens surrounded by spaces: “em dashes -- those beautiful things”)

I believe that Textile and SmartyPants both allow you to reverse their behaviour by setting a preference. I’m guessing TEX does too. The typist’s convention going back many decades trumps the significance of these excellent software packages’ behaviours. Michael Z. 2005-10-4 15:04 Z

Michael, I edited your nice list to (a) correct a small error in the example for “en dash” in Nathan Hamblen’s backwards convention, (b) add entries for “spaced en dash” and “spaced em dash,” and (c) add some <code> tags to improve the formatting. I hope you don’t mind. – Daniel Brockman 14:43, 26 December 2005 (UTC)
This is all moot since they can be entered directly into the markup now — like this. I've got a javascript button that converts a bunch of types automatically. — Omegatron 02:04, 1 December 2005 (UTC)

Contradiction on double hyphen use

According to "Dashes and hyphens used on Wikipedia":

Historically a double hyphen (--) was used to represent an em dash because on a typewriter the hyphens tend to connect, creating a dash in appearance. Since this is almost never the case in digital type, do not use this technique.

But according to "Dash guidelines for Wikipedia editors":

Editors who do not want the bother of keying in HTML entities or prefer to maintain the readability of the wikitext are free to type their dashes in this fashion.

(my emphasis)

It seems odd to me that such a broad exception to the rule would be placed in the bottommost section. Also the two section titles seem redundant, no? Dforest 12:45, 8 November 2005 (UTC)

Yep. I like the second one better. They should be allowed, but can be changed to a proper dash by anyone who wants to. — Omegatron 14:41, 9 January 2006 (UTC)
There can be problems displaying em dashes (and curly quotes for that matter) with some OS and character sets -- that is a reason why perhaps there should be tolerance for the double hyphen (and straight quotes) option. RomaC 03:12, 21 January 2006 (UTC)
But allow it in an inconsistent manner? That doesn't make sense. Either allow them or don't. There's no point in letting people use double dashes in a few articles so that they can read them in their browser, while all the other articles use Unicode dashes. Those same browsers have a problem with foreign characters, too. Should we remove foreign language Wikipedias?
Ideally, the Mediawiki would be smart enough to figure out where to use the real dashes, and we could use <nowiki> for the few cases where it would otherwise make a mistake. For instance, whenever it detects a dash between two dates, it renders it as an en dash. When it sees two dashes in a row, it renders it as an em, etc. Then you could turn it off in preferences if you had a broken browser. — Omegatron 19:40, 24 April 2006 (UTC)

"Allowed" should be changed to "tolerated". That would mean that if a user types a double dash when creating or adding to an article, nobody should get bent out of shape, and there would also be the encouragement (more than allowing) of editors to change them to em dash. Thus it would read something like: Hu 22:49, 18 November 2006 (UTC)

I agree 100% on this. There has been discussion and disagreement on this issue in the past, but I think it is worth revisting the topic. Especially now that the edit page has a link to insert the dashes without needing to use the HTML entity. —Doug Bell talkcontrib 23:08, 18 November 2006 (UTC)
I agree also, except that the wording should be: Double dashes may be tolerated, but editors are encouraged to convert any they might find to em dashes, or to spaced en dashes (depending on which of these two practices has already been established in the article). Remember that we allow spaced en dashes OR em dashes (spaced or not), as sentential punctuation. – Noetica 23:29, 18 November 2006 (UTC)
Thank you for re-emphasizing space em dashes. I think they are much preferred because the spaces enhance reading speed and comprehension since the words on either side are more easily recognized as individual cognates and don't have to be parsed away from the dash. Hu 10:32, 19 November 2006 (UTC)

Suggest cutting moot point at bottom

Most of the bottom of this article (how to type em-dashes on various keyboards) is totally moot since anyone can just click the link below the edit box to add the em-dash. Any objections to me chopping it totally? It can always live at em dash or something. Stevage 22:19, 19 January 2006 (UTC)

Don't. Some people will want to type them in by hand instead of the edit thingy, which is slower. — Omegatron 19:25, 26 January 2006 (UTC)

Should have a preferred dash style—really

I'm sure I can't say anything here that hasn't been said before, so please be patient with a new editor.

I know about the many styles people use for dashes. I was hoping by coming to the manual of style I'd get some guidance about the preferred style on Wikipedia. My understanding—generally, not specific to Wikipedia—has been that the em dash without spaces (or with hair spaces) was the preferred style. After reading the style manual here it seems like the answer is "whatever you want to type."

Also, having the guide tell editors not to change existing dash styles seems to go against WP:BB and will just result in ugly pages that use multiple dash styles. It's OK not to require a particular dash style, but not even stating a preference is, in my opinion, a mistake. – Doug Bell talkcontrib 04:07, 4 February 2006 (UTC)

   Sorry to rain on the parade, but I don't think 5 people is much of a consensus for the whole encyclopedia. I'd love to see an em-dash as preferred style for parenthetical clauses and similar breaking, but am very much against flush or hair spacing. Flush spacing, though common in print, is far from universal, and at Wikipedia seems to be in the minority of articles I've worked on (before tweaking them, that is). Hair spacing looks nice as displayed, but isn't worth the editing annoyance, especially when ordinary spaces look fine as well. — Jeff Q (talk) 02:18, 26 April 2006 (UTC)
       Fine, let's ignore the spacing issue for now. Can we at least say that 1) it's okay to put in hyphens or em/en dashes; 2) it's okay to replace hyphens with em or en dashes, as appropriate; and 3) it's not okay to replace em or en dashes with hyphens? —Chowbok 04:03, 26 April 2006 (UTC)
           * Support. Jeff Q (talk) 04:09, 26 April 2006 (UTC)
           * Support. (Seems like the sensible place to develop consensus on this is the dashes talk page, isn't it . . .) —Simetrical (talk • contribs) 03:08, 17 May 2006 (UTC)
           * Support, preferably with the addtion that changes from en dash to em dash or em dash to en dash are discourgaed. Jeltz 
No one can agree on it, so it sits in limbo for now.  :-\ — Omegatron 04:44, 4 February 2006 (UTC)
I agree with Doug Bell. My ideal policy would be that we always use proper em and en dashes, but I think mandating hyphens would be a better solution than "do what you like, but don't change anything". Can't this just be voted on once and for all? Surely Wikipedia has settled thornier issues than this... —Chowbok 22:51, 2 March 2006 (UTC)
It should be "do what you like, but x is preferred. don't revert if people change to x." — Omegatron 03:10, 3 March 2006 (UTC)
I agree. And as I've said elsewhere on this page, flush em dashes are the only style—short of inventing a nonstandard &dash; entity—with a prayer of automatically extracting semantic dashes if someone wanted to automatically convert Wikipedia to a different dash style. (PS: Is it a coincidence that this is the only Wikipedia discussion I've read in which all participants sign with a Unicode em dash? :-) ) —BenFrantzDale 04:10, 3 March 2006 (UTC)
I second the agreement with Omegatron's recommended wording. Do we really need a vote to determine what the preferred x style is?—it seems pretty obvious to me. (I've changed my signature from an en dash with spaces to an em dash with no spaces as a sign of support. :-) —Doug Bell talkcontrib 02:27, 18 March 2006 (UTC)

I usually change double hyphens, spaced hyphens, and spaced dashes to flush em dashes when I see them, especially when they are inconsistently used in an article. I don't make it a crusade, just clean it up when I'm editing something else. Em dashes must look fine, because no one ever complains or reverts. Michael Z. 2006-03-03 03:43 Z

I think Omegatron's solution is exactly right. It won't effect anybody who doesn't want to bother with correct dashes, but we'll gradually move to proper usage. And I don't see why x should be so hard to resolve. These aren't actually controversial issues in the publishing world (unlike disputes like, say, the serial comma or the possessive of singular words ending in "s"); em dashes break up sentences, en dashes signify ranges and are used for compound hyphenation. —Chowbok 02:42, 18 March 2006 (UTC)
Why are unspaced better? — Omegatron 03:46, 18 March 2006 (UTC)
To appeal to authority, unspaced em dashes are the style used by CMS and by Strunk & White. —BenFrantzDale 05:07, 20 March 2006 (UTC)

We actually seem to have reached some sort of consensus here. Can we change the policy, then? —Chowbok 01:51, 26 April 2006 (UTC)

Sorry to rain on the parade, but I don't think 5 people is much of a consensus for the whole encyclopedia. I'd love to see an em-dash as preferred style for parenthetical clauses and similar breaking, but am very much against flush or hair spacing. Flush spacing, though common in print, is far from universal, and at Wikipedia seems to be in the minority of articles I've worked on (before tweaking them, that is). Hair spacing looks nice as displayed, but isn't worth the editing annoyance, especially when ordinary spaces look fine as well. — Jeff Q (talk) 02:18, 26 April 2006 (UTC)
Fine, let's ignore the spacing issue for now. Can we at least say that 1) it's okay to put in hyphens or em/en dashes; 2) it's okay to replace hyphens with em or en dashes, as appropriate; and 3) it's not okay to replace em or en dashes with hyphens? —Chowbok 04:03, 26 April 2006 (UTC)
How come? The usage of those is pretty well-established (unlike with spacing). —Chowbok 18:21, 21 August 2006 (UTC)
Well established? I have seen both en dashes and em dashes used in serious typograhy (news papers, books). I think we should ignore en vs. em for now since there is no well-established rule there that all of the typograhic world agree on. Jeltz talk 11:22, 22 August 2006 (UTC)
Right, both should be used. Whether it's an em or an en depends on the usage. Can you give me an example of a book that uses one or the other contrary to the standards? —Chowbok 16:00, 22 August 2006 (UTC)
Hmmm, I feel like we might be talking about different things. What I'm talking about is that both en dashes and em dashes can be used for parentethical purposes, depnding on what you prefer. See Dash#En_dash_versus_em_dash for what I'm referring to. Jeltz talk 17:49, 22 August 2006 (UTC)
  • Support, of course. Ignore the spacing issue, and the en vs em issue for now. (Of course, Mediawiki should just replace -- with — automatically.) — Omegatron 15:13, 21 August 2006 (UTC)
I personally prefer the LaTeX style where -- is replaced with – and --- is replaced with —. That way you can write both. Jeltz talk 11:24, 22 August 2006 (UTC)
I used to argue for that style, too.  :-) After creating the dash fixer, I realized that there is a vast number of double hyphens on Wikipedia already from people who learned it in typing class, and a few who just use a hyphen. Might as well just go with what they already use. (Besides, they'd just end up typing -- anyway, out of habit, getting an en dash, and leaving it that way; so we'd have ens everywhere we were supposed to have ems.) It could still be possible to create both dashes, if we were smart with the implementation. I tried to write up a Dash syntax summary, but never finished it. — Omegatron 11:45, 22 August 2006 (UTC)
  • Support, also hyphen-minus to minus (in math context) but not the reverse, and mediawiki should automatically do double -- (but leave more than two hyphens alone). This is the first I (who thought myself a bit of a typography geek) learned that hyphen and m-dash were not equivalent. In the long run, I'd say that type-anything is necessary, but any hyphen-minus outside a PRE tag or equivalent should be ripe for replacement — to minus if math; to en-dash when that's clear or, in parenthetical case, for consistency or when there's any indication that the original author prefers it so; to em-dash for the same reasons or for double hyphens; and to en-dash or em-dash as preferred (or possibly just em-dash) in the parenthetical case when there is no reason to infer authorial preference. --Homunq 14:25, 6 January 2007 (UTC) (or is it —~~~~?)
  • Support Raifʻhār Doremítzwr 15:29, 6 January 2007 (UTC)
  • SupportDavid Eppstein 16:29, 6 January 2007 (UTC)
  • Support Strad 17:10, 6 January 2007 (UTC)
  • Support —Ben FrantzDale 00:11, 8 January 2007 (UTC)

X11

Following section was rather misleading, since e.g. most linux distributions have switched from xmodpad into using xkb, and the provided example does not work.

Under recent versions of X11, the shell command

xmodmap - <<EOT
keysym m           = m           NoSymbol U2014 NoSymbol
keysym n           = n           NoSymbol U2013 NoSymbol
keysym KP_Subtract = KP_Subtract NoSymbol U2212 NoSymbol
EOT

will add the em dash (—) to AltGr-M, the en dash (–) to AltGr-N, and the minus sign (−) can now be obtained by pressing the minus key on the numeric keypad while holding the AltGr key.

Ndashes are shorter than hyphens

Notice:

–––––––––––––––––––– twenty ndashes
-------------------- twenty hyphens
01234567890123456789 twenty digits

At least in my Lucida default font, the hyphen is at least a pixel longer than the ndash, contrary to this MoS section, and the hyphen matches the width of numbers, not the ndash.

So what I'm saying is, unless Lucida is unusual in this respect, please stop replacing hyphens in number ranges with ndashes. Thank you. --James S. 15:16, 16 February 2006 (UTC)

For the record: the character you describe as "hyphen" is the Unicode character U+002D called HYPHEN-MINUS. This is an old overloaded ASCII character that has in the past been used as both a HYPHEN (‐) and a MINUS (−). Some font designers make HYPHEN-MINUS look more like a hyphen (short), others make it look more like a minus (longer, higher, matching +). As long as there is no easy way to enter the proper unambiguous Unicode HYPHEN on non-specialist keyboards, this problem will persist. Markus Kuhn 19:19, 24 March 2006 (UTC)
If that is ineed the case, then your copy of Lucida is quite unusual. The computer I am at right now has crappy fonts installed, but I am quite sure that at home the above would render with the first line longer than the second and with the line of numbers longer than the hyphens. Compare:
–––––––––––––––––––– twenty ndashes
NNNNNNNNNNNNNNNNNNNN twenty Ns
These should be very close to the same length.
What operating system and browser are you using? —BenFrantzDale 16:13, 16 February 2006 (UTC)
Lucida Grande has an unusually long hyphen; I assume other Lucidas do too. Normally a hyphen is very short, indeed. Michael Z. 2006-02-16 16:27 Z
Okay, that's just bizarre. My copy of Lucida Sans has en dashes longer than hyphens, but in Lucida Sans Unicode the hyphens are longer. Why would they do this? How dumb.
But yes, that particular Lucida is extremely unusual in that respect, so we shouldn't base policy on it. —Chowbok 17:48, 20 March 2006 (UTC)
If you have Lucida Sans Unicode installed, this should demonstrate it:
–––––––––––––––––––– twenty ndashes
-------------------- twenty hyphens
01234567890123456789 twenty digits
Chowbok 17:54, 20 March 2006 (UTC)
Whereas plain Lucida Sans has
–––––––––––––––––––– twenty n-dashes
-------------------- twenty hyphens
01234567890123456789 twenty digits
nnnnnnnnnnnnnnnnnnnn twenty n's
The behavior of Lucida Sans Unicode is unusual and should be ignored. Arial, the default font, gives all elements their expected widths:
–––––––––––––––––––– twenty n-dashes
-------------------- twenty hyphens
01234567890123456789 twenty digits
nnnnnnnnnnnnnnnnnnnn twenty n's
Notice how the n-dashes are precisely as long as the n's. —Simetrical (talk • contribs) 03:19, 17 May 2006 (UTC)

Meta: navigation

The Template:Style(edit talk links history) at the top of WP:MOSDASH made navigation more difficult than necessary. I've moved it to the bottom and filled out the missing shortcut in Template:Style-guideline. - Omniplex 18:02, 28 February 2006 (UTC)

Dashes not used on Wikipedia

The article states neither the figure dash ("‒") nor the quotation dash ("―") should be used in Wikipedia articles because browser support for them is lacking. Is that still true? Under Mac OS X 10.3.9 (2003) they are rendered correctly: not only in the current version of Safari, but also in a 2003 version of Camino (a Mozilla-based browser) and in a 2001 version of Internet Explorer. Ian Spackman 08:13, 24 March 2006 (UTC)

As of 2005, the support of the UTF-8 end dash (–), em dash (—) and minus sign (−) in web browsers has been excellent and there is no reason not to use these characters in Wikipedia. Even VT100-terminal based text-mode browsers such as w3m or lynx haven't had problems with them for many years. I don't know whether the same can already be said for figure dash and quotation dash, which are missing in most commonly-covered Unicode subsets, including WGL4. Markus Kuhn 19:10, 24 March 2006 (UTC)

Article text now outdated after UTF-8 switch

This policy page needs to be throoughly reviewed for technical inaccuracies after the Wikipedia UTF-8 switch. I found one specifically wrong passage, and I suspect there may be others.

Use the HTML entity &mdash;, which the MediaWiki engine automatically converts into a numeric entity in the rendered HTML.

First, this appears to contradict the up-front, emphasized statement about using UTF-8 characters. The context suggests its point was recommending the type of dash, not necessarily the preferred means to use it, but that's my point — it doesn't comprehend the availability of UTF-8. Second, it's most definitely wrong about rendering "&mdash;" as a numeric entity — it's rendered as a UTF-8 em-dash character, as it should be. I suspect that whoever added the bold-statement update to the top didn't examine the rest of the article for now-inaccurate statements or undesirable advice. I'd be bold and do an update myself, but I'm not confident enough of my knowledge about these topics. (I had to do some careful testing just to be sure of even the single case I'm mentioning here.) ~ Jeff Q (talk) 11:45, 5 April 2006 (UTC)

Yes, the person who added the bold statement at the top (me) didn't alter the rest of the article very much. Originally it was more like a "this page is invalid now" notice. — Omegatron 14:08, 5 April 2006 (UTC)
I do have a problem with encouraging the use of the UTF-8 characters in wiki source, because n-dash looks exactly like a hyphen in the edit box, at least in Windows Courier (m-dash, while a hair longer, also looks the same unless you compare them carefully side-by-side). It's fine for the original editor, but (a) it can lead to confusion for subsequent editors ("hey, that's a date range, why did he put a hyphen there?") (b) it can mislead newcomers who use existing articles for style guidance, when all they can see is a dash (c) it can lead to a pointless discussion such as the one I just had with User Quoth over what I thought was the replacement of a &ndash; with a hyphen in John Wilkes. It takes a true geek to look at an edit box and figure out whether one is looking at an ASCII hyphen or a typographically correct dash.
Can we at least (1) say Entity and Unicode character are both valid; do not jump into an article and replace an HTML entity with a Unicode character just because you can (2) have the link in the Insert box create the entity instead of the Unicode character. David Brooks 00:09, 17 April 2006 (UTC)
But that makes the wikicode harder to read and more intimidating to the "technically impaired". Better to force the "true geeks" to deal with the complexity of unicode characters than to make it harder on all the regular people editing content by inserting all kinds of confusing HTML.
Ideally, the software would recognize all the major cases in which each type of dash is used and generate them automatically (and curly quotes, too, while we're at it). The special cases could be handled with nowiki tags just like everything else. But no one seems to want to implement this. Textile does it. — Omegatron 02:26, 17 April 2006 (UTC)

Birth–death dates

A discussion at Wikipedia talk:Manual of Style (dates and numbers) has revealed an overwelming consensus (with, I think, only one editor disagreeing) for the use of en-rules for connecting birth and death dates. As this is the universal English standard, is there a reason for not making it the Wikipedia standard? Editors wouldn't of course (indeed, couldn't) be forced to use it, but I can't imagine why any editor would complain because a hyphen had been corrected to an en-rule. --Mel Etitis (Μελ Ετητης) 16:08, 15 May 2006 (UTC)

Proposing to keep Wikipedia talk:Manual of Style (dates and numbers)#n dash as central discussion place on this issue. --Francis Schonken 16:48, 15 May 2006 (UTC)
That certainly seem to be the wisest approach. --Mel Etitis (Μελ Ετητης) 22:17, 15 May 2006 (UTC)

Special character list

The character to the left of the emdash (—) on the special character list under the edit box (–) appears to be an ndash (–). If this is correct, it would be helpful to have this fact added to Omegatron's bolded notice at the top of the article, for the benefit of casual skimmers of the page. With the broad lack of consensus over the styles discussed on this page, I don't feel so bold as to make the change myself. --Blainster 16:12, 18 May 2006 (UTC)

I second that, and I do feel so bold. —71.208.125.132 22:38, 1 September 2006 (UTC)

Negative numbers

I'd like to propose that negative numbers should use the traditional "hyphen/dash" character "-" rather than −. I believe this usage would make pages easier to maintain, and that pages using another convention would probably gradually migrate to this anyway just because of editor ignorance and common usage outside this wiki. I can see using − for subtraction in mathematical expressions (where length may be important for aesthetic reasons), but for plain negative numbers, I think the one or two pixel difference in length isn't worth the trouble. Thoughts? --Doradus 17:57, 23 July 2006 (UTC)

I just type minus in in Unicode (0x2212) like this: −. This shows up as correctly on the page and in the edit window. Anal typographers will certainly change dashes to minuses. I see no reason to avoid good typography when we have the tools to easily do it. —Ben FrantzDale 05:23, 24 July 2006 (UTC)
Generally, “+1-microsecond error” and “−1-microsecond error” look okay, but “-1-microsecond error” doesn’t look elegant, could be even confusing. Compare:
  • -1- to -5-microsecond error
  • −1- to −5-microsecond error
U+2212 should be encouraged for more readable, non-ambiguous typesetting, although U+002D for the MINUS sign should not be disallowed. —Gyopi 06:22, 24 July 2006 (UTC)

Dashes In Foreign Languages

i'm a new editor to wikipedia, so i don't know if this issue has been raised already somewhere else, and i'm not sure this is the correct place to post it. if it isnt, pick my discussion up, and put it where it goes. tell me on my user page where u put it, or i'll just come back here for responses. here goes: hyphens are used frequently in french (Ile-de-France) and arabic (Burj al-Arab). i'm guessing the short guy (hyphen-minus) is the correct one to use? its the only one i've ever seen used in those languages. and in french, when do we put hyphens in phrases? i think the french department's name of "Loir-et-Cher" literally means "Loir and Cher". since english doesnt put hyphens there, does french really need them? i've seen people's and places's proper names with and without hyphens. are francophones arguing about whether to put or not put them, as much as anglophones are arguing about which type of dash to use? perhaps this is clarified in the french-language style manual, but im not a francophone. someone lemme know.4.230.174.207 01:57, 11 August 2006 (UTC)

There is actually a very specific hyphen character: Unicode U+2010 hex (8208 decimal)—but it may not be widely supported in fonts or browsers.
In your browser/font it looks like this: ‘‐’. (Looks about right in Safari.) Ian Spackman 00:49, 26 August 2006 (UTC)
It looks like a hyphen to me in Firefox on Ubuntu. Jeltz talk 10:07, 26 August 2006 (UTC)

Clarification re dashes separating surnames in page names

Gene Nygaard and I have been having a discussion on dashes in pagenames that could perhaps be clarified by a discussion here.

Specifically, it concerns dashes in the names of pages such as Erdős–Straus conjecture that involve subjects named after multiple people. Clearly in text such phrases should use en-dashes but we are disagreeing on whether the same is true as part of a page name. MOSDASH says "Hyphens and dashes are generally rather avoided in page names" but it only specifically talks in that passage about numeric ranges, and also says that for greater precision dashes can be used (with appropriate redirects). I think that "precision" can be interpreted here to mean that dashes are appropriate as a signal that the page name refers to two people instead of to one person with a hyphenated name, while Gene thinks the preference for hyphens over dashes takes precedence. Also, the first example of a dash between names in MOSDASH is a seemingly-approving link to Poincaré–Birkhoff–Witt theorem; I think this example supports the position that dashes should be used in this context while Gene thinks that the dashed page name is a historical artifact related to the fact that the page name policy of MOSDASH was merged from elsewhere some time after that example was added.

Also, and I'm not sure how relevant this is, but in the specific case of Erdős–Straus conjecture the url already has percent-encoded unicodes, so hyphens aren't needed to keep the url pretty.

I'm happy to abide by whatever the consensus is on this issue, and I don't really want to turn this into an attempt to change policy, but I would like a broader sample of opinion on what the existing policy actually is. Anyone?

If anyone cares to see our existing discussion on the issue, it's here.

David Eppstein 05:55, 8 October 2006 (UTC)

Note that Wikipedia:Naming conventions main page now specifically defers to this MoS page for naming as well as use in articles, and the naming discussion on this MoS page is a recent addition here. I don't know where all the past discussions of this issue with respect to article names have taken place, but I do know that such discussions exist.
Erdős–Woods number has a redirect from Erdős-Woods number, because it is a remnant that was left behind when Daniel Brockman moved the page.[3] If use of dashes in this context in the article names is encouraged, there will be a great many of them created without having that redirect from the hyphen form, just as we now have a huge number of article names with diacritics in the names which do not have the redirects from the English-alphabet spellings.
Look at this list, then answer the following questions: Erdős-Woods, Erdős - Woods, Erdős‐Woods, Erdős‑Woods, Erdős‒Woods, Erdős–Woods, Erdős−Woods.
  1. How many different hyphens do you see as you view this page as you'd normally read it?
  2. How many different hyphens/dashes do you see when you view it in edit mode (normally displayed in a different font)?
  3. How many of the seven in the above list above are true duplicates?
  4. How many other distinct possibilities could other editors come up with?
Now, take a look at some of those possibilities contained in links to possible article names/redirects:
As you can see, lots of missing redirects as I post this.
Even more important, as far as the article naming question goes, if a link is made on a page as [[Erdős&#150;Woods number]] (displays as [[Erdős&#150;Woods number]]) or as [[Erdős&#x96;Woods number]] (displays as [[Erdős&#x96;Woods number]]) or as [[Erdős&ndash;Woods number]] (displays as Erdős–Woods number), the latter has apparently been tweaked by the software to work (something we know to be true because when it gets there, it doesn't say "Redirected from"), but the first two remain redlinks as I type this, even though all three display as exactly the same en dash on the page, the same as if the character – were used in them (and we know from above the link works when that is used, and in fact that is the article rather than a redirect).
Additional missing redirects only peripherally related to the issue at hand in this discussion and not included in the list above: hyphen Erdös-Woods number, en dash Erdos–Woods number and Erdös-Woods number. Note that encouraging use of dashes in article names would often mean that several new redirects should also be created (yet, in most cases, will not be done by the person creating or moving the article). Gene Nygaard 13:39, 8 October 2006 (UTC)
Here is an additional problem related to the failure to make the situation clear, and a failure within this MoS project page to maintain a clear distinction to the naming-conventions role which has been passed here, and other uses. We can easily end up with copy-and-paste clone articles (which can then diverge from each other), something which has already happened at Bose–Einstein condensate (page history) and Bose-Einstein condensate (page history). Gene Nygaard 13:56, 8 October 2006 (UTC)
I feel we ought to keep ease for the reader as a top consideration, which would mean preferring hyphens (which exist on keyboards) to en-dashes (which generally don't). The name should be something the reader can easily type into the search box and find. Nareek 15:20, 8 October 2006 (UTC)
Just make the hyphen version a redirect to the en dash version. — Omegatron 16:16, 8 October 2006 (UTC)
As you can see, that doesn't get done. And it is often not "a" redirect that is needed, but many redirects. People who see an en-dash version are likely to write it with &#150; or other things that don't work in links, or use ‒ instead of –, which doesn't work, or use − instead of –, which also doesn't work, as well as using the hyphen/minus (-) instead of the en dash –. That's much less of a problem if we have a rule to use the hyphens in the article names, and dashes only in redirects. People using ‒ or − and getting a redlink are more likely to think of turning it blue by using a hyphen than they are to discover that – is the character they really want, rather than one of the others that is nearly identical to it.
Encouraging use of dashes in article names will result in more inadvertent creation of duplicate articles (oblivious to the existence of the other) than continuing to recommend hyphens in article names will. Gene Nygaard 17:31, 8 October 2006 (UTC)
This isn't any different than any other naming convention. Many have special characters. Just make it very clear that redirects are necessary. — Omegatron 11:59, 14 October 2006 (UTC)
All those different variations will exist regardless of which one the policy tells us to use as the pagename. That's why the policy also says to put in the redirects. But if the search box can't find one of those names when people type in a different one, isn't that a flaw in the search box rather than a flaw in the naming convention? —David Eppstein 16:36, 8 October 2006 (UTC)
Yep. — Omegatron 11:59, 14 October 2006 (UTC)
The different variations will exist; but it is more broken, more of a serious problem, for links using the characters we do have on our keyboards not to work, than it is for characters we do not have on our keyboards not to work.
Furthermore, we need to work with what we have. We want to make our information accessible, now. If you want to claim that new features should be added to our software, go right ahead, but that isn't particularly relevant to what we have now. Gene Nygaard 17:31, 8 October 2006 (UTC)
I don't know about yours, but the en-dash is definitely a character on my keyboard: option-minus. —David Eppstein 17:46, 8 October 2006 (UTC)
Even that isn't exactly "on" your keyboard, and is a feature unfamiliar to many users with computers like yours—but as you already know, the computers which have an "option" key to even make that possible are a minority. Gene Nygaard 18:25, 8 October 2006 (UTC)
Seems to me like a mediawiki problem. If the average user isn't going to know the difference between four or more different symbols that all look near-identical, why not just define them to be the same, like has already been done with the capitalization of the first letter? We'd need a way to solve conflicts that the initial change would cause, naturally, but in the end, I think we'd have a far more user-friendly system. Where there's an actual difference in meaning between two or more, we can use a disambiguation of some kind. -FunnyMan 01:49, 14 October 2006 (UTC)
You mean "Why doesn't someone finish the {{DISPLAYTITLE|}} magic word and set rules for when it is appropriate?"
Probably, yes. Like one of the editors above I use a Mac and, since they are all easy to type, I am moderately scrupulous/pedantic about using hyphens, en rules and em rules as I see fit at a given moment in time. However, while article titles should certainly be typographically correct in their canononical and display forms (the main ugliness is the near-universal use of
'
when what is meant is
), it seems to be to be pretty barmy that a-1, a–1 and a—1 could all be distinct articles. Come to that I would certainly also like to see automatic case-insensitive redirects. —Ian Spackman 12:20, 14 October 2006 (UTC)