Wikipedia talk:Plagiarism/Archive 6

Latest comment: 14 years ago by TenOfAllTrades in topic In-text attribution
Archive 1Archive 4Archive 5Archive 6Archive 7Archive 8Archive 10

Plagiarism OF Wikipedia

There have been instances that Wikipedia itself got plagiarized (or copyviolated). Perhaps this should be mentioned? --Piotr Konieczny aka Prokonsul Piotrus| talk 16:33, 20 July 2009 (UTC)

This is a subject I have raised a couple of times and I see on Wikipedia talk:Reusing Wikipedia content that you have found that page. Have you also seen m:talk:Communications committee? I also placed a comment on Wikipedia talk:Citing sources/Archive 23#Reliable Source Quotes Wikipedia. I also placed a suggestion at Wikipedia:Village pump (policy)/Archive 50#Plagiarism of a wiki article:
Originally I thought to have a list, but then I decided that a better solution is a template for the talk page of the article that has been plagiarised with a some arguments: the sources of the violation; the date it was retrieved/found; and a comment. The template should have in it a category so that it is easy to spot articles which have been plagurised. This would have the advantage of both warning editors that such a page exists and would keep a central list (in a category) for further analysis. --PBS (talk) 16:31, 21 July 2009 (UTC)

Plagiarism accusations in practice

I just read this real-life argument about the suggestion of plagiarism in someone's article work: [1]. It includes some of the points we have discussed here; use of a source's structure, close paraphrasing, copyright vs. plagiarism, etc. I thought I'd just plonk it here as food for thought. Similarly, this article recently saw an edit war over the inclusion of a sentence which one editor asserted was plagiarism (he received a 48-h block for breaking 3RR in the end). Again, just food for thought; the more prominent the topic of plagiarism becomes, the more such disputes we will have, and the more important it is that this guideline will help editors resolve them. JN466 22:58, 9 July 2009 (UTC)

Looking at these cases, I am amazed at how poorly understood the concept of plagiarism is. This page should provide a clear definition of plagiarism with examples. Jayen's examples indicate the definition adopted from Webster's dictionary is being twisted, emboldening some editors to copy and paste from sources and claim it's fine - per Wikipedia policy - since they put a footnote on it. Shame.Tobit2 (talk) 05:28, 2 August 2009 (UTC)

Disputed tag

See previous Wikipedia talk:Plagiarism/Archive 4#Remove disputed tag from article page

It is now August 2nd, and there don't seem to have been any additional efforts to dispute the status in over a month. This tag isn't intended to remain indefinitely. Are there further issues that remain unaddressed? --Moonriddengirl (talk) 14:23, 2 August 2009 (UTC)

Now August 9th (marginally), and just your response. I've removed the tag. :) --Moonriddengirl (talk) 00:13, 9 August 2009 (UTC)
Good call, Moonriddengirl. Durova298 00:14, 9 August 2009 (UTC)

Suggestions for improvement

In my opinion, the article is no longer objectionable because it now permits the use of attribution templates. It still needs improvement. I have avoided editing the article lately because I did not want to step into a contentious area. However, I feel that we still need some improvements:

  • To much emphasis on copyright: This is supposed to be about plagiarism. Methods of adhering to copyright law (paraphrase, limit the text, etc.) are in direct conflict with the concept of proper acknowledgement of the work of the original author. When it is legal to do so, it is far more honest to use the author's own words (when appropriate) and then attribute them properly.
  • Far too much emphasis on copyright in the lede. As above, but worse. The idea of plagiarism is lost in the copyright discussion in the lede.
  • Fixing plagiarism: the only cure is also the easiest cure and the best cure: attribute the work that was copied. If you attribute it properly, it's not plagiarized. For purposes of curing plagiarism, there is no need to remove it, or contact the editor, or reword it, or anything else. Just attribute it. (Contact the original editor to remind tham not to do it again.) If the original work is copyrighted, then refer to the copyright policy. If the text needs improvement, then improve it, but this has nothing to do with plagiarism and you must still attribute the result.
  • I reccomend a new template: {{copied}}. Add this template just prior to the </ref> to generate the following text: "(some content copied from this source)". This template should be used when a ref has the correct implicit scope (i.e., a casual reader is likely to think the placement of the ref covers the text that has been copied.) This should be used for material not in blockquotes or quotations that is not so large that an attribution template is appropriate.

If there is any consensus about any of this, I can try my hand at a copyedit. -Arch dude (talk) 16:03, 2 August 2009 (UTC)

I think that the copied template is a bad idea, because as soon as a word is changed then it is no longer a copy, but that does not mean that the original source should not be attributed. As we already have lots of Attribution templates I don't see the need for such a template.
Those templates are good for the whole article, not for smaller scope. The proposed "copied" template is intended for when we use content. Note the wordong: "some content." "some" means we can modify it. "Content" means more than merely text. -Arch dude (talk) 18:12, 2 August 2009 (UTC)
I am not sure that your concept of black and white between copyright violation and plagiarism is clear cut. For example it is possible to paraphrase something in such a way that it is not a copyright violation but it is still so close to the original words that it is plagiarism. Just adding a citation to the words does not fix that. --PBS (talk) 17:27, 2 August 2009 (UTC)
That's what I said (or tried to say.) A citation is insufficient, so the "copied" template adds a sentence to the citation. That sentence say "some content copied..."-Arch dude (talk) 18:12, 2 August 2009 (UTC)
No, a close paraphrase is just as much a copyright violation as an exact copy is. It is plagiarism if, and only if, you claim someone else's work as your own. Properly attributed work can never be plagiarism (although it could still be a copyvio). --RexxS (talk) 17:49, 2 August 2009 (UTC)
I think we may be agreeing. There is a point at which it is no longer a paraphrase which is a copyright violation, but it could still be plagiarism. It is the question of what constitutes proper attribution is the issue in such a case. --PBS (talk) 18:17, 2 August 2009 (UTC)
Properly attributed work, however, means more much than a citation. If an editor copies text verbatim from a referenced source or closes paraphrases it, he must include quotation marks or directly indicate that the work belongs to someone else. The article should state this clearly.Tobit2 (talk) 17:58, 2 August 2009 (UTC)
There is no need to quote text if the text comes from a work under which there is no such copyright requirement. That is what the sub-sections in "Attributing text copied from other sources" explains in detail (and something that has been discussed in detail several times over the last month or two on this talk page). However in many case I agree with you because otherwise if falls foul of the provision of WP:OR -- PBS (talk) 18:17, 2 August 2009 (UTC)
For works not under copyright...as you say, there is no requirement, no law, no governing body that says you must indicate the source or authorship. However, it is a convention. Tolstoy is not under copyright. However, most would considered copying War and Peace and passing it off as my own rather improper. So long as quotation marks are used or some other indication of authorship - such as one of the attribution templates - Wikipedia is on solid ground. The problem I see is that while plagiarism is a simple matter Wikipedia's policy/guideline on the subject has so much detail it is confusing. I suggest that the policy/guideline include a "rules of thumb section," a few simple guidelines to help most editors avoid and spot plagiarism. I'd be happy to draft one.Tobit2 (talk) 19:13, 2 August 2009 (UTC)
The guidline is simple: If it is copied, then it must be attributed. Period. Why this is hard to understand? This simple concept is the heart of the plagiarism policy. Everything else is, or should be, explanations of how to attribute copied material, with examples. The interactions with the copyright policy are completely secondary and should be treated as such. -Arch dude (talk) 22:51, 2 August 2009 (UTC)
Simple but wrong. I assure you, if you were to copy work from copyrighted material - verbatim - and then put a footnote at the end, you would very likely be sued. Here is an example of plagiarism guidelines from the Boston University's Law School:
All written work, whether in preliminary or final form, submitted by a student in the course of law study, in the course of employment, or in the course of other activities, including but not limited to moot court and law journal work, whether or not related to the study or profession of law, is assumed to be the student’s own work. Anything copied or paraphrased from another author or source must be appropriately identified, acknowledged, and attributed. The use of the exact language of another without identification as a direct quotation by quotation marks or otherwise is plagiarism even though the source is cited in the student’s work. Violation of the rules stated in this paragraph may be subject to disciplinary action, including suspension or expulsion. Use of the work of another without proper attribution constitutes plagiarism whether or not the writer acts with an intent to mislead or deceive. However, such intent, or lack of it, may be considered in determining the proper sanction if a violation is established.
Indeed that is a simple and clear policy.Tobit2 (talk) 23:30, 2 August 2009 (UTC)
How does this differ from what I said? (i.e., If it is copied, then it must be attributed.) Where does this mention copyright? NOTE: I have stated repestedly the paraphrasing is still copying, so "paraphrased" is still "copied." -Arch dude (talk) 23:47, 2 August 2009 (UTC)
Here we see how this guideline gets all confused up. It's not about copyvio, and it's not necessarily about academic definitions of plagiarism (since they tend to read more toward expecting independent thought, which we don't do in articles). Indeed, if it is copied, then it must be attributed. I like the idea of the {{copied}} template, since just a footnote to the source text fails to indicate exact or close to exact copying of the actual words. But I like the idea of a "Rules of thumb" section too, a sort of cut-the-crap short form section for those readatudinally-challenged. It is a pretty long and meandering guideline... Franamax (talk) 02:17, 3 August 2009 (UTC)
I don't think that the copied template is desirable, as there comes a point at which it is no longer a copy, and it is a judgment as to when that takes place. I think the wording from {{1911}} "incorporates text" is better. Further the suggested template with an exact copy of the original text assumes that someone is cut and pasting text from one place to another and not just copying parts of the text and modernizing the text as as they see fit. I am in the process of creating a list of Cromwell's upper house. I am taking it from Noble pp. 371-427,I am only taking the names and the words in italics. I am not copying the biographies and I am not leaving the spelling of names or the use of "f" for "s" as in was in 1787! Even if I cut it down to just the parts of the list I am using I am going to double the size of the table to an article which should not be over 32k in size, for no discernible advantage, to the reader of the article. When I have finished I will mark the table with an attribution to Noble along the lines of {{1911}}--PBS (talk) 11:08, 4 August 2009 (UTC)
Sorry, I messed up. There is already a template called {{copied}}, and it has a different use. Allow me to rename my propsed template to {tl|attribute_ref}} When you use only facts from a reference, you are free to cite it instead of attributing it, and the {{attribute_ref}} template is unnecessary. I think that this covers your example. If your content is a substantial percentage of the new article or if it pervades the article instead of being restricted to a single section or smaller scope, then an attribution template is more appropriate, because an inline reference is implicitly limited to a smaller scope than the entie article. As I said in my original proposal, the {{attribute_ref}} template is to be used when the copied content is within the implicit scope of an inline reference. Also as I said earlier, my proposed wording uses the terms "some content..." so later modification is not precluded. I am more than happy with a modification to the wording. Would you like to propose some wording? How about "some content incorporated from this source."  ? As to your method of editing: We are all peers, and we are free to use whatever methods work, within the overall guidelines. This is wonderful, and it lets us move toward our true goal, which is to build the best encyclopedia we can. I would like to thank you for your efforts in general and for the Cromwell's upper house article, and there is nothing wrong with your approach. I would have done it differently, but the result would have been the same (at least it would if I were as careful as you are.) I would have copied the entire content of the original to my computer and created a file with the bare minimum of wiki markup, together with an attribution. I would have then copied this to another file on my computer and added markup to bring the article to the bare minimum acceptabe level. I would then add the first file to the article with one edit, together with an edit summary "original text to preserve attribution", and then immediately replace that edit with the second edit, with a summary "preliminary wikification." Further work can be then be done in the normal manner. Again. my method works for me, and is not the only method or the best method. -Arch dude (talk) 12:16, 4 August 2009 (UTC)

A bunch of questions

The first sentence says,

Now, I'm having trouble understanding how this relates to our policy. According to the table of contents of Diana Hacker's book, she's expounding the MLA's view on plagiarism; she's talking to academics, i.e. individuals seeking to build a reputation for originality. By contrast, our own article on plagiarism opens with the definition from the Random House Compact Unabridged Dictionary: "use or close imitation of the language and thoughts of another author and the representation of them as one's own original work". (emphasis added.) This definition suggests that WP:OR is enough to prevent plagiarism. Suppose you reproduce, but do not credit, a public domain source that contains copious footnotes. By including the footnotes, haven't you complied with WP:OR? Then haven't you also not plagiarized?

When I look at Wikipedia:Plagiarism#Why_plagiarism_is_a_problem, "original research" is the second reason; the first reason is "citing sources"; the third reason discusses "improperly copied content" without describing what makes some copying improper; the fourth reason is about copyright. Only the fifth reason sticks to the domain of plagiarism -- i.e. our impact on "Subject matter experts" -- by which I assume we mean university professors. So does the motivation for our plagiarism policy come down to "We don't want to offend university professors?"

I'd also like some clarification: when plagiarism occurs on Wikipedia, who is doing the plagiarizing -- WP, or the editor? If WP does not accept original work, how can it be accused of plagiarizing? The plagiarizer is the individual editor, and I would assume that's his problem -- between him and his conscience. The "nutshell" summary says, "Don't make the work of others look like your own; give credit where it's due."

Now, my impression from plagiarism is that plagiarism isn't about denying someone the credit she deserves -- it's about receiving credit that YOU don't deserve. (" plagiarism is concerned with the unearned increment to the plagiarizing author's reputation that is achieved through false claims of authorship.") So couldn't we solve this problem by creating a Userbox, "This user plagiarizes, and should not be congratulated for writing high-quality content."

Sorry this post was disorganized. It's almost sunrise and I'm fighting insomnia. Andrew Gradman talk/WP:Hornbook 07:46, 14 August 2009 (UTC)

I would like to show you an example of OR in a work that I copied from a copy expired text that I fixed recently. There was an article called Murrough O'Brien, 1st Earl of Inchiquin which I deleted because it carried an incompatible copyleft licence. Rather than leave a red link I replaced it with a copy of Lee, Sidney (1903), "Dictionary of National Biography Index and Epitome" (an OCR version of which can be found at User:Magnus Manske/Dictionary of National Biography. In the text that I copied was the sentence "He outwitted the Irish leader, Donough MacCarty, 2nd Viscount Muskerry, at the battles of Cappoquin and Lismore". An editor asked for a citation on that specific sentence (probably not realising that the whole paragraph was cited), which was fair enough as it expresses an opinion (and as such could be seen as OR), so I fixed by adding a specific attribution to the sentence "Sidney Lee states that". --PBS (talk) 10:00, 14 August 2009 (UTC)
I don't believe those reasons are listed in priority order. The Wikipedia community knows that it is the contributor who has plagiarized; the world at large does not think of Wikipedia as a collection of individuals, but as a single entity. (This is something I've been called on to clarify quite a few times via OTRS.) In my own personal ranking, the fifth reason would be somewhat higher, as I prefer not to see headlines like this. Personally, I'd like to help create a highly reputable freely available resource, and to do that I think we need to conform to the standards of the larger community, which means outing and rectifying plagiarism where it occurs. I know that some of the contributors to this talk page hold very different opinions. Some see plagiarism as a moral offense akin to, say, making off with your neighbor's car, while others see it as something that should not concern Wikipedia at all. The RFC conducted to promote this to guideline found the majority of responders believed it was a level of concern that required guideline. --Moonriddengirl (talk) 10:44, 14 August 2009 (UTC)

Andrew, I agree completely. Our plagiarism article is completely off base. Plagiarism is two separate offences: 1) taking credit, and 2) failing to give credit. Offence number one is of central concern in an acedimic and professional context and is very serious, but is essentially irrelevant to Wikipedia. For us, the problem is offence number two. While I feel it is still a serious ethical breech, it is much less severe than falsely taking credit. We need to justify our anti-plagiarism policy on our own terms, and not in terms of acedimic plagiarism. -Arch dude (talk) 11:37, 14 August 2009 (UTC)

The context behind those questions: I want to aggregate good public domain sources

I co-created a template that aggregates public domain sources. Some editors are concerned that it could empower people to plagiarize. I'd like your feedback for how this template can be improved to 1) better accomplish its purposes and 2) not get attacked for violating our plagiarism policy. Also, I'm looking for someone who would be willing to work with me on implementing this feedback, because I don't know how to program -- I had it commissioned by hitch-hiking other people's programming skills.

Here's an awkward summary of the template:

  • We need some method of encouraging editors to engage in a "source gather", a necessary step in scholarly research that is not adequately supported by ==External Links== or ==Further reading==. This could be accomplished by a template that permits us to gather, annotate, and otherwise organize sources that have not yet been integrated into the text. It would consist of a list of sources and would appear on the talk page of articles. Checkboxes would indicate whether each source is public domain; NPOV; well-footnoted; not-outdated; and available in html. If the template contains a source that falls into all five of these categories (CRS Reports are of this character), it should transclude the article to a category like this one.
  • {{refideas}} illustrates these functions in a rudimentary way, but it needs a major overhaul. People have suggested incorporating features from {{expand further}}, {{findsourcesnotice}}, and {{findsources3}}, and to make the template collapsible when it contains 5 sources.

Some editors are concerned that the "category" function could empower people who engage in plagiarism. One editor wrote, "I have a real problem with this... it sounds like we are encouraging lazy editors to go out and cut and paste material from free sources into our articles...We want to encourage editors to actually read other sources and summarize what they say." I responded: "I definitely agree that [our editors] should read the source, but why should they then take the time to paraphrase it? That sounds like a waste of time. If the statement in the original is better, they should use the statement in the original. Especially when the original source was written by a professional, full-time, paid, scholarly author". Another editor agreed: "This isn't about editors being "lazy" - this is a rapid way to develop initial content that would take much longer for human volunteer labour to produce."
I recognize that plagiarism is a concern, which is why I'm posting here. Clearly, the page should be scrutinized and supervised, and should include a detailed banner detailing Wikipedia's policies on POV, citation, copyright, plagiarism, as well as encouraging the use of citation templates. What else? Andrew Gradman talk/WP:Hornbook 15:52, 14 August 2009 (UTC)

My two cents on this: 1) the best articles are indeed made by reading and summarizing multiple sources rather than just copying one; 2) however, direct copying of free text has been done in the past and is permissible, providing it is attributed. If you are creating 120 articles on all the species of five-spined stickleback from a single well-written source, there's not necesssarily much the wiki-editor can add; 3) when large-scale unquoted copying is done, a template or note similar to one in Category:Attribution templates should be used; 4) I think we're still up in the air on how to properly attribute one or a few directly copied sentences, see Arch Dude's thread just above about using a ref template to indicate small-scale copying; 5) my preference is to add free text verbatim in one single edit so that it is clear what exactly was copied and note that clearly in the edit summary and on the talk page, but that's just me. Franamax (talk) 20:12, 14 August 2009 (UTC)
1) Not necessarily true. Several articles copied over from Citizendium are extremely high quality. Also there are some cases, admittedly rare, where old public domain works are still regarded as definitive references on a given subject. Kaldari (talk) 21:37, 14 August 2009 (UTC)

Why plagiarism is a problem

This reason doesn't make sense to me:

  • The correction of improperly copied content disrupts the encyclopedia and may require the deletion of all subsequent edits to the article.

How would correcting content disrupt the encyclopedia? Is this actually referring to deleting copyvio content? Kaldari (talk) 17:07, 14 August 2009 (UTC)

I think that's a good diagnosis. The sentence doesn't clarify what constitutes "improperly copied", so it's hard to know what sort of "improprieties" merit "deletion of all subsequent edits." But I can't imagine why this would apply to anything other than copyvio. Andrew Gradman talk/WP:Hornbook 21:42, 14 August 2009 (UTC)
If this is solely a copyvio problem, why are we listing it here? I think I'll remove it unless someone knows why it would be related to plagiarism in general. Kaldari (talk) 22:10, 14 August 2009 (UTC)

People are not getting the message

I just discovered numerous instances of material lifted verbatim from other sources without quotation marks, by an editor who seems to make a habit of it. I found text lifted from professional journals, and even in one instance from a college student's term paper! (and yes, I checked to make sure the cribbing wasn't in the other direction: the material had been uploaded to a "get your free research papar here" site before the article in question was "written".) The editor, when I called him on another instance (relating to obvious, non-neutral promotional material) claimed it was just A-OK to quote without quotation marks, as long as he had a footnote to the paragraph containing the text. (Not all of his "contributions" have such footnotes, though). The funniest instance of this is when he copied a scholars autobiographical information without even changing the possessive adjective "my" to "his", referring to "my recent work". I'm not talking about an inexperienced editor; he's prodigiously prolific, but there's too much copying and pasting going on. The issue for me is not legalities; the issue is confusion and fairness. Habits like that can also result in jarring changes in tone, and non-neutral content getting slipped in, without readers knowing to whom to attribute the point of view. I think this project page needs to emphasize more strongly that footnoting information, broadly displayed in a paragraph, is not the same as properly attributing and marking verbatim text. This should apply equally to all work, whether under legal copyright or not. 72.229.55.73 (talk) 05:24, 6 October 2009 (UTC)

The anon is refering to me, and this Talk:Systems psychology discussion just started a few hours ago. -- Marcel Douwe Dekker (talk) 06:52, 6 October 2009 (UTC)

Don't drag it in here, man. I cited examples I found in your articles, merely to illustrate a point, not about you (stop taking stuff so personally) but about the subject of this project page: I feel that people are editing stuff in ignorance of the simple idea that quotations need to be marked as such. The fact that you simply don't realize that there's a problem is supporting my point. Anyway, my fault, re-reading the project page, I see that, really, it's as clear as can be. You just don't care about the standard (which is not a question of Wikipedia rules, but one of ordinary, common-sense practice in regard to incorporating other people's work.) —Preceding unsigned comment added by 72.229.55.73 (talk) 07:42, 6 October 2009 (UTC)

I am merely trying to seek it out indeed. As I just was saying here. I have had a hell of a discussion on this item here, and I am really would like to know where I am dealing with here.
I really appreciate if you (or anybody else) can create clearance here. I however oppose to any attitude claiming everything is and has been clear to begin with, and all of Wikipedia has been wrong for all those years. There is really a strong argument for not using qoutation marks, because other editors don't accept them and start changing the text anyway. This for a long time I considerd to be even worse. But nowadays I try to use them anyway. If I forgot or was lazy here and there in my article please inform me and I will see what can do.
And... making remarks like "You just don't care about the standard" is considered a personal attack and gives me the right to add an other tag on your talkpage, which I will for you to get the drill here. -- Marcel Douwe Dekker (talk) 07:56, 6 October 2009 (UTC)

On the Talk:Systems psychology the anon automatically assumed and stated (see here:

I'm removing the paragraph, since it's just plagiarised from a web page promoting original research.

Now on Wikipedia there is an common phrase, see for example here

If the source is cited, it is not plagarism. There may be other problems, but not "plagarism".

I still have a hard time making sense of this contrast. -- Marcel Douwe Dekker (talk) 09:14, 6 October 2009 (UTC)

(the issue is not simply sources, it's exact wording. Material can be considered plagiarism (or at least a copyright violation, if that applies, even if it's not a direct, verbatim quote, if no source is cited. But that doesn't mean that citing a source makes it permissible to use someone else's exact text at substantial length, without indicating in any way where that person's text ends, and yours begins. 72.229.55.73 (talk) 10:12, 6 October 2009 (UTC)
This has been a bit of a bone of contention here. Some editors here opine as Smokey Joe does in the comment you link, that a simple reference link suffices. Others (which include myself) are of the opinion that if you directly copy any substantive text, you have an obligation to make clear that it is not your own writing. I personally favour the latter because I don't want anyone to think I've written text which in fact I have not.
Leaving aside the copyright issue, it would be helpful to have clarity on the acceptability of directly copying unquoted text. Is it OK to do so as long as an attribution footnote follows? (I say no, but that's just me) What other methods of attribution are needed? Obviously there are the attribution templates for large-scale copying, but what about when it's a sentence or two? There's a suggestion somewhere above here to modify the {{cite}} templates to allow indication of direct copying, other than that, I'm not sure how to bridge the gap. Franamax (talk) 09:58, 6 October 2009 (UTC)

It's not just you. The page this page is attached to makes it plain: use quote marks or block indents on quoted material. Same standard used in the world at large, I always thought. (We have a simultaneous edit happening here, Franamax. The following is directed to Marcel Douwe Dekker, not at you.)

Marcel, Did you even read the page that this talk page is attached to? "If the external work is under standard copyright, then duplicating its text with little, or no, alteration into a Wikipedia article is usually a copyright violation, unless duplication is limited and clearly indicated in the article by quotation marks, or some other acceptable method (such as block quotations)." (Emphasis added)

Note how I quoted (that is, put quotation marks around) the material I just cited there. That's what you didn't do. And what you have done multiple times, and which you insist is acceptable. It isn't. I can't believe you've been operating under such a delusion for so long. Wikipedia, as an institution, has itself outed journalists and others who have quoted verbatim from articles while making it seem as if the text were their own. What you're doing is no different. As the article makes plain (and again, it's not just Wikipedia, it's normal, common-sense practice), you can't just shovel in someone else's text, without even using quotation marks, and then justify it simply because you put a footnote in. All the footnote tells the reader is where you got your information; it is not meant to indicate "just assume that anything you're reading in that paragraph was probably not written by the Wikipedia editor(s) who purport to have written it." 72.229.55.73 (talk) 10:03, 6 October 2009 (UTC)

Sorry guys. This is too much. I would like to keep it simple. There seems to be three ways (for short text):
  1. Just adding the ref-tag
  2. Use quotation marks and ref-tag
  3. make a separate lay out with large quotation marks and references.
Franamax leaves me to believe all three are used and favoured by different users. Please correct me if I am wrong. -- Marcel Douwe Dekker (talk) 10:23, 6 October 2009 (UTC)
I have to hedge that a bit. Your method #1 has occurred in the past, and I run across it often (since it's often pretty easy to find direct text copies from web sources) but I believe it's strongly discouraged and nothing in the current guideline supports it. I'll wait for other voices to confirm that though.
And another technique is to simply indicate somewhere that you have made a direct copy of text. One way would be in the "References" section, though I'm not positive on how it could be done just now, possibly by using the "quote=" field in cite templates. The edit summary or on the talk page are other, less satisfactory ways to do it. It does seem to be a bit of a hole in the procedures right now. Ideas and action are welcome! Franamax (talk) 11:07, 6 October 2009 (UTC)
The problem is that the complainant has not identified the passage(s) concerned, neither here not at Talk:Systems psychology. If it's s significant chunk, quotes or even attribution might be needed. If it's just phrases that have become part of the language of the field, 1 or 2 citations to show that it's common currency will be enough (see Wikipedia:Plagiarism#What_is_not_plagiarism).
For now I'd reinstate the text in the article and then post to Talk:Systems psychology a request for identification of the passage(s) concerned, plus citations to the work(s) allegedly plagiarised. --Philcha (talk) 11:23, 6 October 2009 (UTC)

I will list specific examples here later today. I will not edit any of the pages in question myself though, because the person doing the "non quoted" quoting would just revert them. And we're talking about potentially an awful lot of articles, in a field in which I have no expertise. As you can see from other comments here though, the editor whose actions I am taking issue with does not recognize that anything he's doing constitutes plagiarism. He specifically doesn't see the need to put long quotations in quotation marks. I'll post examples later. 72.229.55.73 (talk) 11:33, 6 October 2009 (UTC) Thank you for your lucid explication of the issue, Franamax. I maintain that, even if the rules on plagiarism here weren't already explicit (I mean, am I wrong, is the paragraph I quoted in this project's page just intended to be ignored? It states explicitly that one must use quotation marks, or block indenting. And if "Just adding the ref-tag" were considered acceptable (which again, it's not, if the documentation is to be believed) then that would put Wikipedia in a universe all its own, because really, where in encyclopedic, journalistic, academic or scholarly writing is such carbon copying of whole paragraphs of prose, without it being indicated as such, permitted? (People arguing for such permissiveness, please cite some examples other than "people disagree about this point in Wikipedia." Please show me any reputable reference work that addresses the issue and supports such a notion.

And why in the world would anyone even want to do it to begin with? I can at least understand the motive for the reverse problem: journalists at least are getting paid for stuff they lift verbatim from Wikipedia, so I guess they think it's worth the risk to their credibility; why would volunteers undermine the credibility of Wikipedia by doing the same thing?

I agree, Franamax, everyone should keep it simple; when quoting, show that you're quoting. It's always been simple. 72.229.55.73 (talk) 11:34, 6 October 2009 (UTC)

Mmmm...!? With his remark "As you can see from other comments here though, the editor whose actions I am taking issue with does not recognize that anything he's doing constitutes plagiarism. He specifically doesn't see the need to put long quotations in quotation marks." is 72.229.55.73 talking about me...!? Is he really talking about me...?? -- Marcel Douwe Dekker (talk) 11:36, 6 October 2009 (UTC)
72.229.55.73, the initial burden of proof is on the person who alleges plagiarism. You're wasting your time and everyone else's if you do not identify specific passages you think are plagiarised, with citations to the relevant pages of the allegedly plagiarised works. --Philcha (talk) 11:47, 6 October 2009 (UTC)

←Well, a google search here of the text removed is somewhat informative. The article's "It holds the promise of integrating mind-body-spirit in a rigorous and coherent framework" is almost verbatim from the cited source: "Process Psychology holds the promise of integrating mind-body-spirit in a rigorous and coherent framework." I tend to think that "holds the promise of integrating mind-body-spirit in a rigorous and coherent framework" is protected expression, as it doesn't seem stock or uncreative. The passage that begins "Drawing from the depths of..." is taken directly from Mary Elizabeth Moore's review there, which is also creative expression governed by copyright. This does seem to be a plagiarism issue, as well as a problem under WP:NFC, which requires that copyrighted text incorporated into articles be clearly marked. --Moonriddengirl (talk) 12:49, 6 October 2009 (UTC)

Yes, that section is removed... and stays removed (if it is up to me). I didn't do a good job here 1.5 years ago. I agree that removed section is badly sourced, and needs improvement. And it could be considered plagiarism. The one thing I am interested in is when does the badly sourced ends and the plagiarism begins...?? -- Marcel Douwe Dekker (talk) 12:58, 6 October 2009 (UTC)

I think there is a misunderstanding here. If text is under copyright then Wikipedia editors must use quotes which is what this guideline says: "If the external work is under standard copyright, then duplicating its text with little, or no, alteration into a Wikipedia article is usually a copyright violation, unless duplication is limited and clearly indicated in the article by quotation marks, or some other acceptable method (such as block quotations)."

Where this guideline indicates that other methods such as attribution is acceptable is for non copyright material. In other words for all copied text from all copyright sources, then quotes must be used. The reason why we have the hedge in the guideline is because very occasionally we have copyright material where the author has given us permission to use it, see for example the Richard Lindon article and OTRS on Talk:Richard Lindon, and there are also some other very narrow criteria such as lists, where in practical terms there is no avoiding copying the structure and words of a text and where quotes are no needed, but I think we can put those to one side for this conversation. -- PBS (talk) 12:59, 6 October 2009 (UTC)

Mmm... Philcha seems to make the one exception here that "if it's just phrases that have become part of the language of the field, 1 or 2 citations to show that it's common currency will be enough.."... !? -- Marcel Douwe Dekker (talk) 13:02, 6 October 2009 (UTC)
Adding to what PBS said, plagiarism and copyright problems may co-exist or may exist independently. With those particular passages, we are dealing with both plagiarism (the source is indicated, but there is no indication that the words are verbatim) and a problem under copyright policy, as the language is very likely non-free and has been replicated directly without quotation marks. Copyright is the more urgent consideration here, but even if the material were free, copying the text without indicating that it is duplicated is (under current plagiarism guideline) a problem. Since it is likely copyrighted, as PBS says, you can only use brief quotations of the text in accordance with WP:NFC, which sets out when such quotations may be appropriate and how they must be marked. In terms of plagiarism, citation is typically understood to mean that you have gotten your facts from a particular source, but without quotation marks or block quote or some other indicator they are not generally understood to mean that you've gotten your language from the source.
Non-creative text is not protected, which includes phrases that have become part of the language of the field, but unless a phrase truly is common coin, most speech is amply creative enough to merit protection under US copyright laws. --Moonriddengirl (talk) 13:09, 6 October 2009 (UTC)
Ok, this is quite clear, thank you (all). -- Marcel Douwe Dekker (talk) 13:21, 6 October 2009 (UTC)

Ditto! 208.105.23.6 (talk) 16:15, 6 October 2009 (UTC) (different ip, same "anon" as above) Thanks for the thoughtful and clear contributions by all. 72.229.55.73 (talk) 23:29, 7 October 2009 (UTC)

Proof of burden

Hi, could somebody take a look at the continuing discussion we started here on the Talk:Project Management Institute. I am rather confused their. In short:

PBS seems to have seriously urged me to remove the history section. I ask him what was wrong? he said I have to look for myself. Now he seems to be telling me: If I feel it is ok, you should return the page, but if you haven't solved the copyright problems, he states "if you repeatedly add wording that is a copy violation, your account will be blocked"

This seems to be the other way around I don't have a clue what kind of copyright problems he talks about. I have asked him three times to state the exact problems... But I don't get any answers. What am I doing wrong here?

I am under the impression that a proof of burden is a simple listing from the other editor of what he thinks specfically is wrong.. with the text?? -- Marcel Douwe Dekker (talk) 11:39, 7 October 2009 (UTC)

Please do not confuse plagiarism with copyright. This guideline has nothing to do with Wikipedia's policy on copyright. I have evaluated the text and identified additional material copied from non-free sources at that talk page. --Moonriddengirl (talk) 11:47, 7 October 2009 (UTC)
Ok, thank you. This is the feed back I wanted. I understand now, will study on this some more first, and get back on this how to solve this...!? I would appreciate further input later on. -- Marcel Douwe Dekker (talk) 11:57, 7 October 2009 (UTC)

@Moonriddengirl: would you clarify this area for me, please -- If text has been plagiarized from copyrighted material, isn't it, ipso facto, a copyright violation? I don't see how representing someone else's words as one's own could not be also a copyright violation, if such text were non-free. And, in your opinion, if the copied text (I am talking about a substantial sequence of verbatim text that has not been marked with quotation marks) is for some reason not considered plagiarism, wouldn't such a verbatim, unmarked, undelimited passage nevertheless lead a reasonable person to suspect that a copyright violation might have occurred? Would you advise any editor to add such unmarked quotations, or would you not caution that such a practice might predictably result in frequent violations of copyright? Thanks. If you prefer to reply on policy on copyright or another page, it's all good. Thanks. 72.229.55.73 (talk) 23:53, 7 October 2009 (UTC)(sorry, forgot to log-in: this is my user page: Bacrito (talk) 23:56, 7 October 2009 (UTC))

My point here is that this is not the place to discuss copyright problems or how they are handled. This is the guideline on plagiarism; copyright concerns are governed by policies which set out procedures elsewhere. The plagiarism guideline is a matter of consensus—like other guidelines on Wikipedia. As Wikipedia:Policies and guidelines indicate, policy trumps. While it might be permissible to leave text in place while the person bringing it up musters evidence of plagiarism (if consensus so dictates), on the contrary it is removed or blanked during an investigation of copyright.
That said, no, plagiarism and copyright violation do not always go hand in hand. One might, for example, paraphrase a passage to the point that it did not constitute substantial similarity to the original passage—thus representing no copyright violation—while still creating a plagiarism issue if the material is improperly attributed. Complicating matters further, even with literal duplication, whether or not "copyright violation" has occurred is determined only by a court of law. Even if copying is substantial, if it clears fair use a court of law might determine that copyright was not violated. Even so, such material would be plagiarized if it was not attributed.
Even given that, I have no idea what I might have said in this passage that leads you to wonder if I would advise any editor to add unmarked quotations, particularly given that I reference WP:NFC above. Wikipedia's policy on the use of copyrighted text is pretty clear: it may be briefly quoted under certain circumstances if properly attributed. Wikipedia has wisely side-stepped the gray areas of copyright law and adopted a relatively straightforward handling of protected text. --Moonriddengirl (talk) 01:19, 8 October 2009 (UTC)
Forgive me, my questions were intended as questions, I did not mean to imply that I thought you would advise an editor to add unmarked quotations, I simply asked your opinion on the matter. I and others have repeatedly insisted that they shouldn't do so. Sorry if I inadvertently annoyed you. Perhaps through exhaustion from the back and forth with Marcel Douwe Dekker, I felt the need to ask some perhaps blatantly unnecessary questions. Apologies for taking your time. Bacrito (talk) 01:54, 8 October 2009 (UTC)
I'm sorry if I came across as annoyed. :) I was not; just very puzzled. --Moonriddengirl (talk) 11:01, 8 October 2009 (UTC)

A similarity possibility

I think that Plagiarism is similar to Wikipedia:copy-paste as both of them have the function of "Taking stuff from other websites". Do you agree? If not, could you explain why they are different to each other please? I don't understand. Minimac94 (talk) 07:03, 23 December 2009 (UTC)

Plagiarism is a guideline. Copy-paste is an essay. Other than that, the only substantial difference is the content. :) Wikipedia:Copy-paste is primarily meant to be a user-friendly suggestion of how to comply with Wikipedia:Copyrights. This guideline offers information about what does and does not constitute plagiarism, how to attribute so that copying is not plagiarism when the source is not copyrighted, how to recognize plagiarism and what to do about it. --Moonriddengirl (talk) 11:27, 23 December 2009 (UTC)

I fought WP & WP won

(begun & copied from here) I've a question about plagiarism on WP that's bugged me for quite awhile. Nobody seems really troubled, but IMO it needs adressing. It appears many pages derived from DANFS (in particular, from what I've noticed, all the submarine pages), are verbatim copies. This is being defended as OK because they're not copyright. Except this suggests (& I agree) it's still plagiarism of somebody else's intellectual effort. Am I wrong? Maybe more important, can anything be done if I'm not? (BTW, I've added material from other sources where I encounter the pages, as much to correct DANFS POV & error; stil...) TREKphiler any time you're ready, Uhura 01:52, 18 January 2010 (UTC)

Wikipedia:Plagiarism covers this issue. --Tagishsimon (talk) 02:05, 18 January 2010 (UTC)
Attribution of quoted text is what makes it not plagiarism. There is a specific attribution template for this purpose, {{DANFS}}. LeadSongDog come howl 03:12, 18 January 2010 (UTC)
Attribution when it's verbatim? Forgive me if I think that's a weasel. TREKphiler any time you're ready, Uhura 08:07, 18 January 2010 (UTC)
Did you read the template: "This article includes text from the public domain Dictionary of American Naval Fighting Ships." What part of that do you find ambiguous? --Tagishsimon (talk) 10:04, 18 January 2010 (UTC)
Hi. :) Coming late to this party, but, yes, current consensus is that plagiarism is not an issue when attribution is provided, as through one of those templates or other acceptable means. Since plagiarism is considered[by whom?] a moral issue and is not a legal one, Wikipedians are pretty much free to determine by group consensus what constitutes plagiarism and how it should be handled. I know from prior involvements in conversations that there are some who think any copied text should be in quotation marks or block quote. I also know that there are some who think that as an encyclopedia we are exempt from questions of plagiarism altogether, since nobody supposes this material to be original. (See, for example, the talk page of Wikipedia:Wikipedia Signpost/2009-04-13/Dispatches.) It's always possible that consensus will change, and, of course, you can make your voice heard on the matter at WP:VPP or at Wikipedia talk:Plagiarism. :) --Moonriddengirl (talk) 13:15, 18 January 2010 (UTC)
Actually, the answer was linked from the question. Wikipedia:Close paraphrasing#When_is_it_a_problem? is explicit that "using another's words as one's own is considered plagiarism" (my bold). Attribution makes it clear that the words are not one's own. LeadSongDog come howl 18:37, 18 January 2010 (UTC)
Perhaps I'm being unclear. First, it's not the ambiguity that troubles me, it's the dishonesty. "includes text"? It's copied entire, in the cases I've seen. Second, "as one's own"? Somebody posted the page, copied from DANFS, without making clear the entire page was copied verbatim. Maybe it's splitting hairs to say that's "one's own". I nevertheless think it's wrong. Maybe not in conflict with WP guidelines, but wrong, in which case the guidelines should be changed. (I hold out no hope of that.) Perhaps, being a writer, I'm touchier than most on the issue. It also bothers me somebody quoting WP in good faith could get hammered, not knowing it's a verbatim copy (in the cases it still is). And, since Moon sugests raising the issue (& I think that's a good idea), do any here object to having this exchange copied to one or both of the above talk pages? TREKphiler any time you're ready, Uhura 23:19, 18 January 2010 (UTC)
What you're missing is that PD text can be altered. While many PD-based articles are built quoting the source verbatim, that may change over time. And while I would fully support a best practice of making a note in the edit summary of the article when the PD material is added, I hold no illusions that only a tiny minority will be aware of that best practice. In a similar way, having better attribution templates where you can change a flag so that the wording can either read "includes text" or "includes all text" would also hinge upon the hope that when derivatives are made of the source text, the editor will change the flag.
Which would lead to debates about when to change the flag, but also be mostly ignored, given that attribution templates are, way too often, added manually by the copyright cleanup crew once they verify that the text is, indeed, based on a PD source. MLauba (talk) 23:30, 18 January 2010 (UTC)
A "switchable" template would be better than an unclear one, IMO, if only because it would get changed. It might not be "instantaneous", but if experience is any guide, it'd be pretty quick. And as a "health warning", an unchanged one beats an unclear one: in effect, it'd say, "Don't quote verbatim or it's gonna bite you." Honestly, tho, that doesn't address the underlying issue: copying verbatim from DANFS (or anywhere) to begin with. Which (being unclear, as usual... :( ) I have a real problem with: not over copyvio (if public domain), but over the deeper issue. Just because I can copy Shakespeare entire doesn't mean I should. TREKphiler any time you're ready, Uhura 00:39, 19 January 2010 (UTC)
Except that there is currently no policy that prevents quoting from a PD source verbatim, even in full, provided that the result fits all other content criteria. Should there be one? Frankly, I don't have a horse in this. That being said, your example is flawed: copying Shakespeare entire wouldn't make for an encyclopaedic article. That being said, chances are that he's being copied in full on wikisource. MLauba (talk) 00:52, 19 January 2010 (UTC)
Wikipedia talk:Plagiarism would be the proper place for the discussion. I'm was not clear, uintil your most recent post, whether you're suggesting that guidelines should be changed to prevent copying the whole of a source, or that attribution templates should be changed to cover the cases where all of the text has been copied. I guess you've answered that question somewhat - you have a problem with both.
We can haggle about template text. FWIW, I take "includes text" to mean anything from "a little" to "some" to "most" to "all". I'm puzzled by the impression of an arbitrary judgement by you to exclude the last of these. But that strand pales into insignificance with the "should we allow copies at all" strand of your argument.
My take: copyright is (or once was) a bargain. Society provides criminal and civil protection for a period of time, in return for the protected thing entering the public domain at some time. And the point of the public domain is that the text can be reused. And so we, getting the point, reuse it, and provide an attribution (the wording of which is the subject of the other strand of the argument). By using that attribution we are signalling that this is not my work, or all my work. It builds on the work of others. That clears the moral point about giving credit where it is due. To recap: no legal issue because it is public domain. No moral issue because it is acknowledged. Subject to an acceptable acknowledgement, I'm plain not understanding what are the issues that would mitigate against use of PD text. And, for the record, not enjoying the implied accusations of dishonesty and wrongness.
Doubtless we can mull the issue some more at Wikipedia talk:Plagiarism and leave MRG in peace. --Tagishsimon (talk) 01:02, 19 January 2010 (UTC)
I take copyright more a legal bargain than a moral one, which is why the "all" bothers me, & why the WP position really bothers me. "Not getting sued" is not the same as "being OK". And there's a distinction between saying "includes material" & "copied entirely". Not explaining the difference is dishonest by omission, & I don't know what else to call it. And copying entire, public domain or no, attribution of source or no, I find wrong, & I don't know what else to call that, either. I'm making no accusations against anyone in particular, Tagishsimon; I think the system that allows this, & the thinking underlying it, is faulty: GIGO. Put it another way: would you copy any PD material & use it anywhere but WP with nothing but the "contains material" warning? If you wouldn't, you get it: it ain't OK. Why, then, is it OK on WP? TREKphiler any time you're ready, Uhura 01:35, 19 January 2010 (UTC)
You're not actually explaining why it is not okay to create a WP article from a PD source; indeed, you seem to be inviting me to explain why it is not okay. I've already explained my view - which is that with attribution, it is okay to use some or all of a PD source. You need to make your case, not merely repeat your emotional reaction. --Tagishsimon (talk) 01:39, 19 January 2010 (UTC)

(outdent)

I have created many articles by direct copying of an entire article from the Dictionary of National Biography. When I do so, I use the Template:DNB template. I do not believe that there is any ethical or moral problem with this: anyone who wishes to do so can easily determine which verbatim content remains in the article, by simply following the link back to wikisource. This is not plagiarism, because plagiarism is copying without attribution, and my usage is fully attributed. A question was raised above: "how is WP different from any other place where the original is copied verbatim?" The answer is simple: WP (and certain other works, such as commercial encyclopedias and compendia) make no assertion of original authorship, and the reader has no expectation that the work is original. This is utterly different than a signed work of any type, such as an academic paper or a scholarly work. Plagiarism wrong for two independent reasons. The first reason is failure to acknowledge the work of the original author, and this is an ethical or moral wrong committed against the original author. The second reason is falsely claiming credit as the creator of the work. This is a separate ethical and moral wrong committed against the reader, and it also constitutes academic or professional fraud, depending on the circumstances. This second type of wrong is the one that gets the most emphasis in most contexts, but it is not relevant when the resulting work has no explicitly acknowledged author. The effort required to determine the WP author (via the edit history) will also show the actual provenance of the text. As to my practice of copying DNB articles: This work is in the public domain. This means that I (and Wikipedia readers) have as much moral right to use this material as anyone else: we are the intellectual heirs of the authors by right of our shared humanity and by action of copyright law. The original authors contributed this work to an organization that eventually sold the copyrights to another organization, and those copyrights then expired. I have no more (and no less) moral obligation to the original authors than does the successor organization, which in the case of the DNB is the Oxford University Press, who publishes the ODNB. If you look, you will see that WP is at least as careful about attribution of these articles as is Oxford University Press. In the case of the DNB in particular, I (and several others) have gone to great lengths to place the original sources on Wikisource and to actually figure out exactly who the original authors were. the author attribution at Wikisource is a great deal better that was the attribution in the original DNB. We have no more of a moral obligation to a PD author than we do to a GFDL or a CC-BY-SA author: If we provide a way for a reader to trace the source back to the original our obligation is fulfilled. -Arch dude (talk) 02:33, 19 January 2010 (UTC)

See Wikipedia_talk:Featured_article_criteria#FAs_that_are_copies_of_other_sources. Christopher Parham (talk) 15:09, 4 February 2010 (UTC)

Citation attribution

As none else has produced a template along the lines that user:Arch dude suggested in Wikipedia_talk:Plagiarism/Archive 6#Suggestions for improvement I have gone ahead and written one. Hence the addition to the guideline:

To aid with attribution at the end of a few sentences consider using the {{citation-attribution}} template, or source specific ones such as {{DNB Cite}}.

I have made it the same format as {{source-attribution}} but it could be changed to be an addition more in line with the format that user:Arch dude suggested. What do others think? -- PBS (talk) 05:33, 8 March 2010 (UTC)

The templates recommended in section Wikipedia:Plagiarism#Where to place attribution have no docs or examples, and should not appear in the guideline until this is fixed - the guideline should be useful to the majority of editors, not just to specialists. --Philcha (talk) 05:49, 8 March 2010 (UTC)

How about fixing it? -- PBS (talk) 06:17, 8 March 2010 (UTC)
I doc templates which I develop and use:
  • I know the templates already.
  • I have the motivation.
I have need the knowledge nor the motivation to doc templates which I neither develop and use. Those who should develop and use should doc them. --Philcha (talk) 09:01, 8 March 2010 (UTC)
I've removed the confusing tag, as it seems your objection is that the templates are confusing or unclear. The guideline seems quite clear. You may wish to tag *them*, if you don't understand them, so that they can be repaired. Tagging the guideline to complain about issues with auxiliary material doesn't seem productive. --Moonriddengirl (talk) 11:40, 8 March 2010 (UTC)
{{DNB Cite}} seems to have had documentation for some time. I didn't check to see when documentation was added to {{citation-attribution}}, but it was there when I arrived. I added an example. --Moonriddengirl (talk) 12:23, 8 March 2010 (UTC)
I'm not sure, Philcha, why you would restore a tag that says {{Unclear section|the templates recommended here have no docs or examples, and should not be recommended until this is fixed}} when both templates now have docs and examples. --Moonriddengirl (talk) 12:50, 8 March 2010 (UTC)

PBS, I think this is a great idea, it really bothers me to see a footnote citing a work, when the text is actually a copy of the work itself. I'd also like to see a flag added to the {{cite}} family to show "Incorporates text from..." before all the usual parts of a proper citation, but this is a good start. Franamax (talk) 07:18, 8 March 2010 (UTC)

The flag would be fabulous. My only problem with the new inline citation (which I also think is a great idea) is that it is difficult to format it properly. Wonder who could go about implementing such a flag? --Moonriddengirl (talk) 12:23, 8 March 2010 (UTC)
I'll make a mockup in my userspace, then we can approach the folks at {{cite}} and see how hard they laugh. :) Franamax (talk) 18:02, 8 March 2010 (UTC)

It would be better to list the templates that are currently regarded as useful and in what situations. The problem with categories is that anyone can add anything. --Philcha (talk) 16:05, 8 March 2010 (UTC)

There are 142 of them, not including subcategories; in those subcategories, there are 59 US government attribution templates alone. Listing them and describing in what situations they may be useful seems like it would be overwhelming to this document. --Moonriddengirl (talk) 16:10, 8 March 2010 (UTC)
Perhaps a [WP:List of PD attribution templates] page? Various projects could be invited to contribute advice on when to use them. Categories are cumbersome to navigate and decipher sometimes. Franamax (talk) 18:02, 8 March 2010 (UTC)
That would work for me. :) I've often had a grumbly poke through the category looking for the attribution template I needed. OTOH, I'm not volunteering to write it up. :D --Moonriddengirl (talk) 13:08, 9 March 2010 (UTC)

We do not have to change {{cite}} because {{citation-attribution}} templates can be wrapped around the citation templates. eg using the current example from template:citation-attribution/doc:

{{citation-attribution|
{{cite book |last1=King |first1=Richard John |title=Handbook to the cathedrals of England |volume=Vol. 1, Part 2. |year=1869 |publisher=J. Murray |page=129 }}
}}

produces:

  One or more of the preceding sentences incorporates text from King, Richard John (1869). Handbook to the cathedrals of England. Vol. Vol. 1, Part 2. J. Murray. p. 129. {{cite book}}: |volume= has extra text (help), a publication now in the public domain.

--PBS (talk) 01:59, 9 March 2010 (UTC)

O the magic of nested templates! My only suggestions then would be 1) change "One or more of the preceding sentences incorporates text from" to "Text incorporated from"; and 2) "a publication now in the public domain" to "a PD/free work" (link to PD and whatever we have that most closely describes "free"). I really like minimal wording in footnotes. This looks like a major advance. Franamax (talk) 02:32, 9 March 2010 (UTC)
I think the precise wording of the template should be discussed on the talk page of the template. I am going to copy this part of the conversation over there. -- PBS (talk) 03:37, 9 March 2010 (UTC)

Conflation

An editor has expressed that this may conflate wp:quote.174.3.107.176 (talk) 09:57, 16 March 2010 (UTC)

How many words?

Hi, how many words in a row would you need to copy to be accused of plagiarism? Is there an acceptable minimum amount of copying that needs to take place in order for an accusation of plagiarism to stick? I've recently been told "Plagarism is copying without quotation marks three or more words." Is that correct? --HighKing (talk) 14:57, 25 March 2010 (UTC)

Were you told this on a Wkipedia page? If so where? The wording was in Wikipedia:Quote until I removed it with this edit five days ago. -- PBS (talk) 19:51, 25 March 2010 (UTC)
It's not about word count, It's about creativity. If the phrase is unique to the work from which it is copied, you should attribute it. You can use any method of attribution, not just quotation. "Jesus wept" (shortest sentence in the King James Bible) is an attributable two-word quote.. "don't count your chickens before they are hatched" does not (in my opinion) need attribution. Note that quote-marks themselves are do not provide attribution--you need a reference also. -Arch dude (talk) 00:16, 26 March 2010 (UTC)

Assistance requested at Village Pump

Unless the discussion [2] should actually be here. I wouldn't know. If I could ask a favor though: please read what I've written carefully. Some contributors apparently have not. Yakushima (talk) 07:06, 5 July 2010 (UTC)

Is it "not quite plagiarism" if you cite a source?

RESUMING: To the extent that there was any conclusion at the Village Pump discussion above, by anyone other than the editor with whom I have the dispute, it seems to be this: for the situation I outlined,

though the editor supplying that opinion hedged that some dictionary definition might permit escape. Well, I'm talking about Wikipedia here (and common sense). Not about what a given dictionary definition might say.

Let me put it a lot less hypothetically. The problematic passage was as follows. Note that bold indicates my emphasis added, to show diffs, and especially note that what follows was not a blockquote (or any other kind of quote) in the WP article, but part of its running text. It would therefore be presumed by readers to be editor-contributed, not copied -- much less quoted -- from the source cited. [3]:

They revealed that the government had knowledge all along that the war would not likely be won, and that continuing the war would lead to many times more casualties than was ever admitted publicly.[4] Further, the papers showed that high-ranking officials had a deep cynicism toward the public, as well as disregard for the loss of life and injury suffered by soldiers and civilians.[4]

The source cited in [4] had this:

They revealed the knowledge, early on, that the war would not likely be won and that continuing the war would lead to many times more casualties than was admitted publicly. Further, the papers showed a deep cynicism by the military towards the public and a disregard for the loss of life and injury suffered by soldiers and civilians.

I'm still being told by a certain editor, despite considerable discussion with him, that I'm ignorant of WP:PLAGIARISM because (he says) I don't realize (as he claims) that supplying that footnote [4] would make it "not quite plagiarism".

My position is: in this case, the footnote makes no difference. A footnote isn't even some ambiguous indication that a passage might be copied from anything. If anything, quite the contrary. A footnote suggests that somebody did their homework, and since people who do their homework are seldom so foolish as to lead a trail back to evidence of their own misbehavior, the (already relatively small) probability in the reader's mind that he's reading something copied (nearly) verbatim from some source is reduced, not increased. Either way, the reader will more naturally assume this wording is the work of the editor(s) who contributed it. Any editor who copied this into the WP article text, in this way, footnote or not, is therefore, clearly, making someone else's work look like it's their own. That's what would make any such copying plagiarism.

Yes? No? Yakushima (talk) 12:12, 8 July 2010 (UTC)

And since Talk pages should be about the article, and how to improve it: how would you improve the article so as to improve readers' understanding of your answer? IMO, the article as it stands was not enough for one editor to get it, for the case above. —Preceding unsigned comment added by Yakushima (talkcontribs) 8 July 2010
I've spoken to the contributor. If he finds the guideline unclear, we can talk about improvement. But note that his assertion is not held only by him. Consensus here reflects the majority opinion that specific acknowledgment of copying is required, but others have voiced similar opinions, particularly in encyclopedic works. Generally, I recommend not trying to convince somebody that their opinion of plagiarism is wrong (plagiarism being a social construct after all), just that it does not mesh with the current consensus view of Wikipedians. --Moonriddengirl (talk) 13:39, 8 July 2010 (UTC)
The question I asked was clear. Do you have an answer to it? If the text were plagiarism without the footnotes, do you really think that it's suddenly no longer within the scope of that WP:PLAGIARISM when you add cites to the source it copies from, but still no indication of quoting it? This is what he was saying. Do you agree with him? And do you find support for that position in WP:PLAGIARISM? If so, reason it out for me. I don't see it, but I can follow logical steps in an argument. Yakushima (talk) 14:36, 8 July 2010 (UTC)
Did you find my response at Village Pump unclear? (Excerpted: "Community consensus is that direct copying of prose is plagiarism unless it is clearly identified as a copy.") If so, hopefully you will find my note to him more definitive. --Moonriddengirl (talk) 14:40, 8 July 2010 (UTC)
Clear enough for me. But for the editor in question (who didn't seem to notice what you said), you might have to go further, and point out that merely adding a citation to a passage, by itself, doesn't "clearly identify as a copy" the text of the passage. In this case, I think you need to connect all of the dots, tied back to a real context. To do that, it might help if you answered the above question directly: if the mostly-copied passage were plagiarism to begin with under WP:PLAGIARISM, does it somehow fall short of that verdict simply by adding footnotes to the source it copied (or any other source, for that matter)? I'm about 99.9% sure that your answer is: No, it's still WP:PLAGIARISM. Once he sees the source of his confusion, he might even end up being productive in this long wrangle. He might actually tell us exactly what wording in WP:PLAGIARISM led him to think he understood it and that I didn't. The resulting clarification might lead to improvements in that guideline to help reduce future misunderstandings. Yakushima (talk) 15:05, 8 July 2010 (UTC)
Hard to say if he's noticed, since he hasn't edited since I left it. I think my note is plenty clear. If he doesn't, I'll be happy to talk to him about it further. --Moonriddengirl (talk) 17:34, 8 July 2010 (UTC)

Paragraph Two: Newbie Overwhelm

Gregcauletta's recent comment [4] suggests to me some serious and valid criticisms of this guideline:

  • There's too much, too soon, about copyright violation and other IP issues
  • That emphasis could leave editors who are unsophisticated about plagiarism with the impression that, for Wikipedia purposes, plagiarism and copyright violation are mostly (if not entirely) distinct, when in fact they are often enough coincident.

And that's if they get anything out of the introduction at all, except bewilderment: a sense that plagiarism is (notwithstanding the good nutshell summary) a very complex technical subject.

Don't get me wrong: The guideline is a tour de force insofar as it's about getting the issues and principles right. I just think it falls down badly, early, in a more important respect: getting those issues and principles across, to those in the target audience who need this guidance the most. After all, if you succeed in getting a somewhat flawed message across, you'll get more help in the long run for eventually getting it right. But if you fail in getting it across, you just get cognoscenti talking to themselves.

Look at the long, dense, very technical second paragraph, excerpted below. In the presentation that follows, I'll use italics for sentences lacking the word "plagiarism", and I'll use bold for any terms that are almost certainly over the head of most Wikipedia editors. After all, your average Wikipedia editor is above-average in education, I'm sure, but most of them don't concern themselves with the arcana of IP law and open source licensing. I'll break several times for comment.

By Wikipedia's verifiability policy, articles should be based on previously published sources. These must be handled appropriately to avoid plagiarism. If the external work is under standard copyright, then duplicating its text with little, or no, alteration into a Wikipedia article is usually a copyright violation, unless duplication is limited and clearly indicated in the article by quotation marks, or some other acceptable method (such as block quotations).

I must say something: an editor who has arrived at this guideline confused about copyvio/plagiarism distinctions could read that last sentence as implying that such duplication is never plagiarism, even if it is copyright violation. And it's a long sentence that doesn't directly address the topic raised by the previous one: how a source should be "handled appropriately to avoid plagiarism." Already, it's going off-topic.

.... If the external work is under a copyleft license that is compatible with Creative Commons Attribution/Share-Alike License 3.0, it may be acceptable to include the text directly into a Wikipedia article if it otherwise meets policies and guidelines and if adequate attribution is provided.

Still off-topic, but maybe worse: most people (and most new Wikipedia editors) have neither heard of, nor taken much note of, the term "copyleft". Then you throw in severe complication: that the hypothetical external work's copyleft license is somehow constrained by a complex and more recent licensing scheme (with version number!). Congratulations! You've got about 95% the eyes definitely glazing over, at least among those who need a primer on plagiarism. (And those editors probably unwittingly constitute 95% of the plagiarism problem on Wikipedia. Of course reaching those people is the priority!)

"Adequate attribution" is technical in a different way: a lot of people will arrive at the article not knowing some fairly basic things about what constitutes "adequate". I'm probably better than most, but I have an advantage: I was once beaten and left for dead by an English professor wielding the Chicago Manual of Style.

Most copyleft licences require that attribution be given; omitting such attribution is not only plagiarism, but also a copyright violation.

Whoa. Waa-aa-ay before this, the reader should be told that the combination of plagiarism and copyright violation is hardly limited to cases where the source was under copyleft compatible with blah-blah-blah, about which readers in the most important target audience know next to nothing at this point, and care even less. And if you have to say this here (or anywhere), say it this way: "omitting such attribution is not only copyright violation, it's what makes it plagiarism." Actually, I'd throw out the "is not only copyright violation" part.

Text from works with incompatible licensing must be treated as if the text were under the standard copyright notice. Works that are in the public domain because they were never protected, or their copyright has lapsed, carry no legal requirement for attribution, but most articles in Wikipedia that are derived from such external works attribute the text to the public domain source. Attribution for compatibly licensed and public domain text is generally provided through the use of an appropriate attribution template, or similar annotation, placed in a "References section" near the bottom of the page.

Now, I've been gentle up there and have not flagged all uses of "attribution", even though earlier I flagged it as possibly technical, and not well understood yet by some readers. Step back, knit what I've separated with comments back together again in your mind, and look at how much of this long, complex paragraph is now in italics (i.e., the word "plagiarism" not used in the sentence) and peppered with boldface (i.e., possibly too technical to throw at a recent arrival starting cold on these issues.)

What to do? Strunk & White said, "Omit needless words." Yes, I'd support deleting the entire paragraph as an improvement to the guideline. If any points made in it are not made further on, in relevant sections, they should have been already.

OK, I actually do believe there should be a paragraph there. Just not this one. Yakushima (talk) 06:24, 9 July 2010 (UTC)

"When the only tool you have is a hammer everything looks like a naisl" You have come to this page after problems with one specific type of plagiarism, that from your description of it -- which I have not looked at in detail -- was a copyright problem.
But there are other areas were plagiarism is a problem and copyright is not an issue. It seems to me that you are starting from a position where you are not assuming good faith. Most plagiarism arise for two reasons. Either the summation of an article is done in such a way that instead of being a summary it becomes a from of plagiarism. This can be a real problem for many inexperienced editors if they use only one or two sources. The other is text is introduced from another source that providing that attribution is given it is not considered a plagiarism problem on Wikipedia. There are areas were plagiarism (and non obvious copyright violations) is a problem and this guideline also covers those issues.
There is another area were people copy text in good faith, but it is a problem but is not a simple copyright issue that that is incomparability with copyleft licensing.
The introduction to this guide line is a summary of the body of the guidline and while as always there are areas were it can be improved, I think its a reasonable summary of the contents. As to what is copyleft for anyone who does not know what it is we have links to the article copyleft to explain it. -- PBS (talk) 11:49, 9 July 2010 (UTC)
PBS (quotes his): "When the only tool you have is a hammer everything looks like a naisl"
Is that based on any review of my editing history? I have shown that I'm pretty aware of the distinction between copyright violation and plagiarism (and that they can overlap in the same incident). And I think my editing history offers ample evidence that I don't believe all of Wikipedia's many problems can be solved by finding plagiarism and rooting it out. Am I confused about what you mean by "hammer" in invoking that figure of speech?
PBS: You have come to this page after problems with one specific type of plagiarism, that from your description of it -- which I have not looked at in detail -- was a copyright problem.
Consider looking at the issue in detail (starting with the Talk page for Daniel Ellsberg) first, before characterizing my behavior as narrow-mindedly ("hammer/nail") focused on one problem. It clearly was plagiarism. I suspected early that it was WP-to-website copying in violation of WP license terms, but said early on that the copying could have been either way. I did slip at one point and talk about how the WP article "ripped" from the website (perhaps partly under the influence of Gregcauletta's belief that it was a legitimate source). I did see that the site itself didn't attribute the text to anybody. And it turned out on closer inspection that the website (u-s-history.com) had copied an earlier version of Daniel Ellsberg without attribution to Wikipedia, which (as I understand WP:PLAGIARISM -- correct me if I'm wrong) makes it plagiarism on the part of whoever copied that version of the Wikipedia article to that website without attribution. (BTW, I'm writing this from memory, so if the chronology is wrong, well, it's wrong. I'm tired of explaining the situation to people who won't look.)
PBS: But there are other areas were plagiarism is a problem and copyright is not an issue.
I've already shown that I know this, in so many places, in so many ways, that this comment of yours makes me want to scream.
PBS: It seems to me that you are starting from a position where you are not assuming good faith.
It might seem that way to you, but you've also admitted to not looking very deeply into what led up to all this. The header for this thread says it's about the second paragraph of the guideline. I open the thread accepting, in good faith, Gregcauletta's excuse for not understanding WP:PLAGIARISM. I also accept, in good faith, that the second paragraph was not written with any intent to confuse. So where is the evidence you see for a bad faith presumption?
PBS: Most plagiarism arise for two reasons. Either the summation of an article is done in such a way that instead of being a summary it becomes a from of plagiarism. This can be a real problem for many inexperienced editors if they use only one or two sources.
It takes more than that, to make it a problem. AGF: It takes an editor thinking somehow that a whole introductory section can hardly be written any differently than how it's laid out in those sources. Sans AGF: the person doesn't want to bother. In other words, it's from an editor who can't--or won't--contribute original wording.
Well, inability to be original in your paraphrasing doesn't get you off the hook. We're not all equally endowed in that department. Those people have no inalienable right to edit Wikipedia despite their lack of endowment. As for those who could, but won't, that goes to the issue of intent, and only makes it far worse.
PBS: The other is text is introduced from another source that providing that attribution is given it is not considered a plagiarism problem on Wikipedia.
Give me a sentence in English that actually parses, please. I can't make sense of this one.
Eg text copied from Encyclopædia Britannica, Eleventh Edition with a template:1911 in the article. -- PBS (talk) 02:02, 10 July 2010 (UTC)
PBS: There are areas were plagiarism (and non obvious copyright violations) is a problem and this guideline also covers those issues.
I've already said what my problem is, with such formulations: they make plagiarism sound like it's OK except in special cases. In fact, plagiarism is almost invariably a problem of some kind, for somebody. One can hardly complain that "plagiarism" sounds like an ugly accusation on the one hand, while presenting it as if it's mostly innocuous on the other.
PBS: There is another area were people copy text in good faith, but it is a problem but is not a simple copyright issue that that is incomparability with copyleft licensing.
Is it really too much to ask that you write sentences that parse? Yes, I know that copyleft leads many people to believe they don't need to attribute the source in any way. But the amount of copyleft text that anybody might copy into Wikipedia is vanishingly small compared to the mainstream sources. So delving into this relatively obscure special case in a very dense, almost unreadable second paragraph of a guideline that everybody contributing to Wikipedia should at least understand the spirit of, well, it's just very bad organization.
I certainly don't mean to suggest that there should be no discussion of copyleft at all. Above, I say that anything in that first paragraph that isn't already covered below it should be covered below it already. Did you read that far? Nor am I against some mention of copyleft in the introduction. But it should be more like This: "There are some rare and slightly difficult cases when it comes to copyleft, which is discussed in its own section below."
PBS: The introduction to this guide line is a summary of the body of the guidline and while as always there are areas were it can be improved, I think its a reasonable summary of the contents.
For a certain audience, perhaps: people who practically know the guidelines and the issues like the back of their hand, and who just want a 30-second quick review. See above on its readability index. It's utterly unsuitable for people who actually need to be educated somewhat about plagiarism, for a specific--and relatively typical--context.
PBS: As to what is copyleft for anyone who does not know what it is we have links to the article copyleft to explain it.
Yes, but if most people don't need to know about it, to get their issues over plagiarism resolved within this guideline, why emphasize it so strongly at the outset, in a densely technical, hardly readable paragraph? The introduction in this guideline should, if anything, aim to be understood in its spirit, and for typical instances, even by sub-average Wikipedia editors. It should get people ready to navigate to the answers they need from the table of contents.
I can't believe I've become such a lousy writer that I didn't make myself pretty clear at the outset here. But stylistic problems and obscurity can creep up on you so slowly that you don't notice. People, please feel free to tell me where I was less than clear. Yakushima (talk) 13:56, 9 July 2010 (UTC)

By the way, according to the Flesch–Kincaid readability test as computed here [5], the second paragraph requires 19 years of schooling. Yes, beyond M.S./M.A. grad-student level. On the readability indices here [6], the passage scores at graduate school level for all but Coleman-Liau Index.

We don't need this level of discussion for someone who just wants to fix some WP info about an Eric Dolphy track from old vinyl album-cover liner notes. We just want them to get that they can't just copy those liner notes (at least, not without showing that the wording comes from those notes.)

I can't think of any reason why there couldn't be a second paragraph that would not only be easily read by most college freshman, but also stand as an actual model of how they should be writing. And I think I've given you all plenty of reasons why there should be a second paragraph like that, or no paragraph at all. Would you take writing advice about what amounts (in AGF terms, at least) to a point about being clear, from somebody who was a long way from being clear? Yakushima (talk) 12:09, 9 July 2010 (UTC)

1st para contra section heading

The opening paragraph says, among other things:

This guideline addresses cases where plagiarism is not a problem, ....

which puts the emphasis (unfortunately, I think) on what are at best such special cases that no reasonable person would call them plagiarism. (And what fun for purposes of out-of-context quoting by irresponsible people either trashing Wikipedia or defending plagiarism: "Wikipedia's own guidelines say '... plagiarism is not a problem ...'").

Worse, though, and getting to my point, there's a section heading later:

Why plagiarism is a problem

which directly implies that plagiarism is always a problem. Whichever is right (IMNSHO, plagiarism is an ethical problem almost by definition, even if many specific incidents owe only to cluelessness about the ethic), this guideline can't be seen as equivocal about whether it's always a problem or only in certain cases. I don't think this guideline should even use the word "problem" without making it clear who would have any such problem, and whether the problem is legal, ethical, reputational, orthographic, etc.

To further compound the confusion: one of the reasons given under "Why plagiarism is a problem" is nearly circular. It effectively re-asserts a good definition of plagiarism instead of saying why it's a problem for anybody.

A credible encyclopedia must not silently present content copied from elsewhere as though it were original

This is almost like saying "Wikipedia shouldn't have plagiarism because, well, it's plagiarism". You need to say why it damages the credibility of Wikipedia.

After all, for some people, it's subtle. A lot of plagiarized content is copied admiringly by the clueless precisely because, being well-written and closely reasoned, it sounds so credible. Even the clued-in might cite some such motive, rationalizing that they at least made Wikipedia "look better", Wikipedia being presumably a good cause. Such text, if never detected as plagiarism, actually redounds to Wikipedia's credibility.

It might seem to some that such credibility damage is not a cultural universal, depending on the topic. In some cultures where all norms are dictated at considerable length, and with mind-bending specificity, in sacred texts and authoritative commentaries thereon, attribution is thought beside the point. (Unless, perhaps, you're (a) ecclesiastically certified and (b) digging into relatively obscure material.)

However, in these cultures, the very fact that your lay text contains a comment on an ethical, moral or social issue is considered a clear enough signal, in most formal contexts, that you copied it in good faith from an authoritative source -- after all, how would you dare to have your own opinion on such matters, in which originality is tantamount to Original Sin? As well, the diction of the source may be considered signal enough, if it differs enough from the vernacular.

This is perhaps all well and good in those cultures, but Wikipedia is nothing if not a secular instrument of education. Yakushima (talk) 07:54, 9 July 2010 (UTC)

[Irrelevant point about 2nd para added here by mistake, see about readability of it, in section above] Yakushima (talk) 15:12, 9 July 2010 (UTC)
Addressing solely the point of "This guideline addresses cases where plagiarism is not a problem," I believe it is within the spirit of the guideline and probably uncontroversial to clarify as I have done here. If this is controversial, we can restore the status quo. The lead is written to summarize the contents, which were derived after considerable debate. The first section covers precisely that: where plagiarism is not a problem. But I agree with you that as written the phrase can be misleading. --Moonriddengirl (talk) 14:39, 9 July 2010 (UTC)
Sorry that I brought up two points, the second about readability of the second paragraph of the guideline. Not intended. I had an edit conflict earlier, and maybe in the process of "solving" it (*sigh*), the passage about the readability scale ended up here. I'll move it where it should be (the section above) after commenting here.
It's fine, I think, for the first paragraph to say that the guideline will talk about "where copying is not a problem." But "where plagiarism is not a problem" is going to sound nonsensical to people who are sensitive to the term's very negative connotations. And I thought that was almost everybody. Am I wrong? If so, when did it change? Yakushima (talk) 15:08, 9 July 2010 (UTC)
I think that "This guideline addresses where copying and close paraphrasing is not a problem" would be better than "This guideline addresses where copying and close paraphrasing may not be a problem". But I think it would be better if the clauses in the sentence were to be rearranged to place the problems first, eg: "This guideline addresses how to avoid plagiarism, how to address plagiarism when it is encountered, and where copying and close paraphrasing are not considered to be plagiarism". -- PBS (talk) 03:29, 10 July 2010 (UTC)

(dedent). Problems first is good. But plagiarism (because of its ugly sound on almost all ears) is problematic from the first word. So the first priority is to say that the purpose of this guideline is to help people understand what plagiarism is and what to do about it, from the ethical rudiments expressed in the nutshell to the subtleties (technical to emotional) that arise in confronting even the possibility of it.

"How to avoid plagiarism", right off the bat, makes it sound like plagiarism is an unfortunate accident, something you stumble into. For some people it is, admittedly. But for those people, "avoid" could reinforce any impressions they might have that people who plagiarize are generally blameless.

For those for whom the word sounds bad, like a reflection on character (i.e., most of us), the subtext of "how to avoid plagiarism" is too close to "how to avoid being a thief." Well, what sense would that make?

In short, "avoid plagiarism" is potentially misleading from either point of view. That's why I like part of Moonriddengirls' solution: it relies on non-normative diction at just the point where it says the guideline will draw crucial distinctions:

This guideline addresses where copying and close paraphrasing may not be a problem ....

However, even she uses "avoid plagiarism" just a bit later. Worse, copying and close paraphrasing happen in copyright violation as well (inadvertent and otherwise), and the second paragraph already muddies those waters too much for those who don't understand the distinctions and similarities. And remember how I got here: such confusion is a major reason why we're even having this discussion about improving the guideline.

I think the whole introduction should start and end on giving credit where credit is due, and pound the point pretty hard everywhere between. That's positive to the naive, the beleaguered, the more sophisticated, and the more judgmental. With a dark word like plagiarism, you want to get rays of sunshine beckoning on the horizon immediately. But we can't be Pollyanna here. Plagiarism is failing to give credit where it's due. That's dark. However, phrased the right way, it's also motivational: after all, nobody wants to fail. And the scrollbar status for this longish guideline will hint to any reader (naive or sophisticated) that this "not-failing" is more complex than they might have thought. Indeed, it might hint even to the sophisticates that almost anybody can fail even with the best of intentions. That humbling length is good, provided that readers aren't left too daunted by the writing itself.

Which gets back to my Newbie Overwhelm point above. Readers will certainly need motivation to either suck it up and get through paragraph two on a first reading, or (better) bravely skip it in the hope that things clear up later. Maybe any comment on that should be made above, not here. But for now, let me suggest that at least the tone of the introduction would be better if it rang somewhat like the following. (N.B. Not block-quoted from any source, and with the allowed exception of conventional expressions, not anything that I remember from anywhere. I claim these words are mine.):

This guideline talks about failing to give credit where credit is due. It explains how an editor can fail to give due credit, why this failure reflects very poorly on Wikipedia when noticed by its readers, and what to do if you think you see a case of it. Due credit isn't always an easy goal, but it's always worthwhile.
Wikipedia is nothing without its reliable sources. Imagine that it offered little more than bare statements of fact, supported by those sources. It might still have a good chance of modest success, though we'd certainly miss all the fine writing and organization found in so many of its articles. Imagine that Wikipedia too often failed to give due credit. Then Wikipedia would certainly fail altogether -- even it had the best writing and best organization of any encyclopedia in the world, all offered with what was otherwise the best of intentions. Since plagiarism is failure to give due credit, editors should strive to understand plagiarism for what it is, and for how it endangers Wikipedia.
To see and to know is not enough, however. Editors also need to know what to do when they think they see plagiarism, and when they wonder whether they've strayed into behavior that looks like it.
Consider:
  • Apparent cases of plagiarism can be ambiguous and uncertain. Sometimes, what looks like failure to credit sources (even failure that might seem blatant at first) turns out not to be plagiarism at all -- or not in the copy direction first suspected.
  • Because plagiarism is a very serious ethical breach in instructional and professional settings, acrimony can easily arise at the very mention of it. Editors need to know how to temper their expressions and responses.
  • Since Wikipedia is open to being edited by anyone, some editors might not have a firm grasp of time-honored conventions for attribution. Editors seeing such failures should always try to remember that naivete or lack of orthographic skill might be at the root. Editors who think they might be failing in this way need to be guided toward safe conventions.
  • We now have a bewildering array of new text licensing schemes. Wikipedia itself is under a relatively complex combination of them. They go beyond copyright in how freely they permit redistribution. This leads many to think that the added freedom implies that sources released under such licenses are free for use with no attribution at all. This is seldom the case, but it's an "understandable misunderstanding."
This guideline aims to help you through the maze of difficult and differing perceptions, personal sensitivities, technical subtleties, administrative remedies, and the repair of articles (and discussions!), whether the problem turns out to be plagiarism or not. Above all this, let the principle of "credit where it's due" be your guide, but without forgetting other policies and common sense guidelines for editors. It's seldom easy to keep them all in mind in the context of plagiarism, whether apparent or real. But that's no excuse for not trying.

Remember: I'm only suggesting tone and reading level here, by example. I'm not nearly as up to speed on all the history, wikidrama, and subtleties as I should be. I don't claim I've covered all the facts and issues in the right way, above. And the writing? Still too gnarly. If the feeling is right, though, and better than what we have already, but the words are wrong, what matters is that the words get replaced without doing violence to that feeling. Yakushima (talk) 15:28, 10 July 2010 (UTC)

Simple and obvious

Rereading, I think this paragraph doesn't work:

Phrases that are the simplest and most obvious way to present information. Editors who claim that the phrasing at issue is plagiarism must show that there is an alternative phrasing that does not make the passage more difficult to read. If a proposed rephrasing may impair the clarity, or flow, of a paragraph, they must propose a rephrasing that avoids such side-effects, possibly by rephrasing content preceding and following the disputed passage, or even the whole paragraph. An objective measure of whether a proposed rephrasing makes the passage more difficult to read can be obtained by a readability tool such as Dispenser's Readability Analyser. However, issues about clarity and flow will have to be resolved by discussion.

It was a turbulent time, when that was added, but I would imagine that the intent was to clarify that simple, non-creative content does not require attribution. That's basically what's implied by the first sentence, but the subsequent seem to suggest that reading fluidity is the real issue here...and that doesn't make sense. Suppose a public domain source has an excellent description of the flight pattern of a Monarch butterfly and that rewriting it would dilute it. Does this eliminate the attribution requirement, when all that is needed to retain it is a note that it's copied verbatim? (If the passage is non-free, of course, it's even less useful, since copyright law doesn't care if you like the way the copyright holder puts it.)

I propose eliminating everything after the first sentence, replacing it with an expanded explanation that the reason this is not a plagiarism problem is because it lacks the degree of creativity requiring attribution. Maybe something like this:

Phrases that are the simplest and most obvious way to present information. Sentences such as "John Smith was born on 2 February, 1900" lack sufficient creativity to require attribution.

As it is currently written, anyone can claim that any non-free content they've imported requires no attribution if it has a better readability index. Again, all that's required if the content is free to permit its use is attribution. If it's non-free, readability is immaterial.

Thoughts? --Moonriddengirl (talk) 15:34, 19 July 2010 (UTC)

I agree. And I think your text version protects me from plagiarism accusations regarding the previous sentence. Hans Adler 21:05, 19 July 2010 (UTC)
I also agree (has that also protected me?). I also like Mrg's explanation of "simple, non-creative content does not require attribution", which perhaps could be included.
It does raise one issue though. When I have been involved in POV disputes where someone demands sources to justify a phrase it can be extremely difficult not to use very similar wording to the original source, because if the wording differs, the person who wish to emphasise different POV will say that the wording does not accursedly reflect the experts point of view. For example take the phrases:
  • "the majority of Australian experts are considerably more circumspect" (Mark Levene p. 344, footnote 105).
  • however the majority of Australian experts are more circumspect (Wikipedia history wars --cites Levene and another source (for the rest of the sentence))
The words "majority of Australian experts" is mandated by WP:NPOV as accurately reflecting what was called "mass attribution" (e.g. "maority" is not "most" and not all "experts" are "historians"), and I would have liked to use a different word to "circumspect" but I could not find one that meant precisely the same thing (probably a lack of ability/imagination). If it were not for a POV dispute, I would have used a different word, but the phrase would have a slightly different meaning.
The question comes down to whether without turning a passage into a string of quotes, such phrases can be justified given the self imposed limitations we work under of accurately summarising the sources with no original research, and whether you (Mrg) intend your proposed wording to cover such instances. -- PBS (talk) 21:48, 19 July 2010 (UTC)
I doubt it. Occasionally, precision will demand direct (fair use) quotations. However, I think someone that repeatedly demands the exact words of a source takes Wikipedia too seriously, and should just go read the book. --Hroðulf (or Hrothulf) (Talk) 13:00, 20 July 2010 (UTC)
At the very least, they are misunderstanding WP:NOR.:) There are certainly occasionally sources so contentious that they require direct quotation to make sure they are accurately attributed and conveyed. I see that Lemkin has an in-sentence attribution. If not overused, this is generally taken in academic settings at least to authorize a slightly closer following of language. It also makes it easier to toss quotation marks around the odd "striking phrase". I would drop some quotation marks around that myself. In fact, I think I'll go have a shot at it. :) --Moonriddengirl (talk) 13:08, 20 July 2010 (UTC)
Oops! Levine is the source of the quote; not Lemkin. Hmmm. Well, still, the point stands, though I'm wrong in this instance. --Moonriddengirl (talk) 13:10, 20 July 2010 (UTC)

User notice template

Hi. I've created a notice template for unattributed copying of PD content, here. Wanted to let people know about it in case it can be improved. --Moonriddengirl (talk) 15:02, 11 September 2010 (UTC)

Where to place attribution templates

I note the section Wikipedia:Plagiarism#Where to place attribution added on July 20 2009. This advice seems to be in conflict with Wikipedia:MOS#Section headings which talks of "primary headings are then ==H2==, ===H3===, ====H4====, and so on up to ======H6======" and makes no mention of ad hoc emboldened section headings.

There does seem to be sense in using a section heading to regularise and highlight the attribution of text inserts into articles. Although hitherto I've placed attribution templates under an H2 ==Notes==, I could cope with placing it under an H3 ===Attribution===. Does anyone have thoughts on this?

I note that whatever advice is given here should probably be reflected in Wikipedia:Citing_sources, such that we establish a regular pattern for citations, references, notes, attribution, etc. --Tagishsimon (talk) 10:31, 13 September 2010 (UTC)

Why can you live with putting it under a section called notes and not at the end of a section called references? -- PBS (talk) 20:59, 13 September 2010 (UTC)
Attribution is not a section heading it is simply a bold line. We added it because the contributors to this talk page thought it was desirable to highlight the attribution. I considered suggesting that it should be a section header but rejected the idea for three reasons:
  • The first was to trigger a TOC on a stub for one extra line was a complication that most editors could do without.
  • The second was I saw it as a only a rearrangement of the lines already in the References section(s): usually {{1911}} or whatever where already in the list of general citations and as such it was to highlight the inclusion of the attribution for the reader which was one of the major plagiarism concerns of those who were involved in writing this guideline who were opposed to including any text from third party sources unless it was in quotes.
  • The third was that if Attribution became a section it would just add more confusion for editors over what the difference was between Notes, References and this other section called Attribution, it would also mean a rewrite of WP:CITE and WP:LAYOUT making the text in those guidelines more complicated than if Attribution remained just a bold line -- it is a relatively small number of pages which include unquoted text from third party sources.
I think the advantages of keeping it as a bold line (simplicity) outweigh the advantages of having it as a separate section header. -- PBS (talk) 20:59, 13 September 2010 (UTC)
We added it because the contributors to this talk page thought it was desirable to highlight the attribution. I've had a look through the archives of the talk pages of this guideline, but didn't find the discussion you allude to. Would you please point me to it. --Tagishsimon (talk) 16:17, 14 September 2010 (UTC)
The conversations from the time are in archive 5, but the two preceding archives are also relevant. -- PBS (talk) 23:10, 14 September 2010 (UTC)
I've searched Archive 5 thoroughly and found no discussions of the attribution header in it. Could you be any more specific about where these discussions are, please? --Tagishsimon (talk) 22:46, 15 September 2010 (UTC)
I myself just put the notices directly under the word "references". --Moonriddengirl (talk) 17:11, 14 September 2010 (UTC)
I also put it as the first thing in the Notes/References section (whichever one the article uses for its inline refs). I never actually noticed that this said to put it under it's own fake heading. VernoWhitney (talk) 19:42, 14 September 2010 (UTC)
VernoWhitney why would you want to put it above the {{reflist}} and not in the References section
The reason I think that they are better at the bottom of the reference section is because sometimes there is more than one of them, and sometimes directly below the PD attribution is a list of the general references used in the PD article (and mixing up the references used by Wikiepdia editors and secondary references used by the PD sources does not keep the clarity we need for "say were you got it".
The second reason is a technical one. Most of the attribution templates do not contain the code to work with {{sfn}} and {{harvnb}} so if an article uses short citations there needs to be a standard {{cite book}} or whatever with the "ref=harv" parameter set (it is set by default in the {{citation}} template. The "mark 2" attribution templates I have been coding for {{1911}}, {{catholic}}, and {{DNB}} all have the "ref=harv" parameter set, but they are the exception to the rule. This means that for most articles such as Western Allied invasion of Germany, the citation has to be given twice, and given that I think it looks better if the attribution goes at the bottom. -- PBS (talk) 23:10, 14 September 2010 (UTC)
Most of the articles I add attribution to are freshly created and don't have separate Notes/References sections. I guess my logic is that the source for all of that material really is wherever you're copying it from, and so it should go in with all of the other footnotes which show directly where you got the info from. I just put it at the top of that section so people would be more likely to read it instead of skimming it along with all of the other footnotes. An attribution header would work just as well, but as I said, I just skimmed that part of the article before so I didn't know there was supposed to be one. VernoWhitney (talk) 01:56, 15 September 2010 (UTC)
I am at the moment working through the DNB articles putting in a "wstitle=" parameter, as it happens just after I asked this question of you came across a few articles laid out the way you describe. Here are a couple Henry Newcome and George Henry Harlow and I think they look OK, but I think that William Kiffin, which has a mix, is a mess and will become even more so as other sources are cited as references. I have altered Charles Lucas (politician) but have left the note at the top, and also altered Richard Royston, the latter shows how an article like William Kiffin can be modified to use an "attribution" line along with the {{harvnb}} template and can take any number of other references. However these are the exception and much more typical are articles like John Gayer and Rowland Searchfield. PBS (talk) 02:44, 15 September 2010 (UTC)
Sad to say I'm appalled by the (to me) new footnote style you're imposing on DNB articles, PBS. Henry Acton, for example, now conflates together the reference and the attribution, when previously there was a clear division between these two. The danger of going down the Henry Acton route in an article with multiple references is that attribution loses all prominence because it is buried in a mass of other references. My expectation is that we publish in an article which has need to attribute a PD insert, a clear and self standing attribution statement, one that it entirely separate from a citation reference. (And I also gently point you to my reply above, where I'm still searching for the prior discussion, still not finding it in archive 5.) --Tagishsimon (talk) 23:10, 15 September 2010 (UTC)
If there is only one references then it does not really matter if there is an attribution line or not as no one who looks at the references section (such as in the article Henry Acton is going to fail to notice it) it does no real harm to add it. In a more developed article, were there may be dozens of references, then putting the references which includes copying PD line at the bottom of the reference section is in my opinion quite an elegant solution, take for example the article George Monck, 1st Duke of Albemarle -- PBS (talk) 09:34, 18 September 2010 (UTC)
It didn't used to be there, Verno. :) It used to just say "Attribution for compatibly licensed and public domain text is generally provided through the use of an appropriate attribution template, or similar annotation, placed in a "References section" near the bottom of the page." I don't remember noticing the change; I've gone right along with what I believed it said. Tagishsimon, I suspect he may be referring to this comment. Not a lengthy discussion, but the question of prominence was raised and nobody seems to have objected to his bold edit at the time. We can talk about changing the current recommendations, though, if you think them inappropriate. I myself think what it used to say was fine. It allowed leeway for getting "fancy" if need be and for the simpler situations I usually encounter. --Moonriddengirl (talk) 23:40, 15 September 2010 (UTC)
I'm at the point where I quite like PBS' suggestion of an Attribution header, other than that per WP:MOS, as I said at the top, it should in my view be in the form of an H2 or H3 header, not merely ad hoc emboldening. I think it important that we signal in no uncertain terms inserts of PD text, and the suggested header reinforces the template. I don't think that mere whim should override WP:MOS (which gets many more eyeballs than this page and the guidelines in which are, thus, arguably given with more force, or at least much boader consensus than we represent). And the same argument for prominence and clarity in attribution is (part of) what is distressing me about the approach taken in Henry Acton (and by now many other PBS touched articles), that references and attribution are being conflated. (I recognise that questions of prominence are value judgements and that YMMV applies.) So, what are we discussing?
  • Whether the recommendation for an Attribution header should remain
  • Whether, if it does, it should be emboldened by hand, or an H2 or an H3
  • Whether attribution templates can also serve as / be conflated with references, as in Henry Acton
  • And, for completeness, since there's discussion in other places [7] [8], whether we want attribution templates such as {{DNB Cite}} and {{1911}} to be adorned with logos such as a PD logo or a wikisource logo.
--Tagishsimon (talk) 00:49, 16 September 2010 (UTC)
Tagishsimon using the term "decorative" is a cigarette punch (The Cray twins discovered that rather annoyingly that people expected to be hit by them, so they used to offer a victim a cigarette before hitting them, because it a victim had their jaw open when they punched them if tended to result in a broken jaw). Using the term decorative in this context implies that it serves no purpose, but for example the article Thomas More the wikipsource icon servers exactly the purpose it is supposed to serve in showing that there is a Catholic enclycopedia article on Wikisource relating to Tom More.
Tagishsimon you wrote "I quite like PBS' suggestion of an Attribution header" I have not suggested that it should be a header (See what I wrote above). In my opinion your previous usage of "notes" and "references" section headings an the content you have placed in those sections shows that you were confused by the guidance in WP:CITE. Having been through 1,000 of articles in the last couple of weeks (changing unamed parameters to "wstitle=") I have found that the majority of editors treat {{catholic}}, {{1911}} and {{DNB}} like any other reference, so your question "Whether attribution templates can also serve as / be conflated with references" is answered by usage: yes they can (and I can only conclude that you ask this question because of the rather unique way you have interpreted the usage of "notes" and "references" section headings) -- PBS (talk) 10:05, 18 September 2010 (UTC)

BTW for anyone who is interested: all three templates {{1911}},{{Catholic}} and {{DNB}} have a flag called "inline=1" which allows them to be used like the {{citation-attribution}} "To aid with attribution at the end of a few sentences ..." eg <nowkik>{{tl|1911|inline=1|wstitle=A}} returns:

-- PBS (talk) 10:21, 18 September 2010 (UTC)

Some of this discussion goes a bit beyond my areas of participation. :) I'm not very visual, as my userpage attests. Here are a few examples of one of the PD templates I created (maybe the only one? I lose track) in use: Omaha Race Riot of 1919#References; Limos (mythology)#References; William Zouche#Notes. IMO, it stands out clearly enough from other references to draw attention to itself so that readers who check sources should not miss it. As far as I'm concerned, the primary purpose of this guideline is to get the attribution on the article in some clear way, and any way that does that works for me. I don't think we need to make it too elaborate. Sometimes, it is the only source we have, or one of the few, and I worry about overwhelming a stub article with sections. Take Cavum vaginale, for instance. Would it be improved by the addition of a subheader or bolding for attribution? I don't think so. I think we should avoid being overly directive here. --Moonriddengirl (talk) 13:15, 19 September 2010 (UTC)

In-text attribution

Hi Moonriddengirl, what's your objection to this? It's standard practice per V to use in-text attribution without quotation marks. SlimVirgin talk|contribs 15:39, 8 October 2010 (UTC)

Consensus in the development of this guideline was that copied content should be clearly denoted as copied. The section in which you've added the note says, "You can avoid any dispute concerning potential plagiarism by". I don't believe people can "avoid any dispute" by in-line attribution if the content they are copying is complex or lengthy. Certainly, not everybody is going to regard it as plagiarism, but I believe that it's not the failsafe the guideline would suggest it to be. --Moonriddengirl (talk) 15:46, 8 October 2010 (UTC)
The guideline can't contradict other policies, guidelines, and standard practice, MRG (e.g. see WP:QUOTEFARM). It's normal practice on Wikipedia and every other kind of publication to attribute without quotation marks, because no one wants their articles to become a list of quotes. Writing "Moonriddengirl said she loved it" is just as appropriate as "Moodriddengirl said, 'I love it!'". Quotation marks are for words that we want to draw attention to for some reason, perhaps because very distinctive, or legally or politically important.
You are right that copyright violations can't be avoided with in-text attribution alone, but it also can't be avoided with quotation marks either. But plagiarism is avoided by clearly attributing the source next to the text you are citing; quotation marks are irrelevant in that sense. SlimVirgin talk|contribs
Writing "Moonriddengirl said she loved it" is a proper paraphrase. The content "I loved it" is short and almost entirely devoid of creativity. The attribution in that case is sufficient. Writing:

Moonriddengirl said the US government utilizes a "substantial similarity" test intended to determine if infringement exists and that Melville Nimmer produced subcategories of "substantial similarity" for which the courts search.

is a different matter. (That's copied from my userpage, I note, to avoid self-plagiarism. ;)) Quotation marks are for words that are copied. This is a guideline; WP:QUOTEFARM is an essay. Nevertheless, it agrees that "Quotations must always be clearly indicated as being quotations." Inline attribution is insufficient if creative language is retained. --Moonriddengirl (talk) 16:08, 8 October 2010 (UTC)
You just keep saying that, but without a source. Please provide a reliable source. It's clearly fine a great deal of the time, on Wikipedia and in every other form of publication, to add reported speech without quotation marks so long as you make clear who said it. SlimVirgin talk|contribs 17:19, 8 October 2010 (UTC)
University of Pittsburgh, section D: "Copying Distinctive Words or Phrases without Proper Attribution Is Plagiarism The following use of Bettelheim’s “debilitated state of the ego” is plagiarism, even though the ideas are attributed to him, because there are no quotation marks around this distinctive phrase..." (I'll leave you to read the rest of the example yourself.) --Moonriddengirl (talk) 17:27, 8 October 2010 (UTC)
But it doesn't say you need quotation marks. You just need to make clear that the phrase is not your own, and that can easily be done with the writing, with the way you attribute in-text. It's just wrong to say that quotation marks are the only way to do that, and that webpage doesn't support your argument that it is. You're recommending a very poor writing practice. SlimVirgin talk|contribs 17:56, 8 October 2010 (UTC)
"The following use of Bettelheim’s “debilitated state of the ego” is plagiarism, even though the ideas are attributed to him, because there are no quotation marks around this distinctive phrase..."" (emphasis added). --Moonriddengirl (talk) 18:17, 8 October 2010 (UTC)
A few more for you: "[9]: "*Note that in the paraphrase, a very brief quotation is used. When you paraphrase, you cannot conveniently borrow the direct language of your source, however brief, without using quotation marks"; [10]: "Be sure you have enclosed the exact words of the source with quotation marks or you have set off longer quotes with indention. You are committing plagiarism if you neglect this critical punctuation, demarcating words you have gathered directly from sources"; [11]: "Use quotation marks to identify any unique term or phraseology you have borrowed exactly from the source"; [12]: "Additionally, paraphrasing is plagiarism where you fail to cite your original source and, in some cases, where you fail to use quotation marks as well.... even where the original source has been cited, plagiarism occurs where you fail to use quotation marks around words or phrases that show the author’s distinct and original thought or expression.... While determining whether certain words warrant quotation marks might seem dependent on the reader, you should always use quotation marks when in doubt." --Moonriddengirl (talk) 18:32, 8 October 2010 (UTC)

I think there is a misunderstanding going on. I guess SlimVirgin is probably thinking of quotations of entire paragraphs, using appropriate means for marking them as quotations. If there are no quotation marks this includes at the minimum starting a new paragraph and saying where it is from, but typically that new paragraph is also indented and/or in italics. I think Moonriddengirl is probably thinking of the recent big copyvio case involving an established editor and regular contributor to policy discussions, who thought copy/paste (without quotation marks or other markup) is the only way to avoid original research, and acted accordingly for years. Hans Adler 16:49, 8 October 2010 (UTC)

Yes, you're right; that's the kind of thing I am thinking about. We see a lot of that at WP:CP and WP:CCI, and I wouldn't want to inadvertently confuse our contributors into thinking that quotation marks aren't necessary. Conversations about plagiarism on Wikipedia (nevermind copyvio) can get really ugly. :/ Clarity matters. When it comes to non-free content, there's already this in that section: "properly attributing any public-domain, or free-content text, that you place directly into an article." --Moonriddengirl (talk) 16:56, 8 October 2010 (UTC)
I don't see what difference adding quotation marks would make to the kind of thing you're describing. But the point is that quotation marks are not needed if there is in-text attribution, and if you want to change that you're going to need a wider consensus, because it would fly in the face of common writing practices and other policies. SlimVirgin talk|contribs 17:16, 8 October 2010 (UTC)
What other policies? --Moonriddengirl (talk) 17:18, 8 October 2010 (UTC)
In any case, (hypothetically contradictory) policies local to the project do not override legal standards of copyright which require that verbatim copying of non-free text be specifically attributed as direct quotations for any fair use claim to be plausible, nor can they supersede generally accepted standards of academic honesty. Peter Karlsen (talk) 18:04, 8 October 2010 (UTC)

SV you say that to meet Wikipedia content polices and other guidelines, in-line attribution and an inline-citation are enough, but how is a reader to know if what I have just written is a summary of what you said or a direct quote unless quotes are marked as such? I would assume that although imitation is the sincerest form of flattery, to avoid copyright and plagiarism issues, I would have put in quotes an exact copy of your words. MRG has told me in the past that if a Wikiepdia editor use a well worn phrase -- as I did in the last sentence -- then those do not have to be quoted. On reading that last sentence would you assume that I was quoting verbatim what MRG said to me or paraphrasing what she said to me? -- PBS (talk) 01:11, 9 October 2010 (UTC)

Most of the time it doesn't matter. What is the difference between "Philip Baird Shearer said it was all nonsense," and "Philip Baird Shearer said: 'It is all nonsense'"?
Quotation marks are only needed if it's important or interesting for some reason to mark the exact words used, e.g. "U.S. President Philip Baird Shearer said the Russian president was "talking nonsense." Then it would matter exactly what you had said. SlimVirgin talk|contribs 01:22, 9 October 2010 (UTC)
You asked for reliable sources for my position that inline attribution does not automatically eliminate plagiarism concerns. I have provided those. But I'm curious: do you have reliable sources to substantiate your view that "Quotation marks are only needed if it's important or interesting for some reason to mark the exact words used"? --Moonriddengirl (talk) 01:33, 9 October 2010 (UTC)

Consensus

Someone said the quotation-mark requirement was added after a discussion reached consensus about it. Can someone link to that discussion, please? SlimVirgin talk|contribs 17:59, 8 October 2010 (UTC)

I think this kind of escalation is premature. I am not sure at this point what this discussion is about, and I guess I am not the only one. I am still sure that we can get to unanimity if we start talking about concrete things that we want to allow or not, or that we want to encourage or not. I am absolutely sure that you don't want to allow the kind of thing that Moonriddengirl wants to prevent, and that we can find language that is acceptable to all once proper communication has started. Hans Adler 18:17, 8 October 2010 (UTC)
This conversation seems to begin with WP:NFC (policy and guideline). The policy has long permitted "in accordance with the guideline, use brief verbatim textual excerpts from copyrighted media, properly attributed or cited to its original source or author" and the guideline has long said, "Copyrighted text that is used verbatim must be attributed with quotation marks or other standard notation, such as block quotes." On the 7th, a contributor changed the policy by incorporating some of the guideline into it (here), following which Slim changed the policy thusly (perhaps not realizing that this contradicted the guideline, which had long been incorporated by reference...although not after her edit). Her change was reverted. A bit of back and forth editing followed and some conversation at Wikipedia talk:Non-free content#"If appropriate" quotation marks. It seems very likely to me that this conversation is an extension of that. It's my perception based on the thrust of conversation there that the goal here is to avoid unnecessary use of quotation marks. --Moonriddengirl (talk) 19:46, 8 October 2010 (UTC)
Yes, it's clearly about unnecessary use of quotation marks. But it's all extremely abstract so far. I would like to see a concrete example of the kind of literal quotation without quotation marks that SV is arguing for. Hans Adler 20:10, 8 October 2010 (UTC)
Example: I've just added to Ezra Pound (issue at point in bold):

At a literary salon in February 1909, he befriended the novelist Olivia Shakespear—Yeats's former lover and the subject of his The Lover Mourns for the Loss of Love—and her daughter, Dorothy, Pound's future wife, who Iris Barry said carried herself with the air of a young Victorian lady out skating, in strong contrast to Pound.[1]

The skating analogy comes from Iris Barry, who described Dorothy as "carrying herself delicately with the air, always, of a young Victorian lady out skating."
According to Moonriddengirl, attributing the description to Iris Barry is not enough. I would also have to place some of these words in quotation marks. But there is no other publication that would require that of a writer. Quotation marks there are entirely optional, depending on the extent to which I want to draw attention to the phrase. SlimVirgin talk|contribs 00:16, 9 October 2010 (UTC)
Thanks. I think that has clarified things sufficiently and we can all agree that quoting in this way is not plagiarism. The problem seems to be how we can make it clear that this is acceptable, without inadvertently encouraging some of our less literate editor colleagues to write something like the following:

At a literary salon in February 1909, he befriended the novelist Olivia Shakespear—Yeats's former lover and the subject of his The Lover Mourns for the Loss of Love—and her daughter, Dorothy, Pound's future wife, who always carried herself delicately with the air of a young Victorian lady out skating.[2]

I believe that's the kind of thing that Moonriddengirl is constantly cleaning up. I am sure it's not a nice job and she would prefer having more time for writing content. The problem is that the editors who write like that also tend to be rather bad at understanding the nuances of any instructions we give them. If we give them a chance to misunderstand what we tell them as permitting what they want to do, they won't miss it. Hans Adler 16:41, 9 October 2010 (UTC)
I'm afraid you may have misunderstood me. I didn't say it was "a discussion." I said, "in the development of this guideline". There are six pages of archived talk, and the question of directly noting copied content has been raised repeatedly. The guideline says, with respect to non-free content, "In addition to the edit summary note, be sure to attribute the material either by using blockquotes, or quotation marks, by using an attribution template, using an inline citation and/or adding your own note in the reference section of the article to indicate that language has been used verbatim." But all that misses the initial point. You added your change to a section that says (specifically), "You can avoid any dispute concerning potential plagiarism by..." (in your words) "providing in-text attribution without quotation marks, and referencing the source". I've quoted a number of reliable sources for you now which indicate that quotation marks are required to avoid plagiarism. Accordingly, it is not true to say that you can avoid any dispute concerning plagiarism by "providing in-text attribution without quotation marks, and referencing the source", and it does no service to readers of this guideline to tell them otherwise. If they don't use quotation marks, even if the content is free, they certainly may encounter very vitriolic disputes about plagiarism, and if they do it extensively with non-free sources, they might wind up at WP:CP. --Moonriddengirl (talk) 18:49, 8 October 2010 (UTC)
You said there had been consensus in the development of the guideline, and you reverted me on that basis, so could you link to wherever you feel consensus was expressed that all quotations must be in quotation marks, and that's it's never okay simply to write that Moonriddengirl said SlimVirgin was an idiot? :) SlimVirgin talk|contribs 23:14, 8 October 2010 (UTC)
No, I reverted you on the basis that there is "No consensus for this" (quoting my edit summary); you made a change, and I disagreed. You then asked me my problem with it, and I've explained...including with the external WP:RS you requested. If you want to change the guideline, you're welcome to try to achieve consensus for that. --Moonriddengirl (talk) 23:33, 8 October 2010 (UTC)
Please don't tell me the word string above would have to be rendered as:
who "Iris Barry" said carried "herself ... with the air ... of a young Victorian lady out skating".
Save me. Save our readers from it. One of the principles of good quotation technique is to shield readers from bad English or formatting, and to integrate what others write/say smoothly into the grammar of the WP text. Provided the original meaning is not changed substantively (which it is not, here) and the attribution is there (yup), it would be objectionable not to do this. I copy-edit some of The Signpost's journalistic stuff each week. There, I sometimes see that readers would be forced to jump though hoops (ellipsis points, weaving in and out of quoted fragments) in the supposed service of black-letter law on this matter. Or exposed to rather bad prose in quoted material. No one—not the original source, not our readers, not WP—is served well by such inflexibility. Tony (talk) 04:05, 9 October 2010 (UTC)
In the example given who said "in strong contrast to Pound"? Was it Iris Barry? If not, then why does the citation come after that phrase? Also isn't the sentence ambiguous as one can take the contrast to be in the deportment of Dorothy and Pound, or a contrast of opinion over Dorothy's deportment.
"You can avoid any dispute concerning potential plagiarism by: ... providing in-text attribution without quotation marks, and referencing the source" (shoudn't that be citing the source?) Can we tease out of this bundle and discuss the two issues of copyright and plagiarism (as at the moment we seem to be conflating the two) and see what common ground there is and what the differences are? For example I don't think anyone is tying to impose a crude three word rule. -- PBS (talk) 11:36, 9 October 2010 (UTC)
First, in spite of Slim's statement that "According to Moonriddengirl, attributing the description to Iris Barry is not enough..." I haven't actually weighed in on the proper punctuation of that snippet. That said, I believe in keeping with the sources I supplied above to demonsttate problems with Slim's assertion that "You can avoid any dispute concerning potential plagiarism by..." (in her words) "providing in-text attribution without quotation marks, and referencing the source", that quotation marks would serve the sentence here:

Pound's future wife, who Iris Barry said carried herself "with the air...of a young Victorian lady out skating"

This would satisfy the plagiarism standards set out by those sources. People who weave in and out of ellipses are badly paraphrasing. My personal opinion (which I have stated on Wikipedia in the past, if Slim needs verification) is that where in-line attribution is used we can more closely follow our sources, though I do follow the convention of using quotation marks to indicate creative word choice. But none of this has anything to do with a black letter law on quotation marks. It's to do with quite the opposite, a blanket assertion that "You can avoid any dispute concerning potential plagiarism by..." (emphasis added). You can't. I've verified that with enough sources to demonstrate significant alternative viewpoints. Leaving aside non-free content concerns, the question of when and how to use quotation marks to avoid plagiarism is entirely up to the community...whether we choose to embrace a standard that meets the most rigorous expectations or not is up to us. --Moonriddengirl (talk) 12:02, 9 October 2010 (UTC)
I would have to endorse Moonriddengirl's position here. Slimvirgin's proposed modification to the guideline is – likely inadvertently – far too broad in its scope. It implies that inline citation is always an appropriate alternative to the use of quotation marks; it directly contradicts the previous instruction regarding verbatim copying. While I suspect that everyone participating in this discussion has a reasonable grasp of what plagiarism actually means and wouldn't be tempted to (mis)read the guideline that way, we should strive to write in such a way that we won't confuse editors who aren't familiar with proper academic sourcing standards. I regularly deal with editors who believe that it is appropriate to copy entire sentences or even paragraphs from outside sources, as long as they tack on an external link and mayhaps substitute a synonym in a couple of places. Editors naively reading the modified version of the guideline would feel that that sort of wholesale copying is acceptable. TenOfAllTrades(talk) 15:40, 9 October 2010 (UTC)

Proposal

The first section currently ends as follows:

You can avoid any dispute concerning potential plagiarism by:

  • rewriting text completely into your own words, using multiple referenced sources;
  • marking any material you copy as a verbatim quote, using quotation marks, and referencing the source;[3]
  • properly attributing any public-domain, or free-content text, that you place directly into an article.

  1. ^ Montgomery, Paul L. Ezra Pound: A Man of Contradictions", The New York Times, 2 November 1972.
  2. ^ Montgomery, Paul L. Ezra Pound: A Man of Contradictions", The New York Times, 2 November 1972.
  3. ^ Note that the amount of text you quote from non-free sources must be limited to comply with non-free content guidelines.

Contrary to what the section title suggests, this does not try to define plagiarism but only gives a bright-line rule for those who want to play it safe. How about something like the following instead:

Defining and identifying plagiarism is not as easy as it may appear, but we can establish some bright lines:

Recognising obvious plagiarism
  • More than 20 words are copied from a source with no or minimal rephrasing. The source does not appear in a citation.
  • More than 20 words are copied from a source with no or minimal rephrasing. The source appears in a citation, but nothing indicates to the reader that information from the source was copied rather than independently rephrased or summarised.
Playing it safe
  • Rewrite text completely into your own words, using multiple referenced sources.
  • Mark any material you copy as a verbatim quote, using quotation marks, and referencing the source.[1]
  • Properly attribute any public-domain, or free-content, text that you place directly into an article.[2]

There is a huge area between these two bright lines, and most of it is plagiarism. That said, competent writers have techniques for copying text verbatim without quotation marks and still attributing it correctly to the source. Only try this if you are really sure you understand plagiarism better than 50% of American undergraduate students.[3]


  1. ^ Note that the amount of text you quote from non-free sources must be limited to comply with non-free content guidelines.
  2. ^ See 1911 for an example.
  3. ^ Roig, Miguel (1997), "Can undergraduate students determine whether text has been plagiarized?", The Psychological Record

I probably missed some important things, but maybe this can serve as inspiration for a version that satisfies all concerns. Hans Adler 17:40, 9 October 2010 (UTC)

I think it's a sensible direction; I like your "There is a huge area between these two bright lines, and most of it is plagiarism". "Playing it safe" is probably good verbiage there. I'm not really comfortable with "That said, competent writers have techniques for copying text verbatim without quotation marks and still attributing it correctly to the source." It may be worth noting that some view inline attribution as adequate to avoid plagiarism, but, according to the sources I quote above, copying text verbatim without quotation marks is plagiarism even where attributed, and a good many Wikipedians are likely to agree. It's not a question of the competence of the writer, but of the definition and standards of plagiarism adopted. There really just is not a bright line definition of plagiarism; it varies by culture and context. I also worry a bit about setting any number as a definitive standard under "obvious plagiarism." I'm afraid people will overlook the "most of it is plagiarism" bit that follows for lesser copying and defend with a "but I only copied 19 words!" Five or six words in a striking phrase can be plagiarism, particularly if uncited. --Moonriddengirl (talk) 18:52, 9 October 2010 (UTC)
Hans, the problem with playing it safe it that you encourage quote farming and very poor writing, and we see this too much on Wikipedia already. The couple lived "in the heart of Oxford," after paying "a large sum" for a "sunny apartment". It's very much to be avoided, so we need to make the point that text can be attributed by saying who said something. SlimVirgin talk|contribs 20:28, 9 October 2010 (UTC)
I think TenOfAllTrades has sum up the problem we face. If we put any number in there it will be abused by people of less than good faith. I have been involved in a number of plagiarism cases over the last year and the thing that has shocked me and made me cynical is that when it is pointed out to the offenders they at first plead ignorance (often claiming that they did it before this guideline was written), but then not one of them has helped clean up the mess they have created even though most of them are very prolific contributors to WP.
"non-free sources" is open to abuse because it has more than one meaning. As Frank Zappa once said "If you can't be free, at least you can be cheap". I downloaded it from a none pay per view site, so cost me nothing so its free.
If we put in a limit it will be abused, because it is a question of judgement as to what constitutes plagiarism. EG If it is a list of members of a parliament in alphabetic order, the it may be a direct copy of over 1000 words long, or if it is the first sentence of a biography "John Smith (1 June 1920 - 10 January 2000) was a chemist who won the Nobel Prize for chemistry in 1950 for work on pork scratchings". In the last example SV gives, I think we should be encouraging people to summarise using other words if possible, I would have thought that all the phrases you have used were cliche enough to be used without quotes but it the sentence in the original was "The couple lived in the heart of Oxford, after paying a large sum for a sunny apartment" then I think we should encourage its rewrite eg "The couple bought an expensive flat in the centre of Oxford and [moved in during ...].". The whole "original" sentence is is only 17 words long so it fits comfortably under a 20 words limit which if adopted would allow someone just to copy the sentence from the original, so far from "encourage[ing] quote farming and very poor writing", I think the current wording is encouraging creative summarising and discouraging plagiarism.
Perhaps the solution lies in rearranging the information already in the guideline. How about grouping the three sections "Definition of plagiarism", "Why plagiarism is a problem" and "What is not plagiarism" as subsections under a new section heading. Then moving "What is not plagiarism" above "Why plagiarism is a problem" and moving the words after "You can avoid any dispute.." out of definition section (as they are not definitions) into a new subsection called something like "How to avoid plagiarism" and place it as the last section in the new section? This would allow people to see the already large number of things that when copied are not plagiarism, before suggesting rule for avoidance of plagiarism with other text. -- PBS (talk) 21:56, 9 October 2010 (UTC)
To be honest, I'd much prefer to see over-quotation and over-citation than under-citation. Wikipedia articles are often works in progress, and we expect each editor to contribute according to his strengths. After a 'researcher' identifies important sources and statements which ought to form the basis for an article (or which might be added to an existing article), a 'copyeditor' can mould that clumsily-assembled raw material into brilliant prose, while a 'wikignome' correctly formats the references and footnotes. At the same time, editors who engage excessively in quote-farming – or whose contributions habitually degrade well-written articles – can be taken aside and gently encouraged to improve their approach to contributing. It is far easier to repair an article which relies too heavily on clearly-cited quotations than one which contains an excess of concealed and unreported copying or too-close paraphrasing, largely because the former is obvious to casual inspection while the latter is not. Trying to discourage the use of quotation marks probably won't reduce the amount of quote mining; it will just increase the amount of inconspicuous plagiarism.
It's also possible that advice on how to reduce one's reliance on quotations could (or should) become part of some other policy or guideline. Generally, an overuse of quotations in article writing is more an issue of bad writing than one of plagiarism. While still troubling, it's a bit out of our scope here (except in the cases where quotations entirely or near-entirely substitute for an article's prose — but closely paraphrasing someone else's article to use as our own without quotes is equally problematic). TenOfAllTrades(talk) 23:17, 9 October 2010 (UTC)