Wikipedia:Wikipedia Signpost/Single/2014-07-30

The Signpost
Single-page Edition
WP:POST/1
30 July 2014

 

2014-07-30

Knowledge or unreality?

  • Dariusz Jemielniak, Common Knowledge?: An Ethnography of Wikipedia (Stanford University Press 2014).
  • Charles Seife, Virtual Unreality: Just Because the Internet Told You, How Do You Know It's True? (Viking/Penguin 2014).

In Common Knowledge: An Ethnography of Wikipedia, Dariusz JemielniakUser:Pundit on the English and Polish Wikipedias and a steward—discusses Wikipedia from the standpoint of an experienced editor and administrator who is also a university professor specializing in management and organizations. In Virtual Reality: Just Because the Internet Told You, How Do You Know It's True?, journalism professor and author Charles Seife presents a more broadly themed work reminding us to question the reliability of information found throughout the Internet; he cites Wikipedia as a prime example of a website whose contents contain enough misinformation to warrant caution before relying on the information on the site.

Jemielniak's Common Knowledge?: An Ethnography of Wikipedia

Dariusz Jemielniak (Pundit) in 2010

Jemielniak's book is an academic discussion of Wikipedia; he does not aim to present either a "how-to" guide for editors and readers or a complete history of the project. He states that his "book is a result of long-term, reflexive participative ethnographic research" performed as a "native anthropologist." (p. 193) (The word "ethnographic" in this context refers not to ethnicity in the quasi-racial sense, but to the study of a subgroup of the population—here, the subgroup that actively edits Wikipedia.) By this, Jemielniak means that he has spent several years as a Wikipedian, has introspected about his experiences throughout that time through the lens of his academic background, and has now written up his findings and conclusions. I don't think he means that he became active in Wikipedia for the purpose of doing research about it, although it seems quite possible that he started thinking about combining his editing hobby and his professional interests fairly early in his wiki-career.

I cannot pretend to evaluate Common Knowledge as a work of anthropology or of organizational management science. As a general reader and a Wikipedian, I found the book interesting as a compilation of incidents in Wikipedia's history, some of which I was already familiar with and some of which were new to me, and as a reminder of some issues the project faces as it moves forward. Non-academic readers may find the book lacking in a unifying theme, beyond that Wikipedia plays an important role in the world today that warrants academic study of its culture and communities. Jemielniak recently stated (on a Wikipediocracy thread) that "I wrote this book for academic research purposes, I absolutely have no hope of high sales (and honestly, I'll be surprised if it goes beyond 500 copies)." The book has been praised by Jimmy Wales, Clay Shirky, Jonathan Zittrain, and Zygmunt Bauman and it deserves to sell well over 500 copies, but it won't make be making the wiki-best-seller list either.

The eight chapters of Common Knowledge discuss basic rules governing Wikipedia, different roles contributors take on within the project, dispute resolution processes, and the nature of project leadership. The topics are illustrated with examples of disputes or controversies drawn primarily from English Wikipedia history (though controversies about actions by Jimbo Wales on Wikimedia Commons and Wikiversity are also mentioned). The incidents Jemielniak discusses are presented in detail and accurately, but some of them are ten years old and don't necessarily reflect the project's practices or realities today. For example, Jemielniak reviews the bitter and protracted disagreement on En-WP regarding when the historical German-language name "Danzig" should be used for the city now located in Poland and known as Gdańsk. Perhaps aided by his own geographical and historical background, he does an excellent job of presenting the history of the dispute, surveying the arguments for the different points of view, and explaining why the dispute-resolution process ultimately reached the result it did. He does not, however, discuss whether the Wikipedia of 2014 would address the same issue, if it were arising anew, in the same fashion that the much younger Wikipedia of 2003-2004 did.

Jemielniak also doesn't spend much time discussing how lessons learned from Wikipedia dispute-resolution experiences can be used to minimize future disputes or to improve future decision-making. I find this unfortunate, but I can't call it a fault of the book, both because ethnography is descriptive rather than prescriptive, and more importantly because the failure to take stock of dispute-resolution successes and failures has struck me for years as a project-wide myopia. In the 13½ years of English Wikipedia there have been, in round numbers, a billion edit-wars, yet no one knows whether most edit-wars get resolved by civil discussion reaching a consensus on the optimal wording, or by one side's giving up and wandering away (or sometimes by everyone's ultimately losing interest and wandering away). Similarly, the English Wikipedia Arbitration Committee has decided several hundred cases since 2004, and community discussions on noticeboards have resolved thousands more content and conduct disputes, yet no one ever seems to have gone back and conducted any systematic review of which approaches to dispute-resolution worked better than others. That's a different book that ought to be written, although it too risks selling fewer than 500 copies.

Speaking of ArbCom (which I'm prone to do since I've served on ours since 2008), Jemielniak mentions the Arbitration Committees of both the Polish Wikipedia and the English Wikipedia. He opens the book with an account of a Polish Wikipedia arbitration case that resulted in his being blocked from Po-WP for one day. He claims that in retrospect he accepts the ruling against him, but his account of the dispute makes that ruling sound terribly unfair—a cynical gesture of evenhandedness, but meted out to editors who didn't deserve to be treated evenhandedly. (But of course those of us who can't read Polish will never hear the other side of the story.)

The book's mentions of En-WP ArbCom are sound, but dated. He discusses the historical origin of the Committee as an extension of the original authority of Jimmy Wales, and cites a handful of Committee decisions, the most recent of which is an unusual case-motion from 2009. He does not spend much time on the current role of the Committee. That's actually a very defensible omission, because at least on English Wikipedia (I can't speak for other projects), while ArbCom has other responsibilities (some of which most of us don't particularly want), the importance of the Arbitration Committee as an arbitration committee has radically declined in the past few years. (I've discussed this decline here.) So Jemielniak's not spending nearly as much space discussing arbitration as one might expect in a book about Wikipedia hierarchies, leadership, and dispute resolution turns out to be a reasonable decision, but one that is not explained.

Although the academic style of Common Knowledge (and the price of the book) will deter some readers, Wikipedians who want a taste of Jemielniak's thinking about the project can find it in a recent article he contributed to Slate, "The Unbearable Bureaucracy of Wikipedia". In this article, aimed at a general rather than an academic audience, Jemielniak posits that Wikipedia's "increasingly legalistic atmosphere is making it impossible to attract and keep the new editors the site needs." It's a thoughtful article that identifies a significant issue, and its more direct approach accompanied by concrete suggestions make this article more accessible than Common Knowledge for non-specialist readers. All of us who want Wikipedia to thrive, which requires that the project welcome newcomers and facilitate their becoming regular editors, can hope for more such wisdom from this Pundit.

Seife's Virtual Unreality: Just Because the Internet Told You, How Do You Know It's True?

By contrast to Jemielniak's academic treatment specific to Wikipedia, Charles Seife—the author of Zero, Alpha and Omega, and Proofiness—has written a more broadly themed book about the unreliability of information found throughout the Internet. "Just because the Internet told you," the subtitle asks, "how do you know it's true?" Now at one level, the fact that the Internet contains a fair amount of misinformation is not breaking news; "Someone is wrong on the internet" became a meme and then a cliché for a reason. Lots of us think we're sophisticated enough to avoid falling into the kinds of traps that Seife warns us about—but the warnings in Seife's book are important and timely nevertheless.

Wikipedia is just one of the many online sources of bad information that Seife discusses, but for obvious reasons it's the one I'll focus on here. Seife catalogs a dozen instances in which deliberate misinformation was introduced into Wikipedia. Such misinformation is inserted into Wikipedia, perhaps every day, by a miscellaneous array of pranksters, hoaxers, vandals, defamers, and in a few instances by Wikipedia critics conducting so-called "breaching experiments" to see how long a falsehood placed in Wikipedia stays in Wikipedia. (Such experiments are not permitted; see also Wikipedia:Do not create hoaxes.) Some of Seife's examples will be well-known to "Signpost" readers, such as the Colbert-inspired tripling of elephants and the Bicholim Conflict; others were new to me, such as AC Omonia Nicosia and the Edward Owens hoax.

Related articles
2014-07-30

UK political editing; hoaxes; net neutrality
22 April 2015

Saving Wikipedia; Internet regulation; Thoreau quote hoax
15 April 2015

WikiWomen's History Month—meetups, blog posts, and "Inspire" grant-making campaign
11 March 2015

Gamergate; a Wiki hoax; Kanye West
11 March 2015

Monkey selfie, net neutrality, and hoaxes
13 August 2014

How many more hoaxes will Wikipedia find?
30 July 2014

Wikipedia's sexism; Yuri Gadyukin hoax
29 April 2013

An article is a construct – hoaxes and Wikipedia
11 February 2013

Hoaxes draw media attention; Sue Gardner's op-ed; Women of Wikipedia
28 January 2013

Rush Limbaugh falls for Wikipedia hoax, Public Policy Initiative, Nature cites Wikipedia
20 September 2010

Hoaxes in France and at university, Wikipedia used in Indian court, Is Wikipedia a cult?, and more
14 June 2010

Quote hoax replicated in traditional media, and more
11 May 2009

News and notes: Flagged Revisions and permissions proposals, hoax, milestones
10 January 2009

Media coverage of Wikipedia hoax results in article
17 April 2006

Hoax exposé prompts attempt to delete author
8 August 2005

Hoax articles on April Fool's rub some the wrong way
4 April 2005

Attempt to foist false article on Wikipedia revealed
14 February 2005


More articles

Experienced Wikipedians are well-aware of this problem, as are our critics. English Wikipedia, in what can equally be considered admirable self-criticism or self-absorbed navel-gazing, contains discussions of hoaxes on Wikipedia; we also have a lengthy List of hoaxes on Wikipedia; and another compilation recently appeared on a critic site here. (Wikipediocracy link)

Misinformation in the media has always been with us (Tom Burnham's books were favorites of mine growing up, and I'm mildly dismayed that Burnham's name comes up a redlink), but it certainly is possible to spread false information more rapidly online than it was in the analog era. Of course, it is possible to spread correct information more rapidly as well. A particular problem is misinformation posted on Wikipedia—and elsewhere all over the Internet—with the purpose of doing harm to someone. (A prime example of this sort of thing is the Qworty fiasco that unfolded last year.) Any falsehoods in article content damage the credibility and usefulness of the encyclopedia we are collaboratively writing, but intentional falsehoods posted by a subject's personal or political or ideological enemies with the malicious intent to defame or damage a living person do so tenfold. I am confident that well over 99% of Wikipedia pages are free of intentional falsehoods—yet no one can deny that Wikipedia articles must still contain far too many lies, damn lies, and sadistics.

Neither Seife nor Jemielniak say much about the biographies of living persons policy and its enforcement, although many Wikipedians, including myself, have long thought fair treatment of our article subjects to be the central ethical issue affecting the project. I know that when I've been defamed online I didn't enjoy it, and that Wikipedia BLP subjects feel the same way when their number-one Google hit has been edited in nasty ways by their personal or political or ideological enemies. (The good news is that when I or others spot defamation on Wikipedia we are often able to do something about it; I've often wished that I had an "edit" and a "delete" button that I could use on the rest of the Internet.)

Seife's discussion of misinformation on Wikipedia focuses on intentionally false information, but a greater number of inaccuracies are introduced by editors who make honest mistakes than by hoaxers and vandals. Sometimes mistakes are made by good editors who inadvertently type the wrong word or misread a source. Other times, we encounter a good-faith editor who wants to help build Wikipedia but, at least in a given topic-area, simply doesn't know what he or she is talking about. Wikipedia has no systematic system of quality control beyond surmounting the bar for deletion, at least until one seeks to bring an article to the mainpage or have it rated (at which point various sorts of flyspecking take place—some of which can be overdone, but that's another discussion). On English Wikipedia today, there are dedicated noticeboards to address conflict-of-interest issues, evaluate the reliability of sources, solve copyright problems (some quite abstruse), keep fringe theories in check, and put a stop to edit-warring. I've never seen anyone wonder why there's no dedicated noticeboard where one goes for help in figuring out whether questionable information in an article is accurate or not.

Despite the falsehoods he identifies, all of which have now been removed, Seife acknowledges that "by some measures one can argue that Wikipedia is roughly as accurate as its paper-and-ink competitors." (p. 29) He cites the well-known 2005 Nature article comparing the accuracy of Wikipedia's scientific content to that of a canonical, traditional reference source, the Encyclopedia Britannica. One continues to read of comparisons of Wikipedia with traditional library reference books (see Reliability of Wikipedia). The Wikipedia community should certainly aspire for our encyclopedia to land on the favorable side of such comparisons. I think that on balance it does.

But "Wikipedia vs. Britannica" is no longer the right question, or at least not the only right question. At least equally relevant today is how Wikipedia's completeness and fairness and accuracy compare, not only to traditional media sources, but to the other information available on the Internet. Wikipedia has evolved as part of, not independent of, the Internet as a whole. And it is the Internet as a whole, not just Wikipedia, that has changed the population's information-searching habits, so that today when one needs or wants to look something up, one does so on the computer or a handheld device rather than in a book or a (hard-copy) journal or newspaper. In the unlikely event that Wikipedia (and all of its mirrors and derivatives) were to disappear tomorrow (and not be replaced by a similar site), our readers from schoolchildren to senior citizens would not revert to the habits of 25 years ago and start trooping to the library or even the reference shelves in their living rooms when they wanted to check a fact. (I am not saying this is a good thing or a bad thing, though it has elements of both; it is simply a truth.)

Instead, people in the wikiless world would still perform the same Google searches that today bring up their subject's Wikipedia article as a top-ranking hit. They would find the same results, minus Wikipedia, and they would look at the other top-ranking hits on their subject instead. Would those pages, on average, provide better-written, better-sourced, more accurate, and more fair coverage of their subject than the corresponding Wikipedia pages? And to the extent the answer is yes, how do we link the best of that content to become accessible from Wikipedia? A future Wikipedia scholar may wish to focus more on these questions (and produce another 495-copy-selling book).

Seife rather kindly refrains from discussing in the book, as an example of a questionable Wikipedia page, his own BLP. Predictably, that page is the first Google hit on Seife's name (his own webpage at NYU is second). Unfortunately, the article bears a prominent, disfiguring banner at the top of the page, proclaiming that:

This article may require cleanup to meet Wikipedia's quality standards. The specific problem is: Article does not meet Wikipedia standards for quality. Please help improve this article if you can. (June 2013)

Now, no well-informed reader of Wikipedia would take this pronouncement alleging that Charles Seife is an ill-written article as a reflection against Charles Seife. (If anything, the obvious circular reasoning suggests sloppiness in the crafting of the tag.) After all, the reader would know that Charles Seife wouldn't have written the article and, as a matter of our conflict-of-interest guidelines, is discouraged from editing the article at all, much less improving its overall editorial quality. Nonetheless, it isn't exactly encouraging that in the 13 months since an anonymous IP editor added that tag, no one has improved the article enough to resolve the quality concern and remove the tag. If I were notable enough to warrant a Wikipedia BLP and this were the state of it for over a year, I think I'd have the right to be ticked off. (Cynical aside to editors interested in Wikipedia's public relations: improve the BLPs of journalists likely to cover us.)

Meanwhile, in a recent radio interview—which is well worth listening to—Seife claims that Wikipedia gets four or five facts of his life wrong (not controversial claims, he says, just basic facts, though he doesn't name them), which knowing about the COI guideline he didn't fix. (Aside to Charles Seife: let me know about the non-controversial fixes needed and I'll make them myself. You won't need to go to The New Yorker à la Philip Roth.)

The bottom line on these two books: Wikipedians should read (and think carefully about) Jemielniak's Slate article, but only the hardier ones among us will gain the full benefit of his book, although all of us should thank him for writing it. More Wikipedians will enjoy Seife's book, though only a sliver of it is about Wikipedia, and perhaps everyone should listen to his radio interview, although for many of us both the book and interview will reinforce, rather than challenge, our existing views about the reliability of the information that surrounds us.

Ira Brad Matetsky is a New York attorney. He edits as Newyorkbrad on the Wikimedia projects, having first registered on the English Wikipedia in 2006. He has been a member of the English Wikipedia Arbitration Committee since 2008.
The views expressed in this book review are those of the author alone; responses and critical commentary are invited in the comments section. A previous review of the Polish translation of Jemielniak's book is archived here.

Reader comments

2014-07-30

Shifting values in the paid content debate; cross-language vandalism detection; translations from 53 Wiktionaries

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Understanding shifting values underlying the paid content debate on the English Wikipedia

Related articles
Does Wikipedia pay?

How paid editors squeeze you dry
31 January 2024

"Wikipedia and the assault on history"
4 December 2023

The "largest con in corporate history"?
20 February 2023

Truth or consequences? A tough month for truth
31 August 2022

The oligarchs' socks
27 March 2022

Fuzzy-headed government editing
30 January 2022

Denial: climate change, mass killings and pornography
29 November 2021

Paid promotional paragraphs in German parliamentary pages
26 September 2021

Enough time left to vote! IP ban
29 August 2021

Paid editing by a former head of state's business enterprise
25 April 2021

A "billionaire battle" on Wikipedia: Sex, lies, and video
28 February 2021

Concealment, data journalism, a non-pig farmer, and some Bluetick Hounds
28 December 2020

How billionaires rewrite Wikipedia
29 November 2020

Ban on IPs on ptwiki, paid editing for Tatarstan, IP masking
1 November 2020

Paid editing with political connections
27 September 2020

WIPO, Seigenthaler incident 15 years later
27 September 2020

Wikipedia for promotional purposes?
30 August 2020

Dog days gone bad
2 August 2020

Fox News, a flight of RfAs, and banning policy
2 August 2020

Some strange people edit Wikipedia for money
2 August 2020

Trying to find COI or paid editors? Just read the news
28 June 2020

Automatic detection of covert paid editing; Wiki Workshop 2020
31 May 2020

2019 Picture of the Year, 200 French paid editing accounts blocked, 10 years of Guild Copyediting
31 May 2020

English Wikipedia community's conclusions on talk pages
30 April 2019

Women's history month
31 March 2019

Court-ordered article redaction, paid editing, and rock stars
1 December 2018

Kalanick's nipples; Episode #138 of Drama on the Hill
23 June 2017

Massive paid editing network unearthed on the English Wikipedia
2 September 2015

Orangemoody sockpuppet case sparks widespread coverage
2 September 2015

Paid editing; traffic drop; Nicki Minaj
12 August 2015

Community voices on paid editing
12 August 2015

On paid editing and advocacy: when the Bright Line fails to shine, and what we can do about it
15 July 2015

Turkish Wikipedia censorship; "Can Wikipedia survive?"; PR editing
24 June 2015

A quick way of becoming an admin
17 June 2015

Meet a paid editor
4 March 2015

Is Wikipedia for sale?
4 February 2015

Shifting values in the paid content debate; cross-language bot detection
30 July 2014

With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use
18 June 2014

Does Wikipedia Pay? The Moderator: William Beutler
11 June 2014

PR agencies commit to ethical interactions with Wikipedia
11 June 2014

Should Wikimedia modify its terms of use to require disclosure?
26 February 2014

Foundation takes aim at undisclosed paid editing; Greek Wikipedia editor faces down legal challenge
19 February 2014

Special report: Contesting contests
29 January 2014

WMF employee forced out over "paid advocacy editing"
8 January 2014

Foundation to Wiki-PR: cease and desist; Arbitration Committee elections starting
20 November 2013

More discussion of paid advocacy, upcoming arbitrator elections, research hackathon, and more
23 October 2013

Vice on Wiki-PR's paid advocacy; Featured list elections begin
16 October 2013

Ada Lovelace Day, paid advocacy on Wikipedia, sidebar update, and more
16 October 2013

Wiki-PR's extensive network of clandestine paid advocacy exposed
9 October 2013

Q&A on Public Relations and Wikipedia
25 September 2013

PR firm accused of editing Wikipedia for government clients; can Wikipedia predict the stock market?
13 May 2013

Court ruling complicates the paid-editing debate
12 November 2012

Does Wikipedia Pay? The Founder: Jimmy Wales
1 October 2012

Does Wikipedia pay? The skeptic: Orange Mike
23 July 2012

Does Wikipedia Pay? The Communicator: Phil Gomes
7 May 2012

Does Wikipedia Pay? The Consultant: Pete Forsyth
30 April 2012

Showdown as featured article writer openly solicits commercial opportunities
30 April 2012

Does Wikipedia Pay? The Facilitator: Silver seren
16 April 2012

Wikimedia announcements, Wikipedia advertising, and more!
26 April 2010

License update, Google Translate, GLAM conference, Paid editing
15 June 2009

Report of diploma mill offering pay for edits
12 March 2007

AstroTurf PR firm discovered astroturfing
5 February 2007

Account used to create paid corporate entries shut down
9 October 2006

Editing for hire leads to intervention
14 August 2006

Proposal to pay editors for contributions
24 April 2006

German Wikipedia introduces incentive scheme
18 July 2005


More articles

How paid editors squeeze you dry
31 January 2024

"Wikipedia and the assault on history"
4 December 2023

The "largest con in corporate history"?
20 February 2023

Truth or consequences? A tough month for truth
31 August 2022

The oligarchs' socks
27 March 2022

Fuzzy-headed government editing
30 January 2022

Denial: climate change, mass killings and pornography
29 November 2021

Paid promotional paragraphs in German parliamentary pages
26 September 2021

Enough time left to vote! IP ban
29 August 2021

Paid editing by a former head of state's business enterprise
25 April 2021

A "billionaire battle" on Wikipedia: Sex, lies, and video
28 February 2021

Concealment, data journalism, a non-pig farmer, and some Bluetick Hounds
28 December 2020

How billionaires rewrite Wikipedia
29 November 2020

Ban on IPs on ptwiki, paid editing for Tatarstan, IP masking
1 November 2020

Paid editing with political connections
27 September 2020

WIPO, Seigenthaler incident 15 years later
27 September 2020

Wikipedia for promotional purposes?
30 August 2020

Dog days gone bad
2 August 2020

Fox News, a flight of RfAs, and banning policy
2 August 2020

Some strange people edit Wikipedia for money
2 August 2020

Trying to find COI or paid editors? Just read the news
28 June 2020

Automatic detection of covert paid editing; Wiki Workshop 2020
31 May 2020

2019 Picture of the Year, 200 French paid editing accounts blocked, 10 years of Guild Copyediting
31 May 2020

English Wikipedia community's conclusions on talk pages
30 April 2019

Women's history month
31 March 2019

Court-ordered article redaction, paid editing, and rock stars
1 December 2018

Kalanick's nipples; Episode #138 of Drama on the Hill
23 June 2017

Massive paid editing network unearthed on the English Wikipedia
2 September 2015

Orangemoody sockpuppet case sparks widespread coverage
2 September 2015

Paid editing; traffic drop; Nicki Minaj
12 August 2015

Community voices on paid editing
12 August 2015

On paid editing and advocacy: when the Bright Line fails to shine, and what we can do about it
15 July 2015

Turkish Wikipedia censorship; "Can Wikipedia survive?"; PR editing
24 June 2015

A quick way of becoming an admin
17 June 2015

Meet a paid editor
4 March 2015

Is Wikipedia for sale?
4 February 2015

Shifting values in the paid content debate; cross-language bot detection
30 July 2014

With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use
18 June 2014

Does Wikipedia Pay? The Moderator: William Beutler
11 June 2014

PR agencies commit to ethical interactions with Wikipedia
11 June 2014

Should Wikimedia modify its terms of use to require disclosure?
26 February 2014

Foundation takes aim at undisclosed paid editing; Greek Wikipedia editor faces down legal challenge
19 February 2014

Special report: Contesting contests
29 January 2014

WMF employee forced out over "paid advocacy editing"
8 January 2014

Foundation to Wiki-PR: cease and desist; Arbitration Committee elections starting
20 November 2013

More discussion of paid advocacy, upcoming arbitrator elections, research hackathon, and more
23 October 2013

Vice on Wiki-PR's paid advocacy; Featured list elections begin
16 October 2013

Ada Lovelace Day, paid advocacy on Wikipedia, sidebar update, and more
16 October 2013

Wiki-PR's extensive network of clandestine paid advocacy exposed
9 October 2013

Q&A on Public Relations and Wikipedia
25 September 2013

PR firm accused of editing Wikipedia for government clients; can Wikipedia predict the stock market?
13 May 2013

Court ruling complicates the paid-editing debate
12 November 2012

Does Wikipedia Pay? The Founder: Jimmy Wales
1 October 2012

Does Wikipedia pay? The skeptic: Orange Mike
23 July 2012

Does Wikipedia Pay? The Communicator: Phil Gomes
7 May 2012

Does Wikipedia Pay? The Consultant: Pete Forsyth
30 April 2012

Showdown as featured article writer openly solicits commercial opportunities
30 April 2012

Does Wikipedia Pay? The Facilitator: Silver seren
16 April 2012

Wikimedia announcements, Wikipedia advertising, and more!
26 April 2010

License update, Google Translate, GLAM conference, Paid editing
15 June 2009

Report of diploma mill offering pay for edits
12 March 2007

AstroTurf PR firm discovered astroturfing
5 February 2007

Account used to create paid corporate entries shut down
9 October 2006

Editing for hire leads to intervention
14 August 2006

Proposal to pay editors for contributions
24 April 2006

German Wikipedia introduces incentive scheme
18 July 2005

See related Signpost content: "Extensive network of clandestine paid advocacy exposed", "With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use"
Reviewed by Heather Ford

Kim Osman has performed a fascinating study[1] on the three 2013 failed proposals to ban paid advocacy editing in the English language Wikipedia. Using a Constructivist Grounded Theory approach, Osman analyzed 573 posts from the three main votes on paid editing conducted in the community in November, 2013. She found that editors who opposed the ban felt that existing policies of neutrality and notability in WP already covered issues raised by paid advocacy editing, and that a fair and accurate encyclopedia article could be achieved by addressing the quality of the edits, not the people contributing the content. She also found that a significant challenge to any future policy is that the community 'is still not clear about what constitutes paid editing'.

Osman uses these results to argue that there has been a transition in the values of the English language Wikipedia editorial community from seeing commercial involvement as direct opposition to Wikipedia's core values (something repeated at the institutional level by the Wikimedia Foundation and Jimmy Wales who see a bright line between paid and unpaid editing) to an acceptance of paid professions and a resignation to their presence.

Osman argues that the romantic view of Wikipedia as a system somehow apart from the commercial market that characterized earlier depictions (such as those by Yochai Benkler) has been diluted in recent years and that sustainability in the current environment is linked to a platform's ability to integrate content across multiple places and spaces on the web. Osman also argues that these shifts reflect wider changes in assumptions about commerciality in digital media and that the boundaries between commercial and non-profit in the context of peer production are sometimes fuzzy, overlapping and not clearly defined.

Osman's close analysis of 573 posts is a valuable contribution to the ongoing policy debate about the role of paid editing in Wikipedia and will hopefully be used to inform future debates.

"Pivot-based multilingual dictionary building using Wiktionary"

Reviewed by Maximilian Klein (talk)
Straight edges represent translation pairs extracted directly from the Wiktionaries. The pair guildbreaslawas found via triangulating.

To build multilingual dictionaries to and from every language is combinatorially a lot of work. If one uses triangulation–if A means B, and B means C, then A means C (see figure)–then a lot of the work can be done by machine. A large closed-source effort did this in 2009[supp 1], but a new paper by Ács[2] defends "while our methods are inferior in data size, the dictionaries are available on our website"[supp 2]. Their approach used the translation tables from 53 Wiktionaries, to make 19 million inferred translations more than the 4 million already occurring in Wiktionary. The researchers steered clear of several classical problems like polysemy, one word having multiple meanings, by using a machine learning classifier. The features used in the classifier were based on the graph-theoretic attributes of each possible word pair. For instance, if two or more languages can be an intermediate "pivot" language for translation, that turned out to be a good indicator of a valid match. In order to test the precision of these translations, manual spot checking was done and found a precision of 47.9% for newly found word-pairs versus 88.4% for random translations coming out of Wiktionary. As for recall, which tested the coverage of a collection of 3,500 common words, 83.7% of words were accounted for by automatic triangulation in the top 40 languages. That means that right now if we were to try and make a 40-language pocket phrasebook to travel around most of the world just using Wiktionary, about 85% of the time there would be a translation, and it would be between 50-85% correct.

This performance would likely need to increase before any results could be operationalized and contributed back into Wiktionary. However, given the fact that the code used to parse and compare 43 different Wiktionaries was also released on GitHub[supp 3], that goal is a possibility. It's yet another testament to the open ecosystem to see a Wikimedia project along with Open Researcher efforts make a resource to rival a closed standard. While Ács' research isn't the holy grail of translation between arbitrary languages, it cleverly mixes established theory and open data, and then contributes it back to the community.

"Cross Language Learning from Bots and Users to detect Vandalism on Wikipedia"

Reviewed by Han-Teng Liao (talk)

A new study[3] by Tran and Christen is the latest example of academic research on vandalism detection which has been developed over the years[supp 4] in the context of the PAN workshop[supp 5], where researchers develop both corpus data and tools to uncover plagiarism, authorship, and the misuse of social media/software. This work should be of interests to both researchers and Wikipedians because of (a) the need to detect vandalism and (b) the interesting question whether such vandalism-fighting data and tools are transferable or portable from one language version to another. Both the vandalism-fighting corpus and tools have both practical and theoretical implications for understanding the cross-lingual transfer in knowledge and bots.

In 2010 and 2011, Wikipedia vandalism detection competitions were included by the PAN as workshops. It started with Martin Potthast's work on building the free-of-charge PAN Wikipedia vandalism corpus, PAN-WVC-10 for research, which compiled 32452 edits based on 28468 Wikipedia articles, among which 2391 vandalism instances were identified by human coders recruited from Amazon's Mechanical Turk[supp 6]. In 2011, a larger crowdsourced corpus of 30,000+ Wikipedia edits is released in three languages: English, German, and Spanish[supp 7], with 65 features to capture vandalism.

Based on even larger datasets of over 500 million revisions across five languages (en:English, de:German, es:Spanish, fr:French, and ru:Russian), Tran & Christen's latest work adds to the efforts by applying several supervised machine learning algorithms from the Scikit-learn toolkit[supp 8], including Decision Tree (DT), Random Forest (RF), Gradient Tree Boosting (GTB), Stochastic Gradient Descent (SGD) and Nearest Neighbour (NN).

What Tran & Christen confirm from their findings is that "distinguishing the vandalism identified by bots and users show statistically significant differences in recognizing vandalism identified by users across languages, but there are no differences in recognizing the vandalism identified by bots" (p.13) This demonstrates human beings can recognize a much wider spectrum of vandalism than bots, but still bots are shown to be trainable to be more sophisticated to capture more and more nonobvious cases of vandalism.

Tran & Christen try to further make the case for the benefits of cross language learning of vandalism. They argue that the detection models are generalizable, based on the positive results of transferring the machine-learned capacity from English to other smaller Wikipedia languages. While they are optimistic, they acknowledge such generalization has at best been proven among some of the languages they studied (these languages are all Roman-alphabet-based languages except for Russian), and the poor performance of the Russian language model. Thus, Tran & Christen rightly point out the need for research on non-English and especially non-European language versions. They also recognize that many word based features are no longer useful for some languages such as Mandarin Chinese, because of tokenization and other language-specific issues.

Tran & Christen call for next research projects to include languages such as Arabic and Mandarin Chinese to complete the United Nations working set of languages. It will be interesting to see how such research projects can be executed and how the greater Wikipedia research and editor community can help and/or use such research efforts.

Readers' interests differ from editors' preferences

Reviewed by Piotr Konieczny.

A conference paper titled "Reader Preferences and Behavior on Wikipedia"[4] deals with the under-studied population of Wikipedia readers. The paper provides a useful literature review on the few studies about reading preference of that group. The researchers used publicly available page view data, and more interestingly, were able to obtain browsing data (such as time spend by a reader on a given page). Since such data is unfortunately not collected by Wikipedia, the researchers obtained this data through volunteers using a Yahoo! toolbar. The authors used Wikipedia:Assessment classes to gauge article's quality.

The paper offers valuable findings, including important insights to the Wikipedia community, namely that "the most read articles do not necessarily correspond to those frequently edited, suggesting some degree of non-alignment between user reading preferences and author editing preference". This is not a finding that should come as much surprise, considering for example the high percentage of quality military history articles produced by the WikiProject Military History, one of the most active if not the most active wikiproject in existence - and of how little importance this topic is to the general population. Statistics on topics popularity and quality of corresponding articles can be seen in Table 1, page 3 of the article. Figure 1 on page 4 is also of interest, presenting a matrix of articles grouped by popularity and length. For example, the authors identify the area of "technology" as the 4th most popular, but the quality of its articles lags behind many other fields, placing it around the 9th place. It would be a worthwhile exercise for the Wikipedia community to identify popular articles that are in need of more attention (through revitalizing tools like Wikipedia:Popular pages, perhaps using code that makes WikiProject popular pages listing work?) and direct more attention towards what our readers want to read about (rather than what we want to write about). Finally, the authors also identify different reading patterns, and suggest how those can be used to analyze article's popularity in more detail.

Overall, this article seems like a very valuable piece of research for the Wikipedia community and the WMF, and it underscores why we should reconsider collecting more data on our readers' behavior. In order to serve our readers as best as we can, more information on their browsing habits on Wikipedia could help to produce more valuable research like this project.

Wikipedia from the perspective of PR and marketing

Reviewed by Piotr Konieczny.

An article[5] in "Business Horizons", written in a very friendly prose (not a common finding among academic works), looks at Wikipedia (as well as some other forms of collaborative, Web 2.0 media) from the business perspective of a public relations/marketing studies. Of particular interest to the Wikipedia community is the authors goal of presenting "the three bases of getting your entry into Wikipedia, as well as a set of guidelines that help manage the potential Wikipedia crisis that might happen one day." The authors correctly recognize that Wikipedia has policies that must be adhered to by any contributors, though a weakness of the paper is that while it discusses Wikipedia concepts such as neutrality, notability, verifiability, and conflict of interest, it does not link to them. The paper provides a set of practical advice on how to get one's business entry on Wikipedia, or how to improve it. While the paper does not suggest anything outright unethical, it is frank to the point of raising some eyebrows. While nobody can disagree with advice such as "as a rule of thumb, try to remain as objective and neutral as possible" and "when in doubt, check with others on the talk page to determine whether proposed changes are appropriate", given the lack of consensus among Wikipedia's community on how to deal with for-profit and PR editors, other advice such as "maximize mentions in other Wikipedia entries" (i.e. gaming WP:RED), "be associated with serious contributors...leverage the reputation of an employee who is already a highly active contributor... [befriend Wikipedians in real life]", "When correcting negative information is not possible, try counterbalancing it by adding more positive elements about your firm, as long as the facts are interesting and verifiable", "...you might edit the negative section by replacing numerals (99) with words (ninety-nine), since this is also less likely to be read. Add pictures to draw focus away from the negative content" might be seen as more controversial, falling into the gaming the system gray area. The "Third, get help from friends and family" section in particular seems to fall foul of meatpuppetry.

In the end, this is an article worth reading in detail by all interested in the PR/COI topics, though for better or worse, the fact that it is closed access will likely reduce its impact significantly. On an ending note, one of the two article's co-authors has a page on Wikipedia at Andreas Kaplan, which was restored by a newbie editor in 2012, two years after its deletion, has been maintained by throw-away SPAs, and this reviewer cannot help but notice that it still seems to fail Wikipedia:Notability (academics)...

"No praise without effort: experimental evidence on how rewards affect Wikipedia's contributor community"

Reviewed by Piotr Konieczny.

In 2012, the authors of this paper[6] have given out over a hundred barnstars to the top 1% most active Wikipedians, and concluded that such awards improve editors productivity. This time they repeated this experiment while broadening their sample size to the top 10% most active editors. After excluding administrators and recently inactive editors, they handed out 300 barnstars "with a generic positive text that expressed community appreciation for their contributions", divided between the 91st–95th, 96th–99th, and 100th percentiles of the most active editors (this corresponds to an average of 282, 62 and 22 edits per month) and then tracked the activity of those editors, as well as of the corresponding control sample which did not receive any award. The experiment was designed to test the hypothesis that less active contributors will be responsive to rewards, similar to the most highly-active contributors from the prior research.

The authors found, however, that rewarding less productive editors did not stimulate higher subsequent productivity. They note that while the top 1% group responded to an award with an increase in productivity (measured at a rather high 60% increase), less productive subjects did not change their behavior significantly. The researchers also noted that while some of the top 1% editors received an additional award from other Wikipedians, not a single subject from the less active group was a recipient of another award.

The researchers conclude that "this supports the notion that peer production’s incentive structure is broadly meritocratic; we did not observe contributors receiving praise or recognition without having first demonstrated significant and substantial effort." While this will come as little surprise to the Wikipedia community, their other observation - that outside the top 1% of editors, awards such as barnstars have little meaningful impact - is more interesting.

Further, the authors found that while rewarding the most active editors tends to increase their retention ratio, it may counter-intuitively decrease the retention ratio of the less active editors. The authors propose the following explanation: "Premature recognition of their work may convey a different meaning to these contributors; instead of signaling recognition and status in the eyes of the community, these individuals may perceive being rewarded as a signal that their contributions are sufficient, for the time being, or come to expect being rewarded for their contributions." They suggest that this could be better understood through future research. For the community in general, it raises an interesting question: how should we recognize less active editors, to make sure that thanking them will not be taken as "you did enough, now you can leave"?

Briefly

  • Wikipedia assignments improve students' research skills: It is refreshing to see a continuing and growing stream of academic works endorsing various aspects of teaching with Wikipedia paradigm. A study[7] of eleven students "enrolled in a semester-long academic literacy course in a preparatory program for study at an Australian university... showed an educationally statistical improvement in the students’ research skills, while qualitative comments revealed that despite some technical difficulties in using the Wikipedia site, many students valued the opportunity to write for a ‘real’ audience and not just for a lecturer."
  • A split in the growing field of Chinese-language Wikipedia research: A blog post[8] by Han-Teng Liao (廖漢騰) presents an interesting exploratory overview of a Chinese language research on Wikipedia. The findings suggest that Chinese-language scholars and academic publication outlets are increasingly doing research in the field of Wikipedia studies; however there's "a divide between mainland Chinese academic sources/search results on one hand, and Hong Kong/Taiwanese ones on the other." The reason for this seems to be primarily technical, as scholars from different regions seem to publish in different outlets, which in turn are not indexed in the academic search engines preferred by those from other region.

Other recent publications

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.

  • "Uneven Openness: Barriers to MENA [Middle East/North Africa] Representation on Wikipedia"[9] (blog post)
  • " Detecting epidemics using Wikipedia article views: A demonstration of feasibility with language as location proxy"[10]
  • "The Reasons of People Continue Editing Wikipedia Content - Task Value Confirmation Perspective"[11]
  • "Circling the Infinite Loop, One Edit at a Time: Seriality in Wikipedia and the Encyclopedic Urge"[12]
  • "Identifying Duplicate and Contradictory Information in Wikipedia"[13]
  • "The impact of elite vs. non-elite contributor groups in online social production communities: The case of Wikipedia"[14]
  • "What do we Think an Encyclopaedia is?"[15] From the abstract: "Based on survey and interview research carried out with publishers, librarians and higher education students, [this article] demonstrates that certain physical features and qualities are associated with the encyclopaedia and continue to be valued by them. Having identified these qualities, the article then explores whether they apply to three incidences of electronic encyclopaedias, Britannica Online, The Stanford Encyclopedia of Philosophy and Wikipedia."
  • " Crowdsourcing Knowledge Interdiscursive Flows from Wikipedia into Scholarly Research"[16]. From the abstract: "using a dataset collected from the Scopus research database, which is processed with a combination of bibliometric techniques and qualitative analysis [this article finds] that there has been a significant increase in the use of Wikipedia as a reference within all areas of science and scholarship. Wikipedia is used to a larger extent within areas like Computer Science, Mathematics, Social Sciences and Arts and Humanities, than in Natural Sciences, Medicine and Psychology."
  • "How Readers Shape the Content of an Encyclopedia: A Case Study Comparing the German Meyers Konversationslexikon (1885-1890) with Wikipedia (2002-2013)"[17]

References

  1. ^ Osman, Kim (2014-06-17). "The Free Encyclopaedia that Anyone can Edit: The Shifting Values of Wikipedia Editors". Culture Unbound: Journal of Current Cultural Research. 6 (3): 593–607. doi:10.3384/cu.2000.1525.146593.
  2. ^ Ács, Judit (May 26–31, 2014). "Pivot-based multilingual dictionary building using Wiktionary" (PDF).
  3. ^ Tran, Khoi-Nguyen; Christen, P. (2014). "Cross Language Learning from Bots and Users to detect Vandalism on Wikipedia". IEEE Transactions on Knowledge and Data Engineering. 27 (3): 673–685. doi:10.1109/TKDE.2014.2339844.
  4. ^ Reader Preferences and Behavior on Wikipedia. HT’14, September 1–4, 2014, Santiago, Chile. http://www.dcs.gla.ac.uk/~mounia/Papers/wiki.pdf
  5. ^ Kaplan, Andreas; Haenlein, Michael (2014). "Collaborative projects (social media application): About Wikipedia, the free encyclopedia". Business Horizons. 57 (5): 617–626. doi:10.1016/j.bushor.2014.05.004. Closed access icon
  6. ^ Restivo, Michael; van de Rijt, Arnout (2014). "No praise without effort: experimental evidence on how rewards affect Wikipedia's contributor community". Information, Communication & Society. 17 (4): 451–462. doi:10.1080/1369118X.2014.888459.
  7. ^ Miller, Julia (2014-06-13). "Building academic literacy and research skills by contributing to Wikipedia: A case study at an Australian university". Journal of Academic Language and Learning. 8 (2): A72–A86.
  8. ^ Liao, Han-Teng (2014-06-20). "Chinese-language literature about Wikipedia: a meta-analysis of academic search engine result pages".
  9. ^ Graham, Mark; Hogan, Bernie (2014-04-29). "Uneven Openness: Barriers to MENA Representation on Wikipedia". SSRN 2430912.
  10. ^ Generous, Nicholas; Fairchild, Geoffrey; Deshpande, Alina; Del Valle, Sara Y.; Priedhorsky, Reid. "Detecting epidemics using Wikipedia article views: A demonstration of feasibility with language as location proxy". arXiv:1405.3612v1 [cs.SI].
    Revised and published as Generous, Nicholas; Fairchild, Geoffrey; Deshpande, Alina; Del Valle, Sara Y.; Priedhorsky, Reid (2014). "Global Disease Monitoring and Forecasting with Wikipedia". PLOS Computational Biology. 10 (11): e1003892. arXiv:1405.3612. Bibcode:2014PLSCB..10E3892G. doi:10.1371/journal.pcbi.1003892. PMC 4231164. PMID 25392913.
  11. ^ Lai, Cheng-Yu; Heng-Li Yang (2014). "The Reasons of People Continue Editing Wikipedia Content - Task Value Confirmation Perspective". Behaviour & Information Technology. 33 (12): 1371–1382. doi:10.1080/0144929X.2014.929744.
  12. ^ Salor, E.: Circling the Infinite Loop, One Edit at a Time: Seriality in Wikipedia and the Encyclopedic Urge. In Allen, R. and van den Berg, T. (eds.) Serialization in Popular Culture. London: Routledge p.170 ff.
  13. ^ Weissman, Sarah; Ayhan, Samet; Bradley, Joshua; Lin, Jimmy (2014-06-04). "Identifying Duplicate and Contradictory Information in Wikipedia". arXiv:1406.1143 [cs.IR].
  14. ^ Mihai Grigore, Bernadetta Tarigan, Juliana Sutanto and Chris Dellarocas: "The impact of elite vs. non-elite contributor groups in online social production communities: The case of Wikipedia". SCECR 2014 PDF
  15. ^ Schopflin, Katharine (2014-06-17). "What do we Think an Encyclopaedia is?". Culture Unbound: Journal of Current Cultural Research. 6 (3): 483–503. doi:10.3384/cu.2000.1525.146483.
  16. ^ Lindgren, Simon (2014-06-17). "Crowdsourcing Knowledge Interdiscursive Flows from Wikipedia into Scholarly Research". Culture Unbound: Journal of Current Cultural Research. 6 (3): 609–627. doi:10.3384/cu.2000.1525.146609.
  17. ^ Spree, Ulrike (2014-06-17). "How Readers Shape the Content of an Encyclopedia: A Case Study Comparing the German Meyers Konversationslexikon (1885-1890) with Wikipedia (2002-2013)". Culture Unbound: Journal of Current Cultural Research. 6 (3): 569–591. doi:10.3384/cu.2000.1525.146569.
Supplementary references and notes:
  1. ^ Mausam; Soderland, Stephen; Etzioni, Oren; Weld, Daniel S.; Skinner, Michael; Bilmes, Jeff (2009). Compiling a Massive, Multilingual Dictionary via Probabilistic Inference. pp. 262–270. ISBN 978-1-932432-45-9.
  2. ^ "Hungarian Front Page".
  3. ^ "wiki2dict github". GitHub.
  4. ^ For example, in 2013 only two languages are studied [1] in contrast to the five languages reported in this 2014 journal article.
  5. ^ http://pan.webis.de/
  6. ^ See [2]
  7. ^ See [3]
  8. ^ Scikit-learn is an open source project in Python for machine-learning


Reader comments

2014-07-30

How many more hoaxes will Wikipedia find?

Another hoax on the English Wikipedia was uncovered this week—not by any thorough investigation, but through the self-disclosure of an anonymous change made when the editors were in their sophomore year of college. The deliberate misinformation had been in the article for over five years with plenty of individuals noticing, but not one suspected its authenticity. This leads to one obvious question: how many more are there?

Amelia Bedelia is a fictional character used by children's book author Peggy Parish and her nephew Herman Parish, who stepped in to continue the series after the former's death in 1988. Bedelia is over 50 years old and is literal-minded to the extreme. According to publisher HarperCollins, "When she makes a sponge cake, she puts in real sponges. When she weeds the garden, she replants the weeds. And when she pitches a tent, she throws it into the woods!" The New York Times Book Review noted that "No child can resist Amelia and her literal trips through the minefield of the English language—and no adult can fail to notice that she's usually right when she's wrong." Writer Cynthia Samuels continued:


However, Peggy Parish would likely be the first to tell her readers that her main character was not based on a maid in Cameroon.

Nor did she spend some "formative years" there.

Yet this is precisely what the Wikipedia article on Amelia Bedelia had said since January 2009: "Amelia Bedelia's character is based on a maid in Cameroon, where the author spent some time during her formative years. Her vast collection of hats, notorious for their extensive plumage, inspired Parish to write an assortment of tales based on her experiences in North Africa."

Related articles
2014-07-30

UK political editing; hoaxes; net neutrality
22 April 2015

Saving Wikipedia; Internet regulation; Thoreau quote hoax
15 April 2015

WikiWomen's History Month—meetups, blog posts, and "Inspire" grant-making campaign
11 March 2015

Gamergate; a Wiki hoax; Kanye West
11 March 2015

Monkey selfie, net neutrality, and hoaxes
13 August 2014

How many more hoaxes will Wikipedia find?
30 July 2014

Wikipedia's sexism; Yuri Gadyukin hoax
29 April 2013

An article is a construct – hoaxes and Wikipedia
11 February 2013

Hoaxes draw media attention; Sue Gardner's op-ed; Women of Wikipedia
28 January 2013

Rush Limbaugh falls for Wikipedia hoax, Public Policy Initiative, Nature cites Wikipedia
20 September 2010

Hoaxes in France and at university, Wikipedia used in Indian court, Is Wikipedia a cult?, and more
14 June 2010

Quote hoax replicated in traditional media, and more
11 May 2009

News and notes: Flagged Revisions and permissions proposals, hoax, milestones
10 January 2009

Media coverage of Wikipedia hoax results in article
17 April 2006

Hoax exposé prompts attempt to delete author
8 August 2005

Hoax articles on April Fool's rub some the wrong way
4 April 2005

Attempt to foist false article on Wikipedia revealed
14 February 2005


More articles

The hoax was only revealed when EJ Dickson, a journalist and one of the two original hoax editors, noticed a series of tweets including one from Jay Caspian Kang, an editor for the New Yorker, that highlighted the text Dickson wrote five years earlier. In her words, "It was total bullshit ... It was the kind of ridiculous, vaguely humorous prank stoned college students pull, without any expectation that anyone would ever take it seriously." Her co-editor Evan continued, "I feel like we sort of did it with the intention of seeing how fast it would take to get it taken down [by Wikipedia editors]".

Their edits were removed after Dickson publicized her edits in the Daily Dot.

Historical hoaxes

Hoaxes have a lengthy history on Wikipedia. The longest-lasting hoax was a two-sentence, obscure biography of Gaius Flavius Antoninus, who was supposedly a Roman politician who helped assassinate Julius Caesar in 44BCE.

At least 23 known hoaxes have lasted for five to six years, including an article on an equally obscure alleged war between Portugal and the Maratha Empire of modern-day India. Wikipedia editor A-b-a-a-a-a-a-a-b-a, who is now indefinitely blocked, wrote that this "Bicholim conflict" took place in 1640–41 and the resulting peace treaty played a major role in Portugal's keeping control over Goa until the 1960s. At the time it was exposed as a hoax, the meticulously created article had held good article status for five years. It was over 4300 words long, and had about 150 citations.

John Tyler, the tenth president of the United States, supposedly sent federal troops into Michigan in 1843 a secessionist movement spawned from an equally farcical Canadian-Michigander conflict over the Upper Peninsula of Michigan.

Numerous hoaxes have existed for shorter amounts of time. Among the most colorful was another painstakingly detailed entry on the Upper Peninsula War. Boasting 23 references in its bibliography, this fake article chronicled a struggle between the United States, Canada, and nascent separatists in Michigan spawned from a disputed territorial line in the Upper Peninsula. It ended with the massacre of numerous Canadian troops (along with 80–120 civilians suspected of being Canadian co-conspirators), and the arrest and execution of Michigan's governor.

This fantastical story turned out to be a success story for Wikipedia: the hoax, despite the effort that had been put into it, was caught, nominated for deletion within a week of creation, and disposed of.

What's left?

With this latest hoax revelation, how many more are out there? An op-ed published in the Signpost last year argued that studies show Wikipedia is very accurate and false information is near the level of statistical irrelevance. When hoaxes do occur, they "have reached great prominence, true, but they are small in number, and they can be caught." According to the author, "Wikipedia is generally fairly effective (if not perfect) at keeping its information clean and rid of errors."

Yet just by itself, the Bedelia hoax caused a number of others to be revealed in comment threads discussing the case, including false ghost stories and a new origin story for the corporate name Verizon. Dickson's article also referenced a prior hoax regarding the alleged inventor of S'mores; one of those claimed inventors even had their own biography article which was deleted last year, but not before being cited in a number of books. How many more remain hidden in plain sight?

Though not a defense, these problems of falling for false information are not new. John McIntrye, a copyeditor for the The Baltimore Sun and a noted critic of Wikipedia, also wrote about this latest hoax, and noted that those who were duped showed a "hardly novel" combination of laziness and gullibility, as demonstrated long ago by H.L. Mencken's 1917 Bathtub hoax.

Still, as EJ Dickson's article concluded, "I learned from my inadvertent Wikipedia hoax ... not that Wikipedia itself isn't reliable, but that ... many people believe it is." Numerous examples of Bedelia's alleged Cameroonian origins have been written about by scholars, bloggers, academics, and apparently even the current author of the series himself, who reportedly told a journalist in 2009 that the character was based on "a French colonial maid in Cameroon." The fact that these hoaxes are not caught for such a long time does not mean they cannot be caught—a discerning editor looking for questionable claims and lack of citations may spot them.

But the average reader using Wikipedia will likely not.

In brief

  • Swedish Wikipedia hits yet another milestone: The Swedish Wikipedia, which as of two weeks ago was the fourth-largest Wikipedia (behind the Dutch and German), is now the second-largest Wikipedia. In that timeframe, Swedish has grown by about 65,000 articles, compared to the German's 6,000 and the Dutch's 2,000—principally thanks to bot-created articles, a process that has proved controversial among Wikipedians.
  • New iOS app: Following the redesigned Wikipedia app for Android phones in June, the Wikimedia Foundation has now released a companion app for iOS, which has similar features including the ability to edit (for the first time) and saving pages for later offline viewing. Fast Company's story notes that the WMF is "really interested" in eventually raising funds through the app.
  • Another legal victory: The WMF has declared a "victory for free and neutral knowledge" in their triumph over four websites who were improperly using the Wikipedia trademark. All looked nearly identical and offered to create articles for clients willing to pay US$799; the WMF was forced to take legal action when the domain owners declined to respond to Uniform Domain-Name Dispute-Resolution Policy complaints filed at the World Intellectual Property Organization. The Next Web reports that "In short, the Wikimedia Foundation has used cybersquatting legislation to combat four paid-for services here. However, it's a small victory in the grand scheme of things—it faces a mighty uphill battle to curb the practice altogether."
  • US Congress edits: Wikimedia DC has started a dialogue with staffers from the US Congress. According to DC president James Hare, "they are interested in exploring how they can contribute information in a manner consistent with Wikipedia policies and best practices."
  • University Challenge on Wikipedia: The well-known UK quiz show University Challenge featured a bonus round last week focusing on Wikipedia. The Wikimedia UK blog asks "how would you fare against the students of Jesus College, Oxford University who faced the questions?"
  • Open positions
    • Royal Society of Chemistry: The UK Royal Society of Chemistry is calling for a six-month Wikipedian in residence "to join the Communications and Campaigns team and foster collaboration between our staff, members, Wikipedians, the research community and the general public, with the aim of better understanding and improving the usability of chemistry related content on Wikipedia."
    • Program intern: Wikimedia UK is recruiting a program intern that will assist in putting on gender gap events later this year.

      Reader comments

2014-07-30

Success in Egypt and the Arab world

Introduction

The Wikimedia Education Program currently spans 60 programs around the world. Students and instructors participate at almost every level of education. Subjects covered include law, medicine, arts, literature, information science, biology, history, psychology, and many others. This Signpost series presents a snapshot of the Wikimedia Global Education Program as it exists in 2014. We interviewed participants and facilitators from the United States and Canada, Serbia, Israel, the Arab World, and Mexico, in addition to the Wikimedia Foundation.

Education presentation by Dr. Martin Poulter of Wikimedia UK

Wikimedia Education in Egypt

Based on emails with Samir El-Sharbaty, member of the Egypt Wikimedians user group which was approved in July 2014 by the Affiliations Committee

Participants in the first Campus Ambassador training in Cairo, 2012
Recruiting poster at the Faculty of Alasun (Languages) at Ain Shams University, 2012

Congratulations for getting user group recognition from the Affiliations Committee. Does the user group plan to involve itself with the Wikipedia Education Program in Egypt, and if yes, how?

Yes, the WEP is one of the top priorities of the user group. Also most of the founding users of the user group are Education Program Volunteers, like Walaa Abdelmonem, Ahmed Mohi, Mohamed Ouda, May Hachem and I. The relationship between the user group and volunteers and activities is meant to be a mutually beneficial relationship meaning that the user group will aim at supporting the wiki movement volunteers in Egypt while volunteers will help organize events and activities to support existing users and attract new ones.

How would you describe the current WEP program in Egypt?

Photo from the 4th Wikipedia Education Celebration Conference in Cairo, 2014. 88% of Egyptian Wikipedia Education Program participants were women
The Egypt Education Program is the program of heroes. Although volunteers went through many challenges they were able to prove themselves with great results. Last term, our students were able to record some numbers like 22 new articles per student, 88% Female users, and 7 million bytes added by one institution. The number of articles was 2,435 (from that institution).

How many high schools and universities participate in WEP in Egypt, and how many instructors and students participate?

We have 3 institutions with 9 courses this term and new ones are coming soon. The number of campus volunteers is 10 and instructors are 6. The participating institutions are the Faculty of Alalsun (Languages) and Faculty of Arts in Ain Shams University, and the Faculty of Arts in Cairo University.

Which languages of Wikipedia do students read and edit?

Students read many versions of Wikipedia and translate from them but most of them edit the Arabic Wikipedia only.

How much student activity is translation and how much is new prose?

6 of our courses are translation courses and 3 are Arabic research based courses.

Besides Wikipedia, do students or instructors contribute to other Wikimedia projects like Wiktionary, Wikisource, or Commons?

In the past four terms our students edited Wikipedia only but this term we will train them on taking and uploading photos to Wikimedia Commons.

How many Egyptian Wikimedia volunteers assist students and instructors?

We have 15 members in the user group, 5 of them are WEP volunteers.

Is there anything else that ‘’Signpost’’ readers should know about the Education Program in Egypt?

No, thanks.
Another photo from the 4th Wikipedia Education Celebration Conference in Cairo, 2014

Wikimedia Education in the broader Arab World

Based on emails and a Skype interview with Tighe Flanagan, WMF Arab World Wikipedia Education Program Manager

Professor Abeer Abd El-Hafez (front row just to the left of the centre) and her group of participants in the Cairo pilot, 6 March, 2011
”The Wikipedia Translation Center at the College of Languages and Translation" in 2013 at King Saud University in Riyadh, Saudi Arabia

Can you describe how the Education Program started in the Arab World?

The Wikipedia Education Program started in the Arab world in 2012 with a pilot program at Cairo University and Ain Shams University in Egypt. The first term had 7 classes and 54 students, and they added 1,855,454 bytes of content to the Arabic Wikipedia (divide bytes in half to compare with Latin script because of complex characters). This was before I joined my team, but my understanding is that this decision came out of a convening of Arab Wikipedians that was hosted in Doha in 2011 (see this blog post). The team at the WMF invested a lot of time and effort in getting off to a good start with trainings for professors and Wikipedia Ambassadors. There is a detailed report on-wiki.

How many instructors and students currently participate in the program?

Workshop banner at King Saud University
Currently I describe the various iterations and initiatives in the region as the Arab World Programs, since each country has its own dynamics, base of volunteers, and history. For example, we're in the 5th semester or iteration of the program in Egypt and the 3rd in Jordan. Right now, it looks like we have:
  • 7 classes in Egypt with 51 students registered on course pages on the education extension.
  • 10 classes in Jordan with 81 students registered
Egypt and Jordan have both decided locally to extend the editing period through the beginning of August to give students to opportunity to edit over the summer during and during the month of Ramadan, especially in Egypt where the normal academic calendar was disrupted by political events making normal workshops and edit-a-thons impossible during this past semester.
One struggle we have with some programs is getting everything properly tracked on-wiki. The education extension has been a huge help with that, and the forthcoming campaigns extension should only make things better and integrate more with our other tracking tools like Wikimetrics.

Which countries currently participate?

“Group photo on the roof of the Amman International Hotel” on the second day of the Arab World regional education hackathon, May 2014
We had a regional WEP Hackathon last month with volunteers participating from Jordan, Egypt, Saudi Arabia and Yemen. Our volunteers from Yemen are working to get a pilot class launched at Sana'a University this fall. Volunteers in Saudi Arabia are working to get students to publish their work online (currently there is a lot of translation that happens offline or is impossible to track so far).
In terms of trackable student activity, universities and schools in Egypt and Jordan are our main participants. The Wiki Club at Princess Nora University in Riyadh, Saudi Arabia, is planning on joining us formally in the coming year. Things like the education extension are making it easier for people to create course pages and register students -- and for us to track their impact on-wiki.

What grade levels are the students who participate?

WEP started in the Cairo Pilot at the university level. Since then, we have had secondary school students also participate in both Egypt and Jordan. Jordan in particular has had very active students at the 10th grade level. This decision was made at the local level to engage at the secondary level, and we have been pleasantly surprised with the results.

As you probably know, Wikipedia editors are predominantly male in most languages. Approximately what percentage of the students who participate in the Arab World education program are female?

We are proud of the fact that most student participants in our programs in Egypt and Jordan are female. In Jordan last term (fall 2013-14) it was about 70%; in Egypt for the same term is was about 88%.

How are instructors and students trained to use Wikipedia?

Instructors and students are trained using the Wikipedia Ambassador model. Ambassadors have experience with Wikipedia, either because they are Wikipedians or have participated in the program as a student previously, or are professors who have taught with Wikipedia. These Ambassadors facilitate workshops and edit-a-thons that train groups of instructors, students, or a mix of both.

Do students and instructors usually use VisualEditor?

Yes and no. I know our Ambassadors in Jordan prefer to teach newbies how to edit using the VE, and I know that some Ambassadors in Egypt prefer the traditional editor. We try to make sure we cover how to use editors in our materials since the VisualEditor is not available in all namespaces on all projects. And some students like learning more advanced editing using templates, for example.

What kinds of assignments do students receive when using Wikipedia in the classroom? For example, are they translating, editing existing articles, or creating new articles? Which languages do they use?

The CourseInfo tool shows articles that were edited by a class in 2012
Most of our students translate articles or improve existing articles through translation. Some students also write new articles.
Our translation students translate from other languages into Arabic; English is the largest language, but there are also students who translate from Spanish, French, Italian, Hebrew, Turkish and Korean.

Has the program received any endorsements from governments of countries that are participating?

No, none of the national governments has endorsed the Wikipedia Education Program in any of the countries in the Arab region.

How do you expect the program to develop in the next few years?

The programs are being steered and managed day-to-day by local volunteers -- a mix of professors and Wikipedians and active student leaders. We are eager to see the programs continue to grow and scale, but also making sure that the relatively small Arabic Wikipedia community (about 600 active editors and less than 100 very active editors each month, on average) is able to absorb the new batches of editors and their contributions.
I also expect more and more classes to have specific topic focuses that fill specific content gaps on the Arabic Wikipedia as those needs are better identified.

Is there anything else that you would like Signpost readers to know about the program?

The Wikipedia Education Programs in the Arab world should be thought of as a variety of programs, not a single entity. They all work on the same project, but operate in a variety of contexts. I am eager to work with all types of volunteers in the Movement (and movement-aligned volunteers who aren't Wikipedians) to make a positive impact on Wikipedia, especially the Arabic version.

In addition to the written email interview above, we spoke by Skype.

Languages

Tighe speaks English, Arabic and French. English is a common second language for professors in the region. Many universities have translation faculties in countries such as as Egypt, Jordan, Saudi Arabia.

Outreach

It’s easy to market the idea of improving Arabic Wikipedia because there's a widespread push to put more Arabic online. People like the idea but getting familiar with Wiki can be a little challenging. Mission alignment is easy.
Personal relationships are very important in the Arab World, such as having good relationships with professors and administrators.

Biggest challenges

The biggest challenge is getting translation work placed online. For example, students often translate material as a part of their translation capstone projects. Tighe encourages students and professors to translate Wikipedia articles and place the results online. Translating open license work benefits Arabic Wikipedia, and the universities benefit because the translated material has an open license unlike many other materials that could be translated such as recent magazine articles or books. Some people hesitate to post translations online because they feel that a translation may be good enough for a passing grade but not good enough for publication.

Cultural norms: celebrations and physical artifacts

Celebration in Isra University, Amman, Jordan, August 2013
In Arab countries, having capstone celebrations for course completions such as small ceremonies is important to students and professors. Celebrations are usually organized by local volunteers.
Physical letters, certificates, and stamps are culturally significant. Students appreciate having physical documentation of their online accomplishments.
  • Education program relationship with Arabic Wikipedia online volunteers
Arabic Wikipedia is smaller than English and has a pretty good ambassador program. People can know who's who. Pending changes is in effect across Arabic Wikipedia; ambassadors will push through reviews and check for reversion or deletion proposals.
There is some discussion within the community about whether the education program focus on recruiting new editors or creating content. The discussion is mostly friendly.

Going forward

Tighe's role is like a regional facilitator. Arabic Wikipedia Education programs started very hands-on, boots on the ground. Now the programs are more locally owned initiatives driven by professors and students. WMF helps with materials and the Education Program extension. Different countries will run their programs differently, so the Arabic Education Program is more like a network of regional programs.


Wikipedia Education Program seminar in Algeria at the University of Medea, 2013


Reader comments

2014-07-30

Doom and gloom vs. the power of Reddit

We indeed moved far away from football this week, and further into much more serious issues of war and death. The Israel-Palestinian conflict continues to dominate the news, and the top 10, with Gaza Strip (#4), Israel (#9), and Hamas (#10). The top 25 also includes Palestine (#15) and Israeli–Palestinian conflict itself (#17). Death also lies behind the popularity of James Garner (#1), the American actor who died on 19 July, Malaysia Airlines Flight 17 (#3), and Deaths in 2014 (#8).

We have Reddit to thank for some less serious topics of interest, including a funny story about songwriter Tom Lehrer (#5), as well as how land mine (#7) areas in the Falkland Islands have become penguin sanctuaries. Actress Rose Leslie (#21) made the top 25 simply because Reddit noticed she grew up in a castle. It's worth noting that earlier this week The New York Times was asking "Can Reddit Grow Up?", about that site's efforts to develop a mature business model. Considering that Reddit and Google Doodles are without peer in their ability to direct traffic, at least to Wikipedia, it stands to reason that someone will figure out how to leverage that site's massive audience.

For the full top 25 list, see WP:TOP25. See this section for an explanation for any exclusions.

For the week of 20 to 26 July 2014, the ten most popular articles on Wikipedia, as determined from the report of the 5,000 most viewed pages, were:

Rank Article Class Views Image Notes
1 James Garner B-class 1,160,042
This American actor died on July 19 at age 86 of a heart attack. Garner starred in several popular television series over more than five decades, including Maverick and The Rockford Files. He also starred in more than 50 films.
2 Fifty Shades of Grey B-class 579,935
This 2011 erotic romance novel by E. L. James (pictured) is one of the biggest best sellers of the past decade. It is being adapted into a movie directed by Sam Taylor-Wood so that even more people can experience it. On July 24, the movie trailer for the film was released, which is no doubt why this article was so popular this week.
3 Malaysia Airlines Flight 17 B-class 576,750
The tragic shooting down of this passenger aircraft over Eastern Ukraine drops one spot this week. Although it seems likely that Russian-backed insurgents, who recently downed some Ukrainian planes in the same area, mistook the Boeing 777 for a Ukrainian military plane, a full investigation of the crash needs to be completed. That continues to be hampered by the lack of government authority and ongoing fighting in the region, leading to news reports about the efforts made to simply transport bodies out of the area, as well as disturbing claims of scavenging of passenger belongings by local residents.
4 Gaza Strip C-Class 508,624
The latest round of fighting between Israel and Hamas, part of a very long and complicated history of conflict, keeps this article on the list for the second straight week. The military operation is dubbed "Operation Protective Edge" though our article on the conflict is now filed under 2014 Israel–Gaza conflict.
5 Tom Lehrer B-class 507,403
This American singer-songwriter, satirist, and mathematician was the subject of a very popular Reddit thread this week. As Reddit noticed, when Lehrer was asked at age 84 by hip-hop artist 2 Chainz for permission to sample a song he wrote 60 years ago, Lehrer responded: "As sole copyright owner of 'The Old Dope Peddler', I grant you motherfuckers permission to do this. Please give my regards to Mr. Chainz, or may I call him 2?"
6 2014 Commonwealth Games c-class 487,610
The 2014 edition of the Commonwealth Games kicked off on 23 July in Glasgow, Scotland, and will run through 3 August. Almost 5,000 athletes from 71 different nations and territories will be competing in 18 sports, including Lawn Bowls.
7 Land mine C-Class 438,852
Reddit also caused a huge spike in the popularity of this article on 25 July, when a "Today I Learned" thread noted that areas around landmines laid near the sea during the Falklands War (1982) have become favorite penguins sanctuaries, as penguins do not weigh enough to detonate the mines, and can breed free of human interference. The sanctuaries have proven so popular and lucrative for ecotourism that removal efforts have been opposed.
8 Deaths in 2014 List 408,553
The list of deaths in the current year is always a popular article. In addition to James Garner (#1), deaths this week included (and this is a random sample, truly): Indian actor Kadhal Dhandapani (July 20), English female aviator and World War II military pilot Lettice Curtis (July 21), American football player Robert Newhouse (July 22), American swimmer and 1932 Olympics gold medal winner Helen Johns (July 23), South Korean vionlist Ik-Hwan Bae (July 24), American author Bel Kaufman (July 25), and Ukrainian mayor Oleh Babayev (July 26).
9 Israel B-class 396,605
Up from #14 last week. As with #4, the latest round of fighting between Israel and Hamas is no doubt the cause of the popularity of this article this week.
10 Hamas B-class 396,081
Up from #17 last week, giving the recent conflict three of the top ten spots this week. Sadly, this popularity, and the bloodshed causing it, is likely to continue.

It took 396,081 views to make the Top 10 this week, down substantially from the 467,674 views needed last week. In the greater raw WP:5000 stats, 158 articles received over 100,000 views this week, with The Big Bang Theory (#158) the last to do so. William Shakespeare (#587) was the last to break 50,000 views; Los Angeles Lakers (#2239) last to hit 25,000; and United States Navy SEALs and Jazz tied for last (#4999) on the WP:5000, with 16,068 views.

Reader comments

2014-07-30

Skeletons and Skeltons

This Signpost "Featured content" report covers material promoted from 20 through 26 July.

Two featured articles were promoted this week.

The Nusfjord Road by Simo Räsänen, a new featured picture.
  • Red Skelton (nominated by We hope) According to the nominator, this American comedian spent seventy years in the business of making people laugh, including "vaudeville, films, radio and a weekly television show" lasting for two decades. On the side, he painted portraits of clowns. Strangely, rumor has it that Skelton made more money from the latter.
  • The FP (nominated by Sock) A little-known, low-budget, low-grossing film with a small cult following, The FP follows gang members who fight using a Dance Dance Revolution knock-off. Taking in just over US$40,000 at the box office (including one weekend's total of just $93), The FP has gradually gained fans since its 2011 premiere at South by Southwest. Its bizarre dialogue and premise have divided critics and audiences alike.

Four featured lists were promoted this week.

The Chequered Skipper, a smallish butterfly found in northern and central Europe, is the subject of this featured picture.
  • Alastair Sim on stage and screen (nominated by SchroCat) SchroCat's twentieth featured list comes in the form of this article detailing the appearances of Alastair Sim, a "memorable character player of faded Anglo-Scottish gentility". He performed from 1930, when he played a messenger in Othello until 1977, when he acted in his last performance, as himself in the television show To See Such Fun, ending his 47 year long career, in which he appeared on stage, film, and television.
  • 86th Academy Awards (nominated by Birdienest81) Birdienest81's seventeenth featured list regarding the Academy Awards. The 86th Academy Awards were presented on 2 March 2014, at Hollywood's Dolby Theatre. 12 Years a Slave won the coveted award for best picture, whilst Gravity received the most awards.
  • Moons of Neptune (nominated by Double sharp) Double Sharp received this featured list with some captivating pictures accompanying it regarding the 14 moons of Neptune. Triton is by far the biggest of the 14; it has a mass far far greater than the other moons combined, and was the first discovered, in 1846. The most recently discovered moon was first seen only last year.
  • Dadasaheb Phalke Award (nominated by Vivvt) Vivvt's second Featured List is, as the first, related to Indian cinema. The Dadasaheb Phalke Award is India's highest award for cinema, and has been running for the 45 years since 1969. The most recent edition (2013) was awarded to Gulzar of the Hindi film industry.
Panorama of Bath, Somerset.
The Death of Socrates by Jacques-Louis David.


Reader comments
If articles have been updated, you may need to refresh the single-page edition.