This is an anti-vandalism bot census for all Wikimedia projects as of February 2011. It includes active and inactive bots, also simple clones of other bots. The main goal of this census is compiling all the information available about these counter-vandalism automatic tools. Feel free to help improve this page.
Techniques
editWe can see two main types of anti-vandalism bots:
- First generation: simple scoring systems based in regular expressions and heuristics.
- Second generation: machine learning, natural language processing, neural networks and bayesian filters.
Census
editBot name | Operator | Project | First reversion | Last reversion[1] | Methods of detecting vandalism | Programming language | License | Source code | Edits[2] | Comments, additional features, etc |
---|---|---|---|---|---|---|---|---|---|---|
Africanus | Magister Mathematicae | es.wikipedia.org | 2010-04-23 | Active | · | · | · | · | 7000 | AVBOT clone |
AmeliorationBot | Ameliorate! | en.wikipedia.org | 2008-07-21 | 2008-11-08 | Regular expressions | AWB | · | · | 2000 | Not really anti-vandalism. It removes test edits and default messages of toolbar |
AntiSpamBot | Shadow1 | en.wikipedia.org | 2006-11-16 | 2007-12-25 | Blacklist | Perl | · | · | 44000 | Not really anti-vandalism, it is anti-spam |
AVBOT | Emijrp | es.wikipedia.org | 2008-03-10 | 2010-05-16 | Scoring system based in regular expressions | Python, Pywikipedia | GPL | http://code.google.com/p/avbot/ | 595000 | Anti-vandalism, tag new pages for deletion, remove non-notable biographies from date pages |
Bot que revierte | Orgullomoore | es.wikipedia.org | 2006-06-04 | 2006-07-03 | Regular expressions | · | · | · | 900 | It only reverts to anonymous users, and only the last edit |
Botarel | Lucien leGrey | es.wikipedia.org | 2009-04-17 | Active | · | · | · | · | 46000 | AVBOT clone |
BOTirithel | Tirithel | es.wikipedia.org | 2009-09-14 | Active | · | · | · | · | 6000 | AVBOT clone |
BOTpolicia | Er Komandante | es.wikipedia.org | 2007-01-21 | 2008-03-18 | · | · | · | · | 153000 | Anti-vandalism, tag new pages for deletion |
ClueBot | Cobi | en.wikipedia.org | 2007-07-29 | 2010-12-02 | Scoring system based in heuristics | PHP | GPL | User:ClueBot/Source | 1500000 | Report IP vandals who use open proxies to WP:OP |
ClueBot NG | Cobi, Crispy1989 | en.wikipedia.org | 2010-11-02 | Active | Artificial neural network | C, C++, PHP, Python, Bash, and Java (more info) | GPL | https://cobihome.external.cluenet.org:8443/viewvc/cluebotng/ | 1623187 | Main anti-vandalism bot on w:en: |
CounterVandalismBot | Lloydpick | en.wikipedia.org | 2007-09-28 | 2007-11-10 | Scoring system | Visual C# | · | · | 14000 | · |
CVBOT | HUB | es.wikipedia.org | 2009-10-01 | Active | · | · | · | · | 3000 | AVBOT clone |
CVNBot | Manuelt15 | es.wikipedia.org | 2010-04-08 | Active | · | · | · | · | 600 | AVBOT clone |
GoblinBot4 | Bluegoblin7 | simple.wikipedia.org | 2009-05-12 | Active | · | PHP | GPL | http://sourceforge.net/projects/antivandalbot | 10000 | · |
PseudoBot | Pseudomonas | en.wikipedia.org | 2008-04-12 | Active | Redlinks to nonexistent page in date articles | Perl | · | · | 95000 | Remove non-notable biographies from date pages |
Salebot | Gribeco | fr.wikipedia.org | 2007-10-26 | Active | Scoring using regexps and user profiling | Perl | · | https://fisheye.toolserver.org/browse/gribeco/salebot2 | 713000 | Also used on ptwiki |
SalebotJunior | Nakor | fr.wikipedia.org | 2010-07-26 | Active | Scoring using regexps and user profiling | Perl | · | https://fisheye.toolserver.org/browse/gribeco/salebot2 | 1000 | Exactly the same code as Salebot, running when Salebot is unavailable |
SoxBot III | X! | en.wikipedia.org | 2009-01-01 | 2009-09-12 | Scoring system | PHP | GPL | https://github.com/soxred93/SoxBot-III | 29000 | · |
VoABot II | Voice of All | en.wikipedia.org | 2006-07-28 | 2009-10-06 | · | Javascript/Java | · | · | 274000 | Anti-vandalism, revert vandal page moves |
Bots to be added:
- AVBOT (es) and its clons, Salebot (pt), AntiVandalBot (en) previously Tawkerbot2 and Tawkerbot4, MartinBot (en), DASHBotAV (en), XLinkBot anti-spam (en)
- ask in other wikipedias
- no bots? pl
- anti-vandalism bots are not allowed in: de
- use of anti-abuse filters, not bots: it
- make statistics about reverts in the recentchanges table to discover bots
Semi-automatic anti-vandalism tools
editFor a fuller list see Category:Wikipedia counter-vandalism tools
- Huggle
- Twinkle
- STiki
- Igloo
- Lupin's Anti-Vandal Tool, of which there are several variants e.g.
Notes
editReferences
edit- 2011
- Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features. B. Thomas Adler, Luca de Alfaro, Santiago Mola-Velasco, Paolo Rosso, and Andrew G. West.
- 2010
- Detecting Wikipedia vandalism with active learning and statistical language models. Si-Chi Chin, W. Nick Street, Padmini Srinivasan, and David Eichmann.
- The Work of Sustaining Order in Wikipedia: The Banning of a Vandal. R. Stuart Geiger and David Ribes.
- Crowdsourcing a Wikipedia Vandalism Corpus. Martin Potthast.
- Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. Andrew G. West, Sampath Kannan, and Insup Lee.
- 2009
- Vandalism Detection in Wikipedia: a Bag-of-Words Classifier Approach. Amit Belani.
- Using Dynamic Markov Compression to Detect Vandalism in the Wikipedia. Kelly Y. Itakura and Charles L. A. Clarke.
- 2008
- Automatic Vandalism Detection in Wikipedia. Martin Potthast, Benno Stein, Robert Gerling.
- (German) Automatic Vandalism Detection in Wikipedia. Robert Gerling.
- Automatic Vandalism Detection in Wikipedia. Martin Potthast, Benno Stein, and Robert Gerling.
- Automatic Vandalism Detection in Wikipedia: Towards a Machine Learning Approach. Koen Smets, Bart Goethals, and Brigitte Verdonk.
- 2007
- Creating, Destroying, and Restoring Value in Wikipedia. Reid Priedhorsky, Jilin Chen, Shyong (Tony) K. Lam, Katherine Panciera, Loren Terveen, and John Riedl.
See also
edit- Wikipedia:Vandalism
- Wikipedia:WikiProject Vandalism studies
- Category:Wikipedia anti-vandal bots
- WikiTrust -- A computational technique running live on many language editions of Wikipedia (I believe), with directly accessible API (announcement). English version integrated as a "queue" into the STiki semi-automated front-end.
- Wikipedia trust write-up (added by its author, not peer reviewed) -- If your looking for a more academic angle, this might be worth reading. Fig. 9 especially visualizes the relationship between a lot of the academic writings in this field. (Note that "vandalism", while a small subset of the "trust" spectrum, seems to be where all the effort is focused).
- Wikipedia:Academic studies of Wikipedia
External links
edit- 1st International Competition on Wikipedia Vandalism Detection (Padua, Italy, 22–23 September 2010 - workshop page, see also Wikipedia:Wikipedia_Signpost/2010-09-27/In_the_news#Vandalism_detection_competition)
- http://www.andrew-g-west.com
- Reverts statistics