Wikipedia:Automated moderation

This page is currently inactive and is retained for historical reference.
Either the page is no longer relevant or consensus on its purpose has become unclear. To revive discussion, seek broader input via a forum such as the village pump.

This is an information page.

It is not an encyclopedic article, nor one of Wikipedia's policies or guidelines; rather, its purpose is to explain certain aspects of Wikipedia's norms, customs, technicalities, or practices. It may reflect differing levels of consensus and vetting.

Shortcut

WP:AUTOMOD

Automated moderation in Wikipedia is the use of Wikipedia:Bots to promote good behavior in our shared wiki environment.

From 2001-2019 Wikipedia's bots mostly executed simple commands as directed by humans. Most of these commands were editorial to focus on Wikipedia's content, and bot operators less often directed bots to intervene in human actions.

With advances in data science, and nearly 20 years of data on human activities in Wikipedia, it has become possible to have bots patrol Wikipedia to detect various kinds of misconduct. For example, with thousands of examples of humans applying the Wikipedia:Blocking policy to user misconduct, bots can use machine learning to identify misconduct in past cases then apply what they learned to situations which humans have not yet evaluated. This is "automated moderation".

Automated moderation should not replace human evaluation, but should only complement it. The Wikipedia community should take care to be human-centered in all its moderation, and to always value and demand transparency in judgement and the Wikipedia community process.

Projects

WikiTrust

From 2007-2011 the WikiTrust project proposed using automated moderation to support humans in patrolling Wikipedia. Problems at the time included the high cost of computation, lack of general software, difficulty in structuring Wikimedia data, and lack of community support and understanding. The technical problems with this project and the cost of running this sort of program has dropped greatly as of 2019. Insights from this project should be useful to planning next steps.

Application of Jigsaw

Jigsaw is a Google company which in 2017 did a research project to seek out misconduct in Wikipedia. Nice parts of the project included the media attention that it brought to Wikipedia and research in automated moderation, and that they created and shared a useful dataset for further research, and that they published a research paper. Parts of the project that were less useful is that the research team had no contact with the Wikipedia community about needs, and that the project was a one-time event, and that the project focused on the technical side of things with the omission of community discussion or user interaction.

Google is a superpower and having their attention is the same as being recognized by a government or having a divine intervention. Thanks Google, you do what you want even while interaction with you socially is as weird as it gets.

ORES rankings

The Wikimedia Foundation has an in-house development team engaged in data mining mw:ORES, a pun for "mining" "ore".

From May 2017-2019 the ORES project has used machine learning to rank the likely usefulness of individual edits. This permits moderation on a per-edit basis, and produces a dataset on edits which anyone else can use to search for trends in behavior associated with accounts.

ORES is an active automated moderation program running in various Wikimedia projects including English Wikipedia

Automod @ U of Virginia

In 2018-19 the Wikipedia Research Lab at the University of Virginia presented Automatic Detection of Online Abuse, a project applying machine learning to examine user blocks in English Wikipedia.

This research had two important findings: one was that with available technology and data we can create good automated moderation tools at a low enough cost to justify developing them, and the second was that the social infrastructure to apply these tools will be more expensive to develop than the technology and also cannot be rushed. This project recommends community conversation and policy development now in anticipation of ever easier and lower cost access to automated moderation.

Anyone can participate

Develop general Wikimedia content

An essential part of understanding automated moderation is having access to general reference information on the technical and social aspects of the same. Consider these Wikipedia articles. Encourage community groups and university students from all disciplines to edit them in the usual wiki process.

technical

social

Recommend policy

Quis custodiet ipsos custodes? (Who will watch the watchmen?) The Wiki community should take care to set policy on what bots do and how anyone should interact with them.

Bots promise to save a huge amount of human labor on mundane tasks but also can themselves make bad judgments repeatedly and quickly. Test them cautiously, use them with care, and routinely check what they are doing. Keep Wikipedia policy which promotes transparency in how bots operate and who operates them.

Technology develops at the speed of trust - do not tolerate anyone saying that the wiki community should change culture and increase trust to match available technology.

Ideological policy pages might be the highest authority in representing the Wikimedia community consensus which guides moderation policy. Some of those guides include the following:

Wikipedia:English Wikipedia non-discrimination policy
meta:Code of conduct (does not actually exist)

Evaluate bot recommendations

Bots generate reports. We make better bots by having humans evaluate bot activities as correct, incorrect, or uncertain.

Consider participating in human evaluation of bot activity. If anyone asks for this kind of labor from the Wikipedia community, they should do so in alignment with human / bot interaction policy. Bots demand a huge amount of human labor from all the humans who give their attention to bots, and any request of evaluation from bot operators should come with a self-reported estimate of how much human labor, time, and evaluation they intend to collect and for what purpose.

Misconduct to target

Proxy editing

Wikipedia has restrictions on editing with open proxy IP addresses. There are broadly two kinds of problematic proxy editing; one is by Wikipedia:IP users, and the other is with registered Wikimedia accounts originating from a proxy.

User:ProcseeBot scans IP addresses against known proxies and blocks them. This is routine, has not been the subject of controversy, and has resulted in more English Wikipedia blocks than any other process.

While not currently well documented or the subject of much attention, understanding proxies is essential to understanding blocks in English Wikipedia. Machine learning research on blocks and misconduct will detect huge numbers of blocked proxy accounts and ought to interpret the significance of these.

Undisclosed paid editing

Undisclosed paid editing, such as when a company hires a public relations professional to manipulate Wikipedia content, is both against the Wikimedia Terms of us and a serious social taboo in English Wikipedia.

Various community groups have organized themselves to use human labor to develop datasets identifying undisclosed paid editing in anticipation of using machine learning to identify this particular kind of misconduct.

There is a Wikipedia community hope and understanding that detecting undisclosed paid editing can be among the easier sorts of misconduct to detect with automated processes.

Spamming

Common spam behavior can include inappropriate posting of links or keywords in Wikipedia spaces.

Harassment

Harassment is a pattern of repeated offensive behavior that appears to a reasonable observer to intentionally target a specific person or persons. Usually (but not always), the purpose is to make the target feel threatened or intimidated, and the outcome may be to make editing Wikipedia unpleasant for the target, to undermine, frighten, or discourage them from editing. Harassment might also be called abuse, hounding, griefing, trolling, rudeness, and other social misconduct.

Wikipedia:Community_response_to_the_Wikimedia_Foundation's_ban_of_Fram#Data_analysis_for_tone

Sock farming

Sock farming is the set of activities whereby an evil organization mass-creates many Wikipedia accounts which are themselves bots or puppets, and directs each of these accounts to make low-value good or neutral edits in order to develop the appearance of useful new user behavior. When needed, the evil organization harvests one of these farmed accounts to engage in strategic misconduct, gaining the benefits of the good will of having prior merit in editing Wikipedia.