User:Thatcher/Quis custodiet ipsos custodes

This essay contains the opinions of one or more Wikipedia contributors. For a related policy proposal, see Wikipedia:Review Board.

Oversight of Checkuser

I do not believe Wikipedia currently has effective oversight of the Checkuser function. By effective oversight, I mean review with the possibility that a Checkuser could lose the tool for misuse that was not an obvious violation of the privacy policy. (Probably 99.8% of checks and oversights are legitimate; I am concerned that we lack an effective means of dealing with the other 0.2%.)

Examples

Real examples, names changed. Disposition as indicated.^[1]

Checkuser Smith is an active editor of a contentious topic. He runs checks on nearly all editors who join the topic from a particular point of view. He justifies this because the topic area is being hounded by disruptive sockpuppets of a banned user, and in fact about 50% of the checks reveal socks of this banned user. However, he not only checks brand new accounts, but also long-term editors with thousands of edits but who are newly arrived at that particular topic. I raised this issue with Smith and with Jimbo Wales; Jimbo agreed that it might look bad but Smith declined to change his behavior and stated that even if only 5% of the checks uncovered sockpuppetry he would continue.
Checkuser Adam was involved in discussing a banned user on one of the admin noticeboards, when editor Eve spoke up in defense of the banned user. Within a few minutes Adam had checked Eve as a potential sock of the banned user, even though Eve was a long-term editor and administrator and (in my opinion at least) there could be no reasonable suspicion that Eve had any relationship to the banned user. Sent to Newyorkbrad, who sent it to Arbcom-L. No word on final outcome.
Checkuser Groucho was involved in a dispute with User:Harpo and endorsed a community ban on Harpo on the admin noticeboard. The ban did not reach consensus and Harpo remains an active editor with no blocks since the discussion. During the course of the discussion several other admins told Groucho he was out of line. Near the end of the incident, Harpo expressed the sentiment that "If I am going to be hounded off the site I will just start a new account under a different name"; Groucho checkusered him on the grounds of sockpuppetry, even though Harpo was not blocked and the ban proposal had been defeated. Discussed with an arbitrator, no further action taken yet.

Example 1 is a case of a Checkuser being too close to an issue to be able to make an unbiased evaluation. I believe it would be better for Smith to request (publicly or privately) to have another checkuser look at editors that Smith considers suspicious and decide whether or not to run the checks. Examples 2 and 3 just look like Checkusers running checks because they are pissed off. (With apologies to my former colleagues, that's what it looks like.)

Analysis

The use of the Checkuser tool is governed first by the Wikimedia Foundation m:Privacy policy and second by the local Checkuser policy. Both policies prohibit the disclosure of private information (such as IP addresses) except under defined circumstances. But neither policy places any meaningful restrictions on when and how checks should or should not be run. On the actual use of Checkuser, the Checkuser policy states,

The CheckUser feature is approved for use to prevent disruption, or investigate legitimate concerns of bad faith editing. The tool is to be used to fight vandalism, to check for sockpuppet activity, to limit disruption or potential disruption of any Wikimedia project, and to investigate legitimate concerns of bad faith editing. The tool should not be used for political control; to apply pressure on editors; or as a threat against another editor in a content dispute. (emphasis added)

In practice, these statements have been interpreted by the Checkusers and Arbcom to not place any restrictions on what checks should be run and who should run them. Anything can be justified under the guise of "limiting disruption."

In addition to checks which may directly run afoul of the written policy by not being aimed at "legitimate concerns" (see especially example 2), there are checks which potentially violate Wikipedia's common law expectation of privacy and administrator best practice. Administrators are prohibited from acting where they have a potential conflict of interest. Admins may not block users with whom they have been involved in previous disputes and may not close deletion discussions on pages where they have a vested interest. Even though the Checkuser policy does not specifically prohibit in writing Checkusers from running checks when they have conflicts of interest, this prohibition is a natural extension of the expectations the community has toward administrative conflicts of interest. Checkusers should be held to a higher standard, not be excused from it. The Checkuser policy is written to suggest, and Checkusers all tell the community when asked, that checks are never run "without good reason," but in practice, any reason is good enough.

Checkusers theoretically review each other's actions. This is why Wikimedia wikis must have at least two Checkusers, or none. This review should include issues of appropriate use of the tool. However, it is my experience that internal discussions among the Checkusers consist entirely of technical support—re-analysis of findings in complex cases and double-checking results that seem likely to cause disruption or drama of their own. There is little if any internal discussion of possibly inappropriate checks, and such discussions that do occur are dominated by the following assumptions:

Checkusers are really busy and have no interest in editors' personal information except when a real concern arises, so even if the concern does not seem realistic to others on review, the Checkuser is given the benefit of the doubt.
Even though the act of looking, by itself, might be perceived as an invasion of privacy by some editors, it is viewed as essentially harmless as long as the information is not actually released or discussed outside the Checkuser community.
The Checkusers as a group are protective of their independence and discretion and are reluctant to endorse any position that might curtail their own use of the tool.

On the English language Wikipedia, the Checkuser privilege is granted by the Arbitration committee, so it is clear that Arbcom has the authority to remove the Checkuser privilege. I am, of course, not privy to their internal discussions, but I believe they make the same three assumptions noted above, possibly with an added consideration of being reluctant to take action (such as removing someone's Checkuser status) that might be seen as fuel for on- and off-wiki drama.

Oversight of Oversight

Because I have never had Oversight permission or access to the Oversight mailing list, I have much less first-hand information about the potential mis-use of the Oversight tool. Nevertheless I believe misuse has occurred.

Examples

Again, real cases, not hypothetical, that I saw with my own eyes or was informed about.

Editor Moe retires, leaving a strong anti-Wikipedia rant on his user page. Moe later "un-retires" and begins plans to seek adminship. He is concerned that his rant might negatively affect his RFA, so he asks Curly to oversight his rant. (Ultimately, Moe is banned for other misconduct.) Brought to the attention of Arbcom. No violation found.
Editor Bob, who has Oversight permission, leaves a talk page comment for editor Ray. Bob then revises the comment several times, substantially changing its tone and meaning. He then (before Ray sees it) uses Oversight to remove the original comment and majority of edits, leaving only the revised comment. Raised with an arbitrator, no wrongdoing found.

The case of Moe and Curly is complicated. Moe did indeed write an anti-Wikipedia "rant", but the message also revealed personal information about a third party. Arbcom found upon review that removal of the personal information was acceptable. However, by the nature of Oversight the entire comment was removed, and administrator "Shemp" who brought it to my attention was quite concerned that the removal of the rant would prevent effective review of Moe by a forthcoming RfA. This is a case where more transparency might have helped, perhaps there could have been a review (as described below) followed by re-posting the non-private part of the "rant." Bob's oversighting of his own comments to Ray just looks wrong to me no matter what Arbcom thinks. Oversight is not meant to help make editors look good, and Oversighters should not remove their own edits any more than admins should unblock themselves; Bob should have asked for assistance.

And, of course, there is the much-publicized oversight of certain edits made by FT2 during his Arbcom election in 2007. The facts are relatively simple. Another editor, who opposed FT2's candidacy, wrote a post on his personal blog highlighting edits FT2 made to zoophilia, with the implication that Wikipedia was about to elect someone to a senior position who had an unacceptable viewpoint on the practice. David Gerard oversighted the edits, but because they were so old, the content was not removed, only the attribution of FT2 as author was lost [1]. David realized that the oversighting was a mistake, as it was contrary to the oversight policy, and apologized to Jimbo Wales. However, for the next year, no one would speak openly about or even acknowledge the removal of the edits. A simple statement made by someone in authority, acknowledging the oversight and the apology, would have eliminated much of the succeeding drama.

Analysis

The Oversight policy allows removal of edits that reveal non-public information or that contain potentially libelous statements or copyright infringements. I have been told that the most common use of Oversight is to remove edits made by editors who were accidentally logged out, thus inadvertently revealing their own IP addresses.^[2] (This was not in the official policy until I just added it, I wonder if it will stick.) Let me repeat, Oversight is not meant to be used to protect editors from embarrassment or other consequences of their edits, and it most certainly is not meant to allow editors who have Oversight permission or who are friendly with someone with Oversight permission to shield their embarrassing mistakes from view when that ability is not available to the general editing population.

I have been told that the Oversight mailing list handles active discussions on the appropriate use of the tool, if so, this is a good thing (and in marked contrast to the Checkuser mailing list). However, there remains no transparent on-wiki mechanism to report and evaluate questions about Oversight use.

Need for accountability and transparency

I do not believe that any single incident I have listed here is grounds for removal of Checkuser or Oversight access. I do believe they are grounds for issuing a caution or warning, and that repeated exhibition of poor judgement by the same people should result in loss of access. I also think that the community deserve some kind of transparency and accountability in how the Checkusers and Oversighters are handling their tasks. When confronted with a serious question from a concerned editor, if all I can say is, "I can't tell you" or "I'll look into it", then I don't believe I am serving the community's interests.

Accountability and transparency could be provided by the Arbitration Committee acting on its own. However, my own experience contacting Arbcom or individual Arbitrators has been less than satisfactory. I report things but rarely hear back. I believe that Arbcom suffers from two problems, as far as review and oversight of Checkuser and Oversight is concerned:

Conflict of interest Most Arbitrators are active Checkusers and/or Oversighters and are reluctant to issue rulings or find fault that might lead to curbs on their own discretion to use the tools.

Culture of silence Arbcom is extremely reluctant to discuss these concerns publicly, and I believe they are unwilling to take strong action when otherwise appropriate, because that will lead to public discussion which is potentially highly dramatic and disruptive. There is nothing in the privacy policy that prevents me from naming the editors, Checkusers and Oversighters in my examples above, as no non-public information (such as Eve's IP address) would have to be disclosed in order to have a discussion about the appropriateness of Adam's check. I certainly think that we must be careful not to bring unfair criticism down on the head of any Checkuser or Oversighter accused of misconduct. I do not advocate a public inquest for every accusation, and I think that if the result of a complaint is "no violation" or "minor violation", it would be best to keep the specifics private. But an acknowledgment of the complaint and its disposition is certainly appropriate (something like this, maybe).

Proposal: Local ombudsman office

I propose the creation of a local ombudsman office to investigate and report on allegations of misuse of Checkuser and Oversight. ^[3] The members of the ombudsman office would have to have Checkuser and Oversight permission themselves, in order to have access to the relevant information, (and should ideally have some experience using the tools) but should not themselves be active users of Checkuser or Oversight for routine matters. Members of the office could be selected by a number of different mechanisms, including direct election by the community, appointment by the Arbitration Committee, or election from among the currently active Checkuser and Oversighters. Direct election would seem likely to result in the most independence, but since only Arbcom can assign Oversight and Checkuser status, direct election would not work unless Arbcom assented to the result or pre-approved the candidates. The ombudsman office would develop procedures to take and investigate complaints and report on their findings. I believe that where there is a finding of a minor violation, the violation should be publicly noted but without naming the responsible Oversighter or Checkuser—Wikipedia does not do Scarlet Letters or public pillories. If misuse of Oversight is found, a note documenting the contents of the edit that was improperly removed and the name of the contributing editor could be made in an appropriate place. In cases of serious violations or repeated minor violations from the same person, a full public report may be appropriate. The ombudsman office could be given the power to direct the Stewards to remove Checkuser and Oversight access, but I think it is more likely that the office would make a public referral to Arbcom for final disposition.

Questions to be addressed

How will the officers be selected?
How will they take complaints, investigate, and respond?
What level of disclosure is appropriate for what type of finding?
Will the officers have direct power to remove Checkuser and Oversight permission or can they only refer to Arbcom?

Independence issues

There seems to be no way to avoid the fact that the local ombudsman office will be, to some extent, a creature of Arbcom. No one can hold the position without Arbcom's approval in one way or another, and the office would have no power to act unless that power was granted by Arbcom. Perhaps the only way to insure that the office is independent is to select independent-minded people to serve in it.

In any event, the most important points of this proposal are not independence but transparency and accountability. If the officers refused to discuss their cases and found that every complaint was groundless, it would become pretty apparent that the office was not doing the job for which it was constituted, and it would rapidly lose credibility. In fact, the Arbitrators could fulfill this local ombudsman role themselves without delegating their authority to yet another Wikipedia bureaucracy, if they were minded to do so. Nothing now stops any Arbitrator from make public statements on the example cases I gave or on any other cases that I am not aware of.

Notes

^ Let me be clear that I am not talking about the check run by Lar that was the subject of a recent arbitration case. I said at the time, and I still believe, that although I probably would not have run that check myself, I considered it to be (barely) within the bounds of acceptable discretion. These other cases are, I think, much farther outside the bounds of acceptable procedure.
^ Single revision deletion would reduce the volume of oversighted edits (and thus the potential for misuse) but would not eliminate the need for an independent panel to review disputed cases.
^ The m:Ombudsman commission has determined that its mandate extends only to violations of the m:Privacy policy, that is, inappropriate release of non-public information.

[1] Let me be clear that I am not talking about the check run by Lar that was the subject of a recent arbitration case. I said at the time, and I still believe, that although I probably would not have run that check myself, I considered it to be (barely) within the bounds of acceptable discretion. These other cases are, I think, much farther outside the bounds of acceptable procedure.

[2] Single revision deletion would reduce the volume of oversighted edits (and thus the potential for misuse) but would not eliminate the need for an independent panel to review disputed cases.

[3] The m:Ombudsman commission has determined that its mandate extends only to violations of the m:Privacy policy, that is, inappropriate release of non-public information.

[1]

[2]

[3]