Wikipedia:Bots/Requests for approval/JVbot 3

The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was

Denied.

JVbot

Operator: John Vandenberg ^(chat)

Automatic or Manually Assisted: Unsupervised

Programming Language(s): python

Function Summary: Automatically oversight regularly released non-public personal information that is unequivocally within policy and has been oversighted by humans many times already.

Edit period(s) (e.g. Continuous, daily, one time run): Continuous

Already has a bot flag (Y/N): Yes

Function Details: This bot will scan for reverts on prominent pages, analyse the reverted revision, and oversight when appropriate. Once that is bedded down, it will also analyse new revisions for the same problems, and revert and oversight them. The bot will use a blacklist to determine which edits should be oversighted, and this will not be publicly available for obvious reasons.

Discussion

A separate bot account for this task should be created, bots with elevated user rights should not share accounts with bots that don't require them. BJ^Talk 00:59, 12 December 2008 (UTC)[reply]

If the request is approved in principle by the community and whoever else has a say in it, I will create a separate account to perform this task. John Vandenberg ^(chat) 02:06, 12 December 2008 (UTC)[reply]

Don't forget to advertise this request on the appropriate noticeboards to try to get wide community input. I for one would like some assurance that the oversighters will be keeping a general eye on the bot's actions, as by nature it will be impossible for the community at large to know just what the bot is oversighting. Anomie ⚔ 05:00, 12 December 2008 (UTC)[reply]

If it only oversights the diff prior to a revert, the community will be able to see the revert diff, however it will be an empty diff. The bot could record a visible log with revids of those "empty" reverts, which would at least allow for a bit of review. John Vandenberg ^(chat) 05:42, 12 December 2008 (UTC)[reply]

That's the point, the community can't (by intention) see what the bot removed. I don't think a public log would be that useful, and oversighters already have the oversight log. Anomie ⚔ 12:15, 12 December 2008 (UTC)[reply]

A public log of the reverts creates a list of people who saw the diffs that were oversighted. They know what they saw, and can be queried about its contents. John Vandenberg ^(chat) 12:33, 12 December 2008 (UTC)[reply]

As I noted on IRC, this should get the OK from ArbCom, and possibly Cary Bass. And there definitely needs to be some assurance that everything the bot oversights gets reviewed. Would it be possible to release the (non-sensitive parts of) the code? Mr.Z-man 05:28, 12 December 2008 (UTC)[reply]

I'll be happy to have it reviewed by another python coder before it goes into production. I did release my DJVU bot, and dont mind releasing my code for the patrol and oversight bot, but that can create an arms race - I would like to have it properly bedded down before releasing the code. If an arms race starts anyway, I'll open it so that more devs can help. John Vandenberg ^(chat) 05:42, 12 December 2008 (UTC)[reply]

I'm not comfortable with a bot doing this because 1) the bot's actions can't be reviewed, 2) the bot's actions can't be undone, and 3) with the blacklist secret, there's no way of telling what the bot should be doing. --Carnildo (talk) 11:39, 12 December 2008 (UTC)[reply]

Strongly agree with Carnildo. Beyond the (significant) technical hurdles here, you're playing with fire. Oversight, unlike protection, deletion, etc. is not easily reversible or reviewable. You run the risk of false positives that destroy page histories. And I don't see any indication from any part of the community for giving bots this right. (The community is still lukewarm on the idea of bots having admin rights.) If we more oversighters, then let's resolve that issue properly, please. :-) --MZMcBride (talk) 05:45, 13 December 2008 (UTC)[reply]

The issue is not simply solved by more oversighters, although more will reduce the problem. There are cases where the information clearly needs to be zapped immediately, i.e. within seconds; every set of eyes that see it are a problem, and specifically on pages where lots of eyes are looking. We would need 100 new oversighters to be able to have the required vigilence. No heuristics will be involved; just unambiguous cases that have already been discussed on oversight-l. John Vandenberg ^(chat) 06:56, 13 December 2008 (UTC)[reply]

What sort of information are we talking about that needs same-second elimination? Details of the President's security? True names of deep-cover operatives? What? --Carnildo (talk) 07:13, 13 December 2008 (UTC)[reply]

True names of deep-cover operatives is very close to the mark. John Vandenberg ^(chat) 10:01, 13 December 2008 (UTC)[reply]

Although I'm usually fine with adminbots, I'm really uncomfortable with an oversighterbot, it's not transparent (necessary evil though) and it's irreversible too. Maybe it would be better to have it report things like this to oversight-l? Maxim_(talk) 15:57, 13 December 2008 (UTC)[reply]
Reporting problems to oversight-l is a fallback that was being discussed offline last night. More code involved, and not as cool or effective, but ... that would be a good step forward. John Vandenberg ^(chat) 08:10, 14 December 2008 (UTC)[reply]

How about make it an adminbot that deletes the page and restores it without the bad revisions. Of course the problem is that the bot's deletion log would reveal the affected articles and until the revision is oversighted any admin could view the deleted revision in the history, but it would minimize the number of 'outsiders' seeing the information. Essentially it boils down to how much we trust our admins. ~ User:Ameliorate! ^{(with the !) (talk)} 02:26, 15 December 2008 (UTC)[reply]

That would be a good solution for many cases; food for thought there.

However, the most problematic "leak" is one that occurs on a highly visible community page (e.g. ANI), and those pages usually have so many revisions that they cant be deleted. John Vandenberg ^(chat) 02:32, 15 December 2008 (UTC)[reply]

It seems to me that the best solution would be for the bot to delete the revision if it can, email oversight-l for removal, and pressure the developers for that selective deletion feature we've been promised for so long. --Carnildo (talk) 02:38, 16 December 2008 (UTC)[reply]

The 'selective deletion' process is a complete and total hack that should almost never be done by admins, much less by bots. Domas mused that we should require things to be deleted for a week just to stop admins from doing it. Truly, it's awful. --MZMcBride (talk) 01:02, 17 December 2008 (UTC)[reply]

From all this discussion, I gather that this bot has been discussed at another venue, link please if onwiki. Foxy Loxy ^Pounce! 12:20, 20 December 2008 (UTC)[reply]

I'm not aware of any other on wiki discussion but it has been talked about a good deal on IRC. BJ^Talk 12:24, 20 December 2008 (UTC)[reply]

Ok, it would be good if someone could detail what transpired on IRC for those who (unfortunately) weren't there. Foxy Loxy ^Pounce! 12:47, 20 December 2008 (UTC)[reply]

It hasnt been discussed onwiki, or anywhere else except on IRC with the same people who have also commented here. Some of the justification for the bot has been discussed on IRC as it is a mix of Wikipedia:BEANS and private, so it wont be repeated here. The rest of the IRC discussions has been mostly echoed here already, as I have often requested that BAG people mention here the concerns they raised on IRC. The notes here may be a bit obtuse as a result, but not due to any intention to hide the discussion. If you have questions, or feel parts need to be clarified, feel free to ask. John Vandenberg ^(chat) 12:20, 28 December 2008 (UTC)[reply]

There is no way I would trust this job to a closed-source bot. The code has to be public, just like the contribution history of someone applying for the oversight power is public. Per the adminbot discussion, the way you stop vandals from exploiting your bot is to separate the code from the data about what patterns to look for, and keep the data hidden. rspεεr (talk) 10:04, 28 December 2008 (UTC)[reply]

It is the data that I said would be hidden; see the initial request at the top.

I am more than happy to release the code (I am rather keen on open source, as you can see from my userpage), but doing so will likely result in an arms race, so I would prefer that it is reviewed by a few people, and given a little time to improve to better handle real world scenarios.

fwiw, the patrol bot is not open source because the framework I have written can be used maliciously; I've been working on a "malicious patroller detector" which will mean I can release the patrol bot code without worrying that I am causing new problems for the NPP crew.

John Vandenberg ^(chat) 12:06, 28 December 2008 (UTC)[reply]

I don't quite understand. Are you saying the bot would be exploitable through knowledge of its code alone? That's security by obscurity, and that's not a good thing to have in an oversight bot. If you're going to ask the community to grant oversight permission to your bot, you need to show us the code so that we have some basis on which to do so. rspεεr (talk) 21:28, 29 December 2008 (UTC)[reply]

Absolutely bad idea. This bot would be completely unaccountable to the community. Too much risk, and the damage that could be caused is irreversable (we'd obviously never know about it either). Majorly talk 00:53, 14 January 2009 (UTC)[reply]
Oppose, this task requires human judgment, and can't be done safely on the basis of a regexp or other automated pattern matching. The anti-vandal tools provide plenty of examples of bad regexp matches, such as matching "fannie" in an article about the recent financial crisis, but we trust the judgment of human editors not to press the rollback button. Anti-vandal bots use more restrictive regexps, but they still have false positives. Since anti-vandal bot errors are visible to everyone, we can simply revert the bot, and refine the regexp if necessary. But for an oversight bot, failures are more difficult to detect, and the effort required to fix them is much greater. Only oversighters could see that the bot made an error, then a developer would need to restore the incorrectly oversighted edits. At least, theoretically, because as the recent FT2 case shows, it may not be possible to bring the edits back at all. The result might be an acrimonious oversight controversy every month, which would really, really suck :( The Nordic Goddess Kristen ^{Worship her} 01:38, 14 January 2009 (UTC)[reply]
Oppose - Although users with Oversight will be able to check the bots actions, there are 2 major problems: There are too few oversight users (currently 40), and the actions are irreversible. Humans with oversight may make mistakes; a bot with oversight is much more likely to. Note that these problems don't exist for deletions - there are over 1000 admins, and deletions can be reversed easily (except where there are also deleted revisions for the same title). עוד מישהו Od Mishehu 07:33, 14 January 2009 (UTC)[reply]
While I understand the reason for the bot, I'm uncomfortable with something that is permanent and not transparent. --Kbdank71 19:42, 14 January 2009 (UTC)[reply]
Oppose Malfunctions would be disastrous and undetectable in a timely manner. A bot to report suspect edits to oversighters is acceptable though (the abuse filter could do that kind of things actually). Cenarium (Talk) 03:15, 15 January 2009 (UTC)[reply]

Denied. It seems clear that this task does not have the consensus of the community. Anomie ⚔ 03:41, 15 January 2009 (UTC)[reply]

The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.