Wikipedia:Bots/Requests for approval/CounterVandalismBot

The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was

Approved.

CounterVandalismBot

Operator: Lloydpick

Automatic or Manually Assisted: Automatic

Programming Language(s): Visual C#

Function Summary: Reverting Vandalism

Edit period(s) (e.g. Continuous, daily, one time run): Continuous

Edit rate requested: Max 10 edits per minute.

Already has a bot flag (Y/N): N

Function Details: Monitors the Recent Changes feed, once read in it analyses the changes made, and if it believes the article has been vandalised then it takes more measures to confirm the vandalism, once those further checks are completed it goes to revert the article.

I am aware there are existing anti vandalism bots on Wikipedia, but I created this not only to help control vandalism but also to learn a new programming language, and in this case its Visual C#. It is currently only complete with the basic of features, so detection of vandalism and reverting. Warning users will come soon once I get some more time to update the bot. From my understanding of the bot creation process as stated in here that I should create a request before allowing it lose onto Wikipedia and then build further onto it with the agreement of the group.

Discussion

Thank you for bringing your bot here for discussion. Vandalism bots are tricky to say the least, what type of logic will your bot be performing to determine if an edit is valid? — xaosflux ^Talk 13:55, 14 September 2007 (UTC)[reply]

Currently it uses a scoring system, where an article is given a starting score of zero. Various word checks are then done, to identify bad as well as good words. Each separate check has a given value which is added or subtracted from the overall score. There are also generic checks which adjust the score, such as the number of words, and another is if it matches what a sentence is usually consisted of. It also takes into account the article name, this has been done because if someone was to reference say a swear word in the article about said swear word it would ignore certain checks. Lloydpick 14:27, 14 September 2007 (UTC)[reply]

From the talk page:

“

This bot is currently NOT a exclusion compliant bot. It will be soon however.

”

It probably should not be... — madman bum and angel 19:28, 14 September 2007 (UTC)[reply]

I'm not too sure really, I'm in two minds about it that it should and that it shouldn't. As there are pro's and con's to each, if its exclusion compliant then same pages could get vandalised easily. I don't know, opening it up to the floor. Lloydpick 19:59, 14 September 2007 (UTC)[reply]

You could have a page like User:CounterVandalismBot/Run and if the page is changed to 'false' the robot will turn off. Are you using a framework? Alpta 03:44, 15 September 2007 (UTC)[reply]

It probably shouldn't have a max amount of 10 edits per minute. It should be able to edit as fast as it needs to correct vandalism. --(Review Me) R ^ParlateContribs_@ (Let's Go Yankees!) 05:41, 15 September 2007 (UTC)[reply]

R, the bot doesn't run with a flag, it runs unflagged. Alpta 13:52, 15 September 2007 (UTC)[reply]

R is right though. We don't cap the edit rate of antivandalism bots, even though they run without a flag. I don't think this bot should be approved however until the warning of users is set up. Reverting is only half the process, and if users aren't warned, they're likely to vandalize again. —METS501 (talk) 15:41, 15 September 2007 (UTC)[reply]

Ok, thanks for the comments so far guys, as I said its still a work in progress, and I thought I should start the approval process before its finished compeltly incase I come acrop on something along the way. I'll remove the edit cap, and i'll be starting work on the warning system next, after that it will be the page flag to turn the bot on and off. Currently it uses the api.php for doing most of the work, the rest is done by custom code I have written, this project started off as something to learn C#, so I decided writing everything myself would be the best way to learn. Lloydpick 19:45, 16 September 2007 (UTC)[reply]

Cool, C# :) . Are you using the dotnetwikibot framework? Would you be willingly to show me the code? H₂O 22:50, 16 September 2007 (UTC)[reply]

No, its not using the dotnetwiki framework, I decided to jump in at the deep end and work out and learn how to do everything myself from scratch. I can show some wp admins the code if you like, but I don't want to publically post it for everyone to see. Lloydpick 23:49, 16 September 2007 (UTC)[reply]

If you'd like, you can email the code to me and I'll post it and delete it quickly and link to the deleted revisions so that only admins can see it. —METS501 (talk) 03:33, 17 September 2007 (UTC)[reply]

Sure, that sounds like a good idea, I want to do a little bit more work on it before I show the source however, it needs a little clean up ;) Just to keep everyone else up to date, the bot now edits once every 5seconds, this is to make sure that the work it does is completed before it starts the next. It also now gives users warnings, but only a static one, im going to work on making them dynamic (ie. warning1, then warning2 etc) hopefully tonight when I get back from a work meeting. Lloydpick 08:55, 17 September 2007 (UTC)[reply]

As discussed in previous counter-vandalism bots' BRFAs, the bot needs to be able to detect existing warnings, not only from itself but from other users. — madman bum and angel 14:04, 18 September 2007 (UTC)[reply]

Indeed, to make this possible I had to add on a database to the back of it so that keeping track of it becomes much easier. Hopefully by the end of tonight it can count up existing warnings. Out of interest as this was something that popped up while I was coding. How long to warnings last, I know that there's no exact timespan, but roughly. Monthly, Weekly? As if im going to count up others warnings, I need to know how far back to look and how many to count. What would everyone recommend? So far I have gone with the current month. Lloydpick 00:49, 19 September 2007 (UTC)[reply]

Right then, bot now warns the user correctly based on the vandalism warnings already issued for the current month. Next on the list is making it obey a "Run" page, so that the bot can be disabled by normal (registered) users. Lloydpick 12:39, 20 September 2007 (UTC)[reply]

This isn't necessary, IMO. MartinBot doesn't use as far as I know. If you want to use it though, you could have the bot grab the run page before each edit it makes to make sure it's still the same text. That's the simplest way to stop it, unless you create another script which will kill the main process if the page is changed. For example "Run.exe" checks your run page, and if it is changed to 'stop' or something, that script will kill "main program.exe". Just food for thought... CO₂ 23:53, 20 September 2007 (UTC)[reply]

Using the extra process method could allow the bot to make another destructive edit in the time checks inbetween. If it came to a point where it really needed to be shutdown, I think it would probably be better to get it to check everytime so that if it is going nuts, its stopped instantly rather than allowing perhaps a few more edits. However, I like the process itself, and I may use it in another project (non-wp related) that I'm working on. Cheers :) Lloydpick 00:54, 21 September 2007 (UTC)[reply]

More news regarding the database, anything the bot now does, from scoring an article to warning users, is recorded in the database to provide a log, so that if there is ever a problem it can be backtraced. I'm also thinking about putting the ID number in the warning message so that if it is incorrect it will be much faster to track it down. Lloydpick 09:02, 21 September 2007 (UTC)[reply]

If you plan to include the ID, I'd recommend you put it in hmtl tags, like  instead of making it visual, since that could confuse new contributors. CO₂ 03:48, 22 September 2007 (UTC)[reply]

Could do, I may make it obvious somewhere like in the edit summary, but ill change it to a hidden reference Lloydpick 13:15, 22 September 2007 (UTC)[reply]

Adjusted the warning so that the reference id number is hidden, changed the edit summary to show the id at the end of the message Lloydpick 23:41, 22 September 2007 (UTC)[reply]

← The bot should also either not revert to its own version, or not revert administrators' revisions. MartinBot obeys the former, VoABot II obeys the latter. — madman bum and angel 21:22, 23 September 2007 (UTC)[reply]

Bot will now not revert to its own version. I have also adjusted the scoring system a little after watching the bot for most of the night in observer mode (i.e. no edits, no warnings). Had a few false positives (Roughly ~2 in 200 checks), but with the changes I have made it should be lower. Lloydpick 21:51, 23 September 2007 (UTC)[reply]

With the bot being run in observer mode currently, it has tried to revert over 1,000 articles, and here's a little breakdown as to what it tried to do...

reverted:               823          71.9%     I attempted to revert these articles
beaten:                 218          19.1%     I was beaten to fixing this vandalism by another user
aborted:                92           8.0%      I stopped processing due to an error (client)
already reverted:       11           1.0%      I have already reverted this article once today
----------------------------------------------------------------------------------------------------
total:                  1144 as of 25/09/2007 00:38:34

Obviously the more useful statistics such as false positives are not really possible short of me reviewing every attempt the bot makes, which can be done as I was doing that last night. However, it seems this evening has been a very heavy day for vandalism and the bot was left running to watch memory usage and to test some fixes to stop it crashing rather than to see its anti-vandalism capabilities. It is also getting to the point where I am possibly willing to share source code, however due to it being a C# project it isn't a one file program. So I will zip up the solution and send to people on demand (wikipedia admins only please, im not willing to share this out to anyone who isn't im afraid). If you would like to see the source code post on my talk page with some sort of contact details and ill send you the latest stable code I have at the time. Your comments and time are much appreciated, thank you! :) Lloydpick 00:03, 25 September 2007 (UTC)[reply]

Been leaving the bot running in observer mode during the day and part of the nights, were now up to 2,386 attempted reverts. I have been looking through the reverts which could be false positives (ones by named accounts) and have found one situation where a real edit could be marked as vandalism, im currently working on a solution to that. Once that has been fixed I suspect that the bot could well be considered "feature complete" in my eye's anyway. Lloydpick 20:08, 26 September 2007 (UTC)[reply]

Ok I have added another datagrid view to the bot so that I can see exactly what has been reverted and what the modifications were, I will leave it running through this evening, I will be monitoring directly what its doing (its still in observer mode though, so no reverts still). This is a task to hopefully see if any more preventable false positives occur. Beyond this, the bot, for me at least, is classed as feature complete. Lloydpick 17:33, 27 September 2007 (UTC)[reply]

Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. Make 50 edits/reverts in the mainspace with the bot, and warn as such. Run 40 of those reverts at 4 edits per minute (including warning, so in reality 2 reverts per minute), then run the last 10 edits as fast as it finds vandalism. CO₂ 20:42, 27 September 2007 (UTC)[reply]

Done Trial Completed, 40 edits at 2epm and 10 edits at full speed, details of the reverts can be found at User:CounterVandalismBot/Trial. I'm going to work on the bot as some issues have appeared while running this test. I'll update here again once those have been fixed. Thanks Lloydpick 00:22, 28 September 2007 (UTC)[reply]

Added conflict detection, should no longer warn a user if the revert attempt conflicts. Also added contribution count restrictions, this is to help avoid reverting legitimate edits which to the system look like vandalism (primarily censorship). Unfortunately though, there's no way to test the conflict detection without going setting the bot to go live. Lloydpick 21:33, 29 September 2007 (UTC)[reply]

Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. Same thing, make 50 edits/reverts in the mainspace with the bot, and warn as such. Run 40 of those reverts at 4 edits per minute, then run the last 10 edits as fast as it finds vandalism. CO₂ 14:11, 30 September 2007 (UTC)[reply]

Done 2nd trial completed, and ran much better than the previous. A few errors occured, which were all fixed during the trial. First was the conflict detection working incorrectly, this was then fixed and confirmed to work during the test. Secondly was a single false positive, but I have added additional checks so that this particular case should not occur again. Trial results can be found on the Trial page. Lloydpick 01:47, 1 October 2007 (UTC)[reply]

Could I see the code? —— Eagle101^{Need help?} 01:23, 5 October 2007 (UTC)[reply]

Sure, i'v left you a pm on IRC Lloydpick 08:36, 5 October 2007 (UTC)[reply]

I don't see any reason not to Approved. :) --uǝʌǝsʎʇɹnoɟ ʇs 11:47, 8 October 2007 (UTC)[reply]

The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.