Wikipedia:Bots/Requests for approval/HersfoldBot
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Automatic or Manually Assisted: Automatic, but only run when supervised.
Programming Language(s): Java, using User:MER-C/Wiki.java.
Function Overview: HersfoldBot will transwiki articles from Category:Copy to Wiktionary to Wiktionary using the Special:Import function. This bot will also need approval and admin/import rights over there before it can fully operate, however testing of the import function is possible at test.wikipedia.org.
Edit period(s): When needed, probably no more than once a day.
Already has a bot flag (Y/N): No, would need one should not need one unless required by policy - it would be preferred to have the bot's edits show up in RC so they can be noticed and the imported articles dealt with.
Function Details: HersfoldBot will collect the list of articles (only pages in the main namespace) from Category:Copy to Wiktionary and complete the following execution cycle for each article. The bot will ignore any article that has been tagged with {{TooManyRevisions}}; that template's function is explained later on.
- Determine if Wiktionary already has a page existing at wikt:Transwiki:<article name>.
- If so, the bot will attempt to import the full history of the article using Special:Import at Wiktionary (through the use of the API).
- If the import is successful, the bot will replace the Transwiki template ({{Copy to Wiktionary}} or one of its redirects) on Wikipedia with {{TWCleanup}} and log the transwiki both at Wikipedia:Transwiki log/Articles moved from here/en.wiktionary and wikt:Wiktionary:Transwiki log.
- If there is already a transwikied article by that title, the bot will not import, but simply replace the transwiki template with {{TWCleanup2}}.
All of the actions the bot makes are logged to a text file on my computer so that I can review what happened, why it stopped running, and whether or not I need to go in myself to clean up some of the things it wasn't able to do (see next paragraph).
The bot has multiple safety checks built into it which will either stop it running or set it to ignore particular articles which have proven to be a waste of time.
- The bot will stop editing more-or-less immediately if it has new messages at either Wiktionary or Wikipedia.
- The bot will be unable to continue if it is blocked, and should exit cleanly if this proves to be the case.
- The bot will stop if it is unable to create or open the text log file on my computer. This happens before it tries to log in to either wiki, and in fact before I even enter the password.
- The bot will stop running if it encounters an IOException at any point (with one exception mentioned later), as these usually indicate a problem with the internet connection.
- The bot will stop running if it has inadvertently been logged out or finds that it does not have access to import articles.
- The bot will stop running if it gets a "cantimport", "badinterwiki", or unknown error back from the import API, as this indicates access has been denied, there is a problem with the hard-coded portion of the request URL, or something really bad happened.
- The bot will stop running if it encounters ten "notempdir" errors in a row - this is a server-side error, and may be only temporary; the counter allows the server time to correct itself without the bot stopping, but then will force the bot to stop if it seems the server is really having trouble.
- The bot will stop if I enter the wrong password or it otherwise fails to log in twice.
- The bot will mark within its log that manual review is needed in the following circumstances, however will not stop running:
- The bot receives a HTTP 504 error from the import API after attempting to import an article. In testing, it appears that this will sometimes occur when importing articles with high revision counts (roughly 200-300, I think), even though the import may successfully complete. The bot will also pause for five seconds to allow the server to recover.
- The bot receives a "filetoobig" error from the API. This will cause the bot to stop importing it and add {{TooManyRevisions}} to the article on Wikipedia, which will cause it to ignore the article on future runs.
- The bot receives ten "cantopenfile" errors from the API for the same article. This seems to occur at random for some articles, but repeatedly for articles with very high revision counts (estimated to be 300 or more, not reliably tested yet). This will cause the bot to stop importing the article and add {{TooManyRevisions}} to the article on Wikipedia, which will cause it to ignore the article on future runs.
- The bot will stop running if a total of three or more of these errors occur during its run. While these errors do not necessarily indicate a problem by themselves (since the import API does appear to be only partially reliable at best), repeated occurrences of them could mean I need to check the code. When each of these errors occurs, the error will be noted in the text log and the article will be re-added to the bot's import queue for a later attempt.
- The bot receives a "notoken" error from the import API.
- The bot receives a "badtoken" error from the import API.
- The bot receives a "nofile" error from the import API.
- The bot receives a "partialupload" error from the import API.
I will be placing the source code online soon at User:HersfoldBot/Source for review; that page will be fully protected. The code contains more complete documentation, as well as a slightly more detailed listing of the various conditions that will make the bot die (there are currently 30 different exit codes that indicate an error).
I would like to get approval here first, if possible. I have been unable to test the editing functions of this bot yet, and would like to be able to test that here before trying to get approval and admin rights over at Wiktionary. The import functionality has been tested at testwiki: and appears to work fine (see testwiki:Special:Contributions/HersfoldBot). Once operational, I will also look into transferring the logs the bot produces onto Wikipedia, somewhere within the bot's userspace.
Discussion
editWow, that's one BRFA you've got there Hersfold. Anyhow, by request, a quick criteria analysis from me (as urgency probably plays second fiddle to getting it perfect here):
- is harmless: that's what a period of debate and trial is for.
- is useful : Yes, passes that one easily enough.
- does not consume resources unnecessarily: Yep.
- performs only tasks for which there is consensus: I can't see this being a problem.
- carefully adheres to relevant policies and guidelines: I can't see any problems, and I trust an admin to know his way around them anyway.
- uses informative messages, appropriately worded, in any edit summaries or messages left for users: I would hope so.
So in summary, no problems so far, although one gets the feeling that some time, trial and error may be needed to get everything working perfectly. - Jarry1250 (t, c) 20:25, 9 March 2009 (UTC)[reply]
- Oh, definitely. You'll see at User:HersfoldBot/Version that the code underwent several revisions already as I worked out the weird bugs. I still don't entirely know what to expect from the import API (since by all accounts it's buggy as hell), but I think (hope) I've worked out most of the major things already. It's proven easy enough to work around those errors so far, anyway. Hersfold (t/a/c) 21:41, 9 March 2009 (UTC)[reply]
- Seems harmless Approved for trial (50 edits or 5 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. MBisanz talk 23:09, 9 March 2009 (UTC)[reply]
- Trial running now -
- The bot will handle test.wikipedia.org as Wiktionary; edits can be viewed at testwiki:Special:Contributions/HersfoldBot.
- The bot will use User:HersfoldBot/Wikipedia:Transwiki log/Articles moved from here/en.wiktionary as the local transwiki log since the articles aren't being transwikied to Wiktionary.
- The bot will edit the source articles here, however all of those edits will be rollbacked on completion to keep the articles categorized. Special:Contributions/HersfoldBot
- Once done, I will copy the text log to User:HersfoldBot/Trial run log. Hersfold (t/a/c) 01:52, 10 March 2009 (UTC)[reply]
- Trial running now -
- Ok, trying this again after I clean up the mess on test wiki - I forgot to assign the results of some functions back into some strings, so when the bot tried to edit, it ended up not doing anything (on the articles) or overwriting the existing content (on the logs). Hersfold (t/a/c) 02:14, 10 March 2009 (UTC)[reply]
- Trial complete. The trial has finished. On the last run log, the bot imported 13 articles to test.wikipedia.org out of the 16 that it attempted. The 3 articles that it failed to import received "cantopenfile" errors; I'm not sure what caused Hoodrats to fail (it was probably the API being ornery), however both List of slang terms for police officers and List of terms for gay in different languages have repeatedly gotten these errors, which makes me think they have too many revisions to import. Had the program continued, it probably would have marked both of these with {{TooManyRevisions}} eventually. The bot recorded all of its actions at testwiki:Transwiki log (mimicking Wiktionary's log) and at User:HersfoldBot/Wikipedia:Transwiki log/Articles moved from here/en.wiktionary (mimicking Wikipedia's log). The bot is now blocked again, pending further clearance. Hersfold (t/a/c) 04:32, 10 March 2009 (UTC)[reply]
I've just made some changes to the code to allow it to run through a GUI instead of the command line; could I get another trial to make sure it still works OK? The changes made will be logged in the bot's userspace shortly, although the changes made to the bot's operating code shouldn't have a substantial effect on how it runs. Hersfold (t/a/c) 23:29, 10 March 2009 (UTC)[reply]
- This seems pretty harmless, but could take some time to get right, so I'm moving it to Approved for extended trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete.. Lemme know when it is all fixed up. MBisanz talk 08:12, 11 March 2009 (UTC)[reply]
- I was called to comment over at Wiktionary and left my comments there[1], but my main suggestions are to make it check if there are main Wiktionary articles under the same name (not just Transwiki) for duplicates, and have it have a character limit of how long articles can be that are imported to Wiktionary. Goldenrowley (talk) 02:48, 12 March 2009 (UTC)[reply]
- (outdent) Still adding features as requested on Wiktionary; I'm going to hold off on the final trial until I get approval for test runs on their end or they stop throwing suggestions at me. Their suggestions are including a lot of stuff that a Wikipedia editor wouldn't know about simply because it's about how they deal with things on their end. Hersfold (t/a/c) 05:29, 13 March 2009 (UTC)[reply]
- {{OnHold}} - In order for me to do final testing on Wiktionary, I'll need use of the import flag over there. In order to get that (and approval to run) I need to go through one of their two-week vote periods, so there isn't likely to be any more information here for a while. I am still watchlisting this, so if anyone has any further comments or questions, I'll see them. Hersfold (t/a/c) 01:20, 16 March 2009 (UTC)[reply]
{{BAGAssistanceNeeded}} - The approval vote over at Wiktionary should wrap up tomorrow, and so far it's unanimously in favor of granting the import flag. Once I (or whoever closes the vote) hunts down a steward for the flag, the bot will be ready for a final live trial between Wikipedia and Wiktionary for approval here. There have been several changes to the code since a trial was last run, so if I could get someone to review that and approve the bot for final trials that'd be great. Thanks. Hersfold (t/a/c) 19:17, 24 March 2009 (UTC)[reply]
Approved for trial (50 edits or 7 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. sounds fine to me. MBisanz talk 21:35, 24 March 2009 (UTC)[reply]
- Running into some minor problems with the import API - should be running smoothly in a moment. Hersfold (t/a/c) 05:51, 26 March 2009 (UTC)[reply]
- Trial complete. After fixing the API queries (I forgot to fix that bit of the code before running), the bot imported 10 articles, marked one for manual review due to its size, and removed another from the category since it already existed at Wiktionary. The bot ran for approximately 11-12 minutes and encountered no errors. A log of the imports it made can be viewed at Wikipedia:Transwiki log/Articles moved from here/en.wiktionary#HersfoldBot import March 26 2009, 05:58:24 UTC, and the bot's full operation log is available at User:HersfoldBot/Trial run log#Final trial. Hersfold (t/a/c) 06:53, 26 March 2009 (UTC)[reply]
Approved. BJTalk 07:02, 26 March 2009 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.