Wikipedia:Bots/Requests for approval/PDFbot 2
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Operator: Dispenser
Automatic or Manually Assisted: Automatic
Programming Language(s): pywikipedia
Function Summary: Dictionary based dead link repair
Edit period(s) (e.g. Continuous, daily, one time run): Monthly
Edit rate requested: 1 edits per minute
Already has a bot flag (Y/N): Yes
Function Details: Adding a task to fix dead links using a dictionary that replaces all URL instances on a pages that are normally processed (pages with {{PDFlink}}). This task is only useful if a large number of links have been relocated such as with the Virginia State Routes.
Discussion
editHow will the bot know if a link is dead of if the server is temporarily down? —METS501 (talk) 23:36, 23 March 2007 (UTC)[reply]
- The dictionary is human generated and verified. The bot currently checks the content-type to see if it's either application/PDF or octet-stream, so it doesn't report the size of 404 pages. I plan on using it on the VA highway pages, see [1]. —Dispenser 00:29, 24 March 2007 (UTC)[reply]
- Sorry, I don't understand :-( Not your fault. You can either try to explain it more step by step to me or wait for another BAG member or user. —METS501 (talk) 01:46, 24 March 2007 (UTC)[reply]
- The WikiProject U.S. Interstate Highways has been using www.virginiadot.org as a reference and tagged the pages with PDFlink. About half a year ago, I believe, virginiadot had moved their resources to a different directory. To fix these URLs is trivial, but since I did not ask for it in my original request I can't do it. Thus, I'm filling out a second request to cover fixing URLs. —Dispenser 03:34, 24 March 2007 (UTC)[reply]
- With the content-type precautions you've made and the use of a manually created list of what to replace in the URL to get it to work, I don't see any problem with the idea. However, in the diff you referenced above, the PDF size isn't shown in brackets afterwards (the parameter having been removed). Is this a problem present in the bot, or is it just the fact that you were carrying out the replacement manually? Also, I'd just like to seek assurance that the bot does check that the new links work before replacing the old ones with them, to avoid any problems with some dead links being introduced. Martinp23 21:05, 24 March 2007 (UTC)[reply]
- The manual removal the parameter was done since it was incorrect (those were the size of the 404 html) and is not reflective of the bot would do. The links will pass the normal (correct content-type) tests before they are committed. —Dispenser 22:08, 24 March 2007 (UTC)[reply]
- Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. No more than 50 edits, please report back when complete. Martinp23 05:46, 26 March 2007 (UTC)[reply]
- Trail was completed last week. —Dispenser 00:53, 4 April 2007 (UTC)[reply]
- Everything looks OK, but it would be nice to change the edit summary to make it more fully describe the changes. On this understanding - Approved. Martinp23 09:38, 4 April 2007 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.