Wikipedia:Bots/Requests for approval/Joe's Null Bot
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Joe Decker (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 19:51, Saturday May 19, 2012 (UTC)
Automatic, Supervised, or Manual: Eventual goal is automatic. I'd first test in manual, expect to run trials and full runs (if approved) in supervised mode for a while, but hopefully eventually migrate to automating the daily run with a crontab entry.
Programming language(s): Perl via MediaWiki::Bot MediaWiki::API
Source code available: User:Joe's Null Bot/source
Function overview: Once-daily application of WP:NULLEDITs special purges with the "forcelinkupdate" option set to each of the articles within Category:BLP articles proposed for deletion by days left.
Links to relevant discussions (where appropriate):
- Attempt at engagement at WT:BLPPROD , which largely got side-tracked into alternate larger proposals of reworking the way maintenance templates could be reorganized.
I'm more than happy to seek consensus elsewhere if requested, just tell me where. These were the targeted alternatives that seemed most sensible to me. I also deliberately pinged a couple editors who I thought might have an opinion, and posted a note on the Template's talk page, but largely the response I got back was apathy, and surprise that something like this might be necessary, the Template bug is a bit obscure to many editors.
Edit period(s): One run daily. At first I'd probably manually trigger the daily run and I might miss some days, with the long-term goal being to stick it into a crontab and forget it.
Estimated number of pages affected: Roughly 100 per day.
Exclusion compliant (Yes/No): Sure, although I'm not sure it's strictly necessary in this case, I tend to assume that providing such compliance is a sensible default.
Already has a bot flag (Yes/No): n/a: This is a new proposal.
Function details:
Each day, I intend this bot to search through the articles listed at Category:BLP articles proposed for deletion by days left and apply a WP:NULLEDIT. purge with the super-secret "forcelinkupdate" parameter set.
Background: The WP:BLPPROD process uses Template:Prod blp to label articles marked for deletion and to populate appropriate maintenance categories, both "Category:BLP articles proposed for deletion by days left" and "Category:Expired_proposed_deletions_of_unsourced_BLPs". A couple admins such as myself aside, most admins see these expirations only when articles are pushed into the latter category, but that only happens when an edit is made after the expiration date. Unfortunately, a normal "purge" will not actually update the contents of these categories.
That template-placed-categories don't update articles, even with a purge operation, is a "wont fix" misfeature of the MediaWiki software, a WP:NULLEDIT (which discusses this) is likely the minimal operation necessary.
My design sketch treats any protection on an article as a reason to not attempt the null edit, and any failure from get_text as a reason to abort the run entirely. (The latter is likely be too paranoid, failing if the article was deleted during the directory traverse, when simply moving on to the next article would likely do no harm.)
If approved, I'd also like feedback on the question of whether logging of some sort would be desired (it strikes me as overkill, but hey, that's just me) and of what form.
Discussion
editComment by proposer: I had had some concern that NULLEDITS would mess up the article history, but I was assured (and have tested by handmade NULL edits) that such edits both "do what I want" and leave the history untouched. This should make any out-of-process edits on such a bot extremely apparent. However, it does mean that the 'bot would be doing something that, while seemingly utterly trivial, could do harm if there's some deep dark corner case of a NULLEDIT having a noxious effect. I can't think of one, but maybe you can. --joe deckertalk to me 20:17, 19 May 2012 (UTC)[reply]
You should probably use purge instead of null edits. — HELLKNOWZ ▎TALK 19:58, 19 May 2012 (UTC) Disregard that, I realized you need updating transcluded cats. — HELLKNOWZ ▎TALK 20:04, 19 May 2012 (UTC)[reply]
- No problem!
I'm still trying to work on clarifying my exposition, which is why I haven't added this to the BRFA,my apologies if this is all still a muddle of jumbled thoughts! --joe deckertalk to me 20:10, 19 May 2012 (UTC)[reply]
I've not looked into this particular request that closely yet. However, I will say that in the past null edit bots have been denied. This is because the job queue is there for a reason and it's been suggested that using null edit bots to get around it is not the right way to go. Instead, job queue issues would be something for devs to handle. See Wikipedia:Bots/Requests for approval/Null edit bot - Kingpin13 (talk) 21:01, 19 May 2012 (UTC)[reply]
- Thank you for pointing me at this, I was unaware of it. it raises several interesting questions, and does suggest that I'm unlikely to win support here.
- But a few questions/comments, if I may.
- First, that suggests that I have a fundamental misunderstanding of the specific problem I'm trying to fix. It suggests that, contrary to my understanding, that at some point, perhaps two months later, the category will be properly populated. Is that right?
- Second, "two months", is that right? I've only been able to tell it's more than a week or so so far, but not because I've tested it beyond that, only because I haven't let anything get out farther than that.
- I wouldn't be here if I figured the issue was "a day or two", nobody cares if a few BLPPRODs slip a couple days. I stopped hand-inserting NULL EDITS for ten days (mostly) and saw the oldest entries not deleted, so I can assume it's at least seven or eight days long, beyond that, I have no data. Perhaps I've been wasting my time the last couple years. I honestly don't care if things slip a few days.
- But I do care if they slip two months. Editors balance how much work they do to deal with copyright and BLP issues based on how long they figure the article will survive before deletion, and, knowing that "10 days" was really "70 days" would make a non-trivial difference in some of those decisions.
Finally, FWIW, the developer's have said they won't fix this, or again, that is my understanding, which (as is so often the case) may be flawed. So that's not a plausible alternative.- Thanks for the pointer, that really is interesting. Cheers! --joe deckertalk to me 22:04, 19 May 2012 (UTC)[reply]
- Hmmm, I'm not convinced that the Help:Job queue is actually the problem. At the moment I'm entring this, the size of that queue is listed at under 100 items here, and fluctuates to such an extent that it's hard, if I understand correctly, to say that the five-day current backlog of unpopulated expirations is explained by the mechanism there. Any clarification would be greatly welcomed, sorry if I'm just being clue-free here. --joe deckertalk to me 22:38, 19 May 2012 (UTC)[reply]
- While certain magic words cause MediaWiki to expire the cache early, it seems that only edits (null or otherwise) will update the categorylinks table. So while the correct category might be shown at the bottom of the page, the category page itself won't show the page as a member until an edit is made. This is probably the same underlying cause as T33628. Anomie⚔ 23:33, 19 May 2012 (UTC)[reply]
- Thank you for the explanation, this is closer to my original understanding, although I'm ignorant of specific WM internals. I would imagine, as category links is a database table, that any general fix in the Wikimedia software might have performance implications, but I really don't know, obviously. --joe deckertalk to me 18:21, 20 May 2012 (UTC)[reply]
- While certain magic words cause MediaWiki to expire the cache early, it seems that only edits (null or otherwise) will update the categorylinks table. So while the correct category might be shown at the bottom of the page, the category page itself won't show the page as a member until an edit is made. This is probably the same underlying cause as T33628. Anomie⚔ 23:33, 19 May 2012 (UTC)[reply]
- Hmmm, I'm not convinced that the Help:Job queue is actually the problem. At the moment I'm entring this, the size of that queue is listed at under 100 items here, and fluctuates to such an extent that it's hard, if I understand correctly, to say that the five-day current backlog of unpopulated expirations is explained by the mechanism there. Any clarification would be greatly welcomed, sorry if I'm just being clue-free here. --joe deckertalk to me 22:38, 19 May 2012 (UTC)[reply]
Oppose This is a bad hack to fix a software issue. You should file a bug to have the software issue fixed, not bot null edits. --Chris 10:29, 21 May 2012 (UTC)[reply]
- Thanks. I'm highly sympathetic to the idea this is a bug that should be fixed. Where we differ, I expect, is in our expectation of the practical chances of that happening based on this behavior's history, but if such a fix were to be developed at some point in time, that would clearly be the right solution. I have (as of yesterday) requested clarification my expectation (at 31628) at to whether this will be fixed, but I can file a separate bug as well if you believe that would be more likely to elicit a response. --joe deckertalk to me
- Update: Well, that is fascinating. While not documented at Help:Purge or mw:Purge, I discover through the discussion at T39001 that there is an option to purge that forces the category updates in question. The only reference I can find to this behavior is [1], but it also works as command-line magic, e.g..,
- That's certainly clean enough.
As a result/modified proposal, at this time I would like to replace the "null edit" of my proposal with this special purge. Chris, did you have other objections, or do you consider the need for a special purge a MediaWiki bug as well? I mean besides the "ugly as sin" thing, which I *entirely* agree with. --joe deckertalk to me 16:23, 21 May 2012 (UTC)[reply]
- I like the special purge. Ok, let's move on this, with the understanding that if they ever do fix the bug then this bot will stop. Approved for trial (0 edits or 7 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. (heh) Anomie⚔ 01:06, 22 May 2012 (UTC)[reply]
- Hey, thanks! And of course: if there's a bugfix, this "tidbit of ugly" is history, either during the trial, or (should this eventually be approved for a longer term) after. I'll start putting it together tomorrow, and will provide at least a short update here by the next day. --joe deckertalk to me 05:11, 22 May 2012 (UTC)[reply]
- Yeah, a purge should be fine. --Chris 05:30, 22 May 2012 (UTC)[reply]
- Thanks! And thanks for suggestion of filing at Bugzilla, too. --joe deckertalk to me 06:02, 22 May 2012 (UTC)[reply]
Note In looking at this this morning, I expect it'll be easier to write directly to MediaWiki::API rather than ::Bot, mostly as ::Bot doesn't know the special parameter. Please let me know if this causes any concerns. --joe deckertalk to me 18:00, 22 May 2012 (UTC)[reply]
- No concern. Anomie⚔ 20:47, 22 May 2012 (UTC)[reply]
- Thanks! --joe deckertalk to me 21:54, 22 May 2012 (UTC)[reply]
Update: FIrst full run complete earlier today, categories updates were as expected. Also saw the lag_delay functionality kick in, and that too was functioning "as expected." Will continue to manually trigger and monitor once per day through the trial period. Current source mod password redaction here. What's y'all's preference with respect to exclusion compliance here? I haven't implemented it yet, but it's clearly straightforward. --joe deckertalk to me 21:54, 22 May 2012 (UTC)[reply]
Quick Update: Exclusion compliance added, tested, today's run completed. --joe deckertalk to me 19:26, 23 May 2012 (UTC)[reply]
End of Trial 1 Update While the trial period isn't officially over for a few hours, the last daily run of the trial period completed a couple hours ago. The last significant change to the code was between the day 3 and day 4 trials (where I rearranged the code to be more clueful about server lag), the code has been stable save for debugging output tweaks since. No signs of trouble observed. Still manually triggering, I'll eventually want to put this in crontab or something similar, but other than that and debugging output tweaks, I'm not anticipating any changes. --joe deckertalk to me 18:47, 28 May 2012 (UTC)[reply]
- All 0 of the edits made look good. Approved. Anomie⚔ 19:48, 28 May 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.