Wikipedia:Bots/Requests for approval/Hazard-Bot 12
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Withdrawn by operator.
Operator: Hazard-SJ (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 04:49, Sunday June 3, 2012 (UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): C#
Source code available: AutoWikiBrowser
Function overview: Removing non-free files, per Wikipedia:Non-free content criteria#9, from non-articles.
Links to relevant discussions (where appropriate):
Edit period(s): Occasional
Estimated number of pages affected: ≥1200
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details: Removes non-free files from non-articles, for example: this, this, this, this, these etc. Hazard-SJ ✈ 04:49, 3 June 2012 (UTC)[reply]
Discussion
editI know we have (or have had in the past) bots that do this. It might be useful to see how they handle this task for comparison. Some random thoughts:
- How exactly is it determined whether an image is non-free?
- Should users be notified that images they've added have been removed?
- Should the images be straight-out removed as in the examples you linked, should they be commented out (like
<!-- Non-free image outside of article space removed by Hazard-Bot: [[File:Non-free-image.jpg]] -->
), or possibly a third option like colon-escaping them (that is,[[File:Non-free-image.jpg]]
→[[:File:Non-free-image.jpg]]
)? The last two wouldn't work for infoboxes, but for straight inclusions, they may be more useful.
— The Earwig (talk) 23:26, 3 June 2012 (UTC)[reply]
- There is a tool on Toolserver with the file list, as maintained by checking for the existencce of {{Non-free media}} on the file description page. I could comment them out, but I'm not sure the text in the additional text within the comment would actually work. About the colon-escaping, I'm not sure I could handle that with AWB. As for your second question, I have no way of notifying them. It would be considerably difficult to parse the history of the page to check that. Hazard-SJ ✈ 23:28, 4 June 2012 (UTC)[reply]
- Also, with comparison, CommonsDelinker also removes the entire thing. Hazard-SJ ✈ 01:38, 8 June 2012 (UTC)[reply]
Sorry, but (as with your last brfa), users really do need to be notified when you are messing with stuff they've been working on. If you don't notify users that you've removed their image, all you'll end up doing is confusing new users, and annoying experienced users. --Chris 09:10, 8 June 2012 (UTC)[reply]
- Realistically, there are two strategies for determining who added the image – I don't see it as a "considerably difficult" process. Both of these are used by WikiBlame, the de-facto "revision search tool". You've got your linear search, and your binary search. While binary search is faster for general cases (O(log n) vs. O(n)), we'd probably want to go with a simple linear search from most recent to earliest, since the addition is likely to be rather recent and binary search can't guarantee that the diff it finds is the most recent addition of the image. Unfortunately, this might identify the wrong user if someone else vandalizes the page, removing the image in the process, followed a drive-by reversion by a random user, but I don't think there's anything we can do about that. All you have to do for linear search is find the most recent revision where the filename is not present in the wikitext, and the editor of the revision following that is the one you notify.
- As for alternatives to straight-out removing the image, you can also try replacing it with File:NonFreeImageRemoved.svg (see right). This might be an easier operation, demonstrated via regex, where
File:Foo.png
is the offending image: first try replacing\[\[(File|Image):Foo.png(\||\]\])
with[[File:NonFreeImageRemoved.svg\2
, then(File|Image):Foo.png
withFile:NonFreeImageRemoved.svg
if that isn't found, and finallyFoo.png
withNonFreeImageRemoved.svg
if that isn't found. This way, you only remove what is absolutely necessary and do your best to avoid, e.g., removing[[:File:Foo.png]]
, a regular link to Foo. But regardless, I'm okay with removing the image completely (using whatever AWB function removes images) as long as users are notified with some message containing – at least – a link to the page, a link to the image, and the diff where they added the image. - — The Earwig (talk) 22:55, 8 June 2012 (UTC)[reply]
- I'm not aware of AWB being able to do any of the searches you mentioned. As for confusing new users, the edit summary could explain, as well as a message in the bot's userspace. As for annoying experienced editors, depending on their experience, they might now about the policy. Otherwise, the edit summary etc. will inform them. Hazard-SJ ✈ 05:12, 10 June 2012 (UTC)[reply]
- No, I don't think AWB can do this kind of searching. You would need to actually write the bot yourself, as far as I know (I've never worked with AWB directly, so don't take this as fact). Also, I disagree that experienced editors should know better; it's possible for an image previously considered free to end up incorrectly licensed and be converted to fair use. A explanatory edit summary is good (and required, too), but my concern is mainly that new users may not think to check edit history (because they are new and might assume "oh, it's probably just a server error, I'll re-add it"); experienced users are less benefited by a talk page explanation, but if they are not watching the page, they will never know the image was removed until manually checking it. Granted, I can't think of a situation where an unwatched page losing an image really matters, so perhaps this is only useful for newbies. — The Earwig (talk) 18:24, 13 June 2012 (UTC)[reply]
- As an example, see how DASHBot does (or did) it: like so. — The Earwig (talk) 06:54, 15 June 2012 (UTC)[reply]
- No, I don't think AWB can do this kind of searching. You would need to actually write the bot yourself, as far as I know (I've never worked with AWB directly, so don't take this as fact). Also, I disagree that experienced editors should know better; it's possible for an image previously considered free to end up incorrectly licensed and be converted to fair use. A explanatory edit summary is good (and required, too), but my concern is mainly that new users may not think to check edit history (because they are new and might assume "oh, it's probably just a server error, I'll re-add it"); experienced users are less benefited by a talk page explanation, but if they are not watching the page, they will never know the image was removed until manually checking it. Granted, I can't think of a situation where an unwatched page losing an image really matters, so perhaps this is only useful for newbies. — The Earwig (talk) 18:24, 13 June 2012 (UTC)[reply]
- I'm not aware of AWB being able to do any of the searches you mentioned. As for confusing new users, the edit summary could explain, as well as a message in the bot's userspace. As for annoying experienced editors, depending on their experience, they might now about the policy. Otherwise, the edit summary etc. will inform them. Hazard-SJ ✈ 05:12, 10 June 2012 (UTC)[reply]
- Withdrawn by operator. Hazard-SJ ✈ 01:17, 22 June 2012 (UTC)[reply]
- This is a shame... we really need a bot for this task. Either way, the paperwork has been completed. — The Earwig (talk) 01:27, 22 June 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.