Wikipedia:Bots/Requests for approval/GreenC bot 5
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: GreenC (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 02:52, Tuesday, April 24, 2018 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): BotWikiAwk
Source code available: accdate.awk
Function overview: The proposal is for 'accdate bot' to remove |access-date=
from citations in the tracking category Category:Pages using citations with accessdate and no URL using targeted strategies.
Links to relevant discussions (where appropriate): Help_talk:Citation_Style_1#Clearing Category Pages using citations with accessdate and no URL - also CS1 documentation which supports use of |access-date=
for |url=
only.
Edit period(s): one-time run during first pass as standalone bot; then semi-continually as part of a module of WaybackMedic
Estimated number of pages affected: 25,000 (57% of 43,719)
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details:
Of the Category:CS1 errors, the tracking category with the most entries is Category:Pages using citations with accessdate and no URL (43,719). There is no silver bullet solution to clearing the cat, so this will break it down by targeting known types of problems within that category. There have been many discussions about it over the years.
The proposal is for 'accdate bot' to remove |access-date=
from citations in the tracking category Category:Pages using citations with accessdate and no URL using the following strategies:
- 1. Remove
|accessdate=
in CS1|2 templates that don't have a|url=
but do have a value assigned to any of the various 'permanent-record' identifiers. Excluding templates{{cite web}}
,{{cite podcast}}
, and{{cite mailing list}}
. Normally|isbn=
would be excluded from the identifier list, but if a{{cite book}}
it would be included. - 2. Remove
|accessdate=
in{{Cite book}}
,{{Cite news}}
and{{Cite journal}}
with no|url=
. Per the documentation, "Access dates are not required for links to published research papers, published books, or news articles with publication dates." If a publication date is provided, remove|accessdate=
.
- 1. Remove
Discussion
edit- The bot has been updated to the specifications above. A dry run of 1,000 articles found a fix in 574 or about 57%. The total cites fixed is 1165, of those 1121 are of type #2 and 44 are of type #1. I manually checked about 100 diffs offline and don't see any problems but will manually check these 574 once they are uploaded. Or whatever number is approved for trial. -- GreenC 15:44, 1 May 2018 (UTC)[reply]
- See User talk:CitationCleanerBot/Archive 1#Accessdates. You may want to link to that, or provide a similar explanation on the bot's page, because those questions will happen A LOT. Headbomb {t · c · p · b} 16:39, 1 May 2018 (UTC)[reply]
- Agreed a good idea to have a FAQ since
|access-date=
is a common source of confusion, what it's for and why exists. -- GreenC 04:11, 3 May 2018 (UTC)[reply]
- Agreed a good idea to have a FAQ since
Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Since no one from BAG seems interest in this, I'll take it despite having been involved in the discussion a bit. Headbomb {t · c · p · b} 16:45, 16 May 2018 (UTC)[reply]
- Trial complete. Edits (toolserver). Or Special:Contributions of May 7. -- GreenC 21:06, 17 May 2018 (UTC)[reply]
- The edits look good to me. A very minor cosmetic issue: for edits like these where the accessdate parameter is the last parameter in the citation, ideally the bot should also be removing the white space in front of the pipe character rather than leaving some extra white space at the end of the citation. —RP88 (talk) 21:37, 17 May 2018 (UTC)[reply]
- The space is there because the preceding argument has a trailing space and the bot leaves other arguments alone for safety. I understand personal preferences for spacing, but I can't program for every contingency, cites are often a mix of spacing styles. If removal of the preceding argument trailing space is the right decision, always, I don't know. Arguably in this case the spacing is consistent because every other argument has both a leading and trailing space. The bot retained the existing style, though it was coincidence. -- GreenC 22:15, 17 May 2018 (UTC)[reply]
- The edits look good to me. A very minor cosmetic issue: for edits like these where the accessdate parameter is the last parameter in the citation, ideally the bot should also be removing the white space in front of the pipe character rather than leaving some extra white space at the end of the citation. —RP88 (talk) 21:37, 17 May 2018 (UTC)[reply]
- Trial complete. Edits (toolserver). Or Special:Contributions of May 7. -- GreenC 21:06, 17 May 2018 (UTC)[reply]
@GreenC: In edits like these [1] (and I could pick several examples), the bot also removes empty |url=
parameters, and I do not see the wisdom in doing that. This discourages finding free URLs and makes it (slightly) harder to add them. Empty parameters should be left alone. Headbomb {t · c · p · b} 16:23, 18 May 2018 (UTC)[reply]
- I concur with User:Trappist the monk in the discussion, and also generally about removing them when they might cause confusion - in this case empty
|url=
have actually created some of the problem this bot is attempting to resolve. There is no evidence empty arguments encourage users to fill them in (nudge theory); there's no way future editors can know why the empty argument exists: did it once have something and was deleted? Was the citation copy-pasted in with other empty args and lazily the empties were kept? Was it always empty? There's no nudge factor because there are so many possibilities of why it exists. If the empty|url=
included a wikicomment saying "A URL might exist; please fill me in, or delete this notice and empty arg" that would be more clear. Do we want to do it? It seems like it would be true for any citation without a|url=
and goes down the rabbit hole of trying to direct users what to do. -- GreenC 18:09, 18 May 2018 (UTC)[reply]
- I concur with User:Trappist the monk in the discussion, and also generally about removing them when they might cause confusion - in this case empty
- By that rationale, every empty parameter should be removed, and that's not something I feel bots should be doing, save in fairly controled situations, or strong consensus to do so (in which case the functionality could be implemented in AWB). I picked a clean edit, but I could have picked an edit where the bot removed an empty url parameter, but left a slew of other empty parameters alone (jstor/zbl/etc...) such as [2]. The problem the bot is trying to solve is stray accessdates, so it should stick to that IMO. Open to other BAG opinion here since I'm partly involved here. I will point out that in the dicussion that lead to this, no one suggested/supported removing empty url parametesr from citations. Headbomb {t · c · p · b} 18:18, 18 May 2018 (UTC)[reply]
- "In this case empty
|url=
have actually created some of the problem this bot is attempting to resolve." Removal is relevant to the purpose of the bot, and it's limited to the citation it edits as a secondary - it doesn't seek out other empty arguments in other citations. To nudge the community to do things with signals of encouragement is not the bot's intention. OTOH removal of|url=
within the citations its edits is relevant to the bot's purpose. -- GreenC 19:41, 18 May 2018 (UTC)[reply]- Personally, I have no issue with removal. The empty args are a waste of space and accomplish nothing from my viewpoint. Also basic bots working with cite templates, may encounter issues with empty URL parameters, though good coding can easily work around that.—CYBERPOWER (Chat) 20:30, 18 May 2018 (UTC)[reply]
- WP:COSMETICBOT says «changes that do not [change output] are typically considered cosmetic». Sometimes this means that it's taken for granted they can be performed alongside bigger changes, sometimes it means they raise more complaints than the bigger change. :) --Nemo 23:58, 18 May 2018 (UTC)[reply]
- Personally, I have no issue with removal. The empty args are a waste of space and accomplish nothing from my viewpoint. Also basic bots working with cite templates, may encounter issues with empty URL parameters, though good coding can easily work around that.—CYBERPOWER (Chat) 20:30, 18 May 2018 (UTC)[reply]
- "In this case empty
- By that rationale, every empty parameter should be removed, and that's not something I feel bots should be doing, save in fairly controled situations, or strong consensus to do so (in which case the functionality could be implemented in AWB). I picked a clean edit, but I could have picked an edit where the bot removed an empty url parameter, but left a slew of other empty parameters alone (jstor/zbl/etc...) such as [2]. The problem the bot is trying to solve is stray accessdates, so it should stick to that IMO. Open to other BAG opinion here since I'm partly involved here. I will point out that in the dicussion that lead to this, no one suggested/supported removing empty url parametesr from citations. Headbomb {t · c · p · b} 18:18, 18 May 2018 (UTC)[reply]
{{BAGAssistanceNeeded}} To be clear I'm recusing myself from making the final call here. I have listed some objections above, but I'll note for the record they are not a personal deal breaker for me, simply a concern I have. Headbomb {t · c · p · b} 16:37, 1 June 2018 (UTC)[reply]
- I have no issues with the removal of |url=, as Cyberpower678 mentions above, they can cause issues. Approved. SQLQuery me! 15:53, 28 August 2018 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.