Wikipedia:Bots/Requests for approval/BattyBot 10
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: GoingBatty (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 23:14, Sunday May 20, 2012 (UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): AWB
Source code available: AWB general fixes
Function overview: Use AWB's general fixes to change {{No footnotes}} to {{More footnotes}} if an article has at least one inline citation. (e.g. diff)
Links to relevant discussions (where appropriate):
Edit period(s): Multiple runs
Estimated number of pages affected: Hundreds
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details:
Discussion
editHow're you going to tell the difference between explanatory footnotes and genuine citations? I suggest only doing the fully automated change if there's an inline citation with a citation template - any other articles should be supervised. Allens (talk | contribs) 09:30, 22 May 2012 (UTC)[reply]
- Good question, Allens! I've asked the AWB developers here. GoingBatty (talk) 01:14, 23 May 2012 (UTC)[reply]
- Since AWB just counts <ref> tags, I'll be conservative and skip any page that does not contain {{cite. GoingBatty (talk) 22:12, 24 May 2012 (UTC)[reply]
- Sounds good. Also note the {{citation}} and {{sfn}} templates. (The latter generates its own internal "ref", so is particularly important.) Allens (talk | contribs) 22:31, 24 May 2012 (UTC)[reply]
- I agree, Allens. To ensure I match one of those templates but skip {{citation needed}}, I'll skip all articles that do NOT contain the following regex: {{(cite|sfn|citation\s?\|). GoingBatty (talk) 01:57, 25 May 2012 (UTC)[reply]
- Looks good. (You may need to escape the {{.) Allens (talk | contribs) 04:13, 25 May 2012 (UTC)[reply]
- I agree, Allens. To ensure I match one of those templates but skip {{citation needed}}, I'll skip all articles that do NOT contain the following regex: {{(cite|sfn|citation\s?\|). GoingBatty (talk) 01:57, 25 May 2012 (UTC)[reply]
- Sounds good. Also note the {{citation}} and {{sfn}} templates. (The latter generates its own internal "ref", so is particularly important.) Allens (talk | contribs) 22:31, 24 May 2012 (UTC)[reply]
- Since AWB just counts <ref> tags, I'll be conservative and skip any page that does not contain {{cite. GoingBatty (talk) 22:12, 24 May 2012 (UTC)[reply]
Rather self-explanatory task, clarified well by the discussion above. Approved for trial (25 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. — madman 16:34, 30 May 2012 (UTC)[reply]
- Setting up the trial now. Note that I won't run this fully automated until the next SVN of AWB is available, which will contain the fix for this bug I reported. GoingBatty (talk) 01:41, 31 May 2012 (UTC)[reply]
- Trial complete. Here are the edits made by the bot. Note that I made the bot more conservative to skip all articles that do NOT contain the following regex: <ref(\sname=.*)?>{{(cite|sfn|citation\s?\|). GoingBatty (talk) 02:57, 1 June 2012 (UTC)[reply]
- Actually, the "sfn" part is not valid - "sfn" contains a "ref" itself, so would not be inside a "ref". (In other words, {{sfn|author|year|page}} does the same thing as <ref>author (year), page</ref>.) Perhaps taking care of the sfn ones can be done on a second run after the first full run is complete? Allens (talk | contribs) 10:35, 1 June 2012 (UTC)[reply]
- Right again, Allens! I'll try {{sfn|<ref(\sname=.*)?>{{(cite|citation\s?\|) next time. GoingBatty (talk) 16:29, 1 June 2012 (UTC)[reply]
- ".*" could be something more explicit to reduce possible problems. "name ?=" would be better than "name=" and similar for all other places where white space could occur. Snowman (talk) 13:23, 2 June 2012 (UTC)[reply]
- Actually, ".*" should probably instead be "\s*.+?", if AWB will accept that (without the "?" for non-greedy if it won't); the names used are too variable to use much of anything else. "name\s*=" would indeed be better; good point. Allens (talk | contribs) 14:29, 2 June 2012 (UTC)[reply]
- What about "\s*.*?", and I think AWB does accept the non-greedy version. I have seen the malformed syntax "<ref name = >{{cite| " Snowman (talk) 21:09, 2 June 2012 (UTC)[reply]
- Needs to accept text versions with white space before "cite" and "citation";
"\s*(cite|citation)\s*\|".Why is the "\|" within the round brackets in the expression above? Also, white space at other places is not yet catered for. Snowman (talk) 21:09, 2 June 2012 (UTC)[reply]
- Needs to accept text versions with white space before "cite" and "citation";
- What about "\s*.*?", and I think AWB does accept the non-greedy version. I have seen the malformed syntax "<ref name = >{{cite| " Snowman (talk) 21:09, 2 June 2012 (UTC)[reply]
- Actually, ".*" should probably instead be "\s*.+?", if AWB will accept that (without the "?" for non-greedy if it won't); the names used are too variable to use much of anything else. "name\s*=" would indeed be better; good point. Allens (talk | contribs) 14:29, 2 June 2012 (UTC)[reply]
- ".*" could be something more explicit to reduce possible problems. "name ?=" would be better than "name=" and similar for all other places where white space could occur. Snowman (talk) 13:23, 2 June 2012 (UTC)[reply]
- Right again, Allens! I'll try {{sfn|<ref(\sname=.*)?>{{(cite|citation\s?\|) next time. GoingBatty (talk) 16:29, 1 June 2012 (UTC)[reply]
- Actually, the "sfn" part is not valid - "sfn" contains a "ref" itself, so would not be inside a "ref". (In other words, {{sfn|author|year|page}} does the same thing as <ref>author (year), page</ref>.) Perhaps taking care of the sfn ones can be done on a second run after the first full run is complete? Allens (talk | contribs) 10:35, 1 June 2012 (UTC)[reply]
- Trial complete. Here are the edits made by the bot. Note that I made the bot more conservative to skip all articles that do NOT contain the following regex: <ref(\sname=.*)?>{{(cite|sfn|citation\s?\|). GoingBatty (talk) 02:57, 1 June 2012 (UTC)[reply]
- Point on the whitespace. I'm surprised that a ref without a name after the name= works. The \| is in the round brackets because it's only needed for distinguishing between "citation needed" and "citation | ...". Allens (talk | contribs) 21:38, 2 June 2012 (UTC)[reply]
- I see how it selects "citation" and "citation needed". The style of programming I prefer is with more belts and braces;
"\s*(cite\s*\||citation\s*\|)" or "\s*(cite|citation)\s*\|". I think that this is likely to give less unforeseen errors. I think that "cite\s*\|" should be used because it makes use of the pipe, which is a significant character.A ref without a name after "name=" is rendered normally. Snowman (talk) 22:01, 2 June 2012 (UTC)[reply]- Umm... all the "cite" templates have something after the "cite" before the next pipe. I suppose one could use "cite\s+\w+\s*\|", but I'm not sure why one would bother. Allens (talk | contribs) 22:59, 2 June 2012 (UTC)[reply]
- Thank you, I got that wrong. There are a limited number of things that can occur after cite. "\s+\w+\s*" would not work consistently, because sometimes there are two words, hyphenated words, or words with unusual capitalization. Snowman (talk) 23:21, 2 June 2012 (UTC)[reply]
- I see how it selects "citation" and "citation needed". The style of programming I prefer is with more belts and braces;
From [[Template:Citation/core]]: Template:Citation Template:Cite arXiv Template:Cite book Template:Cite conference Template:Cite DVD-notes Template:Cite encyclopedia Template:Cite IETF Template:Cite interview Template:Cite journal Template:Cite mailing list Template:Cite manual Template:Cite news Template:Cite newsgroup Template:Cite press release Template:Cite report Template:Cite sign Template:Cite speech Template:Cite techreport Template:Cite thesis Template:Cite video Template:Cite webSnowman (talk) 23:21, 2 June 2012 (UTC)[reply]
- Where is the source code? Snowman (talk) 13:26, 2 June 2012 (UTC)[reply]
- The bot is using AWB, as stated above; see Wikipedia:AWB#Getting_the_sources. Allens (talk | contribs) 14:29, 2 June 2012 (UTC)[reply]
- Of course, a lot of in-line citations do not use the cite template. Unless I have missed something, this proposal would miss a lot of pages that might need correcting. Snowman (talk) 13:32, 2 June 2012 (UTC)[reply]
- Yes - but better that than "correcting" pages that shouldn't be. Allens (talk | contribs) 14:29, 2 June 2012 (UTC)[reply]
- It is better to miss as few pages that need editing as possible. Snowman (talk) 21:25, 2 June 2012 (UTC)[reply]
- If it was semiautomated processing, that would be correct. When we're talking about fully automated processing, I disagree. Allens (talk | contribs) 21:38, 2 June 2012 (UTC)[reply]
- I think that the white space problem is likely to have showed up in the skipped pages, and I would like to know in what way the skipped page were examined to make improvements to the regex. Snowman (talk) 22:28, 2 June 2012 (UTC)[reply]
- How will pages be selected for processing? Snowman (talk) 13:33, 2 June 2012 (UTC)[reply]
- I would imagine by being in the category Articles lacking in-text citations, into which they're placed by {{no footnotes}}. Allens (talk | contribs) 21:38, 2 June 2012 (UTC)[reply]
- Approximately, how many pages are skipped for one page edited? Snowman (talk) 13:46, 2 June 2012 (UTC)[reply]
- It might be useful to see what pages it has skipped, which is provided by the AWB log. This might give clues to what white space it is not catering for. Has the bot operator looked at the pages the bot has skipped? There is a link to the list of 25 edits that AWB has made, but there is not a list of pages that were skipped. Snowman (talk) 22:38, 2 June 2012 (UTC)Snowman (talk) 22:24, 2 June 2012 (UTC)[reply]
- Why not ask for this to be incorporated into AWB, so that AWB users can do this semi-automatically. Snowman (talk) 20:49, 2 June 2012 (UTC)[reply]
- I suspect it's already in there - note "AWB general fixes"? Allens (talk | contribs) 21:38, 2 June 2012 (UTC)[reply]
- Would it be better to write it within round brackets; "({{sfn|<ref(\sname=.*)?>{{(cite|citation\s?\|))"? (needs consideration for white space variations). Snowman (talk) 23:00, 2 June 2012 (UTC)[reply]
- There are lots of templates that can be used in in-line citations. Snowman (talk) 23:31, 2 June 2012 (UTC)[reply]
- True, but "cite, "citation", and "sfn" encompass most of them. Marking unsigned edit by Allens (talk | contribs) 04:29, 3 June 2012 (UTC)[reply]
- I think that it would benefit the efficiency of the regex if more templates (that are used in in-line citations) were incorporated into it. Snowman (talk) 08:12, 3 June 2012 (UTC)[reply]
- Certainly. What others are you thinking of? Allens (talk | contribs)
- Please remember to sign. Of course the bot's author would need to look them up. The one for Internet Movie Database "{{IMDb name|" and many for birds and animals ("{{IUCN 200[0-9]|", "{{IUCNlink") and bare links (like this one I found on the Philip Larkin page; <ref>[http://news.bbc.co.uk/2/hi/entertainment/3193692.stm Larkin is nation's top poet], BBC News, 23 October 2003; [http://entertainment.timesonline.co.uk/tol/arts_and_entertainment/books/article3127837.ece The 50 greatest British writers since 1945], ''The Times'', 5 January 2008.</ref>) The sequence within bare links does vary and they may not all begin with the url. Snowman (talk) 10:23, 4 June 2012 (UTC)[reply]
- Sorry about the signature! If it is considered desirable to spot URLs, the most general regex for http/https is: "https?://[^\]\[<>"\x00-\x20\x7F\p{Zs}]{2}" - this is for simply detecting URLs.
- Please remember to sign. Of course the bot's author would need to look them up. The one for Internet Movie Database "{{IMDb name|" and many for birds and animals ("{{IUCN 200[0-9]|", "{{IUCNlink") and bare links (like this one I found on the Philip Larkin page; <ref>[http://news.bbc.co.uk/2/hi/entertainment/3193692.stm Larkin is nation's top poet], BBC News, 23 October 2003; [http://entertainment.timesonline.co.uk/tol/arts_and_entertainment/books/article3127837.ece The 50 greatest British writers since 1945], ''The Times'', 5 January 2008.</ref>) The sequence within bare links does vary and they may not all begin with the url. Snowman (talk) 10:23, 4 June 2012 (UTC)[reply]
- Certainly. What others are you thinking of? Allens (talk | contribs)
- "Cite" and "cite" both render normally. It would be worth finding about if "Citation" and "citation" are both valid. Snowman (talk) 23:39, 2 June 2012 (UTC)[reply]
- They are, as are "Sfn" and "sfn". The WM software automatically capitalizes the first letter. Case-insensitive detection seems sensible. Allens (talk | contribs) 04:29, 3 June 2012 (UTC)[reply]
- What if there is only one in-line ref on a page and it is marked as a dead link? Snowman (talk) 08:51, 3 June 2012 (UTC)[reply]
- Dead links are still counted as references (although one could argue that it should be otherwise for cite web). It's for external links that dead means to be removed. Allens (talk | contribs) 19:07, 3 June 2012 (UTC)[reply]
- The date in the "more footnotes" template is not changed. The date is the same as the original "no footnotes" date; see one the "bots" edits. Snowman (talk) 08:58, 3 June 2012 (UTC)[reply]
- Yes. I can see arguments both ways whether this is a good thing or not. I'm tending to think that, given that "more footnotes" includes "no footnotes" as a more-extreme subset, using the old date is appropriate - it tells one how long the article has been noted as needing more footnotes (whether due to no footnotes at all or inadequate footnotes). Allens (talk | contribs) 19:07, 3 June 2012 (UTC)[reply]
- The basic information on the template wiki pages, does not say anything about keeping old dates. If fact, the information says use {{date}}, which is the date a new template is written. Snowman (talk) 17:35, 4 June 2012 (UTC)[reply]
- If an article has both an "Refimprove" template and a "No footnotes" template, would it be better to remove the "no footnotes" template; see Infantry fighting vehicle? Snowman (talk) 09:13, 3 June 2012 (UTC)[reply]
- User GoingBatty has removed the footnotes template leaving the refimprove template; see his comment below. I think this this is correct on this page. I note that User GoingBatty did this manually and this was not an action of the bot. Snowman (talk) 12:22, 4 June 2012 (UTC)[reply]
- The two do not mean the same thing. "No footnotes" indicates that there are some references (without footnotes); it does not say whether or not these references are adequate. "Refimprove" indicates that the existing references are not adequate. Now, "unreferenced" (if correct) would indicate that "no footnotes" should be entirely removed. Allens (talk | contribs) 19:07, 3 June 2012 (UTC)[reply]
- On some pages would it be better to replace "no footnotes" with a "refimprove" template? I suspect that sometimes the "no footnotes" template is not always best replaced with "more footnotes", since some of the "no footnotes" templates have been on pages for years and the page may have had a lot of recent editing. I think that a human may need to assess the pages and do this task semi-automatically. Snowman (talk) 09:19, 3 June 2012 (UTC)[reply]
- Re:Konrad Lorenz on 3 June 2012. The "no footnotes" tag here is inappropriate, so changing it to "more footnotes" would be wrong. This needs changing to "refimprove", which I have just done. The bot would need to check if there is a list of non-inline references or it will be repeating mistakes. Snowman (talk) 09:38, 3 June 2012 (UTC)[reply]
- Some of the pages are marked with an "inline" template; see Michael Doohan page. The regex does not do anything about this form of the template. Incidentally, this is effectively another "no footnotes" template on a page without a list of non-inline references and needs removing or changing to "refimprove" which I have done. Snowman (talk) 09:59, 3 June 2012 (UTC)[reply]
- The regex is not what is spotting the "no footnotes", but, yes, {{inline}} and {{citations}} do need to be picked up (they're redirects to {{no footnotes}}). Allens (talk | contribs) 19:07, 3 June 2012 (UTC)[reply]
- Articles that contain {{inline}}, {{citations}}, and other redirects to {{no footnotes}} are all picked up by making a list where the source is "What transcludes page" Template:No footnotes. When processing this list of articles, Wikipedia:AutoWikiBrowser/Template redirects will convert these templates to {{no footnotes}} before checking to see if it should be changed to {{more footnotes}}. GoingBatty (talk) 21:52, 3 June 2012 (UTC)[reply]
- Bots trial run: The bot has changed a "no footnotes" tag to a "more footnotes" tag (without changing the date) where this is inappropriate on the Istishia page. All the reference on this page are all in-line, so a "refimprove" tag is needed; see the bot's edit here. I have less confidence in this bot now. I think that the bot run needs to checked and mistakes corrected. Also, where the bot's edit is appropriate the date should be updated in the template. I have corrected the bot's mistake on the Istishia page changing the template to refimprove and I put in today's date. Snowman (talk) 10:54, 3 June 2012 (UTC)[reply]
- See above. Allens (talk | contribs) 19:07, 3 June 2012 (UTC)[reply]
- The bot did not make a mistake on the Istishia page - it did exactly what it was programmed to do. I agree that since all the references are inline citations, that neither the "no footnotes" nor "more footnotes" tags are appropriate, and I'm glad that the bot's edit alerted you to review this article and remove them. However, the decision to add "refimprove" should be independent of the other tags. GoingBatty (talk) 22:40, 3 June 2012 (UTC)[reply]
- Unfortunately, I think that the bot did what it was programmed to do, but what it was programmed to do was to make a mistake. I would have thought that the owner of the bot would have checked all the 25 edits and picked this mistake up. What is the use of the test run, if the owner of the bot does not notice mistakes? I was checking the run of the bot's 25 edits and I came across this mistake after checking about 6 or 7 edits and I did not check the rest. I suspect that there are further mistakes in the bots remaining edits. Snowman (talk) 17:32, 4 June 2012 (UTC)[reply]
- I disagree that this was actually a mistake. It was more of a null edit (neither good nor bad). Allens (talk | contribs) 18:10, 4 June 2012 (UTC)[reply]
- I think that putting the wrong template on a page is a mistake. I think that a bot should not make these mistakes for the sake of changing "no footnotes" to "more footnotes" appropriately by a proportion of its edits, which are relatively minor changes. Snowman (talk) 18:59, 4 June 2012 (UTC)[reply]
- Impression 1. What I have seen here does not inspire me with confidence and I suspect that the regex has not been fully tested. If the function of this bot is thought to be useful, then it could be incorporated into AWB. I suggest asking the authors of AWB. Snowman (talk) 21:21, 2 June 2012 (UTC)[reply]
- Is there some reason not to do as much as can be done safely automatically (using this bot), followed by semi-automatically (by AWB users) for anything requiring judgement (like whether free text is a citation)? Allens (talk | contribs) 21:38, 2 June 2012 (UTC)[reply]
- At this point in time, I do not have confidence that the regex caters for variations in white space. I think that considering white space is a basic aspect of writing regexes and I am puzzled why improvements to the regex were not made following analysis of skipped pages. I do not have confidence in this bot to evolve or improve should it come up against problems. Snowman (talk) 22:01, 2 June 2012 (UTC)[reply]
- One, it's not up to the bot to evolve/improve; it's up to humans to figure out how it should be changed. Two, customarily, what one is primarily concentrating on during a bot trial is it not doing anything it is not supposed to do. Why not do a first run with the old version of the regex (which has been trialed), then get approval for running with a new version after examining some missed pages? It is anticipated above that multiple runs may be necessary. Allens (talk | contribs) 22:59, 2 June 2012 (UTC)[reply]
- Of course, it is obvious that humans have to modify a bot and I am not inspired with confidence at this moment in time. I would have thought that this simple regex would have been tested adequately with AWB in semi-automated mode before it reached this page. How low is the bar here? Snowman (talk) 23:08, 2 June 2012 (UTC)[reply]
- What the regex is going to have been tested for, I suspect, is for whether it correctly excluded pages that shouldn't be changed, and whether on most pages that should be changed, it allowed them. Ever heard of "the perfect is the enemy of the good"? No regex is going to be perfect... Allens (talk | contribs) 04:29, 3 June 2012 (UTC)[reply]
- Do people who want to improve the efficiency of a regex get called an enemy here? I do not see any room for complacency here. Snowman (talk) 08:03, 3 June 2012 (UTC)[reply]
- I'm not saying you're an enemy; I apologize if it seemed that way - just that trying to get things perfect can sometimes be problematic. I certainly have no argument with trying to make it work as well as possible. Allens (talk | contribs) 19:07, 3 June 2012 (UTC)[reply]
- Do people who want to improve the efficiency of a regex get called an enemy here? I do not see any room for complacency here. Snowman (talk) 08:03, 3 June 2012 (UTC)[reply]
- What the regex is going to have been tested for, I suspect, is for whether it correctly excluded pages that shouldn't be changed, and whether on most pages that should be changed, it allowed them. Ever heard of "the perfect is the enemy of the good"? No regex is going to be perfect... Allens (talk | contribs) 04:29, 3 June 2012 (UTC)[reply]
- Of course, it is obvious that humans have to modify a bot and I am not inspired with confidence at this moment in time. I would have thought that this simple regex would have been tested adequately with AWB in semi-automated mode before it reached this page. How low is the bar here? Snowman (talk) 23:08, 2 June 2012 (UTC)[reply]
I am selecting pages for processing by looking at all the transclusions of {{No footnotes}}, since both {{No footnotes}} to {{More footnotes}} categorize articles in Category:Articles lacking in-text citations.
Since one of the many things that the AWB general fixes do is to change {{No footnotes}} to {{More footnotes}} if an article has at least one inline citation, every other bot using AWB already have the possibility to make this change when performing another task, and I have not seen any concerns from the Wikipedia community about this. Therefore, I did not anticipate having to write a regex to ensure that the reference is not a footnote before submitting this bot request. However, I was happy to do so based on Allens' suggestion. It seems now that the issue is how to write the most effective regex for a fully automated bot to maximize the number of fixes with no false positives.
Once the AWB developers release a new SVN that includes the bug fixes they made related to this bot request, I'll have AWB preparse the 31,000+ articles containing {{No footnotes}} using general fixes to weed out the thousands of articles where the general fixes make no changes, only change whitespace/casing, or make changes that still leave "no footnotes". I'll then save the remaining list and preparse it with the regex to skip articles that do not contain references with citation templates. I'll analyze the skipped articles, tweak the regex, and do several iterations of this until I'm happy with the regex. I'll then post the regex here and request approval for a new trial run.
Of course, I'm open to another course of action if you have a better idea. Thanks! GoingBatty (talk) 05:45, 3 June 2012 (UTC)[reply]
- Based on the above discussion, I am fairly sure that the regex can be improved right now. I suggest that the current inefficient regex is not used again. Snowman (talk) 08:16, 3 June 2012 (UTC)[reply]
- ... Of course, it is not good practice to make AWB make loads of minor edits to whitespace and casing and editors have been blocked for doing this. From the above, I am not sure if you are planning to do this or not. Snowman (talk) 08:21, 3 June 2012 (UTC)[reply]
- "Pre-parse mode" skips articles - please see Wikipedia:AutoWikiBrowser/User manual#Options. GoingBatty (talk) 15:59, 3 June 2012 (UTC)[reply]
Incidentally, as far as I am aware, 25,000 is the maximum that AWB can list. I guess that there are work-arounds.Snowman (talk) 08:27, 3 June 2012 (UTC)[reply]
- ... Of course, it is not good practice to make AWB make loads of minor edits to whitespace and casing and editors have been blocked for doing this. From the above, I am not sure if you are planning to do this or not. Snowman (talk) 08:21, 3 June 2012 (UTC)[reply]
- Based on the above discussion, I am fairly sure that the regex can be improved right now. I suggest that the current inefficient regex is not used again. Snowman (talk) 08:16, 3 June 2012 (UTC)[reply]
- Impression 2. I have looked at the use of the "no footnotes" template and found that many of them are years old. In some cases (provisional estimate 30 to 40%) the tagged page does not even have a list references without in-line references or the page has had a lot of development since the addition of the "no footnotes" template. I think that the necessary updates can not be done by a bot that only changes "no footnotes" to "more footnotes" just because the article has at least one in-line citation. I think that each page needs individually reviewing to see what is the best template to use and which templates need removing. However, a well organised program (perhaps in Perl) could scan pages and sort some of this this out and make some automatic "intelligent" choices about templates. Snowman (talk) 09:44, 3 June 2012 (UTC)[reply]
- "More footnotes" is at least an improvement over "No footnotes" when there are, in fact, footnotes present. Flagging cases in which the "No footnotes" template appears incorrect given the presence of citations but is also quite old (say, more than 2-3 years?) would probably be helpful. (I know Perl, incidentally, although I'm currently working on a program for the GOCE, a couple of copy-editing tasks, improvements to STiki, etc, not to mention RL...) Allens (talk | contribs) 19:07, 3 June 2012 (UTC)[reply]
- Reply: Just a reminder on what each template states:
- No footnotes: "This article includes a list of references, related reading or external links, but its sources remain unclear because it lacks inline citations. Please improve this article by introducing more precise citations."
- More footnotes: "This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. Please help to improve this article by introducing more precise citations."
- Refimprove: "This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed."
- Based on the discussion above, I think the optimal process for pages with {{no footnotes}} would be:
- Step 1: If the article only contains inline citations (no other list of references, related reading, or external links), then remove {{no footnotes}} and {{more footnotes}} completely. (I did this on Infantry fighting vehicle manually.)GoingBatty (talk) 22:42, 3 June 2012 (UTC)[reply]
- I think that is is very likely that the article would need assessing to see if a refimprove tag is needed, and that you can not just remove the no "footnotes" or "more footnotes" with your simple bot. Thank you for reminding what the templates say. For the "Infantry fighting vehicle" I would agree that the no or more footnotes tag needed removing leaving the refimprove tag. Snowman (talk) 10:07, 4 June 2012 (UTC)[reply]
- Step 2: If the article contains at least one inline citation, then change {{no footnotes}} to {{more footnotes}}. (Presuming these contain a list of references, related reading, or external links, since {{no footnotes}} didn't get removed in step 1.GoingBatty (talk) 22:31, 3 June 2012 (UTC)[reply]
- A page with "no footnotes" may not have a list of references, but it is a requirement that the "more footnotes" tag should be added only to pages with a list of references. Hence, the presence of a list of references must be confirmed as part of the process of up-dating an article from a "no references" tag to a "more references" tag. Snowman (talk) 13:26, 4 June 2012 (UTC)[reply]
- Reply: Just a reminder on what each template states:
- Do not replace {{no footnotes}} or {{more footnotes}} with {{refimprove}}, since these templates are not related.GoingBatty (talk) 22:31, 3 June 2012 (UTC)[reply]
- To me this seems to be absolute non-sense and your comment reduces my confidence in this proposed "bot". I have used AWB to test about 400 files from the same list that your bot is set to scan and about 12% to 15% needed the tags changing. My opinion, after personally examining all the 12% that needed tag modification in this set of 400 pages, is that a "refimprove" tag is an excellent choice and is required to update many page. When pages have to be updated we are not restricted to only having to use "no footnotes" or "more footnotes" just because they are similar. We can choose the most suitable template available. Snowman (talk) 10:07, 4 June 2012 (UTC)[reply]
- Having said that, changing {{no footnotes}} to {{more footnotes}} on an article that does not contain a list of references, related reading, or external links doesn't make the article any worse than it started. Hopefully, it would also encourage editors who see the edit on their watchlist to manually review the article and use their best judgement on whether the template is still needed or not.GoingBatty (talk) 22:31, 3 June 2012 (UTC)[reply]
- I think that not assessing the page for a "Refimprove" tag in this situation is a mistake and I guess that you may making excuses to justify your simple bot. I think that the correct thing to do is to assess the article to see if a "Refimprove" tag is needed. Snowman (talk) 10:07, 4 June 2012 (UTC)[reply]
- In other words, doing Step 2 and then Step 1 gets an article to the same place as doing Step 1 first. GoingBatty (talk) 22:42, 3 June 2012 (UTC)[reply]
- This would completely miss the point that many of pages that the "bot" scans need to be updated with a refimprove tag. Snowman (talk) 10:11, 4 June 2012 (UTC)[reply]
- Having said that, changing {{no footnotes}} to {{more footnotes}} on an article that does not contain a list of references, related reading, or external links doesn't make the article any worse than it started. Hopefully, it would also encourage editors who see the edit on their watchlist to manually review the article and use their best judgement on whether the template is still needed or not.GoingBatty (talk) 22:31, 3 June 2012 (UTC)[reply]
- I made the comments above because I don't think that a bot can assess whether {{refimprove}} is necessary or not. What logic would you suggest? GoingBatty (talk) 17:10, 4 June 2012 (UTC)[reply]
- From my run using AWB of 400 pages in the category indicated, I am sure that a lot of pages need "refimprove" tags rather than "no footnotes" tags. I do not have any suggestions on how a bot could recognise a page that needs a "Refimprove" tag. From what I have seen of the bot, I think that it is not fit for purpose. Snowman (talk) 17:25, 4 June 2012 (UTC)[reply]
- Umm... you don't disagree that these pages shouldn't have "no footnotes", right? So long as at least some of them should have "more footnotes", and the change is harmless on the rest (which should have "refimprove"), what's the problem? Allens (talk | contribs) 18:10, 4 June 2012 (UTC)[reply]
- I have difficulty in interpreting what "Umm..." means, so I would be grateful if you would explain what this means. According to your philosophy on templates, I guess that you would be likely to consider that it is not harmful that "more footnotes" is not used instead of "no footnotes", so why do you think that this "bot" should ever be used? I think that is would be better to aim for more accuracy when placing templates. However, I think that it would be worthwhile considering if the templates here could be re-designed or merged. Snowman (talk) 18:34, 4 June 2012 (UTC)[reply]
- Umm... you don't disagree that these pages shouldn't have "no footnotes", right? So long as at least some of them should have "more footnotes", and the change is harmless on the rest (which should have "refimprove"), what's the problem? Allens (talk | contribs) 18:10, 4 June 2012 (UTC)[reply]
- From my run using AWB of 400 pages in the category indicated, I am sure that a lot of pages need "refimprove" tags rather than "no footnotes" tags. I do not have any suggestions on how a bot could recognise a page that needs a "Refimprove" tag. From what I have seen of the bot, I think that it is not fit for purpose. Snowman (talk) 17:25, 4 June 2012 (UTC)[reply]
- I made the comments above because I don't think that a bot can assess whether {{refimprove}} is necessary or not. What logic would you suggest? GoingBatty (talk) 17:10, 4 June 2012 (UTC)[reply]
- After doing a run myself using AWB on the first 400 pages in the relevant category, I estimate that 3000 to 5000 pages may need to have the "no references" updated to a more suitable reference (likely to be "more references" or "refimpove"). A bot that might edit thousands of pages needs to avoid mistakes. I think that the estimate of the pages affected given by the bot's author (given above as 100s) is far too low. However, I might be wrong, so I would be interested to see Author's calculations or reasoning on why he thinks the number of pages affected is hundreds and not thousands of pages. Snowman (talk) 11:07, 4 June 2012 (UTC)[reply]
- I made no calculations when estimating the number of pages. GoingBatty (talk) 17:10, 4 June 2012 (UTC)[reply]
- What do you think of my estimate of the number of pages affected? Will the owner of the "bot" make an estimate based on the results of the bot's run of 25 edits? Snowman (talk) 17:25, 4 June 2012 (UTC)[reply]
- Since there have been many suggestions given, I think we need to agree on a path forward for the bot before making a new estimate. GoingBatty (talk) 01:27, 5 June 2012 (UTC)[reply]
- It would be useful to estimate the number of pages with the problem. This is different to estimating the number of edits that are likely to be made by a bot. Snowman (talk) 09:09, 5 June 2012 (UTC)[reply]
- As regards the way forward, I think that either new templates need to be made that do not need updating when one in-line reference is added or removed. I note that you think that discussing new templates is not relevant to the discussion here, which I find is not immediately helpful. Otherwise, if anyone wants to use the current rather awkward templates, I think that a script (suggest Perl) would be needed to scan the articles to see what is on the page (perhaps count relevant features) and then make some "intelligent" options about modifying templates and where to put them. The script could also make a log of problem pages that need to be looked at manually. You may need someone to help you with the script. I think that may be worth you liaising with the authors of AWB to see if they are interesting in building any relevant features into AWB or helping you with a script, but they may have other priorities. If you can improve the current approach, then please explain the next phase. Snowman (talk) 09:09, 5 June 2012 (UTC)[reply]
- When submitting this bot request, my goal was to run a bot that operates within the current consensus for "no footnotes" and "more footnotes". One of your many ideas is to discuss new templates that would be easier to use, which is a reasonable suggestion. Start that discussion in the right forum, and I'll be happy to join you there. I've considered submitting an AWB feature request to change the way AWB general fixes add/change/remove the "no footnotes"/"more footnotes" tags, but that request would be moot if you're successful in developing better templates. GoingBatty (talk) 02:49, 6 June 2012 (UTC)[reply]
- As regards the way forward, I think that either new templates need to be made that do not need updating when one in-line reference is added or removed. I note that you think that discussing new templates is not relevant to the discussion here, which I find is not immediately helpful. Otherwise, if anyone wants to use the current rather awkward templates, I think that a script (suggest Perl) would be needed to scan the articles to see what is on the page (perhaps count relevant features) and then make some "intelligent" options about modifying templates and where to put them. The script could also make a log of problem pages that need to be looked at manually. You may need someone to help you with the script. I think that may be worth you liaising with the authors of AWB to see if they are interesting in building any relevant features into AWB or helping you with a script, but they may have other priorities. If you can improve the current approach, then please explain the next phase. Snowman (talk) 09:09, 5 June 2012 (UTC)[reply]
- It would be useful to estimate the number of pages with the problem. This is different to estimating the number of edits that are likely to be made by a bot. Snowman (talk) 09:09, 5 June 2012 (UTC)[reply]
- Since there have been many suggestions given, I think we need to agree on a path forward for the bot before making a new estimate. GoingBatty (talk) 01:27, 5 June 2012 (UTC)[reply]
- What do you think of my estimate of the number of pages affected? Will the owner of the "bot" make an estimate based on the results of the bot's run of 25 edits? Snowman (talk) 17:25, 4 June 2012 (UTC)[reply]
- Would it be better to deprecate "no footnotes" and "more footnotes" and use a new template with wording that covers the two? Snowman (talk) 13:18, 4 June 2012 (UTC)[reply]
- Please let me know if you choose to make this proposal on the template talk pages or somewhere else more appropriate. GoingBatty (talk) 17:10, 4 June 2012 (UTC)[reply]
- I am hoping for some opinions on these templates. Why use these templates, when it might be better to use better worded templates that do not need to be updated when one in-line reference is added to an article. I think that it would be useful to think about better templates. Snowman (talk) 17:39, 4 June 2012 (UTC)[reply]
- This is, however, not the place for this discussion. Allens (talk | contribs) 18:10, 4 June 2012 (UTC)[reply]
- I think this might have a bearing on this bot, because this sort of a bot might have have a simpler task if the templates could be worded better or perhaps merged. Snowman (talk) 19:06, 4 June 2012 (UTC)[reply]
- This is, however, not the place for this discussion. Allens (talk | contribs) 18:10, 4 June 2012 (UTC)[reply]
- I am hoping for some opinions on these templates. Why use these templates, when it might be better to use better worded templates that do not need to be updated when one in-line reference is added to an article. I think that it would be useful to think about better templates. Snowman (talk) 17:39, 4 June 2012 (UTC)[reply]
- Snowman - please be more careful when using AWB to manually change these templates. I've fixed this, this and this. Thanks! GoingBatty (talk) 04:15, 5 June 2012 (UTC)[reply]
- Thank you. It was just the "date=" that was missing. I realised this mistake as I was running AWB, so the later AWB edits will have the date included. I went back to correct some of these errors, but there must have been more than I thought. I might have made some of these errors manually as well, when I had the wrong format in mind for the template. While you were looking at my edits, I hope that you noticed how frequently a "refimprove" tag is needed to replace a "no footnotes" tag on articles in the relevant category. Snowman (talk) 09:09, 5 June 2012 (UTC)[reply]
- While I agree that there are articles that should have the "no footnotes"/"more footnotes" tags removed, and there are articles that should have a "refimprove" tag added, I don't agree that the "refimprove" tag should replace the "no footnotes"/"more footnotes" tags, because they are used to identify two different issues. Some articles could even have both tags! GoingBatty (talk) 02:49, 6 June 2012 (UTC)[reply]
- I think that you have missed my point and we seem to be going round in circles. I have examined about 400 pages in the category that the bot is likely to edit and I have found that a many these pages have been modified significantly since the "no footnotes" tag was edited so that a manual review (or a sophisticated script) is needed to assess what tags need to be changed, kept, or added. I estimate that about 20 to 40% of the pages in the relevant category will need the "no footnotes" tag removing and a "refimprove" tag adding, but the "bot" is programmed to incorrectly tag most of these pages with a "more footnotes" tag. I found such a mistake within the bots first 6 or 7 edits and stopped looking for more mistakes. Snowman (talk) 07:02, 6 June 2012 (UTC)[reply]
- Please note that the third edit of mine that you have listed above is where I have correctly replaced a "no footnotes" tag with a "refimprove" tag (but with date format wrong). The "bot" would have changed this to a "more footnotes" tag, because the "bot" only tests for the presence of in-line references (or more accurately certain types of in-line references) before it replaces a "no footnotes" tag with a "more footnotes" tag. Snowman (talk) 07:10, 6 June 2012 (UTC)[reply]
- I agree that we're going in circles on some points, which is why I've asked for BAG help. Until then, let's see what we can agree on. GoingBatty (talk) 02:53, 6 June 2012 (UTC)[reply]
- I believe we agree there are articles where it is appropriate to remove "no footnotes"/"more footnotes" because all of the citations are now inline references, and that I did not include this logic on my initial bot request. GoingBatty (talk) 02:53, 6 June 2012 (UTC)[reply]
- I don't think we agree on when it is appropriate for "no footnotes" to be changed to "more footnotes". For example, when you removed "no footnotes" from Konrad Lorenz in this edit, I hope you noticed the list of external links that lacks inline citations. We agree that the bot as proposed would have changed "no footnotes" to "more footnotes". Could you please explain how you determined that "more footnotes" would not be appropriate for this article? GoingBatty (talk) 02:53, 6 June 2012 (UTC)[reply]
- External links are not necessarily meant to be in-line references and may not have been used as sources. I suspect that what the "templates" may or may not say about external links is a red herring. "Cockatoo" is an example of a Featured Article with a list of external links. Snowman (talk) 10:55, 7 June 2012 (UTC)[reply]
- Although you estimated that 20-40% of the pages in the category should have a "refimprove" tag added, you do not have any suggestions on how a bot could recognise a page that needs a "refimprove" tag. Therefore, do you agree that adding "refimprove" tags is not a suitable task for this bot? Thanks! GoingBatty (talk) 22:20, 6 June 2012 (UTC)[reply]
- I think that you should decide what your bot is going to do. However, I think that it is currently not fit for the stated purpose. Further, the "no footnote" and "more footnotes" templates are rather clunky and not very usable in my opinion. In data processing, if the input is crap, then the output is very likely to be crap. Talking metaphorically, I think that this bot is rearranging crap and making extra crap. If I was to work in this area, I would re-design the templates as the first step in tidying-up here; however, I am planning some other tasks on the Wiki that I am giving a higher priority to. Snowman (talk) 10:55, 7 June 2012 (UTC)[reply]
- I agree that we're going in circles on some points, which is why I've asked for BAG help. Until then, let's see what we can agree on. GoingBatty (talk) 02:53, 6 June 2012 (UTC)[reply]
- Please note that the third edit of mine that you have listed above is where I have correctly replaced a "no footnotes" tag with a "refimprove" tag (but with date format wrong). The "bot" would have changed this to a "more footnotes" tag, because the "bot" only tests for the presence of in-line references (or more accurately certain types of in-line references) before it replaces a "no footnotes" tag with a "more footnotes" tag. Snowman (talk) 07:10, 6 June 2012 (UTC)[reply]
- I think that you have missed my point and we seem to be going round in circles. I have examined about 400 pages in the category that the bot is likely to edit and I have found that a many these pages have been modified significantly since the "no footnotes" tag was edited so that a manual review (or a sophisticated script) is needed to assess what tags need to be changed, kept, or added. I estimate that about 20 to 40% of the pages in the relevant category will need the "no footnotes" tag removing and a "refimprove" tag adding, but the "bot" is programmed to incorrectly tag most of these pages with a "more footnotes" tag. I found such a mistake within the bots first 6 or 7 edits and stopped looking for more mistakes. Snowman (talk) 07:02, 6 June 2012 (UTC)[reply]
- While I agree that there are articles that should have the "no footnotes"/"more footnotes" tags removed, and there are articles that should have a "refimprove" tag added, I don't agree that the "refimprove" tag should replace the "no footnotes"/"more footnotes" tags, because they are used to identify two different issues. Some articles could even have both tags! GoingBatty (talk) 02:49, 6 June 2012 (UTC)[reply]
- Thank you. It was just the "date=" that was missing. I realised this mistake as I was running AWB, so the later AWB edits will have the date included. I went back to correct some of these errors, but there must have been more than I thought. I might have made some of these errors manually as well, when I had the wrong format in mind for the template. While you were looking at my edits, I hope that you noticed how frequently a "refimprove" tag is needed to replace a "no footnotes" tag on articles in the relevant category. Snowman (talk) 09:09, 5 June 2012 (UTC)[reply]
{{BAG assistance needed}}
- While the comments by Snowman and Allens have been valuable and thought provoking, I would appreciate some feedback from someone in the Bot Approvals Group. GoingBatty (talk) 02:53, 6 June 2012 (UTC)[reply]
- I appreciate the concerns that have been raised in the discussion above, and I am impressed by the bot operator's willingness to address such concerns. I agree that in some of the articles that may be affected by this task, {{refimprove}} would be a more appropriate tag than {{more footnotes}}; I also agree that in all of the articles that may be affected, {{more footnotes}} would be a more appropriate tag than {{no footnotes}}. While changing {{no footnotes}} or {{more footnotes}} to {{refimprove}} would be too context-sensitive a task to be performed automatically, that is not the task that has been requested. The fact that a human contributor can perform a still more useful task (which is the case 99% of the time) should not prevent a bot from performing whatever lesser task it may perform harmlessly. This task is part of AWB's general fixes, which generally demonstrates a larger consensus that the task is valuable; this consensus is not outweighed by the concerns raised in the discussion above. There are a number of other AWB-based bots approved to perform general fixes in the course of their duties that may already be performing this task.
- In that light, I have reviewed the results of the trial and they all appear to be per specifications. I can see why some would feel the date parameter should be changed; however, I think that would result in a massive backlog for the months during which the bot would be operating with no benefit. I think it's more appropriate to retain information that indicates how long it's been since a human contributor has reviewed the article, so other contributors can address historical backlogs and decide whether a tag such as {{refimprove}} would be more appropriate. Approved. — madman 13:56, 15 June 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.