Wikipedia:Bots/Requests for approval/Xenobot 6.2
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Automatic or Manually assisted: Automatic
Programming language(s): AWB or Python
Source code available: On request
Function overview: The bot will deploy a new ISO region code parameter to {{Infobox settlement}}, reducing excessive parserFunctions and template calls to {{CountryAbbr}} and its children. It will also set |coordinates_display=
where appropriate.
Links to relevant discussions (where appropriate): Wikipedia:Bot requests#coordinates type= in Template:Infobox settlement (perm), also here (perm), and here (perm)
Edit period(s): One time one, with occasional patrols as necessary
Estimated number of pages affected: close to 160,000 200,000
Exclusion compliant (Y/N): Y
Already has a bot flag (Y/N): Y
Function details: (Modified from original) The bot will use the |subdivision_name=
& |subdivision_name1=
parameters to generate a new parameter, |coordinates_region=
. This is currently done through parserFunctions and fairly expensive template calls to {{CountryAbbr}} which are asserted to have a noticeable slowing effect when editing settlement articles.
| coordinates_region = {{subst:CountryAbbr|$SUBDIVISION_NAME|$SUBDIVISION_NAME1|subst=subst:}}
The bot will also set |coordinates_display=inline,title
where appropriate (see #Revision to functionality for additional details).
Discussion
edit- The bot should also work on articles that use {{Geobox}}, since its internal {{Geobox2 coor}} and {{Geobox2 coor title}} templates also uses CountryAbbr and its ilk. The problem there is that multiple
xxx_coordinates_type
fields are possible. - Also, perhaps the simple subst solution you propose might not be the best idea, long-term. I think it's a good idea to build the parameter string to {{coord}} based on other existing infobox parameters (e.g. use the population field for city types) and that coding within Infobox settlement and Geobox seems sound. The only problem we need to fix is the parsing of country/subdivision/etc. fields to construct the ISO region string. Therefore, I think a better solution here might be to add a new
coordinates_region
parameter to Infobox settlement and Geobox, so that the ISO region string is specified directly, and then use this bot to add that line of markup to all the affected geographical articles.
| coordinates_region = {{subst:CountryAbbr|$SUBDIVISION_NAME|$SUBDIVISION_NAME1|subst=subst:}}
— Andrwsc (talk · contribs) 22:31, 17 March 2010 (UTC)[reply]
- For my part this makes things worlds easier, so I'll modify the task to suit the new parameter(s). –xenotalk 23:36, 17 March 2010 (UTC)[reply]
- How long would you think the bot would need to get through all those articles? I'm wondering if we need to create some intermediate "backwards compatibility" mode for the infobox templates during the time we transition from CountryAbbr? Or perhaps just make the final infobox template changes immediately prior to the bot run and let 'er rip? — Andrwsc (talk · contribs) 23:45, 17 March 2010 (UTC)[reply]
- There are close to 200,000 infobox settlements so it will definitely take some time. =) (14 days @ 10 epm) –xenotalk 00:10, 18 March 2010 (UTC)[reply]
- How long would you think the bot would need to get through all those articles? I'm wondering if we need to create some intermediate "backwards compatibility" mode for the infobox templates during the time we transition from CountryAbbr? Or perhaps just make the final infobox template changes immediately prior to the bot run and let 'er rip? — Andrwsc (talk · contribs) 23:45, 17 March 2010 (UTC)[reply]
- The bot will also strip out commas from population totals, which (correct me if I'm wrong) aren't valid inputs to coord. –xenotalk 00:46, 18 March 2010 (UTC)[reply]
Err, what's the benefit to this bot? Reducing the number of ParserFunctions sounds like worrying about performance. --MZMcBride (talk) 00:34, 19 March 2010 (UTC)[reply]
- It's more the CountryAbbr. If you look at the template and its children, you'll see how tough it is has been for them to keep up with the various ways names are passed ({{flag}}s, etc.) Having the standard ISO code in there would futureproof the template. It might make sense to split the parameter in two for the country and province/state. –xenotalk 00:59, 19 March 2010 (UTC)[reply]
- Agree. CountryAbbr is a disaster and must be replaced. — Andrwsc (talk · contribs) 05:04, 19 March 2010 (UTC)[reply]
- However, is it really necessary to make 200,000+ edits (I think that's the correct number) that aren't doing anything to the final rendered page? Wouldn't this be better as a bot that doesn't make edits solely to substitute {{CountryAbbr}}, but makes other edits and does this substitution along with those (e.g., AWB's genfixes), so as to avoid making a bunch of edits that could easily be combined into others? I hope you understand my point; I'm concerned about a bot making a very large number of edits that do not affect the final page. — The Earwig (talk) 22:51, 19 March 2010 (UTC)[reply]
- To increase the utility of the edits, general fixes could be turned on if approval was granted for that. –xenotalk 03:03, 20 March 2010 (UTC)[reply]
- Alternatively, we could add this logic to the AWB genfixes and then the templates would be improved over time, while users made other useful edits. Rjwilmsi 22:23, 28 March 2010 (UTC)[reply]
- I considered that - someone is also working on optimizing CountryAbbr2 (though I think it's dependent on "plainifying" the names passed through subdivision_name). My main concern with adding it to general fixes and just letting it happen over time is that depending on the input provided by subdivision_name and subdivision_name1, the coordinates_region may not actually be valid input. Though it would likely be a graceful failure, there is no easy way to track when the substitution actually occurs out there 'in the cloud'. Further, an AWB general fix would then be dependent on the CountryAbbr and its child-templates not changing. Lastly, at close to 200,000 articles, the time for an AWB general fix to propagate completely without someone focused on the task probably approaches infinity.
- Some investigation into the claim that editing these articles is noticeably slower due to CountryAbbr should be conducted. I'll admit I've done none. If there is indeed a substantial lag, and the suggested improvements to CountryAbbr don't significantly reduce it, the cost-benefit analysis may favour the bot completing this task: Multiply the time saved by bypassing CountryAbbr by the number of edits to the settlement articles over a year and you have a strong case for recovering that aggregated human time. There are also other tweaks that may be done to the infoboxes while the bot runs - stripping commas from population fields, for example. –xenotalk 23:41, 28 March 2010 (UTC)[reply]
- It's not just a speed issue, it's a maintenance one. I do a lot of flag template maintenance work, and this is severely impacted by CountryAbbr. Even with the changes mentioned above, I still see many thousands of "false positives". For example, Special:WhatLinksHere/Template:Country data Canada shows Aalborg Municipality near the top of the list, and the only reason it appears is because of CountryAbbr et. al. It simply must go. — Andrwsc (talk · contribs) 23:51, 30 March 2010 (UTC)[reply]
- Alternatively, we could add this logic to the AWB genfixes and then the templates would be improved over time, while users made other useful edits. Rjwilmsi 22:23, 28 March 2010 (UTC)[reply]
- To increase the utility of the edits, general fixes could be turned on if approval was granted for that. –xenotalk 03:03, 20 March 2010 (UTC)[reply]
- However, is it really necessary to make 200,000+ edits (I think that's the correct number) that aren't doing anything to the final rendered page? Wouldn't this be better as a bot that doesn't make edits solely to substitute {{CountryAbbr}}, but makes other edits and does this substitution along with those (e.g., AWB's genfixes), so as to avoid making a bunch of edits that could easily be combined into others? I hope you understand my point; I'm concerned about a bot making a very large number of edits that do not affect the final page. — The Earwig (talk) 22:51, 19 March 2010 (UTC)[reply]
- Agree. CountryAbbr is a disaster and must be replaced. — Andrwsc (talk · contribs) 05:04, 19 March 2010 (UTC)[reply]
- This really should be done, although I have no preference whether it may be best to first make it parts of AWB's genfixes and wait a month or two before working through the rest of them.
Worrying about performance is only a minor aspect, as far as I'm concerned. The current solution is an unmaintainable hack with weird side effects. It's patently absurd that Chojnów renders {{Country data Macedonia}}, {{Country data Mauritius}}, {{Country data Nepal}}, {{Country data People's Republic of China}} and several others as part of comparisons to figure out that {{POL}} is "PL". I'm sure it was intended to be a smart hack, but it's really just a hack. Infoboxes should get the standardized input to generate the more complex representations, not vice versa. Amalthea 13:48, 20 April 2010 (UTC)[reply]
- {{BAGAssistanceNeeded}} FYI four days ago, I left a message at VPM (perm) inviting comments or objections to this BRFA and so far there have been no objections raised. I would like to move forward with a trial, as the fixes mentioned above and recently introduced by Wikid77 have not solved all the problems (in fact, they apparently introduced some more issues) that lead to this request. –xenotalk 14:51, 20 April 2010 (UTC)[reply]
- Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. I really don't know what to say about this. I'm a little undecided on the task, but it isn't really harmful, appears to have adequate support, and no opposition has been raised recently. Let's try it out. — The Earwig (talk) 20:18, 21 April 2010 (UTC)[reply]
- {{BotTrialComplete}} 100 edits. I tried to give a good cross-section of countries, with a few U.S. and Canada sprinkled in for good measure to show the State/Prov Abbr is working. Here is a graceful failure (subdivision_name did not contain a Country, it improperly had the subdivision_name1 value). This error was my fault, I temporarily fudged up the regex (now fixed). –xenotalk 19:10, 3 May 2010 (UTC)[reply]
- This looks good!! I especially like that you are also cleaning up the flag template usage, such as
{{flagicon|Spain}} [[Spain]]
to{{flag|Spain}}
. Are you also able to replace the hard-coded image syntax (e.g.[[Image:Flag of Brazil.svg|25px]] [[Brazil]]
) with the flag template equivalent? That would also solve a current WP:Accessibility problem. — Andrwsc (talk · contribs) 19:20, 3 May 2010 (UTC)[reply]
- This looks good!! I especially like that you are also cleaning up the flag template usage, such as
- {{BotTrialComplete}} 100 edits. I tried to give a good cross-section of countries, with a few U.S. and Canada sprinkled in for good measure to show the State/Prov Abbr is working. Here is a graceful failure (subdivision_name did not contain a Country, it improperly had the subdivision_name1 value). This error was my fault, I temporarily fudged up the regex (now fixed). –xenotalk 19:10, 3 May 2010 (UTC)[reply]
- I can try, though I stopped collapsing flagicon+Countryname because sometimes it would have strange things like
{{flagicon|ESP}} Spain
and it was hard to pick out the right one of the two. But with some clever parserFunctions, I can probably overcome this and still do the work when it's the same. How common is the second thing you mention? I don't think I saw it in the trial run. –xenotalk 19:25, 3 May 2010 (UTC)[reply]- I pulled that example directly out of CountryAbbr, and there are several countries like that in the #switch statement, so I presume it is common enough that led to it being included there. — Andrwsc (talk · contribs) 21:56, 3 May 2010 (UTC)[reply]
- I can try, though I stopped collapsing flagicon+Countryname because sometimes it would have strange things like
- Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. I really don't know what to say about this. I'm a little undecided on the task, but it isn't really harmful, appears to have adequate support, and no opposition has been raised recently. Let's try it out. — The Earwig (talk) 20:18, 21 April 2010 (UTC)[reply]
I just stopped by to comment how happy I am to see the bot helping with the task for which I summoned it. Keep up the good work, all. --Stepheng3 (talk) 03:03, 4 May 2010 (UTC)[reply]
Revision to functionality
edit- FYI I'll probably add this request (perm) to do work on coordinates_display into the task and increase the utility of the edits. –xenotalk 12:50, 4 May 2010 (UTC)[reply]
- Please update the Function details appropriately, and perform a trial of this newly extended functionality Approved for trial (30 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Josh Parris 14:03, 4 May 2010 (UTC)[reply]
- Shall do, as soon as I understand the additional task a bit more. –xenotalk 14:04, 4 May 2010 (UTC)[reply]
- Trial complete. 31 edits (AWB's new "max edit" featured goofed ;>). –xenotalk 13:10, 5 May 2010 (UTC)[reply]
- On Los Angeles, why did the bot leave coordinates_region= blank? --Stepheng3 (talk) 16:45, 5 May 2010 (UTC)[reply]
- Probably because
{{flagicon|USA}} [[United States]]
is an invalid input to my fork of CountryAbbr - I'll clean that up in the production run. I think a more important question is why such a large city didn't already have title coordinates. Are we sure these are desired across-the-board? –xenotalk 16:50, 5 May 2010 (UTC)[reply]- This example perfectly illustrates why
coordinates_region
and your bot are necessary. Editors have freedom to use whatever markup they like for the infobox parameter, so that meant {{CountryAbbr}} needed constant maintenance (like a game of whack-a-mole) to catch all the variations, and of course, always had some articles that fell through the cracks. MUCH better to have a distinct infobox parameter for this purpose. After the bot run, a hidden category can be used to manually check all the articles with blankcoordinates_region
. — Andrwsc (talk · contribs) 17:00, 5 May 2010 (UTC)[reply]- Quite right - though, it is probably a good idea to add the maintenance category before the bot run, so that graceful failures such as the Los Angeles example can be fixed promptly. –xenotalk 17:09, 5 May 2010 (UTC)[reply]
- This example perfectly illustrates why
- Probably because
- On Los Angeles, why did the bot leave coordinates_region= blank? --Stepheng3 (talk) 16:45, 5 May 2010 (UTC)[reply]
- Trial complete. 31 edits (AWB's new "max edit" featured goofed ;>). –xenotalk 13:10, 5 May 2010 (UTC)[reply]
- Shall do, as soon as I understand the additional task a bit more. –xenotalk 14:04, 4 May 2010 (UTC)[reply]
"Are we sure these are desired across-the-board?" There seems to be consensus on the template talk page, but of course there may be editors attached to particular articles not having title coordinates -- they are free to override by setting coordinates_display=inline
.
"invalid input to my fork". What if someone were now to add this variant to {{CountryAbbr}}? Since the bot added a blank coordinates_region=
to the template, that would block upgrades to CountryAbbr from affecting the region code in the article's coordinates. I'm thinking we should modify {{Infobox settlement}} to treat blank coordinates_region=
the same as missing coordinates_region=
. Or else keep the bot from adding the blank parameter.--Stepheng3 (talk) 00:49, 7 May 2010 (UTC)[reply]
- Hopefully we won't have to maintain CountryAbbr much longer. We should consider a maintenance category for blank & missing
coordinates_region=
, fixing them manually as the bot progresses. — Andrwsc (talk · contribs) 02:42, 7 May 2010 (UTC)[reply]- That seems an acceptable route. --Stepheng3 (talk) 04:05, 7 May 2010 (UTC)[reply]
- Yep, the bot has no no way to know if the subst of the subdivision names will come up with an proper or blank result, so the template needs to go into a maintenance category if it is blank. –xenotalk 04:21, 7 May 2010 (UTC)[reply]
- That seems an acceptable route. --Stepheng3 (talk) 04:05, 7 May 2010 (UTC)[reply]
Why does the edit summary for this edit to Herat claim that the bot is doing something with coordinates_region, when it isn't? Josh Parris 08:48, 7 May 2010 (UTC)[reply]
- Because the botop didn't exclude articles from the first trial from the second =) [1] (Here's hoping for a feature to allow AWB to dynamically change the edit summary) –xenotalk 12:29, 7 May 2010 (UTC)[reply]
Approved. I see you're constrained by your tools. Josh Parris 13:02, 7 May 2010 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.