Template talk:Detect singular

(Redirected from Module talk:Detect singular/sandbox)
Latest comment: 2 years ago by Frietjes in topic Fix for br bug in marriage?

edit

Hi Hike395—this is a neat template you've created! I'm planning on following up about the discussion about infobox plurality at the VPT, and this template will be one of the ones under consideration, so I'd like it to be in good shape. Two potential improvements that come to mind:

  1. Adding "and" as a word that triggers a plural result
  2. Making it so that inputs that consist entirely of a single wikilink trigger a singular result, even if they have a comma or "list".

Thoughts? Also, I realize you missed the earlier plurality discussion. I've drafted an RfC follow-up and am seeking feedback at User_talk:Sdkb#RfC_draft_feedback, so feel free to let me know if you have thoughts about that more broadly. {{u|Sdkb}}talk 00:07, 9 December 2020 (UTC)Reply

These are both good ideas! It might be easiest to convert this to a small bit of Lua code. Unfortunately, I can't edit the template anymore (since I'm a templateeditor, not an admin). Thanks for letting me know about the RfC. — hike395 (talk) 06:15, 9 December 2020 (UTC)Reply
@Hike395, just following up about this. The latest stage in my quest to fix the (s) issue was this VPT discussion, and although the reception there was somewhat warm enough, there were a bunch of concerns about whether {{Detect singular}} was quite developed enough to be ready for the task. It's enough to make me nervous about trying to roll it out further, since those sorts of concerns can tip consensus. Do you have any interest in working with this in the sandbox to try to get it able to read all wikilinks as singular? In quasi-technical terms, I think what we want is basically to get it to remove anything within [[ ... ]] before it does the check for list things. Cheers, {{u|Sdkb}}talk 04:13, 8 January 2022 (UTC)Reply
I can try to make a short Lua script to implement something this. I'll also ask for a reduction in page protection (down to templateeditor). — hike395 (talk) 05:02, 8 January 2022 (UTC)Reply
@Sdkb:   Done See Template:Detect singular/testcases. I've handled all of the corner cases correctly:
  • Items inside of wikilinks don't get parsed, so treated as singular objects ("[[Alabama]] and [[Georgia]]" becomes plural)
  • "and" as its own word triggers plural
  • A list of QIDs will trigger plural
  • Multiple asterisks in a row count as one asterisk
  • Comma only triggers plural if preceded by a letter and followed by a non-letter. This handles "May 4, 2020" and "4,563,523" both as singular
At this point, we may be able to turn on |bullets=1 by default or get rid of it. We could also get rid of |no_comma=, keeping {{force singular}} for cases like "Martin Luther King, Jr." (which is still recognized as plural outside of wikilinks). What do you think? — hike395 (talk) 06:47, 8 January 2022 (UTC)Reply
Later: handled all of the cases brought up in VPT (i.e., lists with one element, lists with items with commas). Added test cases for them. Should be OK now. — hike395 (talk) 07:16, 8 January 2022 (UTC)Reply
Television articles will not be using {{force singular}} so make sure that the functionally of |bullets= and |no_comma= work or don't remove them. Gonnym (talk) 08:41, 8 January 2022 (UTC)Reply
@Gonnym: They work right now: I don't have to remove them. — hike395 (talk) 08:46, 8 January 2022 (UTC)Reply
Great, that was my only concern. Gonnym (talk) 08:48, 8 January 2022 (UTC)Reply
@Gonnym: I used Template:Infobox television episode/testcases to test the Lua version: to make everything work, I must set the default |bullets=yes. That's because {{plainlist}} doesn't expand into <li></li> pairs, but leaves any list items marked by asterisks. One of the VPT requests was to count the number of list items and if there's only one, call it singular. So I need to count the number of asterisk-delimited items under all cases, not just when |bullets= is set explicitly to yes.
In any event, the new Lua code does not break Template:Infobox television episode. Gonny, if there are any other infoboxes that you're worried about, please let me know and I can test the Lua version explicitly. — hike395 (talk) 09:06, 8 January 2022 (UTC)Reply
That was the one. Thanks for making sure it works :) Gonnym (talk) 09:08, 8 January 2022 (UTC)Reply
@Hike395, just checked it out, and that's fantastic! (see token of appreciation on your talk) Regarding the parameters, I think if we're able to remove those, that'd be nice, as simplicity is always good. Once you're done working on the module, let's open an edit request to get the new version implemented. I just saw you commented with some work on {{Infobox settlement}}; I'll take a look at that, and begin some work on {{Infobox person}} myself. Combining those two very prominent infoboxes will hopefully give us a nice launch that others will pick up and spread to other infoboxes. Cheers, {{u|Sdkb}}talk 23:17, 8 January 2022 (UTC)Reply
Thanks for the barnstar!!
hike395 (talk) 23:38, 8 January 2022 (UTC)Reply

I'm starting to slowly convert the live versions of {{Infobox television}}, {{Infobox television episode}}, and {{Infobox film}} to use {{Pluralize from text}} and Module:Detect singular. — hike395 (talk) 15:57, 9 January 2022 (UTC)Reply

Another thought I just had: Would it be possible to make it so that links return as plural if they link to a page with "list of" in the title? E.g. [[List of honors and awards received by Barack Obama]] should return plural. {{u|Sdkb}}talk 22:39, 9 January 2022 (UTC)Reply
That's a good idea! Will implement. — hike395 (talk) 00:22, 10 January 2022 (UTC)Reply
My work on {{Infobox person}} is here in the sandbox. I'll test out in previews on a bunch of biographies and see if I spot any errors. Let me know how it looks to you! {{u|Sdkb}}talk 23:36, 9 January 2022 (UTC)Reply
Okay, we've got a problem at Edgar Allen Poe. It uses spouse = {{Marriage|[[Virginia Eliza Clemm Poe]]|1836|1847|end=died}}, which is returning plural for some reason. Other uses of {{Marriage}} seem to be doing the same (see e.g. William Hanna). Any idea why, @Hike395? Also, we need to remove everything within a citation, so that the citations at e.g. Mark David Chapman don't lead to a plural result. {{u|Sdkb}}talk 00:17, 10 January 2022 (UTC)Reply
Will check it out. I think removing everything in references is a good idea. — hike395 (talk) 00:21, 10 January 2022 (UTC)Reply
Problem with {{marriage}} is   Fixed. The template was getting confused by all of the markup, so I simply removed it before doing many of the checks. I'll work on your suggested changes next. — hike395 (talk) 01:10, 10 January 2022 (UTC)Reply
Also, something to note is that I had a particularly difficult time deciding what to do with |parents=, since it's often used to only list one parent when a person only has one notable parent, but it should still use Parents in that case, expect for the rare instance where someone actually has only one parent. So I'm not sure we can escape (s) in that particular scenario. {{u|Sdkb}}talk 00:20, 10 January 2022 (UTC)Reply
We may need to keep the (s), oh well. — hike395 (talk) 00:22, 10 January 2022 (UTC)Reply

I've gotten this to work with all of the area codes, demonyms, nicknames, and mottoes in Template:Infobox settlement/testcases. I made the following changes:

  1. Create a function _pluralize() that makes it easy to plug into an infobox. This is exposed as Template:Pluralize from text: interface is documented in Template:Pluralize from text/doc.
  2. Added a parameter |ignore_links= that, when false, prevents the rewriting of all wikilinks as WIKILINK. This is required to parse the nicknames of some cities.
  3. Made the scanner look for a semicolon-separated list in addition to a comma-separated list.
  4. Made 4 changes to Template:Infobox settlement/sandbox to set up the automatic parsing of singular/plural, calling {{Pluralize from text}}.

Comments/suggestions are welcome. I'll see if I can find other interesting infoboxes to auto-pluralize. — hike395 (talk) 22:50, 8 January 2022 (UTC)Reply

If the template is going to support pluralization like you did, then really the next step is to deprecate the plural parameter name (drop |nicknames= and keep |nickname=), have a bot replace usages and when complete, remove support from the infobox. But that's not really related to this template, you just asked for comments :) Gonnym (talk) 23:43, 9 January 2022 (UTC)Reply
True -- I think we'd have to do that infobox-by-infobox. But happy to remove args[2] when those are all gone. — hike395 (talk) 00:22, 10 January 2022 (UTC)Reply

Proposed new features

edit

Sdkb proposed two new features (above):

  • Strip out references from input.   Not done --- the preprocessor already replaces references with opaque objects, so it's not needed. Plural stuff in references don't trigger the existing code
  • "List of" inside of a wikilink to trigger plural.   Done --- hopefully this doesn't cause unanticipated issues.

hike395 (talk) 02:07, 10 January 2022 (UTC)Reply

Awesome; thanks! {{u|Sdkb}}talk 05:28, 10 January 2022 (UTC)Reply

Protected edit request on 14 January 2022

edit

Per above, please adopt the sandbox version, which converts to Lua to improve this template's functionality. Courtesy ping Hike395. {{u|Sdkb}}talk 07:03, 14 January 2022 (UTC)Reply

  Done — Martin (MSGJ · talk) 18:46, 18 January 2022 (UTC)Reply

Fails with dates

edit

@Hike395, I was wondering if there was a way to make this work with dates? Date usage such as {{Detect singular|May 2, 2005}} and {{Detect singular|{{Start date|2005|5|2}}}} fail because I assume the comma in the date. Gonnym (talk) 10:18, 22 June 2021 (UTC)Reply

@Gonnym: FWIW, now it does:
{{Pluralize from text|May 2, 2005||singular|plural}} → singular
{{Pluralize from text|{{Start date|2005|5|2}}||singular|plural}} → singular
Note we don't need to specify |no_comma=1 to get this behavior: the regular expressions are better now. — hike395 (talk) 00:04, 9 January 2022 (UTC)Reply

Override function

edit

I've had this draft RfC sitting in my sandbox for a few months, with accompanying talk discussion with Jonesey95, RexxS, Hike395, and GhostInTheMachine. I've hesitated to launch it, though, since each of the possible options seem to have meaningful downsides. Option2a got the closest, but it requires adding a ton of parameters, which would make documentation more complex, and errors in the automatic detection or ambiguous cases would require editing work.

I'm realizing that what we really need is a way to go with option 3 (using this template), but with an override function that can be triggered within uses of infoboxes that themselves use this template. For instance:

At {{Infobox person}}, we'd set it up so that instead of the label "Spouse(s)", as it currently does, it displays either "Spouse" or "Spouses" based on the result from this template. However, for the rare cases in which this template gets it wrong, someone using the infobox could do this:

{{Infobox person
|name=Camilla, Duchess of Cornwall
|spouse=[[Charles, Prince of Wales]]{{force singular}}
}}

Or this:

{{Infobox person
|name=Henry VIII
|spouse=[[Henry VIII's wives]]{{force plural}}
}}

This template would search for the presence of {{force singular}} and {{force plural}} in any string it evaluates and resolve appropriately if one of them is present. Invoking them when you use a template would be easier than Option 2a, as would be explaining them (since anyone curious would just go to their documentation page).

What do you all think? And if this override is something we want for this template, would any of you be able to help code it? {{u|Sdkb}}talk 23:55, 25 August 2021 (UTC)Reply

I don't think that I agree with the method you described. The main issue is that we make a simple template now requiring to get the content of a page which is a costly function. We have pages that do that, but we should use that as a last resort. If an infobox knows about a problematic parameter, then just create an alias for it. So in this example, you have a |spouse= parameter which has a singular value that is in fact a plural. In this situation I'd add |spouses= to the infobox. Gonnym (talk) 08:03, 28 August 2021 (UTC)Reply
And regarding the Charles, Prince of Wales issue, a solution (which I haven't tested for edge cases) could be for this template (with an optional parameter) to count links which would result in only 1 link for Charles and a singular value as a result. Gonnym (talk) 08:07, 28 August 2021 (UTC)Reply
@Gonnym, I was a little worried it'd be expensive. It wouldn't need to get/check the full content of a page, though, just the text being fed to the detect singular template. Does that help?
What you're describing with adding |spouses= is option 2 from the draft RfC; we considered that but it has some drawbacks.
Regarding the wikilink improvement, I would definitely like to see that; see above. But for the purposes here, the salient thing is that this template is never going to be 100% perfect, so we need a way to handle errors. And it needs to be something simple enough from the infobox user end that they find it intuitive to interact with. {{u|Sdkb}}talk 17:21, 28 August 2021 (UTC)Reply
For such an edge case, I'd prefer the ability to set a custom string, because maybe it's not the word "spouse(s)" people want, but "partners/hubbies" etc..
{{Infobox person
|name = Henry
|spouse = [[Henry VIII's wives]]
|spouse_text = Wives
}}
~ or in this case, a custom pluralization. Shushugah (he/him • talk) 15:14, 9 September 2021 (UTC)Reply
The spouse parameter is just the example I chose—I'm trying to find a solution widely applicable to every infobox everywhere with a (s) in a parameter name. We could discuss whether forced standardization is a bug or a feature (there's an argument against "wives"), but I think the larger issue is that this would require adding a ton of new parameters and significantly increase complexity on the template user end. It would also be very unlikely to be widely used, as it'd require editors to actively use the extra _text parameter, and if a parameter is renamed, there'd be no way to push that out to all its uses.
I'm still looking for an answer to my question above—is there a way to get only the content of the field, not the whole page, and does that help reduce the cost? {{u|Sdkb}}talk 17:20, 9 September 2021 (UTC)Reply
@Gonnym, following up, it looks like {{str find}} is what the template would use, just searching the value of the parameter (e.g. searching the string [[Charles, Prince of Wales]]{{force singular}} for {{force singular}} or {{force plural}}). How expensive would that be? {{u|Sdkb}}talk 07:52, 25 October 2021 (UTC)Reply
Thanks for pinging me. I've re-read the discussion and I made a mistake thinking you want to read the content of the page instead of the parameter value only. The call shouldn't be expensive at all as it's just a string check. That said, I'm still not sure how good this will work with tools like the visual editor so you should test that out. Also, current usages shouldn't be required to use the proposed system. The television infoboxes for example, work completely fine as is now without requiring users for additional information. Gonnym (talk) 18:05, 26 October 2021 (UTC)Reply
@Gonnym, after a bunch of tinkering in the sandbox, I think I have this working. It's successful in the testcases, at any rate. I've made an implementation request below.
Regarding infoboxes like {{Infobox television episode}} that already use {{Detect singular}}, there won't have to be any changes. The only difference is that, if the infobox ever does make an error, users of it will be able to override it by adding one of the force templates to the value.
Regarding VisualEditor, I've added TemplateData, but since VE still displays wikitext within parameter values, there's nothing else that can help there. If an override is used, it'll display as value{{force singular}}, no different than any other template currently used in infobox values. {{u|Sdkb}}talk 22:19, 26 October 2021 (UTC)Reply
Great to hear you got it working. Regarding the Charles example, that could also be fixed with counting the amount of links "[[". The Henry example isn't a real one so can't really comment on a fix for that. Gonnym (talk) 22:23, 26 October 2021 (UTC)Reply
Yes, it would be fantastic if singular wikilinks were always detected as singular! It's probably beyond my current technical ability to implement, but if you wanted to code it, it'd be very nice to have that when I go to VPT and seek to start using {{Detect singular}} more widely at infoboxes. {{u|Sdkb}}talk 22:31, 26 October 2021 (UTC)Reply

Protected edit request on 26 October 2021

edit

In the sandbox version, I've added the capability for this template to detect overrides, which can be triggered through the {{Force singular}} or {{Force plural}} templates. If it looks good, could we implement? {{u|Sdkb}}talk 22:01, 26 October 2021 (UTC)Reply

  Done — Martin (MSGJ · talk) 15:36, 29 October 2021 (UTC)Reply

Fix for br bug in marriage?

edit

@Frietjes: It looks like you added a bug fix to Module:Detect singular/sandbox for a case where there is a trailing line break in an argument to {{marriage}}. You wrote test cases, but you never promoted it to the main Module. I'm about to do some development in the sandbox: do you want me to incorporate your fix, or let it revert? — hike395 (talk) 01:19, 7 September 2022 (UTC)Reply

hike395, I think we should incorporate it. I can't think of any non-visual reason why there would be a br before a closing div tag. thank you. Frietjes (talk) 15:04, 7 September 2022 (UTC)Reply