Document that the GCL template recognises the suffix "Z"

edit

Why not document that the GCL template recognises the suffix "Z", as it does the prefix "N"?

From the module documentation for the data table:

--[[The table here is traversed by the function that expands glossing abbreviations.
If the abbreviation isn't found in this list and it begins with an "N" then the function
will discard the "N" and search again, returning the result prefixed with "non-" (ex. "NFUT"
is not found, so it will search for "FUT" and return "non-future"). … ]]--

Perhaps add the following description, for what I assume is the current behaviour, as described on List of glossing abbreviations:

If the abbreviation isn't found in this list and it ends with a "Z" then the function
will discard the "Z" and search again, returning the result suffixed with "-izer"
after discarding any final "e" (ex. "TRZ" is not found, so it will search for "TR" and,
finding "transitive", return "transitivizer").

I'm guessing that there aren't very many uses of this behaviour; if so, maybe it'd be better tabulated explicitly, rather than handled by program code. What do you think? yoyo (talk) 04:30, 5 September 2019 (UTC)Reply

The function doesn't do that, I'm afraid. It would be too messy: not many of the meanings given in the table can be suffixed with "-izer" (NOMZ = "nominative caseizer", etc.). Uses of -Z are too few and far in between, so I don't think there's a need for some special machinery to deal with them; if the specific abbreviation is common (like NMLZ), then it can just be added as a separate entry in the table. – Uanfala (talk) 20:42, 7 September 2019 (UTC)Reply

Whether the "O" abbreviation is also exempt from negation

edit

Assuming that one doesn't ever use the gloss "NO" meaning "non-object argument of canonical transitive verb", shouldn't "O" be exempt from negation?

From the module documentation for the data table:

--[[ … A few abbreviations are exempt from this behaviour and they're marked by the ExcludeNegation key.]]--

Specifically, only these four abbreviations show ExcludeNegation = true:


["A"] = {expansion = "agent-like argument of canonical transitive verb", wikipage = "A (glossing abbreviation)", ExcludeNegation = true},

["P"] = {expansion = "patient-like argument of canonical transitive verb", wikipage = "P (grammar)", ExcludeNegation = true},

["S"] = {expansion = "single argument of canonical intransitive verb", wikipage = "S (grammar)", ExcludeNegation = true},

["SG"] = {expansion = "singular number", wikipage = "Singular number", ExcludeNegation = true},

But if P is Patient, and O is also Patient (I'd thought it was Object!) shouldn't O also be exempt from negation? Thus:


["O"] = {expansion = "patient-like argument (object) of canonical transitive verb", wikipage = "O (grammar)", ExcludeNegation = true},

yoyo (talk) 04:30, 5 September 2019 (UTC)Reply

Thanks for spotting this! I've fixed that now. – Uanfala (talk) 20:34, 7 September 2019 (UTC)Reply

Whether "S" is negated

edit

I'm confused by this entry - "NS" appears in the table:
["NS"] = {expansion = "non-subject (see oblique case)", …},

Surely this is the negation of "S", which can't thereby be excluded? yoyo (talk) 05:50, 5 September 2019 (UTC)Reply

ExcludeNegation is only relevant if the module tries to get the meaning of an abbreviation and it can't find it in the table. Let's assume the abbreviation is NPL, the module won't find it in the table, so it will strip the N and then look for PL, if it finds an entry and this entry doesn't have ExcludeNegation turned on, then it will return "non-" + whatever meanings is set for PL. The module will never have to do go through this process for NS as there's already an entry for it in the table. So yeah, there's no need to have ExcludeNegation = true in the entry for S, though it does no harm, but feel free to remove it. – Uanfala (talk) 20:34, 7 September 2019 (UTC)Reply
Done. Following the logic given above, whenever we add any new abbreviation "N<abbr>" to the table, we thereby make the ExcludeNegation parameter irrelevant (and meaningless) for the previously existing "<abbr>", and the Interlinear Lua module should therefore ignore it when processing. So the smart thing would be always to remove the ExcludeNegation parameter from any entry when adding its explicit negation. yoyo (talk) 04:12, 28 October 2019 (UTC)Reply

Documention is wrong about some features, some which don't work predictably

edit

At least this section of the documentation is off:

The abbreviation will be rendered as a link to the relevant article if the |glossing= parameter is set to wikilink, thus {{gcl|CLF|glossing=wikilink}} gives: CLF. The wikipedia article is specified by the third parameter: {{gcl|classifier|Chinese classifier}} gives: CLF. The presence of this third parameter (even if empty) will force the abbreviation to be displayed as a wikilink – {{gcl|CLF||}} is equivalent to {{gcl|CLF|glossing=wikilink}}.

If you do {{gcl|classifier|Chinese classifier}}, as stated there, you get classifier, not the claimed CLF, which was produced with {{gcl|CLF|classifier|Chinese classifier}}. That seems reasonable enough, though this loses the ? hover icon and the "classifer" tooltip (or it does for me; I have those pop-up article preview things on). When I manually do this: [[Chinese classifier|<span title="classifier" class="explain" style="font-variant: small-caps;">clf</span>]] then I get what I would expect this template to actually do: clf. If the link is put on the inside, then some of the functionality (the ? tooltip) stops working: <span title="classifier" class="explain" style="font-variant: small-caps;">[[Chinese classifier|clf]]</span> gives: clf

Also, the casing behavior isn't making sense:

  • CLF – just "CLF" as bare text, for comparison
  • CLF{{gcl|CLF|glossing=wikilink}}
  • clf{{gcl|clf|glossing=wikilink}}
  • clf – just "clf" as bare text

So, if you use all-caps input (and it matches an abbreviation on the list – see below), it is applying small-caps, but is converting the real value to lower-case (when you copy-paste). However, if you supply a lower-case (or mixed-case, or a non-list-matching upper-case) value, it does nothing to the case at the content level or at the display level, nor does it affect the size. I would think that it should behave as {{sc}} does in these regards, and be documented as to exactly what it will do. The MOS:SMALLCAPS recommendation has been {{sc}} behavior: input is forced to be lower-case (when copy-pasted) regardless of whether you feed it "clf" or "CLF" or "cLf" or whatever, and it is displayed as smallcaps: clf CLF cLf. This is because the linguistic "specs" like Leipzig do not actually require small-caps or all-caps, just some kind of display distinction, and small-caps is the most common/recommended version. If we leave it lower-case, reusers of our content can do whatever they want with it just with CSS, while if we forced it to really be upper-case, then they'd be stuck with that unless they used some kind of scripting to do search-and-replace operations. Even with the CSS 3's font-variant-caps property (which is not fully supported even in some major browsers), there's no "force it to lower-case" version, only various forms of small-caps and case-mixing. One consequence of this is that if people input literally quoted values from a source that uses standard abbreviations but not in all/small caps (maybe the source used serif/sans font change or something instead as the distinction), you get unexpected results. I just saw this in an article where one gloss's input used 1SG and another used 1sg. This is clearly not desirable, since people are likely to copy-paste such strings (especially complicated ones like "father\PL-DAT.PL" or "go.out-PFV-1SG") from source material.

A quick comparison of behavior:

  • {{gcl|CLF}} displays as: CLF, pastes as: clf
  • {{gcl|ABCDEFG}} displays as: ABCDEFG, pastes as: ABCDEFG
  • {{gcl|aBcDeFg}} displays as: aBcDeFg, pastes as: aBcDeFg
  • {{sc|aBcDeFg}} displays as: ABCDEFG, pastes as: abcdefg – converts input to LC, displays all as SC
  • {{sc1|aBcDeFg}} displays as: aBcDeFg, pastes as: aBcDeFg – no case conversion, displays LC as SC and UC as regular UC
  • {{sc2|aBcDeFg}} displays as: aBcDeFg, pastes as: aBcDeFg – no case conversion, displays UC as semi-small SC and LC as ultra-small SC (probably too small to comply with MOS:FONTSIZE, actually).

{{gcl}}'s present behavior appears to match that of {{sc}} but only if the input is all UC and it matches something on the list. If it was LC, or mixed case, or UC but not a match, you get no display change and you get no case conversion at the paste level. That's just weird and not what anyone would predict/expect. I would think the desired behavior is that of {{sc}}, as noted above – even if there's no list match; that just means there's no auto-generated toolip and/or link available, and that they would have be specified manually. When I do exactly that with {{gcl|ABCDEFG|rotisserie|assassin}}, suddenly the display changes, the case converts, and the link works (though tooltip does not): ABCDEFG (pastes as abcdefg). When I test with {{gcl|ABCDEFG|assassin}}, I get identical results ABCDEFG. When I test with {{gcl|ABCDEFG|rotisserie}}, I get ABCDEFG, so display change, case conversion, and tooltip work (link was submitted as void). With {{gcl|ABCDEFG|rotisserie}}, I get ABCDEFG, which is the same. I'm not sure why the doc says anything about using {{gcl|ABCDEFG|rotisserie}} format, unless it's to suppress something that happens when there is a list match.

So:

  1. The tooltip is broken when the link feature is on; I think this is fixable but wrapping the span inside the link instead of vice versa.
  2. Custom input does not work properly unless there is at least one of a tooltip or a link, so "clean" use after the first occurrence is not going to work; when I try supplying both optional parameters as void, the basic features stop working again: {{gcl|ABCDEFG}}ABCDEFG. Yet this exact format is one that's documented to be successful. Turn out it only works with abbreviations on the pre-defined list.

Other attempts:

  • {{gcl|ABCDEFG|glossing=no link}}ABCDEFG
  • {{gcl|ABCDEFG|glossing=no link}}ABCDEFG
  • {{gcl|ABCDEFG|glossing=no link|3=}}ABCDEFG
  • {{gcl|ABCDEFG|glossing=no link|2=}}ABCDEFG
  • {{gcl|ABCDEFG|glossing=no abbr}}ABCDEFG

So, finally the last one works, but this took much longer to figure out than any user would normally invest.

More weirdness... The third parameter is supposed to be the page link target, but look at this:

  • {{gcl|ABCDEFG|glossing=no abbr|2=rotisserie}}ABCDEFG
  • {{gcl|ABCDEFG|glossing=no abbr|3=assassin}}ABCDEFG

In the second, what should have been a link target in |3= (or apparently just discarded due to |glossing=no abbr ends up being treated as a replacement for |2=!

It's unclear what the intent is or even how to document this (especially since some of this is probably bugs, not intentional behavior).  — SMcCandlish ¢ 😼  19:04, 8 February 2021 (UTC)Reply

Thank you, SMcCandlish, this sort of detailed feedback is precisely what has been needed here.
  • The bit in the documentation that reads {{gcl|classifier|Chinese classifier}} is an error. It should instead be {{gcl|CLF|classifier|Chinese classifier}}. I'm going to correct that.
  • Yeah, the tooltips don't work if there's a link. I haven't thought much of it before – the link text sort of makes the tooltip redundant most of the time, and if both are displayed, then there will be something of a visual clash between the link and the abbr formatting (for example, in the way the mouse cursor changes). Still, it is better if the tooltip is displayed, and your solution seems to do the trick. I'm going to implement it when I get around to working on the next version of the module.
  • The small caps behaviour is intentional. The template serves two functions: 1) flagging a string of text as a glossing abbreviation, and then 2) defining, and displaying, this abbreviation's meaning. The abbreviation may be formatted in different ways, and that's up to the editor who's used it; it's perfectly reasonable, for example, to use small caps for most abbreviations, but still have the likes of 1sg and 2pl in lower case. The template meddles with the capitalisation only if the abbreviation supplied consists entirely of upper-case letters, in which case it's assumed that small caps were intended instead. That's only because this formatting is so commonly used, and the goal is just to save editors the need to use nested {{sc}} every time. I don't think {{gcl}} should override editors' stylistic choices and force output that always matches {{sc}}.
    I realise this leads to inconsistencies when copy-pasting the output: {{gcl|ACC}} will paste as lower-case acc, but {{gcl|Acc}} will paste as Acc. Should we try to make these consistent? I don't know if this is relevant here, but this transformation to lower case is not universal: there are good publications, like Glossa that don't do it with their glosses.
  • Uses like {{gcl|ABCDEFG}} are not valid: the meaning of the abbreviation should be known: either by being manually defined inside the template, or by being available in the pre-defined list. If it's not known, then the template will correctly output an error – here this is done by displaying the abbreviation in normal size and with the "error" class (which should make its text colour red), and with an error message in the tooltip. But yeah, this needs to documented.
  • {{gcl|ABCDEFG|rotisserie}} doesn't display as a link, because it can't find anything to link to – the third parameter is empty, and there's no entry for "ABCDEFG" in the list of recognised abbreviations. I'm not sure what's the best thing to do in such cases, but the template can be tweaked to assume the link is the same as the label, so that {{gcl|ABCDEFG|rotisserie}} gets interpreted the same as {{gcl|ABCDEFG|rotisserie|rotisserie}}: ABCDEFG. What do you think?
  • |glossing=no abbr currently overrides other style choices, including formatting as a link. That's probably not desirable – I can imagine a user might go for that option if they want a plain link, without the abbr formatting. In the next version of the module, |glossing=no abbr should just strip the abbr element, while leaving everything else intact.
I should get to work on the bugs at some point within the next couple of months. I'm planning to do that as part of the switch to TemplateStyles – that should also make it a bit easier for other people to tinker with the template. As for the unexpected features – I'm open to persuasion. The template is only used on about 40 or so articles, and there's nothing in its current behaviour that is set in stone. – Uanfala (talk) 02:34, 10 February 2021 (UTC)Reply
In the same order (but numbering for later clarity):
  1. Glad that's just a typo.
  2. Link popups on mouseover/hover (hovercards, I think it's called?) is an optional feature people have to turn on in Preferences, and it doesn't apply to non-logged-in users at all, so it really shouldn't be a factor. The fix for it is to just reverse the element order so the link surrounds the span with title, instead of being inside. Then both features work (you can even decide which you want to pop up by pointing carefully).
  3. "The small caps behaviour is intentional." Yes, I know (and I wrote most of the MoS material in question, in consultation with peeps from the linguistics project. :-)
    1. The main problem I'm reporting is that the sc behavior fails when the code someone uses as input isn't on your pre-configured list. So, the template behaves in a wildly inconsistent manner, both as to codes the list isn't aware of yet and (much worse) even differently-cased input that would match something on the list.
    2. Yes, they should be consistent. The MoS material needs to be updated to account for this template (after which "The template is only used on about 40 or so articles" won't be true for long). People shouldn't get weirdly different results if they switch from {{sc}} to {{gcl}}, which they probably should (eventually, anyway).
    3. It doesn't have anything to do with overriding the user agent; a user can still tell their browser (or MW, via Special:MyPage/Common.css) to apply whatever CSS they like. One of the reasons to normalize the input to lowercase is that even CSS 3 doesn't have a font-variant[-caps] option that translates to "all lower-case", only various forms of all-caps, small-caps, and mixed. So, they can do whatever they want with the display if we feed them LC stuff at the code level, but if we convert to UC (or permit editor insertion of UC that isn't changed to LC), this takes away the ability of the user to change it, without resorting to something like complicated JavaScript case-conversion tricks.
    4. Nothing about this stuff is universal, alas. The only kinda-sorta requirement is that these codes be distinct from other material in the gloss. That's most often done (and recommended to be done) as SC, though all caps is sometimes also used, and someone somewhere might instead be doing a font-family shift or whatever. What online systems do with the underlying data is going to vary widely. Most are simply not going to think about this stuff very hard. WP, however, is keen on WP:REUSE, and also (MOS:CAPS) isn't going to be keen on making the actual underlying data hardcoded as all-caps if's it a word or a non-acronym/non-initial abbreviation (and most of these aren't). So, for a general conflux of reasons, the current behavior of {{sc}} is the best option to emulate in {{gcl}}.
    5. Also, the appears to be no use-case at all for this to output something like "contemplative" (i.e. a whole word not an abbreviation) and not have it also be in the same smallcaps style. The Leipzig and other quasi-specs that call for such labels to be differentiated (usually small-caps) don't make an exception for one that are not abbreviations. And I have in fact seen interlinear glosses that do not use abbreviations, mostly in pedagogical rather than linguistics material. So, we cannot actually depend on people inputting abbreviated forms, even if the template will effectively force them to do something like provide either a link or a tooltip or both since the template won't be able to auto-perform that action in such cases.
  4. Errors:
    1. "Uses like {{gcl|ABCDEFG}} are not valid". Okay, then that needs to be more explicit in the documentation. It would probably help a whole lot to have illustrative examples of various use cases.
    2. Red error messaging: Good idea. I've implemented that sort of thing in several templates myself, and it seems to help.
    3. "so that {{gcl|ABCDEFG|rotisserie}} gets interpreted the same as {{gcl|ABCDEFG|rotisserie|rotisserie}}: ABCDEFG. What do you think?" Possibly, but see below for alternative idea.
  5. What I was testing with "{{gcl|ABCDEFG|rotisserie}}" and "{{gcl|ABCDEFG|rotisserie}}" and "{{gcl|ABCDEFG|assassin}}" and "{{gcl|ABCDEFG}}" and "{{gcl|ABCDEFG}}" and so on was mostly to see if A) any of these could result in consistent smallcaps output (i.e., I was looking for a bug workaround), but also B) if it was possible to use a "bare" version without any tooltip or link, e.g. for later use after it's already been tooltipped and linked one morpheme earlier. Editors are going to want a means of avoiding the "sea of blue" effect.
    1. This latter might ultimately be at odds with use of this in {{interlinear}}. The simplest solution, if we want it to continue doing these things by default at every instance, would be to treat present-but-empty parameters as intentional suppression: {{gcl|ABCDEFG}}, {{gcl|ABCDEFG|rotisserie}}, {{gcl|ABCDEFG|assassin}}. That might conceivably even work inside {{interlinear}}, if it will recognize something like 1PL (or 1pl) automatically but ignore {{gcl|1pl}}.
  6. "|glossing=no abbr should just strip the abbr element": Yes, I'd thought that "operator overloading" that to do more than one thing was odd and potentially confusing, though I didn't get into it.
Anyway, I'm impressed (especially with the broader functionality of {{interlinear}}). Makes me wish I knew Lua better (I think I've only done one module so far). If I'm come across as cranky about any of this, that's not the intent. This is a rather low-nuance medium (and I'm also in a hurry; I have a bunch of irons in the fire, so I'm not really poring over what I'm writing). If you think it'll take a few months to do much with it, I should probably old off working mention of it into MOS:SMALLCAPS, which I was actually in the process of updating).
 — SMcCandlish ¢ 😼  05:21, 10 February 2021 (UTC)Reply

3. With respect to the small-caps behaviour, we seem to be proceeding from widely diverging sets of assumptions. Here's what I see from where I'm standing:

  • The current behaviour of {{sc}} and {{gcl}} is different even in the prototypical case. Compare for example {{gcl|ACC}}: ACC with {{sc|ACC}}: ACC. {{sc}} converts the input string to upper case, and then puts it in a span with style set to font-variant: small-caps; text-transform: lowercase. {{Gcl}} applies the same styles as {{sc}}, but it uses an abbr element, and leaves the input string intact. Neither of these convert the input string to lower-case, which I understand is the behaviour you're recommending. As far as I can see, what you get in your clipboard when you copy something like ACC will depend on your browser: the version of Firefox I'm using seems to copy the underlying string (ACC in both examples), while Chrome applies the text-transform property to the copied text (so above you'll get acc).
  • Whatever other advantages there may be to having the abbreviation in lower case at the content level, this doesn't seem to affect reusability. This is because of the same CSS property text-transform, which is widely supported. Even if our glossing abbreviations are in upper case, a re-user of our content who prefers lower-case labels can simply apply text-transform: lowercase.
  • Please correct me if I'm mistaken, but the idea that {{gcl}} should always match {{sc}} seems to proceed from the assumption that glossing abbreviations on Wikipedia should always be displayed in small caps. Well, such a preference is already woven into {{interlinear}}, which automatically applies small caps to all abbreviations. Still, editors may choose a different style, so the template should be able to allow them to. If an editor has supplied a mixed-case string, then it's a mixed-case string that should be displayed. This is a valid stylistic choice. Actually, I'm more inclined to think that if any change in this aspect of {{gcl}} is desirable, it is in the other direction: preserving even all-caps input, with small caps having to be explicitly marked (for example, by nesting {{sc}}). This would be the option that maximises flexibility, though at the expense of convenience.

Now, 3.5 makes an interesting point about non-abbreviated labels (e.g. "contemplative", spelt out in full). For glosses inside {{interlinear}}, this could be done across the board with |glossing=no abbr. For an isolated instance of such a label, I would imagine that simply writing {{sc|CONTEMPLATIVE}} should do the job. If it's necessary to use {{gcl}}, then this could still be done, albeit a bit awkwardly, with {{gcl|CONTEMPLATIVE|contemplative|glossing=no abbr}}. Should the template be changed to make that easier?

5. If an instance of {{interlinear}} has been set up to turn all glossing abbreviations into links, and it's necessary to turn off this automatic linking for a single abbreviation, then it could simply be wrapped in {{gcl}}: something like {{gcl|ACC}} will always produce a string that's not a link, even within an interlinear block where linking is the default. I don't see any use for an option that intentionally suppresses linking, because that's already the default. However, you may be raising a case for an easier suppression of the whole abbr formatting (tooltip + dotted underline). I don't know how necessary this is – the formatting isn't as distracting as a sea of blue links is prone to be, and it may be argued that visual consistency is important and that all abbreviations should look the same, even if repeated. Still, I can imagine a setup where {{gcl|ACC}} outputs the bare ACC (currently only achievable using {{gcl|ACC|glossing=no abbr}}), with the currently default abbr-formatted ACC requiring an explicitly set, if empty, second parameter: {{gcl|ACC}}. That would probably be viable.

Don't worry, you don't sound cranky. Your feedback is helpful. If you're interested in tinkering with the module, you're welcome to. Though the real obstacle here would likely be the low readability of the code (I don't have a background in either web design or programming; the whole project got going only because I was writing an article and I got annoyed at the massive hassle that it was to format the interlinearised text that I wanted). – Uanfala (talk) 23:56, 11 February 2021 (UTC)Reply