Wikipedia:Reference desk/Archives/Language/2024 July 24

Language desk
< July 23 << Jun | July | Aug >> Current desk >
Welcome to the Wikipedia Language Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


July 24

edit

Which + <noun>...? vs. What + <noun>... ?

edit

I'm looking for questions of the type "Which/What + <noun>...?", so that changing the interrogative word by the other interrogative word may sound non-native. 2A06:C701:7B31:C100:7D63:C50F:C3A5:9744 (talk) 12:47, 24 July 2024 (UTC)[reply]

Which/what car do you drive? Which/what kind of food do you like to eat?
"If we’re presented with an infinite number or an undetermined number, then “what” is always the choice. However, if the choices are narrowed down to a more specific selection, we can use “which.”" AlmostReadytoFly (talk) 14:06, 24 July 2024 (UTC)[reply]
So, can "WHAT" always be the default, for a non-native speaker unaware of that rule? 2A06:C701:7B31:C100:7D63:C50F:C3A5:9744 (talk) 14:17, 24 July 2024 (UTC)[reply]
Yes, the same site I linked above advises "It’s worth noting that “what” can be used in place of “which” it’s just more informal if you were to do so." AlmostReadytoFly (talk) 14:24, 24 July 2024 (UTC)[reply]
Thanks. 2A06:C701:7B31:C100:7D63:C50F:C3A5:9744 (talk) 14:34, 24 July 2024 (UTC)[reply]
If you have told someone there are only two options, chicken soup and tomato soup, it sounds IMO a bit peculiar if you then ask, "What soup shall I get for you?". At least to me, "Which soup shall I get for you?" sounds better.  --Lambiam 17:03, 24 July 2024 (UTC)[reply]
Phrases like "which one" and "which answer" often follow multiple choices too. For example, "Our sides are A, B, C and D, which one would you like?" is more natural than substituting "what one" and it would appear that Google's Ngram viewer confirms this. [1] Quoting the above citation: "We may already be presented with a list of potential answers, and we’re just asking someone to clarify which one applies to them." Of course, "what" is definitely used when asking open-ended questions such as:"We have A,B, C and more, what would you like?", although I think it's natural to ask "Which one would you like?" even then... Sentence order matters, compare: "What dressing would you like? We have..." to "We have... Which one would you like?". Modocc (talk) 17:32, 24 July 2024 (UTC)[reply]

Are there apps or any software, that can identify the native language of a speaker currently speaking in a non-native language (e.g. English)?

edit

2A06:C701:7B31:C100:7D63:C50F:C3A5:9744 (talk) 13:18, 24 July 2024 (UTC)[reply]

Google Translate does this. If it's not sure, such as when the words could come from multiple languages, it tells you which languages, along with the translations. Matt Deres (talk) 14:11, 24 July 2024 (UTC)[reply]
I suspect Google-Translate can't do this, unless the text is written. If it's spoken using microphone, Google-Translate can't identify the language, even when spoken by a native speaker.
Anyway, I was asking about identifying a person's native language, when they are currently speaking in a non-native language (e.g. English), rather than in their native language we want to identify. 2A06:C701:7B31:C100:7D63:C50F:C3A5:9744 (talk) 14:30, 24 July 2024 (UTC)[reply]
So you're asking if there's software that can identify someone's accent, if they're speaking English with a French/Italian/German/Swedish/Danish/Finnish/Norwegian/etc/etc/etc accent? That sounds like a big ask. AlmostReadytoFly (talk) 14:44, 24 July 2024 (UTC)[reply]
Yep. Maybe AI can help? 2A06:C701:7B31:C100:7D63:C50F:C3A5:9744 (talk) 14:52, 24 July 2024 (UTC)[reply]
Even bigger than one thinks...for example, people in France or Germany do speak with at least a little hint of their local accents (as do English speakers...Mancunian, Liverpudlian etc.)...irl, you rarely will encounter these stereotypical "accent speakers" (Maurice Chevalier) which are propagated by media. So you would need an even finer mesh to be able to differentiate between Parisian, Bavarian, or Berlinish accents/sociolects. Even AI might be a little overwhelmed by that. Lectonar (talk) 15:01, 24 July 2024 (UTC)[reply]
I would say that in the last ten years or so, genuine French and German people on British TV have taken to using increasingly comic accents. There's an alleged Frenchman called "Fred Serieuse" or somesuch who sounds like Inspector Clouseau on a very bad day. DuncanHill (talk) 23:56, 24 July 2024 (UTC)[reply]
I guess AI may overcome it, by listening to millions of speeches already known (or already defined) to be spoken in a given "typical" accent, say a "British" accent defined in advance to be a "typical British" accent. Thus AI may build a list of typical features of that typical accent. The same may be done for every other typical accent defined in advance, e.g. a "typical Mid-west American" accent, and so forth. After building a list of say 200 accents defined in advance to be "typical", the next step AI should take, is to identify - how close to a given typical accent - our own accent is. All of that may help identify in what accent we speak English, whether our accent is British or American or French or Chinese or Swahili or whatever. HOTmag (talk) 15:13, 24 July 2024 (UTC)[reply]
Estimates of the number of UK regional accents vary between 40 and 56. I think you'd need a lot more than 200. Alansplodge (talk) 18:56, 24 July 2024 (UTC)[reply]
The number of "main" accents depends on our choice: For example we can define a list of eight "main" typical accents: Chinese, Hindi, Spanish, French, Arabic, Bengali, Portuguese, Russian. All depends on what we define in advance to be "main". Then, AI should compare our own accent with this list only, and the output should be something like: "The typical accent closest to your own accent is Chinese", even if I'm a native English speaker, because English is not on the specific selected list defined in advance. In my previous response, I suggested a list of 200 accents defined in advance to be: "[main] typical accents". HOTmag (talk) 23:52, 24 July 2024 (UTC)[reply]
There are local accents too, right? I mean, many people can within their own region pinpoint the exact village a speaker comes from. One village sounds significant different from the next, just three kilometres away. PiusImpavidus (talk) 11:23, 25 July 2024 (UTC)[reply]
You can statistically analyse the speech:
  • How is each phoneme pronounced exactly? Part of this is accent, part is voice, part is random variation, so there's a limit to the amount of information you can extract from that.
  • What's prosody like? To what extend is it stress timed, syllable timed or mora timed? There's a continuum in that. Does the speech appear tonal? We can assign numbers to that.
  • How are words chosen? For example, English often has pairs of synonyms, one from Germanic, one from Romance, and the speaker may have a preference for one of them.
  • Look at syntax. Not only the errors, but also the ratio of alternative constructions both grammatical in the language. For example, English allows preposition + pronoun (in which) and a pronominal adverb (wherein) as alternatives, with the former far more common. If the speaker prefers the latter, chances are he's Dutch (or a lawyer).
Also, the question is about finding the native language of the speaker, not the accent. A person from Amsterdam speaking English has a different accent than one from Antwerp speaking English, although their native languages are the same.
So, you have to record enough speech to get statistics with sufficiently small error bars, collect similar statistics for a large number of accents, each belonging to some native language, then you can compute a score for how well the speech fits a particular native language. You first have to find the accent, then the language and there are more accents, so you need more statistics.
It's just statistics, no intelligence needed, artificial or otherwise (some intelligence may be required for tagging the recordings). But then, AI is just a huge pile of statistics with software calculating correlations. Collecting the statistics may be a challenge and I doubt a suitable corpus is available to make this work well. PiusImpavidus (talk) 11:18, 25 July 2024 (UTC)[reply]

I think the short answer is no. There is no currently functional app to detect accents. It could exist in the future, but even if an AI could utilize the existence of millions of youtube videos there would still be a lot of arbitrary positives or errors, since in the end of the day each human has a unique way of speaking and speech changes over time. Now there would be significant interest for such an app for forensic investigations for Immigration authorities, who has a lot of problems with identifying 'true' geographic origins of individuals. See [2], Language analysis for the determination of origin. --Soman (talk) 10:36, 25 July 2024 (UTC)[reply]

using past perfect verbs with dates in the same sentence

edit

Should past perfect verbs be used in a sentence beginning with the year that a past date happened? I was copyediting this article's history section and found the use of "had" repetitive. When looking into when and when not to use them, I was stuck on the appropriateness of using past perfect in a sentence like "By the end of the financial year in 1874, 1,100 pounds had been spent in construction, of an estimated total of 13,200 pounds." Thank you for your help! Decsok (talk) 18:46, 24 July 2024 (UTC)[reply]

Entirely appropriate. According to Pluperfect (aka past perfect) the tense relates 'to an action that occurred prior to an aforementioned time in the past'. You have a textbook example. Starting from a specific past moment (EoFY 1874), it then refers to one or more preceding events (x pounds had been spent). Alternatively, it could have said 'In FY 1874, x pounds were spent', thus not splitting the spending events from the end of financial year. -- Verbarson  talkedits 19:02, 24 July 2024 (UTC)[reply]
Thank you! Decsok (talk) 19:38, 24 July 2024 (UTC)[reply]
If money had been spent on the same project in earlier financial years, the alternative formulation is not equivalent. It could be that 300 pounds were spent in construction in FY 1873, and a further 800 pounds in FY 1874. Then, by the end of FY 1873, 300 pounds had been spent in construction, which, by the end of FY 1874, had grown to a whopping 1,100 pounds, but in FY 1874 only 800 pounds were spent.  --Lambiam 00:24, 25 July 2024 (UTC)[reply]