Wikipedia:Reference desk/Archives/Language/2009 January 30

Language desk
< January 29 << Dec | January | Feb >> January 31 >
Welcome to the Wikipedia Language Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


January 30

edit

Guessing authors country of origin from text

edit

Suppose I am trying to guess someone’s native language or country of origin based on a blog entry or something they wrote in English without using the content. What kind of clues could you gather from incorrect grammar, spelling, word usage? Can you give specific examples of grammatical mistakes you may have seen and what country that mistake may lead you to believe the author is from?


To take a first stab at answering my own question, I'd say first look for American English versus British English. Things like lift/elevator or color/colour, since this will not only tell you that they might be from England, but any number of former British colonies as well. Aside from this I'm at a loss. I assume people may sometimes incorrectly use English words in the grammatical form of their native language. So would you have any guesses to who might not use "to" like a sentence "I want go the park" Or what countries a person might be from if they always put their subject after the verb? I don't have any specific reason for asking this, I just am interested in seeing what people respond with for what clues they might hypothetically see in text that would give them good incite into guessing the authors country of origin. Anythingapplied (talk) 19:03, 30 January 2009 (UTC)[reply]

When a person at the very end of his sentence the verb places, he's probably German. DuncanHill (talk) 19:07, 30 January 2009 (UTC)[reply]
It's difficult to tell really, at least to be able to figure out the exact country. There are many languages that, for example, place verbs at the end of the sentence or after the object. In addition, the person could be multi-lingual, and even if they are mistakenly translating directly from one language to English, that doesn't mean it's his/her native language. The suggestion about American vs British dialectal differences is a good one; even that can backfire though because someone who speaks English as a second or third language may learn a mix of American/British/etc. colloquialisms.
This doesn't mean that you cannot gather clues from the grammar, syntax, writing style, etc. But those clues will usually only allow you to narrow down to a region, as opposed to a specific country. — Jclu (talk) 20:16, 30 January 2009 (UTC)[reply]

The absence of a definite or indefinite article may suggest that the writer is Russian, as his language manages very well without either. I know offhand of no other language which is like that, except Latin. Irish people routinely say "What happened him?" when they mean "What happened to him?" Pavel (talk) 20:21, 30 January 2009 (UTC)[reply]

This file is from the article Article (grammar).
 
Color scheme
  indefinite and prefixed definite articles
  only prefixed definite articles
  indefinite and postfixed definite articles
  only postfixed definite articles
  no articles
-- Wavelength (talk) 21:14, 30 January 2009 (UTC)[reply]
(edit conflict) One language I know a bit about is Russian, and one of the big giveaways with Russians writing in English is their use of the definite article (the) - and, to a lesser extent, their use of the indefinite article (a/an). Russian has no use for any of them, so they tend not to have a good feel for when they're necessary and when they're contra-indicated in English. So, we see sentences like "In 1923, Pinkovsky was appointed to Politburo of Soviet Union", or "I am writing this in the English". -- JackofOz (talk) 20:28, 30 January 2009 (UTC)[reply]
(EC twice in a row :-) Other clues for German are an overabundance of passive constructions; conditional and explanatory phrases at the beginning of long sentences; using "of" instead of possessive 's; past tense instead of present perfect progressive; using statements as questions. Terms like "in general", "as a rule" and descriptions what s.o. "has to" do and "is obligated/obliged" to do in business texts. - On the other hand it might just have been written by a native translator after translating a 70 page business report ;-).
Four other markers of a German accent in writing:
  • Setting off clauses with a comma even if restrictive ("Is German the only language, that does this?")
  • Writing ordinal numbers with a "." instead of th/st/nd/rd. Scandinavians also do this, though. ("We are in the 21. century")
  • When an acronym is used adjectivally, joining it to the noun with a hyphen ("German sentences may be written in the SOV-style")
  • Using "resp." to mean "or, respectively, ..." or just "or". ("Press the button marked 2 (resp. 3) [to go to the 2nd or 3rd floor"])
--Anonymous, 03:52 UTC, January 31, 2009.


Polish (and probably other Eastern European languages) missing or misplaced/ wrong articles and pronouns. 76.97.245.5 (talk) 20:36, 30 January 2009 (UTC)[reply]
'Irish people routinely say "What happened him?" when they mean "What happened to him?"' -- Incorrect. When we say "What happened him?", we mean "What happened him?" ;p jnestorius(talk) 21:32, 30 January 2009 (UTC)[reply]
Another way is to look for False friends. If the use of a word seems strange - it might be because its foreign cousin has a different meaning. You could also look for expressions, figures of speech, that don't exist in English but in another language. Regardless of what you go by, you need knowledge of the possible 'source' languages. In short, no big shortcuts. --130.237.179.182 (talk) 22:54, 30 January 2009 (UTC)[reply]

Having consulted the Oxford English Dictionary and Webster's, I see that neither concedes the use of "happen" as a transitive verb. With the demise of the Celtic Tiger, our young people may once again have to go abroad to earn a living, and the fact that we speak English is one of our greatest assets. It would be better for us if we speak it properly. Some Hiberno-English usages are quirky and appealing, others are just annoying and wrong; and leaving out prepositions falls into that category. Is mise le meas. Pavel (talk) 00:35, 31 January 2009 (UTC)[reply]

Adding prepositions in expressions where English doesn't use one, can also be a giveaway. "On beforehand" instead of just "beforehand", is a mistake that a Scandinavian writer easily could make, because we have the same word (forhånd, förhand), but the expression requires the preposition "på" (on). --NorwegianBlue talk 13:13, 31 January 2009 (UTC)[reply]
I've actually seen "beeble" for "people" from the Indian Subcontinent. Look for "b" for "p" and "g" for "k". It is tempting to try to guess the country of a writer. I run into that a lot here copyediting random articles, but I can usually tell from the subject matter, so I'm coming in late, so to speak. --Milkbreath (talk) 13:26, 31 January 2009 (UTC)[reply]
In my work, I sometimes have to proofread works that have been translated into English from German by Indians. They don't make spelling mistakes like "beeble", but I have to watch out for progressives in the wrong place ("At our company, we are putting great emphasis on quality and safety") and for the word "thrice" (which just sounds quaint outside South Asia). —Angr 13:33, 31 January 2009 (UTC)[reply]

Thrice is lovely. One tends to wonder why "twice" survived in everyday usage in England and "thrice" fell by the wayside. You also wonder, if you've never been there, what other treasures, grammatical and otherwise, survive in the Indian subcontinent, apart from Royal Enfield motorcycles. They seem to have valued the gift of English, because it served as a unifying factor in a country with countless languages and dialects, and is still an official language of India; although I don't doubt there are those who would wish it otherwise. In any case, they've probably taken better care of their English than people at home, who mangle it mercilessly. Pavel (talk) 14:08, 31 January 2009 (UTC)[reply]

I agree. Indian English can be just adorable. The Global Indus Technovators Awards "recognize and felicitate distinguished young innovators of South Asian origin". How charmingly formal can you get? And that's another clue—somewhat outmoded, to-our-ears-stilted diction. --Milkbreath (talk) 14:26, 31 January 2009 (UTC)[reply]
Well yes, but technical documentation isn't supposed to be "just adorable". That's why even though we outsource some of our translation to India, there's still work for us Brits and Americans, editing the writing so it doesn't make you go, "awww...". —Angr 15:26, 31 January 2009 (UTC)[reply]
All writing which people pay to read should be adorable. DuncanHill (talk) 15:56, 31 January 2009 (UTC)[reply]
Up to a point, yes. When the writing itself becomes the focus of attention - for whatever reason - where's the story gone? I'm sure some people write things to compete for turgid prose prizes (It was a dark and stormy night, and so on), just as some advert makers make ads just to compete for the Best European Advert competition, rather than necessarily to sell products, but those are special cases. The usual position is that the writing should always be the medium, rather than the message itself (pace McLuhan). When you've read a story, or even while you're just getting into it, you might think "Wow! This is brilliantly written", but if it's so brilliant that you pause to reflect on wonderful turns of phrase and exceptional word choices, to the detriment of the narrative, then the writer has done too well, which is just as bad as not doing well enough. In some cases, it's tantamount to the writer bragging about how much he/she knows about their language, and that's not what writing is about. -- JackofOz (talk) 20:56, 31 January 2009 (UTC)[reply]
The same holds for the typeface. It should be attractive and pleasing without calling any attention to itself, so that the reader focuses on the content rather than the appearance of the letters. (That doesn't work on me, though; I'm so fascinated by typefaces that I always catch myself staring at the letters instead of reading the text when I read a book.) —Angr 21:16, 31 January 2009 (UTC)[reply]
We have an offshore office in Mumbai and I have to call the, sometimes. An awful lot of early conversations went something like this:

"When will X be finished?" -"It is being finished now." "Right, so when can I expect it?" -"As I said, it is being finished now." "What does that mean? A few minutes? An hour? Tomorrow?" -"As I said several times already, it is being finished right now. It is already being finished."

  • Penny drops*

"When you say 'being finished' you mean that it exists in the state of finishedness?" -"Yes, that is correct. Isn't that what I said?"

btw I'm British, and "thrice" wouldn't make me bat an eyelid. I didn't even realise anyone considered it archaic. —Preceding unsigned comment added by 86.173.22.48 (talk) 20:04, 1 February 2009 (UTC)[reply]

It's certainly not something you hear every day. The last time I heard it used was probably by Frankie Howerd in Up Pompeii! His character Lurcio had a habit of saying 'Nay, nay, and thrice nay!' You overseas chaps missed a treat if you missed that. Pavel (talk) 20:29, 1 February 2009 (UTC)[reply]

One of the FAQs on AskOxford is "What comes after once, twice, thrice?" DuncanHill (talk) 04:40, 2 February 2009 (UTC)[reply]
Wasn't the Up Pompeii! line usually "Woe! Woe! and thrice Woe!", from Senna the soothsayer? (Often interrupted by Lurcio saying "All right, dear, we heard you the first time.") AndrewWTaylor (talk) 15:22, 3 February 2009 (UTC)[reply]


Maybe, when the word "thrice" was originally coined, life was a lot simpler. People counted to three, and any more than three was simply "a lot." Pavel (talk) 14:11, 2 February 2009 (UTC)[reply]

You seem to have lot of examples there, I just thought I'd offer the info that there's a web ap somewhere out there that has been trained on blogs to distinguish between male and female writers. Perhaps a similar ap exists, or could exist, to do the same with native language. Aaadddaaammm (talk) 14:28, 4 February 2009 (UTC)[reply]