Talk:List of English words of Japanese origin/Archive 2
This is an archive of past discussions about List of English words of Japanese origin. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 |
Misleading
The title of this article -- "List of English words of Japanese origin" -- gives the misleading impression that these words are in sufficiently widespread and general use to be considered "English" by English speakers. In fact, only a tiny number of them fall into that category. Either the article needs renaming, or there needs to be a disclaimer at the start saying "these words are listed in standard English dictionaries, but..." I tried to do something along these lines but was reverted. Maybe my attempt was not good, but the way it reads right now cannot be correct IMO. Matt 20:43, 12 July 2007 (UTC).
- I agree that the article and the title are not well matched. My opinion is that words should be removed from the article. Would you agree that only words found in mainstream general English dictionaries should stay? Fg2 20:59, 12 July 2007 (UTC)
- Criteria for inclusion here has been discussed above in the Criteria section and in the subsequent sections. We could discuss that again if necessary, and delete some entries again.
- Having said that, I agree with the anonymous editor regarding a need for a short disclaimer. However, the disclaimer in the begining should be kept simple and accurate, and I didn't agree with some of his details. For example, if you start listing words in the intro, it's going to look like original research.--Endroit 21:17, 12 July 2007 (UTC)
- I propose that the word be listed in at least three major English dictionaries. That should give a fairly impartial and general consensus of the wide-spread acceptability a word as English. Citations should probably also be given for the dictionaries. Bendono 23:37, 12 July 2007 (UTC)
- That would strengthen Criterion #2. Do you have opinions about Criteria #1 and 3? Fg2 08:56, 13 July 2007 (UTC)
- I think it would a good idea to at least state the criteria for inclusion in the article. Referencing individual words to individual dictionaries even better I guess. Featured list List of English words containing Q not followed by U might have some general ideas and a model for a "disclaimer" warning. I accept the objections to my original edit as appearing like OR, but by way of explanation, my thinking was as follows. I came to this article interested in English words of Japanese origin. I scrolled down the list and saw Baren, Bokeh, Bunraku etc. etc. etc. My immediate thought was "this isn't what I want to know... none of these are English words." Hence my attempt to construct a very much shorter list of the words such as tycoon that, to my mind at least, could conceivably be called "English". On a different point, is there any reason why all the words are capitalised? Normal practice in word lists is to capitalise only propoer nouns isn't it? Matt 22:12, 13 July 2007 (UTC).
- Sorry for the late response. This page fell off my radar until a recent edit. Fg2 asked about Criterion 1 and 3. If we accept my above proposal, I do not think that any other criteria is needed. #1: if it is in three major English dictionaries, then it should "occur in English texts with [sufficient] regularity". #3: if it is used by people who don't speak Japanese, then it should already be in major English dictionaries. Realistically, it is unlikely that everyone will agree with #1 and 3, and it smells of original research. Nor should we have to; let lexicographers argue about those details. We can cite them. I asked for three or more major dictionaries to obtain a fairly impartial and general consensus of the wide-spread acceptability a word as English. Bendono (talk) 03:23, 8 February 2008 (UTC)
(resetting indent) The three-dictionary rule is nice because it's objective. Would you want to elaborate on which dictionaries are acceptable? Or leave it open? Fg2 (talk) 08:24, 8 February 2008 (UTC)
- Ideally I would hope that we do not need to specify. Just like with reliable sources, some dictionaries are more reliable than others. It is difficult to describe, so let me give a few examples.
- Below are a few dictionaries that I consider to be acceptable:
- Below are some that I would not consider to be major English dictionaries:
- A Dictionary of Anime Terms
- A Dictionary of Japanese History Terminology
- The point is not to find a single obscure reference to a word. It is to find an impartial and general consensus of the wide-spread acceptability of a word. I think that the only objective way of doing that is by consulting multiple major English dictionaries, even if a specific list is not decided upon. That is the spirit of my proposal. As always, if there are any specific concerns, they can always be discussed here. Bendono (talk) 11:24, 8 February 2008 (UTC)
- I think giving examples of acceptable and unacceptable sources for this list will help ensure an authoritative list and give a rationale for removing inappropriate or marginal edits. In that spirit, it might be helpful to list a few categories of dictionaries that we want editors to avoid: dictionaries for non-native speakers of English and for translators, specialized dictionaries (Japanese History Terminology is an example), open-content dictionaries, electronic dictionaries that don't have reliable print counterparts. We can of course discuss exceptions if there are any to discuss. The criterion on the article page would be quite simple; elaborations like the ones in this paragraph would probably be best left to the talk page. Fg2 (talk) 12:17, 8 February 2008 (UTC)
- As ever on Wikipedia a multitude of busybody users have weighed-in on an otherwise informative topic and spoiled it. Half of the words on here are only used to refer to items of Japanese culture in the first instance, and even then in fairly specialized contexts. Such words are hardly universally recognized in the way that, say, 'tycoon' and 'tsunami' are. The latter words have percolated so far into the English-speaking world that their origin is not immediately clear. Someone needs to put an end to the pedantry. Gunstar hero (talk) 19:48, 5 June 2008 (UTC)
Same old discussion
I'm amused to see the constant desire to see words that are jargon in specific disciplines or hobbies related to Japan added to this page. Properly call these non-anglicized words jargon, and you can't make an argument against it. Suggest that they might be established loan words, and it gives logophiles the excuse to wage dictionary wars. But call this ever-dynamic page glossary of Japanese words, and there shall be eternal peace (and additions to the list).--67.121.120.67 (talk) 02:35, 12 February 2008 (UTC)
None of these words are English
None of these words can be considered part of the English language. They are simply Japanese words transcribed into English.
The title of the article needs to be changedGlobalscene (talk) 22:00, 30 April 2008 (UTC)
- Yeah, to "list of Japanese words known by American nerds" <eyes rolling> -WikiSkeptic (talk) 20:53, 2 May 2008 (UTC)
- The Oxford English Dictionary and the Merriam-Webster Dictionaries disagree with you. Please point out the words that reputable English dictionaries do not include. Fg2 (talk) 00:00, 1 May 2008 (UTC)
Oxford,Merriam & webster seem to agree. Most of the words I've checked are either labelled Japanese terms or Japanese proper nouns.
Every word on this planet can be transcribed into English, it does not make it an English word.Globalscene (talk) 16:38, 2 May 2008 (UTC)
Many of these words are naturalized English-language citizens (so to speak). But the list has become bloated once more; "Bokeh" is an example of a word that is still alien here. --Orange Mike | Talk 16:40, 2 May 2008 (UTC)
- I agree that every word on this planet can be transcribed into English. Note that Japanese has about fifty or a hundred thousand words that the English dictionaries have not seen fit to include. The English dictionaries include the Japanese words that have become English. There's a distinction between a word that is Japanese without being English, and a word that is English listed with its origin being Japanese. It's the latter that belong in this article.
- Regarding "bokeh": photography magazines use the word freely. This is one of the criteria the editors of dictionaries will look at when deciding whether to include it as an English word. So maybe it's on its way to entering the dictionaries. But Wikipedia's function is not to predict what future editions will say; instead, we should wait until this enters reputable dictionaries as an English word. When it does, it belongs in this article. So check the dictionaries. If it's in them, keep it in this article; if not, remove it. Fg2 (talk) 06:55, 3 May 2008 (UTC)
This isn't a list of words in an english dictionary, many are in the dictionary but explicitly say they are Japanese.Globalscene (talk) 00:14, 4 May 2008 (UTC)
- Yes, and that's a problem exactly how? If they are in an English language dictionary, and they say they are Japanese, then they are obviously English words of Japanese origin. Hence this list. ···日本穣? · Talk to Nihonjoe 04:47, 4 May 2008 (UTC)
- No, the dictionaries do not say they are Japanese. This seems to be a common point of confusion so let me explain. Every entry in an English dictionary is an English word. The etymology of words may be different. Some come from French, some from German, some from Middle English, etc. Some from Japanese. Apparently you see the etymology section and think that means the word is not English despite it having an entry in the dictionary. For example, if you look at an example brought up in the AFD, honcho, the MW entry says the etymology is from the Japanese word "hanchō", but that does not mean "honcho" is Japanese. Indeed, that isn't even the proper Romanization, the meanings are different, and so are the pronunications. --C S (talk) 00:21, 15 May 2008 (UTC)
- I think what is trying to pointed out here is that a large number of dictionaries give word origin information (though the detail varies). Therefore, if a word is of Japanese origin, the dictionary will often indicate this. ···日本穣? · Talk to Nihonjoe 01:19, 15 May 2008 (UTC)
- Exactly. When the Merriam-Webster Online Dictionary defines sayonara and writes "Etymology: Japanese" that means it's an English word that comes from the Japanese language. When the Oxford English Dictionary's Etymology page for the same word says "[Jap.]" it means the word is English and came from Japanese. As C S said, every word that has an entry in an English dictionary is English, according to the compilers of that dictionary. Words from East Asian languages take their place as part of English along with words from many historical and geographical relatives of German, French, Latin, and other European languages. And the example C S gives, "honcho," refutes the statement "They are simply Japanese words transcribed into English."
- I agree that there are words in this list that are not yet part of the English language. "Bokeh" is in neither Oxford nor Merriam-Webster. So maybe we need a new article, "Japanese words in use in English in specialized fields" or something similar, if these words are to find a place. Or maybe we should simply delete them. But we have to be careful to remove only words that are not English. Words that the compilers of reputable general English dictionaries agree are English belong in this article. Fg2 (talk) 02:55, 15 May 2008 (UTC)
Proposal
The following words are, indisputably, words of Japanese origina whose meaning would be understood and part of the vocabulary of the average speaker of English:
- bonsai tree
- haiku
- karaoke
- manga
- origami
- dojo
- commiting hara-kiri
- head honcho
- judo
- jiu jitsu
- kamikaze
- karate chop
- ninja
- nunchucks (nunchaku)
- samurai
- sumo wrestling
- futon
- hooch
- kimono
- gingko tree
- hibachi grill
- ramen noodles
- shiitake mushrooms
- soy sauce
- sukiyaki
- sushi
- shrimp tempura
- teriyaki
- tofu
- tycoon
- shogun
- Zen
- akita dog
- geisha girl
- "Go" the game
- kudzu vines
- rickshaw
- sayonara
- Sensei
- sudoku
- anime
That's an impressive number of words right there. All the rest should be removed. There's no need to embellish the list. As a reverse analogy, I'm sure that the words "gerrymander" or "McCarthyism" have turned up in Japanese books, but they would not be considered to be part of Japanese vocabulary. I'm not sure that the opinion will be umaminous, but the list should be trimmed down to commonly used words. Mandsford (talk) 21:54, 4 May 2008 (UTC)
- I agree the list should be trimmed. It may be useful, as well, to have a section with words familiar to people within a certain field (such as bokeh and photography). While such words may not be within the general public knowledge, they may very well be important and frequently used within a particular field. ···日本穣? · Talk to Nihonjoe 23:02, 4 May 2008 (UTC)
- Please read the archives of this Talk page before taking radical action. This has been discussed thoroughly and there is no need to reinvent the list. Light pruning may be in order but not of the magnitude proposed. The link to the archives is in the Archives box at the top of this page. Fg2 (talk) 23:07, 4 May 2008 (UTC)
- Perhaps we could separate the words that truly are "English words" from the lesser-known words that happen to be Japanese. No need to tell us to shut up, because a lot of people have discussed the matter of revising (what you call "reinventing") the list. Mandsford (talk) 15:21, 5 May 2008 (UTC)
- Please read the archives of this Talk page before taking radical action. This has been discussed thoroughly and there is no need to reinvent the list. Light pruning may be in order but not of the magnitude proposed. The link to the archives is in the Archives box at the top of this page. Fg2 (talk) 23:07, 4 May 2008 (UTC)
- Hey, now, that was uncalled for. No one is telling anyone to "shut up", so cool your jets. Fg2 simply asked you to read the talk archives before taking any major action (such as wiping out most of the list) as this issue has been discussed in the past. I know Fg2 well enough to know he wasn't trying to be rude here. ···日本穣? · Talk to Nihonjoe 19:25, 5 May 2008 (UTC)
- Whether its been discussed in the past is irrelevant. The article is in need of a major revamp.Globalscene (talk) 21:06, 9 May 2008 (UTC)
- No, it's not irrelevant if it applies directly to this situation. If you go read the discussion, you'll see there are quite a few people who agree the list needs revamping and regular attention to keep it from becoming bloated by Japanese words which are not really English words (yet). ···日本穣? · Talk to Nihonjoe 01:17, 15 May 2008 (UTC)
- Whether its been discussed in the past is irrelevant. The article is in need of a major revamp.Globalscene (talk) 21:06, 9 May 2008 (UTC)
- Hey, now, that was uncalled for. No one is telling anyone to "shut up", so cool your jets. Fg2 simply asked you to read the talk archives before taking any major action (such as wiping out most of the list) as this issue has been discussed in the past. I know Fg2 well enough to know he wasn't trying to be rude here. ···日本穣? · Talk to Nihonjoe 19:25, 5 May 2008 (UTC)
Currently, the inclusion criteria is based on the most credible and reliable third-party sources for word origins and existence: major, published dictionaries. I can't imagine that there is a better (i.e. more reliable and authoritative) source than major published English dictionaries, so I can't imagine any better criteria for inclusion. If you have a specific proposal for inclusion criteria that you believe is better than using major published English dictionaries, please present it; however, caviling about how the list is too long and giving a list of words that should be included without any explanation of an independent criterion for selecting those words is not really a tenable proposal. You Are Probably Not a Lexicologist or a Lexicographer. Nohat (talk) 04:27, 15 May 2008 (UTC)
What about Typhoon? Thats a pretty common english word from Japanese (Taifuu) —Preceding unsigned comment added by Ottawakismet (talk • contribs) 20:25, 4 November 2009 (UTC)
I removed 'typhoon' because it did not enter English from Japanese, but rather via Portuguese, Greek, and Chinese. Japanese "taifuu" is On'yomi, so it also came from Chinese. Ulmanor (talk) 04:51, 5 November 2009 (UTC)
Nomination for Deletion = vandalism?
Is this the typical level of maturity displayed here? —Preceding unsigned comment added by Globalscene (talk • contribs) 21:03, 9 May 2008 (UTC)
So is this article at the mercy of admin bias?
This is what I'm getting from this discussion. Especially these Japanese-American editors. Yes i did some background checking and alot of people here are actually Japanese. —Preceding unsigned comment added by Globalscene (talk • contribs) 21:09, 9 May 2008 (UTC)
- I think there is probably a more relevant reason: namely you are mistaken in your assertions. Japanese etymologies of English words do not make the words Japanese, anymore than a Greek etymology makes an English word Greek. --C S (talk) 00:24, 15 May 2008 (UTC)
- I'll bet you that a lot of people who you think are Japanese aren't. ···日本穣? · Talk to Nihonjoe 01:16, 15 May 2008 (UTC)
Which words appear in dictionaries?
I wrote a small program to check various online dictionaries for whether a word is listed, and here are the results for the words on this page.
word | american_heritage | merriam_webster | random_house | websters_new_millennium | wordnet |
---|---|---|---|---|---|
adzuki | no | yes | no | yes | no |
azuki bean | yes | yes | yes | no | no |
aikai | no | no | no | no | no |
aikido | yes | yes | yes | no | no |
akita | no | yes | yes | no | no |
atsu | no | no | no | no | no |
aucuba | yes | yes | yes | no | no |
banzai | yes | yes | yes | no | yes |
bento | yes | no | yes | no | no |
bokeh | no | no | no | yes | no |
bokken | no | no | no | no | no |
bonsai | yes | yes | yes | no | yes |
bonze | yes | yes | yes | no | no |
budo | no | yes | no | yes | no |
bukkake | no | no | no | no | no |
bushido | yes | yes | yes | no | yes |
daikon | yes | yes | yes | no | no |
daimyo | yes | yes | yes | no | no |
dan | yes | yes | yes | no | no |
dashi | no | yes | yes | no | no |
dojo | yes | yes | yes | no | no |
domoic acid | yes | yes | no | no | yes |
edamame | yes | no | no | no | no |
ekiden | no | no | no | no | no |
enokitake | no | no | yes | no | no |
enoki mushroom | no | yes | no | no | no |
fugu | yes | yes | yes | no | yes |
fusuma | no | yes | yes | no | no |
futon | yes | yes | yes | no | no |
gaijin | no | yes | yes | no | no |
geisha | yes | yes | yes | no | yes |
genro | no | yes | yes | no | no |
geta | yes | yes | yes | no | yes |
ginkgo | yes | yes | yes | no | yes |
go | yes | yes | yes | no | yes |
gyokuro | no | yes | no | no | no |
gyoza | no | no | yes | no | no |
haiku | yes | yes | yes | no | yes |
hanami | no | no | no | no | no |
happi | no | no | no | no | no |
happy coat | no | no | no | no | no |
hara-kiri | yes | yes | yes | no | yes |
hentai | no | no | no | yes | no |
hibachi | yes | yes | yes | no | no |
hijiki | no | no | yes | no | no |
hikikomori | no | no | no | yes | no |
hiragana | yes | yes | yes | no | no |
honcho | yes | yes | yes | no | yes |
honcho | yes | yes | yes | no | yes |
ikebana | no | yes | yes | no | no |
imari | no | yes | no | no | no |
inro | yes | yes | yes | no | no |
judo | yes | yes | yes | no | yes |
jujutsu | yes | yes | yes | no | yes |
juku | no | no | yes | no | no |
kabuki | yes | yes | yes | no | no |
kaizen | no | no | no | yes | no |
kakemono | yes | yes | yes | no | no |
kaki | yes | yes | yes | no | no |
kakiemon | no | yes | no | no | no |
kami | yes | yes | no | no | yes |
kamikaze | yes | yes | yes | no | yes |
kana | yes | yes | yes | no | no |
kanban | no | yes | yes | no | no |
kanji | yes | yes | yes | no | no |
karaoke | yes | yes | yes | no | no |
karate | yes | yes | yes | no | yes |
karoshi | no | no | no | yes | no |
kata | yes | yes | yes | no | no |
katakana | yes | yes | yes | no | no |
katana | no | yes | no | yes | no |
katsuo | no | no | no | no | no |
katsuobushi | no | no | no | no | no |
katsura | no | yes | no | no | no |
katsuramono | no | no | no | no | no |
keiretsu | yes | yes | yes | no | no |
keirin | no | no | no | no | no |
kendo | no | yes | yes | no | no |
kimono | yes | yes | yes | no | yes |
kirigami | yes | no | yes | no | no |
koan | yes | yes | yes | no | yes |
koi | no | yes | yes | yes | no |
koji | no | yes | yes | no | no |
kombu | no | yes | yes | no | no |
koto | yes | yes | yes | no | no |
kudzu | yes | yes | yes | no | no |
makimono | no | yes | yes | no | no |
manga | no | yes | yes | no | no |
matcha | no | no | no | no | no |
matsuri | no | no | no | no | no |
matsutake | no | yes | yes | no | no |
medaka | no | yes | yes | no | no |
mikado | yes | yes | yes | no | yes |
mirin | yes | yes | no | no | no |
miso | yes | yes | yes | no | yes |
mizuna | no | yes | yes | no | no |
mochi | no | no | no | no | no |
moxibustion | yes | yes | no | no | no |
nappa, napa cabbage | no | no | no | no | no |
nashi | no | yes | yes | no | no |
netsuke | yes | yes | yes | no | no |
ninja | yes | yes | yes | no | yes |
noh | yes | yes | yes | no | no |
nori | yes | yes | yes | no | no |
nunchaku | no | yes | yes | no | no |
obi | yes | yes | yes | no | yes |
ooch | no | no | no | no | no |
origami | yes | yes | yes | no | no |
otaku | no | no | no | yes | no |
oxa | no | no | no | no | no |
pachinko | yes | yes | yes | no | no |
panko | no | no | no | no | no |
rame | no | no | no | no | no |
ramen | no | yes | yes | no | no |
randori | no | yes | no | no | no |
renga | no | no | yes | no | no |
rickshaw | yes | yes | yes | no | yes |
romaji | no | yes | yes | no | no |
ronin | no | yes | no | no | no |
roshi | no | no | yes | no | no |
sai | no | no | no | no | no |
sake | yes | yes | yes | yes | yes |
sakura | no | yes | no | no | no |
salaryman | no | yes | yes | no | no |
samurai | yes | yes | yes | no | yes |
sashimi | yes | yes | yes | no | no |
satori | yes | yes | yes | no | no |
satsuma | yes | yes | yes | no | yes |
satsuma | yes | yes | yes | no | yes |
sayonara | yes | yes | yes | no | no |
senryu | no | yes | no | no | no |
sensei | no | yes | yes | no | no |
seppuku | yes | yes | yes | no | yes |
shabu shabu | no | yes | yes | no | no |
shakuhachi | no | yes | no | no | no |
shamisen | yes | yes | no | no | no |
shiatsu | yes | yes | yes | no | yes |
shiba inu | no | yes | no | no | no |
shiitake | yes | yes | yes | no | no |
shinkansen | no | no | no | no | no |
shinto | yes | yes | yes | no | yes |
shogi | yes | yes | yes | no | no |
shogun | yes | yes | yes | no | yes |
shoji | yes | yes | yes | no | no |
shoyu | no | yes | yes | no | no |
shunga | no | no | no | no | no |
sika | yes | yes | yes | no | no |
skosh | no | yes | yes | no | no |
soba | yes | yes | yes | no | no |
soroban | no | yes | yes | no | no |
soy | yes | yes | yes | no | yes |
sudoku | yes | yes | no | yes | yes |
sukiyaki | yes | yes | yes | no | no |
sumi-e | no | no | yes | no | no |
sumo | yes | yes | yes | no | yes |
surimi | no | yes | yes | no | no |
sushi | yes | yes | yes | no | yes |
tabi | yes | no | yes | no | yes |
taiko | no | no | no | no | no |
takoyaki | no | no | no | no | no |
tamari | no | yes | yes | no | no |
tanka | yes | yes | yes | no | yes |
tanuki | no | yes | no | no | no |
tatami | no | yes | yes | no | no |
tempura | yes | yes | yes | no | yes |
tenno | no | yes | no | no | yes |
teppanyaki | no | yes | no | no | no |
teriyaki | yes | yes | yes | yes | no |
tofu | yes | yes | yes | no | yes |
tokonoma | no | yes | yes | no | no |
tokusatsu | no | no | no | no | no |
torii | no | yes | yes | no | no |
tsunami | yes | yes | yes | no | yes |
tsutsugamushi | no | yes | no | no | no |
tycoon | yes | yes | yes | no | yes |
udo | yes | yes | yes | no | no |
udon | no | yes | yes | no | no |
ukiyo-e | no | yes | yes | no | no |
umami | yes | yes | no | no | no |
umeboshi | no | no | yes | no | no |
urushiol | yes | yes | yes | no | no |
utani | no | no | no | no | no |
uzushi | no | no | no | no | no |
waka | no | yes | yes | no | no |
wakame | no | yes | yes | no | no |
wakizashi | no | no | no | no | no |
wasabi | yes | yes | yes | no | yes |
yagi | yes | yes | no | no | no |
yakitori | no | yes | yes | no | no |
yakuza | yes | no | yes | no | no |
yukata | no | no | yes | no | no |
yumi | no | no | no | no | no |
zaibatsu | no | yes | yes | no | no |
zazen | yes | yes | yes | yes | no |
zen | yes | yes | yes | no | yes |
zori | yes | yes | yes | no | no |
And for historical record, here is the program I wrote to do this:
#!/usr/bin/perl
use warnings;
use strict;
use LWP::Simple qw(get);
use Memoize;
memoize('my_get');
my %dictionary_data = (
merriam_webster => { url => 'http://www.m-w.com/dictionary/%word%',
nomatch_regexp => qr/The word you've entered isn't in the dictionary/ },
random_house => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/Based on the Random House Unabridged Dictionary/ },
american_heritage => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/The American Heritage/ },
wordnet => { url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/WordNet/ },
websters_new_millennium =>
{ url => 'http://dictionary.reference.com/browse/%word%',
nomatch_regexp => qr/No results found for/,
match_regexp => qr/Webster\'s New Millennium/ }, #'
);
my @dicts = sort keys %dictionary_data;
print "{|\n|-\n! word\n";
foreach my $dict (@dicts) {
print "! $dict\n";
}
while (<>) {
chomp;
my @results = ();
print "|-\n| $_\n";
foreach my $dict (@dicts) {
if (is_in_dictionary($_, $dict)) {
print qq(| style="background:#99FF99" | yes\n);
} else {
print qq(| style="background:#FF9999" | no\n);
}
}
}
print "|}\n";
sub is_in_dictionary {
my ($word, $dictionary) = @_;
my $url = $dictionary_data{$dictionary}->{url};
$url =~ s/%word%/$word/;
my $result = my_get($url);
my $nomatch_regexp = $dictionary_data{$dictionary}->{nomatch_regexp};
my $match_regexp = $dictionary_data{$dictionary}->{match_regexp};
if ($result =~ $nomatch_regexp) {
return 0;
} else {
if (defined $match_regexp) {
return $result =~ $match_regexp ? 1 : 0;
} else {
return 1;
}
}
}
sub my_get {
my ($url) = @_;
my $result = get($url);
return $result;
}
Nohat (talk) 05:51, 15 May 2008 (UTC)
What to do with this data? For starters, I think words which don't appear in any of the dictionaries are good candidates for culling, provided no one can show that the word appears in a dictionary other than the ones given here, such as the Oxford English Dictionary. Nohat (talk) 05:53, 15 May 2008 (UTC)
- I think words that are in at least two dictionaries should qualify. Also, in cases like "bokeh", which is a very common word in photography (I believe Fg2 pointed this one out), exceptions can be made on a case by case basis. ···日本穣? · Talk to Nihonjoe 06:05, 15 May 2008 (UTC)
- "bokeh" is not a common word outside of photography, which makes it a field-specific technical jargon borrowed from another language. If that is English, then, for instance, "dal segno" (a music term) must also be English. Furthermore, "poco a poco diminuendo" should be an example of how English speakers express the idea of quieting down little by little. If you have to reach for a technical jargon dictionary to find a word, because it's not in the regular ones, it is not English. An English word is something in common use by English speakers, not just those engaged in a narrow field. Some music terms have passed into common use. An ability that you have is your "forte", and a "piano" is an instrument. Thus, these are English (and found in major dictionaries). Also, if the word requires accents, umlauts or strange characters, it is not English. "Resume" (noun) is English, but if you stick the French accents back into it, it is then, as written, the French word, and not English any longer. 24.85.131.247 (talk) 08:22, 29 April 2012 (UTC)
- Also, if "bokeh" is English, then so is every freaking Latin plant and animal name. Those are technical jargon also and used by English speaking scientists. We should not not accept that Latin is English. Latin is Latin, Italian is Italian and Japanese is Japanese. 24.85.131.247 (talk) 08:22, 29 April 2012 (UTC)
- There were typo in the article and the list above was affected by them. Please see diff and check those that are fixed. "nappa, napa cabbage" also need to be checked separately. --Kusunose 06:36, 15 May 2008 (UTC)
- I think before anything is done with the data, "Webster's New Millennium" ought to be removed. Clearly it is a rather brief dictionary, not even listing karate, karaoke, sushi, etc. even when the first three all do. Some dictionaries are more authoritative than others; not only have I never heard of the Webster's New Millennium dictionary, it seems rather hard to find in a print edition. For example, note that karoshi and hikikomori are listed in Webster's New Millennium but not the others. I have serious doubts about whether these have actually made it into the English language. --C S (talk) 06:52, 16 May 2008 (UTC)
- I agree there are some problems with this data. The Webster's New Millennium and the WordNet results do seems somewhat spurious, and and as Kusunose pointed out, there were some typos in the list of words I used. For me, the main takeaway from my experiment was to show that the majority of words on this page are in fact listed in multiple major published English dictionaries, and that there isn't much of a compelling argument to support the calls for a drastic shortening of the list that could be based on any reliable, verifiable criteria. Certainly I agree that the words that aren't listed in any dictionaries or are only listed in one dictionary are good candidates for removal, but there aren't that many of those. Unless someone else does it before I get the chance, I will post corrections to the list, and then a proposed rubric for removing words and a list of words to be removed for further discussion. Nohat (talk) 08:01, 16 May 2008 (UTC)
- Hi Nohat. By the way, thanks for going to this effort. I look forward to your revision. --C S (talk) 08:09, 16 May 2008 (UTC)