Talk:Debate on the use of Korean mixed script
This article was nominated for deletion on 17 February 2024. The result of the discussion was keep. |
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||
|
Information theory + partiality
editI am by no means an expert of Korean, nor of Asian scripts, however I am reasonably knowledgeable in mathematics and computer science; and it seems to me that the section about information theory, at least, is close to nonsense.
Obviously a larger set of symbols allows for expressing words with less symbols; that doesn’t take into account the complexity of knowing the set. By having more possible symbols, you trade easiness of learning against density of information, in an inefficient deal (because density grows only logarithmically in the number of possible symbols). And what is a symbol, anyway? On this broken metric, Hangul could appear to outperform Hanja, simply by considering that the symbols are the syllables, rather than the individual letters (there are, apparently, 11,172 possible syllables, compared with the 2,000 common Hanja characters). Yet the set of Hangul syllables is much easier to learn, and Hangul syllables are much easier to decode, because they are formed in a principled way from a reduced set of 24 letters, whereas Hanja characters are just arbitrary drawings. The drawing of Hangul syllables conveys phonemic information, whereas the drawing of Hanja characters conveys nothing.
The objection about homophones is of course valid, but the current paragraph lacks the amount of quantification that one would expect from a section titled “Information theory” and, besides, it makes a number of claims that ought to be supported by sources.
Also, −log2(1/24) is approximately 4.58, not 4.75 as written in the article (and −log2(1/2000) is closer to 10.97 than to 10.96).
In any case, no source whatsoever is provided in that section, that would demonstrates this is more than personal work. I second @Remsense who put the {{Multiple issues}} box on top of the article. The article as a whole does seem partial towards mixed script being superior to pure Hangul. For instance, it does not look like good faith to assert that “For the first 500 years of Hangŭl's existence Korea's literacy rates were not higher than that of other pre-industrialised states or even that of its character-using neighbours”: as I understood it, even though Hangul was invented some 500 years ago, it was banned or in confidential use until 1894, and when it finally became widespread, literacy in Korea took off at a quite spectacular rate. Attributing wrong claims (that mixed script was due to Japanese colonialism) to supporters of pure Hangul, again without sources, might be a strawman fallacy. I failed to find the alleged 2005 study about literacy in the OECD; I don’t doubt it exists, but for lack of the source, I can’t check the related claim in this article (the given source is an archived newspaper article which isn’t very explicit about the discussed study). Maëlan 04:39, 17 February 2024 (UTC)
- Looking over this article again, I am thinking about AFD'ing this article as a WP:TNT case: its topic is notable, but it is wholly a net negative on this site in its present state. Remsense诉 05:58, 17 February 2024 (UTC)
- Most of the stuff in Information Theory other than the math is really an over-explanation of how Hangul cannot replace Hanja as a whole as it is a phonogram and cannot represent the meaning each logographic meaning a Hanja character has. 00101984hjw (talk) 07:11, 19 February 2024 (UTC)