The term yūrei moji (幽霊文字, lit. "ghost characters" or "phantom characters") or yūrei kanji (幽霊漢字) refers to the characters in JIS kanji character sets that have unclear sources.

Overview

edit

JIS C 6226 (currently JIS X 0208) was established by Ministry of International Trade and Industry of Japan in 1978. But when establishing it, because the evidence for each character was not stated clearly, there were "characters of unknown usage" included. Those are called yūrei moji.

Kanji that are considered yūrei moji (Shift JIS in parentheses)
垉 (9AB3) 垈 (9AB0) 墸 (9AD5)
壥 (9ADD) 妛 (9BAA) 岾 (9BB1)
彁 (9C5A) 恷 (9C8E) 挧 (9D6A)
暃 (9DF1) 椦 (9E9B) 橸 (9EEF)
汢 (9F89) 熕 (E090) 碵 (E1F1)
穃 (E26D) 粐 (E2E2) 粭 (E2E4)
粫 (E2E6) 糘 (E2F2) 膤 (E452)
蟐 (E5AA) 袮 (E5D7) 軅 (E75F)
鍄 (E7FB) 閠 (E880) 靹 (E8D6)
駲 (E971) 鵈 (E9FC)

The exemplary ones among them would be 妛 and 彁, which are not even in Kangxi Dictionary. For example, when selecting the first standard of JIS X 0208 (JIS C 6226-1978), the evidence of the character 妛 was provided by pasting two characters (山 and 女) together, and it is thought that the shadow between those two characters was mistaken as a horizontal stroke. (The character that is similar to it but has usage is 𡚴, which is included in JIS X 0213.) For the character 彁, not only the evidence but also the usage is unclear.

After then JIS kanji were established and got included into computers and word processors, people started to wonder the usage of those characters. So, Hiroyuki Sasahara from National Institute for Japanese Language started a research about them, assuming that those characters are either used as place names or mistakes when copying from other data.

As a result, it is found out that even though those characters are not found in general Kan-Wa jiten and considered yūrei moji, many of them are actually used in place names. On the other hand, the twelve kanji (see JIS X 0208#Kanji from unknown sources) are still left to have unknown sources. Though their sources are unclear, similar characters are still found in old dictionaries or they are thought to be mistakes when copying other data; but there is a single character (彁) that has no evidence or usage at all. So, today, the real characters that are considered yūrei moji are those 12 characters, or in a narrower sense, only one (彁). The result of this research is in JIS X 0208:1997 Appendix 7.

Number of yūrei moji

edit

There is no clear number of yūrei moji. Even though there are 12 characters reported to be yūrei moji, it is not determined as 12; because even though the evidence is not found today, at the time of establishing the standard, there might have been the evidence.


Yūrei moji that have or had been in usage

edit
  • 粫: used in Uruchida (粫田)
  • 橸: used in Ishidaru (石橸)
  • 軅: used in Takatobu (軅飛)
    • Shirasaka of Shirakawa, Fukushima. It is changed into Takatobi (鷹飛).

The character 椦 may be a mistake of the first character of Nudeshima (橳島) of Maebashi, Gunma.

Handling in dictionaries

edit

Because yūrei moji are characters with unknown sources, at least they were not in dictionaries before the JIS kanji was established. Even if they are mistakes, or even if they were used in the past, or even if they were used by a small number of people, they still have unclear reading.

Even so, as long as they are in computers and word processors, it is not good to exclude them from IME dictionaries; so most IME dictionaries give unofficial readings, considering them as phono-semantic compound characters (形声文字, keisei moji). Following this, Kan-Wa jiten and kanji dictionaries with character codes include those unofficial readings.


Reason the yūrei moji still exist

edit

Though the research of yūrei moji was done, but this was a part of JIS kanji revision on 1997. Because there had been a big confusion already because of so-called 1983 JIS Kanji Revision, people could not have another confusion coming from JIS kanji revision.

As a result, yūrei moji are still in JIS; so if there is a proper font, everyone can use them.

Examples of usage

edit

Yūrei moji are originally unknown characters, and even if the meanings of them or the original characters of them are discovered, their usage would be still low.