Talk:GNU Unifont/Archive 1

Latest comment: 16 years ago by Ph9000 in topic Reference to Latest Version
Archive 1

Reference to Latest Version

{Request edit} Roman Czyborra's original GNU Unifont website at http://czyborra.com/unifont/ is down. Someone else posted a link to an archived version of his website. That archive does not contain the version of the GNU Unifont currently described in this article, the 2007-12-31 version. I created that version by assembling everything that existed, and adding glyphs of my own, then giving the result to Roman. This version was in Roman's "updates" directory on his website before it went offline. The only place where the 2007-12-31 and later versions are now available is my website at http://unifoundry.com/unifont.html.

I also created this GNU Unifont article partially because an editor deleted an image of the entire GNU Unifont as it stood at one point in time on a more general page (someone else had created and added the image). Another Wikipedian mentioned that they missed the image, and I included it in this article. A second editor deleted it from this new page when I added it, thinking it too much to include even in a dedicated article. The http://unifoundry.com/unifont.html link contains a table with links to graphical representations of all the glyphs in the latest GNU Unicode font. With two editors deleting the font image from Wikipedia, that URL is the only place that such graphical depictions of the whole font now exist.

I am therefore requesting that a link be made to http://unifoundry.com/unifont.html so that the latest version of the font can be properly referenced in the article, and so that readers have access to graphical depictions of the whole font. If a link is made to unifoundry.com, I'll update the information on its Unicode Basic Multilingual Plane coverage. The GNU Unifont creator, Roman Czyborra, asked me to continue my additions to his font. --Ph9000 (talk) 16:36, 17 February 2008 (UTC)

Response. I declined the requested edit because I can't find 'unifont' in a search of http://www.gnu.org. Would welcome more discussion of this point. If Unifont were really being shipped as part of Linux, you'd expect to see the font stored and documented in more widely-accessible places. It's also a concern that the editor hosting the font, and making the request here, doesn't provide his real name. Having WP link to an anonymously-run website is (in my view) no better than linking to the web archive of Czyborra's work. I note there are quite a few Google hits for 'unifont' but I don't have time to make a study of the results. There is even a www.unifont.org site. Evidently Unifont exists, and Gnu exists. The question is, is there such a thing as 'Gnu Unifont' and if so, what is the best link to use for it? If there really is a Gnu Unifont why is it not mentioned at gnu.org? See also our article on Free software Unicode typefaces. EdJohnston (talk) 16:09, 3 April 2008 (UTC)
Response. I submitted my updated utility software to the Free Software Foundation in December 2007 or January 2008, but they were changing their whole database system. After changing their system, my submission sort of fell through the cracks. I called them back, and they added this link on 10 April 2008 for GNU Unifont utilities: http://directory.fsf.org/project/unifoundry/. My name is Paul Hardy, the name listed on the FSF site and the name under which my domain is registered (not anonymous). I told the FSF that I'd like to incorporate the latest font into Gnome as part of a GNU release, but only after I'm done adding the (as of the beginning of April) anticipated Unicode 5.1 glyphs. Roman Czyborra also requested before his website went down that I get my additions incorporated into Debian. There is just one script remaining from Unicode 5.1 that could be rendered well as a BDF font: the African script Vai, with about 300 glyphs. I'm also going to add all the remaining glyphs so at least there is some representation (even if complex scripts aren't rendered perfectly) since the Basic Multilingual Plane is almost complete. At that point, I'll get it incorporated into the latest Linux, Gnome, Debian, etc. I don't want to add it to official releases until then because a release usually sticks around in one form or another for years. --Ph9000 (talk) 19:48, 24 May 2008 (UTC)
So I guess we should wait until FSF ships a release of this software. EdJohnston (talk) 22:08, 24 May 2008 (UTC)
The software is already listed on FSF's site, at http://directory.fsf.org/project/unifoundry/. The last time GNU updated the source CD that they ship was in 2004. --Ph9000 (talk) 15:07, 25 May 2008 (UTC)
OK, that clarifies things somewhat. I am still concerned that the name 'Paul Hardy' does not appear at unifoundry.com. Can you fix that? It sounds as though you are taking over maintenance of Czyborra's work; is that correct? Also, the name 'Paul Hardy' doesn't seem to appear at gnu.org. Is there some other place where your name appears that we ought to be linking to? If you are the official maintainer of something that is a part of Gnu we ought to be able to reference that fact somehow. EdJohnston (talk) 16:21, 25 May 2008 (UTC)
My name has been in the "Licensing" section of the utilities page as copyright owner, but I guess that is buried in the middle of the page. It also appears as comments in my source code, but of course that is even more buried. My name appears at the end of the home page for the site. I just added it to the top of the page with the latest unifont version, at http://unifoundry.com/unifont.html. It seemed presumptuous to stick my name in such a prominent place; I've added thousands of glyphs, but others added most of them. Yes, I'm maintaining the font now; Roman Czyborra has told me to continue this work. He also told me that he would like to be sent the latest additions, but his website is down. When his website is back up, I'll send him the latest, or he can just copy them. I also checked on the FSF page and found that my name isn't publicly listed even though I gave them my name. I didn't realize that they kept names confidential. --Ph9000 (talk) 17:14, 25 May 2008 (UTC)
The idea of a Gnu Unifont seems interesting. One problem is that your role in it is currently so obscure, and if you can put your name more prominently on the unifoundry site that would help. It is also helpful if a WP article gives a good explanation of the issues around the whole topic. Does Gnu Unifont provide a widely-recognized solution to a known problem, or is it just another hypothesis for what ought to be done? Does your work have 'competitors?' How does this way of solving the problem compare to other proposals? Fonts appear to be a well-known topic, so if this is really a major contribution, surely it would have registered somewhere. EdJohnston (talk) 18:08, 25 May 2008 (UTC)
I am the one currently maintaining the font. It has been around for about 10 years, but no work had been done on it in a few years at the time I picked it up in 2007 (for example, the last Debian update with a new version of unifont.hex is from 2004). Roman Czyborra gave me the go-ahead to continue this work. The GNU Unifont entry on the Free Software font page you referred to has been there a long time as well. Roman Czyborra stated the justification for the font's existence in the first paragraph of the page currently linked as the archive: "The Unicode Standard was first published in 1991. Seven years later there is still no complete Unicode font and Unicode text often shows up unreadable with empty boxes or question marks for missing characters. This can lead to misunderstandings and is quite frustrating." Its purpose was to cover the entire Unicode Basic Multilingual Plane (the first 65,536 code points); I supposed that could get added to the writeup on the article's page. My name appears on the bottom of the main web page for the unifoundry site as a signature, and anybody can see who owns the domain name (it is in my name). I think that is enough, but it also appears on other pages. The whole point of my original post was for a link to the most recent version of unifont.hex rather than just a link to an outdated archive. At this point, my site is the only site with the latest version.--Ph9000 (talk) 20:05, 25 May 2008 (UTC)
By the way, as for "competitors," there aren't any others as far as I know. I searched for the most complete free font covering the Unicode Basic Multilingual Plane in 2007, and no other font even came close. That is why I am working with it. --Ph9000 (talk) 20:10, 25 May 2008 (UTC)
Does your work have actual users? If this were a serious problem, and you have solved it, wouldn't we see some evidence, in the form of enthusiastic testimonials? Or are people using other solutions instead? EdJohnston (talk) 19:02, 26 May 2008 (UTC)
If you do a Google search on "GNU Unifont" (with the double quotes), you'll see thousands of entries, including "testimonials." For example, the popular free international Unix editor Yudit uses GNU Unifont as a base font. I'm new to Wikipedia, but I do not think that testimonials belong here, not even in a discussion page. They fall in the realm of commercial marketing. I also think the idea of "competition" goes against the cooperative spirit of GNU and the free software community as a whole. It is an environment of cooperation, not competition, with the idea that such cooperation can produce something much greater than one person could achieve alone (just like Wikipedia). The spirit of this is captured in the Zulu word "Ubuntu" (the name of a derivative of Debian GNU/Linux); one translation is: "I am who I am because of who we all are." An important paper on the philosophy and dynamic of the free software community is Eric Raymond's paper "The Cathedral and the Bazaar"; see http://www.catb.org/~esr/writings/cathedral-bazaar/ for an online copy. The original "serious problem" of this font was that there was no font that covered the Unicode Basic Multilingual Plane. Unicode is the normal encoding for international documents and web pages today, and it will be in the future. Without font coverage of all of Unicode, this made it likely that web pages would be encountered whose characters could not all be rendered. To render international characters, you need software support and fonts. Software can't even be tested until a font exists. For example, Unicode 5.1 just added the African script Vai, used by some people in Sierra Leone and Liberia; now that the Unicode encoding is defined, software and fonts are needed to support it. Is the lack of support of a specialized African script a "serious problem" for most of Wikipedia's readership? No. Does that detract from the importance of supporting Unicode as the standard for international information exchange? I don't think that it does. --Ph9000 (talk) 20:01, 31 May 2008 (UTC)
So it appears that people just put up with incomplete coverage of the Unicode plane. Consider the home page of the Bangla Wikipedia, which is here. Using Safari I can't read that font. If I had Gnu Unifont installed, could I read that page? EdJohnston (talk) 02:58, 1 June 2008 (UTC)

Yes, people put up with incomplete coverage if they have no other choice. As for the Bangla Wikipedia, the answer is: sort of...and therein lies the tradeoff of GNU Unifont. Roman Czyborra's original output was a BDF font for X-Windows. X-Windows did not have built-in capability to properly render complex scripts such as Indian (and other Brahmi-derived) scripts, Semitic scripts (notably Arabic), and others (such as Mongolian, which is written vertically). In complex scripts, letters change depending on their location in a word. The complication with Bengali (and other Indian scripts) is that vowels following consonants are written above, below, before, or after the consonant they follow in pronunciation. Bengali has a further complication not shared with Devanagari or most other Indian scripts in that some vowels have components that appear both before and after the consonant they follow in pronunciation. The only hope for this working in a simple bitmap font such as GNU Unifont is if consonants are drawn with sufficient padding on the left, right, top, and bottom to leave room for these combining vowel marks with Unicode-compatible software (again, X-Windows wasn't originally designed to handle that). The Bengali font in GNU Unicode was more or less drawn that way, so that usually you'll be able to read what is there. But the result will not look perfect for all letter combinations. The philosophy is that at least something is better than nothing at all.

Indian scripts are syllabic in that respect. When Unicode was being developed, the Indian government adopted their ISCII standard as the coding scheme for Indian scripts in Unicode. ISCII was developed in the 1980s, when fonts were usually only given 256 code points. ISCII allowed ASCII in the lower half of a font, and one of several Indian scripts in the upper half for any given font. In other words, an ISCII font is limited to 128 characters per Indian script. Acceptable rendering requires far more to be rendered well, with several forms for most letters. In addition, most Indian scripts have many contractions of two or three letters. To use our Bengali example, there are over 300 such contractions. Obviously there is no way they can be encoded with only 128 code points, so in ISCII fonts they are broken up into their basic letters.

There isn't general agreement in the Indian community about how many letter variations are necessary for a good Indian font (although the ITRANS project, no longer in development, made some good headway). ISCII fonts also were designed to handle modern scripts only; originally they didn't include some letters necessary for Sanskrit, and ISCII and Unicode today still do not have assignments for almost all Vedic marks. The most complete Vedic Devanagari font of which I am aware is from Omkarananda Ashram in the Himalayas (http://www.omkarananda-ashram.org/Sanskrit/itranslator2003.htm). The Sanskrit 2003 font there has over 1000 ligature combinations. Use of such ligatures is only possible with Unicode by using the Private Use area. By its nature, use of the Private Use area is non-standard. The Indian script assignments are evolving in Unicode.

BDF fonts don't by themselves allow rule sets for rendering of complex scripts. You'd need a specially enocoded font and special software to work with that font for complex rendering. The most notable example of such a combination is probably Hanterm, an xterm (X-Windows terminal emulator) derivative for rendering Hangul (Hangeul), the Korean script (see http://unifoundry.com/hangul/ for a more detailed explanation).

TrueType / OpenType fonts do allow complex rules for rendering. They probably hold the promise for the future. Luis Gonzales Miranda wrote a program to convert GNU Unifont's native .hex font into a TrueType outline font. That provides an entry way for rendering complex scripts starting with GNU Unifont glyphs, but a lot more work is necessary. The pinnacle of font technology for complex scripts is currently the SIL's Graphite format (see http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&cat_id=RenderingGraphite). Graphite fonts are TrueType fonts with some additional font tables. Currently there isn't a lot of software that can handle Graphite fonts.

GNU Unifont was designed with the intention of at least rendering something for any Unicode glyph in the Basic Multilingual Plane as opposed to rendering nothing at all if the necessary custom font wasn't available -- that is the tradeoff it makes. It is a starting point, and itself will probably evolve into something more in the future. --Ph9000 (talk) 17:45, 1 June 2008 (UTC)