Talk:Double-byte character set
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||
|
Nonexistent page
editIt's no help to redirect to a nonexistent page. — Preceding unsigned comment added by Doovinator (talk • contribs) 19:48, 23 March 2004 (UTC)
What code pages *do* support all the major languages in East Asia?
editSince Unicode supports all the major languages in East Asia, unlike many other codepages, it is generally easier to enable and maintain software that uses Unicode.
Does this mean there are some other codepages that do? —Frungi 03:17, 11 July 2005 (UTC)
Character set / Encoding
editI feel confused when I read that UTF-8 would be a character set while it is in fact a character encoding, a way to represent characters (code points) of Unicode plans. Is DBCS misnamed? Should it have been named "double-byte character encoding" instead, or does it really represent a set of symbols (characters)? Teuxe (talk) 18:16, 31 August 2010 (UTC)
- That depends on whether you're asking whether the people who coined the term "double-byte character set" should have called it a "double-byte character encoding" (I would say "yes, they should have, to make it clearer what they're talking about", although I don't know whether, at that time, the "character set" vs. "character encoding" distinction was being properly drawn) or whether the page should be named "double-byte character encoding" rather than "double-byte character set" (I'd say that, if DBCS is the common term, it shouldn't be).
- The page should note that it's an encoding; I've changed the first paragraph to use "character encoding" rather than "character set". Guy Harris (talk) 23:23, 25 January 2013 (UTC)
DBCS/MBCS in Windows
editIn Microsoft Windows, MBCS denotes encodings that use a mixture of 1 and 2 bytes per character. In C and C++ using Microsoft's "generic-text mapping" this is enabled via the macro _MBCS. The documentation states that MBCS is DBCS, so in Windows DBCS also refers to 1/2 byte encodings.
Ref: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_90c3.asp http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/_crt_using_generic.2d.text_mappings.asp
Perhaps get this into the main text?
Cheers,
- Alf
Always in East Asia?
editWhy are almost all double-byte character sets from East Asia? --84.61.7.180 16:11, 3 June 2006 (UTC)
- Probably because most other cultures either use the Roman alphabet for writing, and thus mainly just need some accented versions of Roman-alphabet letters (thus requiring only 104 or so code points, so they can continue to use one byte), or use another small alphabet (thus also requiring only one byte); Chinese, Japanese, and Korean all use logograms or syllabaries, which require a lot more code points, thus requiring more than one byte. Guy Harris (talk) 23:28, 25 January 2013 (UTC)
DBCS on System i not terribly controversial
editI work for a software company that builds software for the IBM System i (formerly AS/400 and iSeries). DBCS is certainly a complex topic but not one which I would described as particularly controversial for users of this platform. Poorly understood and hard to comprehend, perhaps. Also, using the term DBCS-enabled with other IBM System i users would not be ambiguous. Most applications that run on the IBM System i today use DBCS rather than Unicode as it rather late comer to this platform and has at least one major restriction on the System i platform that prevents it's rapid adoption. That should be clarified. If DBCS is controversial and non-deterministic on other platforms I would suggest separate section to talk about DBCS on per platform basis. I'm new here so I did not want to go nuts editing this article without feedback or guidance.
Marty Acks 00:41, 17 July 2007 (UTC)
- Perhaps the article used to say DBCS on System i was controversial, but it no longer does so. Guy Harris (talk) 23:31, 25 January 2013 (UTC)
IBM DBCS
editIBM supported a true two-byte DBCS encoding, based on EBCDIC, back in the 1990s. (For example, the code X'4040'
was the DBCS encoding for a space character, corresponding to the single-byte EBCDIC X'40'
character code, and to ASCII X'20'
and Unicode U+0020
.) IBM COBOL (VS II) supported it with the PIC
G(n)
picture clause specifier, where G
presumably stood for a 16-bit "graphic" character, as well as the IS
DBCS
class condition expression. Based on some of the documents I have for it, this character set was intended mainly for Japanese/Asian applications. Here are some online references: 1, 2, 3, and 4. — Loadmaster (talk) 19:00, 27 November 2013 (UTC)