While we're on the subject of character sets
Where can I find information the number of bytes required to encode characters in the following character sets? I can guess a lot of them, but I need a definitive answer.
ANSI_CHARSET
DEFAULT_CHARSET
SYMBOL_CHARSET
MAC_CHARSET
SHIFTJI_CHARSET
HANGEUL_CHARSET
HANGUL_CHARSET
JOHAB_CHARSET
GB2312_CHARSET
CHINESEBIG5_CHARSET
GREEK_CHARSET
TURKISH_CHARSET
VIETNAMESE_CHARSET
HEBREW_CHARSET
ARABIC_CHARSET
BALTIC_CHARSET
RUSSIAN_CHARSET
THAI_CHARSET
EASTEUROPE_CHARSET
OEM_CHARSET
t.i.a
Monday, October 13, 2003
Each character set is controlled by a different standards body, so you're going to have a hard time finding just one definitive source.
Some places to start would be:
For Windows and Macintosh character sets, the Nadine Kano book: ISBN 1-55615-840-8
For Asian character sets, the Ken Lunde book: ISBN 1-56592-224-7
List of IANA-assigned charset names: http://www.iana.org/assignments/character-sets
The Letter Database: http://www.edi.ee/letter/
RFC 1345: http://www.ietf.org/rfc/rfc1345.txt
If there is a definitive resource I'd love to know about it. I sure wish the world was all-Unicode already!
Nate Silva
Monday, October 13, 2003
Recent Topics
Fog Creek Home
|