Fog Creek Software
Discussion Board

Handling DBCS characters

Am writing a Windows application (dialog box based). There are a couple of strings that the user has to enter. This can be in any language. I need to read the strings from the dialog box and use them in my program. Should I use GetDialogBoxText() and convert the strings to wide-characters (use Unicode)? Also, how do I enter DBCS characters in the text box?

Monday, February 23, 2004

First of all I assume you're running on Win NT/2000/XP, NOT win 95/98/Me, because Win 9x does not really support Unicode without a lot of extra work.

Also it sounds like you're using Win 32 / C++ as opposed to some other development environment, but you didn't make that clear.

Finally DBCS is NOT the same as Unicode -- DBCS is an old, pre-Unicode encoding for Asian languages. Since you said that the strings can be in "any" language I assume you want Unicode, not DBCS.

If these three assumptions are correct, just #define UNICODE, then use wchar_t throughout your code instead of of char and put a capital L in front of any string literals:

char *s = "hi";
wchar_t *s = L"hi";

For built-in string functions like "strcpy" use the "wide" versions, in which you replace "str" with "wcs". So strcpy->wcscpy; strlen->wcslen, etc.

You can still call the same Windows APIs .. the #define UNICODE will make sure that the Unicode version is automatically called.

Joel Spolsky
Fog Creek Software
Monday, February 23, 2004

*  Recent Topics

*  Fog Creek Home