Fog Creek Software
Discussion Board




Internationalization / Multiple languages

Hello,
In Joel's article about cleaning up FogBugz, he mentioned separating the interface text into a language file. This is a programming 101 question, but how is this done? What's in the file? Is it a big array? How does one refer to this file for interface elements or error messages? I'm interested in non language-specific concepts. Are there any articles or books on this topic?
Thanks,
Phillip

Phillip Harrington
Monday, February 25, 2002

The GNU gettext package does this, you can read about it here:

http://www.gnu.org/software/gettext/gettext.html

Basically, you have a message catalog file that contains translations of *every* potential output text that your program uses into the languages you want to support.

Then, in your program, you retrieve the appropriate message and print it out.

So instead of

print "Hello world.";

you end up with something like

print gettext("Hello world.");

where the gettext() function looks in the message catalog for the key "Hello world." and a value that matches the current locale.

David Sklar
Monday, February 25, 2002

You also need to make sure that you format time, dates and currency according the the current locale. There are functions that will format this for you.

Matthew Lock
Monday, February 25, 2002

On top of the basic "put strings, button text, etc into a resource file", you also have to realize  that not all languages read left to right. Getting all strings into a resource file is a good start. Eventually, when you go to languages such as the eastern languages, you'll have to start internationalizing your layout as well, since not all languages read left-to-right, some read right-to-left and up-down as well.

-james

James Wann
Monday, February 25, 2002

There are different approaches depending on what kind of app you are building. 

If it is an executable then almost everyone uses resources.  At Juno we put the resources into different dll's (one for english, spanish, etc) and then loaded the corresponding dll.

For web apps (or just web pages), its a little trickier.
Probably the easiest thing to do is either place all your strings as constants into different files (lang-es.asp, lang-en.asp, etc) or build a little lang manager like the previous user mentioned with the gettext.  The lang manager class could create mutliple string arrays, one for each language and then you would just ask it for a specific string using a constant.  ie. Respone.Write oLangMgr.GetString(S_HELLOWORLD)

Michael H. Pryor
Tuesday, February 26, 2002

Michael H. Pryor wrote:
"For web apps (or just web pages), its a little trickier."

Not if one uses a real scripting language like PHP...
[ducking and running for cover]

PHP support gettext right out of the box:
http://www.php.net/manual/en/ref.gettext.php
http://www.php-er.com/chapters/Gettext_Functions.html

Jan Derk
Tuesday, February 26, 2002

For web applications or web pages that are dynamically generated (such as using ASP), another choice is XML files for localized strings.

There is an article on this subject at:
http://msdn.microsoft.com/library/en-us/dnexxml/html/xml09182000.asp
or
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnexxml/html/xml09182000.asp

This article was written in September, 2000, and the URL's in the article either no longer work or link to files that are protected. This makes it a little more difficult to understand.

However, there is an example of an XML file containing localized strings, designed for the above purpose, at:

http://msdn.microsoft.com/msdn-files/026/000/210/ASP/Include%20Files/DE/DCLocStrings_xml.asp
(Note that this URL may not work in Netscape because it contains a space character.)

This type of XML file could be used in an ASP page by, for example, loading the XML file into an XML DOM object and then using the XML "selectSingleNode" method, such as:
set objXML = Server.CreateObject("Microsoft.XMLDOM")
objXML.load(FILE-SPEC)
Response.Write objXML.selectSingleNode("PageTitle").text

although there are probably more efficient ways to use the XML strings, such as using XSLT to generate HTML or XHTML. It may also be reasonable to cache the localized output of the transformation in some way.

Philip Dickerson
Wednesday, February 27, 2002

gettext() is cool except this is really a Unix thing. Other solutions require some managing of the strings in your application, which displeased me because I don't want to have a wholy system to set-up each time I add a new string.

I have found that Qt (www.trolltech.com) provided exactly what I want. Just like gettext(), you don't really care about strings when you code (just put them inside a tr() ). When you start to care, there is a great tool (QtLinguist) that helps you translate them, check for keybindings (often very tricky), find common translations, ... QtLinguist is based on the experience of KDE which is translated if 45 langagues, including right-to-left and asiatic ones.

What's so great is that QtLinguist is perfectly accessible to non-programmers, and the same executable can easily switch langagues without any need for recompiling.

Okay, I love Qt programming but they have put up a really really nice library that will run on Unix, Windows, Mac and even embedded devices.

Philipe Fremy
Friday, March 01, 2002

*  Recent Topics

*  Fog Creek Home