Home
Search
 
What's New
Index
Books
Links
Q & A
Newsletter
Banners
 
Feedback
Tip Jar
 
XML RSS Feed
Tutorial: Internationalization
I have not personally worked on international applications so my comments are theoretical and not drawn from personal experience. This article also contains reader-contributed tips and tricks for building international programs.

 

See also the tutorial International Character Sets.

Sections


Working With Strings

One of the first things you should do when building an international application is to separate the user interface's string values from the application code. The program must be able to change the strings displayed on labels, menus, buttons, and other user interface elements depending on the language selected.

The program must also be able to change accelerators, shortcuts, pictures, icons, and other graphics for different languages.

There are several issues to keep in mind when you use strings like this. First, depending on the languages you will support, you may need to use Unicode strings. You will need to handle these strings using only functions that can handle 2-byte wide Unicode characters.

Second, the same things can take longer to say in one language than in another. For example, a tiny "Ok" button in English may not be big enough to hold "Aceptar" in Spanish. You need to allow room for the longest value you will display. In the end, you will need to test the program in each of the languages it uses to verify that everything fits in all languages.

Third, you cannot use these string values in the code. For example, when the user clicks on a CheckBox, you cannot determine which box it is by testing its caption.


Loading String Values

There are several places you can store the strings an application will use. Resource files provide the greatest flexibility without too much more effort than the other methods, though each has its strengths and weaknesses and the others may be useful for very small projects.


Registry

You can store the strings using a different key section for each language. If the string variable g_Language stores the language you are using, the program can use code like this to get a label's value:
    lblName.Caption = GetSetting(App.ProductName, _
        g_Language, "lblName", "Name")

This method is easy to use. Unfortunately it requires that you store the information in the registry. You can use the RegEdit program to export the part of the registry containing your strings and then "execute" the .reg file to install the values on each computer, but that's a relatively complex process. The values must also be installed and maintained on each computer separately. Other strategies allow you to maintain these values for networked computers in a single location.


Text File

You can store the strings in a text file. For example, the file might look like this:

    English.YesButton:Yes
    English.NoButton:No
        :
    French.YesButton:Oui
    French.NoButton:Non
        :

The program can read this file searching for lines that begin with the name of the language it is using. It can place the values in a g_Strings collection using each string's name ("YesButton" and "NoButton" in the previous example) as the item's key in the collection. Then the program can assign a string value like this:

    cmdYes.Caption = g_Strings("YesButton")

This method is very easy to understand. Its takes a little work to load the string values into the collection, but that's not too hard.

If the text file is in a directory shared on a network, many computers can use it at the same time. To change the strings, you only need to change this single file and all of the computers are up to date.

This method has the disadvantage that it can only store text. It cannot hold pictures, icons, and other graphics that must change from language to language.


Database

A database offers a bit more security than a text file. You can password protect a database and some databases let you grant access to parts of the database for specific users. For example, a user who cannot access the program's supervisor screens does not need access to the strings displayed there.

A program can use a query to select the strings it needs from a database as in:

    SELECT * FROM Strings WHERE Language = 'English'

The program can then store the strings in a collection as described in the text file section. If the database stores the exact names of the controls, the program can find the control in the Controls collection and assign its value. Click here to download an example program that loads strings from a database.

This method has most of the same strengths and weaknesses as a text file. The database can be shared so you can use one copy for several computers and it works best for text data. You could store images in the database, but that's more difficult.


Resource File

You can store the strings in a resource file. The LoadResString function takes a resource ID number as a parameter and fetches the corresponding string from a resource file. The program should use constants or enumerated values to define the IDs.

    Public Enum StringIDs
        resCaption = 101
        resCmdYes = 102
           :
    End Enum
        :
    ' Set the form's caption.
    Caption = LoadResString(resCaption)
        :

The resource file can contain more than one string table for use with different languages. When it runs, the application automatically selects the right table based on the system's LCID. If the LCID doesn't match a table, the application uses the first table.

Click here to download a Visual Basic example that uses a resource file with multiple string tables.


Another strategy is to offset the base string IDs by 1000. Then add an appropriate number to get a particular language. For example, you might let the English offset be 0, the French offset be 1000, and the German offset be 2000.

    Public Enum LanguageOffsets
        langEnglish = 0
        langFrench = 1000
        langGerman = 2000
            etc.
    Public Enum LabelCaptions
        capOkButton = 1
        capCancelButton = 2
            etc.
    End Enum

Now suppose the Long variable g_Language contains the language offset. Then you could assign a button's caption like this:

    cmdOk.Caption = LoadResString(g_Language + capOkButton)

This variation has the advantage that the program can pick its language explicitly so you can change the language at run time for testing. Resource IDs must be between 1 and 32,767 so you may need to do some planning to avoid running out of IDs.

Click here to download a Visual Basic example that uses a resource file with one string table.


A third variation uses a separate resource file for each language. Because the resource file is compiled into the executable, this variation requires you to distribute a different executable for each language.

Resource files have the big advantage that they can easily hold strings or pictures. They have the disadvantage that they are compiled into the executable so changes to the resource file require that you recompile and redistribute the program.


XML

Enrico Strydom takes this approach. The basic idea is to store information that you would need to change for different languages in XML files. At runtime, you load the appropriate file and locate the correct information for each control. This approach doesn't require you to use resource files or other compiled resources. You only need to use simple text files in XML format.

For information on using XML in Visual Basic, see my book Visual Basic .NET and XML.

Here are Enrico's instructions:

I wrote an add-in that runs through all of the controls on a form and generate XML like so:

<frmChangeReason Caption='Change reason'>
<txtCRS Text='' ToolTipText=''/>
<cmdOK Caption='OK' ToolTipText=''/>
<cmdCancel Caption='Cancel' ToolTipText=''/>
<Label2 Caption='Change reason'
  ToolTipText='The reason for this change'/>
<lvwABA>
<Item Index='1' Text='Bookmark'/>
<Item Index='2' Text='ABA unique id'/>
<Item Index='3' Text='Action'/>
<Item Index='4' Text='Change reason'/>
</lvwABA>
</frmChangeReason> 
<frmChangeReasonDate
  Caption='Change reason and date'>
<txtCRS Text='' ToolTipText=''/>
<cmdCancel Caption='Cancel' ToolTipText=''/>
<cmdOK Caption='OK' ToolTipText=''/>
<dtpSCD ToolTipText=
  'The FSV date when this change will become effective'/>
<lblDataPassing Caption='' ToolTipText=''/>
<Label2 Index='0' Caption='Effective FSVD' ToolTipText=
  'The FSV Date when this change will become effective'/>
<Label2 Index='1' Caption='Change reason'
  ToolTipText='The reason for this change'/>
</frmChangeReasonDate>

Now we have a class that processes the XML. Easiest is from file where the LCID forms part of the filename ie. LAN409.xml, LANG40C.xml, load this in the class-initialize.

Alternative is to have a resource file where the LCID is the resource ID and the "complete" XML is the resource. Only way I could find to make the RES file is to make a "C" style .RC file with lines like so:

    "409" RCDATA DISCARDABLE "_LANG409.xml"
    "40A" RCDATA DISCARDABLE "_LANG40A.xml" 
and compile the RES file manually before building the VB project (DOS box, rc -r PROJECT1.rc). Remember to convert to UniCode when loading the resource.

    sXML = StrConv(LoadResData("""" & sLocaleID & """", 10), vbUnicode)

(for some resone the MS people do not recompile the resource if you create it using their resource editor ...)

Class has a method "FormTextFromResource" that gets a form object as parameter (each form calls this method in the Form_Load event passing "me" as parameter).

FormTextFromResource now has everything it needs to do magic, here's how we do it ...

  1. Go find the resources for this form in the "complete" XML resource
        Set xFormNode = mvarDOMDOCForms.selectSingleNode("//" & frm.Name)
        If xFormNode Is Nothing Then
            MsgBox "No string resources found for form " & frm.Name
            Exit Function
        End If 
        frm.Caption = _
            xFormNode.Attributes.getNamedItem("Caption").nodeTypedValue
    
  2. For each control on the form ...
        For Each cControl In frm.Controls
            sCtlType = TypeName(cControl)
            Err.Clear
            sControlIndex = cControl.Index
            If Err.Number = 343 Then
                ' Control is not part of a control-array
                Set xNode = xFormNode.selectSingleNode(cControl.Name)
                If xNode Is Nothing Then _
                    Debug.Print "Lang: no resource found for control "; _
                    cControl.Name & " on form " & frm.Name
                Else
                    ' This one is part of a control array
                    ' Find the correct node
                    Set xNode = xFormNode.selectSingleNode( _
                        cControl.Name & "[@Index='" & cControl.Index & "']")
                    If xNode Is Nothing Then _
                        Debug.Print "Lang: no resource found for control "; _
                        cControl.Name & ", index=" & cControl.Index & _
                        " on form " & frm.Name
                End If
    
                Select Case sCtlType
                    Case "TextBox", "usrTextBox"
                        cControl.Text = xNode.Attributes. _
                            getNamedItem("Text").nodeTypedValue
                        cControl.ToolTipText = xNode.Attributes. _
                            getNamedItem("ToolTipText").nodeTypedValue
                    Case "Label", "CheckBox", "OptionButton", "CommandButton"
                        cControl.Caption = xNode.Attributes. _
                            getNamedItem("Caption").nodeTypedValue
                        cControl.ToolTipText = xNode.Attributes. _
                            getNamedItem("ToolTipText").nodeTypedValue
                    Case "Frame"
                        cControl.Caption = xNode.Attributes. _
                            getNamedItem("Caption").nodeTypedValue
                    Case "ListView"
                        ' The current node has sub-elements with
                        ' attributes for each ColumnHeader
                        Set xNodeList = xNode.selectNodes("Item")
                        xNodeList.Reset
                        Set xNode = xNodeList.nextNode
                        While Not xNode Is Nothing
                            idx = xNode.Attributes. _
                                getNamedItem("Index").nodeTypedValue
                            cControl.ColumnHeaders(idx).Text = _
                                xNode.Attributes.getNamedItem("Text").nodeTypedValue
                            Set xNode = xNodeList.nextNode
                        Wend
    ... and so on ...

Nice thing here is that you can have as many resource strings per control as you need (Caption/Tooltiptext etc), even column headers for a listview control! The "old" way of doing it with the resource ID in the TAG property limits you to a single resource per control, no way of also storing a tooltip.

[Note: You can store multiple values in a Tag property using a delimited string such as "Caption=Click Me;ToolTipText=Click Here To Launch". -- Rod]

XML file also cantains an element "GeneralStrings" (for use in MsgBox etc..)

    <GeneralStrings>
        <GENERAL_RES_VERSION String="Version: %s Build %s"/>
        <GENERAL_RES_ENVIRONMENT String="Environment: %s"/>
        <MQ_ERR_QUEUE_OPEN String=
            "Error opening queue %s for " & "input %s"/>
        <MQ_ERR_QUEUE_READ String=
            "Error reading message from queue " & "%s %s"/>
        <GENERAL_UNEXPECTED_RESPONSE String=
            "Unexpected response " & "received: %s"/>
        <GENERAL_RESPONSE_TIMEOUT String=
            "No response received in " & "the allocated time."/>
        ... more ... 
    </GeneralStrings>

And we have a method in the class to "GetStringResource" with additional parameters. The method has some fancy logic to seach for "%s" and use the additonal parameters to replace (some languages has different word order)

Public Function GetStringResource(sStringID As String, _
  ParamArray vReplacementValues() As Variant) As String 
    ... code not displayed ...

    Set xNode = mvarDOMDOCStrings.selectSingleNode( _
        "//GeneralStrings/" & sStringID)
    If xNode Is Nothing Then
        MsgBox "String resource not found (" & sStringID & _
            "), resource file : " & mvarResourceFile
        GetStringResource = ""
        Exit Function
    End If
    On Error GoTo ERR_NO_STRING
    sResString = xNode.Attributes.getNamedItem("String").nodeTypedValue

    ... some code to replace the %s ...

The add-in code is basically a reverse of the code in "FormTextFromResource" where you build the form using the IDE (normal way) then run the add-in which generates a block of XML with the form name, controls etc. and the values for all the attributes thet you have localized strings for. Some logic to go search the XML file for the block and replace if it already exists/add if not.


Books

Here are a few books you may find useful. If you have used any of them and want to write a review, please let me know.
  • Internationalization With Visual Basic by Michael S. Kaplan, $49.99, 650 pages, paperback. A mix of excellent and terrible reviews at Amazon. Read the publisher's information and reviews carefully. Judging by the reviews, you will either love or hate this book. [Amazon] [Amazon UK]
  • Programming for the World : A Guide to Internationalization by Sandra Martin O'Donnell, $33.00, 440 pages, paperback. Slightly mixed but generally good reviews at Amazon. [Amazon] [Amazon UK]

Useful Links

Here are a few useful links that give some internationalization hints. Email me to recommend others.


Visitor Tips

Martin Connelly:

Here is a response I wrote on this to ASP newsgroup. This is also a hot topic on some XML newsgroups, like www.vbxml.com.

Q: I'm going to need to develop asp pages that use MS Access and will require interfacing with various foreign languages. Can Anyone point me in the direction to find out how to do this? I will be needing to support Arabic, Cambodian, Chinese, Korean, Laotian, plus the Eureopean languages.

A: You will need Access 2000, Access 97 doesn't have true UniCode support. You may need the algorithm for converting characters from UTF-8 to Unicode and vice-versa. There are all sorts of gotchas doing this ie. German text strings tend to be 30-40% longer than English.

Have a look at Michael Kaplan's website it handles multi-languages. There are tips and tools there to handle multi-languages. Even Chinese (Big Five) and multi-directional languages like Hebrew and Arabic. He is writing a book on Internationalization, and does consulting work for Microsoft. Don't know publication date.

Also, here's some other reading: Good Luck.

Review the Session.CodePage setting in the gate.asp file. This should be located on line 5 of the file. If you are using a non-Western language, you will probably need to change this setting. To find the correct settings, see Appendix F in Developing International Software for Windows 95 and Windows NT: A Handbook for International Software Design by Nadine Kano, Microsoft Press, 1995. [Out of print] This book is available on the Web here.

Also see this article: Creating a new locale.


Ruth Nisenbaum:

I'm writing from Israel where we have a lot of internationalization problems (besides all the rest that you might have heard).

The last problem I had was the following.

I develop with Win2000. In hebrew, all the menus and combo boxes should be from right to left. In my computer everything looked o.k., but when I installed in the user's computer, the screens looked really bad with all the combo boxes and menus in the opposite direction.

The problem was solved by copying back the original dll's from Win98: oleaut32.dll,ole32.dll,vbame.dll (special vba middle east dll).

Of course I had to get out of windows to be able to copy the dll's in the windows\system directory.

We had another very strange problem with Crystal Reports 8 Web Server. We developed a label printing program. It worked fine with CR7, but when using the OCX for CR8 it reversed all the strings. The problem is that to solve the problem we have to unregister the CR8 component from all the users. We contacted Seagate but they didn't understand what we were talking about and we couldn't get any help.


Trevor Finch:

We have internationalised, but still find bugs and glitches months later.

The text for the user interface is saved in a text file (and not a resource file), with an ID number for each text item that is also in the TAG of each control and menu and is refreshed at runtime. The 'ENGLISH' text file and tag numbers are generated automatically by a utility programme that reads the VBP file.

This also worked (amazingly - this is VB4) with 2-byte Chinese text. There were some problems with using API calls to TextOut - it seemed to get confused by the length of the string.

We manage reports, graphic output, grid headings etc by putting report definition files in language sub-directories e.g. C:\ThisApp\ENGLISH, C:\ThisApp\SPANISH. These files configure the grids etc at runtime.

The user has to set this directory to change the report and configuration files.

An advantage is that the user can now do much more customising - not just the cosmetic text but also, for example, changing the complete layout of a grid.

I have no experience with elaborate icons and symbols, but we have avoided clever things like symbols in menus (mainly because I don't know how)


An additional problem that we had to handle was 'units' - metric, US, whatever

For example the metric world measures water volumes in litres and Kl and Ml - Americans use gallons, MGallons, AcreFt, anything.

Our solution was to do all calculations internally in metric, but allow entry and display in user selected units.

In the VB development environment, Microsoft allows the user to enter dimensions and control positions in user-selected units, and VB displays in these units, but presumably it saves internally in a consistent unit (twips ?)

We have a Class which is attached to each text box, and can also be attached to label captions and used by Picture.Print. The class can take Microsoft format codes ('#00.0'), and also special codes like 'VOL' to display in user-selected volume units.

It is an extension to what Microsoft have done with Access text boxes.

There can sometimes be 'rounding' problems if the user changes display/entry units


It is at the end of this mess that you really appreciate the work and thought that has gone into Microsoft tools - but then they have rooms of programmers and bookshelves of committees to get it right.
Whether it is worthwhile or not - try doing a demonstration of your own software running in a language you don't understand. I was totally lost...

It really makes you realise what it must be like to use software that is not in your language. [Also if you look at the Spanish version and see English text, you know you missed something. -- Rod]

We can now supply the Spanish version to US agriculture because the majority of their users are more comfortable in Spanish than English.


Jon Windle:

I was part of a project group that did this once, in C but the principles are probably the same. Essentially the way we did it was

  • we had a class that returned the message string to be displayed; given a predefined constant. These constants were a huge enum decleration.
  • this class also returned the number of languages supported, and the strings used to identify the language.
  • the strings were stored in a text file, a different file for each language supported.
  • when the class was initialised the language to be used was given, again using a predefined enum, and the appropriate file read.

All the messages were stored in the same file.

So to add new languages we simply translated the message file and added a new language constant. These language constants were stored in a seperate file so that we could add languages without having to recompile the code. We had names like msg01.txt & msg02.txt so we could build a language file name from the value of the constant used to specify the language.

The biggest headache was making sure we added the message strings in the file so that they corresponded to the predefined constants; especially when adding these messages and constants in the wee small hours.


Peter Chamberlin:

Good to see there was a lot of feedback about internationalisation in VB. I had actually done some work researching this during the week and came across a program at www.flashget.com which supports multiple languages, however not through the usual Resource method.

What the developer did was to build the app in English, then put together a .ini file with sections for each form, menus, dialogs, error msgs etc. The user then chooses a language from a menu in the program (which is a list of all .ini files detected within the language subdir within the program installdir) which then instantly updates the programs settings to another language. In this case the developer used unique IDs to identify each piece of text and then got foreign users of his program to create the translation file from English to their language and to send it back to him for inclusion within the next installer.

This is a much preferred solutions, as I don't have to do the language translation work ! The routine for loading would either be going through all controls on a form then loading the new string from the relevant .ini section depending on the control ID, or going through the INI and updating each control with the new text. Probably a table linking the control names to the Unique IDs would be useful. FlashGet I presume was programmed in C++ however a similar thing can be achieved in VB6 by setting controls .tag property to a unique ID. An interesting issue anyway.


Mike Ohren:

There is a lot more to internationalisation that language (this is the least of our worries). You have to plan how you are ging to resolve the following:

  • Currency issues (both single oand multi-currency systems)
  • Format issues for dates and currencies and also the decimal symbol and thousand separator
  • Time zone issues to resolve any descrepancies between user-local time and server-local time
  • Daylight saving time adjustments in different countries.
  • Differing national holidays
  • Differing Tax and financial laws and regulations.
  • Differing National legal requirements that you app has to conform to.
Also, Is the application designed to operate in a single national environment for each installation or does a single installation handle multiple nationalities (eg a european wide call centre application or even an USA-wide application that has to deal with multiple time zones).

Not all applications are affected by all these problems but you need to ask the question.


[When I worked at GTE Laboratories, they consolidated all of their billing operations in one call center. I have no idea how they handled time zones ranging from the New York to Hawaii (a 5-hour span) plus all the different telephone regulatory constraints in each state. -- Rod]
WJK:

Completed Application

I have completed a rather large multi-language commercial software program for the international market. So the information provided comes with five years of international development. I would not say that I am an expert… I continue to make my share of mistakes and plan to continue. I am providing these ideas, only as a basis of how it could be done. I read the article on using XML for the database. As we move into a web based world, that approach should be investigated too.

The software program, a engineering application for estimating, was developed in the USA for use in Europe. The program operates in 6 languages with space for a user created language. The program consists of roughly 20 forms. The forms include menus, toolbars, taskbars, tab controls, grids and other common controls. I have seen articles describing the use of the registry for storing the language text. I do not agree with using the windows registry because of corruption problems. I don’t want my software to cause a user system crash or have to keep up with changes as operating systems change.

We do not ever use the registry to store the language text translations. Our method allows the user to change languages on the fly as needed. Instead of using the registry, we developed a Languages Access Database with several tables for each control type.

The tables are as follows;

  • Menu System – stores all the menu items
  • Tab Descriptions – each tab description
  • Tool Tips
  • Grid Headings
  • List or Combo Boxes – text for each item available in a list or combination box.
  • Grid List Boxes – text for each drop down selection in the grid
  • Form Titles
  • Message Box Text – this can also hold miscellaneous text needed to build screen messages.
  • Text Labels and Control Buttons – this table holds a lot of entries.
The tables are designed to group similar controls together. Also controls that use the same code for translation are stored in the same table. Each table has a column to represent a single language. Adding another language is as simple as adding another column. Using these tables it is easy to confirm that a translation exists for each language.

Automatic Translation

At the start this seemed rather simple, just have a couple of foreign dictionaries available. Translate the word and manually enter it into the data table. As time went on the size and scope of the project expanded exponentially. Single word translations don’t always express the idea correctly. We ended up increasing the space allowed on each control for longer translations. We minimized the use of command buttons on the forms, and instead we used the tool and task bars. We used pictorial icon buttons and then used tool tips as labels because of space limitations.

The cost of language translation services is expensive. Most of this work can be completed with software. It is a simple matter to export a column of data into lines of text for translation with translation software. The translated text is then imported into the database along side matching rows. A step further, each language column has a manual binary over-ride column. The translation software is not always accurate. When the user provides a manual translation, the over-ride checkmark ignores that row during automatic importation. The developer should be aware that controls are constantly being added during the early years of the programs development. The system has to be easy to update and relatively inexpensive.

Help Systems and User Manuals

The ideal system would use separate help files for each language. Because of the extensive cost, we have developed PDF user manuals in the various languages and shell to those documents. The help systems will be added as revenue permits, but we can still point to on-line help. A large amount of money should be planned for supporting help systems and manuals in various languages.

Reports

We ended up writing some reports multiple times for each language supported. Others we were able to make use of the languages database. I would recommend early planning for reports needed and how this capability will be accomplished. It is much easier to support one version of the report.

Custom Controls

Make sure that each custom control you intend to include with your software has multi-language capability. The sure fire test is using Chinese or other oriental language with the control. Make these tests early on in the development. You will loose a lot of work otherwise. Don’t say we will never be interested in marketing there, just search harder for a more flexible control.

References

I too would like to recommend reading Internationalization With Visual Basic by Michael S. Kaplan, $49.99, 650 pages, paperback. I would rather keep the $50 and have borrowed the book from the library. I don’t find it to be of continuing use. It does have ideas of how to be aware and let windows handle international number and date formatting. This was especially helpful. I simply do not agree with using the registry to store the translation.

Got Tips?
Email me and share your internationalization tips with others.


 
Subscribe to the VB Helper newsletter
Copyright © 1997-2001 Rocky Mountain Computer Consulting, Inc.   All rights reserved.
www.vb-helper.com/tut9.htm Updated