Internationalizing Python PyQt Apps

About this post

This post is a brief summary of the translating subtask of the internationalizing task of software development.  The larger internationalization task is discussed in Qt documentation.  The subtask of translating string literals is also discussed in Mark Summerfield’s book, page 512 (although the word “internationalize” is not in the book’s index!)

This post is relevant for different situations:

  • c++ versus python versus other languages
  • language used to write literals in the software
  • declarative GUI versus procedural definition of GUI (declarative: using uic or QML)

But the examples assume:

  • you are writing an app in Python using PyQt and Qt
  • you are an English speaker and coded your string literals in English (ASCII characters.)
  • you are procedurally defining your GUI (not declarative)

What is i18n?

i18n stands for “internationalization”, also called “localization”.  It is a set of business procedures, programming tools, operating system support, etc. to: make a software application display a different language than used for string literals in the source code.

The gist of i18n translations

You write your program using a few idioms and a little boilerplate in app startup code.  Use the checklist below to be sure you coded everywhere for i18n.

You modify your build process with a few extra steps.

Data flow

This is the dataflow using Qt and its tools:

From your Python source (with i18n idioms and boilerplate) and a .pro file, the program pylupdate4 produces (or updates an existing) a .ts (translations file.)

The program Qt Linguist edits the .ts file in place (like a database.)  A person in the translator role uses the program Qt Linguist.

From a .ts file the program lrelease creates .qm files.

From .qm files (in your apps resource directory, i.e. in your apps package somewhere) AND from system environment variables, your app reads and displays translated strings appropriate to your locale (as your app starts up.)

Creating a .pro file

The .pro file specifies:

  • which of your source files contain translatable string literals
  • which languages you want to translate into

Example:

SOURCES += appName.py
TRANSLATIONS += appName_es.ts

Running pylupdate5

>pylupdate5 appName.pro

This assumes:

  • the current directory is the project directory.
  • you are using Qt 5
  • the .pro file is in the current directory

It creates many .ts files in the current directory (one for each language specified in the .pro file.)

Installing the tools

Some of the tools come with Qt SDK ( Qt Creator) and some come with PyQt.  In particular, pylupdate5 from PyQt is a specialized version of lupdate5 from Qt SDK (for c++.)

Note that there is now ‘qtchooser’.  Often a command you might run e.g. ‘lrelease’ invokes (through a link in /usr/bin/lrelease) the qtchooser command, which determines which version of the tool ( loosely referred to as lrelease) to invoke.  Some of the commands from Qt 4 and Qt 5 are different commands?

Installing Qt Linguist for Qt5 on Linux

sudo apt-get install qttools5-dev-tool

Note that Qt Linguist is not specialized for Python code (the same whether the app is written in C++ or Python.)

Using Qt Linguist

Qt Linguist steps you (a person in the translator role) through phrases needing translation.  It doesn’t actually translate.

To start it, on a command line:

>linguist

A GUI app starts.  It is a document oriented app.  The document is a .ts file.  You Open the file, edit it, and Save it.  Your edits are: creating a translated string from an English language string (from the Python source, which is always in English, or ASCII anyway?)

Using Qt Linguist is iterative in this sense: in a session you iterate through the strings that have not been translated yet.

Using Qt Linguist is also iterative in this sense: when you change the GUI of your app (when you add a displayable string), you (or your person in the translator role) repeat (again) use Qt Linguist on the same .ts file as before (again, its a database.)

Do it yourself translating phrases for software apps

Even if you don’t know many foreign languages, you translate reasonably by yourself, without enlisting a specialized translator person.

There is a website Ostext.org offering access to a database of phrases used in other software and already translated.

I suppose the internal query is something like “select appName, outPhrase from translations where inPhrase contains ‘foo’ and outLanguage = ‘bar’.  In other words, it returns a set of translations of an English phrase found inside translated phrases from many apps.

There is also Google translation.

It might be easy to do a reasonable translation of the important phrases of an app because such phrases are of the grammatical form “imperative-verb noun”, that is, not grammatically correct sentences.  Then again, most users might be able to understand those phrases in English, just because of their context in menus, dialogs, etc.

Testing an internationalization

You don’t need to package your app.  Just cd to your project directory (where the .pro, .ts, .qm, and .py files are.)  Then set your locale and start your app:

>export LANGUAGE=es_es
>./my_app.py

Note this will mess the locale of any programs you start from the same terminal (shell having the same environment.)

Here es_es means: the Spanish language, in the Spain region (Castillian dialect of Spanish?)

Organizing the files in directories

Some developers seem to put all their .ts and .qm files in a subdirectory ‘i18n’ of their project directory.

When packaged, only the .qm files are needed, again usually in a subdirectory, but now a subdirectory of the resource directory.

The idioms for i18n

Wherever you first define a string literal that will be visible to a user, bracket the string literal in a call to self.tr():

self.tr("foo")

Caution: you can’t bracket a variable (a reference), only a string literal.  This won’t work (pylupdate will fail to parse it into the .ts file):

bar = "foo"
self.tr(bar)

In other words, the tool pylupdate5 parses your code (looking for string literals that need translating), but is limited in its understanding of which references have type ‘string’.

Caution: the method tr() is defined in class QObject (the root class of most Qt classes.)  You can’t use self.tr() in an object that is not derived from QObject.  In that case, you can delegate to another object that is derived from QObject (one that you defined, or a QApplication instance.)

Gotchas: complexities

Install translator early

You must install a QTranslator instance into the app before procedurally creating the GUI (otherwise you create the GUI with untranslated strings.)

One flavor: don’t create singleton instances ( that include to-be-translated strings) at the top level of imported modules (outside of a class), if you import the modules before installing a translator (otherwise, the singleton instance is untranslated.)  In other words, don’t invoke tr() at import time (statements at the top level of modules get executed at import time.)   This mistake is very easy to do if your app is at all complex i.e. having many modules.

Keep a reference to translator instances

Using PyQt, when you create a QTranslator instance and call app.installTranslator(instance), you should keep a reference to the instance.  Otherwise the instance may go out of scope and the instance get garbage collected.  That is, when you install into the app, apparently Qt only keeps a reference, and not a copy.  (??? Is this still correct.???)

Don’t call tr() in mixin classes

At ‘required translation extraction time’ (when you call pylupdate5) it establishes the context key from the class name containing a call to  tr(), that is, statically or lexically.  At runtime, a call to tr() establishes the context from the class name of the object (dynamically).  If you use mixin classes, these two context keys don’t match.

Understanding translation context

Summerfield say context ‘… is only important to translators.’  That is wrong.  The context is used as a key (in various ways and times) including by the Qt translation system at app execution time.

Installing many translators

You install many (a sequence of) translators.  When your app calls for a  translation, the Qt machinery searches the sequence of translators in the order installed.

Generally speaking, the many translators include:

  • your app e.g. myApp_es.qm
  • external (installed separately) Python modules that your app imports e.g.  myPySubmodule_es.qm
  • Qt e.g. qt_es.qm

You include the Qt translations so that the dialogs, etc. built into Qt library are translated. Similarly for external modules that you import.

Note that the order above may be significant, since Qt may translate the same phrases that your app includes.  The above order gives precedence to your app’s translations.

It seems like most people just copy the .qm files to an app’s resource file.  Alternatively, on certain platforms, at translator install time you might search the system for the .qm files in standard places.  For example, on Linux, your app might use the installed Qt library and install translators that load from the installed Qt translations (wherever that is?)

Boilerplate example

TODO

String literals that you don’t need to translate

Some strings are universal across languages and regions, and should not be translated:

  • app names: this is more or less a brand name, or iconic name?
  • file suffixes: these are international standards (mimetypes)
  • strings built into Qt that Qt translates, e.g. “OK” for a standard button (but maybe “Apply” would be a better example, since isn’t ‘OK’ universal?)
  • strings used as keys in the settings database (the user should never see these.)

Special characters in translated strings

Certain strings in Qt use special characters, such as ‘&’ in menu item texts, to mark the character which should be underlined so that when the menu is open, a user can select that menu item using the keyboard.  Note that this usage is only on the Win platform, and is deprecated on other platforms?  I don’t know whether the translator must retain that special character in the translated phrase.  Since translation may involve changing characters, and since the set of underlined characters in a menu should be a proper set (no duplicates), it is probably something the translator needs to do, but doesn’t have enough information to do properly?

Packaging your translations

When you package your app, the .qm files should be in the resources part of the package.

The boilerplate code should access the resource part of the package for the .qm files.  In Qt, the path prefix “:/” refers to the resources part of the app’s installation (installed from the resource part of the package.)

I think different platforms might not actually install all the translations from the resource part of the package to the standard installation places on the install target computer?

Checklist for internationalizing an app

This is a checklist for complete internationalization of your code.  Check these things:

  • the boilerplate is in your app startup code
  • every string literal that you define in your app code that the user may see is idiomized for translation (grep for “[A-Z]+.*” or a better regex for string literals ?)
  • every file that includes an idiomized string is listed as a SOURCE in the .pro file (grep for “.tr(” )

Strings user may see:

  • strings in the GUI
  • constant strings (not entered by the user) that your app writes into document files
  • error messages (resulting from user errors; you might omit critical exceptions from system or app coding errors.)
  • license and copyright notices?

Representations of custom key sequences such as “Ctl-Z” ?  I don’t know whether these need translation.

Prioritizing language for internationalization

You have limited resources.  You can only translate for the most popular languages.

The most popular spoken languages are Mandarin, Spanish, English, Hindi, Arabic, Portuguese, …

Unfortunately, that is not weighted by the count of speakers who use computers.  The most popular languages for Internet users are English, Mandarin, Spanish, Japanese.  (Internet usage is a good proxy for users who have computing devices, even just mobile phones?)

Unfortunately, that is not weighted by the count of computer users who might pay for your app.  Some countries have lax intellectual property standards, and in some countries many people can’t afford to pay for apps.  So I might guess that the priority is: English, Spanish, Japanese, …

Also, consider that some language pairs are more close than others: people who speak Portuguese may prefer a Spanish translation over no translation (that is, a default English translation.)

My guess is that Spanish is the most important (and easiest from English) language to translate to.

Translating to Chinese Using Qt Linguist

When I tried to cut and paste from Google Translate into Qt Linguist operating on a Spanish translation, it showed square box characters!  To see the proper characters, open Linguist on a foo_cn.ts file.  When Linguist starts, there is a dialog: choose the target language to be a Chinese.

Links

An old discussion about alternative idioms, and other alternatives

A discussion about common mistakes in using tr(), and how to test using MockSwedish

PyQt’s documentation about translation, including a discussion of c++ and python differences.