Internationalization

As a platform intended for use around the world, Kolibri has a strong mandate for translation and internationalization. As such, it has been designed with technologies to enable this built in.

Writing localized strings

For strings in the frontend, we are using Vue-Intl, an in house port of React-intl. Strings are collected during the build process, and bundled into exported JSON files.

Messages will be discovered for any registered plugins and loaded into the page if that language is set as the Django language. All language setting for the frontend are based off the current Django language for the HTTP request.

.vue files

Within Kolibri .vue components, messages are defined in the <script> section as attributes of the component definition:

export default {
  name: 'componentName',
  $trs: {
    msgId1: 'Message text 1',
    msgId2: 'Message text 2',
  },
};

The component names and message IDs should all be camelCase.

User visible strings can be used anywhere in the .vue file using $tr('msgId') (in the template) or this.$tr('msgId') (in the script).

An example Vue component would then look like this

<template>
  <div>
    <!-- puts 'Hello world' in paragraph -->
    <p>{{ $tr('helloWorld') }}</p>
  </div>
</template>


<script>

  export default {
    name: 'someComponent',
    mounted() {
      // prints 'Hello world' to console
      console.log(this.$trs('helloWorld'));
    },
    $trs: {
      helloWorld: 'Hello world',
    },
  };

</script>

.js files

In order to translate strings in Javascript source files, the namespace and messages are defined like this:

import { createTranslator } from 'kolibri.utils.i18n';
const name = 'someModule';
const messages = {
  helloWorld: 'Hello world',
};
const translator = createTranslator(name, messages);

Then messages are available from the $tr method on the translator object:

console.log(translator.$tr('helloWorld'));

common*String modules

A pattern we use in order to avoid having to define the same string across multiple Vue or JS files is to define “common” strings translator. These common translators are typically used within plugins for strings common to that plugin alone. However, there is also a “core” set of common strings available to be used throughout the application.

In order to avoid bloating the common modules, we typically will not add a string we are duplicating to a common module unless it is being used across three or more files.

Common strings modules should typically have a translator created using the createTranslator function in which strings are defined - these can then be used in the setup function of a component to expose specific strings as functions:

import commonStringsModule from '../common/commonStringsModule';


export default {
  name: 'someComponent',
  setup() {
    const { myCoolString$, stringWithArgument$ } = commonStringsModule;
    return {
      myCoolString$, stringWithArgument$
    };
  },
};
<template>
  <div>
    <p>{{ myCoolString$() }}</p>
    <p>{{ stringWithArgument$({ count: 4 }) }}</p>
  </div>
</template>

Previously, this has been handled via mixins, which has required this additional complexity in the modules. You may see modules that include the translator object and the following:

  • An exported function that accepts a string and an object - which it then passes to the $tr() function to get a string from the translator in the module.

  • An exported Vue mixin that exposes the exported function as a method. This allows Vue components to use the mixin and have the exported function to get a translated string readily at hand easily.

ICU message syntax

All frontend translations can be parameterized using ICU message syntax. Additional documentation is available on crowdin.

This syntax can be used to do things like inject variables, pluralize words, and localize numbers.

Dynamic values are passed into translation strings as named arguments in an object. For example:

export default {
  name: 'anothetComponent',
  mounted() {
    // outputs 'Henry read 2 stories'
    console.log(this.$tr('msg', {name: 'Henry', count: 2}));
  },
  $trs: {
    msg: '{name} read {count} {count, plural, one {story} other {stories}}',
  },
};

.py files

For any user-facing strings in python files, we are using standard Django tools (gettext and associated functions). See the Django i18n documentation for more information.

RTL language support

Kolibri has full support for right-to-left languages, and all functionality should work equally well when displayed in both LTR and RTL languages.

There are a number of important considerations to take into account with RTL content. Material Design has an excellent article that covers most important topics at a high level.

Warning

Right-to-left support is broken when running the development server with hot reloading enabled (yarn run devserver-hot)

Text alignment

Alignment of application text (i.e. text using $tr syntax) is mostly handled “automagically” by the RTLCSS framework. This means that application text should have CSS applied to it as though it is written in English. For example, if you want the text aligned left for LTR languages and right for RTL, simply use text-align: left. This will be automatically flipped to text-align: right by the webpack plugin. Since the application is only ever viewed in one language at a time, RTLCSS can apply these changes to all CSS at once.

On the other hand, alignment of user-generated text (from databases or from content) is inherently unknown beforehand. Therefore all user-generated text must have dir="auto" set on a parent DOM node. This can get especially complicated when LTR and RTL content are mixed inline bidirectionally. Read more about the Unicode Bidirectional algorithm.

A rule of thumb for inline bidirectional text:

  • if user-generated text is on its own in a block-level DOM element, it should be aligned based on the text’s language using dir="auto" on the block-level element.

  • if user-generated text is displayed inline with application text (such as “App Label: user text”), it should be aligned using CSS text-align on the block-level element, and dir="auto" on a span wrapping the inline user text.

Behavior

Occasionally it is necessary to perform different logic depending on the directionalty of the the currently-selected language. For example, the handling of a button that changes horizontal scroll position would need to flip direction.

In the frontend, we provide a isRtl property attached to every Vue instance. For example, you could write Vue methods like:

previous() {
  if (this.isRtl) this.scrollRight();
  else this.scrollLeft();
},
next() {
  if (this.isRtl) this.scrollLeft();
  else this.scrollRight();
},

If you need to get the current language directionality on the backend, you can use django.utils.translation.get_language_bidi.

Iconography

Choosing whether or not to mirror icons in RTL languages is a subtle decision. Some icons should be flipped, but not others. From the Material guidelines:

anything that relates to time should be depicted as moving from right to left. For example, forward points to the left, and backwards points to the right

It is recommended to use the KIcon component when possible, as this will handle RTL flipping for you and apply it when appropriate, as well as taking care of other details:

<KIcon icon="forward" />

If KIcon does not have the icon you need or is not usable for some reason, we also provide a global CSS class rtl-icon which will flip the icon. This can be applied conditionally with the isRtl property, e.g.:

<img src="forward.png" :class="{ 'is-rtl': isRtl }" alt="" />

Content rendererers

User interfaces that are tightly coupled to embedded content, such as the ‘next page’ and ‘previous page’ buttons in a book, need to be flipped to match the language direction of that content. UIs that are not tightly integrated with the content should match the overall application language, not the content.

Information about content language direction is available in the computed props contentDirection and contentIsRtl from kolibri.coreVue.mixins.contentRendererMixin. These can be used to change styling and directionality dynamically, similar to the application-wide isRtl value.

In situations where we are using third-party libraries it might be necessary to flip the entire content renderer UI automatically using the RTLCSS framework rather than make targeted changes to the DOM. To handle these cases, it’s possible to dynamically load the correct CSS webpack bundle using a promise:

export default {
  name: 'SomeContentRenderer',
  created() {
    // load alternate CSS
    this.cssPromise = this.$options.contentModule.loadDirectionalCSS(this.contentDirection);
  },
  mounted() {
    this.cssPromise.then(() => {
      // initialize third-party library when the vue is mounted AND the CSS is loaded
    });
  },
};

Crowdin workflow

We use the Crowdin platform to enable third parties to translate the strings in our application.

Note that you have to specify branch names for most commands.

Note

These notes are only for the Kolibri application. For translation of user documentation, please see the kolibri-docs repository.

Note

The Kolibri Crowdin workflow relies on the project having the “Duplicate strings” setting set to “Show – translators will translate each instance separately”. If this is not set, the workflow will not function as expected!

Prerequisites

The tooling requires a minimum Python version of 3.7 and the dependencies in requirements/fonts.txt installed.

You’ll need to have GNU gettext available on your path. You may be able to install it using your system’s package manager.

Note

If you install gettext on Mac with Homebrew, you may need to add the binary to your path manually

Finally, ensure you have an environment variable CROWDIN_API_KEY. You can generate your Crowdin API key by navigating to your Crowdin account settings page.

Extracting and uploading sources

Typically, strings will be uploaded when a new release branch is cut from develop, signifying the beginning of string freeze and the beta releases. (See Release process.)

Before translators can begin working on the strings in our application, they need to be uploaded to Crowdin. Translations are maintained in release branches on Crowdin in the Crowdin kolibri project.

This command will extract front- and backend strings and upload them to Crowdin, and may take a while:

make i18n-upload branch=[release-branch-name]

The branch name will typically look something like: release-v0.8.x

Pre-translation

After running the i18n-upload command above, the newly created branch should have some percentage of strings in supported languages shown as both translated and approved. These strings are the exact matches from the previous release, meaning that both the string IDs and the English text is exactly the same.

At this point, it is often desirable to apply some form of pre-translation to the remaining strings using Crowdin’s “translation memory” functionality. There are two ways to do this: with and without auto-approval.

To run pre-translation without auto-approval (recommended):

make i18n-pretranslate branch=[release-branch-name]

Or to run pre-translation with auto-approval:

make i18n-pretranslate-approve-all branch=[release-branch-name]

Warning

The exact behavior of Crowdin’s translation memory is not specified. Given some English phrase, it is not always possible to predict what suggested translation it will make. Therefore, auto-approval be used with caution.

Transferring screenshots

Every release, we need to transfer screenshots on the platform from the previous branch to the new branch, as this is the only way to persist screenshots across branches. To do this run:

make i18n-transfer-screenshots branch=[release-branch-name] source=[previous-release-branch-name]

This will match all screenshots by their Kolibri message ID to persist screenshots across releases.

Reviewing screenshots

Every release, we need to review screenshots on the platform to ensure they are up to date. To generate a report of all the screenshots for a particular branch run:

make i18n-screenshot-report branch=[release-branch-name]

This will generate an HTML report that can be browsed to double check screenshots against the source English strings, with a link to the string on Crowdin to update the screenshot if needed.

Downloading translations to Kolibri

As translators work on Crowdin, we will periodically retrieve the latest updates and commit them to Kolibri’s codebase. In the process, we’ll also update the custom fonts that are generated based on the translated application text.

First, you need to download source fonts from Google. In order to do this, run:

make i18n-download-source-fonts

Next, we download the latest translations from Crowdin and rebuild a number of dependent files which will be checked in to git. Do this using the command below. It can take a long time!

make i18n-download branch=[release-branch-name]

This will do a number of things for you:

  • Rebuild the crowdin project (note that builds can only happen once every 30 minutes, as per the Crowdin API)

  • Download and update all translations for the currently supported languages

  • Run Django’s compilemessages command

  • Regenerate all font and css files

  • Regenerate Intl JS files

Check in all the updated files to git and submit them in a PR to the release branch.

Note

Remember about Perseus! Check if files in that repo have changed too, and submit a separate PR. It will be necessary to release a new version and referencing it in Kolibri’s base.txt requirements file.

Adding a newly supported language

In order to add a new language to Kolibri, the appropriate language information object must be added to the array in kolibri/locale/language_info.json.

Warning

Always test a newly added language thoroughly because there are many things that can go wrong. At a minumum, ensure that you can run the development server, switch to the language, and navigate around the app (including Perseus exercises). Additionally, ensure that the fonts are rendered with Noto.

The language must be described using the following keys, with everything in lower case

{
  "crowdin_code":   "[Code used to refer to the language on Crowdin]",
  "intl_code":      "[Lowercase code from Intl.js]",
  "language_name":  "[Language name in the target language]",
  "english_name":   "[Language name in English]",
  "default_font":   "[Name of the primary Noto font]"
}

If the language doesn’t exist in Django, you may get errors when trying to view the language. In this case it needs to be added to EXTRA_LANG_INFO in base.py.

For the new language to work, the django.mo files for the language must also be generated by running make i18n-download and committed to the repo.

To test unsupported languages, you can use the Deployment section LANGUAGES option in the Kolibri options.ini. Either set the value to all to activate all languages, or add the specific Intl language code as the value.

Once the language has been fully translated and is ready for use in Kolibri, its Intl language code must be added to the KOLIBRI_SUPPORTED_LANGUAGES list in kolibri/utils/i18n.py.

Updating font files

We pin our font source files to a particular commit in the Google Noto Fonts github repo.

Google occasionally adds new font files and updates existing ones based on feedback from the community. They’re also in the process of converting older-style fonts to their “Phase III” fonts, which are better for us because they can be merged together.

In order to update the version of the repo that we’re using to the latest HEAD, run:

python packages/kolibri-tools/lib/i18n/fonts.py update-font-manifest

You can also specify a particular git hash or tag:

python packages/kolibri-tools/lib/i18n/fonts.py update-font-manifest [commit hash]

Make sure to test re-generating font files after updating the sources.

Note

We attempt to download fonts from the repo. It is possible that the structure of this repo will change over time, and the download script might need to be updated after changing which version of the repo we’re pinned to.

Configuring language options

The languages available in an instance of Kolibri can be configured using a few mechanisms including:

  • An environment variable (KOLIBRI_LANGUAGES)

  • An options.ini file (in LANGUAGES under [Deployment] )

  • Overwriting the option in a kolibri_plugin.py plugin config file

It takes a comma separated list of intl_code language codes. It can also take these special codes:

  • kolibri-supported will include all languages listed in KOLIBRI_SUPPORTED_LANGUAGES

  • kolibri-all will include all languages defined in language_info.json

Auditing strings

Much of our string workflow before developer implementation happens using Ditto. In order to do a full audit of newly added strings from Ditto, a CSV of the newly added strings can be exported from Ditto, and then our internal audit tool can be run to generate a CSV report of any strings that are potentially missing:

yarn run auditdittostrings --ditto-file <path to Ditto CSV>

This will produce an output CSV file in kolibri/locale/en/LC_MESSAGES/profiles/ditto.csv that contains an audit report on which strings from the Ditto file that are marked as FINAL were not found in the codebase (using an exact match method, so this may produce false positives if strings are not in ICU syntax on Ditto), and also any strings that have been discovered in the codebase to be an exact match - i.e. when we appear to have duplicate strings in our codebase (again, these may be false positives, as some strings may be repeated for different senses).