L20n examples for localizers

Jeff Beatty

3

Depending on the localizer, l20n can be seen as either a gift or a curse. In it’s simplest form, l20n will guarantee a more streamlined and efficient form of localizing mozilla projects. By separating an appplication’s localization logic from it’s own programmed logic, localizers will enjoy more flexibility and freedom of expression than ever before. However, in order to take advantage of its full capabilities, localizers will need to become familiar with its syntax and how it operates within their localized strings. For many localizers, this can represent a mountain of a challenge.

Rest assured, help is on the way! The goal in this entry is to demonstrate to localizers how the l20n infrastructure will make their work easier and more robust. Through several examples, I hope to turn you into a believer! By seeing how l20n processes strings as objects, genders, and plurals, you’ll come to agree that l20n is innovative and efficient enough to improve L10n for mozilla and for the L10n industry as a whole.

L20n: localizing strings as objects

One of the most classic examples of using strings in computer programming involves the string, “Hello, world!”. Using, “Hello, world!”, I’ll demonstrate how l20n uses strings as objects and why that makes L10n easier.

The code below will display, “Hello, world!” as a top-level header. This is designated by the <h1></h1> tag. In this format, the string can only be used once within code.

HTML = <h1>Hello, world!</h1>

The l20n code below takes the, “Hello, world!” string and converts it into an object called, title. By assigning the string to an object, the string can be used many times wherever the title object appears. You’ll see that the HTML-l20n example below includes, title within the <h1> tag. By doing so, it displays “Hello, world!” without needing it to appear in between the <h1></h1> tag.

L20n = <title "Hello, world!">
HTML = <h1 l10n-id="title"></h1>

Now, if you were to localize, “Hello, world!”, you would simply do it within the title object.

L20n es = <title "Hola mundo!">

Within this framework, you have translated one single string which can now be reused wherever the application calls for title, thus cutting down the total number of strings you’ll need to translate.

L20n vs genders

For our first example, let’s look at how l20n intelligently processes gender by looking at the strings, “Firefox has been updated,” and “Aurora has been updated” in English and comparing them to a gender dominant language like Polish. We’ll use brandName as a variable to store Firefox/Aurora and update as a variable to contain the rest of the string.

In English:
<brandName "Firefox">
<update
  '{{ brandName }}
   has been updated.'>

This will display the string, “Firefox has been updated.” Simple enough, right? Now let’s add Polish gender rules to the mix.

<brandName "Firefox"
  gender: 'male'>
<brandName "Aurora"
  gender: 'female'>

We’ve now identified the gender for each brandName. Now let’s put it into the code.

<update [brandName..gender] {
male: '{{ brandName }} został zaktualizowany.',
female: '{{ brandName }} została zaktualizowana.'
}>

Notice how update has changed. Its value is now a hash (a dictionary-like variable object which can contain multiple values called keys) with two keys: male and female, each with a string value assigned to it. Furthermore, the brackets following the update identifier introduce an index. It tells l20n to look at the value of the index and choose the corresponding key from the hash.

By adding ..gender to brandName, we’ve determined that if the string contains “Firefox,” then the male version of the string is used. If the string contains “Aurora,” then the female version of the string is used. Pretty neat, right? This way, the localizer can translate for both forms and leave it up to the L10n logic of the application to determine which gender form is appropriate for the corresponding product name.

An important thing to remember: all of this is happening in the localization file (remember how we’re separating L10n logic from application logic? This is where that happens). The source (English in our case) is left unchanged. Similarly, the developer doesn’t even have to know about the fact that some languages sport genders. All they ask l20n is for the value of the update string and l20n makes sure it’s grammatically correct and fits the context.

L20n vs plurals

One of the biggest advantages of l20n is that localizers can not only establish plural rules for a string in their locale, but they can set the rules for the program to intelligently determine when to display plural and singular forms. Localizers can combine multiple plural rules and apply them in a single string.

The example below demonstrates how these plural rules can be intelligently applied in a string. You can find a functional example of this here.

<plural($n) { $n == 1 ? 'one' : 'many' }>
<axel "Axel had {{ $beers }} {{ ~..bottles[plural($beers)]}s of beer."
bottles: {
one: "bottle",
many: "bottles"
}
>

This first line establishes the plural rule, identifying $n as a variable that can change values according to quantity. Essentially, the line states that there is a singular value, and a plural value for $n (which is only displayed when $n is greater than 1).

<plural($n) { $n == 1 ? 'one' : 'many' }>

This next line contains the string with the plural in it. $beers is a numerical variable. In this context, it is meant to replace $n in the plural rule. By putting $beers in the parenthesis following $plural, we’re telling plural which form of the term bottles we need here. The plural rule will look to see if the number is 1 or greater than 1 and then display the corresponding form of the term.

<axel "Axel had {{ $beers }} {{ ~..bottles[plural($beers)]}s of beer."

Finally, this line defines the singular and plural forms of the term in question. This is what the plural rule inserts once it has determined whether $beers is 1 or greater than 1.

bottles: { one: "bottle", many: "bottles" } >

For another working example of how L20n processes plurals, see slides 7 -10 of Stas’s FOSDEM l20n demo.

While the examples above illustrate how l20n will make gender and plural constructs simpler to localize, l20n does not prioritize these grammatical concepts higher than others. Ultimately, we like to think that l20n sees all languages as equals. Consequently, we are working hard to ensure that its syntax will accomodate and support any and all unique grammatical concepts from all languages.

3 responses

Post a comment

  1. smo wrote on :

    The history of the l20 wiki page is quite enlightening, and so is a compare of the current version with Aug 2011 version:

    https://wiki.mozilla.org/index.php?title=L20n&action=historysubmit&diff=402369&oldid=339236

    I have yet to see something earth moving in the difference.

    As regards examples – they have been around since 2007. That is before Google Translate API and all kinds of other linguistic services (ever heard of ;Pontoon?) kicked in.

    Time to move on.

    Reply

  2. Staś Małolepszy wrote on :

    I think that the diff you linked to is unfair to the amount of work that is happenning. True, maybe the front page hasn’t changed much, but you only need to dive a little deeper and you’ll dicover pages like https://wiki.mozilla.org/L20n/Features which represent a big effort that has been going on for months now.

    The code has matured a lot too, and we’re getting closer to having something ready to be implemented in Gaia and B2G. We’re also working on the C++ bindings and Gandalf has been able to successfully build Firefox with l20n support. We’re operating on many fronts and if it doesn’t look like we’re making progress, it’s because the progress is distributed among all of them. C++, JavaScript and Python bindings to start with. The file format and the syntax. The parser and the compiler. Tools, libraries and scripts.

    I’ll admit that we need to get better at communicating more about l20n and the progress we’re making. Jeff has been helping us and the result is the very blog post we’re commenting under right now. I started blogging about l20n too, so stay tuned for more status updates in the near future.

    This is happenning right now. And it’s pretty exciting.

    Reply

  3. Ibrahima Sarr wrote on ::

    Fascinating to see l10n being cured of it’s anglo-centric overdose! I am speaking from experience. Are these features ready to be implemented in programmes like Virtaal or Pootle so they stop annoying us with false positives?

    Reply

Post Your Comment