Language packs are restartless now

Language packs are add-ons that you can install to add additional localizations to our desktop applications.

Starting with tomorrow’s nightly, and thus following the Firefox 18 train, language packs will be restartless. That was bug 677092, landed as 812d0ba83175.

To change your UI language, you just need to install a language pack, set your language (*), and open a new window. This also works for updates to an installed language pack. Opening a new window is the workaround for not having a reload button on the chrome window.

The actual patch turned out to be one line to make language packs restartless, and one line so that they don’t try to call in to bootstrap.js. I was optimistic that the chrome registry was already working, and rightfully so. There are no changes to the language packs themselves.

Tests were tricky, but Blair talked me through most of it, thanks for that.

(*) Language switching UI is bug 377881, which has a mock-up for those interested. Do not be scared, it only shows if you have language packs installed.

Why l10n tools should be editors instead of serializers

If your tool serializes internal state instead of editing files, it’ll do surprising things if it encounters surprising content. Like, turn

errNotSemicolonTerminated=Named character reference was not terminated by a semicolon. (Or “&” should have been escaped as “&”.)


errNotSemicolonTerminated=Named character reference was not terminated by a semicolon. (Or “” should have been escaped as “amp;”.)

And that’s for a string the localizer never touched.

(likely narro issue 316)

Notes on SemWiki

Semantic Wiki is nice, but it’s hard to wrap one’s head around. Thus, writing down some notes-to-non-self.

Most importantly, start with paper. SemWiki isn’t very forgiving if you reconsider. Once you’ve made some headway with paper, set up mediawiki locally. I’ve thrown my db away three times so far because I did do reconsider. FYI, that’s a tad tricky, here’s what works for me:

  • kill the db and recreate it
  • move LocalSettings.php away
  • load index.php, follow the configuration ’til you have a db, ignore the download
  • php maintainance/update.php to get the semwiki tables
  • Special:SMWAdmin to do tables and data update (log in again)
  • php maintainance/runJobs.php to speed up the data update

Using Special:CreateClass is nice, as it’s doing a whole lot of things for you. There is one thing it doesn’t, for Properties linking to Pages, it won’t ask you for the Form to create/edit those pages. Special:CreateProperty does, but that’s of little help. You can add that later by editing the Property page, and adding a Has default form like

This is a property of type [[Has type::Page]]. It links to pages that use the form [[Has default form::Aisle Milestone]].

Property name is the wiki page name of a property, Field name is the human readable name used in the forms created, btw. If you loose the mapping between field and property name, edit the form, and explicitly specify the property with {{{field|Summary|property=Has summary}}} or the like. Though, more likely, you dropped [[Has summary::{{{Summary|}}}]] in the template, I replaced that with just {{{Summary|}}}, which breaks stuff.

Also, you don’t want to use ‘/’ in your form names, that breaks editing URLs from your referring pages.

Oh, and yes, you don’t need to add the namespaces like Template: etc when entering the names in the semwiki forms, those are prepended automagically.

Another drawback of Special:CreateClass is, it’ll do all its work asynchonously, so you’ll need to wait a while ’til things are up for you to actually enter your data. Locally, you can speed things up with php maintainance/runJobs.php.

I’m still torn on how much I’d really like to use the free text, right now I’m using a Text property to create a summary that I can put in a prominent part of the template.

One more trick, if you’re using largely prefixed pages, like we do on wikimo, you can get pretty descent sorting if you edit the Template to have a category hook like

[[Category:Aisle/Use Case|{{#titleparts: {{PAGENAME}} | | -1 }}]]

This is for Aisle, for which I ended up creating my own semweb instead of using our standard feature pages. They have a ton of stuff I don’t need, and don’t have a few things I hope to use.

Why I deleted my account on facebook

I opened a facebook account a good while ago to be present in communication channels where our community is. I’ve closed that account, with a host of pending “friend” requests from community members, and here’s why.

On one hand, there’s all the “you wanna be a friends of a northern-german elderly guy with a click” thing, and the “what’s the value of my life at NASDAQ”. But if facebook would have worked for me, I would have continued to bite that bullet.

The real reason I left is that facebook doesn’t work for me. My role at Mozilla is to talk and engage with people all over the world. Many of them came back with friend requests on facebook. Their friends came back with friend requests. Now, most of what they’re putting up on facebook is targetted to their social circle, and, in their language. Which is cool, but my facebook feed just turned into a series of stuff I can’t read.

And being your friend and then mute you? That’d be just rude.

I haven’t communicated there really. So, I’ve stopped being on facebook. I’m still in that 14-day grace period, but I cleared my cookies to make it through that.

If you want to keep in touch, subscribe to this blog, or catch me on twitter. And then there’s irc and email, of course.

I intend to stay on twitter for the forseeable future. I’m not fond of their promo-tweets, but I enjoy the asymmetric nature of connections there. I have no plans to join other social networks at this point.

Rapid releases and the l10n dashboard are friends now

Wait a second, we’re on the rapid release schedule for almost a year now, and 9 releases. How can the l10n dashboard be friends with the trees only now?

Well, I’ve hacked and lied and tweaked and spoofed the data for a year. No more.

The obvious changes are:

  • Localizers as well as drivers can now see how far behind their work is, in release cycles
  • channel migration code is actually not a lie, and can be taken over by release management

On the team page, you’ll now see something like

You’ll notice the difference between the Current sign-off with the green check-mark, and the fx14 one with the looking glass. In the past, we’ve shown the green check-mark for both, now we’re actually showing the version that we’re using instead of the current one, and the looking glass is there to indicate that the localizer should actually look into this. There’s been a good deal of confusion about this, and I sure hope that this will resolve it a good deal.

There’s a ton of follow-up work, for one on elmo. This bug has been blocking a lot of other patches and work, both for the localizer-facing parts as well as the release infrastructure-facing parts.

More so, the state of localizations changes from “n missing strings, probably in this bucket” to “didn’t update since fx12”. That’s changing how we guide our work as l10n drivers a good deal. The impact between what we do and what localizers do becomes less anecdotal, and more science.

And there’s quite a few things we need to do, in particular for desktop.

compare-locales 0.9.6

I’ve updated compare-locales with two important fixes:

  • License header fix for ini files, bug 760998
  • l10n-merge now works with multiple errors per file, bug 756448

I’ve also updated the license to MPL2.

Update your local installs with the usual commands, like

pip install -U compare-locales

The l10n dashboard is already running the new code, I’ll file a bug to get the production builds updated probably tomorrow.

Migrating to the rapid release process

Wait, what, migrating to the rapid release process? Aren’t we, like, doing that? Well, not in the data models that drive the l10n dashboard. What follows is two-fold, for one, why would I be hacking on a patch for half a year? But also, there are some interesting technical tidbits on how to do intensive data migrations in a django project. I’ll link to the actual code in full later, right now the patch is still in flux. I’ll need to write this stuff down to get a review on the patch, so why not here.

I’ve blogged about data models before, but here’s a quick glossary:

Tree models a set of repositories to run automated tests against, namely, compare-locales. AppVersion models the thing we’re shipping, say Firefox 3.6 or Firefox 13.

When gandalf and I originally designed this part of the dashboard, the relationship between what we build and what we shipped was static, kinda like

Static relation between AppVersion and Tree

Small caveat, for AppVersions we’re not building right now, the old tree is stored as lasttree. We’ll need that data further down.

The rapid release process now lets AppVersions jump from Tree to Tree like little birds. So we need to have some intermediary that models that relationship, with start and end time. More like

Connect AppVersions and Trees through a model in time

Sorry for the ugliness, but that’s the pretty part so far.

Let’s start with doing that migration. Be warned, it’s migration 1 out of 4.

This first migration covers the sql schema, and persists the existing links between AppVersions and Trees. We’ll want to drop two ForeignKey columns for tree and lasttree from AppVersion, and add a ManyToMany table for the new relationship. Plus constraints and indices, sure. As I don’t speak sql, I want to use the ORM as much as I can, which sounds like one couldn’t. Enter a fake migration app that mimics the import pieces of our shipping app. It’ll contain a subset of AppVersion to the extent we need it, and the intermediate model, AppVersionTreeThrough. This needs to be really fake code, as you shouldn’t rely too much on the version of the code on disk to be really what you need for this db migration step. I cheat a bit on that, though. The import trick here is that the fake AppVersion contains both the old and the new fields, and that the fake AppVersionTreeThrough matches the one you migrate to in all flags and fields.

Now, getting django to eat a fake app is tricky. You need a module for the app, and that module needs to have a models module. Both need to be in sys.modules, and they need to have the __file__ properties set. Just because django is paranoid about equality of code and thinks it needs to verify location on disk. But hey, I can spoof all of that. Python ftw. Then you create a meta class, copying most data for db and management from your original app, and with that meta, create Model subclasses.

Ok, so now we have python code to do the migration, but we don’t have the SQL tables and columns yet. Let’s get our fingers dirty with that.

from import color, sql
style = color.no_style()
sql.sql_all(mig_module.models, style, ship_connection)

This is going to return a list of all CREATE TABLE, CREATE INDEX, ALTER TABLE ADD CONSTRAINT sql commands for our fake migration models. The migration code inspects that, and creates the many-to-many table, adds constraints and indices, and tweaks the CREATE TABLE statement for AppVersion to just pull out the SQL definitions for the columns. Those get inserted into ALTER TABLE ADD COLUMN statements, and then we execute all that.

At this point, the database contains the old columns with the old data, and the new empty tables (and columns).

Now migrate the data, with the help of the our intermediate classes that can create the django orm magic. As we’re still in the first phase of our migration, let’s not overrotate and just make links between AppVersion and Tree over all time for those that are currently bound, and for those that used to be linked (read lasttree), make the end time just now(). We’ll fix that later. That’s a nice and easy loop.

Now we’ve used our migration app to the extent we need. It’d be nice if we could just leave it with that, but we’ve got to tear it down, because otherwise the validation step will complain about multiple apps referring to the same tables. Module caches in django are tough, the following code does that at least for 1.3:

    # prune "migration_phase_1" app again
    del settings.INSTALLED_APPS[-1]
    from django.db.models import loading
    loading.cache.app_models.pop('migration_phase_1', None)
    loading.cache.register_models('migration_phase_1')  # clear cache
    # empty sys modules from our fake modules

And, yes, now we can DROP stuff, at least when using mysql. sqlite doesn’t do that, and in contrast to peterbe, I don’t mind postgres ;-). Of course, as django adds constraints with rather arbitrary names, the best thing we can do is inspect the database for the actual names of those, and then we drop’em, and the columns.

And if you think this blog post is too long, you’re right. Let’s talk about the migrations 2-4.

The second migration just inspects our actual builds, and adjusts the start and end times for stuff that’s not on the rapid release cycle, and old.

The third migration fights the past. To make our mismatching data model work for the rapid cycle, I replicated a lot of history data each time we migrated appversions, to a newly created set of appversions for that cycle. Now that our data model gets that right, this migration searches for that data (Signoff entries, and their associated Action ones), and deletes them.

The forth migration sets the start and end times for the rapid release cycle, including the off-times for Firefox 5, and moves the Signoff entries from the long running aurora AppVersion to the respective AppVersions on aurora at the time. Not too bad, but really loads of weird data, and it gets worse every six weeks :-).

Phew. You made it. Now all I need to do is to fix the code that uses all that data.

compare-locales 0.9.5

Busy times for compare-locales, there’s another release out the door.

New in this release are a significant rewrite of the Properties parser. A lot less regular expressions, a lot more performance in bad situations. Thanks to glandium for poking me hard with a patch. That patch didn’t work, but at least it got my butt to it. Comparing bn-IN is now down to 23 secs for 3 minutes+.
The next big thing is that I now run checks on entities that are keys, too. That doesn’t seem to have caused any regressions, but look out for new false positives. On the plus side, if you use ‘&’ as accesskey, you’ll get an error report instead of a ysod.
Finally, I added support for mpl2 license headers, so we’re all set there.

As usual, update with pip install -U compare-locales.

What’s a glossary term?

I’m hacking on some tool that indexes the localizable strings in our apps.

One of the fall-outs could be a glossary tool, i.e., which terms in Firefox, Thunderbird, etc should localizers bother to get consistently translated.

Which raises an interesting question, where do you draw the line? What’s a good metric to use to define a glossary? Are there glossary-based applications that don’t need a cut-off at all?

Insights welcome.

Web-based IDEs for Localization

There isn’t much news on the localization tool front that I started at MozCamp in Berlin, but I’ve got some more questions for the web tool guys among you. As any good project, a localization editor should stand on the shoulders of giants, so I’ve been looking at Orion, Cloud9/Ace, and etherpad-lite. All of them got parts right, and I try to gather some input what it’d take to bring home the rest. I’ve set up an etherpad each for you to type to,

A bit of context: Localization is editing code by people that don’t (necessarily) code. An editor needs to take an extra step to make breaking things hard, and editing the right pieces easy. There’s also a host of content assist available to suggest translations, spell checking, glossaries, etc. Which opens two paths: You offer forms, and your localizer will never learn what’s happening in real life, or you drive a code editor beyond what the programmer of that code editor ever needed herself. I would love to try the latter again, this time on the web.

So, if you have some experience with tweaking, wrestling, extending, stripping any of these, or with one I missed, I’d be thankful for your input.