Axel Hecht Mozilla in Your Language

November 5, 2008

extensions, l10n, and tools

Filed under: L10n,Mozilla — Tags: , — Axel Hecht @ 4:15 pm

Wladimir Palant has two recent posts on perl scripts he did for helping him in managing localizations for Adblock Plus and TomTom Home.

Sadly, Wladimir ignored about a year of development in the compare-locales work, and a whole other flock of utilities available as part of the translation toolkit.

This year of development is the result of testing almost 80 different locales against up to 4 different applications and thousands of localizable strings, trying to catch more and more fatal errors in each update. The development hasn’t stopped yet, too. There is more in-depth work going on in Gandalf’s Silme project, for example.

While I appreciate that more folks are paying attention to l10n and extensions, it’s unfortunate to see such work being invested in steps back in capabilities.

Another approach was recently started by Jean-Bernard “Goofy” on the babelzilla wiki and forum. I’m looking forward to add to and help with that project.

October 26, 2008

New l10n dashboard entry points

Filed under: L10n,Mozilla — Tags: , — Axel Hecht @ 9:13 am

As requested in the l10n sessions in Barcelona, I added locale entry hook. Getting all builds for, say, German, you would go to http://l10n.mozilla.org/dashboard/?locale=de. HTH.

October 21, 2008

l10n-merged linux builds on the l10n server

Filed under: L10n,Mozilla — Tags: , , , — Axel Hecht @ 2:43 pm

I reached another milestone on the l10n builds on the l10n server – reliable l10n depend builds.

A short recap on why they could not be reliable. Details are in Armen’s and John’s presentation in Whistler. First and foremost, l10n builds with missing strings break. They might start, or not, maybe even crash. Or just display the yellow screen of xml parsing death. Now, l10n builds are not really builds, but repackages of an en-US build. Between the time that the en-US build started, or, in hg, the revision it used, and the tip at the time when the en-US binary is finished and available, there can be further l10n-impact landings. We are using the nightly builds for the repackages throughout the whole day even, so the chance that the current en-US source doesn’t correspond to the nightly increases. So even if you know that a localization is good tip vs tip, you can’t say if it’s breaking the previous nightly or not. Huh? Look at the graphs in Armen’s and John’s presentation for more arrows going back and forth in time. ;-)

Enter bugs 452426 and 458014. 452426 added the changeset id to application.ini (thanks Ted), and 458014 refactored browser/locales/Makefile.in with additional logic to extract that info for the build system. I got that one landed yesterday, so we can now get the source stamp of mozilla-central for a firefox build.

Right, good catch, this doesn’t work for comm-central builds. I’ll leave it up to them to figure out how to reproduce the plethora of repos they have.

So far, so good. You download the nightly, unpack, ident (the rule to extract the changeset id). Now you back to your source tree and hg update to that revision, and run compare-locales against that. We’d be able to reliably say “works” or “better don’t touch”.

We promised more, and more pieces came together today.

With reliable compare-locales code, one can not only detect missing strings, but also add missing strings to files. Think about a CPP step, nothing permanent, nothing gets landed upstream. But just for the needs of this particular build, you’d have something that has all strings. Not all translated, some padded from en-US. That works. compare-locales is already able to do merges for a while now, storing the changed files into a separate location. Mostly because I consider changing the source to be evil. So what about missing files? Nothing. Good files? Nothing. How does the build pick up files from merges, l10n, and en-US then?

By rewriting make-jars.pl, enter JarMaker.py. Among overall readibility improvements and removing XPFE hacks, JarMaker.py offers to pick up l10n files from a list of top-level source dirs.  It offers another cute feature, by writing both chrome and extension manifests at once. Now, with bug 458014, we don’t have to run the libs phase for installers and langpack separately. (I never got why we do that until I rewrote make-jars.pl, actually.) The rewrite of JarMaker.py was preceeded by rewriting Preprocessor.py, so that all of the jar generation can happen in a single python process.

Starting from today, all of this came together with my installation of buildbot on the l10n server.

This gives us

  • builds on push, i.e., feedback within 5-10 minutes (real stats pending)
  • comparisons of the l10n tip against both
    • the en-US tip (for the upcoming nightly)
    • changeset of the previous (for the existing nightly, with l10n-merge)
  • html-ified output for both of those
  • updates for the dashboard

and last, but not at all least, a

  • working build, even for partial translations.

Find 60 3.1b2pre linux builds on the l10n server.

Thanks to Armen, I used a few of his new makefile targets for download and upload, he did a bunch of work for the sourcestamps-in-application.ini on cvs, too. Thanks to Ted, the poor fellow had to review all my rewrites and Makefile dependencies foo, and did some patches, too. hgpoller stuff not to forget.

TODO:

  • silme will offer even more reliable merges
  • nightly scheduler for all locales (currently I only build on l10n and en-US l10n-impact changes) (*)
  • mar’s
  • comm-central
  • more Makefile foo to pick up more missing files from en-US in doubt… (*)
  • … or at least document the core set of required files (*)

I won’t take most of those, fwiw. Possibly only the (*) ones.

Sources are in my tooling repository, and there’s an updated version of compare-locales, 0.6, on pypi. No drastic changes here, just some paths fixes, mostly for Windows.

October 2, 2008

State of the Zwiebelfisch

Filed under: L10n,Mozilla — Tags: , — Axel Hecht @ 3:45 pm

For those that know that I’ve been somewhat offline lately, good news. I’m back online. I’m not going into the details, but ISPs in Germany and their handling of customers is, errr, let’s put it that way: not Firefox.

Other items of progress lately:

Seth covered Firefox 3.1 Beta 1. In my words, it went like I expected, our localizers just blew our expectations. That doesn’t make me change my expectations. I expect what I think we should expect, and heartly welcome that Firefox gets so much more. 35 localization teams swallowed the even more technical hg compared to CVS. Some localization teams volunteered to help out others. Overall, the current stats are that we have 41 hg l10n repos with localizers commiting, 21 with just me so far, and 6 are plain empty. The last 6 are waiting on good content on cvs to migrate over to hg.

I just landed changes to both hgpoller and hg_templates that add a json output to hg.m.o, once that’s pushed live. That json output is designed to allow to clone the pushlog2.db, i.e., with a mere clone of a repository on hg.m.o and a copy of pushlog2.db, you can do all kinds of history magic. I sure hope to use that data to make out opt-in process a lot more appealing and understandable.

The real big thing in the queue is bug 458014, refactoring browser/locales/Makefile.in. There are a bunch of things I’m doing there. Not wasting time and computing power would be one. Being able to actually tell which build the build is repackaging before it is repacking is the other big thing. The initial comment in the bug has a good analysis.

This is a rather big step for l10n in mozilla land. Being able to actually pick the right source in en-US for the build you’re repackaging is something we’re talking about for years. It’s a prerequisite for doing l10n-merge at build time, too. Which that bug adds, btw. So this bug should really pave the path to reliable l10n nightly builds, good testing infrastructure, and more.

“More” being a new way to start localizing Mozilla applications. No more, no less. Up to now, your localized build was busted if the localized source you used to build wouldn’t include all the strings of en-US. With the changes I added in place, you can just make the build insert those at build time. This will not change the output of compare-locales, and will give localizers the ability to, without additional tools, decide on whether to keep a particular string with the same value as en-US or to bother about it later. l10n-merge is really only going to add an intermediate output. Think about it as CPP output.

We talked about this l10n-merge in Whistler, and there are a bunch of loose ends and room of improvement. Nevertheless, the state in bug 458014 is such that I would like to use to figure out how to make a buildbot factory do the right thing at least in the more or less trivial cases.

l10n-merge won’t change our policy to only ship complete localizations. I hear folks trying to change it, thus the comment. It’ll be tough to convince me to not require complete localizations. “Localizations” doesn’t imply translating all strings, though. Localization implies an conscious decision on whether a particular string should be translated or not. Security error messages, of course, those protect our users on the web. XSLT error message much more likely not, there are only a few languages which have a community around XSLT in their native language. Falling back to English without a consious decision is a door wide open to a sucky user experience. Something that at least I don’t consider to be Firefox. Maybe something else. Minefield is one possible name for it, there might be others.

PS: I picked Zwiebelfisch for no other reason than “it’s late at night, and it starts with Onion”

July 25, 2008

compare-locales 0.5 is out

Filed under: L10n,Mozilla — Tags: — Axel Hecht @ 7:35 am

I just uploaded compare-locales 0.5 on pypi. To update your current install, use

easy_install -U compare-locales

This is the new version that can handle mozilla-central. There are two main features that were required to do so:

  • Not get the directories to compare from client.mk.
  • Drop the default path for l10n directories.

The latter was already an optional parameter to compare-locales, and is now mandatory. This is mostly because with hg, you’ll more likely end up with several local repos representing branches or items of work. You just point which source to compare to which tree, and you’re done.

The first change is a bit more drastic. I created new files to hold the information about which directories actually contribute to the localization of an app, for Firefox on mozilla-central, cvs trunk, and the 1.8 branch, and for Thunderbird on cvs trunk and 1.8 branch. Now that cvs trunk is dead for shredder, we’ll need a new one on comm-central. I suggest to put those files into $(APP)/locales/l10n.ini, you can take a peek at the browser one on mxr. As you can see, it pulls the toolkit information in from a seperate ini file. If you’re creating an l10n.ini file for your app, feel free to CC me on the bug and request my review.

The main change for localizers is that you call it differently. It doesn’t really matter anymore from where you call it, just make sure that the first argument is the path to the l10n.ini, the second argument is the base dir for the localizations, and then list the locales that you want to compare. There is a remaining constraint in that the l10n base dir needs to have the localization in a subdir with the same name as the locale code.

compare-locales browser/locales/l10n.ini ../l10n de fr hi-IN

would compare German, French and Hindi Firefox localizations.

July 24, 2008

l10n buildbot update

Filed under: L10n,Mozilla — Tags: , — Axel Hecht @ 3:31 pm

I’ve updated the buildbot master and slaves running on the l10n server in the last few days.

There have been a few reasons to do so:

  • for Firefox 3.1 localization, I will need an updated compare-locales
  • … and support on the dashboard
  • there’s a new buildbot 0.7.8 around the corner
  • … which offers scheduler properties, which make l10n buildbots sooooo much nicer
  • and, hrm, I had bugs to fix in the history view

For a while now it was obvious that the code running the buildbot on l10n.mozilla.org would be much nicer and easier to grok, if it used the new features upcoming in buildbot, namely, scheduler properties. I’ve been doing some weird hacks to get around the lack of those on buildbot 0.7.7, and nobody liked those, including me.

Luckily, it wasn’t hard at all to drop all the weird code I used in favour of scheduler properties, for the most part, it was removing code. For the curious ones among you, have a look at the crucial diff.

The downside of that change is that I needed to update my custom code at a few places that used build properties in the 0.7.7 style. Once I learned the tricks, the patch was rather systematic, though, no real surprises. Again, there’s a patch in the original queue for curious folks.

I’ve followed the buildbot trunk darcs repository since the announcement of the imminent release last week on my local setup, and found a few issues. Which is cool, as those issues are fixed now, and I have a unpatched buildbot 0.7.8pre running on l10n.m.o. This is actually a real treat, as we (the release team as well as I) have been running slightly patched releases of buildbot to fix bustages in the past. And those bustage fixes tended to be slightly different upstream than in our repo, making upgrading to a new version of buildbot a pita.

On top of that work on the backbone, I’ve fixed a few timezone issues all around, which most apparently broke the statistics pages. They’re linked to from the dashboard as history (H). Those most apparently showed when trying to click on the bonsai links. There are blue vertical marker lines for l10n check-ins, and if you click on those, a lens opens with the commiter, the files and the check-in comment. The commiter email is actually a link to bonsai-l10n. UE guidance welcome.

While I was at it, I fixed both the start and end times of the timelines, as well as their display. Now they’re actually piecewise constant, as they should be. Ok, they’re not. They’re still P1, because timline likes that, but they look really close to P0.

Part of the preparations for fx3.1 was the renaming of “trunk” into “fx30x”. That should not only enable us to have fx3.1 on the dashboard, but other projects, too. Those required some changes to the existing data archive, in particular the database that I have locally. Those changes weren’t too bad, though.

Known regressions:

  • I check out from scratch now on each build. Should be less of an issue on hg. And CVS is stable, and I only check out the locale that I build. Thus resolved WHATEVER.
  • If anybody has links referring to “trunk”, those are broken now. “fx30x” is the new name for “trunk”.
  • If you find more, file a bug, comment, or drop me a mail.

July 22, 2008

white russian

Filed under: L10n,Mozilla — Tags: , , — Axel Hecht @ 3:38 pm

Where do we go, sings Marillion. In this context, rather, where do my thoughts go.

l10n builds it is again. I’m currently working on porting my buildbot work over to buildbot 0.7.8, and prep it for mozilla-central. There are a few interesting points:

  • 0.7.8 is not released yet. Finding bugs, fixing bugs, reporting back upstream.
  • 0.7.8 has scheduler properties. Drop all the fuzzyness about queues in schedulers etc. Just set up properties right away. Smoooooothness. Lost merging builds for now.
  • mozilla-central doesn’t have l10n configuration in client.mk. That’s bug 445217, landed on hg and various cvs branches. This change goes along with various changes to compare-locales, which I haven’t released yet.
  • As I don’t have to call into client.mk anymore and read the config data from plain files instead, I can do that asynchronously (more easily). My master just starts up now.

All in all, interactions inside the l10n build scheduling are much easier to grok thanks to scheduler properties.

To add some cream to the milk, here’s how I picture l10n builds to go one day:

  • pull central (and friends for Thunderbird, SeaMonkey)
  • pull and update to tip for locale to build
  • for the most performant platform:
    • update central and en-US to tip
    • compare-locales with rich reporting to dashboard and siblings
    • possibly other source verification and test code
  • update en-US to nightly changeset
  • compare-locales with merge, fail on error
  • repack, fail on error
  • upload binaries

The source analysis step is going to be the important one to be used for the dashboard, as that needs to have the most current information.

The actual builds on the other hand will be generated against the changesource for the last nightly, i.e., they will merge in en-US content based on the right data, and will only fail on rare occasions where the merge is harder than necessary, or when there are plain bustages in the localization.

In case you really want to know, the patches I’m currently working on are in a public hg queue repo.

July 3, 2008

l10n merge

Filed under: L10n,Mozilla — Tags: — Axel Hecht @ 10:10 am

I’ve just pushed an implementation of l10n-merge to my tooling repository. It’s now actually just an option to compare-locales, and will do the weakest heuristic for now.

Whenever compare-locales finds missing entries in an existing file, it will create a copy of that file in a staging directory for the merge, and append the missing entries. For the bulk of our files, that should work fine. Bookmarks.html is an exception, as are the netError files, I think order of entities and dtd inclusions matters there. In the end, you get a staging directory with files that got fixed up, a localization directory with both good files and files with missing entries, and the original en-US source. Making jar.mn actually pick localized files up from three different places in a particular order is in my build-patches repository.

Here’s why, in case you wonder: First and foremost, it leaves the original source alone. I like it like that. Secondly, it does as few file manipulations as possible in the best case, a complete localization doesn’t do a single copy or something. It’s not all that invasive into the build system as one might think, too. At least as soon as you want to look at the code to see where you’re looking for files, checking a bunch of source base dirs is rather trivial.

June 2, 2008

EUROPEADA 2008

Filed under: L10n,Mozilla — Axel Hecht @ 3:57 am

Just wanted to point all of you into soccer and languages at the EUROPEADA 2008, that’s currently on. “The soccer tournament for the autochthonous, national minorities in Europe”, as they say themselves. There’s a video featuring some of the languages spoken on tagesschau.de (German).

April 14, 2008

L10n buildbots builds, the first quarter

Filed under: L10n,Mozilla — Axel Hecht @ 9:37 am

The current setup of my l10n buildbot is running for 3 months now, or a quarter. I figured that’d be a good time to do some stats.

In these three months, the l10n server ran about 10600 builds, 10416 of those on trunk. Of the latter, 2330 builds succeeded, 8086 failed (fun number pun). The total amount of build time during these three months was merely 3 days, the rest of the day it was just pounding bonsai-l10n and slacking.

The mean response time, that is, the time between the change that bonsai shows and the end of the build is 3-4 minutes, with the following distribution

Histogram of build lags

The lags go as high up as almost 2 hours, the really big jumps seem to be problems on bonsai-l10n, there are a few builds taking 20-30 minutes due to slowness in cvs check-outs. I’m not showing some 36 builds in the diagram above. But the histogram shows nicely, you should in general be done within 10 minutes, and even if the build didn’t do langpacks on linux but full repacks on windows, I would expect a similar reponsiveness on comparable hardware to the l10n server.

The bad news is, the code is mostly non-reviewed, and should use a pending feature for buildbot, custom build properties. Otherwise, reconfig will never work.

« Newer PostsOlder Posts »

Powered by WordPress