l10n merge

I’ve just pushed an implementation of l10n-merge to my tooling repository. It’s now actually just an option to compare-locales, and will do the weakest heuristic for now.

Whenever compare-locales finds missing entries in an existing file, it will create a copy of that file in a staging directory for the merge, and append the missing entries. For the bulk of our files, that should work fine. Bookmarks.html is an exception, as are the netError files, I think order of entities and dtd inclusions matters there. In the end, you get a staging directory with files that got fixed up, a localization directory with both good files and files with missing entries, and the original en-US source. Making jar.mn actually pick localized files up from three different places in a particular order is in my build-patches repository.

Here’s why, in case you wonder: First and foremost, it leaves the original source alone. I like it like that. Secondly, it does as few file manipulations as possible in the best case, a complete localization doesn’t do a single copy or something. It’s not all that invasive into the build system as one might think, too. At least as soon as you want to look at the code to see where you’re looking for files, checking a bunch of source base dirs is rather trivial.

The Conversation {6 comments}

  1. Asrail {Saturday July 5, 2008 @ 8:11 pm}

    What’s the best way to suggest improvements and report bugs?

    Well…

    would help a lot if the shebang were “/usr/bin/env python” instead of “python”, since the latter won’t work on most unix systems.

    Also, would be nice for Linux users if the script ignored files ending on “~” or hidden files:
    – file names ending with “~” are temporary files for some editors, as emacs. CVS also add files with names ending on “.~1.1.~” (where this 1.1 stands for the version of the file).
    – file names starting with “.#” or starting with “.” and ending with “.swp” are swap files of some widely used text editors, as emacs and vim.
    – some people may use some versioning system on their files and usually the directory is hidden (starts with a dot).

    It told me to add and localize “en-US.dic”.

    It doesn’t like comments on DTDs.

    Is it possible to use the locale directory as destine of the merge command?
    I, for instance, use git and would be fairly easy to revert anything.

  2. Asrail {Saturday July 5, 2008 @ 9:17 pm}

    It bugs hard on “#expand” commands on DTDs.

    It added a “#expand” alone on the end of a file that started with such command and broke the build.

    By the way, I’m using with “suite” as the application.

  3. Axel Hecht {Saturday July 5, 2008 @ 11:52 pm}

    Hi Asrail,

    need to check in on the shebang stuff, I thought that easy_install would do that for me.

    Comments in DTDs are fine, unless you have counter examples. The #expand thing is a bug that I just WONTFIXed because we don’t use it anymore at least in firefox and thunderbird. File a bug on suite? On the main toolkit apps, it was just an indication that the URL shouldn’t be in l10n in the first place.

    Regarding en-US.dic, this is gonna hit you in other places, too, suite doesn’t have a filter.py, file a bug on suite again, http://mxr.mozilla.org/mozilla/source/browser/locales/filter.py is the equivalent.

    I’m not sure if I should really enhance my comparison stuff for all kinds of editor cruft.

    The merge isn’t intended to play nice for now, it’s intended to be a build time workaround. If it breaks when you try to make it edit in-place or not is something that I don’t know.

  4. Asrail {Sunday July 6, 2008 @ 6:04 pm}

    It does not parse this comment:

    It contains an additional hyphen.

    Thanks for the reply and thanks for this tool.

  5. Asrail {Sunday July 6, 2008 @ 6:05 pm}

    That should be: “lower than”!—SideBar–“greater than”

Sorry, comments for this entry are closed at this time.