• Presentation at WordCamp Philippines

    September 18th, 2009 by seth bindernagel with 5 comments »

    Gen and I attended WordCamp Philippines and I presented today to the audience of about 100-150 people. The purpose of our visit and participation was straightforward:

    1. To gain further insight into the landscapeof the Web and Internet in the Philippines;
    2. To assess whether or not a localized version is something our community here mightpursue;
    3. To meet our community of campus reps and others.

    It’s been a steamy (as in the humidity), but amazingly kind reception here and we booked our schedules full with meetings and events. That’s all Gen’s amazing work.

    As for my WordCamp chat, here is my presentation. I started by taking the audience through our open web demos (video, canvas, svg, css, js etc. thank you Paul Rouget…), and then honed in on describing our Mozilla community, using localization as an example of how we are a global community of passionate contributors working to promote Mozilla’s mission.

    My call to action was two-fold: the blogging community can help promote the Open Web through their blogs, AND, if people feel empowered to do so, let’s start a localization for Filipino users.

    Feedback from these local bloggers was energetic, questions were poignant, and the message was embraced. My prediction, a Mozilla community here is going to take off if we continue to nurture, empower, and participate.

    I am trying to embed the presentation here, based on some code that Gen shared with me, but it looks like it is not working.

  • Firefox Mongolian Direct Outreach

    September 16th, 2009 by seth bindernagel with 7 comments »

    Over the past couple Firefox releases, the Mozilla community has proudly shipped a Mongolian localization of Firefox.  And, based on the blocklist pings that Firefox makes everyday, we can estimtate that we have between 10,000 and 20,000 active daily users in that locale.  That’s a nice accomplishment by the Mongolian community!

    However, as we ramp up our efforts to localize Firefox 3.6, and a mobile Firefox, we have been reaching out to our “mn” community leader, Natsagdorj (Nagi) Shagdar, but have had no response to the emails we have sent.  I guess it’s an inevitability to have some turnover when a 100% volunteer community rallies together to ship Firefox in over 70 languages.  Building sustainable communities is critical to our ongoing success and something we take very seriously.

    Therefore, this is an open blog post to reconnect with our Mongolian team in order to make sure everything is OK and receive a status update on the work/team going forward.  It would be terrific to receive an email from Nagi or others to let me know how to proceed with work on the mn locale.

    This post serves a secondary purpose because we also would like to invite any others interested in joining the Mozilla Mongolian community to contact us.  We are looking for community members to help take up some of the localization effort so we don’t lose all that we have accomplished with the mn version.  Plus, we don’t want to let down thousands of our Mongolian users who will be looking for the latest and greatest when Firefox 3.6 comes out.

    If you have interest in joining the community or know of anyone who might help in some capacity (even with simple referrals), then contact me through the comment section of this blog.  We have a robust set of community members and tools that makes localization easy and fun.

    As a matter of fact, we are welcoming all newcomers, so just ping me.  Thanks, everyone!

  • Improving LOL

    August 28th, 2009 by seth bindernagel with 6 comments »

    One more post coming from l10n intern, Jeremy Hiatt.  The following word-for-word post describes his work to improve the format of LOL, making it more readable and understandable for the developers and localizers who might use it.

    ————————————-

    Today I gave a presentation to some of the guys from the platform team about the state of l20n. In my previous posts I’ve blogged about the advantages and drawbacks for each of the formats we’ve considered, and I got some more good feedback about that in today’s brownbag. After the talk, I chatted with fantasai about how we could improve the LOL format to made it more readable and understandable. She had some interesting ideas that I’d like to share.

    Dropping Angle Brackets

    First, she (and a few others) pointed out that angle brackets make LOL look like XML. However, this resemblance might be confusing since LOL is otherwise nothing like XML. The intention for angle brackets in LOL was to delimit entity definitions and give clear visual separation. These cues are helpful for our parser, especially when it comes to error recovery. In an error case, the parser can drop all tokens until it recognizes an opening bracket (<) and resume the parse. If you have suggestions for implementing effective error recovery if we do remove the brackets from the syntax, please leave them below.

    Encoding Properties

    Another potentially confusing aspect of LOL files is that an entity may have properties defined in addition to its value. Here’s our usual example of a noun with a specified gender:

    <appName: "Jägermeister"
     gender: "male">

    This can be disambiguated slightly with indentation, but fantasai noted that the current syntax does little to explain the difference between the first assignment (which is specifying appName itself), and the second assignment to the appName.gender property. She suggested a syntax that differentiates assigning the value from assigning properties: use ‘=’ for the first assignment, and curly braces to delimit additional properties. Here’s the same example from above:

    appName = "Jägermeister" {
        gender: "male" }

    In this format, LOL would look a lot like CSS.

    Indexing

    An entity that mentions a variable gendered noun may define different forms for different genders. For example:

    <complex[appName.gender]: {
    	male: "Ein hübscher ${appName}s.",
    	female: "Ein hübsches ${appName}s."}>

    In the current syntax, square brackets following the entity key denote the index used to select the proper form. The suggestion was to move that to the RHS of the assignment:

    complex = [appName.gender] {
            male: "Ein hübscher ${appName}s.",
    	female: "Ein hübsches ${appName}s."}

    If you’re familiar with a switch statement in programming, you’ll probably notice that we basically adopted the standard syntax, but substituted square brackets for the switch( ) keyword.

    Objects with Multiple Attributes

    Objects in the UI, such as buttons, typically have a “label” and “accesskey” attribute but no canonical string value. This is subtly distinct from the cases above, where in the first case we wished to specify additional properties, and in the second the string value was resolved based on an external index. Example time:

    <button: {value: "Push me", accesskey: "p"}>

    In this case, it doesn’t make sense to refer to just “button”: you want either the label or the accesskey, which are available through the “.” accessor (e.g. button.label). To draw attention to this distinction, we could require a syntactic difference, or we could simply omit the index from the switch syntax above.

    Summary

    There are plenty more features of l20n that I’d love to put under the microscope here, but in the interest of focusing the discussion I’ll add them to a future post instead. As always, please share your opinion. You can also find us on IRC if you’re looking to start a lively debate; just look for me (jhiatt), Pike, and gandalf. Thanks to everyone from the brownbag today, and thanks especially to fantasai for taking the time to help us out!

  • Worldwide Lexicon and the Firefox Universal Translator add-on

    August 26th, 2009 by seth bindernagel with 3 comments »

    Asa passed me this Read Write Web article about the Worldwide Lexicon’s project, Firefox Universal Translator, which helps translate web pages automatically within the browsing experience. The tool enables project members to create, curate, and share translations.  Have you seen it and what do you think?  I’m curious to hear.

  • Compiling Localizable Objects into Native JavaScript

    August 25th, 2009 by seth bindernagel with 5 comments »

    As promised, here is the second post from Jeremy Hiatt’s work on our l20n project.  This is a word-for-word reposting of his essay about compiling localizable objects in native JS.

    ====================================

    One of the goals for my summer internship is to improve performance of l20n. The initial implementation was a parser written entirely in JavaScript that operated on .lol files. For more details about our choices for file formats, see my previous post. After some failed attempts to rework the parser’s use of regular expressions that regressed performance, I experimented with JSON as an alternative file format. The hope was that we could leverage the performance of Gecko’s built-in JSON parser to speed up l20n. We did see some tremendous improvements: on a large testcase constructed from browser.dtd, JSON cut our parsing time from ~140 milliseconds down to just a few ms. Unfortunately, we were still slow when it came to evaluating and displaying all those entities. We still had a big chunk of parsing left that we couldn’t outsource to JSON. Each string value in l20n may contain variable placeholders. Here’s an example (in JSON):

    "droponbookmarksbutton" : {
        "value" : "Drop a link to bookmark it"},
    
    "popupWarning" : {
        "value" : "${brandShortName}s prevented this site
                  from opening a pop-up window."}

    (Line breaks inserted for clarity.) The first string doesn’t use any variables, but the second does. In order to catch all these placeholders, we scanned each string with a regular expression to match the ${…}s syntax, even though many strings don’t use any variables. That translated to a linear traversal of every single string before it could be returned, costing us a lot of time. In tests conducted in the xpcshell, rendering all the elements from browser.properties took roughly 40ms. In comparison, the current framework for properties files can parse and display all the elements in under 20ms. Since we can’t afford to regress overall performance, that meant we still had work to do to get faster.

    One way to eliminate checking every single string is to add extra information to the encoding for strings. Many languages define different behavior for single- vs. double-quoted strings, performing replacements in one but not the other. We could also have added a special flag to indicate simple (no replacements) vs. complex strings. Either of these approaches would have added further complexity to the localization process, so we did not seriously consider this approach.

    Instead, on the advice of the brilliant Staś Małolepszy, we embarked on an experiment to compile our l20n objects into native JavaScript. As a result, we saw another impressive performance jump. In an xpcshell test, we can load and display all of browser.properties in roughly 4ms (an order of magnitude improvement!). Here’s what our previous example looks like as compiled JavaScript:

    this.droponbookmarksbutton="Drop a link to bookmark it";
    this.__defineGetter__("popupWarning",
      function() { return "" + (brandShortName) +
        " prevented this site from opening a pop-up window.";});

    Another great thing about compilation is that our runtime performance doesn’t depend on our choice of source file format. Here’s a diagram showing the different ways an l20n file can get inflated into a localization context:

    l20n compilation schemeInflating l20n source into a context

    The performance numbers were collected using nsITimelineService in the xpcshell. The l20n runtime infrastructure can inflate a source file directly into a context, or it can load compiled JavaScript definitions for a significant performance boost. For comparison, here’s a diagram of Mozilla’s current l10n scheme:

    Current l10n schemeCurrent l10n scheme

    Again, this time was measured in the xpcshell when loading the browser.properties string bundle. It’s not necessarily representative of performance for DTD files as well. As we can see, compilation now guarantees at least comparable performance to the current approach, no matter what file format we end up using. If you’d like to weigh in on that debate, please leave a comment on my previous post! And finally, we are also working on l20n support in Silme so that it will be easy to migrate existing DTD/.properties files to our new l20n format.

    Intercompatibility with SilmeIntercompatibility with Silme

    Silme will serve as a critical compatibility layer to ensure a smooth transition to our new l10n framework. Please let me know if you have any questions or comments!