{"id":426,"date":"2012-07-16T21:08:53","date_gmt":"2012-07-16T21:08:53","guid":{"rendered":"http:\/\/blog.mozilla.org\/l10n\/?p=426"},"modified":"2012-07-16T21:08:53","modified_gmt":"2012-07-16T21:08:53","slug":"l20n-features-explained-dom-overlays","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/l10n\/2012\/07\/16\/l20n-features-explained-dom-overlays\/","title":{"rendered":"L20n features explained. DOM overlays."},"content":{"rendered":"<p><em>(This is a crosspost from my blog: <a href=\"http:\/\/informationisart.com\/8\/\">http:\/\/informationisart.com\/8\/<\/a>.\u00a0 Check it out for better code formatting and syntax highlighting.)<\/em><\/p>\n<p><strong>With L20n&#8217;s DOM overlays, developers can amend localized strings with additional non-localizable HTML markup. This improves the separation of content and structure and reduces the cruft in localization files.<\/strong><\/p>\n<p>When it comes to localizing web content, you\u2019re likely to end up with a lot of HTML inside of your source strings. You might be see markup like <code>&lt;strong&gt;<\/code> and &lt;<code>em&gt;<\/code>, or simply links to other resources with &lt;<code>a&gt;<\/code> tags.<\/p>\n<p>Consider the following paragraph taken from <a href=\"http:\/\/www.mozilla.org\">www.mozilla.org<\/a>.<\/p>\n<blockquote><p>Portions of this content are \u00a91998\u20132012 by individual mozilla.org contributors. Content available under a <a href=\"\/foundation\/licensing\/website-content.html\">Creative Commons license<\/a>.<\/p><\/blockquote>\n<p>The HTML code for this paragraph is this:<\/p>\n<div class=\"highlight\">\n<pre><code class=\"html\"> <span class=\"nt\">&lt;p&gt;<\/span> Portions of this content are \u00a91998\u20132012 by individual mozilla.org contributors. Content available under a <span class=\"nt\">&lt;a<\/span> <span class=\"na\">href=<\/span><span class=\"s\">\"\/foundation\/licensing\/website-content.html\"<\/span><span class=\"nt\">&gt;<\/span>Creative Commons license<span class=\"nt\">&lt;\/a&gt;<\/span>. <span class=\"nt\">&lt;\/p&gt;<\/span> <\/code><\/pre>\n<\/div>\n<p>You\u2019ll notice the <code>&lt;a&gt;<\/code> tag with an <code>href<\/code> attribute. The <code>href<\/code> is a URL, and it makes this HTML significantly harder to read.<\/p>\n<p>If we wanted to localize this paragraph, the L20n code for English would look like this:<\/p>\n<div class=\"highlight\">\n<pre><code class=\"clojure\"> <span class=\"nv\">&lt;licenseInfo<\/span> <span class=\"s\">\"\"\"<\/span> <span class=\"s\"> Portions of this content are \u00a91998\u20132012 by individual <\/span> <span class=\"s\"> mozilla.org contributors. Content available under a <\/span> <span class=\"s\"> &lt;a href=\"<\/span><span class=\"nv\">\/foundation\/licensing\/website-content<\/span><span class=\"o\">.<\/span><span class=\"nv\">html<\/span><span class=\"s\">\"&gt;Creative <\/span> <span class=\"s\"> Commons license&lt;\/a&gt;.<\/span> <span class=\"s\"> \"\"\"<\/span><span class=\"nv\">&gt;<\/span> <\/code><\/pre>\n<\/div>\n<p>The URL will always be <code>\/foundation\/licensing\/website-content.html<\/code>, regardless of the user\u2019s locale. It makes little sense, then, to have it in the source string. The tag makes the string harder to read and increases the risk of introducing an error (e.g., removing a quotation mark or accidentally editing the URL).<\/p>\n<p>In fact, the <code>href<\/code> attribute is part of the document\u2019s structure rather than its source content, and as such, does not belong in the L20n code at all.<\/p>\n<p>What if L20n let you skip attributes that are not related to the source content? What if it copied those attributes from the developer-defined code, thus sparing the localizer all the trouble?<\/p>\n<h2 id=\"enter_dom_overlays\">Enter DOM overlays<\/h2>\n<p>The premise is simple: only localizable source content should live in L20n files. Let\u2019s modify the HTML code and the <code>licenseInfo<\/code> string accordingly.<\/p>\n<div class=\"highlight\">\n<pre><code class=\"html\"> <span class=\"nt\">&lt;p<\/span> <span class=\"na\">l10n-id=<\/span><span class=\"s\">\"licenseInfo\"<\/span><span class=\"nt\">&gt;<\/span> <span class=\"nt\">&lt;a<\/span> <span class=\"na\">href=<\/span><span class=\"s\">\"\/foundation\/licensing\/website-content.html\"<\/span><span class=\"nt\">&gt;&lt;\/a&gt;<\/span> <span class=\"nt\">&lt;\/p&gt;<\/span> <\/code><\/pre>\n<\/div>\n<p>The actual content, both source &amp; target, will be injected with L20n code. This way the developer doesn\u2019t have to (although they can) put it in HTML. All that matters is the <code>l10n-id=\"licenseInfo\"<\/code> part, as well as the <code>a<\/code> tag with the <code>href<\/code> attribute defined in HTML. All the rest happens in the L20n file.<\/p>\n<div class=\"highlight\">\n<pre><code class=\"clojure\"> <span class=\"nv\">&lt;licenseInfo<\/span> <span class=\"s\">\"\"\"<\/span> <span class=\"s\"> Portions of this content are \u00a91998\u20132012 by individual <\/span> <span class=\"s\"> mozilla.org contributors. Content available under a <\/span> <span class=\"s\"> &lt;a&gt;Creative Commons license&lt;\/a&gt;.<\/span> <span class=\"s\"> \"\"\"<\/span><span class=\"nv\">&gt;<\/span> <\/code><\/pre>\n<\/div>\n<p>We keep the <code>&lt;a&gt;<\/code> tag in the L20n file so that the localizer has control over what is linked and what is not. However, the attributes of the tag are not localizable and thus, are absent in the string. The strings is easier to read, and also harder to accidentally break.<\/p>\n<h2 id=\"matching_and_reordering_multiple_overlays\">Matching and reordering multiple overlays<\/h2>\n<p>L20n\u2019s DOM overlays match HTML nodes by type, name and position. If <code>licenseInfo<\/code> had two <code>&lt;a&gt;<\/code> child nodes, their attributes would be matched and copied from the source string in their respective order.<\/p>\n<p>Consider the following example.<\/p>\n<blockquote><p>Welcome to <a href=\"\/\">Pancake<\/a>, <a href=\"\/profile\">Sta\u015b<\/a>.<\/p><\/blockquote>\n<p>The HTML and L20n code responsible for this message might look like this:<\/p>\n<div class=\"highlight\">\n<pre><code class=\"html\"> <span class=\"nt\">&lt;p<\/span> <span class=\"na\">l10n-id=<\/span><span class=\"s\">\"welcome\"<\/span><span class=\"nt\">&gt;<\/span> <span class=\"nt\">&lt;a<\/span> <span class=\"na\">href=<\/span><span class=\"s\">\"\/\"<\/span><span class=\"nt\">&gt;&lt;\/a&gt;<\/span> <span class=\"nt\">&lt;a<\/span> <span class=\"na\">href=<\/span><span class=\"s\">\"\/profile\"<\/span><span class=\"nt\">&gt;&lt;\/a&gt;<\/span> <span class=\"nt\">&lt;\/p&gt;<\/span> <\/code><\/pre>\n<\/div>\n<div class=\"highlight\">\n<pre><code class=\"clojure\"> <span class=\"nv\">&lt;welcome<\/span> <span class=\"s\">\"\"\"<\/span> <span class=\"s\"> Wecome to &lt;a&gt;{{ brandName }}&lt;\/a&gt;, &lt;a&gt;{{ $user.firstname }}&lt;\/a&gt;.<\/span> <span class=\"s\"> \"\"\"<\/span><span class=\"nv\">&gt;<\/span> <\/code><\/pre>\n<\/div>\n<p>The &lt;<code>a&gt;<\/code> elements are in the same order in the source code and in the L20n code. L20n will thus copy the <code>href<\/code> attribute from the first &lt;<code>a&gt;<\/code> element in the source code to the first <code>&lt;a&gt;<\/code> element in the L20n code.<\/p>\n<p>Let\u2019s suppose now that the localizer wishes to change the order of the links. Maybe the grammar requires her to do so, or maybe the register is more (or less) formal in her locale. The expected result would be (translated back to English for the sake of this example) this:<\/p>\n<blockquote><p>Hi <a href=\"\/profile\">Sta\u015b<\/a>. Welcome to <a href=\"\/\">Pancake<\/a>.<\/p><\/blockquote>\n<p>Because the order is different, the localizer cannot rely on L20n\u2019s automatching any more. Instead, she needs to instruct L20n which <code>&lt;a&gt;<\/code> element corresponds to which one in the source. L20n allows her to do so via the <code>l10n-path<\/code> attribute set on the element, like so:<\/p>\n<div class=\"highlight\">\n<pre><code class=\"clojure\"> <span class=\"nv\">&lt;welcome<\/span> <span class=\"s\">\"\"\"<\/span> <span class=\"s\"> Hi &lt;a l10n-path=\"<\/span><span class=\"nv\">a<\/span><span class=\"p\">[<\/span><span class=\"mi\">2<\/span><span class=\"p\">]<\/span><span class=\"s\">\"&gt;{{ $user.firstname }}&lt;\/a&gt;.<\/span> <span class=\"s\"> Welcome to &lt;a l10n-path=\"<\/span><span class=\"nv\">a<\/span><span class=\"p\">[<\/span><span class=\"mi\">1<\/span><span class=\"p\">]<\/span><span class=\"s\">\"&gt;{{ brandName }}&lt;\/a&gt;.<\/span> <span class=\"s\"> \"\"\"<\/span><span class=\"nv\">&gt;<\/span> <\/code><\/pre>\n<\/div>\n<p>Using the <a href=\"https:\/\/developer.mozilla.org\/en\/XPath\">XPath<\/a> syntax, the localizer identifies which source <code>a<\/code> element to copy attributes from. The first <code>&lt;a&gt;<\/code> element in the <em>translation<\/em> will be matched against the second <code>&lt;a&gt;<\/code> element in the <em>source<\/em>. The meaning of <code>a[2]<\/code> in XPath is:<\/p>\n<blockquote><p>the second child (descendant of the first generation) of the context node that is an <code>&lt;a&gt;<\/code> element.<\/p><\/blockquote>\n<p>(The context node is the source node that\u2019s being localized, in this example the <code>&lt;p&gt;<\/code> element with <code>l10n-id=\"welcome\"<\/code>.)<\/p>\n<p>In most of the cases, the XPath expression will be very basic and minimal, like in the examples above. The full XPath syntax is supported, however, allowing for more complex matching.<\/p>\n<p>Lastly, the <code>l10n-path<\/code> is only required when changing order of elements of the same type, like two <code>a<\/code> elements. If you want to change the order of child nodes in a string with one &lt;<code>strong&gt;<\/code> and one &lt;<code>em&gt;<\/code> tag, you can do so without having to specify the <code>l10n-path<\/code> attributes.<\/p>\n<h2 id=\"privileges_and_autoextraction\">Privileges and autoextraction<\/h2>\n<p>The next step is to see if there\u2019s a need to prevent some attributes from being copied. It might be interesting to extend DOM overlays with a mechanism which only accepts whitelisted attributes, or blocks blacklisted ones from being copied from the source strings to the translation. This could be done globally, or even per-entity.<\/p>\n<p>I also started working on a maintenance script which extracts the contents of source nodes and automatically creates valid L20n code ready to be localized. It supports whitelisting attributes, but generally leaves most of the attributes out of the L20n code. You can find the code on <a href=\"https:\/\/github.com\/stasm\/l20n\/tree\/extract\">Github<\/a>, but bear in mind that this was more of an experiment and is very much a work-in-progress.<\/p>\n<h2 id=\"discussion\">Discussion<\/h2>\n<p>Please post your thoughts in the <a href=\"https:\/\/groups.google.com\/forum\/#!forum\/mozilla.dev.l10n\">mozilla.dev.l10n<\/a> newsgroup.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>(This is a crosspost from my blog: http:\/\/informationisart.com\/8\/.\u00a0 Check it out for better code formatting and syntax highlighting.) With L20n&#8217;s DOM overlays, developers can amend localized strings with additional non-localizable &hellip; <a class=\"go\" href=\"https:\/\/blog.mozilla.org\/l10n\/2012\/07\/16\/l20n-features-explained-dom-overlays\/\">Read more<\/a><\/p>\n","protected":false},"author":104,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[503],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/posts\/426"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/users\/104"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/comments?post=426"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/posts\/426\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/media?parent=426"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/categories?post=426"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/l10n\/wp-json\/wp\/v2\/tags?post=426"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}