{"id":60,"date":"2007-10-30T03:01:26","date_gmt":"2007-10-30T10:01:26","guid":{"rendered":"http:\/\/blog.mozilla.org\/axel\/2007\/10\/30\/firefox-2-glossary\/"},"modified":"2007-12-09T13:19:34","modified_gmt":"2007-12-09T20:19:34","slug":"firefox-2-glossary","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/axel\/2007\/10\/30\/firefox-2-glossary\/","title":{"rendered":"Firefox 2 glossary"},"content":{"rendered":"<p>As I have <a href=\"http:\/\/blog.mozilla.org\/axel\/2007\/10\/17\/localization-is-hard-the-math-way\/\">been blogging before<\/a>, I try to create a <a href=\"http:\/\/l10n.mozilla.org\/~axel\/glossary\/\">glossary for Firefox 2<\/a>. I&#8217;m leaving out all the gory details, but it&#8217;s been a hard fight between me trying to be clever and being dumb. I&#8217;m still not happy with the code that creates the data, but still, I think this iteration generates output that looks good enough to share, and get busted by others.<\/p>\n<p>With all the grumpiness about my code, I&#8217;m pretty happy that I didn&#8217;t have to do the web part, so thanks to <a href=\"http:\/\/shaver.off.net\/diary\">shaver<\/a> for poking <a href=\"http:\/\/www.allpeers.com\/blog\/2007\/10\/14\/new-mozpad-api-project-statistics-online\/\">plasticmillion<\/a> about <a href=\"http:\/\/simile.mit.edu\/exhibit\/\">exhibit<\/a>. That went pretty slick up to a site of my standards. Read, with full visual suckage. I did find a <a href=\"http:\/\/simile.mit.edu\/issues\/browse\/EXHIBIT-239\">bug in exhibit<\/a>, though.<\/p>\n<p>The dataset I have now went through a series of iterations to do the right thing to find phrases, which, by now, should be a nice set of educated guesses. I&#8217;m probably wrong with some, so if you find a string that shouldn&#8217;t be there, or should be and isn&#8217;t, that&#8217;s likely a bug.<\/p>\n<p>Ok, beef. Here&#8217;s the <a href=\"http:\/\/l10n.mozilla.org\/~axel\/glossary\/\">link<\/a>. It should have all phrases that appear in Firefox more than once, sortable by length and occurence, but not sequences that are just filler words. I have a short black list for the latter. For each phrase, you can click on it, and it will open an mxr search on the MOZILLA_1_8_BRANCH in all localizable files for it. I didn&#8217;t file the mxr bugs, nor did I yet try to work around them, searching for &#8216;foo&#8217; including ticks doesn&#8217;t work, at least. And of course, you can search the glossary, it&#8217;d suck otherwise, right? Thanks to exhibit, that was easy.<\/p>\n<p>Before going further into this, is this something worthwhile for  the l10n community? Other RFEs? Right now, the source data is in sqlite, which I intend to share, if anyone&#8217;s interested, though the database schema and the way I create it needs work.<\/p>\n<p>And before you ask, yes, I tried to run it on Thunderbird, too, but it seemed to make the story harder and dominate the results. I guess it&#8217;d be better to just create two separate apps. I&#8217;m having perf problems with the code, too, so I wasn&#8217;t too keen to do more than initially necessary. I don&#8217;t index security\/manager, because the localizable files  in there are just yucky, and I didn&#8217;t want to special case for stuff like # being replaced by a newline.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>As I have been blogging before, I try to create a glossary for Firefox 2. I&#8217;m leaving out all the gory details, but it&#8217;s been a hard fight between me trying to be clever and being dumb. I&#8217;m still not happy with the code that creates the data, but still, I think this iteration generates [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/posts\/60"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/comments?post=60"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/posts\/60\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/media?parent=60"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/categories?post=60"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/axel\/wp-json\/wp\/v2\/tags?post=60"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}