What’s a glossary term?

I’m hacking on some tool that indexes the localizable strings in our apps.

One of the fall-outs could be a glossary tool, i.e., which terms in Firefox, Thunderbird, etc should localizers bother to get consistently translated.

Which raises an interesting question, where do you draw the line? What’s a good metric to use to define a glossary? Are there glossary-based applications that don’t need a cut-off at all?

Insights welcome.

The Conversation {4 comments}

  1. smo {Saturday January 28, 2012 @ 5:50 am}

    hi Axel:

    a good start would be a compilation of single-word nouns and verbs (actually and/or – like in the case of “File”). That should carry it a long way.

    Check such a compilate by Slovenian Ubuntu community here:
    https://wiki.lugos.si/slovenjenje:pojmovnik

    The most appropriate container for this kind of information would be the TMX format, allowing a single-file, multilingual solution.

    Regards

    smo

  2. Caspy7 {Saturday January 28, 2012 @ 7:26 am}

    Seems like indexing everything and then getting a count for frequency for each occurrence would be the place to start…or I’m not fully understanding the concept…

  3. Eloi {Friday February 3, 2012 @ 8:37 am}

    Perhaps Caspy is right, you could use Antconc or a similar tool and substract articles, prepositions and any list of bad words you can make up along the way. However the process will still be highly manual, and sometimes frequency is not the best filter: some key terms appear just a few times as compared to many other words like open, file or whatever. I think there’s been people working on this for years and there still is no solution but to hire a full-time terminologist yey :D

  4. Philippe {Tuesday February 7, 2012 @ 3:14 am}

    Do you need something like this :
    http://www.frenchmozilla.org/transvision/index.php?recherche=bookmarks&locale=de&repo=release&t2t=t2t

    For now it’s a quick glossary just based on the frequency, I plan to use more heuristic rules and similarity in the near future.

Sorry, comments for this entry are closed at this time.