But this flexibility can have cause some problems in term management for busy translators in some pairs. It's nice that UK English terminology will display in my project with US English set as the target language. But when I want to export a simple list of all English and all German words in a termbase, for example - and there are synonyms in the entry as well, the resultant delimited output is very confusing for many people. And if you decide that you want to move all the terms to "generic" English, things can get really messy.
I faced exactly that problem just last week. I had been maintaining the termbase for an end client for about a year, starting with a rather large in-house terminology I was given in an Excel file. I imported it to memoQ as generic German and started working with the target language set to generic English. After a while the customer pointed out that UK English was their "standard". I swallowed hard and set my template project to UK English, pointing out that they would still be getting English with a rather American character from me no matter what I did with the spellchecker. Then later when I found out about the bug in SDL Trados where XLIFF files are difficult to import if the sublanguages are not specified, I set the source language of that customer's project to German (Germany).
After that they opened a subsidiary in the US and suddenly my native language variant was the new "standard". Ha, ha. I now had a term base with five languages: two flavors of German and three flavors of English. Consolidation time!
But how does one do this? memoQ itself is unhelpful in this regard; it barely offers any management features for terminology, much less tricks like merging sublanguages.
It's not that hard, really, and the steps depend on what data you actually keep in the termbase. If you never enter definitions for your terms things are pretty easy.
To start consolidating your terms by combining the sublanguages, first export a fully specified delimited text file. That's a "CSV" file with all the defaults.
Let's have a look at how that file is structured:
A given language or sublanguage can have any number of terms, which are treated as synonyms. Why are they synonyms? Because in the term structure they share the same definition. If you have three language variants for English like I did, you can have three definitions. The definition fields are the first problem source.
Each entry for a language or sublanguage has three fields: the actual term (word, phrase), an example (which is what you wrote in the "usage" field), and "term info" which is a bunch of other meta data for capitalization rules, matching, gender, forbidden status, part of speech, etc.
If I have definitions that I want to keep, there is a little more complication to avoid losing data. My quick and dirty solution was to create a temporary column for each major language in which I combine all the definitions for the language's variants. I do this by using Excel's concatenation function, something like this:
In my case N was the definition column for English, U the definitions for UK English and Y for US English.
I renamed my merge columns as English_Def and German_Def and deleted the names of the original definition columns then saved the data as Unicode text (UTF-8, the memoQ default to avoid potential problems with mapping characters on re-import of the data to memoQ). After import, a quick look at my test data confirmed that no data was lost and all the language variants were combined as one major language:
Obviously a little editing is still desirable - entry #3 shows some duplication because two English variants contained the same term; my sloppy conditional statement also left a few leading slashes for cases where the first definition column was empty and a later one for another variant of the same language was not. But that's not a big deal; one should always have a careful look at data after doing something like this.