Nov 26, 2018

memoQ as an instrument of bowdlerization

Recently in a memoQ user forum, someone asked if the environment could be used to ensure that certain words would be kept out of a target text. 

The question wasn't clearly understood at first: some thought the asker wanted to ensure that certain words were not translated and suggested the use of non-translatable lists, others pointed out that the memoQ term bases have an option to mark certain term translations as forbidden, which would be indicated by a black color in the Translation Results list, for example:

But no... what was wanted was indeed a monolingual list to ensure that the target text did not contain certain words, regardless of what the source text said.

This is indeed possible in memoQ:

The red box in the screenshot above marks a word on such a forbidden list, which is in an English to English (US) revision project I set up to clean up Mark Twain's 1601 and make it fit for teaching in Sunday school. The presence of a no-no is indicated by a little lightning bolt icon for a QA warning. Actually running a QA check for terminology gave the following result, a list of forbidden expressions (with some alternatives suggested):

How is this done? With an ordinary memoQ term base. One can make a term entry only for a single language - the target language in this instance - and mark the Forbidden term checkbox on the Usage tab.

If there is no source term (or in my example above, where the source and target language are variants of the same language), there will be nothing shown in the translation results list, but if there is a termbase entry marked forbidden which is found on the target side, then a warning will be displayed in the translation and editing grid if the QA profile currently selected for the project includes terminology. (If the QA profile does not include appropriate term checking, no warning will be displayed for segments in the grid, nor will the forbidden words be indicated when QA is run. The Default QA profile does include term checking, but I use a lot of different profiles focused on fewer issues, such as just tag checking, so I have to pay attention to this detail.)

Building such a list word-by-word with manual entries is tedious. So it's probably easier to import your monolingual list of words to avoid from a text file or an Excel sheet, and then in the memoQ Term Base Editor, select a range of terms (like all of them) and set the desired forbidden status for the entire selection:

Bulk changes to any term properties are possible this way as I showed some time ago in a short video tutorial.

The critical setting for the scenario described here is marked with a red box in the QA profile below:

As you can see, the source text can also be checked for forbidden expressions if that option is also selected.

I have created a dedicated termbase to track forbidden expressions in three languages. Lists - monolingual or otherwise - of forbidden expressions can be maintained in one or more term bases or the expressions can be kept in a more ordinary term base. If barring certain common expressions in one or more languages is important to you, it might be convenient to maintain this information in a dedicated term base.

And for recent versions of memoQ (8.4 and later), make sure that the term bases you want to use for quality assurance are marked on the Term bases page of Project home. Here it's not a good idea to use your larger translation term bases for QA, because these may result in rather large numbers of false positives. Optimum term properties settings for translation and quality assurance are often not the same.

Thus memoQ can be used as a powerful tool to avoid embarrassment from an unfortunate choice of words and to adapt the target language to fit a particular audience better. Thinking back to the time, years ago, when a friend who ran the translation department at a conservative German company nearly got fired for writing that a certain software operation could be performed at the touch of a penis (a Freudian slip after he and some other translators were joking about how sick they were of a certain phrase in the user documentation) and remembering the sensitivity to terms I have seen with some clients (at the same software company, the term FAQ was banned, because executive management was afraid that it might be pronounced like fuck), I can see how this somewhat unusual approach to terminology in memoQ could be a job-saver for some.


  1. Interesting post! Funnily enough, we've only recently looked at this issue as well, though given our target languages, our choice solution has been to use regex rules in the QA check to more easily track down inflections, plus allow a more finely grained filter (providing different warnings for mild and coarse language).

  2. Interesting post. I've been wondering if it's also possible to hack MemoQ to allow only term-base approved words to the MT output. This would reduce the synonym salad produced by MT. First feeding the term base and then using QA to pick out unwanted words takes longer than doing it manually for shorter texts. An alternative would be to run the QA on the MT entries before you put them in the target segment.
    A feature of this kind has been promised for a while under the name Adaptive MT, but I haven't seen any working implementations yet.


Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)