Apr 3, 2018

Dealing with tagged translatable text in memoQ

Lately I've been doing a bit of custom filter development for some translation agency clients. Most of it has been relatively simple stuff, like chaining an HTML filter after an Excel filter to protect HTML tags around the text in the Excel cells, but some of it is more involved; in a few cases, three levels of filters had to be combined using memoQ's cascading filter feature.

And sometimes things go too far....

A client had quite a number of JSON files, which were the basis for some online programming tutorials. There was quite a lot of non-translatable content that made it past memoQ's default JSON filter, much of which - if modified in any way - would mess up the functionality of the translated content and require a lot of troublesome post-editing and correction. In the example above, Seconds in a day: is clearly translatable text, but the special rules used with the Regex Tagger turned that text (and others) into protected tags. And unfortunately the rules could not be edited efficiently to avoid this without leaving a lot of untranslatable content unprotected and driving up the cost (due to increased word count) for the client.

In situations like this, there is only one proper thing to do in memoQ: edit the tags!

There are two ways to do this:

  • use the inline tag editing features of memoQ or
  • edit the tag on the target side of a memoQ RTF bilingual review file.
The second approach can be carried out by someone (like the client) in any reasonable text editor; tags in an RTF bilingual are represented as red text:

If, however, you go the RTF bilingual route, it's important to specify that the full text of the tags is to be exported, or all you'll get are numbers in brackets as placeholders:

Editing tags in the memoQ working environment is also straightforward:

On the Edit ribbon, select Tag Commands and chose the option Edit Inline Tag

When you change the tag content as required, remember to click the Save button in the editing dialog each time, or your changes will be lost.

These methods can be applied to cases such as HTML or XML attribute text which needs to be translated but which instead has been embedded in a tag due to an incorrectly configured filter. I've seen that rather often unfortunately.

The effort involved here is greater than the typical word- or character-based compensation schemes can justly compensate and should be charged at a decent hourly rate or be included in project management fees. 

A lot of translators are rather "tag-phobic", but the reality of translation today is that tags are an essential part of the translatable content, serving to format translatable content in some cases and containing (unfortunately) embedded text which needs to be translated in other (fortunately less common) cases. Correct handling of tags by translation service providers delivers considerable value to end clients by enabling translations to be produced directly in the file formats needed, saving a great deal of time and money for the client in many cases.

One reasonable objection that many translators have is that the flawed compensation models typically used in the bulk market bog do not fairly include the extra effort of working with tags. In simple cases where the tags are simply part of the format (or are residual garbage from a poorly prepared OCR file, for example), a fair way of dealing with this is to count the tags as words or as an average character equivalent. This is what I usually do, but in the case of tags which need editing, this is not enough, and an hourly charge would apply.

In the filter development project for the JSON files received by my agency client, the text used was initially analyzed at
14,985 words; 111,085 characters; 65 tags
and after proper tagging of the coded content to be protected it was
8766 words; 46,949 characters; 2718 tags.
The reduction in text count more than covered the cost of the few hours needed to produce the cascading filter needed for this client's case and largely ensured that the translator could not alter text which would impair the function of the product.

No comments:

Post a Comment

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)