Pages

Aug 24, 2016

memoQ autotranslatables: a partial antidote for drudgery

I'm currently working on a stack of legal pleadings for a patent nullity suit – lots of "urgent" words to churn by the end of the week. And after 10,000 or so of them, I got pretty damned tired of typing out the translation of text citations of the form "Spalte 7, Zeilen 34 bis 45" as "Column 7, Lines 34 to 45".

In fact, it was really starting to piss me off. In such situations, I try not to get mad but to get an autotranslatable ruleset instead. This is perhaps one of the most under-utilized productivity tools in memoQ.


So the next time I ran into a text that fit that format, the translation was offered as an autocompletable phrase as soon as I typed the first letter:


Of course life isn't usually that simple, at least not life with technology. And authors? Well, they seem to believe firmly in the old saying that "consistency is the hobgoblin of little minds". So of course the text also includes lots of references in the form "Spalte 7, Zeilen 34 - 45", with or without spaces around the hyphen. No problem, just add a rule for that (or if you are more clever, edit the single rule to cover the variations):



Now I am not one to advocate that the unwashed masses of translators – or even the washed ones – run out and learn to write regular expressions. I've programmed more computer languages and systems than I can possibly remember for about 45 years now, and I can't keep most of the autotranslatable rules in my head if I don't use them for a week or more after yet-another-refresher, so it would be stupid and hypocritical of me (or just bloody naive) to expect most people to mess with nerdy shit like this. But....

... a few simple rules and a couple of nice "recipe templates" to start can go a long way. And sometimes it pays not to be too clever; I have one highly sophisticated set of rules for complex legal citations that was written by a professional programmer, and it's unusable. Takes minutes to load even on a very fast computer, which is a huge pain in the backside every time a project is opened in memoQ. My more verbose, brute force approach to legal reference autotranslation may not be elegant, but it loads much faster and covers 90% or more of what I encounter. Maybe a case of where it's smart to be a little stupid.

There are lots of good tutorials out there on regex (regular expressions), including a few YouTube webinar videos from Kilgray, the memoQ Help, a few chapters in old books of mine, discussions in the Yahoogroups lists and more.

The examples above require the knowledge of only a few rules:
  • Chunks of the source text to be analyzed are grouped in parentheses. In the examples shown, those groups are merely where numbers occur.
  • Numbers are represented by the escape code "\d". If there might be more than one digit, add a plus sign: \d+.
  • Spaces are represented by the escape code "\s". In the rules you can usually just type a space instead, but if you have to cover cases where it might be missing or where more than one might have been typed (usual sloppiness), then use the escape code, followed by an asterisk, which means "zero or more" of whatever it is put after: \s*.
  • For the rest of the text to match, you can usually type it just the way it occurs as I have done above. For the target translation rules, you can usually just type the literal text you want, with the groups represents by the numerical order in which they occur, preceded by a dollar sign. So the first group (parentheses set) in the source is $1, the second is $2, etc. Of course the order can be changed in the target; it's just not necessary in this case, but in autotranslatable rules for dates this happens rather often.
Not only will the little rules I wrote for this big job save me a lot of typing, I can also use them in a QA profile to check that I have made no errors by switching numbers, missing a space or anything else in my translation. That is done by marking the appropriate checkbox on the first tab of the QA profile you plan to use:


Perhaps such things are worth a little effort in your projects once in a while....


2 comments:

  1. As always, a great post on a pretty complex topic written for translators as opposed to tech geeks. Thanks a million!
    Colin

    ReplyDelete
  2. Kevin, I'm a Déjà Vu user and enthusiast, but this is definitely is something that makes me think about switching to MemoQ. An excellent illustration.

    ReplyDelete

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)