Dec 11, 2008

Hiding file content in MS Word and RTF files for translation

We frequently receive jobs in which some part of a file is to be excluded or where tables are included where the customer wishes to have the target text written adjacent to a column with the source text. Some of these jobs are straightforward to deal with, others less so. After a particularly nasty assignment with a mix of tables, colors and untranslatable notes embedded in the middle of sentences, I decided to write a short set of suggestions to share with some of my clients so that they can prepare certain jobs more effectively, saving me time and saving them money. If it can be of assistance to anyone, feel free to download the guidelines here. Suggestions for improvement or addition are very welcome.


  1. Your method seems extremely complicated. Suggestions for a basic CAT user:

    - make a copy of the file.
    - in this copy, erase everything that shouldn't be translated.
    - translate the remaining text with your favourite CAT, check, print, and make corrections
    - copy and paste the translated text in the original file, in the right column
    - last stage of verifications

    This even works in most cases for complicated columns (and I did lots of these).

    Btw I use Wordfast 5 which uses the Word interface. Another method is to mark all the untranslatable text with the double strike through/red ants method as described on page 16 of the Wordfast 5 manual, but this supposes that the user understands how Wordfast is working and has the manual, and my colleagues don't.

    The other option, the tw4winExternal method as described in the Wordfast manual, cannot be used in complicated columns.

  2. Which method is complicated? There are several presented. For the most part they simply involve search and replace techniques to hide text which is not to be translated. If you deal with manuals of several hundred pages in which the changes are scattered throughout as colored text, it is simply too easy to overlook material to be translated, and manually copying and pasting runs this risk as well as being very time-consuming. And I'm not sure how you would handle the last example otherwise.
    Perhaps the methods seem complicated because you are not accustomed to the working environment of the two CAT tools discussed (TagEditor and DVX).
    Although I have a WF license, I've never used the tool, so I can't comment much on the double strike method. If applying a double strike property to text causes it to be skipped, then of course the search and replace method I described can be adapted by applying that property instead of the one for hidden text. Either way the other font properties would remain intact, unlike applying tw4winExternal or other untranslatable styles.

  3. Thank you for the great post and related instructions, Kevin! We linked to it from Medical Translation Insight.

  4. @FET: As I noted in the comments on the other blog, this method also works just fine for MemoQ. At the time I wrote up the notes I wasn't a serious MemoQ user, so I wasn't aware of this. I'll have to get around to writing an update.


