Apr 17, 2009

Crossing segment boundaries

One of the basic rules I learned early on when using Déjà Vu to do Trados projects was that one must be very, very careful about messing with the codes ("tags" to you Trados users), and deleting codes for Trados segment boundaries was strictly off limits. On those rare occasions when I did so accidentally in a presegmented MS Word or RTF file, the result was a disaster - an asynchronous mess like this:Consequently I always avoided changing the default segmentation when doing the Trados/DVX workflow (or adjusted it manually in Word or TagEditor - a time-consuming process). I would combine segments in the DVX environment but leave in the code for the transition between the two Trados segments. This sometimes resulted in some very odd "trash" segments in the Trados TMs, but these didn't worry me much, because my DVX TM had a "complete", sensible segment, albeit with a silly code embedded in it. Not a problem, really, since no one ever complained about the trash in the Trados TM: agencies are used to incompetent segmentation by translators inexperienced with Trados, and some even insist that the default segmentation never be altered (thus compelling translators to produce trash segments).

Well, yesterday a client did complain. He tried to demand that I stick to the original German segmentation or at least re-adjust it so that the segments made sense in a concordance. I understand his concern, as these trash segments really do bother me, but I wasn't about to abandon my DVX workflow and lose time working in TagEditor. I was also rather grumpy after another night of abbreviated sleep, and my response to his request was less than gracious, mostly taking the "high ground" as a native speaker who knows what's better for the structure of an English sentence. In fact, I was a real jerk.

Then I slept on it. For a few hours at least. And I realized that the problem, of course, was technical, not linguistic, and my intuition told me that there is in fact a solution that contradicts what I had thought to be true about deleting segmentation codes for Trados files in DVX. I'm sure others have noticed this before and mentioned it in various public forums, but I was so fixated on the disaster of deleting segmentation codes for MS Word and RTF files pre-segmented with Trados that I never considered trying it with a TTX. And guess what? It works like a charm!

I ran an experiment with three types of files: a Word document, an Excel document and an HTML document. In all three I was able to combine segments in DVX and delete the codes on the left side to change the segment boundaries in Trados and still have a valid TTX file as my export! And in every case I was able to generate valid target files, even ones with paragraphs combined.

Here is what the source column in DVX looks like after the TTX segments are combined, before and after the codes are deleted:



This represents a huge advance in the way I use Déjà Vu X for doing Trados jobs.


  1. Wrong -
    Das soll gefährlich sein,und nach meiner Meinung ist es auch.
    Right -
    Das soll gefährlich sein, und nach meiner Meinung ist es das (dies, es) auch.

    Wrong -
    Also werden wir sehen was passiert.
    Right -
    Also werden wir sehen, was passiert.

  2. Irrelevant - at 5 am I don't pretend to write good German for examples, nor do I engage in translation into that language. The intent here was merely to create some files to test a technical principle, and that was quite successful.

    In the meantime I've tested some XML files, and - not surprisingly - these work as well when certain segment boundaries in the TTX file are deleted. So in the future, the Trados TMs that result will be a lot cleaner.

  3. Kevin - I'm sure Anonymous is perfect in everything they do. Don't sweat it. If I worked in Déjà Vu I would be thrilled that you shared this with me. There aren't enough translators out there who are willing to share their knowledge with others - and even fewer who work with Déjà Vu. Keep up the good work!


Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)