Dec 29, 2009

Translating Trados TTX files with MemoQ

Quite some time ago, I summarized the techniques for translating TTX files with Atril's Déjà Vu X and published the information as a PDF file, which is available as a free download from one of my web sites (by clicking the link earlier in this sentence, for example). Now I would like to present how this task might be approached using Kilgray's MemoQ.

First of all, let's consider why you might want to do this at all. I think a typical situation might be where your customer insists on a TTX file as the deliverable translation. Another case might be where the files to be translated need to be pre-processed with TagEditor to reduce import time (like when MS Word files are heavily laden with graphics). I often use TagEditor to pre-process jobs I translate with Déjà Vu, and while MemoQ's import capabilities are generally better than those of DVX (and MemoQ sometimes handles files that TagEditor can't), sometimes everything just runs faster and better if I make a TTX to import into MemoQ.

You can argue about TMX compatibility with the customer all you like, but in many cases too much leverage is lost for future work if you deliver anything except a properly translated TTX file. So don't argue, just do it. In most cases you will want to pre-segment the file. The procedure for doing this is described in Steps 0 & 1 of the Trados TTX in DVX instructions previously mentioned. If you do not have access to a Trados license to use a TM provided and get the best leverage, you should ask your client or a colleague with a Trados license to assist you in the presegmentation (the latter only if the client's confidentiality rules permit).

When you are ready to import the TTX file into MemoQ, there are two options in the Project Wizard or in the Project Manager window: "Add document" and "Add document as...". The former is an import routine that simply brings in the segmented content. The second option opens the document import settings window (click for an enlarged view):  

Selecting the option to import unsegmented content (circled in red with a red arrow pointing to it) will cause numbers and dates, which are usually skipped by Trados, to be imported into the MemoQ project. I don't know of any other software that will do this currently. This is very helpful for technical or financial documents with tables of numbers to be corrected. It is not easy to find all this content in the TagEditor environment, so in this regard, this aspect of quality assurance is a lot easier for a Trados project if it is done in MemoQ.

Quick minds may have realized at this point that with this second import option, all the content of an unsegmented TTX file can be imported. While this is indeed possible, it's usually not a great idea, and it may upset the customer in many cases. This is because the the TTX file cannot be "cleaned" to transfer the data into the Trados translation memory. However, a target file can be saved from it. If for some reason you translate a TTX without segmenting it, the TM information is transferable to the client as a bilingual file (Trados-compatible Word document) or a TMX export from the MemoQ TM, but this is a rotten idea for all but the simplest files, because the segment leverage of the content imported into the Trados TM will probably be awful. You are much better off getting someone to segment the original TTX for you and "retranslating" it from the TM in MemoQ. The only time I would translate an unsegmented TTX myself is if I am using that format for expedience in the case of a huge Word file full of graphics or something similar.

After the TTX has been imported into MemoQ, if you want to clear the target cells, then right-click anywhere in the translation work area and choose Clear Translations... from  the context menu. If you want to clear only a specific range, select the beginning of that range, then shift-click at the end of the range to select all the cells in between. In either case, there are a number of options for what should be cleared (all translations, just unconfirmed segments, etc.). The way this option is implemented is less dangerous than the analogous function in DVX, where I have to remember to filter what I want to keep before clearing target cells. Filter functions can, of course, be applied in MemoQ too.

When you are done translating the (segmented) TTX file in MemoQ, your output is a "uncleaned" TTX file that contains both the source content and your translation. If you have a copy of TagEditor and the original file available, you can save a copy of the translated file in its original format by using the command File > Save Target As... in TagEditor. If you don't have the original file, you can't save a target file - your customer will have to do that.

If your customer has a translation memory relevant to your project, it should be exported from Trados Workbench in TMX 1.4 format and imported into a MemoQ database for concordance use. Please note that it is better to use these databases for the pretranslation/presegmentation step than to presegment the Trados file against an empty database (basically copying the source content 1:1 to the target) and then translate in MemoQ using the migrated TM content from Trados; the leverage will generally be higher (i.e. more and better matches).

This procedure is safe and 100% compatible with Trados. It can also be performed with the unlicensed version of MemoQ (MemoQ4Free) with the restrictions that apply to that product (only one file, no import to the TM, just to the termbase).

6 comments:

  1. Two questions for you, Kevin, from a Trados non-user:

    If a customer asks you to deliver a TTX file, isn't it safe to assume that he has Trados? Otherwise, what would be the point?

    Assuming that the customer has Trados, isn't it better for him to create the TTX file for you? I was under the impression that this would safely avoid any variation in segmentation settings and the resulting problems.

    Or have I missed something obvious?

    Marc P.

    ReplyDelete
  2. For those who don't know, Marc is the guy who launched the TOXIC project to enable OmegaT to work with TTX files. Nice piece of work.

    @Marc: Usually that's a safe assumption, yes. However I have seen kitchen table agencies without any Trados license or skill who pass through such requests, occasionally prepared files, from the end customer. Often the customer will create the TTX. However, it is not common at all for this TTX to be pre-segmented unless you ASK (and often you'll have to provide instructions on how you want it done!). That's because most assume the work will be done in Trados, and Tageditor segments the content of the TTX as you work. The parts you haven't worked on yet are not segmented. Also, if the file is being pretranslated using an existing TM, stuff that isn't at least a fuzzy match won't be segmented with the default settings. You have to specify explicitly in Trados Workbench that the source is to be copied to the target if there is no match. Otherwise you can pretty much forget making the translation of TTX with non-Trados tools work in a way that will satisfy the client.

    I do this work with Trados licenses that I own (two of them). You can do a lot with demo (unlicensed) versions or have other people (clients, colleagues) help you out, but I like the control of having my own full license and not depending on anyone else. It also lets me do extra testing or experiments if I run into a difficulty. There are also some useful QA features for verifying tags, etc. that Trados offers; it's not a bad idea to use these before you deliver to avoid sending trouble to your client.

    My main interest is getting jobs donr right, as efficiently as possible, and giving the client exactly what the client wants and/or needs. I want maximum flexibility to do this. So I don't worry about pinching a few euros or a few hundred more on a license, because one way or another it will pay for itself in preparing and processing files or testing. I have one client who drives me nuts by sending me his reviews in the form of TagEditor comments. I grumble about this, but it's really not a big deal, because I have Trados and I can read these comments and go straight to the commented section with just a click.

    Do I recommend translating with Trados 8.3 or earlier? Generally no. But for those of us who use other tools, it's still often very useful to have a reasonably updated full license available, if only for exporting TM content to TMX. I'll bet if you were to poll 10 agency PMs at random on how to do this, at least several would not know.

    ReplyDelete
  3. OK, this is becoming clearer now – I think.

    So far, I have encountered three types of TTX file: Source=Target, Source≠Target (that's "not equal to", if the symbol doesn't make it: this is the case where the target segments have been "pretranslated"), and "zero target", i.e. no target segments. Toxic can only handle the first kind, i.e. where source and target segments are both present and identical.

    From your description of an "unsegmented" TTX file, I suspect that it's what I'm calling the "zero target" TTX. (Because of the way Toxic is written, it's the fact that there are no target segments that is the crucial point and the reason it can't handle this form. Toxic simply extracts translatable content, and it handles TTX by regarding the target segments as the translatable content.) You can probably confirm this, but if not I will have to dig out some samples from my growing repository of sample TTX files.

    So far, the solution to Toxic's inability to handle the other two forms of TTX has been documentation: the readme file contains clear instructions on how to create the desired form of TTX in Trados, and these can be passed on to the customer, which is what some Toxic users have done, apparently with success. It would be ideal if the functionality for handling them as-is could be included in Toxic, but that's something for the future. In the meantime, if the instructions are completely foolproof, it's not a bad alternative, or at least so I thought.

    I wasn't aware that Trados could/would segment TTX as you worked. "Pretranslation" of a file against a TM is clear: that is something that can be done by the customer (and could then be reverted by Toxic), or the Source=Target form can be created by the customer and the TM simply supplied (in TMX format) for reference. The issue is, if I understand you correctly, that Trados actually adjusts segmentation according to the existing TM. That would make it more urgent for Toxic to support the "pretranslated" format.

    On having a whole raft of tools to deal with different proprietary formats: the problem isn't always cost. Much more significant, to me personally, are a) that Linux versions of most of the proprietary CAT tools aren't available, and b) the efficiency penalty of having to work in half a dozen different environments rather than just the one. The former I appreciate is not an issue to most people; the latter is only really an issue if I actually have to work in the tool, as distinct from using it simply to preprocess files.

    Marc

    ReplyDelete
  4. You've got it, Mark. Linux versions of MemoQ, Trados, etc. aren't available, but (quoting from the VMWare FAQ) "VMware Player is software that enables users to easily create and run virtual machines on a Windows or Linux PC."

    The Player (or Server for that matter) are free, and you can run a virtual machine with Win XP or whatever you prefer to use any applications you need for processing and compatibility. I've used this solution (with a full VMWare license actually) for years for ensuring that I can keep using my ancient electronic dictionaries. If Trados will function under VMWare then you could do all your preprocessing in the XP virtual machine, move the file to your fave tool in Linux (OmegaT or whatever), send it back to the virtual machine for exporting a target file & you're done.

    With regard to your comment that "... the efficiency penalty of having to work in half a dozen different environments rather than just the one... is only really an issue if I actually have to work in the tool, as distinct from using it simply to preprocess files" (emphasis mine) is quite correct. I used to go bonkers switching back and forth between Trados and Déjà Vu to translate, because the ergonomics are so different. I can manage it with MemoQ and DVX only because MemoQ allows me to reprogram all the major keyboard shortcuts so that the application behaves like Déjà Vu (an absolutely wonderful feature). I really only work in "one" ergonomically unified environment (MQ/DVX) for translation, but I pre-and post-process in Trados or Star Transit all the time.

    ReplyDelete
  5. This post is a few years old, but it's very helpful. I prefer translating TTX files with MemoQ to run the QA plugin and flag double spaces and wrong numbers, but I just realized there's a glitch.

    Has anyone translated large TTX files with MemoQ and found that all the ">" and "<" signs have been inverted?

    It only happens to me when the file is very large, but not small files. It also only happens if the sign is proceeded or followed by a space. I only work with medical documents, so even one inverted sign is too many!

    ReplyDelete
    Replies
    1. There have been quite a few changes in TTX handling in the past year or two. Are you using a current version (which?) and have you shared this with Support? I haven't done much with TTX in a long time myself, so I've only followed the discussions at a bit of a distance. This the format is still of interest to Kilgray, so I would report anything troubling and expect it will be dealt with.

      Delete

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)