Translation Tribulations

Dec 31, 2009

Trados Classic as the fulcrum of collaboration

ful·crum (ˈfu̇l-krəm, ˈfəl-), n.

plural: fulcrums or ful·cra \-krə\

etymology: Late Latin, from Latin, bedpost, from fulcire to prop

1 (a): prop; specifically : the support about which a lever turns (b): one that supplies capability for action

2 : a part of an animal that serves as a hinge or support (also applies, given what a beast Trados is!)

Users of other translation environment tools often become irritated when Trados is referred to as a "standard". It is certainly not one in an official sense; no national or international body has recommended compliance with its formats and protocols. Yet as an early entrant to the field of machine-assisted human translation tools backed by ruthless marketing, for many it became a de facto standard, and in seeking to gain acceptance on the market, vendors of many better alternative tools adopted Trados file formats or enabled their users to work with them in some way.

I suppose that in ten years' time formats like TTX and the marked-up "bilingual" format of files processed with the Trados macros in Microsoft Word or some equivalent will be rare or non-existent. However, with other tools such as Wordfast Classic or Anaphaseus using this markup as a primary format and MemoQ and Déjà Vu X supporting it as an exchange format for compatibility and collaboration, the engine driving the persistence of Trados formats is no longer entirely of the original provider's making. So for a number of years yet, I think it will be important to understand the role that Trados data formats can play in information exchange and collaboration in translation projects. My comments below in this post can be understood primarily as tips and instructions for people who use the old Trados (version 8.3 or earlier) as a tool for a project where there is the intent to outsource some of that project's content to translators who use some tool other than Trados. Please note that I am describing only scenarios with which I am familiar; if there are important differences that apply to different tools, I encourage more knowledgeable persons to enlighten me and others in the comments.

The first thing an outsourcer must understand and decide is whether the "compatibility" that a translator offers by working in a tool other than Trados and exporting "Trados-compatible" files (bilingual files, TMs and terminology resources) is really sufficient. The answers to this question can vary a lot.

If you insist on a Trados project using a Trados TM server (TM Anywhere technology), especially where multiple translators must be coordinated, you are probably aware that there is really no substitute for the translator working in Trados itself in some way. None of the major TEnT vendors' servers are accessible by clients from other providers as far as I know. So here the translator will have to bend his or her knee and kiss the pope's ring or just skip the job. If you as an outsourcer are willing to accept compromises or the translator will be working alone (so you can perhaps provide a TM export and the translator won't miss out on contributions from others in a project), then keep reading for more options. I have a client in Switzerland that likes translators to work off their Trados server online. Although I have the technical ability to do this, I refuse to work this way due to years of bad experience with "enterprise technologies" from Trados. (TeamWorks put such a bad taste in my mouth that it would take an Act of God to make me willing to use such things from that source again.) So this client kindly provides TM exports with a password for access. For "confidentiality reasons" and because of contractual obligations to the end client (I am told) they cannot give me the password so I can use this database. However, this is a virtual fig leaf, because they know that I can find out the TWB password in seconds using TMPwdRec.exe from Kakeeware. It works with Trados versions through 8.3 (despite the fact that the highest version mentioned on the linked web page is 6.5).
If your source file is fairly complex or maximum leverage (i.e. highest quality matching) is very important in later projects, then you want to have your files pre-processed with Trados. How depends on your needs and workflows. As many who have moved from processing MS Word files with the macros in Word to translation of these files in TagEditor have learned, what should be a 100% match from the TWB TM often is not; the same issue will often be found if you get a TMX file from OmegaT, MemoQ or some other tool. Your translator may or may not have a copy of Trados to do this, but if you have special segmentation definitions or other unusual circumstances, you might want to prepare the files yourself. Also make it clear to the translator whether it is "allowed" to change your segmentation. I always assumed that combining bad segments and occasional adjacent ones in order to create more sensible and/or better content and avoid nonsense in the TM was desirable until I encountered a colleague in Colorado whose projects often demand such extreme levels of automation that he expects translators to change absolutely nothing with the default segmentation. This extreme attitude is an unfortunate byproduct of working with primitive technologies like Trados; with TM-driven segmentation like you'll find in MemoQ this is no longer an issue, as the best match can be created dynamically. Since my colleague is a smarter guy than I am, I assume he's already moved on to something better or will do so at some point.
If you will prepare the Trados files for your translator, so-called presegmentation of the files will generally be necessary (unless the translator uses something like Wordfast Classic or Anaphraseus, which more or less follow the old Trados macro rules for segmenting as they work). Find out whether the translator wants files where the target segments are populated with an exact copy of the source (necessary at the current time for OmegaT as I understand it) or whether fuzzy match content - where present - should be written to the target. The latter option is best if the translator's software can handle it, and it will usually save the most time. If your translator does not have a licensed copy of Trados, you should also be kind enough to export the TM, usually to TMX 1.4 instead of the Trados TXT format, so that the translator can use it for concordance lookups. Instructions on how to perform this presegmentation procedure using the Workbench Translate function and particular settings for copying the source to target on no match will be found in my old published instructions for processing Trados RTF and Word projects with Déjà Vu or the information on handling TTX files with Dejà Vu. The preparation is generally the same for software other than DVX. (Both instruction sets are long overdue for an upgrade, but are still OK for orientation purposes. There is also a lot more information to be found in online forums and the Yahoogroups lists for various tools.)
If all you care about is getting a good translation and having something for your Trados TM that will usually give you reasonable matches and enable concordance use, then TMX or bilingual exports from tools like DVX or MemoQ are generally more than adequate. In fact, in some cases, this is the only way that an outsourcer can access Trados project content in Trados. I have one customer, a great EN>DE translator and Class A editor who subcontracts a lot of her DE>EN work to us. She works with Trados and expects to receive TM material for her Workbench TMs as part of the deliveries. However, many of her projects, including the MS Word files, require the use of TagEditor, and she has an old version of Trados which cannot handle most of these files in TagEditor. So we do the work in DVX or MemoQ and send her a bilingual RTF or MS Word file to clean, even if the source format is InDesign, Excel, PowerPoint, XML or something else. She can access the information in her concordance and she's happy. I like the true joke that, in many instances, third-party tools are more compatible with Trados than Trados itself. I have seen many examples of this. Even die-hard Trados users would be well-advised to keep licensed or unlicensed versions of a tool like MemoQ around to iron out such circumstances or to send a translation of an InDesign file to a translator with Trados who refuses to use Tageditor. Or to deal with cases where there are a lot of numbers and dates to fix, as noted in an earlier post.
Exchange of terminology data poses its own challenges at times and should probably be handled in a separate post. However, in my experience, there are few outsourcers who make sophisticated use of MultiTerm, and terminologies are usually maintained and exchanged in another format. There are, however, a number of good methods for receiving and sending terms, and the only scenario I see as an insurmountable problem for other tools beside Trados is one where dynamic availability of terms via an online server is important. Those cases are rare.

Except for the first case cited - translation projects that use an online Trados server - outsourcers really can be confident that "Trados jobs" done with a third-party tool really are 100% compatible with their processes. This is especially the case if an actual copy of Trados (licensed or demo) is used for pre- and post-processing steps. With regard to segmentation issues, there are possibilities for "improvement" in some cases, which have been discussed for TTX files in an earlier post ("Crossing Segment Boundaries").

As with any project, it's also important to remember to provide copies of the source material as a PDF where possible, so that the formatting and the purpose of mysterious tags/codes can be understood better and errors avoided. But this is true for any project, not just ones involving a mix and match of tools. Yet I am continually surprised by how many experienced project managers and translation consumers forget this basic principle.

Dec 29, 2009

Translating Trados TTX files with MemoQ

Quite some time ago, I summarized the techniques for translating TTX files with Atril's Déjà Vu X and published the information as a PDF file, which is available as a free download from one of my web sites (by clicking the link earlier in this sentence, for example). Now I would like to present how this task might be approached using Kilgray's MemoQ.

First of all, let's consider why you might want to do this at all. I think a typical situation might be where your customer insists on a TTX file as the deliverable translation. Another case might be where the files to be translated need to be pre-processed with TagEditor to reduce import time (like when MS Word files are heavily laden with graphics). I often use TagEditor to pre-process jobs I translate with Déjà Vu, and while MemoQ's import capabilities are generally better than those of DVX (and MemoQ sometimes handles files that TagEditor can't), sometimes everything just runs faster and better if I make a TTX to import into MemoQ.

You can argue about TMX compatibility with the customer all you like, but in many cases too much leverage is lost for future work if you deliver anything except a properly translated TTX file. So don't argue, just do it. In most cases you will want to pre-segment the file. The procedure for doing this is described in Steps 0 & 1 of the Trados TTX in DVX instructions previously mentioned. If you do not have access to a Trados license to use a TM provided and get the best leverage, you should ask your client or a colleague with a Trados license to assist you in the presegmentation (the latter only if the client's confidentiality rules permit).

When you are ready to import the TTX file into MemoQ, there are two options in the Project Wizard or in the Project Manager window: "Add document" and "Add document as...". The former is an import routine that simply brings in the segmented content. The second option opens the document import settings window (click for an enlarged view):

Selecting the option to import unsegmented content (circled in red with a red arrow pointing to it) will cause numbers and dates, which are usually skipped by Trados, to be imported into the MemoQ project. I don't know of any other software that will do this currently. This is very helpful for technical or financial documents with tables of numbers to be corrected. It is not easy to find all this content in the TagEditor environment, so in this regard, this aspect of quality assurance is a lot easier for a Trados project if it is done in MemoQ.

Quick minds may have realized at this point that with this second import option, all the content of an unsegmented TTX file can be imported. While this is indeed possible, it's usually not a great idea, and it may upset the customer in many cases. This is because the the TTX file cannot be "cleaned" to transfer the data into the Trados translation memory. However, a target file can be saved from it. If for some reason you translate a TTX without segmenting it, the TM information is transferable to the client as a bilingual file (Trados-compatible Word document) or a TMX export from the MemoQ TM, but this is a rotten idea for all but the simplest files, because the segment leverage of the content imported into the Trados TM will probably be awful. You are much better off getting someone to segment the original TTX for you and "retranslating" it from the TM in MemoQ. The only time I would translate an unsegmented TTX myself is if I am using that format for expedience in the case of a huge Word file full of graphics or something similar.

After the TTX has been imported into MemoQ, if you want to clear the target cells, then right-click anywhere in the translation work area and choose Clear Translations... from the context menu. If you want to clear only a specific range, select the beginning of that range, then shift-click at the end of the range to select all the cells in between. In either case, there are a number of options for what should be cleared (all translations, just unconfirmed segments, etc.). The way this option is implemented is less dangerous than the analogous function in DVX, where I have to remember to filter what I want to keep before clearing target cells. Filter functions can, of course, be applied in MemoQ too.

When you are done translating the (segmented) TTX file in MemoQ, your output is a "uncleaned" TTX file that contains both the source content and your translation. If you have a copy of TagEditor and the original file available, you can save a copy of the translated file in its original format by using the command File > Save Target As... in TagEditor. If you don't have the original file, you can't save a target file - your customer will have to do that.

If your customer has a translation memory relevant to your project, it should be exported from Trados Workbench in TMX 1.4 format and imported into a MemoQ database for concordance use. Please note that it is better to use these databases for the pretranslation/presegmentation step than to presegment the Trados file against an empty database (basically copying the source content 1:1 to the target) and then translate in MemoQ using the migrated TM content from Trados; the leverage will generally be higher (i.e. more and better matches).

This procedure is safe and 100% compatible with Trados. It can also be performed with the unlicensed version of MemoQ (MemoQ4Free) with the restrictions that apply to that product (only one file, no import to the TM, just to the termbase).

Dec 25, 2009

The new Déjà Vu

As the developers choose to classify it, it's only a new build of the current version 7.5 of Déjà Vu X. It was expected quite some time ago but was delayed for the testing and refinement of file filters among other issues.

I downloaded the update file (about 35 MB) from Atril's web site and ran the installation right away. As I expected, there were problems with the dongle drivers immediately thereafter, so that I was confronted with a dialog informing me that DVX would only run in demo mode. This is a common problem, which was dealt with as usual by rebooting and running the program to reinstall the drivers (path: C:\Program Files\ATRIL\Deja Vu X\Dongle\setupdrv.exe). Afterward, when I launched the application, I was pleased that my other settings, including recent projects, were all intact.

According to Atril's version history, the changes in the new build versus Build 303 are:

Added new filter for working with XLIFF files, including SDL Trados Studio 2009 SDLXLIFF
Added support for FrameMaker v9.0 in FrameMaker MIF filter
Added support for InDesign CS4 in InDesign INX filter
Microsoft Windows 7 officially supported
Microsoft Office 2010 (current at Beta 2) officially supported
Improvements in the RTF filter, including reduced extraneous codes and improved performance
Fixed issues with curly brackets in PO filter
Improvements in the XML filter, including better handling of large CDATA sections and improved performance
Improvements and fixes in the MIF filter, including better handling of index entries, footnotes and text insets
Various fixes to the SDL IDT filter
Fixed issues with exporting satellites
Improvements in number handling (particularly in Propagate) and case conversions
Fixes issues with filter on selection
Fixed various issues with mouse/keyboard focus
Fixed various issues with TMX import/export
Fixed various issues with MultiTerm import
Fixed various issues with TM import/export
Fixed various issues with TD import/export
Fixed issues with alternate portion handling when work with a separate edit area
Fixed issues with search and replace in TM and project

The points highlighted in red are ones that have particularly concerned me in my work; others will have a greater interest in other points, of course. I particularly look forward to seeing if the improvements in the RTF filter will eliminate the need to run Dave Turner's CodeZapper macro on almost every RTF or DOC file I translate with Déjà Vu. Also, the fact that all attempts to import MultiTerm data in the past year have failed has been very irritating. I look forward to testing the performance of the new InDesign filter; in prior versions the filter was vastly inferior to the one in SDL Trados TagEditor, and the best I worked with in most cases was Kilgray's filter for MemoQ.

As has always been the case so far, this upgrade is free to all registered users of DVX. Free upgrades forever aren't the smartest business model if you are trying to cover the cost of ongoing development and support, so I hope that changes at some point so we might see more frequent improvements to what is still in many respects the best translation environment tool (TEnT) option available for freelance translators and small agencies. When asked what I recommend these days, it's a hard call for me. Most of the time I recommend MemoQ now, because of the advanced features, momentum and support that product has as well as its affordable server capabilities, but for quite a number of project types I do frequently, Déjà Vu X remains a critical element. Right now it's very hard to state the best technical choice without knowing a lot about the asker's project mix, so my recommendation is usually based largely on support now. In that respect the team of Atril and PowerLing still has a lot of lost ground to recover.

Search me!