Jan 25, 2009

Outsourcing translation work from DVX Workgroup to OmegaT

Discussions of CAT tools never end in the online forums and elsewhere, and there are often heated arguments about which tools are best, whether claims of compatibility are to be taken seriously, etc. Not infrequently, fire-breathing Open Source fanatics or even calmer souls who prefer free software to commercial stuff that would probably increase their efficiency and certainly improve their marketability make a case for tools like Metatexis or OmegaT. I've always been rather skeptical of tools like these, especially after discussions with developers who have confirmed some of the limitations. One "knockout criterion" for me was that this software allows the translator to translate only a limited number of formats, not including many of the most popular ones. The usual workflow for MS Word documents with OmegaT, for example, involves conversion to ODT format for translation and subsequent re-conversion. I suspect there may be some issues here, because I used to use OpenOffice to clean up trash codes in MS Word and RTF files, and sometimes the formatting got messed up.

After a several discussions with Marc Prior, the project coordinator for OmegaT, one workflow which would give both of the aforementioned free tools the ability to work on projects involving any format which can be processed by Déjà Vu X or SDL Trados became apparent. It's quite simple, really, if you have access to DVX Workgroup. The steps are as follows:
  1. Export an "external view" as an RTF table with all segments or whatever portion you want to have worked on by the OmegaT user.
  2. The OmegaT user opens the RTF external view and copies the source column to a new OpenOffice ODT document.
  3. The ODT document is translated normally in OmegaT with care taken not to damage the codes enclosed in curly brackets (example: {13})
  4. After the target file is exported, the table column in it is copied to the target text column of the original RTF external view table.
  5. The RTF external view table is re-imported to the DVX project.
Since DVX can process all Trados formats if they are pre-segmented, this method also enables OmegaT to process any Trados files with a guarantee of 100% compatibility.

TM content can be exported from DVX or Trados as TMX and placed in the TM folder of the OmegaT project. While the matching may not be optimal this way, so the translator may have to work a little harder, this is not necessarily a bad thing. Fuzzy matches are processed a little differently in OmegaT, for example by putting [fuzzy] before a segment which is automatically inserted. Perhaps this will force some people to stop and think about a match before blindly continuing to the next segment. If one accidentally continues to the next segment, the automatic markers will make it easier to find the error. I did notice in my brief tests that subsegment matches from a TMX file were displayed in one of the panes on the right of the screen; this was very helpful. Terminology can be shared with OmegaT via tab-delimited text files with three column (source, target and additional info). Unfortunately OmegaT does not appear to support fuzzy matching of terminology, so it may be necessary for the translator to do manual lookups if a term occurs as a plural but is listed in the singular in the glossary.

Thus outsourcers who work with Déjà Vu can cooperate safely with users of freeware tools like OmegaT and be assure of reasonably efficient information exchange and completely compatible final results.


  1. This would certainly be great, if only more outsourcerers were using DVX. Most are using Trados, and currently there is no easy way to handle Trados TagEditor files (though OmT v2.0 will be able to handle bilingual XLIFF files, so there is a chance that TagEditor will be supported).

  2. There are some outsourcers that use DVX, but the majority use Trados, and a growing number of others use tools like MemoQ or that abomination Across. It's funny, though, because if you look at the translation market or the translation tools market as a whole, among translators, Trados does not in fact enjoy a majority share. Thus by insisting on blind aherence to SDL Trados in all things, many outsourcers are really undermining their own business. The smart approach for them would be to find a way to make their projects accessible to any translator (including those who don't use CAT tools) while still maintaining full compatibility with their favored tool. Déjà Vu X offers exactly this possibility at what for an agency or company is really a trivial investment (especially when one considers Atril's upgrade policies).
    I keep hoping that MemoQ will add a function very much like the "external view" RTF tables in DVX, and I'm told this is under consideration, but it hasn't happened yet.

  3. "The smart approach for them would be to find a way to make their projects accessible to any translator (including those who don't use CAT tools) while still maintaining full compatibility with their favored tool."

    I think that most outsourcerers are doing so by using Idiom WorldServer or similar tools, hence giving the translator a way to work on the files on their own servers rather than handing the material out and hoping to receive something compatible in return.

  4. SDL announces plans to switch from ttx to xliff format in the next Trados version, so it's going to be much more compatible with other software.

  5. I had seen a comment about the changeover to XLIFF (and the probable demise of the hated TagEditor) in a review some weeks ago, but I couldn't find the link again (didn't look that hard either). This, of course, would change the game quite a bit in a positive way.


Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)