Jun 14, 2018

Translating Wordfast GLP packages... elsewhere.


One reason to keep  translation environment tool licenses up to date is that new formats continue to appear. New formats for translatable files as well as new file formats for the tools that help to process files for translation. Very often I have heard some "professional" say "I'm a translator, not a [fill in the blank]. If the client wants this translated, I'll have to get it in a Microsoft Word file." Or something like that.

Let's get real for a moment.

  • That attitude is simply lazy and disrespectful toward translation consumers who would like to make use of one's services and
  • a lot of money is being left on the table here in many cases. I built a huge clientele at the start of the last decade, because my use of translation environment tools like Trados, Déja Vu, STAR Transit and Wordfast enabled me as an individual to tackle translation challenges that many agencies at the time had no concept of how to cope with.
As translation agencies have acquired more technical tools, most of them still remain unfortunately unaware of how to use them properly or plan more than the simplest workflows well, but that's a subject for another day. Also...
  • ... by using tools and techniques that are compatible with what your clients require for a final format, you can save your client a lot of time and money for further layout work - and probably avoid the introduction of errors in your translation work in its final format as well.
  • And in my experience, showing technical and process competence to benefit clients usually leads to greater trust and better work together.
So what has all this got to do with Wordfast?

Well... I didn't like the Wordfast brand for a very long time. Its various incarnations were perhaps the weakest of the popular tools in a technical sense, and inevitably when agency friends called me, desperate to fix some massive translator screw-up (usually by somebody in France), Wordfast "Pro" was often involved in the disaster.

I looked at the "newer" Wordfast versions a number of times over the years, and honestly they always seemed like lobotomized wannabe tools. This was about the time that many other toolmakers were trying to decide if they should support XLIFF.

Well, a lot has changed since then. I became aware of the changes the other day when somebody posted a question in a social media forum for memoQ asking how to handle Wordfast Pro 5 GLP packages. I had never heard of these, so of course I was curious and decided to take a look. This finally led me to download a 30-day trial of the latest Wordfast Pro software to evaluate its potential for interoperable work with other translation environments. I see a lot of changes since my last look, and so far I think they are all positive, and along the way I had good cause to look at Wordfast Anywhere, the free web-based CAT tool that I talked some university colleagues into not wasting their time with a while ago. Well, my recommendation in that regard might change, but that and commentary on the latest incarnation of WF Pro will have to wait for another day.

About those GLP packages....


Yes, those. This was the question:


Someone pointed out that GLP files - like every other translation "package" one finds from all the tool providers - are merely ZIP files with particular structure inside and the extension re-named. 


Gotta love Facebook. You'll always get an answer in some group, usually a wrong one. That's why I keep a blog. Good information gets buried in social media noise too often, and good luck finding it in any kind of search. In this case... we don' have no steenkeen TXML files as I learned... that's the old Wordfast Pro....

A colleague in Germany kindly provided me with a little GLP package to examine, which I promptly unzipped. I noticed that at least one tool (7-Zip) sees through the renamed extension nonsense and saved me the usual trouble of renaming it before unpacking.


So far, so good... inside the folder for the unpacked GLP file I found the following:


The test package was an English to Portuguese project. But source? Hello? Let's have a look there!


Very interesting. The original source files (English) came along for the ride. This is good, because I often like to translate source files in memoQ - taking advantage of the preview there for many file types - and then use the translation memory to translate the file that is created by other other tool (usually SDL Trados SDLXLIFF files in my work). Now let's have a look inside the pt target folder. There's actually another folder named txlf inside that one. And there I found:


No TXML files! TXLF is a new instance of the rather ubiquitous XLIFF files one finds in the translation world, some of which have some rather bothersome "extensions" that may require special handling in the translation process. In the simple test I performed, none of that was apparent; an ordinary XLIFF filter seemed to work well. Future tests will show me if there are any quirks I hope, but so far, so good.

So one strategy, with pretty much any CAT tool, would be to unpack the GLP file, get at those TXLF files and then bring them into another working environment using an XLIFF filter. Maybe also use my approach with the source files too, which will ensure that you can deliver a good target file even if quirky tags in the XLIFF lead you to produce less than an optimal result there. 


The current version of memoQ (8.4) does not recognize the TXLF extension, so as in all such cases, the All files option must be used and the correct filter applied in a later dialog. Unlike with some other tools, memoQ cannot be "trained" by the user to recognize new extensions as far as I know.

But what about importing the GLP files directly to memoQ? Wouldn't that be nice? And I thought it might be possible using the ZIP file filter recently introduced (and the same All files trick to get the GLP file and apply the ZIP filter later). Well...


It looked promising.


So much so that I even optimistically named and saved a custom configuration for the ZIP filter. All I need to do now is cascade an XLIFF filter!


Ack. Sooooo close. I've been here before. There are more things in heaven and down-to-earth cascading formats, Kilgray, than are dreamt of in your philosophy! Please, please expand the list of possible cascaded formats sensibly to make better use of this lovely new ZIP filter!

So for now, that's a no-go, but soon? Who knows? If you bother support@kilgray.com and tell the memoQ team how helpful it would be, maybe this and similar problems can be solved with relative ease.

In any case, for now it seems that the unpack-and-do-the-XLIFF approach will work for most anyone with a modern CAT tool. And that's good news, because in today's fast-changing technology environment for translation, interoperability of CAT tools is increasingly important. It is a foolish waste of time to translate in a large number of CAT tools and probably a bad idea to do so in two or three according to my old research. I've usually found that such JOATs are, professionally, often stupid goats who lack the depth in a single major environment or two, which could allow them to get the most out of their tools and serve their clients in the best way with their linguistic skills and subject matter knowledge.

So is the latest Wordfast a tool worth checking out? I don't know yet. But it may be used by colleagues and clients with whom I like to work, and understanding how to share projects and project resources in painless ways will benefit all of us, no matter what our tool preferences may be. Wordfast seems to be developing very much in that spirit, so I will revisit it for more collaboration scenarios in the future.


2 comments:

  1. Thanks Kevin for an excellent blog. Until recently I remained with memoQ 2015 but was forced to look into the WF 5.0/GLP issue recently (how I found your blog) and went through the processes you outline above. I did change to mQ 8.7.1 in the meantime, and as a result of my GLP import efforts failing at exactly the same point as yours, I took your cue and sent a test GLP to mQ support. They confirm after testing that "currently no bilingual files are supported by the [ZIP] filter." That is indeed a pity, as it really would make life a lot easier if those file types could be handled. We can but wait. All the best in the meantime.
    Laurence Fogarty/info@geminilanguageservices.com.

    ReplyDelete
  2. A challenge is to import the match rates from TXLF or TXML files to memoQ.

    ReplyDelete

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)