Jul 7, 2013

A quick look at XLIFF:doc in memoQ 2013

memoQ 2013 introduced a few new import filters: the TMX translation import filter, which I expect will mostly find use for PEMT applications and translation memory quality assurance; the GetText PO filter, which I suspect is of limited interest unless you are one of those people who considers "xl8" and "l10n" to be in the vanguard of spelling reform; the filter and wizard for TIPP packages, an interchange format promoted by Interoperability Now! and the Linport Project; and XLIFF:doc, which has been the subject of much discussion amond the l10ners of my acquaintaince.

Despite its too-geeky company, XLIFF:doc is actually a rather interesting format. The sample I obtained showed how a document can contain content for multiple translation files, which are displayed separately in the working environment after import and, most interestingly, how file previews and terminology match information can be embedded in the XLIFF:doc file for reference.

One of the serious problems at present with memoQ's interoperability solutions for proprietary packages from other vendors is that terminology is not transferred.

If the package comes from an SDL Trados Studio user, odds are that this won't be a problem, as the majority of Trados users seem to be terrified of managing terminology with MultiTerm, and damned few of them actually do it. I was often disappointed to find that I had spent years maintaining specialist terminologies in MultiTerm for various "sophisticated" clients of mine who used Trados, only to discover one day that the data had actually never been used, because none of the client's staff dared to use MultiTerm. Of the dozen or so SDLPPX files I've been given for translation jobs in the past year or so, not one included terminology, just a bit of TM data, which transfers nicely.

STAR Transit users have a better integrated, somewhat more user-friendly terminology option and use it more often as one might expect, making it necessary to acquire this terminology in other ways for use in memoQ for now.

Using XLIFF:doc as an exchange format for translation offers a strategy for overcoming this long-standing problem. As far as I can tell, Ontram and XTM appear to be the only translation environments which generate or intend to generate XLIFF:doc as part of the Translation Interoperability Protocol packages (the TIPP packages which memoQ now imports).

Kilgray has stated no plans to generate XLIFF:doc or TIPP from memoQ projects, but I hope this changes. If all the major translation environments were to create files of these types as well as read them, we would be much, much further along with trouble-free exchange of data for projects with heterogeneous working tools.

Here's a video offering a quick "inside look" at an XLIFF:doc file imported to memoQ and the reference information (previews, terms and translation matches) it can contain:


  1. Interesting post Kevin... what are your views on the likelihood of receiving source files as XLIFF:doc? With this I mean do you have any insight into how many clients are sending out work in this format?

  2. As you know Paul, that would depend on how widely the format is adopted. If you work with others who use XTM Cloud or Ontram, I would say there's a pretty good chance to see XLIFF:doc sooner rather than later. It might be in a TIPP file, but considering that memoQ will export the individual XLIFF:doc files of a TIP package and others may as well, these files might be unbundled.

    If SDL and/or Kilgray were to support the creation of XLIFF:doc as an export format, I think its use would spread more quickly and cross-platform projects would be possible with far fewer difficulties than we face today. Of course, the makers of the One Ring in Maidenhead would probably prefer to see everyone locked in to a single platform (one of theirs, but which?), though if you consider how good competition has proved to be for SDL Trados in recent years, I think this would only make things better.

    I like the fact that the embedded terminology feature in XLIFF:doc makes it possible to share relevant terminology for a document without opening up access to your entire termbase. Given the paranoia of some corporates, this might even be a selling point.

  3. I'm surprised to see you make this comment on being locked in Kevin. I think we've done more than anyone else to expose the ability of others to integrate workflows with SDL tools. If we really did think as you said then we would never have provided the comprehensive SDK we have, or launched the OpenExchange in the first place and we would never have based our own internal format on XLIFF.
    So I guess the answer to my question is that we may see files like this coming from tools that create this format... in much the same way as we see memoqxliff and sdlxliff? What I was more interested in was whether you had any insight into how many work givers will adopt this? That would be the turning point... until then it's just another flavour of XLIFF that other tools will have to try and support. Once all tools at least support it (so not create it just handle it correctly) I think whether it's a simpler and more consistent flavour for all tools to support or not is really irrelevant because it has reduced value for users within the same supply chain.
    So for me it will be interesting to see whether the work givers adopt this and push it forward.

    1. I'm teasing you Paul... and I like to use some of the things you mention as a goad for some of your competitors. Some people like to make rude remarks about the OpenExchange; regardless of whatever those disputed participation levels may be, I think it's a great program worth emulating.

      As for lock-in, whether intentional or not, that is an unfortunate consequence of the move to "live" server-based projects these days, but that's another can of worms. I won't bore anyone here with a repetition of my concern that these servers are undermining more than a decade of work on interoperability between tools, which I hope will be changed by integration options for third-party clients in the various servers.

      XLIFF:doc seems to be a little more than just another flavor of XLIFF like SDL or Kilgray have spawned. Maybe I'm ignorant of other similar "flavors", but I think the inclusion of an accessible preview (which I think is updated during translation) and the embedding of terminology are very useful improvements. You are probably aware that there are industrial companies among the developers of the format, and I think it would be a huge advantage to the "work givers" to have a reliable, standardized format for project packages. Hell, SDL doesn't even have that internally! The current incarnation of the WorldServer has its own package format to be maintained and exchanged, and apparently this has changed as well recently as I have noted from compatibility issues with older versions of Studio.

      "Reduced value for users within the same supply chain"... is very much a debatable point, just another way of saying that everyone should use Studio (or memoQ or WordFast or whatever). The greatest value, which people too often forget and the bozos at TAUS probably never learned, is competent translation by someone with appropriate subject knowledge and linguistic skills. And tool switching (though foolishly practiced by many) is ergonomically bad and causes considerable losses in efficiency. So it's better for all to ensure that the most useful data are transmitted losslessly for processing using accessible, agreed standards.

    2. I was just wishing that SDL already had a TIPP package handler so I could easily handle a client that just switched to XTM, and still use all my glossaries, etc., without having to do a separate lookup step. TIPP is sponsored by Interoperability Now!, which I'm surprised SDL does not seem to be part of.


Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)