Aug 19, 2015

Doing the deed with the DGT

Several years ago a legal translator in my circles began to use memoQ for her work, and I was asked to help with the migration of data from her old environment. When she was introduced to memoQ LiveDocs, she was delighted to learn that she was able to view the original document text or bitext of concordance hits for content saved in a LiveDocs corpus.

Because her work involved a lot of references to EU directives and other information sources from the EU, the parallel corpora from the DGT had great value to her work. These are enormous bodies of data, totaling several million translation units and growing constantly. Many translators in the EU use this data, but the sheer bulk of it tends to be burdensome to many translation environments, and the lack of context often limits the value of information retrieved from these corpora when stored in translation memories.

So she decided that LiveDocs was the medium in which the DGT data were to be stored, and because the DGT translation memories contain their data in sequential document order, the document context of any concordance hits can be viewed using the context menu in the memoQ concordance:

Thanks to the expansion of file types which can be included in LiveDocs since that time, it is easier than ever to import data from parallel corpora like the EU DGT and use these to support translation work. Using the LiveDocs approach, the extraction of a single large bilingual TMX file from the many zipped data collections is also completely unnecessary (in fact, the extreme quantity of data in those single files inevitably causes memory problems). To build reference corpora for concordancing or the construction of predictive typing resources such as Muses in memoQ, it is simply necessary to unpack the individual zip files into folders full of small TMX files and then import these folder structures into memoQ:

Include only TMX files in the LiveDocs corpus import:

Selecting the desired languages extracts the bilingual data from the individual TMX files, which contain data in all the official EU languages. If a particular file does not contain the desired pairing a corresponding message will be displayed. Don't worry about it.

This approach of loading smaller TMX files into LiveDocs overcomes the memory problems which may occur with gigantic files. And once these smaller files are in a LiveDocs corpus, they can be selected en masse and exported to one or more translation memories.

In fact, this approach is useful to get around the current inability of memoQ translation memories to import more than one TMX file directly at a time. This may be helpful, for example, to OmegaT users who want to migrate their many TMX translation memories (one from each project!) if they start using memoQ.

Aug 18, 2015

Enter the Dragon, Anywhere!

Today Nuance made a presentation of a new product to be released this autumn for mobile devices using iOS and Android operating systems: Dragon Anywhere.

The mobile app will allow secure transcription with a WiFi or cellular data connection as well as synchronization of custom vocabulary with Dragon desktop computer applications wor Windows and MacOS.

The initial presentation made no mention of which languages will be available in Dragon Anywhere; the synchronization feature makes me worry that it might be restricted to the current seven or eight languages available for Dragon NaturallySpeaking (Windows) or Dragon Dictate (Mac), but perhaps the standalone applications for desktop computers and laptops will finally be upgraded to offer the 40+ languages currently available for mobile devices with apps such as Dragon Dictation for iOS or Swype + Dragon Dictation for Android.

It is also unclear at this point whether the new Dragon Anywhere app will allow direct dictation or transfer to the cursor location on a linked computer, as one can do with myEcho or using Swype + Dragon Dictation in conjunction with Chrome Remote Desktop. But with the addition of extensive voice-controlled editing features to the mobile app, Dragon Anywhere represents significant progress toward better ergonomics for writing and translation!

Aug 12, 2015

"Upgrading" translation memories for document context

Translation memories are in a way the Zombie Apocalypse of our profession. Dead data walking, with rotting bits falling off and lying out of context in a concordance for the reader to puzzle over, wondering where that bit of wordflesh one fit in the whole.

How often in my Dark Days as a Trados User did I look at some concordance hit and wonder just how stoned the translator was, corrected the "mistake" and discovered later it was quite right in context? Context matters, but with translation memories it's like the Invisible Hand at best, with a few missing fingers.

I once was lost, but now I found memoQ LiveDocs, and I
  1. export my context-free or -invisible TMs to TMX, then
  2. import the TMX to LiveDocs
where I now can go directly from a concordance hit to the translation unit with LiveDocs, which can be read in document context if that chunk of text was in fact written to the TM in the sequential order of the document, which is often the case.