Translation Tribulations: 2015

Dec 31, 2015

2015: that was the year that was.

Another year's over, my friend,

and once again fools can pretend

that Translations with Borders

of quite simian orders

didn't MpT heads to no end.

Nov 26, 2015

Fuzzy match of the month - WTF?!

Experienced translators using translation environment tool technology are quite familiar with the ludicrous results often obtained by so-called "fuzzy" matches in translation. For some 20 years now, the lie has been propagated that such matches usually help translators to work faster and that such "matches" therefore obligate one to offer discounts.

I will not rehash the familiar arguments and evidence that even truly close matches with the difference of a little word or two can cost more time that translation from scratch with no reference text or the fact that modern translation tools are useful primarily as a guide to facilitate consistency and not necessarily speed of work, especially if the translator is a real one with strong language and subject matter skills. Of course there are monkey-level jobs where a fuzzy match can usually be expected to save time, but once one ventures into fields such as legal or financial translation this is not the case as often as the linguistic sausage providers (aka LSPs) might claim.

I just wanted to share this little screenshot from my "daily bread", because it truly is worthy of sewer disposal.

All fuzzy matches are not created equal; every tool on the market will spew nonsense, and these nonsensical "values" are not even close to consistent between tools. It's time to cut the crap with fuzzies as a real means of evaluating work effort. Or at least share some of what the believers are smoking to reach such conclusions.

Oct 29, 2015

Revised target document workflows in SDL Trados Studio 2015 vs. memoQ

Yesterday I had an unexpected opportunity to see the new SDL implementation of the feature Kilgray introduced to memoQ two years ago, in which a revised target document (or some portion thereof) is re-imported to a translation project for purposes of updating the translation memory. Since my involvement with the concept and specification of this feature in memoQ, I have been expecting the competition to follow suit, since in principle at least, this is a useful feature which nearly everyone can use in several common scenarios.

The way in which SDL Trados Studio 2015 handles project updates with edited target documents appears very different than what memoQ does, so that one might easily think that the functions are different. And this is one of those rare instances where I have to give SDL credit for a smoother, more streamlined procedure less likely to cause confusion and frustration with users.

The positive difference starts with the choice of terminology in the command interface. SDL refers to a "target document" rather than a "monolingual document" - I think this is less ambiguous and potentially confusing to an average user. The fact that these updates are perhaps not supported for bilingual formats in memoQ is one of those nerdy details which will not interest most people, especially given that there is a stable, established update process for project updates using bilingual documents.

When the reviewed file to import is selected, the user has the option to go to the aligner and correct possible matching errors for the revised target document (desirable if, for example, edits might cause the segmentation to change), but the default is to go straight back to the working window for translation and editing, with the changes already shown in tracked changes mode. Very nice.

In memoQ, the trip through the aligner is mandatory, but for simple changes, this is usually not needed, so I like the fact that Studio 2015 offers this as an option. And in memoQ, several extra steps are needed to show the changes in tracked mode (redlined markup), with confusing traps in the interface along the way. In a recent blog post, I described how Kilgray's emphasis on commands and terms relevant only to server projects, with the usual tracked changes options a translator would want buried under the "Custom" command, causes many users to conclude that tracked changes simply do not work in memoQ, which is not true at all. You just have to run the evil interface gauntlet to get there.

Does this mean I think everyone should dump memoQ and start using SDL Trados Studio 2015? Heck no. There are many processes involved in successful translation work, and switching from one tool to another based on a single feature or a just a few features is not particularly clever, no matter which way you go. (Except for "away from Across", which is always a good idea.) I am very pleased and encouraged by SDL's different approach to this feature, because it shows once again the importance of competition and different approaches to a problem. Ultimately, ergonomics and user experiences should determine the further development of a feature. In my opinion, memoQ usually has the edge here, but not always, and this is a case where improvements to this innovative feature which first appeared in memoQ could very well be inspired by SDL.

Oct 27, 2015

Beware the document Reimport trap in memoQ!

In between sneezes and hot shots of gingered lime tea I saw the Skype icon on my Windows task bar change to indicate a message. A distress call from a financial translator friend who had just received a new version of the Q3 report she was translating. memoQ has excellent version management features, which include a document-based pretranslation (X-Translate), which allows one to use a current or previous version of a translation to identify unchanged sections which have already been translated when the client sends a new version. This avoids potential confusion with undesired matches coming out of any ofd many translation memories or LiveDocs corpora which might be attached to a project.

This time, however, memoQ seemed to be getting weird on her, with error messages referring to ZIP archives and password protection. Her customer's file was not password protected, and as far as she knew, there was no ZIP archive anywhere in sight. She was dealing with "ordinary Word files". I have no idea what those are, but I hear about them often enough, and that is often where the trouble starts.

Last July I was teaching a week-long introductory course to memoQ in Lisbon, and when I wanted to show the course participants how this X-Translate feature worked, everyone ran into unexpected problems. When it was first introduced in memoQ, I noticed that the updates would work in any format. A translation which starts out as a script in a word processing file might later be updated as a set of presentation slides, and memoQ's document-based pretranslation did an excellent job of enabling me to focus quickly on the new material. It still does, but since the early days, some advocate of unintelligent programming decided that the filter used for the Reimport function to bring in the updated source text should assume that the source format was unchanged from the previous version rather than simply offer an appropriate filter for the current format. One must specify the filter to be used for an updated version if this assumption is not correct (as I also explained in my book New Beginnings with memoQ shortly after noticing this).

I can probably guess why this was done. With certain filters, the filter to use is not obvious from the extension (the multilingual delimited text filter, for example, if it is needed), or there may be a custom configuration of an "obvious" filter needed. In these cases, the assumption of using the last filter settings makes a lot of sense. However, if there is a change of format, where it is clear that the new filter should not apply, then some action should be taken other than a virtual assault on the user with mysterious error messages.

In the case of my financial translator friend, the update came as a DOC file, where the original had been DOCX. Geeks who have nothing better to learn with their time might know that DOCX files are actually renamed ZIP files, so at least the confusing error message above was "truthful" in a sense.

I see this sort of "switch hitting" with Microsoft Word file formats of various generations or changes from RTF to DOC or DOCX rather often. But in the case of importing new document versions, these changes mean trouble for memoQ if the user does not notice the difference, and given that the majority of working translators I have encountered who use Windows operating systems never fix the default system setting which hides the extensions of known file extensions, the chances that your average mortal wordworker will figure out this problem is just about zilch.

Armed with new insight into the problem, my friend was able to import the new document version successfully by specifying the appropriate filter manually and then use X-Translate to get her previous translation applied to sections of source text which had not changed (so that inappropriate 100% matches from a TM or LiveDocs corpus could be avoided). But for the future, I hope that Kilgray will apply a little more intelligent logic to the selection of filters for the document Reimport function of memoQ.

Oct 25, 2015

European Commission Workshop - Contracts for translation services

What the Linguistic Sausage Producers don't want you to know:

Did you know that tenders for work with the European Commission are not just for the big Wortwurstläden but can be submitted by individual translators who are EU citizens - and that these individuals have equal standing before the Directorate General for Translation? The DGT does not differentiate and many of its best external contractors are individuals, either self-employed persons or dynamic teams of two or three professionals.

The DGT uses taxpayers’ money and must be transparent, with fair and equal treatment for each candidate. Reading their specifications may appear daunting at first, but taking a closer look is worthwhile! Questions may be submitted and are answered during the weeks when the call for tender is open; this can be done in three languages, almost in real time, with all questions and replies made public on the DGT web site.

Quality pays and they will pay for quality: decisions are based on a quality/price ratio of 70/30, in favor of quality. For each job done, a quality note with feedback is sent to facilitate ongoing improvement.

But to get this far, you must first submit a persuasive offer to the selection board.

On November 28, 2015 from noon to 4 pm, IAPTI's UK chapter is hosting a workshop in Manchester (UK) to inform you of what it takes to tender and win at Europe's highest public level for translation. Profit from this important business event at yet another iconic venue! Registration information is available here.

The beautiful Manchester Central Library, venue for the EC tender workshop!

*******

The speaker: Monica Garcia-Soriano started her EU career as a lawyer linguist 24 years ago at the Court of Justice in Luxembourg. She later joined the Spanish Translation Unit at the European Commission in Brussels and for the last 8 years she has been in charge of procurement at the Commission's External Translation Unit.

Oct 15, 2015

The Invisible Hand of memoQ LiveDocs - making "broken" corpora work

Last month I published a post describing the "rules" for document visibility in the list of documents for a memoQ LiveDocs corpus. Further study has revealed that this is only part of the real story and is somewhat misleading.

I (wrongly) assumed that, in a LiveDocs corpus, if a document was visible in the list its content was available in concordance searches or the Translation Results pane, and if it was not shown in the list of documents for the corpus in the project, its content would not be available in the concordance or Translation Results pane. Both assumptions proved wrong in particular cases.

In the most recent versions of memoQ, for corpora created and indexed in those versions, all documents in a corpus shown in the list will be available in the concordance search and the Translation Results pane as expected. And the rules for what is currently shown in the list are described accurately in my previous post on this topic. However,

if there are documents in the corpus which share the same main language (as EN-US and EN-UK both share the main language, English) but are not shown in the list, these will still be used for matching in the memoQ Concordance and Translation Results and
if the corpus was created in an older version of memoQ (such as memoQ 2013R2), documents shown in the list of a corpus may in fact not show up in a Concordance search or in the Translation Results.

This second behavior - documents shown in the list but their content not appearing in searches - has been described to me recently by several people, but it could not be reproduced at first, so I thought they must be mistaken, and statements that "sometimes it works and sometimes it doesn't" made these pronouncements seem even more suspect. Except that they happen to be true and I now (sort of) understand why.

Prior to publishing my post to describe the rules governing the display of documents for a LiveDocs corpus in a project, I had been part of a somewhat confusing discussion with one of my favorite Kilgray experts, who mentioned monolingual "stub" documents a number of times as a possible solution to content availability in a corpus, but when I tried to test his suggestion and saw that the list of documents on display in the corpus had not expanded to include content I knew was there, I thought he was wrong. But actually, he was right; we were talking about two different things - visibility of a document versus availability of its content.

For purposes of this discussion, a stub document is a small file with content of no importance, added only to create the desired behavior in memoQ LiveDocs. It might be a little text file - "stubby.txt" - with any nonsense in it.

I went back to my test projects and corpora used to prepare the last article and found that in fact for the main languages in a project all the content was available from the corpora, regardless of whether the relevant documents were displayed in the list. In the case of a corpus not offered in the list for a project because of sublanguage mismatches in the source and target, adding a stub document with either a generic setting (DE, EN, PT, etc.) or sublanguage-specific setting for the source language or the correct sublanguage setting for the target (DE-CH, EN-US, etc.) made all the corpus content for the main languages available instantly. (In the project, documents added will have the project language settings; use the Resource Console for any other language settings you want.)

Content of a test corpus before a stub document was added. Viewed in the Resource Console.

The test corpus with the document list shown in my project; only the stub document is displayed, but
all the indexed content shown above is also available in the Concordance and Translation Results.

It is unfortunate that in the current versions of memoQ the document list for a corpus in a project may not correspond to its actual content for the main languages. Not only does this preclude accessing a document's content without a match or a search, it also means that binary documents (such as one of the PDF files shown in the list) cannot be opened from within the project. I hope this bug will be fixed soon.

Since a few of my friends, colleagues and clients were concerned about odd behavior involving older corpora, I decided to have a look at those as well. Kilgray Support had made a general recommendation of rebuilding these corpora or had at least suggested that problems might occur, so I was expecting something.

And I found it. Test corpora created in the older version of memoQ (2013 R2) behaved in a way similar to my tests with memoQ 2015 - although the "display rules" for documents in the list differed as I described in my previous blog post, the content of "hidden" documents was available in Concordance searches and in the Translation Results pane. But....

When I accessed these corpora created in memoQ 2013 R2 using memoQ 2015, even if I could see documents (for example, a monolingual source document with a generic setting), the content was available in neither the Concordance nor the Translation Results until I added an appropriate stub document under memoQ 2015. Then suddenly the index worked under memoQ 2015 and I could access all the content, regardless of whether the documents were displayed in the list. If I deleted the stub document, the content became inaccessible again.

So what should we do to make sure that all the content of our memoQ corpora are available for searches in the Concordance or matches in the Translation results?

If you always work out of the same main source language (which in my case would be German or "DE", regardless of whether the variant is from Germany, Austria or Switzerland), then add a generic language stub document for your source language to all corpora - old and new - under memoQ 2015 and all will be well.

If your corpora will be used bidirectionally, then add a generic stub for both the source and target to those corpora or add a "bilingual stub" with generic settings for both languages. This will ensure that the content remains available if you want to use the corpora later in projects with the source and target reversed.

Although it's hard to understand the principles governing what is displayed, when and why, following the advice in the red text will at least eliminate the problem of content not being available for pretranslation, concordance searches and translation grid matches. And the mystery of inconsistent behavior for older corpora appears to be solved. The cases where these older corpora have "worked" - i.e. their content has been accessible in the Concordance, etc. - are cases where new documents were added to them under recent versions of memoQ. If you just keep adding to your corpora, doing so particularly from a project with generic language settings, you'll not have to bother with stub documents and your content will be accessible.

And if Kilgray deals with that list bug so we actually see all the documents in a corpus which share the main languages of a project, including the binary ones, then I think the confusion among users will be reduced considerably.

Oct 9, 2015

Words in music: what vocabulary and languages tell us about leading musicians

Corpus linguistics has been a passion of mine since an article published by three colleagues about a decade ago showed me the possibilities of "mining" collections of text in a subject area to determine its critical vocabulary, what sorts of words belong together in specialist language, etc. The public webinar offered tomorrow by the International Association of Professional Translators and Interpreters (IAPTI) is a new take on this familiar subject, exploring the use of vocabulary by popular musicians and relating these to things like mastery of multiple languages or commercial success. A familiar subject, sort of, but nonetheless something completely different.

It is this kind of reimagination of things we "already" know which can open our minds to knew possibilities of many kinds which are available now, but which nobody expects and therefore nobody sees. So I will enjoy hearing what Mr. Jewalikar has to say at 3 pm UTC tomorrow, October 10th and look forward to the new ideas this may stir up in my head. And if you would like to join us in the presentation, you can register by sending a short e-mail request to info.request@iapti.org; there is no charge to participate.

This event is the sort of professional information with fresh perspectives and a clear focus on the needs and interests of individual professionals - not the linguistic sausagemakers and exploiters - that I have come to expect from this organization. While others sell out to the commercial interests of the bulk market bog to turn language professionals and aspiring professionals into "input" for their post-edited machine sausage of words, IAPTI and a very few other groups keep the focus on stimulating content of real professional worth.

Track changes in memoQ: misunderstandings and navigation

Although tracked changes have been part of memoQ since the distant days of memoQ 5.0, many users are still confused about how to use these features and how to navigate marked changes in a translation.

The confusion starts with the menu for activating the tracked changes, which in recent versions of memoQ is found on the Review ribbon. What most people do not realize is that the first two options - Against Last Received Version and Against Last Delivered Version - are not relevant to the usual workflows of an individual translator working in a local project created on his or her computer. Often I have caught myself selecting the option Against Last Delivered Version for the tracked changes to show, because I want to compare against the last version I delivered to my client by exporting and e-mailing the document, because I forget that this refers to the actual Deliver function in a server project.

If I am working locally in my own projects, the only track changes option that is relevant is Custom, with which I can show comparisons to specific minor versions:

In the present example, I've selected a comparison with a "snapshot" I made before an editing session. A snapshot creates a record of the status of a translation at a given time and makes rollbacks possible. Use the submenu of the Versions icon on the Documents ribbon to make a snapshot of your translation:

Once the tracking of changes for the current translation compared to a previous minor version has been activated, the relevant changes will be marked in red in the translation grid. If changes have been made to the source text (correcting OCR errors, for example, by editing the source text with F2), these will be shown as well.

Changes can be rejected by choosing Revert To Earlier Version on the Review ribbon, in the context menu (right-click) or with the corresponding keyboard shortcut. Or a version of a target text not shown in the markup can be recalled with the Row History and restored by copying it from the dialog (Ctrl+C) and pasting in the target cell and editing out extraneous information.

But how can one navigate many tracked changes in a larger document? Many users think this is not possible, though in fact it's rather simple with memoQ's filtering features.

Clicking the filter icon above the target text column opens a dialog to specify filter criteria for the working view. On the Status tab under Other properties... the option Change tracked can be selected to show only those segments with tracked changes.

Alternatively, the Go to next segment settings (Shift+Ctrl+G) can be configured in the same way with Change tracked on the Status tab, so choosing Go to next (Ctrl+G) or confirming a segment (if the option Automatically jump after confirmation is selected in the Go to next segment settings dialog) will take you to the next segment with tracked changes.

Sep 16, 2015

Getting around language variant issues in memoQ LiveDocs

I was told by some other users that a fundamental change had been made in the way language data are accessed in LiveDocs. It was said that until a few versions ago it had been possible to use documents for reference in LiveDocs regardless of their sublanguage settings. So I was told. The truth is more complicated than that.

According to my tests, memoQ 2015 is the first version of memoQ to have a logically consistent treatment of language variants for both bilingual and monolingual documents in corpora. All the other versions tested (memoQ 2013R2, 2014, 2014R2) are equally screwed up and show the same results.

The "visibility" of a monolingual or bilingual document when viewed in a corpus attached to a project running under memoQ 2015 follows these rules:

the sublanguage (language variant) settings for source and target (of the document or the project) must match the project
or the language setting (of the document or the project) must be generic.

Two rules. Pretty simple. It doesn't matter what version of memoQ the project or corpus was created in, only which version is actively running.

I created a test corpus with the following document mix:

The corpus contained 11 documents, both bilingual and monolingual with a mix of generic language settings and settings with language variants specified (such as German for Germany, Switzerland and Liechtenstein and English for Zimbabwe, the US and UK).

In a project running under memoQ 2015 with the languages set to generic German and generic English, all 11 documents in the corpus were accessible.

So if you want access to all LiveDocs corpus data for the major languages of your project, it is necessary to use generic language settings, either when you load the data into LiveDocs (difficult unless you always use the resource console, since adding documents to a corpus from within a project automatically applies the project's language settings!) or in the languages specified for the project itself. And this will only work with memoQ 2015. If you want to apply penalties to particular language variants this can be done using keyword markers (as seen in the screenshot above) and configuring the More penalties tab of the LiveDocs settings file applied to that corpus.

If the same corpus is attached to a project running under memoQ 2015 with language settings for Swiss German and generic English, the documents available from the corpus are these:

For a Swiss German and UK English project under memoQ 2015, this is the picture:

And for a Germany's German and US English:

All the screenshots above can be predicted based on the two rules stated. Work it out.

"But what happens with earlier versions of memoQ?" you might wonder. It's messy. Here is a look at a Swiss German and UK English project under memoQ 2013 R2, 2014 and 2014 R2:

And here's a project with generic German and Generic English under memoQ 2013 R2, 2014 and 2014 R2:

In each case the five bilingual documents are visible no matter what the project's language settings are. However, there is strict adherence to language variants and the generic language setting for monolingual documents! In my opinion, that's for the birds. I see no good reason to follow a different rule for data availability in bilingual versus monolingual documents. So in a sense, Kilgray has cleaned up this inconsistency in the latest version of memoQ.

Some have expressed a desire for a "switch" setting to allow language variant settings to be ignored. And perhaps Kilgray will provide such a feature in the future. But the best way to get there now is simply to make your project's language settings generic.

Changing the language settings for bilingual data in an existing LiveDocs corpus

If you have a corpus with a mix of language settings and you want to convert these to generic settings or a particular variant, this can be done as follows currently only for bilingual documents:

Select the bilingual documents to export from the corpus and export them to a folder. (If you choose to zip them all together, unpack the *.zip file later to make a folder of the exported *.mqxlz files.
Re-import the *.mqxlz files to the LiveDocs corpus via the Resource Console so you are able to specify the exact language settings you want. In the import dialog, you'll have to change the filter setting manually from "binary" to "XLIFF". These *.mqxlz files are not the same as bilingual files from a translation document in a project and are not recognized automatically.

Unfortunately, there is no way to change the language settings of a monolingual document except to re-import it in the Resource Console in its original form and set the language variant (or generic value) there.

So really, for now, the best way to go seems to be to use memoQ 2015 with generic project language settings.

Sep 15, 2015

A quick trip to LiveDocs for EUR-Lex bilingual texts

Quite a number of friends and respected colleagues use EUR-Lex as a reference source for EU legislation. Being generally sensible people, some of them have backed away from the overfull slopbucket of bulk DGT data and built more selective corpora of the legislation which they actually need for their work.

However, the issue of how to get the data into a usable form with a minimum of effort has caused no little trouble at times. The various texts can be copied out or downloaded in the languages of interest and aligned, but depending on the quality of the alignment tool, the results are often unsatisfactory. I've been told that AlignFactory does a better job than most, but then the question of how best to deal with the HTML bitexts from AlignFactory remains.

memoQ LiveDocs is of course rather helpful for quick and sometimes dirty alignment, but if the synchronization of the texts is too many segments off, it is sometimes difficult to find the information one needs even when the (bilingual) document is opened from the context menu in a concordance window.

EUR-Lex offers bi- or tri-lingual views of most documents in a web page. The alignments are often imperfect, but the synchronization is usually off by only one or two segments, so finding the right text in a document's context is not terribly difficult. So these often imperfect alignments are usually quite adequate for use as references in a memoQ LiveDocs corpus. Here is a procedure one might follow to get the EUR-Lex data there.

The bilingual text of a view such as the one above can be selected by dragging the cursor to select the first part of the information, then scrolling to the bottom of the window and Shift+clicking to select all the text in both columns:

Copy this text, then paste it into Excel:

Then import the Excel file as a file for "translation" in a memoQ project with the right language settings. Because of quirks with data access in LiveDocs if the target language variants are specified and possibly not matched, I have created a "data conversion project" with generic language settings (DE + EN in my case as opposed to my usual DE-DE + EN-US project settings) to ensure that data stored in LiveDocs will be accessed without trouble from any project. (This irritating issue of language variants in LiveDocs was introduced a few version ago by Kilgray in an attempt to placate some large agencies, but it has caused enormous headaches for professional translators who work with multiple sublanguage settings. We hope that urgent attention will be given to this problem soon, and until then, keep your LiveDocs language data settings generic to ensure trouble-free data access!)

When the Excel file is added to the Translations file list, there are two important changes to make in the import options. First, the filter must be changed from Microsoft Excel to "multilingual delimited text" (which also handles multilingual Excel files!). Second, the filter configuration must be "changed" to specify which data is in the columns of interest.

The screenshot above shows the import settings that were appropriate for the data I copied from EUR-Lex. Your settings will likely differ, but in each case the values need to be checked or set in the fields near the arrows ("Source language" particularly at the top and the three dropdown menus by the second arrow below).

Once the data are imported, some adjustments can be made by splitting or joining segments, but I don't think the effort is generally worth it, because in the cases I have seen, data are not far out of sync if they are mismatched, and the synchronization is usually corrected after a short interval.

In the Translations list of the Project home, the bilingual text can be selected and added to a LiveDocs corpus using the menus or ribbons.

The screenshot below shows the worst location of badly synchronized data in the text I copied here:

This minor dislocation does not pose a significant barrier to finding the information I might need to read and understand when using this judgment as a reference. The document context is available from the context menu in the memoQ Concordance as well as the context menu of the entry appearing in the Translation results pane.

A similar data migration procedure can be implemented for most bilingual tables in HTML files, word processing files or other data sources by copying the data into Excel and using the multilingual delimited text filter.

Sep 14, 2015

French-English naval defense glossary published

Australian colleague Steve Dyson has published a new French-English glossary of terms for naval defense for technical journalists and translators, available in ePub format on Lulu.com. His many decades of experience as a translator and technical writer associated with French defense industries cannot be transplanted into the brains that might need it, but this distillation is a valuable resource nonetheless.

Aug 19, 2015

Doing the deed with the DGT

Several years ago a legal translator in my circles began to use memoQ for her work, and I was asked to help with the migration of data from her old environment. When she was introduced to memoQ LiveDocs, she was delighted to learn that she was able to view the original document text or bitext of concordance hits for content saved in a LiveDocs corpus.

Because her work involved a lot of references to EU directives and other information sources from the EU, the parallel corpora from the DGT had great value to her work. These are enormous bodies of data, totaling several million translation units and growing constantly. Many translators in the EU use this data, but the sheer bulk of it tends to be burdensome to many translation environments, and the lack of context often limits the value of information retrieved from these corpora when stored in translation memories.

So she decided that LiveDocs was the medium in which the DGT data were to be stored, and because the DGT translation memories contain their data in sequential document order, the document context of any concordance hits can be viewed using the context menu in the memoQ concordance:

Thanks to the expansion of file types which can be included in LiveDocs since that time, it is easier than ever to import data from parallel corpora like the EU DGT and use these to support translation work. Using the LiveDocs approach, the extraction of a single large bilingual TMX file from the many zipped data collections is also completely unnecessary (in fact, the extreme quantity of data in those single files inevitably causes memory problems). To build reference corpora for concordancing or the construction of predictive typing resources such as Muses in memoQ, it is simply necessary to unpack the individual zip files into folders full of small TMX files and then import these folder structures into memoQ:

Include only TMX files in the LiveDocs corpus import:

Selecting the desired languages extracts the bilingual data from the individual TMX files, which contain data in all the official EU languages. If a particular file does not contain the desired pairing a corresponding message will be displayed. Don't worry about it.

This approach of loading smaller TMX files into LiveDocs overcomes the memory problems which may occur with gigantic files. And once these smaller files are in a LiveDocs corpus, they can be selected en masse and exported to one or more translation memories.

In fact, this approach is useful to get around the current inability of memoQ translation memories to import more than one TMX file directly at a time. This may be helpful, for example, to OmegaT users who want to migrate their many TMX translation memories (one from each project!) if they start using memoQ.

Aug 18, 2015

Enter the Dragon, Anywhere!

Today Nuance made a presentation of a new product to be released this autumn for mobile devices using iOS and Android operating systems: Dragon Anywhere.

The mobile app will allow secure transcription with a WiFi or cellular data connection as well as synchronization of custom vocabulary with Dragon desktop computer applications wor Windows and MacOS.

The initial presentation made no mention of which languages will be available in Dragon Anywhere; the synchronization feature makes me worry that it might be restricted to the current seven or eight languages available for Dragon NaturallySpeaking (Windows) or Dragon Dictate (Mac), but perhaps the standalone applications for desktop computers and laptops will finally be upgraded to offer the 40+ languages currently available for mobile devices with apps such as Dragon Dictation for iOS or Swype + Dragon Dictation for Android.

It is also unclear at this point whether the new Dragon Anywhere app will allow direct dictation or transfer to the cursor location on a linked computer, as one can do with myEcho or using Swype + Dragon Dictation in conjunction with Chrome Remote Desktop. But with the addition of extensive voice-controlled editing features to the mobile app, Dragon Anywhere represents significant progress toward better ergonomics for writing and translation!

Aug 12, 2015

"Upgrading" translation memories for document context

Translation memories are in a way the Zombie Apocalypse of our profession. Dead data walking, with rotting bits falling off and lying out of context in a concordance for the reader to puzzle over, wondering where that bit of wordflesh one fit in the whole.

How often in my Dark Days as a Trados User did I look at some concordance hit and wonder just how stoned the translator was, corrected the "mistake" and discovered later it was quite right in context? Context matters, but with translation memories it's like the Invisible Hand at best, with a few missing fingers.

I once was lost, but now I found memoQ LiveDocs, and I

export my context-free or -invisible TMs to TMX, then
import the TMX to LiveDocs

where I now can go directly from a concordance hit to the translation unit with LiveDocs, which can be read in document context if that chunk of text was in fact written to the TM in the sequential order of the document, which is often the case.

Aug 11, 2015

memoQuickie: "Partial deliveries" from the preview

Earlier today a friend asked how she might deliver just a table from a larger report she was translating in memoQ in a word processing file. My quick answer was to use the recent export functions in memoQ which enable one to export partially completed work, then open the file and delete everything but the table.

But then I thought... there must be a better way. So I built a test file in Microsoft Word:

And then I imported this into memoQ, copied the text from the preview and pasted it back into Microsoft Word .

Content copied from the memoQ preview: the red box is placed at the currently selected segment.

The Excel object copied from the preview as a bitmap graphic, but the ordinary table and the other text were normal and editable. This would often suffice to send some small section of a translation quickly, and this could even be used with the monolingual review feature to update the translation with suggested changes.

Aug 6, 2015

"New Beginnings with memoQ" tutorial guide released

The initial version of my new tutorial guide for memoQ has finally been made available through the distribution site I use for convenient promotional distribution. Those interested in this 101-page PDF e-book can see more details or purchase it by clicking on the cover graphic above. A table of contents for this edition can be downloaded here. Many of the discount codes distributed by myself or Kilgray in the past year or two can be applied to anything on that site, including this volume.

The book is organized in three sections. The front part gives a quick guide to navigation in the old and new memoQ interfaces (because even those who have purchased memoQ recently may have to come to grips with older versions to do some server projects for customers using older releases). Then a series of four "job" chapters introduce common, fundamental concepts for basic work and safe data management in the memoQ translation environment. The last section, an appendix called "This & that" covers about a third of the total page count with a great variety of supplemental information to help new and experienced users come to grips with many challenges for working efficiently with this translation environment.

The appendix includes a glossary of terms to "de-nerdify" things a bit as well as an overview of some of the most important or useful features introduced for individual translators in recent versions. I added the latter because I often advise people about which versions to upgrade to for particular features, and with the rapid development of memoQ this is not easy to keep track of!

Until very recently I never considered doing instructional materials specifically focused on new users and their needs because of the large number of people offering beginners' instruction. However, as my conceptual collaborator in this guide and I observed, the feature-focused approach used by many instructors often confuses and intimidates users and sometimes fails to address basic matters needed for their routine, productive work.

The approach here focuses on workflows and possibilities for application rather than features. And because of the challenges many experienced users of memoQ face adapting to the recent changes in the interface and the many new possibilities for the memoQ translation environment, I called the book "New Beginnings" because it offers everyone a chance for a fresh start for better productivity.

Jul 22, 2015

Setting memoQ to run as a 64-bit application

It has been so long now since 64-bit versions of the Windows operating system were introduced, allowing access to more RAM and adding other efficiencies to the systems we use. So imagine my surprise today when, after a recent memoQ build upgrade, all Hell broke loose in the midst of a time-critical project, with crashes every five or ten minutes

and interesting views of the working translation grid like this one:

Of course all this happened as I was doing a time-critical translation, dictating at top speed to meet a tight deadline for a last-minute legal matter.

So I send a quick cry for help to support@kilgray.com, and in a short while I got a question back, asking whether I was running in 32-bit or 64-bit mode. "What a silly question," I thought. "64-bit, of course!" Wrong, actually.

I don't know why, but it seems that memoQ upgrades are currently configured to run in 32-bit mode. And depending on what kind of work you are doing, that can have unfortunate results. Working with my version of Dragon NaturallySpeaking today it was an absolute disaster that caused me to lose half my working time and sweat buckets right up to the delivery.

There are two executable files for memoQ in the program folder: MemoQ32.exe and MemoQ.exe, the former obviously for 32-bit systems and the latter not-so-obviously for the more typical 64-bit systems for our times.

I checked the properties of the shortcut I use to launch memoQ

and 'shore 'nuff

there was the problem. So I edited the "target" field, which tells the shortcut what to run:

and lived happily ever after for the rest of the afternoon. The worst problems I have seen with a CAT tool for a decade were resolved.

"But how do I know if I have a 32-bit or a 64-bit Windows operating system," you might ask. Good question. Most translators answer this by disemboweling a black chicken and examining its entrails, but being an advocate of better treatment for animals, particularly oppressed chickens, I researched a better way.

The answer can also be found in the Windows Control Panel in the System control. The screenshot from my Windows 7 Ultimate version is shown above; other versions may have a slightly different display, but the System control is where you should look in any case.

*******

P.S. - In my correspondence with Kilgray Support, the engineer helping me had suggested a one-off start of the memoQ.exe application (64-bit memoQ) from the program folder. I thought this was a bit of a hassle, and I questioned why he would propose such a thing and why the 32-bit application would be a default now. I received this response:

" ... this is by design: by default memoQ will start in 32-bit mode to ensure better startup and better user interface performance. But you can start memoQ 2014 R2 or memoQ 2015 in 64-bit mode, which is useful for heavy operations which requires several gigabytes of RAM."

All very well and good... those who do mostly lightweight stuff can go for the default and perhaps have a special shortcut for the "heavy lifting" memoQ version. The problem I see is that most users are not going to know what sucks up a lot of system capacity and may require the 64-bit application.

I suspect that large TMs or other big resources, things like predictive typing (commonly in use) and certainly the use of speech recognition software is going to make a crunch. I am unaware of any discussion (to date) which is publicly accessible, which will make clear the line that is crossed to require more memory access. Most of the professionals I know tend to use memoQ in a rather resource-intensive way, and I am unclear what the recommendations should be here.

P.S. #2 - Kilgray responds to questions on 64-bit versus 32-bit on the Yahoogroups forum: https://groups.yahoo.com/neo/groups/memoQ/conversations/topics/41676. Due to access difficulties with Yahoo's sucky Neo interface, some of the message thread is included here:

Dear All,
I'll try my best to clarify how memoQ relates to 32-bit and 64-bit.
1. Use 64-bit if you expect "heavy" use
memoQ is a "hybrid" application that was designed (at least from maybe version 6) to run in 64-bit mode on a 64-bit system and 32-bit on 32-bit systems. The support for 64-bit mode was added to avoid out of memory issues with "heavy" use, like importing very large documents. (I clearly remember that one of the use cases was importing large FrameMaker documents. Most probably the other most significant use case was memoQ server in general, which, depending on how it is used by how many people, may have to do many things at once and load an enormous amount of resources.)
A 32-bit application is limited to 2 GB of memory regardless of how much memory is available on the system, while a 64-bit application is practically umlimited (technically there is a limit, but it is enormous). One could argue that 2 GB of memory use is not likely in memoQ anyway, but actually there can be short spikes, during document import or when loading a translation memory, for example. There are also further details that complicate this, like memory fragmentation which are frankly too technical for me, but, as far as I understand, they mean that an "out of memory" error is still possible when there seems to be enough memory overall, but not enough "continous" memory for a specific operation.
These kinds of issues are rare in 64-bit operation and more common on 32-bit operation. This means that with 64-bit memory. This means that with 64-bit memoQ you are less likely to get out of memory errors. Use 64-bit memoQ if you expect "heavy" usage or if you have run into out of memroy errors. If you have, the error message displayed by memoQ most probably already advised you to go 64-bit.
2. memoQ is 32-bit is more snappy
Most modules of memoQ are based on the .NET framework and is written in "managed code" that runs under the Common Language Runtime of the .NET Framework. Sorry if this is too technical, but the point is that memoQ itself is (mostly) intermediary code that must be "translated" (compiled) to code that a computer can actually execute, right when the program modules are loaded to memory. The piece of software that does this compilation is called the Just In Time (JIT) compiler. It is the JIT compiler that creates either 64-bit or 32-bit machine code, as required.
The JIT compiler in recent versions of the .NET Framework simply performs way worse when creating 64-bit code than 32-bit code. (Most probably the compilation itself is slower on 64-bit than 32-bit, or it is possible that there are performance issues with the resulting "machine code" itself.) This is a widely known problem, and Kilgray has done elaborate research to confirm that this is why memoQ starts up slower (by approximately 100%) in 64-bit mode than 32-bit mode. It is also likely that some of the interface is also slower in 64-bit than 32-bit. We are also convinced that this was a very signifcant issue to the user base, and has led to unfavourable comparisons to the competition (which I think is at a dubious advantage here in the sense that they don't offer a 64-bit mode, as ar as I know). The good news is that Microsoft has been working on the problem and they have created a brand new JIT compiler for version 4.6 of the .NET Framework. So whenever memoQ gets upgraded to .NET 4.6, 64-bit will see a performance boost.
It is a myth that 64-bit most always perform better, and in the case of .NET (before version 4.6), clearly the opposite is true, at least in terms of loading and "JIT compiling" program modules. This can heavily affect startup times, for example.
Best regards,Gergely

I left out the part that is (hopefiully) obvious by now to many of you, and something I have explained here before, maybe twice or more:
As an answer to the performance (especially startup speed) concerns, in recent builds of memoQ, we turned off the autoamtic "hybrid" behaviour that detected the 32 or 64 bitness of the system, and created a separate 32-bit executable (and a start menu/screen shortcut) to start memoQ with. With this you can force memoQ to run in 32-bit mode. This is also the default in the sense that if you just click "memoQ" in your start menu/screen. you start the "forced" 32-bit mode. There is also a "memoQ (x64)" shortcut you can click to enable 64-bit mode on 64-bit systems.
Also, another piece of advice: for users with beefy systems the perforamnce degradation caused by the 64-bit JIT compiler may be completely irrelevant, as memoQ may start up for you in maybe four seconds instead of two. If you have such a machine, 64-bit might be a very good choice. And you get some protection from "out of memory" issues.
Another piece of advice: please don't switch to 64-bit mode if you do not have a beefy system and you are not having any issues with memory use. This might just slow your copy of memoQ down with no benefit in your specific case. Switch to 64-bit if you know you are affected by "out of memory" issues, or you expect some heavy use (very large resources or translatable files, etc).
Gergely

Message 3 of 3 , Today at 12:49 PM
"It is a myth that 64-bit most always perform better, and in the case of .NET (before version 4.6), clearly the opposite is true"
Really glad that, despite this, Kilgray kept the 64-bit support. The 32-bit version caused out of memory errors for most of our translators after ~15-30 minutes of use, causing a lot of confusion after 2015 hit, until we found the 64bit version :D
wolfschoe

Jun 14, 2015

Introduction to memoQ in Lisbon / Introdução ao memoQ em Lisboa

From July 6th to the 11th, the summer school of the New University in Lisbon will offer a week-long introductory course for memoQ 2015, which includes:

The functional modules of the memoQ translation environment and how these work together;

Common workflows for translation and editing tasks;

Making use of legacy translations and data from other environments;

Collaboration with users of other translation environments (SDL Trados Studio, OmegaT, etc.);

Tips for problem-solving and added value for translation customers.

The course will be taught by me an Professor David Hardisty, with whom I have spent most of this year so far exploring innovative speech recognition and editing workflows, which will also be an important part of this course. It's a pleasure to work with David, because not only does he have a strong commitment to the success of his students, but he has a marvelous talent for taking my concepts and recasting them in a way that work really, really well for undergraduate and graduate students at all levels.

The course is open to anyone (limited enrollment, 16 persons I think) and offers 24 hours of instruction in the week. It will be taught in English with summaries in Portuguese.

A description of the course in English and Portuguese is here. I am not responsible for the errors in the English. Registration information (Portuguese only, alas) is on this page. Attendees who don't read Portuguese but manage to figure out how to register nonetheless will receive a special reward during the course.

During the week, the course is offered in the evenings, leaving the days free for work or local tourism. It is recommended that translators make a pilgrimage to the monastery named after the patron saint of translation in Belém. Miracles have occurred there, carpel-tunnel syndrome has been healed and dead text has even come to life, but not even the intervention of St. Jerome can save a machine pseudo-translation. You might learn that trick from us, however. Or not.

On August 3rd a course on project management with memoQ will be offered on a similar plan.

Jun 10, 2015

The Great Translation Rate Conspiracy in France!

The French Society of Translators (SFT) is conducting another of its periodic rate surveys. Although the American ATA treads fearfully in matters of rates, deferring to corporates who prefer to keep everyone in the dark, afraid and focused on more important matters like the Kramer vs. Kramer spectacle of TransPerfect and the latest venture capital conquests of Dopeling's Stormin' CEO, European organizations occasionally pander to the masses like the sleazy socialists we all know them to be and do these surveys that injudiciously gather and publish real data from real translators in inconvenient contradiction to the usefully manufactured figures of the Common Nonsense Advisory which aforementioned corporates employ as their rightful negotiating bludgeon to keep 'em down in The Bulk Market Bog that is the world to all those who think that translation is mostly about proZtitution for the sake of Larry the Language Lizard at thepigturd and other industrial luminaries of word sausage.

The survey from those cheese-eating surrender monkeys includes 25 questions and is expected to take 20 valuable minutes of your time which might be better spent translating political propaganda praising Robert Mugabe for Translators Without Borders for free or registering to pay Lionbridge for the privilege of receiving lowball bulk market offers to churn words into sausage to feed their better bottoms' line.

The survey is open to anyone who can puzzle out its French, and those who foolishly believe in the value of free information in a free society for free-thinking translators to make informed decisions with a knowledge of current compensation statistics can support this leftist plot by clicking the link below and not selflessly sacrificing their futures for the good of their Big Bog betters:

http://sondage.sft.fr/index.php/765913/

Jun 7, 2015

The lightning passed.

The difference between the right word
and the almost right word
is the difference between
lightning and a lightning bug."
- Mark Twain

RIP Chris Irwin, a friend and example to many

Known by some by his business pseudonym Textklick, Chris Irwin was an extraordinary man in his quiet way. His unexpected death recently has left many shocked and grieving for the loss of a friend and colleague whose wit, kindness and good judgment were usually understated but of a quality seldom surpassed. In over a decade, he enriched my life in ways I will remember to its end.

One of his qualities I appreciated most was his reluctance to take sides in a fight and his consideration that a human being is a complex construct, one we should judge with care. I do not, however, think anyone will decry a lack of care in judging him a good one and an example to follow in many ways, both professionally and personally. The many entries in his Kondolenzbuch attest to just some of the ways in which he will live for a very long time in the memories of those who knew, loved and respected him

In the last week I have thought a lot about how, despite challenges he faced, he never forgot the importance of savoring life's vintage, even the sour notes, and he remained an active part of our professional community with his good wife through all the good times and those which were less so. Life is good. Enjoy it to the last drop. Even the lees.

May 31, 2015

Authoring and Editing with memoQ (webinar)

Last February I described my initial work with translation tools as environments for authoring and editing documents in a single language. Some people have been doing this quietly for a while; occasionally I would hear puzzled comments from a trainer who had held a class on SDL Trados Studio, OmegaT or memoQ which had been attended by a technical writer or someone with other professional writing interests not related to translation. But to my knowledge there has been no systematic approach to this.

Some weeks later I began to discuss and present some new possibilities for speech recognition in 38 languages which go well beyond the limitations of Dragon NaturallySpeaking for automated speech transcription in the eight languages for which it is available. These possibilities include a number of mobile solutions which are quickly gaining traction among translators and other professional writers.

On Tuesday, June 2nd (two days from now), I will be presenting a one-hour introduction to "MemoQ for Single-language Authoring and Editing" in the eCPD Webinar series. The registration page is here.

This presentation will be an update of the talk I gave earlier this year which discussed CAT tools in general as authoring and editing tools. Although any tool works in principle (and even a user of SDL Trados Studio, for example, can probably draw enough ideas from the upcoming eCPD talk to make good use of the approach), memoQ has some particular advantages, not the least due to its corpus-handling features in LiveDocs and its superior predictive typing facilities, including "Muses" (which are like SDL's AutoSuggest with more flexibility and without the onerously high data quantity requirements).

The presentation will include an overview of some of the latest advances in speech recognition in 38 languages for ergonomically superior writing by automated transcription as well as discussions of version management and dictation workflows which can be applied for greater ease in editing monolingual documents or even translations, including post-editing of machine pseudo-translation (PEMpT by the "Hardisty Method"). I've been fairly quiet on this blog in recent months due to conference organization and travels and the considerable time put in to researching improved work ergonomics for translation, writing and editing processes. (In fact I didn't even find time to blog the memoQ Day on April 22nd in Lisbon yet!) Elements of all these efforts, which have sparked no little interest at recent conferences and workshops I have presented at in Europe, will be part of Tuesday's talk, which will include Q&A afterward to explore the interests of those participating.

So if you are a translator involved in a lot of revision or editing work (bilingual or monolingual, a technical writer or other professional writing in a single language for publication, someone working on a thesis or authoring for other purposes, the eCPD presentation may help you to do this with better organized resources and greater efficiency. As one friend of mine who wrote a thesis just before I developed this approach put it, with this she would at least have been able to keep track of the feedback on her work from its five or so reviewers without going completely nuts.

Search me!