Showing posts with label tags. Show all posts
Showing posts with label tags. Show all posts

May 5, 2022

Understanding and mastering tags... with memoQ!

Everything you need to know... in 36 pages!

Following up on the success of his excellent guide to machine translation functions in memoQ, Marek Pawelec (Twitter: @wasaty) has now published his definitive guide to tag mastery in that translation environment. In a mere 36 pages of clearly written, engaging text, he has distilled more than a decade of personal expertise and exchanges with other top professionals in language services technology into simple recipes and strategies for success with situations which are often so messy that even experienced project managers and tech support gurus wail in despair. Garbage like this, for example:


This screenshot is taken from the import of The PPTX from Hell, which a frustrated PM asked for help with just as I began reviewing the draft of Marek's book about a month ago. It contained nearly 32,000 superfluous spacing tags and was such a mess that it choked all the best professional macros usually deployed to deal with such things. Last year, I had developed my own way of dealing with these things that involved RTF bilingual exports and some search and replace magic in Microsoft Word, but when I shared it with Marek, he said "There's a better way", and indeed there is. On page 23 of this book. It was much cleaner and faster, and in a few minutes I was able to produce a clean slide set that was much easier to read and translate in the CAT tool. A page that costs 50 cents (of the €18 purchase price of the guide) earned me a 140x return and saved hours of working frustration for the translation team.

The book covers a lot more than just the esoterica of really messed up source files. It is a superb introduction to dealing with tags and markup for students at university and for those new to the translation profession and its endemic technologies, and it has sober, engaging guidance at every level for experienced professionals. I consider it an essential troubleshooting work for those in support roles of internal translation departments and, quite honestly, for my esteemed colleagues in First Level Support at memoQ. Marek is a superb trainer and an articulate teacher, with a humility that masks expertise which very often surprises, delights and informs those of us who are sometimes thought to be experts.

I am also particularly pleased that in the final version of his text he addresses the seldom discussed matter of how to factor markup into cost quotations and service charges for translations. memoQ is particularly well designed to address these problems, because weighting factors equivalent to word or character counts can be incorporated in file statistics, offering a simple, transparent and fair way of dealing with the frustrations that too often leave project managers screaming and crying in frustration shortly before... or after planned deliveries.

Whatever aspect of tags may interest you in translation technology and most particularly in memoQ, this book will give you the concise, clear answers you need to understand the best actions to take.

The PDF e-book is available for purchase here: https://payhip.com/b/tHUDx


Sep 8, 2018

Editing inline tag content in memoQ

The topic of accessing and editing translatable text in tags comes up from time to time. I thought I had published instructions on this topic some time ago, but when a tech-savvy colleague who always does a proper search before asking questions couldn't find it, nor could yours truly, I concluded that it was time for another tutorial video. So here it is:


The video post on YouTube includes a hot-linked table of contents that will enable you to jump to key parts of the tutorial. This is a very simple function to implement with "markers" in Camtasia, and I recommend that those who make tutorials of any significant length or who post recorded webinars consider implementing such tables of contents to facilitate finding particular parts of interest without endless hit-and-miss searching in a long video.

Apr 3, 2018

Dealing with tagged translatable text in memoQ

Lately I've been doing a bit of custom filter development for some translation agency clients. Most of it has been relatively simple stuff, like chaining an HTML filter after an Excel filter to protect HTML tags around the text in the Excel cells, but some of it is more involved; in a few cases, three levels of filters had to be combined using memoQ's cascading filter feature.

And sometimes things go too far....


A client had quite a number of JSON files, which were the basis for some online programming tutorials. There was quite a lot of non-translatable content that made it past memoQ's default JSON filter, much of which - if modified in any way - would mess up the functionality of the translated content and require a lot of troublesome post-editing and correction. In the example above, Seconds in a day: is clearly translatable text, but the special rules used with the Regex Tagger turned that text (and others) into protected tags. And unfortunately the rules could not be edited efficiently to avoid this without leaving a lot of untranslatable content unprotected and driving up the cost (due to increased word count) for the client.

In situations like this, there is only one proper thing to do in memoQ: edit the tags!

There are two ways to do this:

  • use the inline tag editing features of memoQ or
  • edit the tag on the target side of a memoQ RTF bilingual review file.
The second approach can be carried out by someone (like the client) in any reasonable text editor; tags in an RTF bilingual are represented as red text:


If, however, you go the RTF bilingual route, it's important to specify that the full text of the tags is to be exported, or all you'll get are numbers in brackets as placeholders:


Editing tags in the memoQ working environment is also straightforward:


On the Edit ribbon, select Tag Commands and chose the option Edit Inline Tag


When you change the tag content as required, remember to click the Save button in the editing dialog each time, or your changes will be lost.

These methods can be applied to cases such as HTML or XML attribute text which needs to be translated but which instead has been embedded in a tag due to an incorrectly configured filter. I've seen that rather often unfortunately.

The effort involved here is greater than the typical word- or character-based compensation schemes can justly compensate and should be charged at a decent hourly rate or be included in project management fees. 

A lot of translators are rather "tag-phobic", but the reality of translation today is that tags are an essential part of the translatable content, serving to format translatable content in some cases and containing (unfortunately) embedded text which needs to be translated in other (fortunately less common) cases. Correct handling of tags by translation service providers delivers considerable value to end clients by enabling translations to be produced directly in the file formats needed, saving a great deal of time and money for the client in many cases.

One reasonable objection that many translators have is that the flawed compensation models typically used in the bulk market bog do not fairly include the extra effort of working with tags. In simple cases where the tags are simply part of the format (or are residual garbage from a poorly prepared OCR file, for example), a fair way of dealing with this is to count the tags as words or as an average character equivalent. This is what I usually do, but in the case of tags which need editing, this is not enough, and an hourly charge would apply.

In the filter development project for the JSON files received by my agency client, the text used was initially analyzed at
14,985 words; 111,085 characters; 65 tags
and after proper tagging of the coded content to be protected it was
8766 words; 46,949 characters; 2718 tags.
The reduction in text count more than covered the cost of the few hours needed to produce the cascading filter needed for this client's case and largely ensured that the translator could not alter text which would impair the function of the product.




Feb 23, 2014

Cleaning up a crappy OCR job for translation

It's a sad fact in the professional work of translators that a lack of understanding on how to deal effectively with various PDF formats causes enormous loss of productivity and results which are not really fit for purpose. The aggressive insistence of many colleagues possessed of a dangerous Halbwissen on using half-baked methods and inappropriate tools contributes to the problem, but, bowing to the wisdom about arguing with fools, I now mostly sit back with a bemused and amused smile and watch the tribulations of those who believe in salvation by PDF import filters and cheap or free OCR. "TANSTAAFL" is a true as it ever was.

Just before the weekend I got an inquiry from an agency client I rather like. Nice people, good attitude, but struggling sometimes trying to find their way with technology despite some in-country "expert" training. This inquiry looked a bit like ripe fish at first glance. The smell got stronger after I was told that because the corporate end client had converted the PDF for their annual report and begun to edit the mess (and comment it heavily too) in the OCR file that this would be all there was to work with. It was a thoroughly appetizing sight when imported into a translation environment:


There are so many issues in that tossed salad of translation terror that I don't even know where to start describing them.

The screenshot above was in memoQ. How does it look in SDL Trados Studio? Often just as messy. In this case, this was the result in an older version of Studio:

SDL Trados Studio choked and refused to import the file!

I do have the latest version of SDL Trados Studio 2014, but unfortunately it's on a system that does not yet Microsoft Office, because I refuse to bow to Microsoft's insistence that I must buy a Portuguese version of that software. No MS Office, no file import in this case with SDL Trados Studio. memoQ fortunately has not needed MS Office to import its old file formats since the release of memoQ 6.0.

Ugly OCR trash like this file is all too common at this time of year, and as I am busy compiling the syllabus for the workshop I want to do on better living with well-used technology for legal and financial translators, I felt obliged to take this one on as a teaching example. It's actually not as bad as it looks. On the other hand, the best approach may not always be obvious, and the best solution for one document may not apply as well or at all to another.

My first approach was to use Dave Turner's CodeZapper macros. This isn't as straightforward as it used to be since I downgraded from Microsoft Office 2003 to later versions; for some reason the toolbar refuses to stay loaded between work sessions, and there's no way I can keep track of all the abbreviations for macros on it.


I can't deal with anything more complicated than clicking the "CZL" option for "Code Zapper lite", which did a rather decent job on the heavy mess above:


But all was not quite as well as it seemed:


Text in the header and footer remained trashed, and the heavy use of comments and tabbed lists meant that there were plenty of legitimate tags to deal with which were just too confusing with the DVX-like mess of memoQ's default import and display for an RTF file.

So I went for a kinder, gentler approach. I changed my import filter settings in memoQ:


There is actually seldom any good reason to import an RTF or DOC file into memoQ using the default filter settings. And marking those two little checkboxes at the bottom often accomplishes much of what CodeZapper does. Sometimes less. A bit more in this case.


The header and footer texts were absolutely clean. Don't let the extra tags in this sample fool you: overall, there were fewer than in the code-zapped file. Now there are still a number of issues to be seen in the screenshot above, including paragraph breaks in the middle of a sentence and awful manual hyphenation (many instances of that in the whole text) and joys like badly placed comments and links which mess up the text and prevent term identification by the software:



Source editing features of memoQ (F2) enable issues like the two above to be dealt with easily:



After a bit of repair like this in the memoQ environment (where it is really much, much easier to fix the problems of bad comment and link placement), I copied the entire source text to the target to enable me to export a cleaner source text file. I then opened this file in Microsoft Word and used various search and replace operations to fix the bad hyphenation and other problems like excess spaces. Replacing the hyphens had to be done occurrence-by-occurrence, because the style of writing in German meant that there were many legitimate instances of hyphens followed by spaces.

After all was done, the "before and after" looked like this:

BEFORE

AFTER

The remaining tags were all legitimate formatting tags for comments, hyperlinks, tabs after section numbering, etc. These do, of course, require attention and add complexity to the work still, so they must be included in the charges for the job. memoQ makes this calculation particularly simple by allowing weighting factors to be specified in the analysis. These are the settings I typically use for a German source text:


I find this usually represents a fair minimum for the additional effort in translation and quality assurance that tags require. In this case, of course, time charges for the cleanup apply, but as you can probably guess from comparing the two analysis tables above, the customer is actually saving a lot of money by paying me to clean up the mess, and the results will be a lot more usable. My cleaned-up version of the source text will also be returned in case the authors intend to make more revisions in the source - this will save more time and money by avoiding redundant cleanup in that case.


Dec 11, 2013

General settings for memoQ TMs

memoQ TM settings are found in the Resource Console, the Options and a project's Settings.
This is a very useful "light resource" which is well worth nearly every user's time.
To define the TM settings to be used in new projects, select a settings configuration under Tools > Options... >  Default resources > TM settings (in the row of icons) by marking its checkbox.

To define the default TM settings to be used in the project you have opened, go to Project home > Settings > TM settings (in the row of icons) and mark the checkbox for the desired project default.

Different settings for individual TMs in a project (for example to set higher or lower match criteria) may be applied by going to Project home > Translation memories, selecting the TM of interest, clicking the Settings command at the right of the window and choosing the settings to apply instead of the project's standard TM settings.

The General settings tab is the same for all currently supported versions of memoQ. Role options are included on another tab in memoQ 2013 R2, and the Project Manager editions of memoQ offer additional possibilities for filtering and/or applying penalties to content on a Filters tab.


Match thresholds
The first value here (minimum) controls the fuzzy percentage below which a match will not be displayed in the translation results pane at the upper right of the working translation window.

The "good match" threshold is relevant to pretranslation (though this is unfortunately not made obvious in the dialog). The default value of 95% is really too high and would only apply to matches with small differences in tags or numbers; since any small difference in words is penalized significantly in memoQ (something I find very helpful, as I can understand more quickly what differences to look for compared to working in Trados). I usually set my "good matches" to 80%.

Not a "good match" according to the memoQ TM default setting
Penalties
In my work, an alignment penalty, which is a deduction from the match rate of a translation unit created by feeding an alignment to a translation memory, does not make a lot of sense. This is because
  • I almost never send alignments to a TM. Why bother? LiveDocs may be slower in pretranslation, but it provides context matching just like a TM, and you can actually read what you find in a concordance search in its original document context. TMs suck because you do not get the full context for your matching segment and are thus at greater risk for missing information which may be important for a translation. This is especially the case with short match segments.
  • if I happen to be aligning a dodgy translation and want to send it to a TM, I'll put it in a "quarantine TM" which already has its own penalty.
  • on those rare occasions when I might feed an alignment to a TM, it's because the content is going to a user of another CAT tool, and if that person uses Trados or another tool that can read XLIFF files or other available bilingual formats, I'll send the data as that instad, so it can be reviewed and modified more easily before feeding to a TM. This also gives the other person a bilingual reference with document context.
  • alignment for TMs is soooooo 1990s!
User penalties: If you have the misfortune to share a TM with someone whose work you do not trust completely and you want to avoid letting that person's 100% and context match segments slip past you unnoticed, apply a suitable penalty for the level of "risk" that person represents. If you want to be sure that user's content never gets used in a pretranslation and never appears in the translation results pane, apply a whopping big penalty like 80%. Those segments not be shown or inserted but will still be there in a concordance search if you want them.

TM penalties: Sometimes a client provides you with a TM you do not trust completely, or you may have a "quarantine TM" with content of dubious quality. Or I might have a TM with good content in British English but need to deliver a translation in American English. Applying penalties to such TMs will reduce the priority of their matches and prevent 100% matches with inappropriate language from slipping past without more careful inspection. As in the case of user penalties, you can also apply a very large penalty to ensure that matches will never be displayed in the translation results pane or used in a pretranslation but still have the TM content available for concordance searches.

Adjustments
It seems to be a good idea generally to enable the adjustment of fuzzy hits and inline tags. In many (but not all) cases, this will correct small differences in numbers, punctuation, cases and inline tags.

The only significant effect I was able to determine in adjusting the inline tag strictness in my tests was that more permissive settings might count a match with different tags as a full match. While this might meet the requirements of some clients hoping to impose discount schemes, from a quality assurance perspective, this does not seem like a good idea, and I believe it is better to have a strict setting here to draw attention to differences and reduce the chance that errors might be overlooked.

Sep 26, 2013

Getting a grip on memoQ QA resources

I think the initial reaction of a lot of people to memoQ's QA functions is overwhelmed bafflement. And that's really a shame. This simple, versatile and power feature included in the translation environment can save a lot of time and grief. But perhaps Kilgray and many memoQ advocates and trainers (yours truly included) have not taken as user-friendly an approach to presenting this feature as we might. The problem starts, I think, with the default QA profile in memoQ - as far as I know the only configured profile that is delivered with the software. It has nearly every damned option turned on and drives many people crazy with long lists of uninteresting alleged "errors". Even with sorting to group the problems that are of interest, the huge list of QA warnings is often like a big, nasty finger wagging in one's face. It just pisses me off.

In my previous memoQuickie posts on QA profiles and terminology QA as well as a later post with a demo video showing terminology QA in a LiveDocs alignment/editing workflow for a dictated translation, I tried to show how one can create focused QA profiles that can accomplish specific, important tasks like verifying tag integrity or checking a translation against a list of critical, mandatory terminology, but when one of my frequent collaborators called for advice on how to use memoQ to check the integrity and format of more than one hundred footnotes in an OCR document and admitted that she still had not created a custom QA profile for tags and didn't know how, I realized that my approach up to now has probably been a colossal failure.

I keep telling people how easy it is to create some of those custom QA profiles. But why should they have to for common tasks? Why doesn't Kilgray help them out a little with some demo QA profiles than can be used for common tasks, avoiding the "noise" of the point-the-finger-at-everything default profile? After all, there are many demonstration configurations for that LQA feature that is of little or no use to the freelance community. Why not something that would benefit more users?

Well, there are now a few simplified QA profiles available on Kilgray's Language Terminal:

Kilgray Language Terminal - get your QA profiles

Language Terminal is a community resource with a growing number of features, most of which I haven't blogged about for lack of time. Its future is far more interesting to me than its present, but it currently includes a small but growing library of resources such as custom filter configurations and QA profiles which can help users with certain tasks. It also offers nice online backup integration for memoQ projects and a free InDesign server. The latter can be used by anyone (including those with other tools) to create PDF previews of InDesign files they have received or translated, and the integration of this InDesign server with memoQ desktop projects is expected to increase in the not-distant future.

The three QA profiles (MQRES files which can be imported to your memoQ installation in seconds) are the ones I use most often. "Tags only" allows me to verify that I haven't messed up my target file formatting by leaving out important tags, and "Terminology check" lets me use an approved list of terms to ensure that they are translated as agreed with the client.

The "Empty QA profile" is great for the ego. Apply that to your project, and the QA check will show no errors or warning at all. Fantastic, right? If you decide that there is some particular type of error or maybe a few types that you want to check in one go, it's a simple matter to clone this file, rename it and edit to activate the QA tests of interest. Much easier than turning off all the garbage in the default profile.

Any of those three simplified profiles might make a good basis for creating an automatic QA check that best meets your needs for a particular project. And if you want to share it with others, Kilgray's Language Terminal is a good place to do so.

Nonetheless, I do hope that future builds or releases of memoQ might include these or other QA profile examples in every memoQ installation. That would surely help more users get a proper grip on memoQ's best quality assurance features.

Aug 2, 2013

Translating SDL Trados Studio SDLXLIFF files & more in memoQ!



My latest demonstration video actually covers a number of memoQ features so that I would have an excuse to create this video index:
Time  Description
0:32
  Importing the first SDLXLIFF file to memoQ
1:12  Exporting the finished translation
1:27  Viewing the translation in SDL Trados Studio 2009
1:40  Re-importing the edited translation for a TM update
3:24  Saving the translation in a LiveDocs corpus for later reference
3:55  Importing a new version of the text in an SDLXLIFF source file
4:25  Comparing source text versions
5:55  Document-based pretranslation ("X-Translate")
7:11  Examining a "warning" for forgotten tags
7:46  Results of the second translation in SDL Trados Studio

That is the sort of thing I was talking about in a recent blog post about new approaches for online instruction. Many times I have wished for just such an index for long webinars or even much shorter reference videos like this one.

This tutorial was inspired by a Skype chat with a colleague in the US a few days ago. She uses memoQ but works with a number of others who use various versions of SDL Trados Studio, and there were some questions about about how one might deal with TM updates after a translation as well as the inevitable new versions that legal and financial translators often encounter. 

I have also noticed that quite a number of people are not up to date on SDLXLIFF compatibility with memoQ; this video also shows that former issues with preserving segment status have been taken care of, and everything now works well.

What is not obvious in the video is that one can also change the segmentation of the SDLXLIFF in memoQ; this happens only in the memoQ environment to allow better translation and more sensible translation memory content, and when the SDLXLIFF file is exported from memoQ, the original segmentation from Trados is preserved in the Trados environment.

Also not shown in the video is how I imported a third version of the source text, this time as a Microsoft Word file, not an SDLXLIFF. The document-based pre-translation (X-Translate) worked perfectly, and the target file was exported in the proper format (DOCX).

There are, of course, many other ways one could handle a "project" like this, but the procedure shown is not unlike what I sometimes do in projects myself.

********

I apologize for the quirky click animation in this tutorial; Camstudio had some problems I have never encountered before, and I'll have to get to the bottom of that if I keep using that tool. Otherwise, the video quality is probably the best I have achieved so far, and I would like to thank the friend who revealed the "secret" of better quality video for YouTube.

Jul 12, 2012

RegEx for translating DVX external view tables in memoQ

Atril's Dejà Vu was the first translation environment tool I am aware of to offer a means of exchanging translation content for review, correction and translation using an ordinary word processor. These "external views" were the original inspiration for memoQ's RTF bilingual tables, which are used in many interoperable workflows not only with people using a word processor but with many other CAT tools as well.

As with memoQ RTF bilinguals, the content in the "external view" which is not to be translated can be selected and hidden with a word processor, leaving only a target column into which the source text has been copied. But these steps alone with the standard RTF filter pose a problem:


The DVX "codes" (tags), which are represented by curly brackets enclosing a number, are not protected. Erasing parts of them can damage the content. It is also not possible to perform a tag check using the memoQ QA functions.

The solution is to use the Regex tagger in memoQ. There are two ways to do this.

If the document has already been imported,


the tagger can be run from the Format menu.

Enter the appropriate regular expression to convert the DVX code to a protected tag: \{(\d+)\}


This expression describes the pattern of the text to protect: a curly bracket (with a backslash in front of it to indicate that this is to be interpreted literally as a character, not as a bracket for grouping something), one or more digits (\d indicates a digit as opposed to d, which is just the letter d, and the plus sign means one or more) and a closing curly bracket ("escaped" with a backslash so it is understood literally as the bracket character in the DVX code.)

Click Add to put the rule in the list, then click Run tagger now.


The result is protected tags in the translation grid of memoQ. These can also be verified with a QA tag check after the translation is completed.

Your regular expression rules can be saved in the dialog above and re-used, or exported from the list under Tools > Resource console... > Filter configurations and shared with others.

The regular expression tagger can also be used as a cascading filter when the RTF file for the external view is imported:



Here the configuration can also be saved or another one loaded.

Dec 27, 2011

SDL Trados Studio: Translating memoQ bilingual RTF files

Some time ago, I noted that SDL Trados Studio experiences difficulties importing XLIFF files in which the sublanguages are not exactly specified if the default languages are not set to the same major language. So if you plan to translate an XLIFF from memoQ or another tool in SDL Trados Studio, it is necessary to ask the one generating the file to specify the sublanguages or, if that is not practical, use the workaround described here. I discovered this bug before the release of the 2011 version of Studio and spoke to SDL development and management staff specifically about this at the TM Europe conference in Warsaw, but apparently this is not a priority to fix compared to other issues, and it may be a while before SDL Trados Studio users can work with client XLIFF files without coping with this headache.

Several of my client agencies using memoQ for project management have quite a number of freelance translators using various Trados versions and who have no intention to stop doing so. It's important to work smoothly with these resources in a compatible way, which also protects the data and formats. In a recent article on processing memoQ content with Trados TagEditor, I published a procedure I developed which enables the memoQ tags in the text of the bilingual RTF table export to be protected as tags when working in SDL Trados TagEditor. Now I would like to present a similar approach for Trados Studio users, which can serve as an alternative to XLIFF exchange.

If the bilingual RTF table is created in memoQ specifying the mqInternal style for tags
this style setting can be specified as non-translatable in SDL Trados Studio. To do this, select the menu choice Tools > Options, and in the dialog which appears under File Types, add the mqInternal style to the list of styles to be converted to internal tags in the appropriate formats (RTF, and just in case the file gets re-saved as a Microsoft Word document, for Microsoft Word 200-2003 and Microsoft Word 2007-2010 as well):

SDL Trados Studio dialog for setting RTF, DOC and DOCX styles as "non-translatable" (converting to tags)

Once the mqInternal style has been entered this way in SDL Trados Studio, the prepared bilingual RTF file can be imported. "Preparation" for import includes copying the source text to the target and hiding all the text you do not intend to translate (the file header, the source column, and the comments and status columns if present). The result will look something like this:

The prepared memoQ bilingual RTF file imported to SDL Trados Studio. Note that the bold and
italic type are displayed normally as in memoQ, which offers the translator greater working ease.

Please note that the same procedure described for working with these files in TagEditor (hiding the red text of the tags, see the TagEditor article for details) also works for SDL Trados Studio, but this method involving the mqInternal style saves a few steps.

Dec 25, 2011

SDLXLIFF files in TagEditor, OmegaT and memoQ

As SDL Trados Studio gains acceptance, SDL's own flavor of XLIFF is encountered with increasing frequency by translators using other tools. I decided to test three of these to see how they fared: TagEditor (for "backward compatibility" with Trados users who haven't upgraded), the Open Source tool OmegaT and memoQ.

A simple DOCX test file was created, which looked like this:

It was opened in SDL Trados Studio 2009 and saved as an SDLXLIFF file, which was subsequently imported into each of the other three translation environment tools.


TagEditor test
Using the default XLIFF INI supplied with SDL Trados 2007, I obtained results which looked as follows:


Some ugly tag salad there and exposed , vulnerable information from the header. Using the adapted INI file I made for memoQ XLF files, things improved a bit:


Still not very pretty, but it works, and it works better than an memoQ XLIFF currently does in TagEditor. No breaking of tags.

Translated and brought back into SDL Trados Studio, the translation grid looked like this with everything in good order:


The target DOCX file with the translation saved nicely and was perfect.

In real life, however, it may be necessary to adapt the INI file in TagEditor more extensively for good results. The German consultancy Loctimize has compiled some good instructions for doing so in which the entire workflow is also described nicely (in German). So far I haven't run across similar instructions in English.


OmegaT test
Initially things looked much better with the SDLXLIFF file imported to OmegaT:


A great start, much cleaner-looking than TagEditor! But when the translation was re-imported to SDl Trados Studio, a small problem was apparent:


One of the tags in the second segment was dropped. In a similar test with an XLIFF from memoQ, the version of OmegaT I tested (version 2.3.0, update 3) appeared to trash even more tags, and the target file was completely reformatted! In fact, it even trashed tags on the source side in the memoQ file! Thus I was deeply concerned about the XLIFF filter in OmegaT. However, as astute observers have noted, I probably deleted the missing tag when editing in OmegaT, and a subsequent successful re-test of the workflow confirmed this. But the problem with the XLF file from memoQ was frighteningly repeatable. Careful, systematic testing revealed, however, that the roundtrip of a bilingual XLF file from memoQ back into memoQ failed. Either there is a problem with the version I have installed (5.0.56) or the installation is corrupted. The matter is being pursued with Kilgray support. The target file from the SDLXLIFF translated with OmegaT was fine.


memoQ test
I have translated many SDLXLIFF files in memoQ and seldom encountered a problem of any kind. The file from SDL Trados Studio looks as follows in the memoQ environment:



Please note: with memoQ I can use an XLIFF which has not had the source copied to the target or one which has been pretranslated. That is not really the case for the other two environments tested, because with both TagEditor and OmegaT the source must be copied to the target or you have nothing to translate. You might say that memoQ offers "real" XLIFF editing for translation.

The SDLXLIFF file translated in memoQ reimported beautifully to SDL Trados Studio 2009 and saved to a target file (DOCX) from there with no problems.

Trados TagEditor: Optimal translation of memoQ bilinguals

With the growing number of translation agencies, direct clients and outsourcing translators adopting Kilgray's memoQ as a working platform for managing translation project content, it is particularly important for these new memoQ users and their partners to understand the best approaches to working together with persons who use other tools. One tool which is still commonly found is SDL Trados TagEditor. Compared to the other "classic" Trados tool, the Workbench macros for Microsoft Word, TagEditor has the advantage of enabling many different file formats to be processed while protecting their formatting elements (also known as "tags").

SDL Trados TagEditor can work with two types of "bilingual" files prepared in memoQ: XLIFF (*.xlf) files and bilingual RTF tables. Each approach will be presented here along with some suggestions for best practice.

XLIFF files
TagEditor comes with a default INI file for translating XLIFF, typically found at the path C:\ProgramData\SDL International\Filters\XLIFF.ini.This INI enables the contents of the target segments from the memoQ XLF file to be translated as the source in TagEditor. Thus for this approach to work, the source must be copied completely to the target in memoQ before the bilingual XLIFF is created using the Export bilingual function of the Translations page. This makes pretranslation undesirable in most cases, because the source text for matches will not be accessible and the translator will end up with a very screwy TM. Data for the TM should be supplied to the translator as TMX; be aware that match rates for the segments in TagEditor will differ significantly in some cases.

The memoQ XLIFF files will have a lot of "junk" at the top of the file when viewed in TagEditor:

Skip the content between the mqfilterinformation tags and do not change it in any way. Place the  cursor below that to start working. If you prefer not to see that information at all, use the XLIFF INI for TagEditor which I modified for use with memoQ XLF files. Then the XLIFF will look a bit cleaner with the header information filtered out:

Astute observers may have noticed, however, that all is not really well with the tag structures in the views above. I think there is  problem with the way that memoQ is generating the XLIFF files, with some tag structures being replaced by entities. (You see this if you open the XLIFF from memoQ in a text editor.) This causes consistent problems like the following in TagEditor:


This will require a lot of tag fixing. Thus I really can't recommend the XLIFF method at this point, not for my simple little test file in any case. The methods using the bilingual RTF tables with memoQ tag protection are safer and the structures that result are much simpler.

But if you do use this method, when the translation is complete, clean the TTX file using Trados Workbench or use the menu option File > Save Target As... in TagEditor to create an XLIFF file to return with the translated content. If the content inside the mqfilterinformation tags has not been segmented, an accurate count of the words translated will be shown in Trados Workbench upon cleaning the TTX (as accurate as that tool is given its limitations with numbers, dates, etc.)

Bilingual RTF tables
There are created in memoQ using the Two-column RTF option of the Export bilingual function. Technically speaking, the files have more than two columns (source and target, index numbers and possibly columns for a second target text, comments and status). Good practice for working with these files in TagEditor and many other tools also requires the source to be copied to the target column. This can be done in memoQ or later in a word processor. The table might look like this, for example:


For best results in TagEditor, it is important that this file be generated with the "mqInternal" style selected for tag formatting. The dark red color imparted to the tags with this option means that proofreading in a word processor is easier, and it also enables the text of the tags to be selected and hidden using a search and replace function. If the RTF file is then saved as a Microsoft Word file, the memoQ tags in the table will then be protected in TagEditor!


If the "full text" option for tags is selected, this makes little or no difference in the TagEditor view.

Here's a quick look at what the protected memoQ tags look like in TagEditor and what can happen without protection:




One possible workflow for memoQ RTF tables in SDL Trados TagEditor consists of the following steps:
  1. Copy the source text to the target in memoQ
  2. Export a bilingual "two-column" RTF file with the mqInternal style option selected for the tags
  3. Re-save the RTF as a DOC or DOCX file! This is necessary so that TagEditor will use the right filter.
  4. Select and hide all the text in the file
  5. Select only the text to translate in the target column and unhide it
  6. Using search and replace, hide all the dark red text. The settings for the dialog are show below and are set using the Font... option (marked with a red arrow in the screenshot) in the Format dropdown menu of the Replace dialog.


    The font color to hide will be found under More Colors... in the font colors of the font properties dialog:

  7. Launch TagEditor and open the Microsoft Word file with your content to translate. All the hidden text will be protected in tags. Translate the accessible text.
  8. Create a target MS Word file from your TTX as described above for the XLIFF files translated in TagEditor.
  9. Open the target file and unhide all the text.
  10. (Optional) When reviewing the text in the word processor, comments may be added if there is a comments column. These will be imported back into memoQ and can serve as valuable feedback.
  11. Re-save the target file as an RTF
  12. Re-import the RTF with the translated table into memoQ. The target text will be updated to include the translation. 
  13. A QA check for tags, terminology, etc. should be performed in memoQ before exporting the final file for delivery. If an external reviewerr is used, another bilingual file in an appropriate format can be generated in memoQ for that work.
Steps 4 to 6 can be performed using a macro for convenience.

The procedure described above can, of course, be abbreviated considerably by simply copying the source text cells into a new Microsoft Word document, doing the search and replace to hide the dark red text for the tags, then processing the file in TagEditor. After translating, unhide the text in your working file, then paste the cells over the target cells in the RTF file.

Here's a look at the test file translated in TagEditor (with a comment added as shown by the dark speech balloon icon) after it was re-imported to memoQ:



And here's the translated file itself: