Translation Tribulations: productivity

Showing posts with label productivity. Show all posts

Oct 1, 2023

Bring the lightning.

Quality is a slippery notion, especially when discussing it with those whose ethical approaches to providing services are even slipperier. According to one well-known figure in the trashlation sector, "Quality doesn't matter". Knowing that individual as I do, I know that this utterance was intended as a provocation, and that it is likely backed by some almost-persuasive sleight-of-hand involving differing definitions and whatnot. Given the variability in the human emotional perception of quality (as with obscenity, I cannot define quality, but I know it when I experience it), all of the attempts one sees to quantify it in language services seem all the more absurd.

All the myriad process definitions, ISO certifications, stamps and seals of sinlessness, diplomata, grants of honoris causa et cetera cannot transform the humble lightning bug into a Bolt of Zeus.

Nor are Large Language Models (LLMs) capable of such linguistic transubstantiation, but rather the opposite. The predictive practices at their core could take a training feed of all the world's great literature (and likely already have), and yet the output would be nothing more than an insipid averaging of the basest mediocrities. Only the basest of the mediocre could mistake such text for objectively good quality.

Were we to plot the degree of enthusiasm for AI as the "future" of trashlation against the degree of actual understanding and competence for good language, the graph would look something like this:

But a recent article in The Economist suggests a better way. Curiously, it is a process I resort to myself when the greatest subtlety and balance are needed in a work, for example in the translation of good poetry, or a letter of condolence occasioned by the loss of a belovèd child.

Back to pen on paper. Where the pressure of the nib is an expression in itself, as the sweeping flourish of a final letter or a well-executed ligature.

"But that's ABSURD!!!" some might protest, glancing nervously at their smartphone timers counting down to the next due delivery of linguistic sausage. Much too slow some might think. But is it? Really?

"But you need to run QA and you can't do that with a sheet of scribbles on paper!" some might suggest, more reasonably. Ah, but I can, merely dictate the text I will have read aloud already time and again as I refined the words and their rhythm, and then, in good electronic form, all the slings and arrows of outrageous regex are my quality arsenal.

We have a slow food movement. Perhaps if we want more delicious, digestible, properly communicative words in our translated lives, we should slow the fuck down and let them crystallize, with exquisite subconscious fractal creativity, to form bolts of emotion and understanding that pierce the veil between this world and others as they flash across a page.

As the morlocks cower in their caves and hovels, tapping tiny tablets in their claws, prompting their artificial gods to take this terror of meaning from their shriveled world.

Sep 11, 2023

memoQ "Auto-translation Roundup": 14 September 2023 at 15:00 CET

This week's public lecture for the "memoQuickies Resource Camp" on Thursday, September 14, 2023, at 3:00 p.m. Central European Time (2:00 p.m. Lisbon time) will be a summary of currently available auto-translation rules on the course pages which are open to everyone (enrolled or not) and those restricted to registered participants. This is all about getting to work now with stuff that is ready to go, and how to adapt that stuff for clients with different requirements.

So if you want to get right down to productive work using available memoQ auto-translation rulesets in translation, review and quality assurance without wasting time on learning bloody regex, this is for you.

Be there or be square....

A recording of the talk will be available to all registered course participants afterward.

Last week's lecture, "Auto-translation Rules for Everyone", is available here.

Oh, and next week at the same time we'll be talking about the memoQ Regex Assistant and all the cool libraries available for QA, filtering, find & replace and other tasks.....

Jun 17, 2022

memoQ Inside Out: Templates for Translators

In the summer of 2021, I was coaching a group of project managers and translators at a Portuguese service company, helping them to develop processes to overcome some rather complicated filtering and configuration challenges for recurring project types. It was clear to me that some of their difficulties could be overcome with the use of templates, but I had only recently begun to use these productively myself, and my attempts to communicate the subject matter overwhelmed the group for the most part, and examining the sample templates provided with memoQ installation simply made matters worse.

After several frustrating tutorial sessions and the failed acceptance of a template that I had developed which was tailored to a rather long wish list of automation that came out of our discussions, I decided that the only way to make the value of templates clear to this group of professionals was to wipe the slate clean, forget about all the myriad "wishes" and build a few simple templates which did just a few simple things. Starting from a new configuration with nothing at all. Surprisingly, less was indeed more, and the frustrated people began to "get it".

At almost the same time, my friend and colleague Marek Pawelec, a gifted teacher whom I often refer to quite objectively as "a consultant's consultant", mentioned that he was thinking of writing a book on memoQ templates, because he found that most people were unable to avoid the problems in the example templates provided with memoQ installation, nor were they able to work out most difficulties encountered when making their own templates. I could understand this very well, because the user interface in the configuration dialog for a template is not a stellar example of clarity, and it took me years to make proper sense of much of it. Disappointing, really, because I had been part of the chorus begging for something like templates for years, but when they were delivered, little about them made obvious sense to a dummy like me.

He sent me a chapter he had drafted, where I noted that he had adopted the same reductionist approach to getting started. A template with just a pick list or two for meta data to avoid the problem I've had for years of accidentally using different designations for the same clients, subjects, domains, etc. He had come to the same conclusion independently that the best approach to helping people use templates effectively is to start with one or two simple things they do all the time but often mess up.

That interesting draft chapter took time to evolve into a full-fledged guide of nearly 70 pages, with many practical, relatable examples of the kinds of challenges that individual translators (and many other service providers) often face in configuring translation projects. The topics cover the full range of options, from very simple tasks to extremely complex workflows involving pre-import scripts for preparing translation data and post-processing to recreate the original data formats. At every stage he offers clear examples and guidance on how to make things work in cases I have seen time and again in more than two decades of commercial translation work.

I had the pleasure to edit two drafts of this work as it neared completion. And pleasure really is the right word to use here. Marek has a very different explanatory style than mine, but one which I prefer for my own education. He manages very well the deep dive into messy details without drowning the reader in jargon and other unhelpful complexity. His guide gives valuable suggestions and information for every level of expertise. Much of the content can be understood and applied by unsophisticated new users of memoQ, but some of the details on content connectors and scripting can light a chandelier full of bulbs in the heads of alleged experts like myself.

Templates for Translators is an essential reference work for all memoQ users in my opinion, the sort of thing which ought to have been provided seven years or so ago when templates were introduced. Instead we got some imperfect examples which too often - especially in the hands of under-trained PMs at translation agencies - result in unworkable projects with 50+ translation memories and term bases grinding performance to a halt or a lot of mysterious and unwanted automation that does stupid shit like write unfinished and defective translations directly into one's master TM.

In addition to explaining clearly how to create your own helpful project shortcuts and automation from scratch, Marek included a great chapter in which he describes in detail the templates provided for local projects, what works in them and what doesn't, and how to fix any issues so things work right for you. Even if you are a server user working primarily with online projects, there is a wealth of material in this version of the templates guide to help you work more effectively with templates for online projects. A second edition is planned for later this year, which will cover the additional features of templates for memoQ Server projects, but the real problems of most people working with those are covered in the basics presented in the "translators" edition, not in a lack of guidance on the many extra event "triggers" for online projects or other details. So if you are a server user, don't wait for the later edition, get this guide now, read every damned page and try to contain your exuberance as you finally understand a lot of stuff that has been confusing the Hell out of most of us for a long time. Then when the "server edition" of the guide is published, you'll be better prepared to absorb the increment of information it offers.

This book is now a valued part of my teaching "arsenal", and I recommend it without reservation to every memoQ user who aspires to work independently and create more effective processes for the special needs of various clients and subject matter. If you are a consultant or trainer at a serious level, it could well be considered malpractice to train without some of the information you'll find in Templates for Translators. But that's just what I see too often: discussions of templates glibly use the few defective examples installed with memoQ with little consideration given to how many translators should work in the real world with real, common client projects. This book is a welcome aid to move beyond all that and improve our satisfaction with the routine of translation in memoQ.

So for less than the cost of half an hour of consulting, the €30 invested here will save nearly anyone a large multiple of that and continue to pay dividends for a very long time, even if you understand and apply only 10% of the material presented. I charge far, far more to teach people less than that.

memoQ Inside Out: Templates for Translators is available for purchase at https://payhip.com/b/agrxM

Aug 5, 2021

Workflow Wednesday: Getting started with memoQ templates

Recorded Aug 11, 2021

It has been more than seven years since memoQ introduced the use of project templates, and although the default method of project creation involves templates when the New Project icon is clicked on the Project ribbon, most users stick with the examples provided, venturing little beyond them, or they use the old Project Wizard and avoid templates altogether. It took me some years to really get my head around the use of project templates in memoQ, and the fully configured sample templates included with installation and made to specifications that were seldom aligned to my needs were not particularly helpful.

When I finally did understand how templates could revolutionize my productivity in local and online projects, I responded to help requests by some LSP consulting clients by providing fully configured templates to address all the problems they listed with the often complex needs of their high volume clients. And to my surprise, most of these configurations went unused. The project managers were simply overwhelmed. As I had been for nearly six years.

And then a colleague's request to help with a filter for a package type not included in memoQ's standard configuration opened my eyes to the importance of simplicity. I had to use a template for that particular challenge, and the template allowed easy import of GLP packages full of TXLF files and did no other special thing.

A weekend of training with project managers from a local LSP showed that this approach could clear up the confusion often caused by immediate confrontation with "kitchen sink" templates as an introduction. When the team shared their desires of "just one thing" to make their work easier and saw how simply that one thing could be accomplished, the understood the value of templates quickly and were soon able to build more sophisticated templates as their confidence grew and they dared tread just a bit farther. Step. By. Step.

So this webinar took a different approach to templates than you have probably seen so far, emphasizing simplicity and simple needs as a foundation for robust processes and automation. I had no intention of talking about all the myriad options for configuration and automation, though some of these were discussed in the Q&A. This talk is for people who are confused by templates. Who think they aren't really of any use for what they do. Or who are even scared stiff of them. So enjoy the recording (best viewed on YouTube, where you can take advantage of the time-coded table of contents).

Jan 28, 2020

Another look at Windows 10 speech recognition

A few years ago while on "holiday", I returned from dinner to find that my laptop had bluescreened. Panic time! It was Saturday night, and I still had quite a lot of text to translate and deliver on Monday morning. And up on the highest mountain in Portugal, I wasn't sure where I could find a replacement to finish the project, which was, at least, not utterly lost, because I had put it on a memoQ Cloud server for testing. The next day I got lucky: about 50 km away there was a Worten, where I picked up a gamer laptop with lots of RAM and an SSD. Well, not so lucky, as it was a Hewlett Packard Omen, with a fan prone to failure, but that's another story....

This new laptop was my first encounter with Windows 10. I had heard that this operating system offered improved speech recognition capabilities, and since I prefer to dictate my translations and downloading the 3 GB installation file for Dragon NaturallySpeaking (DNS) from my server at the office was going to take forever, I thought I would give Windows 10 speech recognition a try. I hadn't installed my CAT tool of choice yet, so I fired up Microsoft Word and began dictating. "Not bad," I thought. Then I tried it in my translation environment, and the results were a complete disaster. So I put that mess out of my mind.

Since then there have been some notable advances in speech-to-text capabilities on a number of platforms. But the best solution for my languages (German and English) with DNS became increasingly cranky thanks to neglect of the product by Nuance. Every week I read new reports of trouble with DNS in a variety of environments in which it used to perform very well. Apple's iOS 13 was a great leap forward of sorts for speech recognition and voice-controlled editing, but the new features are only available in English, and having Voice Control activated totally screws up my otherwise rather good dictation in German and Portuguese (or any other language). And don't get me started on the crappy vocabulary addition feature, which uses text entry alone with no link to actual pronunciation. Good luck with that garbage. It's not a bad solution in Hey memoQ with the additional command features added, but iOS dictation is not completely up to reasonable professional standards yet.

I probably would have given no further thought to Windows 10's speech-to-text features if it weren't for Anthony Rudd. We've corresponded a bit since I bought his excellent book on regular expressions for translators (and there's another practical guide for us coming soon from him!), and in a recent discussion he alluded to the use of Unicode with regex as a simple way of dealing with some things another colleague was struggling with. I was intrigued by this, and so for about half a day, I ran down a rabbit hole, testing Unicode subscripts and superscripts for a variety of purposes like fixing bad OCR of footnote markers and empirical formulae, autocorrecting common expressions for subscripted variables and chemical terms, including subscripts and superscripts in term bases and much more. Fascinating and useful stuff on the whole, even if some fonts don't support it well.

And of course I looked at using these special Unicode characters in speech-to-text applications. DNS had some funky quirks (not allowing numbers in the "spoken" version of terms, for example), but it worked rather well, so I can now say "calcium nitrate formula" and get Ca(NO₃)₂ without much ado. And for some reason it occurred to me to give Windows 10 speech recognition a try, just because I was curious whether vocabulary could in fact be trained. Indeed it can, and that feature is better than iOS 13 or DNS by far.

But first I had to remember how to activate speech recognition for Windows on my laptop again. When in doubt, type what you're looking for in the search box....

Notice I've pinned Windows Speech Recognition to my taskbar on the right, which is good for quick tasks.

Gesucht, gefunden. Unlike other speech recognition solutions, the one in Windows 10 works only for the language set for the operating system. And options there are limited to English (United States, United Kingdom, Canada, India, and Australia), French, German, Japanese, Mandarin (Chinese Simplified and Chinese Traditional) and Spanish.

I put on my trusty Plantronics earset (the best microphone I've used for dictation tasks or audio in my occasional webinars in the past year) and began to dictate, first in Microsoft Word, which had shown acceptable results in my tests long ago. I found that adding vocabulary in the Speech Dictionary (accessed via the context menu in the dictation control element shown as a graphic at the top of this post) was dead simple.

The option to record pronunciation enabled me to record non-English names and words in several languages. And sure enough, the Unicode subscripts and superscripts worked, so I can now say CO₂ (I just dictated that) to my heart's content.

I was expecting a mess when I tried to use Windows 10 speech-to-text in a CAT tool, but it was not to be. It was brilliant, actually. I tried it in my copy of SDL Trados Studio, and with the scratchpad disabled so I could dictate directly into the target it worked well. No voice-controlled editing like I'm used to with DNS in memoQ, but that DNS feature does not work in SDL Trados Studio anyway, so this is no worse. But with the scratchpad box enabled (see the screenshot below), I could use voice commands to select and correct text or perform other operations. Brilliant!

After clicking or speaking "Insert", the text will be written to the target field with the proper formatting

So users of SDL Trados Studio who translate to a target language supported by Windows 10 speech recognition are probably better off not giving their money to Nuance, which I'm told can't even be bothered to make a 64-bit version of DNS now (which probably accounts for a lot of the trouble people have with that program.

I tested Wordfast Pro 5, which seems to confuse the speech recognition tool horribly, with source text displayed in the floating bar for some odd reason. But my earlier tests of Wordfast with DNS were equally unhappy, so somehow I'm not surprised. And I didn't test the Memsource desktop editor, which took the price a few years ago for the worst-ever DNS dictation results with a CAT tool. I'll leave that to someone with a much wider masochistic streak.

But what about memoQ, my personal environment of choice for most translation work? Equally brilliant, works just the same as SDL Trados Studio. No voice control for editing without the dictation scratchpad enabled (there, DNS has an advantage in memoQ), but with the scratchpad you can use the voice commands to edit before inserting in the target text field.

Wanna see this in action? Have a look at this short demo video:

I hope that the future will bring us more language support for Windows 10 dictation (Portuguese, Russian and Arabic, please!) and that other providers (like Google, if you're listening, and Apple, which never listens to anyone anymore except to spy on them with Siri) will expand the speech-to-text features offered, particularly to include sound-linked vocabulary training and better adaptation to individual users' speech. Five years ago when I began to investigate alternatives for non-DNS languages, I expected we would have more by now, and we do, but professional needs require all providers to raise their game.

Addendum: Someone asked me if Windows Speech Recognition is a cloud resource or a locally installed one which will work without an Internet connection. It's definitely the latter. So if you have lousy bandwidth or find yourself disconnected from the Internet, you can still use speech-to-text features.

And more: I use a lot of spoken commands for keyboard shortcuts when I work, so I did a little research and testing. It seems that Windows 10 speech recognition gives full access to an application's keyboard shortcuts via voice. So in memoQ, for example, I can dictate the insertion of tags, items from the Translation Results pane and a lot more. Watch out, Nuance. Windows 10 is going to kick your Dragon's scaly butt!

Oct 29, 2019

Bilingual EU legislation the easy way in #xl8

Translators of European languages based in the EU and many others deal often with citations of EU legislation or need to consult relevant EU legislation for terminology in their translations. One popular source of information for that is the EUR-LEX website, which provides a convenient archive of legislation and related information, with the possibility of multilingual text displays, as seen here:

Some years ago, I published a description of how data from these multilingual EUR-LEX displays can be transferred to translation memories or other corpora for reference purposes, and more recently I produced a video showing this same procedure. But some people don't like the paragraph-level alignment format of the EUR-LEX displays, and these can also occasionally be seriously out of sync for some reason, as in this example (or worse):

Now I don't find that much of a nuisance when I use memoQ LiveDocs, because I can simply view the full bilingual document context and see where the corresponding information really is (kind of like leaving alignments in memoQ uncorrected until you actually find a use for the data and determine that the effort is worthwhile), but if you plan to feed that aligned data to a translation memory, it's a bit of a disaster. And many people prefer data aligned at the sentence level anyway.

Well, there is a simple way to get the EU legislation texts you want, aligned at the sentence level, with the individual bitexts ready to import into a translation memory, LiveDocs corpus or other reference tool. See that document number above with the large red arrow pointing to it? That's where you start....

Did you know that much of the information available in EUR-LEX is also available in the publicly available DGT translation memories? These are sentence-level alignments. But most people go about using this data in a rather klutzy and unhelpful way. The "big data" craze some years ago had a lot of people trying to load this information into translation memories and other places, usually with miserable results. These include:

the inability to load such enormous data quantities in a CAT tool's TM without having far more computer RAM than most translators ever think they'll need;
very slow imports, some apparently proceeding on a geological time scale;
data overload - so many concordance hits that users simply can't find the focused information they need; and
system performance degradation, with extremely sluggish responses in a wide variety of tasks.

Bulk data is for monkeys and those who haven't evolved professionally much beyond that stage. Precision data selection makes more sense, and enables better use of the resources available. But how can you achieve that precision? If I want the full bilingual text of EU Regulation No. 575/2013 in some language pair, for example, with sentence-level alignment, how can I find that quickly in the vast swamp of DGT data?

Years ago, I published an article describing how it is better to load the individual TMX files found in the downloadable ZIP archives from the DGT into LiveDocs so that the full document context can be seen from the concordance searches. What I didn't mention in that article is that the names of those individual TMX files correspond to the document numbers in EUR-LEX.

Armed with that knowledge, you can be very selective in what data and how much you load from the DGT collection. For example, if you organize the data releases in folders by year...

... and simply unpack the ZIP files in each year's folder...

... each folder will contain TMX files...

... the names of which correspond to the document number found in EUR-LEX. So a quick search in Windows Explorer or by other means can locate the exact document you want as a TMX file ready to import into your CAT tool:

These TMX files typically contain 24 EU languages now, but most CAT tools will filter just the language pair you want. So the same file can usually give you Polish+French, German+English, Portuguese+Greek or whatever combination you need among the languages present.

I still prefer to import my TMX data into a LiveDocs corpus in memoQ, and there I can use the feature to import a folder structure, and in the import dialog, I simply write the name of the file I want, and all other files (thousands of them) are promptly excluded:

After I enter the file name in the Include files field, I click the Update button to refresh the view and confirm that only the file I want has been selected. Depending on where in memoQ you do the import, you may have to specify the languages (Resource Console) to extract or not (in a project, where the languages are already set). Of course, the data can also be imported to a translation memory in memoQ, but that is an inferior option, because then it is not possible to read the reference document in a bilingual view as you can in a LiveDocs corpus; only isolated segments can be viewed in the Concordance or Translation results pane.

How you work with these data and with what tools is up to you, but this procedure will provide you with a number of options for better data selection and improved access to the reference data you may need for EU legislation without getting stuck in the morass of millions of translation units in a performance-killing megabomb TM.

Aug 1, 2019

memoQ Ergonomics Webinar on August 14th

On Wednesday, August 14 at 16:00 Central European Time, I will be giving a talk on working ergonomics in memoQ, drawing on the outline of an online course to be released later this month. You can register now HERE. The webinar will be held in English and is available to all interested parties free of charge. The recording will be available later to participants with the course materials.

This discussion will highlight key concepts and approaches from the course outline shown below. memoQ version 9 will be used as the basis for discussion, but most of the talk's content is applicable to any version from recent years.

Working Ergonomics in memoQ 9.0: Technology Practice for Ease of Use

Getting Laid Out
- Standard memoQ Layouts

This can be improved on....

- More Fun with memoQ Working Layouts!

Colors, Visibility and Priority
- Color My Grid!
- Fonts in the Working Display

Wild & crazy? Or legible? You decide!

- Translation Results List Tuning

See the match results in the order you prefer!

- Setting the user interface language

- Showing hidden characters

Tuning Options for Typists

- Autocorrect
- Lookup & insertion
- Autopropagation and Its Implications
- Predictive Typing
- Keyboard Customizing!

The Great Dictators
- Hey memoQ

- Chrome Speech

- Dragon NaturallySpeaking

- Other Speech Tools and the 3-Stage Process

Other Views of Translation
- Combining and Filtering Files for Translation

- What's in a Translation Files List?

- Making Sense of memoQ QA Results

Jul 21, 2019

"Faulty" memoQ light resource defaults and how to change them

So often in the decade since I began using memoQ, I've felt an undercurrent of irritation at some of the default settings for certain types of resources, and with the need to switch these resources manually in so many projects. But with so many other pressing matters, I didn't really focus on the problem until a participant in last week's summer school course at Universidade Nova in Lisbon expressed the same irritation with regard to the default QA settings.

QA settings are probably the most familiar irritants to many memoQ users. Some have declared memoQ QA to be "unusable" because of many false positives or a failure to check certain things, and these opinions are usually unfounded and reflect a poor understanding of the available options and how to use them. But even those of us who do know how to use them get caught out by forgetting to change the QA settings to our favorite profiles on many occasions.

No more. If, for example, you want to change the memoQ QA default settings, it is very easy to configure them to match your preferred profile. I started off by cloning my empty QA profile, a template file that I maintain in which no checks at all are enabled. This is the starting point I use for custom QA rule sets in memoQ.

I then edited the renamed file and configured it with the terminology, auto-translation rule and tag settings I prefer in routine cases, leaving many of the usual, irritating memoQ QA defaults disabled. Then I exported the MQRES file as a backup and opened it in a text editor.

There I copied all of the text starting with the XML declaration (skipping the MemoQResource header), and I looked (for example, in the Resource Console, though the Options and Project Settings would do as well) to see where the default resource was located:

Then I went to the file location...

... opened the file, and pasted the copied text of my desired settings into the file:

Default resources (as well as already-imported light resources) do not use headers. Then I saved the new default file and closed it. Then I started memoQ again and make a copy of the Default file for QA settings, and used the editor to examine its contents, which matched those I had pasted into the file for the QA option defaults.

I'm not sure (yet) whether these defaults will be replaced when bugfixed builds or new versions are installed, so I am keeping my exported custom QA configuration as a backup in case I need to do this again. And I will be looking at other light resources which might benefit from this approach.

I had originally considered setting my empty QA profile (with nothing set) as the default until I realized that this could lead to a false impression that nothing of interest was wrong if that default profile is accidentally chosen (or not deselected, rather) for a quality check. Then I realized that the best default to use would, of course, be the settings I use most frequently.

One objection to this procedure raised on social media is that one can set the default for new projects in the memoQ options under "default resources". However, this does nothing for projects which already exist. In these, the change would have to be made project-by-project. And for users who tend to re-use projects for a particular client rather than use the powerful, but somewhat confusing project templates feature, this is a real time-consuming nuisance. Changing the installation-level defaults automatically changes how QA is done in all the exiting projects that use "default" QA options.

Jan 6, 2019

A voice-activated recorder for iOS

Some years ago I picked up an Olympus hand-held digital recorder which served me well for some translation work as well as for evidentiary purposes during assaults by a drunken psychotic. I also used it on many occasions to record ideas for projects and other things while out and about.

But juggling several electronic devices at once has never been easy for me, and I kept losing the little recorder in jacket pockets, suitcases, desk drawers, etc. I still have it, but it's been some weeks since I could tell you where.

I use the Voice Memos app on my iPhone, but for me its continuous recording feature is inconvenient, and I don't like pausing and resuming the recording frequently. It's too distracting. The Olympus device and an old tape recorder I used decades ago both had a convenient voice activation feature which avoided excessive dead space.

Being the technosaurus that I am, it took a long time to realize that there is probably an app for that these days. And indeed there is. Several in fact. I downloaded the iOS app depicted above, and in the days ahead I'll be using it for a few translation projects to evaluate what I consider to be an improved version of the three-step translation workflow I demonstrated a few years ago in a remote conference lecture in Buenos Aires using the now-defunct Dragon Dictation app from Nuance. Stay tuned.

Dec 29, 2018

memoQ Terminology Extraction and Management

Recent versions of memoQ (8.4+) have seen quite a few significant improvements in recording and managing significant terminology in translation and review projects. These include:

Easier inclusion of context examples for use (though this means that term information like source should be placed in the definition field so it is not accidentally lost)
Microsoft Excel import/export capabilities which include forbidden terminology marking with red text - very handy for term review workflows with colleagues and clients!
Improved stopword list management generally, and the inclusion of new basic stopword lists for Spanish, Hungarian, Portuguese and Russian
Prefix merging and hiding for extracted terms
Improved features for graphics in term entries - more formats and better portability

Since the introduction of direct keyboard shortcuts for writing to the first nine ranked term bases in a memoQ project (as part of the keyboard shortcuts overhaul in version 7.8), memoQ has offered perhaps the most powerful and flexible integrated term management capabilities of any translation environment despite some persistent shortcomings in its somewhat dated and rigid term model. But although I appreciate the ability of some other tools to create customized data structures that may better reflect sophisticated needs, nothing I have seen beats the ease of use and simple power of memoQ-managed terminology in practical, everyday project use.

An important part of that use throughout my nearly two decades of activity as a commercial translator has been the ability to examine collections of documents - including but not limited to those I am supposed to translate - to identify significant subject matter terminology in order to clarify these expressions with clients or coordinate their consistent translations with members of a project team. The introduction of the terminology extraction features in memoQ version 5 long ago was a significant boost to my personal productivity, but that prototype module remained unimproved for quite a long time, posing significant usability barriers for the average user.

Within the past year, those barriers have largely fallen, though sometimes in ways that may not be immediately obvious. And now practical examples to make the exploration of terminology more accessible to everyone have good ground in which to take root. So in two recent webinars, I shared my approach - in German and in English - to how I apply terminology extraction in various client projects or to assist colleagues. The German talk included some of the general advice on term management in memoQ which I shared in my talk last spring, Getting on Better Terms with memoQ. That talk included a discussion of term extraction (aka "term mining"), but more details are available here:

Due to unforeseen circumstances, I didn't make it to the office (where my notes were) to deliver the talk, so I forgot to show the convenience of access to the memoQ concordance search of translation memories and LiveDocs corpora during term extraction, which often greatly facilitates the identification of possible translations for a term candidate in an extraction session. This was covered in the German talk.

All my recent webinar recordings - and shorter videos, like playing multiple term bases in memoQ to best advantage - are best viewed directly on YouTube rather than in the embedded frames on my blog pages. This is because all of them since earlier in 2018 include time indexes that make it easier to navigate the content and review specific points rather than listen to long stretches of video and search for a long time to find some little thing. this is really quite a simple thing to do as I pointed out in a blog post earlier this year, and it's really a shame that more of the often useful video content produced by individuals, associations and commercial companies to help translators is not indexed this way to make it more useful for learning.

There is still work to be done to improve term management and extraction in memoQ, of course. Some low-hanging fruit here might be expanded access to the memoQ web search feature in the term extraction as well as in other modules; this need can, of course, be covered very well by excellent third-party tools such as Michael Farrell's IntelliWebSearch. And the memoQ Concordance search is long overdue for an overhaul to allow proper filtering of concordance hits (by source, metadata, etc.), more targeted exploration of collocation proximities and more. But my observations of the progress made by the memoQ planning and development team in the past year give me confidence that many good things are ahead, and perhaps not so far away.

Dec 7, 2018

Integrated iOS speech recognition in memoQ 8.7

Today, memoQ Translation Technologies (the artists formerly known as "Kilgray") officially released their iOS dictation app along with memoQ version 8.7, making that popular translation environment tool the first on the desktop to offer free integrated speech recognition and control.

My initial tests of the release version are encouraging. Some bugs with capitalization which I identified with the beta test haven't been fixed yet, and some special characters which work fine in the iOS Notes app don't work at all, but on the whole it's a rather good start. The control commands implemented for memoQ work far better than I expected at this stage. I've got a very boring, clumsy (and unlisted) video of my initial function tests here if anyone cares to look.

Before long, I'll release a few command cheat sheets I've compiled for English (update: it's HERE), German and Portuguese, which show which iOS dictation functions are implemented so far in Hey memoQ and which don't perform as expected. There are no comprehensive lists of these commands, and even the ones that claim to cover everything have gaps and errors, which one can only sort out by trial and error. This isn't an issue with the memoQ development team for the most part, but rather of Apple's chaotic documentation.

The initial release only has a full set of commands implemented in English. Those who want to use control commands for navigating, selecting, inserting, etc. will have to enter there own localized commands for now, and this too involves some trial and error to come up with a good working set. And I hope that before long the development team will implement the language-specific command sets as a shareable light resources. That will make it much easier to get all the available languages sorted out properly for productive work.

I am very happy with what I see at the start. Here are a few highlights of the current state of Hey memoQ dictation:

Bilingual dictation, with source language dictation active when the cursor is on the source side and target language dictation active when the cursor is on the target side. Switching languages in my usual dictation tool - Dragon NaturallySpeaking - is a total pain in the butt.
No trainable vocabulary at present (an iOS API limitation), but this is balanced in a useful way by commands like "insert first" through "insert ninth", which enable direct insertion of the first nine items in the Translation Results pane. Thus is you maintain good termbases, the "no train" pain is minimized. And you can always work in "mixed mode" as I usually do, typing what is not convenient to speak and using keyboard shortcuts for commands not yet supported by voice control, like tag insertion.
Microphones connected (physically or via Bluetooth) with the iPhone or iPad work well if you don't want to use the integrated microphone in the iOS device. My Apple earphones worked great in a brief test.

Some users are a bit miffed that they can't work directly with microphones connected to the computer or with Android devices, but at the present time, the iOS dictation API is the best option for the development team to explore integrated speech functions which include program control. That won't work with Chrome speech recognition, for example. As other APIs improve, we can probably expect some new options for memoQ dictation.

Moreover, with the release of iOS 12, I think many older devices (which are cheap on eBay or probably free from friends who don't use them) are now viable tools for Hey memoQ dictation. Update: I found a list of iPhone and iPad devices compatible with iOS 12 here.)

Just for fun, I tested whether Hey memoQ and Dragon NaturallySpeaking interfere with one another. They don't it seems. I switched back and forth from one to the other with no trouble. During the app's beta phase, I did not expect that I would take Hey memoQ as a serious alternative to DNS for English dictation, but with the current set of commands implemented, I can already work with greater comfort than expected, and I may in fact use this free tool quite a bit. And I think my friends working into Portuguese, Russian and other languages not supported by DNS will find Hey memoQ a better option than other dictation solutions I've seen so far.

This is just the beginning. But it's a damned good start really, and I expect very good things ahead from memoQ's development team. And I'm sure that, once again, SDL and others will follow the leader :-)

And last, but not least, here's an update to show how to connect the Hey memoQ app on your iOS device to memoQ 8.7+ on your computer to get started with dictation in translation:

Jul 19, 2018

On track with changes in memoQ

After a two-year break I decided to attend memoQ Fest again this year. I had burned out a bit on memoQ as a working environment in 2015 because of the slow resolution of many problems associated with the transition from traditional application menus to the awful, space-stealing Microsoft-style ribbons... bugs remained unresolved for so long in fact that I released my last book (New Beginnings with memoQ) with a "beta" designation and then months later simply withdrew it from the market because it seemed that no fixes were in sight. And then at the beginning of 2017 as things with memoQ 2015 (7.8) had stabilized to an acceptable degree, memoQ 8.0 (Adriatic) was released with a dog's breakfast of foolish interface changes and new bugs. The two minor releases that followed (8.1 and 8.2) deepened the muck, not the least by introducing new and very broken tracked changes features. Dark days indeed in which experienced users generally stuck desperately to the only really reliable release of memoQ that Kilgray still supported (7.8).

And then last autumn there was a perceptible inflection point in the memoQ development trajectory. The 8.3 release still had the awful new match comparison and other confusions, but it was unusually stable for a new release. And it dealt with some long-standing issues while introducing some improved term-handling features. Improvements in terminology handling continued in the version that followed, and stability increased with each minor release - a phenomenon nearly unheard of with any CAT tool I know. Whether it's memoQ, SDL Trados Studio or another tool, new releases are usually painful experiences with a lot of bug bites, but Kilgray seems to have discovered some new secret of pest control in software development. With memoQ 8.4 and 8.5 I have actually been able to recommend upgrades very near the release date, which past experience taught me is usually not a professionally wise thing to do. Whatever the memoQ team is doing now, I hope they keep it up, because this is the kind of stability and reliability that is needed in critical business applications and which is seldom delivered on schedule by any company.

Adding to that, the latest version of memoQ has restored the old (and better) match comparison feature and other aspects of productivity that went missing when the 8.x series first hit the street. I still have my usual long wish list of improvements and features, but at the moment the only big objection I still have to memoQ's feature set is the limitations of its image transcription module for handling graphics with a lot of text. Something closer to Fluency's approach to transcription would be helpful.

With all the encouraging things happening with the translation environment I depend on, I decided to go to Budapest this year and meet the new team members who were contributing to many of these improvements and to catch up with old friends and colleagues who have contributed so much to improving my productivity and professional satisfaction for many years. memoQ Fest is inevitably a good event no matter how good or bad its published program sounds; the gathering of talent, brains and outright decent people in the memoQ community means that it's probably impossible not to have a good time and learn a lot of useful things in the sessions, breaks and after-hours events. This year's 10th anniversary of memoQ Fest exceeded all my expectations in every aspect.

I was extremely pleased to see the progress made with the integrated speech recognition feature suggested three years ago after investigations begun by David Hardisty and others, and to listen to the new design and education teams about approaches to future development, support and training. I'm even a little excited, which is an unusual thing given the cynicism that nearly half a century of working with software has embedded in my brain.

There are so many interesting things happening now in the memoQ world: not just speech recognition, but major ongoing improvements to terminology handling, subtitling, easier QA control... the list goes on, and the importance of any item on it will depend, of course, on the nature of an individual translator's or other service provider's work and clientele. But, really, it's not about the features. It's the people. And I think that the combined quality of the team behind the creation and support of memoQ software and the superb, mutually supportive professionals in the user community remains unbeatable for a secure professional present and a promising future in #xl8.

Lots of tasty tidbits in the latest memoQ!

Jun 25, 2018

OmegaT: free CAT tool, free webinar

Click this graphic for more information and registration....

Didier Briel, current project manager of the Open Source OmegaT CAT tool, will discuss what makes this language service community resource unique, how it can enable you to work together comfortably in teams with others who use different tools (interoperability) and other interesting matters.

Have a look and see if this is the versatile, multi-platform tool you've been looking for!

Jun 3, 2018

The Great Dictator in Translation.

I have no need for words. memoQ will have that covered in quite a few languages.

This is not your grandfather's memoQ!

Dec 1, 2017

Appearances matter in your CAT tool.

This morning on the social media memoQ forum of Putin's Western election influence platform, a user expressed their [1] woes while editing a translation with a lot of italic formatting in the text:

I assume they were facing something like this:

The problem is not so much the italic text per se, but the user's choice of display font for working. I have the same problem sometimes. Usually what I do is switch the text in the translation and editing grid of memoQ to a monospace font like Courier. I also keep the non-printing characters visible (and do so in Microsoft Word as well) so that I can pick up formatting problems like extra spaces, for example, or optional hyphens (which tend to produce "rogue tags" in a CAT tool, making term matching and search functions useless in most cases and complicating the work in other ways too).

The font here is Courier New

This approach can be used with most CAT tools which have their own editor. In recent versions of memoQ, this is done via the Quick Access Toolbar under Options > Appearance:

Changing the font in CAT tools can not only improve the visibility of the text you work on; it can also be an aid in editing. Many studies and articles have been published on how to improve accuracy when editing or proofreading one's work, with tips such as these:

12. Font size

Change the font size to one you can work with easily on your screen.

13. Font

We all have our favorite fonts. For proofreading change the font to the one that allows you to find mistakes most easily. Start by trying Times New Roman, Arial and Calibri and work through other fonts until you find one you like.

It would, of course, be nice to have a one-click changover possible; I don't think any tool offers that at present. Considering that everyone faces this issues, software providers should think about that.

A font like the one above isn't everyone's cup of tea, but it would slow me down and make me notice careless typing errors better when I read through a text before delivery.

[1] This post is gender-neutered to provide a safe space for today's university students to pick up editing tips.

Search me!