Pages

Jan 14, 2019

Specialist terminology taxonomies from Cologne Technical University

Click and thou shalt go there!

Early in the last decade when I lived near Düsseldorf and began translating full time, the nearby technical university in Cologne had an excellent terminology studies program run by Prof. Klaus-Dirk Schmitz, who also had a long history in Saarbrücken back in my exchange student days there. I had the pleasure of meeting this gentleman at various professional events for Passolo (before it was swallowed by SDL) or other occasions, and I remain impressed by the professional qualities of some of the colleagues he helped to educate. At some point he or one of his students pointed me to an interesting online collection of specialist terminologies created by students at the university as part of their degree work. While student work must be viewed carefully, on the whole I found these collections to be of better quality than quite a few put together by "professionals", and their structured taxonomies were also interesting to people like me who enjoy such things. And occasionally the terminologies were rather helpful for certain technical topics I translate.

But over the years I simply forgot about them for the most part, and when they did come to mind I assumed that the old MultiTerm engine used to handle the data on the site would no longer work. That latter assumption may be partly correct; I found the collection again, noted that the most recent addition to the term library was a bit over a decade ago and that the search functions don't seem to work with Chrome, though I am able to browse the structured taxonomies without difficulty.


Looking through the list of term collections, I saw one that would be particularly useful for a current personal effort: beekeeping. One of my projects for the year ahead is to add some hives to the garden to see if I can improve some of the vegetable, fruit and nut yields. A local Portuguese beekeeper and I have been trading poultry, and he kindly provided me with a copy of his thesis on apiculture and offered assistance to get me started. So I am reading up on the subject in several languages, thinking to put together a good terminology to make cross-referencing the concepts between English, German and Portuguese a little easier.

One thing I never tried to do before was to extract data from the FH Köln (Cologne Technical University) site into any sort of terminology management tool. I don't think they were ever intended to be used that way, and at the time most of the collections were put together, translation environment tools were much less widely used by professionals and university study programs than they are today. But after a little thought and experimentation, mining the pages proved to be quite simple.

Here's how I did it:
  1. Opened a collection of interest and expanded the folder tree for a particular language completely, then selected and copied all the text in that frame:

  2. Pasted the copied content as plain text (no formatting) into Microsoft Word. The numerical codes were followed directly by the text entries.
  3. Removed parentheses by searching and replacing with nothing.
  4. Inserted a tab between the number codes using search and replace with wildcards (regex of a sort):

  5. Switched to the other language in the term collection and repeated steps 1 through 4.
  6. Transferred the contents to Excel (various ways to do this).
  7. Imported the Excel file with the specialist terms into a term base in my translation environment tool of choice.




Jan 13, 2019

A second look at Wordfast Pro


The generally good impression made by Wordfast Anywhere in my recent tests inspired me to take a new look at the premium environment for freelance translators: Wordfast Pro 5. A lot has changed with Wordfast Pro since its early days, and much of what I found troublesome with early versions has been corrected. A new look has also been on my agenda for a while since I realized that two new formats were introduced (TXLF, an XLIFF format, and GLP, a zipped project package format for Wordfast), which can be handled by my usual translation environment but (currently) with a few extra steps required compared to the old TXML format.

The installation took about a minute and started off with a good impression from the warning about cloud drive synchronization:


I've seen a number of people come to grief with other tools when their projects, translation memories or other resources are stored in Dropbox or similar configurations so they can be shared by installations on different computers, and I appreciate Wordfast's attempt to warn people off from this dodgy practice. If you want to share resources, play it safe and stick them in Wordfast Anywhere.

At first, the program is in demo mode, which limits translation memories to 500 translation units (TUs) and does not allow access to remote resources such as Wordfast Anywhere. Fortunately, there is a fully functional 30-day trial available, and it took all of about two minutes to fill out the simple request form, receive the mail with the trial license key and activate it in Wordfast Pro 5.

I was really enthusiastic about the clean, uncluttered feel of the interface. There's a lot more functionality in SDL Trados Studio or memoQ, but all the myriad features of those environments can be intimidating to some, and even for experienced users navigation can be confusing at times to locate some obscure setting or feature. Not in Wordfast Pro 5: the features mostly aren't there, and what is there can be found without much ado. Given the limited scope of mastery and inclination to learn on the part of many hamsters running on the freelance translation wheel, this can be a definite advantage.

On the Help ribbon I saw a Feedback icon. I don't know why, but this inspired a weird enthusiasm in me, so I clicked it, and when the dialog appeared, I wrote a quick note to the development team to say what a great impression the new user interface was making before I had even started to do anything useful with it. I noticed that the feedback dialog also had options to include files and projects in case of a problem, which I also thought was really cool. Something like that in other tools would be very helpful to their users and probably encourage more suggestions and interaction.

It was really easy to navigate through the ribbon menus and explore the configuration options. I was pleased to see that different sets of keyboard shortcuts were available to make the ergonomics easier for users of some other tools.


But SDLX? Huh? That's kind of Jurassic. No memoQ shortcuts, but no problem. I can customize, right? Yes... but I soon discovered that I apparently had no way to save my customized shortcuts as "memoQ style" or whatever else I might want to call them. And then I noticed that I probably can't save the configuration to move it onto a second computer where the terms of the license agreement allow private individuals to install another copy. And, hmmmm, no option to print a cheat sheet I can refer to as I learn the keyboard shortcuts. memoQ users are kind of spoiled on both counts, I guess.

One thing I was very eager to try was the connection to my translation memories and glossaries in my Wordfast Anywhere account. That proved to be quite straightforward: it worked exactly as the clear instructions of the Wordfast Pro Help described the process.

So I was ready to try out some translation, maybe a little dictation with Dragon NaturallySpeaking. I imported a little text file to get started:


WTF??? Now I know what the problem is here, but importing the same file to Wordfast Anywhere gives this result:


And in memoQ:


The import with the simple text filter of Wordfast Pro 5 (version 5.7) does not map the characters correctly. I had to change the source text file from ANSI to UTF-8: not a big deal for me, but a lot of translators I know will be over their heads right there.

The choice of import filters available is fairly good as one might expect from most professional translation environments these days, but two important things were missing for me. There seems to be no option to cascade filters, useful for example if you have a Microsoft Excel file containing HTML text to translate, and there is also no facility for configuring custom regex-based text filters or tagging text content which needs protection (such as placeholder text). This won't be an issue for a lot of translators, but for those who deal with challenging, often unexpected formatting issues in customers' files it could be a real pain in the neck.

On to dictation... Dragon NaturallySpeaking (DNS) seemed to perform well. I had to turn off the DNS dictation box by unmarking thew checkbox in its dialog. Text was then transcribed well into the target field, and my spoken keyboard shortcut to confirm a segment and go on to the next one worked perfectly. Then I misspoke and used a spoken editing command to correct my error. Nothing happened. I tried several different spoken selection and editing commands that I use every day in memoQ. Nothing worked. Shit. What we have here is a failure of compatibility. The full potential of Dragon NaturallySpeaking cannot be used in Wordfast Pro 5.

I explored the settings further... quality assurance. That looked pretty good; the options were easy to understand and I could set them as I wanted to check my work. But the QA settings I need vary in many projects, and sometimes I want to do a QA check on just one aspect like tags or maybe terminology. Wordfast Pro 5 offered no facility to save a QA configuration or profile and load it as one might do in SDL Trados Studio or memoQ. This too would be a deal-breaker for me, alas. I depend on a full hand of memoQ quality assurance profiles for selective checking of important quality parameters in my jobs. Toggling settings back and forth in Wordfast would drive me nuts. Still, this wouldn't disturb many CAT tool users who can barely be bothered to run a spelling check on their work, much less run a check or missing or mismatched tags.

In contrast to my conclusions years ago, I can now say that Wordfast Pro is "ready for prime time". It has a nice, clean, easy to navigate interface, and the Help descriptions are clear, if somewhat idiosyncratic in their spelling at times. The options are limited compared to other professional tools I use which have comparable costs of use, but that may be perceived as an advantage by many... until they need what's not there, which is probably inevitable if they work at translation in a full-time freelance capacity. Over the years I have heard many good things about Wordfast support, so I expect that users will at least find help and advice when they need it.

The integration with the online Wordfast Anywhere resources is also simple and good. That's a major point in favor of this tool and should be very helpful for collaboration.

Overall, I think that users who invest in a Wordfast Pro license will get their money's worth. A three-year license costs €400, with three-year renewals costing half the list price after that. If you aren't willing to pay after the three years, your license will stop working (unlike SDL Trados or memoQ, where the current license models allow you to keep working with the software long after your claim to support and upgrades has lapsed - basically "forever" if nothing strange happens with newer operating systems).

The possibilities for collaboration between Wordfast users and those who work with other environments are much better than they used to be, and in just a short time I was able to see how I can prepare projects for a colleague using Wordfast Pro 5. (SDL Trados packages can apparently be handled, though that's not the case for memoQ project packages prepared with the PM Edition - I would have to make MQXLIFF files and export TM and term base resources.) And I hope that this situation will only get better, with more environments offering various kinds of Wordfast resource integration and Wordfast acquiring new capacities to work with other formats and resources.

Jan 12, 2019

Another look at Wordfast Anywhere

The Wordfast suite of applications has a long history, and through much of it I've had my eye on the tools but up to now never really found them up to the demands of my work. Wordfast Classic (back when it was the only Wordfast app) was brought to my attention by an enthusiastic manager of a German bank's translation team more than 15 years ago; he found that the "blacklist" feature for terminology (since adopted by others - for example in memoQ's "forbidden" terms) was extremely helpful to his translators in avoiding terms which might provoke branding controversies or which were simply inappropriate in a particular specialist context.

When Wordfast Pro came along, I was disappointed in the interoperability of its early versions and it being late to the party for supporting XLIFF formats (as were some other popular tools). That issue is solved in the meantime, so I suspect I might not be quite so unhappy were I to revisit the application.

But really, Wordfast doesn't come onto my radar very often, and when it does, it's not so much the application suite itself as it is the Wordfast creator - Yves Champollion, who follows in a way the family tradition of the famous French Egyptologist, Jean-François Champollion, translator of the Rosetta Stone, and who has earned his own fair share of praise for his many years of support for individual translators and their professional organizations. It would not surprise me if much of the loyalty I find among users of Wordfast is inspired by the personal qualities of Yves as much as by any technical features of his tools.

The least among these tools was, in my consideration, the web-based Wordfast Anywhere (WFA). I looked at it briefly in the early days and was unimpressed: too limited, I thought. And the idea of translating in a browser seemed dubious to me, and it remains so in many scenarios that are relevant to my work. WFA was a bit ahead of its time, before the scamming Gold Rush that targeted corporate clients for web-based solutions designed to wrest data and control away from translators. WFA wasn't welcome in that party: its focus on empowering individual translators is anathema to most of the web CAT solutions ones sees today.

My interest in Wordfast generally was revived recently when I saw that memoQ has integration plug-ins for Wordfast term bases and translation memories on servers. This inspired the thought that perhaps Wordfast Anywhere might function as a collaboration server here, sort of like some had hoped for the Language Terminal resources, but one that actually works perhaps. Alas no, or not yet at least; the memoQ plug-in cannot "see" the WFA server and an individual account. Oh, but if it could....

Collaboration and interoperability between translation environments have been topics of great interest for me since I began to use specialist tools for organizing translation resources some 19 years ago. And on those occasions when I want to share resources with someone who does not have a professional suite of desktop translation resources, I'm always a little uncomfortable with my default recommendations, because they are just a little too nerdy to work well with everyone. So I wondered... how well might WFA work with resources I prepare in SDL Trados Studio or memoQ and pass on to a colleague unequipped with those tools or other desktop solutions. I thought I remembered limits that would restrict such an effort, but either my memory is wrong or these limits changed.

WFA can accept files to translate which are up to 20 MB in size. I receive files that are sometimes larger than this, but not routinely, so this is not much of a restriction. But then I thought the limit on translation memory size would be the stumbling block, and indeed, when I tried to upload a 390 MB TM with about 330,000 translation units, I got an error message telling me that 300 MB (or rather 300000000 with no indication of units!) was the limit. Looking in the online documentation I found that 100,000 TUs is the limit for an individual translation memory in WFA. But you can attach multiple TMs and term bases (which can be much larger as I saw from the 800,000+ entry IATE termbase supplied by the environment). And most TMs that I see for mid-size companies are well under that size limit.

So I spent some time kicking the virtual tires again. Uploaded some damned big EU directives in various formats, including bilingual alignments in an XLIFF. No problem. Loaded a big memoQ XLIFF file: the *.mqxliff extension wasn't recognized, but I fixed that the usual way by changing it to *.xlf and it worked well, roundtripping perfectly back to memoQ and confirming that interoperability would work well enough for collaboration.

Indeed, the range of original file formats handled by this free online translation environment is impressive.

As I browsed through the options and customizing features of the WFA environment, my respect for its capabilities increased further. The thought occurred to me at one point that this might even be suited as an environment for a small company with limited translation needs to manage its language resources and make them available for in-house or external translators. With the several exchange formats available, translators and reviewers could easily perform their work with other translation environment tools or even word processors, and the results could be merged with the master records in the WFA account. This is probably the least expensive, secure way for a company to take its first steps toward central management of its translations and terminology resources. No big server investments needed, and later all resources can be migrated easily to more sophisticated environments, such as a memoQ Server, if necessary.

Some years ago, I opposed the use of Wordfast Anywhere in a local university program, arguing instead that more established professional tools like SDL Trados Studio and/or memoQ should be used instead, especially as the cost of doing so is negligible in teaching curricula. I take that back now. And my impression is that WFA is better suited to a teaching program than other, perhaps slicker web-based tools, because of the underlying philosophy of its design, which leaves translators and their partners in control of the data, not some third-party provider inclined to carry out dubious data mining and use the results to sell more dodgy commercial solutions.

Wordfast users also know that their desktop software can access translation memories and term bases on a WFA account as remote resources. My last look at Wordfast Pro showed me that the tool had come a long, long way since I last dealt with it to clean up some messes a French translator inflicted on an agency client of mine. It's been on my list to look at further for some time; I know it will likely not meet my criteria for the broad range of translation, quality assurance and consulting tasks I do, but it does do a good job of covering the real, practical needs of many colleagues, and it is important to me to understand other translation environments to facilitate collaboration with people who use them.

And for these cases of working together with a mix of environments, it seems to me that Wordfast Anywhere can be a productive bridge to bring partners together. To create a free account and start testing Wordfast Anywhere, click here.

Jan 11, 2019

Do you know Anki?

It began with a short DM last night, which I misunderstood at first:


Oh? Gábor must have read my mind. Just recently someone introduced me to Fotografia de Aves em Portugal, a public Facebook group for bird photography in Portugal, and I was thinking about making some sort of flashcard set to learn the bird names in Portuguese and English and maybe to do the same for all the mushrooms that I encounter at the quinta and out in the fields hunting. In fact, when I looked up the description of the desktop computer program Anki and its iOS mobile app companion, I realized that this is really what I have hoped to find for quite a long time for various learning tasks.

The computer app developed by Damien Elmes and its online server and synchronization site AnkiWeb are free; the charge for the iOS mobile app helps to support the development of all platforms. There is also a free compatible Android app by a different author. I like the idea of being able to coordinate my "learning decks" between devices and access them from anywhere. And a quick look at the import features tells me that it's not hard to send the content of some of my personal study term bases in memoQ to this application.

I assume this is an app he came across in his quest to learn Chinese; in fact, he did say that it's a tool that helps give one a fighting chance to learn all the myriad characters needed for basic literacy. And further research on my part showed that this is a popular tool for review in medical school and many other areas.

I downloaded and installed the app; the initial view was a little puzzling:


But within a few minutes I got my bearings and downloaded a few of the many "shared decks" online to familiarize myself with how the app works:


Pretty simple, really. Thank you, Gábor!

Jan 10, 2019

Word of the day: verjuice!

It all started with an argument about Dijon mustard. I hate Portuguese mustards which, for the most part, seem rather nasty and chemical, or at least flavorless. My go-to commercial mustard since college has been Grey Poupon Dijon mustard, with its nice sharp kick that I always thought came from horseradish. So when the doutora got her latest order of spices that included two kinds of mustard seed and spoke of making our own mustard, I said please don't forget to add some horseradish.

She couldn't see why I would want to ruin good mustard that way and informed me that my favorite mustard in fact contained no horseradish at all. With triumphant glee, my gouty fingers danced over they keyboard, asking Google Why is Grey Poupon mustard spicy? It seems there are in fact versions of that mustard with horseradish, but I've probably never bought them. In fact, the kick in Dijon mustard seems to come from the revolutionary introduction of verjuice in mustard-making in 1752 by Jean Naigeon.

Verjuice? Is that some kind of a typo? When I conceded the culinary argument and said that she was right, verjuice is used, the doutora replied what's that? She knows more about mustard than I do and thought that it simply had vinegar, which in fact it usually does. Verjus in Portuguese, or sumo verde (green juice). Vertjus in Middle French. Husroum (حصرم) in Arabic. Verjus in German too apparently. This seems to be one of those key culinary secrets I've been missing out on all my life.

Since my first foray into practical translation at age 14, I have found culinary translation and the associated terminology fascinating, not least because they often give me interesting excuses for hours of experimentation in the kitchen with interesting and often tasty results. And verjuice seems to be one such promising excuse.

There is a rich tradition, apparently of cooking and formulating with the juice of unripe grapes, unripe oranges, crab apples and other juices which are collectively called verjuice, and Middle Eastern cuisine and Middle Ages European cuisine made much use of it. It has largely fallen into disuse in modern times, though the Australians have reversed that somewhat. So now I'll have to make some of this stuff and see where that leads me.

A comment in the German Wikipedia entry - Verjus ist deutlich milder als Essig - makes me think maybe I was right about that horseradish after all, but who cares when there are new frontiers to explore in culinary translation?

Jan 9, 2019

Translating in the trenches....


The other day I was chatting with a colleague about snarky communications with clients and vendors, and she reminded me of a delightful gem of a blog which, though it only had a run of five months, provided a lot of smiles and laughs to fellow translators, most especially those involved in some way with German translation. This was in the days before all the foolishness of "branding" and style over substance.

For a good time, check out https://trenchtranslation.blogspot.com/

Jan 8, 2019

Translating "smarter"

In response to my recent piece on the use of Fluency to translate Microsoft Publisher files, I received the following comments from the former's technical support:
It appears that you flat out ignored (or disabled) the warning message that pops up every time you try to open a Publisher file (attached).
There is no problem with Fluency with regards to Publisher. The issue lies in the fact that the Publisher interop is unstable. This (and the fact that professionals don’t use Publisher) are the reasons that CAT tools don’t support Publisher.
Are you hoping to ridicule us to encourage us to fix Publisher support? We’d just as soon remove support for it altogether, but we do have a few users who are grateful for it and understand that they might have to restart their computer a few times or kill the Publisher process that is frozen in the background. Another option that works, sometimes, is to try and save multiple times, which you discovered, but incorrectly attributed to resizing the view of the text (which doesn’t affect the output).
So we let you know before you start that Publisher is unstable and you’ve just spent a bunch of time documenting how it is unstable. To what end?
As for our XLIFF files, yeah they aren’t great, but extremely rarely is someone exporting something from Fluency to another CAT tool.
Regards,
Richard Tregaskis
Western Standard Support
Em: support@westernstandard.com
Ph: 801-224-7404
I'm not sure about the part that "professionals don't use Publisher". Certainly graphics professionals don't; when I earned my bread that way I usually used software like PageMaker, Quark Xpress or FrameMaker, nowadays Adobe InDesign seems to be the tool of choice. But I wouldn't think of some engineer in a technical department who uses Microsoft Publisher to write a manual as being unprofessional. Just foolish maybe, but no more so than the ones who use CorelDraw or even MS PowerPoint (!!!) in the same crazy way. People make the choices they do in the circumstances they work in, and professionals try to meet them at least halfway where possible to accomplish the necessary objectives. Fluency does that in the case of Microsoft Publisher, but one would do a service to the customer to suggest that another publishing platform might suit their needs better in the same way that responsible translation consultants often suggest that PDF is perhaps not the ideal format to provide for translation, and that the original format (if it isn't a Microsoft Publisher file or paper) might work better for everyone.

What concerns me about the response of Fluency's technical support, however, is the apparent lack of concern for the compatibility of their XLIFF files. If these cannot be exchanged readily with other platforms, one must ask what actual purpose they serve. Indeed, what would that be? And, perhaps, whether Fluency is really to be taken seriously as a professional platform for translation work.

Some years ago in a period where it looked a bit dark for my platform of choice, I thought that Fluency showed promise as a working platform, and I made a serious effort to investigate its suitability for my routine work as a translator of legal and scientific material. I was charmed by the generally functional approach to transcription, but the translation side of things was less encouraging, riddled with bugs at nearly every stage. After a few days I ran screaming back to more stable, well supported work platforms. The handling of SDLPPX (SDL Trados Studio package files) in particular was the sort of disaster one doesn't easily forget; even with products whose developers care about functionality and compatibility there are issues time and again as the SDL Trados platform evolves. I can only imagine what would happen with Fluency Now if I tried out one of those test files that a friend at SDL likes to play tricks on me with.

XLIFF is serious business. These days it is often the basis not only of interoperable processes with CAT tools but for all manner of bilingual exchange processes. And thus, until the technical support and/or development department of a tool takes this format seriously and makes a reasonable effort to ensure at least basic interoperability, that tool cannot be taken seriously for professional work.

That should be a question mark, not a period :-) 



Jan 7, 2019

"Unprofessional translation" and CAT tools


Those who came to this post expecting more ammunition for the war against the deprofessionalization of translation by the exploitative practices of the bulk market bog inhabited by the worst of (dis)service companies like Lionbridge, TransPerfect, thebigword and others will be disappointed. Nor will they find anything useful to combat the many technophobic misunderstandings and actual abuses of professional tools by sometimes less than bilingual wannabes whom some hope to keep away from translation by building some sort of virtual wall.

Tools are as useful as what we do with them. Hammers are good to drive nails or posts, depending on their design, weight and other factors, or they can be used to commit grisly murder, as one reads occasionally in the papers. But they can also make nice doorstops or play a part in exercise and sports. Computer-aided translation (CAT) tools or (as they are better known) translation environment tools (TEnTs) are versatile and often useful to solve problems and processes for which they were not originally conceived. The E-Learning, Translation and Ideas Bakery website and blog (included for years now in the blog roll here on the left of the page) by a Romanian colleague who teaches at university in the UK shares a number of concepts that can be described thus, and the Unprofessional Translation blog (also in the list here) which I also follow shares many stories of situations where the usual working tools of my present profession can be applied well to situations beyond the usual commercial or literary borders that most of us set for our work.

I continue to be excited by the possibilities of using TEnTs as an aid in language learning. The fact that these are so seldom used in that way is, for me, evidence of great opportunities missed by teachers and students around the world and perhaps also by the providers of commercial tools, though for the broad market of teachers and learners everywhere I would encourage the use of several excellent free and open source tools like OmegaT and the Heartsome Suite, even the web-based tool of that axis of evil, the Google Translator Toolkit.

I have documented some of my own efforts to use my main professional tool, memoQ, to support my own progress of learning Portuguese since I moved to Portugal six years ago. This was essential for getting a grip on the terminology and expressions needed to pass my weapons license exam in Portuguese when I could barely speak well enough to order breakfast, and I continue to use it as a means to track vocabulary and expressions I encounter in the newspaper, magazines and public notices such as the warning about deadly Asian wasps here (in the graphic at the top of this post). many years ago when learning German, Russian and Sumerian I kept a thick deck of flashcards for reference and practice, and at various times I have used online sites like Duolingo, Livemocha or Memrise to get farther with Portuguese or Spanish vocabulary, but none of these have proven as effective for focused study of a written language than the tools I have on my desktop computer, which enable me to compile corpora and glossaries which are adapted best to my personal situation and needs for language acquisition in a new country and culture.

One welcome difference of using TEnTs for personal projects as opposed to professional work is that one can focus on the parts most needed and need not worry about completing an "assignment". Thus I will maintain corpora with only partial translations or perhaps only simple comments to explain grammatical aspects of particularly challenging sentences. And if there is some useful external quiz engine I want to use for virtual flashcards (or I want to make printed cardstock ones or a cheat sheet to help with discussions at the tax office or sporting goods store) that is easy enough to do with the many data exchange options in memoQ, SDL Trados Studio, OmegaT or whatever.

In the same way that understanding the use of word processing software does not make you a writer, students of language who use translation environment tools are unlikely to become viable translators en masse, even if they may have that as an objective for some reason. As a professional translator, I see the attempts of bulk market providers to engage even competent bilinguals as translators and note with depressing frequency that a fool with a tool remains a fool and that language mastery will go nowhere professionally without mastery of concepts and subject matter details as well. But most people can, I think, get on farther and faster with the many challenges posed by a new language in areas of interest and necessity with the "unprofessional" aid of professional translation tools.

Jan 6, 2019

A voice-activated recorder for iOS


Some years ago I picked up an Olympus hand-held digital recorder which served me well for some translation work as well as for evidentiary purposes during assaults by a drunken psychotic. I also used it on many occasions to record ideas for projects and other things while out and about.

But juggling several electronic devices at once has never been easy for me, and I kept losing the little recorder in jacket pockets, suitcases, desk drawers, etc. I still have it, but it's been some weeks since I could tell you where.

I use the Voice Memos app on my iPhone, but for me its continuous recording feature is inconvenient, and I don't like pausing and resuming the recording frequently. It's too distracting. The Olympus device and an old tape recorder I used decades ago both had a convenient voice activation feature which avoided excessive dead space.

Being the technosaurus that I am, it took a long time to realize that there is probably an app for that these days. And indeed there is. Several in fact. I downloaded the iOS app depicted above, and in the days ahead I'll be using it for a few translation projects to evaluate what I consider to be an improved version of the three-step translation workflow I demonstrated a few years ago in a remote conference lecture in Buenos Aires using the now-defunct Dragon Dictation app from Nuance. Stay tuned.

Jan 5, 2019

memoQfest 2019: presentation proposals due by January 30

memoQfest 2019 will be held in Budapest this year from May 29-31. The call for papers has gone out, and submissions for talks are due by January 30th.

Last year I attended after a two-year break, during which quite a few things changed with the company (which has continued to change in name at least - now memoQ Translation Technologies Ltd., the artists formerly known as Kilgray). Not only was memoQfest 2018 a good opportunity to meet new members of the memoQ team, it was simply a spectacular event in its own right, the best of the 7 conferences I attended over the years, with a great deal of content relevant to users at all levels, both individuals and companies.

Stay updated on the conference and its planning at the memoQfest.org site (which redirects to its current home on the memoQ.com domain).

Jan 4, 2019

Translating Microsoft Publisher files

Every few months or so I run across a question in social media or am confronted with a project like this:



Some time ago, Paul Filkin published an interesting discussion of an Open Exchange application that enables SDL Trados Studio users to deal with the Microsoft Publisher format with some limitations; in the article, he also discussed other approaches, including one I have known about for some time: the use of Western Standard's Fluency

I looked at Fluency some years ago, and while I found some interesting things there, such as its transcription module, on the whole the application never seemed ready for prime time with its sloppy programming of details. I spent some time trying to persuade its underfunded team to correct some of the problems I saw, but after a while it became clear that the company and its product were not able to cope with the demanding technical challenges routinely faced by language service providers today.

The discussion which followed the posted question suggested a number of approaches, but if the colleague's client expected to receive a translated PUB file instead of some other format, the only realistic option for this possibly one-off job would be to use Fluency in some way. I assumed (and suggested) that a workflow involving
  *.pub <-> Fluency <-> (exchange format) <-> memoQ
might do the trick (with the exchange format probably being XLIFF, but otherwise the bilingual RTF format that I remembered from my tests of Fluency long ago.)

And so it proved to be. But the Devil is in the details.

The first sign of trouble came from a colleague - a professor at a local university who is known for his technical curiosity and flexibility in translation courses - who told me that Fluency does indeed offer an XLIFF export but that memoQ experienced problems importing it. His description of the error message sounded a lot to me like the typical mistakes that CAT tool programmers who are XLIFF newbies make when implementing a spec that they are probably too lazy to read and test. (I found the same error myself and submitted it to memoQ Support for comment a few hours ago.) He said that he had then tried the RTF export, but it wasn't clear to me what the result was and he was under time pressure, so I didn't press the matter but resolved to have a look myself.

I used a modified English template file for an invitation as my PUB file to test. The file imported easily into Fluency:

I assume that "terminology" download is some silly, unhelpful public domain dictionary I would never use.

The Fluency user interface offered a sort of WYSIWYG representation for the text, which makes it appear not bad for work, though appearances are deceiving. In fact, this proved to be a source of some trouble later.

As mentioned, the XLIFF export could not be used in memoQ, and although I am capable enough of analyzing structure problems in a tagged file, I wasn't in the mood to clean up someone else's mess, so I exported a "Fluency Work File" as my next attempt. That is app jargon for a bilingual RTF file similar to that found in other applications.


The difference with Fluency RTFs is that they include the WYSIWYG text representation. Nice, really, and this makes the work in another environment a little easier. I copied the source text column and pasted it into a new file (DOCX), then imported that to memoQ for translation:


Afterward, the translation exported from memoQ was pasted into the target column of the Fluency Work File (bilingual RTF exchange file). I imported that bilingual file back into Fluency and then exported a translated PUB file using the File / Save As command. I got a strange error message saying that there had been some trouble with the export and that some manual adjustment might be needed in Microsoft publisher.


At first glance I thought, "Looks OK" and then... WTF???  Everything was OK except the title. Not only was the text cut off, it was not even the text I had translated in German. When I copied the text out of the field and pasted it into Notepad, this is what I saw:
Tag der Tag der kulturellen Vielfalt
kulturellen Vielfalt
Vielfalt
kulturellen Vielfalt
kulturellen Vielfalt
Vielfalt
kulturellen Vielfalt
kulturellen Vielfalt
Vielfalt
No joke. Fluency somehow went berserk exporting the text of the title field, and sliced, diced and multiplied the whole mess in a truly bizarre way.

In my nearly 5 decades of casual and occasionally professional programming I have seen almost every stupidity imaginable, so in this case I imagined that somehow the problem lay in sloppy programming associated with text that is longer than the space provided in the field. Interestingly, Fluency enabled me to change the size of the target text in the translation window, so I reduced it by about half and tried to export a new target PUB file.


That worked in fact. So Fluency can indeed be used as a sort of filter for Microsoft Publisher files to be translated in other tools such as memoQ, but the process is not without trouble on the Fluency side, at least when text overruns the field size available, as one might expect to happen with some frequency.

Western Standard offers a 15-day trial of Fluency Now, their desktop tool for freelance translators, and the application can be paid on a monthly subscription of only 15 US dollars. So perhaps for the occasional project or client that requires work with PUB files that is an option. Microsoft Publisher is not taken seriously as a layout and publishing tool by graphics professionals and CAT tool providers, but because it is part of the Microsoft Office suite, one will find it in use from time to time, and this imperfect solution may be the best option for helping such clients.


Jan 3, 2019

Using analog microphones with newer iPhones


Microphone quality makes a great difference in the quality of speech recognition results. And although the microphones integrated in iOS devices are generally good and give decent results, positioning the device in a way that is ergonomically viable for efficient dictated translation - and concurrent keyboard use - is not always so easy. This is a potential barrier to the effective use of Hey memoQ speech recognition.

So a good external microphone may be needed. But with recent iPhone models lacking a separate microphone jack and using the lightning port for both charging and microphone input, connecting that external microphone might not be as simple as one assumes. Especially not someone like me, who is rather ignorant of the different kinds of 3.5 mm audio connections. I have had a few failures so far trying to link my good headset to the iPhone 7.

Colleague Jim Wardell is not only the most experienced speech recognition expert for translation whom I am privileged to know; he is also a musician with extensive experience in microphones of all kinds and their connections. And recently he was kind enough to share the video below with me to clear up some misunderstandings about how to connect some good analog equipment to use with Hey memoQ on an iPhone 7 or later:




Jan 2, 2019

Hacking the "Hey memoQ" dictation commands


In the initial release of the Hey memoQ dictation feature in memoQ version 8.7.3, it's a bit inconvenient to deal with command configuration. Unlike most configurations in memoQ, the dictation commands cannot yet be exported as a light resource and shared with other users, nor can a configuration for generic German, for example, be easily transferred to a desired variant such as "ger-DE" or "ger-CH". Surely this will be addressed soon, but at the moment it's a bit of a nuisance.

But fear not... there is usually a backdoor to hack memoQ configurations, and this is no exception.


The screenshot above shows the path to the current configuration file for dictation commands. The XML file contains all the configured commands for all the memoQ languages and variants, including those of no interest whatsoever.

Deep inside the Hey memoQ dictation command file with Notepad++

A peek inside the XML file reveals that the dictation commands are structured as key-value pairs. And here it is possible to enter the text for dictation commands, simply by typing the desired text between the string tags inside the Value tags.

A configuration (Commands set) for one variant of a language - such as generic Portuguese - can also be copied to other variants - such as Brazilian or European Portuguese, saving the trouble of re-entering everything laboriously in the configuration dialog within memoQ.

I made a copy of the XML configuration and edited it to have only the variants of English and German that were of interest to me. Then I copied this file over the one in the memoQ configuration directory shown in the screenshot above. When I restarted memoQ, the file bloated a bit; upon examining it, I saw that all the deleted languages had been restored after the ones I had left in the edited file, but the new file was still only 247 KB in size because the senseless copying of English commands to the other languages was gone.

A customized XML file can be shared with other users, who can use it to replace the existing configuration file and probably save time configuring their languages and variants of interest. My file with generic English, EN-US, EN-UK, generic German and DE-DE is here.




Jan 1, 2019

We made it this far.

The last week of 2018 was puzzling to me. New Years greetings began to drift in with the fog on Boxing Day, causing me to think more than a few times that I had ripvanwinkled my distracted way into 2019 out here on the quinta, failing to notice much of anything more than my concern for getting the last late invoices out, finishing a course plan and hauling - wheelbarrow by wheelbarrow full - a tractor load of horse manure that some idiot dumped at the bottom of the hill rather than at the top, next to the garden where it is needed.

Whew. No new wars in 2018, and though the fascist stones continue to strike the crumbling bastions of democracy, these are in part being used to rebuild those walls or to whet the knives for the fights ahead. However these turn out, at least some will achieve a better understanding of the world shared by their grandparents or a generation or two before that.

Translation continues as before, seeming dull in its routine upon superficial inspection, but revealing unexpected and pleasurable textures as they eyes focus or the hand is laid on the work. Emily Wilson's Odyssey translation was among the better works which graced the past year for me, and it prepared me well for the imaginative inspirations of Madeline Miller's Song of Achilles and Circe, which told so well the subtext lost in too many heroic tales, that the real heroism of humanity is more often and better found in the hearts of its minor characters. Never did I expect to hear my thoughts in the mouth of Telemachus.

So. Here we are in 2019. Those good wishes for a bom ano novo won't seem so out of place now. Well aged, the flavors of good are complex, with bitter notes or a touch of vinegar sometimes, like the mead I forgot so long in its oxidative fermentation vessel and bottled a year late. But we can taste it, and sometimes that is enough.

Making limoncello from life.