Pages

Jan 14, 2019

Specialist terminology taxonomies from Cologne Technical University

Click and thou shalt go there!

Early in the last decade when I lived near Düsseldorf and began translating full time, the nearby technical university in Cologne had an excellent terminology studies program run by Prof. Klaus-Dirk Schmitz, who also had a long history in Saarbrücken back in my exchange student days there. I had the pleasure of meeting this gentleman at various professional events for Passolo (before it was swallowed by SDL) or other occasions, and I remain impressed by the professional qualities of some of the colleagues he helped to educate. At some point he or one of his students pointed me to an interesting online collection of specialist terminologies created by students at the university as part of their degree work. While student work must be viewed carefully, on the whole I found these collections to be of better quality than quite a few put together by "professionals", and their structured taxonomies were also interesting to people like me who enjoy such things. And occasionally the terminologies were rather helpful for certain technical topics I translate.

But over the years I simply forgot about them for the most part, and when they did come to mind I assumed that the old MultiTerm engine used to handle the data on the site would no longer work. That latter assumption may be partly correct; I found the collection again, noted that the most recent addition to the term library was a bit over a decade ago and that the search functions don't seem to work with Chrome, though I am able to browse the structured taxonomies without difficulty.


Looking through the list of term collections, I saw one that would be particularly useful for a current personal effort: beekeeping. One of my projects for the year ahead is to add some hives to the garden to see if I can improve some of the vegetable, fruit and nut yields. A local Portuguese beekeeper and I have been trading poultry, and he kindly provided me with a copy of his thesis on apiculture and offered assistance to get me started. So I am reading up on the subject in several languages, thinking to put together a good terminology to make cross-referencing the concepts between English, German and Portuguese a little easier.

One thing I never tried to do before was to extract data from the FH Köln (Cologne Technical University) site into any sort of terminology management tool. I don't think they were ever intended to be used that way, and at the time most of the collections were put together, translation environment tools were much less widely used by professionals and university study programs than they are today. But after a little thought and experimentation, mining the pages proved to be quite simple.

Here's how I did it:
  1. Opened a collection of interest and expanded the folder tree for a particular language completely, then selected and copied all the text in that frame:

  2. Pasted the copied content as plain text (no formatting) into Microsoft Word. The numerical codes were followed directly by the text entries.
  3. Removed parentheses by searching and replacing with nothing.
  4. Inserted a tab between the number codes using search and replace with wildcards (regex of a sort):

  5. Switched to the other language in the term collection and repeated steps 1 through 4.
  6. Transferred the contents to Excel (various ways to do this).
  7. Imported the Excel file with the specialist terms into a term base in my translation environment tool of choice.




Jan 13, 2019

A second look at Wordfast Pro


The generally good impression made by Wordfast Anywhere in my recent tests inspired me to take a new look at the premium environment for freelance translators: Wordfast Pro 5. A lot has changed with Wordfast Pro since its early days, and much of what I found troublesome with early versions has been corrected. A new look has also been on my agenda for a while since I realized that two new formats were introduced (TXLF, an XLIFF format, and GLP, a zipped project package format for Wordfast), which can be handled by my usual translation environment but (currently) with a few extra steps required compared to the old TXML format.

The installation took about a minute and started off with a good impression from the warning about cloud drive synchronization:


I've seen a number of people come to grief with other tools when their projects, translation memories or other resources are stored in Dropbox or similar configurations so they can be shared by installations on different computers, and I appreciate Wordfast's attempt to warn people off from this dodgy practice. If you want to share resources, play it safe and stick them in Wordfast Anywhere.

At first, the program is in demo mode, which limits translation memories to 500 translation units (TUs) and does not allow access to remote resources such as Wordfast Anywhere. Fortunately, there is a fully functional 30-day trial available, and it took all of about two minutes to fill out the simple request form, receive the mail with the trial license key and activate it in Wordfast Pro 5.

I was really enthusiastic about the clean, uncluttered feel of the interface. There's a lot more functionality in SDL Trados Studio or memoQ, but all the myriad features of those environments can be intimidating to some, and even for experienced users navigation can be confusing at times to locate some obscure setting or feature. Not in Wordfast Pro 5: the features mostly aren't there, and what is there can be found without much ado. Given the limited scope of mastery and inclination to learn on the part of many hamsters running on the freelance translation wheel, this can be a definite advantage.

On the Help ribbon I saw a Feedback icon. I don't know why, but this inspired a weird enthusiasm in me, so I clicked it, and when the dialog appeared, I wrote a quick note to the development team to say what a great impression the new user interface was making before I had even started to do anything useful with it. I noticed that the feedback dialog also had options to include files and projects in case of a problem, which I also thought was really cool. Something like that in other tools would be very helpful to their users and probably encourage more suggestions and interaction.

It was really easy to navigate through the ribbon menus and explore the configuration options. I was pleased to see that different sets of keyboard shortcuts were available to make the ergonomics easier for users of some other tools.


But SDLX? Huh? That's kind of Jurassic. No memoQ shortcuts, but no problem. I can customize, right? Yes... but I soon discovered that I apparently had no way to save my customized shortcuts as "memoQ style" or whatever else I might want to call them. And then I noticed that I probably can't save the configuration to move it onto a second computer where the terms of the license agreement allow private individuals to install another copy. And, hmmmm, no option to print a cheat sheet I can refer to as I learn the keyboard shortcuts. memoQ users are kind of spoiled on both counts, I guess.

One thing I was very eager to try was the connection to my translation memories and glossaries in my Wordfast Anywhere account. That proved to be quite straightforward: it worked exactly as the clear instructions of the Wordfast Pro Help described the process.

So I was ready to try out some translation, maybe a little dictation with Dragon NaturallySpeaking. I imported a little text file to get started:


WTF??? Now I know what the problem is here, but importing the same file to Wordfast Anywhere gives this result:


And in memoQ:


The import with the simple text filter of Wordfast Pro 5 (version 5.7) does not map the characters correctly. I had to change the source text file from ANSI to UTF-8: not a big deal for me, but a lot of translators I know will be over their heads right there.

The choice of import filters available is fairly good as one might expect from most professional translation environments these days, but two important things were missing for me. There seems to be no option to cascade filters, useful for example if you have a Microsoft Excel file containing HTML text to translate, and there is also no facility for configuring custom regex-based text filters or tagging text content which needs protection (such as placeholder text). This won't be an issue for a lot of translators, but for those who deal with challenging, often unexpected formatting issues in customers' files it could be a real pain in the neck.

On to dictation... Dragon NaturallySpeaking (DNS) seemed to perform well. I had to turn off the DNS dictation box by unmarking thew checkbox in its dialog. Text was then transcribed well into the target field, and my spoken keyboard shortcut to confirm a segment and go on to the next one worked perfectly. Then I misspoke and used a spoken editing command to correct my error. Nothing happened. I tried several different spoken selection and editing commands that I use every day in memoQ. Nothing worked. Shit. What we have here is a failure of compatibility. The full potential of Dragon NaturallySpeaking cannot be used in Wordfast Pro 5.

I explored the settings further... quality assurance. That looked pretty good; the options were easy to understand and I could set them as I wanted to check my work. But the QA settings I need vary in many projects, and sometimes I want to do a QA check on just one aspect like tags or maybe terminology. Wordfast Pro 5 offered no facility to save a QA configuration or profile and load it as one might do in SDL Trados Studio or memoQ. This too would be a deal-breaker for me, alas. I depend on a full hand of memoQ quality assurance profiles for selective checking of important quality parameters in my jobs. Toggling settings back and forth in Wordfast would drive me nuts. Still, this wouldn't disturb many CAT tool users who can barely be bothered to run a spelling check on their work, much less run a check or missing or mismatched tags.

In contrast to my conclusions years ago, I can now say that Wordfast Pro is "ready for prime time". It has a nice, clean, easy to navigate interface, and the Help descriptions are clear, if somewhat idiosyncratic in their spelling at times. The options are limited compared to other professional tools I use which have comparable costs of use, but that may be perceived as an advantage by many... until they need what's not there, which is probably inevitable if they work at translation in a full-time freelance capacity. Over the years I have heard many good things about Wordfast support, so I expect that users will at least find help and advice when they need it.

The integration with the online Wordfast Anywhere resources is also simple and good. That's a major point in favor of this tool and should be very helpful for collaboration.

Overall, I think that users who invest in a Wordfast Pro license will get their money's worth. A three-year license costs €400, with three-year renewals costing half the list price after that. If you aren't willing to pay after the three years, your license will stop working (unlike SDL Trados or memoQ, where the current license models allow you to keep working with the software long after your claim to support and upgrades has lapsed - basically "forever" if nothing strange happens with newer operating systems).

The possibilities for collaboration between Wordfast users and those who work with other environments are much better than they used to be, and in just a short time I was able to see how I can prepare projects for a colleague using Wordfast Pro 5. (SDL Trados packages can apparently be handled, though that's not the case for memoQ project packages prepared with the PM Edition - I would have to make MQXLIFF files and export TM and term base resources.) And I hope that this situation will only get better, with more environments offering various kinds of Wordfast resource integration and Wordfast acquiring new capacities to work with other formats and resources.

Jan 12, 2019

Another look at Wordfast Anywhere

The Wordfast suite of applications has a long history, and through much of it I've had my eye on the tools but up to now never really found them up to the demands of my work. Wordfast Classic (back when it was the only Wordfast app) was brought to my attention by an enthusiastic manager of a German bank's translation team more than 15 years ago; he found that the "blacklist" feature for terminology (since adopted by others - for example in memoQ's "forbidden" terms) was extremely helpful to his translators in avoiding terms which might provoke branding controversies or which were simply inappropriate in a particular specialist context.

When Wordfast Pro came along, I was disappointed in the interoperability of its early versions and it being late to the party for supporting XLIFF formats (as were some other popular tools). That issue is solved in the meantime, so I suspect I might not be quite so unhappy were I to revisit the application.

But really, Wordfast doesn't come onto my radar very often, and when it does, it's not so much the application suite itself as it is the Wordfast creator - Yves Champollion, who follows in a way the family tradition of the famous French Egyptologist, Jean-François Champollion, translator of the Rosetta Stone, and who has earned his own fair share of praise for his many years of support for individual translators and their professional organizations. It would not surprise me if much of the loyalty I find among users of Wordfast is inspired by the personal qualities of Yves as much as by any technical features of his tools.

The least among these tools was, in my consideration, the web-based Wordfast Anywhere (WFA). I looked at it briefly in the early days and was unimpressed: too limited, I thought. And the idea of translating in a browser seemed dubious to me, and it remains so in many scenarios that are relevant to my work. WFA was a bit ahead of its time, before the scamming Gold Rush that targeted corporate clients for web-based solutions designed to wrest data and control away from translators. WFA wasn't welcome in that party: its focus on empowering individual translators is anathema to most of the web CAT solutions ones sees today.

My interest in Wordfast generally was revived recently when I saw that memoQ has integration plug-ins for Wordfast term bases and translation memories on servers. This inspired the thought that perhaps Wordfast Anywhere might function as a collaboration server here, sort of like some had hoped for the Language Terminal resources, but one that actually works perhaps. Alas no, or not yet at least; the memoQ plug-in cannot "see" the WFA server and an individual account. Oh, but if it could....

Collaboration and interoperability between translation environments have been topics of great interest for me since I began to use specialist tools for organizing translation resources some 19 years ago. And on those occasions when I want to share resources with someone who does not have a professional suite of desktop translation resources, I'm always a little uncomfortable with my default recommendations, because they are just a little too nerdy to work well with everyone. So I wondered... how well might WFA work with resources I prepare in SDL Trados Studio or memoQ and pass on to a colleague unequipped with those tools or other desktop solutions. I thought I remembered limits that would restrict such an effort, but either my memory is wrong or these limits changed.

WFA can accept files to translate which are up to 20 MB in size. I receive files that are sometimes larger than this, but not routinely, so this is not much of a restriction. But then I thought the limit on translation memory size would be the stumbling block, and indeed, when I tried to upload a 390 MB TM with about 330,000 translation units, I got an error message telling me that 300 MB (or rather 300000000 with no indication of units!) was the limit. Looking in the online documentation I found that 100,000 TUs is the limit for an individual translation memory in WFA. But you can attach multiple TMs and term bases (which can be much larger as I saw from the 800,000+ entry IATE termbase supplied by the environment). And most TMs that I see for mid-size companies are well under that size limit.

So I spent some time kicking the virtual tires again. Uploaded some damned big EU directives in various formats, including bilingual alignments in an XLIFF. No problem. Loaded a big memoQ XLIFF file: the *.mqxliff extension wasn't recognized, but I fixed that the usual way by changing it to *.xlf and it worked well, roundtripping perfectly back to memoQ and confirming that interoperability would work well enough for collaboration.

Indeed, the range of original file formats handled by this free online translation environment is impressive.

As I browsed through the options and customizing features of the WFA environment, my respect for its capabilities increased further. The thought occurred to me at one point that this might even be suited as an environment for a small company with limited translation needs to manage its language resources and make them available for in-house or external translators. With the several exchange formats available, translators and reviewers could easily perform their work with other translation environment tools or even word processors, and the results could be merged with the master records in the WFA account. This is probably the least expensive, secure way for a company to take its first steps toward central management of its translations and terminology resources. No big server investments needed, and later all resources can be migrated easily to more sophisticated environments, such as a memoQ Server, if necessary.

Some years ago, I opposed the use of Wordfast Anywhere in a local university program, arguing instead that more established professional tools like SDL Trados Studio and/or memoQ should be used instead, especially as the cost of doing so is negligible in teaching curricula. I take that back now. And my impression is that WFA is better suited to a teaching program than other, perhaps slicker web-based tools, because of the underlying philosophy of its design, which leaves translators and their partners in control of the data, not some third-party provider inclined to carry out dubious data mining and use the results to sell more dodgy commercial solutions.

Wordfast users also know that their desktop software can access translation memories and term bases on a WFA account as remote resources. My last look at Wordfast Pro showed me that the tool had come a long, long way since I last dealt with it to clean up some messes a French translator inflicted on an agency client of mine. It's been on my list to look at further for some time; I know it will likely not meet my criteria for the broad range of translation, quality assurance and consulting tasks I do, but it does do a good job of covering the real, practical needs of many colleagues, and it is important to me to understand other translation environments to facilitate collaboration with people who use them.

And for these cases of working together with a mix of environments, it seems to me that Wordfast Anywhere can be a productive bridge to bring partners together. To create a free account and start testing Wordfast Anywhere, click here.