Search me!

Jun 12, 2021

The right Ideas venue for #memoQ in #xl8

Yesterday I had an interesting chat with some of the memoQ team involved with the new Regex Assistant in memoQ 9.8, and before we finished, one of the fellows in the session offered me a tour of that "development portal" described in a January 2021 blog post but which I had never accessed myself, having seen similar user forums for other tools go straight down the toilet after many people had invested a lot of effort in them. However, in this case, I was quite surprised by the quality of what I saw, and then tweeted:

This thing really does look good, and I don't mean just its clean appearance:


As far as I can tell, the thing is implemented with WordPress; it uses gravatars for those users who want to have a chosen image appear by their posts and comments. The official company explanation of the portal, its purpose and how to use it is here.

One user had this comment about the new platform for suggestions:
Although memoQ's support team is generally excellent, they deal with tens of thousands of users and their ongoing troubles, and the form letters they used to send for requested features often had a somewhat impersonal feel, which frustrated many people. And there was little chance to see if others had similar requests or to discuss these. All that has changed for the better with this new portal.

The most encouraging thing for me, however, is the clear commitment to this platform which I saw in the memoQ product owner, Zsolt Varga, in a group chat with another responsible person. He's a consummate professional and no bullshitter, and it's clear he plans to watch the portal closely to inform efforts of many kinds with memoQ software.

So, for now I have abandoned my initial reservations about yet another user campaign, and I am taking this as seriously as the decision-makers I trust are. My friends at memoQ will probably also welcome the respite from late-night e-mails and Skype harangues about things I feel are harming the productivity of many of my LSP and independent friends in the translation sector. Many people in the translation world, including some of my closest friends, mistake what I actually do. For more than twenty years now, I have been more of a consultant, occasionally a developer for translation workflow solutions and training or coaching. Yes, I translate a lot, though less in recent years as I prepare for a quiet retirement with my ducks and goats, but I have this bad habit as a former research scientist of turning almost every freakin' translation project into some kind of study, though my clients are usually spared the knowledge and the burden of results from these studies. Most of my time lately is spent training translators and project managers privately. So in any case, I am really, really chuffed, as my British friends would put it, to see the memoQ Ideas Portal, which is, I believe, the best venue so far for memoQ users of every kind to tell the makers what we really need.

So take your ideas and wishes for a better memoQ here: https://ideas.memoq.com/

And now I have to get off my ass and contribute, or I'm going to lose a bet and have to pick up a really, really big bar tab at the next memoQ Fest. So won't you join me? At the Portal, I mean. Well, in Budapest at the next memoQ Fest too, assuming we all manage to get our vaccinations and don't die of the next plague or get blown up by all those jihadis for other CAT tools out there....

We need better backups! Vote for this!!!




Jun 5, 2021

Get better dates with memoQ

There's hope for all those incels stuck with RWS Trados Studio, Memsource, Wordfast, OmegaT and a host of other horrors if they are willing to make a change....


Not surprisingly, dates can be a real nuisance to translate and check, depending on the client's specifications. Target language specifications that include the use of elements such as non-breaking spaces can be particularly troublesome. But even apparently simple tasks like writing all dates in the target language as DDMonth-AbbreviatedYYYY or the like can go wrong far more than expected, and skilled reviewers easily get caught up in the flow of the text and overlook details of format (and often even correct content) in dates. This proved to be a shock to one LSP client who found hundreds of overlooked date errors in a large volume of recently reviewed text.

What's the solution to these inevitable human errors? Proper automation of the monkey work so professionals can concentrate on what they are good at: a fluent text that accurately reflects the intent of the original.


In the case of the QA horror of checking four source languages to ensure the dog's breakfast of date input formats (which also included day+month and month+year entries) regardless of capitalization, memoQ enabled a simple auto-translation ruleset to be created (a few hours' work, including testing and documentation); this was then attached to a project using a QA profile configured to check only against the enabled auto-translation rules, and BOOM! after about a minute, all the date errors in something like 100,000 translated words were revealed. The only false positives found were a few instances where times were written after the date, and the rule can be updated easily to avoid this issue.

I do a lot of date rule development for many languages, and I've published some of this in simplified forms on this blog. But interesting new tricks come up all the time. And I've found it useful when developing rules for others, who usually understand little about writing proper specifications that capture all the likely source input, to create special screening rules like the one shown in the first screenshot, which can be used to examine an entire large TM imported to the memoQ working grid, and see how the input and target texts vary. I used that expression on two large TMs in a view, and in just a few seconds, my laptop screen showed me all renderings of every English date in those TMs into Portuguese. While researching target formats for some new rules I also found quite a number of errors in the TM which I could have corrected had I cared to.

Having rules like this available in the translation phase can prevent quite a few errors to start with. I've found too many cases of dates in March translated as May and overlooked by both the translator and the reviewers. memoQ is - as far as I know - the only tool which will offer such conversions in a results table from which they can be inserted just like any other "terminology hit".

Other tools like RWS TradoZe Studio will allow you to use regular expressions for quality assurance (checking the text), but I'm not aware of any tool other than memoQ which allows you not only to include these checks in customized QA profiles but which can also provide them for on-the-fly review from a library of named expressions. That's what memoQ now does in version 9.8 (scheduled for release in June, within a few weeks) as shown in the first screenshot above.

This new rules library feature of memoQ makes it possible for the first time for users who have a life which does not involve wasting brain cells learning to program regular expressions to use the efforts of those who actually like that sort of thing and do it well. So with this tool, anyone can easily check for things like date errors and a lot more without knowing a single bit of regex syntax. That's some progress :-)

Stuff like dates just keeps getting better for memoQ users, leaving them more time for life and better stuff... like other kinds of dates.

If this kind of thing interests you, I think my friend Marek Pawelec may be teaching an in-person course in regular expressions for memoQ in July of this year, and he, I and others (including memoQ's Business Services unit) are available to help you with turnkey solutions to project challenges like those described here.

Jun 3, 2021

A Hebrew abbreviations "hint base" points the way for other languages

Years ago I published a guideline for how to create something like a term base for memoQ that can handle the irregularities one might find in the way German attorneys on tight deadlines might type the many abbreviations they use in crazy ways. The memoQ term base model can't cope with punctuation and many special characters, so it's basically impossible to use it to map something like "US-$" to the standard currency code "USD". But regular expressions in an auto-translation rule can do that, of course.

The same principle can be used simply to map abbreviations to their full expression so the translator can decode the abbreviation and decide how to render it. Here's an example of that in Hebrew:

This can, of course, be done in other languages, but the fellow who had this idea and asked me about it happens to be a Hebrew translator working into several target languages. I'm tempted to adapt one of my German abbreviation sets to map to the full German expression in the target to serve as an aid to translators who might not be as familiar with the abbreviations as I am and who are also not bound strictly to a particular target language expression. A cheat sheet, basically, or a "hint base" if there is such a thing.

The code for this is particularly simple. Here's a quick look at the resource in an external editor:


The basic "engine" is just a list (#abbreviations#). And the resource was created quickly using search and replace on a list of over 600 abbreviations in an Excel spreadsheet.

In the awful memoQ rule editor it looks like this:


Those who know Hebrew may note that some periods are out of place. I'm not an RTL expert, so I had a few issues with punctuation migrating as I moved data from one format to another, but someone familiar with issues like that can fix things without much ado. This was just a quick prototype to demonstrate feasibility. And a few minutes of search and replace work in a text editor beats entering more than 600 pairs manually in the built-in editor for memoQ. It would be nice if that damned editor included list import features that would read Excel files directly!

As with other auto-translation rules, certain characters may need to be represented by entities or uuencoding. The simple rule shown above can also be made more robust by dealing with variable punctuation, for example. Complexity can always be added. 

Many thanks to the translator colleague who shared his challenge and gave me something fun to do after a grueling day of mapping many messed-up date formats from a lot of different source languages I mostly don't know :-)