Dec 12, 2013

In HAMPsTr We Trust?

So many times when I hear the bright and happy predictions of commercial interests spouting nonsense about "translation as a utility" and hoping to feast on the roadkill of communication, who claim the highest of motives and show the basest motivations in their real acts, I hear a saxophone in my mind and a strained voice declaring that some day "they may understand our rage".

Machine pseudo-translation (MpT) and human-assisted machine pseudo-translation (HAMPsTr) are big business for the profiteers offering pseudo-solutions which typically start in the low six figures of investment. "Get on the MT boat or drown!" declared one such profiteer, Asia Online CEO Dion Wiggins at his unfortunate keynote presentation at memoQfest 2012 in Budapest.

It seems that each week a new story line to justify the linguistic lemmings' rush over the cliff appears. Recently I heard for the first time how translators suffer from the "blank page syndrome" (note: as of 25 December 2013 the entire blog with that "blank page" link has disappeared) and need machine generated babble for inspiration. I thought perhaps I was just an odd one, usually struggling with many ways to render a text from German into my native language and trying to choose the best, but experienced colleagues I asked about their fear of blank pages all asked me if I was joking.

This morning another colleague sent me a real screamer:
"Smaller language service providers (LSPs) process fewer words than larger ones... [this] puts them at a disadvantage when it comes to leveraging linguistic assets due to the smaller size of their terminology databases and translation memories (TMs). These less comprehensive language resources limit reuse on subsequent projects or for training statistical machine translation (SMT) software."
The author of that particular bucket of bilge is Don DePalma, head of the Common (Non)sense Advisory, an organization rightly seen as incompetent to interpret even third-grade level mathematics in their discredited report of dramatic rate decreases for translations, which turned out to be an artifact of calculations involving mismatched survey populations. In any case, the idea that small translation agencies or individual translators, who are generally more aware of and concerned with their clients' business are at any disadvantage by not being buried under mountains of monkeyfied mumbo-jumbo from bulk trashlation nearly ruined my keyboard as I spit my coffee laughing. Don deserves an extra Christmas bonus for that transcreation of the truth.

But the best was yet to come:

This inspiring graphic accompanied an article on how to motivate those involved in post-editing MpT in the HAMPsTr process promoted by Asia Online and others. There has been some vigorous and interesting speculation on where that arrow is pointing :-)  The colleague who sent the link to me commented:
An interesting read from a humanitarian perspective. If they need to go to these lengths to "motivate" people, even those who are otherwise happy to swim in the muddy, toxic pond that these LSPs (your definition of the term) have created, one would have thought that they will understand that there is something wrong with their concept and goals. But why let the facts get in the way, I guess.
Indeed, those swimming in the pond do seem to have some real issues, even in cases frequently quoted as a HAMPsTr success. I long ago lost count of how many MpT advocates have told me of the wonderful words at Microsoft and Symantec, nicely extruded from controlled language sources and lovingly shaped into their final sausage form by happy hamsters. But this TAUS presentation by a Symantec insider tells another story:

And further indications that we are all getting mooned by the MpT Emperor can be heard in the excerpts of this recent GALA presentation in Berlin:

Unlike some of my colleagues, I have no fear of being replaced by Mr. Gurgle or any of his online Asian cousins however well-trained. What provokes some rage in me and more than a little concern is the callous dishonesty of the MpT profiteers and their transparent contempt for truth, the true interests of modern business and the health of those involved in language processes.

I have no little sympathy for the many businesses and individuals struggling to cope with the challenging changes in international business communication in the past 20 years. Nor do I feel that MpT has no role to play in communication processes; colleagues such as Steve Vitek have presented clear cases of value for screening of bulk information in legal discovery to identify documents which may need timely human translation and other applications. Kirti Vashee of Asia Online has commented honestly on numerous occasions on his blog and elsewhere about the functional train wreck of most "automated translation" processes one encounters, but still cannot take proper distance from the distortion and scaremongering practiced by the head of his team and others.

I am particularly concerned by the continued avoidance of the very real psychological dangers of post-editing MpT, which were discussed by Bevan and others in the decades before the lust for quick profits silenced discussions and research into appropriate occupational health measures. If Asia Online and others are truly concerned with developing sustainable HAMPsTr processes, then let them fund graduate research in psychology to understand how to protect the language skills and mental function of those routinely exposed to toxic machine language.

All this disregard for true value and truth reminds me so much of my days as an insider in the Y2K programmers' profit orgy: we all knew it was bullshit, but all the old COBOL programmers wanted to take their last chance to score big before they were swept into the dustbin of history. Some 60 years or so after it began, is machine translation ready to assume its place in that bin? The True Believers and profiteers will loudly say no, but at some point the dust will settle, the damage will be assessed, and we will find that the place of MpT is not at all what many imagine it to be today.


  1. Great post, Kevin!

    The messages the illustration sends are many. The illustration is self-explanatory, I would say.

    Why our colleagues continue buying the lie of MpT as an opportunity if it has not one but many negative sides? I fail to understand.

    Thank you so very much for voicing so well the concern of many professional translators.

  2. Great article, Kevin.
    Very well written.
    I can feel your rage.
    Olivier den Hartigh
    English to French Translator @

  3. Kevin, just reposted it on my FB wall with the following header:A brilliant assessment--by the prince of translation gab, Kevin Lossner--of the rampant "McCarthyism" that slimy, bottom-feeder, MT hawkers in the translation industry are practicing. The message: Don't believe everything they're telling you. In fact, don't believe ANYTHING they're telling you.

    1. Oh, I believe some of what they say in those video clips linked here. The bits about problems finding a compensation model with which posteditors are happy at Symantec, for example. When I hear that sort of thing I always wonder what's wrong with paying the victims... uh, I mean the workers... by the hour as in most normal jobs. The answer, of course, is in that word the good lady repeats so often: "discounts". It's all about squeezing the last drop of blood, not from stones, but from the human bones gnawed to nothing by this dogs' technology.

      If, as some declare, there is such a great need for this HAMPsTr process, then it should be quite a commercially viable matter to set up large, "professional" galleys of post-editors rowing in MpT service in the language combinations needed most. In these hard times in some parts of the EU and neighboring areas, it should not be hard to find well-qualified university graduates who might be otherwise engaged waiting tables or cleaning toilets as guest workers in London if they are lucky. Let the capital interests behind MpT technology set up shop to support these HAMPsTr processes under conditions subject to health and safety laws of a First World country, and let's see where that leads, why don't we? Would anyone care to make wagers on whether this will happen and what the outcome will be after a few years? Anyone?

  4. Just to clarify, Kevin, the "don't believe everything" they're telling you I saw as your message. Mine, to the readers of my FB wall, is "don't believe ANYTHING they're telling you," because the wholesaler/commoditizers habitually lie through their teeth.

  5. Kevin,

    While there are some overzealous MT proponents and some who even appear to be deliberately providing misinformation, there are also others who are trying to determine how this technology can be deployed in a productive way. I think it is more useful to be able to tell who is who rather than lump them all together.

    For the record, the graphic above which you imply is from Asia Online is ACTUALLY from KantanMT (an instant Do-it-yourself MT option that I often warn users about in my blog) and the generic message that they propagate about "motivating post-editors" is quite different from the practices that Asia Online recommend as best practices for post-editing practices.

    The Asia Online views on recommended post-editing practices are more accurately characterized in this post or this . Dion’s statement was also very clearly directed at LSPs and not translators and I understand that you disagree with him as you have made this point repeatedly.

    The dialogue (if one is even possible) would be more constructive and your point more credible if the facts you presented were more accurate. It is unfortunate that the state of affairs in much of the translation business is mutual disrespect between translators, LSPs and buyers. While it is important to bring about change and raise awareness about things that do not work in the business, I doubt very much if “rage” or continued disrespect are means to bring about a change that any of us would find desirable.

    1. Sorry for the confusion over the graphic, Kirti. I didn't say it was from Asia Online, merely that it represents the process promoted by Asia Online (Human Assisted Machine Pseudo-Translation, aka HAMPsTr). The particular "other" you mentioned here (KantanMT) lacked the redeeming social value the courts spoke of long ago in pornography decisions, so I didn't think giving the rest of their obscenity any exposed would serve much purpose. You perform a good public service when you warn people against them and some of the other carnies out there picking pockets.

      As for "lumping" all the providers together, in this case it is entirely appropriate to do so. That is because, though you differ in the technical details of the systems promoted, you are all, in fact, promoting human post-processing of fractured, unnatural language, and the outcome in each case will inevitably be the same: psychological degradation of the word workers, high setup and maintenance costs better invested elsewhere and a text where clients may become so fatigued dealing with its unnatural structure and "unimportant" errors left in that the smart ones will eventually give up and save their money by hiring someone competent in the source language to abstract the originals instead. I was quite surprised to find that one of my colleagues on Stridonium has built solid business doing exactly that, because some of her clients found that their MpT results, though intelligible, simply burned them out in real use. (So I suppose one could consider the final text consumer as a victim as well; I had only consider the post-editor.)

    2. It's not like the old days when the New Basin Canal was built in New Orleans and as the cheap labor from Ireland succumbed to the effects of the work and Yellow Fever it could be used as backfill for the roads. The great silence by "solution providers" on matters of occupational health for post-editors is damning. If they continue to push these systems at €100K or more apiece without offering clear, tested guidelines for maintaining the mental acuity and general work fitness of those running on their fancy wheels, then no few may learn to speak of them as final solution providers. And we really don't want that.

      It's clear that Dion was speaking to LSPs and corporates in the crowd that day in Budapest and on other occasions, because those are the ones with the six-figure sums he hopes to frighten into handing them over. Threatening someone with drowning is not what I would expect as the start of the responsible "dialogue" called for. Indeed, it's a rather thuggish thing to say, and disrespectful not only to the owners of the pockets addressed but to the collateral victims (word workers) as well.

      That TAUS clip featuring the woman from Symantec also makes it clear that even one of the frequently cited "success stories" for MpT use in its most obvious domain of application (IT) has achieved no good, sustainable solution for compensation. I usually avoid all those discussions of "underpaid hamsters", because after what happens to their minds with continued exposure to toxic text, I'm sure they'll be fit for top earnings in non-linguistic activities, but apparently those still trying to make a go of it are deeply unhappy. Why are there so many discussions of fancy edit distance formulas and other bizarre means of calculation just to figure out a paycheck? Pay their warm bodies by the day, by the hour, by the week or month at a wage rate commensurate with skills and experience. Companies figured that model out long ago and it continues to work in most cases, so why is it that those using MpT processes are so reluctant to use it? Is it because they don't see word workers as more than backfill for canal-side roads? Let's hope not. But useless talk of "discounts, discounts" that we hear from people like that Symantec woman make it clear that people are far from the focus. And where that is the case, the process is simply not sustainable. The short-term slash-and-burn approach to text production supported by MpT and HAMPsTr processes may eventually leave us with a wasted landscape of language talent which is as fit to support business as the Moon is to support life.

      My "facts", Kirti, are accurate. And if you are looking for disrespect, go listen again to the words from the MpT advocates in the clips here or the ridiculous statements by someone like Jaap van der Meer. I don't make this stuff up. I just quote it.

    3. I don't understand your perseveration on the payment model. Do professional (human) translators charge by the hour/day/week/month? That has not been my experience, although admittedly I have only contracted for translation services a handful of times.

    4. "Perservation"? What, pray tell, is that? There are many services translators charge based on time as anyone familiar with the sector knows. Review activities (editing, proofreading, etc.) are typical examples of this, though those at the lower end of spectrum try to shift risks onto the hamsters by pushing piece rates independent of quality or actual effort. In fact, in recent years, some of the service providers at the very top end of the profession have gone almost entirely to hourly charges. In some of the very complex workflows that may be involved in the production of annual report translations, for example, this may be the only reasonable way to handle remuneration, especially when text updates may occur constantly throughout the period of a project.

  6. Great post Kevin. I take your point that prolonged exposure to MT may have an impact on language skills but I fear it may be years before this is proven. Large- scale longitudinal studies are expensive and style is hard to quantify objectively so impact must be hard to measure.

    In the meantime we find some translators are faster when they post-edit MT compared to translation from scratch so the perceived utility for these people will probably outweigh the perceived or even proven risk. It is vaguely depressing but I try to remain optimistic. Look at the history of electricity. It is (also?) carcinogenic but useful. Eventually, richer countries learned not to build houses under high-voltage pylons. I am not aware of any translator that only post-edits but I do know that post-editing is a growing niche so agencies may have to be careful not to turn budding enthusiastic translators into dumbed-down "demotivated" post-editors.

    Unfortunately, your suggestion of per hour payment has been tried but it doesn't work. In practice no LSP will allow a translator to translate or post-edit one word per hour. A common model is X€'s per hour @ Y words per hour but this is, of course, a poorly disguised per word pricing model. In short, the dog continues to chase his tail.

    So let's collaborate and find a solution so the mutt can finally get some rest!

    What we see in our data is that except in exceptional circumstances where the MT is super-dooper high quality some translators are faster when they post-edit relative to translation from scratch and some are not. This is always my mental starting point when I think about the pricing problem. It makes sense. Humans come in all shapes and sizes and just like some people are good at sprinting or endurance running I think some people look at a garbled text and see a potential translation while others just see junk, or worse toxic junk.

    Training and practice may help those who want to give MT post-editing the old college try for a few months but I suspect in many cases this is a waste of time. Luckily, MT is such a small niche that these translators will always have work so far from dying out like the dinosaurs they are likely to continue to thrive (no matter what the CSA write).

    Just as you see these translators who are willing to post-edit as hamsters they may see you, rightly or wrongly, as a lumbering sauropod. As a trainee researcher, I prefer to watch from the sidelines and bean count.

    The solution to this problem that is proposed by Asia Online and similar providers is to provide better MT so that even you Kevin might be won over by high quality MT output. It is a seductive argument and mildly effective for some people or for some accounts where there is a large budget but in your case I am going to speculate that better MT is unlikely to win you over, not least because texts that are that easy to translate lend themselves to Dragon Dictate.

    Unfortunately, for the moment it is quite cheap to produce MT of quality X but expensive to produce MT of quality X+Y as it normally involves sourcing large quantities of in-domain training data or writing lots of rules. This is how MT companies on the consulting side of the spectrum make a living. Kantan and Microsoft Translator Hub inhabit the opposite end and both niches are valid. Warning people about competitors is a time honored tradition in business so I suspect Bill Gates and Tony O'Dowd will take Kirti's concerns in their stride.

    Soooo...getting back to pricing models and conveniently ignoring light-posting (a valid but hard to define niche within a niche) if we assume some kind of QA process after post-editing how do we stop the dog from chasing his tail in the "pay by the word or pay by the hour" debate?


    1. Sorry, ran out of space on the last post, this is a continuation...

      The solution is to...

      Allow translators to choose whether they post-edit or translate from scratch on each job. Most LSP's and Translation Management Systems have some sort of job portal to disseminate jobs so it can just be a checkbox on a html page.

      After a few jobs if a translator prefers to post-edit he can negotiate a discount on new words that both parties feel is fair based on real or perceived speed improvements and stick to that discount for future jobs on the same account. Remember, he can always revert to translation from scratch at the full word price.

      Badda bing.

      Sadly, this blindingly obvious solution is not my idea, it comes from a well-regarded Madrid LSP called Celer Solutions and it seems to work well for them on regular accounts. It doesn't work on long tail client profiles for companies like who make heavy use of Google Ad words so it is not one size fits all.

      Also, it is not a panacea as it is likely that any LSP that implements this pricing model will favour translators who chose the post-editing option but at least provides a pushback mechanism in case MT quality suddenly degrades, e.g. because an end client hired a new technical writer with a distinctive style.

      I suspect the only reason why this model is not mentioned by Asia Online or TAUS is that it doesn't force MT down translators' throats. At least this is how Dion responded when I mentioned the model in a LinkedIn discussion. I think his words were "buyers prefer to maintain more control of the supply chain" or words to that effect.

      Remember, MT is normally provided by large translation buyers and this cost has to be justified by discounted translation costs but in any industry where you get fair discounts you also get unfair ones.

      It should always be up to a translator to chose what technology aids he or she uses otherwise you end up shoving a square brick into a round hole.

      p.s. I was in the room when Katrin Dresher gave that presentation at that TAUS event. In fairness, she is not really describing conditions at Symantec but a true synopsis of what was discussed in the break-away discussion group that, as it happens, I also attended. Most people agree that the current pricing model is temporary as MT is a new technology so buyers and agencies are still finding their feet.

      Disclaimer, Symantec is an industrial partner of my employer, where I work as a Ph.D. student. The above implication that MT may be carcinogenic does not represent the view of my employer and is, in my own pScientific view, pretty bloodly unlikely.

  7. Drum roll!

    Let translators decide on a job-by-job basis if they prefer to post-edit or translate from scratch. In practical terms this takes the form of a checkbox or radio button on a web portal. These are often bespoke portals developed by larger agencies or off the shelf solutions like XTRF, Plunet or XTM. Small LSP’s can just use a spreadsheet.

    If a translator choses to post-edit on a job a discount is applied to the new words in the job. This discount has been negotiated in advance based on his or her perception of their working speed improvement using MT after a period of post-editing at full price.

    Badda bing!

    This is not a panacea that will protect translators from evil MT as LSP’s will naturally gravitate towards translators that provide the discount. They are cheaper. However, it does at least provide a pushback mechanism should MT quality degrade for any reason, e.g. a new technical writer on the buyer side and it is certainly fairer than current models that are predominantly unilateral.

    Also, it is not a one-size-fits-all solution. A low cost agency called tried it but it does not suit their long-tail client base (where marketing is done mainly using Google Ad words). It does however work on large regular accounts as is indicated by the success of the MT post-editing program within the agency that pioneered it, Celer Solutions in Madrid.

    Beware of vested interests. When I mentioned this model on LinkedIn Dion Wiggens remarked that he didn’t think it would work, as MT clients prefer to have more control of the supply chain. Rubbish! It works for Celer, 30% of their turnover is post-editing in a market where 5% is closer to the average. I suspect, we will never hear this model from companies like TAUS or Asia Online as it does not involve shoving the technology they sell down translators’ throats.

    Remember, most MT is produced on the buyer side. This is where the bilingual data needed to train Statistical MT systems is collated and it costs engineering hours. This cost must be recouped so a discount request is inevitable. The problem is that in any industry where you see fair discounts you also see unfair ones. Worse still, what is fair for Mary is not fair for Bob.

    In the end, within reason, I feel it should always be up to a translator to choose how to get a job done.

    Disclaimer: my implication that MT may or may not be carcinogenic does not represent the view of my employer, In fact, in my own trainee pScientific opinion it is pretty bloody unlikely.

    p.s. Katrin Drescher from Symantec (one of our industrial partners) sat directly to my right in the TAUS breakout group she is summarizing so I was privy to the original discussion. I think she was pretty forthright about the fact that pricing models are still evolving and most people I know accept that we are in a period of transition in that regard. In fairness, her summary of the discussion is pretty unbiased and she is a trained translator herself. At least Symantec are trying to examine the issue. Some big buyers that use MT don’t care about anything other than price so quality has to suffer.

  8. While I can certainly understand taking offense to the HAMPsTr designation the fact is MT is here to stay. Today Google Translate is the largest translation service provider in the world with over 200 million monthly users. Google claims they "translate more data in one day than all the translators in the world translate in a year". It's also true that many translators use Google Translate to create their first draft when translating. The lines between computer translation and human translation are blurring. Your post reminds of me of the angry comments I heard countless times many years ago when translation memory was first introduced. Translators have never been early adopters of new technologies, they seem to see them as threats to their livelihood. I understand that, disruption and change are never easy. After speaking with countless end clients who are the ones that determine our future I can tell you that translators biggest threat is not MT but their inability to meet these end clients business needs. Translation is viewed as a commodity because too often clients perceive little difference in translation quality from one supplier to the next. And that's one of the reasons why MT has flourished. While there's no doubt MT produces repetitive incomprehensible translation segments and that some MT suppliers oversell how to use MT, humans introduce variable translation errors of their own and often times oversell their own capabilities.

    Recently Google has begun partnering with more LSP's and have incorporated human translation services into their offerings like Youtube, etc... The near future looks pretty clear to me and it includes both MT and human translation. Having said that there will always be room for the handmade craftsman who delivers value that clients can actually appreciate.


