Mar 19, 2011

Who's afraid of the big, bad MT post-editing job?

Still somewhat groggy without my third double espresso eggnog latté of the morning, I was woken up quickly by an interesting tweet from Andrew Bell, curator of the translator's social network The Watercooler. It pointed to a PDF file with eight slides summarizing the results of a global TAUS survey on MT post-editing by translation agencies.

The data are interesting and could be spun quite a number of ways. There were altogether 75 respondents out of God-and-TAUS-know-how-many-approached. I presume the latter figure was mentioned in the live presentation. About half of these actually provide such services to their clientele. Given all the talk about MT and "our" post-editing future, what fraction of business do you expect this activity to represent among the respondents? For 86% it was less than 10% of current revenue. I'm guessing much less. A mere 1.8% reported in the 26 to 50% range and none above that. Reality Check #1. The tsunami of MT hasn't hit the profession, nor do the waters appear to be retreating at the beaches. So don't head for the hills just yet. I assume that a certain company which MT'd and post-edited some fifty kazillion Wikipedia articles into Thai was among the respondents, and even that visionary firm didn't break 50% for MT-postedit revenues. So, people, there probably is other work out there in at least the near future.

Reality Check #2: The majority of respondents also reported little or no increase in such business in the past year. Most (75%) of the "post-editors" are in fact part of the regular stables of translators at these agencies, but I suspect these workhorses are those unable to digest better feed and make more than horseshit out of it. Or they are starving. Or a bit adventurously masochistic and can't find where they left the cuffs and leather whip.

If you want to surf the MT-postedit wavelet, do so by all means. It could be an interesting diversion, and as in with any path taken in life, I expect you'll meet some people, possibly even ones worth knowing. But if this isn't your sport, don't think you're doomed to starvation and oblivion.

I don't know how large the total market is for my German to English translation pair, much less for any other combination. An estimate of 100 million euros would be far too low based on a quick napkin calculation of how many translators it supports and what they probably earn on average. Some billions I expect. In any case, even in a relatively small "market" for a common language, a translator would need to capture only a miniscule fraction of the business to make a moderate to good living. I suspect that even our blogging overworked American translator serves somewhat less than 0.001% of the German to English market. The line between starvation and survival and between survival and prosperity is not drawn by the available volumes in the FIGES markets nor even really by the average rates charged in them. Nor by skill as a translator beyond a certain reasonable standard to be expected in a particular discipline.

If you want to earn "enough" as a freelance translator in major language pairs and avoid the ball-and-chain future of an MT posteditor, and you have sufficient linguistic and personal skills to avoid embarrassing yourself or anyone who recommends you, getting "there" is mostly a matter of organization, discipline, initiative and a good service attitude. My attitude is mostly bad, and I still do OK. And I strive to improve that attitude by shooting and skinning pigs instead of customers that may appear to share some of their characteristics. Find a way to get the message of fair service to those who need it and you'll have the freedom to worry about other things, like if the apple grafts in the garden will take. If you're clueless as to what to do, I can't help you. But I can offer you a link to 25 things translators should never do.


  1. Kevin

    Thank you for bringing his survey to my attention. I can confirm that Asia Online did NOT participate in the survey you refer to and that we are not a member of TAUS or the TDA.

    Most of our revenue comes from translation initiatives that simply would NEVER BE UNDERTAKEN were it not for the possibilities (cost, time, quality) created by our “human steered” MT engines. Thus to a great extent we extend the scope of business translation, rather than take work away from the human translation workload.

    I actually agree with you and others that say MT is not best suited to replace the SAME EXACT WORK that was previously done by human translators. (There are usually good reasons that this is done by humans.) This is a mistake that many LSPs and those in the survey you mention often make, thus it is not surprising that the adoption is very low. Square pegs in round holes? MT can make sense in large volume documentation projects when the material is highly repetitive, but not so much if it requires nuance and TEP levels of accuracy and the content differs from project to project.

    Companies like Microsoft spend hundreds of millions of $$$ on human translation services but this only covers a tiny fraction of the content that they actually translate and make available to their customers. Probably 90% or more is MT. However they always use humans to translate "critical" content like security and financial transaction details.

    We have long crossed the point where MT is producing more translation than humans; you just don’t see it because what MT does was never done by professional human translators in the first place. The free MT sites are probably doing more translation in a day (in terms of words) than the total professional translation industry is doing in a year. You may think that what they produce is crap, but there are now tens of millions of people who use these services daily, as a matter of course, to expand their reach to access information that is not available in their own languages.

    The Thai Wikipedia project that I have been involved with is an example of the new possibilities of man-machine collaboration. It is the fastest growing site in Thailand in terms of traffic. Even though only 1/3 of the gazillion articles we translated have been indexed, we already have 25% of the traffic of the most popular site in Thailand today. Given current growth rates, the Asia Online Wikipedia site will be the most popular site in Thailand in six months. So in spite of some of the translations being really bad, there is enough quality to make the information useful to the people who visit the site today. We have hardly begun the community assisted post-editing work to continue to improve the quality. It will get better. The potential of this technology to address the digital divide is significant. We have only begun to explore this as we intend to add millions of more pages.

    I hope that relationship between human translators and these new technology initiatives evolves beyond the current level of discourse. There is definitely a role for some translators in steering these engines to higher levels of quality as most of what we see today, is only what is possible when a bunch of NLP geeks drive the bus. The best systems will come with better linguistic steering and more competent linguists drive the bus, which is beginning to happen.


  2. Well, Kirtee, Wikipedia being what it is, I suppose the user community will iron out a lot with time. At least the English pages seem subject to endless revisions.

    I think you answered one question in my mind about that survey. I was puzzled at the relatively low number of responses. Perhaps this was a members-only survey. If AO wasn't asked to participate that seems likely. I just assumed you must be part of that data set. So where would AO fall on the graph?

    Even "human steered" MT for the mass market doesn't appeal to me personally all that much; I would probably get more out of steering a Trabi in the Indianapolis 500. But I'm sure there are humans who would be willing to do such things and even find them fulfilling in some way I'm probably not capable of understanding. The world is a strange and diverse place. I noticed you also found Steve Vitek's comments on MT interesting; his perspective for Japanese patent work makes some sense to me and isn't a lot different from me applying some of the electronic dictionaries I've built over the years.

  3. Kevin

    The survey was open to everybody who cared to respond and was broadcast widely to get feedback. I felt the questions were not likely to lead to useful answers, since I do not think that a good focus for MT is to do exactly the same thing that humans have been doing for years, so I did not respond, but many (or not so many) obviously did.

    MT's greatest potential is to expand the scope of business translation rather than be used as another rationale to pay translators less for doing the same thing. There are a lot of fools out there who think it is a "replacement for HT". It is not if you care to deliver the same quality, but if properly used it will make NEW high-volume projects economically feasible at lower than TEP quality but still good enough levels. It can also be a productivity aid to translators as Steve describes or even in the way TM is to some. A good MT engine can provide higher quality fuzzy matches than TM in addition to providing terminological consistency.

    Since the survey had such a small focus on the NEW kinds of translation projects I did not see much point in responding. Also, At AO we probably only have LT 10% of what we do focused the kind of projects that seemed to be the focus of the survey.

    I expect that for quite some time that that market will remain small.


  4. Interesting stuff. MT post-editing has never even been on our radar and we've never thought to offer it as a service. We're not afraid of it :), but suspect we are better off doing more lucrative translation work for our direct clients. However, it's always worth exploring, isn't it?

  5. J&D: Always worth exploring? That reminds me of a remark I was told Voltaire, a vicious critic of gays, made after being caught coming out of a male brothel. "Once: a philosopher; twice: a pervert!" There are enough people getting screwed by MT, thank you... I'll pass.

    Actually, Kirti's comments above sit well with me and make a lot of sense. They also are in line with the experience Steve Vitek described, where attorneys or researchers use crappy MT to get a "gist" of a document and see if they might need a good translation. It is indeed very plausible that this will lead to more human translation business, and if this is the case, leave a little room for me on the bandwagon.

    Those who think MT will replace quality HT in my lifetime or my grandchildren's ought to consider carefully their patterns of substance abuse.

  6. MT #1: I'm not quite sure why people get so excited about the alleged "opportunities" of post-editing MT. Essentially, it's not that different to revising human translations (which can be a joy or a nightmare, depending on the input quality). Sure, MT tends to make different types of errors than human translators, and this in turn requires different revision/editing techniques some of the time, but what it comes down to in the end is revising a translation to the point where it is error-free and stylistically appropriate.

    MT #2: I also can't understand why anybody could be willing to accept MT post-editing work priced on a per text unit (word, line, etc.) basis (this also applies to revising human translations, of course). This simply means that the client (e.g. agency) is transferring the entire quality risk to the post-editors/revisers, whose reward is completely out of kilter with their risk. The only equitable model for pricing post-editing/revision is on an effort basis, i.e. by the hour.

    MT #3: For German-English translations, we’re in the rather fortunate position that this language pair has traditionally been one of the most difficult for MT systems, irrespective of whether they’re rule-based or statistical. Even the hybrid systems I’ve seen struggle to produce anything remotely approaching the output quality that can be achieved with many other language pairs.

    Market size: The market for German/English (both directions) financial translation (including “financial/legal” translation) is worth at least EUR 100 million a year, quite possibly a couple of hundred million euros (in terms of value added, meaning price to end client minus cost of outsourced input translations). A reasonable extrapolation therefore gives an overall market size for this language combination of at least EUR 1bn a year (across all national borders). Yet another reason why we need some hard economic analysis of the translation industry (at both macro- and micro-economic levels).

  7. RB, thank you for your insights. I'm very pleased to see you and some others re-emphasizing important points like piece rates versus real earnings lately. Although my perverted principles would have me swallowing rat poison before I translate a chemical procedure at 10 cents/word, frankly I would probably earn more per hour doing that instead of a highly complex legal text at 30 cents per word, though the latter might be more intellectually satisfying. People of good sense will keep such things in mind, but, alas, they are few.

    "MT #1: ... Essentially, it's not that different to revising human translations ... it comes down to in the end is revising a translation to the point where it is error-free and stylistically appropriate."

    Or simply retranslating if it's complete crap. But you're right, that issue applies with humans or machines, though recent published studies showing that even highly capable editors cut MT texts (or texts they believe to be MT) more slack than they would a "human" translation do concern me somewhat.

    "MT #2: I also can't understand why anybody could be willing to accept MT post-editing work priced on a per text unit ... the client (e.g. agency) is transferring the entire quality risk to the post-editors/revisers... The only equitable model for pricing post-editing/revision... by the hour."

    AMEN!!! I have never revised at piece rates, and I never will. It's simply bonkers. Of course I feel most revision work is bonkers anyway, but that's because I prefer to produce the bad originals myself.

    "MT #3: For German-English... this language pair has traditionally been one of the most difficult for MT systems..."

    That is, you know, because the Germans themselves are traditionally the most difficult. I know. I was married to one.

    "Market size: The market for German/English (both directions) financial translation (including “financial/legal” translation) is worth at least EUR 100 million a year, quite possibly a couple of hundred million euros (in terms of value added, meaning price to end client minus cost of outsourced input translations). A reasonable extrapolation therefore gives an overall market size for this language combination of at least EUR 1bn a year (across all national borders)."

    More or less my gut feel on the volumes, too, though I think your 'at least' figure could probably be doubled and still be 'at least'. When you brought out the need for an economic analysis of the 'translation market' in Corrine's discussion of earning 'enough', I sat here in my chair affirming that loudly. However, given what we both know about the good business to be found, do we really want more blood in the water for wannabe sharks? I don't need the broad picture, however much I would enjoy looking at it and pointing out all its flaws. I know what I want is there in the so-called market, waiting for me to enjoy it when I care to.

  8. Great writing and amusing too as always Kevin. My personal slant is that there are still sufficient and smart enough brains working on MT for it to a) improve in terms of quality compared to the present and b) increase its segment of the market; however, I think that MT's primary focus will remain those areas of translation where "gist" only is required/will do, or some forms of highly repetitive technical stuff, where a rough MT translation can be post-edited by a human translator and a specific in-house TM created. I also believe that fields such as the medical-pharmaceutical area in which I work will remain relatively unaffected - unless you know an MT engine that can translate badly written, third generation fax copy patient notes!


Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)