Sep 19, 2011

The future is here (artful manipulation)

For personal reasons, I was unable to attend the recent conference on machine translation held in Ede in the Netherlands as I had planned. A colleague, Diane McCartney, was there and has gracefully consented to share her impressions and extensive notes from the day, which will appear here in several blog posts, of which this is the first. It was a busy day.

***

I was very skeptical about the conference that the ATA (Association of Translation Agencies in the Netherlands) held in Ede on machine translation. I usually avoid conferences like this because the high level of BS and marketing make me sick to my stomach. But I don’t believe in opinions based on gut feeling, outdated knowledge and obsolete technology, so off I went.

The day was not at all what I expected. Renato Beninatto’s keynote speech was not half as provocative as I thought it would be, but that may be because there was no one there to talk back. The room was filled with 135 translators and a deafening silence that was only interrupted by a gasp as Renato calmly stated that “quality doesn’t matter”. Talk about setting the tone: nobody talked about anything else for the rest of the day! And no one seems to have heard what he added to that statement almost immediately after he made it, namely “…until it is missing”.

Yes, the agent provocateur had struck again, although I found him to be much tamer than at last year’s Tradulínguas conference where I heard him provoke for the first time. Renato is oh so happy to plant controversial statements in the middle of the room and wait for his audience to react, but the sober and ever-so-practical Calvinist Dutch were unshaken. Maybe that’s because the audience, most of which was between 40 and 60 years old, had heard it all, seen it all, done it all before and before and before. Am I repeating myself? Ah, yes, well, MT we were informed at the end of the day by the Dutch visionary Jaap van der Meer was to do away not only with the repetitive tasks translators found themselves confronted with, but with the stupid tasks they were forced to do too. Human translation is stupid, machine translation is smart, long live machine translation. Yes, indeed, the future has been here for 50 years only we’re all too stupid to do something with it.

We learned that “projects” will disappear and “drops” will be replaced with “drips.” Fully automated, integrated project management and translation environments stored on the SaaS provider’s server will reduce project management to a monitoring activity enabling project managers to focus on exceptions. Yes, we have been working like this for many years – although I don’t see “projects” as such disappearing because one still needs a project number for all sorts of tracking and tracing purposes. SaaS, however, still has to learn how to walk before it can run – try telling that to a software developer – in order for anyone to claim that this part of the future is here.

In terms of translation resources, crowd sourcing is the answer to a quick turnaround time and the best translation of software because the translation is done by the users. I’m sure this works well in environments such as Facebook where adherence to terminology and style are of no import, but what do you do with IBM’s answer to a European request for proposal? Or a bank’s investment strategy? These very real concerns were dismissed with a wave of the hand and a “Sometimes it’s better to apologize than to ask for permission. If you're worried about confidentially, talk to your lawyer.” The future sure looks bright for Justice.

Translators should focus on services and revenue, not on price: We have to give customers what they want when they want it at the price they’re willing to pay. The recipe for success is differentiation: Give your customers more than they’re getting now and sell an unlimited number of languages. And remember: you do not define what you do; your customers do. Companies need to channel and understand what people of all nationalities and languages are saying about their products on the web. Opinions voiced on social networks are very important because no one reads a company’s marketing material. (So, companies could achieve huge savings by firing their marketing departments, stopping their market research activities and no longer translating the marketing material created to tap into foreign markets. All they need to do is put their name and address on Facebook, fire the marketing department, get a machine to trawl the web and gather users’ opinions, publish them on their Facebook page and let MT translate them into the language of the user looking for information about the company. Very efficient and cost effective.)

So we produce translations of a lesser quality. Who gives a toss if that’s what the customer wants? And what’s quality anyway? A subjective notion that depends on your customer’s definition of it. You may think a translation is too literal, but your customer may think you’re the best translator on the planet. You may love the wit in your translation only to find that the customer is stumped because they have no clue what you’re talking about. Focusing on revenue is not as simple because it means focusing on volume and volume is usually paired with quantity breaks (thank you Microsoft and IBM). But since we do have colleagues who are willing to work for a pittance, it looks like rates are not going to stop falling anytime soon. It will be interesting to see the new pricing models agencies impose on us because they, and not the end customer, will be driving the MT bandwagon until more large corporations start implementing MT themselves. Somehow I have the feeling that it's Trados with another name ….

We talk about the future, Renato explains, because the present is boring or we are totally lost. So in other words, we would be neither bored nor lost if we hadn’t listened to the BS that has been shoved down our throats for the last 20 years. So the same guys who helped us up that creek and made sure we lost the paddle are the same ones who are going to bail us out? Sounds like another rollercoaster ride to hell to me. And what did Renato mean when he said that we had to “make sure we didn’t make the same mistake with Trados by not stopping Jochen Hummel from selling it to companies?” It’s impossible to keep anything secret these days, especially from those driving the development of a new technology! Click on Members on the TAUS Web site and tell me again that we need to keep MT to ourselves.

In Jaap van de Meer’s Translation Business Innovation workshop, we in fact learn that TAUS was founded by companies experiencing with their own MTs. Understanding language is a matter of collecting loads of data and putting it in the cloud. Our culture is one of reciprocal collaboration: I win, you win. MT is a utility that is here to stay: It is a basic human right, a utility, like roads, utilities etc. I am totally lost for words ….

Language is a social experience in which we include and exclude people, invent words and change grammar. MT is therefore not a threat in this area. (Whatever that means, because as far as I can tell, this is the whole problem with MT: The flexibility of language and the flexibility with which people use language. But then I’m not a visionary!)

We are also told that MT has not gained the place it has today because computing power has become cheaper and content has been exploding, but because WE the users have changed: we no longer demand fully automatic high-quality translation, but fully automatic useful translation. We accept poor quality because we need real-time translation of even the most trivial piece of information. (Listen to the BBC World Service’s “World have your say” program and think again. And how will such information be translated into every listener’s language? Will we be wearing the special glasses that were developed so the hearing impaired can enjoy movies at the cinema without inconveniencing those who are not hearing impaired? Or will we be wearing special earplugs? Anything’s possible, I’m sure, but when the latest gadget finally hits the market, who will guarantee that the on-the-fly translation is accurate? Will we have to be afraid of new conflicts arising as the result of a machine mistranslation? Is this the future we’re so keen on reaching?)

Because machine translation is now based on hybrid systems, i.e. a mixture of rule-based and statistical systems, the targeted correction of a text, i.e. MT post-editing will no longer be necessary within the next 5 years. The post-editor will have put himself out of work because MT systems will have learned so much from the corrections. Translation engines are currently being produced in real time. For example, you can upload your documents to Systran to train the MT. But beware of the pirates: Google, IBM, and Microsoft are aligning everything they can find on the web to fill their databases. (I hate to say this, but the stuff that these companies are aligning off the web has in part been created by the same people who are filling the Systran database. How does that make Systran a better engine than Google, IBM and Microsoft? Or lesser pirates? Are Systran and TAUS actually paying you for the stuff you add to their databases?)

Companies need a language strategy. (Considering that all LSPs sell themselves as consultants “with the in-depth knowledge companies need to set up and/or review their translation strategies,” this statement rather surprised me. But then again, I have yet to meet anyone working at an LSP who can explain to me what a language strategy is.) The only thing companies have done so far is force LSPs to reduce their rates and squeeze more words through the funnel. This is all thanks to social media, which they use to do their marketing. They need to maintain multiple language spheres and we need quality definitions – several of them (which means, I guess a TAUS QA model as opposed to a LISA QA model, which, like the LISA model, will only ever apply to the translation of software). In five years time, translation will be really interesting because translators will have choices as opposed to earlier – they will be able to choose to move up the ladder and provide high-quality translations. (Think about this: the guy who wants you to upload your translations into his database is actually telling you that you are currently producing crap. And why should we need translators in five years time when MT has put editors out of business?)

Providers of international products want to differentiate themselves by providing very sphere-specific translations, so we need high-quality translations. This means specializing – transcreation, hyperlocalization. (Hang on a second – didn’t we just say that trawling the social networks for users’ opinions was the only thing that mattered because no one reads marketing material anyway? Didn’t someone also mention that MT will enable LSPs to offer more languages in more areas so we could service more customers? And didn’t Renato clearly state in his keynote speech that quality doesn’t matter, and was I really on a different planet when Jaap said that we the users are prepared to accept translations of lower quality? I don’t get it.)

Translation memories are a thing of the past, we are told, totally outdated technology in dire need of replacement. We are moving into the era of the semantic web in which translation memories will no longer be assets and therefore no longer need to be protected like crown jewels. Today’s MT produces much better results than the hopelessly outdated TM technology, which only leverages the segments you enter into it. New tools will soon be available that will have massive leveraging capabilities, and a positive by-product will be that we will be able to preserve endangered and less spoken languages, like Welsh. These languages will not disappear like others. This is all very exciting.

Don’t even bother holding that thought because the most interesting presentation of the day revealed quite the opposite, namely that translation memory maintenance will be more important than ever, because as most of us know, MT does not produce suitable results unless it is combined with a TM. The same goes for terminology management. Why? Because rule-based MT uses multi-lingual dictionaries to translate and you can’t expect a translation to have any level of accuracy if it cannot draw upon an accurate list of terms. Remember, MT consists of translating words in a dictionary based on rules, and NOT of translating concepts or meaning. Unlike humans, machines don’t know what meaning and context are. So MT may be able to create an accurate translation, but it is not able to create and idiomatic or meaningful one. This is why MT has sucked for the last 50 years and will continue to suck unless ...

We take on the pre- and post-editing challenge. This was the most interesting presentation of the day, but I didn’t know that when I rolled up my sleeves, cracked my knuckles and pumped myself up for the big battle. The presentation lasted 90 minutes, and all I did was nod in agreement the whole time. Dr. Sharon O’Brien from the School of Applied Language and Intercultural Studies at Dublin City University had saved the day for everyone in the room. We all could have listened to her for hours: no provocative statements, no condescending and humiliating remarks, just facts, facts and more facts. Results based on sound research that contradict everything everyone wants you to believe – especially the visionaries.

Time and time again she points out that not many companies have been successful at using MT. Those that are successful have their writers and translators on board and have involved their translators in the process from day one. Enabling the main players to take ownership of the process and carve out new roles for themselves is crucial to the success of MT. She also emphasizes that tight control is the key in every area that touches on MT and that quality issues have to be tackled at the source. This was really the best breath of fresh air I’d had in a long time!

After this extremely interesting and far too short workshop, the conference was closed by Jaap van der Meer. I have to say, the conference was almost worth going to just to hear what he had to say. Here are a few of the jewels I collected:

The 250,000 translators in the world aren’t enough to fill the translation demand, which is why we need MT. True if you believe that everything should be translated, including my English tweets. God forbid!

It’s time for the industry to define standards in the same way the banking industry did to facilitate electronic banking. What are TMX, SRX, XLIFF and TBX? Joost Zetsche wrote a report on file format standards under the banner of TAUS that provides a clear overview of their status. Looks like Jaap doesn’t even know what his company commissions. In my opinion, it would be fair to say that it is time to review, consolidate and improve the standards not define them. The comparison with the banking industry is also interesting when you think that it took some 15 years to develop a European standard for EFT and that MAESTRO, the development of which started in the 1980s, has been so successfully implemented across Europe that my Dutch bankcard still doesn’t work in many German stores.

Translators should fill the TAUS database and not be afraid. Why would anyone what to fill a database for a company FOR FREE so the company can use the database to SELL translations of dubious quality? If there is one thing Jaap did not address the whole day it is how TAUS intends to prepare the strings for entry in the database. Pre-editing did not fall once.

According to Jaap, machine translation will relieve translators of their stupid and repetitive tasks. I admit there is nothing more stupid than translating articles about the development of new sources of energy or breakthroughs in diabetes research. It’s pretty clear that Jaap thinks that translators do nothing but translate software strings and poorly written help texts. Well wake up Jaap – software is not the only market segment feeding translators! And believe it or not, there are translators who have never translated software and have no intention of ever doing so.

I went to this conference ready for a fight and I left the conference understanding what should have been clear from the outset: old business practices were being rebranded and given cool names such as crowd sourcing (pool of translators and editors), MT post-editing (proofreading and editing), and service (quality assurance). In that sense, the future really is here. But surely you didn’t need a visionary to tell you that?

***

Diane McCartney was born in California and raised in Germany where she attended a French-German school. She set up the translation department at ASK Computer Systems, where she used a UNIX program to prepare text for translation and review. Today she is based in the Netherlands and has been running her own company since 1997.

5 comments:

  1. Hello Diane,

    Do you know if there is a summary of Sharon O’Brien's presentation available anywhere?

    Thanks,

    Michael

    ReplyDelete
  2. Hello Michael,

    I took so many notes during the conference that we had to break the article into several posts. One of the next posts will be a summary of Dr. O'Brien's presentation.

    ReplyDelete
  3. More than once you say you went to the event ready for a fight. And yet when Jaap said so many silly things, you appear to have said nothing in response.

    Is this the same Jaap that started his career as a translator, that owned and ran translation companies for over 20 years....who is a regular speaker at events...

    And he failed to convey any nuances..the team will have to talk to him about this...

    Rahzeb Choudhury
    TAUS

    ReplyDelete
  4. Dear Rahzeb,

    I stated exactly once, namely at the end of the post, that I had gone to the conference ready for a fight. Once the sessions had started, I quickly realized that it made much more sense not to interrupt Jaap: I could have disrupted his train of thought you see, and I had no intention of doing that.

    We are indeed talking about the same Jaap.

    ReplyDelete
  5. Dear Diane,

    The future is not here. The present is catching up with the past. Machine translation has been here for decades. Most people never heard about it till now that it is becoming more widespread. I have been using it for decades. I worked with a large multilingual team of translators for very large projects.
    We prepared the machine to get a more accurate output, created the automatic glossaries and suggested better ways of writing for machine translation. Then we run the translation through and sent that text to post-editing. Post-editors fixed the sentences as needed (easier said than done) or offered a new translation to sentences beyond repair. After the usual several readings and touching up that all translators or editors do in standard or MT jobs, the translated text was always followed by language QC (i.e., standard editing of the post-edited text) and a formal proofreading. The DTP department received the formatted text and adjusted it as needed. This semifinal translation came back for a final proof, spell-check and QC by the team of post-editors and editors/proofreaders. The DTP department did another QC and sent the final formatted text (software file and print out) to the customer for approval. If there were any changes, we introduced them and updated the machine as well as the automatic glossary so those changes would be applied in future projects.

    We had tables of words per hour for each function that were more realistic than some mentioned here. Even though words per hour tend not to represent the actual time needed.
    There were many issues with the text, of course. MT always has its limitations. You have to work with the best post-editors to avoid transfers, wrong prepositions or word order and many other details that would look odd to native speakers reading a publication.
    I did this type of work before CAT tools came into being.
    No, the future is not here. The past has arrived and become more widespread.

    ReplyDelete

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)