Showing posts with label MT post-editing. Show all posts
Showing posts with label MT post-editing. Show all posts

Jun 24, 2022

The Invisible Hand of Social Media

 


We've probably all been there by now. The evolution of social media platforms has seen the "rise of the machines", with artificial (non-)intelligence programmed by unintelligent humans monitoring and managing our actions, sometimes with little or no possibilities of pushback on our part or processes that waste inordinate amounts of time to obtain correction. We see this on Facebook, LinkedIn and other media which are too often also a major venue for business communication and technical support for professional tools we use. In the case involved with the screenshots above, my "wicked" comment was made with regard to an economic theory promulgated by the long-dead white Englishman, Adam Smith, and I was given the opportunity to protest to an external board after the fourth-world budget help confirmed that I was indeed a bad person promoting terrorism. I wrote:
There was a discussion about the abuses of capitalism and the flawed thinking of Adam Smith's "invisible hand", which he thought guided markets to do the right thing without control. As we all know, it is necessary for societies to exercise some control to protect health and safety, the environment etc. Blind belief in the invisible hand too often leads to tragedy -- in a sense, this invisible hand concept points the middle finger at us humans too often. So I suggested a metaphorical amputation of a metaphorical middle finger on a metaphorical invisible hand, meaning that we need legislation, etc. to protect against abuses in an unrestricted market. No real violence against any living beings whatsoever was suggested. The AI used by Facebook is functionally idiotic.

Consider also that if the hand is invisible the metaphorical blood must be too, so amputation should shock nobody, as any gore will also be invisible ;-)

Now all that is just a bit of amusement over coffee in the morning. But other cases are more serious, like the automated banning of a Ukrainian software developer for two months on LinkedIn because Putinist trolls objected to him trying to describe the reality of serving his customers while coffee breaks are accompanied by missile strikes and business-as-usual genocide. It seems that a certain volume of complaints can trigger a ban even when no specific violation can be identified. Artificial intelligence indeed.

Entirely too much trust is granted to automation, almost a reflex, even when it obviously contradicts logic and common sense. In the translation sector, I encounter time and again the idea that language services work should be cheaper if it involves text pre-proccessed by machine translation such as DeepL, even when it can be demonstrated that achieving the desired level of quality takes longer than if the text were translated by human effort only. So time isn't money for these people, I guess. The real issue is slavelancing.

Some postulate that most people need to believe in a Higher Power of some sort and take from it the direction and meaning of their lives. I used to dispute that, but seeing now how so many educated people in the business world have replaced Nin-girsu, Shiva, Allah, Nossa Senhora da Fatima and Mighty Cthulu with the new God of MTness, I suspect I may have been wrong.

As with most religions, it's our people and their welfare who are the true victims of this automation religion, while the priests argue that the real problem is that we haven't sacrificed enough to that invisible middle finger the technologists wave in our faces.

But I ask, what's the harm of paying a few more cents if you must if it leads at last to a lot more sense and better understanding for us all?

Feb 24, 2021

A memoQ must-have: the definitive guide to MT use!

People who know me and my work know that I have a very low opinion of machine translation use in most language service situations. Even in the best scenarios, it offers no value to me in my routine work as a translator of scientific and intellectual property texts (patent filings and litigation mostly). So why am I totally excited about the new e-book by my friend and colleague Marek Pawelec? For several reasons.

  • MT discussions bore the crap out of me. But when Marek asked me to review a pre-release copy, I was actually entertained by his clear, concise writing and the superb way he explained basic concepts of resource management in general that most memoQ users still don't master. I was shocked at how much fun I had reading about a subject I hate!
  • He talks about more than just how to configure memoQ to use DeepL, Gargle Trashlate or some other MT engine. He details strategies and best practices for effective use that many people might not be aware of. He talks about how to circumvent prohibitions on MT use and how to catch people who do that. And more. I didn't learn something on every page, but it's probably not an exaggeration to say I did on every other one.
  • Pseudo-translation using a special plug-in for the Pre-translation step is covered in wonderful detail. This technique has been very important to my work for nearly 20 years now. I use it to identify hard-coded interface strings in software I translate and to check and quote large documents that might have paragraphs or whole pages scanned and inserted as graphics that look like editable text - or charts whose text can be selected in the document but never show up on the memoQ working grid after import. Marek also discusses other uses of pseudo-translation I never thought of (layout checks, for example) which could have saved me a lot of grief over the years.
The only complaint I have about this book is that it's too cheap. The author teaches me more in its 36 pages than most can in 200 pages, and the learning is worth a Hell of a lot more to my business than fifty cents per page. Anyone else would probably have written much more and communicated far less of value, but that's a special gift that Marek has. Long ago, his talk at a memoQ Fest was the first time that regular expressions (regex) made any sense to me (as a casual programmer for about 40 years at the time I had approached the topic many times and mostly just found confusion). There aren't many people in this world who can take complex topics and make them seem simple and interesting to nearly anyone. Marek can. Richard Feynman could. I can't name many more on that list.

So... all I can really add is to tell you to go spend €18 here: https://payhip.com/b/tF62

The value you'll receive as a memoQ user at any level, even if you never use machine translation, is a large multiple of that price.


~
Update 2021-03-22
The Polish version of the book is now available at: https://payhip.com/b/RWcC

Update 2022-05-05
There are now editions available in Dutch and French:

Mar 4, 2019

What evil lurks in the results from your language service provider?

Let me start by disclosing that although I have a registered limited company through which I provide translation, training and technical consulting services for translation processes, I am essentially a sole trader who is not unreasonably, though not correctly, referred to as a freelancer much of the time. I have a long history of friendship and consulting support with the honorable owners of quite a few small and medium language service companies and of a few large ones. I vigorously dispute any foolish claims that there is no such need for such companies, and I see a natural alliance and many shared interests between the best of them and the best of independent professionals in the same sector.

But as Sturgeon's Law states so well, "ninety percent of everything is crap", and that would apply in equal measure to translation brokers and translators I suspect, though of course this is influenced by context. But what context can justify this translation of a data privacy statement from German to English? Only the section headers are shown here to protect against sensory overload and blown mental circuits:

The rest of the text is actually worse. This is the kind of thing some unscrupulous agencies take money for these days.

Why, pray tell, was the section numbering translated so variously into English? Well, if you know anything about the mix-and-match statistical crapshoot that is SMpT (statistical machine pseudotranslation) and its not-as-good-as-you-think wannabe alternatives, it's easy to guess the frequency of certain correlations in English with German numbers followed by a period.

And clearly, the agency could not even be bothered to make corrections, and the robotic webmaster put the text up, noticing nothing, where it remained for about a year to embarrass a rather good company which I hold in high esteem.

What's the moral of this story? Take your pick from the many reasonable options. "Reasonable" does not include doing business with the liars and thieves who will try to sell you on the "value proposition" of machine translation to cut costs.

A skilled translator knowledgeable in the subject matter and trained in dictation techniques paired with a good speech recognition solution or transcriptionist can beat any human post-edited machine translation process for both volume and quality. And a skilled summarizer reading source texts and dictating summaries in another language can blow them both away as a "value proposition".

One thing that is too often forgotten in the fool's gold rush to cheap language (dis)service solutions is - as noted by Bevan et alia - exposure to machine-translated output over any significant period of time has unfortunate effects one the language skills (reading, writing and comprehension) of the victims working with it. This has been confirmed time and again by translation company owners, slavelancers and other word workers. Serious occupational health measures are called for, but to date little or nothing has been done in this regard.

And when human intelligence is taken out of play or impaired by an automated linguistic lobotomy, the results inevitable gall in the lower quartile of the aforementioned 90%. Really crappy crap.

As another of my favorite fiction authors used to comment: TANSTAAFL. There ain't no such thing as a free lunch. And trust is always good, but these days you need to verify that your service providers really give you what you have paid for and don't pass off crap like you see in the example above.

Mar 10, 2018

Virtual symposium on AI, MpT and language processing March 26-29, 2018

The worlds of artificial intelligence and machine pseudo-translation are largely ones of delusion, wishful thinking, deceit and professional manipulation, but once in a while one encounters a few people in these fields who are worth the time to listen and discuss. Dion "Donny the Wig" Wiggins of Omniscien, formerly Asia Online, is one of these: a researcher at heart, it seems to me, and someone with a good appreciation of processes, even those having little to do with the technology he represents in his day job. Although an established godfather of the MT Mafia, his approaches to application have a large dose of common sense largely absent from the ignorant masses who place their faith in technologies they do not actually understand in detail. More than once he has shared workflow "revelations" that back up old research and testing of mine, but with more and better data to show how great productivity gains can be achieved by simple reorganization of common tasks. So when he told me about his company's upcoming symposium, I knew that it probably wouldn't be the usual bullshit-tinted fluff drifting through the professional atmosphere of translation these days.


Click the graphic above to see the symposium program - attendance is free. You'll see some familiar names and perhaps conclude that there might in fact be a bit of BS in the air, but there is likely to be a good bit of substance to consider and to apply even in areas not covered by the program.

One of the biggest problems I have with machine pseudo-translation technologies is the utter ignorance and dishonesty of many of its promoters and the massive social engineering which takes place to persuade and intimidate people to become its willing victims in areas where it offers little or no real value. The continued disregard for documented occupational health issues and language skills distortion in post-editing processes, and the vile corruption of academic programs to produce a new generation of linguistic dullards who cannot distinguish algorithmic spew from real human language are all matters of significant concern. But if we are to engage the Forces of Evil and know our Enemies and keep them within their wholly legitimate domain, this event might be a place to start :-) See you there.

May 31, 2015

Authoring and Editing with memoQ (webinar)

Last February I described my initial work with translation tools as environments for authoring and editing documents in a single language. Some people have been doing this quietly for a while; occasionally I would hear puzzled comments from a trainer who had held a class on SDL Trados Studio, OmegaT or memoQ which had been attended by a technical writer or someone with other professional writing interests not related to translation. But to my knowledge there has been no systematic approach to this.

Some weeks later I began to discuss and present some new possibilities for speech recognition in 38 languages which go well beyond the limitations of Dragon NaturallySpeaking for automated speech transcription in the eight languages for which it is available. These possibilities include a number of mobile solutions which are quickly gaining traction among translators and other professional writers.

On Tuesday, June 2nd (two days from now), I will be presenting a one-hour introduction to "MemoQ for Single-language Authoring and Editing" in the eCPD Webinar series. The registration page is here.

This presentation will be an update of the talk I gave earlier this year which discussed CAT tools in general as authoring and editing tools. Although any tool works in principle (and even a user of SDL Trados Studio, for example, can probably draw enough ideas from the upcoming eCPD talk to make good use of the approach), memoQ has some particular advantages, not the least due to its corpus-handling features in LiveDocs and its superior predictive typing facilities, including "Muses" (which are like SDL's AutoSuggest with more flexibility and without the onerously high data quantity requirements).

The presentation will include an overview of some of the latest advances in speech recognition in 38 languages for ergonomically superior writing by automated transcription as well as discussions of version management and dictation workflows which can be applied for greater ease in editing monolingual documents or even translations, including post-editing of machine pseudo-translation (PEMpT by the "Hardisty Method"). I've been fairly quiet on this blog in recent months due to conference organization and travels and the considerable time put in to researching improved work ergonomics for translation, writing and editing processes. (In fact I didn't even find time to blog the memoQ Day on April 22nd in Lisbon yet!) Elements of all these efforts, which have sparked no little interest at recent conferences and workshops I have presented at in Europe, will be part of Tuesday's talk, which will include Q&A afterward to explore the interests of those participating.

So if you are a translator involved in a lot of revision or editing work (bilingual or monolingual, a technical writer or other professional writing in a single language for publication, someone working on a thesis or authoring for other purposes, the eCPD presentation may help you to do this with better organized resources and greater efficiency. As one friend of mine who wrote a thesis just before I developed this approach put it, with this she would at least have been able to keep track of the feedback on her work from its five or so reviewers without going completely nuts.

Jun 4, 2014

OmegaT’s Growing Place in the Language Services Industry

Guest post by John Moran

As both a translator and a software developer, I have much respect for the sophistication of the well-known proprietary standalone CAT tools like memoQ, Trados, DejaVu and Wordfast. I started with Trados 2.0 and have seen it evolve over the years. To greater and lesser extents these software publishers do a reasonable job at remaining interoperable and innovating on behalf of their main customers - us translators. Kudos in particular to Kilgray for using interoperability standards to topple the once mighty Trados from its monopolistic throne and forcing SDL to improve their famously shoddy customer support. Rotten tomatoes to Across for being a non-interoperable island and having a CAT tool that is unpopular with most (but curiously not all) of the freelance translators I work with in Transpiral.

But this piece is about OmegaT. Unlike some of the other participants in the OmegaT project, I became involved with OmegaT for purely selfish reasons. I am currently in the hopefully final stage of a Ph.D. in computer science with an Irish research institute called the Centre for Next Generation Localisation (www.cngl.ie). I wanted to gather activity data from translators working in a CAT tool for my research in a manner similar to a translation process research tool called TransLog. My first thought was to do this in Trados as that was the tool I knew best as a translator but Trados’ Application Programming Interface did not let me communicate with the editor.

Thus, I was forced to look for an open-source CAT tool. After looking at a few alternatives like the excellent Virtaal editor and a really buggy Japanese one called Benten I decided on OmegaT. 

Aside from the fact that it was programmed in Java, a language I have worked with for about ten years as a freelancer programmer, it had most of the features I was used to working with in Trados.  I felt it must be reliable if translators are downloading it 4000 times every month. That was in 2010. Four years later that number is about to reach 10,000. Even if most of those downloads are updates, it should be a worrying trend for the proprietary CAT tools. Considering SDL report having 135,000 paid Trados licenses in total - that is a significant number.

Having downloaded the code, I added a logging feature to it called instrumentation (the “i” in iOmegaT) and programmed a small replayer prototype. Imagine pressing a record button in Trados and later replaying the mechanical act of crafting the translation as a video, character-by-character or segment-by-segment, and you will get the picture. So far we use the XML it generates mainly to measure the impact of machine translation on translation speed relative to not having MT. Funnily enough, when I developed it I assumed it would show me that MT was bunk. I was wrong. It can aid productivity, and my bias was caused by the fact that I had never worked with useful trained MT. My dreams of standing ovations at translator association meetings turned to dust.

If I can’t beat MT I might as well join it. About a year and a half ago, using a government research commercialization feasibility grant, I was joined by my friend Christian Saam on the iOmegaT project. We studied computational linguistics in Ireland and Germany on opposite sides of an Erasmus exchange programme, so we share a deep interest in language technology and a common vocabulary. We set about turning the software I developed in collaboration with Welocalize into a commercial data analysis application for large companies that use MT to reduce their translation costs.

However, MT post-editing is just one use case. We hope to be able to use the same technique to measure the impact of predictive typing and Automatic Speech Recognition on translators. I believe these technologies are more interesting to most translators as they impose less on word order.

At this point I should point out that CNGL is a really big research project with over 150 paid  researchers in areas like speech and language technology. Localization is big business in Ireland. My idea is to funnel less commercially sensitive translator user activity data securely, legally, transparently and, in most cases anonymously from translators using instrumented CAT tools into a research environment to develop and, most importantly, test algorithms to help improve translation productivity. Someone once called it telemetry for offline CAT tools. My hope is that though translation companies take NDAs very seriously, it is also a fact that many modern content types like User Generated Content and technical support responses appear on websites almost as soon as they are written in the source language, so a controlled but automated data flow may be feasible. In the future it may also be possible to test algorithms for technologies like predictive typing without uploading any linguistic data from a working translator’s PC. Our bet is that researchers are data-tropic. If we build it they will come.

We have good cause to be optimistic. Welocalize, our industrial partner, is an enlightened kind of large translation company. They have a tendency to want to break down the walls of walled gardens. Many companies don’t trust anything that is free, but they know the dynamics of open-source. They had developed a complex but powerful open-source translation memory system called GlobalSight, and its timing was precipitous.

It was released around the same time SDL announced they were mothballing their newly acquired Idiom WorldServer systemtheir system to replace it with the newly acquired Idiom WorldServer (now SDL WorldServer). This panicked a number of corporate translation buyers, who suddenly realized how deeply networked their translation department was via its web services and how strategically important the SDL TMS system was. As the song goes, "you don’t know what you’ve got till its gone" – or, in this case, nearly gone.

SDL ultimately reversed the decision to mothball TMS WorldServer and began to reinvest in its development, but that came too late for many some corporates who migrated en-masse to GlobalSight. It is now one of the most implemented translation management systems in the world in technology companies and Fortune 500’s. A lot of people think open-source is for hippies, but for large companies open-source can be an easy sell. They can afford engineering support, department managers won’t be caught with their pants down if the company doing the development ceases to exist, and most importantly their reliance on SDL’s famously expensive professional services division is reduced to zero. If they need a new web-service, they can program it themselves. GlobalSight is now used in many companies who are both customers of Welocalize and companies like Intel who are not. Across should pay heed. At a C-Suite level corporates don’t like risk.

However, GlobalSight had a weakness. Unlike Idiom WorldServer it didn’t have its own free CAT tool. Translators had a choice of download formats and could use Trados but Trados licenses are expensive and many translators are slow to upgrade. Smart big companies like to have as much technical control of their supply-chain as possible so Welocalize were on the lookout for a good open-source CAT tool. OpenTM2 was a runner for a while but it proved unsuitable. In 2012 they began an integration effort to make OmegaT compatible with GlobalSight. When I worked with Welocalize as an intern I saw wireframes for an XLIFF editor on the wall but work had not yet started. Armed with data from our productivity tests and Didier Briel, the OmegaT project manager, who was in Dublin to give a talk on OmegaT, I made the case for integrating OmegaT with GlobalSight. It was a lucky guess. Two years later it works smoothly and both applications benefit from each other.

What did I have to gain from this? Data.

So why this blog? Next week I plan to present our instrumentation work at the LocWorld tradeshow and I want Kilgray to pay heed. OmegaT is a threat to their memoQ Translator Pro sales and that threat is not going to reduce with time. Christian and I have implemented a sexy prototype of a two-column working grid, and we can do the same trick importing SDL packages with OmegaT as they do with memoQ. Other large LSPs are beginning to take note of OmegaT and GlobalSight.

However, I am a fan of memoQ, and even though the poison pill has been watered down to homeopathic levels, I also like Kilgray’s style. The translator community has nothing to gain if a developer of a good CAT tool suffers poor sales. This reduces manpower for new and innovative features. Segment-level A/B testing using time data is a neat trick. The recent editing time feature is a step in the right direction, but it could be so much better. The problem is that CAT tools waste inordinate amounts of translator time, and the recent trend towards CAT tools connected to servers makes that even worse. Slow servers that are based on request-response protocols instead of synchronization protocols, slow fuzzy matches, bad MT, bad predictive typing suggestions, hours wasted fixing automatic QA to catch a few double spaces. These are the problems I want to see fixed using instrumentation and independent reporting.

So here is my point in the second person singular. Kilgray – I know you read this blog. Listen! Implement instrumentation and support it as a standard. You can use the web platform Language Terminal to report on the data or do it in memoQ directly. On our side, we plan to implement an offline application and web-application that lets translators analyse that data by manually importing it so they can see exactly how much they earn per hour for each client in any CAT tools that implement that standard. €10 says Trados will be last. A wise man once said you get the behavior you incentivize, and the per-word pricing model incentivizes agencies to not give a damn about how much a translator earns per hour. The important thing is to keep the choice about sharing translation speed data with the translator but let them share it with clients if they want to.  Web-based CAT tools don’t give them that choice, so play to your strengths. Instrumentation is a powerful form of telemetry and software QA.

So to summarize: OmegaT’s place in the language services industry is to keep proprietary CAT tool publishers on their toes!


*******


See also the CNGL interview with Mr. Moran....

Mar 8, 2014

The carnival is over. The MpT emperor still has no clothes.

The late Miguel Llorens once commented about David Grunwald, machine pseudo-translation (MpT) developer and advocate and owner of GTS Global Translations:
"... I disagree with Mr. Grunwald about most things. His ideas about translation as a commodity are depressing and I wouldn’t work for him unless something with a bit more dignity—such as “circus freak”—weren’t a viable career option (for whatever reason)."
However, Miguel also respected the man's efforts to expose the sleazy scam of a Canadian translation technology company called Ortsbo a few years ago. I also find many of Mr. Grunwald's views troubling, particularly statements that "one translator is easily replaceable with another" based upon a long string of unsupportable suppositions. His company's blog often contains interesting and useful insights into events, actors and issues of interest to translators but his obsession with machine pseudo-translation (MpT) and fanatical devotion to the ideology of commoditization over the years wore me out, and I prefer not to spend my energies contemplating the campaigns of one who seems to be on a personal mission as a mental battering ram directed against individuals who are professional language service providers.

So I was surprised when I found his recent guest post on the TAUS blog, which is too often a semi-coherent organ for the hucksters in the MpT carnival. It's more or less what I've expected to hear for a while and many of his points can be clearly picked out of arguments that Kirti Vashee and others make (and which are often overlooked or contradicted by their sources on other occasions). But Mr. Grunwald's piece strikes me as the clearest, most comprehensive and honest statement of current art that I've heard from the MpT camp so far.

I'm not quite the enemy of automation and machine pseudo-translation that some take me for. I am simply against lies, liars and (un)professional abuse in forms such as the human-assisted machine pseudo-translation (HAMPsTr) processes that so many piratical organizations and their enablers push. There are clear cases where automated translation processes offer value, but damned few of these have anything to do with my fields and level of work, and the attempts of SDL and other organizations to pretend otherwise are dishonest and/or deluded at best.

Read the TAUS post and think about it. You might wonder why an MpT advocate would make such unambiguous admissions. Well, unlike some in that camp, Mr. Grunwald never struck me as dishonest, merely as one who inhabited a stratum of the barrel where translators are perhaps indeed interchangeable. He clearly has a good mind, a sense of ethics which seems sound enough in most respects and perhaps a little taste for shaking things up. But as he points out, the money has gone elsewhere now.
"The VCs have rendered their decision: MT is out, human translation is in. In the last 2-3 year a number of venture capital companies have poured millions into companies that develop human translation automation platforms."
And
"Post-edited MT is not as good as from-scratch. Everyone has heard the ‘you get 2 out of 3’ saying. When you deliver post-edited translations, it will be cheap and fast, but will not be (as) good."
The whole MT carnival for years has reminded me of The Great Y2K Scam aka The Last Hurrah of Cobol Programmers. Grab the cash as fast as you can as long as the suckers leave it on the table. Things did not change much a few years ago in a technical sense when MpT became all the rage in the bottom tiers. What did change was the perception that there was money to be made, a fix of VC heroin to dream of language automation at least, and those in the pay of special interests began to brand skeptics like Mr. Llorens as "haters and naysayers" and worse.

Now the money has gone away; it's time to wake up and face reality - or the latest deceptions.

Dec 12, 2013

In HAMPsTr We Trust?

So many times when I hear the bright and happy predictions of commercial interests spouting nonsense about "translation as a utility" and hoping to feast on the roadkill of communication, who claim the highest of motives and show the basest motivations in their real acts, I hear a saxophone in my mind and a strained voice declaring that some day "they may understand our rage".

Machine pseudo-translation (MpT) and human-assisted machine pseudo-translation (HAMPsTr) are big business for the profiteers offering pseudo-solutions which typically start in the low six figures of investment. "Get on the MT boat or drown!" declared one such profiteer, Asia Online CEO Dion Wiggins at his unfortunate keynote presentation at memoQfest 2012 in Budapest.

It seems that each week a new story line to justify the linguistic lemmings' rush over the cliff appears. Recently I heard for the first time how translators suffer from the "blank page syndrome" (note: as of 25 December 2013 the entire blog with that "blank page" link has disappeared) and need machine generated babble for inspiration. I thought perhaps I was just an odd one, usually struggling with many ways to render a text from German into my native language and trying to choose the best, but experienced colleagues I asked about their fear of blank pages all asked me if I was joking.

This morning another colleague sent me a real screamer:
"Smaller language service providers (LSPs) process fewer words than larger ones... [this] puts them at a disadvantage when it comes to leveraging linguistic assets due to the smaller size of their terminology databases and translation memories (TMs). These less comprehensive language resources limit reuse on subsequent projects or for training statistical machine translation (SMT) software."
The author of that particular bucket of bilge is Don DePalma, head of the Common (Non)sense Advisory, an organization rightly seen as incompetent to interpret even third-grade level mathematics in their discredited report of dramatic rate decreases for translations, which turned out to be an artifact of calculations involving mismatched survey populations. In any case, the idea that small translation agencies or individual translators, who are generally more aware of and concerned with their clients' business are at any disadvantage by not being buried under mountains of monkeyfied mumbo-jumbo from bulk trashlation nearly ruined my keyboard as I spit my coffee laughing. Don deserves an extra Christmas bonus for that transcreation of the truth.

But the best was yet to come:


This inspiring graphic accompanied an article on how to motivate those involved in post-editing MpT in the HAMPsTr process promoted by Asia Online and others. There has been some vigorous and interesting speculation on where that arrow is pointing :-)  The colleague who sent the link to me commented:
An interesting read from a humanitarian perspective. If they need to go to these lengths to "motivate" people, even those who are otherwise happy to swim in the muddy, toxic pond that these LSPs (your definition of the term) have created, one would have thought that they will understand that there is something wrong with their concept and goals. But why let the facts get in the way, I guess.
Indeed, those swimming in the pond do seem to have some real issues, even in cases frequently quoted as a HAMPsTr success. I long ago lost count of how many MpT advocates have told me of the wonderful words at Microsoft and Symantec, nicely extruded from controlled language sources and lovingly shaped into their final sausage form by happy hamsters. But this TAUS presentation by a Symantec insider tells another story:


And further indications that we are all getting mooned by the MpT Emperor can be heard in the excerpts of this recent GALA presentation in Berlin:


Unlike some of my colleagues, I have no fear of being replaced by Mr. Gurgle or any of his online Asian cousins however well-trained. What provokes some rage in me and more than a little concern is the callous dishonesty of the MpT profiteers and their transparent contempt for truth, the true interests of modern business and the health of those involved in language processes.

I have no little sympathy for the many businesses and individuals struggling to cope with the challenging changes in international business communication in the past 20 years. Nor do I feel that MpT has no role to play in communication processes; colleagues such as Steve Vitek have presented clear cases of value for screening of bulk information in legal discovery to identify documents which may need timely human translation and other applications. Kirti Vashee of Asia Online has commented honestly on numerous occasions on his blog and elsewhere about the functional train wreck of most "automated translation" processes one encounters, but still cannot take proper distance from the distortion and scaremongering practiced by the head of his team and others.

I am particularly concerned by the continued avoidance of the very real psychological dangers of post-editing MpT, which were discussed by Bevan and others in the decades before the lust for quick profits silenced discussions and research into appropriate occupational health measures. If Asia Online and others are truly concerned with developing sustainable HAMPsTr processes, then let them fund graduate research in psychology to understand how to protect the language skills and mental function of those routinely exposed to toxic machine language.

All this disregard for true value and truth reminds me so much of my days as an insider in the Y2K programmers' profit orgy: we all knew it was bullshit, but all the old COBOL programmers wanted to take their last chance to score big before they were swept into the dustbin of history. Some 60 years or so after it began, is machine translation ready to assume its place in that bin? The True Believers and profiteers will loudly say no, but at some point the dust will settle, the damage will be assessed, and we will find that the place of MpT is not at all what many imagine it to be today.

Sep 8, 2013

Erich Kästner reloaded with PEMT

What to do with a sangria'd Saturday night starting to cool after a long brain baked day of heat and chemical text transmogrifications? That was the question in my mind when the words came over the skypewire asking me to deal once again with the oh so important question of how machine translation advances transform our profession, like blacksmiths of old Ms. Kelley tells us we wait on the Brave New Future our wordforges make as we strike and turn the nails and shape them to drive deep and hold fast the lids of our cost-conscious coffins.

Why beat 'em? Just join 'em, what the Hell, give 'em the words well earned. So a thirty year break with Kästner I broke, took the first poem that popped with his name, slammed it through Google and off I went to a corner of quiet in the chaotic cantinho for lombo asado and well-washed postedit pleasure.

Erich Kaestner

Die Entwicklung der Menschheit

Einst haben die Kerls auf den Bäumen gehockt,
behaart und mit böser Visage.
Dann hat man sie aus dem Urwald gelockt
und die Welt asphaltiert und aufgestockt,
bis zur dreißigsten Etage.

Da saßen sie nun, den Flöhen entflohn,
in zentralgeheizten Räumen.
Da sitzen sie nun am Telefon.
Und es herrscht noch genau derselbe Ton
wie seinerzeit auf den Bäumen.

Sie hören weit. Sie sehen fern.
Sie sind mit dem Weltall in Fühlung.
Sie putzen die Zähne. Sie atmen modern.
Die Erde ist ein gebildeter Stern
mit sehr viel Wasserspülung.

Sie schießen die Briefschaften durch ein Rohr.
Sie jagen und züchten Mikroben.
Sie versehn die Natur mit allem Komfort.
Sie fliegen steil in den Himmel empor
und bleiben zwei Wochen oben.

Was ihre Verdauung übrigläßt,
das verarbeiten sie zu Watte.
Sie spalten Atome. Sie heilen Inzest.
Und sie stellen durch Stiluntersuchungen fest,
daß Cäsar Plattfüße hatte.

So haben sie mit dem Kopf und dem Mund
Den Fortschritt der Menschheit geschaffen.
Doch davon mal abgesehen und
bei Lichte betrachtet sind sie im Grund
noch immer die alten Affen.

— Erich Kästner    
The development of humanity

Once the guy who perched on the trees,
hairy and evil visage.
Then you have lured them out of the jungle
and the paved world and increased,
to the thirtieth floor.

There they sat, the fleas fled,
in centrally heated rooms.
As they sit on the phone.
And there is still exactly the same sound
as it did on the trees.

You hear far. Watch TV.
They are in touch with the universe.
You brush your teeth. You breathe modern.
The Earth is an educated Star
rinse with plenty of water.

You shoot the correspondence through a pipe.
They hunt and breed germs.
You mistaken the nature with all the comforts.
They fly steeply into the sky
and stay for two weeks above.

What makes their digestive left,
they process to cotton.
They split atoms. They heal incest.
And they realize by style investigation,
that Caesar had flat feet.

So they have with the head and the mouth
Created the progress of mankind.
But it apart and
they are considered in the light of the basic
still the old monkeys.

— Google Translate    

After the evening I had a look at what other masterful versions could be found of this work in English; many would indeed give the impression that Google is Good Enough. Like this, my God, or that (double-ack, farther down). This at least tried to rhyme but the translator failed to grasp the text, and the name on the best of the lot on page 5 of this PDF suggests that those coming late to English are not always the worst.

But all that came later. First I had before me the task of post-editing the silken syllables of Friend Google, which I still had not read as the jug of sangria arrived at the table. Somehow I was not quite in the mood. To be fair, I should first soak up the spirit, play a bit with the text, think about it and what I feel distinguishes Kästner and expresses his wit.

I can no more imagine a Kästner poem without rhyme than I can the work of Wilhelm Busch, and while a literal translation might help one learning German to appreciate some subtleties of the original, I think to render him without rhyme for a reader with no German is a criminal act for which there should be some terrible punishment I do not want to imagine. Better to fail at a rhyming Kästner translation than abandon hope of delivering some subversive punch with sarcastic swing he might approve.

So I scratched out tipsy lines in my notebook, thinking about this tale of Man's dubious ascent in the time since he descended from the trees and remade the world in His image.

The Descent of Man

Men once were but squatters in trees,
hairy, with faces of horror.
But drawn from the forest they please
to pave and pile a world on its knees
with their power to the thirtieth floor.

They sit now in comfort, flown from their fleas,
In chambers fine, with central heat.
on the phone voices chatter and freeze
the mind with that tone from the trees,
evolved to the very same beat.

They hear words so distant, see images far.
They are One with a cosmos so lush.
Their hygiene and breathing reveal how they are
One with the planet, this Shiningest Star
with no shortage of water to flush. 

Their missives fly straight and true through the wire.
They seek and cultivate germs and their why, 
and to Nature give comfort entire.
They fly Heaven high and boldly aspire
two weeks to remain in the sky.

What they cannot digest they do convert
to soft and pliable cotton mat.
They split the atom, heal incest's hurt. 
Their style analyses, smart and alert 
show Caesar's feet were flat.

With their heads and their words filled with such might
human progress they make which none do escape.
But considered soberly in the day's light,
for all their learning, creations and flight,
their remain at heart the same ape.

That quick hour of wine and wordplay settled my mood and my stomach and readied me for that Future of Post-editing which all the wisest noggins among Linguistic Sausage Producers say awaits us.

I tried to be true to the principles of Modern Quantitative Quality and keep the edit distance down to reduce the costs for my Imaginary Overseer, but I am a badly trained monkey and I kept wanting to do more with Google's dead words. It bothered me more than a little, moreover, that with the bad English text before me it was much harder to refer to the original German for guidance and inspiration. But the fact that PEMT proved five times faster in the end than the original draft translation will perhaps convince those with minds open like a sieve that MT is now the way to go for highly evolved marketing translation of which Darwin would surely approve.

The development of humanity

Once the fellows perched in trees,
hairy, with visages fell.
Then lured from the jungle
they paved and stacked the world
to thirty floors of Hell.

There they sat, their fleas they fled,
to sit in rooms well-heated.
As there they sit, phones in hand,
one hears the very echoes still
of their past arboreal chatter.

Far they hear and far they see,
their universe is One.
They brush their teeth; they breathe just right;
their Earth is a star alight
with the shine of rinsing water.

Conduits of correspondence swift sing
as they hunt and culture germs.
They give to Nature all their comforts
and ascend straight upward to the sky
and two weeks there above abide.

What digestion leaves,
they make to cotton.
Split atoms. Heal incest.
And their style investigations show,
that Caesar's feet were flat at best.

So they do with head and mouth
the world of mankind's progress shape.
But that aside and
in the light
one sees the same old ape.
Hm. I suppose if I were to meet Mr. Kästner in the afterlife, I'd get a fat shiner for that effort (and possibly for both). But what can one expect from a designated hater and naysayer who simply cannot make himself believe with the best of intentions and all the force of his scientific and technical training that the sound and fury about machine translations signifies anything fit for his purpose?

Jul 26, 2013

The trouble with voice recognition in translation environment tools....


I had not planned to make a video on voice recognition tools any time soon, but a few remarks by my American colleague Kevin Hendzel well down in the many comments about thepigturd's letter to translators sort of goaded me into it. I thought, "What the heck, I'll just grab some text from Wikipedia, record a bit of the work with Camtasia, and post a quick demo of how easy it is to work with Dragon Naturally Speaking." So I got a text about chickens. And activated the screencast recorder. And then the trouble started.

It really sucked. Working with Dragon in memoQ is usually a fairly painless process, but tonight the dogs were anxious and kept poking me in the ribs, and I never did get the microphone adjusted quite right. Some days, microphone position is everything to my scaly transcriptionist. So I suffered with a lot more editing than usual, as anyone watching the video above will see. I worked in my usual "mixed mode" manner, with both keyboard and voice control. Some colleagues who swear by DNS like to do everything by voice and would probably wipe their backsides in the WC that way as well if they could, but that's way too geeky for me. After watching my copywriting partner fly through some 10,000 words of legal translation - and edit it - in a short working day while I slogged through my 3,000 and finished long after she called it a day, I realized that I could work in the relaxed way she did with thoughtful stares at the screen, muttered bursts and the occasional keyboard touch.


But today was a bad day with the Dragon. I might have gone a bit faster with the text. After all, chickens aren't rocket science or even chemistry, with its tag-ridden notation. I could have just dictated in a word processor and everything would have one faster. And if I really want a TM or want to check the terminology, alignment is fast and also a good environment for editing my first draft. I know a number of translators who work that way now. Even with a dictaphone.

In his comments on the other post, Kevin Hendzel expressed a similar feeling to mine when translating with voice recognition: greater engagement and concentration on the text and its structure and meaning. But these tools are not without risk: any errors will in fact pass muster with a spelling checker, so proofreading workflows may have to be very different to be effective. I have noticed this myself - reading my text soon after I have translated it, I am very likely to overlook a missing or switched article or a homophone. Perhaps dictating into a word processor or - since I often look to the glossary hits and other hints on the right of my working window - exporting my text and re-aligning it in the CAT tool after an external rewrite may force my eyes to see things a little differently. In the two years that I have been making serious use of voice recognition I have not yet found the "perfect" workflow.

There are a lot of ways I can tease better results out of this work. But even on a bad day like today, things aren't all that awful. In fact, those familiar with some of the more honest estimates of output in optimized machine translation and post-editing scenarios will realize that today's lousy results (see the end of the video), maintained over the course of a working day, meet or beat the expectations for post-editing in a highly optimized scenario. Without the brain rot typically caused by PEMT! Now that's an advantage. Why don't we stop wasting time with machine translation and instead increase output by more research into the best ways of using voice recognition technology? Ah, but voice recognition is not yet optimized for every language! Ha ha ha... like MT is or ever will be. The millions that get flushed down the toilet with machine translation could and should buy a lot of improvement with voice recognition.

The real trouble with voice recognition is that you may not want your competition to use it. With or without CAT tools. Unlike machine translation.

Jul 22, 2013

Merchants of the Machine: A Parable of an MT Vision

And so it was that a group of merchants, many of them powerful men made wealthy by the work of wordsmiths, did conceive a scheme to replace workers with a translation machine, that they might increase their power and influence and add to their already swollen coffers.

As their servants toiled night and day on its creation, the merchants did send soothsayers out into the world to proclaim the coming of the machine. So ruthlessly did their acolytes preach the MT gospel, glorifying its spew with much trumpeting in many a marketplace, that other merchants did allow their own greed to triumph and they too became devotees of the machine. Rejoicing in their boastings, they held out the promise of riches without labour to any who would follow them and purchase their wares.
But it came to pass that these disciples learned that the machine could perform only the lowliest of the humblest translator’s tasks, for it had no mind and many tongues remained foreign to it; lo, as the machine devoured more its confusion increased! Though this discovery caused consternation amongst them and they were quietly afeared lest they lose face, vanity had made the merchants so presumptuous that they did not cease their evangelising but continued to hide behind a veil of half-truths, dissembling and exaggerating with such cunning that still many in the marketplace were duped.

Yea though the tumult of the charlatans’ voices sounded forth so loud as to deafen thought, skilled workers who were long practised in the art of translation would not be fooled by them. The translators’ ancient art was publicly scorned and mockery poured upon them; the merchants were desirous of concealing the true worth of translators and casting them into the wilderness, for they could dispel the myth of the machine as no other. And though they were mercilessly derided as haters and naysayers, these stewards of language courageously took a stand against the Goliath, exhorting others to beware and saying unto them,”Let not these purveyors of false doctrines exploit you.”
Fellow translators who had become despondent and were distressed lest their wisdom perish and their intelligence vanish if the will of the merchants did prevail were emboldened by this strength and support. At first there were only murmurings amongst them, for some feared incurring the wrath of the powerful money-worshippers. But slowly there arose a wave of dissension as cracks appeared in the ground beneath the philistines who would have the world believe that their machine was mightier than languages that had evolved over hundreds of years.

The translators eschewed confrontation with the merchants: not because they feared the machine, for they knew full well of its weaknesses, but because they understood the lengths to which such opponents were prepared to go to protect their power. In the knowledge that they must not join in battle on a battlefield created by the merchants they saw that they must deal with them wisely and rely on their own skill and sagacity to protect their profession.

Some of them did congregate in a meeting place which they named Stridonium, after the birthplace of their patron saint. There they set out a course, that they might deliver a different message. With renewed strength born of unity they conceived of the simplest of plans to quell the voices of false prophets and create a new gateway to their own honest marketplace. Theirs shall be customers of wisdom and understanding: they shall not translate for the machine, for that would be like unto casting pearls before swine. They shall not hide their light under a bushel for this they must use to illuminate their path and enlighten unknowing procurers of language.

And henceforth they and their fellow translators shall diligently apply themselves to refuting false doctrine and shall not be daunted by their task. They shall reveal the failings of the machine and shall rekindle understanding of translation in the marketplace; they shall prevent extortion of the misguided until their message spreads far and wide.

And they shall not allow their profession to be sacrificed on the altar of Commerce.

*****

About the author of this guest post...

Chartered Linguist Christina Guy is a Dutch to English legal translator and interpreter based in The Netherlands with whom I collaborate for international English copywriting. As a native of the UK with long experience in providing language services in the legal, commercial and diplomatic sectors, she is a passionate and articulate advocate of efficient quality. Several years ago, she and other committed language specialists established the translators' forum Stridonium to facilitate professional exchange in a private atmosphere of competence and mutual support. Christina's previous contribution to Translation Tribulations, A sermon from Ede, was inspired by her first exposure to the mad machinations of the MT muftis of TAUS and their allies.

Dec 18, 2012

Do we need a marketplace for MT post-editing?

Today technology guru Jost Zetzsche posed this question:


Jost's newsletter - the Tool Box (yes, it used to be called Tool Kit, and he has a great book on tools by the same name, so the new name is a bit confusing) - is one of my favorite... er... tools... for keeping an overview of the rapid developments in translation technology. I have subscribed to the premium edition for years and find it well worth the modest investment, about the price of a lunch. Jost is offering a holiday special on premium edition subscriptions. He writes:

... if you would like to spread some holiday cheer among your contractors or colleagues (and at the same time help them up their technical ante), make sure that they receive a Premium edition of the Tool Box newsletter during the next year. Rather than the regular $25 per year, for a limited time you can pay only $10 per subscriber for packages of five subscribers or more. Just send me an email with the names of the subscribers and I will check to see whether they already receive the Premium edition.
...
Have a blessed holiday season.
Jost

P.S. For 20 subscribers or more, pay only $5 per subscriber!
He can be contacted at jzetzsche [at] internationalwriters.com

Regarding the marketplace for post-editors of machine translation, I take a conservative view. I feel that the traditional marketplace is perfectly well suited to handling this sort of commerce:

What do you think?