Pages

Jun 25, 2017

NOW is not the National Organization of Words...

... but with over 4 billion of them, that interpretation of the News on the Web corpus at Brigham Young University would be plausible. BYU is known for its high quality research corpora available to the public. The news corpus grows by about 10,000 articles each day, and its content can be searched online or downloaded.

The results are displayed in a highlighted keyword in context (KWIC) hit list with the source publications indicated in the "CONTEXT" column:


As a legal translator, I find the BYU corpus of US Supreme Court Opinions more useful. It displays results in a similar manner:


It is difficult or impossible to configure a direct search in these corpora using memoQ Web Search, IntelliWebSearch or similar integrated web search features in translation environments. However, these tools can be used as a shortcut to open the URL, and the search string can be applied once the site has been accessed. Since I perform searches like this to study context infrequently, a standalone shortcut with IWS serves me best; if I were using this to study usage in a language I don't master very well, like Portuguese (yes there is a Portuguese corpus at BYU - actually, two of them, one historical), then I might include the URL in a set of sites which open every time I invoke memoQ Web Search or a larger set of terminology-related sites in an IntelliWebSearch group.

One great benefit of using such corpora as a language learner, is that context and collocations (words that occur together with a particular word or phrase) can be studied easily, better than with dictionaries, enabling one to sound a bit less like an idiot in a second, third, fourth or fifth language. Or for many perhaps, even their first language :-)

Jun 24, 2017

The multilingual toolkit for getting a date in Swahili


Some time ago, I was asked by IAPTI to provide some technical support for a developing effort to assist professional translators in various African regions. The flame of the Translators Without Borders center established a few years ago in Kenya has apparently sputtered out due to an incredibly silly anti-business model which undermined local professionals, so various initiatives were launched to help translators in the region grow stronger together and improve their professional practice.

Since memoQ is perhaps the best tool for managing the challenges of expert translation under the widest range of languages and conditions, I considered how I might contribute to solving some of these and reduce the frustrations of language barriers in Africa. I thought of all the business travelers there, as well as the NGOs and representatives of governments around the world who want a piece of what's there. All alone, strangers in a strange land, sweltering in some Nairobi hotel, how can these people even get a date in Swahili?

Once again, it's Kilgray to the rescue... with memoQ's auto-translation rules!

Using the various methods I have developed and published for planning and specifying auto-translation rules, I assembled an expert team for translation in Swahili, Arabic, Hebrew, English, German, Portuguese, Spanish, French, Russian, Hungarian, Dutch, Finnish, Polish and Greek to draft the rules for getting long dates in Swahili.

And using the Cretinously Uncomplicated Process for Identifying Dates (CUPID), these results can be transmogrified quickly to support lonely translators working from German, French and English into Arabic or from German, French, English and Spanish into Portuguese, for example, or in any combination of the languages applied for Swahili dates or others as needed.

With memoQ and regex-based auto-translation, you'll never be stuck for a quality-controlled date in any language!

Germany needs Porsches! And Microsoft has the Final Solution....


I hear that Germany is suffering from a shortage of Porsches. Odd, given that the cars are made there and should be readily available, but it's true, because my friend who lives there told me. He owns a large, successful LSP (Linguistic Sausage Production) company, and to celebrate its rise in revenues, he decided to get everyone on the sales staff a new Porsche as a company car. The problem is that he can't find any for €5000 euros.

So he was left with no choice but to cut overhead using the latest technologies. Microsoft to the rescue! With Microsoft Dictate, his crew of  intern sausage technologists now speak customer texts into high-quality microphones attached to their Windows 10 service stations, and these are translated instantly into sixty target languages. As part of the company's ISO 9001-certified process, the translated texts are then sent for review to experts who actually speak and perhaps even read the respective languages before the final, perfected result is returned to the customer. This Linguistic Inspection and Accurate Revision process is what distinguishes the value delivered by Globelinguatrans GmbHaha from the TEPid offerings of freelance "translators" who won't get with the program.

But his true process engineering genius is revealed in Stage Two: the Final Acquisition and Revision Technology Solution. There the fallible human element has been eliminated for tighter quality control: texts are extracted automatically from the attached documents in client e-mails or transferred by wireless network from the Automated Scanning Service department, where they are then read aloud by the latest text-to-speech solutions, captured by microphone and then rendered in the desired target language. Where customers require multiple languages, a circle of microphones is placed around the speaker, with each microphone attached to an independent, dedicated processing computer for the target language. Eliminating the error-prone human speakers prevents contamination of the text by ums, ahs and unedited interruptions by mobile phone calls from friends and lovers, so the downstream review processes are no longer needed and the text can be transferred electronically to the payment portal, with customer notification ensuing automatically via data extracted from the original e-mail.

Major buyers at leading corporations have expressed excitement over this innovative, 24/7 solution for globalized business and its potential for cost savings and quality improvements, and there are predictions that further applications of the Goldberg Principle will continue to disrupt and advance critical communications processes worldwide.

Articles have appeared in The Guardian, The Huffington Post, The Wall Street Journal, Forbes and other media extolling the potential and benefits of the LIAR process and FARTS. And the best part? With all that free publicity, my friend no longer needs his sales staff, so they are being laid off and he has upgraded his purchase plans to a Maserati.