Pages

Dec 7, 2018

Integrated iOS speech recognition in memoQ 8.7

Today, memoQ Translation Technologies (the artists formerly known as "Kilgray") officially released their iOS dictation app along with memoQ version 8.7, making that popular translation environment tool the first on the desktop to offer free integrated speech recognition and control.


My initial tests of the release version are encouraging. Some bugs with capitalization which I identified with the beta test haven't been fixed yet, and some special characters which work fine in the iOS Notes app don't work at all, but on the whole it's a rather good start. The control commands implemented for memoQ work far better than I expected at this stage. I've got a very boring, clumsy (and unlisted) video of my initial function tests here if anyone cares to look.

Before long, I'll release a few command cheat sheets I've compiled for English (update: it's HERE), German and Portuguese, which show which iOS dictation functions are implemented so far in Hey memoQ and which don't perform as expected. There are no comprehensive lists of these commands, and even the ones that claim to cover everything have gaps and errors, which one can only sort out by trial and error. This isn't an issue with the memoQ development team for the most part, but rather of Apple's chaotic documentation.

The initial release only has a full set of commands implemented in English. Those who want to use control commands for navigating, selecting, inserting, etc. will have to enter there own localized commands for now, and this too involves some trial and error to come up with a good working set. And I hope that before long the development team will implement the language-specific command sets as a shareable light resources. That will make it much easier to get all the available languages sorted out properly for productive work.

I am very happy with what I see at the start. Here are a few highlights of the current state of Hey memoQ dictation:
  • Bilingual dictation, with source language dictation active when the cursor is on the source side and target language dictation active when the cursor is on the target side. Switching languages in my usual dictation tool - Dragon NaturallySpeaking - is a total pain in the butt.
  • No trainable vocabulary at present (an iOS API limitation), but this is balanced in a useful way by commands like "insert first" through "insert ninth", which enable direct insertion of the first nine items in the Translation Results pane. Thus is you maintain good termbases, the "no train" pain is minimized. And you can always work in "mixed mode" as I usually do, typing what is not convenient to speak and using keyboard shortcuts for commands not yet supported by voice control, like tag insertion.
  • Microphones connected (physically or via Bluetooth) with the iPhone or iPad work well if you don't want to use the integrated microphone in the iOS device. My Apple earphones worked great in a brief test.
Some users are a bit miffed that they can't work directly with microphones connected to the computer or with Android devices, but at the present time, the iOS dictation API is the best option for the development team to explore integrated speech functions which include program control. That won't work with Chrome speech recognition, for example. As other APIs improve, we can probably expect some new options for memoQ dictation.

Moreover, with the release of iOS 12, I think many older devices (which are cheap on eBay or probably free from friends who don't use them) are now viable tools for Hey memoQ dictation. Update: I found a list of iPhone and iPad devices compatible with iOS 12 here.)

Just for fun, I tested whether Hey memoQ and Dragon NaturallySpeaking interfere with one another. They don't it seems. I switched back and forth from one to the other with no trouble. During the app's beta phase, I did not expect that I would take Hey memoQ as a serious alternative to DNS for English dictation, but with the current set of commands implemented, I can already work with greater comfort than expected, and I may in fact use this free tool quite a bit. And I think my friends working into Portuguese, Russian and other languages not supported by DNS will find Hey memoQ a better option than other dictation solutions I've seen so far.

This is just the beginning. But it's a damned good start really, and I expect very good things ahead from memoQ's development team. And I'm sure that, once again, SDL and others will follow the leader :-)

And last, but not least, here's an update to show how to connect the Hey memoQ app on your iOS device to memoQ 8.7+ on your computer to get started with dictation in translation:




11 comments:

  1. One small remark/correction: iOS 12 only supports iPhone 5s and newer models, so you won't be able to install it on a 4s.

    ReplyDelete
  2. Thanks for the feeback, Kevin!

    Your iPhne 4S won't work. iPhone 5 is the minimum, because you must be able to install iOS version 10 at least. A used iPhone 5 costs around EUR 50 around here.

    ReplyDelete
    Replies
    1. Thanks, Gergely, someone pointed that out to me the other day on FB, but I didn't get around to updating the information on the post yet. €50 isn't bad; I usually spend that much or more for a microphone. I think some people have gotten so emotionally invested in their Android phones or other technology that it's hard for them to think of the (iOS) input as simply a microphone. At from what I have been able to determine so far in my research on other speech APIs and their ability to pass commands to a Windows application, it doesn't look like there are any good options right now which would have the language coverage available for iOS speech recognition. Cortana certainly wouldn't (that does not even offer European Portuguese right now) and as far as I can tell, Chrome won't handle that either, so users who need Chrome languages (like Slovene) might as well just use that Chrome app I wrote about recently and input commands manually.

      Delete
  3. If you do same-language memoQ projects like I occasionally do (generic English to EN-US for revision, for example), there is a problem with the selection commands in the first release version of "Hey memoQ" at least. See the video record at https://youtu.be/GFGGE_5RQTU

    ReplyDelete
    Replies
    1. I couldn't reproduce this, there might be something else going on. Maybe it is some leftover from the pre-release version you tested, etc.

      Delete
    2. I think I accidentally deleted your other reply, sorry. You posted in duplicate and I was trying to clean up and something went wrong. In any case, the problem is a little different than I explained it; I'll install on my office machine over the weekend just to be sure and then send you a little video via the support ticket. Don't let my grumbling over some of these details distract you. I think this is a great new feature with a lot of potential for a great number of colleagues, including my friends in Portuguese- and Arabic-speaking countries. Thank you so much for responding to the needs of so many translators of important "minor" languages that Nuance and others continue to leave unserved.

      Delete
  4. There is also a bug with the "reset filter" command for the target text. If there IS no matching target text, you can apply the filter just fine, getting an empty result, but the filter cannot be reset by voice then. It has to be done manually. I made a little video record of this (https://youtu.be/XMkO3sD4Tc4), which also shows how to make source text filtering work in the release as of 10 December 2018. The German command used to reset the filter ("Lösche Filter") was configured by me, because there currently are no German settings in the shipping software. What's not shown in the video record is how I filtered the target (with content) and was able to reset, but then I filtered on the missing word "pirates" and was once again not able to reset.

    ReplyDelete
    Replies
    1. Fixed for next build. We spotted this ourselves in QA too, the fix just didn't make it into the first release.

      Delete
  5. Congratulations and a big THANK YOU to everyone at memoQ who have always listened to translators through the years! Now, finally, translators into many of the non-Dragon-NaturallySpeaking languages have an effective way to reap the productivity, quality and ergonomic advantages of speech recognition while using the world's leading translation memory tool. I tested Hey memoQ today and came to the same conclusions as Kevin. This is an awesome step forward in translation technology. Translators into Portuguese, Russian, Arabic, Hebrew, Greek, Chinese, Japanese, Turkish, Hindi, Indonesian, Korean, Indonesian, Thai, and many others (https://www.memoq.com/en/news/hey-memoq-frequently-asked-questions), and, yes, Hungarian should be dancing in the streets. If you consider the number of translators working into these languages, it's going to be some big party. And, hey, thanks Kevin for your enthusiasm and dedication!
    Jim

    ReplyDelete
    Replies
    1. Ha! I was going to write and ask what you thought of all this given that you have longer experience with voice transcription technology than anyone I know. Things sure have come a long way since we saw each other at memoQfest 2015 and you were showing a couple of solutions like this - I think one was an early Chrome speech recognition in memoQ Web. I'm hoping that you and Moshe will have some good ideas on optimizing the ergonomics for Hey memoQ. I usually do OK with it on a small tripod on my desktop -I think the Apple mikes are fairly good quality, but as they get farther away from the mouth there can be some issues with room acoustics. I think ultimately we'll be looking at some sort of external mike running through the iOS device, but I've just started to look at these options and I'm not terribly happy with the cheap solutions. Maybe a boom to keep the phone close to my mouth will work best.

      Delete
  6. For those using an iOS device that has only those damned "lightning" plugs for both charging and sound input from an external microphone, there are splitters available. Searching Amazon with the keywords "iPhone" and "splitter" brings up a number of interesting options, including quite a few that include 3.5 mm jacks and one with a standard USB jack, a 3.5 mm jack and a lightning jack (i.e. female 3 ports) in the split. These might offer some good options to use high quality external microphone equipment. Here's a short URL for one such search: https://goo.gl/AiUc4S

    ReplyDelete

Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)