Aug 9, 2013

Translating against the clock with Dragon Naturally Speaking

After a discussion on voice recognition, which began in the comments for a totally unrelated post about a scatologically bad Linguistic Sausage Producer (LSP), I attempted a timed test of translation with voice transcription in my usual working environment as a demonstration of the productivity gains to be achieved. The Gods were against me that day or my microphone was adjusted wrong; the results were somewhere around 2000 miserable words an hour with a lot of edits as I dictated.

Tonight I resolved to do better with the deck stacked against me. I waited until about 3 am (easy to do on a hot day in a Mediterranean climate) and picked a relatively easy but unfamiliar German text from Wikipedia, took out most of the Greek words I can't pronounce and fired up The Dragon to burn the translation.

No CAT tool this time, nothing but me, the text about snakes and a timer... how many words of draft translation would you expect to do with an easy text in an hour in the middle of the night without coffee? (Of course it will need some revision later, but I wanted a crappy translation for editing demos anyway.)


  1. DNS works really like a charm, but...

    - you must train it really carefully (though doing only the first recognition tests brings good results)
    - you must invest in a better microphone - at least the one delivered with DNS 10 - is crappy - I recommend Sennheiser
    - it depends heavily on the text. E. g. a simple-structured text offers more productivity gain than a legal one.
    - ...and on the tool. DNS with T 2009 does not really work, Kilgray (at least in the past years) paid attention to the DNS users and their problems with some new releases.
    - ...and on the combination of voice commands and keyboard use. Pressing Ctrl-Enter is a lot faster and than saying "click confirm", but e. g. I never got used to do web searches with DNS.

  2. Torsten, you're right about the training, but current versions of Dragon need much less training than the one I tried to use in 2004 when a defective space key and joint pain during a trip to California made me desperate for a quick alternative to the keyboard. As I've mentioned in other posts about using DNS, I favor a mixed mode of keyboard use with voice work, doing confirmations (as seen in the "bad day with memoQ video), tag insertions, most formatting, etc. by keyboard.

    The microphone is important. One colleague insists on a particularly high grade that he orders from the US and which costs some USD 300; what you see demonstrated here is a Logitech USB headset mike for which I paid about EUR 40. I have another headset microphone that costs about twice that (a gamer's headset, beyerdynamic MMX2), which is also good but no better than the Logitech as far as I can see. As for legal work, I have a friend whom I have observed knock out 9,000 to 12,000 words/day of legal text with proofreading in an ordinary working day with familiar cases. I don't come anywhere near that for the heavily formatted texts I do, but comparing like with like I do find that with a crap text where I might manage 1500 words in a very tiring, long day, I can handle twice that in less time and still be fairly fresh at the end. DNS allows me to concentrate on the text better and remain alert.

    You're right that Kilgray has indeed invested a lot more effort in compatibility with DNS than others, and that is reflected by its performance in memoQ. Last night I began testing Fluency, particularly the transcription module, using Dragon, and I was more than a little frustrated with the results, including what appears to be some interference with the function to transfer one's transcript directly to a new translation project. I've got a lot of stuff running on my laptop, so it will take a bit of work to isolate all the factors that kept screwing up the work, but I've never seen anything troublesome like that with DNS in memoQ. A little thing that drove me nuts was that in a number of instances, sentences were not started with capital letters, though I think this was due to intermediate edits. In any case, if I am to use DNS successfully in the other tool, my behavior will likely have to be more restricted.

  3. Which version of DNS are you using, Kevin?

    1. 11.5 now - I think the current version is 12, or at least that's what someone mentioned recently in a private forum.

    2. Current version is 12.

      Concerning the issue with sentences not starting with a capital letter you can say "Groß" (for German...) before starting. You may also use the dictation windows and transfer the text segment by segment. Both solutions are not convincing, I know ...

    3. @Torsten: the equivalent of "groß" for English dictation is "cap", which is what I resorted to in a few cases. The behavior in Fluency was not consistent or not obviously so, but next time I'll pay closer attention to see what behavior on my part was leading Dragon astray there. I want to get it worked out, because I have such a nice idea for an integrated project!

  4. I found another good case for the use of voice recognition :-) The other day I dropped a cup, and in picking up the pieces managed to cut my wrist and the tip of the index finger on my dominant hand. When the finger became infected, the use of dictation was soon revealed as a preferable alternative to pounding a keyboard!


Notice to spammers: your locations are being traced and fed to the recreational target list for my new line of chemical weapon drones :-)