A few years ago while on "holiday", I returned from dinner to find that my laptop had bluescreened. Panic time! It was Saturday night, and I still had quite a lot of text to translate and deliver on Monday morning. And up on the highest mountain in Portugal, I wasn't sure where I could find a replacement to finish the project, which was, at least, not utterly lost, because I had put it on a memoQ Cloud server for testing. The next day I got lucky: about 50 km away there was a Worten, where I picked up a gamer laptop with lots of RAM and an SSD. Well, not so lucky, as it was a Hewlett Packard Omen, with a fan prone to failure, but that's another story....
This new laptop was my first encounter with Windows 10. I had heard that this operating system offered improved speech recognition capabilities, and since I prefer to dictate my translations and downloading the 3 GB installation file for Dragon NaturallySpeaking (DNS) from my server at the office was going to take forever, I thought I would give Windows 10 speech recognition a try. I hadn't installed my CAT tool of choice yet, so I fired up Microsoft Word and began dictating. "Not bad," I thought. Then I tried it in my translation environment, and the results were a complete disaster. So I put that mess out of my mind.
Since then there have been some notable advances in speech-to-text capabilities on a number of platforms. But the best solution for my languages (German and English) with DNS became increasingly cranky thanks to neglect of the product by Nuance. Every week I read new reports of trouble with DNS in a variety of environments in which it used to perform very well. Apple's iOS 13 was a great leap forward of sorts for speech recognition and voice-controlled editing, but the new features are only available in English, and having Voice Control activated totally screws up my otherwise rather good dictation in German and Portuguese (or any other language). And don't get me started on the crappy vocabulary addition feature, which uses text entry alone with no link to actual pronunciation. Good luck with that garbage. It's not a bad solution in Hey memoQ with the additional command features added, but iOS dictation is not completely up to reasonable professional standards yet.
I probably would have given no further thought to Windows 10's speech-to-text features if it weren't for Anthony Rudd. We've corresponded a bit since I bought his excellent book on regular expressions for translators (and there's another practical guide for us coming soon from him!), and in a recent discussion he alluded to the use of Unicode with regex as a simple way of dealing with some things another colleague was struggling with. I was intrigued by this, and so for about half a day, I ran down a rabbit hole, testing Unicode subscripts and superscripts for a variety of purposes like fixing bad OCR of footnote markers and empirical formulae, autocorrecting common expressions for subscripted variables and chemical terms, including subscripts and superscripts in term bases and much more. Fascinating and useful stuff on the whole, even if some fonts don't support it well.
And of course I looked at using these special Unicode characters in speech-to-text applications. DNS had some funky quirks (not allowing numbers in the "spoken" version of terms, for example), but it worked rather well, so I can now say "calcium nitrate formula" and get Ca(NO₃)₂ without much ado. And for some reason it occurred to me to give Windows 10 speech recognition a try, just because I was curious whether vocabulary could in fact be trained. Indeed it can, and that feature is better than iOS 13 or DNS by far.
But first I had to remember how to activate speech recognition for Windows on my laptop again. When in doubt, type what you're looking for in the search box....
|Notice I've pinned Windows Speech Recognition to my taskbar on the right, which is good for quick tasks.|
Gesucht, gefunden. Unlike other speech recognition solutions, the one in Windows 10 works only for the language set for the operating system. And options there are limited to English (United States, United Kingdom, Canada, India, and Australia), French, German, Japanese, Mandarin (Chinese Simplified and Chinese Traditional) and Spanish.
I put on my trusty Plantronics earset (the best microphone I've used for dictation tasks or audio in my occasional webinars in the past year) and began to dictate, first in Microsoft Word, which had shown acceptable results in my tests long ago. I found that adding vocabulary in the Speech Dictionary (accessed via the context menu in the dictation control element shown as a graphic at the top of this post) was dead simple.
|After clicking or speaking "Insert", the text will be written to the target field with the proper formatting|
And more: I use a lot of spoken commands for keyboard shortcuts when I work, so I did a little research and testing. It seems that Windows 10 speech recognition gives full access to an application's keyboard shortcuts via voice. So in memoQ, for example, I can dictate the insertion of tags, items from the Translation Results pane and a lot more. Watch out, Nuance. Windows 10 is going to kick your Dragon's scaly butt!