Speech recognition software: Good supplement to keyboard
Munich - It may have sounded like science fiction 20 years ago, but it's now reality: computers listen to what you say.
Dragon Naturally Speaking 10 is the latest incarnation of Nuance's speech recognition software for PCs. And rival company Linguatec has announced the release date of the successor to Voice Pro 11 this autumn. That means that updated versions of the two key programs in voice recognition industry will be available in stores at the same time. But what are these programs really capable of, and who can benefit the most from them?
The development process at Linguatec focused on improved interaction with Vista, better recognition accuracy, and a more intuitive user interface. Nuance mentions similar goals as well. Dragon NaturallySpeaking 10 aims at providing improved recognition in less time, intuitive voice commands and hence improved control of the PC.
Firefox and Thunderbird are now also supported by Dragon, joining Outlook, Internet Explorer and the applications in the Office suite. Users can achieve roughly 80 per cent recognition precision without training the software, says Martin Held from Nuance. The optimum 99 per cent accuracy can only be achieved by going through the software's built-in training procedure.
That involves the user reading pre-selected texts aloud into the computer. "The system uses those acoustic signals to determine typical characteristics for the respective user," explains Wolfgang Hoeppner, professor for computer linguistics at the University of Duisburg-Essen.
The modern programs can even handle homophones - words that sound alike but are spelled differently. Text databases are used to make the process flow smoothly. They sense the difference between words like "reign" and "rain" based on the likelihood that they will appear with other already recognised words. The rule of thumb: "The more limited the vocabulary, the better the speech recognition," Hoeppner says.
That explains partly why the technology is already quite common in industries with relatively small and manageable vocabularies, such as medicine and law. Nuance and Liguatec in fact offer versions for those users with the specialised vocabulary pre-installed. Anyone working outside a limited vocabulary must accept a lower level of recognition precision, Hoeppner feels.
Dictation is primarily of interest for users who must write in high volumes, but cannot do so particularly quickly or accurately. Nuance promises writing speeds of up to 160 per minute. That's the speed at which people talk, a study by the company found.
As the programs run parallel to the Office or e-mail applications themselves, a powerful PC is required: Nuance recommends at least one gigabyte (GB) or RAM and a 2.4 gigahertz (Ghz) processor or a 1.7 Ghz dual core processor. The still current version of Voice Pro requires 1.5 Ghz and 512 MB of RAM.
The consumer version of Dragon NaturalSpeaking is available from 99 dollars, while professional variants with added functionality start at 199 dollars. Lawyers and doctors pay 999 dollars. Linguatec offers that last group a bit of relief: the still current version of Voice Pro for doctors or lawyers costs 399 dollars, while a full version for all other users cost 199 dollars.
But don't throw away your keyboard and mouse just yet. It's impractical to control the computer using only voice commands, notes Held from Nuance. The more efficient way to work is to use all "interfaces," including both hands and the third helper - the voice. (dpa)