Rick Rashid, Microsoft’s Chief Research Officer gave a demonstration in Tianjin, China at Microsoft Research Asia’s 21st Century Computing event. He discussed about the Speech recognition in computing and the recent breakthrough Microsoft has in it.
Until recently though, even the best speech systems still had word error rates of 20-25% on arbitrary speech.
Just over two years ago, researchers at Microsoft Research and the University of Toronto made another breakthrough. By using a technique called Deep Neural Networks, which is patterned after human brain behaviour, researchers were able to train more discriminative and better speech recognizers than previous methods.
During my October 25 presentation in China, I had the opportunity to showcase the latest results of this work. We have been able to reduce the word error rate for speech by over 30% compared to previous methods. This means that rather than having one word in 4 or 5 incorrect, now the error rate is one word in 7 or 8. While still far from perfect, this is the most dramatic change in accuracy since the introduction of hidden Markov modeling in 1979, and as we add more data to the training we believe that we will get even better results.
He later did a live demo of the results of what they are working on by translating whatever he spoke in English to Chinese within few seconds delay. Watch the video above to see the magic!
via: Next at Microsoft