IBM beats Microsoft at Speech Recognition accuracy

Last year, Microsoft made some pretty impressive breakthrough at speech recognition. The company claimed that its speech recognition technology reached “human parity” with only a 5.9% of WER (Word Error Rate). And now, IBM has achieved an even lower WER with its speech recognition technology. The company claims it has achieved a 5.5% word error rate, beating Microsoft’s 5.9% record by 0.4%.

Microsoft previously beat IBM’s 6.9% WER record by achieving a 6.3% error rate back in September of 2016. So it probably won’t be long until Microsoft hits back at IBM.

What’s interesting is that IBM claims the company is yet to reach human parity. Unlike Microsoft, IBM claims that the human parity is at a WER of 5.1% — which is yet to be achieved by any speech recognition technology. George Saon, an IBM principal research scientist said:

“Reaching human parity – meaning an error rate on par with that of two humans speaking – has long been the ultimate industry goal. Others in the industry are chasing this milestone alongside us, and some have recently claimed reaching 5.9 percent as equivalent to human parity…but we’re not popping the champagne yet. As part of our process in reaching today’s milestone, we determined human parity is actually lower than what anyone has yet achieved — at 5.1 percent.”

IBM said in a blog post that the company was able to achieve a lower error rate than Microsoft by combining LSTM (Long Short Term Memory) and WaveNet language models.