IBM moves closer to human-like accuracy for speech recognition

IBM moves closer to human-like accuracy for speech recognition

PanARMENIAN.Net - The tech world has spent years trying to create speech recognition software that listens as well as humans. Now, IBM says it's achieved a 5.5 percent word error rate, down from its previous record of 6.9 percent -- an industry milestone that could eventually lead to improvements in voice assistants like Siri and Alexa, Engadget said.

Microsoft claimed to reach a 5.9 percent word error rate last October using neural language models resembling associative word clouds. At the time, the company believed 5.9 percent was equivalent to human parity. But, IBM says it's not popping the champagne yet. "As part of our process in reaching today's milestone, we determined human parity is actually lower than what anyone has yet achieved — at 5.1 percent," George Saon, IBM principal research scientist, wrote in a blog post this week.

IBM reached the 5.5 percent milestone by combining so-called Long Short-Term Memory, an artificial neural network, and WaveNet language models with three strong acoustic models. It was then measured using the "SWITCHBOARD" corpus, a collection of telephone conversations that's been used as a benchmark for speech recognition software for decades. SWITCHBOARD is not the industry standard for measuring human parity, however, which makes breakthroughs harder to achieve.

"The ability to recognize speech as well as humans do is a continuing challenge, since human speech, especially during spontaneous conversation, is extremely complex," said Julia Hirschberg, a professor and Chair at the Department of Computer Science at Columbia University, in a statement to IBM. "It's also difficult to define human performance, since humans also vary in their ability to understand the speech of others."

 Top stories
Yerevan will host the 2024 edition of the World Congress On Information Technology (WCIT).
Rustam Badasyan said due to the lack of such regulation, the state budget is deprived of VAT revenues.
Krisp’s smart noise suppression tech silences ambient sounds and isolates your voice for calls.
Gurgen Khachatryan claimed that the "illegalities have been taking place in 2020."
Partner news
---