Adding AI and ML to Speech-to-Text and Language Translations Are Game Changers

At Google I/O, Sundar Pichai showed off an AI-based technology called Duplex, in which a computer made a call to a restaurant to make a reservation in a natural human voice and interacted directly with a person taking down reservations at a particular eating establishment.

This particular AI announcement got a lot of coverage at Google I/O and given its importance, and the breakthrough in technology it delivered, it deserved to be highlighted as one of the most important announcements coming out of this year’s Google Developers Conf. However, for those of us at the conference, it was clear that the theme of AI and Machine Learning was prevalent in all products and services they showed at the event.

The day before Google I/O opened, the company held a special analyst event that specifically focused on AI. The chart below was shared at this event and underlines the fact that AI is used across all Google products.

While the media mostly highlighted things like Duplex and the way AI is used to impact Android P, servers and G-Suite, there were two other things they showed us at the Analyst event that I consider potential game changers.

The first is how AI is applied to voice to text translation. Their goal is to get this to 99% accuracy using AI and ML over the next few years. That said, the demos they showed us in which they dictated comments into various G-Suite applications were pretty accurate even now. They also showed us a more in-depth dive into the new AI feature called Smart Compose where a person writes a sentence, and it writes the next sentence for you based on the first sentences context. Smart Compose will work with either keyboard or voice input.

We have had various voice recognition products such as Dragon Dictate on the market for years. But these programs relied on localized software and took advantage of the current processing power available at the time of each release. These programs did get better over the years but if you ad AI and ML to this problem, the accuracy rate is bound to get better.

Google understands the importance of speech-to-text as it relates to our everyday lives. An accurate voice to text interface is critical when answering a message while driving. It is a meaningful way to respond to an email or text message on wearables or smartphones. It will eventually become a valuable input when using mixed reality glasses were using voice as part of the navigation process and voice-to-text is needed for various types of AR applications.

The second is how AI and ML are used in Google’s Translation programs. Most of us are familiar with Google Translate now, but Google said that in future versions when AI and ML are applied to the translation program will increase the accuracy rate dramatically. Google Translate does a good job today but will deliver more precise translations thanks to their own AI and ML-based technology.

But where AI based translation could be genuinely transformative is with actual language translation in real time. As an international traveler who only speaks English, communicating with locals in Japan, China, S. Korea, France, Spain, Greece, Italy and other places I travel to, this type of translation would be a godsend. There are some handheld devices out there today that attempt to translate what you say into a local language, but if you have ever used one, you know they do not work well and has a lot of limitations regarding what it understands and can translate.

Google has their eye on this type of translation too, and it is safe to say that with their AI and ML-based research going strong in this area of text and voice translation, we could see some real breakthroughs in more accurate language translation on Android phones shortly. Apple also has AI and ML research going on around various aspects of voice and text translation and they too, along with potential partners, could deliver a mobile language translation solution on IOS someday.

AI and ML will have a dramatic impact on voice to text translation, and its most prominent effect may be as part of the UI in AR and VR or mixed reality glasses as part of the information input system. Personally, the language translations excite me the most as it would make my world travels easier since I could speak with locals and have our conversation translated in real time.

As I stated above, Google is integrating AI and ML into all of their products and will impact everything they bring to market. But voice-to-text and language translations may be two of their most practical ways to use AI and ML to enhance our digital lifestyles.

Published by

Tim Bajarin

Tim Bajarin is the President of Creative Strategies, Inc. He is recognized as one of the leading industry consultants, analysts and futurists covering the field of personal computers and consumer technology. Mr. Bajarin has been with Creative Strategies since 1981 and has served as a consultant to most of the leading hardware and software vendors in the industry including IBM, Apple, Xerox, Compaq, Dell, AT&T, Microsoft, Polaroid, Lotus, Epson, Toshiba and numerous others.

2 thoughts on “Adding AI and ML to Speech-to-Text and Language Translations Are Game Changers”

Leave a Reply

Your email address will not be published. Required fields are marked *