Real-time translation tech moves us closer to ‘language transparent’ society
Language isn’t the barrier it used to be.
Connect Google’s newest headphones to a Google Pixel smartphone and the Google Assistant will translate conversations in more than 40 languages almost instantly.
The cameras in some smartphones can translate street signs and restaurant menus.
Video chat software from Skype and others translates fast enough to allow two people speaking different languages to talk almost seamlessly.
And technology developed at Carnegie Mellon University can translate lectures and speeches in real time.
Technology is pushing us toward a “language transparent” society, said Alex Waibel, a professor and researcher at Carnegie Mellon University’s Language Technologies Institute.
“A world where we naturally maintain our individual languages but operate unimpeded by language boundaries through tech in a way as if the barriers didn’t exist,” Waibel said. “And that means providing language translation delivered seamlessly in such a way that we don’t notice the barrier any more.”
Major tech companies are making major plays on translation. Facebook announced this month that its Messenger app will translate conversations in real time. Microsoft’s translation app will now leverage artificial intelligence and machine learning even when it’s not connected to the internet or a cellular signal. Amazon snagged some of CMU’s top translation researchers and set up an engineering center in Pittsburgh’s South Side to develop ways to seamlessly shift Amazon products and content between languages. The team helped launch Amazon.com in Spanish in early 2017, shortly after Amazon opened its Pittsburgh office.
With augmented reality, speech can be translated to text and projected onto glasses. Targeted audio speakers can direct sound so precisely that several different languages could be broadcast around the same conference room table without confusion or interference. Waibel is even working on silent speech implants, tiny devices inserted into people’s cheeks and mouth that monitor the vibrations of speech so that a person can whisper in their own language and it can be translated and broadcast in any other language.
“That’s a bit of science fiction,” Waibel said.
What isn’t science fiction is what Google now offers with its Pixel Buds bluetooth headphones and the Google Assistant, the company’s AI-powered answer to Alexa and Siri. Pair the Pixel Buds with a Pixel smartphone and download the Google Translate app and your phone becomes a translation station. The app will translate what you say into text and display it on the phone’s screen and into speech and play it through the phone’s speakers. When someone else is speaking, the phone can translate that into text on the screen and speech played into the headphones.
The Google Translate app also comes with a camera function that will translate text on the screen. Google’s camera function and a speech translation service is available on iPhones and other phones.
Google hopes this feature will “make the world seem a little smaller,” the company wrote in response to an inquiry from the Tribune-Review.
“It’s a pretty amazing technology, and Google’s working on improving it,” said Steve Van Dinter, public relations manager with Verizon, which gave the Trib a Google Pixel 2 phone and Pixel Buds to demo. “I can imagine being able to travel and having those in and how much better that experience is for someone going overseas.”
Don’t expect Google Translate or other translation technology to completely replace human translators. Alex Rudnicky, also a researcher at CMU’s Language Technologies Institute, said envisions a future where technology and humans work together to translate text and speech where nuance and accuracy is important. Think Shakespeare, Rudnicky said.
“And let’s talk about poetry,” Rudnicky said. “It’s evocative. It’s not just for communicating meaning. It’s for communicating feelings. It touches on the shared experiences of the world.
“I can imagine, eventually, you can build systems that kind of sort of do that, but really, the easiest way to create that sort of text is to get a human in there.”
Software and artificial intelligence also misses the gestures, facial expressions and other forms of communication that accompany and provide rich context to speech. But machine translation isn’t designed to capture all that.
“Roughly, if you consider what machine translations is used for, and will be for the foreseeable future, accurately communicating meaning is probably good enough,” Rudnicky said.
Waibel said Google and others have playing catchup to this technology for more than a decade.
Waibel founded Mobile Technologies in 2008 and in 2009 launched Jibbigo, a speech-to-speech translation app for the iPhone. Waibel said it was the first such app. Apple featured it in a commercial advertising the iPhone 3GS.
“And Google said at the time it couldn’t be done,” Waibel said.
Facebook bought Mobile Technologies in 2013. Waibel and his team went to work for Facebook, full of hope that the large company could scale the technology and use it to make the world more open and connected, Facebook’s mission statement at the time.
It didn’t. Team members were siphoned off to work on other projects and speech translation was left as a low priority. Waibel left.
“I was saddened that the mission of really connecting the world wasn’t at the head of things,” Waibel said.
Waibel’s technology, from early algorithms to deep-learning neural networks, has pushed computer speech and automated translation since the 1970s. He is thankful his first professor didn’t laugh when Waibel told him in 1976 that he wanted to make computers convert text to speech.
“This was really a dream of mine when I was a student,” Waibel said.
Waibel came to CMU in 1979 to continue his research. Facebook bought a company he founded. Google has chased his innovations for the last decade. This month, Waibel traveled to Geneva, Switzerland, to talk to translators at the United Nations. Humans still interpret speeches at the UN in real time.
Waibel’s latest technology could change that.
Aaron Aupperlee is a Tribune-Review staff writer. Reach him at [email protected], 412-336-8448 or via Twitter @tinynotebook.