by Gene Turnbow
The Universal Translator concept from Star Trek began as a plot device – how exactly does one explain that every creature the crew of the Enterprise encounters in interstellar space speaks perfect English, without a trace of an accent? The answer was a sort of magic gizmo that handled the job. There’s that tendency, though, for everything we ever saw on Star Trek to inspire real engineers to make them actually work. A lot of the modern personal technology we take for granted today had its genesis in Star Trek. Flip-open communicator devices that let you talk to anyone on the planet via a satellite uplink (cell phones), voice controlled computers (my cell phone can do this), touch sensitive tablets that can retrieve and display information on anything that interests you (PADD device – we call them iPads or Android tablets now), and even dermal regeneration that can repair the skin of burn victims by spraying on a concoction made from their own cultured skin cells. Tricorders? Not yet – but that’s being worked on.
The Universal Translator problem may be going away very soon. Frank Soong and Rick Rashidof Microsoft are hard at work prototyping the real thing. Technology Review reports that research scientist Frank Soong demonstrated the software at Microsoft’s Redmond, Washington campus on Tuesday. He created the translation system along with colleagues at Microsoft Research Asia, the company’s second-largest research lab, in Beijing, China. Currently the system needs about an hour of training to develop a method of speech identical to the user’s own voice.
For the demonstration, his boss Rick Rashid, who leads Microsoft’s research department, actually read out text that was immediately translated into Spanish. Craig Mundie, Microsoft’s chief research and strategy officer, did the same, only his translation was Mandarian. A sample audio clip can be heard here in four versions: the original English and separate Spanish, Italian and Mandarin versions.
“We will be able to do quite a few scenario applications,” Soong promises. “For a monolingual speaker traveling in a foreign country, we’ll do speech recognition followed by translation, followed by the final text to speech output [in] a different language, but still in his own voice.”
After creating a “model” based on the user’s method of speech, the system converts the model into one that’s able to read out test in another language. This is done by comparing the user’s voice with a stock text-to-speech model for the target language. The system then tweaks the second language to match the individual’s articulation as best as possible. This method supposedly can convert any pair of 26 languages, including Mandarin Chinese, Spanish and Italian.
“The word is just one part of what a person is saying,” he says, and to truly convey all the information in a person’s speech, translation systems will need to be able to preserve voices and much more. “Preserving voice, preserving intonation, those things matter, and this project clearly knows that,” said Shrikanth Narayanan, a professor at the University of Southern California, in Los Angeles. “Our systems need to capture the expression a person is trying to convey, who they are, and how they’re saying it.”
- Microsoft’s page on Frank Soong – no relation to Noonian Soong
- Tom’s Guide article
SCIFI.radio is listener supported sci-fi geek culture radio, and operates almost exclusively via the generous contributions of our fans via our Patreon campaign. If you like, you can also use our tip jar and send us a little something to help support the many fine creatives that make this station possible.