Google announces Gemini 3.5 Live Translate for instant voice-to-voice translation - Ars Technica - News Bunkers

Voice translations preserve speaker’s tone, pacing, pitch—with SynthID watermarks for security.
Google has been chasing real-time translation for years, which it says has been one of its “pioneering machine learning experiments.” We’ve seen numerous demos on stage at Google events in the past, but you needed Google phones, earbuds, or some other specific setup. Last year, Google brought real-time translation to more users in the Translate app, and now it’s expanding availability more. With the release of Gemini 3.5 Live Translate, you’ll have access to instant translation in more places and with lower latency than ever before.
The new AI model is part of the version 3.5 family that launched at I/O. Before today, Google had only rolled out the Flash version, but we’re expecting a Pro model to drop in the coming weeks. Gemini 3.5 Live Translate is a speech-to-speech model tuned to automatically detect and translate in more than 70 languages.
Google says Gemini 3.5 Live Translate is fast enough to keep up with a normal conversation, following just a few seconds behind the speaker while also matching intonation, pacing, and pitch. In short, the voice sounds more like you than a generic robot. The demos, which are all being recorded under controlled conditions, do sound impressive. You won’t have to wait long to verify the model’s abilities for yourself, though.
Gemini 3.5 Live Translate is rolling out across several parts of the Google ecosystem. Developers can begin building with a public preview in the Gemini Live API or AI Studio. The model processes speech continuously and handles all the multilingual inputs automatically, saving developers from manually configuring settings. It also filters out background noise in busy environments.
Select enterprise customers will also get access to the new translation model in Google Meet starting this month in advance of a wider rollout. Google says it’s tweaking the Meet interface to bring the live translate feature to the front, too. Most notably, 3.5 Live Translate will come to the Google Translate app on both Android and iOS soon.
At the tail end of last year, Google began testing Gemini-based live translation in the app with any earbuds (and in the iOS app); previously, you needed to have the company’s Pixel Buds with an Android phone. The pending update will expand further with the addition of the latest 3.5 model. Not only can you use any earbuds, you don’t need earbuds at all. If you don’t have any handy, you can hold the phone up to your ear like you’re on a call to hear the spoken translation. However, this “listening mode” only works on Android at this time.
The audio streams from Gemini 3.5 Live Translate are intended to sound lifelike even if they don’t exactly mimic the user’s voice. However, Google is still proceeding cautiously. All Gemini 3.5 Live Translate audio streams will have SynthID watermarks integrated into the waveform data. This will mark the speech as AI-generated, and there is (currently) no way to remove that.
Ars Technica has been separating the signal from the noise for over 25 years. With our unique combination of technical savvy and wide-ranging interest in the technological arts and sciences, Ars is the trusted source in a sea of information. After all, you don’t need to know everything, only what’s important.

source

Google announces Gemini 3.5 Live Translate for instant voice-to-voice translation – Ars Technica

Leave a Reply Cancel Reply