Google is working on a method to translate speech directly

Spread the love

Google researchers are working on a way to translate speech directly into another language without converting it to text first. Google’s Translatotron can also preserve the speaker’s voice.

The technique works with a neural network that analyzes spectrograms and converts them into a spectrogram that corresponds to the language to be translated into. According to the researchers, Translatotron is the first end-to-end model that can directly translate speech into another language.

It is already possible to translate spoken texts and have them spoken again in another language, but the speech is first converted into text, which is then translated and converted back into speech. That’s also the way Google Translate works now.

By translating speech directly, without first turning it into text, the speaker’s voice can also be preserved, according to Google. An optional speaker encoder is used for this, which must ensure that the characteristics of the translated speech are preserved.

It is not yet known whether and when Translatotron will be used in practice. Examples of the new translation method are on GitHub. The full study is on ArXiv.

You might also like