Baidu Introduces SwiftScribe via AI Transcription App

Spread the love

Chinese internet giant Baidu has introduced SwiftScribe. This is a speech recognition program that is able to convert speech into text via artificial intelligence and learns from any textual adjustments.

The program is mainly intended to help interviewers, for example, convert voice recordings into text more quickly. According to Baidu, transcription will be 40 percent faster on average compared to manually converting speech to text. The Chinese company invites thirty to fifty people to test the beta version. Baidu has announced this.

SwiftScribe uses Deep Speech 2, a speech recognition system from Baidu. The program has learned to associate sounds with certain words and phrases. The neural network is so ‘trained’ with thousands of hours of audio recordings, which means that it should be able to transcribe with relative accuracy. In addition, SwiftScribe is able to learn from manual transcriptions and user-applied textual changes.

The program can upload a file in wav or mp3 format. According to Venturebeat, who interviewed a Baidu project manager, transcribing a 30-second file takes a total of just 10 seconds; a one-minute audio recording takes less than thirty seconds. SwiftScribe can transcribe audio files of up to an hour. Users will still need to capitalize and punctuate, and correct the spelling of certain words.

Last year it turned out that the speech recognition Deep Speech 2 is capable of very accurately recognizing spoken words. Researchers found that in some cases, the technology was able to transcribe Standard Mandarin, the official spoken language in China, better than a person.

You might also like