Microsoft’s New AI Can Imitate Your Voice With Just A 3-Second Sample

10 janvier 2023 Intelligence Artificielle

This new tech comes at a time when voice actors are increasingly concerned about robots taking their jobs.

The AI are coming and they don’t stop coming. From AI art generators that can make Dungeons & Dragons characters to chat bots that can DM an entire D&D game, AI is becoming increasingly powerful. And now not only can it mimic the art styles of various artists, but AI can also mimic our voices too.

We’ve already seen AI voice tech being used in video games, but Microsoft’s Vall-E promises to be even easier to use. Dubbed a « neural codec language model », Vall-E (an homage to OpenAI’s Dall-E art generator) has been trained on over 60,000 hours of speech, making it « hundreds of times larger than existing systems. »

You can see a demo of Vall-E on Microsoft’s GitHub page here (thanks, Rock Paper Shotgun). The system can recreate a specific voice with just three seconds of dialog, allowing the user to simply type what they want that voice to say to create paragraphs upon paragraphs of spoken audio.