DEEPFAKES CAN REPLICATE HUMAN VOICES NOW

DEEPFAKES CAN REPLICATE HUMAN VOICES NOW

It’s not just your face that can be convincingly replicated by a deepfake. It’s also your voice — quite easily as journalist Chloe Beltman found:

Given the complexities of speech synthesis, it’s quite a shock to find out just how easy it is to order one up. For a basic conversational build, all a customer has to do is record themselves saying a bunch of scripted lines for roughly an hour. And that’s about it.

“We extract 10 to 15 minutes of net recordings for a basic build,” says Speech Morphing founder and CEO Fathy Yassa.

The hundreds of phrases I record so that Speech Morphing can build my digital voice double seem very random: “Here the explosion of mirth drowned him out.” “That’s what Carnegie did.” “I’d like to be buried under Yankee Stadium with JFK.” And so on.

But they aren’t as random as they appear. Yassa says the company chooses utterances that will produce a wide enough variety of sounds across a range of emotions – such as apologetic, enthusiastic, angry and so on – to feed a neural network-based AI training system. It essentially teaches itself the specific patterns of a person’s speech.

CHLOE VELTMAN, “SEND IN THE CLONES: USING ARTIFICIAL INTELLIGENCE TO DIGITALLY REPLICATE HUMAN VOICES” AT WAMU 88.5 (AMERICAN UNIVERSITY RADIO) (JANUARY 17, 2022)

And how did Chloe feel about “Chloney,” her digital voice? “Chloney sounds quite a lot like me. It’s impressive, but it’s also a little scary.”

Read more