In a very short and bizarre demonstration, Amazon showed how Alexa can mimic the voice of a dead relative to read bedtime stories or fulfill other tasks involving “human-like empathy.” The feature is still experimental, but according to Amazon, Alexa only needs a few minutes of audio to impersonate someone’s voice.
The demonstration was tucked in the middle of Amazon’s annual re:MARS conference, an industry get-together that focuses on machine learning, space exploration, and some other heady stuff. In it, a young child asks Alexa if Grandma can read The Wizard of OZ—the speaker responds accordingly using a synthesized voice.
“Instead of Alexa’s voice reading the book, it’s the kid’s grandma’s voice,” Rohit Prasad, Amazon’s head scientist for Alexa AI, told a quiet crowd after the demo.
Prasad points out that “so many of us have lost someone we love” to the pandemic, and claims that AI voice synthesis can “make their memories last.” This is obviously a controversial idea—it’s morally questionable, we don’t know how it could impact mental health, and we’re not sure how far Amazon wants to push the technology. (I mean, can I use a dead relative’s voice for GPS navigation? What’s the goal here?)
Amazon’s advanced voice synthesis tech is also worrying. Previously, Amazon duplicated the voices of celebrities like Shaquille O’Neal using several hours of professionally recorded content. But the company now claims that it can copy a voice with just a few minutes of audio. We’ve already seen how voice synthesis tech can assist in fraud and robbery, so what happens next?
We don’t know if Amazon will ever debut this voice synthesis feature on its smart speakers. But audio deepfakes are basically inevitable. They’re already a huge part of the entertainment industry (see Top Gun: Maverick for an example), and Amazon is just one of many companies trying to clone voices.
Source: Amazon via The Verge