The revelation that a documentary filmmaker used voice-cloning software to tell the late chef Anthony Bourdain that he never uttered words came amid criticism over ethical concerns about the use of the powerful technology.
The film “Roadrunner: A Film About Anthony Bourdain” appeared in theaters Friday and mostly features actual footage of the beloved celebrity chef and globe-trotting television host before he died in 2018. But its director Morgan Neville told The New Yorker that a piece of dialogue was created using artificial intelligence technology.
It renews a debate about the future of voice-cloning technology, not only in the entertainment world but in politics and a rapidly growing business sector dedicated to transforming text into realistic-sounding human speech.
“Rejected voice cloning is a slippery slope,” Andrew Mason, founder and CEO of Voice Generator Descript, said in a blog post Friday. “As soon as you get into a world where you’re making subjective judgment calls about whether specific cases might be ethical, it won’t take long for anything to happen.”
more:Anthony Bourdain doc explores the chef’s ‘tortured’ life: ‘He cared too much about everything’
Prior to this week, much of the public controversy about such technologies centered on their ability to create hard-to-detect deepfakes using fake audio and/or video and fuel misinformation and political conflict.
But Mason, who previously founded and led Groupon, said in an interview that Descript has repeatedly rejected requests to bring back a voice, including “those who have lost someone. and mourning.”
“It’s not even that we want to pass judgment,” he said. “We’re just saying you should have some bright lines in what’s okay and what’s not.”
The angry and uneasy reactions to voice cloning in the Borden case reflect expectations and issues of disclosure and consent, said Sam Gregory, program director of Witness, a nonprofit working on using video technology for human rights. He said it would have been appropriate to obtain consent and disclose the technicalities at work. Instead, viewers were stunned – first by the fact of the audio fakery, then by the director dismissing any ethical questions – and expressed their displeasure online.
more:Rapper Biz Markie, whose hit ‘Just a Friend’ became a pop culture staple, dies at 57
“It also touches on our fear of death and thoughts about how people can control our digital likeness and lead us to say or do something without any way to stop it,” Gregory said.
Neville did not identify what instrument he used to recreate Bourdain’s voice, but said he had used it for some of the sentences Bourdain wrote but never said aloud.
“With the blessings of his estate and literary agent, we used AI technology,” Neville said in a written statement. “It was a modern storytelling technique that I used in some places where I felt it was important to bring Tony’s words to life.”
Neville also told GQ magazine that he had received the approval of Bourdain’s widow and literary executor. The chef’s wife, Ottavia Busia, responded by tweeting: “Certainly I wasn’t the one who said Tony would be good with this.”
‘American Horror Stories’:Paris Jackson says her vengeful character ‘makes my blood boil’
Although text-to-speech research is dominated by tech giants like Microsoft, Google, and Amazon, there are now a number of startups like Descript that provide voice-cloning software. Uses range from customer service chatbots to video games and podcasting.
“We have very strong policies about what can be done on our platform,” said Johaib Ahmed, founder and CEO of Resemble AI, a Toronto-based company that sells custom AI voice generator services. “When you’re making a voice clone, it requires the consent of anyone’s voice.”
Ahmed said the rare occasions where he was allowed some posthumous voice cloning was for academic research, including a project working with the voice of Winston Churchill, who died in 1965.
Ahmed said a more common commercial use is to edit a TV commercial recorded by real voice actors and then adapt it to a field by adding local context. He said it is also used to voice anime movies and other videos in one language and make it speak a different language.
‘They are killing people’:Biden points finger at social media platforms for pandemic misinformation
He compared it to past innovations in the entertainment industry, from stunt actors to greenscreen technology.
Northeastern University professor Rupal Patel said seconds or minutes of recorded human speech could help AI systems generate their own synthetic speech, although it probably took too long to capture the clarity and rhythm of Anthony Bourdain’s voice. Got training. Joe also runs another voice-generating company, Vocalidi, which focuses on customer service chatbots.
“If you wanted it to really speak like him, you needed a lot, maybe 90 minutes of good, clean data,” she said. “You’re building an algorithm that learns to speak like Bourdain does.”
Neville is an acclaimed documentarian who starred in Fred Rogers’ picture “Will You Be My Neighbor?” also directed. and the Oscar-winning “20 Feet From Stardom.” He began filming his latest film in 2019, more than a year after Bourdain died by suicide in June 2018.