Meta continued its push into the increasingly crowded AI field on Friday, announcing the creation of a tool called the Voicebox. It’s a speech generating app with a variety of potential use cases, but it’s also ripe for misuse, as Meta admits, which is exactly why the social media giant isn’t releasing Voicebox to the public yet .
Unlike previous speech generation platforms, Meta claims that the Voicebox can perform speech generation tasks it hasn’t been specifically trained on. With text input and a short audio clip for context, the AI tool can create a potentially believable piece of new speech that sounds like whoever was in the source clip.
“Before Voicebox, generative AI for speech required specific training for each task using carefully prepared training data,” said Meta AI. “Voicebox uses a new approach to learning from raw audio only and an accompanying transcript.”
Generative AI is a type of program that can generate text, images, or other media in response to user requests. Meta AI said the Voicebox can output audio in six languages, including English, French, German, Spanish, Polish and Portuguese, and can do it closer to how people naturally speak in the real world.
Meta suggests that the tool can be used to enhance cross-language conversations using technological tools, or to deliver more natural-sounding video game character dialogue. But Voicebox also seems like a faster and cheaper way to create copycat “deepfake” dialogue, making it sound like someone (perhaps a public figure or celebrity) said something they didn’t actually say.
While it may be a breakthrough in AI development, Meta AI also recognized the potential for misuse, saying the company has developed classifiers that distinguish between Voicebox creations and humans. Similar to spam filters, AI classifiers are programs that sort data into different groups or classes, in this case human or AI-generated.
Meta highlighted the need for transparency in AI development in its blog post, saying it’s crucial to be open with the research community. However, the company also said it has no plans to make Voicebox publicly available due to the potential to exploit the technology in potentially negative ways.
“There are many exciting use cases for generative speech models, but due to the potential risks of misuse, we are not currently making the Voicebox model or code publicly available,” said a Meta AI spokesperson. Decrypt in an email.
“While we believe it is important to be open with the AI community and share our research to advance the state of the art in AI, continued the spokesperson, it is also necessary to strike the right balance between openness and accountability.”
Instead of releasing the tool in a functional state, Meta shared audio samples and a research paper to help fellow researchers understand its potential.
AI risks emerge
As AI tools, especially AI chatbots, have become more commonplace since the launch of OpenAIs ChatGPT last November, rapid advances in AI have led global leaders to sound the alarm about the potential misuse of the technology.
On Monday, the UN Secretary-General reiterated the need to take warnings about generative AI seriously.
“The alarm bells about the latest form of artificial intelligence, generative AI, are deafening and are louder than the developers who designed it,” UN Secretary-General Antnio Guterres said at a press conference. “Scientists and experts have called on the world to act, declaring AI an existential threat to humanity on a par with the risk of nuclear war.”
As troubling as the threat of global nuclear war may be, that probability remains in the realm of science fiction and Hollywood blockbusters. A more likely misuse of Generative AI comes from scams targeting individuals who use deepfake images and voices to dupe victims out of money, for example, or, as the UN stated in a recent report, by stoking hate and disinformation online.
A deepfake is an increasingly common type of AI-created video or audio content that depicts false events, but is done in a way that can be very difficult to identify as fake.
In April, CNN reported that scammers used AI technology to clone the voice of an Arizona woman’s 15-year-old daughter, claiming they had kidnapped the teenager and demanding a ransom. And in March, an image of the former president generated by artificial intelligence Donald Trump being arrested went viral after being shared on social media.
#Metas #voice #tool #Deepfakes #ready #wont #released #Decrypt