First, OpenAI offered a tool that allowed people to create digital images simply by describing what they wanted to see. He then built similar technology that produced full-motion video like something out of a Hollywood movie.
Now, he has unveiled technology that can recreate someone’s voice.
The high-profile artificial intelligence start-up said on Friday that a small group of businesses was testing a new OpenAI system, the Voice Engine, that can recreate a person’s voice from a 15-second recording. If you upload a recording of yourself and a paragraph of text, it can read the text using a synthetic voice that sounds like yours.
The text does not have to be in your native language. If, for example, you speak English, it can recreate your voice in Spanish, French, Chinese or many other languages.
OpenAI is not sharing the technology more widely because it is still trying to understand its potential risks. Like image and video generators, a voice generator could help spread misinformation on social media. It could also allow criminals to impersonate people online or during phone calls.
The company said it was particularly concerned that this type of technology could be used to breach voice authenticators that control access to online banking accounts and other personal applications.
“This is a sensitive thing and it’s important to get it right,” an OpenAI product manager, Jeff Harris, said in an interview.
The company is exploring ways to watermark synthetic voices or add controls that prevent people from using the technology with the voices of politicians or other prominent figures.
Last month, OpenAI took a similar approach when it unveiled its video generator, Sora. He demonstrated the technology but did not release it publicly.
OpenAI is among several companies that have developed a new breed of AI technology that can quickly and easily create synthetic voices. They include tech giants like Google as well as startups like New York-based ElevenLabs. (The New York Times sued OpenAI and its partner, Microsoft, over claims of copyright infringement involving artificial intelligence systems that generate text.)
Businesses can use these technologies to create audiobooks, give voice to online chatbots, or even create an automated DJ radio station. Since last year, OpenAI has used its technology to power a talking version of ChatGPT. And it has long offered businesses a range of voices that can be used for similar applications. All built from clips provided by voice actors.
However, the company has yet to offer a public tool that would allow individuals and businesses to recreate voices from a short clip like Voice Engine does. The ability to recreate any voice in this way, Mr. Harris said, is what makes the technology dangerous. The technology could be especially dangerous in an election year, he said.
In January, New Hampshire residents received robo-messages that prevented them from voting in the state’s primary election with a voice that was likely artificially created to resemble President Biden. The Federal Communications Commission later banned such calls.
Mr. Harris said OpenAI had no immediate plans to monetize the technology. He said the tool could be particularly useful to people who have lost their voice due to illness or accident.
It showed how technology was used to recreate a woman’s voice after brain cancer destroyed it. She was now able to speak, she said, after providing a short recording of a presentation she had once given as a high school student.