OpenAI Moves to Generate AI Music in Potential Rivalry With Startup Suno
OpenAI recently made a splash with artificial intelligence that generates short videos from text prompts—say, a scene from “Lord of the Rings” but with Gen Z characters, or Pokémon’s Pikachu as “The Godfather.” Now the company has set its sights on doing the same thing with music.
OpenAI staff have been taking steps to develop AI that generates music, according to a person with knowledge of the work. For instance, the company has been working with some students from the Juilliard School to annotate music scores, according to a second person. They are providing the kind of training data that would be needed to develop AI that produces music.
The Takeaway
OpenAI is working on AI for music generation
Company has rapidly introducing new products, expanding beyond core chatbot services
New models or apps would increase rivalry with AI startups Suno, Udio
Powered by Deep Research
The efforts show how OpenAI continues to aggressively expand the types of services it provides beyond its core chatbot, with the aim of keeping its 800 million–plus users on the service longer every day and potentially finding new ways to generate revenue.
OpenAI has privately discussed generating music with text and audio prompts: for example, enabling people to ask the AI to add guitar accompaniment to an existing vocal track, according to a third person who has been involved in the discussions. The resulting product could also help people add music to videos.
If OpenAI releases a music-generating tool, that could help the company’s own expansion into advertising someday. An ad agency, for instance, could use OpenAI’s tools for tasks related to generating an ad campaign, such as brainstorming ideas for lyrics, creating a catchy jingle based on music samples or recordings, or uploading a video to mimic in terms of style, according to one of the people.
The company, recently valued at $500 billion in an employee stock sale, is also trying to catch up with rival Google. The search engine giant in May launched the second generation of its music-making model, Lyria. Google customers can use Lyria through Google Cloud, and the company has touted the ability of marketers to employ it in creating soundtracks for ads.
It is not clear what the timing of OpenAI’s music efforts is, as well as whether ChatGPT or its video-generation app, Sora, would incorporate the new music AI capabilities or whether they would be part of a stand-alone application. Spokespeople for OpenAI declined to comment.
Before the release of ChatGPT, OpenAI had developed two music-generation models—MuseNet in 2019 and Jukebox in 2020—but they are currently not available to users via its chatbot.
A music AI model that enables OpenAI users to generate music, such as vocal tracks or instrumentation, would go beyond what ChatGPT and OpenAI’s existing models do. People using ChatGPT can ask the chatbot to write lyrics, chord progressions or rhyme schemes in the style of a specific artist, for instance. But it can’t create a full song the way other AI music generators, like those developed by startups Suno and Udio, can.
Using AI to make music has taken off to some extent. Three-year-old startup Suno, which sells subscriptions to an AI-powered music generator, is generating around $150 million in annual recurring revenue, up nearly four times from a year ago.
An OpenAI foray into music would follow its launch of video app Sora, which people use to generate short-form videos similar to those on TikTok but entirely AI generated. OpenAI’s head of Sora, Bill Peebles, said in an X post earlier this month that Sora had a million app downloads in less than five days—faster than ChatGPT’s growth when it first launched in late 2022.
The company also has been developing an X-like social feed within ChatGPT where users can share how they’re using the chatbot. These social features could serve as avenues for potentially sharing any music users create with OpenAI.
OpenAI has launched other AI models that generate audio, and this technology would likely overlap with generating music such as vocals. In March, the company released updated speech-to-text and text-to-speech audio models, aimed at helping customers generate voice agents in areas like customer service.
Label Litigation
OpenAI may need to make deals with music labels to avoid copyright lawsuits. The Recording Industry Association of America, a trade group representing Universal Music Group, Sony and Warner Bros., has sued Suno and Udio, alleging that the startups trained their AI models on copyrighted songs without permission or proper compensation. The labels are demanding as much as $150,000 in damages for each song allegedly infringed, potentially totaling billions of dollars.
Both Suno and Udio have said they can use copyrighted material under the fair use doctrine in copyright law. Universal Music and Warner Music have been in talks with these companies as well as Google for AI licensing deals, the Financial Times reported earlier this month.
Google declined to comment on any potential deals with music labels over AI licensing. The company has existing music industry relationships through its ownership of YouTube.
OpenAI currently takes some precautions to prevent the use of copyrighted material, such as not allowing ChatGPT to share the full lyrics of certain songs and instead offering to summarize them. After OpenAI’s Sora app launched last month, CEO Sam Altman said in a blog post that the company would give copyright holders “more granular control over generation of characters,” suggesting it would ask them to give permission before users can generate videos based on those characters. Altman also said OpenAI will share some of the revenue it makes from video generation with copyright holders.
The company also said it has stopped Sora users from creating videos of Dr. Martin Luther King Jr. at the request of his estate, as it improves its guardrails for historical figures.