Google trained ‘experimental AI’ to generate high-fidelity songs from text prompts. Now it’s available to the public

In January, Google unveiled MusicLM, an ‘experimental AI’ tool that can generate high-fidelity music from text prompts and humming.

The tool is now available for the public to test out.

Google explains that at the public-use level, the tool works by typing in a prompt like “soulful jazz for a dinner party”.

The MusicLM model will then create two versions of the requested song for the person inputting the prompt. You can then vote on which one you prefer, which Google says will “help improve the AI model”.

The model was trained on five million audio clips, amounting to 280,000 hours of music at 24 kHz.

At the time of its unveiling back in January, Google released a set of examples of the tool’s ‘Audio Generation’ abilities ‘From Rich Captions’, the results of which, you can listen to here.

Google claims that, “whether you’re a professional musician or just starting out, MusicLM is an experimental tool that can help you express your creativity”.

The company published a ‘behind-the-scenes look’ yesterday at MusicLM being used by a sound artist, a Google Arts & Culture Artist in Residence, and a Google researcher:

Google also published a paper in January outlining the research that went into developing the tool.

According to Google’s researchers, “Future work may focus on lyrics generation, along with improvement of text conditioning and vocal quality. Another aspect is the modeling of high-level song structure like introduction, verse, and chorus”.

The research paper, which suggests that MusicLM, “further extends the set of tools that assist humans with creative music tasks”, also added that, “there are several risks associated with our model and the use-case it tackles”.

According to the researchers, amongst those risks are that the “generated samples will reflect the biases present in the training data, raising the question about appropriateness for music generation for cultures underrepresented in the training data, while at the same time also raising concerns about cultural appropriation”.

Another risk highlighted by the paper was the “potential misappropriation of creative content”.

The researchers explained: “In accordance with responsible model development practices, we conducted a thorough study of memorization, adapting and extending a methodology used in the context of text-based LLMs, focusing on the semantic modeling stage”.

“We strongly emphasize the need for more future work in tackling these risks associated to music generation — we have no plans to release models at this point.”

Google MusicLM research paper 

They said that they “found that only a tiny fraction of examples was memorized exactly, while for 1% of the examples we could identify an approximate match”.

And then added: “We strongly emphasize the need for more future work in tackling these risks associated to music generation — we have no plans to release models at this point.”

“Seven years into our journey as an AI-first company, we’re at an exciting inflection point.”

Sundar Pichai, Google and Alphabet 

Google’s surprise public release of MusicLM this week arrived on the same day that Google and Alphabet CEO Sundar Pichai announced a huge push into AI with a range of AI-powered updates to various Google products.

“Seven years into our journey as an AI-first company, we’re at an exciting inflection point,” said Pichai in his keynote address at Google I/O 2023 event on Wednesday (May 10).

“We have an opportunity to make AI even more helpful for people, for businesses, for communities, for everyone.”

As part of Google’s new AI push, the company is expanding its conversational AI tool, and Chat GPT rival, Bard into over 180 countries after an initial launch in the UK and US.

Bard has also been recently been moved by Google to its “state-of-the-art language model” PaLM 2. Google says that this is “a far more capable large language model, which features “advanced math and reasoning skills and coding capabilities“.

The public release of MusicLM arrives at a time of rising unease around the use of generative AI in music.

One of the main reasons for the industry’s concerns around the use of generative AI, which is trained on other music, is the risk of copyright infringement.

Last month, AI-generated music productions that mimic the vocals of superstar artists dominated headlines after a song called heart on my sleeve, featuring AI-generated vocals copying the voices of Drake and The Weeknd, went viral.

The track, uploaded by an artist called ghostwriter, was subsequently deleted from the likes of YouTube, Spotify and other platforms. On YouTube, a confirmation on what triggered the takedown of the track from that platform appeared on the holding page of ghostwriter’s now-defunct YouTube upload.

It read: “This video is no longer available due to a copyright claim by Universal Music Group.”

Speaking on Universal Music Group‘s Q1 earnings call last month, Sir Lucian Grainge, CEO & Chairman of Universal Music Group, noted that: “Unlike its predecessors, much of the latest generative AI [i.e. ‘fake Drake’] is trained on copyrighted material, which clearly violates artists’ and labels’ rights and will put platforms completely at odds with the partnerships with us and our artists and the ones that drive success.”

In his opening remarks to analysts on that same call, Sir Lucian Grainge also criticized the “content oversupply” that currently sees around 100,000 tracks a day distributed to music streaming services.

“Not many people realize that AI has already been a major contributor to this content oversupply,” said Grainge.

“Most of this AI content on DSPs comes from the prior generation of AI, a technology that is not trained on copyrighted IP and that produces very poor quality output with virtually no consumer appeal.”

The rise of AI platforms that allow users to create vast volumes of tracks at the touch of a button has also exposed the potential for generative AI to be used for streaming fraud.

Earlier this month, AI-powered music creation app Boomy, whose users have created 14.4 million songs to date, said that Spotify had shut down its ability to upload songs to the DSP, and that some already-uploaded tracks had been removed.

A Spotify spokesperson later confirmed to MBW that those “certain catalog releases” from Boomy were removed because the streaming platform detected artificial streaming of these tracks. (There was no suggestion that Boomy itself was involved in artificial streaming).

Boomy said on Saturday (May 6) that “curated delivery to Spotify of new releases by Boomy artists has been re-enabled,” the company wrote on its Discord server on Saturday (May 6).

While Spotify confirmed it had made some tracks unavailable, it emerged that it was likely Boomy’s own distribution partner – Downtown-owned DashGo – that had halted uploads to Spotify.

Only a small fraction of Boomy tracks appeared to have been “greyed out” so that they couldn’t be played. As of Monday (May 8), there were no greyed-out tracks on Boomy’s playlists on Spotify.Music Business Worldwide