Meta’s new generative AI model can take chords or beats and turn them into songs

June 19, 2024

Facebook and Instagram owner Meta Platforms is among the increasingly numerous contenders in the field of AI music generation, and on Tuesday (June 18), the company’s AI research division unveiled its latest step forward in that effort.

Meta’s Fundamental AI Research (FAIR) team gave the world its first glimpse of JASCO, a tool that can take chords or beats and turn them into full musical tracks.

Meta says this functionality will give creators more control over the output of AI music tools.

JASCO – which stands for “Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation” – is comparable in quality to other AI tools, “while allowing significantly better and more versatile controls over the generated music,” Meta FAIR said in a blog post.

To showcase JASCO’s abilities, Meta published a page of music clips, where simple public-domain melodies are turned into musical tracks.

For instance, a melody from Maurice Ravel’s Bolero is turned into “an 80s driving pop song” and a “folk song with accordion and acoustic guitar.” Tchaikovsky’s Swan Lake becomes a “Chinese traditional track with guzheng, percussion, and bamboo flute,” and an “R&B track with deep bass, electronic drums and lead trumpet.”

“As innovation in the field continues to move at a rapid pace, we believe that collaboration with the global AI community is more important than ever.”
Meta

Meta has been making a fair amount of its AI research available to the public. With JASCO, the company has released a research paper outlining the work, and later this month, it plans to release the inference code under an MIT license and the pre-trained JASCO model under a Creative Commons license. This means other AI developers will be able to use the model to create their own AI tools.

“As innovation in the field continues to move at a rapid pace, we believe that collaboration with the global AI community is more important than ever,” Meta FAIR said.

The latest innovation comes a year after Meta released MusicGen, a text-to-audio generator capable of creating 12-second tracks from simple text prompts.

That tool was trained on 20,000 hours of music licensed by Meta for the purpose of training AI, as well as 390,000 instrument-only tracks from Shutterstock and Pond5.

MusicGen is also capable of using melodies as its input, which, according to some, made it the first music AI tool capable of turning a melody into a fully developed song.

Meta’s JASCO comes on the heels of several innovations in the AI music space to be revealed in recent days.

The same day that Meta unveiled JASCO, Google’s AI lab, DeepMind, revealed a new video-to-audio (V2A) tool capable of creating soundtracks for video. Users can input text prompts to tell the tool what kind of sound they want for the video – or the tool can simply create sounds itself, based on the what the video shows.

DeepMind described this as a crucial part of being able to create video content exclusively using AI tools. Most AI video generators create only silent videos.

Last week, Stability AI, the company behind the popular AI art generator Stable Diffusion, released Stable Audio Open, a free, open-source model for creating audio clips up to 47 seconds long.

The tool – which is not meant for the creation of songs, but rather for the creation of sounds that can be used in songs or for other applications – enables users to fine-tune the product with their own custom audio data.

For instance, a drummer can train the model on their own drum recordings to generate new and unique beats in their own style.

These types of AI tools stand in contrast to AI music platforms such as Udio and Suno, which create entire tracks from nothing more than text prompts.

Such tools are generally trained on large amounts of data, and have become a source of concern for the music industry, due to suspicions they have been trained on copyrighted music without authorization.Music Business Worldwide

Related Posts