Google launches new AI tool called “MusicLM” that turns text to Music – It’s Amazing

Google had recently released a new impressive AI that turns text to music, whistled or hummed melody into instrumental music and more, doing all these in just matter of seconds.

Generative AI, the new age of artificial intelligence touches the peak now with numerous platforms, tools launching back to back. It all started with text-to-image by OpenAI’s Dall-E2, then reached to text-to-video (Meta’s Make-a-video) and here we are with AI scoring background music by itself.

Google takes the credit for creating the first-ever AI that could produce high-fidelity music with its “MusicLM”. Though there have been similar attempts earlier such as Riffusion (AI that composes music by visualizing it), Google’s own AudioML and OpenAI’s Jukebox, MusicLM had surpassed all of them to be successful in the purpose. The AI got detailed in an academic paper.

MusicLM – What it can do?

Google’s text-to-music generator ‘MusicLM’, being trained on 280,000 hours of music, is capable of generating high quality audio from text captions where the soundtrack can be as short as 10 seconds to even a length of a song – 5mins. Yeah, it did create an entire song for the unerring text prompts you feed into.

The songs and bits of music it had composed is quite parallel with that of a human’s. Admiring part of MusicLM is that the AI indeed adds voices of a human in between the tracks, sounding so natural that you seldom find the one composed by AI amidst altogether songs.

Here for instance, when fed with captions that goes like “Induces the experience of being lost in space, and the music would be designed to evoke a sense of wonder and awe, while being danceable”, the AI came out with this:

Impressively, MusicLM can put descriptions cumulatively in sequence and create a soundtrack that tells a “story”, sort of melodic story or narrative that continues to several minutes. Perchance, could bgm-ed in movies.

This is an example of such a story telling audio. Care less of the lyrics, as the AI sings creepy gibberish words that doesn’t exists. With lyrics, you can find that it’s AI who made the song. In fact, Dall-E2 AI came out with a creepy set of words creating its own language, that shocked the researchers. It should be a heir of it.

time to meditate (0:00-0:15)
time to wake up (0:15-0:30)
time to run (0:30-0:45)
time to give 100% (0:45-0:60)

MusicLM also simulates human vocals, and turn the humming and whistle tones to music of whatever instrument you want. Albeit, simulation of human voices needs to be trained for pretty quality.

In case if anyone eager to know how Google sorted out the way for MusicLM, sequentially:

MusicLM is not for everyone

Google, however, hasn’t opened the gate for people to play with AI music composer, fearing the risks it may have, saying ‘no immediate plans to release it’.

“We acknowledge the risk of potential misappropriation of creative content associated to the use case,” the co-authors of the paper wrote. “We strongly emphasize the need for more future work in tackling these risks associated to music generation.”

Out of all the songs and tracks generated by the AI composer, 1% of the music was directly replicated from the songs on which it trained. So, it may raise the concern of copyrights if MusicLM is public, says Google.

Imagine if MusicLM opens to people like ChatGPT; everyone would be a music-composer now.

What do you think of this Google’s AI text-to-music generator?

