this AI can create any sound on command

Tuesday 26th November 2024 07:52 AM

The “Swiss army knife of sound”. This is how Nvidia presents its latest innovation in artificial intelligence. Fugattoshort for Foundational Generative Audio Transformer Opus 1, does not just generate music or modify voices like its competitors. This versatile AI can create virtually any sound imaginable, from the simplest to the most complex, with a simple text command.

A production studio in your pocket

“This thing is crazy”enthuses Ido Zmishlany, multi-platinum producer and co-founder of One Take Audio. For this seasoned professional, the ability to instantly create new sounds in the studio opens up new creative perspectives. Fugatto allows you to quickly prototype musical ideas, add or remove instruments from an existing piece, or even modify the accent and emotion of a voice.

Subscribe to Lemon Squeezer

The history of music is closely linked to technological advances. “The electric guitar gave birth to rock and roll. When the sampler appeared, hip-hop was born”recalls Zmishlany. “With AI, we are writing the next chapter of music. We have a new instrument, a new tool for making music. »

An AI that understands sound like a human

Rafael Valle, head of applied audio research at Nvidia and one of the architects of the project, explains: “We wanted to create a model that understands and generates sound like humans”. This approach made it possible to develop unique capabilities. For example, Fugatto can make a trumpet bark or a saxophone meow.

Even more impressive, the model can generate soundscapes that evolve over time. It can reproduce the sound of a thunderstorm passing through an area, with claps of thunder that intensify then gradually fade into the distance. The system even allows you to create new transitions, like a storm calming down to give way to birdsong at dawn.

Accessible and versatile technology

Trained on millions of audio samples, Fugatto uses 2.5 billion parameters and required the use of 32 NVIDIA H100 GPUs. Its development mobilized an international team of researchers for more than a year, strengthening its multilingual capabilities.

The potential applications go far beyond the musical framework. Language learning tools will be able to personalize their content with any voice chosen by the user. Video game developers will be able to generate dynamic sounds that adapt to player actions. The applications will obviously be endless.

Nvidia launches Fugatto, an AI capable of creating or modifying any sound from text
The model can combine complex instructions to generate new and evolving sounds
Fugatto apps will bring music into the future

???? To not miss any news from Presse-citron, follow us on Google News and WhatsApp.