Jakarta, Indonesia Sentinel — Tech Giant Nvidia, has unveiled a new AI audio generator model capable of producing sounds “never heard before.” Named Fugatto, short for Foundational Generative Audio Transformer Opus 1, an AI model promises to redefine the boundaries of AI-driven audio creation.
According to Nvidia, Fugatto’s capabilities extend beyond current AI audio tools, such as those from Stability AI, OpenAI, Google DeepMind, ElevenLabs, and Adobe. While those other AI audio tools have focused on replicating or synthesizing existing sounds, Nvidia claims Fugatto can create entirely original audio.
According to MBW, Nvidida says Fugatto allows users to generate, modify, and manipulate audio using text or audio inputs. Its advanced features include the ability to produce surreal sounds like “a howling trumpet” or “a meowing saxophone.” The model can also generate high-quality singing voices from text commands.
Fugatto’s core capabilities include composing music from text prompts, modifying existing tracks by adding or removing instruments, altering sound characteristics such as accents and emotions, and generating entirely new and unique sounds.
In a demonstration video, Nvidia showcased Fugatto’s ability to compose music based on unique prompts such as “create a howling saxophone paired with barking sounds, followed by electronic music infused with dog barks.”
Keris Exhibition Opens at Indonesia’s National Museum, Showcasing Over 200 Traditional Dagger
Another example highlighted its capability to produce a combination of deep, rumbling basslines and fragmented, high-pitched digital chirps, mimicking the sound of a colossal machine. Fugatto can even transform human voices by altering accents or imbuing them with specific emotions, such as anger or calmness.
The tool also enables deep edits to music such as isolate vocals within a track, add instruments, and even transform melodies by replacing them with the voice of an opera singer. With Fugatto, NVIDIA has introduced versatile new ways to edit music.
Building Fugatto: A Technical Feat
Accodring to The Verge, in order to develop Fugatto audio generator, Nvidia researchers required to collect a vast dataset containing millions of audio samples. Nvidia then develop models engineered to handle an extensive range of tasks with greater accuracy, even enabling entirely new functionalities without requiring additional training data.
A detailed report lists the extensive datasets Fugatto draws upon, including sound effect libraries such as those from the BBC.
Nvidia’s release of Fugatto marks a significant leap in generative AI audio generator technology, offering artists, creators, and industries a versatile tool to explore new frontiers in sound design. As Nvidia pushes the boundaries of AI audio, Fugatto has the potential to revolutionize everything from music production, signaling a transformative era in generative audio.
(Raidi/Agung)