Nvidia, known for its advanced computer chips, is redoubling its efforts to develop artificial intelligence products, developing a new generative artificial intelligence audio model that can “produce sounds that have never been heard before.”
The new artificial intelligence model is called Fugatorepresents the base generation audio transformer Opus 1. NVIDIA It means it can use text and audio input to generate, transform and manipulate sounds, creating sounds like a trumpet barking or a saxophone meowing. The model can also produce “high-quality singing voices” based on text prompts.
Key features of Fugatto include creating music clips based on text prompts, modifying existing songs by adding or removing instruments, changing voice characteristics such as accent and mood, and generating entirely new sounds. Nvidia describes its new technology as “the Swiss Army Knife of sound.”
Nvidia demonstrated Fugatto’s capabilities in a video, showing how users can generate sounds through prompts such as: “Create the sound of a train passing by and turn it into a lush string orchestra.” Fugatto also allows users to isolate sounds from songs, the video shows .
“This stuff is crazy,” said I do Zmishlanimulti-platinum producer, songwriter and co-founder one beat audioHe is a member of NVIDIA’s cutting-edge startup Inception program.
“We wanted to create a model that understands and produces sound like a human.”
Raphael Valle, Nvidia
“Sound is my inspiration. That’s what drives me to make music. The idea that I can create new sounds on the fly in the studio is incredible.
Fugatto uses Can be combined with ARTa technology that allows users to combine instructions that were not initially seen during training. Nvidia explains that this means users can request complex audio conversions, such as sad text with a French accent.
The model also introduces temporal interpolation, allowing the creation of ever-changing soundscapes. For example, users can create a gradual transition of heavy rain, with thunder getting stronger and then disappearing into the distance.
Nvidia points out that Fugatto is a transformer model 2.5 billion parameterstrained on an NVIDIA DGX system using 32 NVIDIA H100 Tensor Core GPUs, the same as the driven GPU Urtelthe company claims to be the world’s largest private cloud computing platform.
Meanwhile, Nvidia said its research teams from India, Brazil, China, Jordan and South Korea spent more than a year developing a data set containing millions of message samples to develop Fugatto.
Nvidia says Fugatto can be used in a variety of industries, including music production, advertising, language learning and video game development.
Raphael ValaisNVIDIA applied information research manager and project contributor described Fugatto as “our first step into a future where unsupervised multi-task learning in information synthesis and transformation will emerge from the data and model scale.”
“We wanted to create a model that could understand and produce sounds like a human,” Valle said.
Nvidia is the latest tech company to launch an AI-powered audio tool, joining others such as Stablize AI, Open artificial intelligenceand Google deep thinking. However, Nvidia has not yet announced a timetable for public release or commercialization of Fugatto.
Jensen yellow“The era of artificial intelligence is in full swing, driving the world to shift to NVIDIA computing,” NVIDIA founder and CEO said last week when the company released its third-quarter earnings report.
“Artificial intelligence is transforming every industry, company and country. Enterprises are adopting agent AI to revolutionize work processes. With breakthroughs in physical AI, investment in industrial robots has surged. Countries have realized the need to develop their own artificial intelligence and infrastructure. importance.
global music business