
NVIDIA Corp. on Wednesday announced general availability of neural module (NeMo) microservices, the software tools behind artificial intelligence (AI) agents for enterprises.
The chip maker’s modular platform for building and customizing generative AI models and AI agents works with partner platforms to provide features such as prompt tuning, supervised fine-tuning and knowledge retrieval tools.
NVIDIA sees NeMo microservices as building blocks for enterprises to create data flywheels or feedback loops where data is collected from processes and then used to refine AI models.
“Our view is that every [AI] agent will need a data flywheel,” Joey Conway, NVIDIA’s senior director for generative AI software for enterprise, said in a press conference on Tuesday. “The data flywheel is the way we can go from enterprise data — things like inference data, business intelligence, and user feedback — to power and improve the agent so it gains new capabilities and skills, and learns from its experiences.”
NeMo microservices, deployed via NVIDIA’s AI Enterprise software platform, pass data along a chain, starting with NeMo Curator for processing data at scale for training and customization. Other components include NeMo Customizer and NeMo Guardrails. They all work in a circular pipeline, taking new data and user feedback, using this to improve the AI model, then redeploying it.
NVIDIA shared a few customer use cases. AT&T used NeMo microservices to build an AI agent to process a knowledge base of nearly 10,000 documents that is reworked weekly. Cisco Systems Inc.’s emerging tech unit is using NeMo microservices to build a coding assistant that it says has 10 times the response time of similar tools and reduces tool selection errors by 40%. And Amdocs, a maker of software used by phone companies, is using NeMo microservices to create billing agents, sales agents and network agents. The billing agent was able to resolve more inquiries, including a 50% increase in what’s called “first-call resolution,” said Conway.
NeMo microservices support open models such as, Meta Platforms Inc.’s Llama, Google Gemma, Mistral, and Microsoft Corp.’s Phi collection of small-language models (SLMs).
Channeling a frequent refrain of NVIDIA CEO Jensen Huang, Conway said NeMo software would usher in an era of AI agents being used as “digital employees.”
Conway expects NVIDIA’s efforts to grow. “With microservices, each are like a Docker container,” he said. “The orchestration today, we rely on things like Kubernetes, so we have additional features like Kubernetes Operators that help orchestrate it. We have some software today to help with the data preparation and curation. There will be a lot more coming there.”