VMware and NVIDIA today announced they are collaborating on a platform that will make it possible to build and deploy applications infused with artificial intelligence (AI) on virtual machines running in a private cloud or on-premises IT environment managed by an internal IT team.
Announced at the VMware Explore 2023 conference, VMware Private AI Foundation with NVIDIA, due out early next year, will enable enterprises to customize AI models and run AI applications on top of the existing VMware Cloud Foundation platform, based on the virtual machines that VMware already provides.
The extension to that platform will provide organizations with access to up to four graphical processor units (GPUs) per instance, along with a vector database, virtual machines optimized to run deep learning algorithms and the virtual storage services needed to build these applications, says Paul Turner, vice president for product management for the VMware vSphere platform.
It will also include support for NVIDIA NeMo, a framework that makes it simpler to build, customize and deploy AI models that is based on TensorRT for Large Language Models (TRT-LLM) that NVIDIA created using the NVIDIA CUDA parallel programming language. “It will support any of the open AI models,” says Turner.
Dell Technologies, Hewlett Packard Enterprise (HPE) and Lenovo have all committed to building systems incorporating VMware Private AI Foundation with NVIDIA that should become available next year.
Those platforms will enable organizations to unlock the value of AI in a way that enables them to maintain control of the IT environments, adds Justin Boitano, vice president of enterprise and edge computing for VMware.
In general, the size of AI models has reached a point where they typically only require gigabytes of storage to be deployed, notes Boitano. As a result, it’s become a lot more practical for IT organizations to deploy, he adds.
That’s crucial, because generative AI models will soon be pervasively deployed almost everywhere, notes Boitano. “Generative AI is the most transformational technology of our lifetime,” he says.
The challenge organizations face now is bringing together the data scientists, developers and IT professionals needed to build, deploy, update and secure those models at scale. Before too long the number of AI models that any organization will have deployed across highly distributed computing environments will number in the hundreds. Most AI models today are built by data science teams that are typically not well integrated with the IT teams that build and deploy applications. Not surprisingly, a significant culture divide between data scientists and the IT teams that deploy AI models has already emerged.
Regardless of how AI models are built and deployed, the one thing that is certain is most applications are soon going to be augmented by AI models. At this point it’s hard to conceive of any new application being built that did not include some type of AI capability. In effect, when it comes to application development, AI is now table stakes. The issue is determining how best to go about managing those AI models within the context of a larger set of IT workflows.