Generative AI, powered by Large Language Models (LLMs), has the potential to bring about significant change to many organizations; however, most still struggle to deploy the necessary infrastructure and applications to extract value from AI. Whether it is selecting the appropriate LLM and preparing corporate data for a RAG solution, or designing the storage and networking to keep GPUs fed with data, building infrastructure and applications for AI is a complex task. Compounding the complexity of on-premises deployment is the need to run multiple AI applications on shared infrastructure with differing resource requirements and business value. A platform that automates the deployment of infrastructure and supporting components will enable faster adoption and a shorter time to AI value. The Nutanix Enterprise AI (NAI) platform is a new option to simplify the deployment and operation of a private AI platform.

Building Blocks

Choosing the right LLM family and model size within the family is critical; different LLMs have capabilities suited to specific requirements, and the model size directly impacts the amount of resources required for acceptable performance. Model marketplaces, such as Hugging Face, offer an overwhelming range of models but seldom provide effective governance of the models they host. In parallel with model experimentation, hardware will be purchased and deployed. Buying GPUs is a significant financial commitment, costing tens of thousands of dollars each. Undersized GPUs are unable to run larger models, while oversized GPUs have huge price tags and may never be fully utilized. The GPUs are still housed inside servers, where the CPU and RAM configuration will impact data delivery to the GPUs. Joining servers and GPUs together, the network is critical for allowing data to flow to the GPUs and for results to be returned to users. Data center power capacity is frequently a constraint for AI, as GPUs consume a significant amount of power.  A balanced design will load all these resource types evenly, without bottlenecks or overloads, or excessive idle resources.

AI Unleashed 2025

Production AI

Experimenting with open-source models is an excellent way to get started with generative AI, but transitioning to production deployment requires significant effort in terms of security and governance. With over a million models available on Hugging Face, some models are likely to contain security vulnerabilities or be malicious. Nutanix Enterprise AI features a curated model registry, offering models that have been scanned for security before being made available to your AI developers within the platform. NAI provides an interface where AI developers can choose a certified model or upload a custom model, then configure that model to provide an endpoint for an application developer to consume, allowing them to integrate AI into their applications easily. Within the interface, the necessary resource requirements are identified, along with identifying where on the internal platform the model can be run. Model serving is delivered by a variety of standard tools, such as vLLM, and the interface provides monitoring of the models being served to show both model and host resource utilization.

Nutanix Enterprise AI

Nutanix Enterprise AI provides a platform to simplify and accelerate AI adoption for businesses, from addressing the “cold start” problem through to serving and managing AI applications in production operations. It facilitates the development of AI agents by supporting diverse model types (embedding, vision, and reasoning) and providing pre-configured endpoints, thereby streamlining the inference process and enabling security scanning. Offering both on-prem and cloud deployment options, including air-gapped environments, Nutanix leverages its core infrastructure capabilities, including storage and Kubernetes platform, to provide an all-in-one solution for AI deployments, aiming to make infrastructure invisible.

Nutanix presented NAI at AI Infrastructure Field Day; you can watch the videos of all the Nutanix presentations on the appearances page.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Tech Field Day Events

TECHSTRONG AI PODCAST

SHARE THIS STORY