Transforming your business processes with AI is a significant undertaking; the best starting point is to proof out and learn from one or two use cases before attempting to cover hundreds more. This leads to another problem: the approach to delivering AI for a couple of use cases doesn’t scale to providing hundreds of different AI applications. Many companies find their AI adoption stalling as the transition from experimental to full-on production. What these companies need is a scalable platform with enterprise-ready security and governance, as well as built-in multi-tenancy. Haseeb Budhani built Rafay to provide precisely this platform for enterprise organizations and the many emerging Neo-Clouds. The Rafay platform provides GPU-as-a-Service and Model-as-a-Service, with support for the latest NVIDIA GPUs and the NIM framework. Customers use the Rafay platform to go from racks of new servers to a multi-tenant AI service in days, rather than months. Rafay presented at AI Infrastructure Field Day.
What Makes Scaling to Production So Hard?
The core challenges are the variability of use cases and the need for long-term cost-effectiveness. Our first few AI applications will likely utilize similar model types, often operating in isolation. As we examine more applications, we will naturally encounter a broader range of AI models to use and observe increased data flows between these AI applications. Managing an estate of a hundred models in use by five hundred applications is only practical through policies and automated tools. That same automation is vital as applications change over time, newer models are released, or training and quantization updates need to be deployed to production. Security and governance must be embedded in the processes, with integration to the enterprise authentication system and declarative configurations to enable simpler compliance and auditing. Application developers will also want a platform that integrates with their existing tooling, often based on Git repositories for both the application and the platform configuration source files.
Why Not Build Your Own?
Enterprise IT teams often relish the prospect of building solutions using new technologies, such as AI, and happily fill racks with new servers, GPUs, and high-speed networks. For some organizations with a track record of delivering innovative infrastructure solutions, this model is very successful; for many enterprise organizations, building a custom platform and automation feels like a risk they cannot accept. The risks are that building the platform will take a long time and require learning through experimentation, and that the business becomes dependent on the unique knowledge of a small group of people. If others have trodden this path, why not pay them to provide the answers and the platform? The classic ‘build vs. buy’ choice is to build only the things that make your business’s value unique and buy everything else if possible. Does building a platform for AI applications make your business unique? Or do the AI applications you run on the platform bring that value? For most organizations, the faster the AI infrastructure is deployed, the sooner they can move on to building differentiated AI applications. Rafay has a proven methodology and track record for rapidly converting new hardware into an AI platform.
What Are NeoClouds?
The category of NeoCloud has emerged a few times over the last couple of years, referring to cloud providers that specialize in providing a platform for AI applications. A NeoCloud offers services and service levels that differentiate it from more traditional cloud providers, such as AWS, Azure, and Google. The services focus on delivering GPUs or AI models as a service, utilizing high-speed infrastructure to keep the GPUs fed with data and return results with low latency. Some NeoClouds provide explicit sovereignty, guaranteeing that data will never leave specific locations, which brings the added value of consistent network throughput and latency. Other NeoClouds are located where power and cooling costs are low, or near interconnect locations where network costs and latency are also low. The common factor is that these are new infrastructures built to run AI applications in a multi-tenant environment, offering the option to rent AI infrastructure rather than build or buy. Building a NeoCloud presents most of the challenges associated with creating an AI infrastructure, along with far stricter requirements for multi-tenancy, as well as accounting and billing. NeoCloud multi-tenancy is usually multi-layered, with each organization being a tenant of the NeoCloud and each BU within the organization a tenant of the organization! Speed from hardware to selling services is vital for NeoClouds, so again, they look to Rafay as the builder for their AI Factory.
Watch all the presentations from AI Infrastructure Field Day on the Tech Field Day website.



