Red Hat this week launched a full stack edition of an artificial intelligence (AI) platform, dubbed Red Hat AI Enterprise, in addition to making available an instance of the AI Factory developed by NVIDIA to IT teams that have standardized on the platform.

The AI Factory from NVIDIA provides access to a set of tools, models and frameworks based on the CUDA framework that are optimized for building and deploying AI models on graphics processing units (GPUs) from NVIDIA.

Red Hat has extended its Red Hat AI platform by adding support for Mistral-Large-3, Nemotron-Nano, Apertus-8B-Instruct and DeepSeek-V3.2 models and a technology preview of support for Intel CPUs for small language models. Additionally, Red Hat AI 3.3 expands hardware certification to NVIDIA Blackwell Ultra GPUs and AMD MI325X accelerators.

Finally, Red Hat is also previewing a Models-as-a-Service (MaaS) capability to enable IT teams to provide self-service access to privately hosted models via an API gateway in addition to adding support for 3x Whisper speedup, geospatial support, improved EAGLE speculative decoding, enhanced tool calling for agentic workflows and a Red Hat AI Python Index repository to unify the data-to-model lifecycle.

At the core of Red Hat Enterprise is a virtual large language model (vLLM) inference engine and llm-d distributed inference framework that makes it possible to distribute AI models and applications across multiple Kubernetes clusters.

On top of that core platform, Red Hat AI Enterprise provides IT teams with everything from the underlying instance of Red Hat Enterprise Linux (RHEL) and Kubernetes clusters to the models required to build and deploy AI applications, says Joe Fernandes, vice president and general manager for the AI Business Unit at Red Hat.

The overall goal is to provide IT teams with an integrated platform that is ultimately less costly to deploy and support, he adds.

In effect, Red Hat is making a commitment to enable IT teams to build and deploy AI models and applications on any IT infrastructure they prefer, says Fernandes.

While NVIDIA GPUs and frameworks continue to be the most widely used to build and deploy AI models and applications, a growing number of IT teams are also starting to adopt a variety of alternative AI accelerators. Those alternatives are often more available than NVIDIA GPUs and provide a less expensive option, especially when it comes to deploying AI inference engines.

Many IT teams are also looking to deploy those AI inference engines on platforms that are located closer to the point where data is being created and consumed by AI applications. In those environments, an AI inference model that might have been trained using GPUs doesn’t necessarily need to be deployed on a GPU when there is a less expensive class of processors available.

Some organizations are also concerned about AI sovereignty mandates that require them to deploy AI models and applications in on-premises IT environments that are deployed and managed within the confines of a specific geographic region.

The Red Hat platform supports multiple classes of processors and platforms that can be deployed either in an on-premises IT environment or in a cloud computing environment as internal IT teams best see fit, says Fernandes. “That hybrid approach makes us unique,” he adds.

Inevitably, most IT teams will soon find themselves deploying AI models just about everywhere imaginable. In fact, the Futurum Group projects spending on AI platforms market research will surge to $292 billion by 2030, growing at an approximately 50.8% compound annual growth rate (CAGR).

The challenge then, of course, becomes managing all those highly distributed instances of AI applications.