Edera today extended the reach of a platform it developed for isolating workloads using a hardened runtime environment to graphics processing units (GPUs) running artificial intelligence (AI) applications.

Extending a hardened runtime based on the open source Xen hypervisor to include both CPUs and now GPUs will make it possible to narrow the scope of a potential outage to a single GPU rather than an entire machine, says Edera CTO Alex Zenla. That’s critical because IT teams running inference models need to ensure availability of workloads as it becomes apparent that GPUs tend to fail frequently, he adds.

In effect, IT teams now have a means to run artificial intelligence (AI) applications in a multi-tenant environment that enables them to better optimize consumption of scarce GPU resources deployed in the cloud or in an on-premises IT environment, says Zenla. Otherwise, every application running on that machine is offline until a GPU is replaced and the 30 minutes required to cold-start a GPU has been completed, he notes.

“GPUs tend to fail a lot,” says Zenla. “We’re providing elasticity.”

Additionally, the Edera platform provides an agnostic runtime environment that can in the future be extended to other classes of processors and accelerators that might be used to run AI models, adds Zenla.

Finally, the sandboxes that Edera creates isolate workloads in a way that eliminates the root causes of privilege escalation, lateral movement, and data exfiltration because the host is sheltered from vulnerable system calls and kernel-level attack paths.

It’s not clear how much focus there is now on optimizing consumption of GPU resources, but as more AI workloads are deployed the total cost of the infrastructure used to run AI applications is becoming more concerning. IT teams can improve utilization rates of GPUs and other processors so long as they are not also creating a single point of failure for multiple mission-critical applications.

Each IT team will need to determine how best to employ the underlying IT infrastructure used to run AI applications, which may differ depending on the use case. The one thing that is certain is the cost of that IT infrastructure ultimately becomes a limiting factor when determining how many AI applications an organization might be able to afford to deploy in a production environment. IT teams will also need to consider to what degree they might want to migrate AI inference workloads from one class of processors to another to reduce costs.

Obviously, it’s still early days so far as defining a set of best practices for running AI applications is concerned but as more responsibility for meaning these workloads is shifted to internal IT teams the more probable it becomes that a set of playbooks for running AI applications will be developed. The challenge and the opportunity now is to determine which foundational technologies make it possible to manage AI workloads as flexibly as possible in an era where many of those decisions have been made with little to no regard for future requirements.