SoftBank has unveiled a software platform designed to simplify how AI data centers are built and operated, positioning the Japanese conglomerate to offer a menu of GPU cloud services. Called Infrinia AI Cloud OS, the platform provides the foundation for inference-as-a-service offerings, reducing the challenges associated with managing GPU infrastructure and Kubernetes environments.

Infrinia is a software-defined stack that automates the layers required to run modern AI workloads. That includes GPU drivers, networking, storage, and Kubernetes orchestration. The goal, according to SoftBank, is to allow operators and customers to focus on deploying AI models rather than managing the underlying systems that support them.

Junichi Miyakawa, president and CEO of SoftBank, said the software will connect AI data centers with enterprises, service providers, and developers. Rather than assembling bespoke systems or building in-house platforms, operators can use Infrinia to deploy AI services more quickly and at lower operational cost.

Users can select a large language model and deploy inference workloads through APIs without interacting directly with Kubernetes clusters or GPU hardware. SoftBank says this abstraction is increasingly necessary as demand for GPU-accelerated computing expands across fields such as generative AI, robotics, simulation, drug discovery, and materials science.

SoftBank has been a major funder of tech initiatives, and this release is an effort to expand its profile as a vendor, too. The Infrinia platform, said Miyakawa, will enable SoftBank to “play a central role in building the cloud foundation for the AI era and delivering sustainable value to society.”

Inference Workloads on Shared Infrastructure

The platform includes APIs compatible with OpenAI-style interfaces, enabling drop-in integration with existing applications. It also automates routine operational tasks such as monitoring, failover, and maintenance. Multi-tenant isolation is built in, allowing multiple customers to run inference workloads on shared infrastructure while keeping data securely separated.

Infrinia is optimized for today’s most advanced GPU platforms, including high-density systems that rely on ultra-fast interconnects. The stack integrates networking technology from NVIDIA, including NVLink, to reduce latency and maximize GPU-to-GPU bandwidth for distributed AI workloads. SoftBank claims the system can dynamically allocate nodes based on GPU proximity, improving performance for large-scale jobs.

While the initial deployment of Infrinia will be within SoftBank’s own GPU cloud, the company plans to expand availability to other enterprise data centers and cloud environments. That global ambition aligns with SoftBank’s broader strategy of building AI-focused digital infrastructure rather than simply investing in application-layer companies.

Stargate and The GPU Cloud

SoftBank has been steadily increasing its exposure to data center and network assets, including its agreement to acquire DigitalBridge in a deal valued at roughly $4 billion. Once completed, that acquisition would give SoftBank access to a portfolio managing more than 100 billion dollars in digital infrastructure assets, including multiple gigawatts of data center capacity.

That infrastructure push is closely tied to SoftBank’s backing of the Stargate AI data center initiative, of which OpenAI is a leading player. SoftBank has committed tens of billions of dollars toward AI-related investments, funding those moves in part through the sale of its stakes in NVIDIA and T-Mobile US.

Infrinia reflects SoftBank’s strategic view that AI infrastructure competitiveness will increasingly depend on tightly integrated software stacks, not just access to GPUs. As AI models grow larger and inference workloads proliferate, the ability to deploy, scale, and operate GPU clouds efficiently may become a decisive advantage.