OpenAI on Thursday announced GPT-5.3-Codex-Spark, a lightweight, ultra-fast model engineered for near-instantaneous code generation to eliminate lags in artificial intelligence (AI)-assisted programming.

The release marks a significant departure from OpenAI’s traditional infrastructure because the new tool is powered by a multibillion-dollar partnership with Cerebras Systems Inc. rather than industry-standard NVIDIA Corp. GPUs.

Codex-Spark is a smaller version of the flagship GPT-5.3-Codex model released earlier this month. While the full model is designed for deep reasoning and complex, multi-step engineering tasks, Spark is optimized for low-latency interaction.

OpenAI claims the model generates code 15 times faster than its predecessor, which allows developers to maintain creative flow during rapid prototyping. To achieve this, the company is using Cerebras’ Wafer Scale Engine 3 (WSE-3). Unlike standard chips, the WSE-3 is a single massive processor the size of a dinner plate, packed with 4 trillion transistors. This architecture reduces the communication overhead typically found when data moves between multiple smaller GPUs.

“Integrating Cerebras into our mix of compute solutions is all about making our AI respond much faster,” OpenAI said. It called Spark the “first milestone” of a $10 billion multi-year agreement between the two firms.

The leap in speed comes with a tactical sacrifice in raw brainpower.” Internal benchmarks, including SWE-Bench Pro, show that Spark underperforms the full GPT-5.3-Codex on highly sophisticated autonomous tasks.

The pivot to Cerebras hardware arrives during a complicated era for OpenAI. The company is currently managing a strained relationship with NVIDIA, its primary chip supplier, while facing internal upheaval and public scrutiny over its recent Pentagon contracts and the introduction of advertisements into ChatGPT.

OpenAI was careful to frame the move as an expansion rather than a replacement.

For Cerebras, the partnership is a massive validation. Having recently raised $1 billion at a $23 billion valuation, the Sunnyvale-based firm is positioning its mega-chip as the premier solution for real-time AI inference.

“This is just the beginning,” said Sean Lie, chief technology officer of Cerebras. “We are discovering what fast inference makes possible—fundamentally different model experiences.”

The initial rollout is restricted. The new model is currently available only as a research preview for ChatGPT Pro subscribers for $200 per month. It supports a 128,000-token context window but is limited to text only, lacking the multimodal features of other recent models. The new model is integrated via the Codex app, CLI, and Visual Studio Code extensions.