The models are getting better. The agents are getting more capable. But actually running those agents reliably in production? That’s still a mess.

Google is trying to fix that. On May 20, the company released Agent Executor — known in the GitHub repo as AX, short for Agent eXecutor — an open-source distributed runtime for AI agents. It’s available now under the Apache 2.0 license, though Google makes it clear that this is an early preview and that significant changes are coming before a stable release.

The core idea behind AX is straightforward. As agents take on longer and more complex tasks — workflows that run for minutes, hours, or even days — the infrastructure around them needs to be more resilient. Standard software deployments don’t account for the fact that agents are nonlinear programs. They pause, wait for human input, call external tools, and pick back up again. Traditional compute abstractions weren’t built for that.

AX was designed to address this gap with native support for durable execution, secure isolation, session consistency, connection recovery, and trajectory branching.

Durable Execution

The most pressing problem AX tackles is resumption. If an agent is mid-task and something fails — a network outage, a service interruption, a human approval that takes too long — most existing setups lose that state. AX uses an event log and snapshotting to allow workflows to resume after outages or interruptions, including human-in-the-loop confirmations.

This matters more than it might sound. Teams have been improvising their own solutions to this problem for a while. One observer noted that the event log, snapshotting, single-writer model, and connection recovery in Agent Executor are exactly the things SRE teams have been cobbling together on their own for the past year, and that existing frameworks like LangChain and AutoGen are useful for prototyping but tend to fall apart in production once agents run for hours or days.

Isolation and Security

AX also puts a hard boundary around individual components. Each actor — whether an agent, harness, skill, tool, or sandbox — runs in isolation with secure-by-design sandboxes that prevent harmful side effects and limit the blast radius if something goes wrong. This is especially relevant when agents are generating and executing code or handling data across multiple tenants simultaneously.

Model-Agnostic and Harness-Agnostic

One of the more notable design decisions Google made here is that AX doesn’t care what model or framework you’re using. Enterprises can bring their own agents built with LangChain, LangGraph, Google’s Agent Development Kit, or any agent that uses the Agent2Agent (A2A) protocol. The runtime also natively supports MCP servers.

This positions AX less as a Google-specific tool and more as infrastructure that sits beneath your agents, regardless of how they were built. Developers aren’t locked into a specific harness or a specific model provider.

Trajectory Branching

One feature worth calling out specifically is trajectory branching. Checkpoints let you branch an agent’s decision or workflow path at any point, allowing agents to test or evaluate different paths without losing context or other state. For teams conducting evaluations or experiments on agent behaviors, this is a practical capability that removes much of the manual bookkeeping currently required for that work.

Agent Substrate: The Kubernetes Layer

AX doesn’t operate alone. Google also released Agent Substrate alongside it — an open-source Kubernetes abstraction designed specifically for agent workloads. Standard Kubernetes handles thousands of long-running services well, but Agent Substrate is designed for the pattern of millions of sub-second tool calls that would otherwise overwhelm a standard control plane. Together, AX and Agent Substrate aim to provide a foundation for production-scale deployments.

What AX Is Not

Google is direct about the project’s scope. AX is not a managed service — it’s self-hosted. It’s not an agentic framework like LangChain or ADK. It doesn’t dictate what harness or model you use. It’s the serving layer that wraps around those choices, making them more reliable at runtime.

The Bigger Picture

Mitch Ashley, VP and Practice Lead at The Futurum Group, sees this as a sign of the agent stack maturing into distinct layers. “The agent stack is separating into layers, and Google is staking the execution layer,” he said. “AX treats durable runtime reliability as the foundation beneath agents, where workflows survive failures, resume after interruptions, and run in isolation.”

But Ashley is also clear about what AX doesn’t solve. “Reliable execution is not governance. Teams that adopt AX still face the unsolved layer above it: governing agent decisions, enforcing policy, holding agents accountable. The runtime makes agents dependable, and procurement must now scope the governance layer that sits over it.”

The project is still under heavy development. The team notes it’s in the middle of a major refactor and is holding off on accepting most external contributions until the core stabilizes. On the roadmap: support for bring-your-own harness (BYOH), improvements to suspension and resumption of subagents, and tool call approvals within subagents.

Developers can get started at github.com/google/ax.