Delivering AI Safety

Every new technology brings questions and concerns, and the current AI rush is no exception, especially now while we’re still early in the usual hype cycle. LLMs, MCP, and agents raise a lot of fascinating topics to look into, from ethics to long-term effects on new practitioners, but for today we’ll focus on safety.

AI safety is a simple concept: anyone using an AI tool – whether an end user asking an agent for help or an application developer trying to make inference queries – needs to trust that the tool will be safe to use and that it will give sane answers.

This is actually a bit beyond the state of the art at the moment! Our existing AI tools can—and do—go off the rails for no apparent reason, and it’s difficult or impossible to understand what they tried and how they failed. When trying to fulfill a request, they might hand sensitive data to something they shouldn’t, or they might run dangerous commands. This is an uncomfortable state of affairs at best, so it’s very much worth asking what we need to change in order to make things safer.

It’s also worth realizing that we’ve been here before: every time we introduce new kinds of workloads, we have to engineer new capabilities to better manage and monitor those new kinds of workloads. Over time, we’ve also found that pushing those new capabilities into the infrastructure is often the best way to serve the users’ needs. So what do we need in the infrastructure for AI workloads?

AI-Safe Infrastructure

At a glance, the requests being made to AI workloads feel familiar: they’re based on HTTP, which the infrastructure understands pretty well. However, there are some very important differences between AI workloads and typical microservices:

AI workload instances are usually not interchangeable.
All AI workloads require payload processing.
Finally, LLMs are nondeterministic, which spills into the other tools as well.

Interchangeable vs Unique

When building with microservices, the assumption is that each instance of a microservice is interchangeable: if you have three replicas of a microservice, you expect that any of the three can handle requests as they come in and all will be well. This is the foundation of horizontal scaling and of the kind of load balancing we’ve been doing for pretty much the whole life of cloud native computing.

It doesn’t work for AI tools, though. Partly this is because of sheer expense: for example, LLMs require a massive amount of computation, so they need dedicated GPUs, which have to be preloaded with data for the model and for the kind of request you’re about to make. It’s expensive both to perform requests and to switch between kinds of requests, which makes it important to keep track of which instance of the LLM is set up for which requests.

Additionally, AI tools maintain connection state: the context is a critical piece of the request, and it builds up and changes over time. Current tools generally associate context with long-lived connections rather than sending the context with every request. (There’s ongoing work around lessening this dependence on connection state, but it’s not complete yet.)

Taken together, these mean that each instance of an AI tool is unique, and delivering a request to the wrong instance can result in spiraling costs and a badly-behaving application. So a basic tenet of an AI-safe platform is keeping track of which instance of a tool is correct for a given request.

Payload Processing

The application protocols used by basically all networked workloads carefully distinguish between the headers of a request or response and the body: the headers contain only metadata (like an HTTP request’s path and hostname, or an HTTP response status), and the body contains the actual data. Protocol design as far back as SMTP (first standardized in 1982!) is based on the idea that infrastructure should only look at the headers and never inspect the body.

Unfortunately, this doesn’t work for AI workloads. For example:

In MCP, the name of the tool being requested is in the body, not in a header.
When you make inference requests of an LLM, basically everything of import is in the body.
The overall success or failure of AI workload requests is indicated in the body, not in the headers.

Our existing infrastructure can do bare-bones routing since AI workloads use protocols built on HTTP, and it can tell when things go horribly wrong by looking at the HTTP status, but that’s it. To really provide safety, we need to more deeply process the payload, cracking open the bodies of both requests and responses so that we can, for example, handle routing or authorization policy based on which MCP tool is requested; observe that a given model is delivering junk; or prevent an agent from passing context that it shouldn’t.

Nondeterminism

Finally, AI tools end up being nondeterministic, largely because LLMs are: feed an LLM the same prompt repeatedly, and you’ll get varying responses – which means, for example, that two agents given the same task, even if they use the same model, can take wildly different tacks. Even MCP servers, which generally behave more like traditional microservices, can end up with wildly varying inputs, including inputs that may be nonsensical.

This kind of autonomy can be beneficial, but it dramatically increases the need to carefully monitor inputs for correctness and to keep an eye on costs: suppose your agent gets a request and ends up spending ten times as many tokens as it did the last time it got that request? This is another way in which AI-safe platforms need to be able to provide very fine-grained observability and policy enforcement, for which they’ll need efficient payload processing.

Additionally, the autonomy of agents makes it absolutely critical that the agent not use the same credentials as its human. We always need to be able to tell the difference between the human taking an action and their agent taking an action on their behalf. This is challenging for existing authentication protocols, so a properly AI-safe platform will likely need to have support for new authentication extensions (such as the existing draft for OAuth agent authentication).

Putting it All Together

The current lack of AI-safe infrastructure is clearly an issue, but it’s far from an insurmountable problem. Again, we’ve been in exactly this situation before: when HTTP first came on the scene, network infrastructure could only work at the level of TCP connections and had to be modified to understand HTTP’s structure to support the kind of functionality we take for granted today, like per-request load balancing and detailed observability.

Likewise, today we’re faced with the need to modify our existing infrastructure to support the things we need for AI safety, and we’re seeing progress already at various levels, from the formation of the AAIF down to Linkerd’s newly announced MCP support for service mesh and efforts like the CNCF AI Gateway working group. We’ll definitely be able to produce AI-safe infrastructure – until then, though, it falls to all of us to pay extra careful attention to what kinds of AI workloads we run, and how we manage access to them.

Delivering AI Safety

AI-Safe Infrastructure

Interchangeable vs Unique

Payload Processing

Nondeterminism

Putting it All Together

SHARE THIS STORY

FOLLOW US

Delivering AI Safety

AI-Safe Infrastructure

Interchangeable vs Unique

Payload Processing

Nondeterminism

Putting it All Together

TECHSTRONG AI PODCAST

SHARE THIS STORY

RELATED STORIES:

FOLLOW US

NEWSLETTER SIGN UP