There’s a lot of excitement right now about how fast AI is advancing. Bigger models. More autonomy. More things delegated to machines. What’s getting less attention is whether the systems we’re deploying were ever designed to support that level of capability safely.
Over the last few years, we’ve optimized almost exclusively for performance. What we haven’t done is pause to ask whether the underlying architecture — how data flows, how access is controlled, how failures are contained — can actually hold up at scale.
As we move toward 2026, those gaps will become harder to ignore. The most important shifts in AI won’t come from new benchmarks or flashier demos. They’ll come from constraints like security failures, power limits, and architectural realities that force the industry to recalibrate.
These are the trends I think will matter most.
U.S. Open Models Will Fall Further Behind at the Foundation Layer
I’d like U.S. open models to become the best open models in the world. I’m just not optimistic that it’ll happen in the near term.
What we’re seeing instead is a widening gap at the foundation layer. Chinese research labs are releasing base models that are competitive with early GPT-5–class systems, while many U.S. “open” releases are fine-tuned variants built on top of foreign architectures. That distinction matters. Fine-tuning can improve behavior, but it doesn’t give you control over the underlying intelligence.
This isn’t about ideology. Open models matter because access and agency matter. In the same way we don’t restrict who gets to read books, we shouldn’t concentrate access to intelligence in a handful of proprietary systems. Without competitive base models, open ecosystems lose the ability to fully understand, audit, and govern the systems they rely on.
That imbalance is becoming harder to paper over.
Transformer Scaling is Running Into Architectural and Physical Limits
Transformers aren’t disappearing, but the rate of improvement is clearly slowing.
We’re spending dramatically more compute to get smaller gains, and most progress now comes from routing, retrieval, and fine-tuning rather than new architectures. As far as anyone can tell publicly, no major lab has trained a fundamentally new foundation model architecture at scale since 2024.
We’re capped on high-quality data, and the underlying architecture isn’t changing. That leaves only one path to continued gains: more compute. And more compute requires more power. That’s where the current strategy starts to break down.
U.S. data centers already consumed about 4.4% of total U.S. electricity in 2023, with projections reaching between 6.7% and 12% by 2028. At that scale, power availability stops being an abstract concern and becomes a hard constraint on what can actually be built.
Hyperscalers already have GPUs they can’t fully utilize because regional power grids can’t support sustained load. Temporary generators and infrastructure workarounds can bridge gaps, but they don’t scale indefinitely. When a large share of compute cost is power alone, physics becomes the limiting factor.
Applications will keep improving, but raw model intelligence is starting to plateau. Scaling transformers alone is no longer enough.
Prompt-Injection Worm Becomes Plausible
Prompt injection isn’t hypothetical. It’s already happening. What’s changing is the blast radius.
Agentic systems and AI browsers blend untrusted web data with logged-in accounts and autonomous actions. That collapses the separation between the data plane and the control plane, a mistake security engineers have warned about for decades. Combine untrusted inputs, broad permissions, and access to sensitive data, and you get the conditions for self-propagating attacks.
We’ve seen this pattern before. Early internet worms didn’t spread because encryption failed; they spread because systems trusted inputs they shouldn’t have. A prompt-injection worm that propagates through AI browsers isn’t far-fetched. It’s a predictable outcome of how these systems are being built.
Historically, cybersecurity best practices are often developed in response to breaches. AI won’t be an exception.
Trusted Execution is Shifting From Optional to Required
Trusted execution environments and confidential computing have been around for a long time. What’s changed is that AI finally makes them unavoidable.
The hardware is fast enough now. The tooling is usable. And the data AI systems touch is too sensitive for trust-based architectures to hold up. Privacy guarantees that rely on policy or vendor promises won’t survive as AI-driven attacks accelerate.
Systems that can’t prove where data went, who accessed it, and what was retained will increasingly be seen as unfit for serious deployment. Verifiable privacy stops being a differentiator and starts being table stakes.
The Common Thread
None of these shifts is about hype. They’re about constraints.
AI isn’t slowing down because people have lost interest. It’s running into architectural, physical, and security limits that can’t be ignored forever. The next phase of AI won’t be defined by the biggest model release. It’ll be defined by which systems were actually built to survive contact with reality.

