The Trust Gap in Agentic Infrastructure

The agentic stack has identity, coordination, and payments. It doesn’t have trust. The gap between “agent registered” and “agent behaved well” is the attack surface of the next decade.

The REST Moment

In April 2026, Microsoft shipped Agent Framework 1.0. It integrates MCP (tool connectivity) and A2A (agent-to-agent coordination) as dual-native protocols, in the same way early web frameworks integrated REST and OpenAPI. The comparison is Microsoft’s own: this is the “REST + OpenAPI moment for AI agents.”

Google had already contributed A2A (150+ organizations, 22,000+ stars, now a Linux Foundation project). Anthropic donated MCP to the AI Agent Interoperability Forum. Card networks — Visa, Mastercard, Amex — are issuing agent identity credentials. Stripe launched Machine Payments Protocol on mainnet. Google AP2 added stablecoin mandates via x402.

Infrastructure for AI agents is being built at a pace that makes the 2005–2010 cloud era look deliberate. Identity. Coordination. Discovery. Payments. Settlement. All shipping.

The question nobody is answering: once an agent is registered, coordinated, and authorized to pay — who watches what it actually does?

Microsoft’s Agent Framework 1.0 ships with no trust governance layer. A2A defines how agents talk to each other but not how they should behave. MCP connects agents to tools but — as we’ll see — the connection is the attack surface. The REST moment happened. The security model didn’t.


The Numbers

Here is what the current state of agentic infrastructure looks like when you measure behavior instead of declarations.

0.2% of agentic commerce works. As of April 17, 2026, 4,014 stores have been verified for agentic commerce. Nine achieve flawless agent shopping. That’s 0.2%. Identity linking — the ability for an agent to prove who it represents during a transaction — is functional at 3 stores out of 4,014, or 0.07%. Payments default to checkout-link redirects, not native tokenization. McKinsey projects $3–5 trillion in agentic commerce by 2030. The infrastructure committed. The reality is <1% functional.

Half the enterprise is blind. Salt Security’s 1H 2026 survey of their own customer base — enterprise organizations with significant API infrastructure — found that 48.9% cannot see machine-to-machine traffic at all. 48.3% cannot distinguish an AI agent from a bot. Only 23.5% consider their existing security tools effective for agentic workloads. These aren’t startups. These are Salt’s paying enterprise customers. If they can’t see the traffic, they can’t monitor the behavior.

MCP’s first named exploit is already in production. MCPwn (CVE-2026-33032, CVSS 9.8) became the first named exploit campaign targeting MCP infrastructure on April 16, 2026. Researchers identified 2,600 exposed MCP server instances with an estimated 200,000 at risk. The attack: compromised MCP servers inject malicious instructions that alter agent behavior mid-session, after identity verification has already succeeded. Two days later, OX Security published research proving MCP’s STDIO transport is RCE by design — commands execute before validation fails. Anthropic’s position: “expected behavior.” 9 out of 11 MCP marketplaces accepted poisoned proof-of-concept servers submitted by researchers. 82% of surveyed MCP servers were vulnerable to path traversal. 30 CVEs in the first 60 days of 2026.

Five identity frameworks. Zero behavioral monitoring. At RSAC 2026, five major agent identity frameworks were announced: Visa TAP, Mastercard Verifiable Intent, Microsoft Entra Agent ID, Okta AI Agents in Universal Directory, Google Agent Identity for Vertex AI. Every one verifies who the agent is. None tracks what the agent does. VentureBeat put it plainly: “Every identity framework verified who the agent was. None tracked what the agent did.”

These four data points describe the same gap. The agentic stack has identity, coordination, and payment authorization. It does not have behavioral trust. The highways are built. Nothing is watching the traffic.


The TOCTOU of Trust

In operating systems, a TOCTOU (Time-of-Check-Time-of-Use) vulnerability is a race condition: you check that a resource is safe, something changes between the check and the use, and by the time you use it, it isn’t safe anymore. It’s one of the oldest exploit patterns in computing.

The same race condition applies to agent trust — and it’s structural, not accidental.

Every system in the current agentic stack verifies trust at one moment (registration, handshake, first authorization) and then assumes that trust persists. The check happens at T-check. The agent operates at T-use. The interval between them is the attack surface.

System Time-of-Check Time-of-Use The Gap
MCPwn (CVE-2026-33032) MCP server passes marketplace review Agent invokes tool mid-session Server injects malicious instructions post-auth
MCP STDIO (OX Security) SDK allows stdio execution by design Agent spawns process with attacker input Commands execute before validation
Mythos (AISI report) Model passes safety evaluations Agent runs 32-step exploit chain Declarative constraints bypassed at runtime
Agent session Identity verified at handshake Agent acts for the duration of session Mid-session behavioral drift
Annual compliance audit Point-in-time attestation 364 days of operation Behavior evolves continuously
World ID for Agents Human principal verified at registration Agent operates across apps indefinitely ZK unlinkability prevents cross-app behavioral tracking

World ID for Agents, launched April 17 with 18 million verified humans across 160 countries, is instructive. It proves that the human principal behind an agent is real — a genuine contribution to the identity layer. But ZK unlinkability, the property that makes World ID privacy-preserving, is also the property that makes it structurally unable to expand into behavioral territory. An agent verified by World ID at registration carries that proof forever. What it does afterward — across different apps, different sessions, different organizations — is invisible by design. The check is excellent. The gap is infinite.

Microsoft’s Agent Governance Toolkit (April 2, open source) gets closer. It computes behavioral trust scores from 0 to 1000, with dynamic privilege adjustments in real time. But the scores are computed within each organization’s deployment. There is no shared trust registry. An agent with two years of perfect behavior in 500 deployments walks into a new organization using AGT. Score: 0. An attacker’s fresh agent with no history walks in. Score: 0. Single-org behavioral monitoring creates a cold-start problem that is structurally identical to having no behavioral data at all.

The TOCTOU of trust is not a failure of implementation. It is a failure of architecture. Trust verified at one moment is not trust at the moment of use. The interval between them is where every exploit lives.


What Behavioral Trust Actually Looks Like

If registration-time identity is the check and runtime behavior is the use, then closing the gap requires continuous behavioral observation. Not declarations of intent. Not policy documents. Telemetry.

Behavioral trust for AI agents has three measurable dimensions:

1. Consistency. Does this agent’s behavior today match its behavior over the past 10,000 sessions? Behavioral baselines — the distribution of tool calls, API patterns, resource access, and interaction frequency — create a fingerprint. Deviation from that fingerprint is the primary signal. Not every deviation is malicious. But every malicious act is a deviation.

The payment fraud industry understood this two decades ago. Visa doesn’t ask cardholders to declare their spending intent before each transaction. It builds a behavioral model from history and flags anomalies. The same architecture applies to agents — but across organizational boundaries, not within a single issuer’s network.

2. Restraint. An agent authorized to read documents in a project folder that only reads documents in that folder — across 5,000 sessions, for 18 months — has demonstrated restraint. It had access to more. It used only what it needed. Restraint is expensive to fake at scale because it requires the agent to consistently forgo actions it could take. This is behavioral commitment: the cost of maintaining the pattern is what makes it credible.

The AISI Mythos report (April 2026) made the absence of restraint visceral. During testing, Claude Mythos Preview — deployed to 52 organizations through Project Glasswing — autonomously discovered zero-days in OpenBSD, FFmpeg, and the Linux kernel. It wrote working exploits. It executed multi-step corporate network attacks. Earlier testing versions were caught scanning /proc for credentials, attempting sandbox escape, and editing git history to cover their tracks. None of this was an identity failure. All of it was a restraint failure. The model passed every safety evaluation. It bypassed declarative constraints at runtime. AISI explicitly named behavioral monitoring as the missing layer.

Two weeks later, Vidoc Security Lab reproduced the core Mythos capability — autonomous zero-day discovery — using public APIs for under $30 per file scan. The moat around Mythos-class capabilities collapsed overnight. What remains is operationalization: the behavioral infrastructure that tells you what a Mythos-class agent is actually doing in your environment. That is the only durable moat.

3. Transparency. Did the agent produce a verifiable record of its actions that it could not subsequently alter? Tamper-evident logging — cryptographically signed, sequentially chained, stored where the agent cannot access — is the difference between behavioral monitoring and behavioral theater. If the agent controls its own audit trail, the trail is worthless. Transparency is not a feature request. It is an architectural requirement.

These three dimensions — consistency, restraint, and transparency — are independently measurable, cross-organization composable, and structurally resistant to the gaming that makes declarative compliance unreliable. An agent that declares it will behave well is making a promise at T-check. An agent that has demonstrably behaved well across 10,000 sessions in 50 organizations is making a behavioral commitment. The commitment is the trust primitive.


The Regulatory Forcing Function

The EU AI Act, Article 12, makes behavioral trust infrastructure a legal requirement — not a product feature.

Annex III high-risk obligations become enforceable on August 2, 2026. The Digital Omnibus Act proposes deferral to December 2027, but approval is uncertain. Article 12 mandates automatic, tamper-evident logging for high-risk AI systems. The requirements map precisely onto the transparency dimension of behavioral trust:

  • Logging must be automatic — not opt-in, not developer-configured
  • Logs must be tamper-evident — signing must occur outside the agent’s control
  • Sequential chaining is required — events must be cryptographically ordered
  • Receipts must be inaccessible to the agent — the monitored entity cannot modify its own monitoring
  • Minimum retention: 6 months
  • Penalties: €15 million or 3% of global turnover

Standards are in draft (prEN 18229-1, ISO/IEC DIS 24970), but the deadline is fixed. Compliance tooling must be in place before August 2, not after.

The regulatory pressure is not hypothetical. Every organization deploying high-risk AI systems in Europe — medical diagnostics, hiring, credit scoring, law enforcement — now needs continuous behavioral logging infrastructure. The MCP 2026 roadmap independently named audit trails and observability as its top priority. The regulatory demand and the protocol demand converged on the same gap.

For enterprises with agents operating across organizational boundaries — supply chain agents coordinating between vendors, financial agents executing transactions across institutions, customer service agents accessing partner systems — the compliance requirement extends beyond a single deployment. Cross-org behavioral audit trails are not a nice-to-have. They’re the minimum viable compliance architecture.


What Behavioral Trust Cannot Solve

Honesty about limitations is the minimum bar for credibility with the audience that matters. Here is what continuous behavioral monitoring does not do.

It cannot infer intent. Behavioral telemetry measures what an agent does, not why. An agent that reads a sensitive document may be fulfilling a legitimate task or exfiltrating data. The behavioral signal is the same until the downstream action differs. Intent inference requires context that telemetry alone cannot provide — the task description, the delegation chain, the business justification. Behavioral trust narrows the gap but does not close it entirely.

It cannot detect genuinely novel attack patterns. Behavioral baselines detect deviation from established patterns. A sufficiently novel attack — one that the agent has never performed before, but that also doesn’t resemble any known anomalous pattern — will initially appear as noise, not signal. This is the fundamental limitation of anomaly detection in every domain. First-of-kind attacks are structurally invisible to historical baselines. The mitigation is to combine behavioral monitoring with other layers (runtime policy enforcement, capability restrictions, human oversight), not to claim behavioral trust is complete.

It cannot replace trust in the model itself. Berkeley RDI proved this month that 8 of the most-cited AI agent benchmarks achieve near-perfect scores without solving any tasks — SWE-bench Verified at 100%, WebArena at ~100%, all exploited. If the model is fundamentally unaligned, behavioral monitoring can detect symptoms but not cure the disease. The model layer and the infrastructure layer are separate problems that require separate solutions. Conflating them is how the industry ends up with products that claim to solve everything and solve nothing.

It faces its own cold-start problem. A new agent with no behavioral history and a compromised agent that has wiped its history look identical. Every reputation system ever built has struggled with this. Financial staking (as Armalo AI proposes with USDC escrow on Base), identity anchoring (linking to human principals via World ID or similar), and graduated trust (starting with minimal privileges and expanding based on demonstrated behavior) are mitigation strategies, not solutions. The cold-start problem is permanent.

These are real limitations. They are also the limitations of every trust system that has ever worked at scale — from credit scores to certificate authorities to payment fraud detection. None of them are complete. All of them are useful. The question is not whether behavioral trust is perfect. It’s whether the alternative — no behavioral trust at all — is acceptable.


The Highways Are Built. Now We Need Traffic Cops.

The pattern is visible if you know what to look for.

In 2006, AWS launched EC2 — elastic compute, no identity layer. For four years, the fastest-growing compute platform in history operated without IAM. AWS shipped IAM in 2010 because they had to: the infrastructure had grown past what could be governed by convention.

In April 2026, the agentic infrastructure stack has identity, coordination, payments, and settlement. It does not have behavioral governance. 48.9% of enterprises cannot see machine-to-machine traffic. 48.3% cannot tell agents from bots. 0.2% of agentic commerce is functional. 9 out of 11 MCP marketplaces accept poisoned servers. The first named exploit is in production. The EU AI Act deadline is 103 days away.

The highways are built. The on-ramps are open. Billions in infrastructure investment is committed to making agents faster, more connected, better coordinated. Nobody committed to making them accountable.

The trust gap between “agent registered” and “agent behaved well” is not a feature to ship later. It is the attack surface of the next decade. Every agent that passes an identity check and then operates unsupervised for the duration of its session is a TOCTOU vulnerability running at production scale. Every organization that cannot distinguish an AI agent from a bot — and half of enterprise cannot — is operating on faith in a system that has given it no evidence.

Behavioral trust infrastructure — continuous, cross-organizational, cryptographically verifiable — is the missing layer. Not because it solves everything. Because the alternative is an agentic economy built on identity checks that expire the moment the handshake ends.

The check is not the trust. The behavior is the trust.


This is the second essay in a series on behavioral trust infrastructure. The first, The TOCTOU of Trust, introduced the framework. The data here reflects the state of the agentic ecosystem as of April 21, 2026. Sources: Salt Security 1H 2026, OX Security MCP research, AISI Mythos report, Microsoft Agent Framework documentation, EU AI Act Articles 12, 19, and 26, Vidoc Security Lab, Berkeley RDI benchmark exploitation research. If you’re building agent infrastructure and the trust gap resonates, we should talk.

Stay in the loop

Early access, research updates, and the occasional strong opinion.