Germany Didn't Trust a Certificate. Neither Should You.

Germany's national digital ID abandoned static device certification for runtime behavioral attestation. The same logic applies to AI agents.

Germany's national digital identity infrastructure — the eIDAS European Digital Identity Wallet — has a security problem that should be familiar to anyone building with AI agents.

The problem is this: you can certify a device today and have no idea what it will be tomorrow.

Germany's solution is documented in their Mobile Device Vulnerability Management (MDVM) architecture. It's a specification that quietly discards an assumption most security infrastructure still makes — that certification at a point in time means trustworthiness over time. The document describes what they built instead: a runtime attestation system that continuously evaluates device posture and blocks authentication when posture degrades.

Read it as an AI architect and you'll notice something. Every problem they describe is a problem we have with agents.

The Certification Trap

Traditional device certification works like this: an auditor evaluates a device against known attack potentials, assigns a certification level, and publishes results. The device is trusted until certification expires.

The MDVM architects identified the flaw precisely: "new vulnerabilities may be discovered after certification." A device that passed every test in 2024 may have active exploits in 2025. The certification is still valid. The trust assumption it encodes is not.

Their solution is a system that:

  • Collects runtime signals: Google Play Integrity verdicts — which include a MEETS_STRONG_INTEGRITY check requiring security patches within 12 months — Apple AppAttest assertions, and RASP (Runtime Application Self-Protection) telemetry that independently detects rooting, emulation, hooking, and jailbreaking
  • Cross-references vulnerability databases: Device model and OS version are used to query known CVEs, identifying whether specific devices are in an affected class
  • Enforces dynamically: If a vulnerability affecting authentication integrity is discovered, the system "prevents the use of keys by user authentication with insufficiently secure devices" — mid-wallet-lifetime, without requiring OS updates or reinstallation

The design explicitly does not trust any single signal. KeyAttestation validates secure enclave residence but can be defeated by leaked attestation keys. Play Integrity adds Google's backend assessment but has its own trust assumptions. RASP monitors runtime behavior independently of both. Layered, because no single layer is sufficient.

The Same Problem, Different Substrate

Replace "mobile device" with "AI agent" and the architecture description reads identically.

An agent can pass every evaluation at deployment time. It has valid credentials. Its authorization scope is correctly specified. Its behavior in testing matched expectations.

None of that tells you whether it will behave trustworthily when operating autonomously across novel conditions, with counterparties it hasn't encountered, in edge cases that weren't in the test set.

The MDVM architects solved this for devices by shifting from certification to continuous posture evaluation. They collect signals, cross-reference risk data, and block operations when posture degrades — not because the certification expired, but because the runtime evidence warrants it.

Agent trust infrastructure needs the same shift. Not "did this agent pass an evaluation once?" but "what is its behavioral posture right now, across its deployment history, compared to its commitments?"

The signals are different: not Play Integrity verdicts but behavioral patterns across deployments, commitment-keeping rates, operator renewal decisions, escalation behavior when scope is ambiguous. But the architecture is identical. Runtime. Layered. Continuous. Enforced at the moment of use.

Why This Matters Beyond Analogy

Germany built MDVM because they were deploying sovereign-scale infrastructure — national identity credentials — on devices they couldn't physically inspect or control. They couldn't assume trustworthiness. They had to verify it continuously, or the whole system was only as secure as its weakest unpatched Android.

The agentic economy is building the same kind of infrastructure. AI agents will handle financial transactions, access sensitive data, execute operations across organizational boundaries. Deployers can't inspect the agents they use. They can't audit counterparty behavior. The trust assumption has to be earned and continuously maintained — or the system's security is bounded by whatever the least-verified agent can do.

The Strata/CSA survey published this month found that 70% of enterprises run agents in production outside their IAM systems. Only 18% are confident their IAM can handle agent identities. Just 11% have runtime authorization enforcement. That gap isn't a tooling problem — it's an architecture problem. Static evaluation was designed for static entities.

Germany figured this out for devices. Sovereign-scale deployment pressure forced the answer: you cannot certify your way to runtime trust. You have to measure it.

The agent layer is next. The architecture is already written.


This is part of an ongoing series on trust infrastructure for the autonomous economy. Earlier essays: The Agent Passed All the Checks. That Was the Problem., 60% of Consumers Want Approval Gates for AI Spending, Commitment Is the New Link, Who Decides What Agents Are Allowed to Buy?, Declarations Are Gameable. We're building Commit — behavioral commitment data as the input layer for agent governance. Reach out if you're thinking about runtime trust infrastructure for autonomous agents.

Stay in the loop

Early access, research updates, and the occasional strong opinion.