The drift, the surrender, and the architecture they are asking for

By February 2026 the diagnosis had a name. Margaret-Anne Storey, a software engineering researcher with a twenty-year track record on developer cognition, posted From Technical Debt to Cognitive and Intent Debt (arXiv 2603.22106). The argument is small and load-bearing. AI-assisted development has changed the structure of the debt that codebases accumulate. The classical category, technical debt, is still real. Two new ones now sit beside it.

Cognitive debt is the erosion of the team's shared mental model of the system: the code exists, runs, passes tests, and no one on the team can quite say why it works. Intent debt is the erosion of explicit rationale: nobody remembers the constraints that originally shaped the design, because they were never written down where a future reader could find them.

Storey gave the failure mechanism a name too. Cognitive surrender. The developer accepts AI-generated code without paying for the implementation friction that used to build their understanding of it. The AI did the thinking that would have produced the mental model. The developer got the code but not the model.

Three weeks before Storey's paper, Wharton's Shaw and Nave published an empirical study of the same shape (SSRN 6097646). 1,372 participants, roughly 10,000 trials. Their measurement: 80% of users follow AI's wrong answers when AI is wrong. 73% of those failures show no evidence of System 2 engagement at all. The participants did not check. They did not even pause.

Engineering and cognitive science arrived at the same diagnosis from different evidence within weeks of each other. That kind of independent convergence is unusual, and it is why the diagnosis stuck.

Within weeks the practitioner culture began re-deriving the same conclusion in its own words. Simon Willison amplified Storey as the best explanation of cognitive debt he had seen. Martin Fowler cited both papers in his April Fragments. Addy Osmani extended the framework with what he called comprehension debt. Pydantic published The Human-in-the-Loop is Tired. Wire wrote about agent drift. VentureBeat catalogued context decay, orchestration drift, silent failures. Anthropic disclosed in their Claude Code data that 93% of permission prompts get approved. The number is the smoking gun. Permission prompts that approve 93% of the time are not gating anything; they are paperwork.

So practitioners have language for the pain now. Cognitive surrender. Intent drift. Plan decay. Unclosed loops. Silent failures. The Triple Debt. These are not yet vendor-blog vocabulary, but they will be by autumn. The diagnosis is not contested. The audience already knows it from their own work.

What practitioners do not yet have is language for the cure.

This essay is one attempt to provide it. Not as a manifesto. As an architectural question. What does a system that took these symptoms seriously actually look like?

The shape of the failure, in engineering terms

The standard story about AI failure modes is wrong. It is too soft, and it focuses on the wrong layer.

The soft version says AI sometimes hallucinates and sometimes makes mistakes, and the cure is better evals or more training data. That is true and orthogonal. It is not what Storey and Wharton are describing.

The thing they are describing is structural. The agent that does the work and produces a result is the same kind of process at the start of the run and at the end. Nothing about it changes when the user's understanding changes, or when the world the user was thinking about changes, or when the original purpose of the run stops being the right purpose. The run finishes. The diff lands. The merge happens. The model that should have updated is the model in the developer's head, and that model never had to be updated, because the friction that used to force the update was traded for speed.

This is not a model-quality problem. It is a runtime-shape problem.

You can see it most clearly in long-running agents. An agent that is given a task, runs for an hour, takes 200 tool calls, and produces a working result has done something impressive. It has also operated for an hour with no signal that the goal it received at minute 0 is the same goal the user has at minute 60. The user's understanding has moved. The codebase has moved. Maybe the agent itself, in tool call number 47, learned something that should have made it stop and re-ask. Nothing in the runtime asks. Nothing in the runtime can ask, because the goal was never an object the runtime held. It was a string in a system prompt, and once the system prompt was loaded, the goal had nowhere to live.

This is the engineering articulation of intent drift. The phrase comes from the academic literature; the failure is what every team running a serious agent harness has felt in their hands.

The arXiv paper that named the failure (Evaluating Goal Drift in Language Model Agents, arXiv 2505.02709) ran the first replicable benchmark. Some models drifted on the majority of long-horizon runs; others held better. The authors' conclusion was that the field needs longer time-horizon evaluations. The implicit conclusion is that something in the runtime is missing.

What is missing is not more eval coverage. It is an object that the agent, the user, and the future reader can all point at and say this is what we agreed to do.

Three primitives

By construction, then: a runtime that took cognitive surrender and intent drift seriously would have three things current runtimes do not. None of them are model improvements. All of them are runtime concerns. The order matters. Each primitive needs the previous one to be in place to mean anything.

1. Declared Intent

The user states what they want done. Not as a prompt that vanishes after one turn. As an object the runtime holds for the duration of the work, refers back to, and treats as the contract.

Right now, in every framework that ships, the user's intent is encoded as the first message of a conversation, or as a system prompt, or as an instruction string passed to an agent constructor. In all of these cases the intent is data the model reads once and then forgets, except insofar as the conversation history happens to scroll past it again. The model does not know what the user "agreed to" because there is no representation of agreement. There is only the residue of the original message in context, decaying as new tokens arrive.

A runtime with Declared Intent treats the user's stated goal differently. The intent is created as a first-class record at the start of the work. It has a stable identifier. It has a scope: what the agent is and is not allowed to touch. It has constraints: the budget, the deadline, the things the user said matter and the things they explicitly said do not matter. It is the object the entire run is anchored to.

Concretely this changes three things:

Every meaningful decision the agent makes can be linked to the intent. The decision record carries the intent ID. When you come back tomorrow and ask "why did the agent do that?", the answer routes through the intent, not through the transcript.
The intent can be revoked or amended. Not by editing context. By updating the record. The next decision the agent makes loads the current intent, not the one in context that may already be stale.
The intent has its own lifecycle. It is open, satisfied, abandoned, or expired. A run that finishes without satisfying the intent is a different shape from a run that satisfies it. The runtime can treat them differently.

This is the foundation. Without an Intent object, the next two primitives have nothing to anchor to.

2. Living Authority

The user grants authority to a goal at a moment in time. The world moves. The user's understanding of the goal moves. The system the agent is working on moves.

A runtime with Living Authority detects when the contract no longer matches reality and stops to re-ask, instead of executing on stale assumptions.

This is the missing trigger. Every existing framework I know of has the substrate for it (LangGraph's checkpoints, Letta's memory blocks, Anthropic's evaluator-agent harness) and none of them ship the trigger itself. The trigger has to fire when something specific happens: the file the agent is editing was modified by a human in a different branch since the work started, the dependency the plan was built on changed in a way the agent did not see, the approver who originally signed the work has revoked their session, the user changed their mind about something downstream that invalidates the upstream choice.

The pattern in the wallet world is Prompt-to-Propose, Human-to-Sign. The agent prepares a transaction. The human signs on a trusted device. The agent never holds the key, and the human never approves a transaction they cannot read on their own screen. The point of separation is not "approval theatre." It is that the consequential boundary preserves implementation friction in the only place where friction earns its cost.

Living Authority is the same shape, generalized. The agent prepares a step. The runtime checks whether the world the user signed off on has moved. If it has, the step does not execute. The user is asked again, with the diff between then and now in front of them.

The crucial thing about this primitive is that it is not "approval before every action." That is the regime that produced Anthropic's 93% approval rate. It is approval at the boundary where the world has actually moved. Most steps in a long agent run do not need re-approval. The ones that do need it are the ones that current frameworks cannot detect.

3. Evidence Audit

For every meaningful decision the agent makes, the runtime records what it consulted. Not as a transcript dump. As a structured record, indexable by Intent, persistent, queryable.

This is the post-hoc primitive. Declared Intent and Living Authority operate at runtime. Evidence Audit operates after the fact. It is what you reach for when you come back to a piece of work three weeks later and need to reconstruct why the agent made the choices it did.

Most agent frameworks do not record evidence in this sense. They record traces. A trace is the unstructured stream of tool calls and model outputs and intermediate text. To answer "what did the agent stand on for this decision" from a trace, you read the trace. That is fine for ten decisions. It does not survive a hundred. It does not survive across team members. It does not survive the agent itself going away and being replaced by a different one.

Evidence in the structured sense is different. When the agent reads a pattern from the corpus, that read becomes an evidence record: kind, source, timestamp, the intent it was consulted under. When the agent reads a wiki article, same. When the agent searches and chooses among results, the choice is an evidence record. When the agent's decision is "do the thing," the decision record links to the evidence records that informed it.

This is auditable in the legal sense, not the developer-tools-marketing sense. A regulator, or a future maintainer, or the user themselves three months later, can answer the question what did this system stand on? without reading anything that looks like a transcript. The answer is structured, citable, indexable.

The closest academic analog is Grounding Agent Memory in Contextual Intent (STITCH, arXiv 2601.10702), which indexes memory blocks by the intent under which they were created. The runtime claim is that this should be the default, not a research direction.

What this is not

The three primitives are not a manifesto. They are also not new. The pieces exist in adjacent fields. What is new is putting them together as the runtime shape for AI work, and saying out loud that this is what cognitive surrender and intent drift are asking for.

It is not chain-of-thought. Chain-of-thought is a model behavior at inference time. Declared Intent is a runtime object that exists between turns.

It is not LangGraph checkpointing. LangGraph has the substrate; the trigger that fires when the world has moved is the missing piece.

It is not crypto agent wallets. Wallets handle transactions, per-action consent, spending limits. They are the right answer for transactional agents. They are not sufficient for long-running agents whose authority needs to live across hours or days, whose evidence load is dominated by reads and not transactions, and whose drift mode is intent-shape, not budget-overrun.

It is not "AI safety guardrails." Guardrails are content classifiers. Living Authority is a runtime contract. Different layer, different problem.

It is not a refusal of speed. Most steps in a serious agent run can and should be fast. The argument is not that everything should be slow. The argument is that the consequential boundaries should preserve friction, in the way they do in every other infrastructure where consequences and speed have been negotiated honestly: signing keys, deploys, financial transactions, legal documents.

What changes if you build it this way

If a runtime is built around these three primitives, several things follow that do not follow from the current shape:

The agent's transcript is not the source of truth for what happened. The Intent object plus the linked evidence records are. The transcript is a debug artefact.
A run that the agent thinks succeeded but that the Intent still considers open is a failure. This state is currently invisible: the run "succeeds" if the agent emits an "all done" message.
A run where the world moved under the original Intent triggers a re-confirmation. Without this, the run continues, and the user discovers two days later that they merged a patch built against a stale assumption.
The 93% approval problem dissolves. Most steps do not need approval; the ones the runtime cannot tell from "world moved" are the ones that do. Approval rates approach 50% on the steps that fire, because those steps are the ones where the user actually has to look.
Evidence is queryable across runs. "What did all my coding sessions in the last month consult about pattern PAT-0871?" is a query. Right now it is a manual transcript scrub.

These are not nice-to-haves. They are the shape difference between a runtime that takes the diagnosis seriously and one that doesn't.

On the names

The three primitives have names because names are how a vocabulary forms. Declared Intent. Living Authority. Evidence Audit. They are the cure-vocabulary for the symptoms practitioners are already naming.

I am aware that "cure-vocabulary" is doing work it has not yet earned. Cognitive surrender is consolidating because two independent disciplines arrived at it from different angles. Intent drift is consolidating because the academic and practitioner communities are converging on the term. The three primitives are a proposal. They will earn their place if the runtime that carries them produces results that are recognizable to the practitioners who already named the symptoms.

That is the test, and it is the only test that matters. The bridge from pain to category is not made by branding. It is made by an architecture that, when you see it, makes you say "yes, that is what was missing."

This exists

A version of this runtime exists. It is called Spegling.

Spegling holds Intent objects across sessions, refers back to them as work continues, and re-asks for confirmation when the world has moved under an Intent that was signed off before. It records evidence as structured records linkable to the Intent the agent was working under, and exposes that as a queryable surface, not a transcript dump. It hosts the agent harness alongside the intent registry and the evidence store, so the three primitives are runtime properties, not afterthoughts. Concretely: every coding session run inside Spegling links the patterns, wiki articles, and X-Ray reports the agent consulted into structured evidence records, indexable by the intent the session was anchored to.

It is currently private, by allowlist. Not because the work is secret. Because the people who will use it well are the people who already feel the failure mode and want the architectural answer, and that is a smaller audience than the one currently telling vendors to ship faster.

If you have read this far and the diagnosis sounds familiar, the contact form is at the bottom of varjosoft.com. The right time to talk is now, while the symptom-vocabulary is consolidating and the cure-vocabulary has not yet been claimed.

Hannu Varjoranta writes about trust, continuity, and the design of systems that respect human attention. Previously: The Permission Economy, Under the Shared Sky, and Building Software with AI Agents. He builds Spegling and runs patterns-starter, the open bootstrap for personal pattern corpora.

Under the Shared Sky The Night Spegling Woke patterns-starter Get in touch

Selected references. Storey 2026, From Technical Debt to Cognitive and Intent Debt (arXiv 2603.22106, Feb 2026). Shaw and Nave 2026, Thinking — Fast, Slow, and Artificial (SSRN 6097646, Jan 2026). Goal-drift benchmark, arXiv 2505.02709. STITCH (intent-indexed memory), arXiv 2601.10702. Anthropic Claude Code 93% approval rate, disclosed publicly in 2026 product communications. Spegling and patterns-starter at github.com/varjoranta/patterns-starter. Companion thinking on calm and refusal-aware design at Amber Case's Calm Tech Institute.

More writing →