Guide

Multi Agent Coordination Drift: The Cheapest Fix Is Not Fanning Out

Coordination drift is the slow divergence of two or more AI agents' working models of a shared task as they pass work between each other. The literature prescribes episodic memory consolidation, drift-aware routing, and structured handoff checklists. The cheaper fix, the one most prompt-to-app builders actually need, is structural: don't fan out. One agent, one tree, one commit per turn. Here is what that looks like in mk0r's source.

Matthew Diakonov, Written with AI

Published May 12, 20267 min

Direct answer (verified 2026-05-12)

Multi agent coordination drift is when separate AI agents' working memories about a shared task diverge across handoffs. It is one of three drift modes described in the 2026 paper Agent Drift: Quantifying Behavioral Degradation in Multi-Agent LLM Systems, alongside semantic drift and behavioral drift. The standard mitigations are episodic memory consolidation, drift-aware routing, and adaptive behavioral anchoring. The simpler mitigation, when the task allows it, is to run one agent against one source of truth and skip the handoff entirely. mk0r ships exactly that contract: one Claude Code subprocess per session, one git tree at /app, one commit per turn.

The thesis, stated plainly

Coordination drift is not a property of LLMs. It is a property of architectures that route work across multiple LLM contexts. The architecture creates a step that takes a rich, accurate, current view of the world and compresses it into prose for the next agent to read. The next agent decompresses the prose into its own model of the world, which is wrong in small ways. Over enough handoffs, the small ways compound.

Remove the architecture and the property goes with it. If only one agent ever touches the artifact, there is no compression step. There is no decompression step. The same context that edited the file in turn 3 reads the file in turn 4. Drift requires distance between agents. One agent is zero distance.

This sounds tautological, and in a sense it is. The interesting question is whether the single-agent constraint costs you anything. For one user iterating on one prototype, the answer is no.

What a drifted handoff actually looks like

A typical orchestrator-worker pattern on a vibe-coded app has a Planner, a Coder, and a Reviewer. Each one sees a slice of context and writes a summary for the next. Watch how a small rename gets lost.

Three agents, four handoffs, one quiet rename failure

The Reviewer's last message is a confident yes. The Planner accepts it and moves on. The Coder never edited the fixture in src/lib/seed.ts because nothing in the Reviewer's message said to. The fixture still says title, the type says name, and the next render crashes.

No single agent did a bad job. The Planner's brief was correct. The Coder did what it was told. The Reviewer answered the question it was asked. The drift lived in the medium between them: each handoff lost the part of the world the next agent would have needed to read directly.

3-4

“Agent-written prose drifts from reality within 3-4 handoffs, but machine-generated state (test counts, git status, automated snapshots) doesn't.”

Redis: Why Multi-Agent LLM Systems Fail

How mk0r skips the failure class

mk0r runs one Claude Code agent per session. The ACP bridge inside the E2B sandbox (src/core/vm-scripts.ts) spawns exactly one claude-agent-acp subprocess and reuses it across turns, tracked by a single auth fingerprint. The whole loop runs inside that one process.

One turn, one agent, one commit

1
Prompt lands
User types one sentence into the chat. The Next.js API at /api/chat receives it and forwards to the one ACP subprocess running inside the E2B sandbox.
2
Agent reads the tree
The same Claude Code process that handled the last turn reads /app directly. No briefing from another agent. The context that wrote the last commit is also the context reading this turn.
3
Agent edits and verifies
Edit tool changes files. Playwright tool opens localhost:5173 and takes a snapshot. Bash tool runs the test if there is one. All called from the same context, all reading the same /app.
4
One commit
commitTurn runs 'git add -A && git commit'. The whole working tree lands as one SHA. session.historyStack.push(sha). The unit is the whole repo, not a per-agent diff.

The commit step is the part most multi-agent setups quietly skip. commitTurn in src/core/e2b.ts:1759 runs four lines inside the VM:

cd /app
git add -A
git diff --cached --quiet && exit 0  # nothing to commit
git commit -q -m '<turn message>'
git rev-parse HEAD                    # returned SHA

The entire working tree is the unit. If the agent edited types.ts and forgot the fixture, the commit still captures both files exactly as they sit on disk. No agent narrated the change in prose; git add -A read the filesystem directly.

Undo (undoTurn at line 1855) runs git checkout <sha> -- . and the tree is back. There is no second agent that needs to be told the undo happened, because there is no second agent.

But what about parallelism? (the honest counterargument)

The strongest case for multi-agent setups is parallelism: ten subagents each doing one piece of work concurrently. This is real. A web search across ten queries genuinely runs faster as ten parallel calls than as one serial loop.

The trick is that the parallel calls do not need to be separate agents. They can be parallel tool calls from one agent. Claude Code does this natively: one prompt fires ten Read calls in the same turn, gets ten results back, decides what to do. The agent that issued the calls is the agent that reads the results. No handoff, no summary, no drift.

Where parallelism does require separate agents (different runtimes, different secrets, different hosts), the cost of coordination drift is real and the mitigations in the literature start to matter. The point of this guide is not that multi-agent setups are never worth it. It is that a prompt-to-app builder is the easy case, and the easy case has a much cheaper fix.

The resolution, for builders deciding right now

Before you architect a multi-agent system to solve a coordination problem, ask: does this task actually need to be split across contexts, or can one context handle it with parallel tool calls? If the subtasks all need to read the same source of truth, the answer is usually no.

If it does need to be split, do not pass prose between agents. Pass artifacts: commit SHAs, file paths, test names, structured records. The next agent reads the artifact directly with the same tools the writer used. The medium is no longer lossy.

And if you find yourself building a third agent to summarise the conversation between the first two, you have rediscovered why one agent was the right answer.

Building a multi agent app and not sure where to cut?

If you are deciding whether to fan out or stay single-agent on your build, jump on a call and we can walk through the tradeoff for your specific shape of task.

Frequently asked questions

What is multi agent coordination drift, in one paragraph?

Multi agent coordination drift is the gradual divergence of two or more AI agents' working models of a shared task as they pass work between each other. Each agent reads a slice of context, writes a summary for the next agent, and the summary loses fidelity. After three or four handoffs the orchestrator and its workers no longer agree on what was decided, what was tried, or what the current state of the artifact is. The 2026 paper 'Agent Drift: Quantifying Behavioral Degradation in Multi-Agent LLM Systems Over Extended Interactions' (arxiv.org/abs/2601.04170) names this 'coordination drift' and separates it from semantic drift (each agent wandering from intent) and behavioral drift (emergent unintended strategies).

Is coordination drift the same thing as context drift in a single agent?

No. Context drift in a single agent is the assistant losing track of a constraint mentioned 80 turns ago. It is solvable with better memory, smaller context windows, or summarisation. Coordination drift is structural: two agents that have never shared a memory write summaries to each other and the summaries are lossy. Better memory inside one agent does not help because the lossy step is the handoff. The two failures often look the same in the chat log but the fix is different.

Why does mk0r not use a multi agent setup like an orchestrator and workers?

Because there is no work that needs to be split for a single app prototype, and splitting it creates drift. The build loop is: prompt arrives, agent reads files, edits files, runs the dev server, takes a Playwright snapshot, fixes errors, commits. Every step needs the same context (the current source tree). Handing each step to a different agent means each agent rebuilds that context from a summary, which is exactly the operation that drifts. mk0r runs one Claude Code subprocess per session, spawned by ACP_BRIDGE_JS in src/core/vm-scripts.ts, and reuses it across turns.

But surely orchestrator-worker setups have advantages for parallelism?

They do, when the subtasks are genuinely independent (web search across 10 queries, parsing 50 documents, running 20 unit tests). For those, mk0r's single agent uses parallel tool calls inside one process: one agent fires ten Read calls in the same turn, gets ten results back, decides. There is no handoff and no summary, so there is no drift. Coordination drift shows up when subtasks depend on each other and each step's output is interpreted by a different agent. That is the case where the parallelism gain is fictitious because the agents spend their tokens re-reading each other's notes.

What is the cheapest mitigation that still preserves a multi agent setup?

Make the shared state machine-generated, not agent-written. The Redis blog post on why multi-agent LLM systems fail makes this point: agent-written prose drifts from reality within 3-4 handoffs, but test counts, git status, and automated snapshots do not. If you must coordinate, coordinate over the filesystem, the test results, and the git history. Have each agent operate by reading those, not by reading another agent's narration. mk0r treats /app's git tree as that shared state for a single agent; if you fanned out, every worker would still read /app and run 'git diff HEAD' instead of a brief from the orchestrator.

Where exactly in mk0r's source is the 'one tree per turn' contract?

src/core/e2b.ts:1759 defines commitTurn(sessionKey, message). It runs four lines inside the VM: 'cd /app && git add -A && (skip if no changes) && git commit -q -m <message> && git rev-parse HEAD'. Every successful prompt lands as one SHA. The undo path (undoTurn at line 1855) calls revertToSha (line 1815) which runs 'git checkout <sha> -- . && git add -A && git commit --allow-empty -m "Undo to <sha>"'. The whole tree is the unit, atomic per turn. There is no second agent to be out of sync with that tree because there is only one ACP subprocess per session.

What about the Playwright browser inside the VM? Is that not a second agent?

It is a tool the one agent calls, not a peer agent. The Claude Code subprocess invokes Playwright over MCP, waits for a snapshot, reads the snapshot, decides. The browser does not write prose summaries that the agent then consumes. The contract is structured (DOM snapshot, console messages, network log), not natural language. Coordination drift requires the lossy summary step, and there is none here. The same logic applies to npm scripts, the bash tool, and the dev server: structured outputs read by the same context that issued the command.

When does the cheapest fix stop being 'use one agent'?

When the task genuinely cannot fit in one context window or when subtasks have hard isolation requirements (different secrets, different hosts, different runtimes). At that point you need real coordination and the right answer is structured shared state, not handoff prose. mk0r's case is the easy case: a single user iterating on a single prototype on a single sandbox. The hard case is a fleet of long-lived agents managing infrastructure, where you cannot avoid handoffs. The drift mitigation literature is aimed at that hard case. For a prompt-to-app builder the literature is overkill.

If I am building a multi agent setup right now, what is the first thing to check?

Whether the agents are passing each other prose or whether they are passing each other artifacts. Prose drifts: 'I added a Todo type with id and title' loses information the next agent could have read directly from the file. Artifacts do not: 'commit SHA abc1234' is a pointer to a tree the next agent can inspect with the same tools the writer used. Rewrite every handoff to reference an artifact (a commit, a file path, a test name, a stored object) and watch the drift rate collapse. The drift was never in the agents, it was in the medium.