Argument

tmux is a multiplexer. An agent session is a paused microVM. They are not the same primitive.

Both get called "sessions," and the word does most of the harm. tmux keeps processes alive under one server daemon on one host. An mk0r session is a whole microVM that gets paused as a memory snapshot on idle and resumed by ID. Different isolation boundary, different failure modes, different gotchas. The clearest tell is the tmpfs wipe on resume.

4.8from 10K+ builders

One E2B microVM per session, paused on idle, resumed by ID

lifecycle: { onTimeout: "pause", autoResume: true } at src/core/e2b.ts:383

/run is tmpfs, wiped on resume; e2b.ts:1128 and :1644 patch it back

Direct answer, verified 2026-05-11

tmux multiplexes processes inside one shared host: one kernel, one filesystem, processes attached to a long-lived tmux server daemon. Detach and the processes ride along under the same daemon; reboot the host and they all die together.

An mk0r agent session is a full E2B microVM, one per session, paused as a memory image on idle and resumed via Sandbox.connect(sandboxId). The sandbox ID is persisted in Firestore so Cloud Run restarts do not lose it. tmux is a multiplexer with detach/attach semantics; an agent session is an isolation boundary with snapshot/resume semantics.

The architecture details are documented at e2b.dev/docs/sandbox/persistence. The mk0r-specific wiring is in src/core/e2b.ts at lines 383, 409, 1128, and 1644.

What people are usually mashing together

The intuition behind "use tmux for agents" is sensible at first glance. Agents need long-lived processes (a dev server, a headless browser, an MCP bridge). Long-lived processes survive across SSH disconnects under tmux. So tmux feels like a natural place to park an agent.

It works until the second agent arrives. Two agents on one tmux server share a filesystem, share ports, share kernel-level resources, share the user that owns the tmux socket. The first time one of them runs an experimental npm install that corrupts a global config, or binds port 5173, or forks something that does not clean up, the other one inherits the wreckage. tmux is a great ergonomics tool. It is a poor isolation tool because that was not its job.

What an "agent session" needs is what a cloud function or a container scheduler gives you: a per-session blast radius. mk0r picks the strongest version of that: one entire microVM per session, with its own kernel and filesystem. The session does not detach from a daemon; it freezes a whole VM.

One quiet hour, end to end

Here is the actual protocol for a session that handles two prompts 47 minutes apart. The middle of the diagram is where tmux and a snapshot VM diverge.

POST /api/chat, idle, POST /api/chat again

Note the third-to-last row: after Sandbox.connect returns, the server immediately rewrites /run/mk0r-session.json. That step has no analogue in a tmux model; tmux never lost the file, because tmux was never gone.

What survives a pause, what gets wiped

A snapshot resume is not magic. The VM image captures disk and memory at the instant the pause fired, but a few surfaces inside the VM are intentionally volatile. The tmpfs at /run is the big one. We use it for two short-lived files: /run/mk0r-session.json (credentials the scheduler MCP reads on tool call) and /run/brd.conf (residential proxy config). Both have to be rewritten after every wake. That is what writeSchedulerCredsToVm() and the residential-IP re-enable block at src/core/e2b.ts:1644 are doing every time a session is reloaded.

Long-lived processes started by /opt/startup.sh are the next surface. In theory the snapshot includes them; in practice we observe enough drift across long pauses that ensureVmServices() at src/core/e2b.ts:438-447 probes port 3000 first and re-runs the startup script in the background if anything is missing. If you are designing your own snapshot session primitive, that probe is the thing you will want.

None of these surface in tmux. tmux does not pause processes. It does not lose tmpfs. It also does not give you a clean per-session kernel, which is the reason we accept the gotchas.

The five named phases of an mk0r session

If you want to read this as a state machine, this is the shape. Every phase is one or two function calls in src/core/e2b.ts.

Create as a fresh microVM

Sandbox.create(E2B_TEMPLATE, { lifecycle: { onTimeout: "pause", autoResume: true } }) at src/core/e2b.ts:383 boots a microVM from a pre-baked template. The VM gets its own kernel, its own filesystem, and starts a fixed set of processes via /opt/startup.sh: a node proxy on :3000, Playwright MCP on :3001, the ACP bridge on :3002, Vite on :5173, and a headless Chromium with CDP on :9222.

Run the agent loop on private ports

Every turn the model issues tool calls through the ACP bridge. File writes hit /app inside this one VM. Shell commands run in this one VM's namespace. The Playwright MCP drives the Chromium inside this one VM. No port, no path, no process is visible to another session. The isolation is the VM boundary, not a tmux group.

Pause as a memory snapshot on idle

When the user stops typing, the lifecycle kicks in. E2B writes the VM's memory and disk to a snapshot. The sandboxId stays valid. There is no live tmux daemon multiplexing anything between turns; there is a frozen image waiting to be unfrozen. (E2B documents this at e2b.dev/docs/sandbox/persistence.)

Persist the sandboxId, not the runtime

persistSession() writes sandboxId, sessionId, ACP URL, residential IP flags, and the turn historyStack to Firestore under app_sessions. Cloud Run can rotate, the Node server can restart, and the next request that arrives for this user just reads the sandboxId back out and hands it to Sandbox.connect.

Resume by ID, then heal the tmpfs holes

Sandbox.connect(sandboxId) at e2b.ts:409 brings the VM back. /run is tmpfs, so /run/mk0r-session.json and /run/brd.conf are gone; e2b.ts:1128 and :1644 rewrite both. ensureVmServices() at e2b.ts:438 probes :3000 and re-runs /opt/startup.sh if any of the long-lived processes failed to wake. Only after those holes are patched does the next prompt run.

Where tmux is genuinely the right answer

None of this is a knock on tmux. There is a real category of session that tmux models correctly and a snapshot VM does not.

If your session is fundamentally about being on one specific machine (an SSH login on a dev box, a training run that has to stay on the GPU you already booked, an interactive REPL holding state you cannot serialize), tmux is the primitive that fits. The host stays up, your processes ride along, you reattach from any client. A snapshot VM, by design, has no concept of "the same physical machine I logged into yesterday." It is whichever VM has the matching sandbox ID.

tmux also wins on cost when isolation is not load-bearing. One tmux server is cheaper than N microVMs. We pay the per-VM cost because the failure model of an autonomous agent assumes a buggy turn can wedge anything in the VM, and we do not want that wedge to leak into another user's project. The right primitive depends on whether you need a multiplexer or an isolation boundary, and the word "session" quietly conflates the two.

What this changes if you are building agent infra

Three concrete things to take away if you are designing your own session model and the tmux comparison is on your whiteboard:

Persist the sandbox ID, not the runtime. Your server should be allowed to die. The truth about a session is one row in a durable store ( app_sessions in mk0r's case) that names the snapshot to resume. Anything you cache in process memory is best-effort.
Mark your tmpfs. Write down every path that is volatile across pause/resume and own the rewrite explicitly. The bugs from a half-restored tmpfs are subtle and look like phantom auth failures rather than what they are.
Probe before you trust the resume. The contract of a snapshot is "same memory, same disk." The reality, on every snapshot system, includes drift. A 1-second TCP probe of your three or four load-bearing ports is cheap insurance.

Start a session. Walk away for an hour. Come back and watch the resume.

Open mk0r

Designing your own agent session model?

Book 20 minutes. We'll walk through the pause/resume protocol, the tmpfs gotchas we hit, and what we'd do differently if we were starting over.

Frequently asked questions

What is the one-line difference between an agent session and a tmux session?

A tmux session is a named group of processes living inside one tmux server daemon on one host. Detach and the processes keep running on that host; reboot the host and they are gone. An mk0r agent session is one E2B microVM, paused as a memory snapshot when idle, resumed later by sandbox ID. There is no shared daemon, no shared kernel, no shared filesystem. Different category of artifact, not a faster tmux.

Where in mk0r can I actually see the pause/resume model?

Two lines. Open src/core/e2b.ts. Line 383 creates the sandbox with `lifecycle: { onTimeout: "pause", autoResume: true }`. Line 409 resumes a paused sandbox with `Sandbox.connect(sandboxId)`. The sandbox ID lives in Firestore under the `app_sessions` collection so it survives Cloud Run restarts. That is the whole persistence story, end to end.

What actually survives a pause in this model?

The filesystem and the in-memory process state of every process running when the pause fired, captured as a snapshot. So your git repo at /app is intact, the npm node_modules tree is intact, the Vite dev server process resumes mid-flight. What does NOT survive cleanly is anything on tmpfs. /run is tmpfs in this VM, so /run/mk0r-session.json (scheduler creds) and /run/brd.conf (residential proxy config) are wiped on every pause/resume. src/core/e2b.ts line 1128 and line 1644 exist purely to re-write those files when a session is reloaded. That gotcha is one of the clearest tells that this is a real snapshot, not just process pinning.

Why not just use tmux to keep agent processes alive on one host?

Because the per-session blast radius is wrong. Agents write files, run npm install, kick off long-lived dev servers, and drive a real Chromium. If sessions share one tmux server on one host, they share that host's filesystem, ports, kernel, and IPC. A misbehaving turn from session A can `rm -rf /` everything session B was working on. tmux gives you ergonomics; it does not give you isolation. The isolation you want lives at the VM boundary, not the process group boundary.

Is the snapshot resume actually fast enough to feel live?

Fast enough that the user does not notice. The relevant cost is `Sandbox.connect(sandboxId)` plus a tcp probe of port 3000 inside the VM (src/core/e2b.ts:443). When that probe says DOWN, ensureVmServices re-runs /opt/startup.sh in the background and waits up to 30 seconds for port 3000 to come back. On a healthy resume the probe says UP immediately and we skip the restart. The slow path exists because, in practice, some processes inside the VM do drift after a long pause. Worth knowing if you are designing your own snapshot session model.

How is the conversation history kept across pause and resume?

Two layers. The ACP bridge (the Agent Client Protocol server running inside the VM on port 3002) keeps the session prompt + tool-use stream in memory of the process, so a fast resume continues the same session id. The durable layer is git: `commitTurn` in src/core/e2b.ts:1759 writes `git add -A && git commit` after every successful turn and records the SHA in a per-session `historyStack` persisted to Firestore. So even if the VM is killed entirely and a new one is created, you can replay the turn history by SHA. tmux has neither of those layers; it only knows about live processes.

What is the failure mode tmux handles that snapshot VMs do not?

tmux is the right primitive when you need a shell to keep running on a machine that is already configured (an SSH session, an interactive REPL, a long-running training job on your dev box) and the cost of redoing setup elsewhere is high. The host stays up, your processes ride along, you reattach from any client. A snapshot VM has no concept of "the same physical machine I logged into yesterday"; it is whatever VM has the matching sandbox ID. If your session is fundamentally about being on one specific machine, tmux is the answer. Most agent sessions are not.

Does sharing one E2B VM across many user sessions help with cost?

It would, in the same way one Linux box with tmux can host many users. We chose not to. The mk0r failure model assumes a buggy turn will write files outside /app, kill the dev server, fork a runaway process, or wedge the Chromium on port 9222. We do not want any of that landing in another user's project, which it would if sessions shared a kernel. The cost of one VM per session is the price of an isolation boundary that tmux cannot give you.