Guide

AI app builder one-shot prototype limits: the seven clocks

Every other guide on this topic answers the question with the model context window. That number does not bite first. The real cap on a single one-shot turn is seven concurrent runtime clocks, and the smallest one always wins. mk0r ships every one of them as a constant in source.

m
mk0r engineering
11 min
4.9from builders who shipped a one-shot prototype on mk0r
Route hard ceiling: 800 seconds per turn
TTFT watchdog: 15 seconds with no notification evicts the session
Sandbox lifetime: 1 hour; pool entry max age: 45 minutes
Attachment caps: 20 MB inline image, 10 MB inline text

What people usually mean by the limit

Open the existing playbooks on this topic and they all converge on the same answer: the model context window. They will tell you Claude Sonnet 4.6 is 200,000 tokens, that GPT models cap their output at some number, that you can stuff a big spec into the prompt but the response has to fit somewhere. None of that is wrong; it is just the wrong layer to look at when something on a one-shot prototype actually fails.

A typical first-prompt build is small. A landing page, a calculator, a habit tracker, a Pictionary word generator. The generated code is a few hundred lines of TypeScript and JSX. That is nowhere near 200K tokens of context, on either side of the request. So when something does fail on a single turn, it is almost never the model running out of room. It is one of the runtime clocks running out of time.

mk0r’s codebase has seven of them, all visible to anyone who can grep the repo. The rest of this page is each one with the file path and line number, in firing order.

TTFT watchdog

15 seconds. If the ACP bridge inside the sandbox does not emit any notification within 15s of receiving the prompt, the route evicts the session and tells you to retry.

ACP initialize

30 seconds. AbortSignal.timeout(30_000) on POST /initialize. If the bridge does not finish handshaking with Claude in 30s, the sandbox gets killed.

ACP session/new

30 seconds. Same AbortSignal.timeout pattern, same outcome. The bridge has to spin up an agent session in 30s.

Tab-hidden pause

10 minutes. Tab in the background for 10 min disconnects the websockets and unloads the iframe. Resume calls /api/vm/ensure-session.

Route maxDuration

800 seconds. The hard ceiling on a single chat request. Past that the load balancer returns a 504 even if the model is mid-stream.

Pool entry max age

45 minutes. Pre-warmed sandboxes older than 45 min are considered stale and get killed during the next cleanup pass.

Sandbox lifetime

1 hour. E2B reaps the underlying process at 60 minutes regardless of activity. Active sessions refresh this window on every resume.

Attachment inline caps

20 MB for inline images, 10 MB for inline text. Files above the cap still land in /app/uploads/ inside the sandbox, but the agent has to Read them on demand.

The anchor: maxDuration = 800 on the chat route

This is the line that decides everything else. Cloud Run will stream a response for the full duration the route exports; Next.js forwards the value to the platform via the route segment config. Eight hundred seconds is roughly thirteen minutes, which is comfortably more than even a slow VM mode build needs (typical: 60 to 180 seconds). It is also less than both the pool TTL and the sandbox lifetime, by design: the request times out before the sandbox can quietly disappear underneath you mid-stream.

src/app/api/chat/route.ts

If you ever see a 504 from a long mk0r prompt, this is the number that fired. The agent is fine, the sandbox is fine, the model is fine. The platform just hung up because the route ran out of time.

Where each limit lives in the request path

Browser
Tab idle
/api/chat
TTFT 15s
ACP bridge
Sandbox 1h

The TTFT watchdog: 15 seconds and you are out

Right after the route POSTs the prompt to the ACP bridge, it arms a 15 second timer. The timer’s job is to detect a wedged subprocess: if Claude inside the sandbox crashes or hangs before producing a single token, the bridge stops emitting notifications, and the route’s NDJSON reader sits waiting forever. The watchdog short-circuits that. If 15 seconds pass with zero notifications, it evicts the session from the active cache, sends a PostHog event, and closes the stream with a clear retry message.

src/app/api/chat/route.ts

Note the eviction. The next prompt on the same sessionKey cannot reuse the wedged sandbox; it has to claim a new one out of the pool or boot fresh. This is the limit that surfaces a recoverable error instead of a silent infinite spinner. Most other one-shot builders hide this layer entirely; you just wait, then refresh, then wait again.

The sandbox lifetime and the pool age window

Underneath every prompt there is an E2B sandbox. The sandbox has a one hour wall-clock TTL set by E2B_TIMEOUT_MS. The pool of pre-warmed sandboxes considers any entry older than 45 minutes stale, so it never hands you a sandbox that is about to die mid-build. The default pool target is one, which means a single user always has a warm sandbox waiting and a second concurrent user pays the cold boot cost (roughly 2.5 seconds because the template is pre-baked).

src/core/e2b.ts

The 15 minute gap between pool age (45 min) and sandbox lifetime (1 hour) is the safety margin. A sandbox you claimed from the pool always has at least 15 minutes of life left when it lands in your hands, which is comfortably more than the 13 minute route timeout. So a sandbox can never expire under an in-flight one-shot prompt; the route times out first.

The seven clocks, in firing order

0s
TTFT watchdog
0s
ACP init + session
0 min
Tab-hidden pause
0s
Route maxDuration
0 min
Pool entry max age
0 min
Sandbox lifetime
0 MB
Inline image cap
0 MB
Inline text cap

The attachment caps, and what happens above them

One-shot prompts can include files. Drop a screenshot of a mockup, paste a JSON spec, attach a small CSV. mk0r writes every attachment to /app/uploads/ inside the sandbox using writeFileToVm, then decides whether to inline the bytes in the prompt or just leave a path reference.

src/app/api/chat/route.ts

The split matters because inline images consume input tokens. A 20 MB image inlined as base64 is a substantial chunk of the context budget. Above the cap, the file is still available; it is just available through the agent’s Read tool rather than as part of the prompt, which spends a tool call instead of tokens. For a one-shot build, the practical takeaway is this: if you want the agent to react to an image, keep it under 20 MB. If you only need the agent to parse a file on-demand, size does not matter beyond E2B’s own disk budget.

The tab-hidden pause: why the session is gone after lunch

This one is invisible until you notice it. After ten minutes of the tab being hidden, the client tears down the websockets, unloads the preview iframe, and marks the session as paused. When you focus the tab again, the resume hook calls /api/vm/ensure-session, which either revives the paused sandbox or replaces it with a fresh one if E2B already reaped it.

src/hooks/useSessionPause.ts

The pause is not a session-killer; it is a resource-saver. You do not pay for compute on a tab that no one is watching. The cost is the brief “reconnecting...” state when you come back. If you want to test the pause without waiting ten minutes, append ?pauseDelay=10000 to the URL and the timer drops to ten seconds.

What other guides on this topic talk about, vs. what mk0r exposes

The runtime clocks are the part of one-shot prototyping that decides whether your turn finishes. The model context window almost never does.

FeatureCommon writing on the topicmk0r (numbers in source)
Stated cap on a one-shotModel context window (e.g. 200K tokens)Route maxDuration: 800s, in src/app/api/chat/route.ts:18
What kills a wedged sessionRefresh the pageTTFT watchdog at 15s, evicts the active session
Sandbox lifetimeNot mentionedE2B_TIMEOUT_MS = 3_600_000 (1 hour) in src/core/e2b.ts:33
Pool warmth guaranteeNot mentionedPOOL_MAX_AGE_MS = 45 min, target size 1 by default
Tab-hidden behaviorNot mentioned10 min auto-pause, validated resume via /api/vm/ensure-session
Attachment capsTreated as a model concern20 MB inline image, 10 MB inline text, files always written to /app/uploads/
ACP bridge timeoutsNot mentionedAbortSignal.timeout(30_000) on /initialize and /session/new
Where to verifyVendor blog postOpen the file paths in this guide and grep the constants
0sRoute hard ceiling per turn
0sTTFT watchdog evicts the session
0 minPool entry max age
0 minSandbox wall-clock lifetime

How to predict which clock fired

The stack of limits is small enough to memorize, and the failure modes are distinct enough to diagnose by feel.

1

Stream stops at exactly 15 seconds, no tokens received

The TTFT watchdog fired. The ACP subprocess inside the sandbox is wedged. The session has been evicted; retry on the same sessionKey will boot a new sandbox. Look for chat_ttft_timeout in PostHog if you have access.

2

Stream stops around the 13 minute mark, mid-output

The route hit maxDuration = 800. Cloud Run cut the connection. The model and sandbox are likely fine; they just lost the streaming socket. Split the prompt into two turns or accept the truncated output.

3

Boot takes ~2.5 seconds and reports 'pool_claim done'

Normal. You got a pre-warmed sandbox and skipped cold boot. If you see vm_boot in the boot_progress events instead, the pool was empty (concurrent user, recent spec change, or a stale-cleanup pass).

4

Tab was idle, prompt button is grayed out, then 'Reconnecting...'

Tab-hidden pause fired at the 10 minute mark. The resume call to /api/vm/ensure-session is validating the backend session. Wait one or two seconds and you can prompt again.

5

Image upload returned 'too large to inline'

The image was over 20 MB. The file landed at /app/uploads/<name>; the agent will use Read or its image tools to view it. For one-shot builds where you want the agent to actually see the screenshot, downsample first.

What a normal one-shot looks like in the boot_progress stream

What none of these clocks are

None of them are the model’s context window. None of them are Anthropic’s rate limit. None of them depend on whether you signed in or stayed anonymous. They are runtime facts about the path between your browser and Claude, and because they are constants in source, you can read them without running anything.

That is the asymmetry mk0r is built on. Most one-shot builders are opaque about the runtime; they tell you it “just works” and let you find the limits the hard way. mk0r ships them as named constants in two files. If a future you wonders why a one-shot prompt failed at exactly fifteen seconds, the answer is one grep away: ttftMs = 15_000.

Want to walk the seven clocks in the live codebase?

Book 20 minutes and we will open mk0r.com together, fire each limit on purpose, and show you the lines in src/app/api/chat/route.ts and src/core/e2b.ts that decide what dies first.

Frequently asked questions

Why is the model context window not the right limit to think about for one-shot prototyping?

Context window is the limit on how much the model can read and write inside a single response. For a one-shot prototype that produces a small React app or a standalone HTML file, you are nowhere near that ceiling on either side. The cap that bites first is something else: the request timeout on the route that wraps the model call. Every cloud platform draws that line somewhere. Cloud Run defaults to 60 minutes for a streaming response, but most app builders shorten it for cost reasons. mk0r sets it to 800 seconds, which is roughly thirteen minutes. Past that, the platform kills the connection regardless of how much context the model still has to work with.

What is the actual maxDuration that mk0r uses for a single prompt?

Eight hundred seconds. The line is `export const maxDuration = 800;` in src/app/api/chat/route.ts at line 18. That is the budget a single one-shot turn has to boot the sandbox if needed, claim a session, send the prompt to the ACP bridge, stream the response, commit a git turn, and close the response. Anything that does not finish in 800 seconds becomes a 504 from the load balancer. The Quick mode path on a small app finishes in seconds; the VM mode path on a complex prompt usually finishes inside two minutes. The 800 second ceiling exists for the worst case, not the typical case.

What is the TTFT watchdog and when does it fire?

TTFT means time to first token. mk0r runs a fifteen second timer (`const ttftMs = 15_000;` at src/app/api/chat/route.ts line 421) the moment a prompt is forwarded to the ACP bridge inside the sandbox. If the bridge does not emit any notification before the timer fires, the route assumes the session is wedged. It evicts the active session from the cache, sends a `chat_ttft_timeout` PostHog event, and closes the stream with the message `Agent did not respond within 15s. Please retry.` This is one of the limits no other guide writes about, because it is invisible until you hit a session that lost its ACP subprocess to a crash inside the sandbox.

How long does the underlying E2B sandbox live before it gets killed?

One hour. `const E2B_TIMEOUT_MS = 3_600_000;` in src/core/e2b.ts line 33. That is the wall-clock lifetime of the sandbox process from when E2B starts it to when E2B reaps it, regardless of how active you are. The pool refreshes its 1h timeout window after every resume from pause, so an active session never hits this. Idle sandboxes do. If you open mk0r, prototype for ten minutes, walk away, and come back ninety minutes later, the original sandbox is gone and you get a fresh one out of the pool (or a cold boot if the pool was empty).

What happens after the tab has been hidden for a while?

Ten minutes. `const DEFAULT_PAUSE_DELAY_MS = 10 * 60 * 1000;` in src/hooks/useSessionPause.ts line 5. After the tab is hidden for ten minutes, the client disconnects the websockets, unloads the preview iframe, and marks the session as paused. When you focus the tab again, it calls /api/vm/ensure-session which either resumes the paused sandbox or revives it if E2B already collected it. The pause is there to stop wasting browser resources on a tab no one is looking at; it is not the same thing as the sandbox dying. You can override it for testing with `?pauseDelay=10000` in the URL.

Are there limits on attachments going into the prompt?

Yes, two of them, both visible in src/app/api/chat/route.ts. `const MAX_INLINE_IMAGE = 20 * 1024 * 1024;` (line 269) is the cap for inlining an image as a base64 block in the prompt, so the agent can see it. `const MAX_INLINE_TEXT = 10 * 1024 * 1024;` (line 270) is the cap for inlining a text or JSON file. Both files always get written to /app/uploads/ in the sandbox via writeFileToVm, so the agent can read them with its Read tool even when they exceed the inline cap. The cap only controls whether the bytes ride along with the prompt itself or whether the agent has to fetch them separately.

How long does the pool of pre-warmed sandboxes hold an entry?

Forty-five minutes. `const POOL_MAX_AGE_MS = 45 * 60 * 1000;` in src/core/e2b.ts line 1849. The comment on the same line makes the relationship explicit: the sandbox timeout is one hour, so anything older than 45 minutes in the pool is considered stale and gets killed. The default pool target size is one (`process.env.VM_POOL_TARGET_SIZE || "1"` at line 1852), so the second concurrent prompt at any given moment usually pays a cold boot. The template is pre-baked and boots in about 2.5 seconds, so the cold start is small, but it is real.

What about the ACP bridge timeouts inside the sandbox?

Two thirty second timeouts, side by side at src/core/e2b.ts. The /initialize call uses `signal: AbortSignal.timeout(30_000)` (line 1184), and so does /session/new (line 1207). If either takes longer than 30 seconds, the route gives up, kills the sandbox, and surfaces an ACP-specific error. These are bridge-level limits, not model-level limits. Once initialize and session/new succeed, the prompt forwarding has the full remaining maxDuration budget to work with.

Does the model's stop_reason ever cap a one-shot prototype?

Sometimes, but not in the way you might expect. The chat route reads stop_reason from the ACP prompt_complete notification and passes it through as `send({ type: 'done', stopReason: ... })`. A stop reason of `end_turn` means the model finished cleanly; `tool_use` means it asked for a tool and the conversation continues; `max_tokens` means the response itself hit Anthropic's per-response token cap. For a one-shot UI build, max_tokens is rare because the agent splits work across tool calls. When it does happen, the page just gets the truncated text and the next user message can pick up where it stopped. That is the only model-side hard cap that ever shows up in a single turn.

Why does the order of these clocks matter?

Because they fire in nested scopes, and the smallest one always wins. In ascending order: TTFT watchdog at 15s, ACP initialize and session/new at 30s each, tab pause at 10 min, route timeout at 800s (~13 min), pool entry max age at 45 min, sandbox lifetime at 1 hour. If you understand the hierarchy you can predict exactly what will go wrong: a prompt that times out at exactly 15 seconds is a wedged ACP subprocess, not a slow model. A 504 around the 13 minute mark is the route timeout, not the sandbox. A fresh-looking sandbox after a long lunch is the pool serving you a different pre-warmed sandbox, not your original one.

mk0r.AI app builder
© 2026 mk0r. All rights reserved.