AI prototype app limits: eleven outcomes, not one
Most writing on this topic stops at “context window” or “rate limits per minute.” A real prototype turn produces eleven distinguishable terminal states. mk0r names every one of them as a constant in source, classifies each failure into the right bucket, and routes each bucket to a different recovery path.
What people usually mean by “limits” here
Open the playbooks online for this topic and they all converge on the same two answers. Either the model context window is the limit (200K tokens for Claude Sonnet 4.6, 128K for some others), or the API rate limit is (something like fifteen requests per minute on a free tier). Both numbers are real. Neither one is the answer most users actually need.
A typical prototype build, when it fails, fails for one of a small handful of specific reasons. The model said no. The image was too big. The Anthropic billing window tripped. The agent’s subprocess inside the sandbox lost its session. The user clicked stop. None of those is “you ran out of context.” They all need different responses.
mk0r writes the response logic out as a six-way switch in one file, and ships the model’s own five stop reasons next to it. That is the structure of this page.
One prompt becomes one of eleven outcomes
The six error kinds in classifyPromptError
The whole classifier is forty lines of TypeScript. It runs on the raw error string from the ACP prompt_error notification, plus a structured ApiRetryInfo struct carrying HTTP status and Anthropic’s typed errorType (billing_error, rate_limit, authentication_failed, image_error, invalid_request). Structured signals win; regex on the raw message is a fallback for the cases where the bridge does not forward typed info.
The order matters: credit before auth before image before invalid before stale. A credit_exhausted message can also match the ‘limit’ regex, so it has to be checked first. A stale_session is only chosen when nothing elsematched and the message says “Internal error.” That is the case where the ACP subprocess inside the sandbox lost its in-memory session, which is the only failure worth a one-shot recovery attempt.
credit_exhausted
HTTP 402 or 429, errorType billing_error or rate_limit, or a regex hit on 'credit balance is too low' / 'rate limit rejected.' Surface the raw message with reset window. Do NOT restart ACP — the restart hits the same limit.
auth_required
HTTP 401 or 403, errorType authentication_failed, or a regex hit on 'invalid api key' / 'failed to authenticate.' Surface the message and stop. Re-link Anthropic OAuth or check the shared key.
image_error
errorType image_error or a regex hit on 'image too large' / 'unable to resize image' / 'dimension.' Non-retryable. Show the user the actual message; the only fix is a smaller attachment.
invalid_request
errorType invalid_request or a regex hit on 'usage' / 'limit' / 'quota' / 'rejected.' Non-retryable. The prompt itself is wrong: edit and resend.
stale_session
Last-resort match: message says 'Internal error' AND no other classification won. This is the ONLY kind that triggers a recovery: if textDeltaCount is zero, restartAndReloadSession spins a fresh ACP and replays the prompt once.
generic
Default bucket. Show the raw error. Do not paper over with 'Something went wrong' — that hides real SDK issues from the user and from the logs.
The five stop reasons (the clean-finish side)
The other half of the eleven-outcome surface is the model telling you it stopped, and why. The ACP bridge forwards Anthropic’s stop_reason on the prompt_complete notification. mk0r’s type union pins it to exactly five values:
Every one of these is a successful turn from the route’s point of view. The model finished, the sandbox is still running, the conversation history is intact. What changes is whether the user gets a continue button, a banner, or nothing at all.
What the UI does for each stop reason
- end_turn — render normally, no banner. Conversation continues if the user types again.
- max_tokens — amber banner 'Response truncated (max tokens)' with a Continue button.
- max_turn_requests — amber banner 'Turn limit reached' with Continue. Common on long agentic builds.
- refusal — banner 'Agent declined.' No Continue button (model said no, do not retry).
- cancelled — banner suppressed (the user already knows they pressed stop).
Two boundary caps that live outside the chat route
Not every limit is per-turn. Two caps sit at the edges of the request lifecycle, one on the client and one on the account. Both are short and self-contained.
Two things to notice. First, the 25 MB cap is enforced in the browser before any upload happens, so a 400 MB video file never even reaches the chat route. The browser logs an attachment_rejected PostHog event with the file name and size, so you can see them in analytics. Second, PROJECT_CAP only applies to signed-in users; anonymous users have no project list at all, so anonymous prototyping has no project ceiling. The cap is part of the persistent project catalog feature, not part of the runtime.
Eleven outcomes, in source
What other guides on this topic say, vs. what mk0r ships
The 'context window' framing is one number. The mk0r framing is eleven. One is more useful.
| Feature | Common writing on this topic | mk0r (constants in source) |
|---|---|---|
| What 'limit' actually means | Model context window | Eleven terminal states + two boundary caps |
| How errors are classified | Single error string | Six ErrorKinds via classifyPromptError, structured signals first |
| Recovery path on rate limit | Retry the request | Surface raw message with reset window; do NOT restart ACP |
| Recovery path on stale session | Refresh the page | restartAndReloadSession + replay prompt once, only if textDeltaCount===0 |
| Stop reason exposed to UI | Generic 'done' or 'failed' | end_turn / max_tokens / max_turn_requests / refusal / cancelled |
| Cancel button behavior | UI-only, orphans the model call | POST /session/cancel reaches the agent, returns stopReason='cancelled' |
| Per-attachment size cap | Vendor-specific, often hidden | 25 MB browser-side, 20 MB inline image, 10 MB inline text — all in source |
| Per-account project cap | Sometimes documented | PROJECT_CAP = 10 in src/app/api/projects/route.ts:10. None for anonymous users. |
| Where to verify | Vendor blog post | Open the file paths in this guide and grep the constants |
How to read a failed turn
Once you know the eleven outcomes, you can almost always tell which one fired from the message alone. The decision tree is short.
Banner contains 'reset' or 'minutes' or a 429
credit_exhausted. The Anthropic per-minute or daily cap tripped. Wait the stated window. mk0r intentionally does NOT restart ACP here; the restart would hit the same limit. If you have personal Claude OAuth credentials linked, the cap is on your account, not on mk0r's pool.
Banner contains '401' or 'invalid api key'
auth_required. The shared API key was rotated or your linked Anthropic OAuth tokens expired. Re-link in account settings or contact support if it is the shared key. The route does not retry this kind.
Banner mentions 'image' or 'too large' or 'dimension'
image_error. The model rejected the inline image. Downsize below 20 MB if it was an image; if it was a text or PDF, the file is still on disk in /app/uploads/ and the agent can Read it on the next turn.
Banner is exactly 'Internal error' with no token streamed
stale_session. mk0r's recovery branch fires automatically: a fresh ACP subprocess spins up inside the same sandbox and the prompt is replayed once. If the recovery succeeds you see streaming resume; if it fails you get 'Session recovery failed: ...' with the underlying reason.
Done event arrived with stopReason='max_tokens' or 'max_turn_requests'
Not a failure. The turn finished but the model wants you to confirm the next chunk. Use the Continue button. max_tokens means the response was truncated; max_turn_requests means the agent ran the maximum tool calls in one turn.
Done event arrived with stopReason='refusal'
The model declined. Edit the prompt; retrying as-is will produce the same refusal. The route deliberately omits the Continue button for this case.
Banner says 'project cap reached (10)'
Account-side cap. You have ten named projects already; archive or delete one to create another. Anonymous prototyping is not affected.
Two failed turns, side by side
The same prompt can produce two completely different terminal states depending on which limit fires. Here is what each one looks like in the boot/event stream the route emits.
Why eleven, not one
Most articles online answer “limits” with a single number because the alternative is harder to write and the vendor often does not expose the seams. The seams are the interesting part. Whether your turn ends in credit_exhausted or stale_session or max_turn_requests decides what happens next. A continue button is the right answer for one. A reset window is the right answer for another. A one-shot ACP restart is the right answer for a third. Bucketing them all together and showing “Something went wrong” is how an AI prototype tool stops being a tool.
mk0r ships the eleven-state surface as named constants and a forty-line classifier. Everything in this guide is one grep away in the repo: the file paths are real, the line numbers are stable, the constants do not change between deploys. If a future you wonders why a prototype turn ended exactly the way it did, the answer is the kind tag on the event in PostHog (chat_prompt_error_classified) or the stopReason on the done event.
Want to walk the eleven outcomes in the live codebase?
Book 20 minutes and we will open mk0r.com together, fire each error kind on purpose, and show you the lines in src/app/api/chat/route.ts that route every failure to its recovery branch.
Frequently asked questions
Why is 'context window' a misleading answer to 'what are the limits on an AI prototype app?'
Context window is a single number from the model card. It tells you nothing about which class of failure your prompt actually hits. A real prototyping turn ends in one of eleven distinguishable states: six error classes (credit_exhausted, auth_required, image_error, invalid_request, stale_session, generic) and five model stop reasons (end_turn, max_tokens, max_turn_requests, refusal, cancelled). Most of them have nothing to do with how many tokens fit in the prompt. The credit_exhausted class fires on HTTP 429 from Anthropic, which is a per-minute cap on tokens-per-minute, not a per-prompt cap. The stale_session class fires when the agent's subprocess inside the sandbox lost its in-memory session, which is purely a runtime state issue. Treating all of them as 'context window' obscures the fix for any of them.
Where is the failure-classification code in mk0r and what does it actually do?
src/app/api/chat/route.ts, lines 43 to 82, defines a function named classifyPromptError. Its inputs are the raw error message, an optional ApiRetryInfo struct (HTTP status plus ACP errorType), and a meta object holding the most recent api_retry frame. Its output is a tagged union {kind, httpStatus, errorType} where kind is one of six values. The decision order is: structured signals first (errorType=='billing_error' OR HTTP 402/429 maps to credit_exhausted; errorType=='authentication_failed' OR HTTP 401/403 maps to auth_required), then a regex fallback (matching 'credit balance is too low', 'rate limit rejected', 'invalid api key', 'unable to resize image', 'usage', 'quota'). Stale-session is only chosen when the message says 'Internal error' AND no other signal won. Each kind takes a different branch starting at line 463.
What does mk0r do differently for each error kind?
credit_exhausted (line 463): emit a 'credit_exhausted' event with the raw SDK message and the resets-at hint, then DO NOT restart ACP. Restarting would just hit the same Anthropic limit. The user sees the rate-limit window. auth_required (line 479): emit 'auth_required' with the HTTP status and stop. invalid_request and image_error (line 486): mark the turn non-retryable, surface the real message instead of hiding it behind 'Something went wrong.' stale_session (line 499) AND textDeltaCount===0 AND not yet retried (line 500): call restartAndReloadSession to spin up a fresh ACP subprocess inside the same sandbox, replay the prompt once, and resume streaming. Anything else (generic): show the message as a normal error. Six branches, six recoveries.
What is the difference between an error kind and a stop reason?
An error kind comes from classifyPromptError when the ACP bridge returns a prompt_error notification — something went wrong, the turn never reached a clean finish. A stop reason comes from a prompt_complete notification — the turn finished, but the model has to tell you why it stopped. The five stop reasons in src/lib/chat-events.ts line 10 are: end_turn (finished cleanly, this is the success case), max_tokens (the response itself hit Anthropic's per-response token cap and was truncated), max_turn_requests (the agent ran the maximum number of tool calls in one turn and was cut off), refusal (the model declined to act on the prompt), cancelled (the user pressed stop and POST'd /api/chat/cancel which forwarded /session/cancel to the sandbox). The UI in src/app/(landing)/page.tsx line 1370 maps each to a distinct banner string.
What boundary limits live outside the chat route?
Two of them, sitting at the edges of the request lifecycle. The browser-side cap is at src/components/assistant-ui/thread.tsx line 32: const MAX_FILE_SIZE = 25 * 1024 * 1024. Files larger than 25 MB never leave the browser; the addFiles handler logs an attachment_rejected PostHog event and skips them. The account-side cap is at src/app/api/projects/route.ts line 10: const PROJECT_CAP = 10. POST /api/projects calls countProjects(uid) and returns 409 'project cap reached (10)' when a signed-in user already owns ten named projects. Anonymous users do not have a project list at all, so the cap does not apply to anonymous prototyping; the cap is a feature of the persistent project catalog, not the prototype runtime.
How does the cancel path relate to the other limits?
Cancel is the only limit you fire on yourself. POST /api/chat/cancel reads the sessionKey from the body, looks up the active session via getActiveSession, and forwards POST /session/cancel to the in-VM ACP bridge with a 5 second AbortSignal.timeout. If the bridge accepts, the next prompt_complete notification arrives with stopReason='cancelled' and the route closes with that reason. The whole route is at src/app/api/chat/cancel/route.ts, twenty-eight lines including imports. The reason it matters in a limits discussion: most other tools treat 'stop' as a UI-only operation that orphans the model call. mk0r's cancel actually reaches the agent and produces a real stop reason in the same union as end_turn and max_tokens, so the same downstream handlers can render it.
What stops me from filling the context window in practice?
The agent does. mk0r's prompt is plain user text plus optional inline images and inline text. Inline images are capped at 20 MB at src/app/api/chat/route.ts line 269. Inline text is capped at 10 MB at line 270. Files above the cap still land in /app/uploads/ inside the sandbox via writeFileToVm, so the agent reads them with its Read tool on demand instead of inlining them. The agent itself runs the multi-tool turn loop and is allowed to issue many tool calls in one turn; max_turn_requests is the model's signal that it ran out of those, not that it ran out of tokens. In practice, a one-shot prototype build never approaches the 200K context window in either direction; it ends at end_turn well below the limit, with stopReason logged on the prompt_complete notification.
If I get credit_exhausted, do I lose my prototype?
No. credit_exhausted is a signal from the model API, not from the sandbox. The sandbox is still running, the prompt blocks are still in the conversation history, and the next turn picks up from the same place. mk0r intentionally does NOT restart ACP on credit_exhausted (line 466 comment in src/app/api/chat/route.ts) because the restart would just hit the same Anthropic limit. The user sees the raw message containing the reset window and waits, then retries. If you have your own Anthropic OAuth tokens loaded (line 178), the route uses those instead of the shared API key, so the limit follows your account, not mk0r's pool.
Why eleven outcomes and not 'one big error bucket'?
Because they need different responses. credit_exhausted means wait. auth_required means re-link credentials. image_error means downsize the attachment. invalid_request means edit the prompt. stale_session means a one-time recovery is worth attempting. generic means show the user the actual error. end_turn means do nothing. max_tokens means offer a continue button. max_turn_requests means offer a continue button with a 'I will keep going' phrasing. refusal means do not retry; the model said no. cancelled means the user already knows. Bucketing them all together produces an unhelpful UX and an unobservable backend; splitting them produces eleven precise event types and eleven precise UI states. Each event lands in PostHog with its kind tag (chat_prompt_error_classified at line 454) so failures can be counted by category.
Are these the only limits I should know about?
These are the limits that decide what happens to one turn. There is a separate seven-clock layer that governs runtime timing (TTFT watchdog, route maxDuration, sandbox lifetime) which we cover in our companion guide on one-shot prototype limits. There is also a billing entitlement layer at /api/billing/status that gates publishing to a custom domain after the trial. The sequence is: a turn either finishes or fails (these eleven outcomes), the runtime clocks decide how long it had to do that (the seven clocks), and the billing layer decides whether you can ship what you built (the entitled flag). Three layers, three categories of limit, all checkable in the repo.