AI app builder prototyping: what's actually inside the sandbox
Every AI app builder ends its pitch at “describe it, get an app.” The interesting part is what runs after you press enter. This is a walkthrough of what mk0r's sandbox contains, how the agent verifies its own output with a real browser, and why that decides whether your prototype breaks the moment you touch it.
Most AI builders stop at “generate code and hope”
If you've bounced off v0, Bolt, Lovable, or Emergent after the first broken preview, the reason is almost always the same. The agent wrote code that looked right. It streamed into an in-page renderer or a browser-hosted WebContainer. It never ran in a real browser, with a real network stack, against real npm resolution. The first time you clicked a button and nothing happened, the crack showed.
mk0r treats prototyping as a loop, not a one-shot. The sandbox isn't a trick to render faster; it's the environment the agent uses to check its own work. Everything below is the machinery that makes that loop possible.
What's inside every sandbox
The sandbox is an E2B template with ID 2yi5lxazr1abcs2ew6h8. It's pre-baked: the npm install happens at image build time, the project is scaffolded at /app, and the boot sequence only has to start processes. The concrete contents:
Real Chromium
Not WebContainer, not jsdom. Headful Chromium with fonts and libnss3 for TLS verification. The agent can open any preview URL the same way you would.
Playwright MCP 0.0.70
Installed globally via npm in the Dockerfile. Exposed to Claude as a tool over MCP. browser_take_screenshot, browser_snapshot, browser_click, all callable mid-generation.
Vite + React + TS + Tailwind v4
Pre-installed at /app. Dev server on :5173 with HMR already wired over wss. No cold npm install when your prompt lands.
Xvfb + x11vnc + websockify
A virtual display so the agent can see a real browser window, and a VNC bridge so you can too. The screencast in the UI is this stream.
Pre-provisioned services
Postgres via Neon, email via Resend, analytics via PostHog, already wired with env vars. Your prototype can persist data without you signing up for anything.
ACP bridge 0.25.0
@agentclientprotocol/claude-agent-acp runs inside the VM so the Claude agent can stream tool calls, text, and version commits back to your browser as NDJSON.
The self-verification loop, as a diagram
Competitors skip the second leg. mk0r's prototyping loop is four legs, and the agent drives three of them without you:
Prompt to verified preview
What the agent actually does after it writes the file
The order of operations inside a single iteration. Each step is an actual tool call streamed back to your browser as an NDJSON event on the chat endpoint.
Write file
Agent emits a tool_call with the file path and contents. Vite picks up the change.
Streamed from the ACP bridge as type: "tool_call_start", visible in src/app/api/chat/route.ts.
HMR paints
Parent iframe waits up to 800ms for an hmr:after event from the in-app bridge. If it fires, no hard reload.
Open the preview in Chromium
Agent calls the Playwright MCP tool browser_navigate against http://localhost:5173.
Snapshot the DOM
browser_snapshot returns the accessibility tree. The agent reads it, checks that required elements exist and that the error overlay is gone.
Screenshot on demand
For layout bugs the snapshot can't catch (overflowing text, wrong z-index), the agent calls browser_take_screenshot and reads the image back.
Iterate or hand off
If the check passed, the iteration ends with a version commit. If not, the agent writes another diff and starts over, typically without asking you.
What it looks like in the sandbox terminal
Boot, serve, snapshot. An abridged session from inside the VM:
The HMR bridge that keeps iterations instant
Every preview iframe runs a tiny script called _mk0rBridge.ts that hooks Vite's HMR events and posts them to the parent window. The parent side lives in src/components/phone-preview.tsx. When the agent writes a file, the parent bumps a refreshNonce and then waits:
The practical effect: small edits (a className change, a string tweak, a new handler) paint in a few hundred milliseconds without dropping form state. Big edits that invalidate modules get the reload path, layered so there's no white flash between iframes.
Instant HTML vs. real VM: the tradeoff nobody explains
AI builders split into two camps, and the SERP never says so out loud. Knowing which camp your tool is in tells you in advance what kinds of prototypes will survive contact with reality.
| Feature | In-browser HTML / WebContainer | Real VM sandbox (mk0r) |
|---|---|---|
| Runs a real Chromium the agent can use | No, rendered in the parent page | Yes, headful via Xvfb |
| npm install against the real registry | Partial, shimmed resolution | Yes, pre-baked at image build time |
| External API calls without a CORS proxy | No, browser same-origin rules apply | Yes, VM has normal egress |
| Agent can screenshot and read the DOM | No, agent is blind to the render | Yes, via Playwright MCP |
| SSR behaves the way production does | Partial, emulated | Yes, Node 20 runtime |
| First paint under two seconds | Yes, no VM to boot | Close, ~2.5s cold / instant from pool |
| Works fully offline | Yes, browser-only | No, needs the sandbox |
In-browser builders win on first-paint latency. Real VMs win on fidelity. For anything past a one-screen toy, fidelity is the thing you can't retrofit later.
The iterate-see-change loop, as a mini film
Watch what changes frame to frame. This is the short version of what the UI shows you when you're prototyping with mk0r:
One prompt, one iteration
00:00 you type
“Make a habit tracker with a 7-day streak chart.” Session key created in localStorage. Sandbox claimed from the pre-warmed pool.
The numbers that matter when you're prototyping
Real numbers from the running product, not a spec sheet.
Frequently asked questions
What does 'the agent verifies its own output' actually mean?
Inside every mk0r sandbox, @playwright/mcp@0.0.70 is installed globally as a tool the Claude agent can call. After the agent writes a file, it can open the running Vite dev server in a real Chromium instance, take a snapshot of the DOM, and read it back. If a required element is missing or an error surface appears, the agent fixes the code and runs the check again before the iteration is handed to you.
How is this different from v0, Bolt, Lovable, or Replit?
v0 and Lovable stream React to an in-page renderer — fast, but you only see what the code claims to do. Bolt runs in-browser WebContainers, which can't host a real Chromium. Replit spins up a VM but doesn't put a browser-automation tool inside the agent loop. mk0r is the only one where the AI opens its own app in a real browser inside the sandbox and checks it visually before returning.
How fast is the sandbox?
The E2B template (ID 2yi5lxazr1abcs2ew6h8) is pre-baked: Debian bookworm-slim with Node 20, Chromium, Playwright MCP, Vite, React, TypeScript, Tailwind, and a scaffolded /app directory. Cold boot is roughly 2.5 seconds, and a pre-warmed pool backed by Firestore keeps idle sandboxes ready so most sessions skip cold start entirely.
Do I need to sign up?
No. The landing page generates a session key in localStorage on first visit (crypto.randomUUID) and uses it to claim a sandbox. You can prototype, iterate, and get a shareable preview URL without an email address. Signup only matters if you want projects to persist across devices.
What's the difference between mk0r's Quick mode and VM mode for prototyping?
Quick mode streams a single-file HTML/CSS/JS app from Claude Haiku in under 30 seconds — great for calculators, landing pages, and throwaway utilities. VM mode boots the full sandbox with Vite + React + TypeScript + Playwright MCP for anything that needs real npm packages, hot-reload iteration, or visual verification. Prototyping something you'll touch more than once belongs in VM mode.
How fast does the preview update when the agent changes a file?
The bridge script at src/components/phone-preview.tsx posts hmr:before / hmr:after messages to the parent iframe. When the agent writes a file, the parent waits up to 800ms for an hmr:after event; if HMR paints in time, it skips the hard reload and the iframe updates in place. Most edits paint under a second with zero flash.
What's pre-installed in the sandbox?
From docker/e2b/e2b.Dockerfile: Chromium with fonts and libnss3, ffmpeg, Xvfb + x11vnc + websockify for remote desktop, Python 3, postgresql-client, Node 20, Vite, React, TypeScript, Tailwind v4, @playwright/mcp@0.0.70, and @agentclientprotocol/claude-agent-acp@0.25.0. The project template is pre-scaffolded at /app so npm install has already run at image build time.
Does the agent really fix broken code without me asking?
Yes, for observable failures. If Vite throws a build error, the error surface is visible in Playwright and the agent reads it. If a click target doesn't exist, the snapshot shows it. What the agent cannot catch without you is intent mismatch — if the UI renders cleanly but doesn't do what you meant, only you can spot that. Visual verification raises the floor; it doesn't replace taste.
Prototype something, now
You get the full sandbox on your first prompt. Boot, Chromium, Playwright, Postgres, and an agent that checks its own work. No signup.
Open mk0r