Guide

AI app builder prototyping: what's actually inside the sandbox

Every AI app builder ends its pitch at “describe it, get an app.” The interesting part is what runs after you press enter. This is a walkthrough of what mk0r's sandbox contains, how the agent verifies its own output with a real browser, and why that decides whether your prototype breaks the moment you touch it.

Matthew Diakonov, Written with AI

Published April 18, 20268 min

4.8from Engineers who tried v0, Bolt, Lovable, and came here

~2.5s pre-baked VM boot

Playwright MCP in the agent loop

Real Chromium, real npm install

The prototyping loop

Prompt in. Real app out. Verified.

Pre-baked VM boots in ~2.5 seconds

Agent writes React into /app

Agent opens its own app in Chromium

Playwright MCP screenshots the DOM

Fixes applied before you see the preview

0:00 / 0:05

Try it, no signupSession key created in your browser on first click.

Most AI builders stop at “generate code and hope”

If you've bounced off v0, Bolt, Lovable, or Emergent after the first broken preview, the reason is almost always the same. The agent wrote code that looked right. It streamed into an in-page renderer or a browser-hosted WebContainer. It never ran in a real browser, with a real network stack, against real npm resolution. The first time you clicked a button and nothing happened, the crack showed.

mk0r treats prototyping as a loop, not a one-shot. The sandbox isn't a trick to render faster; it's the environment the agent uses to check its own work. Everything below is the machinery that makes that loop possible.

0sCold VM boot

0msHMR paint window

0 CPUPer sandbox

0GBRAM per sandbox

What's inside every sandbox

The sandbox is an E2B template with ID 2yi5lxazr1abcs2ew6h8. It's pre-baked: the npm install happens at image build time, the project is scaffolded at /app, and the boot sequence only has to start processes. The concrete contents:

Real Chromium

Not WebContainer, not jsdom. Headful Chromium with fonts and libnss3 for TLS verification. The agent can open any preview URL the same way you would.

Playwright MCP 0.0.70

Installed globally via npm in the Dockerfile. Exposed to Claude as a tool over MCP. browser_take_screenshot, browser_snapshot, browser_click, all callable mid-generation.

Vite + React + TS + Tailwind v4

Pre-installed at /app. Dev server on :5173 with HMR already wired over wss. No cold npm install when your prompt lands.

Xvfb + x11vnc + websockify

A virtual display so the agent can see a real browser window, and a VNC bridge so you can too. The screencast in the UI is this stream.

Pre-provisioned services

Postgres via Neon, email via Resend, analytics via PostHog, already wired with env vars. Your prototype can persist data without you signing up for anything.

ACP bridge 0.25.0

@agentclientprotocol/claude-agent-acp runs inside the VM so the Claude agent can stream tool calls, text, and version commits back to your browser as NDJSON.

The self-verification loop, as a diagram

Competitors skip the second leg. mk0r's prototyping loop is four legs, and the agent drives three of them without you:

Prompt to verified preview

What the agent actually does after it writes the file

The order of operations inside a single iteration. Each step is an actual tool call streamed back to your browser as an NDJSON event on the chat endpoint.

Write file

Agent emits a tool_call with the file path and contents. Vite picks up the change.

Streamed from the ACP bridge as type: "tool_call_start", visible in src/app/api/chat/route.ts.

HMR paints

Parent iframe waits up to 800ms for an hmr:after event from the in-app bridge. If it fires, no hard reload.

Open the preview in Chromium

Agent calls the Playwright MCP tool browser_navigate against http://localhost:5173.

Snapshot the DOM

browser_snapshot returns the accessibility tree. The agent reads it, checks that required elements exist and that the error overlay is gone.

Screenshot on demand

For layout bugs the snapshot can't catch (overflowing text, wrong z-index), the agent calls browser_take_screenshot and reads the image back.

Iterate or hand off

If the check passed, the iteration ends with a version commit. If not, the agent writes another diff and starts over, typically without asking you.

What it looks like in the sandbox terminal

Boot, serve, snapshot. An abridged session from inside the VM:

sandbox / bash

The HMR bridge that keeps iterations instant

Every preview iframe runs a tiny script called _mk0rBridge.ts that hooks Vite's HMR events and posts them to the parent window. The parent side lives in src/components/phone-preview.tsx. When the agent writes a file, the parent bumps a refreshNonce and then waits:

src/components/phone-preview.tsx

The practical effect: small edits (a className change, a string tweak, a new handler) paint in a few hundred milliseconds without dropping form state. Big edits that invalidate modules get the reload path, layered so there's no white flash between iframes.

Instant HTML vs. real VM: the tradeoff nobody explains

AI builders split into two camps, and the SERP never says so out loud. Knowing which camp your tool is in tells you in advance what kinds of prototypes will survive contact with reality.

Feature	In-browser HTML / WebContainer	Real VM sandbox (mk0r)
Runs a real Chromium the agent can use	No, rendered in the parent page	Yes, headful via Xvfb
npm install against the real registry	Partial, shimmed resolution	Yes, pre-baked at image build time
External API calls without a CORS proxy	No, browser same-origin rules apply	Yes, VM has normal egress
Agent can screenshot and read the DOM	No, agent is blind to the render	Yes, via Playwright MCP
SSR behaves the way production does	Partial, emulated	Yes, Node 20 runtime
First paint under two seconds	Yes, no VM to boot	Close, ~2.5s cold / instant from pool
Works fully offline	Yes, browser-only	No, needs the sandbox

In-browser builders win on first-paint latency. Real VMs win on fidelity. For anything past a one-screen toy, fidelity is the thing you can't retrofit later.

The iterate-see-change loop, as a mini film

Watch what changes frame to frame. This is the short version of what the UI shows you when you're prototyping with mk0r:

One prompt, one iteration

01 / 05

00:00 you type

“Make a habit tracker with a 7-day streak chart.” Session key created in localStorage. Sandbox claimed from the pre-warmed pool.

The numbers that matter when you're prototyping

Real numbers from the running product, not a spec sheet.

Cold VM boot from a pre-baked template

0ms

HMR paint budget before a hard reload kicks in

Accounts required to get a shareable preview URL

Frequently asked questions

What does 'the agent verifies its own output' actually mean?

Inside every mk0r sandbox, @playwright/mcp@0.0.70 is installed globally as a tool the Claude agent can call. After the agent writes a file, it can open the running Vite dev server in a real Chromium instance, take a snapshot of the DOM, and read it back. If a required element is missing or an error surface appears, the agent fixes the code and runs the check again before the iteration is handed to you.

How is this different from v0, Bolt, Lovable, or Replit?

v0 and Lovable stream React to an in-page renderer — fast, but you only see what the code claims to do. Bolt runs in-browser WebContainers, which can't host a real Chromium. Replit spins up a VM but doesn't put a browser-automation tool inside the agent loop. mk0r is the only one where the AI opens its own app in a real browser inside the sandbox and checks it visually before returning.

How fast is the sandbox?

The E2B template (ID 2yi5lxazr1abcs2ew6h8) is pre-baked: Debian bookworm-slim with Node 20, Chromium, Playwright MCP, Vite, React, TypeScript, Tailwind, and a scaffolded /app directory. Cold boot is roughly 2.5 seconds, and a pre-warmed pool backed by Firestore keeps idle sandboxes ready so most sessions skip cold start entirely.

Do I need to sign up?

No. The landing page generates a session key in localStorage on first visit (crypto.randomUUID) and uses it to claim a sandbox. You can prototype, iterate, and get a shareable preview URL without an email address. Signup only matters if you want projects to persist across devices.

What's the difference between mk0r's Quick mode and VM mode for prototyping?

Quick mode streams a single-file HTML/CSS/JS app from Claude Haiku in under 30 seconds — great for calculators, landing pages, and throwaway utilities. VM mode boots the full sandbox with Vite + React + TypeScript + Playwright MCP for anything that needs real npm packages, hot-reload iteration, or visual verification. Prototyping something you'll touch more than once belongs in VM mode.

How fast does the preview update when the agent changes a file?

The bridge script at src/components/phone-preview.tsx posts hmr:before / hmr:after messages to the parent iframe. When the agent writes a file, the parent waits up to 800ms for an hmr:after event; if HMR paints in time, it skips the hard reload and the iframe updates in place. Most edits paint under a second with zero flash.

What's pre-installed in the sandbox?

From docker/e2b/e2b.Dockerfile: Chromium with fonts and libnss3, ffmpeg, Xvfb + x11vnc + websockify for remote desktop, Python 3, postgresql-client, Node 20, Vite, React, TypeScript, Tailwind v4, @playwright/mcp@0.0.70, and @agentclientprotocol/claude-agent-acp@0.25.0. The project template is pre-scaffolded at /app so npm install has already run at image build time.

Does the agent really fix broken code without me asking?

Yes, for observable failures. If Vite throws a build error, the error surface is visible in Playwright and the agent reads it. If a click target doesn't exist, the snapshot shows it. What the agent cannot catch without you is intent mismatch — if the UI renders cleanly but doesn't do what you meant, only you can spot that. Visual verification raises the floor; it doesn't replace taste.

Prototype something, now

You get the full sandbox on your first prompt. Boot, Chromium, Playwright, Postgres, and an agent that checks its own work. No signup.

Open mk0r