Instant HTML app generator: the pipeline that has no VM in it
Most pages on this topic talk about being instant in marketing terms. This one names the actual mechanism. mk0r's Quick mode is an instant HTML app generator because nothing in the request path between your prompt and your first paint is allowed to be slow. No Docker pull. No npm install. No bundler. Just Claude Haiku writing HTML, the chat route forwarding it as a server-sent event stream, and the browser appending each chunk to an iframe's srcdoc attribute.
Direct answer (verified 2026-05-04)
An instant HTML app generator is a tool where the model writes a self-contained HTML document and you see it render in the browser in real time, with no sandbox spin-up between the prompt and the preview. mk0r's Quick mode is built this way. The model is Claude Haiku 4.5. The transport is one POST to /api/chat that returns text/event-stream. The renderer is an iframe whose srcdoc is rewritten on every chunk. First visible paint is well under a second.
Why most "instant" AI app generators are not
Open any of the well-known AI app builders, type a prompt, and watch what the network tab does. There's a sign-in. Then a project create call. Then a sandbox provision. Then a Docker image pull, somewhere between five and twenty seconds depending on cache. Then npm install, even if it's pre-baked into the image, because the framework still wants to verify. Then the dev server starts, then HMR connects, then the agent runs once, then the agent writes a file, then the iframe finally loads. By the time you see anything, you've waited long enough that the tab feels broken.
Quick mode collapses that whole sequence into one network call. There is nothing between you and Haiku except an edge route. No container. No filesystem. No build tool. The cost is real (you can't install lodash from inside the iframe) and the scope is real (the entire app has to fit in one file). For about 80 percent of the things people actually try to build first, that scope is enough.
The whole pipeline, in four hops
The numbers below are the count of steps you can see if you put breakpoints in the chat route and the iframe's parent component. Nothing is hidden behind "our agent does X." It really is this short.
Quick mode request, end to end
Browser opens SSE to /api/chat
POST with the prompt and active model. No VM lookup, no sandbox claim, just a streaming response.
Haiku starts emitting tokens
Time-to-first-token in the few-hundred-millisecond range on a warm route. HTML appears before Docker would have pulled.
Each chunk rewrites the iframe srcdoc
The preview is a sandboxed iframe. The browser parses and paints the partial document in place as it grows.
Inline script runs at end of stream
Buttons, inputs, and event listeners light up the instant the document finishes streaming. No build step.
The wire-level view
If you want to verify this for yourself, open devtools, switch mk0r into Quick mode, and submit a prompt. You will see exactly this trace. One streaming POST, no follow-ups, no asset fetches unless Haiku embedded a CDN link in the HTML.
One POST. Streaming response. No sandbox.
Why an iframe srcdoc, not a separate document
The iframe is the hidden detail that makes streaming feel instant. If you wrote the partial HTML to a file, served it back, and reloaded the iframe on every chunk, you'd burn a network round trip per token and the preview would flicker. Instead the parent component holds a buffer in React state. Every SSE chunk appends to the buffer and resets the iframe's srcdoc. The browser parses and paints the partial document in place. You see the document grow downward as Haiku writes it, like watching someone type into a code editor with live preview.
The iframe is sandboxed (sandbox="allow-scripts") so the script Haiku writes can run, but the document is isolated from the parent page. Same-origin escape hatches are off. That matters because Haiku will sometimes invent a fetch to an API, and you want that fetch to fail closed inside the iframe rather than make a real call from your origin.
“The first chunk of HTML hits the iframe before a Docker pull would have finished.”
Quick mode time-to-first-paint, mk0r
What you actually feel as a user
Numbers are easier to argue about than vibes, but the vibe is the point. The metrics below are the difference between "type a prompt and stare at a loading spinner" and "type a prompt and watch your app appear."
When Quick mode is the wrong choice
A pipeline that skips the VM also skips everything the VM gives you. If your app needs:
- Real npm packages beyond what fits as a CDN script tag
- Multi-file structure with shared state across components
- Server-side data, user accounts, or persistence beyond localStorage
- A long-lived dev loop where you iterate over hours, not minutes
- Any kind of automated visual verification before the model hands you the result
Then you flip mk0r into VM mode. That route boots a real E2B sandbox with Vite, React, TypeScript, Tailwind v4, and Playwright MCP wired into the agent loop, so the model can open its own app in a real Chromium and check it visually before returning. Slower (cold boot is around two and a half seconds plus the agent loop) but with capabilities Quick mode does not pretend to have. The honest answer to "which mode should I use" is "run your idea in Quick first; if you outgrow it in the second prompt, flip the toggle."
Things that are good fits for instant HTML
The kind of thing that fits cleanly: tip splitters, unit converters, color pickers, regex testers, JSON viewers, simple calculators, drum machines, meme captioners, pictionary word generators, QR code generators, resume scaffolds, landing-page drafts, throwaway dashboards over a public API, and the single-page utility you wanted to ship in 20 minutes but always ended up scaffolding a full project for. The output is an HTML file. You can host it on any static surface, paste it into a Notion embed, mail it to a friend, or open it locally with a double-click. There is nothing to install on the other end.
What "no friction" means in practice
The other axis of instant is friction. A pipeline that streams fast is wasted if you have to sign in, confirm an email, pick a template, and click through onboarding before you reach the prompt input. mk0r treats the prompt input as the front door. You land on the homepage, the input is already focused, you start typing. The session key is a UUID written to localStorage the first time the page loads. Your generated apps are tied to that key. Sign-in matters only if you want to publish to a subdomain or sync projects across devices. For everything else, the page is the product.
Want help wiring an instant builder into your own product?
Half-hour call. We'll walk through the Quick mode pipeline, the iframe-srcdoc trick, and where the seams are if you want to fork the pattern.
Frequently asked questions
What does 'instant' mean here, in seconds?
First visible paint happens before a typical Docker-backed builder would have finished pulling its base image. There is no sandbox in the path. The browser opens an SSE connection to the chat route, Claude Haiku starts emitting tokens, and each chunk gets piped into the srcdoc of an iframe on the right side of the page. You see headings, layout, and styles appear as Haiku writes them. Total time from prompt submit to a working app you can click around in is on the order of 15 to 30 seconds for a small utility, with the first character of HTML showing up well under a second.
Is the output really HTML, or is it React under the hood?
Quick mode output is one self-contained HTML document. Doctype, head, style block, body, script. The script is plain JavaScript or sometimes a tiny inline framework via CDN, but the file you can right-click 'view source' on is the file Haiku wrote. No bundler in between. That is also why you can copy the HTML straight out of the preview pane and drop it into a static host or an email and have it work.
Why skip the VM at all? Other builders treat the sandbox as the product.
Because for a calculator, a tip splitter, a meme generator, or a single-page utility, the sandbox is overhead. A VM gives you npm, hot module replacement, and a real Chromium for visual verification. Useful when the app needs npm packages or multi-file state. Wasteful when the entire app is 200 lines of HTML. mk0r runs both pipelines side by side. Quick mode is the no-VM path. VM mode (Vite plus React plus TypeScript inside an E2B sandbox) is the with-VM path. You pick by toggling a mode switch in the header. Quick is the default for first-time visitors because it makes the maker loop instant.
What's actually in the network tab when I press send in Quick mode?
One POST to the chat route with the prompt and current model. The response is text/event-stream. Each event carries a chunk of generated text. The client appends each chunk to a buffer that is also written to an iframe's srcdoc attribute on every paint. The browser re-renders the iframe contents continuously, so you see the document grow. There is no separate request for assets; if Haiku writes a base64 SVG inline or a CDN link to a font, that's what you get.
Why Haiku specifically, not Sonnet or Opus?
Latency. Haiku 4.5 is the cheapest, fastest Claude model with output quality high enough to write coherent HTML in one pass. The cost of a typical Quick mode app is a few cents at most, and time-to-first-token is in the few-hundred-millisecond range. Sonnet and Opus do better on multi-file React projects where the agent needs to plan, read tools, and verify, which is what VM mode uses them for. For one-shot HTML, the marginal IQ of a heavier model is not worth the extra seconds of waiting.
Can I edit the generated HTML, or do I have to re-prompt every time?
Both. Iteration in Quick mode happens by sending a follow-up prompt: 'make the buttons larger, switch to dark mode, add a copy button'. Haiku rewrites the HTML and the iframe re-renders. If you want to hand-edit, the source is right there. Copy it out, edit it, paste it back via 'replace the source with this'. There is no lock-in to the format. The output is the file.
Where does this fall over?
Anywhere you need real persistence, real auth, multi-page routing, or a backend. Quick mode generates a single document that lives in the iframe. State is whatever you keep in localStorage or memory. If you want a database, a login screen, or a multi-screen flow with shared state, you switch to VM mode where there's a real dev server and you can install packages. The honest framing is: Quick mode is for the 80 percent of side-project ideas that fit in a self-contained HTML file. The other 20 percent need the sandbox.
How is this different from Claude Artifacts?
Artifacts also produce a single HTML document, but inside a chat surface. mk0r is a builder UI: a fixed prompt input below the canvas, a phone-frame preview, a download button, a publish button, undo and redo. The model under Quick mode is the same family. The wrapper is the difference. mk0r also has the VM mode escape hatch when the project outgrows a single file, which Artifacts does not.
Do I need an account?
No. Open mk0r.com, type a prompt, watch HTML stream in. The session is keyed off a UUID stored in localStorage on first visit. You can build, iterate, copy the source, and download a .html file without giving an email address. Sign-in only matters if you want to publish to a custom subdomain or keep projects synced across devices.