Guide

When HTML Prompts and HTML Outputs Actually Match

Three rules decide it. Name one noun and one constraint. Lock the output to a single file. Stream the output so drift is visible while you can still interrupt. Drop two of those and the prompt and the HTML stop matching. Notes from building mk0r, an open-source AI app maker.

Matthew Diakonov, Written with AI

Published May 10, 20265 min

Direct answer (verified 2026-05-10)

HTML prompts and HTML outputs match when the prompt commits to one noun and one specific constraint, the model is locked to a single-file HTML target so structure cannot drift across files, and the output streams so any divergence is visible early enough to interrupt and re-state. Verified against the mk0r source at github.com/m13v/appmaker.

Why prompts and outputs drift in the first place

You type, “a habit tracker with daily streaks, weekly graphs, friend circles, and push notifications.” You get back a habit tracker with daily streaks, four placeholder cards labelled “Friends,” and no graphs. The model picked the cheapest constraint to satisfy and dropped the rest into decoration. That is one of three failure modes:

Too many constraints. Five things named in one breath. The model commits to the noun and one of the constraints, and the rest become labels without logic.
Too open a target. When the output is “a project,” structure spreads across files before you see any of it. The first file looks right, the others quietly do not match.
No streaming. A returned blob commits the model to a draft you cannot interrupt. Drift in the first paragraph propagates into every paragraph after it.

Take any of those three away and matching improves. The interesting ones are the second and third, because most tools shipping today optimise for one of them and leave the other on the floor.

The shape of a prompt that matches on the first try

Open mk0r.com. Under the input box there are four clickable example prompts. They were not chosen at random. Read them as a set:

01A habit tracker with daily streaks
02A pomodoro timer with focus sessions
03A mood journal with weekly insights
04A flashcard app for language learning

Every one is a noun phrase plus one constraint. Habit tracker, daily streaks. Pomodoro timer, focus sessions. Mood journal, weekly insights. Flashcard app, language learning. There is no fifth thing tacked on, no “and,” no list. That is the shape that lands. The source for the shipped array is in src/app/(landing)/page.tsx of the public repo.

~30s

“One sentence in, a working mobile app out, in about 30 seconds. No signup, no setup, no code. Tweak it by just saying what to change.”

mk0r homepage hero

The single-file constraint, and why it matters

The Quick mode in mk0r calls Claude Haiku with a system prompt that asks for “a complete single-file HTML app.” The model returns one document. CSS in a style tag. JavaScript in a script tag. State, layout, behavior, and visuals all live in the same scroll. You can read the full document end to end in under a minute.

That is why prompt and output match more reliably here than in tools that target a project tree. There is nowhere for the model to push complexity. If the prompt said “a quiz with one question per screen” and the script tag does not contain a state variable for the current question, the mismatch is visible in seconds. In a multi-file project you would have to open three files before noticing the same drift.

The tradeoff is real. A single HTML file cannot hold a real backend, real auth, real multi-user state, or anything past a JSON literal for data. When you outgrow that, mk0r switches to VM mode, which boots a Vite plus React sandbox and edits files there while you watch a Playwright-driven browser show the result. The matching rule changes from “one file, one shape” to “one file at a time, with a browser open.” Same spirit, different scope.

What streaming buys you that a returned blob does not

The shape of the call is straightforward. Open a streaming connection to the model, ask for a single HTML document, yield every text delta straight to the browser as it arrives. The preview iframe repaints continuously. The first interactive paint shows up before the model has finished writing.

Quick mode: prompt to rendered HTML

Without streaming, you would read the finished blob, decide it is not what you wanted, write a fresh prompt, wait again. Round trips cost minutes. With streaming, the unit is still one sentence, but the cycle is seconds. You stop generation the instant the layout commits to the wrong primary action, edit the sentence, send. The matched parts of the previous output do not even need to be regenerated; you keep iterating against the same project.

The four moves, in order

If you want a prompt that matches the HTML you get back, this is the sequence. Skip any one of them and matching gets noisier.

1
Name one noun
A tracker. A timer. A quiz. The noun pins the layout the model already knows.
2
Add one constraint
Daily streaks. Focus sessions. Four-second cycles. One specific decision, not five.
3
Lock the output target
Single HTML file with embedded CSS and JS. No project, no folder, no place to hide structure.
4
Watch it stream
First paint before the model finishes. Catch drift while you can still interrupt.

When the first output still drifts

No rule wins every turn. When the first paint comes back and the layout is wrong, the temptation is to write a long correction prompt: “the streaks should reset, the colors are off, the font is too thin, the buttons should be square, the title is too loud.” That replaces the parts of the output that already matched.

The cleaner move is one short sentence per change. Pick the one most-wrong thing. Name it specifically. Send. The matched parts stay matched, the named part moves. Repeat. The iteration loop was built for this; do not fight it by trying to make every change at once.

That habit is also the one piece of advice that translates from the streaming HTML mode to the full React mode and outside the tool entirely. One change, one sentence, look at the result. The part the prompt did not name is the part that does not change.

Five sentences worth trying

Each one fits the noun-plus-one-constraint shape. Open mk0r.com, paste any of these, watch the HTML stream in. Then iterate one sentence at a time.

A bill splitter with a tip slider
A breathing exercise with four-second cycles
A reading log with a yearly goal bar
A meeting timer with a yellow warning at one minute
A grocery list with a checkbox per item

Bring a sentence, leave with matching HTML

Twenty minutes on a screen share. You bring an app idea, mk0r builds it, we walk through where prompt and output stay matched and where they need a follow-up sentence.

Frequently asked questions

Why does the HTML I get back stop matching what I described?

Three reasons cover almost every case. The prompt named two or more constraints in one breath, so the model picked the cheapest one to satisfy. The output target was open ended (a project, a folder, a Next.js app) so structure drifted across files before you saw any of it. Or the output was returned as a finished blob, so you only noticed the divergence after the model had already committed to it. Take any of those three away and alignment improves.

What does 'noun plus one constraint' actually look like?

A pomodoro timer with focus sessions. A bill splitter with a tip slider. A breathing exercise with four-second cycles. The noun pins the layout (timer, calculator, exercise), the constraint pins the one decision the model would otherwise have to invent (focus sessions, a tip slider, four-second cycles). The four prompts shipped on mk0r.com under the input box are all this shape, by design.

Why does a single-file target matter so much for matching?

When the output is one HTML document with the CSS in a style tag and the JS in a script tag, the model cannot push complexity into other files where you cannot see it. Every decision lives in the same scroll. If the prompt said 'a quiz with one question per screen' and the script tag does not have a state variable for the current question, the mismatch is visible in seconds. Multi-file projects let the model split the answer across places you have to go look for.

How does streaming change the iteration loop?

Streaming turns a five-minute round trip into a five-second one. You watch the model commit to a layout, a color, a primary action. The moment one of those is wrong, you stop generation, edit the sentence, send it again. Without streaming you would have read the full output, decided you did not like it, written a new prompt, and waited for a fresh draft. With streaming, the iteration unit is the same one sentence, but the cycle time drops by an order of magnitude.

What kinds of prompts almost always match the HTML output?

Single-screen utilities the model has rendered a hundred times: trackers, timers, calculators, journals, quizzes, generators, single-page tools. Anything where the layout is conventional, the interactivity is scoped to one screen, and there is no shared state across users. These prompts match on the first turn because the model has a strong prior for what the HTML should look like.

What kinds of prompts almost never match on the first turn?

Prompts that pack five constraints in one breath ('a habit tracker with daily streaks, weekly graphs, friend circles, push notifications, payment, and dark mode'). Prompts that name a kind of app the model does not have a strong layout prior for ('a serverless workflow orchestrator UI'). Prompts that depend on data the model has to invent ('a dashboard for our quarterly KPIs'). The first turn comes back generic. You iterate from there one constraint at a time.

Where can I see the system prompt that constrains the output to a single HTML file?

The Quick mode call lives in the mk0r repo. The shape is literally: anthropic.messages.stream with a system prompt that asks for a complete single-file HTML app, then yielding text deltas to the browser as they arrive. The reference snippet is on the mk0r overview page at mk0r.com/t/mk0r and the implementation is in the open-source repo at github.com/m13v/appmaker.

Does this work the same way for full React projects?

Not quite. The single-file rule only applies to the streaming HTML mode. The VM mode boots a real Vite plus React project in a sandbox and a Claude agent edits files inside it. The matching rule changes from 'one file, one shape' to 'one file at a time, with a browser open you can watch.' The constraint is still about narrowing what can change between turns, just at a different scope.

How do I recover when the first output drifts hard from the prompt?

Do not write a paragraph trying to fix everything. Identify the one most-wrong thing, name it specifically, send a one-sentence change. 'Make the buttons square instead of rounded.' 'The streak count should reset on day skipped, not just decrement.' Each follow-up sentence edits the same project, so the matched parts stay matched and only the named part moves. Trying to course-correct with a long edit prompt usually replaces correct output with new errors.

Related guides