Guide

Build App From Text Prompt: What Actually Happens Between Send and Render

Every guide on this topic stops at the same sentence: "a large language model interprets your intent and generates code." That is the boring part. The interesting part is what the model reads first, before it writes a single character of your app. On mk0r, that first read is a 320-line opinionated CLAUDE.md file that sits in the open source repo. This page walks through it.

Matthew Diakonov, Written with AI

Published April 29, 20269 min

Direct answer (verified 2026-04-29)

Building an app from a text prompt on mk0r is four things in sequence: a pre-warmed Linux sandbox catches your sentence, an agent reads its system prompt (a Vite plus React project layout, a running dev server, browser-test tools, and explicit design constraints), the agent writes files into /app/src on a real filesystem, and the dev server hot-reloads each change into a browser frame you watch update in real time. The system prompt is in the open source repo at src/core/vm-claude-md.ts.

Step 1: a sandbox is already waiting

The first thing that happens after you type a sentence is also the first thing that already happened, before you typed anything. When mk0r.com loaded, the React tree fired a fire-and-forget POST to /api/vm/prewarm. That route tops up a Firestore collection called vm_pool, keeping at least one fully booted E2B sandbox ready: Node, Vite, React, TypeScript, Tailwind v4, Chromium, and the Playwright MCP server are already running.

Pressing send runs a Firestore transaction that pops one of those entries and binds it to your session. The full latency from keystroke-to-claim is a single transaction plus an ACP re-initialize, not a five-second cold boot.

We wrote a deeper page about why the no-signup choice is what unlocks this parallelism: Instant AI App Builder. The short version: a builder that gates behind signup cannot warm sandboxes for anonymous visitors, so it pays a serial cold boot every time.

Step 2: the agent reads its system prompt

Inside the sandbox, the Claude agent is wrapped in the agent client protocol (the @agentclientprotocol/claude-agent-acp@0.25.0 package, pinned). Before your sentence arrives, the agent has already received a system prompt. There are two layers worth calling out, and both ship in the open source repo.

The short bootstrap prompt is in src/core/e2b.ts at line 148, exported as DEFAULT_APP_BUILDER_SYSTEM_PROMPT. It tells the agent the project layout, the dev server port, and that Playwright MCP is available for browser testing. It also instructs the agent to read its CLAUDE.md files for detailed workflow.

That CLAUDE.md is the second layer. The actual content is in src/core/vm-claude-md.ts as the globalClaudeMd export, around 320 lines of explicit rules. It covers the agent's role, a memory protocol (when to save user preferences, project context, and corrections), workflow rules, code-quality rules, copywriting rules, and a long block of design constraints.

"Never default to Inter, Roboto, Arial, or system fonts. These are the hallmark of generic AI output."

src/core/vm-claude-md.ts, around line 144

That sentence is what makes the system prompt unusual. Most prompt-to-app services give their model a permissive instruction ("build a beautiful modern app") and ship whatever average-of-the-internet output comes out. The mk0r agent gets a specific list of things not to do, and a list of aesthetic directions to commit to before writing any code.

The five anti-slop rules baked into the system prompt

Every one of these is verbatim from globalClaudeMd. None of them are novel taste; the unusual part is that they are instructions to a code-generating agent, not a designer brief.

Three colors maximum

Black, white, and one accent. No secondary or tertiary accents. The agent picks one Tailwind palette color (teal-500, blue-500, etc.) and uses it consistently for buttons, links, and highlights.

Never Inter, Roboto, or Arial

Verbatim from globalClaudeMd: 'These are the hallmark of generic AI output.' The agent picks a distinctive display font and pairs it with a refined body font from Google Fonts.

No decorative icons

No icons on feature cards, section headers, list items, or how-it-works steps. Functional only: chevrons, close buttons, status indicators, navigation arrows. No icon library is installed by default.

No purple-indigo gradients on white

Listed by name in the Anti-Patterns block as a 'cookie-cutter layout that could be any app.' Same block forbids uniform rounded card grids, generic centered hero sections, and Lorem Ipsum left in place.

Pick an aesthetic first

Before writing any UI code, the agent commits to a tone: brutally minimal, editorial, retro-futuristic, soft pastel, industrial, playful, luxury, organic. The point is intentionality, not which aesthetic.

You can verify all five by cloning the repo and reading the file. The Design Constraints section starts around line 97. The Anti-Patterns block is around line 168.

What this changes about the output

A constrained system prompt does not guarantee a good app. It guarantees a different distribution of outputs. Toggle the panel below to see what each rule looks like in practice.

One sentence: 'a habit tracker for morning routines'

Most builders ship a centered hero with an indigo-to-purple gradient, a uniform 3x3 card grid with a colored emoji on each card, a system font (usually Inter), and a CTA button with a glow. Every section has the same border radius. The accent color is whatever the model leaned toward this week.

Inter or system fallback font
Indigo/purple hero gradient
Identical rounded card grid
Decorative emoji on every section

Step 3: files land on a real filesystem

The agent runs as a subprocess inside the same VM as the Vite dev server. There is no remote API translating between the agent and the project; the agent's Write and Edit tool calls hit the real filesystem at /app/src. Vite is already watching that directory. The moment a file changes, the dev server fires an HMR update over a websocket and your browser swaps the changed module in place. End-to-end from file write to repaint is on the order of tens of milliseconds.

This is why iterating feels different from re-prompting a chat model. When you say "make the buttons rounded", the agent reads the actual current contents of App.tsx on disk, not a stale transcription of its previous output. The change is one tool call against a live filesystem the next prompt can also see.

Step 4: the browser frame catches up

The same VM also runs Chromium with a Playwright MCP server attached. The browser is pointed at the dev server. There are two things that use this. First, you see what is happening: the preview frame on mk0r.com is a CDP screencast of the agent's tab. As HMR fires, you watch the layout change. Second, the agent uses it to verify its own work. After UI changes, the system prompt directs it to: navigate to localhost:5173, take a snapshot, check console messages for runtime errors, and not report completion until the browser shows the expected result.

We wrote about that self-test loop in more depth on the AI App Generator page. The relevant fact for this page is that it is part of the same session: the agent finishes the file edits and the browser check inside one prompt-to-response, not as a separate review step.

What "text prompt" can actually be

The input is a string, but it does not have to come from your keyboard. mk0r's landing page wires up a Web Speech API recorder that POSTs the audio to /api/transcribe. That route forwards to Deepgram's Nova-2 model with smart formatting and punctuation, returns a string, and feeds that string into the same prompt path as the keyboard does. Quick mode and VM mode both accept it.

So the practical answer to "build app from text prompt" is: type or speak. The first turn is one sentence. Iteration is also one sentence. The agent keeps the conversation, the file tree, and the browser state across turns.

The honest limits

The system prompt makes the output not look like AI slop. It does not make the output bug-free, and it does not turn a one-sentence description into a production app. A few real limits:

State and persistence. The default project is client-side React. If your prompt implies real auth, a real database, or real payment, you are still going to need to plumb those in. The agent can do it (the system prompt has a Backend Services skill), but a one-sentence prompt will not.
Native iOS or Android. The output is a web app. Mobile-first by default, but web. If you need an App Store binary, this is not the tool.
Ambiguous prompts. The system prompt tells the agent to ask clarifying questions, but on the first turn the agent will usually guess and ask in the same reply. If you want it to ask first, say so.
Complex multi-screen flows. A todo list, a landing page, a dashboard prototype: yes. A multi-screen real-time collaboration tool with presence and conflict resolution: not from one sentence.

The general shape: anything that fits in a Vite plus React app with a small amount of state and a clear UI is in range. Anything that needs serious backend logic is in range, but you will be iterating with the agent for more than one turn.

Why the system prompt being open source matters

Most prompt-to-app builders treat their system prompt as a moat. You ship a sentence, you get an app, you have no idea what the agent was told to do or not do. If you do not like the output, you re-prompt and hope.

mk0r's system prompt is in src/core/vm-claude-md.ts in the open source repo (github.com/m13v/appmaker). You can read it, fork the repo, change the rules, and run the result. If you want a builder that defaults to dark mode, swap the rule that forbids dark backgrounds. If you want it to favor Material UI instead of plain Tailwind, swap the styling section. The output distribution shifts in a way you can predict.

That is not a moat, but it is a different bet: the system prompt and the surrounding sandbox are infrastructure, the model picks choices the system prompt narrows, and the user gets to inspect both layers. Every other prompt-to-app guide on this topic skips this entirely, which is why this page exists.

Want to walk through your idea live?

Bring a sentence. We will open mk0r, build it, and talk through where the system prompt helps and where you will need to iterate. Twenty minutes.

Frequently asked questions

When I type one sentence, what is the very first thing that happens server-side?

Before you press send, the landing page already fired a fire-and-forget POST to /api/vm/prewarm on mount. That route tops up a Firestore-backed pool of pre-booted E2B sandboxes. Each pool entry is a real running Linux box with Vite, React, TypeScript, Tailwind, Chromium, and Playwright MCP already initialized. When your sentence arrives, the server runs a Firestore transaction that atomically claims one of those pool entries for your session, then the sentence flows into an open ACP (agent client protocol) channel as the first user message of that session. There is no LLM call yet.

What system prompt does the agent actually receive?

Two layers. The short one is in src/core/e2b.ts at line 148 and reads: 'You are an expert app builder inside an E2B sandbox with a Vite + React + TypeScript + Tailwind CSS v4 project at /app. The dev server is running on port 5173 with HMR. You have Playwright MCP for browser testing. Read your CLAUDE.md files for detailed workflow, coding standards, and memory instructions.' The longer one is the CLAUDE.md file dropped into /app on first boot. That file (src/core/vm-claude-md.ts, the globalClaudeMd export) is around 320 lines of explicit rules: copywriting principles, design constraints, browser testing rules, and a memory protocol. Both ship with every session.

What is in the design-constraint section that other builders skip?

The verbatim rules are: three colors maximum (black, white, one accent), no decorative icons, no purple or indigo gradients on white backgrounds, no uniform rounded card grids, no Inter or Roboto or Arial as the default font, no exclamation points in copy, and a directive to pick a specific aesthetic (brutally minimal, editorial, retro-futuristic, soft pastel, industrial) before writing any UI code. None of this is novel taste; it is just unusual to bake into a builder's system prompt. Most prompt-to-app systems give the model a permissive 'build a beautiful modern app' instruction and ship whatever comes out.

Where do I see the actual file path so I can verify these rules myself?

src/core/vm-claude-md.ts in the mk0r repo. The relevant export is globalClaudeMd, starting at line 19. The Anti-Patterns block is around line 168. The font rule is at line 144. The repo is open source, so a clone and a grep is enough.

How does the agent actually write files into the project?

The agent runs as a subprocess inside the same VM as the Vite dev server. The agent client protocol (claude-agent-acp v0.25.0) exposes the local /app directory as the working directory. Tool calls like Write and Edit hit the real filesystem. Vite is watching /app/src; an HMR update fires the moment a file lands. The end-to-end path from 'agent writes App.tsx' to 'browser repaints' is a websocket message and a JS module replacement, on the order of tens of milliseconds.

Does the agent only work in one shot, or can I iterate with follow-up prompts?

The session stays open. Every follow-up sentence is appended to the same conversation. The agent has the full prior turn history and the live filesystem state. So 'make the buttons rounded' lands on top of a real /app directory the agent can read, not a fresh transcription of the previous output. This is why the iteration loop feels closer to talking to a developer than re-prompting an LLM.

What is the failure mode when my sentence is ambiguous?

The system prompt explicitly tells the agent to ask clarifying questions when context would improve the work, and lists examples: 'What is the primary use case?', 'Do you have a color scheme in mind?', 'Is this for a technical audience?'. In practice the agent guesses on the first turn (so something appears fast) and asks the clarifying question in its reply text. You can answer the question in your next turn and the agent will revise.

Does this work for non-technical users, or do I need to know what HMR and Vite are?

It works for non-technical users. You type a sentence, you watch a browser frame populate, you click around. The Vite and HMR plumbing is invisible. The reason this page goes deep is that the question 'how does prompt-to-app actually work' is interesting whether or not you ever look at the file paths.

Where is the line between Quick mode and VM mode?

Quick mode streams a single self-contained HTML file straight from a smaller model. Good for a calculator, a memes generator, a one-page tool. No sandbox, under thirty seconds. VM mode boots (or claims) a Freestyle E2B sandbox with the Vite plus React stack and the full agent loop described above. mk0r picks the mode based on the prompt; you can also force one. VM mode is what unlocks the iterate-with-words behavior, because there is a real filesystem to keep editing.

Is this open source so I can read the prompts myself?

Yes. The repo path is github.com/m13v/appmaker. The system prompt files are at src/core/vm-claude-md.ts (the longer CLAUDE.md content, around 320 lines) and src/core/e2b.ts at line 148 (the short bootstrap prompt). Read them and you will see exactly what the agent sees on its first turn, before you have typed anything.