Argument

AI builds apps that skip the middle

Most AI app builder pages describe what the model generates. The honest interesting story is what the model does not have to generate, because it has been baked into the image and the warm pool before anyone showed up. Seven steps. With file paths.

Try it (no account)

Matthew Diakonov, Written with AI

Published May 7, 20268 min

Direct answer (verified 2026-05-07)

An AI app builder "skips the middle" by pre-baking everything between "you typed a sentence" and "the app is running" into the sandbox image and a warm pool, so the model only writes the part that is unique to your idea. On mk0r the middle is exactly seven things: a Vite scaffold, the dev server, the HMR bridge, the Playwright runner, the git repo, the framework decision, and four service env vars. All seven live in the image (template id 2yi5lxazr1abcs2ew6h8) or are provisioned in parallel before the first model turn.

The middle is real, and it is specific

When a Reddit thread says "AI built me an app and it skipped the middle," that is not a feeling. It is a measurement. The middle is a list of concrete steps every AI app builder has to deal with somehow, and the choice each one makes is whether to do them at request time or before the user arrives.

Here is the version that runs on most tools that ask for a login first. Every step is a wait or a context switch. Most ideas do not survive seven of them in a row.

The middle, on a typical AI app builder

Email + password, sometimes a verification round trip

Pick stack

Framework, language, styling, package manager

Scaffold

Generate or download a starter project

Install deps

npm install, sometimes 2 minutes

Boot server

Start dev server and wait for first paint

Wire HMR

Hot reload bridge from server to preview

Add services

Database, analytics, email, repo creation

That figure is the part nobody puts on a landing page, because nobody wants to admit it is there. The interesting question is what each step looks like when you decide it has to run before the user clicks anything.

The single sentence that pins the stack

On mk0r the framework decision is one constant in one file. src/core/e2b.ts, lines 170 through 171. The variable is DEFAULT_APP_BUILDER_SYSTEM_PROMPT.

export const DEFAULT_APP_BUILDER_SYSTEM_PROMPT =
  "You are an expert app builder inside an E2B sandbox " +
  "with a Vite + React + TypeScript + Tailwind CSS v4 " +
  "project at /app. The dev server is running on port " +
  "5173 with HMR. You have Playwright MCP for browser " +
  "testing. Read your CLAUDE.md files for detailed " +
  "workflow, coding standards, and memory instructions.";

That sentence is doing real work. It is telling the model the framework, the language, the bundler, the styling, the port, the hot reload, and the test runner. The model never picks a stack, never types npm create, never waits for a dependency tree. It opens /app/src/App.tsx, reads the CLAUDE.md sitting next to it, and writes the part of the project that depends on what the user actually asked for.

Removing the framework decision from the request path is the single highest leverage move. Every other middle step fans out from this one.

What the image already contains

The image lives at docker/e2b/e2b.Dockerfile. It is a thin wrapper over a few static COPYs and one meaningful build step:

WORKDIR /app
RUN npm create vite@latest . -- --template react-ts \
 && npm install \
 && npm install -D tailwindcss @tailwindcss/vite

Read that twice. By the time the image is published, the following things already exist on disk inside the box: package.json, node_modules/, the Vite config, the React entry point, Tailwind, and the custom HMR bridge file src/_mk0rBridge.ts that posts message events to the parent iframe so the preview can listen for vite:beforeUpdate, vite:afterUpdate, and reload events.

The image also globally installs three things every agent needs: @playwright/mcp for browser testing, @agentclientprotocol/claude-agent-acp for the agent bridge, and @anthropic-ai/claude-code. Plus Chromium, ffmpeg, Xvfb, x11vnc, and websockify so the box can stream a real browser back to the user. None of that runs at the user's prompt time. All of it is sediment.

The dev server starts the moment the sandbox boots, not the moment the agent decides it should exist. By the time the first prompt lands, port 5173 is already serving an empty React app, the iframe has already loaded it, and the bridge is already posting messages back.

The pool that boots before you do

An image alone does not skip the middle. A sandbox still has to boot, ACP still has to initialize, and the session still has to register. That stack of operations takes about ten to thirty seconds. If you wait until the user submits their prompt, the user feels every second of it. So mk0r does not wait.

The landing page fires this on mount, in src/app/(landing)/page.tsx around line 63:

useEffect(() => {
  fetch("/api/vm/prewarm", { method: "POST" }).catch(() => {});
}, []);

That fire-and-forget POST hits a route that calls prewarmSession() in src/core/e2b.ts. The function boots a sandbox from the template, runs the ACP initialize, runs session/new with the same hardcoded system prompt, sets the model to haiku, and writes a "ready" document into the vm_pool collection in Firestore. The entire stack of setup runs while the user is reading the textarea.

When the user finally clicks submit, the request runs a Firestore transaction inside claimPrewarmedSession that atomically deletes one ready document and hands its metadata back to the request. The user's session is now bound to a sandbox that has been alive long enough that the dev server's first paint is already in the iframe.

The number that matters

Roughly 0s of sandbox boot, ACP initialize, and Vite first paint. Each one of them happens before the user has finished typing the prompt, not after.

The four services that do not need a signup form either

Stack and server are the obvious middle. Services are the one almost nobody talks about. If your app needs a database, an analytics key, a transactional email sender, or a git repo, you usually need four separate accounts on four different platforms. mk0r treats all four as middle and provisions them in parallel from src/core/service-provisioning.ts:

One Promise.allSettled, four API calls, four sets of env vars dropped into /app/.env before the agent's first turn runs.

Pre-provisioned per app, not per user

PostHog

Analytics project ID and key written to .env at first prompt

Neon

Postgres database created via API, DATABASE_URL injected

Resend

Transactional email key, audience ID per app

GitHub

Private repo created with a deploy token in the env

The function that runs the four calls is provisionServices() around line 277. It reads:

const results = await Promise.allSettled([
  provisionResend(sessionKey, name),
  provisionNeon(sessionKey, name),
  provisionGitHub(sessionKey, name),
]);

PostHog is synchronous because it reuses a shared project and just mints an app id. The other three are real network calls to real APIs that create a real audience, a real Postgres database, and a real GitHub repo. The agent does not orchestrate them. The orchestrator runs them while the first model turn is still in flight, then writes the resulting env vars to /app/.env so the agent can read them with process.env the moment it actually needs persistence.

What the model is left with

Once you remove the middle, the agent's job collapses to a much smaller surface. It does not pick a framework. It does not boot a server. It does not install packages (except the few specific to the user's idea). It does not wire HMR. It does not create a database, an email sender, an analytics key, or a repo. It does not run the first git commit, that is done in ensureSessionRepo around line 670 of e2b.ts the moment a session is bound.

What it does is open /app/src/App.tsx and the small set of component files in /app/src/components, read the CLAUDE.md that lives next to them (which lists the env vars that exist and the conventions of the project), and write the part that is unique to the user's description. Then it watches its own work in the iframe via the bridge messages and iterates.

That is the spec for "skips the middle." Not fewer keystrokes from the user, fewer keystrokes from the model. The model is a faster typist than any human, but it is still being charged per token, and every token it spends re-deriving the framework decision is a token it does not spend on the user's actual idea.

The honest cost of skipping the middle

Pre-baking the middle is opinionated. If your idea is a SwiftUI iOS app, mk0r is the wrong tool. If you want SvelteKit, wrong tool. If you need a Laravel backend, wrong tool. The whole point of pinning the framework in a constant in a single file is that the constant is sharp. You cannot have a fast first hour and a flexible framework at the same time, and the design picked the first hour.

The other admission: nothing about the pre-baked middle helps the second hour. Once your prototype is past the point where the agent can keep the whole thing in its head (a few hundred lines, a handful of components), you start hitting state management, real auth, multi-screen navigation, the dozens of small architectural choices that are the actual skill of building an app. mk0r's opinion is that those steps belong in a real engineering process, not in a chat textarea, and the moment you reach them you should be exporting your code to a real repository and continuing in a real editor.

The contract is honest in both directions. The first hour is genuinely faster, by minutes. The second hour is not cheaper than anywhere else. Skipping the middle is a tool for ideas that need to exist quickly so their owner can decide whether to invest more, not for projects that already know they want to ship.

Building something where the middle matters?

Fifteen minutes on what part of the build you actually want pre-baked. No pitch.

Frequently asked questions

What does "the middle" actually mean in an AI app builder?

The middle is everything between "a person typed a sentence" and "a working app is on screen." On most builders that includes: a login form, a framework picker, a project name, a starter repo download, an npm install, a dev server boot, an HMR bridge, a preview server, and a sequence of API calls to set up a database and analytics. Each step is a tiny choice or a small wait. Together they are the reason most no-code or AI builder demos look smooth on a recorded video and feel slow when you actually try them. Skipping the middle does not mean removing any one step. It means moving every step that does not depend on the user's idea out of the request path and into a pre-built image.

Where is the system prompt that hardcodes the stack?

It is one sentence in src/core/e2b.ts at lines 170 to 171. The constant is DEFAULT_APP_BUILDER_SYSTEM_PROMPT. It tells the agent: "You are an expert app builder inside an E2B sandbox with a Vite + React + TypeScript + Tailwind CSS v4 project at /app. The dev server is running on port 5173 with HMR. You have Playwright MCP for browser testing." That is it. The model does not ask the user which framework, does not pick a port, does not boot a server, does not install Tailwind. Every one of those decisions is already a fact about the box the agent is sitting in. The agent's job is to fill src/App.tsx and a few component files. The rest of the project is the floor under it.

What is in the sandbox image before the agent ever runs?

The Dockerfile lives at docker/e2b/e2b.Dockerfile. At build time it runs `npm create vite@latest . -- --template react-ts` inside /app, then `npm install`, then installs Tailwind v4 and the @tailwindcss/vite plugin, then overlays our own vite.config.ts, src/_mk0rBridge.ts (the HMR bridge that talks back to the parent iframe), and src/index.css. It also globally installs @playwright/mcp, @anthropic-ai/claude-code, and the agent client protocol bridge. By the time the image is published with template id 2yi5lxazr1abcs2ew6h8, the box has Node 20, Chromium, Xvfb + x11vnc + websockify for screen sharing, the Vite dev server ready to run on 5173, and a Playwright MCP server ready to drive the browser. The model writes none of that. It is sediment.

How does the dev server end up running before the user types?

Two stacked tricks. First, the image starts the proxy and Xvfb on boot, so the box is ready to serve. Second, the landing page fires fetch("/api/vm/prewarm", { method: "POST" }) on mount in src/app/(landing)/page.tsx around line 63. The prewarm route calls prewarmSession() in src/core/e2b.ts which boots a sandbox from the template, runs ACP initialize, runs ACP session/new with the same hardcoded system prompt, and writes a "ready" doc into a Firestore pool. When the user submits their first prompt, claimPrewarmedSession() runs a Firestore transaction that pops one ready sandbox out of that pool and binds it to the session UUID. The dev server has been running for thirty seconds before the user finishes typing.

What about the database, analytics, email, and the repo?

Those are also middle. They live in src/core/service-provisioning.ts. The provisionServices() function fires four calls in parallel using Promise.allSettled: provisionPostHog (synchronous, reuses the shared project), provisionResend (creates an audience and a sending API key), provisionNeon (creates a Postgres project and returns a connection string), and provisionGitHub (creates a private repo and a deploy token). The result is a flat object of env vars the orchestrator writes into /app/.env on the sandbox before the first model turn runs. The CLAUDE.md the agent reads tells it those env vars exist. The agent reaches for them when the user's idea actually needs persistence or email, not as boilerplate at the top of every app.

Why bake the framework choice instead of letting the model pick?

Because picking a framework is one of the most expensive shapes of "middle" there is. Every framework choice fans out into ten more: bundler, language, styling, package manager, dev server, file conventions, lint rules. If the model picks at runtime, it has to write all of that, install it, wait for it, boot a server, and then start on the user's idea. By committing to Vite + React + TypeScript + Tailwind v4 in the image, every single one of those decisions is already settled. The cost is real: any user who wants Solid, Svelte, Astro, or Vue is in the wrong tool. The benefit is that the maker loop is roughly ten seconds long instead of two minutes.

How is this different from "no signup" as a marketing claim?

No-signup is one of the seven middle steps mk0r removes. The other six are equally load bearing. You can have no signup and still spend ninety seconds running npm install on every first prompt, in which case the maker loop is still dead. The phrasing "skip the middle" is the better frame because it forces a builder to enumerate the actual steps and then move each one out of the request path. Account, scaffold, deps, dev server, HMR bridge, framework choice, services. Each one is concrete, each one has a measurable wait attached to it, and each one is either pre-baked into the image or pre-warmed by a separate process before the user arrives.

Does any of this skip help once the app is past the prototype stage?

No, and the design admits it. The pre-baked middle is tuned for the first hour of an idea, the part of the maker loop where most projects die before they exist. Real auth, real billing, real multi-user persistence, a deploy pipeline that goes somewhere other than a preview iframe, monitoring, alerting: those are not skippable, they are the work. The honest claim is that mk0r removes seven specific steps that should not be in the path between idea and artifact, and stays out of the way for the steps that should. If you outgrow it, you outgrow it. That is the contract.

Where can I see the file that pins the whole thing?

Three files in order. src/core/e2b.ts lines 162 through 220 (the system prompt constant and the MCP servers config). docker/e2b/e2b.Dockerfile (the layered scaffold). src/core/service-provisioning.ts function provisionServices around line 277 (the four parallel API calls). Together those three files are the entire "middle" of mk0r. Everything else in the repo is either UI around them, telemetry, or per-turn git plumbing for undo and redo.

Three pages that look at the same idea from a different angle.

Adjacent reading on this site

Argument

Why Friction Kills the Maker Loop for Hobby Projects

The three classic context switches between idea and artifact, and what concretely replaces each one when the absence of the dashboard is the spec.

Read

Mechanism

AI App Builder, No Signup: What's Actually Running for You

Six exposed ports, a real Chromium with Playwright, a Vite dev server, a Postgres, a Resend key, and a private GitHub repo. All for an anonymous visitor.

Read

Setup

Single File HTML Side Project Setup, Without the Setup

Three operations the builder runs in the background before you pick a project idea. The honest answer to "how do I set up a single file side project."

Read