Vibe coding, one year in. Five tradeoffs, five constants.
Most one-year retros on vibe coding stay abstract. Speed up, quality down, agents got smarter. Open a shipped AI builder and the tradeoffs are not opinions, they are numbers in the source. This is a walk through the five that ended up encoded in mk0r, with the file and line where each one lives.
Five tradeoffs settled out after one year of vibe coding, each one a constant or a missing route in mk0r's source. Iteration beat one-shot (ANON_TURN_LIMIT = 6, /undo exists, /regenerate does not). Throwaway beat persistent (E2B_TIMEOUT_MS = 1 hour). A one-sentence system prompt plus 2438 lines of CLAUDE.md beat the megaprompt. Generate-then-verify in a real browser beat generate-and-ship. Free-tier defaults beat model choice on the first prompt.
Verified against the appmaker source at src/app/api/chat/route.ts:23, src/core/e2b.ts:33, src/core/e2b.ts:184-185, src/core/vm-claude-md.ts (2438 lines), and src/app/api/chat/model/route.ts:5.
“The system prompt is one sentence. The 2438 lines of rules that used to live in a megaprompt now live in a CLAUDE.md file the agent reads at boot.”
src/core/e2b.ts:184-185 (DEFAULT_APP_BUILDER_SYSTEM_PROMPT) and src/core/vm-claude-md.ts
The year, in one sentence
Andrej Karpathy posted "there's a new kind of coding I call 'vibe coding'" on February 2, 2025. Collins English Dictionary picked it as Word of the Year for 2025. Karpathy now prefers "agentic engineering" for the professional version, but the original meaning held: describe what you want, watch it build, iterate with words.
In the year that followed, every team shipping an AI app builder had to commit to a position on five tradeoffs. Not in a blog post, in a constant. Here is what mk0r ended up with, and the source line that votes for each one.
1. One-shot to iteration
ANON_TURN_LIMIT = 6 (src/app/api/chat/route.ts:23). /undo, /redo, /revert, /history exist. /regenerate does not. The free trial budgets six turns because that is the floor for closing a single iteration loop.
2. Persistent to throwaway
E2B_TIMEOUT_MS = 3_600_000 (src/core/e2b.ts:33). One hour, in milliseconds. The default lifespan is a forcing function so prototypes do not silently become production.
3. Megaprompt to CLAUDE.md
The system prompt is one sentence (src/core/e2b.ts:184-185). The workflow rules live in 2438 lines of CLAUDE.md inside the VM (src/core/vm-claude-md.ts). The agent reads its memory on every turn.
4. Generate to verify
Quick mode streams HTML from claude-haiku-4-5 with no sandbox. VM mode runs Vite + React + Tailwind + Playwright MCP so the agent can open a real browser and check its own work.
5. Choice to default
FREE_MODEL = 'haiku' (src/app/api/chat/model/route.ts:5). The free tier picks the model for you so a first-time visitor never has to comparison-shop on their first prompt.
1. Iteration over one-shot
The first thing the year settled. A single prompt cannot describe a non-trivial app well enough for the model to converge on the first try. The chat API under src/app/api/chat lists every endpoint the client can hit during a session: /undo, /redo, /revert, /history, /cancel, /mode, /model. There is no /regenerate, /generate, or /restart. The route table is the vote.
The trial budget votes the same way. ANON_TURN_LIMIT = 6 at src/app/api/chat/route.ts:23, with the comment two lines above: "Two turns was way too tight, users were getting blocked before they finished their first prompt cycle." Six is the floor for a real evaluation: one to seed, two or three to refine, one to undo a misstep, one to retry. Below that the visitor is judging the product on a first draft alone, which for anything non-trivial is the wrong sample size.
The substrate that makes the iteration cheap is in src/core/e2b.ts:2073. commitTurn runs git add -A and git commit inside the VM after every successful turn, pushes the SHA onto a historyStack, and persists the session. undoTurn at line 2169 walks the stack back via git checkout and creates a new commit. Undo is itself a commit, so the timeline never silently rewrites and a year-old session that paused and resumed still has the full history. Without that substrate, "undo" would mean asking the model to please write back the version it produced two prompts ago, which it will get close but not byte-exact.
2. Throwaway over persistent (with the option to upgrade)
E2B_TIMEOUT_MS = 3_600_000 at src/core/e2b.ts:33. One hour, in milliseconds. After that the sandbox pauses, and the reaper cleans up pools well before any individual session gets old. The number is a forcing function. Two failure modes show up when prototypes linger: people start treating them as production ("the vibe coding hangover"), and the sandbox accumulates state the agent did not author.
One hour is long enough to run a real demo session and short enough that the next time you open the project, you are explicitly choosing to resume rather than passively inheriting state. There is a reconnect path, so you do not lose history when you come back, but the default is "this expires." The throwaway property only stays throwaway if something forces it.
3. Megaprompt to CLAUDE.md
This is the change I didn't expect a year ago. A year ago, system prompts for app builders were thousand-line documents trying to anticipate every decision the model would face. Today the system prompt is one sentence:
You are an expert app builder inside an E2B sandbox with a Vite + React + TypeScript + Tailwind CSS v4 project at /app. The dev server is running on port 5173 with HMR. You have Playwright MCP for browser testing. Read your CLAUDE.md files for detailed workflow, coding standards, and memory instructions.
Everything else lives in src/core/vm-claude-md.ts, which is0 lines of CLAUDE.md content written into the VM at boot: /root/.claude/CLAUDE.md (global rules) and /app/CLAUDE.md (project rules). Color discipline, typography, copywriting, memory format, design anti-patterns, the save-a-memory rules, the workflow steps. The agent reads these files at the start of every turn.
The lesson, in one line: prompts you tweak per-session belong in the session, and rules you want enforced every time belong in a file the agent reads on startup. The system prompt should hold the role and the boundary. The CLAUDE.md should hold everything you would regret the agent forgetting.
4. Generate-and-verify over generate-and-ship
The year separated two distinct shapes of vibe coding. Quick mode streams HTML, CSS, and JS straight out of claude-haiku-4-5 into the preview, no sandbox boot, completion in under thirty seconds. Good for a tip calculator, a meme generator, a static tile. There is no verification loop, you get the output as-is.
VM mode runs the agent inside an E2B sandbox with Vite + React + Tailwind + Playwright MCP wired in (system prompt line 3 names the MCP explicitly). The agent can open a real Chromium, navigate to localhost:5173, and check that the empty state renders, the click handler works, and the layout makes sense at a phone width. That verification step is what separates "the model thinks this is right" from "the page actually loads." The cost is a sandbox boot. The value is the rest of the year taught us we needed it.
5. Default model over model picker
FREE_MODEL = "haiku" at src/app/api/chat/model/route.ts:5. The route refuses any other model selection for unauthenticated or unpaid users on lines 17-24. The free tier picks the model for you on the first prompt.
The tradeoff is real: you cannot benchmark "would this work better with a bigger model" on the free tier. The framing that justifies it is that the first-time visitor is not trying to comparison-shop, they are trying to find out whether the product works at all on their idea. Picking the model removes a decision that they have no information to make. Power users get the picker once they upgrade. That is a different product, and the year taught us not to bundle the two.
What did not get solved
An honest year-in-review has to admit what the year did not fix. Three things stayed hard. Persistent state across screens, real auth flows, and anything where the failure mode is "wrong answer that looks right." The first two are where vibe coded apps still hit a wall: the agent can build the screens but the data model and the auth surface need a human to design them. The third is the "productivity tax" the broader research kept finding: a December 2025 analysis of AI-generated code reported elevated rates of logic errors and configuration mistakes. Faster code is not free code.
None of these are reasons to skip the on-ramp. They are reasons to read the code your agent wrote, especially on the parts that touch money, identity, or data you cannot afford to lose. The people who pair vibe coding with reading the generated diff get further than the people who treat the output as a black box.
The honest summary
A year of vibe coding did not produce a tier list of tools. It produced five product positions every builder had to take, and the position is the integer. mk0r's integers are 6, 3.6 million milliseconds, 1 sentence, 2438 lines, and one default model. Other tools have other integers for the same five questions, and that is fine, the differences are real product choices.
If you want to see all five of these in motion on a real app, you do not need an account. Open mk0r.com, type the app you want, and watch the iteration loop close. The trial gives you six turns. The first turn streams in under thirty seconds.
Want to walk through the five tradeoffs on a real app?
Bring an app idea. We will open mk0r together, iterate it live, and walk through the constants in the source as we go.
Frequently asked questions
What does 'one year of vibe coding' actually mean as a date?
Andrej Karpathy coined 'vibe coding' on February 2, 2025. By the time you're reading this in May 2026, the term is about fifteen months old, and the tooling stack has crossed at least two thresholds: per-turn diff editing replaced full-app regeneration, and agentic coding assistants started shipping with browser test loops. Collins English Dictionary picked it as Word of the Year for 2025. The 'one year in' window is roughly Feb 2025 to early 2026.
Why frame the tradeoffs as constants instead of opinions?
Because every shipped builder has had to pick a number for each tradeoff, and the number is the position. Take iteration vs one-shot. mk0r's anonymous trial sets ANON_TURN_LIMIT = 6 (src/app/api/chat/route.ts:23). That is not an essay, it is a vote: six is the floor below which a curious visitor cannot evaluate the product, because at fewer turns they never close their first iteration loop. The opinion lives in the integer. Every builder you compare to has its own integer for the same question, and the difference between the integers is the difference between the products.
Is one-shot dead? You can still do one-shots in mk0r.
One-shot is not dead, it shrank. The Quick Haiku mode that lives in mk0r exists for one-shot cases: a tip calculator, a meme generator, a static landing tile. It streams HTML, CSS, and JavaScript out of claude-haiku-4-5 in under 30 seconds with no sandbox boot. The honest scope for one-shot is 'small enough to describe in one sentence and you don't care if the result drifts from your sentence.' Anything with multiple screens, state, or a brand voice that has to stay consistent across the app, you will iterate. The trial budget gives you six turns because below that you cannot tell whether iteration works on your problem.
Why is the system prompt only one sentence?
DEFAULT_APP_BUILDER_SYSTEM_PROMPT at src/core/e2b.ts:184-185 reads: 'You are an expert app builder inside an E2B sandbox with a Vite + React + TypeScript + Tailwind CSS v4 project at /app. The dev server is running on port 5173 with HMR. You have Playwright MCP for browser testing. Read your CLAUDE.md files for detailed workflow, coding standards, and memory instructions.' Nothing about color rules, typography, copywriting, memory format, or design constraints lives in the system prompt. All of that lives in 2438 lines of CLAUDE.md (src/core/vm-claude-md.ts) that are written into the VM at boot. The system prompt is the role; the CLAUDE.md is the manual. The lesson from a year of vibe coding is that prompts you tweak per-session belong in the session, and rules you want enforced every time belong in a file the agent reads on startup.
Why is the VM killed after one hour by default?
E2B_TIMEOUT_MS = 3_600_000 lives at src/core/e2b.ts:33 (one hour in milliseconds). The number is a forcing function. A year of vibe coding showed two failure modes for prototypes that linger: people start treating them as production code (the 'vibe coding hangover'), and the underlying sandbox accumulates state that the agent did not author. One hour is long enough to run a real demo session and short enough that the next time you open the project, you are explicitly choosing to resume rather than passively inheriting state. There is a reconnect-after-pause path so you do not lose history, but the default is 'this expires.'
Why is the free tier pinned to one model?
FREE_MODEL = 'haiku' lives at src/app/api/chat/model/route.ts:5. The route refuses any other model selection for unauthenticated or unpaid users (lines 17-24). Two reasons: the free tier still has to converge on a working first draft, and haiku is fast enough that a 30-second preview is realistic. The tradeoff is that you cannot test 'would this work better with a bigger model' without upgrading, which is a real cost for users comparing builders. The honest framing is that the free tier optimizes for 'can a first-time visitor reach a working app' rather than 'can a power user benchmark every model.' Those are different products.
What's the one thing a year of vibe coding taught that you didn't expect?
How much of the work that used to happen in the system prompt now happens in a file the agent reads after boot. A year ago, system prompts for app builders were thousand-line megaprompts trying to anticipate every coding decision. Today the prompt is one sentence and the rules live in CLAUDE.md inside the VM. The shift moved the agent from 'told what to do once' to 'reads its memory at the start of every turn.' That's the unglamorous structural change behind the more visible ones like per-turn commits and inline diffs.
Where does this leave 'just learn to code'?
Mostly unchanged. The set of apps where vibe coding converges fast (simple state, opinionated styling, no compliance surface) grew this year. The set where it does not (regulated data, multi-user real-time, complex auth, anything where the failure mode is 'wrong answer that looks right') did not get smaller. The honest take after one year is that vibe coding is a different on-ramp, not a replacement, and the people who pair it with reading the generated code get further than the people who treat the output as a black box.
Why no /regenerate route in mk0r? Other builders have one.
Because the route table is the position. /undo, /redo, /revert, /history, and /cancel all live under src/app/api/chat/. /generate, /regenerate, and /restart do not. The only way to throw away a project and start fresh is to provision a new VM, which is intentionally not a one-click action. Once you commit a turn (commitTurn at src/core/e2b.ts:2073 runs git add -A and git commit in the VM after every successful turn), the cheapest move is always to iterate on top of it. A regenerate button would undo a year of learning that iteration beats one-shot for anything non-trivial.
The five tradeoffs, each one expanded
Keep reading
Iterate, Don't One-Shot: The Answer Is in the Route Table
Tradeoff number one, expanded. The /undo and /history endpoints, the missing /regenerate, and why the anon trial is 6 turns instead of 1.
Throwaway Prototypes: When the Tool Deletes It For You
Tradeoff number two, expanded. Why E2B_TIMEOUT_MS is 1 hour and how the auto-expire forces the throwaway property to stick.
Where Vibe Coding Hits the Wall: State and Auth
What the year did not solve. Persistent state across screens and real auth flows are still the cliff for vibe coded apps.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.