Guide

Vibe coding limits for non coders: the build is not the limit, the verification gap is

The agent writes the code and checks its own work in a sandboxed Chromium before it reports done. That closes most of the limits generic posts on vibe coding talk about. Four specific things still slip past it. This is what they are, with file paths.

Matthew Diakonov, Written with AI

Published May 13, 20266 min read

Direct answer, verified 2026-05-13

For a non coder on mk0r, the limit is not writing code. An in-VM agent writes it and self-checks the UI in a sandboxed Chromium before claiming done. The limit is the verification gap: four specific things slip past that self-check. Interaction paths the agent didn't click. Backend writes (email, database, payments) that succeed visually but fail on the wire. The 2-turn anonymous cap before sign in is required. And the 1-hour idle pause when you walk away mid build.

Sources: src/core/vm-claude-md.ts lines 273 to 283 (the verification ritual), and src/core/e2b.ts line 33 (E2B_TIMEOUT_MS).

What the agent actually verifies before it says done

The product configuration file the in-VM agent reads on its first tool call is src/core/vm-claude-md.ts. Under the heading ## Browser Testing on line 273, the agent is told exactly what to do after any UI change. The list below is paraphrased from lines 278 to 283. The loop runs in a real Chromium tab on an Xvfb display inside the same E2B sandbox that holds your code, with a Playwright MCP server attached to it over CDP.

The in-VM agent's 5-step browser test

Navigate
to http://localhost:5173 via Playwright MCP
Snapshot
the rendered DOM to verify structure
Console
check browser_console_messages for errors
Imports
if blank, confirm the component is in App.tsx
Refuse done
until the browser shows the expected result

That loop is the difference between mk0r and tools where the agent declares completion based on what it wrote, not what rendered. For a non coder it does the heavy lifting that a developer would do by opening the page and clicking around. Most "vibe coding limits" posts stop here and call it solved. It is not solved. The loop checks four things, and there are four other things it does not.

The four limits that still slip past

Each one is rooted in something the agent's snapshot, console read, and import check cannot see.

Interaction paths the agent never clicked

The agent verifies the page rendered. It does not verify that tapping "Save" leads to the right second screen, that the form re-enables after a failed submit, or that the modal closes after you confirm. A non coder cannot tell whether "looks right" means "the agent clicked through and it worked" or "the agent rendered the first screen, took a screenshot, and stopped". On a Twitter thread you'll see this as "I built a habit tracker in 30 seconds" followed two days later by "the streak counter doesn't actually increment". The fix is to explicitly ask the agent to click the path you care about and report what happened. The agent has Playwright; the test isn't cheating, it's the same browser the agent already uses.

Backend writes that look like a green UI

Four services are auto provisioned per session in src/core/service-provisioning.ts: PostHog (analytics), Resend (email), Neon (Postgres), and GitHub (source). Their keys are written into /app/.env before your first prompt. The agent reads them and wires them in. But the verification loop checks the UI, not the third party service. A "thanks for signing up" toast says the form submitted to your client code. It does not say the email landed in Resend's queue or that a row was inserted into Neon's contacts table. For a non coder, this is the cheapest failure mode to test: open the actual service dashboard (Resend, Neon) once and see whether anything is in it. If you can't, that's the friend you ask for help.

The 2-turn anonymous cap, set on the landing page itself

In src/app/(landing)/page.tsx line 339, the chat store is initialized with signInGateAfterTurns: 2. An anonymous user gets 2 turns before the third is gated behind Google sign in. That is a generous default for letting the curious see something work, but it has a non coder failure mode: if your first prompt is "make me an app" and your second is "fix the colors", you have already used your free budget on alignment, not on the actual build. The fix: spend the first turn on the most specific sentence you can write, not on the most general. "Build a daily water intake tracker with a reset-at-midnight rule" is one turn. "Build me a health app" followed by five clarifications is six turns.

The 1-hour idle pause, and what it feels like when you come back

The E2B sandbox holding your project has a 1 hour idle ceiling: E2B_TIMEOUT_MS = 3_600_000 on line 33 of src/core/e2b.ts, with lifecycle: {onTimeout: "pause", autoResume: true } on line 471. Below the ceiling, state is effectively unbounded. Above it, the sandbox pauses. Your files, your git history, and your env vars all persist. When you come back, the chat store reconnects and sandbox.setTimeout resets the window (e2b.ts line 499). For a non coder the practical effect is a slower first prompt after a break: the page shows a boot progress indicator while the sandbox wakes. The work is not gone; the sandbox just slept. If you're going to walk away for an afternoon, expect a 20 to 30 second cold start when you come back, not "where did my app go".

What the verification loop cannot see, as a checklist

If you are a non coder, this is the list to keep open in a second tab. The agent will run its own loop. You are responsible for the rest.

Things to verify yourself

Click through every interaction the app advertises (button, form, modal, transition).
Open the Resend, Neon, or Stripe dashboard and confirm a real row or email or charge after a test submit.
On the third prompt, sign in. The 2-turn cap is a budget, not a paywall, and you keep your history.
Reload the page after a break. A boot indicator is normal; the sandbox is restoring.
If something looks off and you cannot read the code, tell the agent the exact symptom ("the streak number never goes up") and ask it to test the path with Playwright itself.

Why this list is shorter than every other vibe coding limits post

Most generic "limits of vibe coding" posts list state, auth, databases, deployment, real time, native mobile, scale, observability. Half of those are not limits on mk0r. Auth is configured per session via Firebase. Email is configured via Resend with a restricted per app key. A Postgres database is provisioned via Neon and the connection URI is in /app/.env. The repo is pushed to GitHub. The agent reads /app/.env and uses what is there, so an entire category of "non coder cannot wire up backend services" is removed by the provisioning code in src/core/service-provisioning.ts. The four limits that remain are the ones that survive that provisioning. Three of them are external (the world is not the sandbox). One is the sign in gate. None of them are about writing code.

What to ask before you start

Before you type the first prompt, decide which side of the verification gap your idea sits on. If the app is a calculator, a tracker, a quiz, a static-feeling tool with no backend write beyond analytics, the verification loop covers almost everything. If the app writes to a database, sends an email, charges money, or holds state across days, the loop covers half. The other half is you (or a friend who can read code) confirming the backend actually fired. That is true on every platform, not just this one; the honest version of "vibe coding limits for non coders" is the friction of catching what the agent could not click.

Want help building the part the agent can't verify?

20 minute call. Open mk0r.com together, pick one of your real app ideas, and walk through the four limit cases on your idea specifically. If yours sits on the easy side of the verification gap, you'll know in five minutes.

Frequently asked questions

What is the real limit of vibe coding for someone who cannot read code?

The limit is not writing code. The in-VM agent writes it for you and checks its own work in a sandboxed Chromium before reporting done. The limit is the gap between 'looks finished' and 'is finished'. Four specific things slip through that gap: interaction paths the agent didn't think to click, backend writes (email, database, payments) that succeed visually but fail on the wire, the 2-turn anonymous cap before the sign-in gate fires (set in src/app/(landing)/page.tsx line 339, signInGateAfterTurns: 2), and the 1-hour idle pause after which the sandbox is suspended (E2B_TIMEOUT_MS = 3,600,000 in src/core/e2b.ts line 33). A coder catches three of those four on instinct. A non coder catches none of them without prompting.

What does the verification loop actually do?

It is five steps, quoted from src/core/vm-claude-md.ts lines 278 to 283 under the Browser Testing heading: 1) navigate to http://localhost:5173 via Playwright MCP, 2) take a snapshot to verify the DOM rendered correctly, 3) check browser_console_messages for runtime errors, 4) if the page is blank, verify the component is imported in App.tsx, 5) do not report completion until the browser shows the expected result. Chromium runs inside the same E2B sandbox on an Xvfb display with a Playwright MCP server attached over CDP. The agent writes code, looks at the page in a real browser tab, and only hands control back if the render matched its intent. That loop closes ~90% of the limits other vibe coding tools have for non technical users. It does not close all of them.

What does the verification loop NOT catch?

Four things. First, interaction state. The snapshot is a DOM check at one moment in time; it does not click every button to see what state the app enters next. Second, backend writes. The agent confirms the UI rendered, not that the POST to Resend, Neon, or Stripe actually wrote a row. A green toast is not a green database. Third, the 2-turn anonymous gate. Anonymous users have 2 free turns before sign in is required (src/app/(landing)/page.tsx:339). The first vague prompt can burn the whole budget. Fourth, the sandbox idle timeout. After 1 hour with no activity, E2B pauses the sandbox (src/core/e2b.ts:33, with lifecycle.onTimeout: 'pause' at line 471). The work persists and reconnects byte exact, but a non coder coming back two hours later may not understand why the preview spinner showed up.

What is pre wired so the non coder doesn't have to do it?

Four services are auto provisioned per session in src/core/service-provisioning.ts: PostHog for analytics, Resend for email, Neon for Postgres, and GitHub for source control. The agent writes the keys into /app/.env before your first prompt arrives, and the project CLAUDE.md tells the agent to check /app/.env for pre provisioned services before asking you for any API keys. So 'add email signup' or 'save form submissions to a database' is one prompt, not five.

Why is the default model Haiku, and what is the limit there for a non coder?

FREE_MODEL is set to 'haiku' in src/app/api/chat/model/route.ts line 5. Anonymous and unsubscribed users can only use Haiku. It is fast (the preview streams while you read), and cheap enough that the platform can serve anonymous prototypes without metering them. The limit: Haiku is a smaller model than Sonnet or Opus, and on the fifth iteration when the codebase has eight files and you're asking for cross file refactors, the model can lose track. The route returns 402 with subscription_required if a free user tries to switch (lines 17 to 24). The non coder won't know to switch, so the limit feels like 'the AI got worse' rather than 'I'm on the cheap model'.

Can the agent build me an iOS or Android app?

No. The /app directory inside the sandbox is a Vite + React + TypeScript + Tailwind project (project CLAUDE.md, also in vm-claude-md.ts). The agent edits files in /app/src/ and the Vite dev server hot reloads at http://localhost:5173. Output is a mobile first responsive web app, not a native binary. If the goal is the App Store, the limit is the stack, not the agent. Ask first whether a mobile web app is enough for your idea; for most consumer prototypes, it is.

What is the design refusal list, and why does it matter to a non coder?

Lines 168 to 175 of src/core/vm-claude-md.ts list six anti patterns the agent is told never to produce: purple/indigo gradients on white backgrounds, uniform rounded cards in a grid with icons, a generic hero with centered text and a gradient button, every section looking the same with slight color variations, cookie cutter layouts that could be any app, and placeholder images or Lorem Ipsum left in place. The rule is scoped by 'These rules apply unless the user explicitly overrides them' on line 99. For a non coder who types 'make it look modern', this means the default is not the generic AI aesthetic. The override is one sentence: 'I want purple gradients on white'. The point is to move the aesthetic decision off the user and onto a written rule the model applies consistently.

Can I see the code, or own it later, if I'm not technical?

Yes. Every prompt commits to a local git repo at /app inside the sandbox. The agent can also push your app to a private GitHub repo auto created under m13v/ at session start (the repo URL is in /app/.env from turn one, see service-provisioning.ts line 51 and the GITHUB_ORG constant). If you want a developer friend to look at it later, they have a real TypeScript project on GitHub. Not a Bubble export, not a screenshot, not a Figma file.