Guide

To build an app from an open source AI project, you need a VM that is already running

Most write-ups on this turn into a provisioning checklist: clone the repo, set up a Python venv, install Postgres locally, register for some hosted vector store, buy a deploy target, wire CI. mk0r collapses that whole list into one sentence: describe the project. The Debian sandbox the agent works inside already has the toolchain installed, four backend services running, and a private GitHub repo waiting before you send the first message. Here is exactly what is in the box, and where to verify it in the source.

Matthew Diakonov, mk0r

Published April 24, 202611 min

4.8from 10K+ creators

VM boots before your first prompt

git, npm, python3, ffmpeg, Chromium pre-installed

Neon, Resend, PostHog, GitHub already wired

The carrier vehicle

Bring an open source AI repo. The VM is already booted.

Debian sandbox with git, Node 20, python3, ffmpeg, Chromium

Neon Postgres, Resend, PostHog, private GitHub repo

Vite dev server already serving on :5173

Agent clones, installs, wires the UI

Mobile-friendly preview in the same Chromium

0:00 / 0:07

The thing every other guide forgets to mention

Open source AI projects are not, in 2026, hard to find. Hugging Face has hundreds of thousands. GitHub has the rest. The friction is not discovery. It is everything between "I cloned the repo" and "there is a working app I can open on my phone." That gap is dependencies, a database, an inference endpoint, an HTTP layer, a UI framework, a deploy target, a domain. Do that for one project and it is a weekend. Do it for three and the math turns against you.

The mk0r answer is to make the gap zero. Not by writing yet another scaffold generator, but by handing the agent a real machine that already has the toolchain. The agent does the same things a human would do (clone, install, edit, run), only it never waits for a download and never asks you for an API key because the keys are already in /app/.env.

What is already installed before you say a word

One Dockerfile layer. One apt-get block. Every binary an open source AI repo is likely to invoke is on PATH the moment the VM boots.

gitnode 20npmpython3python3-pippsycopg2-binaryffmpegchromiumpostgresql-clientxvfbx11vncwebsockifycron@playwright/mcp@anthropic-ai/claude-code

docker/e2b/e2b.Dockerfile

The interesting line is the apt-get block. It is dense on purpose: every entry is something a typical open source AI repo asks for in its README. git for the clone. python3 + pip for the half of AI repos that are still Python. ffmpeg for audio and video pipelines. postgresql-client so a project that wants psql in a script just has it. chromium + xvfb + x11vnc so anything that needs a real browser (scraping, end-to-end test of the demo, headless screenshot of a generated page) gets one without a separate container.

0system packages installed in one apt-get layer

0global npm tools pinned at version

0backend services pre-provisioned per session

0accounts you have to create

What gets wired into the env file at boot

The toolchain is one half. The other half is the backend. Every session that runs the provisioning step gets a fresh /app/.env with four sets of credentials. The agent reads them; you do not have to.

/app/.env

Pre-provisioned

Backend services already wired

Neon Postgres

Per-app database. Use it for vector embeddings, retrieval, or any state.

Resend

Transactional email and a per-app audience already created.

PostHog

Analytics with the app ID set as a group on every event.

GitHub

Private repo provisioned at session start. Push when ready.

How an open source AI repo enters the VM

The shortest way to describe what the agent does is to draw it. One sandbox, one Vite dev server, one Chromium. Anything you bring (a GitHub URL, a model name, a paper code release) flows through the agent into the same workspace.

Bring a repo. The VM is already running.

The shape of one turn, end to end

From the moment you describe the project to the moment a working page exists, six things happen. None of them are setup. The setup already happened, once, at image build time.

git clone the repo into the VM

The agent runs git clone on a residential IP. Hugging Face, GitHub, and most model hubs do not throttle the request because it does not look like a datacenter pull.

Read the README and decide the integration shape

Three options: import as an npm or pip dependency, run as a long-lived sidecar (e.g. a Python FastAPI process on port 8000), or cherry-pick specific files into /app/src. The choice is per-project and the agent makes it from the README.

Install the dependencies

npm install for Node projects, pip install -r requirements.txt for Python, both on the same VM. No virtualenv ceremony unless the repo demands it.

Wire the backend pieces it needs

A retrieval project gets pointed at DATABASE_URL (Neon). An email-sending demo gets RESEND_API_KEY. A model-hub wrapper gets the user's HF token if you provided one, otherwise the agent uses the public endpoint. None of this requires asking you.

Edit /app/src/App.tsx to mount the UI

The Vite + React + Tailwind v4 scaffold is already serving on :5173. The agent imports its new component into App.tsx (the rule that says every new component must be imported there is in /app/CLAUDE.md). Vite HMR picks up the change in milliseconds.

Open the page in Chromium and verify the DOM

Through Playwright MCP attached to CDP at 127.0.0.1:9222. The agent reads the rendered accessibility tree, checks the browser console for runtime errors, and only reports done when the page actually shows up. Same Chromium you see in the screencast.

What it looks like in the shell

Here is the in-VM shell during a representative turn: the user asked for an app that wraps an open source speech-to-text repo. The agent does what a developer would do, only without the think-time.

root@vm:/tmp (agent shell, single turn)

Which kinds of repos this fits

The constraint is the sandbox shape, not the agent. A 1 vCPU / 1Gi RAM VM is a poor place to run a 70B parameter model locally. It is a great place to run wrappers, demos, and pipelines that talk to a hosted inference endpoint. Most open source AI repos in 2026 are exactly that.

Hosted-inference SDK wrappers

OpenAI, Anthropic, Hugging Face Inference, Replicate, Cohere, Mistral. The repo is a thin client; the heavy compute is remote. Drop it into /app/src and call it.

Retrieval and RAG demos

Anything that wants a Postgres + pgvector store. The agent points the repo at DATABASE_URL (Neon) and you have a real database, not a SQLite toy.

Audio and video pipelines

ffmpeg is already on PATH. Speech-to-text, text-to-speech, transcription, splice and dub flows that lean on open source codecs work without extra installs.

Agent and tool-use frameworks

LangChain, LlamaIndex, smol-agents, anything that wants Node or Python plus a real shell. The VM is a real shell.

Browser-driving demos

Repos that need a real Chromium for scraping, screenshots, or end-to-end test of a generated UI. The same Chromium that drives the verification loop is available to your code.

Out of scope

Local inference on multi-billion parameter weights, CUDA-only training scripts, anything that needs a GPU. Use the agent to wrap a hosted endpoint instead of running the model in the box.

Why the residential IP matters here

A surprising amount of the frustration in "clone an AI repo" comes from network behavior, not code. Hugging Face soft-throttles datacenter IPs. Some package mirrors return different content for cloud egress. Model providers ask for CAPTCHA on signup attempts from datacenter ranges. The mk0r VM sends Chromium and (when configured) shell traffic out through a residential upstream provider, so the clone, the install, and the API call to the model endpoint all leave from a normal residential address. That removes a category of failures that most builders learn about the hard way.

The proxy is per-session, not shared

brd-proxy.js reads /run/brd.confper request. The conf is written when the session asks for residential routing and refreshed on every wake-up. So your session does not share an upstream identity with another user's, which matters when an open source AI demo logs into a third-party service on your behalf.

How that compares to the typical guide flow

The fastest way to see what mk0r removes from your day is to line it up against the steps every other write-up on this asks you to do.

Feature	Typical guide flow	mk0r (pre-warmed VM)
Toolchain install	You install Node, Python, pip, ffmpeg, Postgres locally	Already in the image (apt-get layer at build time)
Database	You docker-compose Postgres or sign up for one	Per-session Neon Postgres in /app/.env
Email sending	You sign up for Resend or Sendgrid, copy keys	Resend API key + per-app audience pre-provisioned
Analytics	You sign up for PostHog, add init code, copy keys	PostHog project key in env, app ID set as group
Source control	You create a repo, configure SSH or PAT	Private GitHub repo URL in /app/.env
Browser for verification	None, or a separate headless install	Chromium with CDP on :9222, shared with screencast
Outbound IP	Datacenter IP, sometimes throttled by AI hubs	Residential proxy per session
Time to first running page	An afternoon to a weekend	Single-digit minutes for a small wrapper
Account to start	Several	None

What you can stop doing

The point of a pre-warmed sandbox is not that the agent is smarter. It is that the agent does not have to wait for, or ask you about, the things below. You stop being the bottleneck.

Steps that disappear

Installing Node, Python, pip, ffmpeg, Chromium on a fresh box
Docker-composing Postgres for a vector store
Signing up for an email API and copying the key
Wiring an analytics SDK and waiting for events to land
Creating a GitHub repo and configuring auth before you can push
Setting up a deploy target before you have anything to deploy
Manually opening the preview in your phone to check mobile layout

The numbers that make this work

Three small constants do most of the heavy lifting. Worth naming so you can verify them: the in-VM dev server port, the Chromium debugging port, and the count of services already in the env file.

Vite dev server port the agent edits against

Chromium CDP port for the verification loop

Backend services pre-wired into /app/.env

Reading it yourself

Two files cover almost every claim on this page. The toolchain is in docker/e2b/e2b.Dockerfile (lines 26 to 57 are the meaningful part). The pre-provisioned services and how the agent is told to use them are in src/core/vm-claude-md.ts (the table at lines 1182 to 1187, the per-service skill that follows it). The boot order, including which services start before the agent ever sees a prompt, is in docker/e2b/files/opt/startup.sh. All three are in the appmaker repo. Open them and the rest of the page reads itself.

Bring an open source AI repo, watch it become an app

Book 20 minutes. We will pick a repo together, paste it in, and walk through what the agent does inside the pre-warmed VM until the mobile preview renders.

Frequently asked questions

What does mk0r do that a normal coding agent does not?

It hands the agent a real Debian VM with the toolchain pre-installed and four backend services already wired into /app/.env. The agent does not have to provision anything. It does not ask you for an OpenAI key or a Postgres URL. It runs git clone, runs npm install or pip install, and wires the cloned project into a Vite + React UI that is already serving on port 5173 with HMR.

Which open source AI projects fit into the VM well?

Anything that runs on Node 20 or Python 3, ships as a pip package, npm package, or git repo, and does not need a GPU. Wrappers around hosted models (OpenAI SDK, Anthropic SDK, Hugging Face Inference, Replicate, Ollama against an external endpoint), text and audio processing pipelines that lean on ffmpeg, retrieval projects that want a Postgres database, and any web demo that talks to a remote inference API. Heavy local model inference is out of scope for the standard 1 vCPU sandbox.

Where exactly is this toolchain defined?

Lines 26 to 35 of docker/e2b/e2b.Dockerfile in the appmaker repo. One apt-get block installs chromium, ffmpeg, libnss3, libxss1, xvfb, x11vnc, websockify, python3, python3-pip, postgresql-client, cron, and git. Lines 52 to 57 npm-install -g @playwright/mcp@0.0.70, ws, @agentclientprotocol/claude-agent-acp@0.25.0, @anthropic-ai/claude-code, and social-autoposter. The image is baked once and every session boots from the same snapshot.

What about the database and the email?

Pre-provisioned. A dedicated Neon Postgres database, a Resend API key plus a per-app audience, a PostHog project key, and a private GitHub repo are written to /app/.env at session start. The pre-provisioning table lives in src/core/vm-claude-md.ts at lines 1182 to 1187. If the open source AI project you cloned needs a vector store, the agent points it at Neon. If it needs to email a result, the agent uses Resend. No signup, no copy-pasting keys.

Does the running app actually work on a phone?

Yes. The agent is told in /app/CLAUDE.md to build mobile-first with Tailwind v4 (base, sm:, md:, lg: breakpoints). Vite serves on 0.0.0.0:5173 inside the VM and the proxy on port 3000 routes external traffic so the preview is reachable from any browser. When the agent verifies its own work, it does so through the same Chromium that is screencast back to your tab.

What does a typical first turn look like?

You describe the project. The agent runs git clone <url> /tmp/<repo> inside the VM, reads the README, decides whether to import it as a library, run it as a sidecar, or copy specific files into /app/src. It runs npm install or pip install -r requirements.txt, edits /app/src/App.tsx to mount a UI, and reloads its own page in Chromium via Playwright MCP to check it actually rendered. Total time is usually under two minutes for a small wrapper, longer for projects that need a Python sidecar.

Why is the residential IP relevant?

Several open source AI repos pull weights or datasets from Hugging Face, GitHub, or model hubs that throttle or block datacenter IPs. mk0r routes Chromium and (optionally) login-shell HTTP traffic through a residential upstream so the clone, the package install, and the inference calls go out from a real residential address. The plumbing is in docker/e2b/files/opt/brd-proxy.js and the env config in /etc/profile.d/brd-proxy.sh.

Do I get the source code afterwards?

Yes. Each session is provisioned with a private GitHub repo and the URL is in /app/.env as GITHUB_REPO_URL. The agent can run git init, add the remote, and push when you want a permanent copy. There is no proprietary lock-in: you walk out with a regular Vite + React + TypeScript project plus whatever open source repo you brought with you.

How is this different from Emergent or v0 or other AI app builders?

Most other tools generate code in a chat panel and hand you a zip or a deploy button. They do not give the agent a working VM, real internet, a real database, or a real browser to verify against. mk0r treats the VM as the product. The agent operates inside it the way a human developer operates on a fresh laptop, except the laptop already has everything installed. There is also no signup wall: you land on mk0r.com, type a request, and the VM is yours.

What happens to the VM when I close the tab?

The session sleeps. State persists in the E2B snapshot keyed to your session and is restored when you come back. The /run/brd.conf residential proxy config is re-ensured on every restore (it does not survive sleep on its own). You can close the tab mid-build and pick up where you left off.

The toolchain is already running. You just bring the repo.

Open a session