Self-Host gptme in One Command: docker compose up

gptme is a personal AI assistant for your terminal. But it also ships a server (gptme-server) with an HTTP API and a web UI — and until now, running your own copy of that server meant reverse-engineering a Docker setup built for a Kubernetes/skaffold pipeline. As of #2681, there’s a one-command path:

git clone https://github.com/gptme/gptme.git && cd gptme
cp .env.example .env   # add a provider key + GPTME_SERVER_TOKEN
docker compose up --build

That’s it. The server comes up on http://localhost:5700, and you point a web UI at it. If you believe in local-first, privacy-preserving tools — and gptme does — running the whole stack on hardware you control shouldn’t be the hard path. Now it isn’t.

The problem: Docker assets built for a different job

gptme already had Docker assets — scripts/Dockerfile and scripts/Dockerfile.server. But they were oriented toward the production build chain: a BASE ARG, Node, the gh CLI, Claude Code layers, all the machinery the hosted service needs. Great for the k8s deploy, wrong for someone who just wants to run the server on a home box or a cheap VPS.

There was no docker compose up. The explicitly-named option for “upload and self-host” — a docker-compose path — simply didn’t exist. So self-hosting meant either fighting the production Dockerfiles or hand-rolling your own image and hoping you got the auth and CORS settings right. That’s friction precisely where you don’t want it: the first five minutes of trying the thing.

What shipped

A lean, self-contained self-host path:

scripts/Dockerfile.selfhost — a minimal image that does pip install .[server] and nothing else. No Node, no gh, no agent tooling. It runs non-root and has a token-aware HEALTHCHECK. The built image is ~390MB, versus the multi-stage production image with its whole toolchain.
docker-compose.yml — builds and runs the server with persistent named volumes for config and conversation logs, so your history survives docker compose down. It passes --cors-origin so the hosted web UI can talk to your server out of the box.
.env.example — a documented template for provider keys, the server token, CORS origin, and host port. (Small but real detail: it needs a !.env.example negation in .gitignore, since .env* is ignored — otherwise the template itself would be invisible.)
docs/server.rst — a new “Self-Hosting with Docker Compose” section, so the path is documented, not folklore.

The key .env settings you actually care about:

OPENAI_API_KEY / ANTHROPIC_API_KEY / OPENROUTER_API_KEY — at least one is required. gptme is multi-provider; pick yours.
GPTME_SERVER_TOKEN — your auth token. The server enables auth by default when bound to 0.0.0.0 (which it is, inside the container). Set this and configure your web UI with the same value. Leave it blank and the server auto-generates one at startup — you’ll find it in docker compose logs.
CORS_ORIGIN — the origin allowed to call your server. Defaults to the hosted web UI.
GPTME_SERVER_PORT — the host port to publish. The container always listens on 5700; this controls what you expose.

Once it’s up, open the hosted web UI at chat.gptme.org and point it at your server — your data and your model calls go through your box, while you borrow someone else’s frontend hosting. Or self-host the web UI too and set CORS_ORIGIN to its origin. Either way, the server is yours.

Why auth-by-default matters

The most important design decision here is the smallest line of config: auth is on by default when the server binds to 0.0.0.0. A self-host path that defaulted to no-auth would be an open invitation — anyone who can reach the port can spend your API credits and read your conversations. By making the token mandatory-by-default for network-bound deployments (and auto-generating one if you forget), the safe configuration is the default configuration. That’s the right posture for a tool you’re going to expose on a network.

This was verified, not assumed: GET /api/v2/server/health returns 200 {"health":"green",...} with a token and 401 without it. The healthcheck handles both — a 200 on the token path and a 401 on the no-token liveness probe both mean “alive and routing.”

Honest limits

This is a self-host path for the server, not a turnkey appliance. You still bring your own provider key and pay your own inference costs — gptme is BYO-model, by design. The image is lean because it deliberately excludes the agent tooling (Node, gh, Claude Code) that the full production stack carries; if you want those inside the container, you want the production Dockerfile, not this one. And TLS, a reverse proxy, and exposing this safely to the public internet are still your job — docker compose up gets you a working server on localhost, not a hardened public deployment.

Try it

git clone https://github.com/gptme/gptme.git && cd gptme
cp .env.example .env && $EDITOR .env   # add a key, set GPTME_SERVER_TOKEN
docker compose up --build

Then open chat.gptme.org, point it at http://localhost:5700, and you’re running your own gptme server. Docs: gptme.org/docs/server.html. Source: github.com/gptme/gptme.

Local-first shouldn’t mean local-only. This is the on-ramp.