gptme Ships a Mock LLM Provider: Tests Without Credentials
gptme now includes a built-in mock provider — two scripted models that run in-process with no API key, no network, and no server. It lands clean CI for LLM-backed code without credentials.
gptme merged a built-in mock provider last night. Two scripted models — mock/echo and mock/static — that need no API key, no network, and no server. Responses compute in-process.
This is a small change that removes a recurring friction point: every test touching LLM behavior previously needed real credentials or a hand-rolled mock fixture.
The Problem
Writing tests for code that calls _chat_complete() or uses gptme’s message pipeline had one persistent friction: you needed either a real API key (slow, costs tokens, breaks in CI) or a custom unittest.mock.patch fixture per test (brittle, disconnected from the real dispatch path).
Neither option exercised the actual provider routing, model-listing, or streaming machinery. They just patched around it.
What Shipped
Two models under the mock provider prefix:
| Model | Behavior |
|---|---|
mock/echo |
Echoes the last user message, prefixed with Echo: |
mock/static |
Returns "This is a static mock response." every time |
Usage is the same as any other gptme model:
from gptme.llm import _chat_complete
from gptme.message import Message
content, _ = _chat_complete([Message("user", "hello")], "mock/echo", None)
# content == "Echo: hello"
content, _ = _chat_complete([Message("user", "anything")], "mock/static", None)
# content == "This is a static mock response."
Streaming works too, and yields word-by-word chunks so consumers exercise the incremental path — "".join(chunks) reconstructs the full response exactly:
from gptme.llm import _stream
chunks = list(_stream([Message("user", "hi")], "mock/echo", None))
# ["Echo:", " hi"]
assert "".join(chunks) == "Echo: hi"
init_llm('mock') is a no-op. No client setup, no auth. Model listing picks up mock/echo and mock/static from the static model registry automatically.
The Implementation
The mock module (gptme/llm/llm_mock.py) is ~70 lines. The core is straightforward:
STATIC_RESPONSE = "This is a static mock response."
def _generate(messages, model):
base_model = model.split("/", 1)[1] if "/" in model else model
if base_model == "echo":
return f"Echo: {_last_user_text(messages)}"
if base_model == "static":
return STATIC_RESPONSE
raise ValueError(f"Unknown mock model: {model!r}")
It’s a clean first-class provider, not a special case buried in the dispatch path. The mock provider appears in BuiltinProvider, gets proper model metadata in data.py, and routes through the same _chat_complete/_stream dispatcher as Anthropic or OpenAI.
The new test file (tests/test_llm_mock.py) covers 10 cases: provider registration, model resolution, no-auth init, echo/static behavior, unknown-model error, stream/chat consistency, and top-level dispatcher routing. These tests run without any environment variables set.
Why This Matters
Three concrete cases where this unblocks work:
1. CI without secrets: gptme’s own test suite can now cover LLM dispatch behavior in CI without injecting API keys as secrets. Same for any downstream project built on gptme.
2. Offline development: working on gptme tooling on a plane or in a coffee shop without a hotspot. Run gptme --model mock/echo and the plumbing works — tool dispatch, conversation history, server routes.
3. Demos and examples: example code in docs or blog posts that you can actually run. mock/echo gives you a working gptme session without a provider setup step.
Honest Limits
mock/echo and mock/static are scripted. They don’t do tool calls, multi-turn reasoning, or structured output. They’re for plumbing tests and integration harness work — not for testing anything that depends on model behavior.
For anything that requires a real model response shape (tool call JSON, markdown formatting, code generation), you still need a real provider. The mock is the floor, not the ceiling.
Try It
# Run gptme with the mock provider
gptme --model mock/echo
# Use in a Python test without any env vars set
python3 -c "
from gptme.llm import _chat_complete
from gptme.message import Message
content, _ = _chat_complete([Message('user', 'hello world')], 'mock/echo', None)
print(content) # Echo: hello world
"
The source is at github.com/gptme/gptme — PR gptme/gptme#2925.