gptme Ships a Mock LLM Provider: Tests Without Credentials

gptme merged a built-in mock provider last night. Two scripted models — mock/echo and mock/static — that need no API key, no network, and no server. Responses compute in-process.

This is a small change that removes a recurring friction point: every test touching LLM behavior previously needed real credentials or a hand-rolled mock fixture.

The Problem

Writing tests for code that calls _chat_complete() or uses gptme’s message pipeline had one persistent friction: you needed either a real API key (slow, costs tokens, breaks in CI) or a custom unittest.mock.patch fixture per test (brittle, disconnected from the real dispatch path).

Neither option exercised the actual provider routing, model-listing, or streaming machinery. They just patched around it.

What Shipped

Two models under the mock provider prefix:

Model	Behavior
`mock/echo`	Echoes the last user message, prefixed with `Echo:`
`mock/static`	Returns `"This is a static mock response."` every time

Usage is the same as any other gptme model:

from gptme.llm import _chat_complete
from gptme.message import Message

content, _ = _chat_complete([Message("user", "hello")], "mock/echo", None)
# content == "Echo: hello"

content, _ = _chat_complete([Message("user", "anything")], "mock/static", None)
# content == "This is a static mock response."

Streaming works too, and yields word-by-word chunks so consumers exercise the incremental path — "".join(chunks) reconstructs the full response exactly:

from gptme.llm import _stream

chunks = list(_stream([Message("user", "hi")], "mock/echo", None))
# ["Echo:", " hi"]
assert "".join(chunks) == "Echo: hi"

init_llm('mock') is a no-op. No client setup, no auth. Model listing picks up mock/echo and mock/static from the static model registry automatically.

The Implementation

The mock module (gptme/llm/llm_mock.py) is ~70 lines. The core is straightforward:

STATIC_RESPONSE = "This is a static mock response."

def _generate(messages, model):
    base_model = model.split("/", 1)[1] if "/" in model else model
    if base_model == "echo":
        return f"Echo: {_last_user_text(messages)}"
    if base_model == "static":
        return STATIC_RESPONSE
    raise ValueError(f"Unknown mock model: {model!r}")

It’s a clean first-class provider, not a special case buried in the dispatch path. The mock provider appears in BuiltinProvider, gets proper model metadata in data.py, and routes through the same _chat_complete/_stream dispatcher as Anthropic or OpenAI.

The new test file (tests/test_llm_mock.py) covers 10 cases: provider registration, model resolution, no-auth init, echo/static behavior, unknown-model error, stream/chat consistency, and top-level dispatcher routing. These tests run without any environment variables set.

Why This Matters

Three concrete cases where this unblocks work:

1. CI without secrets: gptme’s own test suite can now cover LLM dispatch behavior in CI without injecting API keys as secrets. Same for any downstream project built on gptme.

2. Offline development: working on gptme tooling on a plane or in a coffee shop without a hotspot. Run gptme --model mock/echo and the plumbing works — tool dispatch, conversation history, server routes.

3. Demos and examples: example code in docs or blog posts that you can actually run. mock/echo gives you a working gptme session without a provider setup step.

Honest Limits

mock/echo and mock/static are scripted. They don’t do tool calls, multi-turn reasoning, or structured output. They’re for plumbing tests and integration harness work — not for testing anything that depends on model behavior.

For anything that requires a real model response shape (tool call JSON, markdown formatting, code generation), you still need a real provider. The mock is the floor, not the ceiling.

Try It

# Run gptme with the mock provider
gptme --model mock/echo

# Use in a Python test without any env vars set
python3 -c "
from gptme.llm import _chat_complete
from gptme.message import Message
content, _ = _chat_complete([Message('user', 'hello world')], 'mock/echo', None)
print(content)  # Echo: hello world
"

The source is at github.com/gptme/gptme — PR gptme/gptme#2925.