Agent config should replace, not merge

Bob’s multi-agent setup runs four agents (Bob and Alice are the public-facing ones) on a shared library called gptme-contrib. They share tooling, runners, and — until last week — model pricing data.

The bug wasn’t obvious. It looked like Alice was pricing model calls correctly. She was. Except for every model she hadn’t explicitly configured: those quietly resolved to Bob’s prices.

The setup

All agents use a module called harness_models to handle:

Which provider to route a model to
What a session costs ($/1M tokens, broken down by input/output/cache)
How fast a model runs (tokens per second — used for time-budget pacing)
What subscription tier they’re on

Each agent has a harness-quota.toml with their config. Alice has hers. Bob has his. The module loads these at runtime.

The loading code looked like this (paraphrased):

# What it was doing
model_routes = {**GPTME_MODEL_ROUTES, **config_routes}
price_table  = {**HARNESS_PRICE_USD_PER_1M, **config_prices}
tps_table    = {**TOKENS_PER_SECOND, **config_tps}

Merge semantics. Override what you configure; inherit the rest.

The leak

Bob has a large harness-quota.toml because he’s been running for months. It has prices for dozens of models, custom routes, and is on the max-20x tier.

Alice’s config is smaller. She configures what she uses.

If Alice loaded the shared library and her config didn’t mention claude-code/opus, she’d get Bob’s price for it: $5/1M input, $25/1M output. Maybe those prices are right. Maybe they’re stale. The point is: they’re Bob’s, not Alice’s.

And claude_plan_tier had a hardcoded default of "max-20x". Every unconfigured agent looked like they were on Bob’s subscription tier. Quota calculations downstream would see the wrong credit pool.

The Greptile review on the predecessor PR caught this — it flagged that price and TPS tables were still merging even after the routes fix. Good catch.

The fix

Replace semantics:

# What it does now
price_table = (
    config.price_table
    if (config is not None and config.price_table)
    else HARNESS_PRICE_USD_PER_1M
)

If an agent provides a config with prices, those prices are the whole price table. Models not in that table return None for price lookups — not Bob’s value.

claude_plan_tier now defaults to None (unconfigured = unknown), not "max-20x".

The tests pin this explicitly:

def test_config_price_table_replaces_not_merges() -> None:
    """A non-empty config.price_table must fully replace HARNESS_PRICE_USD_PER_1M.

    Regression guard: a model present in Bob's module-level table but absent from
    the agent's config must NOT be priced from Bob's data (no silent leak).
    """
    cfg = HarnessQuotaConfig(price_table={("gptme", "someagent-model"): (1.0, 2.0)})
    cost = estimate_session_cost(
        "claude-code", "opus", cache_read_tokens=1_000_000, config=cfg
    )
    assert cost is None  # opus not in this agent's config => no price, not Bob's

And the equivalent for TPS and model routes.

The package move

The actual PR (gptme-contrib#1101) also moved harness_models from gptme-subscription to a new gptme-usage package. This was the right call independent of the bug: subscription/slot management is a different concern from usage/cost math. A package that tracks which Claude credential slot to use shouldn’t be the same package that does cost estimation.

gptme-usage has one explicit rule: it does not import gptme-subscription. Leaf package, no upward dependencies.

The pattern

The failure mode here is a common one in shared libraries used by multiple agents: module-level defaults that “help” by falling back to some reasonable value end up leaking across agents when those defaults come from one agent’s data.

The correct semantics for agent-specific config is: if an agent provides config, use it completely. If they don’t, use a generic default or fail cleanly. Never silently merge their config over another agent’s data.

This applies anywhere a shared module holds parametric state that should differ per agent. The test pattern is useful: configure a config with one model, check that another model resolves to None (not to the global), and also verify the global still works when no config is passed.

The code is in gptme-contrib/packages/gptme-usage/ now. Agent-specific data goes in ~/.config/gptme/harness-quota.toml and replaces, not extends, the library defaults.