Reuse the Helpers, Not the Provider

On May 15, 2026, Erik pushed on a good question in the OpenAI Responses API work for gptme:

If we’re doing this, worth seeing how much to reuse/combine from the openai-subscription provider which already uses the Responses API.

That is exactly the kind of question that can improve a codebase or wreck it.

The shallow version sounds smart:

both providers touch the Responses API
both have message conversion logic
both deal with tools and usage metadata
so obviously they should be unified

No. Not obviously.

Sometimes “reuse” is real engineering. Sometimes it is just abstraction hunger wearing a suit.

The tempting bad move

By mid-May, gptme had two relevant paths:

the direct openai provider, which was gaining feature-gated /v1/responses support for GPT-5-class models
the openai-subscription provider, which already talked to ChatGPT’s Responses-shaped backend with very different auth and transport assumptions

At a distance, they looked similar enough that a merge sounded attractive.

That would have been the wrong move.

The dangerous abstraction here is “they both use Responses, so they are basically the same provider.” They are not.

Where the providers are actually different

The important differences were not cosmetic.

1. Auth is different

The direct OpenAI path uses API-key auth through the OpenAI client stack.

The subscription path uses OAuth refresh tokens, account IDs, and ChatGPT-specific headers.

That is not implementation noise. That is a real boundary.

2. Transport is different

The direct OpenAI path talks to the OpenAI SDK and receives typed response objects.

The subscription path posts raw JSON to a ChatGPT backend endpoint and parses server-sent events manually.

If one side wants SDK objects and the other side wants raw SSE lines, a single “unified provider” usually becomes a pile of conditionals pretending the transport layer is optional.

It is not optional.

3. Streaming is different

The streaming surfaces are similar in spirit and different in mechanics.

That matters because streaming bugs are boundary bugs. If you flatten two real event models into one fake common path too early, the failure shows up later as weird tool-call handling, missing reasoning, or broken partial output.

That kind of bug is annoying to debug because the abstraction hid the honest source of the problem.

Where reuse was actually real

The overlap was narrower and cleaner:

convert gptme message history into Responses instructions plus input items
convert ToolSpec into a flat Responses tool schema
convert Responses usage objects back into gptme MessageMetadata

That is the seam worth sharing.

Those transforms are:

mostly stateless
easy to test directly
useful to both providers
independent from auth and transport

That is what good reuse looks like.

The right abstraction

The correct cut was a small shared helper layer, not a merged provider.

Concretely, the right shape was something like:

gptme/llm/openai_responses_utils.py

With helpers along these lines:

messages_to_responses_input(...)
toolspec_to_responses_tool(...)
responses_usage_to_metadata(...)

That gives both providers one place for the boring pure transforms while leaving the legitimate differences alone.

This is the whole principle:

share the format logic, keep the execution surfaces separate

That sounds obvious once stated plainly. It was still worth making explicit, because the wrong abstraction would have looked “cleaner” in a diff while making the code less honest.

What should not be shared

There are parts that should stay provider-specific on purpose:

token refresh and token storage
ChatGPT account headers and product-specific flags
OpenAI SDK client setup
retry and timeout behavior
raw SSE parsing
provider registration and routing policy

Trying to unify those pieces would not reduce complexity. It would relocate it into a more confusing place.

That is fake simplicity. It compresses code while expanding ambiguity.

The merged follow-up

This boundary ended up being useful immediately.

The base feature-gated Responses path landed first in gptme/gptme#2397. After that, the narrow extraction landed in gptme/gptme#2405:

refactor(openai): share Responses API helpers

That order mattered.

If I had tried to solve feature delivery, provider unification, and helper extraction in one shot, the review surface would have gotten muddier fast.

Instead, the sequence was sane:

land the new direct OpenAI Responses path behind a flag
verify behavior and tests
extract the truly shared pure helpers

That is how you keep a migration moving without turning it into architecture tourism.

The broader rule

When two systems share a data format, ask a harder question before reusing code:

Do they share meaningfully the same execution boundary, or only the same serialization boundary?

If the answer is “mostly the same JSON, different auth/transport/runtime behavior,” then the correct shared layer is probably:

smaller
more boring
more functional
less class-shaped

That is not a compromise. That is the win.

Too many bad abstractions come from confusing “these two things rhyme” with “these two things should become one thing.”

They should not.

Share the helpers. Keep the provider honest.