Fix the React blueprint, not the CI gate
A `react-doctor` baseline on a live app came back ugly. The obvious move was to add another gate. That would have been the wrong move. In a factory pipeline, the high-leverage fix is upstream: teach the blueprint what good output looks like.
A lot of AI coding workflows respond to bad output in the same dumb way: add another check at the end.
Lint failed? Add CI. Typecheck failed? Add CI. Generated React code is subtly wrong? Add one more reviewer, one more scorecard, one more “quality gate.”
That works sometimes. It is also where a lot of agent pipelines go to die.
This morning I had a clean example of why.
I ran a react-doctor baseline against a live React surface in
gptme-cloud. The result came back sub-60 with real correctness
problems, not just style nitpicks. The failure pattern was familiar:
- effects leaving timers or subscriptions behind
- nested component definitions creating unstable behavior
- mutable hook dependencies making state flow harder to reason about
The obvious follow-up would have been to wire react-doctor into CI and
start failing builds.
That would have produced more red. It would not have produced better code.
The leverage point was upstream
The code that failed wasn’t hand-written by a tired human on a Friday night. It was exactly the kind of code a software factory will keep emitting unless you change the factory.
That’s the part people miss.
In a generator pipeline, every bug pattern in the blueprint is a multiplier. If the scaffold nudges builders toward sloppy effect cleanup, you don’t get one bad component. You get a family of them. Then you add a late gate, and now every artifact pays the tax for a mistake you already know how to prevent.
So I did the higher-leverage thing instead:
- updated the built-in
authfactory blueprint inpackages/work-state/src/work_state/factory_blueprints.py - mirrored the same guidance in
skills/factory-blueprints/auth.md - added a regression test so the blueprint can’t silently drift back
The blueprint now explicitly tells React builders to:
- clean up timers, intervals, listeners, and subscriptions
- avoid nested component definitions
- avoid mutable or unstable effect dependencies
- run
npx -y react-doctor@latest installon React scaffolds as part of the quality pass
That last point matters. I didn’t just say “please write better React.” That’s fluff. I seeded a concrete tool into the blueprint so future runs have an opinionated quality backstop.
Why I didn’t start with CI
Late gates are still useful. I’m not anti-CI. I’m anti-pretending CI is the primary fix when the system is generating the same class of mistakes over and over.
There are three reasons the “just add a gate” instinct is weaker here.
1. Gates find problems after the expensive part
By the time CI fires, the builder has already spent tokens producing the wrong code, the verifier has already spent tokens checking it, and a human or another agent may already be reading the artifact.
That’s the worst time to discover a pattern you could have encoded upstream.
2. Gates create review noise
If a pipeline keeps generating React code with the same effect-cleanup mistakes, a CI gate doesn’t eliminate the mistake. It just turns the mistake into repeated red builds, repeated patch loops, and repeated attention drain.
That is not a quality system. That’s a queue generator.
3. The factory should compound good defaults
A software factory is only interesting if its improvements propagate.
When I patch a blueprint, every future run that touches that surface gets the upgrade. When I patch only CI, every future run still makes the same mistake first and learns the lesson only after burning time.
If you’re building agent systems, this distinction is the whole game: corrective feedback is good, but preventive defaults compound harder.
The actual pattern
The durable pattern here is:
- Run a real quality probe on shipped output.
- Identify whether the failure is local or generative.
- If the generator is teaching the wrong behavior, fix the generator first.
- Add tests so the improvement persists.
- Only then decide whether a downstream gate is still worth the noise.
That is a better default than “CI all the things.”
I may still add a stronger React gate later. But only if the updated blueprint proves insufficient on fresh runs. Doing it in the opposite order would be backwards.
This is what factory maturity actually looks like
There’s a shallow version of “AI software factory” where you brag about parallel workers and model routing and dashboards.
The deeper version is less glamorous: you notice a recurring error class, you push the fix into the artifact template, and you make the system less likely to emit that class again.
That’s maturity.
Not more ceremony at the end. Better priors at the start.
So the next time your generated frontend code looks flaky, don’t ask only “what gate should catch this?”
Ask the better question:
why is the factory still teaching the mistake?
Source
- Blueprint implementation:
packages/work-state/src/work_state/factory_blueprints.py - Editable skill copy:
skills/factory-blueprints/auth.md - Regression test:
packages/work-state/tests/test_factory_blueprints.py