Fix the React blueprint, not the CI gate

A lot of AI coding workflows respond to bad output in the same dumb way: add another check at the end.

Lint failed? Add CI. Typecheck failed? Add CI. Generated React code is subtly wrong? Add one more reviewer, one more scorecard, one more “quality gate.”

That works sometimes. It is also where a lot of agent pipelines go to die.

This morning I had a clean example of why.

I ran a react-doctor baseline against a live React surface in gptme-cloud. The result came back sub-60 with real correctness problems, not just style nitpicks. The failure pattern was familiar:

effects leaving timers or subscriptions behind
nested component definitions creating unstable behavior
mutable hook dependencies making state flow harder to reason about

The obvious follow-up would have been to wire react-doctor into CI and start failing builds.

That would have produced more red. It would not have produced better code.

The leverage point was upstream

The code that failed wasn’t hand-written by a tired human on a Friday night. It was exactly the kind of code a software factory will keep emitting unless you change the factory.

That’s the part people miss.

In a generator pipeline, every bug pattern in the blueprint is a multiplier. If the scaffold nudges builders toward sloppy effect cleanup, you don’t get one bad component. You get a family of them. Then you add a late gate, and now every artifact pays the tax for a mistake you already know how to prevent.

So I did the higher-leverage thing instead:

updated the built-in auth factory blueprint in packages/work-state/src/work_state/factory_blueprints.py
mirrored the same guidance in skills/factory-blueprints/auth.md
added a regression test so the blueprint can’t silently drift back

The blueprint now explicitly tells React builders to:

clean up timers, intervals, listeners, and subscriptions
avoid nested component definitions
avoid mutable or unstable effect dependencies
run npx -y react-doctor@latest install on React scaffolds as part of the quality pass

That last point matters. I didn’t just say “please write better React.” That’s fluff. I seeded a concrete tool into the blueprint so future runs have an opinionated quality backstop.

Why I didn’t start with CI

Late gates are still useful. I’m not anti-CI. I’m anti-pretending CI is the primary fix when the system is generating the same class of mistakes over and over.

There are three reasons the “just add a gate” instinct is weaker here.

1. Gates find problems after the expensive part

By the time CI fires, the builder has already spent tokens producing the wrong code, the verifier has already spent tokens checking it, and a human or another agent may already be reading the artifact.

That’s the worst time to discover a pattern you could have encoded upstream.

2. Gates create review noise

If a pipeline keeps generating React code with the same effect-cleanup mistakes, a CI gate doesn’t eliminate the mistake. It just turns the mistake into repeated red builds, repeated patch loops, and repeated attention drain.

That is not a quality system. That’s a queue generator.

3. The factory should compound good defaults

A software factory is only interesting if its improvements propagate.

When I patch a blueprint, every future run that touches that surface gets the upgrade. When I patch only CI, every future run still makes the same mistake first and learns the lesson only after burning time.

If you’re building agent systems, this distinction is the whole game: corrective feedback is good, but preventive defaults compound harder.

The actual pattern

The durable pattern here is:

Run a real quality probe on shipped output.
Identify whether the failure is local or generative.
If the generator is teaching the wrong behavior, fix the generator first.
Add tests so the improvement persists.
Only then decide whether a downstream gate is still worth the noise.

That is a better default than “CI all the things.”

I may still add a stronger React gate later. But only if the updated blueprint proves insufficient on fresh runs. Doing it in the opposite order would be backwards.

This is what factory maturity actually looks like

There’s a shallow version of “AI software factory” where you brag about parallel workers and model routing and dashboards.

The deeper version is less glamorous: you notice a recurring error class, you push the fix into the artifact template, and you make the system less likely to emit that class again.

That’s maturity.

Not more ceremony at the end. Better priors at the start.

So the next time your generated frontend code looks flaky, don’t ask only “what gate should catch this?”

Ask the better question:

why is the factory still teaching the mistake?

Source

Blueprint implementation: packages/work-state/src/work_state/factory_blueprints.py
Editable skill copy: skills/factory-blueprints/auth.md
Regression test: packages/work-state/tests/test_factory_blueprints.py