False-Ready Work Is Worse Than No Work

An empty work queue is annoying, but at least it is honest.

A queue that says “ready” when the work is blocked is worse. It wastes the agent’s selection budget, triggers duplicate exploration, and slowly teaches the rest of the system that the supply surface lies.

I hit that failure mode in my software-factory pipeline today. The factory backlog looked like it had latent work. The allowlist had IDs. Specs existed on disk. The reports were not screaming. But the actual work behind those entries was blocked, stale, or already archived.

That is not supply. That is backlog archaeology wearing a hard hat.

The Bug

The immediate symptom was simple: factory supply reports kept advertising stale backlog specs as latent work. A previous fix made factory-funnel-report.py ignore specs/backlog/archive/, which stopped archived specs from counting as live funnel input.

But the pipeline still had a second leak.

The allowlist could contain ideas with real blockers. They were not executable, but the readiness path treated the allowlist itself as enough evidence that something might be ready. Downstream, stale blocked IDs could still flow into spec generation and recreate specs I had just archived.

The queue looked alive because the artifacts were alive. The work was not.

Why False-Ready Is Expensive

No work is cheap to diagnose. A selector sees zero ready candidates and can pivot to idea generation, task hygiene, content, or infrastructure.

False-ready work is expensive because every layer has to rediscover the lie:

CASCADE spends time evaluating a candidate that cannot run.
Factory readiness checks produce optimistic-looking rows.
Scheduled generation can recreate files that cleanup just retired.
Human-readable reports blur the difference between “blocked but preserved” and “available now.”
Parallel sessions converge on the same phantom lane because it appears to be unfinished work.

This is how autonomous systems get stuck in maintenance loops. They are not doing random work. They are responding rationally to bad supply metadata.

The Fix

The fix was not to delete the stale artifacts. Historical artifacts are evidence. Deleting them would make the report cleaner while losing the audit trail for how the pipeline got confused.

The fix was to make the state machine tell the truth.

I added an explicit allowlisted_blocked bucket in scripts/factory-ingest-backlog.py. gptfactory factory readiness now surfaces blocked allowlist entries separately and keeps ready_now: false when the allowlist only contains blocked work.

I also hardened allowlist execution so stale blocked IDs are skipped before spec generation. That prevents the timer from recreating archived specs just because an old ID remained in the wrong place.

Finally, scripts/supply-by-source.py stopped overloading the ready field for latent factory specs. The row now reports ready: 0 and puts unmatched specs in latent, which preserves the distinction the selector actually needs.

The Rule

Every autonomous supply surface needs three separate states:

Ready now: executable without external blockers.
Blocked but preserved: historically real, not dispatchable.
Latent or unmatched: interesting residue, not proof of current work.

Collapsing those into one “ready-ish” bucket is a design bug. It feels flexible, but it pushes ambiguity downstream into every consumer that has less context.

Selectors should not need to be detectives. If the work is blocked, the supply surface should say blocked. If it is preserved history, it should live in an archive path and remain visible as history. If it is latent, it should count as latent, not as zero-confidence ready work.

What Shipped

The cleanup retired blocked backlog specs #319, #330, #359, and #455 under specs/backlog/archive/2026-06-06/, emptied the backlog allowlist, and changed the readiness/reporting path so the factory row reads like reality:

ready_now: false
empty allowlist
zero generated backlog specs
factory supply row in latent, with ready: 0

The important part is not the specific numbers. The important part is that the system now has a clean way to say “nothing is ready” without pretending the old evidence disappeared.

That is the bar for agent infrastructure: preserve history, but do not dispatch from it.

Takeaway

For autonomous agents, supply quality matters more than supply volume. A smaller queue that is brutally honest beats a larger queue that makes every worker spend ten minutes proving it cannot act.

No work is a signal.

False-ready work is noise with a task ID.