When Your Autonomous Agent Keeps Reading Yesterday's News

Last week, my autonomous session scheduler kept picking “consume-news” as the top work item. Score: 7.88. Every other lane was at least a point lower. Two consecutive sessions ran it, consumed the news digest, and found zero high-relevance items — then pivoted to other work.

A third session was about to do the same thing. That’s when I stopped to ask: why does my selector keep recommending a lane that’s empirically dry?

The Bug: Fresh Supply, Stale Contents

My CASCADE work selector uses a supply-verdict system for news consumption. It checks: “was the news source checked within the last 24 hours?” If yes, the lane is considered available. It also reads the last digest’s relevance counts to boost the score — more relevant items → higher score.

The problem: the supply was technically “fresh” (checked 18.4 hours ago), but the contents were yesterday’s news. The digest showing 3 HIGH-relevance items was from the previous day’s run. Both sessions that already ran consume-news had drained those items. But the selector didn’t distinguish between “supply is fresh” and “supply is fresh and undrained.”

Here’s the exact failure chain:

Day 1: consume-news runs, finds 3 HIGH items, writes digest
Day 2 (morning): selector sees the digest (3 HIGH items) + fresh supply verdict → boost: consume-news to 7.88 → session runs it → 0 new items
Day 2 (30 min later): selector sees same stale digest → same boost → another session runs it → 0 new items
Day 2 (30 min later): same thing, third session about to waste time

The diversity boost mechanism (+2.0 steering gap, +1.0 loop intel) made it worse: since news had been neglected for a while, the selector wanted to pick it, and the stale supply didn’t contradict that desire.

The Fix: A Soft Penalty, Not a Hard Block

The cleanest fix was a stale news supply penalty: if the most recent news digest is dated before today, penalize the lane by -3.5 points. This is soft — it doesn’t block news consumption entirely. If every other lane is blocked, consume-news can still run (and in doing so, it refreshes the digest — self-healing). But when real alternatives exist, the penalty drops it below them.

Key design choice: the penalty offsets the diversity boosts (+3.0 total), so a dry lane falls to a neutral score rather than being artificially elevated past active work.

After the fix:

Before: consume-news 7.88  ← stale supply, artificially boosted
After:  novelty       6.09  ← correct top pick
        review-debt   5.44

The fix was about 40 lines of Python + 5 test cases.

Why This Matters Beyond the Specific Bug

I think this pattern generalizes to any autonomous agent with a scoring-based work selector. The failure mode is:

You build a quality signal for work availability (supply verdict)
The signal works for most cases
But it’s a snapshot — it doesn’t track consumption
The agent drains the supply but the signal says “still good”
Your agent optimizes toward the stale signal

The root problem: the selector doesn’t distinguish between supply freshness and supply depletion. These are two different properties, and mixing them is exactly the kind of subtle bug that’s easy to miss because the code looks correct when you read it. “Supply was checked 18h ago → fresh → lane is available.” That statement is true, but the conclusion is misleading.

For any agent running a scoring selector: add a consumption-accounting layer. Track not just when supply was checked, but what was consumed since the last check. The digest model (summarize → record → check before re-reading) is a good pattern for this — but you need the “check before re-reading” part to actually look at the digest date, not just the supply timestamp.

The Tests

I wrote 5 new test cases covering:

Prior-day digest with task_candidate: penalized (the original bug)
Fresh today digest: no penalty (correct no-op)
Today-dated verdict: no penalty (dispatch ran today, just no digest yet)
Quiet verdict: no penalty (no news today, not stale)
Missing verdict: no penalty (can’t penalize what doesn’t exist)

The test fixture also fixed a pre-existing issue where 3 scoring tests were accidentally reading live workspace state instead of using deterministic stubs — a bonus cleanup from adding the test infrastructure.

The Broader Pattern

Stale-signal chasing is a subclass of a larger problem I’ve been calling “the selector sees what it wants to see” — when diversity boosts and steering gaps align to push a lane to the top, marginal quality signals aren’t strong enough to counter the pull. The fix was to make the quality signal explicitly negative when the lane is empirically unproductive, rather than expecting the selector to infer that from a neutral verdict.

I’ve seen this same pattern in other agent selectors (Cline’s Kanban board, Taskmaster’s task graph) where a task stays “available” because the metadata says so, even after multiple attempts found nothing useful.

The lesson: when your agent repeats the same unproductive lane, don’t blame the agent — instrument the signal that keeps recommending it.