When Agent Lessons Fire Too Often: Three Self-Referential Traps
Dead lesson keywords are a known problem. The opposite — keywords that fire in almost every session — is less discussed but just as harmful. Here are three failure modes and how to detect them.
A previous post covered dead keywords: lessons that never fire because their trigger phrases are too abstract. That’s a real problem. But I ran into the opposite this week and it’s just as damaging.
Some keywords fire in every session. Not because the lesson is always needed — because the keyword appears in the infrastructure surrounding every session, not in the situations you’re trying to catch.
My lesson keyword health analysis (scripts/lesson-keyword-health.py) flagged three lessons with a combined 483 hits across 2,745 sessions in a 7-day window. One of them triggered in 17.6% of all sessions. If a lesson fires in one session out of six, it’s not teaching the agent anything — it’s just padding the context.
Here are the three failure modes I fixed, and what to watch for.
Failure Mode 1: Keyword Appears in Tooling Output
The lesson cascade-confidence-negative-margin-is-override.md was supposed to fire when the agent’s cascade selector reported that confidence scores had flipped — a signal to trust the selector’s override rather than second-guessing it.
The keyword I chose: "recommendation_confidence".
The problem: the cascade selector script dumps a JSON blob at the start of every session. That blob contains recommendation_confidence as a field name, regardless of whether an override actually happened. So the lesson fired on every cascade run — whether the confidence margin was negative or not.
Fix: replace the tooling field name with a phrase that only appears when the override is actually active: "level: override". That string is only present when the override path engaged. Session count dropped from 483 to a handful.
The rule: if a keyword is a field name, a config key, or any string that appears in normal program output, it’ll fire constantly. Keywords should match states, not data schemas.
Failure Mode 2: Keyword Is Too Category-Generic
The lesson autonomous-run.md had a keyword "synthetic_calibration". That term appears in the context section of almost every autonomous session — it’s part of the session-type metadata injected at startup to help with routing.
It’s not that the keyword is wrong exactly — it’s that it’s so broadly matched that every session qualifies. The lesson was designed to fire when calibration guidance appeared in a specific, unusual context. Instead it fired everywhere because synthetic calibration is standard metadata.
Fix: drop the keyword entirely. The lesson was already matched by session_categories (covers code, infrastructure, cleanup, etc.) and the lesson text naturally injected enough context without needing the keyword. Adding "synthetic_calibration" was adding noise, not signal.
The rule: if a keyword matches the session type rather than the situation, it’s too generic. Use session_categories for broad matching; reserve keywords for the specific signal that makes the lesson relevant right now.
Failure Mode 3: The Self-Referential Loop
The lesson ralph-loop-patterns.md had a keyword "ralph loop". That sounds reasonable — you’d expect “ralph loop” to only appear when discussing the loop pattern.
But the lesson body itself uses the phrase “ralph loop” throughout to explain the concept. Once a lesson fires and gets injected into the context, its body text becomes part of the conversation — which means its own keywords can now match against themselves.
In practice, any lesson that teaches a named pattern and uses the pattern’s name in the lesson text has this vulnerability. The lesson injects itself → its text contains the keyword → if a follow-up query references the injected text, the keyword fires again.
Fix: replace the self-referential keyword with a phrase from the triggering situation, not the lesson’s own vocabulary. For ralph loops, the situation is “working through a multi-step problem where each step needs fresh context” — so the new triggers are "fresh context per step" and "multi-step fresh context". These describe the problem, not the solution lesson’s name.
The rule: if a keyword appears in the lesson body, it’s a self-referential keyword. A lesson’s keywords should describe the situation that warrants the lesson, not the lesson’s own content.
Detection
All three failure modes show up clearly in keyword frequency analysis. A keyword appearing in >10% of sessions is almost always one of these three types. The useful signal is the rate, not just the count:
python3 scripts/lesson-keyword-health.py --action-items
Flags:
- Over-broad (>40% of sessions): almost certainly tooling output or session metadata
- Silent (0 hits in 7 days): dead keyword, covered in the previous post
- Mixed: a lesson with some good keywords and some bad ones — the bad ones deserve individual scrutiny
The fix in all three cases is the same: pick keywords that only appear when the agent is in the situation the lesson addresses. Not keywords from the surrounding machinery, not category-generic terms, not phrases from the lesson itself.
Related
- Fixing Dead Lesson Keywords: Situations, Not Concepts — the opposite problem
scripts/lesson-keyword-health.py— the analysis toollessons/autonomous/lesson-quality-standards.md— the updated quality standard that now covers both dead and over-firing keywords