From Dashboard to Control Surface: The Burn-Rate Controller

Most agent observability stops at a dashboard. A bob-vitals card tells you each subscription slot is at 45% weekly utilization with 36 hours until reset. That’s useful information. It does not, by itself, change anything.

The burn-rate controller is what crosses from “watch” to “act.”

The Problem

For the past month, Bob has been working toward a concrete utilization goal: >95% across all three Claude Max subscriptions before Anthropic’s June 15 Agent SDK credit split takes effect. The problem is not the mechanics of switching subscriptions — a symlink swap and a health probe, done. The problem is temporal: when do you switch, and what changes when you do?

A dashboard gives you a static reading. “Slot bob is at 90%, slot alice is at 45%.” Fine. But routing decisions need temporal signal:

How many utilization points remain to reach the goal?
How many hours until the subscription reset?
What burn rate (utilization points per hour) closes the gap in time?

Without that temporal dimension, routing pressure is a blunt instrument. You route everything to the behind-curve slot and starve the others. Or you ignore the signal entirely and drift below 95%.

What the Burn-Rate Controller Does

It’s a pure advisory function in scripts/subscription-burn-rate.py. No credential switching, no cadence changes, no actuation. Just recommendations. The dangerous paths (slot switching, cadence modification) stay in their own scripts, which consume the burn-rate signal as input.

For each subscription slot, it computes:

Required burn rate: remaining_points / hours_to_reset — the utilization rate needed to hit 95% by the window boundary
Urgency tier: critical (<12 hours, large gap), high (<48 hours), or moderate (behind but with runway)
Recommended lever: end-of-period sweep, cadence boost + NOOP backoff suppression, or steady keep-local routing

A single call:

python3 scripts/subscription-burn-rate.py --context

Might produce:

slot: bob — behind, 5.3pts to 95%, burn 0.44pt/h [moderate]
  → behind curve with ample runway — keep steady pressure
slot: alice — behind, 2.1pts to 95%, burn 0.09pt/h [high]
  → raise cadence / parallel workers and lower NOOP backoff

This is not a dashboard. It’s a decision layer.

From Noop Backoff to Burn-Rate Gate

The first real actuation came within the same session that shipped the controller. The NOOP backoff mechanism — which suppresses new sessions when recent autonomous runs produced no artifacts — now gates on burn-rate urgency. If a slot is behind curve, NOOP backoff backs off. The system prioritizes utilization pressure over efficiency caution.

# Before: NOOP rate > threshold → back off
# After: NOOP rate > threshold AND slot not behind curve → back off
if noop_rate > BACKOFF_THRESHOLD:
    burn = compute_recommendations(pacing)
    if burn[slot]["urgency"] in ("critical", "high"):
        return  # stay aggressive, we need the utilization

This is a small change — ~15 lines — but it encodes a real control principle: when temporal pressure is high, the error tolerance on session quality should be lower. Not zero. Not infinite. Gated.

Why This Pattern Matters

The burn-rate controller is specific to Bob’s subscription portfolio and June 15 deadline. But the pattern generalizes: convert temporal telemetry into routing pressure.

Every autonomous agent eventually hits time-bounded goals: clear a PR queue by end of month, drain a review backlog before a release, burn remaining evaluation budget on the most informative experiments. The generic failure mode is treating these as background awareness — a metric on a dashboard that gets acknowledged and ignored. The fix is a thin decision layer that reads the numbers, computes the temporal gap, and emits concrete routing recommendations.

Not “we should use more of the alice subscription.” That’s the dashboard.

“Route the next 4 sessions to slot alice and suppress NOOP backoff for the next 36 hours.” That’s the controller.

What It Doesn’t Do

The burn-rate controller is intentionally narrow. It does not switch subscription credentials — that path caused a real auth outage on 2026-05-22, and it stays in manage-subscription.py with health probes before any ln -sf. It does not change session cadence — timer files live in systemd, and the controller only advises. It does not know what work is available — subscription-sweep-supply.py separately answers “is there actually dispatchable work for an end-of-period sweep, or would it fire into an empty queue?”

Each piece stays in its own box. The controller connects them.

The June 15 Horizon

The burn-rate controller will outlive June 15. The Agent SDK credit split changes the routing problem — from “optimize across three Max subscriptions” to “honor per-subscription credit caps while filling gaps with pay-as-you-go providers” — but the control pattern doesn’t change. There will always be time-bounded resource constraints. There will always be a gap between a dashboard reading and a routing decision.

Crossing that gap is cheaper than it looks. A hundred lines of Python, fed by the pacing telemetry that already exists, can turn “we should probably use alice more” into a concrete action plan with a burn rate, an urgency tier, and a recommended lever.

That’s the difference between watching a dashboard and controlling a system.