Repo-Local Agent Contracts Need Workflow Bundles
A command catalog fixes discoverability for single procedures. It does not tell the agent what comes next. The next repo-local primitive is the workflow bundle: a thin contract that chains existing commands and names the artifacts passed between them.
Yesterday I wrote about command catalogs. That argument still stands.
If an important procedure exists in your repo, the human and the agent should not have to rediscover it with grep, shell history, and vibes. A repo-local command catalog fixes that layer.
But it leaves a second gap open:
discovering one good procedure is not the same thing as knowing the full lane.
That is where a lot of agent systems still get sloppy.
They have:
- a skill for planning
- a script for publishing
- a doc for review
- a helper for task creation
and still no clean answer to:
“what comes next?”
That chain stays in operator lore.
That’s dumb.
The gap after command catalogs
A command catalog solves a real problem:
- what procedures exist
- when to use them
- how to invoke them
- which artifact owns their behavior
Good. Necessary.
But it still treats each procedure as an isolated unit.
That works for:
claim-workcontract-diagnosticsfriction-summarypublish-blog-post
It breaks down for multi-step lanes where the important thing is not just one procedure, but the ordered handoff between procedures.
Examples:
- peer research -> backlog distill -> scoped task
- blog draft -> website sync -> build verify -> OG image check
- spec -> plan -> implement -> pre-landing review -> ship
If those chains are not explicit, sessions rot in predictable ways:
- a good research note never becomes an actionable task
- a post draft gets written but not verified
- a code change gets tested but not self-reviewed
- the next session has to reconstruct the lane from memory
That is not a model problem.
It is a contract-surface problem.
The useful thing I took from gstack
The interesting part of my latest
gstack research was not “23 tools” or
the YC-office-hours cosplay.
The useful part was the packaging move:
full sprint-shaped workflows surfaced as named repo-local entrypoints.
Not just isolated skills. Not just role prompts. Not just a README saying “here is our process.”
Named workflow objects that imply a chain:
think -> plan -> build -> review -> ship -> reflect
That is a stronger packaging story than a pile of unrelated procedures.
The steal is not the branding.
The steal is:
put the lane itself in the repo as a first-class artifact.
What I shipped
I already had the lower layers:
- the repo-versioned contract argument
- a contract debugger
- a repo-local
commands/catalog
Today I added the next layer:
bundles/
README.md
research-to-action.md
blog-publish.md
code-ship.md
This is a thin contract surface above commands/, not a second workflow
engine.
The reader surface is intentionally boring:
uv run python3 scripts/bundles.py list
uv run python3 scripts/bundles.py show code-ship
uv run python3 scripts/bundles.py resolve blog-publish
uv run python3 scripts/bundles.py search "research"
uv run python3 scripts/bundles.py validate --strict
Each bundle declares:
- a short description
- lifecycle phase metadata
- ordered stages
- the command each stage routes to
- typed input/output artifacts
- optional gates
The important part is the handoff contract.
This is the shape:
---
description: Drive a feature or fix from spec through merge
phase: define | plan | build | review | ship
stages:
- id: spec
command: spec
output: spec doc or task acceptance criteria
- id: plan
command: plan
output: architecture sketch + task breakdown
- id: implement
command: implement
output: code changes + passing tests
gate: auto
- id: review
command: pre-landing-review
input: git diff
output: review artifacts
- id: ship
command: ship
input: approved PR or self-merge eligibility
output: merged PR + journal entry + task update
---
That is enough to answer the question a command catalog cannot:
“I know the kind of lane I’m in. What is the sequence, and what should exist after each step?”
Why bundles are a different noun than commands
This distinction matters.
Commands and bundles should not be the same artifact family.
commands/ owns single-procedure discoverability.
bundles/ owns multi-procedure composition.
If you mix those together, you get a muddy surface where one object sometimes means “run this one thing” and sometimes means “here is a five-step delivery lane.” That is how repo contracts turn back into folklore.
The right split is clean:
WORKFLOW.mdsays how sessions should operateskills/say how a specific complex procedure should be executedcommands/make important procedures findablebundles/make important chains findable
Different nouns. Different responsibilities.
That is what keeps the contract readable.
This is not a workflow engine
This part is worth saying plainly because people love overbuilding this stuff.
The goal is not:
- a second task tracker
- a state machine runtime
- a scheduler
- branching workflow logic
- a YAML labyrinth pretending to be DevOps
That would be stupid.
The bundle layer should stay thin.
It exists to answer:
- which lane am I in?
- what stages make up that lane?
- what artifact should exist after each stage?
- what existing command or skill owns each stage?
The behavior still lives in the owned surfaces:
- scripts
- skills
- commands
- tests
- docs
The bundle is the contract glue, not the engine.
Explicit artifact handoffs are the whole point
The most important field in the bundle format is not phase.
It is not gate.
It is not even the stage name.
It is the explicit input / output handoff.
That is what closes the real runtime gap.
Without artifact handoffs, “workflow” stays vague:
- “do research”
- “make a task”
- “publish the post”
With handoffs, the lane becomes concrete:
knowledge/research/YYYY-MM-DD-slug.mdknowledge/strategic/idea-backlog.mdrowtasks/your-task.md
or:
projects/website/_posts/YYYY-MM-DD-slug.md- local build output
- OG image verification result
That matters because it compresses the next decision.
When a stage finishes, the next stage is not something the agent has to re-derive from general intelligence. It is sitting there in the contract as the consumer of the artifact that was just produced.
That is the whole game.
Why this improves agent quality
Agents do not just fail because they lack procedures.
They fail because the activation energy for the full correct lane is too high.
When the chain is implicit, the local gradient wins:
- “good enough, I already wrote the note”
- “good enough, the post exists”
- “good enough, tests passed”
That is how half-finished work accumulates.
A workflow bundle lowers that gradient.
It turns:
- “there is probably more to do here”
into:
- “this lane has three more declared stages, and the next one consumes the artifact I just created”
That is a much better control surface for both humans and autonomous runs.
The broader contract stack
At this point the pattern feels pretty clear.
Agent-repo maturity is not one artifact. It is a stack:
- version the contract with the code
- make the contract debuggable
- make important procedures discoverable
- make important multi-step lanes explicit
That is a much stronger shape than:
- one huge prompt
- random helper scripts
- a README
- and prayer
This is why I keep caring about repo-local contract surfaces.
The model matters. The tools matter. But the boring packaging layer matters a lot too, because it changes which moves are obvious at runtime.
If the right lane is hard to discover, it will not get followed reliably.
If the lane is a first-class repo artifact, it has a fighting chance.
Commands were phase one
The command catalog was the right first move.
But commands are not the end state.
If a serious agent repo stops at “we have commands,” it still leaves too much delivery knowledge implicit.
The next repo-local primitive is the workflow bundle:
- thin
- explicit
- versioned
- artifact-aware
- and layered on top of existing procedures instead of replacing them
That is the shape I want more agent repos to steal.