Workflow Bundles Over Commands: A Thin Composition Layer for Agent Workspaces

Yesterday I shipped Agent Procedures Need a Command Catalog.

That solved one problem cleanly:

what procedures exist in this repo?

Today I pushed the next layer:

how do those procedures compose into a real lane?

That is what bundles/ is for.

But the first draft had a bug in the concept.

The bug was not in the bundle files. The bug was in the surface around them.

The Bundle Layer

Inspired by gstack’s workflow-bundling insight, I added a bundles/ directory as a thin composition layer above commands/.

Each bundle is a Markdown file with YAML frontmatter that declares stages:

---
description: Publish a blog post from draft through website sync to image verification
phase: verify
stages:
  - id: draft-sync
    command: publish-blog-post
    output: projects/website/_posts/YYYY-MM-DD-slug.md
  - id: build-verify
    command: blog-build-verify
    input: projects/website/_posts/YYYY-MM-DD-slug.md
    output: projects/website/_site/ (build output)
  - id: og-verify
    command: og-image-check
    input: projects/website/_posts/YYYY-MM-DD-slug.md
    gate: auto
---

That’s it. No engine, no task tracker, no second source of truth. The stages reference commands from commands/, and the bundle owns the ordering and artifact handoffs — not the implementation.

The First Draft Was Still Too Folkloric

My first pass proved the file format, but it still had the same weakness that many agent repos have:

you could list bundles,
you could show one by name,
you could resolve one to concrete commands,
but you still mostly needed to know what you were looking for.

That is not enough.

If a bundle exists but the human or agent has to remember its exact name, the workflow is still half folklore.

This is the same mistake people make with skills and scripts:

“It exists in the repo” is not the same as “it is easy to discover at runtime.”

What Made The Surface Real

The follow-up change was simple:

add search
add --format json

Now the bundle layer is queryable by:

name
phase
stage ids
stage commands
description
“when to use” text

And it can be emitted as a machine-readable catalog instead of only terminal text.

That changed the surface more than the bundle files themselves.

uv run python3 scripts/bundles.py search "publish blog"
uv run python3 scripts/bundles.py --format json list

The first command answers “what lane handles this?” without exact-name memory. The second answers “how can another tool consume the bundle catalog?” without scraping prose.

Why Search Matters

Search results for "publish blog" - 1 match(es)

blog-publish [verify]
  Publish a blog post from draft through website sync to image verification
  stages: draft-sync, build-verify, og-verify
  when: draft written and ready to publish

This matters because exact names are a fake usability story.

Humans do not remember every bundle slug in a busy repo. Agents do not either. What they remember is the intent:

“publish a blog post”
“turn research into a task”
“ship code safely”

If the runtime surface cannot resolve that intent into a lane, the bundle layer is decorative.

Search is what turns the bundle from a note into an interface.

Why JSON Matters

The more important addition might actually be --format json.

Once the bundle catalog can be emitted structurally, it stops being only a human-facing helper and starts becoming an input to:

selectors
contract diagnostics
UIs and TUIs
agent planning surfaces
other commands that need to answer “what lane owns this artifact?”

That is the difference between documentation and a runtime object.

Markdown remains the durable authoring layer. JSON becomes the interchange layer.

Why This Pattern Works

Composition without coupling. The bundle doesn’t know how the command implements the step. The command doesn’t know what bundle invoked it. They share only a contract: a named procedure that produces a named artifact.
Discoverable by query, not just convention. bundles/*.md still matter, but search and JSON output are what make the catalog usable under time pressure.
No engine, no lock-in. This is the opposite of a workflow engine. It’s a packaging layer. If the bundle pattern doesn’t work, delete the directory and nothing breaks. The commands still work.
Phased rollout. I now have three pilot bundles:
- research-to-action
- blog-publish
- code-ship

Each one is small enough to delete if the pattern proves fake.

The Anti-Pattern I Avoided

I almost stopped after the first draft and called it done.

That would have been dumb.

A bundle layer that is readable but not searchable, and inspectable but not machine-readable, still leaves too much activation energy on the table. The repo would technically contain the workflows while still failing to surface them when needed.

That is the same failure mode I called out in the command-catalog post, just one layer up the stack.

The bigger anti-pattern would have been overcorrecting into a workflow engine. That is also dumb.

The right move is thinner:

version the lane as Markdown
resolve it through existing commands
expose query surfaces for humans and agents
stop there until reality proves more machinery is needed

What’s Next

The bundle contract is still in pilot mode, but now it has a believable runtime surface.

The next real test is straightforward:

use bundle search during normal work selection
feed JSON output into other contract/debug surfaces
see whether this reduces workflow archaeology in actual sessions

If it does, the pattern stays. If it does not, I delete it.

That is the whole point of keeping the layer thin.

Repo Maps Before Edits — another discovery-layer pattern
The Health Check That Wasn’t There — why discovery matters
Agent Procedures Need a Command Catalog — the lower layer this builds on