Edition 8: Hooks Become the Policy Plane

Every coding agent eventually runs into the same wall.

Prompt guidance is not enough.

You can tell the model “don’t do dangerous things,” “ask before writing,” and “stay inside the repo.” That works until the model is under pressure, the repo has weird setup requirements, or the user wants a little more automation than the prompt alone can safely carry.

That is why hooks are quietly becoming one of the most important surfaces in the coding-agent stack.

Not “hooks” in the old sense of random shell glue bolted onto a tool.

Hooks as a policy plane: a typed runtime surface where the operator or the repo can say:

block this tool call
rewrite this risky input
attach extra context
log this event
bootstrap this worktree
fail open here
fail closed there

Once you see the pattern, it is everywhere.

The shift: from extensibility to runtime governance

Older agent extension stories mostly answered one question: how do I add more capability?

install a plugin
add a skill
register a tool
point at an MCP server

The current wave is answering a different question:

how do I govern the capability I already have?

That is a much more consequential surface.

The strongest systems are no longer treating hooks as an afterthought for power-users. They are using them as the runtime layer where safety, approval behavior, auditing, and environment setup become explicit.

Four distinct hook models are emerging

Across the current agent landscape, hooks are converging into four different roles.

1. Blocking safety hooks

This is the clearest pattern.

The hook fires before a tool call or command executes. It can inspect the payload and decide whether the action should proceed.

Crush is the cleanest public example. Its PreToolUse hooks run before the permission UI, can allow, deny, or halt, can inject extra context, and can even rewrite tool input. That is not a cosmetic extension point. That is a real policy layer.

Kimi Code CLI pushes the same idea further with explicit lifecycle events like PreToolUse, PostToolUse, UserPromptSubmit, SessionStart, SessionEnd, and PreCompact. Exit code 2 blocks. Non-zero errors usually fail open. The key point is not the shell implementation. It is that the runtime semantics are spelled out.

Windsurf also exposes pre-hooks that can block via exit code 2, but does the more important thing: it puts the hook config in a repo-local tracked file at .windsurf/hooks.json. That turns project-specific runtime policy into a versioned artifact instead of a hidden local hack.

This is the first real break from “please be careful” prompt design.

2. Audit and automation hooks

Not every important hook needs to block.

Post-action hooks are where agents start to look less like chat apps and more like serious runtimes:

record the command that ran
emit telemetry
summarize a completed session
archive a transcript
trigger downstream automation

This is the layer where the runtime becomes observable.

Cline is strong here because it treats hooks, plugins, tools, schedules, and runtime events as typed product objects instead of lore. The sharp design choice is not any single hook. It is the fact that hook behavior is part of a named extension vocabulary with explicit policies like fail_open and fail_closed.

That matters because an agent runtime without explicit post-action semantics always degenerates into shell-script folklore. Things “happen after runs,” but nobody can quite say where the contract lives.

3. Bootstrap lifecycle hooks

This is the underrated category.

The glamorous use case for hooks is stopping rm -rf. The operationally useful use case is making parallel work actually start cleanly.

Windsurf has the sharpest example here with post_setup_worktree. The hook runs inside the new worktree and is explicitly meant to handle all the boring things that break isolated agent runs in practice:

copy .env files
restore ignored local state
run setup
make the new workspace usable

That is a huge deal.

A lot of multi-agent demos quietly assume the hard part is task routing. It isn’t. The hard part is that the second workspace starts broken because secrets, ignored files, local caches, or generated artifacts are missing.

A lifecycle hook attached directly to worktree setup is more valuable than another clever planner.

4. Prompt-boundary hooks

The most subtle model is the hook that fires before the model even reasons.

Kimi Code CLI’s UserPromptSubmit is a particularly interesting design. The hook can block before the model runs at all. If it does not block, its result is shown to the user but not blindly shoved into the model context.

That is a very modern instinct.

It acknowledges that not every runtime intervention should become more prompt text. Some policy belongs at the boundary around the model, not inside the model’s visible reasoning substrate.

This is one of the cleanest answers I’ve seen to a question most agent systems still muddle: when does a rule belong in prompt context, and when does it belong in the runtime that wraps the model?

The strongest projects are converging on the same thesis

Across these systems, the design converges on a simple idea:

runtime policy should be explicit, typed, and inspectable.

The details differ:

Crush emphasizes deterministic policy intervention before execution.
Windsurf emphasizes repo-local, version-controlled hook contracts.
Kimi Code CLI emphasizes full lifecycle coverage and a clear blocking contract.
Cline emphasizes a typed extension vocabulary with hook policies as first-class runtime objects.

But the underlying thesis is the same:

prompts are guidance, hooks are governance.

That distinction is becoming more important as agents get broader tool access, longer run durations, and more background autonomy.

What does not work

This convergence is real, but the failure modes are also obvious.

Prompt-only safety

If the only protection against a bad tool call is “the system prompt said not to,” the system is fragile.

That is not a theory point anymore. The recent safety-benchmark wave already showed that stronger models do not magically infer correct approval boundaries or least-privilege policies from vague instructions.

Runtime enforcement exists because prompts drift, models improvise, and task pressure changes behavior.

Hidden local hook sprawl

The opposite failure mode is also bad: hooks exist, but they live in a private shell history, a dotfile nobody else sees, or an operator’s memory.

That gives you local power but not a real contract.

The important step Windsurf gets right is not merely “support hooks.” It is “make workspace hooks a tracked repo artifact.”

If the repo depends on a runtime policy, the policy should live with the repo.

Hook maximalism

Hooks are not a substitute for every other control surface.

If a runtime uses hooks for everything, it becomes impossible to reason about:

which behavior comes from prompt context
which behavior comes from declarative permissions
which behavior comes from a repo contract
which behavior comes from a local operator override

That is why the best systems separate surfaces:

contracts for durable instructions
hooks for runtime intervention
skills for executable procedures
permissions for broad trust posture

The win is not “more surfaces.” The win is that each surface owns one fact family cleanly.

The real product gap

The interesting gap is no longer “agent systems need hooks.”

They have hooks.

The real gap is that most still do not have a good policy experience:

one obvious file or UI for project policy
one obvious explanation of blocking semantics
one obvious runtime-status surface
one obvious story for fail-open vs fail-closed

That is the part that separates an extension point from a policy plane.

The best current examples are getting there from different directions:

repo-local config
typed hook stages
explicit matcher semantics
explicit timeout behavior
explicit blocking codes
explicit lifecycle events

What still feels unfinished is the operator view. Most systems let you define policy, but fewer make it easy to answer:

what policies are active right now, and what just blocked this action?

That is where the next round of product quality will likely show up.

The likely next step

The packaging stack is already converging:

repo-local contract files
skills / commands / agents
MCP integration
some notion of project policy

The next layer to converge is the runtime policy plane.

That probably looks like:

Repo-local hook contracts for project-specific rules.
Typed lifecycle events instead of ad-hoc shell triggers.
Clear fail-open / fail-closed semantics per hook class.
A runtime-status surface showing which policies are active.
Hook-aware worktree/bootstrap flows for parallel-agent setups.

The important change is not technical. It is conceptual.

Hooks are no longer just how you customize an agent.

They are becoming how you govern one.

And once that layer becomes first-class, the real split between agent systems will not be “who supports MCP” or “who has skills.” It will be:

who has a real policy plane, and who is still pretending prompts are enough.