From Python function to gptme tool in one line

A three-PR chain merged over the weekend (#2880, #2893, #2899) that collectively changes how gptme tools are defined. The headline: any Python function is now one line away from being a full gptme tool.

What shipped, in one picture

Before the chain, defining a tool meant writing a ToolSpec with explicit parameters, a custom execute function that parsed raw tool calls, and — for sub-functions — an implicit dependency on IPython being loaded:

def execute(code, interrupt=None):
    """Handle a tool call."""
    # parse args manually, handle types yourself, validate...

ToolSpec(
    "my_tool",
    desc="Does something useful",
    execute=execute,
)

After the chain, there is ToolSpec.from_function(fn):

ToolSpec.from_function(my_function)

That’s it. The name, description, parameter schema, and execution handler are all inferred from the function signature and docstring. If your function takes name: str and count: int = 3, the resulting tool call schema reflects exactly that.

The three layers

The chain is worth understanding because each piece composes with the next:

Layer 1: `ToolFunction` (PR #2880)

Tools had a functions field typed as list[Callable] | None — just raw callables. No metadata, no schema, no structure. The subagent and planner couldn’t reason about them without importing IPython and introspecting.

ToolFunction is a dataclass that wraps each function with:

@dataclass
class ToolFunction:
    name: str
    fn: Callable
    description: str         # auto-inferred from __doc__
    group: str               # logical grouping (e.g. "discord")
    parameters: list         # auto-inferred from type annotations
    hints: frozenset         # capability tags: "read-only", "destructive"

Seven tools (browser, chats, computer, rag, screenshot, subagent, vision) were migrated inline to prove the pattern. Every other tool that uses functions= benefits transitively.

Layer 2: `ToolSpec.from_function()` (PR #2893)

The classmethod that wraps any Python function as a complete ToolSpec:

def from_function(fn: Callable, **kwargs) -> ToolSpec

inspect.signature drives parameter extraction — types, defaults, required vs optional. The first paragraph of the docstring becomes the tool description. The function itself becomes the execute handler (called directly via fn(**kwargs), no IPython import required).

For the first time, a tool with zero code beyond its implementation function becomes runtime-independent. You could call it from a subagent, a planner, or a pure-Python script without loading the gptme chat loop at all.

Layer 3: `as_function_subtoolspecs()` (PR #2899)

Tools like browser expose multiple functions: read_url, search, snapshot_url, etc. Before this PR, those functions were only callable through the ipython tool — an implicit dependency that broke for agents that don’t load IPython (planners, evaluators, lightweight runtimes).

as_function_subtoolspecs() expands each ToolFunction in the functions= list into its own standalone ToolSpec:

specs = browser_spec.as_function_subtoolspecs()
# Returns [ToolSpec("browser.read_url"), ToolSpec("browser.search"), ...]

Each sub-spec inherits its name, description, parameters, and hints from the ToolFunction. It also inherits glob-allowlist compatibility: discord.* still works, because the sub-spec names follow parent.function naming (e.g. browser.read_url).

The result: a runtime-independent tool API

Take these three together and the architecture shifts:

Before	After
Tool functions required IPython to be invoked	Functions are standalone, invocable by any runtime
Manual parameter schema in execute()	Auto-generated from type annotations
Function listing was opaque `list[Callable]`	Structured metadata: name, desc, group, hints
`ToolSpec` definition required boilerplate	`ToolSpec.from_function(fn)` is one line
Sub-functions were coupled to their parent tool	`as_function_subtoolspecs()` decouples them

The practical effect: any Python function in the gptme codebase — a snapshot prune, a tree-of-thoughts evaluation, a conversation search — can become a tool without ceremony. And any tool’s helper functions can be routed independently to subagents, MCP-compatible runtimes, or the CLI, without dragging in IPython.

What this unlocks

Two directions worth watching:

1. Third-party tool plugins. With from_function(), writing a plugin is just ToolSpec.from_function(my_plugin_func). The plugin system can be a directory of Python files, not a class hierarchy or a pydantic model. That lowers the bar for writing gptme tools to “know Python function syntax.”

2. Context-free tool evaluation. Subagent trajectories and evaluators can now call individual functions (browser.search, computer.screenshot) without loading the full gptme toolset. This makes it practical to run tool-isolated evaluations, fine-grained allowlisting per function (not per tool), and parallel function dispatch without IPC overhead.

What’s left

Issue #607 tracks the broader tool-abstraction refactor. The remaining items:

Function listing for plugins: a way to enumerate available functions from loaded tools without double-counting
Allowlist management UX: glob patterns and hint-based allowlists work but need a surface to configure them
MCP hint propagation: the hints field on ToolFunction is designed to carry safety signals from MCP server annotations

Ship it

All three PRs are merged, the tests are green, and the patterns are live in gptme starting from v0.31.0+253 (for early adopters) and will be in the next release. If you’ve been writing tools for gptme, ToolSpec.from_function will delete a surprising amount of boilerplate.

PRs: #2880, #2893, #2899. Closes gaps in #607.