agent-dispatch 0.5.0__tar.gz → 0.8.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (30) hide show
  1. agent_dispatch-0.8.0/AGENTS.md +57 -0
  2. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/CHANGELOG.md +119 -1
  3. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/PKG-INFO +139 -30
  4. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/README.md +137 -28
  5. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/agents.example.yaml +10 -1
  6. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/pyproject.toml +15 -2
  7. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/src/agent_dispatch/__init__.py +1 -1
  8. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/src/agent_dispatch/cli.py +260 -41
  9. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/src/agent_dispatch/config.py +8 -3
  10. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/src/agent_dispatch/jobs.py +71 -16
  11. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/src/agent_dispatch/models.py +20 -3
  12. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/src/agent_dispatch/runner.py +341 -87
  13. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/src/agent_dispatch/server.py +279 -98
  14. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/test_cli.py +452 -82
  15. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/test_config.py +30 -0
  16. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/test_jobs.py +123 -0
  17. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/test_models.py +18 -0
  18. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/test_runner.py +518 -0
  19. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/test_server.py +762 -116
  20. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/.github/dependabot.yml +0 -0
  21. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/.github/workflows/ci.yml +0 -0
  22. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/.github/workflows/publish.yml +0 -0
  23. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/.gitignore +0 -0
  24. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/LICENSE +0 -0
  25. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/SECURITY.md +0 -0
  26. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/assets/mascot.png +0 -0
  27. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/src/agent_dispatch/cache.py +0 -0
  28. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/__init__.py +0 -0
  29. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/conftest.py +0 -0
  30. {agent_dispatch-0.5.0 → agent_dispatch-0.8.0}/tests/test_cache.py +0 -0
@@ -0,0 +1,57 @@
1
+ # AGENTS.md
2
+
3
+ Guidance for AI coding agents working on this repository.
4
+
5
+ > **Using agent-dispatch** (not developing it)? Read [README.md](README.md) — it has the full setup path with verify steps and the complete MCP tool reference. This file is for contributing to the codebase.
6
+
7
+ ## What this project is
8
+
9
+ MCP server + CLI that lets Claude Code agents delegate tasks to agents in other project directories. One sync core, two surfaces:
10
+
11
+ | File | Role |
12
+ |------|------|
13
+ | `src/agent_dispatch/runner.py` | Sync subprocess wrapper around `claude -p` — the actual work |
14
+ | `src/agent_dispatch/server.py` | Async FastMCP interface (19 MCP tools), wraps runner in `asyncio.to_thread` + semaphore |
15
+ | `src/agent_dispatch/cli.py` | Click CLI: `init`, `add`, `update`, `remove`, `list`, `describe`, `test`, `doctor`, `jobs`, `job`, `cancel`, `gc`, `serve` |
16
+ | `src/agent_dispatch/models.py` | Pydantic v2 models (`AgentConfig`, `Settings`, `DispatchResult`) |
17
+ | `src/agent_dispatch/config.py` | YAML config load/save + project auto-description |
18
+ | `src/agent_dispatch/cache.py` | Thread-safe in-memory TTL cache |
19
+ | `src/agent_dispatch/jobs.py` | Persistent per-job JSON files for async dispatch |
20
+
21
+ ## Dev setup
22
+
23
+ ```bash
24
+ pip install -e ".[dev]"
25
+ ```
26
+
27
+ ## Gates — both must pass before a change is done (CI rejects otherwise)
28
+
29
+ ```bash
30
+ ruff check src/ tests/
31
+ python3 -m pytest tests/ -v # 428 tests, ~2s — all subprocess calls are mocked
32
+ ```
33
+
34
+ Tests must **never** invoke the real `claude` CLI. Runner tests mock `shutil.which` + `subprocess.run`/`Popen`; server tests mock `_get_config` + `runner.dispatch`.
35
+
36
+ ## Non-obvious invariants (violating these breaks real behavior)
37
+
38
+ - `allowed_tools` / `disallowed_tools` are **tri-state**: `None` = inherit settings defaults, `[]` = explicitly no tools, `[...]` = exactly these. Check with `is not None`, never `or` — `[]` is falsy but semantically distinct.
39
+ - `denied_tools` non-empty + `is_error` ⇒ `error_type="permission"`, regardless of what the error text matches.
40
+ - On failure, callers read `DispatchResult.error` + `error_type` — `result` holds the raw agent output even on errors.
41
+ - `--session-id` and `--resume` conflict — never pass both to `claude`.
42
+ - Valid permission modes: `default`, `plan`, `bypassPermissions` (`models.py: KNOWN_PERMISSION_MODES`).
43
+ - `JobStore.finish`/`fail` refuse already-terminal jobs (returns `None`) — this closes the race with force-cancel; never "fix" it by overwriting.
44
+ - Cancelling a *running* job requires the in-memory `_running_procs` registry (server.py) — the job is marked `cancelled` **before** the subprocess is killed. Don't persist PIDs to disk (PID reuse after restart could kill an unrelated process).
45
+ - `max_budget_usd` is **post-hoc**: `_apply_budget` (runner.py) sets `budget_exceeded` + `hint` after the cost is known; it never fails the dispatch.
46
+
47
+ ## Conventions
48
+
49
+ Python ≥ 3.10 · `from __future__ import annotations` everywhere · Pydantic v2 · Click (CLI) + FastMCP (server) · ruff, line length 100 · all MCP tools return JSON strings, errors as `{"error": "..."}`.
50
+
51
+ ## When adding a feature, check every layer
52
+
53
+ `models.py` (data shape) → `runner.py` (dispatch mechanics) → `server.py` (MCP tool) → `cli.py` (CLI flag) → tests for each → `README.md` + `agents.example.yaml` (user docs).
54
+
55
+ ## More detail
56
+
57
+ [README.md](README.md) documents every MCP tool with parameter tables, response shapes, and the error-recovery map — it doubles as the behavioral spec. The test suite (`tests/`, 428 tests) encodes the exact expected behavior of every layer: when in doubt, read the tests for the module you're touching (`test_runner.py`, `test_server.py`, `test_cli.py`, ...).
@@ -7,6 +7,123 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.8.0] - 2026-06-17
11
+
12
+ Let agents declare what they are good at, so callers can pick the right one.
13
+
14
+ ### Added
15
+ - **Declared capabilities.** `AgentConfig` gains `capabilities` and
16
+ `risky_capabilities` — short snake_case labels describing what an agent is
17
+ for (e.g. `docker_logs`, `restart_services`). They are descriptive metadata
18
+ only (never passed to the `claude` CLI): settable via `add_agent` /
19
+ `update_agent` (MCP) and `add` / `update` (CLI, `--capabilities` /
20
+ `--risky-capabilities`, `none` clears), and surfaced in `list_agents` /
21
+ `inspect_agent` so the calling agent can choose a target at a glance.
22
+ `risky_capabilities` flags higher-risk abilities for extra scrutiny.
23
+
24
+ ### Changed
25
+ - `save_config` no longer writes empty `capabilities` / `risky_capabilities`
26
+ keys for agents that don't declare them, keeping `agents.yaml` clean.
27
+
28
+ ### Note
29
+ - A keyword-scoring router (`recommend_agent` / `dispatch_auto` MCP tools and
30
+ `recommend` / `auto` CLI commands) was prototyped during this cycle and then
31
+ removed before release: a deterministic keyword scorer adds little over the
32
+ calling LLM's own judgment when there are only a handful of agents, and the
33
+ capability labels above cover the "what is this agent for" need without the
34
+ extra surface area or the risk of auto-dispatching to a wrong guess.
35
+
36
+ ## [0.7.0] - 2026-06-10
37
+
38
+ Job control release: running jobs become cancellable, the budget field stops
39
+ being decorative, and async jobs get a CLI.
40
+
41
+ ### Added
42
+ - **Cancel running jobs.** `dispatch_cancel(job_id)` now kills a *running*
43
+ job's `claude` subprocess when the job was started by the same server
44
+ instance (in-memory process registry — no PID files, no risk of killing an
45
+ unrelated process after a restart). The job is marked `cancelled` *before*
46
+ the kill, and `JobStore.finish`/`fail` now refuse already-terminal jobs, so
47
+ the worker's trailing write can't resurrect it. New outcome:
48
+ `cancelled_running`. Jobs from a previous server run still report
49
+ `running` (cannot be killed safely).
50
+ - **Budget visibility (post-hoc).** `max_budget_usd` was stored and displayed
51
+ but never checked. A dispatch whose `cost_usd` exceeds the agent's
52
+ `max_budget_usd` (or `settings.default_max_budget_usd`) now returns
53
+ `budget_exceeded: true` plus a `hint`. The dispatch is *not* failed — the
54
+ `claude` CLI has no spend cap, so by the time the cost is known the money is
55
+ spent; the flag makes runaway agents visible instead of silent.
56
+ - **CLI for async jobs.** New commands: `agent-dispatch jobs [--status
57
+ --limit]` (list), `agent-dispatch job <id>` (detail with progress tail and
58
+ result preview), `agent-dispatch cancel <id>` (pending jobs; running jobs
59
+ belong to the MCP server process), `agent-dispatch gc [--days]` (purge old
60
+ terminal jobs).
61
+ - **PyPI discoverability:** expanded package keywords (5 → 12).
62
+
63
+ ### Changed
64
+ - `runner.dispatch_stream` accepts an `on_proc` callback (receives the Popen
65
+ handle right after spawn) — used by the async worker to register the
66
+ process for cancellation.
67
+ - `JobStore.cancel` accepts `force=True` to cancel running jobs (callers must
68
+ kill the subprocess themselves); `finish`/`fail` return `None` for terminal
69
+ jobs instead of overwriting them.
70
+
71
+ ## [0.6.0] - 2026-06-04
72
+
73
+ Reliability release: timeouts stop being fatal, permission-blocked "successes"
74
+ become visible, async jobs show live progress.
75
+
76
+ ### Fixed
77
+ - **`dispatch_stream` was broken on current claude CLIs** — they reject
78
+ `--print --output-format stream-json` without `--verbose` ("requires
79
+ --verbose"), so every stream dispatch (and CLI `test --stream`) failed
80
+ immediately. The runner now passes `--verbose`. Caught by live verification
81
+ against the real CLI before this release; without it the async-worker
82
+ switch to streaming (below) would have broken all `dispatch_async` jobs.
83
+
84
+ ### Added
85
+ - **Per-call timeout override.** `dispatch`, `dispatch_session`,
86
+ `dispatch_stream`, and `dispatch_async` accept `timeout_seconds` (0 = agent
87
+ default, clamped to 10–7200); `dispatch_parallel` accepts it per item. Use
88
+ it for known-long tasks instead of editing the agent config. CLI:
89
+ `agent-dispatch test <name> --timeout N`.
90
+ - **Resumable timeouts.** Fresh dispatches pre-assign a session UUID via
91
+ `--session-id`, so a timed-out dispatch still returns a `session_id` — the
92
+ partial transcript survives the kill. The timeout error now spells out the
93
+ recovery options: resume via `dispatch_session(..., session_id=...)`, retry
94
+ with `timeout_seconds`, or go async.
95
+ - **Denied-tools visibility.** The claude CLI's `permission_denials` output is
96
+ parsed into `DispatchResult.denied_tools`. A dispatch that "succeeds" while
97
+ tools were blocked (the agent answers "I need permission for X") now carries
98
+ `denied_tools` + a `hint` that the result may be incomplete and how to grant
99
+ access. On `is_error` results, non-empty denials force
100
+ `error_type="permission"` even when the error text has no permission
101
+ keywords. CLI `test` prints the hint as a yellow note.
102
+ - **Async job progress.** Async workers now run with streaming: the job file
103
+ keeps a rolling tail (last 20 lines, throttled to ~1 write/sec) of assistant
104
+ text and tool-use events. `dispatch_status` returns it as `progress` while
105
+ running (kept afterwards as a post-mortem trace); `dispatch_jobs` shows
106
+ `last_progress` for running jobs. New `JobStore.update_progress` (refuses
107
+ terminal jobs, so a trailing write can't resurrect a finished job).
108
+
109
+ ### Changed
110
+ - Timeout error messages are actionable (mention `timeout_seconds`,
111
+ `dispatch_async`, `agent-dispatch update --timeout`, and the resumable
112
+ session) instead of just "increase timeout in agents.yaml".
113
+ - Plain-text fallback successes now carry the generated `session_id`; the
114
+ stream "no result line" fallback does too (a crash mid-stream stays
115
+ resumable).
116
+ - **Old-CLI self-healing**: if the installed claude CLI predates
117
+ `--session-id`, dispatch detects the "unknown option" rejection and retries
118
+ once without the flag (logged warning; timed-out dispatches lose
119
+ resumability) instead of failing every dispatch.
120
+ - `dispatch_parallel` validates per-item `timeout_seconds` / `summary_chars`
121
+ numerically **up front** — a bad value rejects the whole call before any
122
+ dispatch runs, consistent with the structural validation contract.
123
+ - `denied_tools` parsing is bounded (10 entries, 100 chars per name) — the
124
+ field comes from the dispatched subprocess's output, which is untrusted;
125
+ unbounded lists could inflate job files and `return_ref` payloads.
126
+
10
127
  ## [0.5.0] - 2026-06-01
11
128
 
12
129
  Security-hardening release. A multi-agent audit of the codebase surfaced
@@ -214,7 +331,8 @@ cache bounding, and stale-job recovery.
214
331
  - Dependabot for `pip` + `github-actions`, GitHub Actions pinned to
215
332
  commit SHAs for supply-chain integrity.
216
333
 
217
- [Unreleased]: https://github.com/ginkida/agent-dispatch/compare/v0.5.0...HEAD
334
+ [Unreleased]: https://github.com/ginkida/agent-dispatch/compare/v0.6.0...HEAD
335
+ [0.6.0]: https://github.com/ginkida/agent-dispatch/compare/v0.5.0...v0.6.0
218
336
  [0.5.0]: https://github.com/ginkida/agent-dispatch/compare/v0.4.0...v0.5.0
219
337
  [0.4.0]: https://github.com/ginkida/agent-dispatch/compare/v0.3.0...v0.4.0
220
338
  [0.3.0]: https://github.com/ginkida/agent-dispatch/compare/v0.2.2...v0.3.0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: agent-dispatch
3
- Version: 0.5.0
3
+ Version: 0.8.0
4
4
  Summary: MCP server that lets Claude Code agents delegate tasks to agents in other project directories
5
5
  Project-URL: Homepage, https://github.com/ginkida/agent-dispatch
6
6
  Project-URL: Repository, https://github.com/ginkida/agent-dispatch
@@ -8,7 +8,7 @@ Project-URL: Issues, https://github.com/ginkida/agent-dispatch/issues
8
8
  Author: ginkida
9
9
  License-Expression: MIT
10
10
  License-File: LICENSE
11
- Keywords: agent,claude,dispatch,mcp,multi-agent
11
+ Keywords: agent,agent-orchestration,ai-agents,anthropic,claude,claude-code,delegation,dispatch,mcp,mcp-server,multi-agent,subagents
12
12
  Classifier: Development Status :: 3 - Alpha
13
13
  Classifier: Intended Audience :: Developers
14
14
  Classifier: License :: OSI Approved :: MIT License
@@ -45,29 +45,61 @@ Each agent runs as a separate `claude -p` session in its own project directory
45
45
 
46
46
  Works with OAuth, API key, and Claude subscription authentication.
47
47
 
48
+ > **AI agents:** this README is the canonical doc for *using* the tool — setup: [Quick Start](#quick-start) (every step has a deterministic verify), first call: [`dispatch`](#dispatch), tool selection: [Which Tool to Use](#which-tool-to-use), failure handling: [Error Recovery](#error-recovery). Working *on* this repo instead? See [AGENTS.md](AGENTS.md).
49
+
48
50
  ## Quick Start
49
51
 
52
+ **Prerequisite:** the [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) must be installed and authenticated. Check first:
53
+
50
54
  ```bash
51
- pip install agent-dispatch
55
+ claude --version # must print a version — if it fails, install Claude Code before continuing
56
+ ```
57
+
58
+ Then:
52
59
 
53
- # Initialize: creates config + registers MCP server with Claude Code
60
+ ```bash
61
+ pip install agent-dispatch # or: pipx install agent-dispatch
62
+
63
+ # 1. Create config + register the MCP server with Claude Code (user scope)
54
64
  agent-dispatch init
55
65
 
56
- # Add agents (description auto-generated from project files)
66
+ # 2. Register project directories as agents REPLACE the example paths with
67
+ # real directories on your machine; they must exist (~ is expanded, relative
68
+ # paths are resolved). Descriptions are auto-generated from project files.
69
+ # No second project handy? Use the zero-setup block below instead.
57
70
  agent-dispatch add infra ~/projects/infra
58
71
  agent-dispatch add backend ~/projects/backend
59
72
 
60
- # Test it works
73
+ # 3. Smoke test — dispatches a real task to the agent added in step 2 and prints
74
+ # the answer; exit 0 on success. Default task when none given:
75
+ # "What project is this? Describe in one sentence."
61
76
  agent-dispatch test infra
62
77
 
63
- # If agents hit permission errors, grant tool access:
64
- agent-dispatch update infra --permission-mode bypassPermissions
65
-
66
- # If something doesn't work, run the diagnostic:
78
+ # 4. Verify the whole install prints "All checks passed." and exits 0 on success
67
79
  agent-dispatch doctor
68
80
  ```
69
81
 
70
- Done. Every Claude Code session now has access to all dispatch tools.
82
+ **Zero-setup alternative** for steps 2–3 (no second project needed registers the current directory):
83
+
84
+ ```bash
85
+ agent-dispatch add self . && agent-dispatch test self "Say hello"
86
+ ```
87
+
88
+ Every Claude Code session now has the dispatch tools. Independent check: `claude mcp list` must print a line starting with `agent-dispatch:`. From inside a Claude Code session, the first MCP calls are `list_agents()`, then [`dispatch(...)`](#dispatch).
89
+
90
+ **If `init` fails to register the MCP server** (prints a warning instead of `Registered MCP server`), register manually:
91
+
92
+ ```bash
93
+ claude mcp add-json agent-dispatch "{\"type\":\"stdio\",\"command\":\"$(which agent-dispatch)\",\"args\":[\"serve\"]}" --scope user
94
+ ```
95
+
96
+ **If `test` fails with a permission error** (`error_type: "permission"`), grant tool access and re-test:
97
+
98
+ ```bash
99
+ agent-dispatch update infra --allowed-tools "Bash,Read,Grep" # least privilege
100
+ # or, if the agent needs everything (see SECURITY.md for the trade-off):
101
+ agent-dispatch update infra --permission-mode bypassPermissions
102
+ ```
71
103
 
72
104
  ## When to Dispatch
73
105
 
@@ -97,6 +129,8 @@ Lists all configured agents. **Call this first** to see what's available.
97
129
  "mcp_servers": ["portainer", "postgres"],
98
130
  "stacks": ["Python", "Docker"],
99
131
  "dbs": ["Alembic"],
132
+ "capabilities": ["docker_logs", "deploy_debug"],
133
+ "risky_capabilities": ["restart_services"],
100
134
  "permission_mode": "bypassPermissions",
101
135
  "allowed_tools": ["Bash", "Read", "Grep"]
102
136
  }
@@ -130,6 +164,18 @@ One-shot task delegation. Results are cached — identical requests within TTL r
130
164
  | `response_format` | string | no | `"json"` to request a single JSON value; the parsed result lands in `parsed_result`. Empty = free-form text. |
131
165
  | `return_ref` | bool | no | When `true`, returns just a `ref` + summary preview instead of the full result text. Use `fetch_result(ref)` to load the full text on demand. |
132
166
  | `summary_chars` | int | no | Max chars of result text to include in the ref response (default 500). |
167
+ | `timeout_seconds` | int | no | One-off timeout override for this call (0 = agent's configured timeout; clamped to 10–7200). No config edit needed for known-long tasks. |
168
+
169
+ ```python
170
+ # Call — recommended form (always include caller and goal)
171
+ dispatch(
172
+ agent="infra", # must exist in list_agents()
173
+ task="Check container logs for errors related to the scheduler service",
174
+ context="Error: TypeError at scheduler.py:42",
175
+ caller="backend", # your project/role
176
+ goal="debug production crash" # the broader objective
177
+ )
178
+ ```
133
179
 
134
180
  ```json
135
181
  // Response (success)
@@ -155,6 +201,21 @@ One-shot task delegation. Results are cached — identical requests within TTL r
155
201
 
156
202
  **`error_type` values:** `permission` (tool/action denied), `timeout`, `recursion` (dispatch depth exceeded), `not_found` (missing directory or CLI), `cli_error` (other failures). Permission errors include an actionable hint.
157
203
 
204
+ **Resumable timeouts:** every fresh dispatch pre-assigns a session UUID (`--session-id`), so a timed-out dispatch still returns a `session_id` — the partial transcript survives the kill. The timeout error spells out the recovery: resume with `dispatch_session(agent, "Continue where you left off", session_id=...)`, retry with a bigger `timeout_seconds`, or use `dispatch_async`.
205
+
206
+ **Denied-tools visibility:** in non-interactive mode the claude CLI auto-denies tools the agent isn't allowed to use — the agent then often "succeeds" with an answer like *"I need your permission for one read-only query"*. When that happens the response carries the deterministic signal: `denied_tools` (parsed from the CLI's `permission_denials`) plus a `hint` explaining the result may be incomplete and how to grant access. `success` stays `true` — it's a soft signal, not a failure.
207
+
208
+ ```json
209
+ // Response (success, but a tool was blocked)
210
+ {
211
+ "agent": "analysis",
212
+ "success": true,
213
+ "result": "Here is the offline mapping. To finish I'd need to run one read-only query...",
214
+ "denied_tools": ["Bash"],
215
+ "hint": "1 tool call(s) were denied by permissions: Bash. The result may be incomplete..."
216
+ }
217
+ ```
218
+
158
219
  **Structured JSON output:** pass `response_format="json"` to ask the agent for a single JSON value. The runner appends an instruction footer ("respond with a single valid JSON value, no fences, no prose") and on success parses the response — the parsed value lands in `parsed_result`. The raw text is always in `result`. Parse failures leave `parsed_result=None` but don't fail the dispatch (soft mode).
159
220
 
160
221
  ```json
@@ -195,6 +256,9 @@ Multi-turn: continue a conversation with an agent. First call starts a session,
195
256
  | `context` | string | no | Extra context |
196
257
  | `caller` | string | no | Who is dispatching |
197
258
  | `goal` | string | no | Broader objective |
259
+ | `timeout_seconds` | int | no | One-off timeout override (0 = agent default; clamped to 10–7200) |
260
+
261
+ `dispatch_session` is also the **timeout recovery path**: a timed-out `dispatch` returns a `session_id` — pass it here with `task="Continue where you left off"` to salvage the partial work instead of restarting.
198
262
 
199
263
  ```
200
264
  Turn 1: dispatch_session("infra", "List running containers")
@@ -210,7 +274,7 @@ Run multiple tasks concurrently. Much faster than sequential `dispatch` calls.
210
274
 
211
275
  | Parameter | Type | Required | Description |
212
276
  |-----------|------|----------|-------------|
213
- | `dispatches` | string (JSON) | yes | JSON array of `{"agent", "task", "context?", "caller?", "goal?"}` |
277
+ | `dispatches` | string (JSON) | yes | JSON array of `{"agent", "task", "context?", "caller?", "goal?", "response_format?", "return_ref?", "summary_chars?", "timeout_seconds?"}` |
214
278
  | `aggregate` | string | no | Agent name to synthesize all results into one answer |
215
279
 
216
280
  **Important:** `dispatches` is a JSON string, not a list.
@@ -250,7 +314,7 @@ Run multiple tasks concurrently. Much faster than sequential `dispatch` calls.
250
314
 
251
315
  Same as `dispatch` but shows live progress while the agent works. Use for long-running tasks. Not cached.
252
316
 
253
- Parameters are identical to `dispatch`.
317
+ Parameters are the same as `dispatch` except `return_ref`/`summary_chars` (streaming is incompatible with ref-mode).
254
318
 
255
319
  ### `dispatch_dialogue`
256
320
 
@@ -288,9 +352,10 @@ Register a new project directory as an agent. Description is auto-generated from
288
352
  | Parameter | Type | Required | Description |
289
353
  |-----------|------|----------|-------------|
290
354
  | `name` | string | yes | Agent name (letters, digits, hyphens, underscores) |
291
- | `directory` | string | yes | Absolute path to project directory |
355
+ | `directory` | string | yes | Path to an existing project directory (`~` is expanded, relative paths resolved) |
292
356
  | `description` | string | no | What this agent can do — auto-generated if empty |
293
357
  | `timeout` | int | no | Timeout in seconds (0 = use global default) |
358
+ | `max_budget_usd` | float | no | Max cost in USD per dispatch (0 = no limit) |
294
359
  | `permission_mode` | string | no | Permission mode (e.g. `default`, `plan`, `bypassPermissions`) |
295
360
  | `allowed_tools` | string | no | Comma-separated allowed tools (e.g. `"Bash,Read,Edit"`) |
296
361
  | `disallowed_tools` | string | no | Comma-separated disallowed tools |
@@ -304,6 +369,7 @@ Update an existing agent's configuration. Only non-empty fields are changed. Pas
304
369
  | `name` | string | yes | Agent name to update |
305
370
  | `description` | string | no | New description |
306
371
  | `timeout` | int | no | New timeout (0 = don't change) |
372
+ | `max_budget_usd` | float | no | New budget limit (0 = don't change, negative = clear the limit) |
307
373
  | `model` | string | no | Model override. `"none"` to clear |
308
374
  | `permission_mode` | string | no | Permission mode. `"none"` to clear |
309
375
  | `allowed_tools` | string | no | Comma-separated. `"none"` to clear |
@@ -344,15 +410,17 @@ Refs reuse the same storage as `dispatch_async` jobs (under `~/.config/agent-dis
344
410
  When a dispatched task is going to take a while, you don't want to block your own tool slot for minutes. Async dispatch returns a `job_id` immediately and lets you check back when you're ready.
345
411
 
346
412
  ```
347
- // 1. fire and forget
413
+ // 1. fire and forget (timeout_seconds= works here too for known-long tasks)
348
414
  dispatch_async(agent="infra", task="audit every container log for OOM kills today")
349
415
  -> {"job_id": "8f3a...e1", "status": "pending", "agent": "infra"}
350
416
 
351
417
  // 2. do other work, then check progress (non-blocking)
418
+ // `progress` is a rolling tail of what the agent is doing right now
352
419
  dispatch_status(job_id="8f3a...e1")
353
- -> {"id": "8f3a...e1", "status": "running", "started_at": 1730000123.4, ...}
420
+ -> {"id": "8f3a...e1", "status": "running", "started_at": 1730000123.4,
421
+ "progress": ["Using tool: Bash", "Scanning container logs for OOM events..."], ...}
354
422
 
355
- // 3. or block until done (with a timeout cap)
423
+ // 3. or block until done (timeout_seconds default: 60, capped at 3600)
356
424
  dispatch_wait(job_id="8f3a...e1", timeout_seconds=120)
357
425
  -> {"id": "8f3a...e1", "status": "done", "result": {"agent": "infra", "success": true, ...}}
358
426
 
@@ -360,11 +428,13 @@ dispatch_wait(job_id="8f3a...e1", timeout_seconds=120)
360
428
  -> {"id": "...", "status": "running", "timed_out_waiting": true}
361
429
  ```
362
430
 
363
- `dispatch_cancel(job_id)` cancels a job that is still **pending** (before its subprocess starts) a running job is left to finish, since its `claude` subprocess can't be safely interrupted. The response carries an `outcome` of `cancelled`, `running`, `already_terminal`, or `not_found`.
431
+ `dispatch_cancel(job_id)` cancels a **pending** job, and also kills a **running** job's `claude` subprocess when the job was started by the same server instance (the job is marked `cancelled` first, so the worker's trailing write can't undo it; partial work is lost but the progress tail is preserved). A running job started by a *previous* server run can't be killed safely and is left to finish. The response carries an `outcome` of `cancelled`, `cancelled_running`, `running` (not owned by this server), `already_terminal`, or `not_found`.
432
+
433
+ Async workers run with streaming under the hood: the job file keeps a rolling tail (last 20 lines, ~1 write/sec) of assistant text and tool-use events. `dispatch_status` shows it as `progress` while the job runs and keeps it afterwards as a post-mortem trace; `dispatch_jobs` shows `last_progress` for running jobs.
364
434
 
365
435
  `dispatch_jobs(status?)` lists recent jobs as summaries (filter by `pending` / `running` / `done` / `failed` / `cancelled`). `dispatch_gc(max_age_days=7)` purges terminal jobs older than the threshold — pending and running jobs are never deleted.
366
436
 
367
- Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `AGENT_DISPATCH_JOBS_DIR`). One JSON file per job, written owner-only (`0o600`) with atomic writes — safe to read or `ls` while jobs are in flight. Caller-supplied `job_id`s are validated as 32-char hex before any file access (no path traversal). On startup the server marks jobs abandoned in `running` by a prior crashed instance as `failed`.
437
+ Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `AGENT_DISPATCH_JOBS_DIR`). One JSON file per job, written owner-only (`0o600`) with atomic writes — safe to read or `ls` while jobs are in flight. Caller-supplied `job_id`s are validated as 32-char hex before any file access (no path traversal). On startup the server marks jobs left in `running` by a crashed instance as `failed` once they are stale (stuck for over an hour).
368
438
 
369
439
  | When to use async | When to use `dispatch` |
370
440
  |-------------------|------------------------|
@@ -372,14 +442,6 @@ Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `A
372
442
  | Several long tasks you'll collect later | Several short tasks → `dispatch_parallel` |
373
443
  | Don't care about caching (each call is a fresh job) | Cached by default — identical requests are free |
374
444
 
375
- ### Error Responses
376
-
377
- All tools return errors as:
378
-
379
- ```json
380
- {"error": "Unknown agent: 'foo'. Available: infra, db, monitoring"}
381
- ```
382
-
383
445
  ## Which Tool to Use
384
446
 
385
447
  | Scenario | Tool |
@@ -392,6 +454,32 @@ All tools return errors as:
392
454
  | Need a combined summary from multiple agents | `dispatch_parallel` with `aggregate` |
393
455
  | Long task — don't block your tool slot | `dispatch_async` + `dispatch_wait` |
394
456
  | Check progress without blocking | `dispatch_status` |
457
+ | Known-long task, one-off | any dispatch tool with `timeout_seconds=...` |
458
+ | A dispatch timed out | `dispatch_session` with the `session_id` from the error |
459
+
460
+ ## Error Recovery
461
+
462
+ Failures are deterministic: check `success`, then branch on `error_type`.
463
+
464
+ | `error_type` | Meaning | Recovery |
465
+ |--------------|---------|----------|
466
+ | `permission` | A tool call was denied | `update_agent(name, allowed_tools="Bash,Read")` (least privilege) or `update_agent(name, permission_mode="bypassPermissions")`, then re-dispatch. The `error` text includes a hint with the exact fix. |
467
+ | `timeout` | Process killed at the timeout | Resume the partial work: `dispatch_session(agent, "Continue where you left off", session_id=<from the error text>)`. Or retry with a bigger `timeout_seconds=`, or use `dispatch_async`. |
468
+ | `not_found` | Agent directory or `claude` CLI missing | `list_agents()` → check `healthy`. Re-add the agent with an existing path, or run `agent-dispatch doctor` to find what's missing. |
469
+ | `recursion` | Dispatch nesting exceeded `max_dispatch_depth` (default 3) | Don't dispatch from dispatched agents; if the nesting is intentional, raise `max_dispatch_depth` in settings. |
470
+ | `cli_error` | Anything else from the `claude` subprocess | Read the `error` text; run `agent-dispatch doctor` for environment issues; retry once if transient. |
471
+
472
+ Three soft signals that arrive with `success: true`:
473
+
474
+ - **`denied_tools` + `hint`** — the agent finished but some tool calls were blocked; the result may be incomplete. Grant access (see the `permission` row) and re-dispatch.
475
+ - **`parsed_result: null` with `response_format="json"`** — the reply wasn't valid JSON; the raw text is still in `result`. Caveat: an agent that *can't* comply returns `{"error": "<reason>"}` — which parses successfully — so also check `parsed_result` for an `"error"` key.
476
+ - **`budget_exceeded: true`** — `cost_usd` exceeded the agent's `max_budget_usd` (or the settings default). The dispatch is not failed — the money is already spent — but a runaway agent is now visible. Tighten the task, pick a cheaper model, or raise the budget.
477
+
478
+ Tool-level errors (unknown agent, malformed input) return a plain envelope instead of a `DispatchResult`:
479
+
480
+ ```json
481
+ {"error": "Unknown agent: 'foo'. Available: infra, db, monitoring"}
482
+ ```
395
483
 
396
484
  ## Configuration
397
485
 
@@ -403,9 +491,14 @@ agents:
403
491
  directory: ~/projects/infra
404
492
  description: "Infrastructure agent. MCP: portainer."
405
493
  timeout: 300 # seconds, default: 300
494
+ capabilities: # capability labels, shown in list_agents
495
+ - docker_logs
496
+ - deploy_debug
497
+ risky_capabilities: # high-risk labels, surfaced for visibility
498
+ - restart_services
406
499
  # model: sonnet # optional model override
407
500
  # max_budget_usd: 1.0 # cost limit per dispatch
408
- # permission_mode: auto # permission mode for the agent
501
+ # permission_mode: bypassPermissions # one of: default | plan | bypassPermissions
409
502
  # allowed_tools: # restrict which tools the agent can use
410
503
  # - Read
411
504
  # - Grep
@@ -440,6 +533,18 @@ Config is reloaded on every tool call — add agents without restarting.
440
533
  - Stack indicators — Docker, Rust, Go, Python, Node.js
441
534
  - DB indicators — Prisma, Alembic, migrations
442
535
 
536
+ ### Explicit Capabilities
537
+
538
+ Auto-description is useful, but explicit `capabilities` make it clearer what each agent is for. Add short snake_case task labels to agents:
539
+
540
+ ```bash
541
+ agent-dispatch update infra \
542
+ --capabilities docker_logs,deploy_debug \
543
+ --risky-capabilities restart_services
544
+ ```
545
+
546
+ `list_agents` and `inspect_agent` surface `capabilities` and `risky_capabilities` so the caller can pick the right agent at a glance — `risky_capabilities` flags higher-risk abilities (e.g. restarting services) for extra scrutiny.
547
+
443
548
  ## How It Works
444
549
 
445
550
  ```
@@ -466,7 +571,7 @@ agent-dispatch MCP server
466
571
  - **Argument-injection guard** — structured CLI fields (`session_id`, `model`, `permission_mode`, tool names) that start with `-` are rejected so they can't smuggle extra `claude` flags.
467
572
  - **Path-traversal guard** — caller-supplied `job_id`/`ref` values are validated as 32-char hex before any filesystem access.
468
573
  - **Owner-only state** — job files (`0o600`) and `agents.yaml` (`0o600`) are written for the owner only; their directories are `0o700`.
469
- - **Cost control** — `max_budget_usd` per agent or globally.
574
+ - **Cost visibility** — `max_budget_usd` per agent or globally; a dispatch whose cost exceeds it returns `budget_exceeded: true` + a hint (post-hoc — the `claude` CLI has no spend cap, so the overage can be flagged but not prevented).
470
575
  - **Concurrency** — `max_concurrency` (default: 5) caps parallel `claude -p` processes. Note: the sync and async dispatch paths use separate semaphores, so the worst-case total is `2 × max_concurrency`.
471
576
  - **Timeout** — per-agent or global (default: 300s). Orphaned processes are cleaned up.
472
577
  - **Caching** — identical `(agent, task, context, caller, goal, response_format)` requests return cached results, bounded by `cache.max_size` (oldest entry evicted first). Only successes are cached. Sessions and dialogues are never cached.
@@ -485,12 +590,16 @@ See [SECURITY.md](SECURITY.md) for the full threat model (including the `bypassP
485
590
  | `agent-dispatch describe <name>` | Show full configuration for one agent (tri-state tools, project files) |
486
591
  | `agent-dispatch test <name> [task] [--stream]` | Test an agent with a dispatch (`--stream` for live progress) |
487
592
  | `agent-dispatch doctor` | Diagnose installation: claude CLI, MCP registration, agent health |
593
+ | `agent-dispatch jobs [--status --limit]` | List async dispatch jobs (most recent first) |
594
+ | `agent-dispatch job <id>` | Show one job: status, progress tail, result preview |
595
+ | `agent-dispatch cancel <id>` | Cancel a pending job (running jobs: use the `dispatch_cancel` MCP tool) |
596
+ | `agent-dispatch gc [--days]` | Purge terminal jobs older than N days (default 7) |
488
597
  | `agent-dispatch serve` | Start MCP server (stdio, used by Claude Code) |
489
598
 
490
599
  ## Requirements
491
600
 
492
601
  - Python >= 3.10
493
- - [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) installed and authenticated
602
+ - [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) installed, authenticated, and on `PATH` (verify: `claude --version`)
494
603
 
495
604
  ## License
496
605