PyPI - agent-dispatch - Versions diffs - 0.3.0__tar.gz → 0.5.0__tar.gz - Mend

agent-dispatch 0.3.0tar.gz → 0.5.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

{agent_dispatch-0.3.0 → agent_dispatch-0.5.0}/.github/workflows/ci.yml RENAMED Viewed

@@ -6,6 +6,10 @@ on:
   pull_request:
     branches: [main]
+# Least privilege: CI only needs to read the repo.
+permissions:
+  contents: read
 jobs:
   test:
     runs-on: ubuntu-latest

{agent_dispatch-0.3.0 → agent_dispatch-0.5.0}/.github/workflows/publish.yml RENAMED Viewed

@@ -4,12 +4,16 @@ on:
   release:
     types: [published]
+# Default to no privileges; the publish job opts into exactly what it needs.
+permissions: {}
 jobs:
   publish:
     runs-on: ubuntu-latest
     environment: pypi
     permissions:
-      id-token: write
+      id-token: write   # OIDC token for PyPI Trusted Publisher
+      contents: read    # checkout the tagged source
     steps:
       - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6

agent_dispatch-0.5.0/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,224 @@
+# Changelog
+All notable changes to this project are documented here.
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [Unreleased]
+## [0.5.0] - 2026-06-01
+Security-hardening release. A multi-agent audit of the codebase surfaced
+several issues; the confirmed ones are fixed here, plus job cancellation,
+cache bounding, and stale-job recovery.
+### Security
+- **Path traversal in async jobs (fixed).** `dispatch_status`, `dispatch_wait`,
+  and `fetch_result` accept a caller-supplied `job_id`/`ref` that flowed
+  straight into `JobStore`'s file-path construction. A crafted value such as
+  `../../secret` could read any Job-shaped `.json` file outside the jobs
+  directory. Job ids are now validated against `^[0-9a-f]{32}$` at the tool
+  boundary (`_validate_ref`), in `JobStore.get`, and in `JobStore._path`
+  (defense in depth). Malformed ids are rejected without touching the
+  filesystem. New helper `jobs.is_valid_job_id`.
+- **Argument/flag injection via structured CLI fields (fixed).** A
+  `session_id` (caller-controlled in `dispatch_session`) — or a misconfigured
+  `model`, `permission_mode`, or tool name — that started with `-` was placed
+  in the argument position after a flag (e.g. `--resume <session_id>`) and the
+  `claude` CLI parsed it as a *new* flag, allowing options like
+  `--permission-mode bypassPermissions` to be smuggled in. `_build_command`
+  now rejects any such value via `_reject_flaglike` (raising
+  `runner.ArgInjectionError`); `dispatch`/`dispatch_stream` surface it as a
+  clean failed result, never spawning a subprocess.
+- **Tightened file permissions.** Job files are written `0o600` and the jobs
+  directory is created `0o700` (they hold full task/context/result payloads
+  that may contain secrets). `save_config` now writes `agents.yaml` `0o600`
+  and its parent directory `0o700`. All `chmod`s are best-effort (skipped on
+  platforms without POSIX modes).
+### Added
+- `dispatch_cancel(job_id)` MCP tool — cancel a *pending* async job before it
+  starts. Running jobs are left to finish (their subprocess can't be safely
+  interrupted); the tool reports an `outcome` of `cancelled`, `running`,
+  `already_terminal`, or `not_found`. Makes the previously-unreachable
+  `cancelled` job status real. Backed by `JobStore.cancel`, and the
+  cancel/start race is closed by `mark_running` refusing a cancelled job.
+- Cache size bound — `CacheSettings.max_size` (default 1000) caps the
+  in-memory dispatch cache, evicting the oldest entry first (FIFO by insertion
+  time; read access does not refresh, since the timestamp also drives TTL),
+  preventing unbounded memory growth from many unique requests. `cache_stats`
+  now reports `max_size` and `evictions`.
+- Stale-job recovery — on startup the server marks jobs abandoned in
+  `running` (older than 1h, e.g. from a crashed prior run) as `failed` so
+  callers don't poll them forever (`JobStore.recover_stale`).
+### Changed
+- Input bounds hardened across MCP tools: `dispatch_jobs(limit)` clamped to
+  `[1, 1000]`; `dispatch_gc(max_age_days)` rejects non-finite values;
+  `summary_chars` (in `dispatch` and per-item `dispatch_parallel`) clamped to
+  `[0, 100000]`; `dispatch_parallel` rejects more than
+  `max(100, max_concurrency * 20)` items to bound subprocess fan-out.
+- Async job worker now logs lifecycle transitions (running / finished) with
+  the job id for easier production debugging.
+- Type hints filled in (`_ref_payload`, `_run_job`, `_run_one`).
+- Lint surface expanded — ruff now enforces bugbear (`B`), bandit security
+  (`S`), import order (`I`), and pyupgrade (`UP`) in addition to the defaults,
+  with documented ignores for the trusted `claude` subprocess calls.
+- `SECURITY.md` rewritten: accurate supported-versions table and an expanded
+  threat model (bypassPermissions, on-disk job files, env inheritance,
+  best-effort recursion depth, argument-injection mitigation).
+## [0.4.0] - 2026-05-15
+### Added
+- Result references — `dispatch(..., return_ref=True)` and per-item in
+  `dispatch_parallel` now return a compact `{ref, agent, success, size,
+  summary, summary_chars, cost_usd, ...}` payload instead of the full
+  result text. The full DispatchResult is persisted to disk (reusing the
+  async JobStore) and can be loaded on demand via the new
+  `fetch_result(ref, max_chars=0)` MCP tool. Saves caller context when
+  the result is large; the JSON parsed_result (small by nature) is still
+  inlined alongside the ref. fetch_result also works on any
+  `dispatch_async` job_id — the storage is shared.
+- `JobStore.create_completed(...)` — persists an already-finished
+  DispatchResult as a Job in terminal state. Used by ref mode; future
+  iterations can use it for result archival.
+- Structured JSON response support — `dispatch`, `dispatch_session`,
+  `dispatch_async`, `dispatch_stream`, and per-item in `dispatch_parallel`
+  now accept `response_format="json"`. When set, the runner appends a clear
+  "respond with a single JSON value, no prose, no fences" footer to the
+  prompt and attempts to parse the agent's response (tolerating ```json
+  fences). The parsed value lands in a new `DispatchResult.parsed_result`
+  field — `None` when not requested or unparseable (soft mode: parse
+  failure does NOT mark the dispatch as failed). Cache key now includes
+  `response_format` so JSON and text requests for the same task don't
+  collide.
+- `list_agents` MCP tool now surfaces `mcp_servers`, `stacks`, and `dbs`
+  per agent (when present) — the same structured data `auto_describe`
+  already collects from `.mcp.json`, `Dockerfile`, `pyproject.toml`,
+  `package.json`, `Cargo.toml`, `go.mod`, `prisma/`, `alembic.ini`, etc.
+  Calling agents no longer need to dispatch a probe just to learn what
+  tools the target has.
+- New `inspect_agent(name, preview_lines=40)` MCP tool — cheap detailed
+  lookup without a `claude` subprocess. Returns the agent's full config
+  fields (timeout, model, budget, permission_mode, tool lists), detected
+  MCP/stacks/DBs, plus short previews of `CLAUDE.md` and `README.md` so
+  the caller can confirm capabilities before spending a real dispatch.
+- `config.collect_mcp_servers()`, `config.detect_stacks()`, and
+  `config.detect_dbs()` are now public helpers (the previous private
+  `_collect_mcp_servers` remains as an alias for compatibility).
+- Async dispatch with a `job_id` pattern — five new MCP tools let calling
+  agents fire-and-forget long-running tasks without blocking their own tool
+  slot:
+  - `dispatch_async(agent, task, ...)` — start a dispatch in the background,
+    returns `{job_id, status: "pending", agent}` immediately.
+  - `dispatch_status(job_id)` — read the current state of a job without
+    blocking (pending / running / done / failed) including the
+    `DispatchResult` once complete.
+  - `dispatch_wait(job_id, timeout_seconds=60)` — block until terminal or
+    until the timeout fires (capped at 3600s). Returns the same shape as
+    `dispatch_status` plus `timed_out_waiting: true` on timeout — the job
+    keeps running and the caller can poll/wait again.
+  - `dispatch_jobs(status?, limit=50)` — list recent jobs as summaries,
+    optionally filtered by status (most recent first).
+  - `dispatch_gc(max_age_days=7)` — purge terminal jobs older than the
+    threshold. Pending and running jobs are never touched.
+- Job state persists to disk as one JSON file per job under
+  `~/.config/agent-dispatch/jobs/` (override via `AGENT_DISPATCH_JOBS_DIR`).
+  Atomic writes via `os.replace()` so partial files never appear, and jobs
+  survive across server restarts (existing terminal jobs remain queryable,
+  in-flight jobs are abandoned on restart — to be addressed in a future
+  iteration with PID tracking).
+## [0.3.0] - 2026-05-08
+### Added
+- `agent-dispatch doctor` CLI command — diagnoses installation issues:
+  checks `claude` CLI on PATH, `agent-dispatch` on PATH, config validity,
+  MCP registration with Claude Code, and per-agent directory health.
+  Exits non-zero if any blocking issue is found.
+- `agent-dispatch describe <name>` CLI command — show one agent's full
+  configuration: directory, description, timeout, model, budget, permission
+  mode, tri-state tool fields (`(inherit defaults)` vs `(none — explicit
+  override)` vs explicit list), and which project files would be inherited.
+- `--stream` flag for `agent-dispatch test` — surfaces live progress
+  (assistant text + tool use) while the agent works, useful for long
+  tasks where you'd otherwise see nothing until completion.
+### Fixed
+- `list_agents` MCP tool no longer crashes the entire response when one
+  agent's directory is unreadable (`PermissionError`, network FS hiccup,
+  etc.). The bad agent now reports `healthy: "UNREADABLE"` and the rest
+  of the listing succeeds — matching the documented response shape.
+- Dispatch cache key now includes `caller` and `goal`. Previously two
+  requests with the same `(agent, task, context)` but different framing
+  (e.g. `caller="frontend"` vs `caller="backend"`) would collide and the
+  second request would receive the cached response from the first — even
+  though the structured prompt sent to Claude is materially different.
+## [0.2.2] - 2026-04-17
+### Fixed
+- `agent-dispatch list` now distinguishes `allowed_tools: None` (inherit
+  from settings defaults) from `allowed_tools: []` (explicitly no tools).
+  Previously both were rendered identically.
+## [0.2.1] - 2026-04-17
+### Fixed
+- 13 bugs across the runner, server, CLI, config, and models:
+  - Runner: defensive coercion in `_classify_error` for non-string inputs;
+    fallback messages when `is_error=True` produces empty `result`;
+    correct error_type classification on plain-text stdout fallbacks;
+    orphan subprocess cleanup on stream exit paths.
+  - Server: up-front validation in `dispatch_parallel` (rejects bad items
+    before any dispatch runs); `dispatch_dialogue` surfaces per-turn errors;
+    `cache_stats` evicts expired entries before reporting.
+  - CLI: friendly error messages on malformed YAML / invalid schema;
+    `list` handles `OSError` from unreachable directories;
+    sentinel patterns for `update` to clear fields (`"none"` / `""`).
+  - Config: deduplication when collecting MCP servers from multiple paths.
+  - Models: tighter validation bounds (`ge=0`, `ge=1`).
+## [0.2.0] - 2026-04-16
+### Added
+- Error classification — `DispatchResult.error_type` now reports
+  `permission`, `timeout`, `recursion`, `not_found`, or `cli_error`.
+  Permission errors include an actionable hint with suggested fixes.
+- Permission management — agents and global settings support
+  `permission_mode`, `allowed_tools`, and `disallowed_tools`. Tool lists
+  use tri-state semantics: `None` inherits from defaults, `[]` overrides
+  to "no tools", a list specifies the allowed/disallowed set.
+- `update_agent` MCP tool — modify an existing agent's configuration
+  without remove + re-add. CLI parity via `agent-dispatch update`.
+- CLI tests for `init` and `test` commands.
+## [0.1.0] - 2026-04-10
+### Added
+- Initial release.
+- 11 MCP tools: `list_agents`, `add_agent`, `remove_agent`, `dispatch`,
+  `dispatch_session`, `dispatch_parallel` (with optional aggregation),
+  `dispatch_stream`, `dispatch_dialogue`, `cache_stats`, `cache_clear`.
+- CLI: `init`, `add`, `remove`, `list`, `test`, `serve`.
+- Recursion protection via `AGENT_DISPATCH_DEPTH` env var.
+- In-memory TTL cache (thread-safe).
+- Concurrency control via `asyncio.Semaphore` (default: 5 parallel
+  `claude -p` processes).
+- Auto-description from `CLAUDE.md`, `README.md`, `pyproject.toml`,
+  `package.json`, `.mcp.json`, and stack/DB indicators.
+- PyPI publishing via Trusted Publisher (OIDC).
+- CI matrix on Python 3.10, 3.11, 3.12, 3.13.
+- Dependabot for `pip` + `github-actions`, GitHub Actions pinned to
+  commit SHAs for supply-chain integrity.
+[Unreleased]: https://github.com/ginkida/agent-dispatch/compare/v0.5.0...HEAD
+[0.5.0]: https://github.com/ginkida/agent-dispatch/compare/v0.4.0...v0.5.0
+[0.4.0]: https://github.com/ginkida/agent-dispatch/compare/v0.3.0...v0.4.0
+[0.3.0]: https://github.com/ginkida/agent-dispatch/compare/v0.2.2...v0.3.0
+[0.2.2]: https://github.com/ginkida/agent-dispatch/compare/v0.2.1...v0.2.2
+[0.2.1]: https://github.com/ginkida/agent-dispatch/compare/v0.2.0...v0.2.1
+[0.2.0]: https://github.com/ginkida/agent-dispatch/compare/v0.1.0...v0.2.0
+[0.1.0]: https://github.com/ginkida/agent-dispatch/releases/tag/v0.1.0

{agent_dispatch-0.3.0 → agent_dispatch-0.5.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: agent-dispatch
-Version: 0.3.0
+Version: 0.5.0
 Summary: MCP server that lets Claude Code agents delegate tasks to agents in other project directories
 Project-URL: Homepage, https://github.com/ginkida/agent-dispatch
 Project-URL: Repository, https://github.com/ginkida/agent-dispatch
@@ -85,7 +85,7 @@ Done. Every Claude Code session now has access to all dispatch tools.
 Lists all configured agents. **Call this first** to see what's available.
 ```json
-// Response (permission fields shown only when configured)
+// Response (capability + permission fields shown only when populated)
 [
   {
     "name": "infra",
@@ -94,12 +94,28 @@ Lists all configured agents. **Call this first** to see what's available.
     "healthy": true,
     "has_claude_md": true,
     "has_mcp_config": true,
+    "mcp_servers": ["portainer", "postgres"],
+    "stacks": ["Python", "Docker"],
+    "dbs": ["Alembic"],
     "permission_mode": "bypassPermissions",
     "allowed_tools": ["Bash", "Read", "Grep"]
   }
 ]
 ```
+`mcp_servers`, `stacks`, and `dbs` are detected from the agent's project files (`.mcp.json`, `Dockerfile`, `pyproject.toml`, `Cargo.toml`, `prisma/`, `alembic.ini`, etc.) so callers can pick the right agent without dispatching a probe.
+### `inspect_agent`
+Cheap detailed lookup — reads the agent's files without spawning a `claude` session. Returns the full config (timeout, model, budget, permission mode, allowed/disallowed tools), detected MCP/stacks/DBs, plus short previews of `CLAUDE.md` and `README.md` when present.
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `name` | string | yes | Agent name from `list_agents` |
+| `preview_lines` | int | no | Max lines of CLAUDE.md/README.md (default 40, max 200, 0 disables) |
+Use this **before** `dispatch_async`/`dispatch` to confirm an agent has the tools and context for your task — much cheaper than a probe dispatch.
 ### `dispatch`
 One-shot task delegation. Results are cached — identical requests within TTL return instantly.
@@ -111,6 +127,9 @@ One-shot task delegation. Results are cached — identical requests within TTL r
 | `context` | string | no | Extra context: error messages, code snippets, stack traces |
 | `caller` | string | no | Your project/role — helps the agent understand who's asking |
 | `goal` | string | no | Broader objective — helps the agent make better trade-offs |
+| `response_format` | string | no | `"json"` to request a single JSON value; the parsed result lands in `parsed_result`. Empty = free-form text. |
+| `return_ref` | bool | no | When `true`, returns just a `ref` + summary preview instead of the full result text. Use `fetch_result(ref)` to load the full text on demand. |
+| `summary_chars` | int | no | Max chars of result text to include in the ref response (default 500). |
 ```json
 // Response (success)
@@ -136,6 +155,18 @@ One-shot task delegation. Results are cached — identical requests within TTL r
 **`error_type` values:** `permission` (tool/action denied), `timeout`, `recursion` (dispatch depth exceeded), `not_found` (missing directory or CLI), `cli_error` (other failures). Permission errors include an actionable hint.
+**Structured JSON output:** pass `response_format="json"` to ask the agent for a single JSON value. The runner appends an instruction footer ("respond with a single valid JSON value, no fences, no prose") and on success parses the response — the parsed value lands in `parsed_result`. The raw text is always in `result`. Parse failures leave `parsed_result=None` but don't fail the dispatch (soft mode).
+```json
+// Response with response_format="json"
+{
+  "agent": "infra",
+  "success": true,
+  "result": "{\"errors\": 3, \"first_at\": \"14:02\"}",
+  "parsed_result": {"errors": 3, "first_at": "14:02"}
+}
+```
 **Always pass `caller` and `goal`** — the dispatched agent sees a structured prompt:
 ```markdown
@@ -290,6 +321,57 @@ Remove an agent from config.
 View cache hit rate and size, or clear all cached results.
+### Result references — `return_ref` + `fetch_result`
+For dispatches whose result text is large (audits, log dumps, code searches), passing the full text back inflates the calling agent's context. Use `return_ref=True` to get just a small reference instead:
+```
+dispatch(agent="infra", task="audit every container", return_ref=True, summary_chars=200)
+  -> {"ref": "8f3a...e1", "agent": "infra", "success": true,
+      "size": 14823, "summary_chars": 200,
+      "summary": "Inspected 32 containers. Found 3 OOM kills in the last hour:\n- worker-3...",
+      "cost_usd": 0.08, "duration_ms": 9200}
+// Later, when you actually need to read the result:
+fetch_result(ref="8f3a...e1")              -> full DispatchResult JSON
+fetch_result(ref="8f3a...e1", max_chars=2000)  -> truncated, plus {"truncated": true, "full_size": 14823}
+```
+Refs reuse the same storage as `dispatch_async` jobs (under `~/.config/agent-dispatch/jobs/`), so any `job_id` returned by `dispatch_async` is also a valid `ref` for `fetch_result`. `parsed_result` (when `response_format="json"` is set) is small and is always inlined directly in the ref response — no second fetch needed.
+### Async dispatch — `dispatch_async`, `dispatch_status`, `dispatch_wait`, `dispatch_cancel`, `dispatch_jobs`, `dispatch_gc`
+When a dispatched task is going to take a while, you don't want to block your own tool slot for minutes. Async dispatch returns a `job_id` immediately and lets you check back when you're ready.
+```
+// 1. fire and forget
+dispatch_async(agent="infra", task="audit every container log for OOM kills today")
+  -> {"job_id": "8f3a...e1", "status": "pending", "agent": "infra"}
+// 2. do other work, then check progress (non-blocking)
+dispatch_status(job_id="8f3a...e1")
+  -> {"id": "8f3a...e1", "status": "running", "started_at": 1730000123.4, ...}
+// 3. or block until done (with a timeout cap)
+dispatch_wait(job_id="8f3a...e1", timeout_seconds=120)
+  -> {"id": "8f3a...e1", "status": "done", "result": {"agent": "infra", "success": true, ...}}
+// If the timeout fires, the job keeps running:
+  -> {"id": "...", "status": "running", "timed_out_waiting": true}
+```
+`dispatch_cancel(job_id)` cancels a job that is still **pending** (before its subprocess starts) — a running job is left to finish, since its `claude` subprocess can't be safely interrupted. The response carries an `outcome` of `cancelled`, `running`, `already_terminal`, or `not_found`.
+`dispatch_jobs(status?)` lists recent jobs as summaries (filter by `pending` / `running` / `done` / `failed` / `cancelled`). `dispatch_gc(max_age_days=7)` purges terminal jobs older than the threshold — pending and running jobs are never deleted.
+Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `AGENT_DISPATCH_JOBS_DIR`). One JSON file per job, written owner-only (`0o600`) with atomic writes — safe to read or `ls` while jobs are in flight. Caller-supplied `job_id`s are validated as 32-char hex before any file access (no path traversal). On startup the server marks jobs abandoned in `running` by a prior crashed instance as `failed`.
+| When to use async | When to use `dispatch` |
+|-------------------|------------------------|
+| Long task (minutes) — you want to keep working | Short task — you need the answer right now |
+| Several long tasks you'll collect later | Several short tasks → `dispatch_parallel` |
+| Don't care about caching (each call is a fresh job) | Cached by default — identical requests are free |
 ### Error Responses
 All tools return errors as:
@@ -308,6 +390,8 @@ All tools return errors as:
 | Long task, want to see progress | `dispatch_stream` |
 | Two agents need to collaborate | `dispatch_dialogue` |
 | Need a combined summary from multiple agents | `dispatch_parallel` with `aggregate` |
+| Long task — don't block your tool slot | `dispatch_async` + `dispatch_wait` |
+| Check progress without blocking | `dispatch_status` |
 ## Configuration
@@ -336,10 +420,11 @@ settings:
   #   - Read
   #   - Edit
   max_dispatch_depth: 3     # recursion protection
-  max_concurrency: 5        # max parallel claude -p processes
+  max_concurrency: 5        # max parallel claude -p processes (per dispatch path)
   cache:
     enabled: true
     ttl: 300                # seconds
+    max_size: 1000          # max cached entries; oldest evicted first (FIFO)
 ```
 Config is reloaded on every tool call — add agents without restarting.
@@ -377,11 +462,16 @@ agent-dispatch MCP server
 ## Safety
-- **Recursion protection** — `AGENT_DISPATCH_DEPTH` env var tracks nesting. Default limit: 3.
+- **Recursion protection** — `AGENT_DISPATCH_DEPTH` env var tracks nesting. Default limit: 3. Best-effort across the subprocess boundary (see [SECURITY.md](SECURITY.md)).
+- **Argument-injection guard** — structured CLI fields (`session_id`, `model`, `permission_mode`, tool names) that start with `-` are rejected so they can't smuggle extra `claude` flags.
+- **Path-traversal guard** — caller-supplied `job_id`/`ref` values are validated as 32-char hex before any filesystem access.
+- **Owner-only state** — job files (`0o600`) and `agents.yaml` (`0o600`) are written for the owner only; their directories are `0o700`.
 - **Cost control** — `max_budget_usd` per agent or globally.
-- **Concurrency** — `max_concurrency` (default: 5) limits parallel `claude -p` processes.
+- **Concurrency** — `max_concurrency` (default: 5) caps parallel `claude -p` processes. Note: the sync and async dispatch paths use separate semaphores, so the worst-case total is `2 × max_concurrency`.
 - **Timeout** — per-agent or global (default: 300s). Orphaned processes are cleaned up.
-- **Caching** — identical `(agent, task, context)` requests return cached results. Only successes are cached. Sessions and dialogues are never cached.
+- **Caching** — identical `(agent, task, context, caller, goal, response_format)` requests return cached results, bounded by `cache.max_size` (oldest entry evicted first). Only successes are cached. Sessions and dialogues are never cached.
+See [SECURITY.md](SECURITY.md) for the full threat model (including the `bypassPermissions` escalation risk and on-disk job files).
 ## CLI

{agent_dispatch-0.3.0 → agent_dispatch-0.5.0}/README.md RENAMED Viewed

@@ -55,7 +55,7 @@ Done. Every Claude Code session now has access to all dispatch tools.
 Lists all configured agents. **Call this first** to see what's available.
 ```json
-// Response (permission fields shown only when configured)
+// Response (capability + permission fields shown only when populated)
 [
   {
     "name": "infra",
@@ -64,12 +64,28 @@ Lists all configured agents. **Call this first** to see what's available.
     "healthy": true,
     "has_claude_md": true,
     "has_mcp_config": true,
+    "mcp_servers": ["portainer", "postgres"],
+    "stacks": ["Python", "Docker"],
+    "dbs": ["Alembic"],
     "permission_mode": "bypassPermissions",
     "allowed_tools": ["Bash", "Read", "Grep"]
   }
 ]
 ```
+`mcp_servers`, `stacks`, and `dbs` are detected from the agent's project files (`.mcp.json`, `Dockerfile`, `pyproject.toml`, `Cargo.toml`, `prisma/`, `alembic.ini`, etc.) so callers can pick the right agent without dispatching a probe.
+### `inspect_agent`
+Cheap detailed lookup — reads the agent's files without spawning a `claude` session. Returns the full config (timeout, model, budget, permission mode, allowed/disallowed tools), detected MCP/stacks/DBs, plus short previews of `CLAUDE.md` and `README.md` when present.
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `name` | string | yes | Agent name from `list_agents` |
+| `preview_lines` | int | no | Max lines of CLAUDE.md/README.md (default 40, max 200, 0 disables) |
+Use this **before** `dispatch_async`/`dispatch` to confirm an agent has the tools and context for your task — much cheaper than a probe dispatch.
 ### `dispatch`
 One-shot task delegation. Results are cached — identical requests within TTL return instantly.
@@ -81,6 +97,9 @@ One-shot task delegation. Results are cached — identical requests within TTL r
 | `context` | string | no | Extra context: error messages, code snippets, stack traces |
 | `caller` | string | no | Your project/role — helps the agent understand who's asking |
 | `goal` | string | no | Broader objective — helps the agent make better trade-offs |
+| `response_format` | string | no | `"json"` to request a single JSON value; the parsed result lands in `parsed_result`. Empty = free-form text. |
+| `return_ref` | bool | no | When `true`, returns just a `ref` + summary preview instead of the full result text. Use `fetch_result(ref)` to load the full text on demand. |
+| `summary_chars` | int | no | Max chars of result text to include in the ref response (default 500). |
 ```json
 // Response (success)
@@ -106,6 +125,18 @@ One-shot task delegation. Results are cached — identical requests within TTL r
 **`error_type` values:** `permission` (tool/action denied), `timeout`, `recursion` (dispatch depth exceeded), `not_found` (missing directory or CLI), `cli_error` (other failures). Permission errors include an actionable hint.
+**Structured JSON output:** pass `response_format="json"` to ask the agent for a single JSON value. The runner appends an instruction footer ("respond with a single valid JSON value, no fences, no prose") and on success parses the response — the parsed value lands in `parsed_result`. The raw text is always in `result`. Parse failures leave `parsed_result=None` but don't fail the dispatch (soft mode).
+```json
+// Response with response_format="json"
+{
+  "agent": "infra",
+  "success": true,
+  "result": "{\"errors\": 3, \"first_at\": \"14:02\"}",
+  "parsed_result": {"errors": 3, "first_at": "14:02"}
+}
+```
 **Always pass `caller` and `goal`** — the dispatched agent sees a structured prompt:
 ```markdown
@@ -260,6 +291,57 @@ Remove an agent from config.
 View cache hit rate and size, or clear all cached results.
+### Result references — `return_ref` + `fetch_result`
+For dispatches whose result text is large (audits, log dumps, code searches), passing the full text back inflates the calling agent's context. Use `return_ref=True` to get just a small reference instead:
+```
+dispatch(agent="infra", task="audit every container", return_ref=True, summary_chars=200)
+  -> {"ref": "8f3a...e1", "agent": "infra", "success": true,
+      "size": 14823, "summary_chars": 200,
+      "summary": "Inspected 32 containers. Found 3 OOM kills in the last hour:\n- worker-3...",
+      "cost_usd": 0.08, "duration_ms": 9200}
+// Later, when you actually need to read the result:
+fetch_result(ref="8f3a...e1")              -> full DispatchResult JSON
+fetch_result(ref="8f3a...e1", max_chars=2000)  -> truncated, plus {"truncated": true, "full_size": 14823}
+```
+Refs reuse the same storage as `dispatch_async` jobs (under `~/.config/agent-dispatch/jobs/`), so any `job_id` returned by `dispatch_async` is also a valid `ref` for `fetch_result`. `parsed_result` (when `response_format="json"` is set) is small and is always inlined directly in the ref response — no second fetch needed.
+### Async dispatch — `dispatch_async`, `dispatch_status`, `dispatch_wait`, `dispatch_cancel`, `dispatch_jobs`, `dispatch_gc`
+When a dispatched task is going to take a while, you don't want to block your own tool slot for minutes. Async dispatch returns a `job_id` immediately and lets you check back when you're ready.
+```
+// 1. fire and forget
+dispatch_async(agent="infra", task="audit every container log for OOM kills today")
+  -> {"job_id": "8f3a...e1", "status": "pending", "agent": "infra"}
+// 2. do other work, then check progress (non-blocking)
+dispatch_status(job_id="8f3a...e1")
+  -> {"id": "8f3a...e1", "status": "running", "started_at": 1730000123.4, ...}
+// 3. or block until done (with a timeout cap)
+dispatch_wait(job_id="8f3a...e1", timeout_seconds=120)
+  -> {"id": "8f3a...e1", "status": "done", "result": {"agent": "infra", "success": true, ...}}
+// If the timeout fires, the job keeps running:
+  -> {"id": "...", "status": "running", "timed_out_waiting": true}
+```
+`dispatch_cancel(job_id)` cancels a job that is still **pending** (before its subprocess starts) — a running job is left to finish, since its `claude` subprocess can't be safely interrupted. The response carries an `outcome` of `cancelled`, `running`, `already_terminal`, or `not_found`.
+`dispatch_jobs(status?)` lists recent jobs as summaries (filter by `pending` / `running` / `done` / `failed` / `cancelled`). `dispatch_gc(max_age_days=7)` purges terminal jobs older than the threshold — pending and running jobs are never deleted.
+Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `AGENT_DISPATCH_JOBS_DIR`). One JSON file per job, written owner-only (`0o600`) with atomic writes — safe to read or `ls` while jobs are in flight. Caller-supplied `job_id`s are validated as 32-char hex before any file access (no path traversal). On startup the server marks jobs abandoned in `running` by a prior crashed instance as `failed`.
+| When to use async | When to use `dispatch` |
+|-------------------|------------------------|
+| Long task (minutes) — you want to keep working | Short task — you need the answer right now |
+| Several long tasks you'll collect later | Several short tasks → `dispatch_parallel` |
+| Don't care about caching (each call is a fresh job) | Cached by default — identical requests are free |
 ### Error Responses
 All tools return errors as:
@@ -278,6 +360,8 @@ All tools return errors as:
 | Long task, want to see progress | `dispatch_stream` |
 | Two agents need to collaborate | `dispatch_dialogue` |
 | Need a combined summary from multiple agents | `dispatch_parallel` with `aggregate` |
+| Long task — don't block your tool slot | `dispatch_async` + `dispatch_wait` |
+| Check progress without blocking | `dispatch_status` |
 ## Configuration
@@ -306,10 +390,11 @@ settings:
   #   - Read
   #   - Edit
   max_dispatch_depth: 3     # recursion protection
-  max_concurrency: 5        # max parallel claude -p processes
+  max_concurrency: 5        # max parallel claude -p processes (per dispatch path)
   cache:
     enabled: true
     ttl: 300                # seconds
+    max_size: 1000          # max cached entries; oldest evicted first (FIFO)
 ```
 Config is reloaded on every tool call — add agents without restarting.
@@ -347,11 +432,16 @@ agent-dispatch MCP server
 ## Safety
-- **Recursion protection** — `AGENT_DISPATCH_DEPTH` env var tracks nesting. Default limit: 3.
+- **Recursion protection** — `AGENT_DISPATCH_DEPTH` env var tracks nesting. Default limit: 3. Best-effort across the subprocess boundary (see [SECURITY.md](SECURITY.md)).
+- **Argument-injection guard** — structured CLI fields (`session_id`, `model`, `permission_mode`, tool names) that start with `-` are rejected so they can't smuggle extra `claude` flags.
+- **Path-traversal guard** — caller-supplied `job_id`/`ref` values are validated as 32-char hex before any filesystem access.
+- **Owner-only state** — job files (`0o600`) and `agents.yaml` (`0o600`) are written for the owner only; their directories are `0o700`.
 - **Cost control** — `max_budget_usd` per agent or globally.
-- **Concurrency** — `max_concurrency` (default: 5) limits parallel `claude -p` processes.
+- **Concurrency** — `max_concurrency` (default: 5) caps parallel `claude -p` processes. Note: the sync and async dispatch paths use separate semaphores, so the worst-case total is `2 × max_concurrency`.
 - **Timeout** — per-agent or global (default: 300s). Orphaned processes are cleaned up.
-- **Caching** — identical `(agent, task, context)` requests return cached results. Only successes are cached. Sessions and dialogues are never cached.
+- **Caching** — identical `(agent, task, context, caller, goal, response_format)` requests return cached results, bounded by `cache.max_size` (oldest entry evicted first). Only successes are cached. Sessions and dialogues are never cached.
+See [SECURITY.md](SECURITY.md) for the full threat model (including the `bypassPermissions` escalation risk and on-disk job files).
 ## CLI

agent-dispatch 0.3.0__tar.gz → 0.5.0__tar.gz

agent-dispatch 0.3.0tar.gz → 0.5.0tar.gz