PyPI - agent-dispatch - Versions diffs - 0.4.0__tar.gz → 0.6.0__tar.gz - Mend

agent-dispatch 0.4.0tar.gz → 0.6.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/.github/workflows/ci.yml RENAMED Viewed

@@ -6,6 +6,10 @@ on:
   pull_request:
     branches: [main]
+# Least privilege: CI only needs to read the repo.
+permissions:
+  contents: read
 jobs:
   test:
     runs-on: ubuntu-latest

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/.github/workflows/publish.yml RENAMED Viewed

@@ -4,12 +4,16 @@ on:
   release:
     types: [published]
+# Default to no privileges; the publish job opts into exactly what it needs.
+permissions: {}
 jobs:
   publish:
     runs-on: ubuntu-latest
     environment: pypi
     permissions:
-      id-token: write
+      id-token: write   # OIDC token for PyPI Trusted Publisher
+      contents: read    # checkout the tagged source
     steps:
       - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/CHANGELOG.md RENAMED Viewed

@@ -7,6 +7,124 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.6.0] - 2026-06-04
+Reliability release: timeouts stop being fatal, permission-blocked "successes"
+become visible, async jobs show live progress.
+### Fixed
+- **`dispatch_stream` was broken on current claude CLIs** — they reject
+  `--print --output-format stream-json` without `--verbose` ("requires
+  --verbose"), so every stream dispatch (and CLI `test --stream`) failed
+  immediately. The runner now passes `--verbose`. Caught by live verification
+  against the real CLI before this release; without it the async-worker
+  switch to streaming (below) would have broken all `dispatch_async` jobs.
+### Added
+- **Per-call timeout override.** `dispatch`, `dispatch_session`,
+  `dispatch_stream`, and `dispatch_async` accept `timeout_seconds` (0 = agent
+  default, clamped to 10–7200); `dispatch_parallel` accepts it per item. Use
+  it for known-long tasks instead of editing the agent config. CLI:
+  `agent-dispatch test <name> --timeout N`.
+- **Resumable timeouts.** Fresh dispatches pre-assign a session UUID via
+  `--session-id`, so a timed-out dispatch still returns a `session_id` — the
+  partial transcript survives the kill. The timeout error now spells out the
+  recovery options: resume via `dispatch_session(..., session_id=...)`, retry
+  with `timeout_seconds`, or go async.
+- **Denied-tools visibility.** The claude CLI's `permission_denials` output is
+  parsed into `DispatchResult.denied_tools`. A dispatch that "succeeds" while
+  tools were blocked (the agent answers "I need permission for X") now carries
+  `denied_tools` + a `hint` that the result may be incomplete and how to grant
+  access. On `is_error` results, non-empty denials force
+  `error_type="permission"` even when the error text has no permission
+  keywords. CLI `test` prints the hint as a yellow note.
+- **Async job progress.** Async workers now run with streaming: the job file
+  keeps a rolling tail (last 20 lines, throttled to ~1 write/sec) of assistant
+  text and tool-use events. `dispatch_status` returns it as `progress` while
+  running (kept afterwards as a post-mortem trace); `dispatch_jobs` shows
+  `last_progress` for running jobs. New `JobStore.update_progress` (refuses
+  terminal jobs, so a trailing write can't resurrect a finished job).
+### Changed
+- Timeout error messages are actionable (mention `timeout_seconds`,
+  `dispatch_async`, `agent-dispatch update --timeout`, and the resumable
+  session) instead of just "increase timeout in agents.yaml".
+- Plain-text fallback successes now carry the generated `session_id`; the
+  stream "no result line" fallback does too (a crash mid-stream stays
+  resumable).
+- **Old-CLI self-healing**: if the installed claude CLI predates
+  `--session-id`, dispatch detects the "unknown option" rejection and retries
+  once without the flag (logged warning; timed-out dispatches lose
+  resumability) instead of failing every dispatch.
+- `dispatch_parallel` validates per-item `timeout_seconds` / `summary_chars`
+  numerically **up front** — a bad value rejects the whole call before any
+  dispatch runs, consistent with the structural validation contract.
+- `denied_tools` parsing is bounded (10 entries, 100 chars per name) — the
+  field comes from the dispatched subprocess's output, which is untrusted;
+  unbounded lists could inflate job files and `return_ref` payloads.
+## [0.5.0] - 2026-06-01
+Security-hardening release. A multi-agent audit of the codebase surfaced
+several issues; the confirmed ones are fixed here, plus job cancellation,
+cache bounding, and stale-job recovery.
+### Security
+- **Path traversal in async jobs (fixed).** `dispatch_status`, `dispatch_wait`,
+  and `fetch_result` accept a caller-supplied `job_id`/`ref` that flowed
+  straight into `JobStore`'s file-path construction. A crafted value such as
+  `../../secret` could read any Job-shaped `.json` file outside the jobs
+  directory. Job ids are now validated against `^[0-9a-f]{32}$` at the tool
+  boundary (`_validate_ref`), in `JobStore.get`, and in `JobStore._path`
+  (defense in depth). Malformed ids are rejected without touching the
+  filesystem. New helper `jobs.is_valid_job_id`.
+- **Argument/flag injection via structured CLI fields (fixed).** A
+  `session_id` (caller-controlled in `dispatch_session`) — or a misconfigured
+  `model`, `permission_mode`, or tool name — that started with `-` was placed
+  in the argument position after a flag (e.g. `--resume <session_id>`) and the
+  `claude` CLI parsed it as a *new* flag, allowing options like
+  `--permission-mode bypassPermissions` to be smuggled in. `_build_command`
+  now rejects any such value via `_reject_flaglike` (raising
+  `runner.ArgInjectionError`); `dispatch`/`dispatch_stream` surface it as a
+  clean failed result, never spawning a subprocess.
+- **Tightened file permissions.** Job files are written `0o600` and the jobs
+  directory is created `0o700` (they hold full task/context/result payloads
+  that may contain secrets). `save_config` now writes `agents.yaml` `0o600`
+  and its parent directory `0o700`. All `chmod`s are best-effort (skipped on
+  platforms without POSIX modes).
+### Added
+- `dispatch_cancel(job_id)` MCP tool — cancel a *pending* async job before it
+  starts. Running jobs are left to finish (their subprocess can't be safely
+  interrupted); the tool reports an `outcome` of `cancelled`, `running`,
+  `already_terminal`, or `not_found`. Makes the previously-unreachable
+  `cancelled` job status real. Backed by `JobStore.cancel`, and the
+  cancel/start race is closed by `mark_running` refusing a cancelled job.
+- Cache size bound — `CacheSettings.max_size` (default 1000) caps the
+  in-memory dispatch cache, evicting the oldest entry first (FIFO by insertion
+  time; read access does not refresh, since the timestamp also drives TTL),
+  preventing unbounded memory growth from many unique requests. `cache_stats`
+  now reports `max_size` and `evictions`.
+- Stale-job recovery — on startup the server marks jobs abandoned in
+  `running` (older than 1h, e.g. from a crashed prior run) as `failed` so
+  callers don't poll them forever (`JobStore.recover_stale`).
+### Changed
+- Input bounds hardened across MCP tools: `dispatch_jobs(limit)` clamped to
+  `[1, 1000]`; `dispatch_gc(max_age_days)` rejects non-finite values;
+  `summary_chars` (in `dispatch` and per-item `dispatch_parallel`) clamped to
+  `[0, 100000]`; `dispatch_parallel` rejects more than
+  `max(100, max_concurrency * 20)` items to bound subprocess fan-out.
+- Async job worker now logs lifecycle transitions (running / finished) with
+  the job id for easier production debugging.
+- Type hints filled in (`_ref_payload`, `_run_job`, `_run_one`).
+- Lint surface expanded — ruff now enforces bugbear (`B`), bandit security
+  (`S`), import order (`I`), and pyupgrade (`UP`) in addition to the defaults,
+  with documented ignores for the trusted `claude` subprocess calls.
+- `SECURITY.md` rewritten: accurate supported-versions table and an expanded
+  threat model (bypassPermissions, on-disk job files, env inheritance,
+  best-effort recursion depth, argument-injection mitigation).
 ## [0.4.0] - 2026-05-15
 ### Added
@@ -152,7 +270,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Dependabot for `pip` + `github-actions`, GitHub Actions pinned to
   commit SHAs for supply-chain integrity.
-[Unreleased]: https://github.com/ginkida/agent-dispatch/compare/v0.4.0...HEAD
+[Unreleased]: https://github.com/ginkida/agent-dispatch/compare/v0.6.0...HEAD
+[0.6.0]: https://github.com/ginkida/agent-dispatch/compare/v0.5.0...v0.6.0
+[0.5.0]: https://github.com/ginkida/agent-dispatch/compare/v0.4.0...v0.5.0
 [0.4.0]: https://github.com/ginkida/agent-dispatch/compare/v0.3.0...v0.4.0
 [0.3.0]: https://github.com/ginkida/agent-dispatch/compare/v0.2.2...v0.3.0
 [0.2.2]: https://github.com/ginkida/agent-dispatch/compare/v0.2.1...v0.2.2

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: agent-dispatch
-Version: 0.4.0
+Version: 0.6.0
 Summary: MCP server that lets Claude Code agents delegate tasks to agents in other project directories
 Project-URL: Homepage, https://github.com/ginkida/agent-dispatch
 Project-URL: Repository, https://github.com/ginkida/agent-dispatch
@@ -130,6 +130,7 @@ One-shot task delegation. Results are cached — identical requests within TTL r
 | `response_format` | string | no | `"json"` to request a single JSON value; the parsed result lands in `parsed_result`. Empty = free-form text. |
 | `return_ref` | bool | no | When `true`, returns just a `ref` + summary preview instead of the full result text. Use `fetch_result(ref)` to load the full text on demand. |
 | `summary_chars` | int | no | Max chars of result text to include in the ref response (default 500). |
+| `timeout_seconds` | int | no | One-off timeout override for this call (0 = agent's configured timeout; clamped to 10–7200). No config edit needed for known-long tasks. |
 ```json
 // Response (success)
@@ -155,6 +156,21 @@ One-shot task delegation. Results are cached — identical requests within TTL r
 **`error_type` values:** `permission` (tool/action denied), `timeout`, `recursion` (dispatch depth exceeded), `not_found` (missing directory or CLI), `cli_error` (other failures). Permission errors include an actionable hint.
+**Resumable timeouts:** every fresh dispatch pre-assigns a session UUID (`--session-id`), so a timed-out dispatch still returns a `session_id` — the partial transcript survives the kill. The timeout error spells out the recovery: resume with `dispatch_session(agent, "Continue where you left off", session_id=...)`, retry with a bigger `timeout_seconds`, or use `dispatch_async`.
+**Denied-tools visibility:** in non-interactive mode the claude CLI auto-denies tools the agent isn't allowed to use — the agent then often "succeeds" with an answer like *"I need your permission for one read-only query"*. When that happens the response carries the deterministic signal: `denied_tools` (parsed from the CLI's `permission_denials`) plus a `hint` explaining the result may be incomplete and how to grant access. `success` stays `true` — it's a soft signal, not a failure.
+```json
+// Response (success, but a tool was blocked)
+{
+  "agent": "analysis",
+  "success": true,
+  "result": "Here is the offline mapping. To finish I'd need to run one read-only query...",
+  "denied_tools": ["Bash"],
+  "hint": "1 tool call(s) were denied by permissions: Bash. The result may be incomplete..."
+}
+```
 **Structured JSON output:** pass `response_format="json"` to ask the agent for a single JSON value. The runner appends an instruction footer ("respond with a single valid JSON value, no fences, no prose") and on success parses the response — the parsed value lands in `parsed_result`. The raw text is always in `result`. Parse failures leave `parsed_result=None` but don't fail the dispatch (soft mode).
 ```json
@@ -195,6 +211,9 @@ Multi-turn: continue a conversation with an agent. First call starts a session,
 | `context` | string | no | Extra context |
 | `caller` | string | no | Who is dispatching |
 | `goal` | string | no | Broader objective |
+| `timeout_seconds` | int | no | One-off timeout override (0 = agent default; clamped to 10–7200) |
+`dispatch_session` is also the **timeout recovery path**: a timed-out `dispatch` returns a `session_id` — pass it here with `task="Continue where you left off"` to salvage the partial work instead of restarting.
 ```
 Turn 1: dispatch_session("infra", "List running containers")
@@ -210,7 +229,7 @@ Run multiple tasks concurrently. Much faster than sequential `dispatch` calls.
 | Parameter | Type | Required | Description |
 |-----------|------|----------|-------------|
-| `dispatches` | string (JSON) | yes | JSON array of `{"agent", "task", "context?", "caller?", "goal?"}` |
+| `dispatches` | string (JSON) | yes | JSON array of `{"agent", "task", "context?", "caller?", "goal?", "response_format?", "return_ref?", "summary_chars?", "timeout_seconds?"}` |
 | `aggregate` | string | no | Agent name to synthesize all results into one answer |
 **Important:** `dispatches` is a JSON string, not a list.
@@ -250,7 +269,7 @@ Run multiple tasks concurrently. Much faster than sequential `dispatch` calls.
 Same as `dispatch` but shows live progress while the agent works. Use for long-running tasks. Not cached.
-Parameters are identical to `dispatch`.
+Parameters are the same as `dispatch` except `return_ref`/`summary_chars` (streaming is incompatible with ref-mode).
 ### `dispatch_dialogue`
@@ -339,18 +358,20 @@ fetch_result(ref="8f3a...e1", max_chars=2000)  -> truncated, plus {"truncated":
 Refs reuse the same storage as `dispatch_async` jobs (under `~/.config/agent-dispatch/jobs/`), so any `job_id` returned by `dispatch_async` is also a valid `ref` for `fetch_result`. `parsed_result` (when `response_format="json"` is set) is small and is always inlined directly in the ref response — no second fetch needed.
-### Async dispatch — `dispatch_async`, `dispatch_status`, `dispatch_wait`, `dispatch_jobs`, `dispatch_gc`
+### Async dispatch — `dispatch_async`, `dispatch_status`, `dispatch_wait`, `dispatch_cancel`, `dispatch_jobs`, `dispatch_gc`
 When a dispatched task is going to take a while, you don't want to block your own tool slot for minutes. Async dispatch returns a `job_id` immediately and lets you check back when you're ready.
 ```
-// 1. fire and forget
+// 1. fire and forget (timeout_seconds= works here too for known-long tasks)
 dispatch_async(agent="infra", task="audit every container log for OOM kills today")
   -> {"job_id": "8f3a...e1", "status": "pending", "agent": "infra"}
 // 2. do other work, then check progress (non-blocking)
+//    `progress` is a rolling tail of what the agent is doing right now
 dispatch_status(job_id="8f3a...e1")
-  -> {"id": "8f3a...e1", "status": "running", "started_at": 1730000123.4, ...}
+  -> {"id": "8f3a...e1", "status": "running", "started_at": 1730000123.4,
+      "progress": ["Using tool: Bash", "Scanning container logs for OOM events..."], ...}
 // 3. or block until done (with a timeout cap)
 dispatch_wait(job_id="8f3a...e1", timeout_seconds=120)
@@ -360,9 +381,13 @@ dispatch_wait(job_id="8f3a...e1", timeout_seconds=120)
   -> {"id": "...", "status": "running", "timed_out_waiting": true}
 ```
+`dispatch_cancel(job_id)` cancels a job that is still **pending** (before its subprocess starts) — a running job is left to finish, since its `claude` subprocess can't be safely interrupted. The response carries an `outcome` of `cancelled`, `running`, `already_terminal`, or `not_found`.
+Async workers run with streaming under the hood: the job file keeps a rolling tail (last 20 lines, ~1 write/sec) of assistant text and tool-use events. `dispatch_status` shows it as `progress` while the job runs and keeps it afterwards as a post-mortem trace; `dispatch_jobs` shows `last_progress` for running jobs.
 `dispatch_jobs(status?)` lists recent jobs as summaries (filter by `pending` / `running` / `done` / `failed` / `cancelled`). `dispatch_gc(max_age_days=7)` purges terminal jobs older than the threshold — pending and running jobs are never deleted.
-Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `AGENT_DISPATCH_JOBS_DIR`). One JSON file per job, atomic writes — safe to read or `ls` while jobs are in flight.
+Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `AGENT_DISPATCH_JOBS_DIR`). One JSON file per job, written owner-only (`0o600`) with atomic writes — safe to read or `ls` while jobs are in flight. Caller-supplied `job_id`s are validated as 32-char hex before any file access (no path traversal). On startup the server marks jobs abandoned in `running` by a prior crashed instance as `failed`.
 | When to use async | When to use `dispatch` |
 |-------------------|------------------------|
@@ -390,6 +415,8 @@ All tools return errors as:
 | Need a combined summary from multiple agents | `dispatch_parallel` with `aggregate` |
 | Long task — don't block your tool slot | `dispatch_async` + `dispatch_wait` |
 | Check progress without blocking | `dispatch_status` |
+| Known-long task, one-off | any dispatch tool with `timeout_seconds=...` |
+| A dispatch timed out | `dispatch_session` with the `session_id` from the error |
 ## Configuration
@@ -418,10 +445,11 @@ settings:
   #   - Read
   #   - Edit
   max_dispatch_depth: 3     # recursion protection
-  max_concurrency: 5        # max parallel claude -p processes
+  max_concurrency: 5        # max parallel claude -p processes (per dispatch path)
   cache:
     enabled: true
     ttl: 300                # seconds
+    max_size: 1000          # max cached entries; oldest evicted first (FIFO)
 ```
 Config is reloaded on every tool call — add agents without restarting.
@@ -459,11 +487,16 @@ agent-dispatch MCP server
 ## Safety
-- **Recursion protection** — `AGENT_DISPATCH_DEPTH` env var tracks nesting. Default limit: 3.
+- **Recursion protection** — `AGENT_DISPATCH_DEPTH` env var tracks nesting. Default limit: 3. Best-effort across the subprocess boundary (see [SECURITY.md](SECURITY.md)).
+- **Argument-injection guard** — structured CLI fields (`session_id`, `model`, `permission_mode`, tool names) that start with `-` are rejected so they can't smuggle extra `claude` flags.
+- **Path-traversal guard** — caller-supplied `job_id`/`ref` values are validated as 32-char hex before any filesystem access.
+- **Owner-only state** — job files (`0o600`) and `agents.yaml` (`0o600`) are written for the owner only; their directories are `0o700`.
 - **Cost control** — `max_budget_usd` per agent or globally.
-- **Concurrency** — `max_concurrency` (default: 5) limits parallel `claude -p` processes.
+- **Concurrency** — `max_concurrency` (default: 5) caps parallel `claude -p` processes. Note: the sync and async dispatch paths use separate semaphores, so the worst-case total is `2 × max_concurrency`.
 - **Timeout** — per-agent or global (default: 300s). Orphaned processes are cleaned up.
-- **Caching** — identical `(agent, task, context)` requests return cached results. Only successes are cached. Sessions and dialogues are never cached.
+- **Caching** — identical `(agent, task, context, caller, goal, response_format)` requests return cached results, bounded by `cache.max_size` (oldest entry evicted first). Only successes are cached. Sessions and dialogues are never cached.
+See [SECURITY.md](SECURITY.md) for the full threat model (including the `bypassPermissions` escalation risk and on-disk job files).
 ## CLI

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/README.md RENAMED Viewed

@@ -100,6 +100,7 @@ One-shot task delegation. Results are cached — identical requests within TTL r
 | `response_format` | string | no | `"json"` to request a single JSON value; the parsed result lands in `parsed_result`. Empty = free-form text. |
 | `return_ref` | bool | no | When `true`, returns just a `ref` + summary preview instead of the full result text. Use `fetch_result(ref)` to load the full text on demand. |
 | `summary_chars` | int | no | Max chars of result text to include in the ref response (default 500). |
+| `timeout_seconds` | int | no | One-off timeout override for this call (0 = agent's configured timeout; clamped to 10–7200). No config edit needed for known-long tasks. |
 ```json
 // Response (success)
@@ -125,6 +126,21 @@ One-shot task delegation. Results are cached — identical requests within TTL r
 **`error_type` values:** `permission` (tool/action denied), `timeout`, `recursion` (dispatch depth exceeded), `not_found` (missing directory or CLI), `cli_error` (other failures). Permission errors include an actionable hint.
+**Resumable timeouts:** every fresh dispatch pre-assigns a session UUID (`--session-id`), so a timed-out dispatch still returns a `session_id` — the partial transcript survives the kill. The timeout error spells out the recovery: resume with `dispatch_session(agent, "Continue where you left off", session_id=...)`, retry with a bigger `timeout_seconds`, or use `dispatch_async`.
+**Denied-tools visibility:** in non-interactive mode the claude CLI auto-denies tools the agent isn't allowed to use — the agent then often "succeeds" with an answer like *"I need your permission for one read-only query"*. When that happens the response carries the deterministic signal: `denied_tools` (parsed from the CLI's `permission_denials`) plus a `hint` explaining the result may be incomplete and how to grant access. `success` stays `true` — it's a soft signal, not a failure.
+```json
+// Response (success, but a tool was blocked)
+{
+  "agent": "analysis",
+  "success": true,
+  "result": "Here is the offline mapping. To finish I'd need to run one read-only query...",
+  "denied_tools": ["Bash"],
+  "hint": "1 tool call(s) were denied by permissions: Bash. The result may be incomplete..."
+}
+```
 **Structured JSON output:** pass `response_format="json"` to ask the agent for a single JSON value. The runner appends an instruction footer ("respond with a single valid JSON value, no fences, no prose") and on success parses the response — the parsed value lands in `parsed_result`. The raw text is always in `result`. Parse failures leave `parsed_result=None` but don't fail the dispatch (soft mode).
 ```json
@@ -165,6 +181,9 @@ Multi-turn: continue a conversation with an agent. First call starts a session,
 | `context` | string | no | Extra context |
 | `caller` | string | no | Who is dispatching |
 | `goal` | string | no | Broader objective |
+| `timeout_seconds` | int | no | One-off timeout override (0 = agent default; clamped to 10–7200) |
+`dispatch_session` is also the **timeout recovery path**: a timed-out `dispatch` returns a `session_id` — pass it here with `task="Continue where you left off"` to salvage the partial work instead of restarting.
 ```
 Turn 1: dispatch_session("infra", "List running containers")
@@ -180,7 +199,7 @@ Run multiple tasks concurrently. Much faster than sequential `dispatch` calls.
 | Parameter | Type | Required | Description |
 |-----------|------|----------|-------------|
-| `dispatches` | string (JSON) | yes | JSON array of `{"agent", "task", "context?", "caller?", "goal?"}` |
+| `dispatches` | string (JSON) | yes | JSON array of `{"agent", "task", "context?", "caller?", "goal?", "response_format?", "return_ref?", "summary_chars?", "timeout_seconds?"}` |
 | `aggregate` | string | no | Agent name to synthesize all results into one answer |
 **Important:** `dispatches` is a JSON string, not a list.
@@ -220,7 +239,7 @@ Run multiple tasks concurrently. Much faster than sequential `dispatch` calls.
 Same as `dispatch` but shows live progress while the agent works. Use for long-running tasks. Not cached.
-Parameters are identical to `dispatch`.
+Parameters are the same as `dispatch` except `return_ref`/`summary_chars` (streaming is incompatible with ref-mode).
 ### `dispatch_dialogue`
@@ -309,18 +328,20 @@ fetch_result(ref="8f3a...e1", max_chars=2000)  -> truncated, plus {"truncated":
 Refs reuse the same storage as `dispatch_async` jobs (under `~/.config/agent-dispatch/jobs/`), so any `job_id` returned by `dispatch_async` is also a valid `ref` for `fetch_result`. `parsed_result` (when `response_format="json"` is set) is small and is always inlined directly in the ref response — no second fetch needed.
-### Async dispatch — `dispatch_async`, `dispatch_status`, `dispatch_wait`, `dispatch_jobs`, `dispatch_gc`
+### Async dispatch — `dispatch_async`, `dispatch_status`, `dispatch_wait`, `dispatch_cancel`, `dispatch_jobs`, `dispatch_gc`
 When a dispatched task is going to take a while, you don't want to block your own tool slot for minutes. Async dispatch returns a `job_id` immediately and lets you check back when you're ready.
 ```
-// 1. fire and forget
+// 1. fire and forget (timeout_seconds= works here too for known-long tasks)
 dispatch_async(agent="infra", task="audit every container log for OOM kills today")
   -> {"job_id": "8f3a...e1", "status": "pending", "agent": "infra"}
 // 2. do other work, then check progress (non-blocking)
+//    `progress` is a rolling tail of what the agent is doing right now
 dispatch_status(job_id="8f3a...e1")
-  -> {"id": "8f3a...e1", "status": "running", "started_at": 1730000123.4, ...}
+  -> {"id": "8f3a...e1", "status": "running", "started_at": 1730000123.4,
+      "progress": ["Using tool: Bash", "Scanning container logs for OOM events..."], ...}
 // 3. or block until done (with a timeout cap)
 dispatch_wait(job_id="8f3a...e1", timeout_seconds=120)
@@ -330,9 +351,13 @@ dispatch_wait(job_id="8f3a...e1", timeout_seconds=120)
   -> {"id": "...", "status": "running", "timed_out_waiting": true}
 ```
+`dispatch_cancel(job_id)` cancels a job that is still **pending** (before its subprocess starts) — a running job is left to finish, since its `claude` subprocess can't be safely interrupted. The response carries an `outcome` of `cancelled`, `running`, `already_terminal`, or `not_found`.
+Async workers run with streaming under the hood: the job file keeps a rolling tail (last 20 lines, ~1 write/sec) of assistant text and tool-use events. `dispatch_status` shows it as `progress` while the job runs and keeps it afterwards as a post-mortem trace; `dispatch_jobs` shows `last_progress` for running jobs.
 `dispatch_jobs(status?)` lists recent jobs as summaries (filter by `pending` / `running` / `done` / `failed` / `cancelled`). `dispatch_gc(max_age_days=7)` purges terminal jobs older than the threshold — pending and running jobs are never deleted.
-Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `AGENT_DISPATCH_JOBS_DIR`). One JSON file per job, atomic writes — safe to read or `ls` while jobs are in flight.
+Job state persists to disk at `~/.config/agent-dispatch/jobs/` (override with `AGENT_DISPATCH_JOBS_DIR`). One JSON file per job, written owner-only (`0o600`) with atomic writes — safe to read or `ls` while jobs are in flight. Caller-supplied `job_id`s are validated as 32-char hex before any file access (no path traversal). On startup the server marks jobs abandoned in `running` by a prior crashed instance as `failed`.
 | When to use async | When to use `dispatch` |
 |-------------------|------------------------|
@@ -360,6 +385,8 @@ All tools return errors as:
 | Need a combined summary from multiple agents | `dispatch_parallel` with `aggregate` |
 | Long task — don't block your tool slot | `dispatch_async` + `dispatch_wait` |
 | Check progress without blocking | `dispatch_status` |
+| Known-long task, one-off | any dispatch tool with `timeout_seconds=...` |
+| A dispatch timed out | `dispatch_session` with the `session_id` from the error |
 ## Configuration
@@ -388,10 +415,11 @@ settings:
   #   - Read
   #   - Edit
   max_dispatch_depth: 3     # recursion protection
-  max_concurrency: 5        # max parallel claude -p processes
+  max_concurrency: 5        # max parallel claude -p processes (per dispatch path)
   cache:
     enabled: true
     ttl: 300                # seconds
+    max_size: 1000          # max cached entries; oldest evicted first (FIFO)
 ```
 Config is reloaded on every tool call — add agents without restarting.
@@ -429,11 +457,16 @@ agent-dispatch MCP server
 ## Safety
-- **Recursion protection** — `AGENT_DISPATCH_DEPTH` env var tracks nesting. Default limit: 3.
+- **Recursion protection** — `AGENT_DISPATCH_DEPTH` env var tracks nesting. Default limit: 3. Best-effort across the subprocess boundary (see [SECURITY.md](SECURITY.md)).
+- **Argument-injection guard** — structured CLI fields (`session_id`, `model`, `permission_mode`, tool names) that start with `-` are rejected so they can't smuggle extra `claude` flags.
+- **Path-traversal guard** — caller-supplied `job_id`/`ref` values are validated as 32-char hex before any filesystem access.
+- **Owner-only state** — job files (`0o600`) and `agents.yaml` (`0o600`) are written for the owner only; their directories are `0o700`.
 - **Cost control** — `max_budget_usd` per agent or globally.
-- **Concurrency** — `max_concurrency` (default: 5) limits parallel `claude -p` processes.
+- **Concurrency** — `max_concurrency` (default: 5) caps parallel `claude -p` processes. Note: the sync and async dispatch paths use separate semaphores, so the worst-case total is `2 × max_concurrency`.
 - **Timeout** — per-agent or global (default: 300s). Orphaned processes are cleaned up.
-- **Caching** — identical `(agent, task, context)` requests return cached results. Only successes are cached. Sessions and dialogues are never cached.
+- **Caching** — identical `(agent, task, context, caller, goal, response_format)` requests return cached results, bounded by `cache.max_size` (oldest entry evicted first). Only successes are cached. Sessions and dialogues are never cached.
+See [SECURITY.md](SECURITY.md) for the full threat model (including the `bypassPermissions` escalation risk and on-disk job files).
 ## CLI

agent_dispatch-0.6.0/SECURITY.md ADDED Viewed

@@ -0,0 +1,77 @@
+# Security Policy
+## Reporting a Vulnerability
+If you discover a security vulnerability, please report it via [GitHub Security Advisories](https://github.com/ginkida/agent-dispatch/security/advisories/new).
+**Do not** open a public issue for security vulnerabilities.
+## Supported Versions
+| Version | Supported |
+|---------|-----------|
+| 0.5.x   | Yes       |
+| 0.4.x   | Yes       |
+| ≤ 0.3.x | No        |
+## Threat Model
+`agent-dispatch` runs `claude -p` subprocesses in configured directories on
+behalf of a calling Claude Code agent. The MCP caller and the agent
+configurations are part of the same trust domain as the user running the
+server — this is a developer tool, not a multi-tenant service. With that in
+mind, the security-relevant areas are:
+### Subprocess execution
+- Tasks/context strings are passed as **argument-list** elements to
+  `subprocess.run`/`Popen` (never `shell=True`), so there is no shell
+  injection.
+- **Argument injection is guarded.** Structured fields placed next to a CLI
+  flag (`session_id` → `--resume`, `model` → `--model`, `permission_mode`,
+  and tool names) are rejected if they start with `-`, which the `claude`
+  CLI would otherwise parse as a *separate* flag. See
+  `runner._reject_flaglike` / `runner.ArgInjectionError`.
+### Permission escalation (`bypassPermissions`)
+- Setting `permission_mode: bypassPermissions` (or a permissive
+  `default_permission_mode`) disables Claude Code's permission prompts for
+  that agent — it can use any tool without confirmation. Only enable it for
+  agents whose project directories you trust. Prefer `allowed_tools` /
+  `disallowed_tools` for least privilege.
+- A dispatched agent running with broad permissions can, in principle, start
+  its own `claude`/dispatch chain. Recursion depth (`AGENT_DISPATCH_DEPTH`,
+  bounded by `max_dispatch_depth`) is **best-effort**: it crosses the process
+  boundary via an environment variable, so a deliberately hostile agent that
+  clears its environment can reset the counter. It protects against accidental
+  A→B→A loops, not against an adversarial agent.
+### On-disk state
+- Async/`return_ref` job records persist to
+  `~/.config/agent-dispatch/jobs/<job_id>.json` (override with
+  `AGENT_DISPATCH_JOBS_DIR`). They contain the full task, context, and result,
+  which may include sensitive output. Files are written `0o600` and the
+  directory `0o700` (owner-only). Call `dispatch_gc()` periodically to purge
+  old results.
+- `agents.yaml` is written `0o600`. It records project paths and permission
+  settings.
+- `job_id`s are unauthenticated 32-char hex UUIDs — anyone who can call the
+  MCP tools and knows a `job_id` can read its result. Don't relay `job_id`s
+  over untrusted channels. Caller-supplied `job_id`/`ref` values are validated
+  (`^[0-9a-f]{32}$`) before any filesystem access, blocking path traversal.
+### Environment & directories
+- The dispatched subprocess inherits the **full parent environment**
+  (`os.environ.copy()`) — necessary for `claude` to find its credentials.
+  Keep secrets you don't want dispatched agents to see out of the shell that
+  launches the server.
+- Agent directories are resolved to absolute paths via `Path.resolve()` and
+  must exist at registration time.
+### Cost
+- `max_budget_usd` (per agent or as a default) caps spend per dispatch.
+## Reproducibility & CI
+Third-party GitHub Actions are pinned to commit SHAs; workflows run with
+least-privilege `permissions`. Releases publish to PyPI via OIDC Trusted
+Publishing (no long-lived tokens).

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/agents.example.yaml RENAMED Viewed

@@ -45,3 +45,4 @@ settings:
   cache:
     enabled: true
     ttl: 300                       # seconds; identical (agent, task, context) requests are cached
+    max_size: 1000                 # max cached entries; oldest is evicted first (FIFO)

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "agent-dispatch"
-version = "0.4.0"
+version = "0.6.0"
 description = "MCP server that lets Claude Code agents delegate tasks to agents in other project directories"
 readme = "README.md"
 license = "MIT"
@@ -47,6 +47,28 @@ asyncio_mode = "auto"
 target-version = "py310"
 line-length = 100
+[tool.ruff.lint]
+select = [
+    "E", "W",  # pycodestyle
+    "F",       # pyflakes
+    "B",       # flake8-bugbear (likely bugs)
+    "I",       # isort (import order)
+    "UP",      # pyupgrade (modern syntax)
+    "S",       # flake8-bandit (security)
+]
+ignore = [
+    # The dispatch family shells out to the trusted `claude` CLI with argument
+    # lists (never shell=True); see runner._build_command and the arg-injection
+    # guard (_reject_flaglike). Partial path is intentional — `claude` is
+    # resolved from PATH.
+    "S603",  # subprocess call with possibly-untrusted input
+    "S607",  # starting a process with a partial executable path
+]
+[tool.ruff.lint.per-file-ignores]
+# Tests legitimately assert and use throwaway /tmp paths.
+"tests/**" = ["S101", "S108"]
 [project.optional-dependencies]
 dev = [
     "pytest>=8.0",

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/src/agent_dispatch/__init__.py RENAMED Viewed

@@ -1,3 +1,3 @@
 """agent-dispatch: Delegate tasks between Claude Code agents across projects."""
-__version__ = "0.4.0"
+__version__ = "0.6.0"

{agent_dispatch-0.4.0 → agent_dispatch-0.6.0}/src/agent_dispatch/cache.py RENAMED Viewed

@@ -23,12 +23,14 @@ class DispatchCache:
     requests with different framing would collide and return the wrong response.
     """
-    def __init__(self, ttl: int = 300) -> None:
+    def __init__(self, ttl: int = 300, max_size: int = 1000) -> None:
         self._ttl = ttl
+        self._max_size = max_size
         self._store: dict[str, tuple[float, DispatchResult]] = {}
         self._lock = threading.Lock()
         self._hits = 0
         self._misses = 0
+        self._evictions = 0
     @staticmethod
     def _make_key(
@@ -89,6 +91,15 @@ class DispatchCache:
             return  # don't cache failures
         key = self._make_key(agent, task, context, caller, goal, response_format)
         with self._lock:
+            # Bound memory: when at capacity and inserting a new key, evict the
+            # oldest entry by insertion time (FIFO). We intentionally do NOT
+            # refresh timestamps on read — the timestamp also drives TTL expiry,
+            # so touching it on access would turn TTL into idle-time. Refreshing
+            # an existing key never triggers eviction.
+            if key not in self._store and len(self._store) >= self._max_size:
+                oldest = min(self._store, key=lambda k: self._store[k][0])
+                del self._store[oldest]
+                self._evictions += 1
             self._store[key] = (time.monotonic(), result)
     def clear(self) -> int:
@@ -97,6 +108,7 @@ class DispatchCache:
             self._store.clear()
             self._hits = 0
             self._misses = 0
+            self._evictions = 0
             return count
     def evict_expired(self) -> int:
@@ -112,8 +124,10 @@ class DispatchCache:
             total = self._hits + self._misses
             return {
                 "size": len(self._store),
+                "max_size": self._max_size,
                 "hits": self._hits,
                 "misses": self._misses,
+                "evictions": self._evictions,
                 "hit_rate": round(self._hits / total, 3) if total else 0.0,
                 "ttl": self._ttl,
             }

agent-dispatch 0.4.0__tar.gz → 0.6.0__tar.gz

agent-dispatch 0.4.0tar.gz → 0.6.0tar.gz