@jaggerxtrm/specialists 3.15.2 → 3.15.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,162 +1,256 @@
1
1
  # Specialists
2
2
 
3
- **One MCP server. Many specialists. Bead-first orchestration.**
3
+ **One MCP server. Many specialist agents. Bead-first orchestration.**
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/@jaggerxtrm/specialists.svg)](https://www.npmjs.com/package/@jaggerxtrm/specialists)
6
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
7
7
  [![TypeScript](https://img.shields.io/badge/TypeScript-5.0-blue.svg)](https://www.typescriptlang.org/)
8
8
 
9
- **Specialists is a universal framework for defining and running specialist agents.** You can invoke the same specialist from the terminal, through MCP inside coding agents, inside autonomous multi-agent runtimes, or from scripts and CI/CD pipelines. Each run can explicitly define the model, tool access, system prompt, task input, permission level, timeout, output format, tracking behavior, memory sources, and dependency context.
9
+ Specialists is a project-scoped runtime for running focused AI agents from the CLI, MCP, scripts, CI, or HTTP sidecars. A specialist definition declares its model, tools, permission tier, prompt, skills, output contract, timeout/stall policy, worktree behavior, and tracking behavior. The orchestrator keeps task identity in a **bead**; specialists run as fresh scoped sessions that report evidence, changes, and results back to that bead.
10
10
 
11
- Specialists is built on top of the **[pi coding agent](https://github.com/Jaggerxtrm/pi-coding-agent)** as its base execution technology. That gives Specialists access to a broad provider surface across many OAuth and API-backed models, a richer lifecycle event stream for tracking session progress and tool execution, and a usable RPC protocol for orchestrating specialist runs as a stable subprocess boundary.
11
+ Specialists sits in the xt/xtrm stack:
12
12
 
13
- Specialists is intended to run inside the **xt/xtrm architecture** provided by **[xtrm-tools](https://github.com/Jaggerxtrm/xtrm-tools)**. xtrm-tools provides the worktree isolation, execution boundaries, session model, and surrounding workflow environment that Specialists expects. Specialists handles specialist execution; xtrm-tools owns the broader operator workflow and beads enforcement hooks. For tracking and coordination Specialists uses **beads** by **Steven Yegge** as the issue, dependency, and communication layer. I built a similar issue system for Mercury AACS and Terminal back in November, but Beads is already widely used and actively maintained, so xt/Specialists is built around Beads instead of carrying a separate workflow stack. When a specialist run originates from a bead, its output is written back to that same bead, so the task spec, dependency context, coordination state, and result stay inside one tight, controlled loop.
13
+ - **[pi coding agent](https://github.com/Jaggerxtrm/pi-coding-agent)** supplies the model/provider execution layer, JSONL/RPC subprocess boundary, tool events, and extension hooks.
14
+ - **[xtrm-tools](https://github.com/Jaggerxtrm/xtrm-tools)** supplies the surrounding operator workflow: worktree sessions, `.xtrm/` skills/hooks, session reports, update tooling, and workflow enforcement.
15
+ - **[beads](https://github.com/steveyegge/beads)** supplies issue IDs, dependency edges, claims, and durable task/result notes.
14
16
 
15
- A specialist is a reusable execution spec: model, allowed tools, skills, system prompt, task prompt, timeout, permission level, output format, and background-job behavior. It can run from a plain prompt, from a system+task prompt pair, or directly from an **issue/bead ID as the task source**. Dependency chains can be injected as context, centralized memory can be reused across runs, and jobs can execute in the foreground or as background processes with status, events, and results exposed through the CLI and MCP surfaces.
17
+ When a run starts from `--bead <id>`, the bead is the task prompt. Dependency context and relevant memory can be injected, the specialist output is appended back to the same bead, and edit-capable specialists work in isolated branches/worktrees that can be reviewed and merged through `sp merge` or `sp epic merge`.
16
18
 
17
19
  ---
18
20
 
19
21
  ## Vision
20
22
 
21
- Specialists turns one overloaded agent chat into a coordinated agent mind: a central orchestrator keeps task identity and evidence, while fresh specialist sessions act as scoped capabilities with their own prompts, rules, memory, and contracts. See [specialists.scheme.md](specialists.scheme.md) for diagrams comparing the single-chat model with specialist pipelines, herd memory, adaptive chains, and service specialists.
23
+ Specialists turns one overloaded agent chat into a coordinated agent mind: a central orchestrator keeps task identity, evidence, and publication control, while fresh specialist sessions act as scoped capabilities with their own prompts, rules, tools, memory, and output contracts.
24
+
25
+ The problem it solves is not just token count. Long single-agent sessions accumulate old hypotheses, partial plans, tool residue, self-review bias, and forgotten constraints. Specialists uses **contract-bound cognition** instead: write the task contract once, dispatch the right expert role with only the relevant context, require a structured handoff, and let the orchestrator decide the next step.
26
+
27
+ See [specialists.scheme.md](specialists.scheme.md) for the full diagrams and rationale. The core shape is:
28
+
29
+ ```mermaid
30
+ flowchart TD
31
+ U[User / project need] --> O[Orchestrator]
32
+ O --> B[Bead issue contract]
33
+ B --> C{Contract ready?}
34
+ C -->|no| R[Repair scope, success, constraints]
35
+ R --> B
36
+ C -->|yes| D[Dispatch specialist]
37
+
38
+ D --> E[Explorer\nfresh read-only context]
39
+ D --> G[Debugger\nfresh root-cause context]
40
+ D --> X[Executor\nfresh implementation context]
41
+ D --> T[Test-runner\nfresh validation context]
42
+ D --> S[Code-sanity / Security\nadvisory context]
43
+ D --> V[Reviewer\ncontract compliance context]
44
+
45
+ B --> E
46
+ B --> G
47
+ B --> X
48
+ B --> T
49
+ B --> S
50
+ B --> V
51
+
52
+ E --> H[Structured handoff / evidence]
53
+ G --> H
54
+ X --> H
55
+ T --> H
56
+ S --> H
57
+ V --> H
58
+ H --> O
59
+ O --> P[Merge, resume, re-review, or close]
60
+ ```
22
61
 
23
- ## Quick start
62
+ ## What you can run
24
63
 
25
- 1. Install Bun.
64
+ | Need | Specialist / surface |
65
+ |---|---|
66
+ | Map unfamiliar local code | `explorer` |
67
+ | Diagnose a bug with unknown cause | `debugger` |
68
+ | Implement a scoped change | `executor` |
69
+ | Review an executor/debugger result | `reviewer --job <exec-job>` |
70
+ | Run/classify tests | `test-runner` |
71
+ | Current library/API/GitHub research | `researcher` |
72
+ | Plan a multi-file feature into beads | `planner` |
73
+ | Check code shape before review | `code-sanity` |
74
+ | Audit security-sensitive diffs | `security-auditor` |
75
+ | Sync exactly one doc | `sync-docs` |
76
+ | Draft changelog gaps | `changelog-keeper` |
77
+ | One-shot script/HTTP generation | `sp script` / `sp serve` |
78
+
79
+ The live registry is authoritative:
26
80
 
27
81
  ```bash
28
- bun --version
29
- curl -fsSL https://bun.sh/install | bash
82
+ sp list
83
+ sp list --compact
84
+ sp list-rules
85
+ sp help
30
86
  ```
31
87
 
32
- 2. Install xtrm-tools.
88
+ ## Install and bootstrap
89
+
90
+ Specialists is **Bun-only** and expects xtrm-tools to be installed explicitly. xtrm-tools is a runtime prerequisite, not an npm dependency of this package.
33
91
 
34
92
  ```bash
93
+ # 1. Bun
94
+ curl -fsSL https://bun.sh/install | bash
95
+ bun --version
96
+
97
+ # 2. xtrm-tools
35
98
  npm install -g xtrm-tools
36
99
  xt install
37
100
  xt init
38
- ```
39
-
40
- 3. Install Specialists.
41
101
 
42
- ```bash
102
+ # 3. Specialists
43
103
  npm install -g @jaggerxtrm/specialists
44
104
  sp init
105
+ sp doctor
45
106
  sp list
46
107
  ```
47
108
 
48
- `sp` is a shorter alias for `specialists` — both commands are identical:
109
+ `sp` is an alias for `specialists`.
49
110
 
50
- ```bash
51
- sp list
52
- sp run bug-hunt --bead <id>
53
- ```
111
+ `sp init` is an interactive, human-run bootstrap. It checks for `xt` and `.xtrm/`, wires project MCP registration, hooks, skill symlinks, `.specialists/` runtime directories, and the Specialists block in `AGENTS.md`. It does **not** require copying package-owned defaults into every repo.
112
+
113
+ ## Update and drift repair
114
+
115
+ Specialists uses two distribution tracks:
54
116
 
55
- Tracked work:
117
+ | Track | Owned by | What it covers | Check / update |
118
+ |---|---|---|---|
119
+ | **Category A** runtime assets | `@jaggerxtrm/specialists` package | specialist JSON, mandatory rules, catalog, nodes, hooks shipped with the package | `sp doctor --check-drift`, `sp prune-stale-defaults --dry-run`, `sp prune-stale-defaults` |
120
+ | **Category B** filesystem assets | xtrm-tools | `.xtrm/skills`, `.claude/skills`, `.pi/skills`, hook snapshots read directly from disk | `xt doctor --cwd <repo> --json`, `xt update --repo <repo> --apply` |
121
+
122
+ `.specialists/user/` is your customization layer. `.specialists/default/` is now only for intentional pins or compatibility snapshots; stale default files are drift debt and `sp prune-stale-defaults` removes them by default. `sp init --sync-defaults` remains as a compatibility path, but it is deprecated because it creates repo-local snapshots that can drift from the package-canonical source.
123
+
124
+ For an interactive, agent-guided update flow that runs both tracks, diagnoses drift, and asks before applying destructive changes, invoke the `/update-specialists` skill in Claude Code instead of running the raw commands manually.
125
+
126
+ ## Operator skills
127
+
128
+ These skills load into your Claude Code session on demand and guide the most common operator workflows:
129
+
130
+ | Skill | Invoke | When to use |
131
+ |---|---|---|
132
+ | `using-specialists-v3` | `/using-specialists-v3` | **Canonical orchestration guide.** Use for any substantial delegated work: implementation, debugging, review, planning, security audit, doc sync, multi-chain epics. Covers bead contracts, role selection, chain lifecycle, merge path, and escalation. |
133
+ | `using-specialists-auto` | `/using-specialists-auto` | **Autonomous / offline mode.** Activates when you hand over a multi-item list and step away ("auto mode", "run the list"). Layers pacing discipline and escalation triggers on top of `using-specialists-v3`. |
134
+ | `update-specialists` | `/update-specialists` | **Guided drift repair.** Runs both Category A and Category B diagnostics, presents a combined plan, and asks before applying anything. Prefer this over running raw `sp`/`xt` commands directly. |
135
+
136
+ ## Core tracked workflow
56
137
 
57
138
  ```bash
58
139
  bd create "Investigate auth bug" -t bug -p 1 --json
59
- specialists run bug-hunt --bead <id>
60
- specialists feed -f
61
- bd close <id> --reason "Done"
140
+ bd update <id> --claim --json
141
+
142
+ sp run debugger --bead <id> --context-depth 3
143
+ sp ps
144
+ sp feed <job-id> --follow
145
+ sp result <job-id>
146
+
147
+ # After implementation and reviewer PASS
148
+ sp merge <chain-root-bead> # standalone chain
149
+ sp epic status <epic-id> # multi-chain publication check
150
+ sp epic merge <epic-id> # canonical epic publication
151
+
152
+ bd close <id> --reason "Done" --json
62
153
  ```
63
154
 
64
- Merge worktree branches:
155
+ Ad-hoc work is still available, but tracked work should use beads:
65
156
 
66
157
  ```bash
67
- specialists merge <bead-id> # single chain or epic (topological)
68
- specialists merge <bead-id> --rebuild # rebuild after merge
158
+ sp run explorer --prompt "Map the CLI architecture"
69
159
  ```
70
160
 
71
- `specialists run` prints `[job started: <id>]` early. Normal runtime is DB-backed; `.specialists/jobs/latest` is legacy/operator-only.
161
+ ## Background jobs and monitoring
72
162
 
73
- Runtime state lives in `observability.db`; `.specialists/jobs/latest` is legacy convenience pointer only.
163
+ Normal runtime is DB-first: `.specialists/db/observability.db` stores jobs, events, and results. File mirrors under `.specialists/jobs/` are legacy/operator recovery surfaces.
74
164
 
75
- Ad-hoc work:
165
+ Useful commands:
76
166
 
77
167
  ```bash
78
- specialists run codebase-explorer --prompt "Map the CLI architecture"
168
+ sp ps # actionable dashboard
169
+ sp ps -f # TTY dashboard follow; pipes emit ANSI-free snapshots
170
+ sp feed <job-id> # full DB-backed event replay
171
+ sp feed -f # follow all active jobs
172
+ sp result <job-id> --wait
173
+ sp steer <job-id> "focus only on X"
174
+ sp resume <job-id> "continue with these findings"
175
+ sp finalize <any-chain-job> # cascade-close waiting keep-alive chain after PASS if needed
176
+ sp clean --reap-orphans --dry-run
177
+ sp clean --ps # hide terminal dashboard history without deleting DB audit rows
79
178
  ```
80
179
 
81
- ## What `specialists init` does
82
-
83
- - creates `specialists/`
84
- - creates `.specialists/` runtime dirs (`jobs/`, `ready/`)
85
- - adds `.specialists/` to `.gitignore`
86
- - injects the canonical Specialists Workflow block into `AGENTS.md`
87
- - registers the Specialists MCP server at project scope
180
+ ## Script and service specialists
88
181
 
89
- Verify bootstrap state:
182
+ Use `sp run` for interactive agent orchestration. Use the script/service surfaces when you need a synchronous, READ_ONLY, one-shot generation path:
90
183
 
91
184
  ```bash
92
- specialists status
93
- specialists doctor
185
+ sp script <name> --vars key=value --json
186
+ sp serve --port 8000 --readiness-canary warn
187
+ curl -sS http://localhost:8000/v1/generate \
188
+ -H 'content-type: application/json' \
189
+ -d '{"specialist":"hello","variables":{"name":"world"}}'
94
190
  ```
95
191
 
96
- ## Documentation map
192
+ `sp serve` is intended as a sidecar for script-class specialists. For container deployments, mount the whole `.specialists/` directory read-write, set `HOME=/pi-home`, and align container UID/GID with the host user. See [docs/specialists-service.md](docs/specialists-service.md), [docs/specialists-service-install.md](docs/specialists-service-install.md), and [docs/deploying-alongside.md](docs/deploying-alongside.md).
97
193
 
98
- `docs/` is the source of truth for detailed documentation. Start with the page that matches your task:
194
+ ## Documentation map
99
195
 
100
196
  | Need | Doc |
101
197
  |---|---|
102
- | Install and bootstrap a project | [docs/bootstrap.md](docs/bootstrap.md) |
103
- | Release notes and version history | [CHANGELOG.md](CHANGELOG.md) |
104
- | Changelog drafting specialist | [config/specialists/changelog-keeper.specialist.json](config/specialists/changelog-keeper.specialist.json) |
105
- | Run a script-class specialist over HTTP (`sp serve`) — overview & contract | [docs/specialists-service.md](docs/specialists-service.md) |
106
- | Install `sp serve` in another project (sidecar Docker / Podman) | [docs/specialists-service-install.md](docs/specialists-service-install.md) |
107
- | Build & publish the specialists-service image | [docs/release-image.md](docs/release-image.md) |
108
- | Release flow (skill + specialist) | [config/skills/releasing/SKILL.md](config/skills/releasing/SKILL.md) |
109
- | Bead-first workflow and semantics | [docs/workflow.md](docs/workflow.md) |
198
+ | Install, update, and distribution model | [docs/installation.md](docs/installation.md) |
199
+ | Project bootstrap and `sp init` | [docs/bootstrap.md](docs/bootstrap.md) |
200
+ | Bead-first workflow | [docs/workflow.md](docs/workflow.md) |
110
201
  | CLI commands and flags | [docs/cli-reference.md](docs/cli-reference.md) |
111
- | Background jobs, feed, result, stop | [docs/background-jobs.md](docs/background-jobs.md) |
112
- | Write or edit a `.specialist.yaml` | [docs/authoring.md](docs/authoring.md) |
113
- | Current built-in specialists | [docs/specialists-catalog.md](docs/specialists-catalog.md) |
114
- | MCP registration details | [docs/mcp-servers.md](docs/mcp-servers.md) |
115
- | Hook behavior | [docs/hooks.md](docs/hooks.md) |
116
- | Skills shipped in this repo | [docs/skills.md](docs/skills.md) |
117
- | xtrm / worktree integration | [docs/worktree.md](docs/worktree.md) |
118
- | RPC mode notes | [docs/pi-rpc.md](docs/pi-rpc.md) |
119
- | Pi subprocess isolation and extensions | [docs/pi-session.md](docs/pi-session.md) |
120
- | NodeSupervisor architecture, node lifecycle, and `sp node` CLI | [docs/nodes.md](docs/nodes.md) |
121
-
122
- ## Ownership model
123
-
124
- Specialists uses layered ownership with deterministic loader precedence: user layer overrides default layer, and default layer falls back to package source (`.specialists/user/*` > `.specialists/default/*` > `config/*`). Operationally: `config/*` is upstream source shipped by package, `.specialists/default/*` is managed mirror refreshed by `specialists init --sync-defaults` (scope: specialists + mandatory-rules + nodes), `.specialists/user/*` is repo customization layer, and `.specialists/{jobs,ready,db}` is runtime/generated state; `.specialists/jobs/` is legacy mirror/debug surface, not normal-runtime source of truth. Use `sp edit --fork-from <base>` to promote non-user specialist into user layer before editing.
202
+ | Background jobs / `ps` / `feed` / `result` | [docs/background-jobs.md](docs/background-jobs.md) |
203
+ | Specialist JSON authoring | [docs/authoring.md](docs/authoring.md) |
204
+ | Built-in specialists | [docs/specialists-catalog.md](docs/specialists-catalog.md) |
205
+ | Tool catalog and permission resolver | [docs/manifest.md](docs/manifest.md) |
206
+ | MCP registration and tool surface | [docs/mcp-servers.md](docs/mcp-servers.md), [docs/mcp-tools.md](docs/mcp-tools.md) |
207
+ | Hooks | [docs/hooks.md](docs/hooks.md) |
208
+ | Skills and operator skill reference | [docs/skills.md](docs/skills.md) |
209
+ | Orchestration skill (`using-specialists-v3`) | [docs/skills.md#using-specialists-v3](docs/skills.md#using-specialists-v3) |
210
+ | Auto mode skill (`using-specialists-auto`) | [docs/skills.md#using-specialists-auto](docs/skills.md#using-specialists-auto) |
211
+ | Update / drift repair skill (`update-specialists`) | [docs/skills.md#update-specialists](docs/skills.md#update-specialists) |
212
+ | Worktrees and session close | [docs/worktrees.md](docs/worktrees.md), [docs/worktree.md](docs/worktree.md) |
213
+ | Runtime architecture | [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) |
214
+ | Pi subprocess isolation / RPC boundary | [docs/pi-session.md](docs/pi-session.md), [docs/pi-rpc-boundary.md](docs/pi-rpc-boundary.md) |
215
+ | NodeSupervisor | [docs/nodes.md](docs/nodes.md) |
216
+ | Service sidecar / HTTP contract | [docs/specialists-service.md](docs/specialists-service.md) |
217
+ | Compose deployment recipe | [docs/deploying-alongside.md](docs/deploying-alongside.md) |
218
+ | Release notes | [CHANGELOG.md](CHANGELOG.md) |
125
219
 
126
220
  ## Project structure
127
221
 
128
222
  ```text
129
223
  config/
130
- ├── specialists/ canonical specialist definitions (.specialist.json)
131
- ├── mandatory-rules/ canonical rule sets injected into specialist prompts (+ README)
132
- ├── nodes/ canonical node configs
224
+ ├── specialists/ package-canonical specialist definitions (.specialist.json)
225
+ ├── mandatory-rules/ package-canonical rule sets injected into specialist prompts
226
+ ├── catalog/ package-canonical tool catalog
227
+ ├── nodes/ package-canonical node configs
133
228
  ├── hooks/ bundled hook scripts
134
- ├── skills/ repo-local skills used by specialists
135
- └── extensions/ pi extensions (future)
229
+ └── skills/ repo-local skills shipped by this package
230
+
136
231
  .specialists/
137
- ├── default/ managed mirror of canonical (from sp init --sync-defaults)
138
- ├── specialists/
139
- ├── mandatory-rules/
140
- ├── nodes/
141
- ├── hooks/
142
- └── skills/
143
- ├── user/ repo-owned customizations (overrides default + canonical)
144
- │ ├── specialists/
145
- ├── hooks/
146
- └── skills/
147
- ├── mandatory-rules/ repo-specific rule overlay (wins on set-id conflict)
148
- ├── jobs/ runtime gitignored
149
- └── ready/ runtime — gitignored
150
- src/ CLI, server, loader, runner, tools
232
+ ├── user/ repo-owned specialists and overrides (highest precedence)
233
+ ├── default/ optional pins / compatibility snapshots; prune stale files
234
+ ├── mandatory-rules/ legacy/repo rule overlay compatibility
235
+ ├── db/ runtime SQLite state (gitignored)
236
+ ├── jobs/ legacy runtime mirror (gitignored)
237
+ └── ready/ legacy ready markers (gitignored)
238
+
239
+ .xtrm/
240
+ ├── skills/ xtrm-managed skill snapshots and active links
241
+ └── hooks/ xtrm-managed hook snapshots
242
+
243
+ src/ CLI, server, loader, runner, supervisor, MCP tool
151
244
  ```
152
245
 
153
- ## Core workflow rules
246
+ ## Core rules
154
247
 
155
- - **Use `--bead` for tracked work.** The bead is the prompt source.
156
- - **Use `--prompt` for ad-hoc work only.**
157
- - `--context-depth` controls how many completed blocker levels are injected.
158
- - `--no-beads` does **not** disable bead reading.
159
- - specialists are **project-only**. User-scope specialist discovery is deprecated.
248
+ - Use `--bead` for tracked work; use `--prompt` only for quick untracked work.
249
+ - `--context-depth` controls completed dependency context injection; default is 3 for bead runs.
250
+ - `--no-beads` disables tracking bead creation/updates, but it does not disable reading the input bead when `--bead` is provided.
251
+ - Edit-capable specialists run in isolated worktrees. Review/fix passes should use `--job <exec-job>` to reuse the same workspace.
252
+ - Reviewer PASS is the publish gate. Code-sanity/security/test-runner outputs are advisory evidence, not merge approval.
253
+ - Specialists are project-scoped. User-scope specialist discovery is deprecated.
160
254
 
161
255
  ## Deprecated commands
162
256
 
@@ -164,8 +258,9 @@ These commands are still recognized for migration guidance but are no longer onb
164
258
 
165
259
  - `specialists setup`
166
260
  - `specialists install`
261
+ - `sp release prepare` / `sp release publish` (deprecated aliases; release flow is skill-driven)
167
262
 
168
- Use `specialists init` instead.
263
+ Use `sp init`, `xt update`, and the release skill flow instead.
169
264
 
170
265
  ## Development
171
266
 
@@ -173,12 +268,10 @@ Use `specialists init` instead.
173
268
  bun run build
174
269
  bun test # bun vitest run (default)
175
270
  bun run test:node # node vitest run (subprocess-safe alternative)
176
- specialists help
177
- specialists quickstart
271
+ sp help
272
+ sp quickstart
178
273
  ```
179
274
 
180
- `test:node` uses plain `node vitest run` as an alternative to `bun --bun vitest`. Useful for executor/codex subprocess chains that may trigger stall detection during vitest's tinypool worker initialization silence.
181
-
182
275
  ## License
183
276
 
184
- MIT
277
+ MIT — see [LICENSE](LICENSE).