@jaggerxtrm/specialists 3.10.0 → 3.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (100) hide show
  1. package/README.md +3 -0
  2. package/config/hooks/specialists-session-start.mjs +33 -1
  3. package/config/mandatory-rules/changelog-conventions.md +21 -0
  4. package/config/mandatory-rules/changelog-keeper-scope.md +50 -0
  5. package/config/mandatory-rules/gitnexus-required.md +6 -1
  6. package/config/mandatory-rules/sync-docs-scope-discipline.md +40 -0
  7. package/config/skills/releasing/SKILL.md +82 -0
  8. package/config/skills/specialists-creator/SKILL.md +84 -10
  9. package/config/skills/specialists-creator/scripts/validate-specialist.ts +1 -1
  10. package/config/skills/update-specialists/SKILL.md +41 -7
  11. package/config/skills/using-kpi/SKILL.md +150 -0
  12. package/config/skills/using-script-specialists/SKILL.md +208 -0
  13. package/config/skills/using-specialists-v2/SKILL.md +162 -28
  14. package/config/skills/using-specialists-v3/SKILL.md +284 -0
  15. package/config/skills/using-specialists-v3/evals/evals.json +89 -0
  16. package/config/specialists/changelog-drafter.specialist.json +62 -0
  17. package/config/specialists/changelog-keeper.specialist.json +79 -0
  18. package/config/specialists/code-sanity.specialist.json +106 -0
  19. package/config/specialists/debugger.specialist.json +4 -4
  20. package/config/specialists/executor.specialist.json +4 -4
  21. package/config/specialists/explorer.specialist.json +14 -4
  22. package/config/specialists/memory-processor.specialist.json +4 -4
  23. package/config/specialists/node-coordinator.specialist.json +3 -3
  24. package/config/specialists/overthinker.specialist.json +3 -3
  25. package/config/specialists/planner.specialist.json +4 -4
  26. package/config/specialists/researcher.specialist.json +3 -3
  27. package/config/specialists/reviewer.specialist.json +4 -4
  28. package/config/specialists/security-auditor.specialist.json +68 -0
  29. package/config/specialists/specialists-creator.specialist.json +6 -5
  30. package/config/specialists/sync-docs.specialist.json +15 -18
  31. package/config/specialists/test-runner.specialist.json +3 -3
  32. package/config/specialists/xt-merge.specialist.json +4 -4
  33. package/dist/index.js +3323 -1004
  34. package/dist/lib.js +480 -135
  35. package/dist/types/cli/clean.d.ts.map +1 -1
  36. package/dist/types/cli/config.d.ts.map +1 -1
  37. package/dist/types/cli/db.d.ts.map +1 -1
  38. package/dist/types/cli/doctor.d.ts.map +1 -1
  39. package/dist/types/cli/feed.d.ts.map +1 -1
  40. package/dist/types/cli/help.d.ts.map +1 -1
  41. package/dist/types/cli/init.d.ts.map +1 -1
  42. package/dist/types/cli/list.d.ts +4 -0
  43. package/dist/types/cli/list.d.ts.map +1 -1
  44. package/dist/types/cli/merge.d.ts +4 -2
  45. package/dist/types/cli/merge.d.ts.map +1 -1
  46. package/dist/types/cli/node.d.ts.map +1 -1
  47. package/dist/types/cli/prune-stale-defaults.d.ts +2 -0
  48. package/dist/types/cli/prune-stale-defaults.d.ts.map +1 -0
  49. package/dist/types/cli/ps.d.ts.map +1 -1
  50. package/dist/types/cli/result.d.ts.map +1 -1
  51. package/dist/types/cli/run.d.ts.map +1 -1
  52. package/dist/types/cli/script.d.ts.map +1 -1
  53. package/dist/types/cli/serve-hot-reload.d.ts +13 -0
  54. package/dist/types/cli/serve-hot-reload.d.ts.map +1 -0
  55. package/dist/types/cli/serve.d.ts +28 -0
  56. package/dist/types/cli/serve.d.ts.map +1 -1
  57. package/dist/types/cli/status.d.ts.map +1 -1
  58. package/dist/types/cli/stop.d.ts.map +1 -1
  59. package/dist/types/cli/version-check.d.ts +17 -0
  60. package/dist/types/cli/version-check.d.ts.map +1 -0
  61. package/dist/types/index.d.ts +1 -1
  62. package/dist/types/pi/session.d.ts +10 -0
  63. package/dist/types/pi/session.d.ts.map +1 -1
  64. package/dist/types/specialist/canonical-asset-resolver.d.ts +6 -0
  65. package/dist/types/specialist/canonical-asset-resolver.d.ts.map +1 -0
  66. package/dist/types/specialist/drift-detector.d.ts +39 -0
  67. package/dist/types/specialist/drift-detector.d.ts.map +1 -0
  68. package/dist/types/specialist/epic-lifecycle.d.ts.map +1 -1
  69. package/dist/types/specialist/epic-readiness.d.ts.map +1 -1
  70. package/dist/types/specialist/epic-reconciler.d.ts.map +1 -1
  71. package/dist/types/specialist/loader.d.ts +2 -1
  72. package/dist/types/specialist/loader.d.ts.map +1 -1
  73. package/dist/types/specialist/mandatory-rules.d.ts.map +1 -1
  74. package/dist/types/specialist/manifest-resolver.d.ts +55 -0
  75. package/dist/types/specialist/manifest-resolver.d.ts.map +1 -0
  76. package/dist/types/specialist/node-contract.d.ts +2 -2
  77. package/dist/types/specialist/observability-sqlite.d.ts +43 -0
  78. package/dist/types/specialist/observability-sqlite.d.ts.map +1 -1
  79. package/dist/types/specialist/payload-measure.d.ts +19 -0
  80. package/dist/types/specialist/payload-measure.d.ts.map +1 -0
  81. package/dist/types/specialist/porcelain-parser.d.ts +2 -0
  82. package/dist/types/specialist/porcelain-parser.d.ts.map +1 -0
  83. package/dist/types/specialist/resolution-diagnostics.d.ts +36 -0
  84. package/dist/types/specialist/resolution-diagnostics.d.ts.map +1 -0
  85. package/dist/types/specialist/runner.d.ts +8 -0
  86. package/dist/types/specialist/runner.d.ts.map +1 -1
  87. package/dist/types/specialist/schema.d.ts +27 -0
  88. package/dist/types/specialist/schema.d.ts.map +1 -1
  89. package/dist/types/specialist/script-runner.d.ts +44 -1
  90. package/dist/types/specialist/script-runner.d.ts.map +1 -1
  91. package/dist/types/specialist/supervisor.d.ts +4 -0
  92. package/dist/types/specialist/supervisor.d.ts.map +1 -1
  93. package/dist/types/specialist/timeline-events.d.ts +29 -1
  94. package/dist/types/specialist/timeline-events.d.ts.map +1 -1
  95. package/dist/types/specialist/timeline-query.d.ts.map +1 -1
  96. package/dist/types/specialist/tool-catalog.d.ts +126 -0
  97. package/dist/types/specialist/tool-catalog.d.ts.map +1 -0
  98. package/dist/types/tools/specialist/feed_specialist.tool.d.ts +2 -2
  99. package/dist/types/tools/specialist/use_specialist.tool.d.ts.map +1 -1
  100. package/package.json +1 -1
@@ -8,7 +8,7 @@ description: >
8
8
  work without drift. Trigger for code review, debugging, implementation,
9
9
  planning, test generation, doc sync, multi-chain epics, and any question about
10
10
  specialist orchestration.
11
- version: 1.0
11
+ version: 1.4
12
12
  ---
13
13
 
14
14
  # Specialists V2
@@ -17,6 +17,22 @@ You are the orchestrator. Your job is to specify the work, choose the right spec
17
17
 
18
18
  Use this skill for substantial work: codebase exploration, debugging, implementation, review, testing, documentation sync, planning, specialist authoring, and multi-chain orchestration. Do small deterministic edits directly when the scope is already clear and delegation would add ceremony.
19
19
 
20
+ For one-shot synchronous specialist invocations from services or scripts (template + variables, READ_ONLY, JSON out), use `using-script-specialists` instead. That runtime (`sp script` / `sp serve`) is unrelated to bead-first orchestration.
21
+
22
+ ## Update Awareness On Skill Load
23
+
24
+ On first activation in a session, before substantial work, check whether the local specialists install is current:
25
+
26
+ ```bash
27
+ LOCAL=$(node -p "require('./package.json').version" 2>/dev/null)
28
+ LATEST=$(git ls-remote --tags --refs origin 2>/dev/null | grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+$' | sort -V | tail -1 | sed 's/^v//')
29
+ [ -n "$LATEST" ] && [ "$LOCAL" != "$LATEST" ] && echo "specialists v$LOCAL is local; v$LATEST published — consider /update-specialists before substantial work."
30
+ ```
31
+
32
+ Skip the check entirely when `SPECIALISTS_OFFLINE=1` is set, when stdin is not a TTY (specialist-spawned subagent context), or when the previous turn already surfaced this notice. Surface at most one line — never block, never spam, never auto-update. The operator decides whether to run `/update-specialists`.
33
+
34
+ When the local version is behind, the latest CHANGELOG entry can be summarized via `head -50 CHANGELOG.md` to anchor what changed; cross-link to the `update-specialists` skill for the actual reconcile flow.
35
+
20
36
  ## Hard Rules
21
37
 
22
38
  1. `--bead` is the prompt for tracked work.
@@ -24,14 +40,32 @@ Use this skill for substantial work: codebase exploration, debugging, implementa
24
40
  3. Never use `--prompt` to supplement tracked work. Update bead notes instead.
25
41
  4. Use explorer only when the implementation path is unknown.
26
42
  5. Use executor only after scope, constraints, and validation are clear enough to act.
27
- 6. Edit-capable specialists use `--worktree` for the first implementation job.
28
- 7. Reviewer gets its own bead and enters the executor workspace with `--job <exec-job>`.
29
- 8. Use `--context-depth 2` for chained work unless there is a specific reason not to.
43
+ 6. Edit-capable specialists with `--bead` auto-provision a worktree. `--worktree` is still accepted for clarity but not required (the deprecated `--no-worktree` flag is gone).
44
+ 7. Reviewer gets its own bead and enters the executor workspace with `--job <exec-job>`. `--job` auto-resolves the bead if `--bead` is omitted.
45
+ 8. `--context-depth` defaults to 3 (parent task + predecessor + own bead). Override only when the chain needs less or more upstream context.
30
46
  9. Keep executor/debugger jobs alive through review so they can be resumed.
31
47
  10. Merge specialist branches with `sp merge` or `sp epic merge`, never manual `git merge`.
32
48
  11. Specialists must not perform destructive or irreversible actions.
33
49
  12. If a specialist fails, inspect feed/result and either steer, resume, rerun with a better bead, or report the blocker.
34
50
  13. Drive chains autonomously. Do not ask the operator to approve routine stage transitions. Escalate only on critical events (see Autonomous Drive section).
51
+ 14. Stale-base guard: dispatch refuses to provision a worktree when sibling epic chains have unmerged substantive commits. Override only with explicit `--force-stale-base` and a reason. Merge-time rebase happens automatically.
52
+ 15. Auto-checkpoint: executor and debugger commit substantive worktree changes on `waiting` by default (`auto_commit: checkpoint_on_waiting`). Noise paths (`.xtrm/`, `.wolf/`, `.specialists/jobs/`, `.beads/`) are filtered.
53
+ 16. Per-turn output appends to the input bead notes for **all** specialists on every `run_complete`, with `[WAITING — more output may follow]` or `[DONE]` headers. `bd show <bead-id>` is a valid path to read intermediate output.
54
+ 17. Specialist jobs do not orchestrate nested specialist chains. The top-level orchestrator dispatches specialists, collects results, and advances the workflow.
55
+ 18. Treat test failures as evidence to classify against the bead scope. Validate whether failures are in-scope, pre-existing, or infrastructure-related before sending an executor into a fix loop.
56
+
57
+ ## Canonical Runtime State
58
+
59
+ These are current operating facts, not migration notes:
60
+
61
+ - **Asset ownership:** Cat A runtime assets — specialists, mandatory-rules, catalog, and nodes — resolve live from the specialists package after project tiers. Cat B filesystem assets — skills and hooks — are owned by xtrm-tools under `.xtrm/skills/default` and `.xtrm/hooks/default`.
62
+ - **Resolution precedence:** project/user tiers win over managed defaults; package-live is the final fallback. Mandatory-rule indexes are not stacked across tiers; per-id mandatory-rule files may fall through to package canonical when absent locally.
63
+ - **Drift surface:** use `sp doctor --check-drift` to inspect stale managed defaults and `sp prune-stale-defaults --dry-run` to preview cleanup.
64
+ - **Source verification:** resolver/catalog changes in a worktree are verified with `sp config show <name> --resolved --from-source` so evidence comes from the checked-out source, not an installed dist.
65
+ - **Worktree publication:** edit-capable specialists produce worktree branches. Before review or merge, verify the branch diff and status from that worktree.
66
+ - **Epic publication:** epics are the merge-gated identity. Publish through `sp epic merge`; use `sp epic abandon` to deliberately close failed or cancelled epic bookkeeping.
67
+ - **CLI safety:** command help paths are side-effect free. New commands must parse `--help`/`-h` before action and have a no-write help test.
68
+ - **Release context:** changelog-keeper receives xt report context through the `releasing` skill's helper. Release-range logic supports annotated tags.
35
69
 
36
70
  ## Autonomous Drive
37
71
 
@@ -72,7 +106,7 @@ Do not busy-loop `sp ps` in tight intervals. One sleep + one confirmation poll i
72
106
 
73
107
  ```bash
74
108
  # Dispatch
75
- JOB=$(sp run <specialist> --bead <bead-id> --context-depth 2 --background 2>&1 | tail -1)
109
+ JOB=$(sp run <specialist> --bead <bead-id> --context-depth 3 --background 2>&1 | tail -1)
76
110
 
77
111
  # Sleep for median
78
112
  sleep 180
@@ -180,19 +214,24 @@ Run `specialists list` if you need the live registry. Choose by task, not by hab
180
214
  | Planning/decomposition | `planner` | You need beads, dependencies, file scopes, or sequencing. |
181
215
  | Design/tradeoffs | `overthinker` | The approach is risky, ambiguous, or needs critique. |
182
216
  | Implementation | `executor` | The contract is clear enough to write code or docs. |
183
- | Compliance/code review | `reviewer` | An executor/debugger produced changes that need a verdict. |
217
+ | Compliance/code review | `reviewer` | An executor/debugger produced changes that need the final PASS/PARTIAL/FAIL verdict. |
218
+ | Implementation sanity | `code-sanity` | You want a cheap READ_ONLY smell pass for simplicity, type safety, dead code, brittle async/error handling, or maintainability before reviewer. |
219
+ | Security/dependency audit | `security-auditor` | You need threat modeling, secure-code review, package advisory triage, or agent/config security scanning. LOW: scan/read/recommend only. |
184
220
  | Multiple review perspectives | `parallel-review` | A critical diff needs independent review passes. |
185
221
  | Test execution | `test-runner` | You need suites run and failures interpreted. |
186
222
  | Docs audit/sync | `sync-docs` | Docs may be stale or need targeted synchronization. |
187
- | External/live research | `researcher` | Current library/docs/media lookup is needed. |
223
+ | External/live research | `researcher` | Current non-security library/docs/media lookup is needed. |
188
224
  | Specialist config | `specialists-creator` | Creating or changing specialist JSON/config. |
225
+ | Release publication (end-to-end) | `changelog-keeper` | A new tag is being cut. MEDIUM specialist: drafts CHANGELOG section from xt reports, bumps package.json, rebuilds dist, commits, tags, pushes. Use the `releasing` skill to dispatch. |
189
226
 
190
227
  Selection rules:
191
228
 
192
229
  - Explorer is READ_ONLY and should answer specific questions.
193
230
  - Debugger is better than explorer for failures because it traces causes and remediation.
194
231
  - Executor does not own full test validation; use reviewer/test-runner for that phase.
195
- - Reviewer always uses its own bead plus `--job <executor-job>`.
232
+ - Code-sanity is optional and non-blocking by default: use it when a diff smells overcomplicated or type-risky, then resume executor with concrete findings. It is not a merge gate.
233
+ - Security-auditor may run safe local audit commands and web/source research, but must not edit files, update dependencies, exfiltrate secrets, or run destructive/live-target exploit tests. Executor applies any recommended fixes in a separate bead.
234
+ - Reviewer always uses its own bead plus `--job <executor-job>` and remains the final merge gate.
196
235
  - Sync-docs is for audit/sync; executor is for heavy doc rewrites.
197
236
  - Specialists-creator should precede specialist config/schema edits.
198
237
 
@@ -202,15 +241,21 @@ Daily commands:
202
241
 
203
242
  ```bash
204
243
  specialists list
244
+ specialists list-rules # rule × specialist matrix
205
245
  specialists doctor
206
- specialists run <name> --bead <id> --context-depth 2 --background
207
- specialists run executor --worktree --bead <impl-bead> --context-depth 2 --background
208
- specialists run reviewer --bead <review-bead> --job <exec-job> --context-depth 2 --keep-alive --background
246
+ specialists doctor --check-drift # inspect stale .specialists/default snapshots
247
+ sp prune-stale-defaults --dry-run # preview redundant default snapshots
248
+ specialists run <name> --bead <id> --background
249
+ specialists run executor --bead <impl-bead> --background # worktree auto-provisioned
250
+ specialists run code-sanity --bead <sanity-bead> --job <exec-job> --keep-alive --background
251
+ specialists run security-auditor --bead <security-bead> --job <exec-job> --keep-alive --background
252
+ specialists run reviewer --bead <review-bead> --job <exec-job> --keep-alive --background
209
253
  specialists ps
210
254
  specialists ps <job-id>
211
255
  specialists feed <job-id>
212
256
  specialists feed -f
213
- specialists result <job-id>
257
+ specialists result <job-id> # works on done/error/waiting
258
+ specialists result <job-id> --wait --timeout 600
214
259
  specialists steer <job-id> "new direction"
215
260
  specialists resume <job-id> "next task"
216
261
  specialists stop <job-id>
@@ -223,20 +268,25 @@ sp merge <chain-root-bead>
223
268
  sp epic status <epic-id>
224
269
  sp epic sync <epic-id> --apply
225
270
  sp epic merge <epic-id>
271
+ sp epic abandon <epic-id> --reason "..."
226
272
  sp end
227
273
  ```
228
274
 
229
- Avoid `specialists status --job` for normal monitoring; prefer `sp ps <job-id>`.
275
+ `sp result <job-id>` returns the most recent completed turn for `waiting` jobs with a `Session is waiting for your input` footer — use it to inspect a keep-alive job before deciding whether to resume. For `running` jobs, `sp feed <job-id>` is the right tool; `sp poll` is deprecated. Avoid `specialists status --job` for normal monitoring; prefer `sp ps <job-id>`.
230
276
 
231
277
  ## Flag Semantics
232
278
 
233
279
  `--bead <id>` is the task prompt and tracked work identity.
234
280
 
235
- `--context-depth N` controls parent/ancestor bead context. Use `--context-depth 2` for chains so the specialist sees its own bead, predecessor output, and parent task context.
281
+ `--context-depth N` controls parent/ancestor bead context. Default is **3** (own bead + predecessor + parent task). Lower it when the chain is shallow or the parent context is noisy.
282
+
283
+ `--worktree` provisions a new isolated workspace and branch for edit-capable work. Optional when `--bead` is provided to an edit-capable specialist — a worktree is auto-provisioned. Pass `--worktree` explicitly only when you want it without a bead, or for emphasis. The deprecated `--no-worktree` flag is removed and now errors out.
236
284
 
237
- `--worktree` provisions a new isolated workspace and branch for edit-capable work. Use it for the first executor/debugger job that writes files.
285
+ `--job <id>` reuses an existing job's workspace. Use it for reviewer and fix passes. If `--bead` is omitted, bead_id is inferred from the target job's status; explicit `--bead` always wins.
238
286
 
239
- `--job <id>` reuses an existing job's workspace. Use it for reviewer and fix passes. The caller's own `--bead` remains authoritative; `--job` only selects the workspace.
287
+ `--force-job` overrides the concurrency lock that blocks edit-capable specialists from entering an owner workspace while it is `starting`/`running`. Use only when you accept the write race; prefer `sp stop` on dead jobs first.
288
+
289
+ `--force-stale-base` bypasses the dispatch-time stale-base guard that blocks `--worktree` provisioning when sibling epic chains have unmerged substantive commits. Use only with a clear reason; the guard prevents merge-conflict cascades.
240
290
 
241
291
  `--epic <id>` explicitly associates a job with an epic. Use it for prep jobs whose parent is not the epic but should appear in epic status/readiness.
242
292
 
@@ -273,7 +323,7 @@ CONSTRAINTS: READ_ONLY; cite files/symbols.
273
323
  VALIDATION: Findings include recommended executor scope and risks.
274
324
  OUTPUT: Evidence-backed implementation plan."
275
325
  bd dep add <explore> <task>
276
- specialists run explorer --bead <explore> --context-depth 2 --background
326
+ specialists run explorer --bead <explore> --context-depth 3 --background
277
327
  specialists result <explore-job>
278
328
  ```
279
329
 
@@ -289,10 +339,46 @@ CONSTRAINTS: Keep telemetry names stable; avoid broad refactor.
289
339
  VALIDATION: npm run lint, npx tsc --noEmit, targeted auth tests if available.
290
340
  OUTPUT: Diff summary, checks run, follow-up risks."
291
341
  bd dep add <impl> <explore-or-task>
292
- specialists run executor --worktree --bead <impl> --context-depth 2 --background
342
+ specialists run executor --worktree --bead <impl> --context-depth 3 --background
293
343
  specialists result <exec-job>
294
344
  ```
295
345
 
346
+ Optional code-sanity pass for implementation smell checks (use when the diff is non-trivial or likely to accumulate agent-code complexity):
347
+
348
+ ```bash
349
+ bd create --title "Code sanity check token refresh retry" --type task --priority 3 \
350
+ --description "PROBLEM: Cheap READ_ONLY sanity pass for executor implementation quality before final review.
351
+ SUCCESS: Identify concrete simplicity/type-safety/maintainability findings, or return OK.
352
+ SCOPE: executor job <exec-job>, implementation diff only.
353
+ NON_GOALS: No requirements verdict, no security audit, no test execution, no edits.
354
+ CONSTRAINTS: At most 5 concrete findings; cite files/symbols/lines where possible.
355
+ VALIDATION: Findings are suitable to paste into specialists resume <exec-job>.
356
+ OUTPUT: OK/FINDINGS/BLOCKED with handoff."
357
+ bd dep add <sanity> <impl>
358
+ specialists run code-sanity --bead <sanity> --job <exec-job> --context-depth 3 --keep-alive --background
359
+ specialists result <sanity-job>
360
+ ```
361
+
362
+ If code-sanity returns `FINDINGS`, resume executor with those concrete instructions, then rerun code-sanity only if the fixes were substantive. Do not treat code-sanity `OK` as reviewer PASS.
363
+
364
+ Optional security pass when the task touches auth, secrets, input handling, dependency updates, package advisories, agent config, hooks, or exposed endpoints:
365
+
366
+ ```bash
367
+ bd create --title "Security audit token refresh retry" --type task --priority 2 \
368
+ --description "PROBLEM: Scoped security/dependency/config audit for executor changes.
369
+ SUCCESS: Identify evidence-backed security findings or return no findings.
370
+ SCOPE: executor job <exec-job>, changed files, relevant manifests/config only.
371
+ NON_GOALS: No edits, no package updates, no destructive scans, no live exploit testing.
372
+ CONSTRAINTS: LOW permission; recommendations only. HN/social signals are not authoritative proof.
373
+ VALIDATION: Findings cite local evidence or OSV/GHSA/NVD/vendor/package-audit sources.
374
+ OUTPUT: Security audit summary, findings, dependency triage, residual risk."
375
+ bd dep add <security> <impl>
376
+ specialists run security-auditor --bead <security> --job <exec-job> --context-depth 3 --keep-alive --background
377
+ specialists result <security-job>
378
+ ```
379
+
380
+ If security-auditor recommends code or dependency changes, create/resume an executor fix bead. Do not let security-auditor apply updates.
381
+
296
382
  Create review bead:
297
383
 
298
384
  ```bash
@@ -305,7 +391,7 @@ CONSTRAINTS: Findings first with file/line references.
305
391
  VALIDATION: Inspect diff and available checks.
306
392
  OUTPUT: PASS/PARTIAL/FAIL verdict with required fixes."
307
393
  bd dep add <review> <impl>
308
- specialists run reviewer --bead <review> --job <exec-job> --context-depth 2 --keep-alive --background
394
+ specialists run reviewer --bead <review> --job <exec-job> --context-depth 3 --keep-alive --background
309
395
  specialists result <review-job>
310
396
  ```
311
397
 
@@ -353,7 +439,7 @@ CONSTRAINTS: READ_ONLY; produce dependency plan.
353
439
  VALIDATION: Plan names file scopes and merge order.
354
440
  OUTPUT: Parallel track plan."
355
441
  bd dep add <plan> <epic>
356
- specialists run planner --bead <plan> --epic <epic> --context-depth 2 --background
442
+ specialists run planner --bead <plan> --epic <epic> --context-depth 3 --background
357
443
  ```
358
444
 
359
445
  Create independent implementation beads only when write scopes are disjoint:
@@ -383,8 +469,8 @@ bd dep add <impl-docs> <plan>
383
469
  Run parallel executors only if scopes are disjoint:
384
470
 
385
471
  ```bash
386
- specialists run executor --worktree --bead <impl-cli> --context-depth 2 --background
387
- specialists run executor --worktree --bead <impl-docs> --context-depth 2 --background
472
+ specialists run executor --worktree --bead <impl-cli> --context-depth 3 --background
473
+ specialists run executor --worktree --bead <impl-docs> --context-depth 3 --background
388
474
  ```
389
475
 
390
476
  Review each chain with its own review bead and `--job`.
@@ -406,6 +492,12 @@ Standard loop:
406
492
  ```text
407
493
  executor --worktree --bead impl
408
494
  -> waiting after turn
495
+ optional code-sanity --bead sanity --job exec-job
496
+ -> OK: continue
497
+ -> FINDINGS: resume executor with exact sanity findings
498
+ optional security-auditor --bead security --job exec-job
499
+ -> no findings: continue
500
+ -> findings: create/resume executor fix bead; auditor never edits
409
501
  reviewer --bead review --job exec-job
410
502
  -> PASS: verify commit, publish, stop members if needed
411
503
  -> PARTIAL: resume executor with exact findings
@@ -414,7 +506,7 @@ reviewer --bead review --job exec-job
414
506
 
415
507
  Prefer `sp resume <exec-job>` over a new fix executor when the original job is waiting and context is healthy. Use a new fix bead with `--job <exec-job>` only when the original executor is dead, context exhausted, or a separate audit trail is required.
416
508
 
417
- Reviewer output must be consumed before publishing. Do not treat job completion as equivalent to acceptance.
509
+ Code-sanity and security-auditor outputs are advisory inputs to the chain; reviewer output must still be consumed before publishing. Do not treat job completion, code-sanity OK, or security no-findings as equivalent to reviewer acceptance.
418
510
 
419
511
  ## Dependency Mapping
420
512
 
@@ -454,11 +546,20 @@ Use `sp ps` instead of ad-hoc polling.
454
546
  sp ps
455
547
  sp ps <job-id>
456
548
  sp ps --follow
549
+ sp ps --running # only starting/running/waiting jobs
550
+ sp ps --bead <bead-id> # only jobs linked to one bead
551
+ sp ps --since 30m # only jobs started in the last 30 minutes
552
+ sp ps --mine # only jobs whose bead is assigned to you
553
+ sp ps --include-terminal # include merged/abandoned epics (hidden by default)
457
554
  sp feed <job-id>
458
555
  sp result <job-id>
459
556
  ```
460
557
 
461
- Read results at every stage. For READ_ONLY specialists, output also appends to the input bead notes. If result is empty, inspect feed and rerun or switch specialists before relying on it.
558
+ Filter flags compose: `sp ps --running --bead <id>` is the canonical way to inspect "what's actively working on this issue right now". By default `sp ps` hides epics in `merged` or `abandoned` state to keep the snapshot focused; use `--include-terminal` (or `--all`) to bring them back.
559
+
560
+ When dead epics pile up in `failed` state (sibling-chain conflicts, manual stops), recover with `sp epic abandon <epic-id> --reason "<text>"`. The `failed -> abandoned` transition is allowed specifically for cleanup; live members still require `--force`.
561
+
562
+ Read results at every stage. Every specialist (not just READ_ONLY) auto-appends per-turn output to the input bead notes on each `run_complete`, with `[WAITING]` or `[DONE]` headers — `bd show <bead-id>` shows the full handoff trail. `sp result <job-id>` works on `waiting` jobs and returns the most recent turn plus a "Session is waiting for your input" footer; use it to decide whether to resume. If result is empty, inspect feed and rerun or switch specialists before relying on it.
462
563
 
463
564
  Context percentage in `sp ps`/feed is an action signal:
464
565
 
@@ -467,6 +568,8 @@ Context percentage in `sp ps`/feed is an action signal:
467
568
  - 65-80%: steer toward conclusion.
468
569
  - Above 80%: finish, summarize, or replace the job.
469
570
 
571
+ Do not confuse raw token totals with context percentage. `sp ps` may show raw token counts around 50k-100k for large-context models; that alone is not a stop signal. Use the context percentage when available, plus stalls, repeated edit failures, or scope drift.
572
+
470
573
  ## Steering And Resume
471
574
 
472
575
  Use `steer` for running jobs:
@@ -506,6 +609,28 @@ Rules:
506
609
  - Merge between stages only when later stages need the code on the main line.
507
610
  - Run or confirm required gates before closing the root bead or epic.
508
611
 
612
+ ## Release Publication
613
+
614
+ Tagged releases go through the `releasing` skill, which dispatches the
615
+ `changelog-keeper` MEDIUM specialist. The specialist reads xt session
616
+ reports via the releasing skill's `xt-reports.ts` helper, drafts the new
617
+ section into `CHANGELOG.md`, bumps `package.json`, rebuilds `dist/`, commits
618
+ with `release: vX.Y.Z`, tags, and pushes `--follow-tags`. Optional
619
+ `gh release create` if the bead requests it.
620
+
621
+ Operator gate: a single `git diff --stat HEAD~1 HEAD` after the specialist
622
+ finishes. Must show only `CHANGELOG.md`, `package.json`, `dist/`. Anything
623
+ else means scope was violated — revert and refile.
624
+
625
+ The `changelog-keeper-scope` mandatory rule enforces the edit whitelist at
626
+ the specialist level. See `config/skills/releasing/SKILL.md` for the bead
627
+ template, dispatch command, and recovery commands.
628
+
629
+ Release helper contract:
630
+
631
+ - Report extraction is provided by the `releasing` skill, so consumer repos do not need repo-local release helper scripts.
632
+ - Release ranges support annotated tags and should be validated through the same path used by tagged releases.
633
+
509
634
  ## Epic Lifecycle
510
635
 
511
636
  Epics are merge-gated identities with a persisted state machine:
@@ -550,7 +675,7 @@ Override with `--force-job` only when the caller explicitly accepts the write
550
675
  race (e.g. emergency fix into a stalled-but-not-terminal executor):
551
676
 
552
677
  ```bash
553
- sp run executor --bead <fix-bead> --job <stalled-exec-job> --force-job --context-depth 2 --background
678
+ sp run executor --bead <fix-bead> --job <stalled-exec-job> --force-job --context-depth 3 --background
554
679
  ```
555
680
 
556
681
  Do not use `--force-job` as a routine unblock. Inspect `sp ps <job-id>` and
@@ -598,10 +723,13 @@ Do not silently fall back to doing substantial specialist work yourself unless t
598
723
  Dead or zombie process:
599
724
 
600
725
  ```bash
601
- sp stop <job-id>
602
- specialists clean --processes
726
+ sp stop <job-id> # explicit single-job stop
727
+ sp clean --processes --dry-run # preview stale non-terminal cancellations (PID-dead OR > --stale-after, default 24h)
728
+ sp clean --processes # apply: cancel stale rows in observability.db
603
729
  ```
604
730
 
731
+ `sp clean --processes` reads from `observability.db` (DB-first) and uses PID liveness as the primary gate — alive PIDs are never cancelled regardless of age. The `--stale-after <hours>` fallback applies only when a row has no recorded PID. `sp clean` with no flags purges terminal rows older than `SPECIALISTS_JOB_TTL_DAYS` (7d default); `--all` purges all terminals; `--keep <n>` retains the N most recent.
732
+
605
733
  Epic state unclear:
606
734
 
607
735
  ```bash
@@ -609,13 +737,17 @@ sp epic status <epic-id>
609
737
  sp epic sync <epic-id> --apply
610
738
  ```
611
739
 
612
- Specialist missing or config skipped:
740
+ Specialist missing, config skipped, or stale default snapshots:
613
741
 
614
742
  ```bash
615
743
  specialists list
616
744
  specialists doctor
745
+ specialists doctor --check-drift
746
+ sp prune-stale-defaults --dry-run
617
747
  ```
618
748
 
749
+ `sp prune-stale-defaults` is intentionally operator-facing. Always run `--dry-run` first unless the bead explicitly asks to apply cleanup.
750
+
619
751
  Worktree already exists:
620
752
 
621
753
  ```text
@@ -628,6 +760,8 @@ Reviewer cannot enter job workspace:
628
760
  Check target job status with sp ps. MEDIUM/HIGH jobs are blocked from entering a running write-capable workspace unless forced.
629
761
  ```
630
762
 
763
+ When resolver/catalog changes are under review inside a worktree, run `sp config show <name> --resolved --from-source` so reviewer sees local source behavior, not installed dist.
764
+
631
765
  Explorer produced empty output:
632
766
 
633
767
  ```text
@@ -0,0 +1,284 @@
1
+ ---
2
+ name: using-specialists-v3
3
+ description: >
4
+ Canonical specialist orchestration skill. Use proactively for substantial work
5
+ that should be delegated, tracked, reviewed, fixed, tested, or merged through
6
+ specialists: code review, debugging, implementation, planning, doc sync,
7
+ security checks, multi-step chains, and questions about specialist workflow.
8
+ version: 3.1
9
+ ---
10
+
11
+ # Using Specialists v3
12
+
13
+ You are the orchestrator. Your job is to turn user intent into a clear bead contract, choose the right specialist from the live registry, launch the chain, monitor it, consume results, drive fixes, and publish through the specialist merge path.
14
+
15
+ Keep this skill practical. It should contain the core behavior needed to orchestrate well; use live commands for volatile details instead of embedding a static catalog.
16
+
17
+ ## When To Delegate
18
+
19
+ Use specialists for substantial work: codebase exploration, debugging, implementation, review, test execution, planning, documentation sync, security/config audit, release publication, and multi-chain epics.
20
+
21
+ Do small deterministic edits directly when the scope is already obvious and delegation would add ceremony. Do not self-investigate or self-implement a substantial task just because you can read files faster; the audit trail and specialist review are part of the workflow.
22
+
23
+ ## Non-Negotiable Rules
24
+
25
+ 1. `--bead` is the prompt for tracked work.
26
+ 2. Do not dispatch until the bead is a usable task contract.
27
+ 3. Never use `--prompt` to supplement tracked work. Update the bead instead.
28
+ 4. Choose by task shape, not by habit. Check `specialists list --full` when roles may have changed.
29
+ 5. Explorer/debugger answer uncertainty before executor writes code.
30
+ 6. Executor starts only when scope, constraints, and validation are clear.
31
+ 7. Reviewer uses its own bead and the executor workspace via `--job <exec-job>`.
32
+ 8. Keep executor/debugger jobs alive through review so they can be resumed.
33
+ 9. Merge specialist-owned work with `sp merge` or `sp epic merge`, not manual `git merge`.
34
+ 10. Specialists must not perform destructive or irreversible operations.
35
+ 11. Treat tests as evidence: classify failures as in-scope, pre-existing, or infrastructure before starting a fix loop.
36
+ 12. Drive routine stages autonomously once the task is clear. Escalate only for human judgment, destructive actions, repeated crashes, or reviewer `FAIL`.
37
+
38
+ ## Live Registry And Help
39
+
40
+ Use the live registry for role details, permissions, current models, and skills:
41
+
42
+ ```bash
43
+ specialists list --full
44
+ ```
45
+
46
+ Use help for command flags and subcommands:
47
+
48
+ ```bash
49
+ sp help
50
+ sp run --help
51
+ sp ps --help
52
+ sp feed --help
53
+ sp result --help
54
+ sp resume --help
55
+ sp merge --help
56
+ sp epic --help
57
+ ```
58
+
59
+ Do not rely on stale remembered flags when help is available.
60
+
61
+ ## Role Selection
62
+
63
+ Common routing:
64
+
65
+ | Need | Specialist |
66
+ | --- | --- |
67
+ | Unknown architecture, call flow, dependencies, implementation options | `explorer` |
68
+ | Symptom, stack trace, regression, flaky/failing test, root cause | `debugger` |
69
+ | Broad feature decomposition, bead board, dependencies, sequencing | `planner` |
70
+ | Risky design choice, tradeoff, premortem, critique | `overthinker` |
71
+ | Clear implementation or scoped doc edit | `executor` |
72
+ | Cheap implementation-quality smell pass before final review | `code-sanity` |
73
+ | Security/config/dependency audit with recommendations only | `security-auditor` |
74
+ | Final compliance verdict on executor/debugger diff | `reviewer` |
75
+ | Run checks and interpret failures without fixing | `test-runner` |
76
+ | Exactly one doc needs drift-aware sync | `sync-docs` |
77
+ | Current external docs/API/ecosystem research | `researcher` |
78
+ | Create or fix specialist config/schema | `specialists-creator` |
79
+ | Release changelog/package/dist/tag publication | `changelog-keeper` through the `releasing` skill |
80
+
81
+ Selection rules:
82
+
83
+ - Use `explorer` when you need evidence before deciding what to change.
84
+ - Use `debugger` instead of explorer when there is a failure symptom.
85
+ - Use `executor` only after the task can name target files/symbols or a bounded discovery result.
86
+ - Use `reviewer` as the merge gate; code-sanity and security-auditor are advisory.
87
+ - Use `test-runner` for running/classifying tests; it does not implement fixes.
88
+ - Use `specialists-creator` before changing specialist definitions.
89
+
90
+ ## Bead Contract
91
+
92
+ Every specialist-bound bead must be a usable prompt. Title-only beads are not acceptable.
93
+
94
+ Required structure:
95
+
96
+ ```text
97
+ PROBLEM: What is wrong or needed.
98
+ SUCCESS: Observable completion criteria.
99
+ SCOPE: Files, symbols, commands, docs, or discovery area.
100
+ NON_GOALS: Explicitly out of scope.
101
+ CONSTRAINTS: Safety, compatibility, style, permissions, sequencing.
102
+ VALIDATION: Checks/tests/review expected before closure.
103
+ OUTPUT: Expected handoff format.
104
+ ```
105
+
106
+ If the existing issue is vague, update it before dispatch:
107
+
108
+ ```bash
109
+ bd update <id> --notes "CONTRACT: ..."
110
+ ```
111
+
112
+ Contract tuning by role:
113
+
114
+ - Explorer: ask specific questions; require citations to files/symbols/flows; forbid implementation.
115
+ - Debugger: include symptom, reproduction, expected/actual behavior, logs/tests; ask for root cause and minimal fix path.
116
+ - Executor: name target files/symbols and do-not-touch boundaries; require verification evidence.
117
+ - Reviewer: reference the executor job, diff, acceptance criteria, constraints, and required verdict format.
118
+ - Test-runner: name exact commands/suites and expected classification of failures.
119
+ - Sync-docs: exactly one doc in scope.
120
+
121
+ ## Canonical Single-Chain Flow
122
+
123
+ Use this for one implementation branch.
124
+
125
+ ```bash
126
+ # 1. Create or claim root task bead with complete contract
127
+ bd create --title "..." --type task --priority 2 --description "PROBLEM: ..."
128
+ bd update <task> --claim
129
+
130
+ # 2. Optional discovery when path is unknown
131
+ bd create --title "Explore ..." --type task --priority 2 --description "PROBLEM: ... OUTPUT: evidence-backed plan."
132
+ bd dep add <explore> <task>
133
+ specialists run explorer --bead <explore> --context-depth 3
134
+ specialists result <explore-job>
135
+
136
+ # 3. Implementation
137
+ bd create --title "Implement ..." --type task --priority 2 --description "PROBLEM: ... VALIDATION: ..."
138
+ bd dep add <impl> <explore-or-task>
139
+ specialists run executor --bead <impl> --context-depth 3
140
+ specialists result <exec-job>
141
+
142
+ # 4. Optional advisory passes
143
+ specialists run code-sanity --bead <sanity-bead> --job <exec-job> --context-depth 3
144
+ specialists run security-auditor --bead <security-bead> --job <exec-job> --context-depth 3
145
+
146
+ # 5. Final review
147
+ bd create --title "Review ..." --type task --priority 2 --description "PROBLEM: Verify executor output ... OUTPUT: PASS/PARTIAL/FAIL."
148
+ bd dep add <review> <impl>
149
+ specialists run reviewer --bead <review> --job <exec-job> --context-depth 3
150
+ specialists result <review-job>
151
+
152
+ # 6. Publish after reviewer PASS
153
+ sp merge <impl>
154
+ bd close <task> --reason "Reviewer PASS; merged."
155
+ ```
156
+
157
+ Edit-capable specialists with `--bead` auto-provision a worktree. `--worktree` is accepted for clarity but is usually unnecessary. Use `--job <exec-job>` for reviewer/fix passes that must enter the existing executor workspace.
158
+
159
+ ## Review And Fix Loop
160
+
161
+ A chain stays alive until it is merged or abandoned.
162
+
163
+ ```text
164
+ executor/debugger -> waiting
165
+ optional code-sanity/security-auditor -> advisory findings
166
+ reviewer -> PASS | PARTIAL | FAIL
167
+ ```
168
+
169
+ - `PASS`: verify expected commit/diff, then publish.
170
+ - `PARTIAL`: resume the same executor/debugger with exact findings, then re-review.
171
+ - `FAIL`: stop and decide whether to replace the chain, re-scope the bead, or ask the operator if judgment is required.
172
+
173
+ Prefer resume over spawning a new fix executor when the original job is waiting and context is healthy:
174
+
175
+ ```bash
176
+ sp resume <exec-job> "Reviewer PARTIAL. Fix only these findings: ..."
177
+ ```
178
+
179
+ Do not treat job completion, code-sanity OK, or security no-findings as equivalent to reviewer PASS.
180
+
181
+ ## Monitoring And Steering
182
+
183
+ Use `sp ps` for state and `sp result` for completed turns.
184
+
185
+ ```bash
186
+ sp ps
187
+ sp ps <job-id>
188
+ sp ps --bead <bead-id>
189
+ sp feed <job-id> # live/running output
190
+ sp result <job-id> # done/error/waiting result
191
+ ```
192
+
193
+ If a job is running, use `sp feed`. If it is waiting, use `sp result` and decide whether to resume, review, merge, or stop. Avoid tight polling; sleep based on task size, then check once.
194
+
195
+ Use `steer` for running jobs and `resume` for waiting jobs:
196
+
197
+ ```bash
198
+ sp steer <job-id> "Stop broad audit. Answer only the three bead questions."
199
+ sp resume <job-id> "Continue with the next scoped fix. Do not refactor."
200
+ ```
201
+
202
+ Context usage is an action signal when available:
203
+
204
+ - 0-40%: healthy.
205
+ - 40-65%: monitor.
206
+ - 65-80%: steer toward conclusion.
207
+ - Above 80%: finish, summarize, or replace the job.
208
+
209
+ Raw token totals are not context percentages.
210
+
211
+ ## Merge And Publication
212
+
213
+ Standalone chain:
214
+
215
+ ```bash
216
+ sp merge <chain-root-bead>
217
+ ```
218
+
219
+ Epic-owned chains:
220
+
221
+ ```bash
222
+ sp epic status <epic-id>
223
+ sp epic merge <epic-id>
224
+ ```
225
+
226
+ Rules:
227
+
228
+ - Merge only after reviewer PASS unless the operator explicitly accepts a draft for follow-up work.
229
+ - Use `sp epic merge` for unresolved epic chains; `sp merge` refuses those by design.
230
+ - Do not manually `git merge` specialist branches.
231
+ - If merge refuses because a chain job is still `waiting`, consume the result and either resume/stop/finalize that job deliberately.
232
+ - If merge reports a dirty worktree, inspect that worktree. Revert generated noise only when it is clearly unrelated; otherwise ask or re-dispatch.
233
+ - Run or confirm required gates before closing the root bead or epic.
234
+
235
+ ## Multi-Chain Epic Flow
236
+
237
+ Use an epic when multiple implementation chains publish together.
238
+
239
+ 1. Create an epic bead with complete contract.
240
+ 2. Use planner/explorer for shared prep if needed.
241
+ 3. Create independent implementation beads with disjoint file scopes.
242
+ 4. Dispatch executors in parallel only when scopes are provably disjoint.
243
+ 5. Review each chain with its own review bead and `--job`.
244
+ 6. After every chain has reviewer PASS, publish with `sp epic merge <epic-id>`.
245
+
246
+ Use `--epic <id>` when a job belongs to an epic but its bead is not a direct child. Avoid parallel executors on the same file; sequence them or consolidate the work.
247
+
248
+ ## Failure Recovery
249
+
250
+ When something fails:
251
+
252
+ ```bash
253
+ sp ps <job-id>
254
+ sp feed <job-id>
255
+ sp result <job-id>
256
+ sp doctor
257
+ ```
258
+
259
+ Then choose one action:
260
+
261
+ - Steer a running job back to scope.
262
+ - Resume a waiting job with exact next instructions.
263
+ - Stop a dead or obsolete job.
264
+ - Rerun with a better bead contract.
265
+ - Switch specialist if the selected role was wrong.
266
+ - Report blocker if destructive/high-risk/manual action is required.
267
+
268
+ Common recovery commands:
269
+
270
+ ```bash
271
+ sp stop <job-id>
272
+ sp clean --processes --dry-run
273
+ sp epic status <epic-id>
274
+ sp epic sync <epic-id> --apply
275
+ sp epic abandon <epic-id> --reason "..."
276
+ specialists doctor --check-drift
277
+ sp prune-stale-defaults --dry-run
278
+ ```
279
+
280
+ Do not silently take over substantial specialist work yourself unless the operator agrees or the remaining change is genuinely small and deterministic.
281
+
282
+ ## What Stays Out Of This Skill
283
+
284
+ Do not embed the full specialist catalog, all CLI help, release mechanics, stale incident reports, or historical gotchas. Keep volatile detail in `specialists list --full`, `sp help`, bead notes, and focused skills such as `releasing`, `using-nodes`, or `specialists-creator`.