npm - xtrm-tools - Versions diffs - 0.7.12 → 0.7.14 - Mend

xtrm-tools 0.7.12 → 0.7.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (42) hide show

package/.xtrm/skills/default/using-nodes/SKILL.md CHANGED Viewed

@@ -53,7 +53,7 @@ Coordinator commands should still use `$SPECIALISTS_NODE_ID` directly.
    - Your only tool is `bash`. Your only bash commands are `sp node` plus `sp ps`/`sp result`.
    - Do not call `read`, `ls`, `find`, `grep`, or any file inspection tool. You have none.
-2. **Use only `sp node` + `sp ps` + `sp result` + `sp steer` + `sp resume` command surface for orchestration**
+2. **Use only `sp node` command surface for orchestration**
    - Do not emit legacy contract JSON plans as the primary control mechanism.
    - Do not call deprecated node action channels.
@@ -84,8 +84,6 @@ Coordinator commands should still use `$SPECIALISTS_NODE_ID` directly.
 | `sp node spawn-member --node $SPECIALISTS_NODE_ID --member-key <key> --specialist <name> [--bead <id>] [--phase <id>] [--json]` | Coordinator | Launch a member for the current phase. |
 | `sp node wait-phase --node $SPECIALISTS_NODE_ID --phase <id> --members <k1,k2,...> [--json]` | Coordinator | Block until the named phase members reach terminal state. |
 | `sp result $SPECIALISTS_NODE_ID:<member-key> --wait --json` | Coordinator | Read the persisted output for a specific member after a phase barrier. |
-| `sp steer <job-id> 'direction'` | Coordinator | Steer a running member with new context mid-flight. |
-| `sp resume <job-id> 'next task'` | Coordinator | Resume a waiting member with new task instructions. |
 | `sp node create-bead --node $SPECIALISTS_NODE_ID --title '...' [--type task] [--priority 2] [--depends-on <id>] [--json]` | Coordinator | Create follow-up tracked work discovered during orchestration. |
 | `sp node complete --node <node-id> --strategy <pr\|manual> [--json]` | Operator-only | Force-close node lifecycle when coordinator has reached waiting and operator decides to finalize. |
 | `sp node members <node-id> [--json]` | Operator | Inspect member registry and lineage. |
@@ -110,21 +108,13 @@ Coordinator commands should still use `$SPECIALISTS_NODE_ID` directly.
    - after `wait-phase` succeeds, call `sp result $SPECIALISTS_NODE_ID:<member-key> --wait --json` for each participating member,
    - synthesize the outputs into the next decision.
-4. **Steer members dynamically**
-   - after reading a member's result, if other members need updated context, steer them with `sp steer <job-id> 'specific direction from findings'`.
-   - only steer with concrete, evidence-based direction — never speculative.
-   - example: explorer finds X → steer researcher to 'investigate X patterns in external docs'.
-5. **Re-check status**
+4. **Re-check status**
    - re-read node status after each command sequence,
    - adjust the plan from actual runtime state.
-6. **Coordinator terminal behavior**
+5. **Coordinator terminal behavior**
    - once goals are satisfied (or terminally blocked with explicit reason),
-   - synthesize ALL member evidence into a unified report,
-   - this report is your final output — it MUST integrate all member findings,
-   - 'Node completed. ok:true.' is NOT acceptable synthesis,
-   - enter/remain in `waiting` after producing synthesis.
+   - synthesize evidence and enter/remain in `waiting`.
    - do not issue a completion command; operator decides lifecycle closure via `sp node stop` (or force-close via `sp node complete`).
 ---
@@ -137,70 +127,25 @@ Use this exact loop:
 1. `status`
 2. decide the next phase/member set
-3. spawn members for THIS phase only (not all phases)
+3. launch members
 4. `wait-phase`
-5. `result --wait` for each member
+5. `result --wait`
 6. synthesize evidence
-7. steer or spawn members for next phase based on synthesis
-8. repeat until all phases complete
-9. produce final synthesis report
-10. enter waiting for operator closure
-### Multi-phase coordination pattern
-The coordinator MUST use at least 2 distinct phases:
-**Phase 1 — Explore:**
-- Spawn explorer to gather initial evidence
-- wait-phase → read result → synthesize findings
-- Decide: what needs deeper investigation?
-**Phase 2 — Deep-dive (conditional):**
-- Based on explore findings, spawn researcher/overthinker with specific context
-- Steer running members with evidence from phase 1
-- wait-phase → read results → synthesize
-**Phase 3 — Synthesis:**
-- Read ALL member results from all phases
-- Produce unified report integrating all findings
-- Enter waiting
+7. choose next action or enter waiting after synthesis
 ### Synthesis mandate
-Before declaring synthesis complete, the coordinator **MUST** read the persisted results for ALL members across ALL phases.
-The synthesis report MUST:
-- Integrate findings from every member
-- Highlight agreements, contradictions, and gaps
-- Provide actionable conclusions
-- Be the coordinator's own substantive output
-'Node completed. ok:true.' is NEVER acceptable as synthesis output.
-### Synthesis mandate (repeated for emphasis)
 Before declaring synthesis complete, the coordinator **MUST** read the persisted results for the members that produced the evidence.
 Do not rely only on status transitions. `wait-phase` tells you the members are terminal; `sp result $SPECIALISTS_NODE_ID:<member-key> --wait --json` tells you what they actually found or changed. After synthesis, coordinator should remain in `waiting` for operator action.
 ### Steering guidance
-Steer when concrete result evidence shows a gap, contradiction, or missed requirement.
-**Steering commands:**
-- `sp steer <job-id> 'new direction based on evidence'` — for running members
-- `sp resume <job-id> 'next task with context from phase N'` — for waiting members
-- `sp node spawn-member ... --phase <next-phase>` — for new members with specific context
-**Good steering patterns:**
-- Explorer finds module X handles auth → steer researcher: 'Investigate how other frameworks handle auth patterns similar to module X'
-- Researcher finds tradeoff A vs B → spawn overthinker: 'Analyze tradeoff between A and B. Explorer found that X uses A, researcher found Y uses B. Consider: performance, complexity, ecosystem support.'
-- Reviewer finds missing test coverage → spawn executor: 'Add tests for the paths reviewer identified: ...'
+Only steer when concrete result evidence shows a gap, contradiction, or missed requirement.
-**Bad steering patterns:**
-- Steering a member before reading its completed output
-- Steering with generic instructions ('do better', 'investigate more')
-- Steering speculatively without evidence from a prior member result
+Do **not** steer speculatively.
+- Good: result evidence shows a reviewer found a missing acceptance criterion.
+- Bad: steering a member before reading its completed output.
 ---
@@ -242,49 +187,22 @@ When a command fails:
 ## Example command sequences
-### Sequence A: multi-phase explore → deep-dive → synthesis
+### Sequence A: explore -> synthesis -> impl -> waiting
 ```bash
-# Phase 1: explore
 sp ps --node $SPECIALISTS_NODE_ID --json
 sp node spawn-member --node $SPECIALISTS_NODE_ID --member-key explore-1 --specialist explorer --phase explore-1 --json
 sp node wait-phase --node $SPECIALISTS_NODE_ID --phase explore-1 --members explore-1 --json
 sp result $SPECIALISTS_NODE_ID:explore-1 --wait --json
-# Synthesize explore-1 findings. Decide what needs deeper investigation.
-# Phase 2: deep-dive (spawned based on explore findings)
-sp node spawn-member --node $SPECIALISTS_NODE_ID --member-key researcher-1 --specialist researcher --phase deep-dive-1 --json
-sp node spawn-member --node $SPECIALISTS_NODE_ID --member-key overthinker-1 --specialist overthinker --phase deep-dive-1 --json
-sp node wait-phase --node $SPECIALISTS_NODE_ID --phase deep-dive-1 --members researcher-1,overthinker-1 --json
-sp result $SPECIALISTS_NODE_ID:researcher-1 --wait --json
-sp result $SPECIALISTS_NODE_ID:overthinker-1 --wait --json
-# Synthesize all phase 2 evidence.
-# Phase 3: final synthesis
-# Read all member results, produce unified report, enter waiting.
-sp ps --node $SPECIALISTS_NODE_ID --json
-```
-### Sequence B: explore → steer → synthesis
-```bash
-# Phase 1: explore
-sp ps --node $SPECIALISTS_NODE_ID --json
-sp node spawn-member --node $SPECIALISTS_NODE_ID --member-key explore-1 --specialist explorer --phase explore-1 --json
-sp node wait-phase --node $SPECIALISTS_NODE_ID --phase explore-1 --members explore-1 --json
-sp result $SPECIALISTS_NODE_ID:explore-1 --wait --json
-# Explorer found X. Researcher is running — steer it.
-# Steer researcher with explorer findings
-sp steer <researcher-job-id> 'Based on explorer findings about X, investigate Y patterns in external docs'
-sp node wait-phase --node $SPECIALISTS_NODE_ID --phase deep-dive-1 --members researcher-1 --json
-sp result $SPECIALISTS_NODE_ID:researcher-1 --wait --json
-# Final synthesis — produce unified report integrating ALL findings
+# Synthesize the explore findings and decide whether impl is required.
+sp node spawn-member --node $SPECIALISTS_NODE_ID --member-key impl-1 --specialist executor --phase impl-1 --json
+sp node wait-phase --node $SPECIALISTS_NODE_ID --phase impl-1 --members impl-1 --json
+sp result $SPECIALISTS_NODE_ID:impl-1 --wait --json
+# Synthesize impl evidence, then stay in waiting for operator closure.
 sp ps --node $SPECIALISTS_NODE_ID --json
 ```
-### Sequence C: discovered work + review synthesis + operator closure
+### Sequence B: discovered work + review synthesis + operator closure
 ```bash
 sp ps --node $SPECIALISTS_NODE_ID --json
@@ -319,8 +237,6 @@ sp ps --node $SPECIALISTS_NODE_ID --json
 - `sp node wait-phase --node $SPECIALISTS_NODE_ID --phase <id> --members <k1,k2,...> [--json]`
 - `sp result $SPECIALISTS_NODE_ID:<member-key> --wait --json`
 - `sp ps --node $SPECIALISTS_NODE_ID --json`
-- `sp steer <job-id> 'new direction or context'` — steer a running member mid-flight
-- `sp resume <job-id> 'next task'` — resume a waiting member with new instructions
 ### Operator-only closure commands
 - `sp node stop <node-id>`

package/.xtrm/skills/default/using-script-specialists/SKILL.md ADDED Viewed

@@ -0,0 +1,208 @@
+---
+name: using-script-specialists
+description: >
+  Use this skill for synchronous one-shot specialist invocations via `sp script`
+  (CLI) or `sp serve` (HTTP daemon). These run READ_ONLY, template-driven
+  specialists with `$var` substitution and return JSON in-process — no beads,
+  no chains, no worktrees, no job lifecycle. Trigger when integrating a
+  specialist into a service, script, or library, when the caller needs the
+  output immediately, or when the work is a single LLM call with structured
+  input/output. Do NOT use for tracked agent work — that belongs to
+  `using-specialists-v2`.
+version: 1.0
+---
+# Script-Class Specialists
+`sp script` and `sp serve` are a separate runtime from the bead-first
+orchestration covered by `using-specialists-v2`. They exist for service and
+library integration, not for agent chains.
+| Aspect | `sp run` (orchestration) | `sp script` / `sp serve` |
+| --- | --- | --- |
+| Driver | bead contract | template + variables |
+| Execution | supervised job, async | one-shot, synchronous |
+| Permissions | READ_ONLY / MEDIUM / HIGH | READ_ONLY only |
+| Worktrees | edit-capable provisions one | rejected |
+| Output | result.txt + events.jsonl + bead notes | stdout JSON / HTTP body |
+| Audit | `.specialists/jobs/<id>/` | one row in `.specialists/db/observability.db` |
+Use `sp script` from a shell or build pipeline. Use `sp serve` from a service
+that needs an HTTP endpoint backed by `pi`. The same `.specialist.json` runs
+under both.
+## When To Use This Skill
+Trigger when:
+- A service or script needs a single LLM-backed transform (summarize, classify,
+  extract) returning JSON.
+- You are integrating specialists into Python/Node code that cannot block on a
+  supervised job lifecycle.
+- The call is request/response shaped: variables in, structured output out.
+- You need a sidecar HTTP endpoint (`sp serve`) to wrap a specialist for a
+  service consumer that already speaks HTTP.
+Do NOT trigger for: code review, debugging, implementation, multi-turn work,
+keep-alive sessions, anything that should write files. Those belong to
+`using-specialists-v2`.
+## Specialist Compatibility (compatGuard)
+A spec is rejected at request time (`specialist_load_error`) if any of:
+- `execution.interactive` is `true`
+- `execution.requires_worktree` is `true`
+- `execution.permission_required` is anything other than `READ_ONLY`
+- `skills.scripts` is non-empty
+- `prompt.task_template` is missing
+- a referenced `$var` in the chosen template is not supplied (`template_variable_missing`)
+Author specs that explicitly target script-class:
+```json
+{
+  "specialist": {
+    "metadata": { "name": "summarize-event", "version": "1.0.0", "category": "ingestion" },
+    "execution": {
+      "mode": "auto",
+      "model": "anthropic/claude-haiku-4-5",
+      "timeout_ms": 30000,
+      "interactive": false,
+      "response_format": "json",
+      "output_type": "custom",
+      "permission_required": "READ_ONLY",
+      "requires_worktree": false,
+      "max_retries": 0
+    },
+    "prompt": {
+      "task_template": "Summarize event $event_id with body: $body. Return JSON {\"summary\": \"...\"}.",
+      "output_schema": { "required": ["summary"] }
+    }
+  }
+}
+```
+## `sp script` — One-Shot CLI
+```bash
+sp script <specialist-name> \
+  --vars key1=value1 --vars key2=value2 \
+  [--template task_template] \
+  [--model anthropic/claude-sonnet-4-6] \
+  [--thinking medium] \
+  [--timeout-ms 60000] \
+  [--db-path /path/to/observability.db] \
+  [--single-instance <lock-name>] \
+  [--no-trace] \
+  [--json]
+```
+Behaviour:
+- Loads the spec via `SpecialistLoader` (same loader as `sp run`).
+- Renders `prompt.task_template` (or named template) with `--vars`.
+- Spawns `pi --mode json --no-session --no-extensions --no-tools` with the
+  resolved model.
+- Returns the final assistant text on stdout. With `--json`, returns the full
+  `ScriptGenerateResult` envelope.
+- Writes one row to `.specialists/db/observability.db` (same writer as `sp run`).
+Exit codes:
+- `0` — success.
+- non-zero — failure; with `--json`, body has `success: false` and `error_type`.
+Use `--single-instance <lock>` when concurrent invocations of the same logical
+job must be serialized (cron, batch script).
+## `sp serve` — HTTP Daemon
+```bash
+sp serve \
+  [--port 8000] \
+  [--concurrency 4] \
+  [--queue-timeout-ms 5000] \
+  [--shutdown-grace-ms 30000] \
+  [--project-dir /path/to/project] \
+  [--fallback-model anthropic/claude-haiku-4-5]
+```
+POST `/v1/generate`:
+```json
+{
+  "specialist": "summarize-event",
+  "variables": { "event_id": "abc", "body": "..." },
+  "template": "task_template",
+  "model_override": "anthropic/...",
+  "timeout_ms": 60000,
+  "trace": true
+}
+```
+Response (200, success):
+```json
+{
+  "success": true,
+  "output": "<final text>",
+  "parsed_json": { "summary": "..." },
+  "meta": {
+    "specialist": "summarize-event",
+    "model": "anthropic/claude-haiku-4-5",
+    "duration_ms": 1234,
+    "trace_id": "<uuid>"
+  }
+}
+```
+Response (200, failure):
+```json
+{ "success": false, "error": "...", "error_type": "..." }
+```
+Error types: `specialist_not_found | specialist_load_error |
+template_variable_missing | auth | quota | timeout | network | invalid_json |
+output_too_large | internal`.
+`400` is reserved for malformed HTTP. `429` returns when concurrency cap is
+saturated past `queue-timeout-ms`.
+## Operational Rules
+- One `pi` subprocess per in-flight request, bounded by `--concurrency`.
+- Credentials come from `pi`'s own `~/.pi/agent/auth.json`. The service never
+  touches API keys.
+- Observability DB is shared with `sp run`. Audit trail is unified.
+- The service is sidecar-per-consumer: no multi-tenant routing, no session
+  state, no orchestration. If you need orchestration, use `sp run` + beads.
+- For container deployments, see `docs/specialists-service-install.md`. Image
+  runs as non-root UID 10001; bind-mount `~/.pi` and `.specialists/`.
+## When To Switch Back To `using-specialists-v2`
+If any of these become true mid-design, drop script-class and use the
+orchestration runtime:
+- The work needs to write files.
+- The caller wants a multi-turn / keep-alive session.
+- A reviewer pass is needed.
+- The work should be tracked as a bead with auditability beyond a single
+  observability row.
+- The output is iterative (steer / resume).
+## What Not To Put Here
+- Bead workflow, chains, epics, reviewers, worktrees — those live in
+  `using-specialists-v2`.
+- Orchestration MCP tooling (`use_specialist`).
+- Long-running multi-turn examples.
+## Reference
+- `docs/specialists-service.md` — HTTP contract and operational notes.
+- `docs/specialists-service-install.md` — Docker/Podman install path.
+- `docs/script-specialists.md` — historical context for the script-class shape.
+- `src/cli/script.ts`, `src/cli/serve.ts`, `src/specialist/script-runner.ts` — runtime.

package/.xtrm/skills/default/using-specialists/SKILL.md CHANGED Viewed

@@ -62,6 +62,17 @@ Specialists are autonomous AI agents that run independently — fresh context, d
 8. **No destructive operations by specialists.** No `rm -rf`, no force pushes, no database drops, no credential rotation, no mass deletes, no history rewrites. Surface destructive requirements to the user.
 9. **Executor does not run tests.** Executor runs lint + tsc only. Tests are the reviewer's and test-runner's responsibility in the chained pipeline.
 10. **Keep specialists alive through the review cycle.** Never `sp stop` an executor or debugger before the reviewer delivers its verdict. The specialist stays in `waiting` so you can `resume` it — to commit changes, apply fixes from reviewer feedback, or continue work. Only stop after final reviewer PASS and confirmed commit.
+11. **Respect ownership layers and loader precedence.** Loader resolution order is `.specialists/user/*` > `.specialists/default/*` > package fallback `config/*`. Upstream source = package `config/*` (read-only for repo operators); managed mirror = `.specialists/default/*` (no hand edits); repo custom layer = `.specialists/user/*`; runtime/generated = `.specialists/{jobs,ready,db}`.
+12. **Keep backlog-clean isolated.** Do not mix backlog-clean changes into specialist ownership/migration tasks.
+## Mandatory-rules template sets
+Use template-driven mandatory rules for repeatable policy bundles.
+- Specialist config field: `specialist.mandatory_rules.template_sets`
+- Template source: `config/mandatory-rules/*.md`
+- Template format: YAML frontmatter + body content
+- Runtime behavior: runner resolves templates and injects rendered rules at end of prompt
 ---
@@ -127,11 +138,13 @@ specialists stop <job-id> --force             # 5s SIGTERM timeout, then pgroup
 # Management
 specialists edit <name>                       # edit specialist config (dot-path, --preset)
+specialists edit <name> --fork-from <base>   # fork non-user specialist into .specialists/user/ then edit
 specialists clean                             # purge old job dirs + worktree GC
 specialists clean --processes                 # kill all running/starting specialist jobs
 specialists db vacuum                         # compact SQLite storage (refuses if jobs running)
 specialists db prune --before <iso|duration> --dry-run|--apply  # prune old events/results/terminal jobs
 specialists doctor orphans                    # integrity scan: orphan, stale-pointer, integrity-violation
+specialists init --sync-defaults              # refresh specialists + mandatory-rules + nodes from canonical defaults
 specialists init --sync-skills                # re-sync skills only (no full init)
 specialists init --no-xtrm-check              # skip xtrm prerequisite check (CI/testing)
 ```