xtrm-tools 0.7.17 → 0.7.18
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.xtrm/config/hooks.json +2 -0
- package/.xtrm/config/instructions/agents-top.md +2 -1
- package/.xtrm/registry.json +429 -712
- package/.xtrm/skills/default/creating-service-skills/scripts/bootstrap.py +82 -156
- package/.xtrm/skills/default/creating-service-skills/scripts/scaffolder.py +73 -121
- package/.xtrm/skills/default/hook-development/references/patterns.md +1 -1
- package/.xtrm/skills/default/last30days/scripts/test-v1-vs-v2.sh +2 -2
- package/.xtrm/skills/default/planning/SKILL.md +75 -29
- package/.xtrm/skills/default/releasing/SKILL.md +163 -57
- package/.xtrm/skills/default/security-pipeline/SKILL.md +192 -0
- package/.xtrm/skills/default/security-pipeline/scripts/security-bootstrap.sh +294 -0
- package/.xtrm/skills/default/security-pipeline/templates/.githooks/pre-push.template +39 -0
- package/.xtrm/skills/default/security-pipeline/templates/.github/workflows/gitleaks.yml +33 -0
- package/.xtrm/skills/default/security-pipeline/templates/.github/workflows/osv-scanner.yml +33 -0
- package/.xtrm/skills/default/security-pipeline/templates/.github/workflows/semgrep.yml +41 -0
- package/.xtrm/skills/default/security-pipeline/templates/.gitleaks.toml +44 -0
- package/.xtrm/skills/default/security-pipeline/templates/.pre-commit-config.yaml +67 -0
- package/.xtrm/skills/default/security-pipeline/templates/.semgrepignore +46 -0
- package/.xtrm/skills/default/security-pipeline/templates/scripts/security-scan.sh +57 -0
- package/.xtrm/skills/default/security-pipeline/templates/scripts/semgrep-diff.sh +68 -0
- package/.xtrm/skills/default/session-close-report/SKILL.md +167 -6
- package/.xtrm/skills/default/sync-docs/SKILL.md +1 -1
- package/.xtrm/skills/default/update-xt/SKILL.md +270 -4
- package/.xtrm/skills/default/updating-service-skills/scripts/drift_detector.py +22 -0
- package/.xtrm/skills/default/using-script-specialists/SKILL.md +7 -5
- package/.xtrm/skills/default/using-specialists/SKILL.md +13 -12
- package/.xtrm/skills/default/using-specialists-auto/SKILL.md +137 -0
- package/.xtrm/skills/default/using-specialists-v2/SKILL.md +14 -21
- package/.xtrm/skills/default/using-specialists-v3/SKILL.md +533 -21
- package/.xtrm/skills/default/vaultctl/SKILL.md +2 -2
- package/CHANGELOG.md +82 -3
- package/cli/dist/index.cjs +12425 -3770
- package/cli/dist/index.cjs.map +1 -1
- package/cli/package.json +9 -3
- package/package.json +27 -7
- package/packages/pi-extensions/package.json +1 -1
- package/.xtrm/skills/default/planning/evals/evals.json +0 -19
- package/.xtrm/skills/default/quality-gates/evals/evals.json +0 -181
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/FINAL-EVAL-SUMMARY.md +0 -75
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/edge-case-auto-fix-verification/with_skill/outputs/response.md +0 -59
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/edge-case-mixed-language-project/with_skill/outputs/response.md +0 -60
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/eval-summary.md +0 -105
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/partial-install-python-only/with_skill/outputs/response.md +0 -93
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/python-refactor-request/with_skill/outputs/response.md +0 -104
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/quality-gate-error-fix/with_skill/outputs/response.md +0 -74
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/should-not-trigger-general-chat/with_skill/outputs/response.md +0 -18
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/should-not-trigger-math-question/with_skill/outputs/response.md +0 -18
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/should-not-trigger-unrelated-coding/with_skill/outputs/response.md +0 -56
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/tdd-guard-blocking-confusion/with_skill/outputs/response.md +0 -67
- package/.xtrm/skills/default/quality-gates/workspace/iteration-1/typescript-feature-with-tests/with_skill/outputs/response.md +0 -97
- package/.xtrm/skills/default/sync-docs/evals/evals.json +0 -89
- package/.xtrm/skills/default/test-planning/evals/evals.json +0 -23
- package/.xtrm/skills/default/using-specialists/SKILL.safe.md +0 -1082
- package/.xtrm/skills/default/using-specialists/SKILL.ultra.md +0 -1082
- package/.xtrm/skills/default/using-specialists/evals/evals.json +0 -68
- package/.xtrm/skills/default/using-specialists-v3/evals/evals.json +0 -89
- package/packages/pi-extensions/.serena/project.yml +0 -130
|
@@ -119,13 +119,13 @@ specialists status --job <job-id> # single-job detail view (legacy
|
|
|
119
119
|
# Epic lifecycle (canonical publication path)
|
|
120
120
|
specialists epic list [--unresolved] # list epics with lifecycle state
|
|
121
121
|
specialists epic status <epic-id> # show chains, blockers, readiness
|
|
122
|
-
specialists epic sync <epic-id> [--apply] #
|
|
123
|
-
specialists epic resolve <epic-id> # transition open -> resolving
|
|
122
|
+
specialists epic sync <epic-id> [--apply] # recompute derived readiness; repair drift
|
|
124
123
|
specialists epic abandon <epic-id> --reason <text> [--force] # terminal transition for stuck epics
|
|
125
|
-
specialists epic merge <epic-id> [--pr] # publish all epic-owned chains
|
|
124
|
+
specialists epic merge <epic-id> [--pr] # publish all epic-owned chains; auto-finalizes PASS chains
|
|
126
125
|
|
|
127
|
-
# Merge (
|
|
128
|
-
specialists merge <chain-root-bead> [--rebuild]
|
|
126
|
+
# Merge (per-chain or standalone; PASS chains can merge inside an active epic)
|
|
127
|
+
specialists merge <chain-root-bead> [--rebuild]
|
|
128
|
+
specialists finalize <chain-root-bead> # manual recovery if PASS auto-finalize did not fire
|
|
129
129
|
|
|
130
130
|
# Session close (chain-aware, epic-aware)
|
|
131
131
|
specialists end [--pr] # close session, publish via merge or PR
|
|
@@ -682,14 +682,15 @@ sp epic list --unresolved # show non-terminal epics
|
|
|
682
682
|
|
|
683
683
|
# Inspect one epic
|
|
684
684
|
sp epic status unitAI-3f7b
|
|
685
|
-
# Shows:
|
|
685
|
+
# Shows: derived readiness state, persisted state (audit only), chains[], blockers[], summary
|
|
686
686
|
|
|
687
|
-
#
|
|
688
|
-
sp epic
|
|
689
|
-
|
|
690
|
-
# Publish
|
|
691
|
-
sp epic merge unitAI-3f7b # merge_ready → merged
|
|
687
|
+
# Publish (no manual state transition — readiness is derived live)
|
|
688
|
+
sp epic merge unitAI-3f7b # batch publish all chains; auto-finalizes PASS chains
|
|
692
689
|
sp epic merge unitAI-3f7b --pr # PR mode
|
|
690
|
+
|
|
691
|
+
# Or per-chain (PASS chain inside active epic is allowed)
|
|
692
|
+
sp merge <chain-root-bead>
|
|
693
|
+
sp finalize <chain-root-bead> # manual recovery if PASS auto-finalize missed
|
|
693
694
|
```
|
|
694
695
|
|
|
695
696
|
### Conflict handling
|
|
@@ -1136,7 +1137,7 @@ sp stop <job-id> --force
|
|
|
1136
1137
|
|
|
1137
1138
|
### 5) `sp end` open-state loop fix
|
|
1138
1139
|
|
|
1139
|
-
If `sp end` detects open-state mismatch, tool
|
|
1140
|
+
If `sp end` detects open-state mismatch, tool surfaces the derived readiness summary (`sp epic status <epic-id>`) and the per-chain merge path. There is no `sp epic resolve` anymore — readiness is recomputed live from chain state.
|
|
1140
1141
|
|
|
1141
1142
|
- **RPC timeout on worktree job start** (30s, `command id=1`) → pi runs `npm install` in fresh
|
|
1142
1143
|
worktrees if `.pi/settings.json` lists local packages. Root cause: worktree gets a stale copy
|
|
@@ -0,0 +1,137 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: using-specialists-auto
|
|
3
|
+
description: >
|
|
4
|
+
Operator-offline autonomous orchestration overlay. Activate when the user says
|
|
5
|
+
"auto mode", "full auto", "run autonomously", "I'll leave you alone", or
|
|
6
|
+
similar — and hands over a multi-item priority list. Layers on top of
|
|
7
|
+
`using-specialists-v3`: paranoid pacing, dispatch loop shape, dist-rebuild
|
|
8
|
+
discipline, escalation triggers specific to unsupervised runs. Does NOT
|
|
9
|
+
duplicate v3's bead contracts, sleep table, rebuttal patterns, escalation
|
|
10
|
+
matrix, or session-end handoff — refers to v3 for those.
|
|
11
|
+
version: 2.0
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# Using Specialists — Auto Mode (overlay)
|
|
15
|
+
|
|
16
|
+
You are running unsupervised. Every shortcut you skip costs the operator on return. Move slowly enough to be correct.
|
|
17
|
+
|
|
18
|
+
`using-specialists-v3` is the canonical specialist orchestration skill — bead contracts, role selection, advisory passes, sleep cadence, rebuttal patterns, escalation matrix, session-end handoff all live there. This skill adds **only** the discipline overlay that changes when no operator is present to catch drift.
|
|
19
|
+
|
|
20
|
+
## When this skill activates
|
|
21
|
+
|
|
22
|
+
User explicitly hands over autonomy: "auto mode", "go", "I'll leave you alone", "run the list", "do them all". Skill stays active until session end or the operator returns. Do NOT activate on a single ad-hoc task.
|
|
23
|
+
|
|
24
|
+
## Auto-mode-specific rules (in addition to v3 hard rules)
|
|
25
|
+
|
|
26
|
+
These EXTEND v3's Non-Negotiable Rules + Escalation Matrix — they do not replace them.
|
|
27
|
+
|
|
28
|
+
1. **Default to serial chains.** Auto-mode rarely benefits from parallel chains; the project-wide commit gate (v3 → Bead Lifecycle) forces serial-tail anyway. Only parallelize when file scopes are provably disjoint AND the time savings outweigh the conflict-resolution cost (rare).
|
|
29
|
+
2. **Re-read each bead and defend each field in your head before launching.** If you can't, the bead isn't ready. Title-only beads waste a turn.
|
|
30
|
+
3. **Rebuild + smoke after each P0 (and after every chain touching `src/`).** Skipping breaks the next chain's baseline silently.
|
|
31
|
+
4. **One rebuttal per reviewer, then escalate.** v3 documents the rebuttal pattern — auto-mode just caps the loop count.
|
|
32
|
+
5. **Session-close report is non-optional.** Operator returning to a clean tree but no report = blind cold-start next session. Follow `/session-close-report` skill at session end.
|
|
33
|
+
|
|
34
|
+
## Per-item loop shape
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
read bead → write 7-section contract (child impl bead) → bd dep add parent→child
|
|
38
|
+
→ sp run executor --bead <impl> --keep-alive --context-depth 3 --background
|
|
39
|
+
→ sleep 10 && sp ps # confirm started, not stuck queued
|
|
40
|
+
→ sleep <role-typical from v3> & sp ps # check (see v3 Monitoring section)
|
|
41
|
+
→ sp result <exec-job> # consume immediately on transition to waiting
|
|
42
|
+
→ optional advisory passes per v3 (code-sanity if smelly, security-auditor if risk surface)
|
|
43
|
+
→ write reviewer bead contract → sp run reviewer --bead <review> --job <exec-job> --background
|
|
44
|
+
→ sleep 90 & sp ps
|
|
45
|
+
→ sp result <reviewer-job>
|
|
46
|
+
→ PASS? → sp finalize <exec> → sp merge → rebuild dist → smoke → close chain (memory ack first)
|
|
47
|
+
→ PARTIAL? → resume executor with exact findings → resume reviewer
|
|
48
|
+
→ FAIL with valid evidence? → stop and report (file follow-up bead)
|
|
49
|
+
→ FAIL with overcautious gate? → rebut once with cited evidence (v3 → Specialist Rebuttal As Routine)
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
## Dist rebuild + commit after every P0 or src/-touching chain
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
bun build src/index.ts --target=bun --outfile=dist/index.js
|
|
56
|
+
sed -i '1s|#!/usr/bin/env node|#!/usr/bin/env bun|' dist/index.js
|
|
57
|
+
chmod +x dist/index.js
|
|
58
|
+
git add dist/index.js dist/types/<changed-paths> 2>/dev/null
|
|
59
|
+
git commit -m "build: rebuild dist after <bead-id> <one-line summary>"
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Without this, the next chain's tests/smokes run against stale dist and the globally-installed `sp` binary (symlinked to local `dist/index.js`) silently uses pre-fix behavior.
|
|
63
|
+
|
|
64
|
+
## Smoke per chain
|
|
65
|
+
|
|
66
|
+
Tighter than v3's E2E Smoke Phase (which is integration-end). Per-chain smoke is:
|
|
67
|
+
|
|
68
|
+
- `bunx tsc --noEmit` clean
|
|
69
|
+
- The targeted test(s) the chain added — green
|
|
70
|
+
- After P0 also: `sp --version`, the specific CLI surface that changed, and (if runtime resolution touched) the same command from a non-repo cwd (`cd /tmp/smoke && sp <cmd>`)
|
|
71
|
+
|
|
72
|
+
If any chain in the session touched auth/secrets/input/dep-lock surface, do v3's cross-cutting security-auditor pass once at end before session close.
|
|
73
|
+
|
|
74
|
+
## Pre-merge state hygiene (transitional, until v3.14.2 ships globally)
|
|
75
|
+
|
|
76
|
+
`sp merge` now ignores `.beads/` and `.xtrm/skills/active/**` (per `unitAI-pqe96` shipped this session). The globally-installed `sp` symlinks to local `dist/index.js`, so after `npm install -g .` the fix is live locally. If you still see `sp merge` refuse on dirty state, the leftover is usually a STAGED `.beads/issues.jsonl` (`M ` not ` M`):
|
|
77
|
+
|
|
78
|
+
```bash
|
|
79
|
+
git restore --staged .beads/issues.jsonl 2>/dev/null
|
|
80
|
+
git checkout -- .beads/issues.jsonl 2>/dev/null
|
|
81
|
+
sp merge <chain-root-bead>
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
In the worktree, drop noise stash before merging if `.xtrm/skills/active/...` files are dirty:
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
git stash push -u -m "noise" -- .xtrm/
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
After merge cleanup: `git worktree remove <path> --force`, `git branch -D feature/<bead>-<role>`, `git worktree prune`, `rm -rf .worktrees/<bead>`.
|
|
91
|
+
|
|
92
|
+
## Auto-mode-specific escalation triggers
|
|
93
|
+
|
|
94
|
+
These supplement v3's Escalation Matrix — stop and report when:
|
|
95
|
+
|
|
96
|
+
- Reviewer FAIL twice after one rebuttal attempt (v3 rebuttal limit hit).
|
|
97
|
+
- Any chain looped twice with no progress.
|
|
98
|
+
- Repeated specialist crashes (>2 same role).
|
|
99
|
+
- Cross-project state pollution (specialists from another repo holding locks/processes).
|
|
100
|
+
- Anything that would otherwise require a v3 hard-rule break.
|
|
101
|
+
|
|
102
|
+
When you stop: file an issue bead with concrete evidence (PIDs, job IDs, exact error), save a memory if the failure mode is durable, write a partial session-close report, do NOT abandon mid-merge.
|
|
103
|
+
|
|
104
|
+
## Session start (auto-specific)
|
|
105
|
+
|
|
106
|
+
In addition to v3's session-start patterns (`bd prime`, `bv --robot-triage`):
|
|
107
|
+
|
|
108
|
+
```bash
|
|
109
|
+
specialists list --full # confirm current roles + models (registry may have drifted)
|
|
110
|
+
sp ps # 0 active expected
|
|
111
|
+
git worktree list # main only expected (specialist worktrees from prior sessions should be cleaned)
|
|
112
|
+
git status -s # clean expected
|
|
113
|
+
bd ready # work to pick up
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
If any of these are dirty (active jobs, lingering worktrees, dirty tree from prior session), reconcile before claiming new work.
|
|
117
|
+
|
|
118
|
+
## Session close (mandatory)
|
|
119
|
+
|
|
120
|
+
Per v3 → At Session End — Mandatory Handoff. Auto-mode addenda:
|
|
121
|
+
|
|
122
|
+
1. Confirm `sp ps` empty (0 running, 0 waiting) and `git worktree list` shows only main.
|
|
123
|
+
2. Drop accumulated stashes if they contain only known noise.
|
|
124
|
+
3. Memory gate at session level: `bd kv set "memory-gate-done:<session-id>" "saved: <durable-insights>"` (or `"nothing novel: <reason>"`).
|
|
125
|
+
4. Run `/session-close-report` skill — fills the canonical report template, drives CHANGELOG sync, commit, push.
|
|
126
|
+
|
|
127
|
+
## Telltale signs you're drifting from auto-mode discipline
|
|
128
|
+
|
|
129
|
+
- Skipping bead contracts because "the change is small" → write it anyway, costs nothing, downstream specialist needs it.
|
|
130
|
+
- Polling `sp ps` more often than v3's sleep table → wastes context.
|
|
131
|
+
- Letting executors run >2× expected without `sp feed` → blocking future work.
|
|
132
|
+
- Accepting reviewer FAIL without cited evidence in the rebuttal → produces longer loops.
|
|
133
|
+
- Manually editing a config file because "the executor would just do the same thing" → breaks v3 hard rule 13 (orchestrator never edits code), no audit trail.
|
|
134
|
+
- Skipping `bun run build` after src/ change → next chain's smoke fails for unrelated reason.
|
|
135
|
+
- Closing the parent bead before chain memory saved → loses durable insight forever.
|
|
136
|
+
|
|
137
|
+
Read once at session start. Re-read if you catch yourself drifting.
|
|
@@ -633,36 +633,29 @@ Release helper contract:
|
|
|
633
633
|
|
|
634
634
|
## Epic Lifecycle
|
|
635
635
|
|
|
636
|
-
Epics are merge-gated identities with a persisted
|
|
636
|
+
Epics are merge-gated identities with a **derived** readiness model. State is computed live from chain readiness; only `merged` and `abandoned` are persisted as terminal markers.
|
|
637
637
|
|
|
638
|
-
|
|
639
|
-
|
|
640
|
-
|
|
641
|
-
|
|
642
|
-
|
|
638
|
+
| Derived state | Meaning | Per-chain merge | Batch merge |
|
|
639
|
+
| --- | --- | :---: | :---: |
|
|
640
|
+
| `blocked` | Some chain pending or has no reviewer verdict. | — | No |
|
|
641
|
+
| `failed` | At least one chain has a failed reviewer verdict. | per-chain on PASS chains | No |
|
|
642
|
+
| `merge_ready` | All chains pass and no active jobs. | Yes | Yes via `sp epic merge` |
|
|
643
|
+
| `merged` | (persisted) Publication complete. | — | — |
|
|
644
|
+
| `abandoned` | (persisted) Cancelled without merge. | — | — |
|
|
643
645
|
|
|
644
|
-
|
|
645
|
-
| --- | --- | --- |
|
|
646
|
-
| `open` | Epic created, chains not yet running. | No |
|
|
647
|
-
| `resolving` | Chains actively running. | No |
|
|
648
|
-
| `merge_ready` | All chains terminal, reviewer PASS, tsc gate passes. | Yes via `sp epic merge` |
|
|
649
|
-
| `merged` | Publication complete. | — |
|
|
650
|
-
| `failed` | One or more chains failed. | Resolve or abandon. |
|
|
651
|
-
| `abandoned` | Cancelled without merge. | — |
|
|
652
|
-
|
|
653
|
-
Operator transitions:
|
|
646
|
+
Operator commands:
|
|
654
647
|
|
|
655
648
|
```bash
|
|
656
|
-
sp epic
|
|
657
|
-
sp epic merge <epic-id> # merge_ready -> merged (canonical publication)
|
|
649
|
+
sp epic merge <epic-id> # publish all chains; auto-runs finalize preflight
|
|
658
650
|
sp epic merge <epic-id> --pr # PR mode (publish via pull request)
|
|
659
|
-
sp epic sync <epic-id> --apply #
|
|
651
|
+
sp epic sync <epic-id> --apply # recompute derived readiness; repair drift
|
|
660
652
|
sp epic abandon <epic-id> --reason <t> # terminal close for unrecoverable epic
|
|
661
653
|
sp epic abandon <epic-id> --reason <t> --force # force when active pointers still exist
|
|
662
654
|
```
|
|
663
655
|
|
|
664
|
-
`sp merge <chain>`
|
|
665
|
-
|
|
656
|
+
`sp merge <chain>` is allowed for any PASS chain regardless of sibling-epic state. Use `sp epic merge` only when batching all epic chains together. `sp finalize <chain>` is a manual recovery for keep-alive executors that reached PASS but did not auto-finalize.
|
|
657
|
+
|
|
658
|
+
`sp epic resolve` was removed — there is no operator-driven open→resolving transition anymore; readiness is recomputed every read.
|
|
666
659
|
|
|
667
660
|
## Concurrency And Force Flags
|
|
668
661
|
|