xtrm-tools 0.7.17 → 0.7.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. package/.xtrm/config/hooks.json +2 -0
  2. package/.xtrm/config/instructions/agents-top.md +2 -1
  3. package/.xtrm/registry.json +429 -712
  4. package/.xtrm/skills/default/creating-service-skills/scripts/bootstrap.py +82 -156
  5. package/.xtrm/skills/default/creating-service-skills/scripts/scaffolder.py +73 -121
  6. package/.xtrm/skills/default/hook-development/references/patterns.md +1 -1
  7. package/.xtrm/skills/default/last30days/scripts/test-v1-vs-v2.sh +2 -2
  8. package/.xtrm/skills/default/planning/SKILL.md +75 -29
  9. package/.xtrm/skills/default/releasing/SKILL.md +163 -57
  10. package/.xtrm/skills/default/security-pipeline/SKILL.md +192 -0
  11. package/.xtrm/skills/default/security-pipeline/scripts/security-bootstrap.sh +294 -0
  12. package/.xtrm/skills/default/security-pipeline/templates/.githooks/pre-push.template +39 -0
  13. package/.xtrm/skills/default/security-pipeline/templates/.github/workflows/gitleaks.yml +33 -0
  14. package/.xtrm/skills/default/security-pipeline/templates/.github/workflows/osv-scanner.yml +33 -0
  15. package/.xtrm/skills/default/security-pipeline/templates/.github/workflows/semgrep.yml +41 -0
  16. package/.xtrm/skills/default/security-pipeline/templates/.gitleaks.toml +44 -0
  17. package/.xtrm/skills/default/security-pipeline/templates/.pre-commit-config.yaml +67 -0
  18. package/.xtrm/skills/default/security-pipeline/templates/.semgrepignore +46 -0
  19. package/.xtrm/skills/default/security-pipeline/templates/scripts/security-scan.sh +57 -0
  20. package/.xtrm/skills/default/security-pipeline/templates/scripts/semgrep-diff.sh +68 -0
  21. package/.xtrm/skills/default/session-close-report/SKILL.md +167 -6
  22. package/.xtrm/skills/default/sync-docs/SKILL.md +1 -1
  23. package/.xtrm/skills/default/update-xt/SKILL.md +270 -4
  24. package/.xtrm/skills/default/updating-service-skills/scripts/drift_detector.py +22 -0
  25. package/.xtrm/skills/default/using-script-specialists/SKILL.md +7 -5
  26. package/.xtrm/skills/default/using-specialists/SKILL.md +13 -12
  27. package/.xtrm/skills/default/using-specialists-auto/SKILL.md +137 -0
  28. package/.xtrm/skills/default/using-specialists-v2/SKILL.md +14 -21
  29. package/.xtrm/skills/default/using-specialists-v3/SKILL.md +533 -21
  30. package/.xtrm/skills/default/vaultctl/SKILL.md +2 -2
  31. package/CHANGELOG.md +82 -3
  32. package/cli/dist/index.cjs +12425 -3770
  33. package/cli/dist/index.cjs.map +1 -1
  34. package/cli/package.json +9 -3
  35. package/package.json +27 -7
  36. package/packages/pi-extensions/package.json +1 -1
  37. package/.xtrm/skills/default/planning/evals/evals.json +0 -19
  38. package/.xtrm/skills/default/quality-gates/evals/evals.json +0 -181
  39. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/FINAL-EVAL-SUMMARY.md +0 -75
  40. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/edge-case-auto-fix-verification/with_skill/outputs/response.md +0 -59
  41. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/edge-case-mixed-language-project/with_skill/outputs/response.md +0 -60
  42. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/eval-summary.md +0 -105
  43. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/partial-install-python-only/with_skill/outputs/response.md +0 -93
  44. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/python-refactor-request/with_skill/outputs/response.md +0 -104
  45. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/quality-gate-error-fix/with_skill/outputs/response.md +0 -74
  46. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/should-not-trigger-general-chat/with_skill/outputs/response.md +0 -18
  47. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/should-not-trigger-math-question/with_skill/outputs/response.md +0 -18
  48. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/should-not-trigger-unrelated-coding/with_skill/outputs/response.md +0 -56
  49. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/tdd-guard-blocking-confusion/with_skill/outputs/response.md +0 -67
  50. package/.xtrm/skills/default/quality-gates/workspace/iteration-1/typescript-feature-with-tests/with_skill/outputs/response.md +0 -97
  51. package/.xtrm/skills/default/sync-docs/evals/evals.json +0 -89
  52. package/.xtrm/skills/default/test-planning/evals/evals.json +0 -23
  53. package/.xtrm/skills/default/using-specialists/SKILL.safe.md +0 -1082
  54. package/.xtrm/skills/default/using-specialists/SKILL.ultra.md +0 -1082
  55. package/.xtrm/skills/default/using-specialists/evals/evals.json +0 -68
  56. package/.xtrm/skills/default/using-specialists-v3/evals/evals.json +0 -89
  57. package/packages/pi-extensions/.serena/project.yml +0 -130
@@ -119,13 +119,13 @@ specialists status --job <job-id> # single-job detail view (legacy
119
119
  # Epic lifecycle (canonical publication path)
120
120
  specialists epic list [--unresolved] # list epics with lifecycle state
121
121
  specialists epic status <epic-id> # show chains, blockers, readiness
122
- specialists epic sync <epic-id> [--apply] # reconcile DB vs live state (dry-run default)
123
- specialists epic resolve <epic-id> # transition open -> resolving
122
+ specialists epic sync <epic-id> [--apply] # recompute derived readiness; repair drift
124
123
  specialists epic abandon <epic-id> --reason <text> [--force] # terminal transition for stuck epics
125
- specialists epic merge <epic-id> [--pr] # publish all epic-owned chains
124
+ specialists epic merge <epic-id> [--pr] # publish all epic-owned chains; auto-finalizes PASS chains
126
125
 
127
- # Merge (for standalone chains only)
128
- specialists merge <chain-root-bead> [--rebuild] # publish ONE standalone chain
126
+ # Merge (per-chain or standalone; PASS chains can merge inside an active epic)
127
+ specialists merge <chain-root-bead> [--rebuild]
128
+ specialists finalize <chain-root-bead> # manual recovery if PASS auto-finalize did not fire
129
129
 
130
130
  # Session close (chain-aware, epic-aware)
131
131
  specialists end [--pr] # close session, publish via merge or PR
@@ -682,14 +682,15 @@ sp epic list --unresolved # show non-terminal epics
682
682
 
683
683
  # Inspect one epic
684
684
  sp epic status unitAI-3f7b
685
- # Shows: persisted_state, readiness_state, chains[], blockers[], summary
685
+ # Shows: derived readiness state, persisted state (audit only), chains[], blockers[], summary
686
686
 
687
- # Transition states (manual)
688
- sp epic resolve unitAI-3f7b # open resolving
689
-
690
- # Publish
691
- sp epic merge unitAI-3f7b # merge_ready → merged
687
+ # Publish (no manual state transition — readiness is derived live)
688
+ sp epic merge unitAI-3f7b # batch publish all chains; auto-finalizes PASS chains
692
689
  sp epic merge unitAI-3f7b --pr # PR mode
690
+
691
+ # Or per-chain (PASS chain inside active epic is allowed)
692
+ sp merge <chain-root-bead>
693
+ sp finalize <chain-root-bead> # manual recovery if PASS auto-finalize missed
693
694
  ```
694
695
 
695
696
  ### Conflict handling
@@ -1136,7 +1137,7 @@ sp stop <job-id> --force
1136
1137
 
1137
1138
  ### 5) `sp end` open-state loop fix
1138
1139
 
1139
- If `sp end` detects open-state mismatch, tool now suggests `sp epic resolve <epic-id>` as next command (no redirect loop).
1140
+ If `sp end` detects open-state mismatch, tool surfaces the derived readiness summary (`sp epic status <epic-id>`) and the per-chain merge path. There is no `sp epic resolve` anymore — readiness is recomputed live from chain state.
1140
1141
 
1141
1142
  - **RPC timeout on worktree job start** (30s, `command id=1`) → pi runs `npm install` in fresh
1142
1143
  worktrees if `.pi/settings.json` lists local packages. Root cause: worktree gets a stale copy
@@ -0,0 +1,137 @@
1
+ ---
2
+ name: using-specialists-auto
3
+ description: >
4
+ Operator-offline autonomous orchestration overlay. Activate when the user says
5
+ "auto mode", "full auto", "run autonomously", "I'll leave you alone", or
6
+ similar — and hands over a multi-item priority list. Layers on top of
7
+ `using-specialists-v3`: paranoid pacing, dispatch loop shape, dist-rebuild
8
+ discipline, escalation triggers specific to unsupervised runs. Does NOT
9
+ duplicate v3's bead contracts, sleep table, rebuttal patterns, escalation
10
+ matrix, or session-end handoff — refers to v3 for those.
11
+ version: 2.0
12
+ ---
13
+
14
+ # Using Specialists — Auto Mode (overlay)
15
+
16
+ You are running unsupervised. Every shortcut you skip costs the operator on return. Move slowly enough to be correct.
17
+
18
+ `using-specialists-v3` is the canonical specialist orchestration skill — bead contracts, role selection, advisory passes, sleep cadence, rebuttal patterns, escalation matrix, session-end handoff all live there. This skill adds **only** the discipline overlay that changes when no operator is present to catch drift.
19
+
20
+ ## When this skill activates
21
+
22
+ User explicitly hands over autonomy: "auto mode", "go", "I'll leave you alone", "run the list", "do them all". Skill stays active until session end or the operator returns. Do NOT activate on a single ad-hoc task.
23
+
24
+ ## Auto-mode-specific rules (in addition to v3 hard rules)
25
+
26
+ These EXTEND v3's Non-Negotiable Rules + Escalation Matrix — they do not replace them.
27
+
28
+ 1. **Default to serial chains.** Auto-mode rarely benefits from parallel chains; the project-wide commit gate (v3 → Bead Lifecycle) forces serial-tail anyway. Only parallelize when file scopes are provably disjoint AND the time savings outweigh the conflict-resolution cost (rare).
29
+ 2. **Re-read each bead and defend each field in your head before launching.** If you can't, the bead isn't ready. Title-only beads waste a turn.
30
+ 3. **Rebuild + smoke after each P0 (and after every chain touching `src/`).** Skipping breaks the next chain's baseline silently.
31
+ 4. **One rebuttal per reviewer, then escalate.** v3 documents the rebuttal pattern — auto-mode just caps the loop count.
32
+ 5. **Session-close report is non-optional.** Operator returning to a clean tree but no report = blind cold-start next session. Follow `/session-close-report` skill at session end.
33
+
34
+ ## Per-item loop shape
35
+
36
+ ```
37
+ read bead → write 7-section contract (child impl bead) → bd dep add parent→child
38
+ → sp run executor --bead <impl> --keep-alive --context-depth 3 --background
39
+ → sleep 10 && sp ps # confirm started, not stuck queued
40
+ → sleep <role-typical from v3> & sp ps # check (see v3 Monitoring section)
41
+ → sp result <exec-job> # consume immediately on transition to waiting
42
+ → optional advisory passes per v3 (code-sanity if smelly, security-auditor if risk surface)
43
+ → write reviewer bead contract → sp run reviewer --bead <review> --job <exec-job> --background
44
+ → sleep 90 & sp ps
45
+ → sp result <reviewer-job>
46
+ → PASS? → sp finalize <exec> → sp merge → rebuild dist → smoke → close chain (memory ack first)
47
+ → PARTIAL? → resume executor with exact findings → resume reviewer
48
+ → FAIL with valid evidence? → stop and report (file follow-up bead)
49
+ → FAIL with overcautious gate? → rebut once with cited evidence (v3 → Specialist Rebuttal As Routine)
50
+ ```
51
+
52
+ ## Dist rebuild + commit after every P0 or src/-touching chain
53
+
54
+ ```bash
55
+ bun build src/index.ts --target=bun --outfile=dist/index.js
56
+ sed -i '1s|#!/usr/bin/env node|#!/usr/bin/env bun|' dist/index.js
57
+ chmod +x dist/index.js
58
+ git add dist/index.js dist/types/<changed-paths> 2>/dev/null
59
+ git commit -m "build: rebuild dist after <bead-id> <one-line summary>"
60
+ ```
61
+
62
+ Without this, the next chain's tests/smokes run against stale dist and the globally-installed `sp` binary (symlinked to local `dist/index.js`) silently uses pre-fix behavior.
63
+
64
+ ## Smoke per chain
65
+
66
+ Tighter than v3's E2E Smoke Phase (which is integration-end). Per-chain smoke is:
67
+
68
+ - `bunx tsc --noEmit` clean
69
+ - The targeted test(s) the chain added — green
70
+ - After P0 also: `sp --version`, the specific CLI surface that changed, and (if runtime resolution touched) the same command from a non-repo cwd (`cd /tmp/smoke && sp <cmd>`)
71
+
72
+ If any chain in the session touched auth/secrets/input/dep-lock surface, do v3's cross-cutting security-auditor pass once at end before session close.
73
+
74
+ ## Pre-merge state hygiene (transitional, until v3.14.2 ships globally)
75
+
76
+ `sp merge` now ignores `.beads/` and `.xtrm/skills/active/**` (per `unitAI-pqe96` shipped this session). The globally-installed `sp` symlinks to local `dist/index.js`, so after `npm install -g .` the fix is live locally. If you still see `sp merge` refuse on dirty state, the leftover is usually a STAGED `.beads/issues.jsonl` (`M ` not ` M`):
77
+
78
+ ```bash
79
+ git restore --staged .beads/issues.jsonl 2>/dev/null
80
+ git checkout -- .beads/issues.jsonl 2>/dev/null
81
+ sp merge <chain-root-bead>
82
+ ```
83
+
84
+ In the worktree, drop noise stash before merging if `.xtrm/skills/active/...` files are dirty:
85
+
86
+ ```bash
87
+ git stash push -u -m "noise" -- .xtrm/
88
+ ```
89
+
90
+ After merge cleanup: `git worktree remove <path> --force`, `git branch -D feature/<bead>-<role>`, `git worktree prune`, `rm -rf .worktrees/<bead>`.
91
+
92
+ ## Auto-mode-specific escalation triggers
93
+
94
+ These supplement v3's Escalation Matrix — stop and report when:
95
+
96
+ - Reviewer FAIL twice after one rebuttal attempt (v3 rebuttal limit hit).
97
+ - Any chain looped twice with no progress.
98
+ - Repeated specialist crashes (>2 same role).
99
+ - Cross-project state pollution (specialists from another repo holding locks/processes).
100
+ - Anything that would otherwise require a v3 hard-rule break.
101
+
102
+ When you stop: file an issue bead with concrete evidence (PIDs, job IDs, exact error), save a memory if the failure mode is durable, write a partial session-close report, do NOT abandon mid-merge.
103
+
104
+ ## Session start (auto-specific)
105
+
106
+ In addition to v3's session-start patterns (`bd prime`, `bv --robot-triage`):
107
+
108
+ ```bash
109
+ specialists list --full # confirm current roles + models (registry may have drifted)
110
+ sp ps # 0 active expected
111
+ git worktree list # main only expected (specialist worktrees from prior sessions should be cleaned)
112
+ git status -s # clean expected
113
+ bd ready # work to pick up
114
+ ```
115
+
116
+ If any of these are dirty (active jobs, lingering worktrees, dirty tree from prior session), reconcile before claiming new work.
117
+
118
+ ## Session close (mandatory)
119
+
120
+ Per v3 → At Session End — Mandatory Handoff. Auto-mode addenda:
121
+
122
+ 1. Confirm `sp ps` empty (0 running, 0 waiting) and `git worktree list` shows only main.
123
+ 2. Drop accumulated stashes if they contain only known noise.
124
+ 3. Memory gate at session level: `bd kv set "memory-gate-done:<session-id>" "saved: <durable-insights>"` (or `"nothing novel: <reason>"`).
125
+ 4. Run `/session-close-report` skill — fills the canonical report template, drives CHANGELOG sync, commit, push.
126
+
127
+ ## Telltale signs you're drifting from auto-mode discipline
128
+
129
+ - Skipping bead contracts because "the change is small" → write it anyway, costs nothing, downstream specialist needs it.
130
+ - Polling `sp ps` more often than v3's sleep table → wastes context.
131
+ - Letting executors run >2× expected without `sp feed` → blocking future work.
132
+ - Accepting reviewer FAIL without cited evidence in the rebuttal → produces longer loops.
133
+ - Manually editing a config file because "the executor would just do the same thing" → breaks v3 hard rule 13 (orchestrator never edits code), no audit trail.
134
+ - Skipping `bun run build` after src/ change → next chain's smoke fails for unrelated reason.
135
+ - Closing the parent bead before chain memory saved → loses durable insight forever.
136
+
137
+ Read once at session start. Re-read if you catch yourself drifting.
@@ -633,36 +633,29 @@ Release helper contract:
633
633
 
634
634
  ## Epic Lifecycle
635
635
 
636
- Epics are merge-gated identities with a persisted state machine:
636
+ Epics are merge-gated identities with a **derived** readiness model. State is computed live from chain readiness; only `merged` and `abandoned` are persisted as terminal markers.
637
637
 
638
- ```text
639
- open -> resolving -> merge_ready -> merged
640
- -> failed
641
- -> abandoned
642
- ```
638
+ | Derived state | Meaning | Per-chain merge | Batch merge |
639
+ | --- | --- | :---: | :---: |
640
+ | `blocked` | Some chain pending or has no reviewer verdict. | — | No |
641
+ | `failed` | At least one chain has a failed reviewer verdict. | per-chain on PASS chains | No |
642
+ | `merge_ready` | All chains pass and no active jobs. | Yes | Yes via `sp epic merge` |
643
+ | `merged` | (persisted) Publication complete. | — | — |
644
+ | `abandoned` | (persisted) Cancelled without merge. | — | — |
643
645
 
644
- | State | Meaning | Chains mergeable? |
645
- | --- | --- | --- |
646
- | `open` | Epic created, chains not yet running. | No |
647
- | `resolving` | Chains actively running. | No |
648
- | `merge_ready` | All chains terminal, reviewer PASS, tsc gate passes. | Yes via `sp epic merge` |
649
- | `merged` | Publication complete. | — |
650
- | `failed` | One or more chains failed. | Resolve or abandon. |
651
- | `abandoned` | Cancelled without merge. | — |
652
-
653
- Operator transitions:
646
+ Operator commands:
654
647
 
655
648
  ```bash
656
- sp epic resolve <epic-id> # open -> resolving (marks epic as merge-ready target)
657
- sp epic merge <epic-id> # merge_ready -> merged (canonical publication)
649
+ sp epic merge <epic-id> # publish all chains; auto-runs finalize preflight
658
650
  sp epic merge <epic-id> --pr # PR mode (publish via pull request)
659
- sp epic sync <epic-id> --apply # reconcile DB vs live job state when stuck
651
+ sp epic sync <epic-id> --apply # recompute derived readiness; repair drift
660
652
  sp epic abandon <epic-id> --reason <t> # terminal close for unrecoverable epic
661
653
  sp epic abandon <epic-id> --reason <t> --force # force when active pointers still exist
662
654
  ```
663
655
 
664
- `sp merge <chain>` refuses if the chain belongs to an unresolved epic. Use
665
- `sp epic merge` for epic-owned chains.
656
+ `sp merge <chain>` is allowed for any PASS chain regardless of sibling-epic state. Use `sp epic merge` only when batching all epic chains together. `sp finalize <chain>` is a manual recovery for keep-alive executors that reached PASS but did not auto-finalize.
657
+
658
+ `sp epic resolve` was removed — there is no operator-driven open→resolving transition anymore; readiness is recomputed every read.
666
659
 
667
660
  ## Concurrency And Force Flags
668
661