xtrm-tools 0.7.14 → 0.7.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,6 +13,22 @@ synced_at: 236ca5e6
13
13
 
14
14
  > Source of truth: `src/specialist/schema.ts` | Runtime: `src/specialist/runner.ts`
15
15
 
16
+
17
+ ## Canonical References
18
+
19
+ When a custom specialist needs a standard rule or skill, reference the canonical asset by name instead of copying its file into the repo. Runtime/package fallback resolves canonical mandatory rules and skills when no project-local override exists.
20
+
21
+ Example:
22
+
23
+ ```json
24
+ {
25
+ "mandatory_rules": { "template_sets": ["serena-cheatsheet"] },
26
+ "skills": { "paths": ["releasing"] }
27
+ }
28
+ ```
29
+
30
+ Only create project-local copies when intentionally changing canonical behavior. After setting references, run `sp config show <name> --resolved` to verify the resolved runtime surface.
31
+
16
32
  ---
17
33
 
18
34
  ## ACTION REQUIRED BEFORE ANYTHING ELSE
@@ -143,6 +143,50 @@ sp db stats --with-payload --format json \
143
143
  end'
144
144
  ```
145
145
 
146
+ ## Recipe 7 — payload component breakdown per specialist
147
+
148
+ **Truth source first.** The actual prompt size billed by the API is the first turn's `input_tokens` from `token_trajectory_json[0]`. Use it as the ground truth — `payload_breakdown` events undercount (tool definitions and harness framing are not captured) and historical rows before the rule N× fix overcount mandatory_rule by attached-rule count.
149
+
150
+ ```bash
151
+ DB=.specialists/db/observability.db
152
+ sqlite3 "$DB" "SELECT specialist, model, AVG(json_extract(token_trajectory_json, '\$[0].token_usage.input_tokens')) AS avg_first_in, COUNT(*) AS n FROM specialist_job_metrics WHERE token_trajectory_json IS NOT NULL AND status='done' GROUP BY specialist, model ORDER BY avg_first_in DESC"
153
+ ```
154
+
155
+ Use this number for cost decisions. Use `payload_breakdown` only for *relative* component analysis (which knob to tune), not absolute sizing.
156
+
157
+ `sp db stats --with-payload` only surfaces total `payload_kb` / `payload_tokens`. To audit *what* fills the prompt (system_prompt vs mandatory rules vs skills vs bead_context vs memory), query `payload_breakdown` events directly. Use this for eager-load bloat investigations, prompt/rule consolidation planning, or duplication hunts — but cross-check against the truth source above.
158
+
159
+ ```bash
160
+ DB=.specialists/db/observability.db
161
+ sqlite3 "$DB" "SELECT specialist, event_json FROM specialist_events WHERE type='payload_breakdown' GROUP BY specialist ORDER BY t DESC" \
162
+ | python3 -c '
163
+ import json, sys
164
+ rows = []
165
+ for line in sys.stdin:
166
+ if "|" not in line: continue
167
+ spec, js = line.split("|", 1)
168
+ d = json.loads(js)
169
+ agg = {}
170
+ for c in d["payload_breakdown"]["components"]:
171
+ a = agg.setdefault(c["kind"], {"tokens":0,"n":0})
172
+ a["tokens"] += c["tokens"]; a["n"] += 1
173
+ rows.append((spec, d["payload_breakdown"]["totals"]["tokens"], agg))
174
+ rows.sort(key=lambda r: -r[1])
175
+ print(f"{\"specialist\":<22}{\"total\":>8}{\"rules\":>8}{\"rules_n\":>8}{\"sys\":>8}{\"skills\":>8}{\"bead\":>8}{\"mem\":>8}")
176
+ for s, t, a in rows:
177
+ g = lambda k: a.get(k, {"tokens":0,"n":0})
178
+ print(f"{s:<22}{t:>8}{g(\"mandatory_rule\")[\"tokens\"]:>8}{g(\"mandatory_rule\")[\"n\"]:>8}{g(\"system_prompt\")[\"tokens\"]:>8}{g(\"skill\")[\"tokens\"]:>8}{g(\"bead_context\")[\"tokens\"]:>8}{g(\"memory\")[\"tokens\"]:>8}")
179
+ '
180
+ ```
181
+
182
+ Component kinds: `system_prompt`, `mandatory_rule` (one event entry per attached rule), `skill` (path reference, ~10 tokens — bodies are loaded on demand, not eagerly), `task_template`, `bead_context`, `memory`.
183
+
184
+ Optimization signals:
185
+ - `mandatory_rule` total dominates: audit wrapper inflation by comparing `bytes` per rule in the event vs `wc -c config/mandatory-rules/<id>.md`. Mismatch >5x means a wrapper or richer source is adding hidden cost — investigate `formatMandatoryRulesBlock` and `parseMandatoryRulesFrontmatter`.
186
+ - `skill` total small (always): skills are reference-only at startup; inlining skill bodies into rules saves nothing.
187
+ - `bead_context` huge: the bead description is bloated — orchestrator should write more concise contracts.
188
+ - `memory` huge: stale or noisy memories — run `bd memories` cleanup or consolidation.
189
+
146
190
  ## References
147
191
 
148
192
  - `docs/observability-metrics.md`
@@ -25,6 +25,20 @@ bd update <id> --claim # claim before any edit
25
25
 
26
26
  > Use `bv --robot-next` for the single top pick. Use `bv --robot-triage --format toon` to save context tokens. **Never run bare `bv` — it launches an interactive TUI.**
27
27
 
28
+ ---
29
+
30
+ ## Current xt Command Surfaces
31
+
32
+ Use these command surfaces when the task is operational rather than code-editing:
33
+
34
+ | Need | Command | Notes |
35
+ |------|---------|-------|
36
+ | Refresh xtrm-managed skills/hooks/reports in one repo | `xt update --apply` | Default `xt update` is dry-run; `--apply` writes. |
37
+ | Refresh many repos | `xt update --apply --root <dir>` | Discovers repos with `.xtrm/registry.json`; failures are reported per repo. |
38
+ | Cut a release | `xt release prepare --patch` then `xt release publish` | `prepare` drafts from xt reports; `publish` tags/pushes. If `prepare` fails on changelog script compatibility, check specialists `unitAI-dnmcg` state and use the manual fallback in `/releasing`. |
39
+ | Close a session report | update latest same-day `.xtrm/reports/<date>-*.md` | `session-close-report` prefers one same-day SSOT handoff; do not create duplicate reports unless asked. |
40
+
41
+
28
42
  ---
29
43
 
30
44
  ## Trigger Patterns
package/CHANGELOG.md CHANGED
@@ -7,6 +7,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ---
9
9
 
10
+ ## [0.7.16] - 2026-05-05
11
+
12
+ ### Fixed
13
+ - `xt update` and `xt install` now repair a broken `.xtrm/skills/default` symlink before running the registry install. Previously only `xt init` repaired stale dev-mode symlinks, so updates failed on machines where the legacy symlink target no longer existed. The npm package root is always the source.
14
+
15
+ ## [0.7.15] - 2026-05-05
16
+
17
+ ### Changed
18
+ - Updated `using-xtrm` and `docs/XTRM-GUIDE.md` to document `xt update`, `xt release prepare/publish`, and same-day SSOT session report behavior.
19
+
10
20
  ## [0.7.14] - 2026-05-05
11
21
 
12
22
  ### Added
@@ -55493,6 +55493,14 @@ async function runInstall(opts = {}) {
55493
55493
  console.log(kleur_default.bold("\n \u2699 xtrm install (.xtrm registry scaffold)"));
55494
55494
  console.log(kleur_default.dim(` \u2022 registry: ${registryPath}`));
55495
55495
  console.log(kleur_default.dim(` \u2022 target: ${userXtrmDir}`));
55496
+ const scaffoldResult = await scaffoldSkillsDefaultFromPackage({
55497
+ packageRoot,
55498
+ userXtrmDir,
55499
+ dryRun
55500
+ });
55501
+ if (scaffoldResult === "copy") {
55502
+ console.log(kleur_default.dim(" \u2022 Repaired .xtrm/skills/default from package payload"));
55503
+ }
55496
55504
  const stats = await installFromRegistry({
55497
55505
  packageRoot,
55498
55506
  registry: registry2,