xtrm-tools 0.7.15 → 0.7.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.xtrm/registry.json +408 -408
- package/.xtrm/skills/default/using-kpi/SKILL.md +44 -0
- package/CHANGELOG.md +5 -0
- package/cli/dist/index.cjs +8 -0
- package/cli/dist/index.cjs.map +1 -1
- package/cli/package.json +1 -1
- package/package.json +1 -1
- package/packages/pi-extensions/package.json +1 -1
|
@@ -143,6 +143,50 @@ sp db stats --with-payload --format json \
|
|
|
143
143
|
end'
|
|
144
144
|
```
|
|
145
145
|
|
|
146
|
+
## Recipe 7 — payload component breakdown per specialist
|
|
147
|
+
|
|
148
|
+
**Truth source first.** The actual prompt size billed by the API is the first turn's `input_tokens` from `token_trajectory_json[0]`. Use it as the ground truth — `payload_breakdown` events undercount (tool definitions and harness framing are not captured) and historical rows before the rule N× fix overcount mandatory_rule by attached-rule count.
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
DB=.specialists/db/observability.db
|
|
152
|
+
sqlite3 "$DB" "SELECT specialist, model, AVG(json_extract(token_trajectory_json, '\$[0].token_usage.input_tokens')) AS avg_first_in, COUNT(*) AS n FROM specialist_job_metrics WHERE token_trajectory_json IS NOT NULL AND status='done' GROUP BY specialist, model ORDER BY avg_first_in DESC"
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
Use this number for cost decisions. Use `payload_breakdown` only for *relative* component analysis (which knob to tune), not absolute sizing.
|
|
156
|
+
|
|
157
|
+
`sp db stats --with-payload` only surfaces total `payload_kb` / `payload_tokens`. To audit *what* fills the prompt (system_prompt vs mandatory rules vs skills vs bead_context vs memory), query `payload_breakdown` events directly. Use this for eager-load bloat investigations, prompt/rule consolidation planning, or duplication hunts — but cross-check against the truth source above.
|
|
158
|
+
|
|
159
|
+
```bash
|
|
160
|
+
DB=.specialists/db/observability.db
|
|
161
|
+
sqlite3 "$DB" "SELECT specialist, event_json FROM specialist_events WHERE type='payload_breakdown' GROUP BY specialist ORDER BY t DESC" \
|
|
162
|
+
| python3 -c '
|
|
163
|
+
import json, sys
|
|
164
|
+
rows = []
|
|
165
|
+
for line in sys.stdin:
|
|
166
|
+
if "|" not in line: continue
|
|
167
|
+
spec, js = line.split("|", 1)
|
|
168
|
+
d = json.loads(js)
|
|
169
|
+
agg = {}
|
|
170
|
+
for c in d["payload_breakdown"]["components"]:
|
|
171
|
+
a = agg.setdefault(c["kind"], {"tokens":0,"n":0})
|
|
172
|
+
a["tokens"] += c["tokens"]; a["n"] += 1
|
|
173
|
+
rows.append((spec, d["payload_breakdown"]["totals"]["tokens"], agg))
|
|
174
|
+
rows.sort(key=lambda r: -r[1])
|
|
175
|
+
print(f"{\"specialist\":<22}{\"total\":>8}{\"rules\":>8}{\"rules_n\":>8}{\"sys\":>8}{\"skills\":>8}{\"bead\":>8}{\"mem\":>8}")
|
|
176
|
+
for s, t, a in rows:
|
|
177
|
+
g = lambda k: a.get(k, {"tokens":0,"n":0})
|
|
178
|
+
print(f"{s:<22}{t:>8}{g(\"mandatory_rule\")[\"tokens\"]:>8}{g(\"mandatory_rule\")[\"n\"]:>8}{g(\"system_prompt\")[\"tokens\"]:>8}{g(\"skill\")[\"tokens\"]:>8}{g(\"bead_context\")[\"tokens\"]:>8}{g(\"memory\")[\"tokens\"]:>8}")
|
|
179
|
+
'
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
Component kinds: `system_prompt`, `mandatory_rule` (one event entry per attached rule), `skill` (path reference, ~10 tokens — bodies are loaded on demand, not eagerly), `task_template`, `bead_context`, `memory`.
|
|
183
|
+
|
|
184
|
+
Optimization signals:
|
|
185
|
+
- `mandatory_rule` total dominates: audit wrapper inflation by comparing `bytes` per rule in the event vs `wc -c config/mandatory-rules/<id>.md`. Mismatch >5x means a wrapper or richer source is adding hidden cost — investigate `formatMandatoryRulesBlock` and `parseMandatoryRulesFrontmatter`.
|
|
186
|
+
- `skill` total small (always): skills are reference-only at startup; inlining skill bodies into rules saves nothing.
|
|
187
|
+
- `bead_context` huge: the bead description is bloated — orchestrator should write more concise contracts.
|
|
188
|
+
- `memory` huge: stale or noisy memories — run `bd memories` cleanup or consolidation.
|
|
189
|
+
|
|
146
190
|
## References
|
|
147
191
|
|
|
148
192
|
- `docs/observability-metrics.md`
|
package/CHANGELOG.md
CHANGED
|
@@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
---
|
|
9
9
|
|
|
10
|
+
## [0.7.16] - 2026-05-05
|
|
11
|
+
|
|
12
|
+
### Fixed
|
|
13
|
+
- `xt update` and `xt install` now repair a broken `.xtrm/skills/default` symlink before running the registry install. Previously only `xt init` repaired stale dev-mode symlinks, so updates failed on machines where the legacy symlink target no longer existed. The npm package root is always the source.
|
|
14
|
+
|
|
10
15
|
## [0.7.15] - 2026-05-05
|
|
11
16
|
|
|
12
17
|
### Changed
|
package/cli/dist/index.cjs
CHANGED
|
@@ -55493,6 +55493,14 @@ async function runInstall(opts = {}) {
|
|
|
55493
55493
|
console.log(kleur_default.bold("\n \u2699 xtrm install (.xtrm registry scaffold)"));
|
|
55494
55494
|
console.log(kleur_default.dim(` \u2022 registry: ${registryPath}`));
|
|
55495
55495
|
console.log(kleur_default.dim(` \u2022 target: ${userXtrmDir}`));
|
|
55496
|
+
const scaffoldResult = await scaffoldSkillsDefaultFromPackage({
|
|
55497
|
+
packageRoot,
|
|
55498
|
+
userXtrmDir,
|
|
55499
|
+
dryRun
|
|
55500
|
+
});
|
|
55501
|
+
if (scaffoldResult === "copy") {
|
|
55502
|
+
console.log(kleur_default.dim(" \u2022 Repaired .xtrm/skills/default from package payload"));
|
|
55503
|
+
}
|
|
55496
55504
|
const stats = await installFromRegistry({
|
|
55497
55505
|
packageRoot,
|
|
55498
55506
|
registry: registry2,
|