slice-tournament-zoo 0.5.7 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -28,6 +28,7 @@
28
28
 
29
29
  - [Requirements](#requirements)
30
30
  - [Install](#install)
31
+ - [Updating](#updating)
31
32
  - [Use](#use)
32
33
  - [Example commands and workflows](#example-commands-and-workflows)
33
34
  - [Uninstall](#uninstall)
@@ -43,6 +44,16 @@
43
44
  - No database, no vector service, no API keys beyond what Claude Code already
44
45
  uses for its subagents.
45
46
 
47
+ > **Token cost.** A tournament is deliberately redundant: every slice runs *N*
48
+ > specimens in parallel, a judge casts multiple votes per pair, and a multi-slice
49
+ > GRPO project repeats that across the DAG. That buys selection pressure and an
50
+ > auditable trail — but it is **token-intensive**, far more than a single-agent
51
+ > run. Budget accordingly (tune `n`, `votesPerPair`, and `traceTier` down for
52
+ > cheaper runs), and consider installing token-efficiency companion plugins
53
+ > alongside STZ: **Caveman** (compressed responses), **RTK** (token-optimized CLI
54
+ > proxy), **Headroom**, and **CodeSight**. They reduce the per-call overhead the
55
+ > tournament multiplies.
56
+
46
57
  ## Install
47
58
 
48
59
  STZ installs two ways: as a global CLI via **npm**, or as a **Claude Code
@@ -89,6 +100,44 @@ needed (Node.js 20+ is the only requirement; the bundled copy fetches `tsx` via
89
100
  > Developing STZ itself, or running the engine without Claude Code? See
90
101
  > [`docs/development/local-and-testing.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/development/local-and-testing.md).
91
102
 
103
+ ## Updating
104
+
105
+ STZ ships through two channels that update independently — the **npm CLI** and
106
+ the **Claude Code plugin**. Keep them on the same version so the `/stz:*`
107
+ commands and the `stz` you call by hand agree.
108
+
109
+ ```bash
110
+ stz --version # what you have
111
+ stz update # check npm for a newer release + plugin/CLI drift
112
+ stz update --check # same, as JSON (CI-friendly; exits non-zero if action needed)
113
+ ```
114
+
115
+ `stz update` does not self-install (it never runs `npm`/`/plugin` behind your
116
+ back); it checks the npm registry, compares against your installed version, and
117
+ prints the exact commands to run. When a plugin manifest is reachable — i.e.
118
+ `CLAUDE_PLUGIN_ROOT` is set (as in a Claude Code session) or you run from a repo
119
+ checkout — it also reports **drift** between the CLI and the plugin's bundled
120
+ engine:
121
+
122
+ ```bash
123
+ npm i -g slice-tournament-zoo@latest # update the CLI
124
+ /plugin update stz # update the plugin (inside Claude Code)
125
+ ```
126
+
127
+ After updating the engine, bring an **existing project's `.stz/` tree** up to the
128
+ current taxonomy schema. Engine updates never touch a scaffolded project on their
129
+ own, so a tree created by an older STZ can fall behind:
130
+
131
+ ```bash
132
+ stz migrate # additive + backed-up; no-op if already current
133
+ ```
134
+
135
+ `migrate` is safe by construction: it only *creates* missing tiers (never
136
+ deletes or renames), and copies the prior tree to a `.stz.bak-schema<N>/` sibling
137
+ before any change. Each `.stz/` carries a `manifest.json` stamped with the STZ
138
+ version and schema version so drift is detectable. Pass `--no-backup` to skip the
139
+ copy.
140
+
92
141
  ## Use
93
142
 
94
143
  ### Scaffold a project
@@ -350,7 +399,8 @@ For contributors and anyone going past day-to-day operation:
350
399
  - **Sealed-suite integrity** — the guide-vs-sensor contract behind the frozen
351
400
  held-out suite: [`docs/development/sealed-suite.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/development/sealed-suite.md).
352
401
  - **Requirement-to-test mapping** — [`docs/TESTPLAN.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/TESTPLAN.md).
353
- - **What is real versus deferred** — [`docs/AS-BUILT.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/AS-BUILT.md).
402
+ - **Roadmap — what is built, deferred, and planned next** —
403
+ [`docs/ROADMAP.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/ROADMAP.md).
354
404
 
355
405
  ## License
356
406
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "slice-tournament-zoo",
3
- "version": "0.5.7",
3
+ "version": "0.7.1",
4
4
  "description": "STZ: a contract-bounded slice pipeline that implements each slice adversarially via an N-specimen tournament with frozen sealed tests, GRPO-style selection, layered anti-reward-hacking, and a replayable markdown audit trail.",
5
5
  "license": "Apache-2.0",
6
6
  "homepage": "https://github.com/dr-robert-li/slice-tournament-zoo#readme",
package/src/README.md CHANGED
@@ -14,6 +14,6 @@ the production path — see [`mock/`](https://github.com/dr-robert-li/slice-tour
14
14
  ## Further reading
15
15
 
16
16
  - The requirement-to-test mapping is in [`docs/TESTPLAN.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/TESTPLAN.md).
17
- - What is real versus deferred is in [`docs/AS-BUILT.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/AS-BUILT.md).
17
+ - What is built, deferred, and planned next is in [`docs/ROADMAP.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/ROADMAP.md).
18
18
  - Running the engine locally / in CI: [`docs/development/local-and-testing.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/development/local-and-testing.md).
19
19
  - The deterministic bridge CLI: [`docs/development/bridge-cli.md`](https://github.com/dr-robert-li/slice-tournament-zoo/blob/main/docs/development/bridge-cli.md).
package/src/bridge.ts CHANGED
@@ -33,6 +33,7 @@ import type {
33
33
  ProjectPhase,
34
34
  ProjectSliceEntry,
35
35
  RunConfig,
36
+ SpecimenId,
36
37
  } from "./types.js";
37
38
  import { PROJECT_PHASES } from "./types.js";
38
39
  import { scaffold, writeDoc, readDoc, stzPath } from "./taxonomy.js";
@@ -56,6 +57,8 @@ import {
56
57
  defaultRunConfig,
57
58
  } from "./project.js";
58
59
  import { detectHacks } from "./hack-detector.js";
60
+ import { STZ_VERSION, SCHEMA_VERSION, PACKAGE_NAME } from "./version.js";
61
+ import { onNoPassers, type EscalationState } from "./escalation.js";
59
62
  import { evalGate, select, pairings } from "./selection.js";
60
63
  import { diffSpecs, renderSpecDiff, isFaithful, unmatchedIntentIds, mismatchedAsBuiltIds, type Spec } from "./specdiff.js";
61
64
  import { seal, verifySeal, amendSeal, heldOutFiles } from "./seal.js";
@@ -95,6 +98,16 @@ function print(obj: unknown): void {
95
98
  process.stdout.write(JSON.stringify(obj, null, 2) + "\n");
96
99
  }
97
100
 
101
+ /**
102
+ * Report the bundled engine's identity (F19). The `/stz:*` commands and a
103
+ * SessionStart hook call this to compare the plugin's engine against a global
104
+ * `stz` CLI and surface channel drift deterministically (no version parsing
105
+ * from prose).
106
+ */
107
+ function versionCmd(): void {
108
+ print({ version: STZ_VERSION, schemaVersion: SCHEMA_VERSION, packageName: PACKAGE_NAME });
109
+ }
110
+
98
111
  // ── paths within a slice ────────────────────────────────────────────────────
99
112
 
100
113
  const sliceRel = (id: string) => join("40-slices", id);
@@ -236,10 +249,136 @@ function gate(args: Record<string, string>): void {
236
249
  const { root, slice } = args as { root: string; slice: string };
237
250
  const evals = loadEvals(root, slice);
238
251
  const { passers, eliminated } = evalGate(evals);
239
- // Emit the pairing schedule the command must drive with judge agents.
252
+ // Emit the pairing schedule the command must drive with judge agents. `gate`
253
+ // is a pure read — it never advances escalation. When `passers` is empty the
254
+ // command calls `escalate` (below), which owns the state transition; keeping
255
+ // them separate means a re-run of `gate` can't double-advance the FSM.
240
256
  print({ passers, eliminated, pairings: pairings(passers) });
241
257
  }
242
258
 
259
+ /** Build the pressure-log entries: every specimen that is not the winner is a
260
+ * negative exemplar (F9). `winner` is null for a no-passers round (all culled). */
261
+ function culledFromEvals(
262
+ root: string,
263
+ slice: string,
264
+ evals: EvalResult[],
265
+ winner: SpecimenId | null,
266
+ ): CulledSpecimen[] {
267
+ return evals
268
+ .filter((e) => e.specimen !== winner)
269
+ .map((e) => ({
270
+ specimen: e.specimen,
271
+ reason: e.hackFindings.length
272
+ ? `hack: ${e.hackFindings.map((f) => f.pattern).join(",")}`
273
+ : `gate testPassRate=${e.testPassRate.toFixed(2)}`,
274
+ diff: Object.entries(readSpecimenFiles(root, slice, e.specimen))
275
+ .map(([p, c]) => `+++ ${p}\n${c}`)
276
+ .join("\n"),
277
+ critique: "",
278
+ hackFindings: e.hackFindings,
279
+ }));
280
+ }
281
+
282
+ /**
283
+ * Bounded cross-round escalation (F14), driven from the command-level `/stz:run`
284
+ * loop. Call this ONCE after a gate that yielded zero passers. It is the single
285
+ * deterministic owner of "are we allowed another round?": it advances the
286
+ * escalation FSM over `state.json`, persists the new counts, and on retry/replan
287
+ * writes the PDR refinement context the next round's specimens consume — exactly
288
+ * the path the mock orchestrator drives internally, now exposed to the real
289
+ * command so it is not the LLM deciding when to stop.
290
+ *
291
+ * The sealed suite is NOT touched here: retry/replan re-enter the tournament with
292
+ * the SAME frozen suite (the command re-runs `seal-verify` each round). Re-using
293
+ * the FSM's hard ceiling (≤1 retry, ≤1 replan) means even a stray double-call is
294
+ * fail-safe — it halts early, it never loops.
295
+ */
296
+ async function escalateCmd(args: Record<string, string>): Promise<void> {
297
+ const { root, slice } = args as { root: string; slice: string };
298
+ const evals = loadEvals(root, slice);
299
+ let state = await loadState(root, slice);
300
+
301
+ const cur: EscalationState = {
302
+ stage: state.escalation,
303
+ retryCount: state.retryCount,
304
+ replanCount: state.replanCount,
305
+ };
306
+ // The round that just failed (1-based): rounds already consumed + this one.
307
+ const failedRound = cur.retryCount + cur.replanCount + 1;
308
+ const { next, action } = onNoPassers(cur);
309
+ state.escalation = next.stage;
310
+ state.retryCount = next.retryCount;
311
+ state.replanCount = next.replanCount;
312
+ state = appendEvent(state, "judgment", `escalation-${action.type}`, action.note);
313
+
314
+ // The whole field is culled this round (no winner). Persist the pressure log so
315
+ // the negative exemplars are auditable regardless of what comes next (F9).
316
+ const culled = culledFromEvals(root, slice, evals, null);
317
+ await writeDoc(root, join("50-pressure", slice, "pressure.md"), {
318
+ frontmatter: { summary: `Pressure log ${slice}: round ${failedRound}, ${culled.length} culled (no passers).` },
319
+ body: renderPressureLog({ sliceId: slice, culled }),
320
+ });
321
+
322
+ if (action.type === "halt") {
323
+ const report =
324
+ `# Failure report — ${slice}\n\n` +
325
+ `No specimen passed the sealed-suite gate after ${failedRound} round(s) ` +
326
+ `(${next.retryCount} retry, ${next.replanCount} replan). The bounded-escalation ` +
327
+ `budget (≤1 retry, ≤1 replan) is exhausted; halting per F14.\n\n` +
328
+ `## Per-specimen gate outcomes (final round)\n` +
329
+ evals
330
+ .map((e) => {
331
+ const why = e.hackFindings.length
332
+ ? `disqualified — hack: ${e.hackFindings.map((f) => f.pattern).join(", ")}`
333
+ : `gate fail — testPassRate=${e.testPassRate.toFixed(2)}, coverage=${e.coverage.toFixed(2)}, mutation=${e.mutationScore.toFixed(2)}`;
334
+ return `- specimen-${e.specimen}: ${why}`;
335
+ })
336
+ .join("\n") +
337
+ "\n";
338
+ state.failureReport = report;
339
+ state = setPhaseStatus(state, "judgment", "failed");
340
+ await writeDoc(root, join(sliceRel(slice), "failure-report.md"), {
341
+ frontmatter: { summary: `Halt: no passers after ${failedRound} round(s).` },
342
+ body: report,
343
+ });
344
+ await saveState(root, state);
345
+ print({
346
+ action: "halt",
347
+ note: action.note,
348
+ round: failedRound,
349
+ escalation: state.escalation,
350
+ retryCount: state.retryCount,
351
+ replanCount: state.replanCount,
352
+ failureReportPath: stzPath(root, join(sliceRel(slice), "failure-report.md")),
353
+ });
354
+ return;
355
+ }
356
+
357
+ // retry or replan → build the PDR refinement context (F9) from this round's
358
+ // group-relative advantages (no votes: GRPO over the eval rewards alone), the
359
+ // same computation the mock uses (orchestrator select(evals, [])).
360
+ const advantages = select(evals, []).judgment.advantages;
361
+ await writeDoc(root, join("50-pressure", slice, "refinement.md"), {
362
+ frontmatter: { summary: `PDR refinement for ${slice} after round ${failedRound} (${action.type}).` },
363
+ body: refinementContext({ sliceId: slice, culled }, advantages),
364
+ });
365
+ if (action.type === "replan") {
366
+ // Re-enter planning: the command rewrites intent.json before re-spawning.
367
+ state = setPhaseStatus(state, "planning", "running");
368
+ }
369
+ await saveState(root, state);
370
+ print({
371
+ action: action.type,
372
+ note: action.note,
373
+ round: failedRound,
374
+ nextRound: failedRound + 1,
375
+ escalation: state.escalation,
376
+ retryCount: state.retryCount,
377
+ replanCount: state.replanCount,
378
+ refinementPath: stzPath(root, join("50-pressure", slice, "refinement.md")),
379
+ });
380
+ }
381
+
243
382
  function recordVotes(args: Record<string, string>): void {
244
383
  const { root, slice } = args as { root: string; slice: string };
245
384
  const votes = readJSON<PairwiseVote[]>(args.votes!);
@@ -282,19 +421,7 @@ async function finalize(args: Record<string, string>): Promise<void> {
282
421
  : { ranking: [], winner: null, advantages: [], votes: [] };
283
422
 
284
423
  // Pressure log: every non-winning specimen is a negative exemplar (F9).
285
- const culled: CulledSpecimen[] = evals
286
- .filter((e) => e.specimen !== judgment.winner)
287
- .map((e) => ({
288
- specimen: e.specimen,
289
- reason: e.hackFindings.length
290
- ? `hack: ${e.hackFindings.map((f) => f.pattern).join(",")}`
291
- : `gate testPassRate=${e.testPassRate.toFixed(2)}`,
292
- diff: Object.entries(readSpecimenFiles(root, slice, e.specimen))
293
- .map(([p, c]) => `+++ ${p}\n${c}`)
294
- .join("\n"),
295
- critique: "",
296
- hackFindings: e.hackFindings,
297
- }));
424
+ const culled = culledFromEvals(root, slice, evals, judgment.winner);
298
425
  await writeDoc(root, join("50-pressure", slice, "pressure.md"), {
299
426
  frontmatter: { summary: `Pressure log ${slice}: ${culled.length} culled.` },
300
427
  body: renderPressureLog({ sliceId: slice, culled }),
@@ -916,10 +1043,12 @@ export async function runBridge(argv: string[]): Promise<void> {
916
1043
  const [sub, ...rest] = argv;
917
1044
  const args = parseArgs(rest);
918
1045
  switch (sub) {
1046
+ case "version": versionCmd(); break;
919
1047
  case "begin": await begin(args); break;
920
1048
  case "record-eval": recordEval(args); break;
921
1049
  case "eval": evalCmd(args); break;
922
1050
  case "gate": gate(args); break;
1051
+ case "escalate": await escalateCmd(args); break;
923
1052
  case "record-votes": recordVotes(args); break;
924
1053
  case "select": await selectCmd(args); break;
925
1054
  case "finalize": await finalize(args); break;
package/src/cli.ts CHANGED
@@ -3,16 +3,22 @@
3
3
  *
4
4
  * stz init [dir] scaffold the .stz/ taxonomy + AGENTS.md
5
5
  * stz run [dir] run the bundled demo slice through the mock pipeline
6
+ * stz update check npm for a newer release + channel drift (F19)
7
+ * stz migrate [dir] bring an existing .stz/ tree up to the current schema (F19)
8
+ * stz --version
6
9
  * stz help
7
10
  */
8
11
  import { join } from "node:path";
9
- import { writeFile } from "node:fs/promises";
12
+ import { writeFile, readFile } from "node:fs/promises";
10
13
  import { existsSync } from "node:fs";
11
14
  import { scaffold, writeDoc, STZ_DIR, TIERS } from "./taxonomy.js";
12
15
  import { runSlice } from "./mock/orchestrator.js";
13
16
  import { runBridge } from "./bridge.js";
14
17
  import { MockModelLayer, defaultMockConfig } from "./mock/mock.js";
15
18
  import type { SliceManifest } from "./types.js";
19
+ import { STZ_VERSION } from "./version.js";
20
+ import { checkLatest, buildVerdict, formatVerdict } from "./update.js";
21
+ import { writeManifest, migrate } from "./migrate.js";
16
22
 
17
23
  const AGENTS_MD = `# AGENTS.md — STZ table of contents
18
24
 
@@ -51,6 +57,7 @@ const DEMO_MANIFEST: SliceManifest = {
51
57
 
52
58
  async function cmdInit(dir: string): Promise<void> {
53
59
  const created = await scaffold(dir);
60
+ await writeManifest(dir); // F19: stamp the tree so `stz migrate` can detect drift later
54
61
  await writeFile(join(dir, "AGENTS.md"), AGENTS_MD, "utf8");
55
62
  await writeDoc(dir, join("00-intent", "bootstrap.md"), {
56
63
  frontmatter: { summary: "Bootstrap (slice-00): hand-written minimal kernel; STZ produces itself from slice-01 (R7/F18)." },
@@ -59,6 +66,63 @@ async function cmdInit(dir: string): Promise<void> {
59
66
  console.log(`Scaffolded ${STZ_DIR}/ (${TIERS.length} tiers, ${created.length} created) + AGENTS.md at ${dir}`);
60
67
  }
61
68
 
69
+ /**
70
+ * Discover the Claude Code plugin's bundled engine version, for drift detection
71
+ * (F19). The plugin sets `CLAUDE_PLUGIN_ROOT`; fall back to a manifest in cwd so
72
+ * a developer running inside the repo still sees drift. Returns null when no
73
+ * plugin manifest is reachable (a pure npm-CLI user has no second channel).
74
+ */
75
+ async function readPluginVersion(dir: string): Promise<string | null> {
76
+ const roots = [process.env.CLAUDE_PLUGIN_ROOT, dir].filter(Boolean) as string[];
77
+ for (const root of roots) {
78
+ const p = join(root, ".claude-plugin", "plugin.json");
79
+ if (!existsSync(p)) continue;
80
+ try {
81
+ const manifest = JSON.parse(await readFile(p, "utf8")) as { version?: unknown };
82
+ if (typeof manifest.version === "string") return manifest.version;
83
+ } catch {
84
+ // Unreadable/malformed manifest -> treat as "no plugin info", not a crash.
85
+ }
86
+ }
87
+ return null;
88
+ }
89
+
90
+ async function cmdUpdate(): Promise<void> {
91
+ const asJson = process.argv.includes("--check") || process.argv.includes("--json");
92
+ const latest = await checkLatest();
93
+ // Plugin discovery uses the working directory (and CLAUDE_PLUGIN_ROOT), not a
94
+ // positional — `update` takes flags, not a dir, so the operator's cwd is the
95
+ // right place to look for a co-located plugin manifest.
96
+ const pluginVersion = await readPluginVersion(process.cwd());
97
+ const verdict = buildVerdict({
98
+ installed: STZ_VERSION,
99
+ latest: latest.version,
100
+ pluginVersion,
101
+ reason: latest.ok ? undefined : latest.reason,
102
+ });
103
+ if (asJson) {
104
+ console.log(JSON.stringify(verdict, null, 2));
105
+ } else {
106
+ console.log(formatVerdict(verdict));
107
+ }
108
+ // Exit non-zero when action is required, so scripts/CI can gate on it.
109
+ if (verdict.stale || verdict.drift) process.exitCode = 1;
110
+ }
111
+
112
+ async function cmdMigrate(dir: string): Promise<void> {
113
+ const noBackup = process.argv.includes("--no-backup");
114
+ const report = await migrate(dir, { backup: !noBackup });
115
+ if (report.upToDate) {
116
+ console.log(`${STZ_DIR}/ already at schema ${report.toSchema} — nothing to migrate.`);
117
+ return;
118
+ }
119
+ console.log(
120
+ `Migrated ${STZ_DIR}/ schema ${report.fromSchema} → ${report.toSchema} ` +
121
+ `(${report.created.length} tier(s) created).`,
122
+ );
123
+ if (report.backedUpTo) console.log(`Backup of the prior tree: ${report.backedUpTo}`);
124
+ }
125
+
62
126
  async function cmdRun(dir: string): Promise<void> {
63
127
  if (!existsSync(join(dir, STZ_DIR))) await scaffold(dir);
64
128
  const model = new MockModelLayer(defaultMockConfig());
@@ -86,7 +150,10 @@ function cmdHelp(): void {
86
150
  Usage:
87
151
  stz init [dir] scaffold the .stz/ taxonomy + AGENTS.md (default: cwd)
88
152
  stz run [dir] run the bundled demo slice through the mock pipeline
153
+ stz update [--check] check npm for a newer release + plugin/CLI drift
154
+ stz migrate [dir] bring an existing .stz/ tree up to the current schema
89
155
  stz bridge <cmd> deterministic orchestration bridge (used by the /stz:* commands)
156
+ stz --version print the installed version
90
157
  stz help show this help
91
158
 
92
159
  In Claude Code, install the plugin and drive the full pipeline with /stz:new,
@@ -104,6 +171,17 @@ async function main(): Promise<void> {
104
171
  case "run":
105
172
  await cmdRun(dir);
106
173
  break;
174
+ case "update":
175
+ await cmdUpdate();
176
+ break;
177
+ case "migrate":
178
+ await cmdMigrate(dir);
179
+ break;
180
+ case "--version":
181
+ case "-v":
182
+ case "version":
183
+ console.log(STZ_VERSION);
184
+ break;
107
185
  case "bridge":
108
186
  // Deterministic orchestration bridge called by the /stz:run command
109
187
  // between Task-subagent spawns. Everything after "bridge" is its argv.
package/src/migrate.ts ADDED
@@ -0,0 +1,130 @@
1
+ /**
2
+ * Project-scaffold migration (F19, the "B" half of the update pathway).
3
+ *
4
+ * Updating the *engine* (npm CLI / plugin) does not touch a project's on-disk
5
+ * `.stz/` tree. When a new STZ release changes the taxonomy (adds a tier, a
6
+ * manifest field), existing projects silently fall behind. This module stamps
7
+ * every scaffold with a manifest carrying `{stzVersion, schemaVersion}` and
8
+ * provides an **additive, backed-up** migration so an old tree can be brought
9
+ * current without losing anything.
10
+ *
11
+ * Safety contract: migration only ever *creates* missing tiers (it reuses the
12
+ * idempotent `scaffold`, which never deletes) and always copies the prior tree
13
+ * to a sibling backup first. A destructive change (renamed/removed tier) is out
14
+ * of scope by construction — there is no code path here that removes a file.
15
+ */
16
+ import { writeFile, readFile, cp } from "node:fs/promises";
17
+ import { existsSync } from "node:fs";
18
+ import { join } from "node:path";
19
+ import { STZ_DIR, TIERS, scaffold } from "./taxonomy.js";
20
+ import { STZ_VERSION, SCHEMA_VERSION } from "./version.js";
21
+
22
+ /** The on-disk manifest stamped at the root of a `.stz/` tree. */
23
+ export interface StzManifest {
24
+ stzVersion: string;
25
+ schemaVersion: number;
26
+ tiers: string[];
27
+ }
28
+
29
+ const MANIFEST_REL = "manifest.json";
30
+
31
+ /** Absolute path to a project's `.stz/manifest.json`. */
32
+ export function manifestPath(root: string): string {
33
+ return join(root, STZ_DIR, MANIFEST_REL);
34
+ }
35
+
36
+ /** Write (or overwrite) the manifest for the current STZ + schema version. */
37
+ export async function writeManifest(root: string): Promise<StzManifest> {
38
+ const manifest: StzManifest = {
39
+ stzVersion: STZ_VERSION,
40
+ schemaVersion: SCHEMA_VERSION,
41
+ tiers: [...TIERS],
42
+ };
43
+ await writeFile(manifestPath(root), JSON.stringify(manifest, null, 2) + "\n", "utf8");
44
+ return manifest;
45
+ }
46
+
47
+ /**
48
+ * Read the manifest if present. Returns null for a pre-manifest project (an
49
+ * `.stz/` tree scaffolded before F19) so callers can treat it as schema 0.
50
+ */
51
+ export async function readManifest(root: string): Promise<StzManifest | null> {
52
+ const p = manifestPath(root);
53
+ if (!existsSync(p)) return null;
54
+ const parsed = JSON.parse(await readFile(p, "utf8")) as Partial<StzManifest>;
55
+ return {
56
+ stzVersion: typeof parsed.stzVersion === "string" ? parsed.stzVersion : "0.0.0",
57
+ schemaVersion: typeof parsed.schemaVersion === "number" ? parsed.schemaVersion : 0,
58
+ tiers: Array.isArray(parsed.tiers) ? parsed.tiers : [],
59
+ };
60
+ }
61
+
62
+ /** What `migrate` did, for the CLI to report and `--check` to emit as JSON. */
63
+ export interface MigrateReport {
64
+ root: string;
65
+ fromSchema: number;
66
+ toSchema: number;
67
+ /** True when nothing needed doing (already current, all tiers present). */
68
+ upToDate: boolean;
69
+ /** Tiers created by this migration (additive only). */
70
+ created: string[];
71
+ /** Sibling path the prior tree was copied to, or null when no change. */
72
+ backedUpTo: string | null;
73
+ }
74
+
75
+ /** True when every current tier directory already exists under `.stz/`. */
76
+ function allTiersPresent(root: string): boolean {
77
+ return TIERS.every((t) => existsSync(join(root, STZ_DIR, t)));
78
+ }
79
+
80
+ /**
81
+ * Bring an existing `.stz/` tree up to the current schema. Idempotent: a second
82
+ * run on an already-current tree is a no-op (`upToDate: true`, no backup).
83
+ *
84
+ * @throws if there is no `.stz/` tree to migrate (use `stz init` first).
85
+ */
86
+ export async function migrate(
87
+ root: string,
88
+ opts: { backup?: boolean } = {},
89
+ ): Promise<MigrateReport> {
90
+ const backup = opts.backup ?? true;
91
+ if (!existsSync(join(root, STZ_DIR))) {
92
+ throw new Error(`no ${STZ_DIR}/ tree at ${root} — run \`stz init\` first`);
93
+ }
94
+ const current = await readManifest(root);
95
+ const fromSchema = current?.schemaVersion ?? 0;
96
+
97
+ // Already current AND structurally complete -> nothing to do. (We still
98
+ // rewrite a missing manifest below if the schema matched but the stamp was
99
+ // absent, so a pre-manifest tree at the same layout still gets stamped.)
100
+ if (fromSchema === SCHEMA_VERSION && current !== null && allTiersPresent(root)) {
101
+ return {
102
+ root,
103
+ fromSchema,
104
+ toSchema: SCHEMA_VERSION,
105
+ upToDate: true,
106
+ created: [],
107
+ backedUpTo: null,
108
+ };
109
+ }
110
+
111
+ // Back up the prior tree before any additive change.
112
+ let backedUpTo: string | null = null;
113
+ if (backup) {
114
+ backedUpTo = join(root, `${STZ_DIR}.bak-schema${fromSchema}`);
115
+ await cp(join(root, STZ_DIR), backedUpTo, { recursive: true });
116
+ }
117
+
118
+ // Additive only: scaffold creates missing tiers, never removes.
119
+ const created = await scaffold(root);
120
+ await writeManifest(root);
121
+
122
+ return {
123
+ root,
124
+ fromSchema,
125
+ toSchema: SCHEMA_VERSION,
126
+ upToDate: false,
127
+ created,
128
+ backedUpTo,
129
+ };
130
+ }
@@ -8,7 +8,7 @@
8
8
  * The model layer is injected (ModelLayer), so this runs identically against
9
9
  * the deterministic mock and a future live Claude Code / Codex implementation.
10
10
  *
11
- * STUBBED vs the full design (logged via the `log` sink, surfaced in AS-BUILT):
11
+ * STUBBED vs the full design (logged via the `log` sink, surfaced in ROADMAP):
12
12
  * - git worktrees per specimen → prototypes/specimen-X/ directories instead.
13
13
  * - per-worktree ephemeral observability stacks → not spun up.
14
14
  * - live Python eval drivers / mutation / PBT → mock EvalRunner.
package/src/update.ts ADDED
@@ -0,0 +1,207 @@
1
+ /**
2
+ * Update pathway (F19): tell the operator whether their STZ is current, and
3
+ * print the exact command(s) to fix it. Two distribution channels mean two
4
+ * things can be stale independently:
5
+ *
6
+ * - the **npm CLI** (`slice-tournament-zoo` on PATH), and
7
+ * - the **Claude Code plugin** (the bundled `stz bridge` the `/stz:*`
8
+ * commands call via `${CLAUDE_PLUGIN_ROOT}`).
9
+ *
10
+ * A "sustainable" pathway therefore does three things: detect a newer npm
11
+ * release, detect *drift* between the two channels, and emit deterministic
12
+ * remediation commands. The registry fetch is **injectable** so the pure verdict
13
+ * logic is unit-tested offline (STZ's no-network test ethos) while the real CLI
14
+ * uses global `fetch`.
15
+ */
16
+ import { PACKAGE_NAME, registryLatestUrl } from "./version.js";
17
+
18
+ /** Result of an npm latest-version check. Structured, never prose-parsed. */
19
+ export interface LatestResult {
20
+ ok: boolean;
21
+ version: string | null;
22
+ /** Machine-readable reason on failure (network, parse, http, …). */
23
+ reason: string;
24
+ }
25
+
26
+ /** The verdict the CLI renders and `--check` emits as JSON. */
27
+ export interface UpdateVerdict {
28
+ packageName: string;
29
+ installed: string;
30
+ latest: string | null;
31
+ /** A newer npm release exists. */
32
+ stale: boolean;
33
+ /** Installed is ahead of npm latest (local/dev build). */
34
+ ahead: boolean;
35
+ /** Plugin bundled engine differs from the installed CLI, when known. */
36
+ drift: boolean;
37
+ /** Plugin engine version if discoverable, else null. */
38
+ pluginVersion: string | null;
39
+ /** Exact remediation commands, in order. Empty when fully up to date. */
40
+ commands: string[];
41
+ /** Why the check could not complete, if `latest` is null. */
42
+ reason?: string;
43
+ }
44
+
45
+ // ── semver compare (the subset STZ versions actually use) ────────────────────
46
+
47
+ interface Semver {
48
+ major: number;
49
+ minor: number;
50
+ patch: number;
51
+ /** Pre-release identifiers (e.g. `rc.1` -> ["rc", 1]); empty for releases. */
52
+ pre: Array<string | number>;
53
+ }
54
+
55
+ /** Parse `MAJOR.MINOR.PATCH[-pre]`. Throws on a non-semver string. */
56
+ export function parseSemver(v: string): Semver {
57
+ const m = /^(\d+)\.(\d+)\.(\d+)(?:-([0-9A-Za-z.-]+))?$/.exec(v.trim());
58
+ if (!m) throw new Error(`not a semver: ${v}`);
59
+ const pre = m[4]
60
+ ? m[4].split(".").map((id) => (/^\d+$/.test(id) ? Number(id) : id))
61
+ : [];
62
+ return { major: Number(m[1]), minor: Number(m[2]), patch: Number(m[3]), pre };
63
+ }
64
+
65
+ /**
66
+ * Compare two semvers. Returns -1 if a<b, 0 if equal, 1 if a>b. Implements the
67
+ * precedence rule that a pre-release is *lower* than its release (1.0.0-rc < 1.0.0).
68
+ */
69
+ export function compareSemver(a: string, b: string): -1 | 0 | 1 {
70
+ const pa = parseSemver(a);
71
+ const pb = parseSemver(b);
72
+ for (const k of ["major", "minor", "patch"] as const) {
73
+ if (pa[k] !== pb[k]) return pa[k] < pb[k] ? -1 : 1;
74
+ }
75
+ // Equal core. A release outranks any pre-release of the same core.
76
+ if (pa.pre.length === 0 && pb.pre.length === 0) return 0;
77
+ if (pa.pre.length === 0) return 1; // a is release, b is pre
78
+ if (pb.pre.length === 0) return -1; // a is pre, b is release
79
+ const n = Math.min(pa.pre.length, pb.pre.length);
80
+ for (let i = 0; i < n; i++) {
81
+ const x = pa.pre[i];
82
+ const y = pb.pre[i];
83
+ if (x === y) continue;
84
+ // Numeric identifiers rank lower than alphanumeric; otherwise compare in kind.
85
+ const xn = typeof x === "number";
86
+ const yn = typeof y === "number";
87
+ if (xn && yn) return (x as number) < (y as number) ? -1 : 1;
88
+ if (xn !== yn) return xn ? -1 : 1;
89
+ return (x as string) < (y as string) ? -1 : 1;
90
+ }
91
+ if (pa.pre.length === pb.pre.length) return 0;
92
+ return pa.pre.length < pb.pre.length ? -1 : 1;
93
+ }
94
+
95
+ // ── verdict ──────────────────────────────────────────────────────────────────
96
+
97
+ /** Inputs to {@link buildVerdict}; all version strings, no I/O. */
98
+ export interface VerdictInput {
99
+ installed: string;
100
+ latest: string | null;
101
+ pluginVersion?: string | null;
102
+ reason?: string;
103
+ }
104
+
105
+ /**
106
+ * Pure: turn (installed, latest, pluginVersion) into a verdict + remediation
107
+ * commands. No network, no filesystem — this is the unit under test.
108
+ */
109
+ export function buildVerdict(input: VerdictInput): UpdateVerdict {
110
+ const { installed, latest } = input;
111
+ const pluginVersion = input.pluginVersion ?? null;
112
+
113
+ const stale = latest != null && compareSemver(installed, latest) < 0;
114
+ const ahead = latest != null && compareSemver(installed, latest) > 0;
115
+ const drift = pluginVersion != null && compareSemver(installed, pluginVersion) !== 0;
116
+
117
+ const commands: string[] = [];
118
+ if (stale) commands.push(`npm i -g ${PACKAGE_NAME}@latest`);
119
+ // The plugin updates through Claude Code's plugin manager, not npm. Surface it
120
+ // whenever a newer release exists OR the two channels have drifted apart.
121
+ if (stale || drift) commands.push("/plugin update stz");
122
+
123
+ return {
124
+ packageName: PACKAGE_NAME,
125
+ installed,
126
+ latest,
127
+ stale,
128
+ ahead,
129
+ drift,
130
+ pluginVersion,
131
+ commands,
132
+ ...(input.reason ? { reason: input.reason } : {}),
133
+ };
134
+ }
135
+
136
+ // ── registry check (injectable fetch) ────────────────────────────────────────
137
+
138
+ /** A minimal fetch signature so tests inject a fake without a network. */
139
+ export type FetchLike = (url: string) => Promise<{
140
+ ok: boolean;
141
+ status: number;
142
+ json: () => Promise<unknown>;
143
+ }>;
144
+
145
+ /**
146
+ * Query npm for the latest published version. Network failures, non-200s, and
147
+ * malformed bodies all collapse to `{ok:false, reason}` rather than throwing,
148
+ * so the CLI degrades to "couldn't check" instead of crashing.
149
+ */
150
+ export async function checkLatest(
151
+ fetchImpl: FetchLike = globalThis.fetch as unknown as FetchLike,
152
+ ): Promise<LatestResult> {
153
+ if (typeof fetchImpl !== "function") {
154
+ return { ok: false, version: null, reason: "no_fetch_available" };
155
+ }
156
+ let res: Awaited<ReturnType<FetchLike>>;
157
+ try {
158
+ res = await fetchImpl(registryLatestUrl());
159
+ } catch {
160
+ return { ok: false, version: null, reason: "network_error" };
161
+ }
162
+ if (!res.ok) {
163
+ return { ok: false, version: null, reason: `http_${res.status}` };
164
+ }
165
+ let body: unknown;
166
+ try {
167
+ body = await res.json();
168
+ } catch {
169
+ return { ok: false, version: null, reason: "invalid_json" };
170
+ }
171
+ const version = (body as { version?: unknown })?.version;
172
+ if (typeof version !== "string") {
173
+ return { ok: false, version: null, reason: "missing_version_field" };
174
+ }
175
+ try {
176
+ parseSemver(version);
177
+ } catch {
178
+ return { ok: false, version: null, reason: "unparseable_version" };
179
+ }
180
+ return { ok: true, version, reason: "ok" };
181
+ }
182
+
183
+ /** Human-readable summary for the `stz update` (non-`--check`) path. */
184
+ export function formatVerdict(v: UpdateVerdict): string {
185
+ const lines: string[] = [];
186
+ lines.push(`STZ ${v.installed} (${v.packageName})`);
187
+ if (v.latest == null) {
188
+ lines.push(`Couldn't check npm for updates (reason: ${v.reason ?? "unknown"}).`);
189
+ lines.push(`To update manually: npm i -g ${v.packageName}@latest`);
190
+ return lines.join("\n");
191
+ }
192
+ if (v.stale) lines.push(`Update available: ${v.latest} (you have ${v.installed}).`);
193
+ else if (v.ahead) lines.push(`You're ahead of npm latest (${v.latest}) — local/dev build.`);
194
+ else lines.push(`Up to date with npm latest (${v.latest}).`);
195
+ if (v.drift) {
196
+ lines.push(
197
+ `⚠ Channel drift: plugin engine ${v.pluginVersion} ≠ CLI ${v.installed}. ` +
198
+ `The /stz:* commands may use a different version than the CLI.`,
199
+ );
200
+ }
201
+ if (v.commands.length) {
202
+ lines.push("");
203
+ lines.push("Run:");
204
+ for (const c of v.commands) lines.push(` ${c}`);
205
+ }
206
+ return lines.join("\n");
207
+ }
package/src/version.ts ADDED
@@ -0,0 +1,51 @@
1
+ /**
2
+ * Single version-identity seam (F19 update pathway).
3
+ *
4
+ * One place owns "what version am I, what package am I, what npm endpoint do I
5
+ * check". Every other module imports from here rather than re-typing a literal.
6
+ * Two hard-won lessons from prior-art update mechanisms are baked in:
7
+ *
8
+ * - The **package name is a code constant**, never an LLM/runtime free choice.
9
+ * A model-driven update path that "decides" the npm name at execution time
10
+ * mistypes it (`@stz/cli`, `slice-tournament`, a typosquat) and queries the
11
+ * wrong package. Pinning it here closes that gap.
12
+ * - The **CLI version is read from package.json**, never hardcoded into a `.ts`
13
+ * string, so a release bump can never leave the reported version stale.
14
+ *
15
+ * `SCHEMA_VERSION` is independent of the package version: it tracks the shape of
16
+ * the on-disk `.stz/` taxonomy and only bumps when the tier layout changes, so
17
+ * `stz migrate` knows whether an existing project tree needs additive upgrade.
18
+ */
19
+ import { readFileSync } from "node:fs";
20
+ import { fileURLToPath } from "node:url";
21
+ import { dirname, join } from "node:path";
22
+
23
+ /** The npm package name. A code constant — see file header. */
24
+ export const PACKAGE_NAME = "slice-tournament-zoo";
25
+
26
+ /**
27
+ * Schema version of the `.stz/` taxonomy tree. Bump when `TIERS` (or the
28
+ * manifest shape) changes so `stz migrate` can detect an out-of-date project.
29
+ */
30
+ export const SCHEMA_VERSION = 1;
31
+
32
+ /** Read the package version from the shipped package.json (never hardcoded). */
33
+ function readPackageVersion(): string {
34
+ const here = dirname(fileURLToPath(import.meta.url));
35
+ // src/version.ts -> ../package.json. npm always ships package.json, and the
36
+ // source-available repo has it at the root, so this resolves in both modes.
37
+ const pkgPath = join(here, "..", "package.json");
38
+ const pkg = JSON.parse(readFileSync(pkgPath, "utf8")) as { version?: string };
39
+ if (!pkg.version) throw new Error(`package.json at ${pkgPath} has no version`);
40
+ return pkg.version;
41
+ }
42
+
43
+ /** The installed STZ version, sourced from package.json. */
44
+ export const STZ_VERSION = readPackageVersion();
45
+
46
+ /** The npm registry endpoint that resolves the latest published version. */
47
+ export function registryLatestUrl(pkg: string = PACKAGE_NAME): string {
48
+ // The `latest` dist-tag document is small and CORS-free; `.version` is the
49
+ // published latest. Encode the name defensively though it is a constant.
50
+ return `https://registry.npmjs.org/${encodeURIComponent(pkg)}/latest`;
51
+ }