@ctxr/skill-llm-wiki 1.0.2 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,134 @@ All notable changes to `skill-llm-wiki` are documented in this file.
4
4
 
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
6
 
7
+ ## [Unreleased]
8
+
9
+ ### Cross-harness support (Claude Code + OpenAI Codex CLI)
10
+
11
+ - **SKILL.md prose neutralised so the wiki-runner sub-agent dispatch works under both Claude Code (via the `Agent` tool) and Codex CLI (via its equivalent).** The Tier 2 envelope shape is now the open `subagent.dispatch.v1` contract: top-level `kind: "subagent.dispatch.v1"`, `role: "wiki-tier2-<kind>"`, the per-Tier-2-request kind moved to the `tier2_kind` extension field. The deprecated `model_hint` / `effort_hint` aliases continue to be emitted for one release so existing wiki-runner consumers keep working; new code should consume `effort` (and optional `model` override) instead. See `https://github.com/ctxr-dev/kit/blob/main/docs/subagent-dispatch-v1.md` for the full envelope spec.
12
+ - Replaced hardcoded Claude model names (`opus` / `sonnet` / `haiku`) in default-model documentation with provider-neutral effort hints (`heavy` / `balanced` / `light`). Each host harness maps `effort` to its own lineup; `model` overrides remain available for explicit pinning.
13
+ - Repositioned package.json description from "Claude Code skill" to "Agent Skills (Claude Code, Codex CLI)".
14
+ - Reordered package.json keywords to lead with `agent-skills`, `agents-md`, `codex`, `claude-code`.
15
+ - Updated git-submodule install path in README from `.claude/skills/` to `.agents/skills/` to match the canonical install topology used by `@ctxr/kit`.
16
+ - Fixed three broken cross-references in `guide/correctness/safety.md` and `guide/layout/in-place-mode.md` (relative paths that resolved outside their target subdir).
17
+ - Declared `publishConfig.access: "public"` for scoped npm publish; added `prepublishOnly` lint+test gate.
18
+
19
+ ### Performance
20
+
21
+ - **Large-corpus pairwise-sweep speedup (~50-100× on I/O-bound paths).** A 596-leaf deterministic build previously took 2h15m; this release targets the three I/O antipatterns responsible for most of that wall time. Surfaced during a live Phase X.6 rebuild attempt on `skill-code-review/reviewers.src`.
22
+ - **`decision-log.mjs::appendDecision` now uses POSIX append.** The prior implementation read the WHOLE `.llmwiki/decisions.yaml` on every call (via `readFileSync`), concatenated the new entry, wrote to a temp file, and renamed — O(file-size) per append. On a 45 MB decisions.yaml that's ~22 MB of avg-read per call × 189k calls ≈ 4 TB of I/O. Rewritten to use `appendFileSync` for every call after the first; the header-creation path still goes through temp+rename for atomicity. Cost per append drops from ~20 ms to ~200 µs — a ~100× speedup at scale. The append path is best-effort (not the same atomicity contract as temp+rename): on a crash a torn trailing entry is possible and the YAML parser will reject the truncated scalar; recovery is to drop the last `- ...` block and re-run the op. The decision log is an audit trail rather than a reproducibility artefact — tree SHAs are unaffected. See the docstring in `scripts/lib/decision-log.mjs` for the full durability breakdown (first-call atomicity via temp+rename, subsequent calls best-effort, Windows/concurrency caveats). Also includes a pre-append check that peeks at the log's last byte: if the existing file doesn't end in `\n` (manual edit, prior torn-tail truncation), a leading newline is inserted so the appended entry can't concatenate onto the previous line and corrupt YAML.
23
+ - **`similarity-cache.mjs` shards cache entries by 2-hex prefix.** The prior flat `<cacheDir>/<sha256>.json` layout put 178k files in a single directory on a 596-leaf build. APFS/ext4/ZFS directory lookups degrade once the VFS dirent cache overflows (~10k entries on typical kernels), so by mid-sweep every `existsSync` / `writeFileSync` paid O(log N)-with-large-N cost. New layout: `<cacheDir>/<2hex>/<rest>.json` → 256 shards, ~700 entries per shard for a 178k-pair corpus. Same pattern as `.git/objects/`. Exported new `CACHE_SHARD_PREFIX_LEN` constant so tuning is explicit. Pre-sharding flat caches are NOT auto-migrated — the similarity cache is pure perf memoisation with no user data at stake, and the next build repopulates the sharded layout. `clearCache` and `cacheSize` walk every shard subdir plus any residual flat entries, so pre-upgrade caches still report size and can be cleared.
24
+ - **`operators.mjs::detectMerge` batches pairwise `decide()` calls via `Promise.allSettled`.** The O(N²) nested loop previously awaited each pair sequentially. At 178k pairs and ~10 ms per pair (cache-hit avg), that alone was ~30 min of wall-clock time even with all I/O cached. Pairs are now streamed into batches of `DETECT_MERGE_PAIR_BATCH_SIZE = 32` concurrent awaits; the streaming accumulator keeps memory at O(batch) instead of O(N²). Result order is preserved via positional `allSettled[]` semantics so downstream proposal collection stays deterministic. No tiered.mjs changes required — each `decide()` call is independent (its own cache file, its own decision-log append); concurrent calls on Node's cooperative event loop don't race on shared mutable state since microtask-ordered awaits serialize map mutations. Expected speedup on I/O-bound sweeps: 10-20× on modern SSDs; more modest on spinning disks.
25
+ - **Per-pair timeout + retry guards via `p-timeout` + `p-retry`.** A stuck pair (NFS hang, runaway embed call, filesystem lockup) used to stall the whole convergence phase forever under the serial `Promise.all` semantics, AND one genuine error would abort the whole batch, discarding every successfully-completed pair alongside it. Every `decide()` call is now wrapped with `pTimeout({ milliseconds: DETECT_MERGE_PAIR_TIMEOUT_MS = 30_000 })` for per-call deadline enforcement, and `pRetry({ retries: DETECT_MERGE_PAIR_RETRIES = 2, minTimeout: 500, maxTimeout: 5_000 })` for transient-failure resilience. The outer batch uses `Promise.allSettled` so a pair that exhausts retries logs to stderr and skips instead of aborting — the batch's other 31 pair decisions land normally. Validation errors inside `decide()` are treated as non-retryable by throwing `AbortError` so a structural bug doesn't burn the retry budget on something that would fail the same way every time. Breadcrumb logs on every retry attempt + every final-fail let operators see reliability activity without grepping through the decision log. New runtime deps: `p-retry@^8` and `p-timeout@^7` (both sindresorhus, zero-dep, ESM-only — the canonical Node retry/timeout pair, ~1 KB each gzipped).
26
+
27
+ ### Added
28
+
29
+ - **CLI progress breadcrumbs (Phase X.9).** Long-running operations (build / rebuild / fix / join) now stream a per-phase breadcrumb to stderr as each orchestrator phase fires:
30
+ - `[build-20260424-212011-abc123 3] ingest: read 596 source file(s), 589 leaves`
31
+ - `[build-... 7] operator-convergence: 156 operator(s) applied across 23 iteration(s); ...`
32
+ - Format: `[<op-id> <phase-index>] <phase-name>: <phase-summary>`
33
+ - Format matches exactly what the final phase log records in the op-summary payload — no new formatter, just a live mirror of `record()` calls.
34
+ - Writes to stderr (never stdout) so consumers piping the command's stdout don't conflate phase chatter with the final op-summary payload.
35
+ - Monotonically-increasing phase index per op, so structured log aggregators can reconstruct phase ordering after multiplexing.
36
+ - Progress is on by default; suppressed when `--json` is passed (the `skill-llm-wiki/v1` envelope consumer contract requires a clean stderr) or when `LLM_WIKI_NO_PROGRESS=1` is set in the environment (CI / hermetic runs that want the pre-X.9 silent-stderr shape).
37
+ - Implementation: `scripts/lib/orchestrator.mjs::runOperation` accepts an `onProgress({name, summary, index})` callback that the internal `record()` helper invokes alongside the phase-log push. CLI wires a callback that writes the breadcrumb to stderr; in-process callers can pass their own handler for custom streaming (TUI, structured log shipper, etc.). Progress-hook throws are swallowed — a misbehaving callback never halts the op.
38
+ - Tests (`tests/unit/cli-progress.test.mjs`, 3 scenarios): stderr breadcrumb shape + monotonic phase index; `LLM_WIKI_NO_PROGRESS=1` suppresses; `--json` suppresses.
39
+ - **Real 11-phase `join` implementation (Phase X.2).** The `join` operation now implements the full pipeline from `guide/operations/ingest/join.md` instead of falling through to the convergence-only stub. The skill can now actually merge N ≥ 2 source wikis into one unified output.
40
+ - **New module `scripts/lib/join.mjs`** exports `runJoin(sources, target, ctx)` and the per-phase helpers (`ingestWiki`, `validateSources`, `planUnion`, `resolveIdCollisions`, `mergeCategoriesWithSameFocus`, `rewireReferences`, `materialisePlan`), plus the canonical `VALID_COLLISION_POLICIES` / `DEFAULT_COLLISION_POLICY` constants.
41
+ - **Pipeline phases (per the methodology):**
42
+ 1. `ingest-all` — walk each source wiki's tree, reading frontmatter through `readFrontmatterStreaming` (CRLF-safe, bounded reads) and the full body from `captured.bodyOffset`. Plain markdown files with no frontmatter fence and dot-prefixed files are skipped.
43
+ 2. `source-validate` — `validateWiki` on each source, aggregated into a single error/warning report. Any hard error halts with `JOIN-SOURCE-INVALID` — joining a broken source would produce a broken output.
44
+ 3. `plan-union` — merge per-source leaf + index records into a unified plan with `sourceWiki` provenance on every record.
45
+ 4. `resolve-id-collisions` — three policies: `namespace` (default) prefixes colliding ids with `<source-basename>.` and relies on source-scoped `renameMap` rewrites in phase 6 to resolve inbound references (the original id stays with the keeper, so adding it to `aliases[]` would trip `ALIAS-COLLIDES-ID`); `merge` folds duplicates when frontmatter is compatible (same `focus`/`type`/`depth_role`) and falls back to namespace on incompatibility; `ask` throws `JOIN-COLLISION-ASK` for the caller to surface. Every renamed leaf also has its filename rewritten (via `relPath`) to satisfy the validator's `ID-MISMATCH-FILE` rule. A running `liveIds` Set guards against secondary collisions when two sources share the same basename (e.g. `/a/reviewers.wiki` and `/b/reviewers.wiki`) — `reserveId` appends `-2`, `-3`, ... as needed.
46
+ 5. `merge-categories` — detects top-level categories with matching `focus`. First-cut is DETECT-ONLY: folding is deferred because `runConvergence`'s MERGE operator handles sibling leaves, not entire category subtrees. A directory-MERGE operator is tracked as a follow-up.
47
+ 6. `rewire-references` — `links[].id` and `overlay_targets[]` entries are rewritten via `renameMap` then `mergeMap`. Unresolvable references are left as-is and surface as `DANGLING-LINK`/`DANGLING-OVERLAY` at Phase 9 so the user gets one unified report.
48
+ 7. `apply-operators` — `runConvergence` on the unified materialised tree (same tiered-AI quality mode the user passes to `build`/`rebuild`).
49
+ 8. `generate-indices` — `rebuildAllIndices` so every index's `entries[]` reflects the unified structure.
50
+ 9. `validation` — `validateWiki` on the target; failures throw `JOIN-TARGET-INVALID` and trigger rollback.
51
+ 10. `golden-path-union` — placeholder; fixture-regression gate lands as a follow-up.
52
+ 11. `commit` — the orchestrator tags `op/<op-id>` and appends the op-log entry.
53
+ - **Source immutability.** Every source wiki is treated as strictly read-only. `runJoin` writes only to the prepared target directory (which the CLI creates empty before the op fires). A consumer passing a `--target` that equals, nests under, or contains any source wiki is refused at the intent layer with `INT-18` — the check runs before the pre-op snapshot, so a stray `--target /path/to/source-a/inside` never writes a byte to source A. A target that is unrelated to every source but happens to already exist non-empty also fails early with `INT-01`.
54
+ - **Per-phase commits.** `runJoin` takes an optional `onPhaseCommit` callback that the orchestrator wires up to `git add -A` + `gitCommit` between the materialise, convergence, and index-generation phases — the joined wiki's private git log shows the same per-phase granularity a fresh `build` does, so `git diff op/<id>~1..op/<id>` works at phase scope.
55
+ - **Orchestrator integration.** A new `plan.operation === "join"` early branch in `scripts/lib/orchestrator.mjs::runOperation` takes the pre-op snapshot on the fresh target, delegates the 11 phases to `runJoin`, tags the op on success, and rolls the tree back to the pre-op snapshot on any failure — the same rollback semantics `build`/`rebuild`/`fix` get.
56
+ - **CLI surface.** `node scripts/cli.mjs join <wiki-a> <wiki-b> [<wiki-c>...] --target <path> [--id-collision namespace|merge|ask] [--quality-mode <mode>]`. Positionals can be any number ≥ 2; each must be an existing skill-managed wiki (has `.llmwiki/git/HEAD`). `--target` is required and must be empty (or not exist yet). New `--id-collision` flag with default `namespace`.
57
+ - **Intent validation.** New structured error codes: `INT-06` for fewer than 2 sources / non-existent source / source-not-managed, `INT-09b` for missing `--target`, `INT-01` for non-empty `--target`, `INT-17` for invalid `--id-collision` value, and `INT-18` for `--target` that overlaps any source wiki (source-immutability guard). All fire at the intent layer so a typo never reaches the pre-op snapshot.
58
+ - **Tests.** 13 unit tests in `tests/unit/join.test.mjs` cover each phase's contract: ingestWiki reads frontmatter + body + skips dotfiles; validateSources aggregates findings across sources; planUnion tags provenance; resolveIdCollisions exercises all three policies (namespace prefix, merge-compatible fold, merge-fallback-to-namespace on incompatible frontmatter, ask-throws, invalid-policy-rejects); VALID_COLLISION_POLICIES canonical list pinned; mergeCategoriesWithSameFocus detects; rewireReferences rewrites via map; materialisePlan writes to target.
59
+ - **Root-leaf containment invariant (Phase X.11).** The wiki root must hold only `index.md` plus subdirectories — never a direct `.md` leaf. A new orchestrator phase between soft-DAG synthesis and review (Phase 4.4.5, `scripts/lib/root-containment.mjs::runRootContainment`) walks `wikiRoot`, identifies direct-child outlier leaves (non-index `.md` files at the wiki root itself — depth 0 per `depthOf`, topical singletons whose affinity to every other leaf stayed below clustering thresholds), and moves each into its own semantically-named subcategory derived from its own TF-IDF distinguishing tokens. A stub `<slug>/index.md` is written so the new category is routable; Phase 5's `rebuildAllIndices` populates the stub's `entries[]` on the same pass.
60
+ - **Why single-member categories instead of a shared "uncategorised" bucket.** Every reviewer leaf has `focus` / `covers` / `tags` that describe some coherent topic, so the honest answer to "where does this belong?" is "in its own tight category named after what it is." A shared bucket label admits defeat about something the data already tells us; a per-outlier slug preserves the semantic signal. If the corpus later grows a topically-adjacent leaf, future builds' convergence + balance may nest both into an existing category — a single-member start state is a valid transient, not a permanent scar.
61
+ - **Slug derivation.** Reuses Phase X.3's `generateDeterministicSlug([outlier], siblings)` and `deterministicPurpose([outlier])` so every outlier gets the same byte-stable naming any clustered leaf would. Sibling corpus is recomputed per outlier so the newly-added subcategory from the previous iteration is included in the IDF ranking — the second outlier picks a slug that discriminates against the first, not a near-duplicate. Uniqueness is enforced via `resolveNestSlug` + the full-wiki forbidden-id index from PR #5: a generated slug that happens to collide with an existing subcategory basename, leaf id, or alias elsewhere in the tree gets the `-group` / `-group-N` fallback treatment. The shared `wikiIndex` is mutated after each resolve so subsequent outliers can't reuse a just-assigned slug.
62
+ - **parents[] rewrite.** The leaf's new direct parent (`<slug>/index.md`) is same-dir-as-leaf, so a primary `"index.md"` entry stays byte-identical across the move even though its semantic target moves from root-index to subcategory-index. Same convention `applyBalanceFlatten` leveraged when moving a subtree up one level (PR #8). Non-primary entries that DON'T already start with `"../"` gain a `"../"` prefix because paths that were relative to the old leaf-dir (wiki root) are now one level too shallow. Example: `["index.md", "beta/index.md"]` → `["index.md", "../beta/index.md"]`. Entries that already start with `"../"` are preserved byte-identical — a root-level leaf has no legitimate parent above wikiRoot to reference, so the input is already a depth-contract violation; prepending another `"../"` would only escape the root outright (`"../foo"` → `"../../foo"`). The malformed entry stays as-is and validation surfaces it under its normal parent-path rules.
63
+ - **Determinism.** Outlier iteration is lex-sorted by filename, so two runs on the same outlier set produce byte-identical slug assignment order (matters for `-group-N` collision tie-breaks). Files whose frontmatter fails to parse are skipped silently — the validator surfaces them separately under `PARSE`.
64
+ - **LIFT guard.** `detectLift` refuses to emit a proposal when the lift destination would be the wiki root (`dirname(dir) === wikiRoot`). Without this, Phase 4.4.5 and LIFT would oscillate forever on the single-member X.11 subcategories that containment itself creates — LIFT would move the leaf back to root, containment would move it back into a subcategory, next run repeats. Every deeper single-child passthrough is still fair game.
65
+ - **Validator.** New `LEAF-AT-WIKI-ROOT` error in `scripts/lib/validate.mjs` declaratively enforces the invariant: any non-index `.md` at the wiki root is flagged with remediation guidance to run `fix`. `heal.mjs::FINDING_ACTIONS` maps the code to `fix` so automated remediation picks the right command. Even a hand-edit that re-introduces a root-level leaf gets caught on the next validate pass.
66
+ - **Integration.** Phase 4.4.5 runs BEFORE Phase 4.5 review so the containment commit participates in the `--review` diff — users can drop/abort individual containment moves like any other tree-mutating phase's commits. The `anyMutation` gate that decides whether to fire review now includes `containmentDidCommit` so review is surfaced even when containment is the only phase that changed the tree. Runs BEFORE Phase 5 so `rebuildAllIndices` sees the final tree shape and the stub indexes get populated as part of the regular pass. Runs for every top-level operation that reaches convergence + review (build / rebuild / fix) — containment is an invariant the tree must satisfy, not a one-shot migration. Zero outliers → no `mkdir`, no writes, no commit; the phase is a strict no-op for clean wikis.
67
+ - **Tests** (8 new scenarios): zero-outliers no-op, single outlier creates folder with deterministic slug, two distinct outliers get two distinct folders, slug collision with existing subcategory triggers `-group-N` fallback, determinism across runs (byte-identical slug assignment), parents[] rewrite correctness for primary + soft-parent entries, already-`../`-prefixed parents preserved byte-identical, dotfiles / frontmatter-less root files are skipped silently.
68
+ - **Unblocks.** The skill-code-review Phase X.6 build landed 2 outlier leaves (`footgun-bidi-rtl-locale-collation.md`, `footgun-file-path-cross-platform.md`) at the wiki root — their TF-IDF cosine to every other leaf fell below the 0.35 soft-DAG threshold AND the 0.63 HAC threshold AND the coarse-K-means MIN_CLUSTER_SIZE=3 cutoff. Running `fix` on the existing `reviewers.wiki/` now contains them into their own subcategories and completes the invariant.
69
+ - **Deterministic coarse-partition pre-pass for flat large-diverse roots** (`scripts/lib/cluster-detect.mjs::detectCoarseClusters`). Phase X.10 ships the engineering fix for a class of inputs that the v1.0.2 HAC path couldn't structure: a flat directory with hundreds of topically-diverse leaves. On a 596-leaf hand-authored corpus, the HAC's 3-8-size shape-score optimum produced zero valid NESTs during convergence under `--quality-mode deterministic` — the best partition at every candidate threshold was dominated by one giant component plus many singletons, and the 3-8-size clusters that emerged were too few to pass `partitionShapeScore > 0`. Balance enforcement then tried to carve the 576-leaf root linearly (3-5 leaves per sub-cluster iteration) and hit its 20-iter convergence cap far short of the `--fanout-target 6` threshold. The defense-in-depth guard from PR #8 fired, rolled back to the pre-op snapshot, and the whole 2h15m build was wasted.
70
+ - **Trigger.** `detectClusters` now dispatches to `detectCoarseClusters` when `leaves.length > COARSE_PARTITION_THRESHOLD` (default 50). Below that, the existing HAC path runs unchanged.
71
+ - **Algorithm.** Deterministic K-means with farthest-first seed init. K = `ceil(N / COARSE_TARGET_CLUSTER_SIZE)` (target 8 avg per cluster). First seed is always `leaves[0]` (lex-first by caller's ordering); each subsequent seed maximises its minimum similarity-distance (`1 - max(sim-to-existing-seeds)`). Assignment uses mean-member-similarity to each cluster so we reuse the existing NxN affinity matrix without exposing per-leaf vectors. Converges in a handful of iterations; `COARSE_KMEANS_MAX_ITERS = 20` cap is defensive against pathological oscillation.
72
+ - **Filters.** Clusters smaller than `MIN_CLUSTER_SIZE = 3` or larger than `MAX_COARSE_CLUSTER_SIZE = 30` are rejected. Small clusters aren't worth nesting; giant clusters are noise-floor concentration that balance enforcement can refine in a second pass if `--fanout-target` is tight.
73
+ - **Determinism.** All ordering uses lex-first tie-breaking: first seed index 0, subsequent farthest-first with index-ascending tie-break, member iteration in leaf-array order, post-hoc sort by (average_affinity desc, first-member-path asc). Two runs on the same corpus produce byte-identical cluster membership.
74
+ - **Integration.** Coarse proposals match the existing `buildNestProposal` output shape with `source: "math-coarse"` and `threshold: null` (K-means rather than HAC-at-threshold). `operators.mjs::tryClusterNestIteration` and `balance.mjs::runBalance` consume them unchanged. Tier 2 escalation for coarse paths is deliberately disabled: the LLM would be asked to partition 500+ leaves in one shot, which is both a huge token cost and typically produces worse structure than the deterministic K-means — so the coarse path returns empty on failure instead of emitting a `propose_structure` marker.
75
+ - New exports: `detectCoarseClusters`, `COARSE_PARTITION_THRESHOLD`, `COARSE_TARGET_CLUSTER_SIZE`, `MAX_COARSE_CLUSTER_SIZE`, `COARSE_KMEANS_MAX_ITERS`.
76
+ - **Tests** (5 new scenarios): constants-pin sanity, dispatch fires above threshold, byte-identical membership across runs (determinism), size filters reject tiny/giant clusters, post-sort invariant (affinity-desc, path-asc tie-break).
77
+ - **Unblocks.** The skill-code-review Phase X.6 rebuild of `reviewers.src/` (596 flat hand-authored leaves) can now run under `--quality-mode deterministic` without hitting the no-NEST-then-balance-stall failure mode.
78
+ - **`--soft-dag-parents` — post-convergence DAG soft-parent synthesis.** A new Phase 4.4 hook between balance-enforcement and review synthesises soft-parent pointers for each routable leaf by scoring the leaf's TF-IDF vector against every candidate category directory's aggregate vector. Directories whose cosine similarity meets `SOFT_PARENT_AFFINITY_THRESHOLD` (0.35) become soft parents; up to `SOFT_PARENT_MAX_PER_LEAF` (3) are kept per leaf, ranked by cosine desc with POSIX-path ascending as a deterministic tie-break. Each leaf's `parents[]` frontmatter is rewritten with the primary parent first (`"index.md"` by the same convention Phase X.5's flatten preserves) and the chosen soft parents after as POSIX-relative paths to each claimed `index.md`. A companion Phase 5.1 pass, `applySoftParentEntries`, runs after `rebuildAllIndices` and propagates each leaf's soft-claim into the corresponding index's `entries[]` array — so the DAG view materialises in every claimed parent's index.md. Idempotent: records are deduped by id, and repeated runs produce byte-identical output. Build/rebuild only; intent rejects the flag elsewhere via `INT-16a`.
79
+ - **Signal source.** `entryText` from `scripts/lib/similarity.mjs` (focus + covers + tags + domains, focus doubled) is used verbatim so cosine scores sit on the same TF-IDF basis as Phase X.3 / tiered-AI's pairwise comparisons. Threshold calibration transfers across phases. A new companion helper `buildComparisonModelFromTexts(texts)` is exported from similarity.mjs and used here — it skips the `entryText` roundtrip `buildComparisonModel` performs, so pre-aggregated category text (already run through `entryText` per contributor) isn't double-weighted.
80
+ - **Traversal.** Filesystem-native (`readdirSync` + dot-skip) rather than `listChildren` for leaf discovery, so pre-bootstrap category dirs from Phase 3 draft are visible for candidate-category enumeration. Leaf routability is still validated via frontmatter-must-have-id discipline to match `listChildren`'s "movable leaf" semantics — Phase X.5 round-16 learning applied.
81
+ - **Determinism.** POSIX-normalised lex sort throughout (leaf order, candidate order, tie-break), deterministic frontmatter serialisation via `renderFrontmatter`. Build twice on the same tree produces byte-identical `parents[]` arrays and `entries[]` appends on every platform.
82
+ - **Defense-in-depth.** Intent layer rejects the flag outside `build` / `rebuild` (`INT-16a`); orchestrator gates the phase on `plan.operation ∈ {build, rebuild}`. A programmatic caller constructing a plan directly can't accidentally trigger soft-DAG synthesis on a `fix` op. Path-traversal guard in `applySoftParentEntries`: two-layer containment check. Layer 1 is a lexical guard on `path.resolve(leafDir, rel)` + wikiRoot prefix — rejects pure `..`-traversal without touching the filesystem. Layer 2 is a symlink-aware `realpathSync` containment check — resolves the full symlink chain (including intermediate symlinked directories) and rejects any resolved path that sits outside the wiki root's realpath. A hostile leaf's crafted `"../../../external/index.md"` OR `"../trap/index.md"` where `trap/index.md` symlinks out, both get silently rejected instead of mutating arbitrary filesystem paths. Per-index `try/catch` around frontmatter parse so one malformed index doesn't abort the propagation pass.
83
+ - **Performance.** `applySoftParentEntries` uses bounded `readFrontmatterStreaming` reads (via a new `collectAllLeaves(wikiRoot, withBody=false)` mode) rather than full `readFileSync` — O(frontmatter bytes) not O(total leaf bytes) on large corpora. One `collectAllLeaves` walk feeds a `groupLeavesByDir` map that both `collectCandidateDirs` (routable-leaf check) and `buildCategoryText` (aggregate text) consume directly; neither calls `listChildren` anymore — so the soft-DAG phase does one frontmatter pass over the tree regardless of dir count. `buildCategoryText` reads each candidate `index.md` via `readFrontmatterStreaming` rather than a full `readFileSync` — authored orientation bodies don't inflate the TF-IDF pass. `scoreCandidates` threads `ctx.threshold` through the filter rather than hard-coding the constant, so overrides work correctly.
84
+ - **CRLF-fence compatibility.** The `withBody=true` path in `collectAllLeaves` now also routes through `readFrontmatterStreaming` (which normalises CRLF→LF on the frontmatter payload), instead of using `readFileSync` + `parseFrontmatter` directly. `parseFrontmatter` only recognises an LF opening fence, so leaves with CRLF frontmatter (common from Windows editors) used to be silently skipped. The body slice now anchors at `captured.bodyOffset` from a raw buffer read so multi-byte characters at the fence boundary can't corrupt the body.
85
+ - **Actual-write stats.** `applySoftParentEntries` returns `indicesTouched` and `softEntriesAdded` from counters incremented per successful write, not from the planned-appends map. Pre-round-2 stats over-reported on idempotent reruns (every claim deduped → zero writes but `indicesTouched` still counted) and on parse-failure skips.
86
+ - **Review gate tracks commit reality.** The orchestrator's Phase 4.5 `anyMutation` gate uses a `softDagDidCommit` boolean tracked at commit time, not the `softParentsAdded` counter. A rerun that removes previously-synthesised soft parents now below threshold, or a canonical-order frontmatter rewrite that leaves no net soft-parent change but still alters bytes, produces `softParentsAdded === 0` but still dirties the tree and commits — the review gate correctly surfaces the diff in both cases. The commit subject is `"parents[] synthesis across N leaves"` with no added-count in the message at all — the per-phase `record()` line still surfaces `softParentsAdded` (labelled "selected", since it's the count of pointers chosen this run, not a delta) and the private-git diff is the byte-exact source of truth for what actually changed.
87
+ - **Non-index safety check.** `applySoftParentEntries` skips any target `index.md` whose frontmatter lacks `type: index` or a non-empty `id` — defense against a leaf claiming a path that happens to resolve to a same-named-but-non-managed markdown file. Soft-DAG propagation never mutates arbitrary content under wikiRoot.
88
+ - **Atomic writes.** Both leaf rewrites (`rewriteLeafParents`) and index appends (inside `applySoftParentEntries`) go through a local `atomicWriteFile` helper: write to `<path>.tmp`, then `renameSync` into place. Matches `indices.mjs::atomicWriteFile`'s discipline — a crash / SIGKILL between write and rename leaves either the old file intact or the `.tmp` orphaned, never a half-written target. Same durability guarantee the rest of the index-generation pipeline provides.
89
+ - New module `scripts/lib/soft-dag.mjs` exports `runSoftDagParents`, `applySoftParentEntries`, `SOFT_PARENT_AFFINITY_THRESHOLD`, `SOFT_PARENT_MAX_PER_LEAF`.
90
+ - `contract.mjs::SUBCOMMANDS.build` + `.rebuild` include `--soft-dag-parents` in their flag list. Extend remains a contract stub and the flag is NOT declared there.
91
+ - **`--fanout-target=N` and `--max-depth=D` — post-convergence balance enforcement.** A new `balance-enforcement` phase between operator-convergence and index-generation iterates until fixed point applying two deterministic transform classes:
92
+ - **Sub-cluster overfull directories.** Any directory whose *movable* (leaf-only) fan-out exceeds `fanout-target × 1.5` is a candidate; sub-clustering extracts coherent clusters out of leaves, so subdir-heavy dirs with few leaves are un-actionable here and correctly ignored. The math cluster detector carves out the strongest coherent cluster, the Phase X.3 deterministic slug + purpose helpers name it, and `applyNest` applies it. The fanout pass walks the full lex-sorted overfull list until it finds a parent whose leaves yield a live proposal, so one un-actionable candidate never stalls the whole iteration. The "× 1.5" slack (`FANOUT_OVERLOAD_MULTIPLIER`) avoids thrashing on directories that sit one or two children above target. `computeFanoutStats` now returns both `perDir` (combined leaves+subdirs, the Claude-routing-cost view) and `leafCounts` (the movable-fanout view) from a single traversal so `detectFanoutOverload` doesn't re-walk the tree. `buildWikiForbiddenIndex` is built once at `runBalance` entry — but ONLY when `--fanout-target` is set, since `--max-depth`-only runs never call `resolveNestSlug` and would pay a full-tree walk for nothing. The index is mutated (`wikiIndex.add(resolvedSlug)`) after each successful apply, mirroring `operators.mjs::tryClusterNestIteration`'s amortisation pattern — a previous draft rebuilt it per apply, quadratic on the 596-leaf target corpus.
93
+ - **Flatten overdeep single-child passthroughs.** Any branch exceeding `max-depth` whose terminal segment holds exactly one subdir and zero leaves is lifted up one level. Descendants' `parents[]` paths are left unchanged — they are relative to the direct parent's `index.md`, so promoting the whole subtree up one level preserves every relative path by construction. Multi-child subcategories are left alone. `applyBalanceFlatten` preflights the passthrough dir's raw `readdirSync` entries against the allowed set `{child-basename, "index.md"}` BEFORE any mutation — catches stray non-`.md` content (assets, orphan `README.txt`, subdirs lacking `index.md`) that `listChildren` doesn't enumerate, refusing the flatten rather than silently deleting unexpected data. Dot-prefixed entries (`.DS_Store`, editor backups, `.shape/` internals) are treated as noise — not grounds for refusal, but cleaned before the rename so the final `rmdirSync` succeeds. This matches the blanket dot-skip rule the rest of the pipeline already uses (`listChildren`, `buildWikiForbiddenIndex`, `collectEntryPaths`). Final `rmdirSync` refuses non-empty dirs natively as a second safety layer (e.g., against mid-flight writes between preflight and remove).
94
+ - Phase runs only when at least one flag is set; otherwise it is a strict no-op. Deterministic in the inputs — POSIX-normalised lex-sorted dir iteration (sort key is relative path with `\` normalised to `/`, so ordering matches across POSIX + Windows), lex-sorted cluster-member iteration, Phase X.3 deterministic naming. Two runs on the same tree produce identical output on either OS. Balance's tree traversal (`computeDepthMap`, `computeFanoutStats`, `detectDepthOverage`) walks the filesystem directly via `readdirSync` rather than through `listChildren`'s index.md-requiring discipline — balance runs at Phase 4.3, BEFORE Phase 5's `bootstrapIndexStubs`, so category dirs created in Phase 3 draft that have leaves but no `index.md` yet would otherwise be invisible to the rebalance pass.
95
+ - **Hard-fail on non-convergence.** The orchestrator's Phase 4.3 hook now throws when `runBalance` reports `converged: false` (iteration cap hit without reaching a fixed point). The pre-op snapshot restores and the user sees a clear error instead of a silently half-balanced tree — an enforcement phase owes the caller a guarantee, not an advisory best-effort.
96
+ - New module `scripts/lib/balance.mjs` exports `runBalance`, `computeDepthMap`, `getMaxDepth`, `computeFanoutStats`, `detectFanoutOverload`, `detectDepthOverage`, `applyBalanceFlatten`, and `FANOUT_OVERLOAD_MULTIPLIER`.
97
+ - New intent errors: `INT-14` (invalid `--fanout-target`, must be a positive integer in [`FANOUT_TARGET_MIN`, `FANOUT_TARGET_MAX`] = [2, 100]), `INT-14a` (`--fanout-target` used on a subcommand other than build / rebuild), `INT-15` (invalid `--max-depth`, must be a positive integer in [`MAX_DEPTH_MIN`, `MAX_DEPTH_MAX`] = [1, 10]), and `INT-15a` (`--max-depth` used on a subcommand other than build / rebuild). All four surface before the orchestrator runs, so a typo never triggers a pre-op snapshot, and the orchestrator also gates the phase on `plan.operation ∈ {build, rebuild}` in defense-in-depth.
98
+ - `contract.mjs::SUBCOMMANDS.build` and `.rebuild` now list the two new flags so consumers gating on the contract know they're available. `extend` is intentionally NOT in that list — the operation is a stub that throws "not yet implemented", so advertising the flags on it would be a contract lie.
99
+ - **`--quality-mode deterministic`** — a new quality mode that produces byte-reproducible wiki builds with zero LLM/sub-agent calls in the structural decision path. Complements `tiered-fast` / `claude-first` / `tier0-only`:
100
+ - **Pairwise decisions** (`scripts/lib/tiered.mjs::decide`): Tier 0 decisive paths fire as-is; Tier 0 mid-band escalates to Tier 1 (MiniLM embeddings, already deterministic); Tier 1 mid-band is resolved by a static threshold (`TIER1_DETERMINISTIC_THRESHOLD`, derived from the midpoint of the Tier 1 decisive bounds) instead of escalating to Tier 2. No Tier-2 escalation and no mid-band "undecidable" outcome — Tier 1 always produces a concrete same/different, and `tier2Handler` is never invoked. (Tier 0's "insufficient-text" undecidable on empty-frontmatter pairs is a separate path that predates this mode and is unchanged by design: an empty text pair can't be discriminated by any tier, regardless of quality mode.)
101
+ - **Cluster NEST** (`scripts/lib/operators.mjs::tryClusterNestIteration`): the `propose_structure` Tier 2 request is skipped entirely; math-only candidates bypass the `nest_decision` gate (auto-approved — the partition-shape score + metric regression gate already provide an algorithmic equivalent); math-only candidates also bypass the `cluster_name` request and receive a deterministic slug from `generateDeterministicSlug()` + a deterministic purpose from `deterministicPurpose()`.
102
+ - **Deterministic slug algorithm** (`scripts/lib/cluster-detect.mjs::generateDeterministicSlug`): TF-IDF over member frontmatters with the siblings' corpus as the IDF context, ranked `(weight desc, term asc)` for lex tie-breaking, top 1–2 valid tokens joined with `-`. Falls back to `cluster-<7-hex-fnv1a>` when no token survives the slug regex — still deterministic, still member-derived. Byte-stable across member shuffles.
103
+ - **Use case**: the mode to pair with an upcoming `--fanout-target` / `--max-depth` balance pass and soft-parent synthesis for large hand-authored corpora where reproducible builds matter more than the extra nuance an LLM adds at Tier 2.
104
+
105
+ ### Removed
106
+
107
+ - **`--quality-mode tier0-only`** — removed entirely. The mode was a narrow "hermetic CI / no-Claude" path that returned `"undecidable"` for every Tier 0 mid-band pair, forcing manual resolution via the interactive review flow. With `--quality-mode deterministic` (Phase X.3, shipped above) covering the same "no LLM / no sub-agent call" use case AND resolving mid-band pairs via the static Tier 1 threshold (no interactive-review fallback required), keeping tier0-only around was pure maintenance tax. Removal touches `scripts/lib/intent.mjs::VALID_QUALITY_MODES`, `scripts/lib/tiered.mjs::QUALITY_MODES`, the mid-band-undecidable branch in `decide()`, the `INT-13` validation message, the CLI help, methodology.md / README.md / guide sections, and their tests. `--quality-mode tier0-only` now raises `INT-13` at the intent layer — users should migrate to `deterministic` (the path for every tier0-only use case) or `tiered-fast` (the default). Note: "deterministic" eliminates runtime Claude calls but the Tier 1 MiniLM model is still downloaded on first use by `@xenova/transformers` if the local cache is cold — fully air-gapped use requires pre-warming `~/.cache/huggingface` on a networked host. A companion `tests/unit/intent-resolve.test.mjs::"VALID_QUALITY_MODES is in sync with tiered.mjs::QUALITY_MODES"` test guards the mode-list duplication between intent.mjs and tiered.mjs so future drift fails loud.
108
+
109
+ ### Fixed
110
+
111
+ - **`LLM_WIKI_QUALITY_MODE` env var now actually works.** The env var was documented and tested through `tiered.mjs::resolveQualityMode`, but the orchestrator bypassed that helper with `plan.flags?.quality_mode || "tiered-fast"` — so the env var silently had no effect on CLI runs. Orchestrator now calls `resolveQualityMode(plan.flags)` at both the convergence phase and the balance phase, wiring the env var through end-to-end. The flag still wins when both flag and env are set. Invalid values raise `INT-13` at the intent layer on BOTH paths with identical valid-values suggestions (flag and env share the structured-error shape), so a stale `LLM_WIKI_QUALITY_MODE=tier0-only` in a shell profile fails loud on the next invocation rather than surfacing as a generic exit-1. Env-var validation is gated to subcommands that actually consume quality mode (`build` / `extend` / `rebuild` / `fix` / `join`) so a stale env var doesn't lock the user out of recovery paths like `rollback` or `validate`.
112
+ - **Convergence cap starved cluster-NEST on large-merge-heavy corpora.** `operators.mjs::MAX_CONVERGENCE_ITERATIONS` was 20. `runConvergence` applies at most ONE pairwise operator (DESCEND/LIFT/MERGE) per outer iteration, and pairwise ops always outrank cluster NEST (which only runs when no pairwise op fires in a given iteration). Once cluster NEST IS reached, `tryClusterNestIteration` can apply multiple NEST commits in that single pass — the scarcity is at the outer iteration level, not per-NEST. On the 596-leaf `skill-code-review/reviewers.src/` fixture — now reproducibly re-run as a post-merge X.6 smoke test — Tier 1 similarity finds ~20 viable MERGE pairs. Iterations 1-20 all apply MERGE, convergence hits the 20-cap, and the Phase X.10 coarse-partition (which emits ~75 top-level NESTs for a flat 596-leaf root) never runs even once. The downstream balance phase then tried to linearly carve a 576-leaf root, hit its own 20-iter cap, and rolled back. Bumped to `MAX_CONVERGENCE_ITERATIONS = 200` — enough for 20 pairwise ops + at least one cluster-NEST pass (which internally lands all ~75 NESTs) + follow-up MERGEs after NESTs reveal new overlaps. Small wikis exit early via the "no ops fired this iteration" break; the raise is effectively free for them. Caller override via `runConvergence({ maxIterations })` is unchanged. (Follow-up: the real architectural fix is to let cluster NEST interleave with pairwise ops in the same iteration rather than run only as a fallback — tracked separately, not blocking X.6.)
113
+ - **Cross-depth slug collision guard in `resolveNestSlug`.** The v1.0.0 collision resolver checked only the cluster's immediate parent directory, missing collisions with leaf ids or subdirectory basenames elsewhere in the tree. On real-world multi-branch wikis (first observed during a 596-leaf novel-corpus build in the consumer skill `skill-code-review`), Tier 2's `propose_structure` picked slug `event-patterns` for a cluster under `design-patterns-group/` — colliding with an existing leaf at `arch/event-patterns/index.md` (id: `event-patterns`) in a completely different branch. The parent-dir-only walk missed it; validation caught `DUP-ID` post-apply and forced a rollback. The resolver's API now: `resolveNestSlug(slug, proposal, wikiRoot, opts)` gains an optional `wikiRoot` third argument and an `opts.wikiIndex` escape hatch. When `wikiRoot` is supplied, the internal `collectForbiddenIdsPredicate` returns a predicate backed by a local parent-dir set PLUS either the caller's precomputed index (via `opts.wikiIndex`) or a fresh full-tree walk (via the new private `walkWikiIds`). A new exported helper `buildWikiForbiddenIndex(wikiRoot)` materialises the wiki-wide id + directory-basename set once per convergence iteration — `operators.mjs::tryClusterNestIteration` precomputes it when at least one proposal is picked, mutates it by `wikiIndex.add(resolvedSlug)` after each successful apply, and passes it through `opts.wikiIndex` so each slug-resolve call runs in O(parent-dir) instead of O(full-tree). Total cost across a multi-NEST iteration: O(#files + #applies) instead of O(#applies × #files). Dot-prefixed entries (directories AND files — `.llmwiki/`, `.work/`, `.git/`, `.github/`, stray `.DS_Store` / `.foo.md`) are skipped under a blanket rule matching `scripts/lib/chunk.mjs::collectEntryPaths` discipline. Per-file frontmatter is read via `readFrontmatterStreaming` for bounded reads at the ~600-leaf scale. Legacy callers that don't pass `wikiRoot` continue to get the parent-dir-only behaviour, so the change is backward-compatible. Fixes issue [#4](https://github.com/ctxr-dev/skill-llm-wiki/issues/4) bug 2.
114
+ - **Pre-apply alias collision guard in MERGE.** `applyMerge` in `scripts/lib/operators.mjs` now walks the full wiki (skipping every dot-directory) to collect every live entry id (every `.md` file's frontmatter `id:`, including `index.md` entries) before it writes the keeper's new `aliases[]`. If any of the new aliases (absorbed's id or any of absorbed's own aliases) would collide with a live id elsewhere in the tree, the merge is refused pre-apply with a clear error — nothing is written, nothing is deleted, the convergence iteration can continue with the next proposal. Before this fix, such collisions were caught downstream at validation as `ALIAS-COLLIDES-ID`, forcing a full pipeline rollback. The guard is defensive and targets the multi-operator-per-iteration reach-state that produced the collisions during the consumer-skill `skill-code-review` 596-leaf novel-corpus build (3 MERGE pairs hit this during Bundle A2: `pattern-eip-messaging↔pattern-eip-endpoint`, `smell-data-class↔antipattern-anemic-domain-model`, `smell-duplicate-code↔antipattern-copy-paste`). The new helper `collectLiveIds(wikiRoot, excludePaths)` is exported so future operators (e.g. the real `join` implementation) can reuse it. Fixes issue [#4](https://github.com/ctxr-dev/skill-llm-wiki/issues/4) bug 3.
115
+
116
+ ### Changed
117
+
118
+ - **`VALID_QUALITY_MODES`** (`scripts/lib/intent.mjs`) and **`QUALITY_MODES`** (`scripts/lib/tiered.mjs`) now include `"deterministic"`. The `intent-resolve` and `QUALITY_MODES` canonical-allow-list tests follow.
119
+ - **NEST audit-trail semantics for deterministic mode.** `appendNestDecision` in `scripts/lib/decision-log.mjs` previously hard-coded `tier_used: 2` for every NEST entry — correct before deterministic mode existed, since every NEST touched Tier 2 via `propose_structure` or `nest_decision`. Under `--quality-mode deterministic` no sub-agent is ever consulted for math candidates, so entries from that path now record `tier_used: 0` and a new `confidence_band: "deterministic-math"` distinguishes them from `"math-gated"` (which still means "math candidate that passed a Tier 2 gate" under the other quality modes). Tooling and tests that filter `decisions.yaml` by `tier_used` to reason about sub-agent costs now see accurate zeros on the deterministic path. Call sites that don't supply `tier_used` still default to 2, so every existing non-deterministic entry is byte-identical.
120
+
121
+ ### Tests
122
+
123
+ - `tests/unit/soft-dag.test.mjs` — 17 new scenarios covering `runSoftDagParents` (no-op empty wiki, zero soft parents on unrelated topics, multiple-category overlap adds soft parents, `maxPerLeaf` cap honoured, determinism across runs, primary-parent preservation at `parents[0]`, CRLF-fence leaf recognition) and `applySoftParentEntries` (append into claimed parent's entries[], idempotency across runs, ignore primary-only leaves, path-traversal rejection via lexical + symlink-realpath defense-in-depth guard, tolerate malformed `parents[]` entries, stats reflect actual writes not planned appends, end-to-end synthesis + propagation). Plus `tests/unit/intent-resolve.test.mjs` — 2 new scenarios covering `INT-16a` subcommand-scoping rejection on `fix`/`extend`/`validate`/`join`/`rollback` and strict `status === "ok"` acceptance on `build`.
124
+ - `tests/unit/balance.test.mjs` — 18 new scenarios: depth-map (plus non-wiki-node dirs skipped under the `index.md`-only discipline), max-depth, fanout-stats (including `leafCounts` return), overload detection (with and without `nestedParents` exclusion, plus a leaf-metric regression guard against flagging dirs overfull only via subdir count, plus a POSIX-sort-key regression guard for cross-platform ordering), depth-overage detection (only single-child passthroughs), flatten happy path + refuse on multi-child, flatten refuses when passthrough holds stray non-`.md` content (defensive emptiness check), flatten tolerates + cleans dot-prefixed noise (`.DS_Store`) under the blanket dot-skip rule, `runBalance` no-op when flags absent, fanout-only pass, depth-only pass, fanout pass skips un-actionable `overfull[0]` and acts on a later candidate, multiplier constant pin.
125
+ - `tests/unit/intent-resolve.test.mjs` — 7 new scenarios covering INT-14 / INT-15 accept/reject boundaries plus INT-14a / INT-15a subcommand-scoping rejection on `fix`/`extend`/`validate` and acceptance on `build`/`rebuild`.
126
+ - New scenarios in `tests/unit/nest-applier.test.mjs` for cross-depth collision, subdir-basename collision, `-group-N` chain fallback, dot-prefixed skip, clean-tree no-op, same-depth regression guard, `buildWikiForbiddenIndex` snapshot shape, `opts.wikiIndex` short-circuit semantics, caller-mutation round-trip, and wiki-root `index.md` id capture (in both the legacy walk and the precomputed-index path). All pre-existing tests pass unchanged.
127
+ - 4 new scenarios in `tests/unit/operators.test.mjs` covering the guard-trips-on-collision path, the guard-permits-clean-merge regression, and two `collectLiveIds` helper tests (skips `.llmwiki/`/`.work/`, honours `excludePaths`).
128
+ - New deterministic-mode coverage across three files:
129
+ - `tests/unit/tiered.test.mjs` — `QUALITY_MODES` canonical allow-list including `"deterministic"`, a constant-derivation pin for `TIER1_DETERMINISTIC_THRESHOLD` (midpoint of the Tier 1 decisive bounds), a joint Tier 0 + Tier 1 mid-band sweep that exercises the `confidence_band === "deterministic-mid-band"` branch empirically, byte-stability across runs, non-escalation to Tier 2, and Tier 0 decisive-path preservation under the new mode.
130
+ - `tests/unit/cluster-detect.test.mjs` — `generateDeterministicSlug` (distinguishing-token selection, order invariance, multi-run stability, hash fallback determinism, precomputed-IDF equivalence) and `deterministicPurpose` (most-shared cover, lex tie-break, focus fallback, plain-frontmatter input equivalence).
131
+ - `tests/e2e/determinism.test.mjs` — new full-build test: two independent `build --quality-mode deterministic` runs on a 6-leaf two-theme corpus must produce byte-identical tree SHAs; the cluster-nest path is in-frame (not skipped); audit-trail check hard-asserts ≥1 NEST entry carrying `tier_used: 0` + `confidence_band: "deterministic-math"`.
132
+ - `tests/unit/intent-resolve.test.mjs` — extended the `--quality-mode` acceptance test to cover `"deterministic"` alongside the existing three modes.
133
+ - 3 skipped (unchanged from prior baseline; all opt-in gates). 0 failing on `ubuntu-latest` / `windows-latest` (CI) and locally.
134
+
7
135
  ## [1.0.0] — 2026-04-16
8
136
 
9
137
  First stable release. The semantic-routing substrate landed in v0.4.0, multi-NEST convergence landed in v0.4.1, and 1.0.0 closes the remaining sharp edge — a DUP-ID collision path discovered during the v0.4.1 deferred novel-corpus validation — plus the Windows CI parity gap. The v0.4.1 "Known remaining gaps" novel-corpus validation item is **resolved**: the combined `skill-code-review/reviewers/` + `overlays/` corpus (45 leaves) now builds end-to-end on the first try, `validate` returns 0 errors / 0 warnings, and multi-NEST applies atomically in a single convergence iteration. Semver commitments are now in effect: the six public operations (Build, Extend, Validate, Rebuild, Fix, Join), the CLI exit-code surface, the layout-mode contract, and the private-git history shape are stable and will not break in 1.x.
package/README.md CHANGED
@@ -3,9 +3,12 @@
3
3
  [![npm](https://img.shields.io/npm/v/@ctxr/skill-llm-wiki)](https://www.npmjs.com/package/@ctxr/skill-llm-wiki)
4
4
  [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
5
5
  [![CI](https://img.shields.io/badge/CI-Ubuntu%20%2B%20Windows-green)](.github/workflows/ci.yml)
6
+ [![Agent Skills](https://img.shields.io/badge/Agent%20Skills-Claude%20Code%20%7C%20Codex%20CLI-blue)](https://agentskills.io)
6
7
 
7
8
  > **Turn any folder of markdown, docs, or source into a deterministic, token-efficient knowledge base your AI agent reads the way you'd want it to — once, and only the parts it needs.**
8
9
 
10
+ Supports Claude Code and OpenAI Codex CLI via the open Agent Skills standard. Tier 2 sub-agent dispatch follows the [`subagent-dispatch-v1`](https://github.com/ctxr-dev/kit/blob/main/docs/subagent-dispatch-v1.md) envelope.
11
+
9
12
  ## The problem every AI-heavy workflow eventually hits
10
13
 
11
14
  You want your AI pair — Claude, Cursor, an agent loop, whatever — to *know things*. Architecture decisions. Runbooks. API contracts. Prior postmortems. Team conventions. The messy folder of `.md` notes you've been keeping for eighteen months.
@@ -62,7 +65,7 @@ Every phase is a git commit in the wiki's private history, so you can inspect, d
62
65
  - **Stable sibling layout.** `<source>.wiki/` is the one folder a wiki ever lives in. No more `.llmwiki.v1`/`.v2`/`.v3` directory proliferation — prior states are reachable as git tags (`pre-op/<id>`, `op/<id>`) in the private repo.
63
66
  - **Three layout modes, never guessed.** `sibling` (default), `in-place` (source IS the wiki), and `hosted` (user-chosen path with a `.llmwiki.layout.yaml` contract). Ambiguous invocations refuse and prompt — see the "Ask, don't guess" rule.
64
67
  - **User-repo coexistence.** An auto-generated `.gitignore` hides the private metadata from any ancestor user git. The skill's isolation env block (`GIT_DIR`, `GIT_CONFIG_NOSYSTEM`, `core.hooksPath=/dev/null`, …) keeps the two gits from leaking into each other.
65
- - **Tiered AI strategy.** TF-IDF (free) → local MiniLM embeddings (required, ~23 MB one-time model download, zero-API) → Claude (only for mid-band ambiguity and decisions requiring natural-language judgment). `--quality-mode tiered-fast|claude-first|tier0-only` selects the escalation policy.
68
+ - **Tiered AI strategy.** TF-IDF (free) → local MiniLM embeddings (required, ~23 MB one-time model download, zero-API) → Claude (only for mid-band ambiguity and decisions requiring natural-language judgment). `--quality-mode tiered-fast|claude-first|deterministic` selects the escalation policy.
66
69
  - **Deterministic slug collisions.** NEST operator auto-resolves slug-vs-member-id collisions with a deterministic `-group` suffix before apply. Your convergence loop never needs manual retries for DUP-ID.
67
70
  - **Optional interactive review.** `skill-llm-wiki rebuild <wiki> --review` prints the post-convergence diff and commit list, lets the user approve / abort / `drop:<sha>` specific iterations, and re-runs validation + index regen on the reverted tree.
68
71
  - **Windows parity.** The CI matrix runs the smoke suite on both `ubuntu-latest` and `windows-latest`; the isolation env switches `/dev/null` to `NUL` and enables `core.longpaths=true` on Windows.
@@ -91,8 +94,8 @@ Merge my docs and runbooks wikis into a handbook
91
94
 
92
95
  This skill has two hard requirements. If either is missing, the skill will refuse to run and print a clear message explaining why and how to fix it.
93
96
 
94
- 1. **[Claude Code](https://claude.ai/code) CLI or IDE extension.**
95
- 2. **Node.js ≥ 18.0.0.** The skill's deterministic CLI (`scripts/cli.mjs`) is a Node.js program, so Node must be available in the shell Claude Code uses to run Bash commands. If Node.js is missing or below the minimum version, Claude will stop the operation before making any changes and relay platform-specific install instructions.
97
+ 1. **An Agent Skills-compatible harness** ([Claude Code](https://claude.ai/code) CLI/IDE, OpenAI Codex CLI, or another harness implementing the [open Agent Skills standard](https://agentskills.io)).
98
+ 2. **Node.js ≥ 18.0.0.** The skill's deterministic CLI (`scripts/cli.mjs`) is a Node.js program, so Node must be available in the shell the host harness uses to run Bash commands. If Node.js is missing or below the minimum version, the harness will stop the operation before making any changes and relay platform-specific install instructions.
96
99
 
97
100
  ### Verify your environment before invoking the skill
98
101
 
@@ -172,7 +175,7 @@ npx @ctxr/kit install @ctxr/skill-llm-wiki # project-local
172
175
  npx @ctxr/kit install @ctxr/skill-llm-wiki --user # user-global
173
176
  ```
174
177
 
175
- Installs to `.claude/skills/ctxr-skill-llm-wiki/` (or `~/.claude/skills/…` with `--user`). No post-install wiring, no automatic hooks, no filesystem watchers the skill is pure standby until you explicitly ask Claude to run an operation against a specific directory.
178
+ Installs canonically to `.agents/skills/ctxr-skill-llm-wiki/` (or `~/.agents/skills/…` with `--user`); `@ctxr/kit` auto-creates discovery-mirror symlinks at `.claude/skills/` (and `~/.codex/skills/` for user-scope) so Claude Code, Codex CLI, and other Agent Skills harnesses all find the artefact. No post-install wiring, no automatic hooks, no filesystem watchers; the skill is pure standby until you explicitly ask the host harness to run an operation against a specific directory.
176
179
 
177
180
  The installed package contains `SKILL.md` (the routing entry point Claude reads at activation), `LICENSE`, `README.md`, `scripts/` (invoked via `node scripts/cli.mjs <subcommand>`, never read as source), and `guide/` (context-specific routing leaves loaded on keyword activation — `hidden-git.md` when the user asks about history or diff, `user-intent.md` when the request is ambiguous, `tiered-ai.md` when the user asks about quality modes, etc.). The internal design doc `methodology.md` is deliberately excluded from the installed package (`files[]` in `package.json` does not list it) so it is never copied into any user environment and never loaded during a session.
178
181
 
@@ -180,15 +183,15 @@ The installed package contains `SKILL.md` (the routing entry point Claude reads
180
183
 
181
184
  ```bash
182
185
  git clone https://github.com/ctxr-dev/skill-llm-wiki.git /tmp/skill-llm-wiki
183
- mkdir -p .claude/skills
184
- cp -r /tmp/skill-llm-wiki .claude/skills/skill-llm-wiki
186
+ mkdir -p .agents/skills
187
+ cp -r /tmp/skill-llm-wiki .agents/skills/skill-llm-wiki
185
188
  ```
186
189
 
187
190
  ### Git Submodule
188
191
 
189
192
  ```bash
190
193
  git submodule add https://github.com/ctxr-dev/skill-llm-wiki.git \
191
- .claude/skills/skill-llm-wiki
194
+ .agents/skills/skill-llm-wiki
192
195
  ```
193
196
 
194
197
  ## Usage
@@ -463,7 +466,7 @@ Quality modes select the escalation policy:
463
466
 
464
467
  - `tiered-fast` (default) — full Tier 0 → 1 → 2 ladder.
465
468
  - `claude-first` — skip Tier 1; mid-band Tier 0 escalates straight to Claude.
466
- - `tier0-only` — air-gapped mode; mid-band becomes an "undecidable" marker resolved via the interactive review flow.
469
+ - `deterministic` — no LLM in the loop; Tier 1 mid-band resolved by a static threshold, cluster naming produced from member frontmatter. Byte-reproducible builds for air-gapped / hermetic CI use.
467
470
 
468
471
  Tier 1 uses `@xenova/transformers` running `Xenova/all-MiniLM-L6-v2` locally via ONNX (~23 MB one-time model download, ~50 ms per text on CPU, zero API cost). It is a **required** runtime dependency since v0.4.0 — the dependency preflight at CLI startup verifies it is resolvable, and will offer to `npm install` it on a fresh checkout if it is missing.
469
472
 
package/SKILL.md CHANGED
@@ -62,18 +62,18 @@ When the user asks for any of the six operations (Build, Extend, Validate, Rebui
62
62
 
63
63
  1. **Resolve the ask** — pin down the operation, source, target, layout mode, and any constraints. Prompt the user to disambiguate where needed (see `guide/ux/user-intent.md`).
64
64
  2. **Run the Node.js preflight** in the main session — this is cheap, produces a tiny output, and must happen before any agent is spawned so the user sees the detailed install/upgrade message on failure. Preflight failures stop the operation; do not spawn an agent.
65
- 3. **Spawn a dedicated "wiki-runner" sub-agent** via the `Agent` tool with a self-contained prompt describing: the operation, the resolved CLI invocation, the activated `guide/` leaves by filename, any quality-mode / layout-mode flags, and the completion criterion. The sub-agent runs the CLI, handles Tier 2 sub-delegations, manages its own context, and reports back a summary when done.
65
+ 3. **Spawn a dedicated "wiki-runner" sub-agent** via the host harness's sub-agent dispatch primitive (Claude Code: `Agent` tool; Codex CLI: equivalent; see the [`subagent-dispatch-v1`](https://github.com/ctxr-dev/kit/blob/main/docs/subagent-dispatch-v1.md) spec) with a self-contained prompt describing: the operation, the resolved CLI invocation, the activated `guide/` leaves by filename, any quality-mode / layout-mode flags, and the completion criterion. The sub-agent runs the CLI, handles Tier 2 sub-delegations, manages its own context, and reports back a summary when done.
66
66
  4. **Relay the sub-agent's summary** to the user. The main session never loads the wiki's content into its own context window.
67
67
 
68
68
  ### Why
69
69
 
70
- Wikis can be any size a 10-entry notes folder or a 10,000-entry knowledge base. A Build that drafts frontmatter for a prose-heavy 10k corpus can run Claude against thousands of entries in Tier 2. Running all of that inline in the main session would consume the user's context budget on content they never asked to see, and would leave no room for continued conversation. The wiki-runner sub-agent has its own context window; the main session's budget stays lean for the user's ongoing chat.
70
+ Wikis can be any size: a 10-entry notes folder or a 10,000-entry knowledge base. A Build that drafts frontmatter for a prose-heavy 10k corpus can run the host LLM against thousands of entries in Tier 2. Running all of that inline in the main session would consume the user's context budget on content they never asked to see, and would leave no room for continued conversation. The wiki-runner sub-agent has its own context window; the main session's budget stays lean for the user's ongoing chat.
71
71
 
72
72
  ### What the wiki-runner sub-agent is responsible for
73
73
 
74
74
  - **Executing the CLI subcommand** and streaming progress back periodically (don't spam every phase — one line per phase is plenty).
75
75
  - **Its own context-window hygiene.** The sub-agent monitors its remaining budget and auto-compacts when it approaches the limit. See `guide/isolation/scale.md` "Context-window management in the wiki-runner" for the protocol — the short version is: phase commits in the private git are the durable checkpoint, so the sub-agent can safely drop its conversation history of prior phases and re-read only what the next phase needs.
76
- - **Handling the Tier 2 exit-7 handshake.** The skill's CLI runs under Node and cannot spawn sub-agents directly. When the operator-convergence phase accumulates Tier 2 requests (cluster naming, mid-band merge decisions, `propose_structure` whole-directory asks, `nest_decision` gate decisions, …) the CLI writes them to `<wiki>/.work/tier2/pending-<batch-id>.json` and exits with code **7** (`NEEDS_TIER2`). Exit 7 is **not a failure** it is the suspend-and-resume signal. The wiki-runner must:
76
+ - **Handling the Tier 2 exit-7 handshake.** The skill's CLI runs under Node and cannot spawn sub-agents directly. When the operator-convergence phase accumulates Tier 2 requests (cluster naming, mid-band merge decisions, `propose_structure` whole-directory asks, `nest_decision` gate decisions, …) the CLI writes them to `<wiki>/.work/tier2/pending-<batch-id>.json` and exits with code **7** (`NEEDS_TIER2`). Each pending request follows the [`subagent.dispatch.v1`](https://github.com/ctxr-dev/kit/blob/main/docs/subagent-dispatch-v1.md) shape (`prompt`, `inputs`, `effort`, optional `model`, `response_schema`, `outputs_path`) so any Agent Skills harness can service it. Exit 7 is **not a failure**, it is the suspend-and-resume signal. The wiki-runner must:
77
77
  1. Detect exit 7 from the CLI.
78
78
  2. Read every `pending-*.json` file under `<wiki>/.work/tier2/`.
79
79
  3. Service each request (see "Inline servicing vs fan-out" below).
@@ -86,9 +86,9 @@ Wikis can be any size — a 10-entry notes folder or a 10,000-entry knowledge ba
86
86
 
87
87
  The wiki-runner chooses **inline** or **fan-out** servicing based on the batch size. The skill CLI's wire protocol is identical either way — pending files in, response files out, exit 7 between — so the choice is entirely a context-budget and throughput call that the wiki-runner makes at runtime:
88
88
 
89
- - **Inline (≤ ~50 requests per batch).** The wiki-runner answers every request directly, reasoning as the Tier 2 worker itself. No child `Agent` spawn per request. Each request's `prompt`, `inputs`, `response_schema`, `model_hint`, and `effort_hint` are visible to the wiki-runner, which writes the JSON response inline. This is the right choice for a typical `build`/`rebuild` against a ~10-50 leaf corpus: batch sizes stay small, fan-out overhead would dwarf the work, and the wiki-runner's own context is plenty for a few dozen frontmatter-blob comparisons. Environment constraint: general-purpose sub-agents in the current Claude Code harness cannot spawn further `Agent`s themselves inline servicing is actually the *only* option when the wiki-runner is itself a nested sub-agent, so the skill's design must not require fan-out.
89
+ - **Inline (≤ ~50 requests per batch).** The wiki-runner answers every request directly, reasoning as the Tier 2 worker itself. No child sub-agent dispatch per request. Each request's `prompt`, `inputs`, `response_schema`, `effort` (with optional `model` override) are visible to the wiki-runner, which writes the JSON response inline. This is the right choice for a typical `build`/`rebuild` against a ~10-50 leaf corpus: batch sizes stay small, fan-out overhead would dwarf the work, and the wiki-runner's own context is plenty for a few dozen frontmatter-blob comparisons. Environment constraint: general-purpose sub-agents in current Agent Skills harnesses (Claude Code, Codex CLI) cannot spawn further sub-agents themselves, so inline servicing is actually the *only* option when the wiki-runner is itself a nested sub-agent, and the skill's design must not require fan-out.
90
90
 
91
- - **Fan-out (> ~50 requests per batch, or mixed `model_hint`s).** The wiki-runner reports the batch size back to the main session, which spawns one narrowly-scoped `Agent` per request (or per small group of homogeneous requests) with the request's `prompt`, `inputs`, `response_schema`, `model_hint`, and `effort_hint`. The sub-agent sees ONLY those inputs two frontmatter blobs for a `merge_decision`, a cluster's leaf metadata for a `cluster_name`, a directory's whole leaf list for a `propose_structure`, and so on. Never pass the whole wiki. Fan-out is the right choice for large corpora (thousands of leaves → thousands of draft-frontmatter and merge-decision requests) where inline would burn through the wiki-runner's context.
91
+ - **Fan-out (> ~50 requests per batch, or mixed `effort`/`model` per request).** The wiki-runner reports the batch size back to the main session, which dispatches one narrowly-scoped sub-agent per request (or per small group of homogeneous requests) via the host harness's sub-agent primitive, passing the request's `prompt`, `inputs`, `response_schema`, `effort`, and optional `model`. The sub-agent sees ONLY those inputs (two frontmatter blobs for a `merge_decision`, a cluster's leaf metadata for a `cluster_name`, a directory's whole leaf list for a `propose_structure`, and so on). Never pass the whole wiki. Fan-out is the right choice for large corpora (thousands of leaves → thousands of draft-frontmatter and merge-decision requests) where inline would burn through the wiki-runner's context.
92
92
 
93
93
  Either way the skill CLI doesn't change — it always emits pending files and exits 7, and the wiki-runner is free to decide how the actual reasoning happens before the response files appear.
94
94
 
@@ -156,13 +156,13 @@ Response: { "slug": "<kebab-case>", "purpose": "<one line>" } or { "decision": "
156
156
 
157
157
  Unless the user specifies otherwise, the wiki-runner and its Tier 2 fan-outs pick the **most suitable model for the task size** at their default effort level. Concretely:
158
158
 
159
- - **Wiki-runner** spawned at the subagent type that can orchestrate CLI subprocesses and hold the whole operation in its context. For very large corpora (>1k entries or >10 MB) prefer a 1M-context Claude variant.
160
- - **Tier 2 draft-frontmatter sub-agent** picks whatever model is cost-effective for writing a ~200-word `focus` + `covers[]` pair from a single source file. Effort: minimal.
161
- - **Tier 2 operator-convergence sub-agent** picks whatever model is strong at structural judgment on frontmatter pairs. Effort: minimal-to-medium depending on pair ambiguity.
162
- - **Tier 2 rebuild plan review sub-agent** picks a strong reasoning model because this is the "deep understanding" case. Effort: medium.
163
- - **HUMAN-class Fix sub-agent** — picks a strong reasoning model; effort medium because the decision needs justification.
159
+ - **Wiki-runner**: spawned at the sub-agent type that can orchestrate CLI subprocesses and hold the whole operation in its context. For very large corpora (>1k entries or >10 MB) prefer a 1M-context model with high effort (`effort: "heavy"`).
160
+ - **Tier 2 draft-frontmatter sub-agent**: picks whatever model the host maps to `effort: "light"`, cost-effective for writing a ~200-word `focus` + `covers[]` pair from a single source file.
161
+ - **Tier 2 operator-convergence sub-agent**: `effort: "light"` to `"balanced"` depending on pair ambiguity. The host's mapped model should be strong at structural judgment on frontmatter pairs.
162
+ - **Tier 2 rebuild plan review sub-agent**: `effort: "heavy"` because this is the "deep understanding" case.
163
+ - **HUMAN-class Fix sub-agent**: `effort: "balanced"` because the decision needs justification.
164
164
 
165
- **User overrides.** If the user specifies a model (`"use sonnet"`, `"run it on haiku"`, `"use opus 1M for the whole thing"`) or an effort level (`"minimal effort"`, `"maximum quality"`), honour the override on every sub-agent the operation spawns, not just the wiki-runner. Pass the override through to the Tier 2 prompts as an explicit instruction. If the user specifies conflicting overrides (e.g., a model that doesn't support the requested effort level), ask before proceeding.
165
+ **User overrides.** If the user specifies a model (`"use sonnet"`, `"use gpt-5-codex"`, `"use opus 1M for the whole thing"`) or an effort level (`"light effort"`, `"maximum quality"`), pass it as the dispatch envelope's optional `model` field and the required `effort` field. The host harness honours the explicit `model` when set; otherwise it maps `effort` to its own model lineup. Honour the override on every sub-agent the operation spawns, not just the wiki-runner. If the user specifies conflicting overrides (e.g., a model that doesn't support the requested effort level), ask before proceeding.
166
166
 
167
167
  ### Inline execution is the escape hatch, not the norm
168
168
 
package/guide/cli.md CHANGED
@@ -11,7 +11,7 @@ covers:
11
11
  - "hidden-git plumbing: log (+ --op), show, diff (+ --op), blame, reflog, history"
12
12
  - "remote mirroring: remote add/list/remove, sync with tag-only default refspec"
13
13
  - "layout mode flags (--layout-mode sibling|in-place|hosted, --target)"
14
- - "tiered-AI flags (--quality-mode tiered-fast|claude-first|tier0-only)"
14
+ - "tiered-AI flags (--quality-mode tiered-fast|claude-first|deterministic)"
15
15
  - "UX flags (--no-prompt, --json, --json-errors (legacy alias), --accept-dirty, --accept-foreign-target, --review)"
16
16
  - "internal helpers: ingest, draft-leaf, draft-category, index-rebuild, index-rebuild-one, shape-check"
17
17
  - "exit code summary (0 ok, 1 usage, 2 validation/ambiguity/review-abort, 3 resolve miss, 4 node too old, 5 git missing/too old, 6 wiki corrupt, 7 NEEDS_TIER2 suspend-and-resume, 8 DEPS_MISSING runtime dependency missing)"
@@ -230,11 +230,12 @@ All top-level operations accept:
230
230
 
231
231
  ## Tiered-AI flags
232
232
 
233
- - `--quality-mode tiered-fast|claude-first|tier0-only` — select the escalation policy. Default `tiered-fast` (TF-IDF → MiniLM embeddings → Claude). `tier0-only` never calls Claude, never loads Tier 1; mid-band pairs become "undecidable" markers the user resolves interactively. Unknown values raise `INT-13`.
233
+ - `--quality-mode tiered-fast|claude-first|deterministic` — select the escalation policy. Default `tiered-fast` (TF-IDF → MiniLM embeddings → Claude). `deterministic` never calls Claude; Tier 1 mid-band pairs are resolved by a static threshold so repeated runs on the same inputs are byte-reproducible. Unknown values raise `INT-13`.
234
234
 
235
235
  ## UX flags
236
236
 
237
237
  - `--no-prompt` / env `LLM_WIKI_NO_PROMPT=1` — fail loudly on any ambiguity instead of prompting; emits `INT-12` if the skill would otherwise ask a TTY question.
238
+ - Env `LLM_WIKI_NO_PROGRESS=1` — suppress per-phase progress breadcrumbs that the CLI otherwise streams to stderr during long-running operations (build / rebuild / fix / join). Useful for CI jobs that already capture their own progress or that want the pre-X.9 silent stderr shape. `--json` implicitly sets this (the `skill-llm-wiki/v1` envelope consumer contract requires a clean stderr).
238
239
  - `--json` — canonical machine-output flag. Enables the `skill-llm-wiki/v1` envelope on subcommands that emit one (validate, init, heal, rollback) and switches `INT-NN` ambiguity errors to JSON on stderr. The consumer-facing probe subcommands `contract` and `where` always emit JSON when this flag is present.
239
240
  - `--json-errors` — legacy alias for `--json`, kept for consumers that adopted the flag before the envelope shipped. Triggers identical behaviour. New code should pass `--json`.
240
241
  - `--accept-dirty` — operate on a source inside a dirty user git repo (escape hatch for `INT-08`).
@@ -69,8 +69,8 @@ The `.work/` directory is scratch space used by phases that need to stage interm
69
69
  folders. History is tracked by the private git repo at
70
70
  `<source>.wiki/.llmwiki/git/`, and rollback is `skill-llm-wiki rollback
71
71
  <source>.wiki --to pre-<op-id>` (byte-exact via `git reset --hard`).
72
- - See [guide/layout-modes.md](layout-modes.md) for the full mode matrix and
73
- [guide/in-place-mode.md](in-place-mode.md) for the in-place variant. Legacy
72
+ - See [guide/layout/layout-modes.md](../layout/layout-modes.md) for the full mode matrix and
73
+ [guide/layout/in-place-mode.md](../layout/in-place-mode.md) for the in-place variant. Legacy
74
74
  `<source>.llmwiki.v<N>/` wikis are detected via **INT-04** and must be
75
75
  migrated explicitly with `skill-llm-wiki migrate <legacy-path>` before any
76
76
  other operation will run.
@@ -93,5 +93,5 @@ If the user's source folder is already tracked by their own git repo, the
93
93
  first in-place operation writes `.gitignore` with `.llmwiki/`, `.work/`,
94
94
  `.shape/history/*/work/`. The user's git sees those paths as ignored. Our
95
95
  private repo's operations never touch the user's `.git/` — see
96
- [guide/coexistence.md](coexistence.md) for the full coexistence story and
96
+ [guide/isolation/coexistence.md](../isolation/coexistence.md) for the full coexistence story and
97
97
  proof-of-isolation tests.
@@ -61,7 +61,7 @@ NEST fires in two modes:
61
61
 
62
62
  **Cluster-based application.** Each accepted cluster is named via a Tier 2 `cluster_name` request (slug + purpose) — or receives a slug directly from a `propose_structure` Tier 2 response. Names are NEVER shortcut from shared tags; if the sub-agent cannot name a cluster, that cluster does not nest. The NEST applier (`scripts/lib/nest-applier.mjs`) then:
63
63
 
64
- 1. **Atomic slug resolution.** Before touching the filesystem, `resolveNestSlug(slug, proposal)` checks whether the proposed slug collides with (a) any member leaf's id, (b) any non-member sibling leaf's id in the same parent, or (c) an existing sibling subdirectory name. On collision the slug is auto-suffixed deterministically (`<slug>-group`, then `<slug>-group-2`, `-group-3`, …) until it's non-colliding. The rename is audited in `decisions.yaml` as `decision: slug-renamed`. This pre-empts the DUP-ID class of validation failure that would otherwise rollback the entire NEST after apply.
64
+ 1. **Atomic slug resolution.** Before touching the filesystem, `resolveNestSlug(slug, proposal, wikiRoot, opts)` checks whether the proposed slug collides with (a) any member leaf's id, (b) any non-member sibling leaf's id in the same parent, (c) an existing sibling subdirectory name, or (d) any live leaf id or subdirectory basename elsewhere in the tree (full-tree walk, activated whenever `wikiRoot` is provided). On collision the slug is auto-suffixed deterministically (`<slug>-group`, then `<slug>-group-2`, `-group-3`, …) until it's non-colliding. The rename is audited in `decisions.yaml` as `decision: slug-renamed`. This pre-empts the DUP-ID class of validation failure that would otherwise rollback the entire NEST after apply — including cross-depth collisions (e.g. a cluster slug `event-patterns` proposed under `design-patterns-group/` that would collide with an existing `arch/event-patterns/` in a different branch of the tree). The optional `opts.wikiIndex` argument accepts a precomputed `Set` from `buildWikiForbiddenIndex(wikiRoot)` — the convergence loop builds it once per iteration and mutates it with `wikiIndex.add(resolvedSlug)` after each successful apply, dropping per-proposal cost from O(full-tree) to O(parent-dir). `wikiRoot` is itself optional: legacy callers that omit it get the parent-dir-only walk preserved from v1.0.0 (modulo the dot-skip rule described in the module source).
65
65
  2. Creates `<parent>/<slug>/` (using the resolved slug).
66
66
  3. Moves each cluster member into the new directory and rewrites its `parents[]` to `["index.md"]`.
67
67
  4. Writes a minimal `index.md` stub carrying `id` (= resolved slug), `type: index`, `depth_role: subcategory`, a `focus:` line from the cluster purpose, and — when the members share them — `shared_covers[]` (intersection of member covers) and `tags[]` (intersection of member tags). The stub does NOT carry aggregated `activation_defaults`: routing is semantic, and descent decisions are made against the stub's `focus` + `shared_covers`, not against a literal keyword union.
@@ -9,7 +9,7 @@ covers:
9
9
  - "Tier 0 is TF-IDF over frontmatter (focus + covers + tags) with fixed thresholds"
10
10
  - "Tier 1 is local embeddings via @xenova/transformers (MiniLM, REQUIRED dep)"
11
11
  - "Tier 2 is a sub-agent, executed via the CLI exit-7 handshake (never inline)"
12
- - default quality mode is tiered-fast; claude-first and tier0-only are opt-in
12
+ - default quality mode is tiered-fast; claude-first and deterministic are opt-in
13
13
  - "similarity-cache at <wiki>/.llmwiki/similarity-cache/ memoises pairwise results"
14
14
  - "decision-log at <wiki>/.llmwiki/decisions.yaml records every non-trivial decision"
15
15
  - operator-convergence routes every MERGE similarity check through tiered.decide
@@ -85,8 +85,9 @@ well-structured corpora — pairs of near-duplicate entries
85
85
  should collapse as SAME, obviously unrelated pairs as DIFFERENT
86
86
  — leaving only genuinely ambiguous pairs to escalate. The actual
87
87
  Tier 0 hit rate on a given wiki depends on how informative the
88
- frontmatter is; run with `--quality-mode tier0-only` and inspect
89
- `decisions.yaml` to measure the tier distribution for your corpus.
88
+ frontmatter is; inspect `decisions.yaml` after a build to measure
89
+ the tier distribution for your corpus (grep the `tier:` field on
90
+ every decision entry).
90
91
 
91
92
  ## Tier 1 — local embeddings (scripts/lib/embeddings.mjs)
92
93
 
@@ -275,13 +276,13 @@ all of its Tier 2 cost on subsequent rebuilds.
275
276
 
276
277
  ## Quality modes
277
278
 
278
- Choose via `--quality-mode` or the `LLM_WIKI_QUALITY_MODE` env var.
279
+ Choose via `--quality-mode` (flag) or `LLM_WIKI_QUALITY_MODE` (env var). The flag wins when both are set. Invalid values on EITHER path raise `INT-13` at the intent layer with the same valid-values suggestions — a stale env value from an obsolete shell profile fails loud on the next `skill-llm-wiki` invocation rather than silently falling through to a plain throw at convergence time. Env-var validation is gated to subcommands that consume quality mode (build / extend / rebuild / fix / join); recovery paths like `rollback` are unaffected.
279
280
 
280
281
  | Mode | Behaviour | Use when |
281
282
  |------|-----------|----------|
282
283
  | **`tiered-fast`** (default) | Full ladder. Tier 0 → Tier 1 → Tier 2 on mid-band escalations. | General-purpose builds. |
283
284
  | `claude-first` | Tier 0 is still consulted for decisive cases. Mid-band Tier 0 skips Tier 1 and goes directly to Tier 2. | When the user values Claude's judgment over speed/cost. |
284
- | `tier0-only` | Tier 0 only. Mid-band decisions become "undecidable" and the caller must resolve manually. | Air-gapped, hermetic CI, and smoke tests that must not reach out to Claude. |
285
+ | `deterministic` | Tier 0 Tier 1 ladder with a static threshold resolving mid-band Tier 1 pairs. No LLM/sub-agent is ever consulted. Cluster naming comes from `generateDeterministicSlug` + `deterministicPurpose`; Tier 2 escalations are skipped entirely. Repeated runs on the same inputs produce byte-reproducible output. | Hermetic CI; large deterministic corpus builds where reproducibility matters more than Tier 2's naming nuance. For air-gapped use, pre-warm the Tier 1 MiniLM model cache on a networked host — `@xenova/transformers` downloads the model on first use otherwise. |
285
286
 
286
287
  ## Similarity cache
287
288
 
@@ -80,7 +80,7 @@ in the wrong place) is always higher than a one-sentence clarifying question.
80
80
  | INT-10 | Unknown `--layout-mode` value | use `sibling` / `in-place` / `hosted` |
81
81
  | INT-11 | Unknown flag / malformed flag value | correct the flag |
82
82
  | INT-12 | Prompt required in non-interactive mode | supply the flag the prompt was asking for, or re-run in a TTY |
83
- | INT-13 | Unknown `--quality-mode` value | use `tiered-fast` / `claude-first` / `tier0-only` |
83
+ | INT-13 | Unknown `--quality-mode` value | use `tiered-fast` / `claude-first` / `deterministic` |
84
84
 
85
85
  ## `--json` for programmatic consumption
86
86
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@ctxr/skill-llm-wiki",
3
- "version": "1.0.2",
4
- "description": "Claude Code skill build, extend, validate, rebuild, fix, and join LLM wikis from any knowledge corpus. Token-efficient retrieval via hierarchical indices, DAG parents, and deterministic rewrite operators.",
3
+ "version": "1.2.0",
4
+ "description": "Agent Skills (Claude Code, Codex CLI): build, extend, validate, rebuild, fix, and join LLM wikis from any knowledge corpus. Token-efficient retrieval via hierarchical indices, DAG parents, and deterministic rewrite operators.",
5
5
  "type": "module",
6
6
  "license": "MIT",
7
7
  "repository": {
@@ -16,6 +16,9 @@
16
16
  "node": ">=18.0.0"
17
17
  },
18
18
  "keywords": [
19
+ "agent-skills",
20
+ "agents-md",
21
+ "codex",
19
22
  "claude-code",
20
23
  "skill",
21
24
  "llm-wiki",
@@ -38,6 +41,9 @@
38
41
  "type": "skill",
39
42
  "target": "folder"
40
43
  },
44
+ "publishConfig": {
45
+ "access": "public"
46
+ },
41
47
  "bin": {
42
48
  "skill-llm-wiki": "scripts/cli.mjs"
43
49
  },
@@ -46,7 +52,8 @@
46
52
  "validate": "node scripts/cli.mjs --version",
47
53
  "lint": "markdownlint-cli2 '**/*.md' '#node_modules'",
48
54
  "lint:fix": "markdownlint-cli2 --fix '**/*.md' '#node_modules'",
49
- "prepare": "husky || true"
55
+ "prepare": "husky || true",
56
+ "prepublishOnly": "npm run lint && npm test"
50
57
  },
51
58
  "devDependencies": {
52
59
  "husky": "^9.1.0",
@@ -54,6 +61,8 @@
54
61
  },
55
62
  "dependencies": {
56
63
  "@xenova/transformers": "2.17.2",
57
- "gray-matter": "^4.0.3"
64
+ "gray-matter": "^4.0.3",
65
+ "p-retry": "^6.2.0",
66
+ "p-timeout": "^6.1.3"
58
67
  }
59
68
  }