npm - @oneie/claude - Versions diffs - 0.3.2 → 0.5.0 - Mend

@oneie/claude 0.3.2 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/README.md +29 -19
package/agents/w1-recon.md +9 -2
package/agents/w2-decide.md +40 -9
package/agents/w3-edit.md +34 -4
package/agents/w4-verify.md +112 -15
package/commands/do-autonomous.md +1 -1
package/commands/do.md +84 -46
package/commands/skill-create.md +4 -4
package/commands/sync.md +7 -7
package/hooks/scripts/stop-reflect.sh +3 -3
package/hooks/scripts/sync-todo-docs.sh +1 -1
package/package.json +2 -1
package/rules/documentation.md +18 -18
package/rules/engine.md +48 -192
package/scripts/do-auto.sh +5 -5
package/scripts/do-folder.sh +1 -1
package/scripts/do-prove.sh +10 -27
package/scripts/do-reconcile.sh +212 -19
package/scripts/do-smoke.sh +65 -25
package/scripts/do-survey.sh +1 -1
package/scripts/do-tier.sh +1 -1
package/scripts/w4-rubric.ts +2 -2
package/skills/oneie/SKILL.md +4 -4
package/skills/signal/SKILL.md +2 -2
package/skills/sui/SKILL.md +1 -1
package/templates/template-agent.md +63 -0
package/templates/template-feature.md +31 -0
package/templates/template-plan.md +80 -0
package/templates/template-teach.md +124 -0
package/templates/template-tests.md +43 -0
package/templates/template-todo.md +783 -0

package/commands/do.md CHANGED Viewed

@@ -1,10 +1,12 @@
 # /do — idea to shipped (v2, lifecycle-first)
-**One command.** Type `/do "<an idea>"` and it walks the whole build lifecycle from wherever the idea is: it agrees a goal, then walks the artifact spine — promise → spec → todo → code → tests → proof → docs → release — **writing whatever is missing, skipping whatever exists.** Point it at a bare idea and it writes everything. Point it at code with no tests or docs and it backfills just those.
+**One command.** Type `/do "<an idea>"` and it walks the whole build lifecycle from wherever the idea is: it agrees a goal, then walks the artifact spine — **making each artifact true** — writing whatever is missing, skipping whatever exists. Point it at a bare idea and it writes everything. Point it at code with no tests or docs and it backfills just those.
+For each artifact on the spine: `true(artifact) ≡ exists(artifact) ∧ reconciles(artifact, canon)`. Missing → write. Stale → rewrite. True → skip.
 The human owns one thing: the goal sentence. `/do` owns everything below it.
-> Narrative for humans: `text/000-do.md`. Per-stage template/skill/agent routing: `plans/templates.md`. This file is what `/do` executes.
+> Narrative for humans: `text/do.md`. Per-stage template/skill/agent routing: `text/templates-plan.md`. This file is what `/do` executes.
 ---
@@ -21,7 +23,22 @@ Sonnet  low | medium      mechanical edit (low) · genuine restructure / prose (
 Opus    high | xhigh      architecture (high) · substrate / schema reconciliation (xhigh)
 ```
-Fan-out: independent agents in ONE message, capped by the plan's `parallel_budget`. Recon + verify fan wide (Haiku × N). Edits run one-per-file (Sonnet × N). Only the design decision (Opus) stays single — understanding is not delegable.
+Fan-out: independent agents in ONE message, capped by the plan's `parallel_budget`. Recon + verify fan wide (Haiku × N). Edits run one-per-file (Sonnet × N). Only W2 stays single — understanding is not delegable, but it runs as a **spawned single Opus `w2-decide` agent**, not inline. The conductor/session model runs Sonnet; W2 stays Opus·high. W2 is in the spawn list (W1·**W2**·W3·W4-rubric); the "never delegated → run inline" constraint is replaced by "stays single — one Opus agent, never fanned."
+### The coherence ratchet
+`reconciles` is not pass/fail in isolation — it is **monotonic**. A write may leave the system *more* coherent or equally coherent. Never less.
+```
+types     — zero new tsc errors          (the original ratchet)
+names     — converge to the dictionary   (no new synonym, no dead name)
+primitives— net new ≤ 0                   (compose 3 existing before adding 1)
+schema    — extend, never fork            (one model, not two)
+surfaces  — reachable, never orphaned     (registered + linked, never dangling)
+docs      — synced, never stale           (no broken link, no lie)
+```
+This is what *"every line makes the system stronger"* means mechanically: coherence is a gate the loop runs, not a hope. The ratchet can tighten the system; it can never loosen it.
 ### Fast & lean — the token economy (automatic, never a flag)
@@ -42,7 +59,7 @@ Comprehensive and fast are not in tension: the **tier decides which gates run**,
 | Write-once state | per cycle | `.w2-spec`/`.w3-receipts` read by path, never re-derived after compaction |
 | `cost:cycle` gradient | close | spend → pheromone; future SELECT routes toward cheap-and-effective paths |
-A cycle that runs **zero LLM calls** is a *good* cycle. The comprehensiveness lives in the gates (DoD, rubric, doc-sync, ANALYZE, PROVE); the tier decides how many of them you pay for.
+A cycle that runs **zero LLM calls** is a *good* cycle. The comprehensiveness lives in the gates (DoD, rubric, RECONCILES, ANALYZE, PROVE); the tier decides how many of them you pay for.
 ### Maximize parallelism — read the graph, reject imaginary blockers
@@ -64,7 +81,7 @@ C2·C3·C4 are siblings → their W1/W2 run concurrently and their W3a edits mer
 **The only valid arrow.** `C_n → C_m` exists **only** when C_m reads a file C_n writes. Before drawing one, fill the blank or delete it:
 > `C_n → C_m because C_m imports/reads {exact file path} that C_n creates.`
-**Imaginary blockers — reject on sight** (full list in `plans/template-todo.md`): "same feature area" · "same folder" · "both touch the DB" (unless the *same* entity) · "logical/reading order" · "don't want too many agents at once" (`parallel_budget` already caps it) · "want to review C_n first" (policy, not a dependency) · "C_m depends on what we *learn* in C_n" (W2 learning ≠ a file dependency).
+**Imaginary blockers — reject on sight** (full list in `text/template-todo.md`): "same feature area" · "same folder" · "both touch the DB" (unless the *same* entity) · "logical/reading order" · "don't want too many agents at once" (`parallel_budget` already caps it) · "want to review C_n first" (policy, not a dependency) · "C_m depends on what we *learn* in C_n" (W2 learning ≠ a file dependency).
 **Cross-cycle W3 batch (the big win).** A batch of 3 cycles with 8 + 6 + 4 independent edits = **18 `w3-edit` agents in ONE message**, not 3 sequential rounds. Only a named same-file collision serializes (→ W3b). Same rule for waves: W1 recon across all cycles in a batch is one Haiku spawn; W4 rubric is 5 Haiku × N cycles in one message — all capped by `parallel_budget`.
@@ -75,7 +92,7 @@ C2·C3·C4 are siblings → their W1/W2 run concurrently and their W3a edits mer
 | Input | Treat as | Entry |
 |---|---|---|
 | bare text (`"add usage billing"`) | **IDEA** (default) | walk the spine from the top |
-| `<slug>` or a `plans/*.md` path | existing work | resolve its slug; ≥2 incomplete cycles → auto context-isolated loop, else walk from the first gap inline |
+| `<slug>` or a `text/*.md` path | existing work | resolve its slug; ≥2 incomplete cycles → auto context-isolated loop, else walk from the first gap inline |
 | `--wave N` | force one wave of a todo | BUILD engine |
 | `--next-cycle` | *(internal)* run one batch then exit — the loop's recursion guard, set by `do-auto.sh`; never typed by a human | BUILD engine |
 | `--auto` | *(legacy alias)* same as a bare multi-cycle `/do <slug>` — the loop is automatic now | BUILD engine, trust-aware |
@@ -97,40 +114,43 @@ Derive the **slug** from the idea: kebab-case, 2–4 words (`"add usage billing"
 The pruned spine by tier:
 ```
-PATCH     code                                                                  (edit + W4 verify gate; 0 spawns)
-FIX       survey → [investigate if legacy] → code → tests → prove
-FEATURE   promise → survey → spec ▸clarify → todo ▸analyze → code → tests → proof → docs → release
-SCHEMA    = FEATURE + substrate reconcile at max
+PATCH     AIM → BUILD → VERIFY                                                                          (edit + W4 verify gate; 0 spawns)
+FIX       AIM → SURVEY → [INVESTIGATE] → BUILD → TEST → PROVE → LEARN
+FEATURE   AIM → PROMISE → SURVEY → DESIGN ▸clarify → PLAN ▸analyze → BUILD → TEST → VERIFY → PROVE → TEACH → SHIP → LEARN
+SCHEMA    = FEATURE + Substrate reconcile at max
 ```
-`investigate` rides any tier when the work touches code we didn't just write — it's the "understand before you change" step, not a separate tier.
+`INVESTIGATE` rides any tier when the work touches code we didn't just write — it's the "understand before you change" step, not a separate tier.
 ---
 ## Step 2 — Walk the artifact spine (presence-check · backfill · skip)
-For each **enabled** stop (per the pruned spine), in order: check if its artifact exists. **Exists → skip. Missing → write it with the owner below.** A stop whose artifact is present is a phase already done; resume is free, no `--from` flag.
+For each **enabled** stop (per the pruned spine), in order: check if its artifact exists and reconciles with its canon. **True → skip. Missing or stale → write/rewrite with the owner below.**
 | Stop | Presence check | Missing → write with | Owner · model · effort |
 |---|---|---|---|
-| **promise** | `test -f text/<slug>.md` | `writer` skill + `text/template-frame.md` | sonnet · medium |
-| **survey** | `do-survey.sh` (4 surfaces) | reuse verdict (expose / extend / build / drop) → the gap list, not the whole idea, feeds the spec | `w1-recon`, `Explore` · haiku · low |
-| **investigate** *(fix / legacy only)* | code we didn't just write | forensic recon → root cause (not symptom) + blast radius + the `must_not_break` line | `w1-recon` · haiku→sonnet · medium |
-| **spec** | `test -f plans/<slug>.md` | Opus + `plans/template-spec.md` (designs the gap list; pre-mortem + decisions live in the template), then `do-reconcile.sh` | opus · high |
-| ↳ *CLARIFY* | spec has `## Clarifications` (FEATURE/SCHEMA) | ≤5 `AskUserQuestion`, written back into the spec | opus · high |
-| **todo** | `test -f plans/<slug>-todo.md` | `/create todo` + `plans/template-todo.md` | sonnet · medium |
-| ↳ *ANALYZE* | — | `do-analyze.sh plans/<slug>-todo.md` — CRITICAL exit 1 blocks BUILD | bash · none |
-| **code** | survey verdict = `build` (≥70% match → `expose`/`extend` instead) | the BUILD engine (Step 3) | sonnet · low–medium |
-| **tests** | test file in the repo folder | test-first, one assertion per deliverable | sonnet · low |
-| **proof** | proof artifact captured | `do-prove.sh` (browser / curl / contract / sync) + `accessibility`; **promise-check** the shipped thing against `text/<slug>.md`. A multi-cycle plan already built on its own `do/<slug>` branch (worktree isolation — see Step 2); PROVE is the gate that branch must clear before a human merges it to trunk | sonnet · low |
-| **docs** | feature doc + runbook exist | `tutorial` + `writer` — written against the *proven* behavior | sonnet · medium |
-| **release** | changelog / README row | `/release` + adoption signal | sonnet · low |
+| **AIM** | goal + tier confirmed | `AskUserQuestion` (one) | — |
+| **PROMISE** | `test -f text/<slug>.md` | `writer` skill + `text/template-feature.md` | sonnet · medium |
+| **SURVEY** | `do-survey.sh` (4 surfaces) | reuse verdict (expose / extend / build / drop) → the gap list, not the whole idea, feeds the design | `w1-recon`, `Explore` · haiku · low |
+| **INVESTIGATE** *(fix / legacy only)* | code we didn't just write | forensic recon → root cause (not symptom) + blast radius + the `must_not_break` line | `w1-recon` · haiku→sonnet · medium |
+| **DESIGN** | `test -f text/<slug>-plan.md` | Opus + `text/template-plan.md` (designs the gap list; pre-mortem + decisions live in the template), then `do-reconcile.sh substrate` + `do-reconcile.sh dictionary` | opus · high |
+| ↳ *CLARIFY* | plan has `## Clarifications` (FEATURE/SCHEMA) | ≤5 `AskUserQuestion`, written back into the plan | opus · high |
+| **PLAN** | `test -f text/<slug>-todo.md` | `/create todo` + `text/template-todo.md` | sonnet · medium |
+| ↳ *ANALYZE* | — | `do-analyze.sh text/<slug>-todo.md` — CRITICAL exit 1 blocks BUILD | bash · none |
+| **BUILD** | survey verdict = `build` (≥70% match → `expose`/`extend` instead) | the BUILD engine (Step 3) | sonnet · low–medium |
+| **TEST** | test file in the repo folder | test-first, one assertion per deliverable | sonnet · low |
+| **VERIFY** | rubric + ratchet pass | `do-reconcile.sh <canon>` per category + rubric composite ≥ 0.65 | haiku · medium |
+| **PROVE** | proof artifact captured | `do-prove.sh` (browser / curl / contract / sync) + `accessibility`; the shipped thing checked against `text/<slug>.md`. A multi-cycle plan already built on its own `do/<slug>` branch (worktree isolation — see context isolation below); PROVE is the gate that branch must clear before a human merges it to trunk | sonnet · low |
+| **TEACH** | feature doc + runbook exist | `tutorial` + `writer` — written against the *proven* behavior | sonnet · medium |
+| **SHIP** | changelog / README row | `/release` + adoption signal | sonnet · low |
+| **LEARN** | learnings entry written | close: learnings + cost signal + trust write | — |
 **Proof before docs.** PROVE can fail and loop back to BUILD; docs written first would describe behavior that isn't final. Document what's proven, then ship.
-The two `↳` rows are **gates, not artifacts**: CLARIFY edits the spec in place (FEATURE/SCHEMA only); ANALYZE is read-only and blocks only on CRITICAL. Each real stop closes by leaving **one durable artifact on disk** — that artifact is the skip-check next time.
+The two `↳` rows are **gates, not artifacts**: CLARIFY edits the plan in place (FEATURE/SCHEMA only); ANALYZE is read-only and blocks only on CRITICAL. Each real stop closes by leaving **one durable artifact on disk** — that artifact is the skip-check next time.
-**FRAME first (for FEATURE/SCHEMA).** The promise (`text/<slug>.md`) is what PROVE later checks the shipped feature against. If you can't state what the user gets, halt — the idea isn't ready to build. PATCH/FIX have no promise to keep, so they skip it.
+**PROMISE first (for FEATURE/SCHEMA).** The promise (`text/<slug>.md`) is what PROVE later checks the shipped feature against. If you can't state what the user gets, halt — the idea isn't ready to build. PATCH/FIX have no promise to keep, so they skip it.
 **Before BUILD — EQUIP (skill-check).** Infer skills from the task tags → `ls .claude/skills/{name}/` → **ready** (proceed) / **stale** (warn, proceed) / **missing** (offer to add). Block only if a required skill is missing.
@@ -139,20 +159,20 @@ The two `↳` rows are **gates, not artifacts**: CLARIFY edits the spec in place
 | When | Ticked by |
 |---|---|
 | a W1 recon agent returns findings for a file | `/do` after the agent settles |
-| a W2 verdict / diff spec / doc-plan resolves | `/do` inline |
+| a W2 verdict / diff spec / doc-plan resolves | `/do` after the spawned W2 agent settles |
 | a W3 edit agent reports success on a file | `/do` after the agent settles |
-| a W4 check passes (verify · demo · doc-sync · rubric · ratchet) | `/do` after the bash exits 0 |
+| a W4 check passes (verify · demo · RECONCILES · rubric · ratchet) | `/do` after the bash exits 0 |
 | all of a wave's items are `[x]` | the wave header (derived) |
 | all 4 waves `[x]` | the cycle header (derived) |
 | all cycles in a batch `[x]` | the batch header (derived) |
-Forward-only — a `[x]` is never un-ticked except by `/do --wave N --redo`. If an action has no checkbox, it isn't tracked, and untracked work is the loudest smell. Full granularity table in `plans/template-todo.md`.
+Forward-only — a `[x]` is never un-ticked except by `/do --wave N --redo`. If an action has no checkbox, it isn't tracked, and untracked work is the loudest smell. Full granularity table in `text/template-todo.md`.
 After the spine: **LEARN / close** — write one `learnings.md` entry (slug, tier, composite, deliverable, proof line), emit `signal("cost:cycle", {tokens, model, composite})`, delete `.w3-receipts.json`, write `.do-trust.json {level, consecutive, composite}`, and append `.w4-improvements.json` (an open item recurring in 3+ consecutive cycles = systemic gap → emit `substrate:systemic-gap`; next W1 reads it as a mandatory recon target).
 **Plan-outcome kill-switch (zero LLM).** Re-run the plan's `outcome:` command every cycle close. Exit 0 → the goal is met: remaining cycles enter justify-or-drop (default drop), `--auto` halts. Over the tier ceiling → `warn` + justify. Weak rubric 3× → halt, the *plan* is wrong, not the code. Goal-drift (outcome fails 3× with green rubrics) → force a re-plan.
-**One command, complexity-sized.** `/do <anything>` is the only thing a human types — an idea, a slug, or a `plans/<slug>-todo.md` path. The tier sizes the work and the spine prunes itself: a typo is edited and verified inline (one cycle, no loop); a feature writes its promise, spec, todo, tests, and docs and then runs every cycle. The human never picks the mode, never types a flag, never runs a second command.
+**One command, complexity-sized.** `/do <anything>` is the only thing a human types — an idea, a slug, or a `text/<slug>-todo.md` path. The tier sizes the work and the spine prunes itself: a typo is edited and verified inline (one cycle, no loop); a feature writes its promise, plan, todo, tests, and docs and then runs every cycle. The human never picks the mode, never types a flag, never runs a second command.
 **Context isolation is automatic for multi-cycle plans.** When `/do` resolves to a todo with **≥2 incomplete cycles**, it does not run them inline — it hands the loop to the internal engine so every cycle gets a **fresh context**:
@@ -170,7 +190,7 @@ A **single** incomplete cycle (or a PATCH/FIX) runs inline — there's nothing t
 **`--next-cycle` is internal.** It means "run one batch, tick boxes, exit — do **not** loop." Only `do-auto.sh` passes it; it is the recursion guard that keeps a fresh `/do` from re-entering the loop. A human never types it.
 State that survives the reset (all on disk — the loop is stateless in memory):
-- `plans/<slug>-todo.md` — checked boxes are the progress bar; the next `/do --next-cycle` reads them and skips completed cycles automatically
+- `text/<slug>-todo.md` — checked boxes are the progress bar; the next `/do --next-cycle` reads them and skips completed cycles automatically
 - `.do-trust.json` — trust level and consecutive score history (drives auto-continue vs halt)
 - `.w4-improvements.json` — open items that become mandatory W1 targets next cycle
 - `.w2-spec.json` / `.w3-receipts.json` — write-once state for soft-resume within a cycle
@@ -193,7 +213,7 @@ This compounds with the W1 cache and W4 rubric cache: a cycle 6 run with context
 ## Step 3 — The BUILD engine (W0 → W4)
-Runs only when the spine reaches `code`. Folder-aware: there is **no root `package.json`** — resolve the target repo folder from the touched files.
+Runs only when the spine reaches `BUILD`. Folder-aware: there is **no root `package.json`** — resolve the target repo folder from the touched files.
 **W0 — Baseline.**
 ```bash
@@ -218,15 +238,23 @@ bun .claude/scripts/w1-recon.ts --targets "$FILES" --mode RECON
 ```
 Script exits 2 if `ANTHROPIC_API_KEY` is absent → fall back to spawning `w1-recon` agent (Haiku · low) for ALL files in ONE message. Either path: skip `relevance_score < 0.4`; all-filtered → halt and broaden (zero-findings guard). Receipt capped at 400 words; persist high-signal slices into `.w2-spec.json`. *(The `w1-recon` agent carries the two-track existing-code + primitive-inventory contract.)*
-**W2 — Decide (never delegated — Opus · high).** Write the goal/deliverable/UX gate (3 sentences) first. Then: reconcile names against `dictionary.md`; run the **compress check** before any new primitive (name 3 existing primitives that compose it → `compose` removes it from the diff, `new` needs a one-line justification + same-diff doc edit). The pre-mortem + trade-offs were already captured at the `spec` stop (`template-spec.md`) — carry the failure modes forward as test cases, don't redo them. Classify each W1 finding Act / Keep / Defer. Output diff specs + write `.w2-spec.json` (+ `.w2-doc-plan.json` if a doc trigger fires). *(The `w2-decide` agent carries the compose-target table — which canonical doc to check per primitive type — plus context-triggers for surgical doc injection.)*
+**W2 — Decide (spawned single Opus `w2-decide` agent — Opus · high).** W2 is NOT run inline; the conductor spawns one `w2-decide` agent at Opus·high and waits for its result. Understanding is not delegable; it stays single, never fanned — but pinning it to the conductor process is not required. The conductor/session model runs Sonnet; W2 stays Opus. W2 writes the goal/deliverable/UX gate (3 sentences) first. Then: reconcile names against `dictionary.md`; run the **compress check** before any new primitive (name 3 existing primitives that compose it → `compose` removes it from the diff, `new` needs a one-line justification + same-diff doc edit). The pre-mortem + trade-offs were already captured at the DESIGN stop (`template-plan.md`) — carry the failure modes forward as test cases, don't redo them. Classify each W1 finding Act / Keep / Defer. Output diff specs + write `.w2-spec.json` (+ `.w2-doc-plan.json` if a doc trigger fires). Pin the **Interface Contract** (shared names, CLI signatures, decisions that multiple downstream tasks reference — pins let them run parallel, not serial). Emit the maximal-width batch DAG (arrow test applied: arrows exist only where one cycle reads a file another writes). *(The `w2-decide` agent carries the compose-target table — which canonical doc to check per primitive type — plus context-triggers for surgical doc injection.)*
-**W3 — Edit (Sonnet × N parallel).** Soft-resume first: if `.w3-receipts.json` has fewer receipts than `.w2-spec.json` diff specs, a prior W3 was interrupted — skip W0–W2 and resume from the first unapplied spec. Pre-validate every anchor with `grep -qF` in parallel. No file overlap → spawn all `w3-edit` agents in ONE message (W3a). Overlap → W3a (independent) then W3b (same-file, sequential). Append a receipt per edit; anchor miss twice → halt, ask user. Build every UI state (empty / loading / error / edge), not just the happy path.
+**W3 — Edit (Sonnet × N parallel).** Soft-resume first: if `.w3-receipts.json` has fewer receipts than `.w2-spec.json` diff specs, a prior W3 was interrupted — skip W0–W2 and resume from the first unapplied spec. Pre-validate every anchor with `grep -qF` in parallel. No file overlap → spawn all `w3-edit` agents in ONE message (W3a). Overlap → W3a (independent) then W3b (same-file, sequential). Append a receipt per edit; anchor miss twice → halt, ask user. Build every UI state (empty / loading / error / edge), not just the happy path. Surfaces are built in category order: `schema → types → receiver/SDK → API route → component → page → route registered → nav entry → inbound links → states`.
 **W4 — Verify (bash first, zero tokens).**
 ```bash
 ( cd $folder && bun run verify )            # biome + tsc + vitest
 $(cycle.demo.command)                       # the goal gate — exit 0 = pass
-# doc-sync gate (reads .w2-doc-plan.json): stale-name=0, links ok, contract mtime current
+# RECONCILES gate: per-category reconcile calls
+do-reconcile.sh types                       # tsc delta ≤ 0
+do-reconcile.sh dictionary                  # no dead name, no new synonym
+do-reconcile.sh substrate   <file>          # DATA tasks — schema extended, not forked
+do-reconcile.sh sdk         <file>          # GATEWAY tasks — joins a receiver
+do-reconcile.sh navigation  <surface>       # SURFACE tasks — registered + ≥1 inbound link
+do-reconcile.sh design      <file>          # SURFACE tasks — composes shadcn/tokens
+do-reconcile.sh authority   <file>          # GATEWAY tasks — walk-up resolves, no ad-hoc
+# TEACH tasks: do-reconcile.sh dictionary + do-reconcile.sh types on doc files
 DELTA_TSC=$((TSC_NOW - TSC_BASELINE))       # hard gate: ≤ 0, no new type errors ever
 ```
 Rubric (inline for TRIVIAL/SIMPLE; for COMPLEX — run the SDK script first (6 Haiku in parallel, rubric + spec block cached across all calls):
@@ -240,44 +268,54 @@ gate: composite ≥ 0.65  AND  goal-fit ≥ 0.50 (hard)  AND  no adversarial > 0
 ```
 Verify scope is the **dependency cone** of `git diff` (full verify only at cycle close — speed). Then compress sweep (ts-prune + noUnusedLocals → delete orphans/dead code), write `.w4-improvements.json`, and fire one pre-warm Haiku at the next cycle's W1 targets. Verify or demo failure on touched files → W3.5 (one Sonnet per dirty file), max 3 W4 loops then halt.
-**Testing policy** (full detail in `plans/template-todo.md`): Vitest-first; Playwright only when the cycle declares `requires_playwright: true`; one goal-based `expect()` per deliverable (assert the destination, not the path); ≤1 test file per cycle; a cycle needing >5 tasks splits unless `mode: lean`; N-variant split-test → winner `mark()`, losers `warn(0.5)`.
+**Testing policy** (full detail in `text/template-todo.md`): Vitest-first; Playwright only when the cycle declares `requires_playwright: true`; one goal-based `expect()` per deliverable (assert the destination, not the path); ≤1 test file per cycle; a cycle needing >5 tasks splits unless `mode: lean`; N-variant split-test → winner `mark()`, losers `warn(0.5)`.
 ---
 ## Definition of done (tier-scaled — `/do` ticks these automatically)
-1. The agreed goal is met (goal-fit gate). 2. Every shippable deliverable has a test asserting it. 3. No regression — tests green, `delta_tsc ≤ 0`, `must_not_break` preserved. 4. Proven live, matches the promise. 5. Docs in sync. 6. Full coverage, no orphan tasks. 7. Rubric clears the bar. 8. Closed loop — recorded result + one learnings entry.
+| # | Done means | Gate |
+|---|---|---|
+| 1 | The agreed goal is met | RUBRIC goal-fit |
+| 2 | Every shippable deliverable has a test asserting it | ANALYZE + TEST |
+| 3 | No regression — ratchet held, `must_not_break` preserved | RECONCILES |
+| 4 | Proven live, reachable, matches the promise | PROVE |
+| 5 | Surfaces wired into navigation, not orphaned | RECONCILES (Navigation) |
+| 6 | Docs in sync — no stale name, no dead link | RECONCILES (Docs) |
+| 7 | Rubric clears the bar | RUBRIC |
+| 8 | Loop closed — result + learning recorded | LEARN |
-PATCH clears {1,2,3,8}. FEATURE clears all eight.
+PATCH clears {1, 3, 7, 8}. FEATURE clears all eight. Same bar, scaled to the work.
 ---
 ## Non-negotiable rules
-- **Never skip W2.** Understanding is not delegable.
-- **Default to parallel.** Spawn W1, W3, and W4-rubric agents in a **single message** — across cycles in a batch, not just within one. An arrow exists only when you can name the file one cycle writes and the next reads; reject every imaginary blocker.
+- **Never skip W2.** Understanding is not delegable — spawn the single Opus `w2-decide` agent; never inline it, never fan it.
+- **Default to parallel.** Spawn W1, **W2**, W3, and W4-rubric agents — across cycles in a batch, not just within one. W2 is in the spawn list; the conductor runs Sonnet. An arrow exists only when you can name the file one cycle writes and the next reads; reject every imaginary blocker.
 - **Tick the moment it lands.** Flip `[ ]`→`[x]` as each agent/check settles — never batch at the end. The file is the progress bar.
-- **FRAME before code (FEATURE/SCHEMA)** — no *feature* ships without a stated promise. PATCH/FIX skip FRAME by design.
+- **PROMISE before code (FEATURE/SCHEMA)** — no feature ships without a stated promise. PATCH/FIX skip PROMISE by design.
 - **Closed loop** — every cycle ends in `mark`/`warn`, never a silent return.
 - **Cheapest tool that decides wins** — a phase that needs no LLM call is the best kind.
 - **W4 max 3 loops**, then halt and report.
 - **One command.** A human types `/do <anything>` and nothing else. Complexity sizing, spine pruning, and (for multi-cycle plans) the context-isolated loop are all automatic — never a flag, never a second command.
 - **Every cycle gets a fresh context.** A multi-cycle `/do` runs each cycle in a clean subprocess via the internal loop. Prior-cycle conversation is noise; the todo checkboxes are the only state that carries forward.
+- **Coherence ratchet is a hard gate.** Every write must leave types / names / primitives / schema / surfaces / docs equal or more coherent than before. The ratchet can tighten; it can never loosen.
 ---
 ## Available to /do — the toolbox
-**Scripts** (`.claude/scripts/`): `do-auto.sh` (*internal* — the context-isolated, worktree-isolated loop `/do` drives for multi-cycle plans; builds on branch `do/<slug>`, merges to trunk only on a human's say-so) · `do-tier.sh` (tier + pruned spine + classifier + ceiling) · `do-folder.sh` (folder-aware verify/build) · `do-survey.sh` (reuse verdict) · `do-reconcile.sh` (substrate dim/verb/dead-name gate) · `do-analyze.sh` (deliverable↔cycle coverage gate) · `do-prove.sh` (surface-detect proof) · `do-smoke.sh` (deterministic outcome) · `w1-recon.ts` (prompt-cached recon) · `w4-rubric.ts` (cached parallel rubric).
+**Scripts** (`.claude/scripts/`): `do-auto.sh` (*internal* — the context-isolated, worktree-isolated loop `/do` drives for multi-cycle plans; builds on branch `do/<slug>`, merges to trunk only on a human's say-so) · `do-tier.sh` (tier + pruned spine + classifier + ceiling) · `do-folder.sh` (folder-aware verify/build) · `do-survey.sh` (reuse verdict) · `do-reconcile.sh <canon>` (**one predicate over 7 canons** — `substrate|dictionary|authority|sdk|design|navigation|types` — subsumes the old RECONCILE, COMPRESS, PROMISE-CHECK, DOC-SYNC gates; `navigation` branch enforces surface reachability; `--self-test` drives fixtures) · `do-analyze.sh` (deliverable↔cycle coverage gate) · `do-prove.sh` (surface-detect proof — pure PROVE, promise-check moved into `do-reconcile.sh design`) · `do-smoke.sh` (deterministic outcome — drives ≥1 fixture per canon branch + orphan/reachable nav fixtures) · `w1-recon.ts` (prompt-cached recon) · `w4-rubric.ts` (cached parallel rubric).
-**Templates**: `text/template-frame.md` (promise) · `plans/template-spec.md` (design + pre-mortem + decisions) · `plans/template-todo.md` (plan + parallel budget + testing policy) · `plans/agent-template.md` (agent definition).
+**Templates**: `text/template-feature.md` (promise/PROMISE) · `text/template-plan.md` (design + pre-mortem + decisions/DESIGN) · `text/template-todo.md` (plan + parallel budget + testing policy/PLAN) · `text/template-agent.md` (agent definition/BUILD).
-**Agents**: `w1-recon` · `w2-decide` · `w3-edit` · `w4-verify` · `Explore` · `Plan` · `pr-review-toolkit:*` (code-reviewer, silent-failure-hunter, type-design-analyzer, pr-test-analyzer, code-simplifier) · `find-bugs`.
+**Agents**: `w1-recon` · `w2-decide` (spawned single Opus; carries the full W2 contract — compress check, Interface Contract pin, maximal batch DAG) · `w3-edit` · `w4-verify` · `Explore` · `Plan` · `pr-review-toolkit:*` (code-reviewer, silent-failure-hunter, type-design-analyzer, pr-test-analyzer, code-simplifier) · `find-bugs`.
-**Skills** (per-stage routing in `plans/templates.md`): FRAME → `writer`, `copywriting` · SPEC → `typedb`, `signal` · BUILD → `astro`, `react19`, `shadcn`, `ai-ui`, `ai-sdk`, `reactflow`, `hono`, `cloudflare`, `wrangler`, `durable-objects` · TEST → `vitest`, `playwright-best-practices`, `webapp-testing` · VERIFY → `accessibility`, `find-bugs`, `perf`, `typecheck` · TEACH → `tutorial`, `writer`.
+**Skills** (per-stage routing in `text/templates-plan.md`): PROMISE → `writer`, `copywriting` · DESIGN → `typedb`, `signal` · BUILD → `astro`, `react19`, `shadcn`, `ai-ui`, `ai-sdk`, `reactflow`, `hono`, `cloudflare`, `wrangler`, `durable-objects` · TEST → `vitest`, `playwright-best-practices`, `webapp-testing` · VERIFY → `accessibility`, `find-bugs`, `perf`, `typecheck` · TEACH → `tutorial`, `writer`.
 **State files** (write-once, read-by-path — compaction-proof): `.w0-baseline.json` · `.w2-spec.json` · `.w2-doc-plan.json` · `.w3-receipts.json` · `.do-trust.json` · `.w4-improvements.json`.
 ---
-*`/do` is `select()` made human-readable: idea in, shipped-and-proven feature out. The path remembers every execution.*
+*`/do` is `select()` made human-readable: idea in, shipped-and-proven feature out. Each artifact is made true — exists and agrees with its canon. The path remembers every execution.*

package/commands/skill-create.md CHANGED Viewed

@@ -23,11 +23,11 @@ If `[slug]` is omitted, derive it from the most recent `learnings.md` entry
 ```bash
 # Last learnings entry
-tail -5 plans/learnings.md
+tail -5 text/learnings.md
 # Recent cycle diff (what actually changed)
 git diff HEAD~1 HEAD --stat
-git diff HEAD~1 HEAD -- $(git diff HEAD~1 HEAD --name-only | grep -v 'plans/\|text/' | head -20)
+git diff HEAD~1 HEAD -- $(git diff HEAD~1 HEAD --name-only | grep -v 'text/\|text/' | head -20)
 ```
 **Step 2 — Extract the pattern (Haiku · low)**
@@ -72,7 +72,7 @@ Use this template:
 **Step 5 — Close**
 Emit `signal("learning:harden", 1, "skill=<slug>")` and append one line to
-`plans/learnings.md`:
+`text/learnings.md`:
 ```
 - YYYY-MM-DD · skill:<slug> · crystallised from <source-slug> · source=skill-create
@@ -96,5 +96,5 @@ Emit `signal("learning:harden", 1, "skill=<slug>")` and append one line to
 ## Close
 After writing the skill file: `[ ] skill-create:<slug> done` appended to
-`plans/learnings.md`, signal emitted, session stop-reflect will pick it up as
+`text/learnings.md`, signal emitted, session stop-reflect will pick it up as
 a new primitive on the next session start.

package/commands/sync.md CHANGED Viewed

@@ -10,8 +10,8 @@ Reconcile substrate state — tick loops, absorb markdown, propagate knowledge.
 |------|------|------|
 | *(default)* | Tick all loops + scan docs + todos + agents | L1-L7 |
 | `tick` | Fire all L1-L7 loops once (one full growth cycle) | L1-L7 |
-| `docs` | Scan `docs/*.md` → memory → TypeDB | L6 |
-| `todos` | Scan `docs/*-todo.md` → tasks → TypeDB + KV | L1 |
+| `docs` | Scan `text/*.md` → memory → TypeDB | L6 |
+| `todos` | Scan `text/*-todo.md` → tasks → TypeDB + KV | L1 |
 | `tasks [dir]` | Import reusable task templates from `tasks/` (or `[dir]`) → world-scoped tasks + skills + capabilities | L1 |
 | `agents` | Scan `agents/**/*.md` → units → TypeDB | L1 |
 | `fade` | Fire L3 only — asymmetric decay | L3 |
@@ -41,8 +41,8 @@ Individual noun invocations target a single loop layer; default runs all of them
 ### *(default)*
 1. `bun run verify` — W0 gate (skip if already passed this session)
-2. Scan `docs/*-todo.md` → parse checkboxes → POST `/api/tasks/sync`
-3. Scan `docs/*.md` → extract concepts → write to memory → TypeDB
+2. Scan `text/*-todo.md` → parse checkboxes → POST `/api/tasks/sync`
+3. Scan `text/*.md` → extract concepts → write to memory → TypeDB
 4. Scan `agents/**/*.md` → parse frontmatter → sync units + skills
 5. GET `http://localhost:4321/api/tick?interval=0` — fire all L1-L7 loops
 6. Report:
@@ -73,14 +73,14 @@ Individual noun invocations target a single loop layer; default runs all of them
 ### docs
-1. Scan all `docs/*.md` files via `src/engine/doc-scan.ts`
+1. Scan all `text/*.md` files via `src/engine/doc-scan.ts`
 2. Extract key concepts, verify doc-code alignment
 3. Write to memory → TypeDB
 4. Report: docs scanned, gaps found, TypeDB writes
 ### todos
-1. Scan all `docs/*-todo.md` files via `src/engine/task-parse.ts` (`scanTodos()`)
+1. Scan all `text/*-todo.md` files via `src/engine/task-parse.ts` (`scanTodos()`)
 2. POST to `http://localhost:4321/api/tasks/sync` (hash-gated: skips if data unchanged)
 3. KV update (~10ms), async TypeDB sync (~100ms)
 4. Report:
@@ -129,7 +129,7 @@ Optional: `description`, `tags`, `wave`, `value`, `effort`, `phase`, `persona`,
 **Why this noun exists:** Stage 1 (LIST) of the trade lifecycle needs inventory.
 A new seller world with zero listings is a dead market. Reusable task templates
 are pre-packaged LIST entries — import the catalog, your world has a starter
-set of listings ready to receive offers. See `docs/trade-lifecycle-todo.md § Cycle 6`.
+set of listings ready to receive offers. See `text/trade-lifecycle-todo.md § Cycle 6`.
 ### agents

package/hooks/scripts/stop-reflect.sh CHANGED Viewed

@@ -75,11 +75,11 @@ check_drift '^api/src/'            'api/CLAUDE.md'
 # (4) Completed /do cycle → suggest /skill-create to crystallise the pattern.
 # Heuristic: learnings.md was touched AND W3 edited >=3 non-plan/non-text files.
-if echo "$CHANGED" | grep -qF 'plans/learnings.md'; then
-  CYCLE_FILES=$(echo "$CHANGED" | grep -vE '^plans/|^text/|^\.claude/' | wc -l | awk '{print $1}')
+if echo "$CHANGED" | grep -qF 'text/learnings.md'; then
+  CYCLE_FILES=$(echo "$CHANGED" | grep -vE '^text/|^text/|^\.claude/' | wc -l | awk '{print $1}')
   if [ "$CYCLE_FILES" -ge 3 ]; then
     # Extract the cycle slug from the last learnings entry
-    CYCLE_SLUG=$(grep -oE 'cycle [0-9]+|· [a-z][a-z0-9-]+' plans/learnings.md 2>/dev/null | tail -1 | sed 's/^· //')
+    CYCLE_SLUG=$(grep -oE 'cycle [0-9]+|· [a-z][a-z0-9-]+' text/learnings.md 2>/dev/null | tail -1 | sed 's/^· //')
     BUF+="- **skill-create** cycle closed with ${CYCLE_FILES} file(s) changed — run \`/skill-create ${CYCLE_SLUG}\` to crystallise the pattern as a reusable skill.\n"
     PROPOSALS=$((PROPOSALS + 1))
   fi

package/hooks/scripts/sync-todo-docs.sh CHANGED Viewed

@@ -24,7 +24,7 @@ esac
 # Don't recurse: if the edit IS TODO.md itself, it was likely regenerated. Skip.
 case "$FILE" in
-  */docs/TODO.md) exit 0 ;;
+  */text/TODO-plan.md) exit 0 ;;
 esac
 # Best-effort POST with a 2s cap so edits stay snappy.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@oneie/claude",
-  "version": "0.3.2",
+  "version": "0.5.0",
   "description": "ONE Claude Code plugin — /do lifecycle, W1-W4 BUILD engine, ONE substrate via MCP, signals, and rules",
   "files": [
     ".claude-plugin",
@@ -11,6 +11,7 @@
     "rules",
     "hooks",
     "scripts",
+    "templates",
     "README.md"
   ],
   "peerDependencies": {

package/rules/documentation.md CHANGED Viewed

@@ -16,12 +16,12 @@ Before any code is written, explicitly list which docs will change:
 ### Documentation Updates (W2)
 **New docs:**
-- `docs/feature.md` — {purpose}
+- `text/feature.md` — {purpose}
 **Docs modified:**
-- `docs/dictionary.md` — add term {name}
-- `docs/routing.md` — update loop L{N}
-- `docs/rubrics.md` — add dimension {name}
+- `text/dictionary-plan.md` — add term {name}
+- `text/routing-plan.md` — update loop L{N}
+- `text/rubrics-plan.md` — add dimension {name}
 **Schema changes:**
 - New TypeDB entities, D1 migrations, TypeQL functions
@@ -37,12 +37,12 @@ Before any code is written, explicitly list which docs will change:
 | Code File | Related Doc |
 |-----------|-------------|
-| `nanoclaw/src/types.ts` | `docs/dictionary.md` — add new type terms |
-| `src/engine/world.ts` | `docs/DSL.md` — update signal grammar |
-| `src/engine/loop.ts` | `docs/routing.md` — update L1-L7 loops |
-| `src/pages/api/*.ts` | `docs/lifecycle.md` — update lifecycle stage |
-| `src/schema/*.tql` | `docs/one-ontology.md` — update dimension/entity |
-| Feature-specific file | `docs/feature.md` — update feature spec |
+| `nanoclaw/src/types.ts` | `text/dictionary-plan.md` — add new type terms |
+| `src/engine/world.ts` | `text/DSL-plan.md` — update signal grammar |
+| `src/engine/loop.ts` | `text/routing-plan.md` — update L1-L7 loops |
+| `src/pages/api/*.ts` | `text/lifecycle-plan.md` — update lifecycle stage |
+| `src/schema/*.tql` | `text/one-ontology.md` — update dimension/entity |
+| Feature-specific file | `text/feature.md` — update feature spec |
 **Pattern:** If you're adding a new field to a TypeScript interface, add it to the doc that defines that interface. If you're changing a loop, update the loop diagram in the doc.
@@ -54,7 +54,7 @@ Before any code is written, explicitly list which docs will change:
 1. **Terminology** — All renamed concepts updated everywhere
    ```bash
-   grep -r "old-name" docs/ -- ignore new-name where intentional (dead names)
+   grep -r "old-name" text/ -- ignore new-name where intentional (dead names)
    ```
 2. **Examples** — Code examples in docs match actual implementation
@@ -125,10 +125,10 @@ After W4 completes, auto-run consistency checks:
 ```bash
 # Terminology consistency
-grep -r "old-term" docs/ | flag if > 0 (not on deadname list)
+grep -r "old-term" text/ | flag if > 0 (not on deadname list)
 # Cross-references
-markdown-link-check docs/*.md
+markdown-link-check text/*.md
 # Metaphor alignment (if touching signal/path semantics)
 grep -l "signal\|strength\|resistance" src/changed-files.txt | xargs -I {} \
@@ -151,9 +151,9 @@ grep "rubric\|dimension" src/changed-files.txt | flag for rubrics.md review
 - None (extends existing types)
 **Docs modified:**
-- `docs/dictionary.md` — add "Rich Messages" section, explain `data.rich` convention
-- `docs/rich-messages.md` — add PaymentMetadata type, Payment flow section
-- `docs/template-todo.md` — add "Documentation Updates (W2 + W4)" section
+- `text/dictionary-plan.md` — add "Rich Messages" section, explain `data.rich` convention
+- `text/rich-messages.md` — add PaymentMetadata type, Payment flow section
+- `text/template-todo.md` — add "Documentation Updates (W2 + W4)" section
 **Schema changes:**
 - `nanoclaw/src/types.ts` — RichMessage.thread, RichMessage.payment added (Cycle 1 & 3)
@@ -197,9 +197,9 @@ When running `/do {name}-todo.md`, the workflow enforces:
 | feature doc named in plan | feature doc | sync verb/component/endpoint counts |
 | rename across W3 | every `**/*.md` containing old name | inline replace (auto-grep) |
-Anchor mismatch → log to `docs/improvements.md` for manual W2 next cycle (no halt — the cycle still closes; the propagate task surfaces next iteration).
+Anchor mismatch → log to `text/improvements.md` for manual W2 next cycle (no halt — the cycle still closes; the propagate task surfaces next iteration).
-**Learnings entry** (append to `docs/learnings.md` on each close):
+**Learnings entry** (append to `text/learnings.md` on each close):
 ```
 - YYYY-MM-DD · cycle N · wave N|gate · {one sentence} · rubric=0.NN · source=w1|w2|w3|w4|cycle