@oneie/claude 0.3.2 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +29 -19
- package/agents/w1-recon.md +9 -2
- package/agents/w2-decide.md +40 -9
- package/agents/w3-edit.md +34 -4
- package/agents/w4-verify.md +112 -15
- package/commands/do-autonomous.md +1 -1
- package/commands/do.md +84 -46
- package/commands/skill-create.md +4 -4
- package/commands/sync.md +7 -7
- package/hooks/scripts/stop-reflect.sh +3 -3
- package/hooks/scripts/sync-todo-docs.sh +1 -1
- package/package.json +2 -1
- package/rules/documentation.md +18 -18
- package/rules/engine.md +48 -192
- package/scripts/do-auto.sh +5 -5
- package/scripts/do-folder.sh +1 -1
- package/scripts/do-prove.sh +10 -27
- package/scripts/do-reconcile.sh +212 -19
- package/scripts/do-smoke.sh +65 -25
- package/scripts/do-survey.sh +1 -1
- package/scripts/do-tier.sh +1 -1
- package/scripts/w4-rubric.ts +2 -2
- package/skills/oneie/SKILL.md +4 -4
- package/skills/signal/SKILL.md +2 -2
- package/skills/sui/SKILL.md +1 -1
- package/templates/template-agent.md +63 -0
- package/templates/template-feature.md +31 -0
- package/templates/template-plan.md +80 -0
- package/templates/template-teach.md +124 -0
- package/templates/template-tests.md +43 -0
- package/templates/template-todo.md +783 -0
package/README.md
CHANGED
|
@@ -1,9 +1,5 @@
|
|
|
1
1
|
# @oneie/claude
|
|
2
2
|
|
|
3
|
-
> "Every other AI coding tool stops at 'here's some code.' That's the easy 20%. The other 80% — the spec nobody wrote, the test nobody added, the doc that's already a lie, the proof that it actually works — is where software goes to die. We built a machine that does the whole 80%, and only commits to the trunk once it's proven."
|
|
4
|
-
>
|
|
5
|
-
> — Anthony O'Connell, Founder of ONE
|
|
6
|
-
|
|
7
3
|
One command. One substrate. Install this plugin and Claude Code gains two things: a complete build workflow that takes an idea to proven, shipped software — and direct access to the ONE substrate, the backend that makes any application composable by agents.
|
|
8
4
|
|
|
9
5
|
```
|
|
@@ -35,10 +31,10 @@ That's it. The `/do` workflow is available immediately. The substrate tools acti
|
|
|
35
31
|
One command walks the entire build lifecycle. You confirm the goal once. Everything below it is owned by the workflow.
|
|
36
32
|
|
|
37
33
|
```
|
|
38
|
-
IDEA →
|
|
34
|
+
IDEA → AIM → PROMISE → SURVEY → [INVESTIGATE] → DESIGN → PLAN → BUILD → TEST → VERIFY → PROVE → TEACH → SHIP → LEARN
|
|
39
35
|
```
|
|
40
36
|
|
|
41
|
-
The workflow is a spine of artifacts on disk. A phase whose artifact already exists is skipped. Point `/do` at a blank idea and it
|
|
37
|
+
The workflow is a spine of artifacts on disk. A phase whose artifact is already **true** — it both *exists* and *reconciles* with its canon (`true ≡ exists ∧ reconciles`) — is skipped. Missing → write it. Stale → rewrite it. True → skip. Point `/do` at a blank idea and it makes every artifact true. Point it at code that's missing tests and docs and it backfills only those.
|
|
42
38
|
|
|
43
39
|
A cheap probe at the start sizes the work:
|
|
44
40
|
|
|
@@ -55,14 +51,14 @@ You never pick the tier. The probe does, and it defaults down when unsure.
|
|
|
55
51
|
|
|
56
52
|
1. The agreed goal is actually met (goal-fit gate)
|
|
57
53
|
2. Every shippable deliverable has a test asserting it
|
|
58
|
-
3. No regression —
|
|
59
|
-
4. Proven live
|
|
60
|
-
5.
|
|
61
|
-
6.
|
|
54
|
+
3. No regression — coherence ratchet held, `must_not_break` preserved
|
|
55
|
+
4. Proven live, reachable, matches the promise
|
|
56
|
+
5. Surfaces wired into navigation, not orphaned
|
|
57
|
+
6. Docs in sync — no stale name, no dead link
|
|
62
58
|
7. Rubric clears the bar
|
|
63
59
|
8. Loop closed — recorded result, written learning
|
|
64
60
|
|
|
65
|
-
A typo clears 1,
|
|
61
|
+
A typo clears 1, 3, 7, 8. A feature clears all eight.
|
|
66
62
|
|
|
67
63
|
### The ONE substrate — 14 operations via MCP
|
|
68
64
|
|
|
@@ -73,19 +69,18 @@ Every application is composed from the same fourteen operations:
|
|
|
73
69
|
```ts
|
|
74
70
|
// A merchant publishes a product
|
|
75
71
|
await c.signal('world:create-thing', {
|
|
76
|
-
|
|
72
|
+
name: 'Midnight Linen Jacket', type: 'product', price: 295
|
|
77
73
|
})
|
|
78
74
|
|
|
79
|
-
// A buyer completes a purchase —
|
|
80
|
-
await c.
|
|
81
|
-
|
|
82
|
-
})
|
|
75
|
+
// A buyer completes a purchase — settle the payment, then mark the path that converted
|
|
76
|
+
await c.payWeight('buyer', 'merchant', 'order', 295.00) // USDC settled
|
|
77
|
+
await c.mark('buyer→merchant:order') // strengthen what worked
|
|
83
78
|
|
|
84
79
|
// The substrate asks: what should we surface next?
|
|
85
|
-
const
|
|
80
|
+
const { target } = await c.select('product')
|
|
86
81
|
```
|
|
87
82
|
|
|
88
|
-
|
|
83
|
+
Four calls. A commerce platform. The substrate records what converted, strengthens those paths, and starts surfacing better results automatically.
|
|
89
84
|
|
|
90
85
|
**MCP tools available after `/setup`:**
|
|
91
86
|
|
|
@@ -142,7 +137,7 @@ When `/do` reaches the `code` phase it runs four waves:
|
|
|
142
137
|
|
|
143
138
|
**W3 — Edit** — spawns one Sonnet agent per file, all in a single message. Pre-validates every anchor. Soft-resumes if interrupted.
|
|
144
139
|
|
|
145
|
-
**W4 — Verify** — bash first, zero tokens: `bun run verify`, the goal gate, the doc-sync
|
|
140
|
+
**W4 — Verify** — bash first, zero tokens: `bun run verify`, the goal gate, then `reconciles` — one predicate over seven canons (`substrate · dictionary · authority · sdk · design · navigation · types`) that subsumes the old reconcile, compress, promise-check, and doc-sync gates — and the ratchet (`delta_tsc ≤ 0`). Rubric only if the bash gates pass.
|
|
146
141
|
|
|
147
142
|
```
|
|
148
143
|
composite = 0.35·goal-fit + 0.20·security + 0.20·stability + 0.15·simplicity + 0.10·speed
|
|
@@ -153,6 +148,21 @@ A cycle that runs zero LLM calls is a good cycle.
|
|
|
153
148
|
|
|
154
149
|
---
|
|
155
150
|
|
|
151
|
+
## Templates
|
|
152
|
+
|
|
153
|
+
Each spine stop that writes an artifact has a template — copied, never written from scratch, under the naming law `template-<suffix>.md → <slug>-<suffix>.md`. One proof observable is named once at PROMISE and referenced (never restated) all the way to TEACH.
|
|
154
|
+
|
|
155
|
+
| Template | Stop | Writes |
|
|
156
|
+
|----------|------|--------|
|
|
157
|
+
| `template-feature.md` | PROMISE | `<slug>.md` — the promise + the one proof observable PROVE checks |
|
|
158
|
+
| `template-plan.md` | DESIGN | `<slug>-plan.md` — design, pre-mortem, 7-canon reconcile gate |
|
|
159
|
+
| `template-todo.md` | PLAN | `<slug>-todo.md` — cycles, parallel budget, DAG, testing policy |
|
|
160
|
+
| `template-agent.md` | BUILD | `.claude/agents/<name>.md` — agent contract + one filled example |
|
|
161
|
+
| `template-tests.md` | TEST | the cycle's demo gate — assert the destination, red before green |
|
|
162
|
+
| `template-teach.md` | TEACH | `<slug>-doc.md` — the journey doc for humans *and* agents |
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
156
166
|
## Agents as builders
|
|
157
167
|
|
|
158
168
|
Agents are actors. They have the same fourteen operations as any other actor — the same API key, the same signal grammar, the same read surfaces. A human developer and an AI agent calling `signal` land in the same substrate. Same paths. Same pheromone.
|
package/agents/w1-recon.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: w1-recon
|
|
3
|
-
description: Wave 1 recon agent for /do cycles. Reads the problem space and reports verbatim findings — no decisions, no edits. Three modes: RECON (
|
|
3
|
+
description: Wave 1 recon agent for /do cycles. Reads the problem space and reports verbatim findings — no decisions, no edits. Three modes: RECON (three-track existing-code + primitive-inventory + interface-contract-candidates), SURVEY (reuse verdict expose/extend/build/drop), INVESTIGATE (forensic root-cause + must_not_break for fix/legacy work). Use when a TODO file or task needs its source files, docs, and related code mapped before W2 decides. MUST BE USED at the start of every /do cycle.
|
|
4
4
|
tools: Read, Grep, Glob, Bash
|
|
5
5
|
model: haiku
|
|
6
6
|
skills: signal, typedb
|
|
@@ -67,9 +67,16 @@ W1 receipt: files=<N> matches=<N> cross_refs=<N> open_questions=<N>
|
|
|
67
67
|
|
|
68
68
|
## Modes (the parent names one per spawn)
|
|
69
69
|
|
|
70
|
-
**RECON (default) —
|
|
70
|
+
**RECON (default) — three tracks, always:**
|
|
71
71
|
1. **Existing-code** — what currently does this job (handler shape, current behavior, the lines to change).
|
|
72
72
|
2. **Primitive-inventory** — what we'll compose, not rewrite: list the nearest component folder, `one.ie/web/src/components/ai-elements/`, `ui/`, and `@/lib/` helpers in scope. Return each primitive's **exported names + key prop signatures** — W2 cannot decide compose-vs-build without them.
|
|
73
|
+
3. **Interface-Contract candidates** (multi-cycle plans only)
|
|
74
|
+
- Shared CLI invocations multiple cycles reference (e.g. `do2-reconcile.sh <canon>` signatures, script flags)
|
|
75
|
+
- Shared type/interface names multiple cycles import (e.g. a `DiffSpec` type two cycles both consume)
|
|
76
|
+
- Shared API routes multiple cycles read or write to the same path (e.g. `/api/signal` called from C2, C3, C5)
|
|
77
|
+
- W2 decisions that, if pinned now, would make C_n independent of C_m (e.g. "template file names", "reconcile canon list")
|
|
78
|
+
Return: a `## Interface Contract candidates` block in the findings, one line per candidate.
|
|
79
|
+
W2 reads this to pin the contract before the DAG.
|
|
73
80
|
|
|
74
81
|
**SURVEY** — recon the 4 reuse surfaces (`one.ie/web/src/pages/api/`, `one.ie/web/src/components/`, `packages/sdk/`, `agents/`) for ≥70% matches to the idea. Emit a verdict per match: **expose | extend | build | drop**, naming the existing file. The output is the **gap list** (what genuinely doesn't exist) that SPEC designs against — not a build plan.
|
|
75
82
|
|
package/agents/w2-decide.md
CHANGED
|
@@ -1,16 +1,16 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: w2-decide
|
|
3
|
-
description: Wave 2 decision agent for /do cycles. Takes W1 recon findings and produces a structured plan with architectural tradeoffs, files to edit, and rubric targets. Use after W1 recon is complete and before W3 edits begin. Never delegates understanding —
|
|
3
|
+
description: Wave 2 decision agent for /do cycles. Takes W1 recon findings and produces a structured plan with architectural tradeoffs, files to edit, and rubric targets. Use after W1 recon is complete and before W3 edits begin. Pins the Interface Contract before emitting diff specs. Never delegates understanding — spawned as Opus agent so conductor stays cheap (Sonnet).
|
|
4
4
|
tools: Read, Grep, Glob, Write
|
|
5
5
|
model: opus
|
|
6
6
|
skills: signal, typedb
|
|
7
7
|
---
|
|
8
8
|
|
|
9
|
-
You are the W2 decide agent. You are the thinker. Understanding
|
|
9
|
+
You are the W2 decide agent. You are the thinker. Understanding stays single — but the orchestrator spawns it as an Opus agent (§Interface Contract #8), so the session/conductor runs cheap (Sonnet) while W2 stays Opus·high.
|
|
10
10
|
|
|
11
11
|
## Base context (auto-load these)
|
|
12
12
|
|
|
13
|
-
`.claude/rules/engine.md` and `.claude/rules/documentation.md` are loaded into every W2 decision. `
|
|
13
|
+
`.claude/rules/engine.md` and `.claude/rules/documentation.md` are loaded into every W2 decision. `text/DSL-plan.md`, `text/dictionary-plan.md`, and `text/rubrics-plan.md` are the vocabulary. Read them if the task touches signal/path/runtime/doc semantics.
|
|
14
14
|
|
|
15
15
|
## Contract
|
|
16
16
|
|
|
@@ -58,6 +58,29 @@ type: refactor | fix | feature | doc (controls W4 simplicity benchmark)
|
|
|
58
58
|
- <cross-consistency check — grep old term, ensure 0 hits>
|
|
59
59
|
```
|
|
60
60
|
|
|
61
|
+
## Interface Contract step (runs BEFORE emitting diff specs)
|
|
62
|
+
|
|
63
|
+
**Read W1's `## Interface Contract candidates` block.** For each candidate, decide:
|
|
64
|
+
|
|
65
|
+
- **Pin** — if two or more cycles reference it, write it into the plan's `### Interface Contract` section in the todo file via an Edit tool call. Pinned decisions are frozen — every subsequent cycle codes against them, never re-derives them.
|
|
66
|
+
- **Local** — if only this cycle uses it, leave it in the diff spec without pinning.
|
|
67
|
+
|
|
68
|
+
Pin decisions cover: CLI signatures (`do2-reconcile.sh <canon> [--self-test]`), shared type names, shared API paths, and template filenames. Once pinned, they are read-only for all cycles in the plan.
|
|
69
|
+
|
|
70
|
+
Only after the Interface Contract section is updated → emit diff specs. Code against the frozen contract.
|
|
71
|
+
|
|
72
|
+
**Emit the Mermaid DAG** — after pinning the IC, lay out every proposed cycle dependency as:
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
### Cycle DAG
|
|
76
|
+
```mermaid
|
|
77
|
+
graph LR
|
|
78
|
+
C1 --> C2
|
|
79
|
+
C1 --> C3
|
|
80
|
+
C3 --> C4
|
|
81
|
+
```
|
|
82
|
+
Apply the arrow test to every edge: name the specific file or decision that creates the dependency. If you cannot name it, delete the arrow — the cycles are independent.
|
|
83
|
+
|
|
61
84
|
## Canonical handoff — write `.w2-spec.json` + `.w2-doc-plan.json` (read by path, never the transcript)
|
|
62
85
|
|
|
63
86
|
After producing the plan above, **write two files at repo root**. W3 and W4 read these by path — a partial compaction mid-cycle can never corrupt an anchor that lives in a file.
|
|
@@ -106,7 +129,15 @@ COMPOSE: <3 existing primitives that cover it>
|
|
|
106
129
|
VERDICT: compose (remove the addition) | extend (add field/tag to existing) | new (justify in one sentence)
|
|
107
130
|
```
|
|
108
131
|
|
|
109
|
-
`compose` → drop the new file from the diff, slot the behavior into the closest existing primitive. `new` → requires a same-diff doc edit. Check the canonical doc per primitive type before deciding: HTTP/SDK/MCP/CLI → `
|
|
132
|
+
`compose` → drop the new file from the diff, slot the behavior into the closest existing primitive. `new` → requires a same-diff doc edit. Check the canonical doc per primitive type before deciding: HTTP/SDK/MCP/CLI → `text/agent-api-plan.md`; substrate verb → `one/dsl.md`; dimension → `one/one-ontology.md`; any name → `dictionary.md`. Default verdict is `compose`. Emit `compress: compose=X extend=Y new=Z` in receipts. The pre-mortem + decisions for the design itself live in `text/<slug>.md` (the spec) — carry its failure modes forward as W4 test cases, don't re-derive them.
|
|
133
|
+
|
|
134
|
+
**Template check** — before creating any new blueprint file (plan, feature spec, todo, agent prompt), check these four canonical templates first:
|
|
135
|
+
- `text/template-feature.md` — new feature spec
|
|
136
|
+
- `text/template-plan.md` — new plan/architecture doc
|
|
137
|
+
- `text/template-todo.md` — new todo cycle file
|
|
138
|
+
- `text/template-agent.md` — new agent prompt
|
|
139
|
+
|
|
140
|
+
If a template covers the shape, copy and fill it — do not write a new one from scratch. A new blueprint without a matching template requires a PRIMITIVE verdict of `new` with justification.
|
|
110
141
|
|
|
111
142
|
## Decision algorithm
|
|
112
143
|
|
|
@@ -126,11 +157,11 @@ Docs-first. For every code file edited, name the doc that must change alongside
|
|
|
126
157
|
|
|
127
158
|
| Code | Doc |
|
|
128
159
|
|------|-----|
|
|
129
|
-
| `src/engine/world.ts` | `
|
|
130
|
-
| `src/engine/loop.ts` | `
|
|
131
|
-
| `src/schema/*.tql` | `
|
|
132
|
-
| `src/pages/api/*.ts` | `
|
|
133
|
-
| New naming/term | `
|
|
160
|
+
| `src/engine/world.ts` | `text/DSL-plan.md` |
|
|
161
|
+
| `src/engine/loop.ts` | `text/routing-plan.md` |
|
|
162
|
+
| `src/schema/*.tql` | `text/one-ontology.md` + `text/dictionary-plan.md` |
|
|
163
|
+
| `src/pages/api/*.ts` | `text/lifecycle-plan.md` |
|
|
164
|
+
| New naming/term | `text/dictionary-plan.md` |
|
|
134
165
|
|
|
135
166
|
W3 spawns parallel agents for both. Missing doc edits = warn in W4.
|
|
136
167
|
|
package/agents/w3-edit.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: w3-edit
|
|
3
|
-
description: Wave 3 edit agent for /do cycles. Takes W2 diff specs and executes precise edits with exact anchors. Code and docs edited in parallel per docs-first rule. Use after W2 plan is locked. Reports dissolved on anchor mismatch, never modifies unplanned scope.
|
|
3
|
+
description: Wave 3 edit agent for /do cycles. Takes W2 diff specs and executes precise edits with exact anchors. Code and docs edited in parallel per docs-first rule. Enforces SURFACE build order and runs do2-reconcile.sh navigation after page/component/route edits. Use after W2 plan is locked. Reports dissolved on anchor mismatch, never modifies unplanned scope.
|
|
4
4
|
tools: Read, Edit, Write, Grep, Glob, Bash
|
|
5
5
|
model: sonnet
|
|
6
6
|
skills: signal
|
|
@@ -39,6 +39,34 @@ EDIT <abs/path> anchor_matched=<true|false> bytes_delta=<+N|-N> outcome=<resu
|
|
|
39
39
|
- Vercel AI SDK / Zod schemas → `/ai-sdk`
|
|
40
40
|
- Signal emission / receiver format → `/signal`
|
|
41
41
|
|
|
42
|
+
## SURFACE category build order
|
|
43
|
+
|
|
44
|
+
When editing a SURFACE artifact (page, component, route, nav entry), enforce this order — each step must reconcile before the next runs:
|
|
45
|
+
|
|
46
|
+
```
|
|
47
|
+
schema → types → receiver/SDK → API route → component → page → route registered → nav entry → inbound links → states → test → proof
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Concretely:
|
|
51
|
+
- A **component** is not W3-complete until the **page file** exists, the **route is registered**, and the **nav entry** is present.
|
|
52
|
+
- A **page** is not W3-complete until its **route is registered** in the router/manifest and a **nav entry** links to it (if it should be reachable from navigation).
|
|
53
|
+
- An **API route** is not W3-complete until the **types** it consumes exist and the **SDK/receiver** that calls it is wired.
|
|
54
|
+
|
|
55
|
+
If any step is missing after your edit, do not mark the spec as `result` — add the missing step as a W3b spec in the receipt (flag it `needs_w3b: true`) and report it. W3b runs before W4.
|
|
56
|
+
|
|
57
|
+
**Navigation reconcile** — after editing any page, component, or route file, run:
|
|
58
|
+
|
|
59
|
+
```bash
|
|
60
|
+
do2-reconcile.sh navigation <abs/path/to/edited/file>
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
Report the exit status in the receipt:
|
|
64
|
+
- Exit 0 → `nav: ok`
|
|
65
|
+
- Exit 1 (WARN) → `nav: warn — <stdout summary>` — W3 continues but W4 will flag it
|
|
66
|
+
- Exit 2 (FAIL) → `nav: FAIL — <stdout summary>` — this is a **dissolved edit**; do not mark W3 complete; return control to W2 with the navigation failure detail
|
|
67
|
+
|
|
68
|
+
A navigation FAIL means the surface is unreachable or orphaned — it cannot ship.
|
|
69
|
+
|
|
42
70
|
## The Three Locked Rules
|
|
43
71
|
|
|
44
72
|
1. **Closed loop** — every edit either lands (result, `mark +1`) or fails (dissolved / failure, `warn`). No silent Edits. No partial diffs left dangling. The events bridge captures `tool:Edit:*` and `tool:Bash:*` automatically — do not manually emit those.
|
|
@@ -46,7 +74,7 @@ EDIT <abs/path> anchor_matched=<true|false> bytes_delta=<+N|-N> outcome=<resu
|
|
|
46
74
|
3. **Deterministic receipts** — end with a numbers line:
|
|
47
75
|
|
|
48
76
|
```
|
|
49
|
-
W3 receipt: specs=<N> marked=<N> warned=<N> dissolved=<N> files_touched=<N>
|
|
77
|
+
W3 receipt: specs=<N> marked=<N> warned=<N> dissolved=<N> files_touched=<N> nav_ok=<N> nav_warn=<N> nav_fail=<N>
|
|
50
78
|
```
|
|
51
79
|
|
|
52
80
|
## Workflow per spec
|
|
@@ -55,8 +83,10 @@ W3 receipt: specs=<N> marked=<N> warned=<N> dissolved=<N> files_touched=<N>
|
|
|
55
83
|
2. If anchor missing → emit `dissolved`, report, stop. Do not improvise.
|
|
56
84
|
3. If anchor present → apply `Edit` with the exact `old_string` / `new_string` from the spec.
|
|
57
85
|
4. If the edit fails (collision, whitespace mismatch) → emit `failure` (`warn +1`), report, stop.
|
|
58
|
-
5. If
|
|
59
|
-
6.
|
|
86
|
+
5. If editing a page/component/route → run `do2-reconcile.sh navigation <file>` and record result.
|
|
87
|
+
6. If doc-parallel spec exists → edit that file next, same exact-anchor rule.
|
|
88
|
+
7. Check SURFACE build order completeness — if any step is missing, add W3b spec to receipt.
|
|
89
|
+
8. On success → proceed to the next spec or terminate.
|
|
60
90
|
|
|
61
91
|
## Completion signal
|
|
62
92
|
|
package/agents/w4-verify.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: w4-verify
|
|
3
|
-
description: Wave 4 verify agent for /do cycles. Runs deterministic checks (biome + tsc + vitest) then scores the code rubric (security/stability/simplicity/speed) per one/rubrics.md. Returns pass/fail with numeric receipts. Use after W3 edits land. Gates the cycle at rubric >= 0.65.
|
|
3
|
+
description: Wave 4 verify agent for /do cycles. Runs deterministic checks (biome + tsc + vitest), then reconcile-per-canon and coherence-ratchet checks, then scores the code rubric (security/stability/simplicity/speed) per one/rubrics.md. Returns pass/fail with numeric receipts. Use after W3 edits land. Gates the cycle at rubric >= 0.65 AND goal-fit >= 0.50 (hard gate).
|
|
4
4
|
tools: Read, Grep, Glob, Bash, Edit
|
|
5
5
|
model: sonnet
|
|
6
6
|
skills: signal, typedb, typecheck
|
|
@@ -26,6 +26,12 @@ The rubric is not a verdict; it is a map forward.
|
|
|
26
26
|
- buildMs: <N>ms (bun run build; compare to W0 baseline)
|
|
27
27
|
- tokens: <input>/<output>/<cache_read> per wave (W1+W2+W3+W4) — the spend receipt the cycle close turns into a `cost:cycle` signal
|
|
28
28
|
|
|
29
|
+
### Reconcile per canon
|
|
30
|
+
<see below>
|
|
31
|
+
|
|
32
|
+
### Coherence ratchet
|
|
33
|
+
<see below>
|
|
34
|
+
|
|
29
35
|
### Code Rubric (one/rubrics.md — Code Rubric section)
|
|
30
36
|
- goal-fit: <0.00–1.00> <why — did this move the plan outcome closer? one line>
|
|
31
37
|
→ improve: <what the diff does NOT yet deliver toward the goal> | "clean" if 1.00
|
|
@@ -58,29 +64,112 @@ The rubric is not a verdict; it is a map forward.
|
|
|
58
64
|
|
|
59
65
|
1. Run `bun run verify` (biome + tsc + vitest). Capture exit code and counts. If the command fails because `bun` isn't available, fall back to `npm run verify`.
|
|
60
66
|
2. If biome/tsc/vitest fail on files touched in W3 → route failure back to W3 (the parent handles the W3.5 reloop; you emit `w4:verify:fail` with the failure list). Max 3 loops per cycle.
|
|
61
|
-
2.5. **Contract gate** — for any verb touched in W3 (`signal`, `mark`, `warn`, `fade`, `follow`, `harden`), read `
|
|
67
|
+
2.5. **Contract gate** — for any verb touched in W3 (`signal`, `mark`, `warn`, `fade`, `follow`, `harden`), read `text/contracts.md` and verify the diff against the verb's pre/post/inv. Until a property-test suite exists, this is a manual check; emit `contracts: reviewed` or `contracts: violated <verb> <clause>` in receipts. A violation is **non-bypassable** — cycle does not close regardless of rubric. If no verb touched → `contracts: n/a`.
|
|
68
|
+
|
|
69
|
+
3. **Reconcile per canon** — run after deterministic checks pass, before rubric scoring.
|
|
70
|
+
|
|
71
|
+
For every file touched in W3, determine its category (DATA / SURFACE / GATEWAY / PROOF / TEACH):
|
|
72
|
+
|
|
73
|
+
| Category | Applies to |
|
|
74
|
+
|----------|-----------|
|
|
75
|
+
| DATA | `schema/*.tql`, `schema/*.ts`, D1 migrations, TypeDB types |
|
|
76
|
+
| SURFACE | `src/components/**`, `src/pages/**`, `src/layouts/**`, `src/styles/**` |
|
|
77
|
+
| GATEWAY | `src/pages/api/**`, `packages/sdk/**`, `channels/**`, `api/**` |
|
|
78
|
+
| PROOF | `tests/**`, `*.test.ts`, `*.spec.ts` |
|
|
79
|
+
| TEACH | `text/**`, `*.md`, agent prompts |
|
|
80
|
+
|
|
81
|
+
Run the canons that apply to each touched file:
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
# DATA
|
|
85
|
+
do2-reconcile.sh substrate <file>
|
|
86
|
+
do2-reconcile.sh dictionary <file>
|
|
87
|
+
do2-reconcile.sh types
|
|
88
|
+
|
|
89
|
+
# SURFACE
|
|
90
|
+
do2-reconcile.sh design <file>
|
|
91
|
+
do2-reconcile.sh navigation <file>
|
|
92
|
+
|
|
93
|
+
# GATEWAY
|
|
94
|
+
do2-reconcile.sh sdk <file>
|
|
95
|
+
do2-reconcile.sh authority <file>
|
|
96
|
+
|
|
97
|
+
# PROOF — no canon check; exit code IS the check (vitest already ran above)
|
|
98
|
+
|
|
99
|
+
# TEACH
|
|
100
|
+
do2-reconcile.sh dictionary <file>
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Report each result as `canon/<name>: ok | warn | FAIL`. Any FAIL → cycle does not close, same as a tsc failure. Emit the FAIL stdout in the report so W2 can target the exact gap.
|
|
104
|
+
|
|
105
|
+
```
|
|
106
|
+
reconcile: substrate=ok dictionary=ok types=ok design=ok navigation=FAIL sdk=ok authority=ok
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
4. **Coherence ratchet** — run after reconcile-per-canon.
|
|
110
|
+
|
|
111
|
+
For this cycle's diff, verify each dimension of the ratchet cannot regress:
|
|
112
|
+
|
|
113
|
+
```
|
|
114
|
+
### Coherence ratchet
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
| Dim | Check | Command | Gate |
|
|
118
|
+
|-----|-------|---------|------|
|
|
119
|
+
| types | delta_tsc ≤ 0 | already in deterministic checks (step 1) | tsc errors this cycle ≤ tsc errors at W0 |
|
|
120
|
+
| names | 0 dead names in touched docs | `do2-reconcile.sh dictionary <touched_docs>` | exit 0 |
|
|
121
|
+
| primitives | net new ≤ 0 | `git diff --name-status HEAD \| grep -c '^A'` minus `git diff --name-status HEAD \| grep -c '^D'` | new_files − deleted_files ≤ 0 |
|
|
122
|
+
| schema | no fork | `do2-reconcile.sh substrate <touched_schema_files>` | exit 0 |
|
|
123
|
+
| surfaces | registered + linked | `do2-reconcile.sh navigation <touched_pages>` | exit 0 |
|
|
124
|
+
| docs | no broken link | `markdown-link-check <touched_docs>` | exit 0 |
|
|
125
|
+
|
|
126
|
+
```bash
|
|
127
|
+
# primitives net check
|
|
128
|
+
NEW_FILES=$(git diff --name-status HEAD | grep -c '^A')
|
|
129
|
+
DEL_FILES=$(git diff --name-status HEAD | grep -c '^D')
|
|
130
|
+
NET_PRIMITIVES=$((NEW_FILES - DEL_FILES))
|
|
131
|
+
# pass if NET_PRIMITIVES <= 0; warn if > 0 and justify with PRIMITIVE verdict from W2
|
|
132
|
+
|
|
133
|
+
# names ratchet
|
|
134
|
+
do2-reconcile.sh dictionary $(git diff --name-only HEAD | grep '\.md$' | tr '\n' ' ')
|
|
135
|
+
|
|
136
|
+
# schema fork check
|
|
137
|
+
SCHEMA_TOUCHED=$(git diff --name-only HEAD | grep '\.tql$' | tr '\n' ' ')
|
|
138
|
+
[ -n "$SCHEMA_TOUCHED" ] && do2-reconcile.sh substrate $SCHEMA_TOUCHED
|
|
139
|
+
|
|
140
|
+
# surfaces ratchet
|
|
141
|
+
PAGES_TOUCHED=$(git diff --name-only HEAD | grep -E 'src/pages/' | tr '\n' ' ')
|
|
142
|
+
[ -n "$PAGES_TOUCHED" ] && do2-reconcile.sh navigation $PAGES_TOUCHED
|
|
143
|
+
|
|
144
|
+
# broken links
|
|
145
|
+
DOCS_TOUCHED=$(git diff --name-only HEAD | grep '\.md$' | tr '\n' ' ')
|
|
146
|
+
[ -n "$DOCS_TOUCHED" ] && markdown-link-check $DOCS_TOUCHED
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
Any ratchet regression → cycle does not close. Report each dim: `ratchet/<dim>: ok | FAIL — <reason>`.
|
|
62
150
|
|
|
63
|
-
|
|
151
|
+
5. If deterministic checks, reconcile-per-canon, and coherence ratchet all pass → score the code rubric. Target is 1.0 on every dim.
|
|
64
152
|
Full KPIs, scoring bands, and improvement format are in `one/rubrics.md` — Code Rubric.
|
|
65
153
|
|
|
66
|
-
|
|
154
|
+
5.5. **Goal-fit (0.35 — the heaviest dim, hard gate ≥ 0.50):** re-read the cycle's
|
|
67
155
|
`Goal delta:` / plan `outcome:` and verify the shipped diff actually moves it. Confirm the
|
|
68
156
|
deliverable proof is present (curl / screenshot / log) and the `ux_after` journey is reachable.
|
|
69
157
|
Clean, fast, safe code that does NOT advance the goal scores low here and **cannot close** —
|
|
70
|
-
goal-fit < 0.50 fails the cycle regardless of the other four dims
|
|
71
|
-
|
|
158
|
+
goal-fit < 0.50 fails the cycle regardless of the other four dims, regardless of composite.
|
|
159
|
+
This is a hard gate: a cycle that produces clean code solving the wrong problem does not ship.
|
|
160
|
+
`→ improve:` names what the goal still needs.
|
|
72
161
|
|
|
73
|
-
|
|
162
|
+
6. **Security (0.20):** grep the diff for `/api[_-]?key|secret|password|token/i`, `eval(`,
|
|
74
163
|
`dangerouslySetInnerHTML`. Check every `src/pages/api/*.ts` route validates input with Zod
|
|
75
164
|
at the boundary. CF Worker env via `context.env` only. No wildcard CORS headers.
|
|
76
165
|
Score 1.0 = all greps return 0. For every gap, emit `→ improve: file:line — what`.
|
|
77
166
|
|
|
78
|
-
|
|
167
|
+
7. **Stability (0.20):** biome + tsc + vitest already ran. Now check: no new `any`, no
|
|
79
168
|
`@ts-ignore` without WHY comment, no silent returns (Rule 1), no wall-clock units in new
|
|
80
169
|
code or docs (Rule 2), no retired names `knowledge|connections|people|node|scent|alarm|
|
|
81
170
|
trail|colony`. Score 1.0 = all zero. For each gap, emit `→ improve: exact location`.
|
|
82
171
|
|
|
83
|
-
|
|
172
|
+
8. **Simplicity (0.15):** the philosophy is small, focused files. The substrate — the
|
|
84
173
|
entire schema + engine — is 200 lines total. Use that as your reference point.
|
|
85
174
|
|
|
86
175
|
```bash
|
|
@@ -109,7 +198,7 @@ The rubric is not a verdict; it is a map forward.
|
|
|
109
198
|
Score 1.0 = all files feel focused and single-purpose; functions tight; zero ceremony.
|
|
110
199
|
For each gap, name what to split or delete.
|
|
111
200
|
|
|
112
|
-
|
|
201
|
+
9. **Speed (0.10):** Three sub-checks, all must pass for 1.0.
|
|
113
202
|
|
|
114
203
|
**Lighthouse:** run against all pages derived from the file→route map. Target 100 on all
|
|
115
204
|
four categories. For each audit below 100, name it and the component responsible.
|
|
@@ -137,15 +226,17 @@ The rubric is not a verdict; it is a map forward.
|
|
|
137
226
|
Score 1.0 = all Lighthouse 100, bundle ≤ W0, agent lines ≤ W0, no context stuffing, cache ≥ 80%.
|
|
138
227
|
For each gap, name the audit, component, or file.
|
|
139
228
|
|
|
140
|
-
|
|
229
|
+
10. Composite = `0.35·goal-fit + 0.20·security + 0.20·stability + 0.15·simplicity + 0.10·speed`. Gate: composite ≥ 0.65 AND goal-fit ≥ 0.50 (hard).
|
|
141
230
|
|
|
142
|
-
|
|
231
|
+
11. Must-not checks (bypass composite — immediate warn):
|
|
143
232
|
- Hardcoded secret or API key → `warn(1)` on security, cycle fails.
|
|
144
233
|
- `eval()` or unsanitized `dangerouslySetInnerHTML` → `warn(1)`, cycle fails.
|
|
145
234
|
- Test failure on W3-touched files → `warn(1)` on stability, route to W3.5.
|
|
146
235
|
- Lighthouse any category drops > 5 pts from baseline → `warn(1)` on speed.
|
|
236
|
+
- Any reconcile-per-canon FAIL → cycle fails (same weight as tsc failure).
|
|
237
|
+
- Any coherence ratchet regression → cycle fails.
|
|
147
238
|
|
|
148
|
-
|
|
239
|
+
12. Cross-consistency checks from the TODO's verify checklist (doc terms match code identifiers,
|
|
149
240
|
no 404 links, no retired names leaked).
|
|
150
241
|
|
|
151
242
|
---
|
|
@@ -283,7 +374,7 @@ git diff HEAD | grep -E '^\+' | grep -E 'define|match|insert' \
|
|
|
283
374
|
Zero hits across all greps = security score eligible for 1.0.
|
|
284
375
|
Each hit = `→ improve: file:line — what the pattern is`.
|
|
285
376
|
|
|
286
|
-
|
|
377
|
+
13. **Write improvement artifacts** — this is how the system learns.
|
|
287
378
|
|
|
288
379
|
```bash
|
|
289
380
|
# a) Machine-readable: feeds next cycle's W1 recon
|
|
@@ -336,7 +427,7 @@ the substrate — this file is structurally weak and should be prioritized in fu
|
|
|
336
427
|
}
|
|
337
428
|
```
|
|
338
429
|
|
|
339
|
-
|
|
430
|
+
14. Emit the completion signal.
|
|
340
431
|
|
|
341
432
|
## Known-flaky allowlist
|
|
342
433
|
|
|
@@ -358,6 +449,7 @@ Success:
|
|
|
358
449
|
"content": {
|
|
359
450
|
"passed": N, "failed": 0,
|
|
360
451
|
"rubric": {
|
|
452
|
+
"goal-fit": { "score": 0.90, "improve": "clean" },
|
|
361
453
|
"security": { "score": 0.95, "improve": "src/pages/api/provision.ts:31 — missing Zod parse on slug" },
|
|
362
454
|
"stability": { "score": 1.00, "improve": "clean" },
|
|
363
455
|
"simplicity": { "score": 0.85, "improve": "inline formatDate() at src/lib/slug.ts:12, saves 9 lines" },
|
|
@@ -367,6 +459,8 @@ Success:
|
|
|
367
459
|
"velocity": +0.06,
|
|
368
460
|
"buildMs": N,
|
|
369
461
|
"lighthouse": { "perf": 97, "a11y": 100, "bp": 100, "seo": 100 },
|
|
462
|
+
"reconcile": { "substrate": "ok", "dictionary": "ok", "types": "ok", "design": "ok", "navigation": "ok", "sdk": "ok", "authority": "ok" },
|
|
463
|
+
"ratchet": { "types": "ok", "names": "ok", "primitives": "ok", "schema": "ok", "surfaces": "ok", "docs": "ok" },
|
|
370
464
|
"improvements_file": ".w4-improvements.json"
|
|
371
465
|
}
|
|
372
466
|
}
|
|
@@ -384,6 +478,7 @@ Failure:
|
|
|
384
478
|
"passed": N, "failed": M,
|
|
385
479
|
"failures": ["<test name or tsc error>"],
|
|
386
480
|
"rubric": {
|
|
481
|
+
"goal-fit": { "score": 0.40, "improve": "diff does not advance plan outcome — goal still needs <X>" },
|
|
387
482
|
"security": { "score": 0.50, "improve": "src/pages/api/chat.ts:23 — missing Zod parse on body.slug" },
|
|
388
483
|
"stability": { "score": 0.00, "improve": "vitest: chat renders message FAILED — type mismatch line 14" },
|
|
389
484
|
"simplicity": { "score": 0.60, "improve": "parseMarkdown() 18 lines, one caller — inline and delete" },
|
|
@@ -391,6 +486,8 @@ Failure:
|
|
|
391
486
|
},
|
|
392
487
|
"composite": 0.34,
|
|
393
488
|
"velocity": -0.12,
|
|
489
|
+
"reconcile": { "navigation": "FAIL — src/pages/u/[slug]/new.astro not registered in manifest" },
|
|
490
|
+
"ratchet": { "surfaces": "FAIL — new page missing nav entry" },
|
|
394
491
|
"improvements_file": ".w4-improvements.json"
|
|
395
492
|
}
|
|
396
493
|
}
|
|
@@ -9,7 +9,7 @@ Loaded by `do.md` for empty invocation or `--once` flag.
|
|
|
9
9
|
```bash
|
|
10
10
|
W0: bun run verify (once per session, skip if already passed)
|
|
11
11
|
|
|
12
|
-
ORIENT: Read
|
|
12
|
+
ORIENT: Read text/TODO-plan.md
|
|
13
13
|
→ note the active front (Atomicity / Vocabulary / New Surfaces)
|
|
14
14
|
→ note the Top 15 priority list
|
|
15
15
|
→ let this shape which task you pick
|