@pilotspace/add 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -31,7 +31,10 @@ Run the tool to find the resume point instead of re-reading the repo:
31
31
  python3 .add/tooling/add.py status
32
32
  ```
33
33
 
34
- - **No `.add/` yet** go to **phase 0 (setup)**: read `phases/0-setup.md`.
34
+ - **No `.add/state.json` yet** (a fresh install drops tooling + docs but does *not* init — so `status` says
35
+ `no .add/ project found`) → enter **autonomous setup**: YOU run init yourself —
36
+ `add.py init --name "<inferred>" --stage <picked> --await-lock` (don't tell the human to) — then read
37
+ `phases/0-setup.md` and draft the foundation + first scope + first contract through to the human lock-down.
35
38
  - **A task is active** → open `.add/tasks/<active>/TASK.md`, look at its `phase:`
36
39
  marker, and read the matching `phases/<n>-<phase>.md`. Work *only* that phase.
37
40
  - **No active task** → first SIZE the request (see Intake below), then create the
@@ -56,28 +59,51 @@ Load the phase guide **only for the phase you are in** (progressive disclosure):
56
59
 
57
60
  | Phase | Guide | Produces (TASK.md section) | Who leads |
58
61
  |-------|-------|----------------------------|-----------|
59
- | setup | `phases/0-setup.md` | `.add/` + survivor files | human |
60
- | specify | `phases/1-specify.md` | §1 rules + ranked least-sure flag | human + AI (co-specify) |
61
- | scenarios | `phases/2-scenarios.md` | §2 Given/When/Then | human |
62
- | contract | `phases/3-contract.md` | §3 frozen shape | human + AI |
63
- | tests | `phases/4-tests.md` | §4 + red suite in `tests/` | human sets, AI writes |
62
+ | setup | `phases/0-setup.md` | `.add/` + survivors + first §1–§3 + `SETUP-REVIEW.md` | AI drafts → **human locks** (the lock-down) |
63
+ | specify | `phases/1-specify.md` | §1 rules + ranked least-sure flag | AI drafts (co-specify) |
64
+ | scenarios | `phases/2-scenarios.md` | §2 Given/When/Then | AI drafts† |
65
+ | contract | `phases/3-contract.md` | §3 frozen shape | AI drafts → **human approves once** (the seam)† |
66
+ | tests | `phases/4-tests.md` | §4 + red suite in `tests/` | AI drafts† |
64
67
  | build | `phases/5-build.md` | code in `src/`, tests green | **AI** |
65
- | verify | `phases/6-verify.md` | §6 checks + gate record | **human** |
68
+ | verify | `phases/6-verify.md` | §6 checks + gate record | **AI auto-gates on evidence**; human on residue/security‡ |
66
69
  | observe | `phases/7-observe.md` | §7 spec delta | human + AI |
67
70
 
71
+ † **One-approval front (v7).** §1–§4 are drafted by the AI as a single bundle and frozen
72
+ together; the human gives **one approval, at the contract freeze** (the autonomy seam) — not
73
+ three separate sign-offs. The AI presents the bundle least-sure-first. See `run.md`.
74
+ ‡ **Verify auto-gate (v6–v7).** Under `autonomy: auto` (the default) a run may auto-PASS once
75
+ the evidence is complete (all tests green · loops dry · no residue) — recorded as *auto-resolved*,
76
+ an explicit PASS, not a skip. **Security always escalates** (HARD-STOP), as do concurrency /
77
+ architecture residue and `conservative` autonomy. See `run.md`.
78
+
79
+ Whenever you present a seam to the human in chat (intake · front approval · gate ·
80
+ milestone close), follow `report-template.md` — SUMMARY → DECISION → ⚠ FLAGS →
81
+ EVIDENCE → NEXT, engine-sourced facts, show-before-ask, never pre-stamp a seam.
82
+
68
83
  In **observe**, also emit **competency deltas** — learnings tagged by which of the five
69
84
  (`DDD · SDD · UDD · TDD · ADD`) they improve — so the foundation self-improves across loops.
70
85
  You write them as `open`; the human folds them into `PROJECT.md`. Read `deltas.md` for the
71
86
  grammar and the status lifecycle. At milestone close (or on demand), run the fold ritual that
72
87
  gathers confirmed deltas into a versioned foundation — read `fold.md`.
73
88
 
74
- ## The dynamic run (v6)
89
+ ## The dynamic run (v6–v7)
90
+
91
+ Once **§3 CONTRACT is FROZEN**, the build→verify half runs as a dynamic, auto-gated run —
92
+ fan-out + in-run convergence — instead of a manual build (`autonomy: auto` is the default; lower
93
+ to `conservative` to keep a human at the gate). Read `run.md` for the trigger, the touch-boundary,
94
+ the evidence auto-gate, and the autonomy dial. The human-led front still owns *direction*, but v7
95
+ compresses it to a **single approval at the contract seam**; the run never edits a frozen contract
96
+ and never auto-passes a security finding.
97
+
98
+ ## Parallel streams — pipelining independent tasks (opt-in)
75
99
 
76
- Once **§3 CONTRACT is FROZEN**, the build→verify half MAY run as a dynamic, auto-gated run
77
- fan-out + in-run convergence instead of a manual build. Read `run.md` for the trigger, the
78
- touch-boundary, the evidence auto-gate, and the autonomy dial. The human-led front
79
- (specify·scenarios·contract) is unchanged; the run never edits a frozen contract and never
80
- auto-passes a security finding.
100
+ The default is one task at a time. When a milestone has several tasks whose `deps=` are
101
+ already `PASS` and a human is ready to review, you MAY run them concurrently: read
102
+ `streams.md`. It changes no `add.py` code you compute a READY-QUEUE from `status`,
103
+ spawn one worker per ready task (each in a worktree, building behind its own frozen
104
+ contract), and keep the human seams (front approval · escalated Verify) on one serial
105
+ REVIEW-QUEUE. The honest gain is pipelining (the reviewer never waits on a build), not
106
+ N× speed; the autonomy dial sets how much actually overlaps.
81
107
 
82
108
  ## Non-negotiable rules (from the method)
83
109
 
@@ -100,6 +126,7 @@ inside TASK.md):
100
126
  ```bash
101
127
  python3 .add/tooling/add.py advance # next phase of the active task
102
128
  python3 .add/tooling/add.py gate PASS # at verify: records PASS, marks done
129
+ python3 .add/tooling/add.py use <slug> # switch the active task (e.g. across parallel streams)
103
130
  ```
104
131
 
105
132
  ## Depth by stage
@@ -0,0 +1,65 @@
1
+ # Adopt — map an existing repo into the foundation (silent)
2
+
3
+ When ADD is pointed at a repo that already has code, onboarding is **silent**: the code
4
+ answers the questions a greenfield interview would ask, so you read it rather than ask.
5
+ This is the **brownfield path** of setup (the greenfield path keeps the 4-lens interview —
6
+ see `phases/0-setup.md`). You fill the survivor files from evidence, then stop at the one
7
+ human gate: the **lock-down** (`add.py lock`).
8
+
9
+ ## The signal — and arming the gate
10
+
11
+ Enter a brownfield repo with `--await-lock`:
12
+
13
+ ```bash
14
+ python3 .add/tooling/add.py init --await-lock
15
+ ```
16
+
17
+ `--await-lock` does two things. It seeds an **unlocked** setup, which *arms the lock-down gate*
18
+ — the engine then refuses a second task, crossing into build, and recording a gate until you
19
+ `lock`. And init, being brownfield-aware, prints a line that begins:
20
+
21
+ ```
22
+ brownfield: existing code detected — the `add` skill maps it into your foundation …
23
+ ```
24
+
25
+ That line is your cue to run this guide. **Always use `--await-lock` for brownfield onboarding**:
26
+ a plain `init` writes no setup and is grandfathered-locked, so its gate never arms *and* the
27
+ closing `lock` below would refuse with `already_locked`. The engine only *detects* the existing
28
+ code (a mechanical fact); it never reads or fills it — interpreting it is your job.
29
+
30
+ ## The silent mapping
31
+
32
+ Fill each survivor file in `.add/` from what the code actually shows — **ask nothing**:
33
+
34
+ | Survivor | Read it from |
35
+ |----------|--------------|
36
+ | `PROJECT.md` (foundation) | the domain nouns, entry points, the README, the first milestone the code implies |
37
+ | `CONVENTIONS.md` | the languages, folder layout, naming, lint config, error style already in the tree |
38
+ | `GLOSSARY.md` | the recurring names in modules, models, and public APIs (one name per concept) |
39
+ | `MODEL_REGISTRY.md` | leave the active model record; note any AI-authored code you can detect |
40
+ | `dependencies.allowlist` | the manifests already in the repo (package.json, pyproject, go.mod, …) |
41
+
42
+ Two rules that never bend:
43
+
44
+ 1. **Never clobber a survivor.** `init` already skips any survivor that exists; if a human
45
+ already wrote `PROJECT.md`, you READ it, you do not overwrite it. Add, never replace.
46
+ 2. **Tag every drafted decision `evidence-grounded` vs `guessed`.** A line you read from the
47
+ code is *evidence-grounded* (cite the file). A line you inferred because the code was silent
48
+ is *guessed*. The human's single lock-down is only honest if they can see which is which —
49
+ the guesses are what they actually need to check. (The tags feed `SETUP-REVIEW.md`.)
50
+
51
+ ## Where it ends — the lock-down
52
+
53
+ Brownfield onboarding draws no per-step approvals. You map the foundation, then draft the
54
+ first milestone's scope and the first task's candidate front exactly as greenfield does, and
55
+ present it all at **one** human gate. The human reviews the decisions (least-sure / `guessed`
56
+ first) and signs:
57
+
58
+ ```bash
59
+ python3 .add/tooling/add.py lock --by "<name>"
60
+ ```
61
+
62
+ `lock` freezes the foundation + scope + first contract in one atomic write and opens the build.
63
+ Until it is run, the engine refuses a second task, crossing into build, and recording a gate —
64
+ so nothing is built on an unreviewed map. That gate is the only thing brownfield onboarding asks
65
+ of a human; everything before it, you did from the code.
@@ -10,7 +10,7 @@ You (the AI) **emit** deltas as `open`. Only the **human** moves a delta to `fol
10
10
 
11
11
  ## The grammar (frozen)
12
12
 
13
- Each delta is ONE line, exactly:
13
+ Each delta begins on its own **tag line**; the learning may wrap onto continuation lines:
14
14
 
15
15
  ```
16
16
  - [<COMPETENCY> · <status>] <learning> (evidence: <pointer>)
@@ -18,10 +18,20 @@ Each delta is ONE line, exactly:
18
18
 
19
19
  - `<COMPETENCY>` — exactly one of the five (below).
20
20
  - `<status>` — `open` | `folded` | `rejected`. A **newly emitted delta is `open`**.
21
- - `<learning>` — the insight, in one phrase ("the domain model missed multi-tenancy").
21
+ - `<learning>` — the insight ("the domain model missed multi-tenancy"). It may run past one line;
22
+ the `- [COMPETENCY · status]` tag line must come **first**, and the `(evidence: …)` clause must
23
+ **close** the delta (on its last line).
22
24
  - `(evidence: …)` — **required**, non-empty: a failing scenario, a production signal, a review
23
25
  note. No evidence → it is an opinion, not a delta.
24
26
 
27
+ A long learning may wrap — the lint (`add.py check`) joins continuation lines, so this is **one**
28
+ delta, not two:
29
+
30
+ ```
31
+ - [SDD · open] the export endpoint must reject a tenant-scoped token used cross-tenant,
32
+ returning `forbidden` (not `not_found`) (evidence: scenario_cross_tenant_export failed)
33
+ ```
34
+
25
35
  ## The five competencies (pick exactly one per delta)
26
36
 
27
37
  | tag | competency | a delta here means you learned something about… |
@@ -1,35 +1,98 @@
1
- # Phase 0 — Setup (once per project)
1
+ # Phase 0 — Setup (autonomous draft → one human lock-down)
2
2
 
3
- Goal: make every later gate enforceable automatically. Do this once.
3
+ Goal: point ADD at a repo and **you** draft the whole foundation — domain, first-milestone scope,
4
+ and the first task's contract — then hand the human exactly one decision: the **lock-down**. Brownfield
5
+ is silent (the code answers the questions); greenfield keeps a short interview. Either way, the human's
6
+ only gate is `add.py lock`. This is the setup-altitude analog of a task's one-approval contract freeze.
4
7
 
5
- ## Do
8
+ ## 1 · Zero-touch entry — you run init yourself
6
9
 
7
- 1. Initialise the runtime (creates `.add/` + survivor-layer files):
10
+ When there is no `.add/state.json`, do **not** tell the human to initialise run it yourself. Infer the
11
+ project name and stage from the repo, and **arm the lock-down gate** with `--await-lock`:
12
+
13
+ ```bash
14
+ python3 .add/tooling/add.py init --name "<inferred from repo/dir>" --stage <prototype|poc|mvp|production> --await-lock
15
+ ```
16
+
17
+ - `--await-lock` is **required** here: it seeds an *unlocked* setup, which arms the gate so the engine
18
+ refuses a second task / crossing into build / a `gate` until you `lock`. A plain `init` is
19
+ grandfathered-locked — its gate never arms, and the closing `lock` would error `already_locked`.
20
+ - name + stage are **your judgment** (read them from the dir name, README, manifests); the engine stays
21
+ mechanical. Pick the stage from the ambition you hear: throwaway → `prototype`, one risky slice → `poc`,
22
+ narrow-but-real → `mvp`, full rigor → `production`.
23
+
24
+ `init` prints one of two things — **that is your branch**:
25
+ - a line starting `brownfield:` → there is existing code (go to **2a**);
26
+ - the greenfield closing (no `brownfield:`) → an empty repo (go to **2b**).
27
+
28
+ ## 2a · Brownfield — map it silently
29
+
30
+ The code answers the questions a greenfield interview would ask, so **read it, don't ask**. Open
31
+ `adopt.md` and follow it: fill each survivor file from the code, never clobber an existing one, and tag
32
+ every decision `evidence-grounded` (cite the file) or `guessed`. Ask the human **nothing** at this step.
33
+
34
+ ## 2b · Greenfield — the 4-lens interview (kept): co-specify at foundation altitude
35
+
36
+ An empty repo has no code to read, so run the short interview. This is the **co-specify at foundation
37
+ altitude** move — the same diverge → converge → validate brainstorm a task's §1 uses (`phases/1-specify.md`),
38
+ lifted to the foundation. Ask the one load-bearing question per lens (diverge), draft the foundation
39
+ (converge), then rank what you're least sure of and show the top flag first (validate):
40
+
41
+ | Lens | The one question that unblocks the section |
42
+ |------|--------------------------------------------|
43
+ | Domain (DDD) | The 3–5 core nouns, and the one invariant that must NEVER break? |
44
+ | Spec (SDD) | The first milestone's outcome — and what's explicitly NOT in v1? |
45
+ | Users (UDD) | The primary user and the one job they hire this for? (or "no UI — surface is X") |
46
+ | Decisions | What's already decided that you'd regret re-litigating? (first Key Decision row) |
47
+
48
+ Ask only the live ones; skip what the request already answers. Rank your drafts least-sure-first using the
49
+ one notation every altitude shares — `⚠ <assumption> — least sure because <why>; if wrong: <cost>` — and
50
+ tag thin or inferred answers `guessed`.
51
+
52
+ ## 3 · Draft to the lock (both paths)
53
+
54
+ 1. **Fill the survivors** (they outlive all code): `.add/PROJECT.md` (the foundation — Domain · Spec/active
55
+ milestone · UI/UX · Key Decisions, one screen), `CONVENTIONS.md`, `GLOSSARY.md`, `MODEL_REGISTRY.md`,
56
+ `dependencies.allowlist`. Brownfield: from the code. Greenfield: from the interview, gaps flagged `guessed`.
57
+ 2. **Size the first milestone** (read `scope.md`) and draft its `MILESTONE.md` — goal · scope · exit criteria
58
+ · breadth-first tasks.
59
+ 3. **Create the first task and draft its candidate front.** `new-task` is allowed pre-lock:
8
60
  ```bash
9
- python3 .add/tooling/add.py init --name "<project>" --stage prototype
61
+ python3 .add/tooling/add.py new-task <slug> --title "<first feature>"
10
62
  ```
11
- If the tool isn't there yet, the installer (`npx @pilotspace/add init`) placed it at
12
- `.add/tooling/add.py`.
13
- 2. Fill the survivor-layer files (they outlive all code):
14
- - `.add/PROJECT.md` **the foundation**: Domain (DDD) · Spec/Living-Document (SDD,
15
- active milestone) · UI/UX (UDD) · Key Decisions. Cross-milestone context the
16
- engine reads first. Keep it to one screen. Book: `docs/14-foundation.md`.
17
- - `.add/CONVENTIONS.md` language, folders, naming, lint, error-code style, architecture.
18
- - `.add/GLOSSARY.md` — one name per concept; used in specs, contracts, and code.
19
- - `.add/MODEL_REGISTRY.md` which AI model/version writes this project.
20
- - `.add/dependencies.allowlist` — packages the AI may use; CI rejects others.
21
- 3. Confirm CI runs green on the empty skeleton before the first feature.
63
+ Draft §1 (specify) · §2 (scenarios) · §3 (contract). **Leave §3 `Status: DRAFT`** the lock is its
64
+ approval (see §5). You MAY `advance` through specify → scenarios → contract → tests pre-lock, but the
65
+ engine **refuses crossing into build** until you `lock` (`setup_unlocked`). Sequence: front → lock → build.
66
+ 4. **Write `.add/SETUP-REVIEW.md`** per `setup-review.md`: every decision you drafted (foundation, scope,
67
+ first contract), **least-sure-first**, each tagged `guessed` | `evidence-grounded`.
68
+
69
+ ## 4 · The one human gate the lock-down
70
+
71
+ Present `SETUP-REVIEW.md` least-sure-first (the `guessed` rows are what the human must actually check). They
72
+ sign **once**:
73
+
74
+ ```bash
75
+ python3 .add/tooling/add.py lock --by "<name>"
76
+ ```
77
+
78
+ `lock` records the lock layers (foundation · scope · contract) in one atomic write and opens the build. It is
79
+ judgment-free — it does **not** parse `SETUP-REVIEW.md`; the human *reading* it is the review.
80
+
81
+ ## 5 · After the lock
82
+
83
+ - The lock **is** the first task's contract approval — the v7 one-approval-front and the lock-down collapse
84
+ into this single signature. Do **not** ask for a separate contract-freeze sign-off (that double-gates).
85
+ - Stamp the first task's §3 `Status: FROZEN @ v1` (lock-authorized), then read `phases/5-build.md` — build is
86
+ now open. Everything before this signature, you drafted.
22
87
 
23
88
  ## Exit gate
24
89
 
25
- - [ ] `.add/state.json` exists (`add.py status` works).
26
- - [ ] `.add/PROJECT.md` foundation filled (domain · spec · UI/UX).
27
- - [ ] CONVENTIONS, GLOSSARY, MODEL_REGISTRY, allowlist filled.
28
- - [ ] Pipeline green on the skeleton.
90
+ - [ ] `.add/state.json` exists; setup was seeded unlocked (`--await-lock`) then locked.
91
+ - [ ] Survivors filled (brownfield: from code, tagged evidence-grounded; greenfield: from the interview).
92
+ - [ ] First task created; §1–§3 drafted; `.add/SETUP-REVIEW.md` written least-sure-first.
93
+ - [ ] Human signed `add.py lock`; first task §3 `FROZEN @ v1`; build open.
29
94
 
30
95
  ## Next
31
96
 
32
- ```bash
33
- python3 .add/tooling/add.py new-task <slug> --title "<feature>"
34
- ```
35
- Then read `phases/1-specify.md`. · Book: `docs/10-setup-and-stages.md`.
97
+ After the lock, read `phases/5-build.md` (build is open). · Book: `docs/10-setup-and-stages.md`
98
+ *(note: book chapters 10 / 13 / 14 still describe the older human-led setup until `book-align` lands).*
@@ -20,6 +20,22 @@ whole bundle (§1–§4). Before asking for it, present the bundle **least-sure
20
20
  most likely wrong (`⚠ [spec|scenario|contract|test] … — because …; if wrong: …`) — aim the human's
21
21
  eye before they freeze. See `run.md`.
22
22
 
23
+ ## The freeze review checklist
24
+
25
+ The human's one minute, aimed. Walk these six before saying yes:
26
+
27
+ - **⚠ flags first** — read the least-sure flags; accept each knowing its cost if wrong.
28
+ - **Intent** — does §1 say what you actually want built (and is anything you expected missing)?
29
+ - **Cases** — does every Must and Reject have an observable §2 scenario you care about?
30
+ - **Shape** — glossary names, error codes, additive vs breaking: is THIS the shape to freeze?
31
+ - **Risk** — is this scope high-risk or method-defining? Then require
32
+ `risk: high · autonomy: conservative` in the TASK.md header — the engine refuses an unguarded completion.
33
+ - **Tests** — will §4 go red for the right reason, asserting behavior rather than internals?
34
+
35
+ This checklist AIMS the one approval — never a second gate, no sign-off forms, no
36
+ extra documents. Reject any line and the bundle goes back to draft; that is
37
+ backward-correction, not failure.
38
+
23
39
  ## AI prompt
24
40
 
25
41
  > Role: an interface architect; frozen contracts are immutable. Read §1, §2,
@@ -17,6 +17,20 @@ before code exists is testing nothing and will wave bad code through later.
17
17
  - Side-effect assertions on rejection paths (`assert balance unchanged`).
18
18
  - A recorded coverage target in §4.
19
19
 
20
+ ## Declaring where tests live
21
+
22
+ §4's `Tests live in:` line is machine-read: when a task has no local `tests/`,
23
+ `add.py report` counts test functions at the declared path(s) instead. The FIRST
24
+ line matching `Tests live in:` is read; paths are its backticked tokens.
25
+ Resolution: `./…` → this task's dir · a token containing `/` → the project root
26
+ (the parent of `.add/`) · a bare name → a sibling of the previous token's
27
+ directory (else the task dir). A directory token counts the `*.py` files directly
28
+ inside it (non-recursive); a `.py` file token counts itself; anything else is
29
+ ignored. Resolved files are deduped, and reports mark declared counts with `†`.
30
+ Paths are confined: anything resolving (symlinks followed)
31
+ outside the project root counts 0 — `..` traversal, absolute paths, and
32
+ symlink escapes are never read.
33
+
20
34
  ## AI prompt
21
35
 
22
36
  > Role: a test author who writes tests before code. Read §2 and §3. Turn each
@@ -36,3 +36,6 @@ change request back to Specify. Honor the feature-specific safety rule named in
36
36
 
37
37
  `python3 .add/tooling/add.py advance` → read `phases/6-verify.md`.
38
38
  Book: `docs/07-step-5-build.md`.
39
+
40
+ > Under `autonomy: auto` (the default) Build and Verify run together as one dynamic,
41
+ > evidence-auto-gated run — not two manual stops. See `run.md`.
@@ -1,8 +1,16 @@
1
1
  # Phase 6 — Verify (evidence + blind-spot checks)
2
2
 
3
3
  Goal: establish trust and record an outcome. Passing tests are necessary, not
4
- sufficient. This phase is **human-led** there is no AI role. Fill **§6** in
5
- TASK.md including the GATE RECORD.
4
+ sufficient. Fill **§6** in TASK.md including the GATE RECORD.
5
+
6
+ > **Who resolves this gate depends on the `autonomy:` header (see `run.md`).**
7
+ > Under `autonomy: auto` (the default) a run auto-PASSes once the evidence is
8
+ > complete — every test green, the convergence loops dry, and **no residue**
9
+ > (security · concurrency · architecture) — recording it as *auto-resolved* with
10
+ > the named run as accountable owner: an explicit PASS, not a skip. **Security is
11
+ > always a HARD-STOP and is never auto-passed.** Under `autonomy: conservative`,
12
+ > or whenever residue is found, this phase is **human-led** and the checks below
13
+ > are the human's.
6
14
 
7
15
  ## Part one — confirm the evidence
8
16
 
@@ -18,6 +26,9 @@ If any is false, stop and return to Build — there is nothing to verify yet.
18
26
  and miss races.) This is usually the single most important check.
19
27
  - **Security** — exposed secrets, injection openings, unexpected/invented
20
28
  dependencies. A security finding is always `HARD-STOP`, never a waiver.
29
+ Writing ANY note on this line means the gate escalates to the human — and
30
+ start it with `NOTE` or `⚠` so `add.py audit` can see it: a marked security
31
+ note reviewed by the auto-gate is an audit finding (`unescalated_security_note`).
21
32
  - **Architecture** — does it respect layering/dependency rules in CONVENTIONS.md?
22
33
 
23
34
  ## Record exactly one outcome (no silent pass)
@@ -30,7 +41,8 @@ If any is false, stop and return to Build — there is nothing to verify yet.
30
41
 
31
42
  ## Exit gate / Next
32
43
 
33
- - [ ] Evidence confirmed, blind-spots checked, a person approved, outcome recorded.
44
+ - [ ] Evidence confirmed, blind-spots checked, outcome recorded — a person approved, or
45
+ (under `autonomy: auto` with no residue) the run auto-resolved as the accountable owner.
34
46
 
35
47
  ```bash
36
48
  python3 .add/tooling/add.py gate PASS # marks the task done
@@ -0,0 +1,48 @@
1
+ # Chat reports — the seam template (for the AI, not for add.py)
2
+
3
+ The engine renders artifacts (`report`, `report --decide`, `status`); this file
4
+ governs the CHAT MESSAGE you wrap around them. The digest is the artifact BEHIND
5
+ your presentation, never a replacement for it — and your prose is never a
6
+ replacement for the digest.
7
+
8
+ Use it every time you report at or near a decision seam: an intake proposal, a
9
+ bundle/front approval, a verify gate, a task completion, a milestone close.
10
+
11
+ ## The five blocks, in order
12
+
13
+ ```
14
+ SUMMARY one line: intent + target + where we are
15
+ DECISION what you need from the human (or "none — FYI")
16
+ ⚠ FLAGS least-sure first, why + cost-if-wrong
17
+ EVIDENCE small table: tests · gates · parity · check — engine-sourced
18
+ NEXT the single next action + what it unlocks
19
+ ```
20
+
21
+ 1. **SUMMARY** — one line carrying intent + target + position, e.g.
22
+ "v13 task 2/3 — tests-declared-fallback is green, gate PASS." The reader
23
+ knows where they are before they read anything else.
24
+ 2. **DECISION** — the question the human must answer, stated plainly; exactly
25
+ one decision per report, or an explicit "none — FYI". If a decision exists,
26
+ ask it AFTER everything below has been shown (show-before-ask).
27
+ 3. **⚠ FLAGS** — least-sure first, each with *why* it is least sure and the
28
+ *cost if wrong*. Where TASK.md markers exist (`⚠` / `- [~]` / `- [ ]`),
29
+ quote them verbatim and keep their document order — extraction ≠ judgment.
30
+ 4. **EVIDENCE** — engine-sourced facts pasted from `add.py` output, never
31
+ re-typed from memory. If your prose and the engine disagree, the engine
32
+ wins: fix the engine or the data, not the sentence.
33
+ 5. **NEXT** — one action and what it unlocks. Mirror the rollup's DECIDE NEXT
34
+ line when it is right; overrule it only with a stated reason (e.g. planned
35
+ tasks the state file cannot see yet).
36
+
37
+ ## Hard rules
38
+
39
+ - **Summary-first.** Never bury the decision under a task list or a diff.
40
+ - **Show before ask.** Render the artifact (digest · diff · report) before any
41
+ approval question; the human decides on what they can see.
42
+ - **Never pre-stamp a human seam.** Freeze / gate / lock fields stay DRAFT or
43
+ blank until the answer returns: show → ask → stamp → advance. An artifact
44
+ must never claim an approval that has not happened.
45
+ - **One report per seam.** After an approval, point at the frozen artifact —
46
+ do not re-render the whole bundle.
47
+ - **Honest scope.** "Done" means the request, not the last task: report
48
+ "task 2/3", never "done" while approved scope remains.
package/skill/add/run.md CHANGED
@@ -28,7 +28,8 @@ then builds against and self-gate the result — the circular trust v6's dogfood
28
28
  What the human is actually approving in that one gate: that the drafted Spec captures the real intent,
29
29
  that the Scenarios cover the cases that matter, and that the Contract shape is the one to freeze. Reject
30
30
  any part and the bundle goes back to draft — that is backward-correction (principle 4), not failure.
31
- Approve, and the run begins.
31
+ Approve, and the run begins. The seam guide (`phases/3-contract.md`) carries the
32
+ **freeze review checklist** — six lines that walk the human through exactly this, ⚠-first.
32
33
 
33
34
  **The least-sure flag — aiming the one approval.** A single approval over a whole bundle invites a
34
35
  rubber stamp. So the AI presents the bundle **least-sure first**: of everything it is asking the human
@@ -148,5 +149,12 @@ closes the v6 dogfood blind-spot, where the whole milestone ran at `auto` on the
148
149
  scope (defining the method) with no friction. The default is `auto` *for ordinary, well-tested scope*;
149
150
  high risk still earns a human gate.
150
151
 
151
- The dial is a **rubric convention** read by the human and the run it is **not an `add.py` flag** (the
152
- engine stays judgment-free); the level lives in the `TASK.md` header where the run already reads.
152
+ Judging *what* is high-risk stays human the scope declares **`risk: high`** in the same `TASK.md`
153
+ header where the dial lives, reviewed at the freeze like every header line (the engine never
154
+ classifies scope). **Since v14 the guard is mechanical for the declared case:**
155
+ the engine refuses the declared combination — `add.py gate` will not complete (`PASS`/`RISK-ACCEPTED`) a task whose header
156
+ carries `risk: high` without `autonomy: conservative` (error `unguarded_high_risk_auto`; `HARD-STOP`
157
+ always records — stopping is never blocked), and `add.py audit` flags the same code on a finished
158
+ record whose header was tampered or whose GATE RECORD reviewer is the auto-gate — which CI enforces
159
+ (audit-ci). The honest limit mirrors the audit's: an **undeclared** high-risk scope passes; declaring
160
+ is the human seam, the engine enforces what was declared.
@@ -20,6 +20,24 @@ scope drafting honors intake's classification — it never re-sizes a request:
20
20
  means one drafting pass, NOT auto-creation. Nothing is written to disk — single draft or the
21
21
  whole batch — until the human confirms. You propose; you wait.
22
22
 
23
+ ## Brainstorm before you draft — co-specify at milestone altitude
24
+
25
+ Don't draft a MILESTONE.md from thin input. Run the same three-move co-specify as a
26
+ task's §1 (`phases/1-specify.md`) — Diverge (framings + open questions) → Converge
27
+ (draft + rank) → Validate (show flags first) — raised to milestone scope. Ask only
28
+ what moves the goal, the In/Out line, or the task list; skip what PROJECT.md settles.
29
+ Draft the WHOLE milestone before showing; nothing hits disk until the human confirms.
30
+
31
+ Diverge seeds (pick the live ones):
32
+ - **Outcome** — done means a user can do *what* they can't today? (goal sentence)
33
+ - **Edge of scope** — nearest thing assumed IN that you want OUT? (Out list)
34
+ - **Riskiest seam** — which contract, if wrong, costs the most rework? (freeze-first)
35
+ - **Done-looks-like** — how do we SEE each outcome without reading code? (exit criteria)
36
+ - **First slice** — which task unblocks the rest? (breadth-first order)
37
+
38
+ Rank assumptions least-sure first; the top 1–2 get the flag the human reads at confirm:
39
+ `⚠ <assumption> — least sure because <why>; if wrong: <cost>`.
40
+
23
41
  ## Drafting a good MILESTONE.md (section by section)
24
42
 
25
43
  - **goal** — ONE sentence, an outcome not an output ("a user can size any request", not "write
@@ -0,0 +1,62 @@
1
+ # Setup review — the one page the human signs
2
+
3
+ Autonomous setup ends at a single human gate: the **lock-down** (`add.py lock`). Before that
4
+ signature is honest, the human needs to see *what you drafted and how sure you were* — not re-derive
5
+ it. `SETUP-REVIEW.md` is that page: every decision you made while drafting the foundation, first-scope,
6
+ and the first contract, **ordered least-sure-first** so the riskiest guesses meet their eye first.
7
+
8
+ This is the setup-altitude analog of presenting a task's front least-sure-first at the contract freeze.
9
+ The engine never reads this file — `add.py lock` is judgment-free, the signature *is* the gate (see
10
+ `setup-lock-state`). The human **reading** this page is the review; your job is to make the reading honest.
11
+
12
+ ## Where it lives
13
+
14
+ Write **one** artifact at `.add/SETUP-REVIEW.md`. **Never clobber a human-edited one** — if it already
15
+ exists with hand edits, append/update, don't overwrite (the same non-clobber rule `init` applies to
16
+ survivors). It is a per-onboarding, setup-altitude artifact; it sits beside `PROJECT.md`, not under a task.
17
+
18
+ ## The template
19
+
20
+ ```markdown
21
+ # SETUP REVIEW — <project>
22
+
23
+ <stage> · <brownfield | greenfield> · drafted by <model> @ <date>
24
+
25
+ | # | Decision | Lands in | Tag | Why / Evidence |
26
+ |---|----------|----------|-----|----------------|
27
+ | 1 | <the drafted decision> | PROJECT.md \| scope \| first-contract | `guessed` | <the inference + why you had to guess> |
28
+ | 2 | <…> | <…> | `evidence-grounded` | <cite the source file/line you read it from> |
29
+
30
+ Sign: reviewed the above → `add.py lock --by "<name>"`
31
+ ```
32
+
33
+ Rows are numbered for reference at the gate ("row 1 is the one I'm least sure about").
34
+
35
+ ## The two rules that make it honest
36
+
37
+ 1. **Least-sure-first.** Order rows by confidence **ascending**. A `guessed` row always floats above an
38
+ `evidence-grounded` one. The point is not completeness theatre — it is to spend the human's attention
39
+ where it changes outcomes: the top of the table is the part they actually need to challenge.
40
+
41
+ 2. **Every row is tagged — `guessed` or `evidence-grounded`.**
42
+ - `evidence-grounded` — you read it from the code/repo. **Cite the file** (e.g. `pyproject.toml`,
43
+ `src/orders/models.py`). Brownfield onboarding (see `adopt.md`) is mostly these.
44
+ - `guessed` — the repo was silent, so you inferred it. **State the inference and why.** Thin-greenfield
45
+ onboarding (a near-empty repo, only the 4-lens answers) produces these. These are what the human
46
+ must check; that is why they sit on top.
47
+
48
+ The tag vocabulary is shared with `adopt.md` — the brownfield map tags each filled survivor decision
49
+ `guessed`/`evidence-grounded`, and those tags flow straight into this table.
50
+
51
+ ## Where it ends
52
+
53
+ `SETUP-REVIEW.md` is **read-only context** for the lock-down. You do not ask the human to approve it
54
+ field-by-field; you present it, least-sure-first, and they sign once:
55
+
56
+ ```bash
57
+ python3 .add/tooling/add.py lock --by "<name>"
58
+ ```
59
+
60
+ `lock` records the lock layers and opens the build — it does **not** parse or validate this file (the
61
+ engine stays judgment-free). The review lives in the human's reading of the page, not in the tool. Make
62
+ the top of the table the truth they most need, and the one signature is informed.