@pilotspace/add 1.1.0 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +81 -0
- package/GETTING-STARTED.md +187 -139
- package/README.md +13 -7
- package/bin/cli.js +96 -5
- package/docs/01-principles.md +3 -3
- package/docs/02-the-flow.md +19 -12
- package/docs/03-step-1-specify.md +15 -13
- package/docs/04-step-2-scenarios.md +2 -2
- package/docs/05-step-3-contract.md +3 -3
- package/docs/06-step-4-tests.md +10 -2
- package/docs/07-step-5-build.md +3 -1
- package/docs/08-step-6-verify.md +25 -5
- package/docs/09-the-loop.md +12 -6
- package/docs/10-setup-and-stages.md +27 -13
- package/docs/11-governance.md +6 -2
- package/docs/12-roles.md +3 -3
- package/docs/13-adoption.md +1 -1
- package/docs/14-foundation.md +15 -15
- package/docs/15-foundations-and-lineage.md +106 -0
- package/docs/README.md +4 -0
- package/docs/appendix-a-templates.md +3 -3
- package/docs/appendix-b-prompts.md +40 -5
- package/docs/appendix-c-glossary.md +49 -12
- package/docs/appendix-d-worked-example.md +2 -2
- package/docs/appendix-e-checklists.md +16 -4
- package/docs/appendix-f-requirements-matrix.md +8 -8
- package/docs/appendix-g-references.md +106 -0
- package/package.json +1 -1
- package/skill/add/SKILL.md +41 -38
- package/skill/add/adopt.md +13 -11
- package/skill/add/deltas.md +8 -6
- package/skill/add/fold.md +19 -17
- package/skill/add/graduate.md +74 -0
- package/skill/add/intake.md +22 -7
- package/skill/add/loop.md +59 -0
- package/skill/add/phases/0-ground.md +66 -0
- package/skill/add/phases/0-setup.md +32 -25
- package/skill/add/phases/1-specify.md +28 -13
- package/skill/add/phases/2-scenarios.md +14 -4
- package/skill/add/phases/3-contract.md +27 -12
- package/skill/add/phases/4-tests.md +15 -5
- package/skill/add/phases/5-build.md +33 -4
- package/skill/add/phases/6-verify.md +40 -2
- package/skill/add/phases/7-observe.md +13 -5
- package/skill/add/report-template.md +65 -7
- package/skill/add/run.md +93 -39
- package/skill/add/scope.md +10 -6
- package/skill/add/setup-review.md +13 -10
- package/skill/add/streams.md +88 -23
- package/tooling/add.py +1817 -90
- package/tooling/templates/CONVENTIONS.md.tmpl +1 -1
- package/tooling/templates/DESIGN.md.tmpl +66 -0
- package/tooling/templates/GLOSSARY.md.tmpl +29 -0
- package/tooling/templates/MILESTONE.md.tmpl +1 -0
- package/tooling/templates/PROJECT.md.tmpl +6 -3
- package/tooling/templates/TASK.md.tmpl +55 -15
- package/tooling/templates/catalog.sample.json +38 -0
- package/tooling/templates/prototype.sample.json +48 -0
- package/tooling/templates/tokens.sample.json +55 -0
- package/tooling/templates/udd-catalog.md +122 -0
- package/tooling/templates/udd-tokens.md +79 -0
|
@@ -10,6 +10,21 @@ Pick ONE task-sized slice, restate the tests it must satisfy, implement, run
|
|
|
10
10
|
tests, iterate to green. Keep each batch small enough to review in full — you
|
|
11
11
|
cannot move faster than you can verify.
|
|
12
12
|
|
|
13
|
+
## Declaring the scope of impact (Scope + Strategy)
|
|
14
|
+
|
|
15
|
+
§5 of TASK.md opens with two declarations, drafted WITH the specification bundle
|
|
16
|
+
and frozen by the one §3 approval — never invented mid-build:
|
|
17
|
+
|
|
18
|
+
- **Scope (may touch)** — the allowlist of every file the build may write
|
|
19
|
+
(backticked tokens; grammar in the template comment). During build, needing a
|
|
20
|
+
file outside the declared Scope is a **STOP → change request** back to Specify,
|
|
21
|
+
never improvisation.
|
|
22
|
+
- **Strategy (ordered batches)** — the planned build order. Guidance, not
|
|
23
|
+
enforced: it aims the small-batches loop, it does not gate it.
|
|
24
|
+
|
|
25
|
+
Deferral, named: the engine gate (touched ⊆ declared) lands in the
|
|
26
|
+
`scope-gate-enforce` task — until it ships this section is prose discipline.
|
|
27
|
+
|
|
13
28
|
## The cardinal rule
|
|
14
29
|
|
|
15
30
|
**Never weaken or delete a test to make it pass, and never edit the frozen
|
|
@@ -19,18 +34,26 @@ change request back to Specify. Honor the feature-specific safety rule named in
|
|
|
19
34
|
|
|
20
35
|
## AI prompt
|
|
21
36
|
|
|
22
|
-
>
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
37
|
+
<prompt>
|
|
38
|
+
Role: implement the feature so EVERY failing test passes — the build phase.
|
|
39
|
+
Read first: §1 · §3 · §4 · CONVENTIONS.
|
|
40
|
+
Objective: every §4 test green, one small batch at a time.
|
|
41
|
+
Steps:
|
|
42
|
+
1. Make EVERY failing test pass, one small batch at a time, honoring the §5 safety rule.
|
|
43
|
+
2. Report which tests pass and exactly what changed.
|
|
44
|
+
Never: change a test or the contract; use a package off the allow-list; or push past something unclear instead of asking.
|
|
45
|
+
</prompt>
|
|
26
46
|
|
|
27
47
|
## Exit gate
|
|
28
48
|
|
|
49
|
+
<exit_gate>
|
|
29
50
|
- [ ] All tests pass.
|
|
30
51
|
- [ ] Coverage did not decrease.
|
|
31
52
|
- [ ] No test and no contract modified by the AI.
|
|
32
53
|
- [ ] No dependency outside the allow-list.
|
|
54
|
+
- [ ] No file outside the declared §5 Scope was touched.
|
|
33
55
|
- [ ] Change small enough to review in full.
|
|
56
|
+
</exit_gate>
|
|
34
57
|
|
|
35
58
|
## Next
|
|
36
59
|
|
|
@@ -39,3 +62,9 @@ Book: `docs/07-step-5-build.md`.
|
|
|
39
62
|
|
|
40
63
|
> Under `autonomy: auto` (the default) Build and Verify run together as one dynamic,
|
|
41
64
|
> evidence-auto-gated run — not two manual stops. See `run.md`.
|
|
65
|
+
>
|
|
66
|
+
> **Honest redo.** If the verify gate finds a confirmed cheat (a tamper, or a reported
|
|
67
|
+
> earned-green failure), the task returns HERE for an honest redo — revert the tampered
|
|
68
|
+
> file or de-overfit src, then advance again. This is the bounded self-heal loop (`run.md`),
|
|
69
|
+
> capped: after the cap a confirmed cheat HARD-STOPs to the human. Never weaken a test or
|
|
70
|
+
> edit the frozen contract to pass.
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# Phase 6 — Verify (evidence +
|
|
1
|
+
# Phase 6 — Verify (evidence + non-functional review)
|
|
2
2
|
|
|
3
3
|
Goal: establish trust and record an outcome. Passing tests are necessary, not
|
|
4
4
|
sufficient. Fill **§6** in TASK.md including the GATE RECORD.
|
|
@@ -31,8 +31,44 @@ If any is false, stop and return to Build — there is nothing to verify yet.
|
|
|
31
31
|
note reviewed by the auto-gate is an audit finding (`unescalated_security_note`).
|
|
32
32
|
- **Architecture** — does it respect layering/dependency rules in CONVENTIONS.md?
|
|
33
33
|
|
|
34
|
+
## Part three — the deep check (do not skim)
|
|
35
|
+
|
|
36
|
+
Green tests prove behavior on the inputs you thought of. They do not prove the change
|
|
37
|
+
is *wired in*, nor that you did not leave a dead end behind — and for a non-coding change
|
|
38
|
+
they prove nothing about whether you actually *read* the thing you signed off. So one more
|
|
39
|
+
requirement, every gate:
|
|
40
|
+
|
|
41
|
+
Deep check — do not skim. If the task produced code, record that every new symbol is
|
|
42
|
+
referenced (wiring) and that no new dead/unused code was introduced. If it produced prose
|
|
43
|
+
or non-code, record a semantic read — what you read in full and what it confirmed. Which
|
|
44
|
+
path applies is the resolver's judgement; the engine never classifies.
|
|
45
|
+
|
|
46
|
+
Record it in the §6 **Deep checks** block — where each new symbol is called (a reference
|
|
47
|
+
search), the dead-code scan result, or the prose you read in full and what it confirmed.
|
|
48
|
+
An unfilled Deep checks block is a **shallow verify**, not a PASS.
|
|
49
|
+
|
|
50
|
+
## Part four — was the green earned?
|
|
51
|
+
|
|
52
|
+
A green suite proves the tests pass — not that the build EARNED them. Three judgment cheats
|
|
53
|
+
pass the unchanged suite without earning it: src overfit to the test fixtures (special-cased
|
|
54
|
+
to the literal inputs, not the general behavior §1 asked for), vacuous asserts (tautological —
|
|
55
|
+
green even against an empty implementation), and real logic stubbed away (the function returns
|
|
56
|
+
a constant the tests happen to accept). These cheats are invisible to the mechanical tamper
|
|
57
|
+
tripwire, which only sees edited files. Score them with an adversarial refute-read: an
|
|
58
|
+
independent reviewer — a subagent under `autonomy: auto` is recommended, the engine never
|
|
59
|
+
spawns one — prompted to argue the green was NOT earned from outside the build context. This
|
|
60
|
+
is the verify-gate, whole-suite specialization of run.md's adversarial verify (see run.md), not
|
|
61
|
+
a new discipline. A confirmed earned-green failure is HARD-STOP-class: never auto-passed, never
|
|
62
|
+
RISK-ACCEPTED — but a first cheat is a chance to redo: a confirmed cheat (mechanical tamper or a
|
|
63
|
+
reported earned-green failure) enters the bounded self-heal loop — it returns to build for an honest
|
|
64
|
+
redo, and only after the loop's cap does it HARD-STOP to the human (the loop lives in run.md).
|
|
65
|
+
|
|
34
66
|
## Record exactly one outcome (no silent pass)
|
|
35
67
|
|
|
68
|
+
When you present this gate to the human, open with the ARC (goal · done · plan) per
|
|
69
|
+
`report-template.md`, and reconcile its FLAGS with `add.py report --decide`'s open-item count
|
|
70
|
+
before the ask — per that file's reconcile rule (verify is where a flag-vs-digest mismatch bites).
|
|
71
|
+
|
|
36
72
|
| Outcome | When |
|
|
37
73
|
|---------|------|
|
|
38
74
|
| `PASS` | all checks met |
|
|
@@ -41,8 +77,10 @@ If any is false, stop and return to Build — there is nothing to verify yet.
|
|
|
41
77
|
|
|
42
78
|
## Exit gate / Next
|
|
43
79
|
|
|
44
|
-
|
|
80
|
+
<exit_gate>
|
|
81
|
+
- [ ] Evidence confirmed, non-functional risks checked, outcome recorded — a person approved, or
|
|
45
82
|
(under `autonomy: auto` with no residue) the run auto-resolved as the accountable owner.
|
|
83
|
+
</exit_gate>
|
|
46
84
|
|
|
47
85
|
```bash
|
|
48
86
|
python3 .add/tooling/add.py gate PASS # marks the task done
|
|
@@ -6,7 +6,7 @@ about the feature finally appears. Fill **§7** in TASK.md.
|
|
|
6
6
|
|
|
7
7
|
## Do
|
|
8
8
|
|
|
9
|
-
1. **Release behind a
|
|
9
|
+
1. **Release behind a scope-of-impact limit** — feature flag and/or gradual rollout.
|
|
10
10
|
2. **Reuse scenarios as monitors** — the §2 scenarios that defined "correct" now
|
|
11
11
|
define what you alert on: overall error rate, each rejection's rate (a spike in
|
|
12
12
|
one is a signal), latency of the risky operation under load.
|
|
@@ -15,16 +15,24 @@ about the feature finally appears. Fill **§7** in TASK.md.
|
|
|
15
15
|
|
|
16
16
|
## AI prompt
|
|
17
17
|
|
|
18
|
-
>
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
18
|
+
<prompt>
|
|
19
|
+
Role: a reliability analyst feeding the next cycle.
|
|
20
|
+
Read first: telemetry · objectives · incidents.
|
|
21
|
+
Objective: turn what production shows into the next SPEC delta.
|
|
22
|
+
Steps:
|
|
23
|
+
1. Report error-budget burn.
|
|
24
|
+
2. Cluster errors and surface the top real-world failures.
|
|
25
|
+
3. Draft a SPEC delta with evidence links.
|
|
26
|
+
Never: auto-roll-back — recommend; a human owns the production decision.
|
|
27
|
+
</prompt>
|
|
22
28
|
|
|
23
29
|
## Exit gate
|
|
24
30
|
|
|
31
|
+
<exit_gate>
|
|
25
32
|
- [ ] Released behind a flag/rollout.
|
|
26
33
|
- [ ] Scenario-based monitors live.
|
|
27
34
|
- [ ] A reviewed spec delta captured (becomes the next `new-task`).
|
|
35
|
+
</exit_gate>
|
|
28
36
|
|
|
29
37
|
## Next
|
|
30
38
|
|
|
@@ -1,19 +1,59 @@
|
|
|
1
|
-
# Chat reports — the
|
|
1
|
+
# Chat reports — the decision-point template (for the AI, not for add.py)
|
|
2
2
|
|
|
3
3
|
The engine renders artifacts (`report`, `report --decide`, `status`); this file
|
|
4
4
|
governs the CHAT MESSAGE you wrap around them. The digest is the artifact BEHIND
|
|
5
5
|
your presentation, never a replacement for it — and your prose is never a
|
|
6
6
|
replacement for the digest.
|
|
7
7
|
|
|
8
|
-
Use it every time you report at or near a decision
|
|
9
|
-
bundle
|
|
8
|
+
Use it every time you report at or near a decision point: an intake proposal, a
|
|
9
|
+
bundle approval, a verify gate, a task completion, a milestone close.
|
|
10
|
+
|
|
11
|
+
## The decision arc — rendered first, above the five blocks
|
|
12
|
+
|
|
13
|
+
Every report at a human gate opens with the **ARC** — three labelled lines that
|
|
14
|
+
place the decision in the work's whole arc, so the human confirms with sight of
|
|
15
|
+
where this is going, not just the step in front of them. Render it first, then a
|
|
16
|
+
separator, then the unchanged five blocks below:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
ARC goal: <the milestone / project goal this decision serves>
|
|
20
|
+
done: <proven progress — tasks done · exit-criteria met · what this gate proves>
|
|
21
|
+
plan: <this gate → the next step → the goal>
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
- **goal** — the milestone or project goal the decision serves, read from the
|
|
25
|
+
`m-goal` line in `add.py status`; never re-typed from memory.
|
|
26
|
+
- **done** — proven progress only: exit-criteria met/total and tasks done from
|
|
27
|
+
the rollup, plus what this gate proves. An honest fact, never a hope.
|
|
28
|
+
- **plan** — this gate → the next step → the goal, mirroring the rollup's
|
|
29
|
+
`DECIDE NEXT` line.
|
|
30
|
+
|
|
31
|
+
The arc is required at every human gate: **baseline-lock · contract-freeze ·
|
|
32
|
+
verify · intake · scope · milestone-close · graduation**. The three labels stay
|
|
33
|
+
constant; their content adapts to the gate. The arc is presentation only — it
|
|
34
|
+
adds no gate and changes no PASS / RISK-ACCEPTED / HARD-STOP / freeze outcome.
|
|
35
|
+
|
|
36
|
+
Its facts are engine-sourced, exactly like EVIDENCE below: goal = `m-goal` ·
|
|
37
|
+
done = exit-criteria met/total + tasks done · plan = `DECIDE NEXT`. If your arc
|
|
38
|
+
and `add.py` output disagree, the engine wins — fix the arc, not the engine.
|
|
39
|
+
|
|
40
|
+
### Per-gate examples — one shape, gate-specific content
|
|
41
|
+
|
|
42
|
+
- **verify** — `goal:` ship the decision arc · `done:` report-arc tests 6/6
|
|
43
|
+
green, gate ready · `plan:` PASS this gate → wire the arc into every gate → goal.
|
|
44
|
+
- **contract-freeze** — `goal:` … · `done:` bundle drafted, lowest-confidence
|
|
45
|
+
flag surfaced · `plan:` freeze §3 → build → goal.
|
|
46
|
+
- **milestone-close** — `goal:` … · `done:` exit-criteria 3/3 met, all tasks
|
|
47
|
+
done · `plan:` close → archive → the next milestone.
|
|
48
|
+
- **intake** — `goal:` the sized request · `done:` classified new-major,
|
|
49
|
+
rationale stated · `plan:` create the milestone → first contract → goal.
|
|
10
50
|
|
|
11
51
|
## The five blocks, in order
|
|
12
52
|
|
|
13
53
|
```
|
|
14
54
|
SUMMARY one line: intent + target + where we are
|
|
15
55
|
DECISION what you need from the human (or "none — FYI")
|
|
16
|
-
⚠ FLAGS
|
|
56
|
+
⚠ FLAGS lowest-confidence first, why + cost-if-wrong
|
|
17
57
|
EVIDENCE small table: tests · gates · parity · check — engine-sourced
|
|
18
58
|
NEXT the single next action + what it unlocks
|
|
19
59
|
```
|
|
@@ -24,7 +64,7 @@ NEXT the single next action + what it unlocks
|
|
|
24
64
|
2. **DECISION** — the question the human must answer, stated plainly; exactly
|
|
25
65
|
one decision per report, or an explicit "none — FYI". If a decision exists,
|
|
26
66
|
ask it AFTER everything below has been shown (show-before-ask).
|
|
27
|
-
3. **⚠ FLAGS** —
|
|
67
|
+
3. **⚠ FLAGS** — lowest-confidence first, each with *why* confidence is lowest and the
|
|
28
68
|
*cost if wrong*. Where TASK.md markers exist (`⚠` / `- [~]` / `- [ ]`),
|
|
29
69
|
quote them verbatim and keep their document order — extraction ≠ judgment.
|
|
30
70
|
4. **EVIDENCE** — engine-sourced facts pasted from `add.py` output, never
|
|
@@ -34,15 +74,33 @@ NEXT the single next action + what it unlocks
|
|
|
34
74
|
line when it is right; overrule it only with a stated reason (e.g. planned
|
|
35
75
|
tasks the state file cannot see yet).
|
|
36
76
|
|
|
77
|
+
**The ask itself** — when block 2's decision becomes a literal question component
|
|
78
|
+
(option picker, numbered menu), compose it as a summary: the detail stays in the
|
|
79
|
+
report above, the question carries intent + what "yes" means + the flag count.
|
|
80
|
+
|
|
37
81
|
## Hard rules
|
|
38
82
|
|
|
83
|
+
<constraints>
|
|
39
84
|
- **Summary-first.** Never bury the decision under a task list or a diff.
|
|
40
85
|
- **Show before ask.** Render the artifact (digest · diff · report) before any
|
|
41
86
|
approval question; the human decides on what they can see.
|
|
42
|
-
- **
|
|
87
|
+
- **Reconcile the count.** Before the ask, your ⚠ FLAGS must reconcile with
|
|
88
|
+
`add.py report --decide`'s open-item count. If your prose calls an item
|
|
89
|
+
resolved while the digest still counts it open, the engine wins — fix the data
|
|
90
|
+
(the TASK.md markers the digest reads), not the sentence. A report whose flag
|
|
91
|
+
count disagrees with the engine is the un-transparent gate the ARC exists to close.
|
|
92
|
+
- **Never pre-stamp a human decision point.** Freeze / gate / lock fields stay DRAFT or
|
|
43
93
|
blank until the answer returns: show → ask → stamp → advance. An artifact
|
|
44
94
|
must never claim an approval that has not happened.
|
|
45
|
-
- **One report per
|
|
95
|
+
- **One report per decision point.** After an approval, point at the frozen artifact —
|
|
46
96
|
do not re-render the whole bundle.
|
|
47
97
|
- **Honest scope.** "Done" means the request, not the last task: report
|
|
48
98
|
"task 2/3", never "done" while approved scope remains.
|
|
99
|
+
- **The question is a summary, never the artifact.** Every approval ask carries
|
|
100
|
+
two layers: a compact SUMMARY · DECISION · ⚠ FLAGS block sits in chat
|
|
101
|
+
immediately before the ask (positional), and the question text itself is a
|
|
102
|
+
summary of two lines at most — intent + what "yes" means + the flag count —
|
|
103
|
+
pointing at the report above (compositional). The full bundle, diff, or
|
|
104
|
+
artifact lives only in the chat report; a question that re-carries it buries
|
|
105
|
+
the decision.
|
|
106
|
+
</constraints>
|
package/skill/add/run.md
CHANGED
|
@@ -1,25 +1,24 @@
|
|
|
1
1
|
# The dynamic run — executing a locked scope
|
|
2
2
|
|
|
3
3
|
Once a task's CONTRACT is frozen (phase 3), the scope is *locked*: the external shape will not move.
|
|
4
|
-
That lock is ADD's autonomy
|
|
5
|
-
covers what runs on the far side of the
|
|
6
|
-
self-improving run** instead of a manual, sequential build. The human-led
|
|
7
|
-
· Contract) still owns *direction*, but v7 compresses it to a **single human approval at the
|
|
8
|
-
(see "The
|
|
4
|
+
That lock is ADD's autonomy decision point — below it code is disposable; above it nothing breaks. This rubric
|
|
5
|
+
covers what runs on the far side of the decision point: the **build->verify half, executed as a dynamic,
|
|
6
|
+
self-improving run** instead of a manual, sequential build. The human-led **specification bundle** (Specify · Scenarios
|
|
7
|
+
· Contract) still owns *direction*, but v7 compresses it to a **single human approval at the decision point**
|
|
8
|
+
(see "The specification bundle" below) — the AI drafts the whole bundle, a human approves it once.
|
|
9
9
|
|
|
10
10
|
> **Self-improving = within-run convergence + emit v5 deltas** — same definition as v5: tracked,
|
|
11
11
|
> evidence-backed, never autonomous training. The run converges in-turn AND feeds the human-gated
|
|
12
|
-
>
|
|
12
|
+
> consolidation loop (`deltas.md` · `fold.md`). The engine stays judgment-free: this is a rubric, not `add.py`.
|
|
13
13
|
|
|
14
|
-
## The
|
|
14
|
+
## The specification bundle (v7)
|
|
15
15
|
|
|
16
|
-
The
|
|
17
|
-
freeze. v7 compresses it to **one**. From the user's input the AI **drafts the whole
|
|
18
|
-
|
|
19
|
-
human gives **one approval, at the frozen contract** (the seam). That single approval is the green light
|
|
16
|
+
The specification bundle used to be three separate approvals — Specify, then Scenarios, then the Contract
|
|
17
|
+
freeze. v7 compresses it to **one**. From the user's input the AI **drafts the whole specification bundle in one pass** — the Spec, the Scenarios, the Contract, and the failing Tests — and presents it together. The
|
|
18
|
+
human gives **one approval, at the frozen contract** (the decision point). That single approval is the green light
|
|
20
19
|
for the self-driving run.
|
|
21
20
|
|
|
22
|
-
Why one approval and not zero: the contract freeze is the autonomy
|
|
21
|
+
Why one approval and not zero: the contract freeze is the autonomy decision point, and the decision point **stays human**.
|
|
23
22
|
The AI *drafts* the contract but never *freezes its own* — a person approves the frozen shape before any
|
|
24
23
|
auto-run touches code. This is exactly what keeps "never self-gate a human-led gate" true under an auto
|
|
25
24
|
default: the one gate that remains is human. Drop it to zero and the AI would freeze the interface it
|
|
@@ -28,11 +27,11 @@ then builds against and self-gate the result — the circular trust v6's dogfood
|
|
|
28
27
|
What the human is actually approving in that one gate: that the drafted Spec captures the real intent,
|
|
29
28
|
that the Scenarios cover the cases that matter, and that the Contract shape is the one to freeze. Reject
|
|
30
29
|
any part and the bundle goes back to draft — that is backward-correction (principle 4), not failure.
|
|
31
|
-
Approve, and the run begins. The
|
|
32
|
-
**freeze review checklist** —
|
|
30
|
+
Approve, and the run begins. The decision-point guide (`phases/3-contract.md`) carries the
|
|
31
|
+
**freeze review checklist** — seven lines that walk the human through exactly this, ⚠-first.
|
|
33
32
|
|
|
34
|
-
**The
|
|
35
|
-
|
|
33
|
+
**The lowest-confidence flag — aiming the one approval.** A single approval over a whole bundle is easy to
|
|
34
|
+
grant without reading. So the AI presents the bundle **lowest-confidence first**: of everything it is asking the human
|
|
36
35
|
to freeze, it names the **1–2 points most likely to be wrong**, tagged by part
|
|
37
36
|
(`⚠ [spec|scenario|contract|test] … — because …; if wrong: …`), each with *why* it is uncertain and
|
|
38
37
|
*what it costs if wrong*. The §1 assumptions feed it, but a flag may equally point at an uncovered
|
|
@@ -40,7 +39,7 @@ scenario or the contract shape. If nothing is materially uncertain, the AI still
|
|
|
40
39
|
biggest risk, however small — never a blank "none". Honest about its limit: the flag records that the
|
|
41
40
|
human approved with the soft spots **in front of them**, eyes open; it makes a real review cheap and a
|
|
42
41
|
lazy one visibly negligent, but it cannot *force* engagement — and the AI never asserts that the human
|
|
43
|
-
engaged when it cannot know (a self-asserted gate would just
|
|
42
|
+
engaged when it cannot know (a self-asserted gate would just move the unread approval one level up). Closing
|
|
44
43
|
that enforcement gap is the job of a CI checker, not of prose.
|
|
45
44
|
|
|
46
45
|
## When the run begins — the scope-lock trigger
|
|
@@ -50,17 +49,18 @@ The trigger is the **frozen contract**, nothing else. A run may start only when:
|
|
|
50
49
|
- §3 CONTRACT is marked `FROZEN @ vN` (the shape is fixed), AND
|
|
51
50
|
- §4 TESTS exist and are RED for the right reason (the target the run drives to green).
|
|
52
51
|
|
|
53
|
-
No frozen contract -> no run: you are still
|
|
52
|
+
No frozen contract -> no run: you are still inside the specification bundle, and starting early is the
|
|
54
53
|
forward-skip the flow forbids. The lock is what makes autonomous execution *safe* — the AI cannot
|
|
55
54
|
drift the interface, because the interface is frozen above it.
|
|
56
55
|
|
|
57
|
-
## The
|
|
56
|
+
## The change scope — what the run may and may not touch
|
|
58
57
|
|
|
58
|
+
<constraints>
|
|
59
59
|
A locked run has a hard boundary. It MAY:
|
|
60
60
|
|
|
61
|
-
- write and rewrite **code** (`src/`) — code is disposable below the
|
|
61
|
+
- write and rewrite **code** (`src/`) — code is disposable below the decision point;
|
|
62
62
|
- drive the **tests** to green WITHOUT weakening them (a weakened test is a method violation);
|
|
63
|
-
- gather **evidence** for the verify gate (test output,
|
|
63
|
+
- gather **evidence** for the verify gate (test output, non-functional review).
|
|
64
64
|
|
|
65
65
|
It MUST NOT:
|
|
66
66
|
|
|
@@ -68,10 +68,11 @@ It MUST NOT:
|
|
|
68
68
|
the run STOPS and hands back to a human to reopen Specify (principle 4). The run never re-locks
|
|
69
69
|
scope on its own.
|
|
70
70
|
- weaken, delete, or skip a **test** to make the build pass (that inverts the method).
|
|
71
|
-
- touch the **
|
|
71
|
+
- touch the **specification-bundle artifacts** (§1–§3) except to halt and escalate.
|
|
72
|
+
</constraints>
|
|
72
73
|
|
|
73
74
|
Crossing the boundary is not a fast run; it is an unverified one. When the run hits something only the
|
|
74
|
-
|
|
75
|
+
specification bundle can resolve, it stops — and that stop is the loop working, not failing.
|
|
75
76
|
|
|
76
77
|
## The dynamic run — fan-out and in-run convergence
|
|
77
78
|
|
|
@@ -83,21 +84,28 @@ on a trustworthy result with three loops:
|
|
|
83
84
|
Stopping at the first green is how defects survive; the run stops only when the well runs dry.
|
|
84
85
|
- **adversarial verify** — for every "done" claim, an independent skeptic tries to REFUTE it. The
|
|
85
86
|
claim survives only if it withstands refutation, not because one pass looked plausible.
|
|
86
|
-
- **completeness-critic** — a final pass that asks "what did we NOT cover — a scenario, a
|
|
87
|
+
- **completeness-critic** — a final pass that asks "what did we NOT cover — a scenario, a non-functional risk,
|
|
87
88
|
an unstated assumption?" Whatever it finds re-enters the run.
|
|
88
89
|
|
|
89
90
|
The run ends only when the loops go dry AND the auto-gate's evidence is satisfied. This is the run
|
|
90
91
|
**self-improving within the turn** — the same convergence the foundation loop runs across milestones,
|
|
91
92
|
compressed into one task.
|
|
92
93
|
|
|
93
|
-
## The
|
|
94
|
+
## The automated quality gate
|
|
94
95
|
|
|
96
|
+
<constraints>
|
|
95
97
|
The verify gate may be resolved by **evidence** rather than by a person — when the evidence is
|
|
96
98
|
sufficient and the result is recorded (principle 7, reframed: an automated, recorded pass is an
|
|
97
99
|
explicit pass, not a skip).
|
|
98
100
|
|
|
99
101
|
- **Auto-PASS requires ALL of:** every test green; coverage not decreased; no test weakened and no
|
|
100
|
-
contract edited; the convergence loops dry; the completeness-critic found nothing open
|
|
102
|
+
contract edited; the convergence loops dry; the completeness-critic found nothing open; and the
|
|
103
|
+
deep check below recorded.
|
|
104
|
+
- **The deep check (every gate, no skim).** Deep check — do not skim. If the task produced code, record
|
|
105
|
+
that every new symbol is referenced (wiring) and that no new dead/unused code was introduced. If it
|
|
106
|
+
produced prose or non-code, record a semantic read — what you read in full and what it confirmed.
|
|
107
|
+
Which path applies is the resolver's judgement; the engine never classifies. An unfilled deep check is
|
|
108
|
+
a **shallow verify**, not an auto-PASS — evidence the work is wired, not merely plausible.
|
|
101
109
|
- **Always escalates to a human (never auto-passed):** any **security** finding (HARD-STOP, always);
|
|
102
110
|
a **concurrency**/timing risk the tests cannot exercise; an **architecture**/layering violation; and
|
|
103
111
|
any failing test. These are the residue principle 2 names — automation cannot judge them.
|
|
@@ -107,54 +115,100 @@ explicit pass, not a skip).
|
|
|
107
115
|
|
|
108
116
|
The auto-gate NEVER writes a human signature it did not get. An auto-PASS is logged as *auto-resolved*,
|
|
109
117
|
honestly — the line between a pass and a skip is the recorded outcome, not a forged name.
|
|
118
|
+
</constraints>
|
|
119
|
+
|
|
120
|
+
## The bounded self-heal loop — a confirmed cheat returns to build
|
|
121
|
+
|
|
122
|
+
The auto-gate trusts evidence; but evidence can be **gamed**. A build can make the unchanged red suite
|
|
123
|
+
pass without EARNING it — a test or the frozen contract edited after the red run, src **overfit** to the
|
|
124
|
+
fixtures, **vacuous** asserts, or real logic **stubbed away**. That is a **confirmed cheat**, and a cheat
|
|
125
|
+
is **HARD-STOP-class**: never auto-passed, never RISK-ACCEPTED-waived (like a security finding). But a
|
|
126
|
+
first cheat is not yet a stop — it is a chance to redo honestly.
|
|
127
|
+
|
|
128
|
+
So a confirmed cheat enters a **bounded self-heal loop**: the engine returns the task to **build** for an
|
|
129
|
+
honest redo, **counts** the attempt, and **caps** it. After **3** honest re-build attempts a fourth
|
|
130
|
+
confirmed cheat forces a **HARD-STOP that escalates to the human** — never an auto-PASS, never an unbounded
|
|
131
|
+
loop. The engine COUNTS, CAPS, and ESCALATES; the **agent** does the honest re-build (the engine never
|
|
132
|
+
auto-fixes). The counter is **monotonic** — it never auto-resets, so the cap cannot be cleared by
|
|
133
|
+
re-crossing a phase; only an honest build (no cheat) escapes the loop, and an honest build PASSes even at
|
|
134
|
+
the third attempt (the cap bites a *continued* cheat, never a recovery).
|
|
135
|
+
|
|
136
|
+
Two findings enter the loop:
|
|
137
|
+
- **mechanical** (enforced) — the tamper tripwire (`tamper-tripwire`): at the gate the engine re-hashes the
|
|
138
|
+
red test files + the frozen §3 against the `tests→build` snapshot; any divergence is a cheat, routed to
|
|
139
|
+
the loop before any completing outcome is recorded.
|
|
140
|
+
- **semantic** (honor-system, necessary-not-sufficient) — the **adversarial refute-read** (`6-verify.md`):
|
|
141
|
+
an independent reviewer argues "the green was NOT earned" and, on a confirmed overfit/vacuous/stub, the
|
|
142
|
+
agent reports it with `add.py heal <slug> --reason "<finding>"`. The engine cannot SEE a judgment cheat,
|
|
143
|
+
so this entry is the agent's honest report — the human verify gate stays the real backstop.
|
|
144
|
+
|
|
145
|
+
The mechanical entry returns-to-build automatically at the gate; the `heal` verb is how a *reported* cheat
|
|
146
|
+
enters the same bounded loop. Either way: ≤3 honest redos, then escalate. A gamed green never ships.
|
|
110
147
|
|
|
111
148
|
## Emitting deltas — feeding the foundation back
|
|
112
149
|
|
|
113
150
|
The completeness-critic does not discard what it finds. Every gap, surprise, or convention that helped
|
|
114
|
-
or hurt becomes an **`open`
|
|
151
|
+
or hurt becomes an **`open` lesson learned** in the task's OBSERVE block, in the `deltas.md` grammar,
|
|
115
152
|
tagged by competency:
|
|
116
153
|
|
|
117
154
|
- a finding the run FIXED but that taught the foundation something (a missing scenario -> `TDD`);
|
|
118
155
|
- a finding the run could NOT fix — a residue escalation -> a delta AND the escalation to a human.
|
|
119
156
|
|
|
120
|
-
These `open` deltas feed v5's human-gated
|
|
121
|
-
the human
|
|
157
|
+
These `open` deltas feed v5's human-gated consolidation (`fold.md`) at milestone close: the run emits `open`;
|
|
158
|
+
the human consolidates. That is the loop closing — **v6 run -> v5 foundation** — so a dynamic run sharpens the
|
|
122
159
|
five competencies instead of letting its findings evaporate at end-of-run.
|
|
123
160
|
|
|
124
|
-
## The autonomy
|
|
161
|
+
## The autonomy level
|
|
125
162
|
|
|
163
|
+
<constraints>
|
|
126
164
|
How much a run may auto-gate is a **per-scope setting**, not a global switch (principle 5: trust is
|
|
127
165
|
earned per scope). A task declares its level in its `TASK.md` header:
|
|
128
166
|
|
|
129
167
|
```
|
|
130
|
-
autonomy:
|
|
168
|
+
autonomy: manual | conservative | auto
|
|
131
169
|
```
|
|
132
170
|
|
|
133
|
-
|
|
171
|
+
An ordered ladder — `manual < conservative < auto` — declared once in the header and reviewed at the freeze:
|
|
172
|
+
|
|
173
|
+
- **auto (the seeded default)** — the run may auto-PASS when the evidence + residue checks above are
|
|
134
174
|
satisfied. Security still always escalates. This is the default starting point: a frozen contract
|
|
135
175
|
flips the task into a self-driving run that converges and auto-gates on evidence.
|
|
136
176
|
- **conservative** — the deliberate *lowering*: the run does all the work and converges, but STOPS at
|
|
137
177
|
the verify gate for a human. Auto-PASS is disabled. Choose it wherever evidence is thin or risk is high.
|
|
178
|
+
- **manual** — the strict floor: the human owns the verify gate and the engine never auto-resolves
|
|
179
|
+
(behaviourally the conservative floor with the explicit "I drive this decision; the AI proposes only"
|
|
180
|
+
name). Choose it for the highest-stakes scope; like `conservative`, it satisfies the high-risk guard.
|
|
138
181
|
|
|
139
182
|
> **v7 reversal (recorded, not hidden).** Earlier the default was `conservative` and `auto` was the
|
|
140
183
|
> earned exception; v7 flips this — `auto` is the default, `conservative` is the deliberate lowering.
|
|
141
|
-
> What did **not** change is principle 5: the
|
|
184
|
+
> What did **not** change is principle 5: the autonomy level is still **per-scope**, and it still lives in the
|
|
142
185
|
> `TASK.md` header, and you still lower it anywhere risk demands. Only the starting point moved.
|
|
143
186
|
|
|
144
|
-
**The high-risk guard — `auto` is refused where it matters most.** The
|
|
187
|
+
**The high-risk guard — `auto` is refused where it matters most.** The autonomy level is not a blank cheque. On a
|
|
145
188
|
**high-risk or method-defining scope** — anything where a wrong-but-plausible result is expensive or
|
|
146
189
|
hard to reverse (auth, money, data-loss paths, the method/trust-layer itself) — `auto` must be lowered
|
|
147
|
-
to `conservative`; leaving it at `auto` there is the reject code
|
|
148
|
-
|
|
190
|
+
to a stricter rung — `conservative` or `manual`; leaving it at `auto` there is the reject code
|
|
191
|
+
**`unguarded_high_risk_auto`**. This
|
|
192
|
+
closes the v6 dogfood gap, where the whole milestone ran at `auto` on the riskiest possible
|
|
149
193
|
scope (defining the method) with no friction. The default is `auto` *for ordinary, well-tested scope*;
|
|
150
194
|
high risk still earns a human gate.
|
|
151
195
|
|
|
152
196
|
Judging *what* is high-risk stays human — the scope declares **`risk: high`** in the same `TASK.md`
|
|
153
|
-
header where the
|
|
197
|
+
header where the autonomy level lives, reviewed at the freeze like every header line (the engine never
|
|
154
198
|
classifies scope). **Since v14 the guard is mechanical for the declared case:**
|
|
155
199
|
the engine refuses the declared combination — `add.py gate` will not complete (`PASS`/`RISK-ACCEPTED`) a task whose header
|
|
156
|
-
carries `risk: high` without `
|
|
200
|
+
carries `risk: high` without a lowered level — `conservative` or `manual` (error `unguarded_high_risk_auto`; `HARD-STOP`
|
|
157
201
|
always records — stopping is never blocked), and `add.py audit` flags the same code on a finished
|
|
158
202
|
record whose header was tampered or whose GATE RECORD reviewer is the auto-gate — which CI enforces
|
|
159
203
|
(audit-ci). The honest limit mirrors the audit's: an **undeclared** high-risk scope passes; declaring
|
|
160
|
-
is the human
|
|
204
|
+
is the human decision point, the engine enforces what was declared.
|
|
205
|
+
|
|
206
|
+
**Autonomy is earned by goal-clarity — the auto-ready goal.** The level decides *who* resolves Verify;
|
|
207
|
+
an **auto-ready goal** decides whether a self-verifying run is even *meaningful*. A milestone goal is
|
|
208
|
+
auto-ready when **every exit criterion cites a verifier** — `(verify: <test | command | metric>)` — so the
|
|
209
|
+
run can check its own result against the goal without human judgment. `add.py check` raises a
|
|
210
|
+
`goal_not_auto_ready` WARN (never red, the active milestone only) while criteria are uncited, and `status`
|
|
211
|
+
prints a `goal-ready:` line every session. It **measures, never blocks** — it changes neither the freeze
|
|
212
|
+
gate nor the autonomy level. The lint forces a citation slot per criterion (raising the floor) but cannot
|
|
213
|
+
prove the citation is honest (`(verify: it works)` passes) — that judgment stays the human's.
|
|
214
|
+
</constraints>
|
package/skill/add/scope.md
CHANGED
|
@@ -20,7 +20,7 @@ scope drafting honors intake's classification — it never re-sizes a request:
|
|
|
20
20
|
means one drafting pass, NOT auto-creation. Nothing is written to disk — single draft or the
|
|
21
21
|
whole batch — until the human confirms. You propose; you wait.
|
|
22
22
|
|
|
23
|
-
## Brainstorm before you draft — co-specify at milestone
|
|
23
|
+
## Brainstorm before you draft — co-specify at milestone level
|
|
24
24
|
|
|
25
25
|
Don't draft a MILESTONE.md from thin input. Run the same three-move co-specify as a
|
|
26
26
|
task's §1 (`phases/1-specify.md`) — Diverge (framings + open questions) → Converge
|
|
@@ -31,12 +31,14 @@ Draft the WHOLE milestone before showing; nothing hits disk until the human conf
|
|
|
31
31
|
Diverge seeds (pick the live ones):
|
|
32
32
|
- **Outcome** — done means a user can do *what* they can't today? (goal sentence)
|
|
33
33
|
- **Edge of scope** — nearest thing assumed IN that you want OUT? (Out list)
|
|
34
|
-
- **Riskiest
|
|
34
|
+
- **Riskiest decision point** — which contract, if wrong, costs the most rework? (freeze-first)
|
|
35
35
|
- **Done-looks-like** — how do we SEE each outcome without reading code? (exit criteria)
|
|
36
36
|
- **First slice** — which task unblocks the rest? (breadth-first order)
|
|
37
37
|
|
|
38
|
-
Rank assumptions
|
|
39
|
-
`⚠ <assumption> —
|
|
38
|
+
Rank assumptions lowest-confidence first; the top 1–2 get the flag the human reads at confirm:
|
|
39
|
+
`⚠ <assumption> — lowest confidence because <why>; if wrong: <cost>`. Present the draft via
|
|
40
|
+
`report-template.md` — open with the ARC (goal · done · plan): the goal this milestone serves,
|
|
41
|
+
what is already covered, and the plan its task list lays out.
|
|
40
42
|
|
|
41
43
|
## Drafting a good MILESTONE.md (section by section)
|
|
42
44
|
|
|
@@ -45,8 +47,8 @@ Rank assumptions least-sure first; the top 1–2 get the flag the human reads at
|
|
|
45
47
|
- **Scope In/Out** — the explicit anti-creep deferral list. Naming what is OUT is as important
|
|
46
48
|
as what is IN; an empty Out list usually means the scope is not yet thought through.
|
|
47
49
|
- **Shared decisions & glossary deltas** — cross-cutting rules every task must honor, named from
|
|
48
|
-
the glossary. New terms get a glossary entry (the
|
|
49
|
-
- **Shared / risky contracts to freeze first** — the
|
|
50
|
+
the glossary. New terms get a glossary entry (the living documentation stays honest).
|
|
51
|
+
- **Shared / risky contracts to freeze first** — the decision points between tasks; name the owning task.
|
|
50
52
|
- **Tasks (breadth-first)** — `slug · depends-on · one line` each. Decompose by deliverable, not
|
|
51
53
|
by phase; keep each task one-file-sized. Order by dependency, not by guesswork.
|
|
52
54
|
- **Exit criteria** — observable, and **every exit criterion maps to a declared task slug**
|
|
@@ -54,6 +56,7 @@ Rank assumptions least-sure first; the top 1–2 get the flag the human reads at
|
|
|
54
56
|
|
|
55
57
|
## Reject codes (emit `{ reject, rationale }`, create nothing)
|
|
56
58
|
|
|
59
|
+
<reject_codes>
|
|
57
60
|
- `not_classified` — the request has not been through intake yet. Classify it first; you cannot
|
|
58
61
|
draft scope for an unclassified request.
|
|
59
62
|
- `dangling_criterion` — a drafted MILESTONE.md has an exit criterion that maps to no declared
|
|
@@ -61,6 +64,7 @@ Rank assumptions least-sure first; the top 1–2 get the flag the human reads at
|
|
|
61
64
|
a malformed milestone. With no engine lint, you are the first check and the human is the backstop.
|
|
62
65
|
- `no_milestone` — intake routed the request to `task` or `change-request`; scope drafting
|
|
63
66
|
creates NO milestone. Honor the classification; do not invent milestone-sized scope.
|
|
67
|
+
</reject_codes>
|
|
64
68
|
|
|
65
69
|
## Worked example (from this repo's own history)
|
|
66
70
|
|
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
# Setup review — the one page the human signs
|
|
2
2
|
|
|
3
|
-
Autonomous setup ends at a single human gate: the **
|
|
3
|
+
Autonomous setup ends at a single human gate: the **baseline approval** (`add.py lock`). Before that
|
|
4
4
|
signature is honest, the human needs to see *what you drafted and how sure you were* — not re-derive
|
|
5
5
|
it. `SETUP-REVIEW.md` is that page: every decision you made while drafting the foundation, first-scope,
|
|
6
|
-
and the first contract, **ordered
|
|
6
|
+
and the first contract, **ordered lowest-confidence-first** so the riskiest guesses meet their eye first.
|
|
7
7
|
|
|
8
|
-
This is the setup-
|
|
8
|
+
This is the setup-level analog of presenting a task's specification bundle lowest-confidence-first at the contract freeze.
|
|
9
9
|
The engine never reads this file — `add.py lock` is judgment-free, the signature *is* the gate (see
|
|
10
10
|
`setup-lock-state`). The human **reading** this page is the review; your job is to make the reading honest.
|
|
11
11
|
|
|
@@ -13,7 +13,7 @@ The engine never reads this file — `add.py lock` is judgment-free, the signatu
|
|
|
13
13
|
|
|
14
14
|
Write **one** artifact at `.add/SETUP-REVIEW.md`. **Never clobber a human-edited one** — if it already
|
|
15
15
|
exists with hand edits, append/update, don't overwrite (the same non-clobber rule `init` applies to
|
|
16
|
-
|
|
16
|
+
living docs). It is a per-onboarding, setup-level artifact; it sits beside `PROJECT.md`, not under a task.
|
|
17
17
|
|
|
18
18
|
## The template
|
|
19
19
|
|
|
@@ -27,14 +27,15 @@ survivors). It is a per-onboarding, setup-altitude artifact; it sits beside `PRO
|
|
|
27
27
|
| 1 | <the drafted decision> | PROJECT.md \| scope \| first-contract | `guessed` | <the inference + why you had to guess> |
|
|
28
28
|
| 2 | <…> | <…> | `evidence-grounded` | <cite the source file/line you read it from> |
|
|
29
29
|
|
|
30
|
-
Sign:
|
|
30
|
+
Sign: confirm in chat → the agent runs `add.py lock --by "<name>"` (typing it yourself works too)
|
|
31
31
|
```
|
|
32
32
|
|
|
33
|
-
Rows are numbered for reference at the gate ("row 1 is
|
|
33
|
+
Rows are numbered for reference at the gate ("row 1 is where my confidence is lowest").
|
|
34
34
|
|
|
35
35
|
## The two rules that make it honest
|
|
36
36
|
|
|
37
|
-
|
|
37
|
+
<constraints>
|
|
38
|
+
1. **Lowest-confidence-first.** Order rows by confidence **ascending**. A `guessed` row always floats above an
|
|
38
39
|
`evidence-grounded` one. The point is not completeness theatre — it is to spend the human's attention
|
|
39
40
|
where it changes outcomes: the top of the table is the part they actually need to challenge.
|
|
40
41
|
|
|
@@ -45,13 +46,15 @@ Rows are numbered for reference at the gate ("row 1 is the one I'm least sure ab
|
|
|
45
46
|
onboarding (a near-empty repo, only the 4-lens answers) produces these. These are what the human
|
|
46
47
|
must check; that is why they sit on top.
|
|
47
48
|
|
|
48
|
-
The tag vocabulary is shared with `adopt.md` — the brownfield map tags each filled
|
|
49
|
+
The tag vocabulary is shared with `adopt.md` — the brownfield map tags each filled living-doc decision
|
|
49
50
|
`guessed`/`evidence-grounded`, and those tags flow straight into this table.
|
|
51
|
+
</constraints>
|
|
50
52
|
|
|
51
53
|
## Where it ends
|
|
52
54
|
|
|
53
|
-
`SETUP-REVIEW.md` is **read-only context** for the
|
|
54
|
-
field-by-field; you present it,
|
|
55
|
+
`SETUP-REVIEW.md` is **read-only context** for the baseline approval. You do not ask the human to approve it
|
|
56
|
+
field-by-field; you present it, lowest-confidence-first; they confirm in conversation, and you run the lock
|
|
57
|
+
with their name:
|
|
55
58
|
|
|
56
59
|
```bash
|
|
57
60
|
python3 .add/tooling/add.py lock --by "<name>"
|