@pilotspace/add 1.1.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +40 -0
- package/GETTING-STARTED.md +165 -139
- package/README.md +13 -7
- package/bin/cli.js +13 -4
- package/docs/01-principles.md +3 -3
- package/docs/02-the-flow.md +15 -11
- package/docs/03-step-1-specify.md +13 -13
- package/docs/04-step-2-scenarios.md +2 -2
- package/docs/05-step-3-contract.md +3 -3
- package/docs/06-step-4-tests.md +2 -2
- package/docs/07-step-5-build.md +1 -1
- package/docs/08-step-6-verify.md +14 -5
- package/docs/09-the-loop.md +12 -6
- package/docs/10-setup-and-stages.md +27 -13
- package/docs/11-governance.md +2 -2
- package/docs/12-roles.md +3 -3
- package/docs/13-adoption.md +1 -1
- package/docs/14-foundation.md +15 -15
- package/docs/15-foundations-and-lineage.md +106 -0
- package/docs/README.md +4 -0
- package/docs/appendix-a-templates.md +3 -3
- package/docs/appendix-b-prompts.md +40 -5
- package/docs/appendix-c-glossary.md +42 -12
- package/docs/appendix-d-worked-example.md +2 -2
- package/docs/appendix-e-checklists.md +2 -2
- package/docs/appendix-f-requirements-matrix.md +8 -8
- package/docs/appendix-g-references.md +106 -0
- package/package.json +1 -1
- package/skill/add/SKILL.md +39 -37
- package/skill/add/adopt.md +13 -11
- package/skill/add/deltas.md +8 -6
- package/skill/add/fold.md +19 -17
- package/skill/add/graduate.md +74 -0
- package/skill/add/intake.md +22 -7
- package/skill/add/loop.md +59 -0
- package/skill/add/phases/0-setup.md +29 -24
- package/skill/add/phases/1-specify.md +23 -13
- package/skill/add/phases/2-scenarios.md +14 -4
- package/skill/add/phases/3-contract.md +24 -11
- package/skill/add/phases/4-tests.md +15 -5
- package/skill/add/phases/5-build.md +11 -4
- package/skill/add/phases/6-verify.md +24 -2
- package/skill/add/phases/7-observe.md +13 -5
- package/skill/add/report-template.md +65 -7
- package/skill/add/run.md +45 -34
- package/skill/add/scope.md +10 -6
- package/skill/add/setup-review.md +13 -10
- package/skill/add/streams.md +69 -19
- package/tooling/add.py +476 -34
- package/tooling/templates/CONVENTIONS.md.tmpl +1 -1
- package/tooling/templates/GLOSSARY.md.tmpl +23 -0
- package/tooling/templates/MILESTONE.md.tmpl +1 -0
- package/tooling/templates/PROJECT.md.tmpl +4 -3
- package/tooling/templates/TASK.md.tmpl +33 -12
|
@@ -11,42 +11,52 @@ understand the feature — that is information, not an obstacle. Stop and ask.
|
|
|
11
11
|
|
|
12
12
|
1. **Diverge** — before drafting, surface the decision space: the 2–3 genuine framings of the
|
|
13
13
|
feature + the open questions you would otherwise guess. Invite the user to add, kill,
|
|
14
|
-
redirect. (Conversational — no new file. At prototype/poc this
|
|
15
|
-
2. **Converge** — draft §1, then RANK
|
|
14
|
+
redirect. (Conversational — no new file. At prototype/poc this shortens to one sentence.)
|
|
15
|
+
2. **Converge** — draft §1, then RANK where your confidence is lowest (below).
|
|
16
16
|
3. **Validate** — present the ranked uncertainty first; the user confirms, corrects, or sends back.
|
|
17
17
|
|
|
18
18
|
## Produce (in TASK.md §1)
|
|
19
19
|
|
|
20
|
+
<output_format>
|
|
20
21
|
- **Framings weighed** — a one-line trace of what you considered: `X (chosen) · Y · Z`.
|
|
21
22
|
- **Must** — each required behavior.
|
|
22
23
|
- **Reject** — each refused input/situation, paired with a **named error code**
|
|
23
24
|
(`amount <= 0 -> "amount_invalid"`, never "handle bad input").
|
|
24
25
|
- **After** — the state that is true once it succeeds.
|
|
25
|
-
- **Assumptions —
|
|
26
|
-
`⚠` flag: `⚠ <assumption> —
|
|
27
|
-
low-stakes `[x]` tail.
|
|
26
|
+
- **Assumptions — lowest-confidence first** — ranked most-likely-wrong → least. The top 1–2 carry a
|
|
27
|
+
`⚠` flag: `⚠ <assumption> — lowest confidence because <why>; if wrong: <cost>`. The rest are the
|
|
28
|
+
low-stakes `[x]` tail. Keep the ranking visible — a flat list of equal `[x]` ticks gets approved without reading.
|
|
29
|
+
</output_format>
|
|
28
30
|
|
|
29
|
-
## The
|
|
31
|
+
## The lowest-confidence flag is bundle-wide
|
|
30
32
|
|
|
31
33
|
The single human approval happens once, at the contract freeze, over the whole bundle. So your
|
|
32
|
-
§1 ranking is the
|
|
34
|
+
§1 ranking is the first input into a bundle-level flag the user reads at the decision point (`run.md`):
|
|
33
35
|
*"of everything I'm asking you to freeze, these 1–2 are most likely wrong."* A flag may point at
|
|
34
36
|
a §1 assumption, an uncovered scenario, or the contract shape.
|
|
35
37
|
|
|
36
38
|
## AI prompt
|
|
37
39
|
|
|
38
|
-
>
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
40
|
+
<prompt>
|
|
41
|
+
Role: a domain analyst who brainstorms, then asks rather than assumes.
|
|
42
|
+
Read first: CONVENTIONS · GLOSSARY · the user's raw input.
|
|
43
|
+
Objective: fill §1 SPECIFY with zero ambiguity left for the AI to resolve by guessing.
|
|
44
|
+
Steps:
|
|
45
|
+
1. Surface 2–3 framings + the open questions; let the user react before you draft.
|
|
46
|
+
2. Produce §1 — Framings weighed, every Must, every Reject with a named error code, the
|
|
47
|
+
After state, and the Assumptions RANKED lowest-confidence first.
|
|
48
|
+
3. Flag the 1–2 where your confidence is lowest, each with why + cost.
|
|
49
|
+
Never: resolve an ambiguity by guessing.
|
|
50
|
+
</prompt>
|
|
43
51
|
|
|
44
52
|
## Exit gate
|
|
45
53
|
|
|
54
|
+
<exit_gate>
|
|
46
55
|
- [ ] Framings weighed noted; every required behavior stated.
|
|
47
56
|
- [ ] Every rejection has a named error code; success state-change described.
|
|
48
|
-
- [ ] Assumptions ordered
|
|
57
|
+
- [ ] Assumptions ordered lowest-confidence first; the 1–2 `⚠` flags carry why + cost — or an honest
|
|
49
58
|
"none material" that still names the single biggest risk (never a blank "none").
|
|
59
|
+
</exit_gate>
|
|
50
60
|
|
|
51
61
|
## Next
|
|
52
62
|
|
|
@@ -6,6 +6,7 @@ generated from it. Fill **§2 SCENARIOS** in TASK.md.
|
|
|
6
6
|
|
|
7
7
|
## Produce (in TASK.md §2)
|
|
8
8
|
|
|
9
|
+
<output_format>
|
|
9
10
|
```gherkin
|
|
10
11
|
Scenario: <short name>
|
|
11
12
|
Given <starting situation>
|
|
@@ -15,20 +16,29 @@ Scenario: <short name>
|
|
|
15
16
|
```
|
|
16
17
|
|
|
17
18
|
The `And ... unchanged` clause catches corrupting partial failures (e.g. a balance
|
|
18
|
-
deducted before a check fails).
|
|
19
|
+
deducted before a check fails). Include it on every rejection.
|
|
20
|
+
</output_format>
|
|
19
21
|
|
|
20
22
|
## AI prompt
|
|
21
23
|
|
|
22
|
-
>
|
|
23
|
-
|
|
24
|
-
|
|
24
|
+
<prompt>
|
|
25
|
+
Role: a specification tester.
|
|
26
|
+
Read first: §1 · GLOSSARY.
|
|
27
|
+
Objective: one scenario per Must and per Reject rule, each result specific and observable.
|
|
28
|
+
Steps:
|
|
29
|
+
1. Write one scenario per Must rule and one per Reject rule.
|
|
30
|
+
2. For every rejection add an And-clause asserting what must NOT change.
|
|
31
|
+
Never: settle for a vague result ("then it works") — results must be specific and observable.
|
|
32
|
+
</prompt>
|
|
25
33
|
|
|
26
34
|
## Exit gate
|
|
27
35
|
|
|
36
|
+
<exit_gate>
|
|
28
37
|
- [ ] One scenario per Must rule.
|
|
29
38
|
- [ ] One scenario per Reject rule.
|
|
30
39
|
- [ ] Each result is a specific, observable fact.
|
|
31
40
|
- [ ] Every rejection asserts what stays unchanged.
|
|
41
|
+
</exit_gate>
|
|
32
42
|
|
|
33
43
|
## Next
|
|
34
44
|
|
|
@@ -1,12 +1,13 @@
|
|
|
1
1
|
# Phase 3 — Contract (freeze the shape)
|
|
2
2
|
|
|
3
3
|
Goal: fix the external shape — interfaces, data, names, error cases — and FREEZE
|
|
4
|
-
it. This is the
|
|
4
|
+
it. This is the decision point that makes the AI-led build safe: below it code is
|
|
5
5
|
disposable; above it nothing breaks because the shape does not move. Fill
|
|
6
6
|
**§3 CONTRACT** in TASK.md.
|
|
7
7
|
|
|
8
8
|
## Produce (in TASK.md §3)
|
|
9
9
|
|
|
10
|
+
<output_format>
|
|
10
11
|
- Interfaces (endpoints/functions/messages) with inputs/outputs.
|
|
11
12
|
- Request/response shapes + persistent schema (note transactional needs).
|
|
12
13
|
- Names drawn from `GLOSSARY.md` (same concept = same name everywhere).
|
|
@@ -14,17 +15,22 @@ disposable; above it nothing breaks because the shape does not move. Fill
|
|
|
14
15
|
|
|
15
16
|
Then mark `Status: FROZEN @ v1`. Generate a mock + contract tests so dependent
|
|
16
17
|
work can start before the real code exists.
|
|
18
|
+
</output_format>
|
|
17
19
|
|
|
18
|
-
**The freeze is the one approval.** This
|
|
19
|
-
whole bundle (§1–§4). Before asking for it, present the bundle **
|
|
20
|
+
**The freeze is the one approval.** This decision point is where the single human approval lands, over the
|
|
21
|
+
whole bundle (§1–§4). Before asking for it, present the bundle **lowest-confidence first**: the 1–2 points
|
|
20
22
|
most likely wrong (`⚠ [spec|scenario|contract|test] … — because …; if wrong: …`) — aim the human's
|
|
21
|
-
eye before they freeze.
|
|
23
|
+
eye before they freeze. Open that report with the ARC (goal · done · plan) per `report-template.md` so the
|
|
24
|
+
human sees the goal this freeze serves and the plan beyond it, not just the bundle. See `run.md`.
|
|
22
25
|
|
|
23
26
|
## The freeze review checklist
|
|
24
27
|
|
|
25
28
|
The human's one minute, aimed. Walk these six before saying yes:
|
|
26
29
|
|
|
27
|
-
- **⚠ flags first** — read the
|
|
30
|
+
- **⚠ flags first** — read the lowest-confidence flags; accept each knowing its cost if wrong.
|
|
31
|
+
The engine refuses an unflagged freeze before build: a frozen §3 with no well-formed
|
|
32
|
+
lowest-confidence flag is rejected (`unflagged_freeze`), and `audit` re-checks it on every
|
|
33
|
+
record that crossed.
|
|
28
34
|
- **Intent** — does §1 say what you actually want built (and is anything you expected missing)?
|
|
29
35
|
- **Cases** — does every Must and Reject have an observable §2 scenario you care about?
|
|
30
36
|
- **Shape** — glossary names, error codes, additive vs breaking: is THIS the shape to freeze?
|
|
@@ -32,24 +38,31 @@ The human's one minute, aimed. Walk these six before saying yes:
|
|
|
32
38
|
`risk: high · autonomy: conservative` in the TASK.md header — the engine refuses an unguarded completion.
|
|
33
39
|
- **Tests** — will §4 go red for the right reason, asserting behavior rather than internals?
|
|
34
40
|
|
|
35
|
-
This checklist AIMS the one approval —
|
|
41
|
+
This checklist AIMS the one approval — the freeze stays the only gate: no sign-off forms, no
|
|
36
42
|
extra documents. Reject any line and the bundle goes back to draft; that is
|
|
37
43
|
backward-correction, not failure.
|
|
38
44
|
|
|
39
45
|
## AI prompt
|
|
40
46
|
|
|
41
|
-
>
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
47
|
+
<prompt>
|
|
48
|
+
Role: an interface architect; frozen contracts are immutable.
|
|
49
|
+
Read first: §1 · §2 · GLOSSARY.
|
|
50
|
+
Objective: produce §3 — the frozen external shape, nothing more.
|
|
51
|
+
Steps:
|
|
52
|
+
1. Define interfaces, shapes, and schema named from the glossary, with a response for every Reject code.
|
|
53
|
+
2. Generate a mock returning the contracted shapes and contract tests pinning them.
|
|
54
|
+
3. Mark FROZEN. No business logic.
|
|
55
|
+
Never: change a frozen contract — a change reopens Specify.
|
|
56
|
+
</prompt>
|
|
46
57
|
|
|
47
58
|
## Exit gate
|
|
48
59
|
|
|
60
|
+
<exit_gate>
|
|
49
61
|
- [ ] Versioned and marked `FROZEN`.
|
|
50
62
|
- [ ] Contract tests pass against the mock.
|
|
51
63
|
- [ ] Every name matches the glossary.
|
|
52
64
|
- [ ] Every spec rejection has a contracted response.
|
|
65
|
+
</exit_gate>
|
|
53
66
|
|
|
54
67
|
## Next
|
|
55
68
|
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# Phase 4 — Tests (
|
|
1
|
+
# Phase 4 — Tests (failing-first suite)
|
|
2
2
|
|
|
3
3
|
Goal: turn scenarios + contract into automated tests and confirm they FAIL before
|
|
4
4
|
any code exists. This operationalizes red/green TDD: red now, green only after
|
|
@@ -12,10 +12,12 @@ before code exists is testing nothing and will wave bad code through later.
|
|
|
12
12
|
|
|
13
13
|
## Produce
|
|
14
14
|
|
|
15
|
+
<output_format>
|
|
15
16
|
- One executable test per scenario (§2), asserting **behavior, not internals**.
|
|
16
17
|
- Contract-conformance tests (shapes + error responses from §3).
|
|
17
18
|
- Side-effect assertions on rejection paths (`assert balance unchanged`).
|
|
18
19
|
- A recorded coverage target in §4.
|
|
20
|
+
</output_format>
|
|
19
21
|
|
|
20
22
|
## Declaring where tests live
|
|
21
23
|
|
|
@@ -33,17 +35,25 @@ symlink escapes are never read.
|
|
|
33
35
|
|
|
34
36
|
## AI prompt
|
|
35
37
|
|
|
36
|
-
>
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
38
|
+
<prompt>
|
|
39
|
+
Role: a test author who writes tests before code.
|
|
40
|
+
Read first: §2 · §3.
|
|
41
|
+
Objective: a red suite that fails for the right reason — behavior, not internals.
|
|
42
|
+
Steps:
|
|
43
|
+
1. Turn each scenario into an executable test.
|
|
44
|
+
2. Add contract-conformance and edge-case tests.
|
|
45
|
+
3. Run the suite and confirm it fails for the right reason; record a coverage target.
|
|
46
|
+
Never: implement the feature, or assert on internals.
|
|
47
|
+
</prompt>
|
|
40
48
|
|
|
41
49
|
## Exit gate
|
|
42
50
|
|
|
51
|
+
<exit_gate>
|
|
43
52
|
- [ ] One test per scenario.
|
|
44
53
|
- [ ] Suite runs and is **red for the right reason**.
|
|
45
54
|
- [ ] Tests assert observable behavior.
|
|
46
55
|
- [ ] Coverage target recorded.
|
|
56
|
+
</exit_gate>
|
|
47
57
|
|
|
48
58
|
## Next
|
|
49
59
|
|
|
@@ -19,18 +19,25 @@ change request back to Specify. Honor the feature-specific safety rule named in
|
|
|
19
19
|
|
|
20
20
|
## AI prompt
|
|
21
21
|
|
|
22
|
-
>
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
22
|
+
<prompt>
|
|
23
|
+
Role: implement the feature so EVERY failing test passes — the build phase.
|
|
24
|
+
Read first: §1 · §3 · §4 · CONVENTIONS.
|
|
25
|
+
Objective: every §4 test green, one small batch at a time.
|
|
26
|
+
Steps:
|
|
27
|
+
1. Make EVERY failing test pass, one small batch at a time, honoring the §5 safety rule.
|
|
28
|
+
2. Report which tests pass and exactly what changed.
|
|
29
|
+
Never: change a test or the contract; use a package off the allow-list; or push past something unclear instead of asking.
|
|
30
|
+
</prompt>
|
|
26
31
|
|
|
27
32
|
## Exit gate
|
|
28
33
|
|
|
34
|
+
<exit_gate>
|
|
29
35
|
- [ ] All tests pass.
|
|
30
36
|
- [ ] Coverage did not decrease.
|
|
31
37
|
- [ ] No test and no contract modified by the AI.
|
|
32
38
|
- [ ] No dependency outside the allow-list.
|
|
33
39
|
- [ ] Change small enough to review in full.
|
|
40
|
+
</exit_gate>
|
|
34
41
|
|
|
35
42
|
## Next
|
|
36
43
|
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# Phase 6 — Verify (evidence +
|
|
1
|
+
# Phase 6 — Verify (evidence + non-functional review)
|
|
2
2
|
|
|
3
3
|
Goal: establish trust and record an outcome. Passing tests are necessary, not
|
|
4
4
|
sufficient. Fill **§6** in TASK.md including the GATE RECORD.
|
|
@@ -31,8 +31,28 @@ If any is false, stop and return to Build — there is nothing to verify yet.
|
|
|
31
31
|
note reviewed by the auto-gate is an audit finding (`unescalated_security_note`).
|
|
32
32
|
- **Architecture** — does it respect layering/dependency rules in CONVENTIONS.md?
|
|
33
33
|
|
|
34
|
+
## Part three — the deep check (do not skim)
|
|
35
|
+
|
|
36
|
+
Green tests prove behavior on the inputs you thought of. They do not prove the change
|
|
37
|
+
is *wired in*, nor that you did not leave a dead end behind — and for a non-coding change
|
|
38
|
+
they prove nothing about whether you actually *read* the thing you signed off. So one more
|
|
39
|
+
requirement, every gate:
|
|
40
|
+
|
|
41
|
+
Deep check — do not skim. If the task produced code, record that every new symbol is
|
|
42
|
+
referenced (wiring) and that no new dead/unused code was introduced. If it produced prose
|
|
43
|
+
or non-code, record a semantic read — what you read in full and what it confirmed. Which
|
|
44
|
+
path applies is the resolver's judgement; the engine never classifies.
|
|
45
|
+
|
|
46
|
+
Record it in the §6 **Deep checks** block — where each new symbol is called (a reference
|
|
47
|
+
search), the dead-code scan result, or the prose you read in full and what it confirmed.
|
|
48
|
+
An unfilled Deep checks block is a **shallow verify**, not a PASS.
|
|
49
|
+
|
|
34
50
|
## Record exactly one outcome (no silent pass)
|
|
35
51
|
|
|
52
|
+
When you present this gate to the human, open with the ARC (goal · done · plan) per
|
|
53
|
+
`report-template.md`, and reconcile its FLAGS with `add.py report --decide`'s open-item count
|
|
54
|
+
before the ask — per that file's reconcile rule (verify is where a flag-vs-digest mismatch bites).
|
|
55
|
+
|
|
36
56
|
| Outcome | When |
|
|
37
57
|
|---------|------|
|
|
38
58
|
| `PASS` | all checks met |
|
|
@@ -41,8 +61,10 @@ If any is false, stop and return to Build — there is nothing to verify yet.
|
|
|
41
61
|
|
|
42
62
|
## Exit gate / Next
|
|
43
63
|
|
|
44
|
-
|
|
64
|
+
<exit_gate>
|
|
65
|
+
- [ ] Evidence confirmed, non-functional risks checked, outcome recorded — a person approved, or
|
|
45
66
|
(under `autonomy: auto` with no residue) the run auto-resolved as the accountable owner.
|
|
67
|
+
</exit_gate>
|
|
46
68
|
|
|
47
69
|
```bash
|
|
48
70
|
python3 .add/tooling/add.py gate PASS # marks the task done
|
|
@@ -6,7 +6,7 @@ about the feature finally appears. Fill **§7** in TASK.md.
|
|
|
6
6
|
|
|
7
7
|
## Do
|
|
8
8
|
|
|
9
|
-
1. **Release behind a
|
|
9
|
+
1. **Release behind a scope-of-impact limit** — feature flag and/or gradual rollout.
|
|
10
10
|
2. **Reuse scenarios as monitors** — the §2 scenarios that defined "correct" now
|
|
11
11
|
define what you alert on: overall error rate, each rejection's rate (a spike in
|
|
12
12
|
one is a signal), latency of the risky operation under load.
|
|
@@ -15,16 +15,24 @@ about the feature finally appears. Fill **§7** in TASK.md.
|
|
|
15
15
|
|
|
16
16
|
## AI prompt
|
|
17
17
|
|
|
18
|
-
>
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
18
|
+
<prompt>
|
|
19
|
+
Role: a reliability analyst feeding the next cycle.
|
|
20
|
+
Read first: telemetry · objectives · incidents.
|
|
21
|
+
Objective: turn what production shows into the next SPEC delta.
|
|
22
|
+
Steps:
|
|
23
|
+
1. Report error-budget burn.
|
|
24
|
+
2. Cluster errors and surface the top real-world failures.
|
|
25
|
+
3. Draft a SPEC delta with evidence links.
|
|
26
|
+
Never: auto-roll-back — recommend; a human owns the production decision.
|
|
27
|
+
</prompt>
|
|
22
28
|
|
|
23
29
|
## Exit gate
|
|
24
30
|
|
|
31
|
+
<exit_gate>
|
|
25
32
|
- [ ] Released behind a flag/rollout.
|
|
26
33
|
- [ ] Scenario-based monitors live.
|
|
27
34
|
- [ ] A reviewed spec delta captured (becomes the next `new-task`).
|
|
35
|
+
</exit_gate>
|
|
28
36
|
|
|
29
37
|
## Next
|
|
30
38
|
|
|
@@ -1,19 +1,59 @@
|
|
|
1
|
-
# Chat reports — the
|
|
1
|
+
# Chat reports — the decision-point template (for the AI, not for add.py)
|
|
2
2
|
|
|
3
3
|
The engine renders artifacts (`report`, `report --decide`, `status`); this file
|
|
4
4
|
governs the CHAT MESSAGE you wrap around them. The digest is the artifact BEHIND
|
|
5
5
|
your presentation, never a replacement for it — and your prose is never a
|
|
6
6
|
replacement for the digest.
|
|
7
7
|
|
|
8
|
-
Use it every time you report at or near a decision
|
|
9
|
-
bundle
|
|
8
|
+
Use it every time you report at or near a decision point: an intake proposal, a
|
|
9
|
+
bundle approval, a verify gate, a task completion, a milestone close.
|
|
10
|
+
|
|
11
|
+
## The decision arc — rendered first, above the five blocks
|
|
12
|
+
|
|
13
|
+
Every report at a human gate opens with the **ARC** — three labelled lines that
|
|
14
|
+
place the decision in the work's whole arc, so the human confirms with sight of
|
|
15
|
+
where this is going, not just the step in front of them. Render it first, then a
|
|
16
|
+
separator, then the unchanged five blocks below:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
ARC goal: <the milestone / project goal this decision serves>
|
|
20
|
+
done: <proven progress — tasks done · exit-criteria met · what this gate proves>
|
|
21
|
+
plan: <this gate → the next step → the goal>
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
- **goal** — the milestone or project goal the decision serves, read from the
|
|
25
|
+
`m-goal` line in `add.py status`; never re-typed from memory.
|
|
26
|
+
- **done** — proven progress only: exit-criteria met/total and tasks done from
|
|
27
|
+
the rollup, plus what this gate proves. An honest fact, never a hope.
|
|
28
|
+
- **plan** — this gate → the next step → the goal, mirroring the rollup's
|
|
29
|
+
`DECIDE NEXT` line.
|
|
30
|
+
|
|
31
|
+
The arc is required at every human gate: **baseline-lock · contract-freeze ·
|
|
32
|
+
verify · intake · scope · milestone-close · graduation**. The three labels stay
|
|
33
|
+
constant; their content adapts to the gate. The arc is presentation only — it
|
|
34
|
+
adds no gate and changes no PASS / RISK-ACCEPTED / HARD-STOP / freeze outcome.
|
|
35
|
+
|
|
36
|
+
Its facts are engine-sourced, exactly like EVIDENCE below: goal = `m-goal` ·
|
|
37
|
+
done = exit-criteria met/total + tasks done · plan = `DECIDE NEXT`. If your arc
|
|
38
|
+
and `add.py` output disagree, the engine wins — fix the arc, not the engine.
|
|
39
|
+
|
|
40
|
+
### Per-gate examples — one shape, gate-specific content
|
|
41
|
+
|
|
42
|
+
- **verify** — `goal:` ship the decision arc · `done:` report-arc tests 6/6
|
|
43
|
+
green, gate ready · `plan:` PASS this gate → wire the arc into every gate → goal.
|
|
44
|
+
- **contract-freeze** — `goal:` … · `done:` bundle drafted, lowest-confidence
|
|
45
|
+
flag surfaced · `plan:` freeze §3 → build → goal.
|
|
46
|
+
- **milestone-close** — `goal:` … · `done:` exit-criteria 3/3 met, all tasks
|
|
47
|
+
done · `plan:` close → archive → the next milestone.
|
|
48
|
+
- **intake** — `goal:` the sized request · `done:` classified new-major,
|
|
49
|
+
rationale stated · `plan:` create the milestone → first contract → goal.
|
|
10
50
|
|
|
11
51
|
## The five blocks, in order
|
|
12
52
|
|
|
13
53
|
```
|
|
14
54
|
SUMMARY one line: intent + target + where we are
|
|
15
55
|
DECISION what you need from the human (or "none — FYI")
|
|
16
|
-
⚠ FLAGS
|
|
56
|
+
⚠ FLAGS lowest-confidence first, why + cost-if-wrong
|
|
17
57
|
EVIDENCE small table: tests · gates · parity · check — engine-sourced
|
|
18
58
|
NEXT the single next action + what it unlocks
|
|
19
59
|
```
|
|
@@ -24,7 +64,7 @@ NEXT the single next action + what it unlocks
|
|
|
24
64
|
2. **DECISION** — the question the human must answer, stated plainly; exactly
|
|
25
65
|
one decision per report, or an explicit "none — FYI". If a decision exists,
|
|
26
66
|
ask it AFTER everything below has been shown (show-before-ask).
|
|
27
|
-
3. **⚠ FLAGS** —
|
|
67
|
+
3. **⚠ FLAGS** — lowest-confidence first, each with *why* confidence is lowest and the
|
|
28
68
|
*cost if wrong*. Where TASK.md markers exist (`⚠` / `- [~]` / `- [ ]`),
|
|
29
69
|
quote them verbatim and keep their document order — extraction ≠ judgment.
|
|
30
70
|
4. **EVIDENCE** — engine-sourced facts pasted from `add.py` output, never
|
|
@@ -34,15 +74,33 @@ NEXT the single next action + what it unlocks
|
|
|
34
74
|
line when it is right; overrule it only with a stated reason (e.g. planned
|
|
35
75
|
tasks the state file cannot see yet).
|
|
36
76
|
|
|
77
|
+
**The ask itself** — when block 2's decision becomes a literal question component
|
|
78
|
+
(option picker, numbered menu), compose it as a summary: the detail stays in the
|
|
79
|
+
report above, the question carries intent + what "yes" means + the flag count.
|
|
80
|
+
|
|
37
81
|
## Hard rules
|
|
38
82
|
|
|
83
|
+
<constraints>
|
|
39
84
|
- **Summary-first.** Never bury the decision under a task list or a diff.
|
|
40
85
|
- **Show before ask.** Render the artifact (digest · diff · report) before any
|
|
41
86
|
approval question; the human decides on what they can see.
|
|
42
|
-
- **
|
|
87
|
+
- **Reconcile the count.** Before the ask, your ⚠ FLAGS must reconcile with
|
|
88
|
+
`add.py report --decide`'s open-item count. If your prose calls an item
|
|
89
|
+
resolved while the digest still counts it open, the engine wins — fix the data
|
|
90
|
+
(the TASK.md markers the digest reads), not the sentence. A report whose flag
|
|
91
|
+
count disagrees with the engine is the un-transparent gate the ARC exists to close.
|
|
92
|
+
- **Never pre-stamp a human decision point.** Freeze / gate / lock fields stay DRAFT or
|
|
43
93
|
blank until the answer returns: show → ask → stamp → advance. An artifact
|
|
44
94
|
must never claim an approval that has not happened.
|
|
45
|
-
- **One report per
|
|
95
|
+
- **One report per decision point.** After an approval, point at the frozen artifact —
|
|
46
96
|
do not re-render the whole bundle.
|
|
47
97
|
- **Honest scope.** "Done" means the request, not the last task: report
|
|
48
98
|
"task 2/3", never "done" while approved scope remains.
|
|
99
|
+
- **The question is a summary, never the artifact.** Every approval ask carries
|
|
100
|
+
two layers: a compact SUMMARY · DECISION · ⚠ FLAGS block sits in chat
|
|
101
|
+
immediately before the ask (positional), and the question text itself is a
|
|
102
|
+
summary of two lines at most — intent + what "yes" means + the flag count —
|
|
103
|
+
pointing at the report above (compositional). The full bundle, diff, or
|
|
104
|
+
artifact lives only in the chat report; a question that re-carries it buries
|
|
105
|
+
the decision.
|
|
106
|
+
</constraints>
|