gm-kilo 2.0.888 → 2.0.889

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -3,30 +3,34 @@ name: gm-emit
3
3
  description: EMIT phase. Pre-emit debug, write files, post-emit verify from disk. Any new unknown triggers immediate snake back to planning — restart chain.
4
4
  ---
5
5
 
6
- # GM EMIT — Write and Verify
6
+ # GM EMIT — Write and verify from disk
7
7
 
8
- GRAPH: `PLAN EXECUTE [EMIT] VERIFY → COMPLETE`
9
- Entry: all mutables KNOWN. From `gm-execute` or re-entered from VERIFY.
8
+ Entry: every mutable KNOWN, from `gm-execute` or re-entered from VERIFY. Exit: gates clear → `gm-complete`.
10
9
 
11
- ## TRANSITIONS
10
+ Cross-cutting dispositions live in `gm` SKILL.md.
12
11
 
13
- **EXIT → VERIFY**: all gate conditions true → invoke `gm-complete` immediately.
14
- **SELF-LOOP**: post-emit variance with known cause → fix, re-verify, stay in EMIT.
15
- **REGRESS → EXECUTE**: pre-emit reveals known logic error.
16
- **REGRESS → PLAN**: pre-emit reveals new unknown | post-emit variance with unknown cause | scope changed.
12
+ ## Transitions
17
13
 
18
- ## LEGITIMACY GATE (before pre-emit run)
14
+ - All gates clear → `gm-complete`
15
+ - Post-emit variance with known cause → fix in-band, re-verify, stay in EMIT
16
+ - Pre-emit reveals known logic error → `gm-execute`
17
+ - Pre-emit reveals new unknown OR post-emit variance with unknown cause OR scope changed → `planning`
19
18
 
20
- For every claim landing in a file:
21
- 1. **Earned specificity** — traces to `authorization=witnessed`, not inflated from weak prior?
22
- 2. **Repair legality** — local patch dressed as structural repair? Downgrade scope or snake to PLAN.
23
- 3. **Lawful downgrade** — can a weaker, true statement replace it? PREFER the downgrade.
24
- 4. **Alternative-route suppression** — live competing route being silenced? Preserve it.
25
- 5. **Strongest objection** — if a reviewer pushed back on this change, what would the sharpest argument be? Articulate it. Cannot articulate = have not understood the alternatives = regress to `gm-execute`.
19
+ ## Legitimacy gate (before pre-emit run)
26
20
 
27
- Fail any regress to `gm-execute` to witness what was missing, or `planning` if gap is structural.
21
+ For every claim landing in a file, answer five questions:
28
22
 
29
- ## PRE-EMIT RUN (mandatory before writing any file)
23
+ 1. Earned specificity does it trace to `authorization=witnessed`, or is it inflated from a weak prior?
24
+ 2. Repair legality — is a local patch dressed as structural repair? Downgrade scope or regress to PLAN.
25
+ 3. Lawful downgrade — can a weaker, true statement replace it? Prefer the downgrade.
26
+ 4. Alternative-route suppression — is a live competing route being silenced? Preserve it.
27
+ 5. Strongest objection — what would the sharpest reviewer pushback be? Articulate it. Cannot articulate = have not understood the alternatives → `gm-execute`.
28
+
29
+ Any failure regresses to `gm-execute` to witness what was missing, or `planning` if the gap is structural.
30
+
31
+ ## Pre-emit run
32
+
33
+ Mandatory before writing any file.
30
34
 
31
35
  ```
32
36
  exec:nodejs
@@ -34,58 +38,29 @@ const { fn } = await import('/abs/path/to/module.js');
34
38
  console.log(await fn(realInput));
35
39
  ```
36
40
 
37
- 1. Import actual module from disk witness current behavior as baseline
38
- 2. Run proposed logic in isolation WITHOUT writing — witness with real inputs
39
- 3. Probe failure paths with real error inputs
40
- 4. Compare: matches expected → write. Unexpected → new unknown → `planning`.
41
+ Import the actual module from disk to witness current behavior as the baseline. Run the proposed logic in isolation without writing — witness with real inputs and with real error inputs. Match expected → write. Unexpected → new unknown → `planning`.
41
42
 
42
- ## WRITING FILES
43
+ ## Writing
43
44
 
44
- `exec:nodejs` with `require('fs')`. Write only when every gate mutable resolved simultaneously.
45
+ `exec:nodejs` with `require('fs')`. Write only when every gate mutable resolves simultaneously.
45
46
 
46
- ## POST-EMIT VERIFICATION (immediately after writing)
47
+ ## Post-emit verification
47
48
 
48
- 1. Re-import from disk (not in-memory stale is inadmissible)
49
- 2. Run identical inputs as pre-emit — must match pre-emit baseline exactly
50
- 3. Known variance → fix immediately, re-verify (EMIT self-loop)
51
- 4. Unknown variance → new unknown → invoke `planning`
49
+ Re-import from disk in-memory state is stale and inadmissible. Run identical inputs as pre-emit; output must match the baseline exactly. Known variance → fix and re-verify (self-loop). Unknown variance → `planning`.
52
50
 
53
- ## GATE CONDITIONS (all true simultaneously)
51
+ ## Gate (all true at once)
54
52
 
55
- - Legitimacy gate passed; none of five refused collapses
56
- - Pre-emit passed with real inputs + error inputs
53
+ - Legitimacy gate passed; no refused collapse
54
+ - Pre-emit passed with real inputs and real error inputs
57
55
  - Post-emit matches pre-emit exactly
58
- - Hot reloadable; errors throw with context (no fallbacks, `|| default`, `catch { return null }`)
59
- - No mocks/fakes/stubs/scattered test files (delete on discovery)
60
- - Behavior change in this emit = a corresponding assertion in test.js (a change no test would catch is a change you cannot prove)
61
- - If this emit changes any browser-facing code (client/, served HTML/JS, shaders, page-bundle imports, gh-pages assets), the post-emit verify MUST include a live browser witness via `exec:browser` (boot server → page.goto → page.evaluate asserting the invariant the change established). Node-side import + test.js does NOT satisfy this — see `gm-complete` BROWSER VALIDATION GATE.
62
- - Files ≤200 lines
63
- - No duplicate concern (run exec:codesearch for primary concern after writing; any overlap → `planning`)
64
- - No comments; no hardcoded values; no adjectives in identifiers; no unnecessary files
65
- - Observability: new server subsystems expose `/debug/<subsystem>`; new client modules in `window.__debug`
66
- - Structure: no if/else where dispatch table suffices; no one-liners that require decoding; no reinvented APIs
67
- - All facts resolved this phase memorized via background Agent(memorize)
68
- - CHANGELOG.md updated; TODO.md cleared/deleted
69
-
70
- ## FIX ON SIGHT — HARD RULE
71
-
72
- Pre-emit run, post-emit run, or legitimacy gate surfaces ANY issue (failing assertion, stderr, type/lint error, unexpected variance, broken import, runtime throw) → fix at root cause this turn, re-run pre-emit AND post-emit, advance only when all gates pass simultaneously. Never write-and-promise-fix-later, never `try/catch`-to-hide, never `.skip`, never silence with redirection. Known variance → fix and re-verify (self-loop). Unknown variance → regress to `planning`.
73
-
74
- ## CODE EXECUTION
75
-
76
- `exec:<lang>` only. File writes via exec:nodejs + require('fs'). Never Bash(node/npm/npx/bun).
77
- Pack runs: Promise.allSettled, each idea own try/catch, under 12s per call.
78
-
79
- ## CODEBASE SEARCH
80
-
81
- `exec:codesearch` only. Grep/Glob/Find/Explore = hook-blocked. Known path → `Read`.
82
-
83
- ## MEMORIZE
84
-
85
- ```
86
- Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')
87
- ```
88
-
89
- Same turn as resolution. Parallel when multiple. End-of-turn self-check mandatory.
90
-
91
- **Never**: write before pre-emit | advance with post-emit variance | absorb surprises | respond to user mid-phase
56
+ - Hot-reloadable; errors throw with context (no `|| default`, no `catch { return null }`, no fallbacks)
57
+ - No mocks, fakes, stubs, or scattered test files (delete on discovery)
58
+ - Any behavior change has a corresponding assertion in `test.js` a change no test catches is a change you cannot prove
59
+ - Browser-facing change post-emit verify includes a live `exec:browser` witness (boot server → `page.goto``page.evaluate` asserting the invariant the change established). Node-side import + test.js does not satisfy this — the final gate runs again in `gm-complete`.
60
+ - Files ≤ 200 lines
61
+ - No duplicate concern (run `exec:codesearch` for the primary concern after writing; overlap → `planning`)
62
+ - No comments, no hardcoded values, no adjectives in identifiers, no unnecessary files
63
+ - Observability: new server subsystems expose `/debug/<subsystem>`; new client modules register in `window.__debug`
64
+ - Structure: no if/else where dispatch suffices; no one-liners that obscure; no reinvented APIs
65
+ - Every fact resolved this phase memorized via background `Agent(memorize)`
66
+ - CHANGELOG.md updated; TODO.md cleared or deleted
@@ -3,70 +3,61 @@ name: gm-execute
3
3
  description: EXECUTE phase AND the foundational execution contract for every skill. Every exec:<lang> run, every witnessed check, every code search, in every phase, follows this skill's discipline. Resolve all mutables via witnessed execution. Any new unknown triggers immediate snake back to planning — restart chain from PLAN.
4
4
  ---
5
5
 
6
- # GM EXECUTE — Resolve Every Unknown
6
+ # GM EXECUTE — Resolve every unknown by witness
7
7
 
8
- GRAPH: `PLAN [EXECUTE] EMIT VERIFYCOMPLETE`. Entry: .prd with named unknowns.
8
+ Entry: `.prd` with named unknowns. Exit: every mutable KNOWN invoke `gm-emit`.
9
9
 
10
- This skill = execution contract for ALL phases. About to run anything load this first.
10
+ This skill is the execution contract for ALL phases pre-emit witnesses, post-emit verifies, e2e checks all run on this discipline. Cross-cutting dispositions live in `gm` SKILL.md.
11
11
 
12
- ## TRANSITIONS
12
+ ## Transitions
13
13
 
14
- - **EXIT → EMIT**: all mutables KNOWN → invoke `gm-emit`.
15
- - **SELF-LOOP**: still UNKNOWN → re-run different angle (max 2 passes).
16
- - **REGRESS → PLAN**: new unknown | unresolvable after 2 passes.
14
+ - All mutables KNOWN → `gm-emit`
15
+ - Still UNKNOWN → re-run from a different angle (max 2 passes)
16
+ - New unknown OR unresolvable after 2 passes → `planning`
17
17
 
18
- ## MUTABLE DISCIPLINE
18
+ ## Mutable discipline
19
19
 
20
- Each mutable: name | expected | current | resolution method.
20
+ Each mutable carries: name, expected, current, resolution method.
21
21
 
22
- Resolves to KNOWN only when ALL four pass:
23
- - **ΔS=0** — witnessed output equals expected
24
- - **λ≥2** — two independent paths agree
25
- - **ε intact** — adjacent invariants hold
26
- - **Coverage≥0.70** — enough corpus inspected
27
-
28
- Unresolved after 2 passes = regress to `planning`. Never narrate past an unresolved mutable.
29
-
30
- ## PRIORS DON'T AUTHORIZE
31
-
32
- Route candidates from PLAN = `weak_prior` only. Plausibility = right to TEST, not BELIEVE.
33
- weak_prior → witnessed probe → witnessed → feed to EMIT. "The plan says" / "obviously X" = prior, not fact.
22
+ Resolves to KNOWN only when all four pass:
34
23
 
35
- Claims in response prose stand or fall by their last witness. A claim with no witness in this session is a hypothesis, not a finding say so when you state it, and say what would settle it. The next reader (you, next turn) needs to know which lines were earned and which were carried forward.
24
+ - **ΔS = 0**witnessed output equals expected
25
+ - **λ ≥ 2** — two independent paths agree
26
+ - **ε intact** — adjacent invariants hold
27
+ - **Coverage ≥ 0.70** — enough corpus inspected to rule out contradiction
36
28
 
37
- ## VERIFICATION BUDGET
29
+ Unresolved after 2 passes regresses to `planning`. Never narrate past an unresolved mutable.
38
30
 
39
- Spend on `.prd` items in descending order of consequence-if-wrong × distance-from-witnessed. Items whose failure would collapse the headline finding must reach witnessed status before EMIT; items with sub-argument-level consequence need at minimum a stated fallback path.
31
+ Route candidates from PLAN are `weak_prior` only. Plausibility is the right to test, not the right to believe. A claim with no witness in the current session is a hypothesis say so when stating it, and say what would settle it. The next reader (you, next turn) needs to know which lines were earned and which were carried forward.
40
32
 
41
- ## CODE EXECUTION
33
+ ## Verification budget
42
34
 
43
- `exec:<lang>` only via Bash: `exec:<lang>\n<code>`
35
+ Spend on `.prd` items in descending order of consequence-if-wrong × distance-from-witnessed. Items whose failure would collapse the headline finding must reach witnessed status before EMIT; sub-argument-level items need at minimum a stated fallback path.
44
36
 
45
- Langs: `nodejs` (default) | `bash` | `python` | `typescript` | `go` | `rust` | `c` | `cpp` | `java` | `deno` | `cmd`
37
+ ## Code execution
46
38
 
47
- File I/O: exec:nodejs + require('fs'). Git directly in Bash. **Never** Bash(node/npm/npx/bun).
39
+ `exec:<lang>` only via Bash. Languages: nodejs (default), bash, python, typescript, go, rust, c, cpp, java, deno, cmd. File I/O via `exec:nodejs` + `require('fs')`. Git directly in Bash. Never `Bash(node/npm/npx/bun)`.
48
40
 
49
- Pack runs: Promise.allSettled parallel, each idea own try/catch, under 12s per call.
50
- Runner: `exec:runner\nstart|stop|status`
41
+ Pack runs: `Promise.allSettled`, each idea own try/catch, under 12s per call. Runner: `exec:runner\n{start|stop|status}`.
51
42
 
52
- Every exec daemonizes. The hook tails the task logfile up to 30s wall-clock and returns whatever it has — short tasks complete inside the window and look synchronous; long tasks return a task_id with partial output and the agent continues with `exec:tail` (drain more output, bounded), `exec:watch` (resume blocking until text/regex match or timeout), or `exec:close` (terminate). Never re-spawn a long task to "check on it" — that orphans the first one. `exec:wait` is a pure timer with no log scanning; `exec:sleep` blocks on a specific task's output; `exec:watch` is the match-or-timeout primitive. Every interaction with the execution platform returns the live list of running tasks for this session — close stragglers via `exec:close\n<id>` so the list stays scannable. Session-end (clear/logout/prompt_input_exit) kills the session's tasks; compaction/handoff preserves them.
43
+ Every exec daemonizes. The hook tails the task logfile up to 30s wall-clock and returns whatever is there — short tasks complete inside the window and look synchronous; long tasks return a task_id with partial output. Continue with `exec:tail` (drain, bounded), `exec:watch` (resume blocking until match or timeout), or `exec:close` (terminate). Never re-spawn a long task to check on it — that orphans the first one. `exec:wait` is a pure timer; `exec:sleep` blocks on a specific task's output; `exec:watch` is the match-or-timeout primitive. Every execution-platform RPC returns the live list of running tasks for this session — close stragglers via `exec:close\n<id>` so the list stays scannable. Session-end (clear/logout/prompt_input_exit) kills the session's tasks; compaction/handoff preserves them.
53
44
 
54
- ## CODEBASE SEARCH
45
+ Utility verbs (`exec:wait`, `exec:sleep`, `exec:status`, `exec:close`, `exec:pause`, `exec:type`, `exec:runner`, `exec:kill-port`, `exec:recall`, `exec:memorize`, `exec:forget`) take their argument on the next line. Inline form (`exec:status <id>`) is denied by the hook.
55
46
 
56
- `exec:codesearch` only. Grep/Glob/Find/Explore/grep/rg/find = hook-blocked.
47
+ ## Codebase search
57
48
 
58
- Known absolute path `Read`. Known dir exec:nodejs + fs.readdirSync.
49
+ `exec:codesearch` only. Grep, Glob, Find, Explore, raw grep/rg/find inside `exec:bash` are all hook-blocked.
59
50
 
60
51
  ```
61
52
  exec:codesearch
62
53
  <two-word query>
63
54
  ```
64
55
 
65
- Iterate: change/add one word per pass. Min 4 attempts before concluding absent.
56
+ Start two words, change/add one per pass, minimum four attempts before concluding absent. Known absolute path → `Read`. Known directory → `exec:nodejs` + `fs.readdirSync`.
66
57
 
67
- ## IMPORT-BASED EXECUTION
58
+ ## Import-based execution
68
59
 
69
- Always import actual modules. Reimplemented = UNKNOWN.
60
+ Hypotheses become real by importing actual modules from disk. Reimplemented behavior is UNKNOWN.
70
61
 
71
62
  ```
72
63
  exec:nodejs
@@ -74,76 +65,16 @@ const { fn } = await import('/abs/path/to/module.js');
74
65
  console.log(await fn(realInput));
75
66
  ```
76
67
 
77
- Differential diagnosis: smallest reproduction → compare actual vs expected → name the delta = mutable.
78
-
79
- ## CI — AUTOMATED
80
-
81
- `git push` → Stop hook auto-watches Actions for pushed HEAD. Same-repo only — downstream cascades not auto-watched.
82
- - Green → Stop approves with summary
83
- - Failure → run names+IDs → `gh run view <id> --log-failed`
84
- - Deadline 180s (override `GM_CI_WATCH_SECS`)
85
-
86
- ## GROUND TRUTH
87
-
88
- Real services, real data, real timing. Mocks/stubs/scattered tests/fallbacks = delete.
89
-
90
- **Scan before edit**: exec:codesearch before creating/modifying. Duplicate concern = regress to `planning`.
91
- **Hypothesize via execution**: hypothesis → run → witness → edit. Never edit on unwitnessed assumption.
92
- **Code quality**: native → library → structure (map/pipeline) → write.
93
-
94
- ## PARALLEL SUBAGENTS
95
-
96
- ≤3 `gm:gm` subagents for independent items in ONE message. Browser escalation: exec:browser → browser skill → screenshot last resort.
97
-
98
- ## RECALL — HARD RULE
99
-
100
- Before resolving any new unknown via fresh execution, recall first.
101
-
102
- ```
103
- exec:recall
104
- <2-6 word query>
105
- ```
106
-
107
- Triggers: "did we hit this" | feels familiar | new sub-task in known project | about to comment a non-obvious choice | about to ask user something likely discussed.
108
-
109
- Hits = weak_prior; still witness. Empty = proceed. Capped 6s, ~5ms when serve running. ~200 tokens / 5 hits.
110
-
111
- ## MEMORIZE — HARD RULE
112
-
113
- Unknown→known = same-turn memorize.
114
-
115
- ```
116
- Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')
117
- ```
118
-
119
- Triggers: exec output answers prior unknown | CI log reveals root cause | code read confirms/refutes | env quirk | user states preference/constraint.
120
-
121
- N facts → N parallel Agent calls in ONE message. End-of-turn self-check mandatory.
122
-
123
- ## FIX ON SIGHT — HARD RULE
124
-
125
- Issue surfaced mid-execution (failing test, exec stderr, broken import, runtime exception, lint/type error, deprecation warning, unexpected output) is fixed THIS turn, at root cause, in-band. Never `// TODO`, never `try/catch`-to-swallow, never `2>/dev/null`, never `.skip`, never "out of scope" inside the same file. Re-witness after fix. New unknown surfaced by the fix → regress to `planning`. Genuine out-of-scope → write a `.gm/prd.yml` item before continuing.
126
-
127
- **Incidental errors auto-plan**: a reasonably-fixable issue that is *not* what the user asked about — pre-existing build break, lockfile drift, broken dep feature, dead import in adjacent module, missing artifact, neighboring lint failure — still belongs to the agent. Add it to `.gm/prd.yml` the same turn it surfaces and execute it before COMPLETE. Do not ask the user; do not narrate past it; do not file it as "next session." Only errors that genuinely need user credentials, decisions, or external services that are down are name-and-stop, recorded with `blockedBy: external`.
128
-
129
- **Obvious re-architecting auto-plans**: same discipline for clear refactor wins surfaced mid-task — code competing with an existing library/package that does the same thing, multi-file ad-hoc logic one import would replace, duplicated logic asking for one helper. Regress to `planning`, add the item, execute. Bar is *obvious + reachable from this session*; speculative refactors stay out.
130
-
131
- **Cross-session PRD**: items in `.gm/prd.yml` from prior sessions are this session's work the moment they're discovered. Finish every item in the file before COMPLETE — including ones the current user message did not mention. "From another session" is not an exemption.
132
-
133
- ## BROWSER WITNESS — HARD RULE
68
+ Differential diagnosis: smallest reproduction → compare actual vs expected → name the delta that delta is the mutable.
134
69
 
135
- Editing browser-facing code (under `client/`, `docs/`, `*.html`, shaders, page-bundle imports, served JS/CSS, gh-pages assets, anything imported by a browser entry, anything visible in DOM/canvas/WebGL) → live `exec:browser` witness in THIS phase, same turn as the edit. Not deferred to EMIT, not deferred to VERIFY — those layers re-witness on top, they don't replace this one.
70
+ ## Edits depend on witnesses
136
71
 
137
- Protocol on every client edit:
138
- 1. Boot server / open page → HTTP 200 witnessed
139
- 2. `exec:browser` → `page.goto(url)` → poll the affected global (`window.__app.<system>`, `window.__debug.<module>`)
140
- 3. `page.evaluate` asserting the specific invariant the change establishes — capture numbers
141
- 4. Variance → fix at root cause, re-witness (FIX ON SIGHT). Never advance to EMIT on unwitnessed client behavior.
72
+ Hypothesis run witness → edit. An edit before a witness is a guess. Scan via `exec:codesearch` before creating or modifying — duplicate concern regresses to `planning`. Code-quality preference: native → library → structure → write.
142
73
 
143
- Forbidden: `node test.js` green as a substitute | screenshot without evaluate | "I'll check it in VERIFY" then skipping | committing a client diff without an `exec:browser` block this turn. Exempt only for server-only / no-browser repos; tag the exemption explicitly.
74
+ ## Parallel subagents
144
75
 
145
- ## CONSTRAINTS
76
+ Up to 3 `gm:gm` subagents for independent items in one message. Browser escalation: `exec:browser` → `browser` skill → screenshot only as last resort.
146
77
 
147
- **Never**: Bash(node/npm/npx/bun) | fake data | mocks | scattered tests | fallbacks | Grep/Glob/Find/Explore | sequential independent items | respond mid-phase | edit before witnessing | duplicate code | if/else where dispatch suffices | one-liners that obscure | reinvent native/library
78
+ ## CI is automated
148
79
 
149
- **Always**: witness every hypothesis | import real modules | scan before edit | regress on new unknown | delete mocks/comments/scattered tests on discovery | update test.js for behavior changes | invoke next skill immediately when done | weight verification by load
80
+ `git push` triggers the Stop hook to watch Actions for the pushed HEAD on the same repo (downstream cascades are not auto-watched). Green Stop approves with summary; failure run names + IDs surfaced, investigate via `gh run view <id> --log-failed`. Deadline 180s (override `GM_CI_WATCH_SECS`).
@@ -3,24 +3,25 @@ name: governance
3
3
  description: Governance reference invoked by PLAN/EXECUTE/EMIT/VERIFY. Separates route discovery (PLAN) from weak-prior handoff (EXECUTE) from earned-emission legitimacy (EMIT/VERIFY). Encodes 16-failure taxonomy, 4 state planes, ΔS/λ/ε/Coverage metrics, governance stress suite.
4
4
  ---
5
5
 
6
- # Governance — Route, Bridge, Legitimacy
6
+ # Governance — Route, bridge, legitimacy
7
7
 
8
- Three roles, three failure surfaces:
9
- 1. **Route discovery** — what family of fault? Owned by `planning`.
10
- 2. **Weak-prior bridge** — plausibility ≠ authorization. Owned by `gm-execute`.
11
- 3. **Legitimacy gate** — did this answer earn its strength? Owned by `gm-emit`/`gm-complete`.
8
+ Three roles, three failure surfaces.
12
9
 
13
- ## Five Refused Collapses
10
+ 1. Route discovery — what family of fault? Owned by `planning`.
11
+ 2. Weak-prior bridge — plausibility is not authorization. Owned by `gm-execute`.
12
+ 3. Legitimacy gate — did this answer earn its strength? Owned by `gm-emit` and `gm-complete`.
14
13
 
15
- 1. Route authorization ("plan looks good" → "code is right")
16
- 2. Candidate → structural repair (local patch presented as architectural fix)
14
+ ## Five refused collapses
15
+
16
+ 1. Route → authorization ("plan looks good" treated as "code is right")
17
+ 2. Candidate → structural repair (local patch shipped as architectural fix)
17
18
  3. Hidden → public law (internal convenience shipped as contract)
18
- 4. Cleanliness → legitimacy (compiles = evidence-supports)
19
+ 4. Cleanliness → legitimacy (compiles treated as evidence-supports)
19
20
  5. One strong route → universal closure (best answer treated as only answer)
20
21
 
21
- When in doubt: preserve ambiguity. Lawful downgrade beats forced closure.
22
+ When in doubt, preserve ambiguity. Lawful downgrade beats forced closure.
22
23
 
23
- ## 7 Route Families
24
+ ## 7 route families
24
25
 
25
26
  | Family | What breaks | Repair |
26
27
  |---|---|---|
@@ -32,7 +33,7 @@ When in doubt: preserve ambiguity. Lawful downgrade beats forced closure.
32
33
  | boundary | Interfaces, contracts, seams | Re-assert contract from one source |
33
34
  | representation | Data shape, schema, type | Make illegal states unrepresentable |
34
35
 
35
- ## 16 Failure Modes
36
+ ## 16 failure modes
36
37
 
37
38
  | # | Name | Family |
38
39
  |---|---|---|
@@ -53,7 +54,7 @@ When in doubt: preserve ambiguity. Lawful downgrade beats forced closure.
53
54
  | 15 | Deployment deadlock | execution |
54
55
  | 16 | Pre-deploy collapse | execution |
55
56
 
56
- ## 4 State Planes
57
+ ## 4 state planes
57
58
 
58
59
  | Plane | Owner | States | Implication |
59
60
  |---|---|---|---|
@@ -62,18 +63,18 @@ When in doubt: preserve ambiguity. Lawful downgrade beats forced closure.
62
63
  | repair_legality | gm-emit | unverified → local_candidate → structural | Local cannot ship as structural |
63
64
  | hidden_decision_posture | gm-complete | open → down_weighted → closed | Close only after CI green |
64
65
 
65
- ## Quality Metrics
66
+ ## Quality metrics
66
67
 
67
68
  - **ΔS** — witnessed output equals expected. ΔS≠0 = still open.
68
- - **λ≥2** — two independent paths agree. λ=1 = still unknown.
69
+ - **λ ≥ 2** — two independent paths agree. λ=1 = still unknown.
69
70
  - **ε** — adjacent invariants hold (types, tests, neighboring callers).
70
- - **Coverage≥0.70** — enough corpus inspected to rule out contradicting evidence.
71
+ - **Coverage 0.70** — enough corpus inspected to rule out contradicting evidence.
71
72
 
72
- All four must pass before mutable flips UNKNOWN→KNOWN.
73
+ All four pass before a mutable flips UNKNOWN KNOWN.
73
74
 
74
- ## Stress Suite (8 Cases)
75
+ ## Stress suite
75
76
 
76
- Run before declaring COMPLETE:
77
+ Run before declaring COMPLETE.
77
78
 
78
79
  | # | Case | Failure if flunked |
79
80
  |---|---|---|
@@ -86,11 +87,11 @@ Run before declaring COMPLETE:
86
87
  | A1 | Authenticity eval partial signals | Surface appearance beats evidence |
87
88
  | D1 | Deploy-gate under CI flake | Treats noise as green |
88
89
 
89
- Legal: illegal_commitment=0, evidence_boundary_violation=0, lawful_downgrade=available in all 8, outlier_visibility=preserved.
90
+ Legal: `illegal_commitment=0`, `evidence_boundary_violation=0`, `lawful_downgrade=available` in all 8, `outlier_visibility=preserved`.
90
91
 
91
- ## Phase Application
92
+ ## Phase application
92
93
 
93
94
  - **planning** — tag every `.prd` item with route family + failure-mode IDs
94
- - **gm-execute** — weak prior only; witnessed probe required before authorization
95
+ - **gm-execute** — weak prior only; witnessed probe before authorization
95
96
  - **gm-emit** — legitimacy gate; unearned specificity → lawful downgrade
96
- - **gm-complete** — stress-suite pass; close posture only CI green
97
+ - **gm-complete** — stress-suite pass; close posture only when CI is green