gm-cc 2.0.569 → 2.0.570

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,7 +4,7 @@
4
4
  "name": "AnEntrypoint"
5
5
  },
6
6
  "description": "State machine agent with hooks, skills, and automated git enforcement",
7
- "version": "2.0.569",
7
+ "version": "2.0.570",
8
8
  "metadata": {
9
9
  "description": "State machine agent with hooks, skills, and automated git enforcement"
10
10
  },
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-cc",
3
- "version": "2.0.569",
3
+ "version": "2.0.570",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
package/plugin.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.569",
3
+ "version": "2.0.570",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": {
6
6
  "name": "AnEntrypoint",
@@ -17,7 +17,7 @@ Transitions = state changes, not reminders. Phase exit condition met → next Sk
17
17
 
18
18
  `gm-execute` = execution contract. Defines "running code" across every phase: `exec:<lang>` = only runner; `exec:codesearch` = only exploration; witnessed output = only ground truth; import real modules over reimplementation. Execution happens in every phase, not only EXECUTE. About to run anything, `gm-execute` protocols not fresh in context → operating outside contract → reload `gm-execute` first.
19
19
 
20
- `twin-atlas` = governance reference. Forward Atlas (route discovery, 7 route families, 16 failure taxonomy) feeds `planning`. Bridge (weak-prior transfer plausibility never equals authorization) constrains `gm-execute`. Inverse Atlas (earned specificity, lawful downgrade, five refused collapses) gates `gm-emit` and `gm-complete`. Load once at session start.
20
+ `governance` = governance reference. Route discovery (7 route families, 16 failure taxonomy) feeds `planning`. Weak-prior bridge (plausibility never equals authorization) constrains `gm-execute`. Legitimacy gate (earned specificity, lawful downgrade, five refused collapses) gates `gm-emit` and `gm-complete`. Load once at session start.
21
21
 
22
22
  ## FRAGILE LEARNINGS — HARD RULE
23
23
 
@@ -56,7 +56,7 @@ One call per fact. **End-of-turn self-check** mandatory: any resolved unknown un
56
56
  - `git_pushed=UNKNOWN` until `git log origin/main..HEAD --oneline` returns empty
57
57
  - `ci_passed=UNKNOWN` until all GitHub Actions runs triggered by the push reach `conclusion: success`
58
58
  - `prd_empty=UNKNOWN` until `.gm/prd.yml` is deleted (not just empty — file must not exist)
59
- - `stress_suite_clear=UNKNOWN` until the change has been mentally walked through every applicable case in the `twin-atlas` governance stress suite (M1, F1, C1, H1, S1, B1, A1, D1) and none flunks. Flunk = regress to the phase that owns the gap.
59
+ - `stress_suite_clear=UNKNOWN` until the change has been mentally walked through every applicable case in the `governance` stress suite (M1, F1, C1, H1, S1, B1, A1, D1) and none flunks. Flunk = regress to the phase that owns the gap.
60
60
  - `hidden_decision_posture=open` until CI green. Posture advances `open → down_weighted` only when some evidence is in, `down_weighted → closed` only when CI green + stress suite clear. Closing early = collapse #3 (hidden orchestration into public law).
61
61
 
62
62
  All must resolve to KNOWN (or `closed` for posture) before COMPLETE. Any UNKNOWN = absolute barrier.
@@ -184,7 +184,7 @@ Before declaring complete, sweep the entire codebase for violations:
184
184
  12. **memorize** → every fact surfaced during verification that would have saved this session's time if it had been in memory at the start (CI timing, flaky-test patterns, environment quirks, runtime behaviors, user preferences stated this session) is handed off via a background memorize call at the moment of resolution. One call per fact, non-blocking. `Agent(subagent_type='memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')`
185
185
  13. **Deploy/publish** → if deployable, deploy. If npm package, publish.
186
186
  14. **GitHub Pages** → check if repo has a GH Pages site. If `.github/workflows/pages.yml` is absent OR `docs/index.html` is absent: invoke the `pages` skill to scaffold the site before advancing.
187
- 15. **Governance stress-suite sweep** (`twin-atlas`) — walk the finished change against every applicable case: M1 missing-evidence-forced-decision, F1 unsourced-number, C1 ambiguous-clause, H1 contradictory-witnesses, S1 attribution-under-pressure, B1 RCA-live-alternatives, A1 authenticity-partial-signals, D1 deploy-gate-under-flake. Ask per case: did the change over-commit, hide contradiction, or treat surface appearance as evidence? Any flunk = regress to the owning phase. The 8 legal outcomes must hold: illegal commitments=0, evidence-boundary violations=0, lawful downgrades available=8, outlier visibility preserved.
187
+ 15. **Governance stress-suite sweep** (`governance`) — walk the finished change against every applicable case: M1 missing-evidence-forced-decision, F1 unsourced-number, C1 ambiguous-clause, H1 contradictory-witnesses, S1 attribution-under-pressure, B1 RCA-live-alternatives, A1 authenticity-partial-signals, D1 deploy-gate-under-flake. Ask per case: did the change over-commit, hide contradiction, or treat surface appearance as evidence? Any flunk = regress to the owning phase. The 8 legal outcomes must hold: illegal commitments=0, evidence-boundary violations=0, lawful downgrades available=8, outlier visibility preserved.
188
188
 
189
189
  Any violation found = fix immediately before advancing.
190
190
 
@@ -67,9 +67,9 @@ Only git in bash directly. `Bash(node/npm/npx/bun)` = violations. File writes vi
67
67
  - Target under 12s per exec call; split work across multiple calls only when dependencies require it
68
68
  - Prefer a single well-structured exec that does 5 things over 5 sequential execs
69
69
 
70
- ## INVERSE ATLAS LEGITIMACY GATE — EARNED SPECIFICITY
70
+ ## LEGITIMACY GATE — EARNED SPECIFICITY
71
71
 
72
- Before the pre-emit run, apply the Inverse Atlas check from `twin-atlas`. For every claim, assertion, or specific value about to land in a file, answer:
72
+ Before the pre-emit run, apply the legitimacy check from `governance`. For every claim, assertion, or specific value about to land in a file, answer:
73
73
 
74
74
  1. **Earned specificity** — does the claim trace to a witnessed mutable (`authorization=witnessed`), or is it inflated from a weak prior?
75
75
  2. **Repair legality** — is this a local candidate repair being dressed up as a structural repair? If yes, either downgrade the scope or snake back to PLAN for structural work.
@@ -113,8 +113,8 @@ The post-emit verification is a differential diagnosis against the pre-emit base
113
113
 
114
114
  ## GATE CONDITIONS (all true simultaneously before advancing)
115
115
 
116
- - Inverse Atlas legitimacy gate passed: every claim traces to `authorization=witnessed`, no weak-prior inflation, no local-candidate-dressed-as-structural, lawful downgrade considered and either taken or explicitly justified, live competing routes preserved
117
- - None of the five refused collapses (`twin-atlas`): route→authorization | candidate→structural | hidden→public-law | cleanliness→legitimacy | one-route→universal-closure
116
+ - Legitimacy gate passed: every claim traces to `authorization=witnessed`, no weak-prior inflation, no local-candidate-dressed-as-structural, lawful downgrade considered and either taken or explicitly justified, live competing routes preserved
117
+ - None of the five refused collapses (`governance`): route→authorization | candidate→structural | hidden→public-law | cleanliness→legitimacy | one-route→universal-closure
118
118
 
119
119
  - Pre-emit debug passed with real inputs and error inputs
120
120
  - Post-emit verification matches pre-emit exactly
@@ -30,19 +30,19 @@ New unknown surfaced by a run → stop, state-regress to `planning`, restart cha
30
30
 
31
31
  Each mutable: name | expected | current | resolution method. Execute → witness → assign → compare. Zero variance = resolved. Unresolved after 2 passes = new unknown = snake to `planning`. Never narrate past an unresolved mutable.
32
32
 
33
- ## BRIDGE DISCIPLINE WEAK PRIORS DO NOT AUTHORIZE
33
+ ## WEAK-PRIOR BRIDGE — PRIORS DO NOT AUTHORIZE
34
34
 
35
- EXECUTE receives route candidates from PLAN. Per `twin-atlas` Bridge: **those candidates arrive as weak priors only — structural value preserved, authorization NOT transferred**. Route plausibility ≠ authorization. A plausible route earns the right to be TESTED, not the right to be BELIEVED.
35
+ EXECUTE receives route candidates from PLAN. Per the weak-prior rule in `governance`: **those candidates arrive as weak priors only — structural value preserved, authorization NOT transferred**. Route plausibility ≠ authorization. A plausible route earns the right to be TESTED, not the right to be BELIEVED.
36
36
 
37
37
  - Prior from PLAN: `authorization=weak_prior`. Permitted use: pick the next witnessed probe.
38
38
  - After witnessed probe succeeds: `authorization=witnessed`. Permitted use: feed into EMIT.
39
- - Collapsing `weak_prior` to `witnessed` without a witnessed probe = route-into-authorization leak (collapse #1 in `twin-atlas`). Snake to PLAN.
39
+ - Collapsing `weak_prior` to `witnessed` without a witnessed probe = route-into-authorization leak (collapse #1 in `governance`). Snake to PLAN.
40
40
 
41
41
  Rhetorical inflation also strips here: "the plan says" / "we agreed that" / "obviously X" are prior-statements, not witnessed-facts. Restate as weak prior, run the probe, witness, only then authorize.
42
42
 
43
43
  ## QUALITY METRICS — APPLY BEFORE MARKING KNOWN
44
44
 
45
- Every mutable passes all four before status flips UNKNOWN → KNOWN (see `twin-atlas` for full definitions):
45
+ Every mutable passes all four before status flips UNKNOWN → KNOWN (see `governance` for full definitions):
46
46
 
47
47
  - **ΔS = 0** — witnessed output equals expected
48
48
  - **λ ≥ 2** — two independent paths agree (different search, different caller, different import), not just one confirmation
@@ -1,19 +1,19 @@
1
1
  ---
2
- name: twin-atlas
3
- description: Governance reference invoked by PLAN/EXECUTE/EMIT/VERIFY. Separates route discovery (Forward Atlas) from weak-prior handoff (Bridge) from earned-emission legitimacy (Inverse Atlas). Encodes 16-failure taxonomy, 4 state planes, ΔS/λ/ε/Coverage metrics, governance stress suite. Adapted from WFGY 4.0 Twin Atlas.
2
+ name: governance
3
+ description: Governance reference invoked by PLAN/EXECUTE/EMIT/VERIFY. Separates route discovery (PLAN) from weak-prior handoff (EXECUTE) from earned-emission legitimacy (EMIT/VERIFY). Encodes 16-failure taxonomy, 4 state planes, ΔS/λ/ε/Coverage metrics, governance stress suite.
4
4
  ---
5
5
 
6
- # Twin Atlas — Route, Bridge, Legitimacy
6
+ # Governance — Route, Bridge, Legitimacy
7
7
 
8
- Central governance reference. Three-module architecture separates three failure surfaces every phase must respect simultaneously:
8
+ Central governance reference. Three roles separate three failure surfaces every phase must respect simultaneously:
9
9
 
10
- 1. **Forward Atlas** — route-first structural orientation. Where could this fail? What family of fault does it live in? Owned by `planning`.
11
- 2. **Bridge** — advisory-only weak-prior transfer. Route plausibility never converts into authorization. Owned by `gm-execute`.
12
- 3. **Inverse Atlas** — legitimacy-first emission governance. Did this answer earn its requested strength? Owned by `gm-emit` and `gm-complete`.
10
+ 1. **Route discovery** — route-first structural orientation. Where could this fail? What family of fault does it live in? Owned by `planning`.
11
+ 2. **Weak-prior bridge** — advisory-only transfer. Route plausibility never converts into authorization. Owned by `gm-execute`.
12
+ 3. **Legitimacy gate** — earned-emission governance. Did this answer earn its requested strength? Owned by `gm-emit` and `gm-complete`.
13
13
 
14
- Neither route-first nor legitimacy-first alone suffices. Bridge exists precisely to stop route plausibility from masquerading as authorization.
14
+ Neither route-first nor legitimacy-first alone suffices. The weak-prior bridge exists precisely to stop route plausibility from masquerading as authorization.
15
15
 
16
- ## The Five Collapses Twin Atlas Refuses
16
+ ## The Five Collapses Governance Refuses
17
17
 
18
18
  A conclusion ships only when none of these has occurred:
19
19
 
@@ -25,7 +25,7 @@ A conclusion ships only when none of these has occurred:
25
25
 
26
26
  When in doubt: preserve ambiguity. Lawful downgrade beats forced closure.
27
27
 
28
- ## The 7 Route Families (Forward Atlas)
28
+ ## The 7 Route Families
29
29
 
30
30
  Every planned item belongs to at least one family. Naming the family disciplines the repair move.
31
31
 
@@ -41,7 +41,7 @@ Every planned item belongs to at least one family. Naming the family disciplines
41
41
 
42
42
  Route family gets written into the `.prd` item. Repair attempted in the wrong family = wasted work.
43
43
 
44
- ## The 16 Failure Modes (Problem Map)
44
+ ## The 16 Failure Modes
45
45
 
46
46
  Routing taxonomy. Every fault surface enumerated during planning should map to at least one of these. Missing mapping = unexamined surface.
47
47
 
@@ -109,11 +109,11 @@ Legal outcomes:
109
109
  - Lawful downgrade: 8 of 8 (always available as an option, always taken when warranted)
110
110
  - Outlier visibility: preserved (downgrade over hiding)
111
111
 
112
- ## How Each Phase Applies Twin Atlas
112
+ ## How Each Phase Applies Governance
113
113
 
114
- - **planning** — enumerates route families (Forward Atlas). Tags every `.prd` item with its family and failure-mode IDs. Writes `route_fit` and the expected `authorization` level needed.
115
- - **gm-execute** — treats every prior decision as a weak prior (Bridge). Only `witnessed` execution raises authorization. ΔS/λ/ε/Coverage checks on every mutable.
116
- - **gm-emit** — Inverse Atlas gate. Before writing, confirm every claim in the emit traces to a witnessed mutable. Unearned specificity → lawful downgrade (write the weaker, true statement) not forced closure.
114
+ - **planning** — enumerates route families. Tags every `.prd` item with its family and failure-mode IDs. Writes `route_fit` and the expected `authorization` level needed.
115
+ - **gm-execute** — treats every prior decision as a weak prior. Only `witnessed` execution raises authorization. ΔS/λ/ε/Coverage checks on every mutable.
116
+ - **gm-emit** — legitimacy gate. Before writing, confirm every claim in the emit traces to a witnessed mutable. Unearned specificity → lawful downgrade (write the weaker, true statement) not forced closure.
117
117
  - **gm-complete** — runs the stress-suite mental pass against the finished change. Closes `hidden_decision_posture` only with CI green.
118
118
 
119
119
  ## Not Every Answer Has Earned the Right to Exist
@@ -75,11 +75,11 @@ Planning = exhaustive fault-surface enumeration. For every aspect of the task:
75
75
 
76
76
  **Fault surfaces**: file existence | API shape | data format | dependency versions | runtime behavior | environment differences | error conditions | concurrency hazards | integration seams | backwards compatibility | rollback paths | deployment steps | CI/CD correctness
77
77
 
78
- **Route family (Forward Atlas — `twin-atlas` skill)**: every `.prd` item is tagged with at least one of the 7 route families — `grounding | reasoning | state | execution | observability | boundary | representation`. The family disciplines the repair move. Bug in `grounding` does not get a `reasoning` fix; bug in `boundary` does not get a `state` fix. Mis-routed repair = wasted EXECUTE pass + snake back to PLAN. Add `route_family:` to the item YAML.
78
+ **Route family (`governance` skill)**: every `.prd` item is tagged with at least one of the 7 route families — `grounding | reasoning | state | execution | observability | boundary | representation`. The family disciplines the repair move. Bug in `grounding` does not get a `reasoning` fix; bug in `boundary` does not get a `state` fix. Mis-routed repair = wasted EXECUTE pass + snake back to PLAN. Add `route_family:` to the item YAML.
79
79
 
80
- **Failure-mode mapping**: cross-reference against the 16-failure taxonomy in `twin-atlas`. If the fault you are enumerating does not map to any entry, either you have found a 17th mode (add to twin-atlas) or the fault is not yet named sharply enough — refine until it maps. Items with no failure-mode mapping SHIP silent bugs.
80
+ **Failure-mode mapping**: cross-reference against the 16-failure taxonomy in `governance`. If the fault you are enumerating does not map to any entry, either you have found a 17th mode (add to the governance skill) or the fault is not yet named sharply enough — refine until it maps. Items with no failure-mode mapping SHIP silent bugs.
81
81
 
82
- **Competing routes stay live**: if two route families plausibly explain the same symptom, keep both alive in the PRD until witnessed execution makes one dominant. Collapsing to one route pre-witness = route-into-authorization leak (see `twin-atlas` — the first of five refused collapses).
82
+ **Competing routes stay live**: if two route families plausibly explain the same symptom, keep both alive in the PRD until witnessed execution makes one dominant. Collapsing to one route pre-witness = route-into-authorization leak (see `governance` — the first of five refused collapses).
83
83
 
84
84
  **MANDATORY CODEBASE SCAN**: For every planned item, add `existingImpl=UNKNOWN`. Resolve via exec:codesearch. Existing code serving same concern → consolidation task, not addition. `exec:codesearch` indexes PDFs page-by-page alongside source — spec PDFs, papers, vendor manuals, and RFCs are searchable as code. When planning against a protocol, hardware, or compliance requirement, search the PDF corpus the same way you search source: two words, iterate. A constraint the PRD is missing because it only lives in a PDF is a fault surface — enumerate doc PDFs as scan targets during mutable discovery.
85
85
 
@@ -135,7 +135,7 @@ Path: `./.gm/prd.yml`. YAML via `exec:nodejs` (use `fs.writeFileSync`). Ensure `
135
135
  - failure mode
136
136
  ```
137
137
 
138
- `route_family`, `failure_modes`, `route_fit`, `authorization` come from `twin-atlas`. Required for items with emission impact (architecture, public API, contract change). Small surgical edits may omit. `authorization` starts `none`; gm-execute raises it to `weak_prior` on hypothesis, `witnessed` only when execution has proven it.
138
+ `route_family`, `failure_modes`, `route_fit`, `authorization` are defined in the `governance` skill. Required for items with emission impact (architecture, public API, contract change). Small surgical edits may omit. `authorization` starts `none`; gm-execute raises it to `weak_prior` on hypothesis, `witnessed` only when execution has proven it.
139
139
 
140
140
  Status: `pending` → `in_progress` → `completed` (remove completed items). Effort: small <15min | medium <45min | large >1h.
141
141
 
@@ -175,9 +175,9 @@ Invoke `browser` skill. Escalation: (1) `exec:browser <js>` → (2) browser skil
175
175
 
176
176
  ## SKILL REGISTRY
177
177
 
178
- `gm-execute` → `gm-emit` → `gm-complete` → `update-docs` | `browser` | `twin-atlas` (governance reference, read once per session) | `memorize` (sub-agent, background only)
178
+ `gm-execute` → `gm-emit` → `gm-complete` → `update-docs` | `browser` | `governance` (read once per session) | `memorize` (sub-agent, background only)
179
179
 
180
- `twin-atlas` carries the Forward/Bridge/Inverse governance model, 7 route families, 16 failure taxonomy, 4 state planes, ΔS/λ/ε/Coverage metrics, and the 8-case governance stress suite. Load once per session at the top of `planning` so protocols stay fresh across phases.
180
+ `governance` carries the route-discovery / weak-prior-bridge / legitimacy-gate model, 7 route families, 16 failure taxonomy, 4 state planes, ΔS/λ/ε/Coverage metrics, and the 8-case governance stress suite. Load once per session at the top of `planning` so protocols stay fresh across phases.
181
181
 
182
182
  `memorize`: `Agent(subagent_type='memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<what>')`
183
183