gm-copilot-cli 2.0.631 → 2.0.633

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -6,115 +6,82 @@ allowed-tools: Write
6
6
 
7
7
  # Planning — State Machine Orchestrator
8
8
 
9
- Root of all work. Runs `PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS → COMPLETE`.
9
+ Runs `PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS → COMPLETE`.
10
10
 
11
- **Entry**: prompt-submit hook → `gm` agent invoke `planning` skill (here). Also re-entered any time a new unknown surfaces in any phase.
12
-
13
- ## WHERE YOU ARE
14
-
15
- Phase where work shape still unknown. Codebase scans, `exec:codesearch` queries, quick imports to probe APIs = all executions. Execution contract in `gm-execute`; protocols not fresh → runs drift (reimplement over import, narrate over witness). Load first.
11
+ Entry: prompt-submit hook → `gm` → here. Re-enter on any new unknown in any phase.
16
12
 
17
13
  ## UNKNOWNS = PRODUCT
18
14
 
19
- Output of this phase ≠ task list. Output = enumeration of every fault surface work could fail on. Each unknown named + resolved cheaper downstream. Each unknown skipped EMIT/VERIFY surprise snake back anyway at higher cost.
20
-
21
- **Unknown emerges later (EXECUTE, EMIT, VERIFY) → return here. Not failure — machine working as designed.** Later-phase unknown means mutable map incomplete. Come back. Think laterally — what else could this touch, what invariant assumed, what subsystem unexamined? Enumerate newly-visible fault surfaces. Only then push forward.
22
-
23
- Patch-around-unknowns-in-place = compounding silent-failure debt. Regress early = stay inside contract.
15
+ Output = every fault surface work could fail on. Unknown named+resolved = cheaper downstream. Unknown skipped = EMIT/VERIFY surprise = snake back at higher cost.
24
16
 
25
- ## FRAGILE LEARNINGSMEMORIZE ON RESOLUTION (HARD RULE)
17
+ Later-phase unknown return here. Not failure machine working. Patch-around-in-place = compounding debt.
26
18
 
27
- Every unknown resolved in any phase (existingImpl absent, dep version confirmed, API shape witnessed, env quirk observed, CI failure root-caused, build flag required, user preference stated) = fact that dies on context compaction unless handed off. Not optional. Not batched at end. **Memorize at the moment of resolution, before the next tool call.**
19
+ ## MEMORIZE HARD RULE
28
20
 
29
- **Trigger contract memorize fires when ANY of these occur**:
30
- - An `exec:` run produces output that answers a prior "I don't know" / "let me check"
31
- - A code read confirms or refutes an assumption about existing structure
32
- - A CI log reveals the root cause of a failure
33
- - User states a preference, constraint, deadline, or decision
34
- - A fix works and the reason it worked is non-obvious (would be forgotten next session)
35
- - An environment / tooling quirk bites once (blocked tools, path oddities, platform differences)
21
+ Every unknown resolved memorize same turn. Not batched, not deferred.
36
22
 
37
- **Enforcement**:
38
- - Regardless of phase, a resolved unknown → spawn `memorize` subagent **in the same turn** as the resolution. Do not wait for phase end. Do not batch across multiple facts — one call per fact keeps classification clean.
39
- - Run in background (`run_in_background=true`). Non-blocking. Continue work in the same message.
40
- - If you notice at end of turn that an unknown resolved earlier was not memorized, spawn it now. Catching up is allowed. Skipping is not.
23
+ Triggers: exec: output answers prior unknown | code read confirms/refutes | CI log reveals root cause | user states preference/constraint | fix worked non-obviously | env quirk observed.
41
24
 
42
- **Invocation (copy verbatim, substitute the fact)**:
43
25
  ```
44
- Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<single resolved fact with enough context for a cold-start agent to use it>')
26
+ Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')
45
27
  ```
46
28
 
47
- **Parallelization**: multiple unknowns resolved in one turn → spawn multiple memorize agents in the **same message** (parallel tool blocks). Never serialize them.
48
-
49
- **Violations = memory leak**. If you leave the turn with a resolved unknown un-memorized, the fact is gone on next compaction. That is the failure mode this rule prevents. Treat every exit from a turn without memorize-catch-up as a bug.
29
+ Multiple facts in one turn → parallel Agent calls in ONE message. End-of-turn: scan for missed spawn now.
50
30
 
51
31
  ## STATE MACHINE
52
32
 
53
- **FORWARD**: PLAN complete → `gm-execute` | EXECUTE complete → `gm-emit` | EMIT complete → `gm-complete` | VERIFY .prd remains → `gm-execute` | VERIFY .prd empty+pushed → `update-docs`
33
+ **FORWARD**: PLAN → `gm-execute` | EXECUTE → `gm-emit` | EMIT → `gm-complete` | VERIFY .prd remains → `gm-execute` | VERIFY .prd empty+pushed → `update-docs`
54
34
 
55
35
  **REGRESSIONS**: new unknown at any state → re-invoke `planning` | EXECUTE unresolvable 2 passes → `planning` | EMIT logic error → `gm-execute` | EMIT new unknown → `planning` | VERIFY broken output → `gm-emit` | VERIFY logic wrong → `gm-execute` | VERIFY new unknown → `planning`
56
36
 
57
- **Runs until**: .gm/prd.yml empty AND git clean AND all pushes confirmed AND CI green.
37
+ Runs until: .gm/prd.yml empty AND git clean AND all pushes confirmed AND CI green.
38
+
39
+ **Cannot stop while**: .gm/prd.yml has items | git has uncommitted changes | git has unpushed commits.
58
40
 
59
- ## ENFORCEMENT COMPLETE EVERY TASK END-TO-END
41
+ ## SKIP PLANNING (DEFAULT for small work)
60
42
 
61
- **Cannot respond or stop while**:
62
- - .gm/prd.yml exists and has items
63
- - git has uncommitted changes
64
- - git has unpushed commits
43
+ Skip if ANY apply:
44
+ - Single-file, single-concern edit
45
+ - Task trivially bounded, under ~5 min
46
+ - User gave explicit surgical instructions
47
+ - Bug fix with identified root cause
48
+ - Zero unknowns
65
49
 
66
- The skill chain MUST be followed completely end-to-end without exception. Partial execution = violation. After every phase: read .prd, check git, invoke next skill.
50
+ Heavy ceremony (PRD + parallel subagents) for multi-file architectural work or genuinely unknown fault surfaces. If new unknown surfaces mid-work, THAT is when to regress — not preemptively.
67
51
 
68
52
  ## PLAN PHASE — MUTABLE DISCOVERY
69
53
 
70
- Planning = exhaustive fault-surface enumeration. For every aspect of the task:
54
+ For every aspect of the task:
71
55
  - What do I not know? → mutable (UNKNOWN)
72
56
  - What could go wrong? → edge case item with failure mode
73
- - What depends on what? → blocking/blockedBy mapped explicitly
74
- - What assumptions am I making? → each = unwitnessed hypothesis = mutable until execution proves it
75
-
76
- **Fault surfaces**: file existence | API shape | data format | dependency versions | runtime behavior | environment differences | error conditions | concurrency hazards | integration seams | backwards compatibility | rollback paths | deployment steps | CI/CD correctness
57
+ - What depends on what? → blocking/blockedBy mapped
58
+ - What assumptions am I making? → each = unwitnessed hypothesis = mutable
77
59
 
78
- **Route family (`governance` skill)**: every `.prd` item is tagged with at least one of the 7 route families `grounding | reasoning | state | execution | observability | boundary | representation`. The family disciplines the repair move. Bug in `grounding` does not get a `reasoning` fix; bug in `boundary` does not get a `state` fix. Mis-routed repair = wasted EXECUTE pass + snake back to PLAN. Add `route_family:` to the item YAML.
60
+ Fault surfaces: file existence | API shape | data format | dependency versions | runtime behavior | environment differences | error conditions | concurrency hazards | integration seams | backwards compatibility | rollback paths | CI/CD correctness
79
61
 
80
- **Failure-mode mapping**: cross-reference against the 16-failure taxonomy in `governance`. If the fault you are enumerating does not map to any entry, either you have found a 17th mode (add to the governance skill) or the fault is not yet named sharply enough refine until it maps. Items with no failure-mode mapping SHIP silent bugs.
62
+ **Route family** (`governance`): tag every `.prd` item with familygrounding|reasoning|state|execution|observability|boundary|representation. Mis-routed repair = wasted EXECUTE pass.
81
63
 
82
- **Competing routes stay live**: if two route families plausibly explain the same symptom, keep both alive in the PRD until witnessed execution makes one dominant. Collapsing to one route pre-witness = route-into-authorization leak (see `governance` — the first of five refused collapses).
64
+ **Failure-mode mapping**: cross-reference 16-failure taxonomy. No mapping = unexamined surface = silent bug.
83
65
 
84
- **MANDATORY CODEBASE SCAN**: For every planned item, add `existingImpl=UNKNOWN`. Resolve via exec:codesearch. Existing code serving same concern → consolidation task, not addition. `exec:codesearch` indexes PDFs page-by-page alongside source — spec PDFs, papers, vendor manuals, and RFCs are searchable as code. When planning against a protocol, hardware, or compliance requirement, search the PDF corpus the same way you search source: two words, iterate. A constraint the PRD is missing because it only lives in a PDF is a fault surface — enumerate doc PDFs as scan targets during mutable discovery.
66
+ **Competing routes stay live** until witnessed execution makes one dominant. Pre-witness collapse = route-into-authorization leak.
85
67
 
86
- **EXIT PLAN**: zero new unknowns in last pass AND all .prd items have explicit acceptance criteria AND all dependencies mapped launch subagents or invoke `gm-execute`.
68
+ **MANDATORY CODEBASE SCAN**: For every item, `existingImpl=UNKNOWN`. Resolve via exec:codesearch. Existing code serving same concern consolidation, not addition. PDFs indexed page-by-page search specs same way you search source.
87
69
 
88
- **SELF-LOOP**: new items discovered add to .prd → plan again.
70
+ **EXIT PLAN**: zero new unknowns last pass AND all .prd items have acceptance criteria AND dependencies mapped launch subagents or invoke `gm-execute`.
89
71
 
90
- **Skip planning entirely** (this is the DEFAULT for small work) if ANY of these apply:
91
- - Single-file, single-concern edit
92
- - Task is trivially bounded and under ~5 minutes
93
- - User gave explicit surgical instructions ("change X to Y")
94
- - Bug fix where root cause is already identified
95
- - Zero unknowns / no mutables to resolve
96
-
97
- Heavy ceremony (PRD + parallel subagents) is for multi-file architectural work or genuinely unknown fault surfaces. Writing a 7-item PRD for a 3-line change is waste. Err toward skipping — if a new unknown surfaces mid-work, THAT is when you regress to planning, not preemptively.
98
-
99
- **Contrast examples:**
100
- - "Fix the hold-detect logic at apcKey25.cpp:163" → SKIP planning. Read, edit, done.
101
- - "Add drift correction and watchdog and observability across the USB audio path" → DO plan. Multi-file, multiple unknowns.
102
-
103
- ## OBSERVABILITY ENUMERATION — MANDATORY EVERY PASS
72
+ ## OBSERVABILITY MANDATORY EVERY PASS
104
73
 
105
- During every planning pass, enumerate every possible aspect of the app's runtime observability that can be improved. The goal is permanent structures — not ad-hoc logs — that make any future debugging session start from a complete, live picture of system state.
74
+ Enumerate every runtime observability gap:
106
75
 
107
- **Server-side permanent structures**: Every internal subsystem state machine, queue, cache, connection pool, active task map, process registry, RPC handler, hook dispatcher — must expose a named, queryable inspection endpoint (e.g. `/debug/<subsystem>`). State must be readable, filterable, and ideally modifiable without restart. Profiling hooks on every hot path. Structured logs with subsystem tag, severity, and timestamp — filterable at runtime by subsystem, not just log level. Any internal state that requires a restart to inspect is an observability gap.
76
+ **Server**: every subsystem exposes `/debug/<subsystem>`. State readable/filterable without restart. Structured logs with subsystem+severity+timestamp.
108
77
 
109
- **Client-side permanent structures**: `window.__debug` must be a live, structured registry — not a dump. Every component's state, every active request, every event queue, every WebSocket connection, every rendered prop, every error boundary — all addressable by key, all queryable without refreshing. New modules register themselves into `window.__debug` on mount and deregister on unmount. Any execution path not traceable via `window.__debug` is an observability gap.
78
+ **Client**: `window.__debug` is live structured registry. Every component/request/queue/connection addressable by key. Modules register on mount, deregister on unmount.
110
79
 
111
- **Permanent vs ad-hoc**: `console.log` = ad-hoc = not observability. A structured logger with subsystem routing = permanent. `window.__debug.myModule.state` = permanent. `window.__debug = { ...window.__debug, tmp: x }` = ad-hoc = not acceptable. Permanent structures survive deploys and accumulate diagnostic value across sessions.
112
-
113
- **Mandate**: on discovery of any observability gap → immediately add a .prd item. Observability improvements are highest-priority — never deferred. The agent must be able to see specifically anything it wants and nothing else — no guessing, no blind spots.
80
+ `console.log` = ad-hoc = not observability. `window.__debug.module.state` = permanent. Discovery of gap add .prd item immediately. Observability = highest priority, never deferred.
114
81
 
115
82
  ## .PRD FORMAT
116
83
 
117
- Path: `./.gm/prd.yml`. YAML via `exec:nodejs` (use `fs.writeFileSync`). Ensure `.gm/` dir exists before writing. Delete when empty never leave empty file. Delete `.gm/` dir when completely empty.
84
+ Path: `./.gm/prd.yml`. Write via `exec:nodejs` + `fs.writeFileSync`. Delete when empty; delete `.gm/` when completely empty.
118
85
 
119
86
  ```yaml
120
87
  - id: kebab-id
@@ -135,9 +102,7 @@ Path: `./.gm/prd.yml`. YAML via `exec:nodejs` (use `fs.writeFileSync`). Ensure `
135
102
  - failure mode
136
103
  ```
137
104
 
138
- `route_family`, `failure_modes`, `route_fit`, `authorization` are defined in the `governance` skill. Required for items with emission impact (architecture, public API, contract change). Small surgical edits may omit. `authorization` starts `none`; gm-execute raises it to `weak_prior` on hypothesis, `witnessed` only when execution has proven it.
139
-
140
- Status: `pending` → `in_progress` → `completed` (remove completed items). Effort: small <15min | medium <45min | large >1h.
105
+ Status: pending in_progress completed (remove completed). Effort: small <15min | medium <45min | large >1h.
141
106
 
142
107
  ## PARALLEL SUBAGENT LAUNCH
143
108
 
@@ -145,81 +110,43 @@ After .prd written, launch ≤3 parallel `gm:gm` subagents for all independent i
145
110
 
146
111
  `Agent(subagent_type="gm:gm", prompt="Work on .prd item: <id>. .prd path: <path>. Item: <full YAML>.")`
147
112
 
148
- After each wave: read .prd, find newly unblocked items, launch next wave. Exception: browser tasks serialize.
149
-
150
- When parallelism not applicable: invoke `gm-execute` skill directly.
151
-
152
- ## MUTABLE DISCIPLINE
113
+ After each wave: read .prd, find newly unblocked, launch next. Browser tasks serialize.
153
114
 
154
- Each mutable: name | expected | current | resolution method. Resolve by witnessed execution only. Zero variance = resolved. Unresolved after 2 passes = new unknown = re-invoke `planning`. Mutables live in conversation only.
115
+ When parallelism not applicable: invoke `gm-execute` directly.
155
116
 
156
117
  ## CODE EXECUTION
157
118
 
158
- `exec:<lang>` only. Bash body: `exec:<lang>\n<code>`
119
+ `exec:<lang>` only via Bash tool. File I/O: exec:nodejs + require('fs'). Git only in Bash directly. Never Bash(node/npm/npx/bun).
159
120
 
160
- `exec:nodejs` | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:c` | `exec:cpp` | `exec:java` | `exec:deno` | `exec:cmd`
161
-
162
- File I/O: exec:nodejs + require('fs'). Git only in Bash directly. `Bash(node/npm/npx/bun)` = violation.
163
-
164
- Pack runs: `Promise.allSettled` for parallel ops. Each idea its own try/catch. Under 12s per call.
165
-
166
- Background: `exec:sleep <id>` | `exec:status <id>` | `exec:close <id>`. Runner: `exec:runner start|stop|status`.
121
+ Pack runs: `Promise.allSettled` for parallel. Each idea its own try/catch. Under 12s per call.
167
122
 
168
123
  ## CODEBASE EXPLORATION
169
124
 
170
- `exec:codesearch` only. Glob/Grep/Read/Explore/WebSearch = hook-blocked. Start 2 words → change one word → add third → minimum 4 attempts before concluding absent.
171
-
172
- ## BROWSER AUTOMATION
173
-
174
- Invoke `browser` skill. Escalation: (1) `exec:browser <js>` → (2) browser skill → (3) navigate/click → (4) screenshot last resort. Browser tasks serialize — one Chrome instance per project.
175
-
176
- ## SKILL REGISTRY
177
-
178
- `gm-execute` → `gm-emit` → `gm-complete` → `update-docs` | `browser` | `governance` (read once per session) | `memorize` (sub-agent, background only)
179
-
180
- `governance` carries the route-discovery / weak-prior-bridge / legitimacy-gate model, 7 route families, 16 failure taxonomy, 4 state planes, ΔS/λ/ε/Coverage metrics, and the 8-case governance stress suite. Load once per session at the top of `planning` so protocols stay fresh across phases.
181
-
182
- `memorize`: `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<what>')`
125
+ `exec:codesearch` only. Glob/Grep/Read/Explore = hook-blocked. Start 2 words → change one → add third → minimum 4 attempts before concluding absent.
183
126
 
184
127
  ## MANDATORY DEV WORKFLOW
185
128
 
186
- No comments. No scattered test files. 200-line limit — split before continuing. Fail loud. No duplication. Scan before every edit. Duplicate concern = regress to PLAN. Errors throw with context — no `|| default`, no `catch { return null }`. `window.__debug` exposes all client state. AGENTS.md via memorize only. CHANGELOG.md: append per commit.
187
-
188
- **Minimal code / maximal DX process**: Before writing any logic, run this process in order — stop at the first step that resolves the need:
189
- 1. **Native first** — does the language or runtime already do this? Use it exactly as designed.
190
- 2. **Library second** — does an existing dependency already solve this pattern? Use its API directly.
191
- 3. **Structure third** — can the problem be encoded as data (map, table, pipeline) so the structure enforces correctness and eliminates branching entirely?
192
- 4. **Write last** — only author new logic when the above three are exhausted. New logic = new surface area = new bugs.
129
+ No comments. No scattered test files. 200-line limit. Fail loud. No duplication. Scan before edit. AGENTS.md via memorize only. CHANGELOG.md: append per commit.
193
130
 
194
- When structure eliminates a whole class of wrong states — name that pattern explicitly. Dispatch tables replacing switch chains, pipelines replacing loop-with-accumulator, maps replacing if/else forests — these are not just style preferences, they are correctness properties. Code that cannot be wrong because of how it is shaped is the goal. Readable top-to-bottom without mental simulation = done right. Requires decoding = not done.
131
+ **Minimal code process** (stop at first that resolves need):
132
+ 1. Native — does language/runtime do this? Use it.
133
+ 2. Library — existing dep solve this? Use its API.
134
+ 3. Structure — encode as data (map/table/pipeline)? Make structure enforce correctness.
135
+ 4. Write — only when 1-3 exhausted.
195
136
 
196
137
  ## SINGLE INTEGRATION TEST POLICY
197
138
 
198
- Every project maintains exactly one `test.js` at project root. 200-line max. No other test files anywhere — no `.test.js`, `.spec.js`, `__tests__/`, `fixtures/`, `mocks/`. Delete all scattered tests on discovery and consolidate coverage into `test.js`.
199
-
200
- **test.js replaces all unit tests.** It tests the real system end-to-end with real data. No mocks, no stubs, no test frameworks. Plain node assertions or process exit codes.
201
-
202
- **Creation**: if `test.js` does not exist, create it during EXECUTE phase covering all testable surface of current work.
139
+ One `test.js` at project root. 200-line max. No .test.js, .spec.js, __tests__/, fixtures/, mocks/. No frameworks, no mocks, no stubs plain assertions, real data, real system.
203
140
 
204
- **Maintenance**: every code change that adds or modifies behavior must update `test.js` to cover it. Every bug fix must add a regression case that would have caught the bug.
205
-
206
- **Structure**: group by subsystem, each subsystem gets a section. When approaching 200 lines, compress older stable tests into tighter assertions to make room for new coverage.
207
-
208
- **Execution**: `gm-complete` runs `test.js` before allowing completion. Failure = regression to EXECUTE.
141
+ `gm-complete` runs test.js before completion. Failure = regression to EXECUTE. Every behavior change updates test.js. Every bug fix adds regression case.
209
142
 
210
143
  ## RESPONSE POLICY
211
144
 
212
- Always terse. Technical substance stays. Fluff dies. Drop articles, filler, pleasantries, hedging. Fragments OK. Short synonyms. Technical terms exact. Pattern: `[thing] [action] [reason]. [next step].`
213
-
214
- Code, commits, and PR descriptions write in normal prose. Security warnings, destructive confirmations, and genuinely ambiguous sequences also drop terseness. Everything else stays terse.
145
+ Terse. Drop filler. Fragments OK. Pattern: `[thing] [action] [reason]. [next step].`
146
+ Code/commits/PRs = normal prose.
215
147
 
216
148
  ## CONSTRAINTS
217
149
 
218
- **Tier 0**: no_crash, no_exit, ground_truth_only, real_execution, fail_loud
219
- **Tier 1**: max_file_lines=200, hot_reloadable, checkpoint_state
220
- **Tier 2**: no_duplication, no_hardcoded_values, modularity
221
- **Tier 3**: no_comments, convention_over_code
222
-
223
- **Never**: `Bash(node/npm/npx/bun)` | skip planning | partial execution | stop while .gm/prd.yml has items | stop while git dirty | sequential independent items | screenshot before JS exhausted | fallback/demo modes | silently swallow errors | duplicate concern | leave comments | create scattered test files (only root test.js) | write if/else chains where a map or pipeline suffices | write one-liners that require decoding | branch on enumerated cases when a dispatch table exists | **leave a resolved unknown un-memorized** | batch memorize calls until end of turn | serialize parallel memorize spawns
150
+ **Never**: Bash(node/npm/npx/bun) | skip planning | partial execution | stop while .prd has items | stop while git dirty | sequential independent items | screenshot before JS exhausted | fallback/demo modes | swallow errors | duplicate concern | leave comments | scattered test files | if/else chains where map suffices | one-liners that require decoding | leave resolved unknown un-memorized | batch memorize | serialize memorize spawns
224
151
 
225
- **Always**: invoke named skill at every state transition | regress to planning on any new unknown | witnessed execution only | scan codebase before edits | enumerate every possible observability improvement every planning pass | follow skill chain completely end-to-end on every task without exception | prefer dispatch tables over switch/if chains | prefer pipelines over loop-with-accumulator | make wrong states structurally impossible | name patterns when structure eliminates a whole class of bugs | **spawn `memorize` the same turn an unknown resolves** | parallel-spawn one memorize per fact when multiple resolve together | end-of-turn self-check: any resolved unknowns un-memorized? → spawn now before handing control back
152
+ **Always**: invoke Skill at every transition | regress to planning on new unknown | witnessed execution only | scan before edits | enumerate observability gaps every pass | follow chain end-to-end | prefer dispatch tables | prefer pipelines | make wrong states unrepresentable | spawn memorize same turn unknown resolves | end-of-turn self-check
@@ -5,28 +5,15 @@ description: Run shell commands on remote SSH hosts via exec:ssh. Reads targets
5
5
 
6
6
  # exec:ssh — Remote SSH Execution
7
7
 
8
- **Use gm subagents for all independent work items. Invoke all skills in the chain: planning → gm-execute → gm-emit → gm-complete → update-docs.**
9
-
10
-
11
- Runs shell commands on a remote host over SSH. No shell open, no manual connection — just write the command.
8
+ Runs shell commands on remote host. No manual connection needed.
12
9
 
13
10
  ## Setup
14
11
 
15
- Create `~/.claude/ssh-targets.json`:
16
-
12
+ `~/.claude/ssh-targets.json`:
17
13
  ```json
18
14
  {
19
- "default": {
20
- "host": "192.168.1.10",
21
- "port": 22,
22
- "username": "pi",
23
- "password": "yourpassword"
24
- },
25
- "prod": {
26
- "host": "10.0.0.1",
27
- "username": "ubuntu",
28
- "keyPath": "/home/user/.ssh/id_rsa"
29
- }
15
+ "default": { "host": "192.168.1.10", "port": 22, "username": "pi", "password": "pass" },
16
+ "prod": { "host": "10.0.0.1", "username": "ubuntu", "keyPath": "/home/user/.ssh/id_rsa" }
30
17
  }
31
18
  ```
32
19
 
@@ -39,39 +26,28 @@ exec:ssh
39
26
  <shell command>
40
27
  ```
41
28
 
42
- Target a named host with `@name` on the first line:
43
-
29
+ Named host with `@name` on first line:
44
30
  ```
45
31
  exec:ssh
46
32
  @prod
47
33
  sudo systemctl restart myapp
48
34
  ```
49
35
 
50
- Multi-line scripts work:
51
-
52
- ```
53
- exec:ssh
54
- cd /var/log && tail -20 syslog
55
- ```
56
-
57
36
  ## Process Persistence
58
37
 
59
- SSH sessions kill child processes on close. To keep a process running after the command returns:
60
-
38
+ SSH kills child processes on close. To persist:
61
39
  ```
62
40
  exec:ssh
63
- sudo systemctl reset-failed myunit 2>/dev/null; systemd-run --unit=myunit bash -c 'your-long-running-command'
41
+ sudo systemctl reset-failed myunit 2>/dev/null; systemd-run --unit=myunit bash -c 'your-command'
64
42
  ```
65
43
 
66
- Always call `systemctl reset-failed <unit>` before reusing a unit name, or use a unique timestamped name:
67
-
44
+ Unique name:
68
45
  ```
69
46
  exec:ssh
70
47
  systemd-run --unit=job-$(date +%s) bash -c 'nohup myprogram &'
71
48
  ```
72
49
 
73
- Fallback if systemd unavailable:
74
-
50
+ Fallback (no systemd):
75
51
  ```
76
52
  exec:ssh
77
53
  setsid nohup bash -c 'myprogram > /tmp/out.log 2>&1' &
@@ -79,11 +55,8 @@ setsid nohup bash -c 'myprogram > /tmp/out.log 2>&1' &
79
55
 
80
56
  ## Dependency
81
57
 
82
- Requires `ssh2` npm package. Install in `~/.claude/gm-tools`:
83
-
58
+ Requires `ssh2` npm package in `~/.claude/gm-tools`:
84
59
  ```
85
60
  exec:bash
86
61
  cd ~/.claude/gm-tools && npm install ssh2
87
62
  ```
88
-
89
- The plugin searches: `~/.claude/gm-tools/node_modules/ssh2`, `~/.claude/plugins/node_modules/ssh2`, then global.
@@ -3,111 +3,58 @@ name: update-docs
3
3
  description: UPDATE-DOCS phase. Refresh README.md, AGENTS.md, and docs/index.html to reflect changes made this session. Commits and pushes doc updates. Terminal phase — declares COMPLETE.
4
4
  ---
5
5
 
6
- # GM UPDATE-DOCS — Documentation Refresh
6
+ # GM UPDATE-DOCS
7
7
 
8
- You are in the **UPDATE-DOCS** phase. Feature work is verified and pushed. Refresh all documentation to match the actual codebase state right now.
8
+ GRAPH: `PLAN EXECUTE EMIT VERIFY [UPDATE-DOCS] COMPLETE`
9
+ Entry: feature verified, committed, pushed. From `gm-complete`.
9
10
 
10
- **GRAPH POSITION**: `PLAN EXECUTE EMIT VERIFY [UPDATE-DOCS] → COMPLETE`
11
- - **Entry**: Feature work verified, committed, and pushed. Entered from `gm-complete`.
11
+ **FORWARD**: docs updated + committed + pushed → COMPLETE
12
+ **BACKWARD**: unknown architecture change `planning`
12
13
 
13
- ## TRANSITIONS
14
-
15
- **FORWARD**: All docs updated, committed, and pushed → COMPLETE (session ends)
16
-
17
- **BACKWARD**:
18
- - Diff reveals unknown architecture change → invoke `planning` skill, restart chain
19
- - Doc write fails post-emit verify → fix and re-verify before advancing
20
-
21
- ## EXECUTION SEQUENCE
22
-
23
- **Step 1 — Understand what changed**:
14
+ ## Sequence
24
15
 
16
+ **1 — What changed**:
25
17
  ```
26
18
  exec:bash
27
19
  git log -5 --oneline
28
20
  git diff HEAD~1 --stat
29
21
  ```
30
22
 
31
- Witness which files changed. Identify doc-sensitive changes: new skills, new states in the state machine, new platforms, modified architecture, new constraints, renamed files.
32
-
33
- **Step 2 — Read current docs**:
34
-
23
+ **2 Read current docs**:
35
24
  ```
36
25
  exec:nodejs
37
26
  const fs = require('fs');
38
- ['README.md', 'AGENTS.md', 'docs/index.html',
39
- 'gm-starter/agents/gm.md'].forEach(f => {
27
+ ['README.md', 'AGENTS.md', 'docs/index.html', 'gm-starter/agents/gm.md'].forEach(f => {
40
28
  try { console.log(`=== ${f} ===\n` + fs.readFileSync(f, 'utf8')); }
41
29
  catch(e) { console.log(`MISSING: ${f}`); }
42
30
  });
43
31
  ```
44
32
 
45
- Identify every section that no longer matches the actual codebase state.
46
-
47
- **Step 3 — Write updated docs**:
48
-
49
- Write only sections that changed. Do not rewrite unchanged content. Rules per file:
50
-
51
- **README.md**: platform count matches adapters in `platforms/`, skill tree diagram matches current state machine, quick start commands work.
52
-
53
- **AGENTS.md**: Launch memorize sub-agent in background with session learnings. Do not inline-edit AGENTS.md — the memorize agent handles extraction, deduplication, and writing. Use: `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<session learnings>')`
54
-
55
- **docs/index.html**: `PHASES` array matches current skill state machine phases. Platform lists match `platforms/` directory. State machine diagram updated if new phases added.
56
-
57
- **gm-starter/agents/gm.md**: Skill chain on the `gm skill →` line updated if new skills were added.
58
-
59
- ```
60
- exec:nodejs
61
- const fs = require('fs');
62
- fs.writeFileSync('/abs/path/file.md', updatedContent);
63
- ```
33
+ **3 Write changed sections only**:
64
34
 
65
- **Step 4 Verify from disk**:
35
+ - **README.md**: platform count, skill tree diagram, quick start commands
36
+ - **AGENTS.md**: via memorize sub-agent only — never inline-edit. `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<learnings>')`
37
+ - **docs/index.html**: `PHASES` array, platform lists, state machine diagram
38
+ - **gm-starter/agents/gm.md**: skill chain line if new skills added
66
39
 
40
+ **4 — Verify from disk**:
67
41
  ```
68
42
  exec:nodejs
69
- const fs = require('fs');
70
- const content = fs.readFileSync('/abs/path/file.md', 'utf8');
43
+ const content = require('fs').readFileSync('/abs/path/file.md', 'utf8');
71
44
  console.log(content.includes('expectedString'), content.length);
72
45
  ```
73
46
 
74
- Witness each written file from disk. Any variance from expected content → fix and re-verify.
75
-
76
- **Step 5 — Commit and push**:
77
-
47
+ **5 Commit and push**:
78
48
  ```
79
49
  exec:bash
80
50
  git add README.md docs/index.html gm-starter/agents/gm.md
81
- git diff --cached --stat
82
- ```
83
-
84
- Witness staged files. Then commit and push:
85
-
86
- ```
87
- exec:bash
88
51
  git commit -m "docs: update documentation to reflect session changes"
89
52
  git push -u origin HEAD
90
53
  ```
91
54
 
92
- Witness push confirmation. Zero variance = COMPLETE.
93
-
94
- ## DOC FIDELITY RULES
95
-
96
- Every claim in docs must be verifiable against disk right now:
97
- - State machine phase names must match skill file `name:` frontmatter
98
- - Platform names must match adapter class names in `platforms/`
99
- - File paths must exist on disk
100
- - Constraint counts must match actual constraints
101
-
102
- If a doc section cannot be verified against disk: remove it, do not speculate.
55
+ ## Fidelity Rules
103
56
 
104
- ## CONSTRAINTS
105
-
106
- **Never**: skip diff analysis | write docs from memory alone | push without verifying from disk | add comments to doc content | claim done without witnessed push output
107
-
108
- **Always**: witness git diff first | read current file before overwriting | verify each file from disk after write | push doc changes | confirm with witnessed push output
109
-
110
- ---
57
+ Every claim verifiable against disk: phase names match frontmatter, platform names match `platforms/`, file paths exist, constraint counts accurate. Unverifiable section → remove, don't speculate.
111
58
 
112
- **→ COMPLETE**: Docs committed and pushed session ends.
113
- **↩ SNAKE to PLAN**: New unknown about codebase state invoke `planning` skill.
59
+ **Never**: write from memory | push without disk verify | add comments | claim done without witnessed push
60
+ **Always**: git diff first | read before overwriting | verify after write | push
package/tools.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.631",
3
+ "version": "2.0.633",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "tools": [
6
6
  {