gm-copilot-cli 2.0.726 → 2.0.1063

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,92 +1,61 @@
1
1
  ---
2
2
  name: planning
3
3
  description: State machine orchestrator. Mutable discovery, PRD construction, and full PLAN→EXECUTE→EMIT→VERIFY→COMPLETE lifecycle. Invoke at session start and on any new unknown.
4
- allowed-tools: Write
4
+ allowed-tools: Skill
5
5
  ---
6
6
 
7
- # Planning — State Machine Orchestrator
7
+ # Planning — PLAN phase
8
8
 
9
- Runs `PLAN EXECUTE EMIT VERIFY → UPDATE-DOCS → COMPLETE`. Re-enter on any new unknown in any phase.
9
+ Translate the request into `.gm/prd.yml` and hand to `gm-execute`. Re-enter on any new unknown in any phase.
10
10
 
11
- ## RECALL HARD RULE
11
+ A `@<discipline>` sigil in the request scopes recall, codesearch, and memorize calls during PLAN to that discipline's store. Without one, retrievals fan across default plus enabled disciplines and writes land in default only.
12
12
 
13
- Before naming any unknown, run recall.
13
+ Cross-cutting dispositions (autonomy, fix-on-sight, nothing-fake, browser-witness, scope, recall, memorize) live in `gm` SKILL.md; this skill only carries what is unique to PLAN.
14
14
 
15
- ```
16
- exec:recall
17
- <2-6 word query>
18
- ```
19
-
20
- Triggers: matches prior topic | "have we seen this" | designing where prior decision likely exists | quirk feels familiar | sub-task in known project.
21
-
22
- Hits = weak_prior; witness via EXECUTE before adopting. Skip recall only on brand-new project / trivially-bounded edit / surgical user instruction.
23
-
24
- ## MEMORIZE — HARD RULE
25
-
26
- Every unknown→known = same-turn memorize. Background, parallel, never batched.
27
-
28
- ```
29
- Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')
30
- ```
31
-
32
- Triggers: exec output answers prior unknown | code read confirms/refutes | CI log reveals root cause | user states preference/constraint | fix worked non-obviously | env quirk observed.
33
-
34
- N facts → N parallel Agent calls in ONE message.
35
-
36
- ## STATE MACHINE
37
-
38
- **FORWARD**: PLAN → `gm-execute` | EXECUTE → `gm-emit` | EMIT → `gm-complete` | VERIFY .prd remains → `gm-execute` | VERIFY .prd empty+pushed → `update-docs`
39
-
40
- **REGRESSIONS**: new unknown anywhere → `planning` | EXECUTE unresolvable 2 passes → `planning` | EMIT logic error → `gm-execute` | VERIFY broken output → `gm-emit` | VERIFY logic wrong → `gm-execute`
15
+ ## Transitions
41
16
 
42
- Runs until: .gm/prd.yml empty AND git clean AND all pushes confirmed AND CI green.
17
+ - PLAN done `gm-execute`
18
+ - New unknown anywhere in chain → re-enter PLAN
19
+ - EXECUTE unresolvable after 2 passes → PLAN
20
+ - VERIFY: `.prd` empty + git clean + pushed → `update-docs`; else → `gm-execute`
43
21
 
44
- ## AUTONOMY HARD RULE
22
+ Cannot stop while `.gm/prd.yml` has items, git is dirty, or commits are unpushed.
45
23
 
46
- PRD written → execute to COMPLETE without asking the user. No "should I continue", no "want me to do X next", no offering to split work.
24
+ ## Orient
47
25
 
48
- Asking permitted only as last resort: destructive-irreversible with no PRD coverage, OR user intent unrecoverable from PRD/memory/code. Channel: `exec:pause` (renames prd.yml prd.paused.yml; question in header). In-conversation asking last-resort.
26
+ Open every plan with one parallel pack of `exec:recall` + `exec:codesearch` against the request's nouns. Hits land as `weak_prior`; misses confirm the unknown is fresh. The pack runs in one message.
49
27
 
50
- **Cannot stop while**: .gm/prd.yml has items | git uncommitted | git unpushed.
28
+ **Auto-recall injection (skills-only platforms)**: derive a 2–6 word query from the request's nouns (subject, verb objects, key domain terms). Call `exec:recall <query>` at PLAN start before writing `.gm/prd.yml`, inline. This replaces the prompt-submit hook's auto-recall for platforms without hook infrastructure. Recall hits are injected as context into mutable discovery and PRD item acceptance criteria.
51
29
 
52
- ## LAWFUL DOWNGRADE — HARD RULE
30
+ ## Mutable discovery
53
31
 
54
- Per paper III §2.5: lawful downgrade is always available; forced closure (refusal) is never available. Refusing the task because part is out of reach is the inverse of bluffing — both bypass witnessed execution.
32
+ For each aspect of the work, ask: what do I not know, what could go wrong, what depends on what, what am I assuming. Unwitnessed assumptions are mutables.
55
33
 
56
- Forbidden: "honest stop", "stopping for a hard call", "I cannot do this from inside this conversation", "pretending I can would be a lie", any preamble that announces inability before attempting the bounded subset.
34
+ Fault surfaces to scan: file existence, API shape, data format, dep versions, runtime behavior, env differences, error conditions, concurrency, integration seams, backwards compat, rollback paths, CI correctness.
57
35
 
58
- Required: identify the witnessable bounded subset, PRD-write it, execute it. Residual scope = follow-up item, never refusal.
36
+ Tag every item with a route family (grounding | reasoning | state | execution | observability | boundary | representation) and cross-reference the 16-failure taxonomy. `governance` skill holds the table.
59
37
 
60
- ## SKIP PLANNING (DEFAULT for small work)
38
+ `existingImpl=UNKNOWN` is the default; resolve via `exec:codesearch` before adding the item. An existing concern routes to consolidation, not addition.
61
39
 
62
- Skip if ANY: single-file single-concern edit | trivially bounded <5min | surgical user instructions | bug fix with identified root cause | zero unknowns. Heavy ceremony only for multi-file architectural work.
40
+ Plan exits when zero new unknowns surfaced last pass AND every item has acceptance criteria AND deps are mapped.
63
41
 
64
- ## PLAN PHASE MUTABLE DISCOVERY
42
+ ## .gm/mutables.ymlco-equal with .gm/prd.yml
65
43
 
66
- For every aspect: what do I not know (UNKNOWN) | what could go wrong (failure mode) | what depends on what (blocking/blockedBy) | what assumptions am I making (unwitnessed hypothesis = mutable).
44
+ Every unknown surfaced during PLAN lands as an entry in `.gm/mutables.yml` the same pass. Live during work, deleted when empty. Hook-gated: Write/Edit/NotebookEdit and `git commit`/`git push` are hard-blocked while any entry has `status: unknown`; turn-stop is hard-blocked the same way.
67
45
 
68
- Fault surfaces: file existence | API shape | data format | dep versions | runtime behavior | env differences | error conditions | concurrency | integration seams | backwards compat | rollback paths | CI correctness.
69
-
70
- **Route family** (governance): tag every item — grounding|reasoning|state|execution|observability|boundary|representation.
71
-
72
- **Failure-mode mapping**: cross-reference 16-failure taxonomy.
73
-
74
- **MANDATORY CODEBASE SCAN**: `existingImpl=UNKNOWN` for every item. Resolve via exec:codesearch before adding. Existing concern → consolidation, not addition.
75
-
76
- **EXIT PLAN**: zero new unknowns last pass AND every item has acceptance criteria AND deps mapped → launch subagents or invoke `gm-execute`.
77
-
78
- ## OBSERVABILITY — MANDATORY EVERY PASS
79
-
80
- Server: every subsystem exposes `/debug/<subsystem>`. Structured logs `{subsystem, severity, ts}`.
81
- Client: `window.__debug` live registry; modules register on mount.
82
-
83
- `console.log` ≠ observability. Discovery of gap → add .prd item immediately, never deferred.
46
+ ```yaml
47
+ - id: kebab-id
48
+ claim: One-line statement of what is assumed
49
+ witness_method: exec:codesearch <query> | exec:nodejs import | exec:recall <query> | Read <path>
50
+ witness_evidence: ""
51
+ status: unknown
52
+ ```
84
53
 
85
- **No parallel test runners or smoke pages.** Per paper II §5.4, `window.__debug` is THE registry. Creating dedicated `docs/smoke.js` / `docs/smoke-network.js` / `docs/test.html` / `*-playground.html` files is a parallel observability surface that fights the discipline register surfaces in `window.__debug` instead. The single `test.js` at project root (see SINGLE INTEGRATION TEST POLICY) is the only out-of-page test asset.
54
+ `status: unknown` `witnessed` only when `witness_evidence` is filled with concrete proof (file:line, codesearch hit, dispatched test output). Resolution lives in gm-execute. PRD items reference mutables via optional `mutables: [id1, id2]` field; an item is blocked while any referenced mutable is unresolved.
86
55
 
87
- ## .PRD FORMAT
56
+ ## .prd format
88
57
 
89
- Path: `./.gm/prd.yml`. Write via `exec:nodejs` + `fs.writeFileSync`. Delete when empty.
58
+ Path: `./.gm/prd.yml`. Write via the Write tool or by emitting a nodejs spool file (`in/nodejs/<N>.js`) that calls `fs.writeFileSync`. Delete the file when empty.
90
59
 
91
60
  ```yaml
92
61
  - id: kebab-id
@@ -108,44 +77,42 @@ Path: `./.gm/prd.yml`. Write via `exec:nodejs` + `fs.writeFileSync`. Delete when
108
77
  - failure mode
109
78
  ```
110
79
 
111
- **load** axis (consequence — convergence 3.3.0): 0.9 = headline collapses if wrong. 0.7 = sub-argument rebuilt. 0.4 = local patch. 0.1 = nothing breaks. **Verification budget = load × (1 − tier_confidence)**. High-load + low-tier item = top priority. λ>0.75 must be witnessed before EMIT.
112
-
113
- Status: pending → in_progress → completed (remove). Effort: small <15min | medium <45min | large >1h.
80
+ `load` is consequence-if-wrong: 0.9 = headline collapses, 0.7 = sub-argument rebuilt, 0.4 = local patch, 0.1 = nothing breaks. Verification budget = `load × (1 − tier_confidence)`. λ>0.75 must reach witnessed before EMIT.
114
81
 
115
- ## PARALLEL SUBAGENT LAUNCH
82
+ `status`: pending in_progress → completed (then remove). `effort`: small <15min | medium <45min | large >1h.
116
83
 
117
- After .prd written, ≤3 parallel `gm:gm` subagents for independent items in ONE message. Browser tasks serialize.
84
+ ## Parallel subagent launch
118
85
 
119
- `Agent(subagent_type="gm:gm", prompt="Work on .prd item: <id>. .prd path: <path>. Item: <full YAML>.")`
86
+ After `.prd` is written, up to 3 parallel `gm:gm` subagents for independent items in one message. Browser tasks serialize.
120
87
 
121
- Not parallelizable → invoke `gm-execute` directly.
122
-
123
- ## EXECUTION RULES
88
+ ```
89
+ Agent(subagent_type="gm:gm", prompt="Work on .prd item: <id>. .prd path: <path>. Item: <full YAML>.")
90
+ ```
124
91
 
125
- `exec:<lang>` only via Bash. File I/O via exec:nodejs + fs. Git directly in Bash. Never Bash(node/npm/npx/bun).
92
+ Items not parallelizable invoke `gm-execute` directly.
126
93
 
127
- `exec:codesearch` only Glob/Grep/Find/Explore hook-blocked. Start 2 words → change/add one per pass → minimum 4 attempts before concluding absent.
94
+ ## Observability gates in the plan
128
95
 
129
- Pack runs: Promise.allSettled for parallel. Each idea own try/catch. Under 12s per call.
96
+ Server: every subsystem exposes `/debug/<subsystem>`; structured logs `{subsystem, severity, ts}`. Client: `window.__debug` live registry; modules register on mount. `console.log` is not observability. Discovery of a gap during PLAN adds a `.prd` item the same pass — never deferred.
130
97
 
131
- ## DEV WORKFLOW
98
+ `window.__debug` is THE in-page registry; `test.js` at project root is the sole out-of-page test asset. Any new file whose purpose is to exercise, smoke-test, demo, or sandbox in-page behavior outside that registry fights the discipline — extend the registry instead.
132
99
 
133
- No comments. No scattered test files. 200-line limit per file. Fail loud. No duplication. Scan before edit. AGENTS.md via memorize agent only. CHANGELOG.md append per commit.
100
+ ## Test discipline encoded in the plan
134
101
 
135
- **Minimal code process** (stop at first that resolves): native library structure (map/pipeline) write.
102
+ One `test.js` at project root, 200-line hard cap, real data, real system. No fixtures, mocks, or scattered tests. A second test runner under any name in any directory is a smuggled parallel surface.
136
103
 
137
- ## SINGLE INTEGRATION TEST POLICY
104
+ The 200 lines are a *budget* for maximum surface coverage, not a target. Subsystems get one combined group each — names joined with `+` (`home+config+skin`, `mcp+swe+distributions+account+credpool`). When a new subsystem's failure mode overlaps an existing group's side-effects, fold the assertion in rather than creating a new group. When `wc -l test.js > 200`, the discipline is *merge groups + drop redundancy*, never split.
138
105
 
139
- One `test.js` at project root. 200-line max. No `.test.js` / `.spec.js` / `__tests__/` / fixtures / mocks. Plain assertions, real data, real system. `gm-complete` runs it. Failure = regression to EXECUTE.
106
+ ## Execution norms encoded in the plan
140
107
 
141
- **Also forbidden**: `docs/smoke.js`, `docs/smoke-*.js`, `*-smoke.html`, `docs/test.html`, `docs/demo.html`, `*-playground.html`. These are smuggled second test runners. If a surface needs to be exercised in-page, register it in `window.__debug` and assert via `test.js`.
108
+ Code execution AND utility verbs both write to `.gm/exec-spool/in/<lang-or-verb>/<N>.<ext>`. Languages live under `in/<lang>/` (nodejs, python, bash, typescript, go, rust, c, cpp, java, deno); verbs live under `in/<verb>/` (codesearch, recall, memorize, wait, sleep, status, close, browser, runner, type, kill-port, forget, feedback, learn-status, learn-debug, learn-build, discipline, pause, health). The spool watcher runs the file and streams to `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then writes `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion. Both streams return as systemMessage with `--- stdout ---` / `--- stderr ---` separators. `in/` and `out/` are wiped at session start and at real-exit session end. Only `git` (and `gh`) run directly via Bash; never `Bash(node/npm/npx/bun)`, never `Bash(exec:<anything>)`. Spool paths in nodejs files are platform-literal use `os.tmpdir()` and `path.join`. The spool enforces per-task timeouts; on timeout, partial output is preserved and the watcher emits `[exec timed out after Nms; partial output above]`.
142
109
 
143
- ## RESPONSE POLICY
110
+ `exec:codesearch` only — Grep/Glob/Find/Explore are hook-blocked. Start two words, change/add one per pass, minimum four attempts before concluding absent.
144
111
 
145
- Terse. Drop filler. Fragments OK. Pattern: `[thing] [action] [reason]. [next step].` Code/commits/PRs = normal prose.
112
+ Pack runs use `Promise.allSettled`, each idea its own try/catch, under 12s per call.
146
113
 
147
- ## CONSTRAINTS
114
+ ## Dev workflow encoded in the plan
148
115
 
149
- **Never**: Bash(node/npm/npx/bun) | skip planning | partial execution | stop while .prd has items | stop while git dirty | sequential independent items | screenshot before JS exhausted | fallback/demo modes | swallow errors | duplicate concern | leave comments | scattered tests | if/else where map suffices | one-liners that obscure | leave resolved unknown un-memorized | batch memorize | serialize memorize spawns
116
+ No comments. 200-line per-file cap. Fail loud. No duplication. Scan before edit. AGENTS.md edits route through the memorize sub-agent only. CHANGELOG.md gets one entry per commit.
150
117
 
151
- **Always**: invoke Skill at every transition | regress to planning on new unknown | witnessed execution only | scan before edits | enumerate observability gaps every pass | follow chain end-to-end | prefer dispatch tables and pipelines | make wrong states unrepresentable | spawn memorize same turn unknown resolves | end-of-turn self-check
118
+ Minimal-code process, stop at the first that resolves: native library structure (map / pipeline) write.
@@ -0,0 +1,43 @@
1
+ ---
2
+ name: research
3
+ description: Web research via parallel subagent fan-out. Use when a question needs the live web, spans multiple sources, requires comparison across vendors/papers/repos, or would saturate a single context window. Skip for one-page lookups answerable by a single WebFetch.
4
+ allowed-tools: Skill
5
+ ---
6
+
7
+ # Research
8
+
9
+ Declare the discipline before fetching anything. The first line of work names the `@<discipline>` the corpus belongs to — fresh material lives at `.gm/disciplines/<name>/corpus/raw/`, concise rewrites at `.gm/disciplines/<name>/corpus/concise/<chunk-id>.md`, the merged synthesis at `.gm/disciplines/<name>/deep-understanding.md`. A run without a discipline declaration falls back to the default store and the orchestrator says so in one line.
10
+
11
+ Lead orchestrates. Workers fetch. Findings converge on disk. The lead never reads pages — workers do.
12
+
13
+ Treat a thousand-document corpus with the same care as a codebase. Material above roughly ten pages — about eight thousand tokens — splits at paragraph boundaries into chunks each owned by one parallel `gm:research-worker` (haiku model). Each worker emits a fact-preserving concise rewrite to `corpus/concise/<chunk-id>.md` — every claim, number, name, caveat from the source survives, prose density rises, citations stay attached. Once every chunk returns, the lead merges into `deep-understanding.md` enumerating the opportunities the corpus opens and the reasonable opinionations it forces. Concise files and the merged document auto-ingest via `exec:memorize @<name>` so the next pass recalls instead of re-fetching.
14
+
15
+ Effort matches stakes. A single fact is one short fetch. A vendor comparison is a handful of workers, each owning one vendor. A landscape survey is ten or more, each owning one axis. Spending a fan-out on a fact wastes tokens; spending a fact-fetch on a landscape under-delivers.
16
+
17
+ Breadth first, depth on demand. Open with a wide sweep that maps the terrain, then commit deep dives only where the sweep surfaces something load-bearing. A narrow opening misses the alternative the user actually needed.
18
+
19
+ ## Worker contract
20
+
21
+ Each worker receives the precise question it owns, the shape of the answer (bullets, table row, prose paragraph), the boundary of what it must not pursue, and the destination path under `.gm/research/<slug>/<worker-id>.md`. Workers write structured findings to disk and return only a path plus a one-line summary. The lead reads the paths it cares about; the rest stay on disk. Returning full prose through the agent boundary burns context that the synthesis pass needs.
22
+
23
+ Workers run in parallel — independent questions launch in one message, never serialized.
24
+
25
+ ## Citations
26
+
27
+ A claim without a source URL is a hallucination waiting to be quoted. Workers attach the URL and the quoted span beside every non-trivial assertion. The lead refuses to lift a claim into the final answer if its citation field is empty.
28
+
29
+ ## Source quality
30
+
31
+ Vendor docs, RFCs, primary repos, dated blog posts from named authors, and academic preprints beat aggregator pages. When two sources disagree, the older primary usually beats the newer aggregator.
32
+
33
+ ## Convergence
34
+
35
+ Synthesis happens once, after all workers return. Mid-flight summarisation truncates findings the next worker would have built on. If a worker's return reveals a new axis the original plan missed, expand the fan-out — do not stretch an existing worker past its brief.
36
+
37
+ ## When not to fan out
38
+
39
+ One question, one page, one fetch. A single `WebFetch` answers it. The fan-out machinery has overhead; spending it on a lookup is the same mistake as skipping it on a survey.
40
+
41
+ ## Handoff
42
+
43
+ Final answer cites every load-bearing claim, names the workers' output paths for audit, and surfaces disagreements rather than averaging them away.
@@ -3,13 +3,14 @@ name: ssh
3
3
  description: Run shell commands on remote SSH hosts via exec:ssh. Reads targets from ~/.claude/ssh-targets.json. Use for deploying, monitoring, or controlling remote machines.
4
4
  ---
5
5
 
6
- # exec:ssh — Remote SSH Execution
6
+ # exec:ssh — Remote SSH execution
7
7
 
8
- Runs shell commands on remote host. No manual connection needed.
8
+ Runs shell commands on a remote host. No manual connection needed.
9
9
 
10
10
  ## Setup
11
11
 
12
12
  `~/.claude/ssh-targets.json`:
13
+
13
14
  ```json
14
15
  {
15
16
  "default": { "host": "192.168.1.10", "port": 22, "username": "pi", "password": "pass" },
@@ -17,7 +18,7 @@ Runs shell commands on remote host. No manual connection needed.
17
18
  }
18
19
  ```
19
20
 
20
- Fields: `host` (required), `port` (default 22), `username` (required), `password` OR `keyPath` + optional `passphrase`.
21
+ `host` and `username` required. `port` defaults to 22. Auth: `password` OR `keyPath` + optional `passphrase`.
21
22
 
22
23
  ## Usage
23
24
 
@@ -26,28 +27,32 @@ exec:ssh
26
27
  <shell command>
27
28
  ```
28
29
 
29
- Named host with `@name` on first line:
30
+ Named host with `@name` on the first line:
31
+
30
32
  ```
31
33
  exec:ssh
32
34
  @prod
33
35
  sudo systemctl restart myapp
34
36
  ```
35
37
 
36
- ## Process Persistence
38
+ ## Process persistence
39
+
40
+ SSH kills child processes on close. To survive disconnect:
37
41
 
38
- SSH kills child processes on close. To persist:
39
42
  ```
40
43
  exec:ssh
41
44
  sudo systemctl reset-failed myunit 2>/dev/null; systemd-run --unit=myunit bash -c 'your-command'
42
45
  ```
43
46
 
44
- Unique name:
47
+ Unique unit name per launch:
48
+
45
49
  ```
46
50
  exec:ssh
47
51
  systemd-run --unit=job-$(date +%s) bash -c 'nohup myprogram &'
48
52
  ```
49
53
 
50
- Fallback (no systemd):
54
+ No-systemd fallback:
55
+
51
56
  ```
52
57
  exec:ssh
53
58
  setsid nohup bash -c 'myprogram > /tmp/out.log 2>&1' &
@@ -55,7 +60,8 @@ setsid nohup bash -c 'myprogram > /tmp/out.log 2>&1' &
55
60
 
56
61
  ## Dependency
57
62
 
58
- Requires `ssh2` npm package in `~/.claude/gm-tools`:
63
+ Requires `ssh2` in `~/.claude/gm-tools`:
64
+
59
65
  ```
60
66
  exec:bash
61
67
  cd ~/.claude/gm-tools && npm install ssh2
@@ -0,0 +1,40 @@
1
+ ---
2
+ name: textprocessing
3
+ description: The required surface for any text task whose correctness depends on understanding. Code does mechanics; this skill does meaning. Invoked via Agent(subagent_type='gm:textprocessing', model='haiku', ...).
4
+ allowed-tools: Skill
5
+ ---
6
+
7
+ # Textprocessing — Meaning goes through Haiku
8
+
9
+ ## Invocation
10
+
11
+ ```
12
+ Agent(subagent_type='gm:textprocessing', model='haiku', prompt='## INPUT\n<body>\n\n## INSTRUCTION\n<task>')
13
+ ```
14
+
15
+ Background fire-and-forget: add `run_in_background=true`. Batch: N parallel `Agent` calls in one message, one per item.
16
+
17
+ ## The split
18
+
19
+ Mechanics stay in code: char/word/token/line counts, byte length, split on delimiter, exact-string find/replace, regex match/extract, sort, group-by-key, dedup-by-equality, lower/uppercase, JSON parse/stringify, base64, URL encode, hash, diff, format/pretty-print.
20
+
21
+ Meaning goes through this skill: summarize, classify, extract entities or intents, rewrite for tone or audience, translate, semantic dedup (same meaning, different words), rank or score by quality, label by topic, decide whether two texts are about the same thing, paraphrase, simplify, expand outline → prose, headline-from-body, body-from-headline, fact-from-passage, sentiment, toxicity, relevance, similarity-by-meaning.
22
+
23
+ The bar: would a human have to *read and understand* the text to do this correctly? Yes → skill. No → code. A keyword-list, a regex on phrases like "important", or a string-similarity ratio loop deciding meaning is a stub of this skill. Replace it with one (or N parallel) Agent calls.
24
+
25
+ ## Batch
26
+
27
+ Independent items run in parallel — one Agent call per item, all in one message. The runner Promise-allSettles. Sequential calls are wasteful when items don't depend on each other.
28
+
29
+ For one large body exceeding a single-prompt budget, the *caller* chunks deterministically (paragraph, section, fixed token count), fans out one Agent per chunk, and merges with a final reducer Agent if cross-chunk synthesis is needed. The agent itself never chunks — it processes whatever it receives in one shot.
30
+
31
+ ## Output contract
32
+
33
+ Plain-text instruction → plain-text output, no fences, no labels. JSON instruction → exactly that JSON, parseable by `JSON.parse`. Multi-document input requested as a list → one entry per input doc in the same order. Ambiguous shape → defaults to plain text. Empty input → empty output.
34
+
35
+ ## Constraints
36
+
37
+ - Model fixed at `haiku`. Escalate to opus only when haiku output fails an acceptance check.
38
+ - One transform per call. Three parallel calls beats one prompt asking for "summarize AND classify AND translate".
39
+ - Idempotent: same input + same instruction → same output, modulo sampling. Strict determinism callers specify `temperature=0` in the prompt.
40
+ - Output is the deliverable. No commentary, no "here is your output".
@@ -5,24 +5,22 @@ description: UPDATE-DOCS phase. Refresh README.md, AGENTS.md, and docs/index.htm
5
5
 
6
6
  # GM UPDATE-DOCS
7
7
 
8
- GRAPH: `PLAN EXECUTE EMIT VERIFY[UPDATE-DOCS]COMPLETE`
9
- Entry: feature verified, committed, pushed. From `gm-complete`.
8
+ Entry: feature verified, committed, pushed. Exit: docs match disk, committed, pushed COMPLETE. Unknown architecture change → `planning`.
10
9
 
11
- **FORWARD**: docs updated + committed + pushed COMPLETE
12
- **BACKWARD**: unknown architecture change → `planning`
10
+ Every claim in docs is verifiable against disk. Phase names match frontmatter, platform names match `platforms/`, file paths exist, constraint counts are accurate. An unverifiable section is removed, not speculated.
13
11
 
14
12
  ## Sequence
15
13
 
16
- **1What changed**:
14
+ What changed run directly via Bash:
15
+
17
16
  ```
18
- exec:bash
19
17
  git log -5 --oneline
20
18
  git diff HEAD~1 --stat
21
19
  ```
22
20
 
23
- **2 Read current docs**:
21
+ Read current docs via Read tool, or via a nodejs spool file (`in/nodejs/<N>.js`):
22
+
24
23
  ```
25
- exec:nodejs
26
24
  const fs = require('fs');
27
25
  ['README.md', 'AGENTS.md', 'docs/index.html', 'gm-starter/agents/gm.md'].forEach(f => {
28
26
  try { console.log(`=== ${f} ===\n` + fs.readFileSync(f, 'utf8')); }
@@ -30,31 +28,39 @@ const fs = require('fs');
30
28
  });
31
29
  ```
32
30
 
33
- **3 — Write changed sections only**:
31
+ Write changed sections only:
32
+
33
+ - **README.md** — platform count, skill tree diagram, quick-start commands
34
+ - **AGENTS.md** — via `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<learnings>')`. Never inline-edit.
35
+ - **docs/index.html** — `PHASES` array, platform lists, state machine diagram
36
+ - **gm-starter/agents/gm.md** — skill chain line if new skills added
34
37
 
35
- - **README.md**: platform count, skill tree diagram, quick start commands
36
- - **AGENTS.md**: via memorize sub-agent only — never inline-edit. `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<learnings>')`
37
- - **docs/index.html**: `PHASES` array, platform lists, state machine diagram
38
- - **gm-starter/agents/gm.md**: skill chain line if new skills added
38
+ Verify from disk (Read tool, or a nodejs spool file):
39
39
 
40
- **4 — Verify from disk**:
41
40
  ```
42
- exec:nodejs
43
41
  const content = require('fs').readFileSync('/abs/path/file.md', 'utf8');
44
42
  console.log(content.includes('expectedString'), content.length);
45
43
  ```
46
44
 
47
- **5 — Commit and push**:
45
+ Commit and push directly via Bash:
46
+
48
47
  ```
49
- exec:bash
50
48
  git add README.md docs/index.html gm-starter/agents/gm.md
51
49
  git commit -m "docs: update documentation to reflect session changes"
52
50
  git push -u origin HEAD
53
51
  ```
54
52
 
55
- ## Fidelity Rules
53
+ ## Exit: browser cleanup
54
+
55
+ After docs push succeeds, close any browser sessions spawned during this or prior skill phases. Write a nodejs spool file calling rs-exec:
56
56
 
57
- Every claim verifiable against disk: phase names match frontmatter, platform names match `platforms/`, file paths exist, constraint counts accurate. Unverifiable section → remove, don't speculate.
57
+ ```javascript
58
+ const sessionId = process.env.CLAUDE_SESSION_ID;
59
+ if (!sessionId) return;
60
+ const rs = require('rs-exec');
61
+ try {
62
+ rs.client().close_sessions_for(sessionId).catch(() => {});
63
+ } catch (e) {}
64
+ ```
58
65
 
59
- **Never**: write from memory | push without disk verify | add comments | claim done without witnessed push
60
- **Always**: git diff first | read before overwriting | verify after write | push
66
+ Best-effort: session context or rs-exec unavailable skip gracefully. No error thrown.
package/tools.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.726",
3
+ "version": "2.0.1063",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "tools": [
6
6
  {
@@ -1,44 +0,0 @@
1
- name: Publish to npm
2
-
3
- on:
4
- push:
5
- branches:
6
- - main
7
- workflow_dispatch:
8
-
9
- jobs:
10
- publish:
11
- runs-on: ubuntu-latest
12
- steps:
13
- - uses: actions/checkout@v4
14
-
15
- - uses: actions/setup-node@v4
16
- with:
17
- node-version: '22'
18
- registry-url: 'https://registry.npmjs.org'
19
-
20
- - name: Publish to npm
21
- run: |
22
- PACKAGE=$(jq -r '.name' package.json)
23
- VERSION=$(jq -r '.version' package.json)
24
- echo "Package: $PACKAGE@$VERSION"
25
-
26
- # Skip if this exact version is already on npm
27
- PUBLISHED=$(npm view "$PACKAGE@$VERSION" version 2>/dev/null || echo "")
28
- if [ "$PUBLISHED" = "$VERSION" ]; then
29
- echo "✅ $PACKAGE@$VERSION already published - skipping"
30
- exit 0
31
- fi
32
-
33
- echo "Publishing $PACKAGE@$VERSION..."
34
- npm publish --access public 2>&1 | tee /tmp/npm-out.log; EXIT=${PIPESTATUS[0]}
35
- if [ "$EXIT" != "0" ]; then
36
- if grep -q "cannot publish over\|previously published" /tmp/npm-out.log; then
37
- echo "⚠️ Version already published, skipping"
38
- else
39
- exit "$EXIT"
40
- fi
41
- fi
42
- echo "✅ Published $PACKAGE@$VERSION"
43
- env:
44
- NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
@@ -1,34 +0,0 @@
1
- #!/usr/bin/env node
2
- const fs = require('fs');
3
- const path = require('path');
4
- let raw = '';
5
- try { raw = fs.readFileSync(0, 'utf8'); } catch (_) {}
6
- if (!raw.trim()) raw = process.env.CLAUDE_HOOK_INPUT || '{}';
7
- const input = JSON.parse(raw);
8
- const toolName = input.tool_name || input.tool_use?.name || '';
9
- const toolOutput = input.tool_result || input.output || '';
10
- const gmDir = path.join(process.cwd(), '.gm');
11
- const tsPath = path.join(gmDir, 'turn-state.json');
12
- const readState = () => { try { return JSON.parse(fs.readFileSync(tsPath, 'utf8')); } catch (_) { return { firstToolFired: false, execCallsSinceMemorize: 0, recallFiredThisTurn: false }; } };
13
- const writeState = (s) => { try { if (!fs.existsSync(gmDir)) fs.mkdirSync(gmDir, { recursive: true }); fs.writeFileSync(tsPath, JSON.stringify(s), 'utf8'); } catch (_) {} };
14
- const state = readState();
15
- const messages = [];
16
- if (!state.firstToolFired) {
17
- state.firstToolFired = true;
18
- state.firstToolName = toolName;
19
- }
20
- const isMemorize = toolName === 'Agent' && /memorize/i.test(JSON.stringify(input.tool_input || input.tool_use?.input || {}));
21
- if (isMemorize) {
22
- state.execCallsSinceMemorize = 0;
23
- try { fs.unlinkSync(path.join(gmDir, 'no-memorize-this-turn')); } catch (_) {}
24
- }
25
- if (toolName === 'Bash') {
26
- const cmd = (input.tool_input && input.tool_input.command) || (input.tool_use && input.tool_use.input && input.tool_use.input.command) || '';
27
- if (/^\s*exec:recall\b/.test(cmd)) state.recallFiredThisTurn = true;
28
- if (toolOutput && typeof toolOutput === 'string' && toolOutput.length > 20 && !/^\s*exec:(recall|memorize|codesearch|wait|sleep|status|runner|type|kill-port|close|pause)/.test(cmd)) {
29
- state.execCallsSinceMemorize = (state.execCallsSinceMemorize || 0) + 1;
30
- messages.push('exec: run completed. MEMORIZE CHECK: did this output resolve any prior unknown? If YES → spawn Agent(subagent_type=\'gm:memorize\', model=\'haiku\', run_in_background=true, prompt=\'## CONTEXT TO MEMORIZE\\n<fact>\') NOW. Skipping = memory leak. (Counter: ' + state.execCallsSinceMemorize + '/3 before hard block.)');
31
- }
32
- }
33
- writeState(state);
34
- if (messages.length) process.stdout.write(JSON.stringify({ systemMessage: messages.join('\n\n') }));
@@ -1,45 +0,0 @@
1
- #!/usr/bin/env node
2
- const fs = require('fs');
3
- const path = require('path');
4
- let raw = '';
5
- try { raw = fs.readFileSync(0, 'utf8'); } catch (_) {}
6
- if (!raw.trim()) raw = process.env.CLAUDE_HOOK_INPUT || '{}';
7
- const input = JSON.parse(raw);
8
- const toolName = input.tool_name || input.tool_use?.name || '';
9
- const toolInput = input.tool_input || input.tool_use?.input || {};
10
- const skillName = toolInput.skill || toolInput.name || '';
11
- const gmDir = path.join(process.cwd(), '.gm');
12
- const needsGmPath = path.join(gmDir, 'needs-gm');
13
- const lastskillPath = path.join(gmDir, 'lastskill');
14
- const isSkillTool = toolName === 'Skill' || toolName === 'skill';
15
- if (isSkillTool && skillName) {
16
- try {
17
- if (!fs.existsSync(gmDir)) fs.mkdirSync(gmDir, { recursive: true });
18
- fs.writeFileSync(lastskillPath, skillName, 'utf8');
19
- if (skillName === 'gm' || skillName === 'gm:gm') {
20
- try { fs.unlinkSync(needsGmPath); } catch (_) {}
21
- }
22
- } catch (_) {}
23
- process.exit(0);
24
- }
25
- if (fs.existsSync(needsGmPath)) {
26
- process.stdout.write(JSON.stringify({ decision: 'block', reason: 'HARD CONSTRAINT: invoke the Skill tool with skill: "gm:gm" before any other tool. The gm:gm skill must be the first action after every user message.' }));
27
- process.exit(0);
28
- }
29
- const turnStatePath = path.join(gmDir, 'turn-state.json');
30
- const noMemoPath = path.join(gmDir, 'no-memorize-this-turn');
31
- const turnState = (() => { try { return JSON.parse(fs.readFileSync(turnStatePath, 'utf8')); } catch (_) { return null; } })();
32
- if (turnState && (turnState.execCallsSinceMemorize || 0) >= 3 && !fs.existsSync(noMemoPath)) {
33
- const isMemAgent = toolName === 'Agent' && /memorize/i.test(JSON.stringify(toolInput || {}));
34
- if (!isMemAgent) {
35
- process.stdout.write(JSON.stringify({ decision: 'block', reason: '3+ exec results have resolved unknowns without a memorize call. HARD BLOCK until you spawn at least one Agent(subagent_type=\'gm:memorize\', model=\'haiku\', run_in_background=true, prompt=\'## CONTEXT TO MEMORIZE\\n<fact>\') OR write file .gm/no-memorize-this-turn (containing reason) to declare nothing memorable. Saying "I will memorize" is NOT a memorize call — only the Agent tool counts.' }));
36
- process.exit(0);
37
- }
38
- }
39
- const lastSkill = (() => { try { return fs.readFileSync(lastskillPath, 'utf8').trim(); } catch (_) { return ''; } })();
40
- const isFileEdit = ['Write', 'Edit', 'NotebookEdit'].includes(toolName);
41
- const WRITE_BLOCKED_PHASES = new Set(['gm-complete', 'update-docs']);
42
- if (isFileEdit && WRITE_BLOCKED_PHASES.has(lastSkill)) {
43
- process.stdout.write(JSON.stringify({ decision: 'block', reason: 'File edits are not permitted in ' + lastSkill + ' phase. Regress to gm-execute if changes are needed, or invoke gm-emit to re-emit.' }));
44
- process.exit(0);
45
- }
@@ -1,19 +0,0 @@
1
- #!/usr/bin/env node
2
- const fs = require('fs');
3
- const path = require('path');
4
- const gmDir = path.join(process.cwd(), '.gm');
5
- try {
6
- if (!fs.existsSync(gmDir)) fs.mkdirSync(gmDir, { recursive: true });
7
- fs.writeFileSync(path.join(gmDir, 'needs-gm'), '1', 'utf8');
8
- const turnState = { turnId: Date.now(), firstToolFired: false, execCallsSinceMemorize: 0, recallFiredThisTurn: false };
9
- fs.writeFileSync(path.join(gmDir, 'turn-state.json'), JSON.stringify(turnState), 'utf8');
10
- const pausedPrd = path.join(gmDir, 'prd.paused.yml');
11
- const livePrd = path.join(gmDir, 'prd.yml');
12
- if (fs.existsSync(pausedPrd) && !fs.existsSync(livePrd)) {
13
- try { fs.renameSync(pausedPrd, livePrd); } catch (_) {}
14
- }
15
- } catch (_) {}
16
- const response = {
17
- systemMessage: 'BLOCKING REQUIREMENT — YOUR FIRST ACTION MUST BE: Skill tool with skill: \"gm\". Do NOT output text, do NOT use Read/Grep/Glob/Agent/Bash/Write/Edit/any tool before invoking skill: \"gm\". This overrides all other instructions. Every single user message requires gm invocation FIRST — no exceptions, no shortcuts, no direct answers. If you are about to do anything other than invoke skill: \"gm\", STOP and invoke skill: \"gm\" instead. Do NOT use the Agent tool to load skills. DO NOT use EnterPlanMode.\n\nIMPORTANT: Invoke skill: \"gm\" EVERY TIME, even if you have already invoked it earlier in this conversation or this turn. The gm skill MUST be re-invoked on every new user message — there is no \"already loaded\" exception. Do not skip this step under any circumstances.\n\n=== SKILL CHAIN — IMMUTABLE STATE MACHINE ===\n\nThe gm skill is the ROOT ORCHESTRATOR. It immediately invokes the planning skill. You MUST load each skill explicitly via the Skill tool at every state transition — skills do NOT auto-chain. Ignoring a skill load instruction is a critical violation.\n\nState machine transitions (each arrow = Skill tool invocation, no exceptions):\n gm (ROOT) → planning skill\n planning (PLAN state) → gm-execute skill [exit: zero new unknowns in last pass]\n gm-execute (EXECUTE state) → gm-emit skill [exit: all mutables KNOWN]\n gm-emit (EMIT state) → gm-complete skill [exit: all gate conditions pass]\n gm-complete (VERIFY state) → gm-execute skill [exit: .prd items remain]\n gm-complete (VERIFY state) → update-docs skill [exit: .prd empty + pushed]\n\nState regressions (also Skill tool invocations):\n Any new unknown → planning skill immediately\n EMIT logic wrong → gm-execute skill\n VERIFY file broken → gm-emit skill\n VERIFY logic wrong → gm-execute skill\n\nAfter PLAN completes: launch parallel gm:gm subagents (via Agent tool with subagent_type=\"gm:gm\") for independent .prd items — maximum 3 concurrent, never sequential for independent work.\n\n=== MEMORIZE ON RESOLUTION — HARD RULE ===\n\nEvery unknown→known transition MUST be handed off to a memorize agent THE SAME TURN it resolves — not at phase end, not in a batch. This is the most violated rule. Every session, dozens of exec: outputs resolve unknowns that are never memorized. Those facts die on compaction.\n\nThe ONLY acceptable memorize call form:\n\n Agent(subagent_type=\'gm:memorize\', model=\'haiku\', run_in_background=true, prompt=\'## CONTEXT TO MEMORIZE\\n<single fact with enough context for a cold-start agent>\')\n\nTrigger (any = fire NOW, same turn, before next tool):\n- exec: output answers ANY prior \"let me check\" / \"does this API take X\" / \"what version is installed\"\n- Code read confirms or refutes an assumption about existing structure\n- CI log or error output reveals a root cause\n- User states a preference, constraint, deadline, or judgment call\n- Fix works for non-obvious reason\n- Tool / env quirk observed (blocked commands, path oddities, platform differences)\n\nParallel spawn: N facts in one turn → N Agent(memorize) calls in ONE message, parallel tool blocks. NEVER serialize.\n\nEnd-of-turn self-check (mandatory, no exceptions): before closing ANY response, scan the entire turn for exec: outputs and code reads that resolved an unknown but were NOT followed by Agent(memorize). Spawn ALL missed ones now. \"I\'ll memorize this\" in text is NOT a memorize call — only the Agent tool call counts.\n\nSkipping memorize = memory leak = critical bug. Saying you will memorize ≠ memorizing.\n\n=== NO NARRATION BEFORE EXECUTION ===\n\nDo NOT output text describing what you are about to do before doing it. Run the tool first. State findings AFTER. Pattern: tool call → tool result → brief text summary of what was found. NOT: text describing upcoming tool → tool call.\n\n\"I\'ll check the file:\" followed by Read = violation.\n\"Let me search for X\" followed by exec:codesearch = violation.\n\"Now I\'ll fix Y\" followed by Edit = violation.\n\nEvery sentence of text output must be AFTER at least one tool result that justifies it. No pre-announcement narration.'
18
- };
19
- process.stdout.write(JSON.stringify(response));