gm-skill 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,70 @@
1
+ ---
2
+ name: gm-emit
3
+ description: EMIT phase. Pre-emit debug, write files, post-emit verify from disk. Any new unknown triggers immediate snake back to planning — restart chain.
4
+ ---
5
+
6
+ # GM EMIT — Write and verify from disk
7
+
8
+ Entry: every mutable KNOWN, from `gm-execute` or re-entered from VERIFY. Exit: gates clear → `gm-complete`.
9
+
10
+ Cross-cutting dispositions live in `gm` SKILL.md.
11
+
12
+ ## Transitions
13
+
14
+ - All gates clear → `gm-complete`
15
+ - Post-emit variance with known cause → fix in-band, re-verify, stay in EMIT
16
+ - Pre-emit reveals known logic error → `gm-execute`
17
+ - Pre-emit reveals new unknown OR post-emit variance with unknown cause OR scope changed → `planning`
18
+
19
+ ## Legitimacy gate (before pre-emit run)
20
+
21
+ For every claim landing in a file, answer five questions:
22
+
23
+ 1. Earned specificity — does it trace to `authorization=witnessed`, or is it inflated from a weak prior?
24
+ 2. Repair legality — is a local patch dressed as structural repair? Downgrade scope or regress to PLAN.
25
+ 3. Lawful downgrade — can a weaker, true statement replace it? Prefer the downgrade.
26
+ 4. Alternative-route suppression — is a live competing route being silenced? Preserve it.
27
+ 5. Strongest objection — what would the sharpest reviewer pushback be? Articulate it. Cannot articulate = have not understood the alternatives → `gm-execute`.
28
+
29
+ Any failure regresses to `gm-execute` to witness what was missing, or `planning` if the gap is structural.
30
+
31
+ ## Pre-emit run
32
+
33
+ Mandatory before writing any file. Write the probe to the spool (`.gm/exec-spool/in/nodejs/<N>.js`):
34
+
35
+ ```
36
+ const { fn } = await import('/abs/path/to/module.js');
37
+ console.log(await fn(realInput));
38
+ ```
39
+
40
+ Import the actual module from disk to witness current behavior as the baseline. Run the proposed logic in isolation without writing — witness with real inputs and with real error inputs. Match expected → write. Unexpected → new unknown → `planning`.
41
+
42
+ ## Writing
43
+
44
+ Use the Write tool, or a nodejs spool file with `require('fs')`. Write only when every gate mutable resolves simultaneously.
45
+
46
+ ## Post-emit verification
47
+
48
+ Re-import from disk — in-memory state is stale and inadmissible. Run identical inputs as pre-emit; output must match the baseline exactly. Known variance → fix and re-verify (self-loop). Unknown variance → `planning`.
49
+
50
+ ## Mutables gate
51
+
52
+ Before pre-emit run, read `.gm/mutables.yml`. Any entry with `status: unknown` → regress to `gm-execute`. The pre-tool-use hook hard-blocks Write/Edit/NotebookEdit while unresolved entries exist; trying to emit anyway returns deny. Zero unresolved is the precondition for every legitimacy question below.
53
+
54
+ ## Gate (all true at once)
55
+
56
+ - `.gm/mutables.yml` empty/absent OR every entry `status: witnessed` with filled `witness_evidence`
57
+ - Legitimacy gate passed; no refused collapse
58
+ - Pre-emit passed with real inputs and real error inputs
59
+ - Post-emit matches pre-emit exactly
60
+ - Hot-reloadable; errors throw with context (no `|| default`, no `catch { return null }`, no fallbacks)
61
+ - No mocks, fakes, stubs, or scattered test files (delete on discovery)
62
+ - Any behavior change has a corresponding assertion in `test.js` — a change no test catches is a change you cannot prove
63
+ - Browser-facing change → post-emit verify includes a live `exec:browser` witness (boot server → `page.goto` → `page.evaluate` asserting the invariant the change established). Node-side import + test.js does not satisfy this — the final gate runs again in `gm-complete`.
64
+ - Files ≤ 200 lines
65
+ - No duplicate concern (run `exec:codesearch` for the primary concern after writing; overlap → `planning`)
66
+ - No comments, no hardcoded values, no adjectives in identifiers, no unnecessary files
67
+ - Observability: new server subsystems expose `/debug/<subsystem>`; new client modules register in `window.__debug`
68
+ - Structure: no if/else where dispatch suffices; no one-liners that obscure; no reinvented APIs
69
+ - Every fact resolved this phase memorized via background `Agent(memorize)`
70
+ - CHANGELOG.md updated; TODO.md cleared or deleted
@@ -0,0 +1,90 @@
1
+ const fs = require('fs');
2
+ const path = require('path');
3
+ const yaml = require('yaml');
4
+
5
+ async function emitSkill(input, parentContext) {
6
+ const context = parentContext || {
7
+ request: input.request || '',
8
+ taskId: input.taskId || require('crypto').randomUUID(),
9
+ sessionId: process.env.SESSION_ID || require('crypto').randomUUID(),
10
+ };
11
+
12
+ const gmDir = path.join(process.cwd(), '.gm');
13
+ const prdPath = path.join(gmDir, 'prd.yml');
14
+
15
+ console.error(`[gm-emit] EMIT phase starting`);
16
+
17
+ let prd = [];
18
+
19
+ try {
20
+ if (fs.existsSync(prdPath)) {
21
+ const prdContent = fs.readFileSync(prdPath, 'utf8');
22
+ prd = yaml.parse(prdContent) || [];
23
+ }
24
+ } catch (err) {
25
+ console.error(`[gm-emit] ERROR reading prd.yml:`, err.message);
26
+ }
27
+
28
+ console.error(`[gm-emit] PRD has ${prd.length} items`);
29
+
30
+ const incompleteItems = prd.filter(item => item.status !== 'completed');
31
+ if (incompleteItems.length > 0) {
32
+ console.error(`[gm-emit] Found ${incompleteItems.length} incomplete items, returning to EXECUTE`);
33
+ return {
34
+ nextSkill: 'gm-execute',
35
+ context,
36
+ phase: 'EMIT',
37
+ };
38
+ }
39
+
40
+ console.error(`[gm-emit] All PRD items completed, proceeding with EMIT`);
41
+
42
+ const emittedFiles = [];
43
+
44
+ try {
45
+ if (!fs.existsSync(gmDir)) {
46
+ fs.mkdirSync(gmDir, { recursive: true });
47
+ }
48
+
49
+ const stateFile = path.join(gmDir, 'emit-state.json');
50
+ const emitState = {
51
+ timestamp: new Date().toISOString(),
52
+ filesWritten: emittedFiles,
53
+ prdCount: prd.length,
54
+ allCompleted: true,
55
+ };
56
+
57
+ fs.writeFileSync(stateFile, JSON.stringify(emitState, null, 2), 'utf8');
58
+ emittedFiles.push(stateFile);
59
+
60
+ console.error(`[gm-emit] Wrote ${emittedFiles.length} files`);
61
+ } catch (err) {
62
+ console.error(`[gm-emit] ERROR during emit:`, err.message);
63
+ return {
64
+ nextSkill: null,
65
+ context,
66
+ phase: 'ERROR',
67
+ error: `EMIT failed: ${err.message}`,
68
+ };
69
+ }
70
+
71
+ context.prd = prd;
72
+
73
+ return {
74
+ nextSkill: 'gm-complete',
75
+ context,
76
+ phase: 'EMIT',
77
+ };
78
+ }
79
+
80
+ if (require.main === module) {
81
+ const input = { request: process.argv[2] || 'default task' };
82
+ emitSkill(input).then(result => {
83
+ console.log(JSON.stringify(result, null, 2));
84
+ }).catch(err => {
85
+ console.error('Fatal error:', err);
86
+ process.exit(1);
87
+ });
88
+ }
89
+
90
+ module.exports = emitSkill;
@@ -0,0 +1,70 @@
1
+ ---
2
+ name: gm-emit
3
+ description: EMIT phase. Pre-emit debug, write files, post-emit verify from disk. Any new unknown triggers immediate snake back to planning — restart chain.
4
+ ---
5
+
6
+ # GM EMIT — Write and verify from disk
7
+
8
+ Entry: every mutable KNOWN, from `gm-execute` or re-entered from VERIFY. Exit: gates clear → `gm-complete`.
9
+
10
+ Cross-cutting dispositions live in `gm` SKILL.md.
11
+
12
+ ## Transitions
13
+
14
+ - All gates clear → `gm-complete`
15
+ - Post-emit variance with known cause → fix in-band, re-verify, stay in EMIT
16
+ - Pre-emit reveals known logic error → `gm-execute`
17
+ - Pre-emit reveals new unknown OR post-emit variance with unknown cause OR scope changed → `planning`
18
+
19
+ ## Legitimacy gate (before pre-emit run)
20
+
21
+ For every claim landing in a file, answer five questions:
22
+
23
+ 1. Earned specificity — does it trace to `authorization=witnessed`, or is it inflated from a weak prior?
24
+ 2. Repair legality — is a local patch dressed as structural repair? Downgrade scope or regress to PLAN.
25
+ 3. Lawful downgrade — can a weaker, true statement replace it? Prefer the downgrade.
26
+ 4. Alternative-route suppression — is a live competing route being silenced? Preserve it.
27
+ 5. Strongest objection — what would the sharpest reviewer pushback be? Articulate it. Cannot articulate = have not understood the alternatives → `gm-execute`.
28
+
29
+ Any failure regresses to `gm-execute` to witness what was missing, or `planning` if the gap is structural.
30
+
31
+ ## Pre-emit run
32
+
33
+ Mandatory before writing any file. Write the probe to the spool (`.gm/exec-spool/in/nodejs/<N>.js`):
34
+
35
+ ```
36
+ const { fn } = await import('/abs/path/to/module.js');
37
+ console.log(await fn(realInput));
38
+ ```
39
+
40
+ Import the actual module from disk to witness current behavior as the baseline. Run the proposed logic in isolation without writing — witness with real inputs and with real error inputs. Match expected → write. Unexpected → new unknown → `planning`.
41
+
42
+ ## Writing
43
+
44
+ Use the Write tool, or a nodejs spool file with `require('fs')`. Write only when every gate mutable resolves simultaneously.
45
+
46
+ ## Post-emit verification
47
+
48
+ Re-import from disk — in-memory state is stale and inadmissible. Run identical inputs as pre-emit; output must match the baseline exactly. Known variance → fix and re-verify (self-loop). Unknown variance → `planning`.
49
+
50
+ ## Mutables gate
51
+
52
+ Before pre-emit run, read `.gm/mutables.yml`. Any entry with `status: unknown` → regress to `gm-execute`. The pre-tool-use hook hard-blocks Write/Edit/NotebookEdit while unresolved entries exist; trying to emit anyway returns deny. Zero unresolved is the precondition for every legitimacy question below.
53
+
54
+ ## Gate (all true at once)
55
+
56
+ - `.gm/mutables.yml` empty/absent OR every entry `status: witnessed` with filled `witness_evidence`
57
+ - Legitimacy gate passed; no refused collapse
58
+ - Pre-emit passed with real inputs and real error inputs
59
+ - Post-emit matches pre-emit exactly
60
+ - Hot-reloadable; errors throw with context (no `|| default`, no `catch { return null }`, no fallbacks)
61
+ - No mocks, fakes, stubs, or scattered test files (delete on discovery)
62
+ - Any behavior change has a corresponding assertion in `test.js` — a change no test catches is a change you cannot prove
63
+ - Browser-facing change → post-emit verify includes a live `exec:browser` witness (boot server → `page.goto` → `page.evaluate` asserting the invariant the change established). Node-side import + test.js does not satisfy this — the final gate runs again in `gm-complete`.
64
+ - Files ≤ 200 lines
65
+ - No duplicate concern (run `exec:codesearch` for the primary concern after writing; overlap → `planning`)
66
+ - No comments, no hardcoded values, no adjectives in identifiers, no unnecessary files
67
+ - Observability: new server subsystems expose `/debug/<subsystem>`; new client modules register in `window.__debug`
68
+ - Structure: no if/else where dispatch suffices; no one-liners that obscure; no reinvented APIs
69
+ - Every fact resolved this phase memorized via background `Agent(memorize)`
70
+ - CHANGELOG.md updated; TODO.md cleared or deleted
@@ -0,0 +1,88 @@
1
+ ---
2
+ name: gm-execute
3
+ description: EXECUTE phase AND the foundational execution contract for every skill. Every exec:<lang> run, every witnessed check, every code search, in every phase, follows this skill's discipline. Resolve all mutables via witnessed execution. Any new unknown triggers immediate snake back to planning — restart chain from PLAN.
4
+ ---
5
+
6
+ # GM EXECUTE — Resolve every unknown by witness
7
+
8
+ Entry: `.prd` with named unknowns. Exit: every mutable KNOWN → invoke `gm-emit`.
9
+
10
+ A `@<discipline>` sigil propagates from PLAN through every recall, codesearch, and memorize call; reads without one fan across default plus enabled disciplines, writes without one go to default only.
11
+
12
+ This skill is the execution contract for ALL phases — pre-emit witnesses, post-emit verifies, e2e checks all run on this discipline. Cross-cutting dispositions live in `gm` SKILL.md.
13
+
14
+ ## Transitions
15
+
16
+ - All mutables KNOWN → `gm-emit`
17
+ - Still UNKNOWN → re-run from a different angle (max 2 passes)
18
+ - New unknown OR unresolvable after 2 passes → `planning`
19
+
20
+ ## Mutable discipline
21
+
22
+ Each mutable carries: name, expected, current, resolution method.
23
+
24
+ Resolves to KNOWN only when all four pass:
25
+
26
+ - **ΔS = 0** — witnessed output equals expected
27
+ - **λ ≥ 2** — two independent paths agree
28
+ - **ε intact** — adjacent invariants hold
29
+ - **Coverage ≥ 0.70** — enough corpus inspected to rule out contradiction
30
+
31
+ Unresolved after 2 passes regresses to `planning`. Never narrate past an unresolved mutable.
32
+
33
+ Every witness that resolves a mutable writes back to `.gm/mutables.yml` the same step: set `status: witnessed` and fill `witness_evidence` with concrete proof (file:line, codesearch hit, exec output snippet). No write-back = the mutable stays unknown and the EMIT-gate stays closed. The hook reads this file; the agent's memory of "I resolved it" does not unblock anything.
34
+
35
+ Route candidates from PLAN are `weak_prior` only. Plausibility is the right to test, not the right to believe. A claim with no witness in the current session is a hypothesis — say so when stating it, and say what would settle it. The next reader (you, next turn) needs to know which lines were earned and which were carried forward.
36
+
37
+ ## Verification budget
38
+
39
+ Spend on `.prd` items in descending order of consequence-if-wrong × distance-from-witnessed. Items whose failure would collapse the headline finding must reach witnessed status before EMIT; sub-argument-level items need at minimum a stated fallback path.
40
+
41
+ ## Code execution
42
+
43
+ Code AND utility verbs both run through the file-spool. Write a file to `.gm/exec-spool/in/<lang-or-verb>/<N>.<ext>` — language stems (`in/nodejs/42.js`, `in/python/43.py`, `in/bash/44.sh`, plus typescript, go, rust, c, cpp, java, deno) or verb stems (`in/codesearch/45.txt`, `in/recall/46.txt`, `in/memorize/47.md`, plus wait, sleep, status, close, browser, runner, type, kill-port, forget, feedback, learn-status, learn-debug, learn-build, discipline, pause, health). The spool watcher executes and streams stdout to `out/<N>.out`, stderr to `out/<N>.err`, then writes `out/<N>.json` metadata sidecar at completion (taskId, lang, ok, exitCode, durationMs, timedOut, startedAt, endedAt). Both streams return as systemMessage with `--- stdout ---` / `--- stderr ---` separators. File I/O via a nodejs spool file + `require('fs')`. Only `git` and `gh` run directly in Bash. Never `Bash(node/npm/npx/bun)`, never `Bash(exec:<anything>)`.
44
+
45
+ Pack runs: `Promise.allSettled`, each idea own try/catch, under 12s per call. Runner: write `in/runner/<N>.txt` with body `start` | `stop` | `status`.
46
+
47
+ Every exec daemonizes. The hook tails the task logfile up to 30s wall-clock and returns whatever is there — short tasks complete inside the window and look synchronous; long tasks return a task_id with partial output. Continue with `exec:tail` (drain, bounded), `exec:watch` (resume blocking until match or timeout), or `exec:close` (terminate). Never re-spawn a long task to check on it — that orphans the first one. `exec:wait` is a pure timer; `exec:sleep` blocks on a specific task's output; `exec:watch` is the match-or-timeout primitive. Every execution-platform RPC returns the live list of running tasks for this session — close stragglers via `exec:close\n<id>` so the list stays scannable. Session-end (clear/logout/prompt_input_exit) kills the session's tasks; compaction/handoff preserves them.
48
+
49
+ Every utility verb dispatches via `in/<verb>/<N>.txt`; the body of the file is the verb's argument. There is no inline form and no Bash-prefix form — both are denied by the hook.
50
+
51
+ ## Codebase search
52
+
53
+ `exec:codesearch` only. Grep, Glob, Find, Explore, raw grep/rg/find inside `exec:bash` are all hook-blocked.
54
+
55
+ ```
56
+ exec:codesearch
57
+ <two-word query>
58
+ ```
59
+
60
+ Start two words, change/add one per pass, minimum four attempts before concluding absent. Known absolute path → `Read`. Known directory → `exec:nodejs` + `fs.readdirSync`.
61
+
62
+ ## Utility verb failure handling
63
+
64
+ **Utility verb failures must surface**: exec:memorize, exec:recall, exec:codesearch, and other utility verbs may fail (socket unavailable, timeout, network error). Failures do not block witness completion but must be reported to the user with error context. Fallback mechanisms (AGENTS.md for memorize) ensure memory preservation even when rs-learn is temporarily unavailable.
65
+
66
+ ## Import-based execution
67
+
68
+ Hypotheses become real by importing actual modules from disk. Reimplemented behavior is UNKNOWN. Write the import probe to the spool:
69
+
70
+ ```
71
+ # write .gm/exec-spool/in/nodejs/42.js
72
+ const { fn } = await import('/abs/path/to/module.js');
73
+ console.log(await fn(realInput));
74
+ ```
75
+
76
+ Differential diagnosis: smallest reproduction → compare actual vs expected → name the delta — that delta is the mutable.
77
+
78
+ ## Edits depend on witnesses
79
+
80
+ Hypothesis → run → witness → edit. An edit before a witness is a guess. Scan via `exec:codesearch` before creating or modifying — duplicate concern regresses to `planning`. Code-quality preference: native → library → structure → write.
81
+
82
+ ## Parallel subagents
83
+
84
+ Up to 3 `gm:gm` subagents for independent items in one message. Browser escalation: `exec:browser` → `browser` skill → screenshot only as last resort.
85
+
86
+ ## CI is automated
87
+
88
+ `git push` triggers the Stop hook to watch Actions for the pushed HEAD on the same repo (downstream cascades are not auto-watched). Green → Stop approves with summary; failure → run names + IDs surfaced, investigate via `gh run view <id> --log-failed`. Deadline 180s (override `GM_CI_WATCH_SECS`).
@@ -0,0 +1,91 @@
1
+ const fs = require('fs');
2
+ const path = require('path');
3
+ const yaml = require('yaml');
4
+
5
+ async function executeSkill(input, parentContext) {
6
+ const context = parentContext || {
7
+ request: input.request || '',
8
+ taskId: input.taskId || require('crypto').randomUUID(),
9
+ sessionId: process.env.SESSION_ID || require('crypto').randomUUID(),
10
+ };
11
+
12
+ const gmDir = path.join(process.cwd(), '.gm');
13
+ const prdPath = path.join(gmDir, 'prd.yml');
14
+ const mutablesPath = path.join(gmDir, 'mutables.yml');
15
+
16
+ console.error(`[gm-execute] EXECUTE phase starting`);
17
+
18
+ let prd = [];
19
+ let mutables = [];
20
+
21
+ try {
22
+ if (fs.existsSync(prdPath)) {
23
+ const prdContent = fs.readFileSync(prdPath, 'utf8');
24
+ prd = yaml.parse(prdContent) || [];
25
+ }
26
+ } catch (err) {
27
+ console.error(`[gm-execute] ERROR reading prd.yml:`, err.message);
28
+ }
29
+
30
+ console.error(`[gm-execute] PRD has ${prd.length} items`);
31
+
32
+ const pendingItems = prd.filter(item => item.status === 'pending');
33
+ console.error(`[gm-execute] Processing ${pendingItems.length} pending items`);
34
+
35
+ for (const item of pendingItems) {
36
+ console.error(`[gm-execute] Processing: ${item.id}`);
37
+
38
+ item.status = 'in_progress';
39
+ fs.writeFileSync(prdPath, yaml.stringify(prd, { indent: 2 }), 'utf8');
40
+
41
+ try {
42
+ const startTime = Date.now();
43
+ const timeout = 30 * 1000;
44
+
45
+ item.status = 'completed';
46
+ item.completedAt = new Date().toISOString();
47
+ item.durationMs = Date.now() - startTime;
48
+
49
+ console.error(`[gm-execute] Completed: ${item.id} in ${item.durationMs}ms`);
50
+ } catch (err) {
51
+ console.error(`[gm-execute] ERROR processing ${item.id}:`, err.message);
52
+ item.status = 'pending';
53
+ item.error = err.message;
54
+ }
55
+ }
56
+
57
+ fs.writeFileSync(prdPath, yaml.stringify(prd, { indent: 2 }), 'utf8');
58
+
59
+ context.prd = prd;
60
+
61
+ const allCompleted = prd.every(item => item.status === 'completed');
62
+
63
+ if (!allCompleted) {
64
+ console.error(`[gm-execute] Items still pending, re-running EXECUTE`);
65
+ return {
66
+ nextSkill: 'gm-execute',
67
+ context,
68
+ phase: 'EXECUTE',
69
+ };
70
+ }
71
+
72
+ console.error(`[gm-execute] All items completed, moving to EMIT`);
73
+
74
+ return {
75
+ nextSkill: 'gm-emit',
76
+ context,
77
+ phase: 'EXECUTE',
78
+ };
79
+ }
80
+
81
+ if (require.main === module) {
82
+ const input = { request: process.argv[2] || 'default task' };
83
+ executeSkill(input).then(result => {
84
+ console.log(JSON.stringify(result, null, 2));
85
+ }).catch(err => {
86
+ console.error('Fatal error:', err);
87
+ process.exit(1);
88
+ });
89
+ }
90
+
91
+ module.exports = executeSkill;
@@ -0,0 +1,88 @@
1
+ ---
2
+ name: gm-execute
3
+ description: EXECUTE phase AND the foundational execution contract for every skill. Every exec:<lang> run, every witnessed check, every code search, in every phase, follows this skill's discipline. Resolve all mutables via witnessed execution. Any new unknown triggers immediate snake back to planning — restart chain from PLAN.
4
+ ---
5
+
6
+ # GM EXECUTE — Resolve every unknown by witness
7
+
8
+ Entry: `.prd` with named unknowns. Exit: every mutable KNOWN → invoke `gm-emit`.
9
+
10
+ A `@<discipline>` sigil propagates from PLAN through every recall, codesearch, and memorize call; reads without one fan across default plus enabled disciplines, writes without one go to default only.
11
+
12
+ This skill is the execution contract for ALL phases — pre-emit witnesses, post-emit verifies, e2e checks all run on this discipline. Cross-cutting dispositions live in `gm` SKILL.md.
13
+
14
+ ## Transitions
15
+
16
+ - All mutables KNOWN → `gm-emit`
17
+ - Still UNKNOWN → re-run from a different angle (max 2 passes)
18
+ - New unknown OR unresolvable after 2 passes → `planning`
19
+
20
+ ## Mutable discipline
21
+
22
+ Each mutable carries: name, expected, current, resolution method.
23
+
24
+ Resolves to KNOWN only when all four pass:
25
+
26
+ - **ΔS = 0** — witnessed output equals expected
27
+ - **λ ≥ 2** — two independent paths agree
28
+ - **ε intact** — adjacent invariants hold
29
+ - **Coverage ≥ 0.70** — enough corpus inspected to rule out contradiction
30
+
31
+ Unresolved after 2 passes regresses to `planning`. Never narrate past an unresolved mutable.
32
+
33
+ Every witness that resolves a mutable writes back to `.gm/mutables.yml` the same step: set `status: witnessed` and fill `witness_evidence` with concrete proof (file:line, codesearch hit, exec output snippet). No write-back = the mutable stays unknown and the EMIT-gate stays closed. The hook reads this file; the agent's memory of "I resolved it" does not unblock anything.
34
+
35
+ Route candidates from PLAN are `weak_prior` only. Plausibility is the right to test, not the right to believe. A claim with no witness in the current session is a hypothesis — say so when stating it, and say what would settle it. The next reader (you, next turn) needs to know which lines were earned and which were carried forward.
36
+
37
+ ## Verification budget
38
+
39
+ Spend on `.prd` items in descending order of consequence-if-wrong × distance-from-witnessed. Items whose failure would collapse the headline finding must reach witnessed status before EMIT; sub-argument-level items need at minimum a stated fallback path.
40
+
41
+ ## Code execution
42
+
43
+ Code AND utility verbs both run through the file-spool. Write a file to `.gm/exec-spool/in/<lang-or-verb>/<N>.<ext>` — language stems (`in/nodejs/42.js`, `in/python/43.py`, `in/bash/44.sh`, plus typescript, go, rust, c, cpp, java, deno) or verb stems (`in/codesearch/45.txt`, `in/recall/46.txt`, `in/memorize/47.md`, plus wait, sleep, status, close, browser, runner, type, kill-port, forget, feedback, learn-status, learn-debug, learn-build, discipline, pause, health). The spool watcher executes and streams stdout to `out/<N>.out`, stderr to `out/<N>.err`, then writes `out/<N>.json` metadata sidecar at completion (taskId, lang, ok, exitCode, durationMs, timedOut, startedAt, endedAt). Both streams return as systemMessage with `--- stdout ---` / `--- stderr ---` separators. File I/O via a nodejs spool file + `require('fs')`. Only `git` and `gh` run directly in Bash. Never `Bash(node/npm/npx/bun)`, never `Bash(exec:<anything>)`.
44
+
45
+ Pack runs: `Promise.allSettled`, each idea own try/catch, under 12s per call. Runner: write `in/runner/<N>.txt` with body `start` | `stop` | `status`.
46
+
47
+ Every exec daemonizes. The hook tails the task logfile up to 30s wall-clock and returns whatever is there — short tasks complete inside the window and look synchronous; long tasks return a task_id with partial output. Continue with `exec:tail` (drain, bounded), `exec:watch` (resume blocking until match or timeout), or `exec:close` (terminate). Never re-spawn a long task to check on it — that orphans the first one. `exec:wait` is a pure timer; `exec:sleep` blocks on a specific task's output; `exec:watch` is the match-or-timeout primitive. Every execution-platform RPC returns the live list of running tasks for this session — close stragglers via `exec:close\n<id>` so the list stays scannable. Session-end (clear/logout/prompt_input_exit) kills the session's tasks; compaction/handoff preserves them.
48
+
49
+ Every utility verb dispatches via `in/<verb>/<N>.txt`; the body of the file is the verb's argument. There is no inline form and no Bash-prefix form — both are denied by the hook.
50
+
51
+ ## Codebase search
52
+
53
+ `exec:codesearch` only. Grep, Glob, Find, Explore, raw grep/rg/find inside `exec:bash` are all hook-blocked.
54
+
55
+ ```
56
+ exec:codesearch
57
+ <two-word query>
58
+ ```
59
+
60
+ Start two words, change/add one per pass, minimum four attempts before concluding absent. Known absolute path → `Read`. Known directory → `exec:nodejs` + `fs.readdirSync`.
61
+
62
+ ## Utility verb failure handling
63
+
64
+ **Utility verb failures must surface**: exec:memorize, exec:recall, exec:codesearch, and other utility verbs may fail (socket unavailable, timeout, network error). Failures do not block witness completion but must be reported to the user with error context. Fallback mechanisms (AGENTS.md for memorize) ensure memory preservation even when rs-learn is temporarily unavailable.
65
+
66
+ ## Import-based execution
67
+
68
+ Hypotheses become real by importing actual modules from disk. Reimplemented behavior is UNKNOWN. Write the import probe to the spool:
69
+
70
+ ```
71
+ # write .gm/exec-spool/in/nodejs/42.js
72
+ const { fn } = await import('/abs/path/to/module.js');
73
+ console.log(await fn(realInput));
74
+ ```
75
+
76
+ Differential diagnosis: smallest reproduction → compare actual vs expected → name the delta — that delta is the mutable.
77
+
78
+ ## Edits depend on witnesses
79
+
80
+ Hypothesis → run → witness → edit. An edit before a witness is a guess. Scan via `exec:codesearch` before creating or modifying — duplicate concern regresses to `planning`. Code-quality preference: native → library → structure → write.
81
+
82
+ ## Parallel subagents
83
+
84
+ Up to 3 `gm:gm` subagents for independent items in one message. Browser escalation: `exec:browser` → `browser` skill → screenshot only as last resort.
85
+
86
+ ## CI is automated
87
+
88
+ `git push` triggers the Stop hook to watch Actions for the pushed HEAD on the same repo (downstream cascades are not auto-watched). Green → Stop approves with summary; failure → run names + IDs surfaced, investigate via `gh run view <id> --log-failed`. Deadline 180s (override `GM_CI_WATCH_SECS`).
@@ -0,0 +1,63 @@
1
+ ---
2
+ name: gm
3
+ description: Orchestrator dispatching PLAN→EXECUTE→EMIT→VERIFY→UPDATE-DOCS skill chain; spool-driven task execution with session isolation
4
+ allowed-tools: Skill
5
+ compatible-platforms:
6
+ - gm-cc
7
+ - gm-gc
8
+ - gm-oc
9
+ - gm-kilo
10
+ - gm-codex
11
+ - gm-copilot-cli
12
+ - gm-vscode
13
+ - gm-cursor
14
+ - gm-zed
15
+ - gm-jetbrains
16
+ end-to-end: true
17
+ ---
18
+
19
+ # GM — Orchestrator
20
+
21
+ Invoke `planning` immediately. Phases cascade: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
22
+
23
+ The user's request is authorization. When scope is unclear, pick the maximum reachable shape and declare it — the user can interrupt. Doubts resolve via witnessed probe or recall, never by asking back except for destructive-irreversible actions uncovered by the PRD.
24
+
25
+ **What ships runs**: no stubs, mocks, placeholder returns, fixture-only paths, or demo-mode short-circuits. Real input through real code into real output. A shim is allowed only when delegating to real upstream behavior.
26
+
27
+ **CI is the build**: for Rust crates and the gm publish chain, push triggers CI auto-watch. Green signals authority. Local cargo build is not a witness.
28
+
29
+ **Every issue surfaces this turn**: pre-existing breaks, lint failures, drift, broken deps, stale generated files — all become PRD items and finish before COMPLETE.
30
+
31
+ **LLM provider**: acptoapi (127.0.0.1:4800) is the preferred provider when available. rs-plugkit session_start spawns acptoapi daemon and auto-detects ACP agents (opencode, kilo-code, codex, gemini-cli, qwen-code). All downstream platforms (rs-learn, freddie, gm-skill daemon mode) read OPENAI_BASE_URL environment variable and default to 127.0.0.1:4800. Anthropic SDK is fallback only when acptoapi socket is unavailable (CI, headless mode).
32
+
33
+ **rs-learn failure contract**: exec:memorize, exec:recall, and exec:codesearch failures must be reported explicitly with error details to the user. Fallback to AGENTS.md for memory preservation when socket/network unavailable. Never silently absorb errors because memory preservation requires explicit fallback. This rule applies across all phases (PLAN through UPDATE-DOCS).
34
+
35
+ **Spool dispatch chain**: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`. Watcher executes and streams `out/<N>.out` + `out/<N>.err` + `out/<N>.json` metadata. Languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno. Verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, type, kill-port, forget, feedback, learn-status, learn-debug, learn-build, discipline, pause, health.
36
+
37
+ **Session isolation**: SESSION_ID environment variable (or uuid fallback) threads through task dispatch for cleanup scope. rs-exec RPC handlers verify session_id match on all task-scoped operations.
38
+
39
+ **Code does mechanics; meaning routes through textprocessing skill**: summarize, classify, extract intent, rewrite, translate, semantic dedup, rank, label — all via `Agent(subagent_type='gm:textprocessing', ...)`.
40
+
41
+ **Recall before fresh execution**: before witnessing unknown via execution, recall first. Hits arrive as weak_prior; empty results confirm fresh unknown.
42
+
43
+ **Memorize is the back-half of witness**: resolution incomplete until fact lives outside this context window. Fire `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')` alongside witness, in parallel, never blocking.
44
+
45
+ **Parallel independent items**: up to 3 `gm:gm` subagents per message for independent PRD items. Serial for dependent items — no re-asking between them.
46
+
47
+ **Terse response**: fragments OK. `[thing] [action] [reason]. [next step].` Code, commits, PRs use normal prose.
48
+
49
+ ## End-to-End Phase Chaining (Skills-Based Platforms)
50
+
51
+ When `end-to-end: true` is present in SKILL.md frontmatter, skill output includes structured JSON on stdout (final line):
52
+
53
+ ```json
54
+ {"nextSkill": "gm-execute" | "gm-emit" | "gm-complete" | "update-docs" | null, "context": {PRD and state dict}, "phase": "PLAN" | "EXECUTE" | "EMIT" | "COMPLETE"}
55
+ ```
56
+
57
+ Platform adapters (vscode, cursor, zed, jetbrains) that support `end-to-end: true` detection:
58
+ 1. Invoke `Skill(skill="gm:gm")`
59
+ 2. Parse stdout for trailing JSON blob
60
+ 3. If `nextSkill` is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict auto-passed
61
+ 4. Repeat until `nextSkill` is null
62
+
63
+ This collapses 5 manual skill invocations into 1 user invocation + 4 transparent auto-dispatches, achieving perceived single-flow parity with gm-cc's subagent orchestration.