npm - gm-cc - Versions diffs - 2.0.727 → 2.0.1064 - Mend

gm-cc 2.0.727 → 2.0.1064

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

package/.claude-plugin/marketplace.json +1 -1
package/agents/gm.md +1 -3
package/agents/memorize.md +22 -2
package/agents/research-worker.md +36 -0
package/agents/textprocessing.md +47 -0
package/bin/bootstrap.js +624 -34
package/bin/plugkit.js +95 -53
package/bin/plugkit.sha256 +6 -6
package/bin/plugkit.version +1 -1
package/bin/rtk.sha256 +6 -0
package/bin/rtk.version +1 -0
package/hooks/hooks.json +2 -46
package/hooks/hooks.spec.json +48 -0
package/package.json +2 -2
package/plugin.json +1 -1
package/skills/browser/SKILL.md +18 -16
package/skills/code-search/SKILL.md +15 -15
package/skills/create-lang-plugin/SKILL.md +22 -26
package/skills/gm/SKILL.md +31 -66
package/skills/gm-cc/SKILL.md +19 -0
package/skills/gm-codex/SKILL.md +19 -0
package/skills/gm-complete/SKILL.md +52 -69
package/skills/gm-copilot-cli/SKILL.md +19 -0
package/skills/gm-cursor/SKILL.md +19 -0
package/skills/gm-emit/SKILL.md +44 -61
package/skills/gm-execute/SKILL.md +42 -84
package/skills/gm-gc/SKILL.md +19 -0
package/skills/gm-jetbrains/SKILL.md +19 -0
package/skills/gm-kilo/SKILL.md +19 -0
package/skills/gm-oc/SKILL.md +19 -0
package/skills/gm-vscode/SKILL.md +19 -0
package/skills/gm-zed/SKILL.md +19 -0
package/skills/governance/SKILL.md +24 -23
package/skills/pages/SKILL.md +42 -92
package/skills/planning/SKILL.md +83 -80
package/skills/research/SKILL.md +43 -0
package/skills/ssh/SKILL.md +15 -9
package/skills/textprocessing/SKILL.md +40 -0
package/skills/update-docs/SKILL.md +27 -21
package/.github/workflows/publish-npm.yml +0 -44
package/hooks/post-tool-use-hook.js +0 -34
package/hooks/pre-tool-use-hook.js +0 -45
package/hooks/prompt-submit-hook.js +0 -19
package/hooks/session-start-hook.js +0 -23

package/skills/gm/SKILL.md CHANGED Viewed

@@ -1,91 +1,56 @@
 ---
 name: gm
-description: Agent (not skill) - immutable programming state machine. Always invoke for all work coordination.
+description: Orchestrator dispatching PLAN→EXECUTE→EMIT→VERIFY→UPDATE-DOCS skill chain; spool-driven task execution with session isolation
+allowed-tools: Skill
+end-to-end: true
 ---
-# GM — Skill-First Orchestrator
+# GM — Orchestrator
-Invoke `planning` skill immediately. Skill tool only — never Agent tool for skills.
+Invoke `planning` immediately. Phases cascade: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
-## STATE MACHINE
+The user's request is authorization. When scope is unclear, pick the maximum reachable shape and declare it — the user can interrupt. Doubts resolve via witnessed probe or recall, never by asking back except for destructive-irreversible actions uncovered by the PRD.
-Top of chain. No mutables resolved. Phases: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
-Each phase loads protocols via Skill invocation only. Reading summary ≠ being in phase.
+**What ships runs**: no stubs, mocks, placeholder returns, fixture-only paths, or demo-mode short-circuits. Real input through real code into real output. A shim is allowed only when delegating to real upstream behavior.
-`gm-execute` = execution contract (all phases). `governance` = route/legitimacy reference (load once).
+**CI is the build**: for Rust crates and the gm publish chain, push triggers CI auto-watch. Green signals authority. Local cargo build is not a witness.
-## RECALL — HARD RULE
+**Every issue surfaces this turn**: pre-existing breaks, lint failures, drift, broken deps, stale generated files — all become PRD items and finish before COMPLETE.
-Before resolving any unknown via fresh execution, check past sessions. Memorized facts only help if recalled.
+**LLM provider**: acptoapi (127.0.0.1:4800) is the preferred provider when available. rs-plugkit session_start spawns acptoapi daemon and auto-detects ACP agents (opencode, kilo-code, codex, gemini-cli, qwen-code). All downstream platforms (rs-learn, freddie, gm-skill daemon mode) read OPENAI_BASE_URL environment variable and default to 127.0.0.1:4800. Anthropic SDK is fallback only when acptoapi socket is unavailable (CI, headless mode).
-```
-exec:recall
-<2-6 word query>
-```
-Triggers: unknown feels familiar | sub-task on a known project | about to ask user something likely already discussed | about to design where prior decision exists. Hits = weak_prior; still witness before adopting. ~200 tokens, ~5ms when serve is running.
-## MEMORIZE — HARD RULE
+**rs-learn failure contract**: memorize, recall, and codesearch spool dispatch failures must be reported explicitly with error details to the user. Fallback to AGENTS.md for memory preservation when socket/network unavailable. Never silently absorb errors because memory preservation requires explicit fallback. This rule applies across all phases (PLAN through UPDATE-DOCS).
-Unknown→known = memorize same turn it resolves. Background, non-blocking.
-Triggers: exec: output answers prior unknown | code read confirms/refutes assumption | CI log reveals root cause | user states preference/constraint | fix worked for non-obvious reason | env quirk observed.
-```
-Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')
-```
+**Spool watcher restart**: at session start, check `.gm/exec-spool/.watcher.heartbeat`. If older than 30s, the watcher is dead. Write a nodejs spool file to `.gm/exec-spool/in/nodejs/restart-watcher.js` that kills stale plugkit processes and spawns `plugkit runner --watch .gm/exec-spool/in --out .gm/exec-spool/out` detached. Wait 2s for initialization before dispatching any exec commands.
-Multiple facts → parallel Agent calls in ONE message. End-of-turn: scan for un-memorized resolutions → spawn now.
+**Spool dispatch chain**: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`. Watcher executes and streams `out/<N>.out` + `out/<N>.err` + `out/<N>.json` metadata. Languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno. Verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, type, kill-port, forget, feedback, learn-status, learn-debug, learn-build, discipline, pause, health.
-**Recall + memorize together = learning loop.** Skipping either breaks it.
+**Session isolation**: SESSION_ID environment variable (or uuid fallback) threads through task dispatch for cleanup scope. rs-exec RPC handlers verify session_id match on all task-scoped operations.
-## AUTONOMY — HARD RULE
+**Code does mechanics; meaning routes through textprocessing skill**: summarize, classify, extract intent, rewrite, translate, semantic dedup, rank, label — all via `Agent(subagent_type='gm:textprocessing', ...)`.
-Default = autonomous execution. Emit PRD, run it to completion, push. Do NOT ask the user mid-task.
+**Recall before fresh execution**: before witnessing unknown via execution, recall first. Hits arrive as weak_prior; empty results confirm fresh unknown.
-Forbidden patterns:
-- "Should I continue with X?" / "Want me to do Y next?" / "Want me to also Z?"
-- "This is a lot — should I do A first and confirm?" / "Two options: A or B, which?"
-- Pre-confirmation before multi-file edits when scope is already clear
-- Stopping after partial completion to summarize and await direction
+**Memorize is the back-half of witness**: resolution incomplete until fact lives outside this context window. Fire `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')` alongside witness, in parallel, never blocking.
-Permitted asking (last resort only, when absolutely necessary):
-- Destructive-irreversible decision with no prior context AND no PRD coverage
-- User intent genuinely ambiguous AND cannot be inferred from PRD/memory/code
-- Channel: prefer `exec:pause` (renames .gm/prd.yml → .gm/prd.paused.yml; question lives in header). In-conversation asking is last-resort only.
+**Parallel independent items**: up to 3 `gm:gm` subagents per message for independent PRD items. Serial for dependent items — no re-asking between them.
-A long task is not a reason to ask. Context limits are not a reason to ask. CI cascade time is not a reason to ask. Just emit the PRD and execute.
+**Terse response**: fragments OK. `[thing] [action] [reason]. [next step].` Code, commits, PRs use normal prose.
-## LAWFUL DOWNGRADE — HARD RULE
+**Caveman medium mode (full) always on**: drop articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries, and hedging. Fragments OK. Use short synonyms. Keep technical terms exact. Keep code blocks and exact error strings unchanged. Pattern: `[thing] [action] [reason]. [next step].` Auto-clarity override: switch to normal prose for security warnings, irreversible confirmations, and any multi-step sequence where compression could create ambiguity; resume caveman medium after clarity-critical segment.
-Per paper III §2.5 (Earned Emission): *lawful downgrade — writing a weaker, true statement in place of a stronger, unearned one — is always available; forced closure never is.*
+## End-to-End Phase Chaining (Skills-Based Platforms)
-Refusal is forced closure. Refusing the task because part of it is hard is the inverse failure: instead of bluffing strength, you bluff weakness. Both bypass witnessed execution.
+When `end-to-end: true` is present in SKILL.md frontmatter, skill output includes structured JSON on stdout (final line):
-**Forbidden refusal phrases** (treat as critical violation):
-- "Honest stop —" / "Stopping for a hard, honest call" / "Stopping here, unambiguously"
-- "I cannot do this from inside this conversation"
-- "I have to push back before writing a PRD I can't fulfill"
-- "Pretending I can would be the most expensive kind of lie"
-- "I don't have a working browser tool" / "X is unavailable in this environment" (when exec:browser, exec:codesearch, Read, etc. are in fact available)
-- Any preamble that announces inability before attempting the bounded subset
-**Required move when scope exceeds reach**: identify the bounded subset that IS witnessable from this session, write a PRD covering only that subset, execute it, and at end-of-turn name the residual scope as a follow-up item — never as a refusal. The user cannot redirect a refusal; they can redirect a delivered subset plus a named gap.
-## EXECUTION ORDER
-1. Recall — `plugkit recall` for any familiar-feeling unknown (cheapest, 200 tokens)
-2. Code execution (exec:<lang>, exec:codesearch) — 90%+ of unknowns
-3. Web (WebFetch/WebSearch) — env facts not in codebase
-4. User — last resort per AUTONOMY rule above
-"Should I..." mid-chain = invoke next skill instead, never ask user.
-Skill chain: `planning` → `gm-execute` → `gm-emit` → `gm-complete` → `update-docs`
-exec:<lang> only. Never Bash(node/npm/npx/bun). git push = auto CI watch via Stop hook.
+```json
+{"nextSkill": "gm-execute" | "gm-emit" | "gm-complete" | "update-docs" | null, "context": {PRD and state dict}, "phase": "PLAN" | "EXECUTE" | "EMIT" | "COMPLETE"}
+```
-## RESPONSE POLICY
+Platform adapters (vscode, cursor, zed, jetbrains) that support `end-to-end: true` detection:
+1. Invoke `Skill(skill="gm:gm")`
+2. Parse stdout for trailing JSON blob
+3. If `nextSkill` is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict auto-passed
+4. Repeat until `nextSkill` is null
-Terse. Drop filler. Fragments OK. Pattern: `[thing] [action] [reason]. [next step].`
-Code/commits/PRs = normal prose. Security/destructive = drop terseness.
+This collapses 5 manual skill invocations into 1 user invocation + 4 transparent auto-dispatches, achieving perceived single-flow parity with gm-cc's subagent orchestration.

package/skills/gm-cc/SKILL.md ADDED Viewed

@@ -0,0 +1,19 @@
+---
+name: gm-cc
+description: AI-native software engineering via skill-driven orchestration on cc; bootstraps plugkit for task execution and session isolation
+allowed-tools: Skill
+---
+# GM — cc Platform
+AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
+**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
+**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
+**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
+**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
+Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).

package/skills/gm-codex/SKILL.md ADDED Viewed

@@ -0,0 +1,19 @@
+---
+name: gm-codex
+description: AI-native software engineering via skill-driven orchestration on codex; bootstraps plugkit for task execution and session isolation
+allowed-tools: Skill
+---
+# GM — codex Platform
+AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
+**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
+**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
+**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
+**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
+Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).

package/skills/gm-complete/SKILL.md CHANGED Viewed

@@ -3,121 +3,104 @@ name: gm-complete
 description: VERIFY and COMPLETE phase. End-to-end system verification and git enforcement. Any new unknown triggers immediate snake back to planning — restart chain.
 ---
-# GM COMPLETE — Verify and Complete
+# GM COMPLETE — Verify, then close
-GRAPH: `PLAN → EXECUTE → EMIT → [VERIFY] → UPDATE-DOCS → COMPLETE`
-Entry: all EMIT gates passed. From `gm-emit`.
+Entry: EMIT gates clear, from `gm-emit`. Exit: `.prd` deleted + test.js green + pushed + CI green → `update-docs`.
-## TRANSITIONS
+Cross-cutting dispositions live in `gm` SKILL.md.
-**EXIT → EXECUTE**: .prd items remain → invoke `gm-execute` immediately.
-**EXIT → COMPLETE**: .prd deleted + test.js passes + pushed + CI green → invoke `update-docs`.
-**REGRESS → EMIT**: broken file output.
-**REGRESS → EXECUTE**: logic wrong.
-**REGRESS → PLAN**: new unknown or wrong requirements.
+## Transitions
-Failure triage: broken output → EMIT | wrong logic → EXECUTE | new unknown → PLAN. Never patch around surprises.
+- `.prd` items remain → `gm-execute`
+- `.prd` empty AND test.js green AND pushed AND CI green → `update-docs`
+- Broken file output → `gm-emit`
+- Wrong logic → `gm-execute`
+- New unknown or wrong requirements → `planning`
-## MUTABLES — ALL MUST RESOLVE BEFORE COMPLETE
+Failure triage: broken output to EMIT, wrong logic to EXECUTE, new unknown to PLAN. Never patch around surprises.
+## Mutables that must resolve before COMPLETE
 - `witnessed_e2e` — real end-to-end run with witnessed output
-- `browser_validated` — MANDATORY for any change touching client/UI/browser-facing code (anything served to a browser, rendered, or whose output is visible to a user). Must invoke `browser` skill, navigate the live page, and witness the change in `window` / DOM / scene state. test.js + node-side imports DO NOT satisfy this gate. See BROWSER VALIDATION GATE below.
+- `browser_validated` — for any change touching client / UI / browser-facing code, see gate below. test.js + node-side imports DO NOT satisfy this gate.
 - `git_clean` — `git status --porcelain` returns empty
 - `git_pushed` — `git log origin/main..HEAD --oneline` returns empty
-- `ci_passed` — all GitHub Actions runs reach `conclusion: success`
-- `prd_empty` — `.gm/prd.yml` deleted (file must not exist)
-- `stress_suite_clear` — change walked through all applicable governance stress cases (M1-D1), none flunk
-- `hidden_decision_posture` — advances open→down_weighted→closed only when CI green + stress suite clear
+- `ci_passed` — every GitHub Actions run reaches `conclusion: success`
+- `mutables_resolved` — `.gm/mutables.yml` deleted OR every entry `status: witnessed`. Stop hook hard-blocks turn-stop while any entry is `status: unknown`.
+- `prd_empty` — `.gm/prd.yml` deleted AFTER residual scan: enumerate every in-spirit reachable residual surfaced this session; any hit re-enters `planning`, appends PRD items, executes. Empty PRD is necessary, not sufficient — done = empty PRD AND zero reachable in-spirit residuals. Out-of-spirit-or-unreachable residuals are named in the response and skipped; everything else is this turn's work.
+- `stress_suite_clear` — change walked through M1–D1 (governance), none flunked
+- `hidden_decision_posture` — open → down_weighted → closed only when CI is green AND stress suite is clear
-## END-TO-END VERIFICATION
+## End-to-end verification
-Run real system, real data, witness actual output. NOT verification: docs updates, saying done, screenshots alone.
+Real system, real data, witness actual output. Doc updates, "saying done", and screenshots alone are not verification. Write the e2e probe to the spool (`.gm/exec-spool/in/nodejs/<N>.js`):
 ```
-exec:nodejs
 const { fn } = await import('/abs/path/to/module.js');
 console.log(await fn(realInput));
 ```
-Browser/UI: invoke `browser` skill. After every success: enumerate what remains — never stop at first green.
+After every success, enumerate what remains — never stop at first green.
+## Browser validation gate
+Required when this session changed any code that runs in a browser: anything under `client/`, UI components, shaders, page-loaded JS, served HTML, gh-pages assets, dev-server endpoints, or any module imported into the page bundle.
-## BROWSER VALIDATION GATE — MANDATORY FOR CLIENT WORK
+Trigger detection (any one): `git diff --name-only origin/main..HEAD` includes paths under `client/`, `apps/*/index.js` with client export, `docs/`, `*.html`, shader files, or any file imported by a browser entry; new/changed export consumed by `window.*` or rendered in DOM/canvas/WebGL; visual, layout, animation, input, network-on-page, or shader behavior altered.
-If this session changed any code that runs in a browser — anything under client/, UI components, shaders, page-loaded JS, served HTML, gh-pages assets, dev-server endpoints, or any module imported into the page bundle — `browser_validated` MUST resolve before COMPLETE. Skipping it because "node tests pass" or "test.js is green" is a forced-closure refusal of witnessed verification.
+Protocol: boot the real server (or open the static page) on a known URL — witness HTTP 200. `exec:browser` → `page.goto(url)` → wait for app init by polling for the global the change affects (`window.__app.<system>`). Probe via `page.evaluate(() => …)` asserting the specific invariant the change was supposed to establish — instance counts, scene meshes, DOM nodes, render stats, network frames. Capture witnessed numbers in the response — "looks fine" is not a witness. Failures route to `gm-execute` (logic) or `gm-emit` (output) — never paper over.
-Trigger detection (any one suffices):
-- `git diff --name-only origin/main..HEAD` includes paths under `client/`, `apps/*/index.js` with client export, `docs/`, `*.html`, shader files, or any file imported by a browser entry.
-- New/changed export consumed by `window.*` or rendered in DOM/canvas/WebGL.
-- Visual, layout, animation, input, network-on-page, or shader behavior altered.
+Long-running probes split into navigate-call → `exec:wait N` → probe-call to stay under the per-call budget. Do not stack multi-second `setTimeout` inside one `exec:browser` invocation.
-Required protocol:
-1. Boot the real server (or open the static page) on a known URL — witness HTTP 200.
-2. `exec:browser` → `page.goto(url)` → wait for app init (poll for the global the change affects, e.g. `window.__app.<system>`).
-3. Probe via `page.evaluate(() => …)` — assert the specific invariant the change was supposed to establish (instance counts, scene meshes, DOM nodes, render stats, network frames, etc.).
-4. Capture the witnessed numbers in the response. "Looks fine" is not a witness.
-5. Failures → regress to `gm-execute` (logic) or `gm-emit` (output) — never paper over.
+Exempt only when: change is server-only with zero browser-facing surface, OR the repository has no browser surface at all (pure CLI / library). Exemption requires explicit tag in the response: `BROWSER EXEMPT: <reason — must reference diff paths showing zero browser-facing surface>`. Default posture is NOT exempt — burden is on the agent to prove exemption with diff evidence.
-Long-running probes: split into navigate-call → `exec:wait N` → probe-call to stay under the per-call budget. Do not stack multi-second `setTimeout` inside one `exec:browser` invocation.
+Pre-flight: run `git diff --name-only origin/main..HEAD` directly via Bash, then dispatch a nodejs spool file that reads the diff list and filters lines matching `client/|docs/|\.html$|\.glsl$|\.frag$|\.vert$`. Any hit AND no `exec:browser` block in this session → mandatory regression to `gm-execute`.
-Exempt only when: change is server-only with zero browser-facing surface, OR repository has no browser surface at all (pure CLI/library). Tag the exemption in the response with the reason; do not silently skip.
+## Integration test gate
-## INTEGRATION TEST GATE
+Write to `.gm/exec-spool/in/nodejs/<N>.js`:
 ```
-exec:nodejs
 const { execSync } = require('child_process');
 try { execSync('node test.js', { stdio: 'inherit', timeout: 30000 }); console.log('PASS'); }
 catch (e) { console.error('FAIL'); process.exit(1); }
 ```
-Failure → regress to `gm-execute`. No test.js + testable surface → regress to `gm-execute` to create it.
+Failure → `gm-execute`. No test.js in a repo with testable surface → `gm-execute` to create it.
-## GIT ENFORCEMENT
+## Git enforcement
+Run directly via Bash:
 ```
-exec:bash
 git status --porcelain
 git log origin/main..HEAD --oneline
 ```
-Both must return empty. Local commit without push ≠ complete.
+Both must return empty. Local commit without push is not complete.
-## CI — AUTOMATED
+## CI is automated
-Stop hook watches all GitHub Actions runs for the pushed HEAD. Do not call `gh run list` manually.
-- All-green → Stop approves with CI summary in next turn context
-- Failure → Stop blocks with run names+IDs → investigate with `gh run view <id> --log-failed`, fix, push, hook re-watches
-- Deadline 180s (override `GM_CI_WATCH_SECS`) → slow jobs get "still in progress" approve
+The Stop hook watches Actions for the pushed HEAD. Do not call `gh run list` manually. All-green → Stop approves with CI summary in next-turn context. Failure → Stop blocks with run names + IDs; investigate via `gh run view <id> --log-failed`, fix, push, hook re-watches. Deadline 180s (override `GM_CI_WATCH_SECS`); slow jobs get a "still in progress" approve.
-## HYGIENE SWEEP
+## Hygiene sweep
-Before declaring complete:
 1. Files >200 lines → split
 2. Comments in code → remove
-3. Scattered test files (.test.js, .spec.js, __tests__/, fixtures/, mocks/) → delete, consolidate into root test.js
-4. Mock/stub/simulation files → delete
-5. Unnecessary doc files (not CHANGELOG/CLAUDE/README/TODO.md) → delete
-6. Duplicate concern → snake to `planning` with restructuring instructions
+3. Scattered test files (`.test.js`, `.spec.js`, `__tests__/`, `fixtures/`, `mocks/`) → delete, consolidate into root `test.js`
+4. Mock / stub / simulation files → delete
+5. Unnecessary doc files (not CHANGELOG, CLAUDE, README, TODO.md) → delete
+6. Duplicate concern → regress to `planning` with restructuring instructions
 7. Hardcoded values → derive from ground truth
-8. Fallback/demo modes → remove, fail loud
-9. TODO.md → empty/deleted
-10. CHANGELOG.md → has entries for this session
+8. Fallback / demo modes → remove, fail loud
+9. TODO.md → empty or deleted
+10. CHANGELOG.md → entries for this session
 11. Observability gaps → server subsystems expose `/debug/<subsystem>`; client modules register in `window.__debug`
-12. Memorize → every fact from verification handed off via background Agent(memorize) at moment of resolution
-13. Deploy/publish → if deployable, deploy; if npm package, publish
+12. Memorize → every fact from verification handed off via background `Agent(memorize)` at moment of resolution
+13. Deploy / publish → if deployable, deploy; if npm package, publish
 14. GitHub Pages → check `.github/workflows/pages.yml` + `docs/index.html` exist; invoke `pages` skill if absent
-15. Governance stress-suite → walk change through M1,F1,C1,H1,S1,B1,A1,D1; any flunk = regress to owning phase
-## MEMORIZE
-```
-Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')
-```
-One per fact, parallel, same turn resolved. End-of-turn self-check mandatory.
-## COMPLETION DEFINITION
+15. Governance stress-suite → walk change through M1, F1, C1, H1, S1, B1, A1, D1; any flunk regresses to the owning phase
-All: witnessed e2e | browser_validated (when client work touched) | failure paths exercised | test.js passes | .prd deleted | git clean+pushed | CI green | hygiene sweep clean | TODO.md gone | CHANGELOG.md updated
+## Completion
-**Never**: claim done without witnessed output | claim done on a client change without browser-validation witness | stop while .prd has items | skip hygiene | skip test.js | uncommitted/unpushed work | stop at first green
+All true at once: witnessed e2e | browser_validated when client work touched | failure paths exercised | test.js passes | `.prd` deleted | git clean and pushed | CI green | hygiene sweep clean | TODO.md gone | CHANGELOG.md updated.

package/skills/gm-copilot-cli/SKILL.md ADDED Viewed

@@ -0,0 +1,19 @@
+---
+name: gm-copilot-cli
+description: AI-native software engineering via skill-driven orchestration on copilot-cli; bootstraps plugkit for task execution and session isolation
+allowed-tools: Skill
+---
+# GM — copilot-cli Platform
+AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
+**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
+**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
+**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
+**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
+Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).

package/skills/gm-cursor/SKILL.md ADDED Viewed

@@ -0,0 +1,19 @@
+---
+name: gm-cursor
+description: AI-native software engineering via skill-driven orchestration on cursor; bootstraps plugkit for task execution and session isolation
+allowed-tools: Skill
+---
+# GM — cursor Platform
+AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
+**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
+**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
+**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
+**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
+Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).

package/skills/gm-emit/SKILL.md CHANGED Viewed

@@ -3,85 +3,68 @@ name: gm-emit
 description: EMIT phase. Pre-emit debug, write files, post-emit verify from disk. Any new unknown triggers immediate snake back to planning — restart chain.
 ---
-# GM EMIT — Write and Verify
+# GM EMIT — Write and verify from disk
-GRAPH: `PLAN → EXECUTE → [EMIT] → VERIFY → COMPLETE`
-Entry: all mutables KNOWN. From `gm-execute` or re-entered from VERIFY.
+Entry: every mutable KNOWN, from `gm-execute` or re-entered from VERIFY. Exit: gates clear → `gm-complete`.
-## TRANSITIONS
+Cross-cutting dispositions live in `gm` SKILL.md.
-**EXIT → VERIFY**: all gate conditions true → invoke `gm-complete` immediately.
-**SELF-LOOP**: post-emit variance with known cause → fix, re-verify, stay in EMIT.
-**REGRESS → EXECUTE**: pre-emit reveals known logic error.
-**REGRESS → PLAN**: pre-emit reveals new unknown | post-emit variance with unknown cause | scope changed.
+## Transitions
-## LEGITIMACY GATE (before pre-emit run)
+- All gates clear → `gm-complete`
+- Post-emit variance with known cause → fix in-band, re-verify, stay in EMIT
+- Pre-emit reveals known logic error → `gm-execute`
+- Pre-emit reveals new unknown OR post-emit variance with unknown cause OR scope changed → `planning`
-For every claim landing in a file:
-1. **Earned specificity** — traces to `authorization=witnessed`, not inflated from weak prior?
-2. **Repair legality** — local patch dressed as structural repair? Downgrade scope or snake to PLAN.
-3. **Lawful downgrade** — can a weaker, true statement replace it? PREFER the downgrade.
-4. **Alternative-route suppression** — live competing route being silenced? Preserve it.
-5. **Strongest objection** — if a reviewer pushed back on this change, what would the sharpest argument be? Articulate it. Cannot articulate = have not understood the alternatives = regress to `gm-execute`.
+## Legitimacy gate (before pre-emit run)
-Fail any → regress to `gm-execute` to witness what was missing, or `planning` if gap is structural.
+For every claim landing in a file, answer five questions:
-## PRE-EMIT RUN (mandatory before writing any file)
+1. Earned specificity — does it trace to `authorization=witnessed`, or is it inflated from a weak prior?
+2. Repair legality — is a local patch dressed as structural repair? Downgrade scope or regress to PLAN.
+3. Lawful downgrade — can a weaker, true statement replace it? Prefer the downgrade.
+4. Alternative-route suppression — is a live competing route being silenced? Preserve it.
+5. Strongest objection — what would the sharpest reviewer pushback be? Articulate it. Cannot articulate = have not understood the alternatives → `gm-execute`.
+Any failure regresses to `gm-execute` to witness what was missing, or `planning` if the gap is structural.
+## Pre-emit run
+Mandatory before writing any file. Write the probe to the spool (`.gm/exec-spool/in/nodejs/<N>.js`):
 ```
-exec:nodejs
 const { fn } = await import('/abs/path/to/module.js');
 console.log(await fn(realInput));
 ```
-1. Import actual module from disk — witness current behavior as baseline
-2. Run proposed logic in isolation WITHOUT writing — witness with real inputs
-3. Probe failure paths with real error inputs
-4. Compare: matches expected → write. Unexpected → new unknown → `planning`.
-## WRITING FILES
+Import the actual module from disk to witness current behavior as the baseline. Run the proposed logic in isolation without writing — witness with real inputs and with real error inputs. Match expected → write. Unexpected → new unknown → `planning`.
-`exec:nodejs` with `require('fs')`. Write only when every gate mutable resolved simultaneously.
+## Writing
-## POST-EMIT VERIFICATION (immediately after writing)
+Use the Write tool, or a nodejs spool file with `require('fs')`. Write only when every gate mutable resolves simultaneously.
-1. Re-import from disk (not in-memory — stale is inadmissible)
-2. Run identical inputs as pre-emit — must match pre-emit baseline exactly
-3. Known variance → fix immediately, re-verify (EMIT self-loop)
-4. Unknown variance → new unknown → invoke `planning`
-## GATE CONDITIONS (all true simultaneously)
-- Legitimacy gate passed; none of five refused collapses
-- Pre-emit passed with real inputs + error inputs
-- Post-emit matches pre-emit exactly
-- Hot reloadable; errors throw with context (no fallbacks, `|| default`, `catch { return null }`)
-- No mocks/fakes/stubs/scattered test files (delete on discovery)
-- Behavior change in this emit = a corresponding assertion in test.js (a change no test would catch is a change you cannot prove)
-- If this emit changes any browser-facing code (client/, served HTML/JS, shaders, page-bundle imports, gh-pages assets), the post-emit verify MUST include a live browser witness via `exec:browser` (boot server → page.goto → page.evaluate asserting the invariant the change established). Node-side import + test.js does NOT satisfy this — see `gm-complete` BROWSER VALIDATION GATE.
-- Files ≤200 lines
-- No duplicate concern (run exec:codesearch for primary concern after writing; any overlap → `planning`)
-- No comments; no hardcoded values; no adjectives in identifiers; no unnecessary files
-- Observability: new server subsystems expose `/debug/<subsystem>`; new client modules in `window.__debug`
-- Structure: no if/else where dispatch table suffices; no one-liners that require decoding; no reinvented APIs
-- All facts resolved this phase memorized via background Agent(memorize)
-- CHANGELOG.md updated; TODO.md cleared/deleted
+## Post-emit verification
-## CODE EXECUTION
+Re-import from disk — in-memory state is stale and inadmissible. Run identical inputs as pre-emit; output must match the baseline exactly. Known variance → fix and re-verify (self-loop). Unknown variance → `planning`.
-`exec:<lang>` only. File writes via exec:nodejs + require('fs'). Never Bash(node/npm/npx/bun).
-Pack runs: Promise.allSettled, each idea own try/catch, under 12s per call.
+## Mutables gate
-## CODEBASE SEARCH
+Before pre-emit run, read `.gm/mutables.yml`. Any entry with `status: unknown` → regress to `gm-execute`. The pre-tool-use hook hard-blocks Write/Edit/NotebookEdit while unresolved entries exist; trying to emit anyway returns deny. Zero unresolved is the precondition for every legitimacy question below.
-`exec:codesearch` only. Grep/Glob/Find/Explore = hook-blocked. Known path → `Read`.
+## Gate (all true at once)
-## MEMORIZE
-```
-Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')
-```
-Same turn as resolution. Parallel when multiple. End-of-turn self-check mandatory.
-**Never**: write before pre-emit | advance with post-emit variance | absorb surprises | respond to user mid-phase
+- `.gm/mutables.yml` empty/absent OR every entry `status: witnessed` with filled `witness_evidence`
+- Legitimacy gate passed; no refused collapse
+- Pre-emit passed with real inputs and real error inputs
+- Post-emit matches pre-emit exactly
+- Hot-reloadable; errors throw with context (no `|| default`, no `catch { return null }`, no fallbacks)
+- No mocks, fakes, stubs, or scattered test files (delete on discovery)
+- Any behavior change has a corresponding assertion in `test.js` — a change no test catches is a change you cannot prove
+- Browser-facing change → post-emit verify includes a live `exec:browser` witness (boot server → `page.goto` → `page.evaluate` asserting the invariant the change established). Node-side import + test.js does not satisfy this — the final gate runs again in `gm-complete`.
+- Files ≤ 200 lines
+- No duplicate concern (run `exec:codesearch` for the primary concern after writing; overlap → `planning`)
+- No comments, no hardcoded values, no adjectives in identifiers, no unnecessary files
+- Observability: new server subsystems expose `/debug/<subsystem>`; new client modules register in `window.__debug`
+- Structure: no if/else where dispatch suffices; no one-liners that obscure; no reinvented APIs
+- Every fact resolved this phase memorized via background `Agent(memorize)`
+- CHANGELOG.md updated; TODO.md cleared or deleted