npm - gm-oc - Versions diffs - 2.0.333 → 2.0.334 - Mend

gm-oc 2.0.333 → 2.0.334

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/package.json +2 -1
package/skills/browser/SKILL.md +175 -0
package/skills/create-lang-plugin/SKILL.md +180 -0
package/skills/gm/SKILL.md +125 -0
package/skills/gm-complete/SKILL.md +107 -0
package/skills/gm-emit/SKILL.md +111 -0
package/skills/gm-execute/SKILL.md +126 -0
package/skills/planning/SKILL.md +93 -0
package/skills/ssh/SKILL.md +86 -0
package/skills/update-docs/SKILL.md +113 -0

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gm-oc",
-  "version": "2.0.333",
+  "version": "2.0.334",
   "description": "State machine agent with hooks, skills, and automated git enforcement",
   "author": "AnEntrypoint",
   "license": "MIT",
@@ -37,6 +37,7 @@
     "agents/",
     "hooks/",
     "scripts/",
+    "skills/",
     "gm.js",
     "gm-oc.mjs",
     "index.js",

package/skills/browser/SKILL.md ADDED Viewed

@@ -0,0 +1,175 @@
+---
+name: browser
+description: Browser automation via playwriter. Use when user needs to interact with websites, navigate pages, fill forms, click buttons, take screenshots, extract data, test web apps, or automate any browser task.
+allowed-tools: Bash(browser:*), Bash(exec:browser*)
+---
+# Browser Automation with playwriter
+## Two Pathways
+**Session commands** — use `browser:` prefix via Bash for all browser control.
+Create a session first, then run commands against it. Use `--direct` for CDP mode (no extension needed — requires Chrome with remote debugging):
+```
+browser:
+playwriter session new --direct
+```
+Returns a numeric session ID (e.g. `1`). Use that ID for all subsequent commands.
+If `--direct` fails, the user needs Chrome running with debugging enabled:
+- Open `chrome://inspect/#remote-debugging` in Chrome, OR
+- Launch Chrome with `chrome --remote-debugging-port=9222`
+```
+browser:
+playwriter -s 1 -e 'await page.goto("https://example.com")'
+```
+```
+browser:
+playwriter -s 1 -e 'await snapshot({ page })'
+```
+```
+browser:
+playwriter -s 1 -e 'await screenshotWithAccessibilityLabels({ page })'
+```
+State persists across calls within a session:
+```
+browser:
+playwriter -s 1 -e 'state.x = 1'
+```
+```
+browser:
+playwriter -s 1 -e 'console.log(state.x)'
+```
+List active sessions:
+```
+browser:
+playwriter session list
+```
+**JS eval in browser** — use `exec:browser` via Bash when you need to run JavaScript in the page context directly.
+```
+exec:browser
+await page.goto('https://example.com')
+await snapshot({ page })
+```
+```
+exec:browser
+const title = await page.title()
+console.log(title)
+```
+Always use single quotes for the `-e` argument to avoid shell quoting issues.
+## Core Workflow
+Every browser automation follows this pattern:
+1. **Create session**: `playwriter session new` (note the returned ID)
+2. **Navigate**: `playwriter -s <id> -e 'await page.goto("https://example.com")'`
+3. **Snapshot**: `playwriter -s <id> -e 'await snapshot({ page })'`
+4. **Interact**: click, fill, type via JS expressions
+5. **Re-snapshot**: after navigation or DOM changes
+```
+browser:
+playwriter session new
+playwriter -s 1 -e 'await page.goto("https://example.com/form")'
+playwriter -s 1 -e 'await snapshot({ page })'
+playwriter -s 1 -e 'await page.fill("[name=email]", "user@example.com")'
+playwriter -s 1 -e 'await page.click("[type=submit]")'
+playwriter -s 1 -e 'await page.waitForLoadState("networkidle")'
+playwriter -s 1 -e 'await snapshot({ page })'
+```
+## Common Patterns
+### Navigation and Snapshot
+```
+browser:
+playwriter session new
+playwriter -s 1 -e 'await page.goto("https://example.com")'
+playwriter -s 1 -e 'await snapshot({ page })'
+```
+### Screenshot with Accessibility Labels
+```
+browser:
+playwriter -s 1 -e 'await screenshotWithAccessibilityLabels({ page })'
+```
+### Data Extraction
+```
+exec:browser
+await page.goto('https://example.com/products')
+const items = await page.$$eval('.product-title', els => els.map(e => e.textContent))
+console.log(JSON.stringify(items))
+```
+### Persistent State Across Steps
+```
+browser:
+playwriter -s 1 -e 'state.loginDone = false'
+playwriter -s 1 -e 'await page.goto("https://app.example.com/login")'
+playwriter -s 1 -e 'await page.fill("[name=user]", "admin")'
+playwriter -s 1 -e 'await page.fill("[name=pass]", "secret")'
+playwriter -s 1 -e 'await page.click("[type=submit]")'
+playwriter -s 1 -e 'state.loginDone = true'
+```
+### Multiple Sessions
+```
+browser:
+playwriter session new
+playwriter session new
+playwriter -s 1 -e 'await page.goto("https://site-a.com")'
+playwriter -s 2 -e 'await page.goto("https://site-b.com")'
+playwriter session list
+```
+## JavaScript Evaluation (exec pathway)
+Use `exec:browser` via Bash when you need direct page access. The body is plain JavaScript executed in the browser context.
+```
+exec:browser
+await page.goto('https://example.com')
+await snapshot({ page })
+```
+```
+exec:browser
+const links = await page.$$eval('a', els => els.map(e => e.href))
+console.log(JSON.stringify(links))
+```
+Never add shell quoting or escaping to the exec body — write plain JavaScript directly.
+## Key Patterns for Agents
+**Which pathway to use**:
+- Multi-step session workflows → `browser:` prefix with `playwriter -s <id> -e '...'`
+- Quick JS eval or data extraction → `exec:browser` with plain JS body
+**Always use single quotes** for the `-e` argument to playwriter to avoid shell interpretation.
+**Session IDs are numeric**: `playwriter session new` returns `1`, `2`, etc. Use the exact returned value.
+**Snapshot before interacting**: always call `await snapshot({ page })` to understand current page state before clicking or filling.

package/skills/create-lang-plugin/SKILL.md ADDED Viewed

@@ -0,0 +1,180 @@
+---
+name: create-lang-plugin
+description: Create a lang/ plugin that wires any CLI tool or language runtime into gm-cc — adds exec:<id> dispatch, optional LSP diagnostics, and optional prompt context injection. Zero hook configuration required.
+---
+# CREATE LANG PLUGIN
+A lang plugin is a single CommonJS file at `<projectDir>/lang/<id>.js`. gm-cc's hooks auto-discover it — no hook editing, no settings changes. The plugin gets three integration points: **exec dispatch**, **LSP diagnostics**, and **context injection**.
+## PLUGIN SHAPE
+```js
+'use strict';
+module.exports = {
+  id: 'mytool',                          // must match filename: lang/mytool.js
+  exec: {
+    match: /^exec:mytool/,               // regex tested against full "exec:mytool\n<code>" string
+    run(code, cwd) {                     // returns string or Promise<string>
+      // ...
+    }
+  },
+  lsp: {                                 // optional — synchronous only
+    check(fileContent, cwd) {            // returns Diagnostic[] synchronously
+      // ...
+    }
+  },
+  extensions: ['.ext'],                  // optional — file extensions lsp.check applies to
+  context: `=== mytool ===\n...`        // optional — string or () => string
+};
+```
+```ts
+type Diagnostic = { line: number; col: number; severity: 'error'|'warning'; message: string };
+```
+## HOW IT WORKS
+- **`exec.run`** is called in a child process (30s timeout) when Claude writes `exec:mytool\n<code>`. Output is returned as `exec:mytool output:\n\n<result>`. Async is fine here.
+- **`lsp.check`** is called synchronously in the hook process on each prompt submit — must NOT be async. Use `execFileSync` or `spawnSync`.
+- **`context`** is injected into every prompt's `additionalContext` (truncated to 2000 chars) and into the session-start context.
+- **`match`** regex is tested against the full command string `exec:mytool\n<code>` — keep it simple: `/^exec:mytool/`.
+## STEP 1 — IDENTIFY THE TOOL
+Answer these before writing any code:
+1. What is the tool's CLI name or npm package? (`gdlint`, `tsc`, `deno`, `ruff`, ...)
+2. How do you run a single expression/snippet? (`tool eval <expr>`, `tool -e <code>`, HTTP POST, ...)
+3. How do you run a file? (`tool run <file>`, `tool <file>`, ...)
+4. Does it have a lint/check mode? What does its output format look like?
+5. What file extensions does it apply to?
+6. Is the game/server running required, or does it work headlessly?
+## STEP 2 — IMPLEMENT exec.run
+Pattern for **HTTP eval** (tool has a running server):
+```js
+const http = require('http');
+function httpPost(port, urlPath, body) {
+  return new Promise((resolve, reject) => {
+    const data = JSON.stringify(body);
+    const req = http.request(
+      { hostname: '127.0.0.1', port, path: urlPath, method: 'POST',
+        headers: { 'Content-Type': 'application/json', 'Content-Length': Buffer.byteLength(data) } },
+      (res) => { let raw = ''; res.on('data', c => raw += c); res.on('end', () => { try { resolve(JSON.parse(raw)); } catch { resolve({ raw }); } }); }
+    );
+    req.setTimeout(8000, () => { req.destroy(); reject(new Error('timeout')); });
+    req.on('error', reject);
+    req.write(data); req.end();
+  });
+}
+```
+Pattern for **file-based execution** (write temp file, run headlessly):
+```js
+const fs = require('fs');
+const os = require('os');
+const path = require('path');
+const { execFileSync } = require('child_process');
+function runFile(code, cwd) {
+  const tmp = path.join(os.tmpdir(), `plugin_${Date.now()}.ext`);
+  fs.writeFileSync(tmp, code);
+  try {
+    return execFileSync('mytool', ['run', tmp], { cwd, encoding: 'utf8', timeout: 10000 });
+  } finally {
+    try { fs.unlinkSync(tmp); } catch (_) {}
+  }
+}
+```
+**Distinguish single expression vs multi-line** when both modes exist:
+```js
+function isSingleExpr(code) {
+  return !code.trim().includes('\n') && !/\b(func|def|fn |class|import)\b/.test(code);
+}
+```
+## STEP 3 — IMPLEMENT lsp.check (if applicable)
+Must be **synchronous**. Parse the tool's stderr/stdout for diagnostics:
+```js
+const { spawnSync } = require('child_process');
+const fs = require('fs');
+const os = require('os');
+const path = require('path');
+function check(fileContent, cwd) {
+  const tmp = path.join(os.tmpdir(), `lsp_${Math.random().toString(36).slice(2)}.ext`);
+  try {
+    fs.writeFileSync(tmp, fileContent);
+    const r = spawnSync('mytool', ['check', tmp], { encoding: 'utf8', cwd });
+    const output = r.stdout + r.stderr;
+    return output.split('\n').reduce((acc, line) => {
+      const m = line.match(/^.+:(\d+):(\d+):\s+(error|warning):\s+(.+)$/);
+      if (m) acc.push({ line: parseInt(m[1]), col: parseInt(m[2]), severity: m[3], message: m[4].trim() });
+      return acc;
+    }, []);
+  } catch (_) {
+    return [];
+  } finally {
+    try { fs.unlinkSync(tmp); } catch (_) {}
+  }
+}
+```
+Common output patterns to parse:
+- `file:line:col: error: message` → standard
+- `file:line: E001: message` → gdlint style (`E`=error, `W`=warning)
+- JSON output → `JSON.parse(r.stdout).errors.map(...)`
+## STEP 4 — WRITE context STRING
+Describe what `exec:<id>` does and when to use it. This appears in every prompt. Keep it under 300 chars:
+```js
+context: `=== mytool exec: support ===
+exec:mytool
+<expression or code block>
+Runs via <how>. Use for <when>.`
+```
+## STEP 5 — WRITE THE FILE
+File goes at `lang/<id>.js` in the project root. The `id` field must match the filename (without `.js`).
+Verify after writing:
+```
+exec:nodejs
+const p = require('/abs/path/to/lang/mytool.js');
+console.log(p.id, typeof p.exec.run, p.exec.match.toString());
+```
+Then test dispatch:
+```
+exec:mytool
+<a simple test expression>
+```
+If it returns `exec:mytool output:` → working. If it errors → fix `exec.run`.
+## CONSTRAINTS
+- `exec.run` may be async — it runs in a child process with a 30s timeout
+- `lsp.check` must be synchronous — no Promises, no async/await
+- Plugin must be CommonJS (`module.exports = { ... }`) — no ES module syntax
+- No persistent processes — `exec.run` must complete and exit cleanly
+- `id` must match the filename exactly
+- First match wins — if multiple plugins could match, make `match` specific
+## EXAMPLE — gdscript plugin (reference implementation)
+See `C:/dev/godot-kit/lang/gdscript.js` for a complete working example combining HTTP eval (single expressions via port 6009) with headless file execution fallback, synchronous gdlint LSP, and a context string.

package/skills/gm/SKILL.md ADDED Viewed

@@ -0,0 +1,125 @@
+---
+name: gm
+description: Immutable programming state machine. Root orchestrator. Invoke for all work coordination via the Skill tool.
+---
+# GM — Immutable Programming State Machine
+You think in state, not prose. You are the root orchestrator of all work in this system.
+**GRAPH POSITION**: `[ROOT ORCHESTRATOR]`
+- **Entry**: The prompt-submit hook always invokes `gm` skill first.
+- **Shared state**: .prd file (markdown format) on disk + witnessed execution output only. Nothing persists between skills. Delete .prd when empty — do not leave an empty file.
+- **First action**: Invoke `planning` skill immediately.
+## THE STATE MACHINE
+`PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS → COMPLETE`
+**FORWARD (ladders)**:
+- PLAN complete → invoke `gm-execute` skill
+- EXECUTE complete → invoke `gm-emit` skill
+- EMIT complete → invoke `gm-complete` skill
+- COMPLETE with .prd items remaining → invoke `gm-execute` skill (next wave)
+**BACKWARD (snakes) — any new unknown at any phase restarts from PLAN**:
+- New unknown discovered → invoke `planning` skill, restart chain
+- EXECUTE mutable unresolvable after 2 passes → invoke `planning` skill
+- EMIT logic wrong → invoke `gm-execute` skill
+- EMIT new unknown → invoke `planning` skill
+- VERIFY file broken → invoke `gm-emit` skill
+- VERIFY logic wrong → invoke `gm-execute` skill
+- VERIFY new unknown or wrong requirements → invoke `planning` skill
+**Runs until**: .prd empty AND git clean AND all pushes confirmed.
+## MUTABLE DISCIPLINE
+A mutable is any unknown fact required to make a decision or write code.
+- Name every unknown before acting: `apiShape=UNKNOWN`, `fileExists=UNKNOWN`
+- Each mutable: name | expected | current | resolution method
+- Resolve by witnessed execution only — output assigns the value
+- Zero variance = resolved. Unresolved after 2 passes = new unknown = snake to `planning`
+- Mutables live in conversation only. Never written to files.
+## CODE EXECUTION
+**exec:<lang> is the only way to run code.** Bash tool body: `exec:<lang>\n<code>`
+Languages: `exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:c` | `exec:cpp` | `exec:java` | `exec:deno` | `exec:cmd`
+- Lang auto-detected if omitted. `cwd` field sets working directory.
+- File I/O: `exec:nodejs` with `require('fs')`
+- Only `git` runs directly in Bash. `Bash(node/npm/npx/bun)` = violations.
+**Execution efficiency — pack every run:**
+- Combine multiple independent operations into one exec call using `Promise.allSettled` or parallel subprocess spawning
+- Each independent idea gets its own try/catch with independent error reporting — never let one failure block another
+- Target under 12s per exec call; split work across multiple calls only when dependencies require it
+- Prefer a single well-structured exec that does 5 things over 5 sequential execs
+**Background tasks** (auto-backgrounded after 15s):
+```
+exec:sleep
+<task_id> [seconds]
+```
+```
+exec:status
+<task_id>
+```
+```
+exec:close
+<task_id>
+```
+**Runner management**:
+```
+exec:runner
+start|stop|status
+```
+## CODEBASE EXPLORATION
+```
+exec:codesearch
+<natural language description>
+```
+Alias: `exec:search`. **Glob, Grep, Read, Explore, WebSearch are hook-blocked** — the pre-tool-use hook denies them. Use `exec:codesearch` exclusively for all codebase discovery.
+## BROWSER AUTOMATION
+Invoke `browser` skill. Escalation — exhaust each before advancing:
+1. `exec:browser\n<js>` — query DOM/state via JS
+2. `browser` skill — for full session workflows
+3. navigate/click/type — only when real events required
+4. screenshot — last resort only
+## SKILL REGISTRY
+**`planning`** — Mutable discovery and .prd construction. Invoke at start and on any new unknown.
+**`gm-execute`** — Resolve all mutables via witnessed execution.
+**`gm-emit`** — Write files to disk when all mutables resolved.
+**`gm-complete`** — End-to-end verification and git enforcement.
+**`update-docs`** — Refresh README, CLAUDE.md, and docs to reflect session changes. Invoked by `gm-complete`.
+**`browser`** — Browser automation. Invoke inside EXECUTE for all browser/UI work.
+## DO NOT STOP
+**You may not respond to the user or stop working while any of these are true:**
+- .prd file exists and has items
+- git has uncommitted changes
+- git has unpushed commits
+Completing a phase is NOT stopping. After every phase: read .prd, check git, invoke next skill. Only when .prd is deleted AND git is clean AND all commits are pushed may you return a final response to the user.
+## CONSTRAINTS
+**Tier 0**: no_crash, no_exit, ground_truth_only, real_execution
+**Tier 1**: max_file_lines=200, hot_reloadable, checkpoint_state
+**Tier 2**: no_duplication, no_hardcoded_values, modularity
+**Tier 3**: no_comments, convention_over_code
+**Never**: `Bash(node/npm/npx/bun)` | skip planning | sequential independent items | screenshot before JS exhausted | narrate past unresolved mutables | stop while .prd has items | ask the user what to do next while work remains
+**Always**: invoke named skill at every transition | snake to planning on any new unknown | witnessed execution only | keep going until .prd deleted and git clean

package/skills/gm-complete/SKILL.md ADDED Viewed

@@ -0,0 +1,107 @@
+---
+name: gm-complete
+description: VERIFY and COMPLETE phase. End-to-end system verification and git enforcement. Any new unknown triggers immediate snake back to planning — restart chain.
+---
+# GM COMPLETE — Verification and Completion
+You are in the **VERIFY → COMPLETE** phase. Files are written. Prove the whole system works end-to-end. Any new unknown = snake to `planning`, restart chain.
+**GRAPH POSITION**: `PLAN → EXECUTE → EMIT → [VERIFY] → UPDATE-DOCS → COMPLETE`
+- **Entry**: All EMIT gates passed. Entered from `gm-emit`.
+## TRANSITIONS
+**FORWARD**:
+- .prd items remain → invoke `gm-execute` skill (next wave)
+- .prd empty + feature work pushed → invoke `update-docs` skill
+**BACKWARD**:
+- Verification reveals broken file output → invoke `gm-emit` skill, fix, re-verify, return
+- Verification reveals logic error → invoke `gm-execute` skill, re-resolve, re-emit, return
+- Verification reveals new unknown → invoke `planning` skill, restart chain
+- Verification reveals requirements wrong → invoke `planning` skill, restart chain
+**TRIAGE on failure**: broken file output → snake to `gm-emit` | wrong logic → snake to `gm-execute` | new unknown or wrong requirements → snake to `planning`
+**RULE**: Any surprise = new unknown = snake to `planning`. Never patch around surprises.
+## MUTABLE DISCIPLINE
+- `witnessed_e2e=UNKNOWN` until real end-to-end run produces witnessed output
+- `git_clean=UNKNOWN` until `exec:bash\ngit status --porcelain` returns empty
+- `git_pushed=UNKNOWN` until `git log origin/main..HEAD --oneline` returns empty
+- `prd_empty=UNKNOWN` until .prd file is deleted (not just empty — file must not exist)
+All four must resolve to KNOWN before COMPLETE. Any UNKNOWN = absolute barrier.
+## END-TO-END VERIFICATION
+Run the real system with real data. Witness actual output.
+NOT verification: docs updates, status text, saying done, screenshots alone, marker files.
+```
+exec:nodejs
+const { fn } = await import('/abs/path/to/module.js');
+console.log(await fn(realInput));
+```
+For browser/UI: invoke `browser` skill with real workflows. Server + client features require both exec:nodejs AND browser. After every success: enumerate what remains — never stop at first green.
+## CODE EXECUTION
+**exec:<lang> is the only way to run code.** Bash tool body: `exec:<lang>\n<code>`
+`exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:java` | `exec:deno` | `exec:cmd`
+Only git in bash directly. Background tasks: `exec:sleep\n<id>`, `exec:status\n<id>`, `exec:close\n<id>`. Runner: `exec:runner\nstart|stop|status`.
+**Execution efficiency — pack every run:**
+- Combine multiple independent operations into one exec call using `Promise.allSettled` or parallel subprocess spawning
+- Each independent idea gets its own try/catch with independent error reporting — never let one failure block another
+- Target under 12s per exec call; split work across multiple calls only when dependencies require it
+- Prefer a single well-structured exec that does 5 things over 5 sequential execs
+## CODEBASE EXPLORATION
+```
+exec:codesearch
+<natural language description>
+```
+## GIT ENFORCEMENT
+```
+exec:bash
+git status --porcelain
+```
+Must return empty.
+```
+exec:bash
+git log origin/main..HEAD --oneline
+```
+Must return empty. If not: stage → commit → push → re-verify. Local commit without push ≠ complete.
+## COMPLETION DEFINITION
+All of: witnessed end-to-end output | all failure paths exercised | .prd empty | git clean and pushed | `user_steps_remaining=0`
+## DO NOT STOP
+After end-to-end verification passes: read .prd from disk. If any items remain, immediately invoke `gm-execute` skill — do not respond to the user. Only respond when .prd is deleted AND git is clean AND all commits are pushed.
+## CONSTRAINTS
+**Never**: claim done without witnessed output | uncommitted changes | unpushed commits | .prd items remaining | stop at first green | absorb surprises silently | respond to user while .prd has items
+**Always**: triage failure before snaking | witness end-to-end | snake to planning on any new unknown | enumerate remaining after every success | check .prd after every verification pass
+---
+**→ FORWARD**: .prd items remain → invoke `gm-execute` skill (keep going, do not stop).
+**→ FORWARD**: .prd deleted + feature work pushed → invoke `update-docs` skill.
+**↩ SNAKE to EMIT**: file output wrong → invoke `gm-emit` skill.
+**↩ SNAKE to EXECUTE**: logic wrong → invoke `gm-execute` skill.
+**↩ SNAKE to PLAN**: new unknown or wrong requirements → invoke `planning` skill, restart chain.

package/skills/gm-emit/SKILL.md ADDED Viewed

@@ -0,0 +1,111 @@
+---
+name: gm-emit
+description: EMIT phase. Pre-emit debug, write files, post-emit verify from disk. Any new unknown triggers immediate snake back to planning — restart chain.
+---
+# GM EMIT — Writing and Verifying Files
+You are in the **EMIT** phase. Every mutable is KNOWN. Prove the write is correct, write, confirm from disk. Any new unknown = snake to `planning`, restart chain.
+**GRAPH POSITION**: `PLAN → EXECUTE → [EMIT] → VERIFY → COMPLETE`
+- **Entry**: All .prd mutables resolved. Entered from `gm-execute` or via snake from VERIFY.
+## TRANSITIONS
+**FORWARD**: All gate conditions true simultaneously → invoke `gm-complete` skill
+**SELF-LOOP**: Post-emit variance with known cause → fix immediately, re-verify, do not advance until zero variance
+**BACKWARD**:
+- Pre-emit reveals logic error (known mutable) → invoke `gm-execute` skill, re-resolve, return here
+- Pre-emit reveals new unknown → invoke `planning` skill, restart chain
+- Post-emit variance with unknown cause → invoke `planning` skill, restart chain
+- Scope changed → invoke `planning` skill, restart chain
+- From VERIFY: end-to-end reveals broken file → re-enter here, fix, re-verify, re-advance
+## MUTABLE DISCIPLINE
+Each gate condition is a mutable. Pre-emit run witnesses expected value. Post-emit run witnesses current value. Zero variance = resolved. Variance with unknown cause = new unknown = snake to `planning`.
+## CODE EXECUTION
+**exec:<lang> is the only way to run code.** Bash tool body: `exec:<lang>\n<code>`
+`exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:java` | `exec:deno` | `exec:cmd`
+Only git in bash directly. `Bash(node/npm/npx/bun)` = violations. File writes via exec:nodejs + require('fs').
+**Execution efficiency — pack every run:**
+- Combine multiple independent operations into one exec call using `Promise.allSettled` or parallel subprocess spawning
+- Each independent idea gets its own try/catch with independent error reporting — never let one failure block another
+- Target under 12s per exec call; split work across multiple calls only when dependencies require it
+- Prefer a single well-structured exec that does 5 things over 5 sequential execs
+## PRE-EMIT DEBUGGING (before writing any file)
+1. Import actual module from disk via `exec:nodejs` — witness current on-disk behavior
+2. Run proposed logic in isolation WITHOUT writing — witness output with real inputs
+3. Debug failure paths with real error inputs — record expected values
+```
+exec:nodejs
+const { fn } = await import('/abs/path/to/module.js');
+console.log(await fn(realInput));
+```
+Pre-emit revealing unexpected behavior → new unknown → snake to `planning`.
+## WRITING FILES
+`exec:nodejs` with `require('fs')`. Write only when every gate mutable is `resolved=true` simultaneously.
+## POST-EMIT VERIFICATION (immediately after writing)
+1. Re-import the actual file from disk — not in-memory version
+2. Run same inputs as pre-emit — output must match exactly
+3. For browser: reload from disk, re-inject `__gm` globals, re-run, compare captures
+4. Known variance → fix and re-verify | Unknown variance → snake to `planning`
+## GATE CONDITIONS (all true simultaneously before advancing)
+- Pre-emit debug passed with real inputs and error inputs
+- Post-emit verification matches pre-emit exactly
+- Hot reloadable: state outside reloadable modules, handlers swap atomically
+- Crash-proof: catch at every boundary, recovery hierarchy
+- No mocks/fakes/stubs anywhere
+- Files ≤200 lines, no duplicate code, no comments, no hardcoded values
+- CLAUDE.md reflects actual behavior
+## CODEBASE EXPLORATION
+```
+exec:codesearch
+<natural language description>
+```
+Alias: `exec:search`. **Glob, Grep, Read, Explore are hook-blocked** — use `exec:codesearch` exclusively.
+## BROWSER DEBUGGING
+Invoke `browser` skill. Escalation: (1) `exec:browser\n<js>` → (2) `browser` skill → (3) navigate/click → (4) screenshot last resort.
+## SELF-CHECK (before and after each file)
+File ≤200 lines | No duplication | Pre-emit passed | No mocks | No comments | Docs match | All spotted issues fixed
+## DO NOT STOP
+Never respond to the user from this phase. When all gate conditions pass, immediately invoke `gm-complete` skill. Do not pause, summarize, or ask questions.
+## CONSTRAINTS
+**Never**: write before pre-emit passes | advance with post-emit variance | absorb surprises silently | comments | hardcoded values | defer spotted issues | respond to user or pause for input
+**Always**: pre-emit debug before writing | post-emit verify from disk | snake to planning on any new unknown | fix immediately | invoke next skill immediately when gates pass
+---
+**→ FORWARD**: All gates pass → invoke `gm-complete` skill immediately.
+**↺ SELF-LOOP**: Known post-emit variance → fix, re-verify.
+**↩ SNAKE to EXECUTE**: Known logic error → invoke `gm-execute` skill.
+**↩ SNAKE to PLAN**: Any new unknown → invoke `planning` skill, restart chain.

package/skills/gm-execute/SKILL.md ADDED Viewed

@@ -0,0 +1,126 @@
+---
+name: gm-execute
+description: EXECUTE phase. Resolve all mutables via witnessed execution. Any new unknown triggers immediate snake back to planning — restart chain from PLAN.
+---
+# GM EXECUTE — Resolving Every Unknown
+You are in the **EXECUTE** phase. Resolve every named mutable via witnessed execution. Any new unknown = stop, snake to `planning`, restart chain.
+**GRAPH POSITION**: `PLAN → [EXECUTE] → EMIT → VERIFY → COMPLETE`
+- **Entry**: .prd exists with all unknowns named. Entered from `planning` or via snake from EMIT/VERIFY.
+## TRANSITIONS
+**FORWARD**: All mutables KNOWN → invoke `gm-emit` skill
+**SELF-LOOP**: Mutable still UNKNOWN after one pass → re-run with different angle (max 2 passes then snake)
+**BACKWARD**:
+- New unknown discovered → invoke `planning` skill immediately, restart chain
+- From EMIT: logic error → re-enter here, re-resolve mutable
+- From VERIFY: runtime failure → re-enter here, re-resolve with real system state
+## MUTABLE DISCIPLINE
+Each mutable: name | expected | current | resolution method. Execute → witness → assign → compare. Zero variance = resolved. Unresolved after 2 passes = new unknown = snake to `planning`. Never narrate past an unresolved mutable.
+## CODE EXECUTION
+**exec:<lang> is the only way to run code.** Bash tool body: `exec:<lang>\n<code>`
+`exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:c` | `exec:cpp` | `exec:java` | `exec:deno` | `exec:cmd`
+Lang auto-detected if omitted. `cwd` sets directory. File I/O via exec:nodejs + require('fs'). Only git in bash directly. `Bash(node/npm/npx/bun)` = violations.
+**Execution efficiency — pack every run:**
+- Combine multiple independent operations into one exec call using `Promise.allSettled` or parallel subprocess spawning
+- Each independent idea gets its own try/catch with independent error reporting — never let one failure block another
+- Target under 12s per exec call; split work across multiple calls only when dependencies require it
+- Prefer a single well-structured exec that does 5 things over 5 sequential execs
+**Background tasks** (auto-backgrounded when execution exceeds 15s):
+```
+exec:sleep
+<task_id> [seconds]
+```
+```
+exec:status
+<task_id>
+```
+```
+exec:close
+<task_id>
+```
+**Runner**:
+```
+exec:runner
+start|stop|status
+```
+## CODEBASE EXPLORATION
+```
+exec:codesearch
+<natural language description of what you need>
+```
+Alias: `exec:search`. **Glob, Grep, Read, Explore, WebSearch are hook-blocked** — the pre-tool-use hook denies them. Use `exec:codesearch` exclusively for all codebase discovery.
+## IMPORT-BASED DEBUGGING
+Always import actual codebase modules. Never rewrite logic inline.
+```
+exec:nodejs
+const { fn } = await import('/abs/path/to/module.js');
+console.log(await fn(realInput));
+```
+Witnessed import output = resolved mutable. Reimplemented output = UNKNOWN.
+## EXECUTION DENSITY
+Pack every related hypothesis into one run. Each run ≤15s. Witnessed output = ground truth. Narrated assumption = nothing.
+Parallel waves: ≤3 `gm:gm` subagents via Task tool — independent items simultaneously, never sequentially.
+## CHAIN DECOMPOSITION
+Break every multi-step operation before running end-to-end:
+1. Number every distinct step
+2. Per step: input shape, output shape, success condition, failure mode
+3. Run each step in isolation — witness — assign mutable — KNOWN before next
+4. Debug adjacent pairs for handoff correctness
+5. Only when all pairs pass: run full chain end-to-end
+Step failure revealing new unknown → snake to `planning`.
+## BROWSER DEBUGGING
+Invoke `browser` skill. Escalation — exhaust each before advancing:
+1. `exec:browser\n<js>` — query DOM/state. Always first.
+2. `browser` skill — for full session workflows
+3. navigate/click/type — only when real events required
+4. screenshot — last resort
+## GROUND TRUTH
+Real services, real data, real timing. Mocks/fakes/stubs = delete immediately. No .test.js/.spec.js. Delete on discovery.
+## DO NOT STOP
+Never respond to the user from this phase. When all mutables are KNOWN, immediately invoke `gm-emit` skill. The chain continues until .prd is deleted and git is clean — that happens in `gm-complete`, not here.
+## CONSTRAINTS
+**Never**: `Bash(node/npm/npx/bun)` | fake data | mock files | Glob/Grep/Read/Explore (hook-blocked — use exec:codesearch) | sequential independent items | absorb surprises silently | respond to user or pause for input
+**Always**: witness every hypothesis | import real modules | snake to planning on any new unknown | fix immediately on discovery | invoke next skill immediately when done
+---
+**→ FORWARD**: All mutables KNOWN → invoke `gm-emit` skill immediately.
+**↺ SELF-LOOP**: Still UNKNOWN → re-run (max 2 passes).
+**↩ SNAKE to PLAN**: Any new unknown → invoke `planning` skill, restart chain.

package/skills/planning/SKILL.md ADDED Viewed

@@ -0,0 +1,93 @@
+---
+name: planning
+description: Mutable discovery and PRD construction. Invoke at session start and any time new unknowns surface during execution. Loop until no new mutables are discovered.
+allowed-tools: Write
+---
+# PRD Construction — Mutable Discovery Loop
+You are in the **PLAN** phase. Your job is to discover every unknown before execution begins.
+**GRAPH POSITION**: `[PLAN] → EXECUTE → EMIT → VERIFY → COMPLETE`
+- **Entry chain**: prompt-submit hook → `gm` skill → `planning` skill (here).
+- **Also entered**: any time a new unknown surfaces in EXECUTE, EMIT, or VERIFY.
+## TRANSITIONS
+**FORWARD**:
+- No new mutables discovered in latest pass → .prd is complete → invoke `gm-execute` skill
+**SELF-LOOP (stay in PLAN)**:
+- Each planning pass may surface new unknowns → add them to .prd → plan again
+- Loop until a full pass produces zero new items
+- Do not advance to EXECUTE while unknowns remain discoverable through reasoning alone
+**BACKWARD (snakes back here from later phases)**:
+- From EXECUTE: execution reveals an unknown not in .prd → snake here, add it, re-plan
+- From EMIT: scope shifted mid-write → snake here, revise affected items, re-plan
+- From VERIFY: end-to-end reveals requirement was wrong → snake here, rewrite items, re-plan
+## WHAT PLANNING MEANS
+Planning = exhaustive mutable discovery. For every aspect of the task ask:
+- What do I not know? → name it as a mutable
+- What could go wrong? → name it as an edge case item
+- What depends on what? → map blocking/blockedBy
+- What assumptions am I making? → validate each as a mutable
+**Iterate until**: a full reasoning pass adds zero new items to .prd.
+Categories of unknowns to enumerate: file existence | API shape | data format | dependency versions | runtime behavior | environment differences | error conditions | concurrency | integration points | backwards compatibility | rollback paths | deployment steps | verification criteria
+## .PRD FORMAT
+Path: exactly `./.prd` in current working directory. **JSON array** written via `exec:nodejs`.
+**Delete the file when empty.** Do not leave an empty `.prd` on disk — remove it entirely when all items are completed.
+```json
+[
+  {
+    "id": "descriptive-kebab-id",
+    "subject": "Imperative verb phrase — what must be true when done",
+    "status": "pending",
+    "description": "Precise completion criterion",
+    "effort": "small|medium|large",
+    "category": "feature|bug|refactor|infra",
+    "blocking": ["ids this prevents from starting"],
+    "blockedBy": ["ids that must complete first"],
+    "acceptance": ["measurable, binary criterion 1"],
+    "edge_cases": ["known failure mode 1"]
+  }
+]
+```
+**Status flow**: `pending` → `in_progress` → `completed` (completed items are removed from array).
+**Effort**: `small` = single execution, under 15min | `medium` = 2-3 rounds, under 45min | `large` = multiple rounds, over 1h.
+**blocking/blockedBy**: always bidirectional. Every dependency must be explicit in both directions.
+**Deletion rule**: when the last item is completed and removed, delete the `.prd` file. An empty file is a violation.
+## EXECUTION WAVES
+Independent items (empty `blockedBy`) run in parallel waves of ≤3 subagents.
+- Find all pending items with empty `blockedBy`
+- Launch ≤3 parallel `gm:gm` subagents via Task tool
+- Each subagent handles one item: resolves it, witnesses output, removes from .prd
+- After each wave: check newly unblocked items, launch next wave
+- Never run independent items sequentially. Never launch more than 3 at once.
+## COMPLETION CRITERION
+.prd is ready when: one full reasoning pass produces zero new items AND all items have explicit acceptance criteria AND all dependencies are mapped.
+**Skip planning entirely** if: task is single-step, trivially bounded, zero unknowns, under 5 minutes.
+## DO NOT STOP
+Never respond to the user from this phase. When .prd is complete (zero new items in last pass), immediately invoke `gm-execute` skill. Do not pause, summarize, or ask for confirmation.
+---
+**→ FORWARD**: No new mutables → invoke `gm-execute` skill immediately.
+**↺ SELF-LOOP**: New items discovered → add to .prd → plan again.
+**↩ SNAKE here**: New unknown surfaces in any later phase → add it, re-plan, re-advance.

package/skills/ssh/SKILL.md ADDED Viewed

@@ -0,0 +1,86 @@
+---
+name: ssh
+description: Run shell commands on remote SSH hosts via exec:ssh. Reads targets from ~/.claude/ssh-targets.json. Use for deploying, monitoring, or controlling remote machines.
+---
+# exec:ssh — Remote SSH Execution
+Runs shell commands on a remote host over SSH. No shell open, no manual connection — just write the command.
+## Setup
+Create `~/.claude/ssh-targets.json`:
+```json
+{
+  "default": {
+    "host": "192.168.1.10",
+    "port": 22,
+    "username": "pi",
+    "password": "yourpassword"
+  },
+  "prod": {
+    "host": "10.0.0.1",
+    "username": "ubuntu",
+    "keyPath": "/home/user/.ssh/id_rsa"
+  }
+}
+```
+Fields: `host` (required), `port` (default 22), `username` (required), `password` OR `keyPath` + optional `passphrase`.
+## Usage
+```
+exec:ssh
+<shell command>
+```
+Target a named host with `@name` on the first line:
+```
+exec:ssh
+@prod
+sudo systemctl restart myapp
+```
+Multi-line scripts work:
+```
+exec:ssh
+cd /var/log && tail -20 syslog
+```
+## Process Persistence
+SSH sessions kill child processes on close. To keep a process running after the command returns:
+```
+exec:ssh
+sudo systemctl reset-failed myunit 2>/dev/null; systemd-run --unit=myunit bash -c 'your-long-running-command'
+```
+Always call `systemctl reset-failed <unit>` before reusing a unit name, or use a unique timestamped name:
+```
+exec:ssh
+systemd-run --unit=job-$(date +%s) bash -c 'nohup myprogram &'
+```
+Fallback if systemd unavailable:
+```
+exec:ssh
+setsid nohup bash -c 'myprogram > /tmp/out.log 2>&1' &
+```
+## Dependency
+Requires `ssh2` npm package. Install in `~/.claude/gm-tools`:
+```
+exec:bash
+cd ~/.claude/gm-tools && npm install ssh2
+```
+The plugin searches: `~/.claude/gm-tools/node_modules/ssh2`, `~/.claude/plugins/node_modules/ssh2`, then global.

package/skills/update-docs/SKILL.md ADDED Viewed

@@ -0,0 +1,113 @@
+---
+name: update-docs
+description: UPDATE-DOCS phase. Refresh README.md, CLAUDE.md, and docs/index.html to reflect changes made this session. Commits and pushes doc updates. Terminal phase — declares COMPLETE.
+---
+# GM UPDATE-DOCS — Documentation Refresh
+You are in the **UPDATE-DOCS** phase. Feature work is verified and pushed. Refresh all documentation to match the actual codebase state right now.
+**GRAPH POSITION**: `PLAN → EXECUTE → EMIT → VERIFY → [UPDATE-DOCS] → COMPLETE`
+- **Entry**: Feature work verified, committed, and pushed. Entered from `gm-complete`.
+## TRANSITIONS
+**FORWARD**: All docs updated, committed, and pushed → COMPLETE (session ends)
+**BACKWARD**:
+- Diff reveals unknown architecture change → invoke `planning` skill, restart chain
+- Doc write fails post-emit verify → fix and re-verify before advancing
+## EXECUTION SEQUENCE
+**Step 1 — Understand what changed**:
+```
+exec:bash
+git log -5 --oneline
+git diff HEAD~1 --stat
+```
+Witness which files changed. Identify doc-sensitive changes: new skills, new states in the state machine, new platforms, modified architecture, new constraints, renamed files.
+**Step 2 — Read current docs**:
+```
+exec:nodejs
+const fs = require('fs');
+['README.md', 'CLAUDE.md', 'docs/index.html',
+ 'plugforge-starter/agents/gm.md'].forEach(f => {
+  try { console.log(`=== ${f} ===\n` + fs.readFileSync(f, 'utf8')); }
+  catch(e) { console.log(`MISSING: ${f}`); }
+});
+```
+Identify every section that no longer matches the actual codebase state.
+**Step 3 — Write updated docs**:
+Write only sections that changed. Do not rewrite unchanged content. Rules per file:
+**README.md**: platform count matches adapters in `platforms/`, skill tree diagram matches current state machine, quick start commands work.
+**CLAUDE.md**: Only non-obvious technical caveats that required multiple runs to discover — things that could not be known without hitting the problem first. Remove anything that no longer applies. Never add anything obvious from reading the code or that any developer would already know. The test: "would a developer need to discover this the hard way, or is it self-evident?" If self-evident, exclude it.
+**docs/index.html**: `PHASES` array matches current skill state machine phases. Platform lists match `platforms/` directory. State machine diagram updated if new phases added.
+**plugforge-starter/agents/gm.md**: Skill chain on the `gm skill →` line updated if new skills were added.
+```
+exec:nodejs
+const fs = require('fs');
+fs.writeFileSync('/abs/path/file.md', updatedContent);
+```
+**Step 4 — Verify from disk**:
+```
+exec:nodejs
+const fs = require('fs');
+const content = fs.readFileSync('/abs/path/file.md', 'utf8');
+console.log(content.includes('expectedString'), content.length);
+```
+Witness each written file from disk. Any variance from expected content → fix and re-verify.
+**Step 5 — Commit and push**:
+```
+exec:bash
+git add README.md CLAUDE.md docs/index.html plugforge-starter/agents/gm.md
+git diff --cached --stat
+```
+Witness staged files. Then commit and push:
+```
+exec:bash
+git commit -m "docs: update documentation to reflect session changes"
+git push -u origin HEAD
+```
+Witness push confirmation. Zero variance = COMPLETE.
+## DOC FIDELITY RULES
+Every claim in docs must be verifiable against disk right now:
+- State machine phase names must match skill file `name:` frontmatter
+- Platform names must match adapter class names in `platforms/`
+- File paths must exist on disk
+- Constraint counts must match actual constraints
+If a doc section cannot be verified against disk: remove it, do not speculate.
+## CONSTRAINTS
+**Never**: skip diff analysis | write docs from memory alone | push without verifying from disk | add comments to doc content | claim done without witnessed push output
+**Always**: witness git diff first | read current file before overwriting | verify each file from disk after write | push doc changes | confirm with witnessed push output
+---
+**→ COMPLETE**: Docs committed and pushed → session ends.
+**↩ SNAKE to PLAN**: New unknown about codebase state → invoke `planning` skill.