gm-hermes 2.0.1077 → 2.0.1079
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/hermes-skill.json +1 -1
- package/index.html +1 -1
- package/package.json +1 -1
- package/skills/gm/SKILL.md +12 -61
- package/skills/gm-cc/SKILL.md +2 -14
- package/skills/gm-codex/SKILL.md +2 -14
- package/skills/gm-complete/SKILL.md +12 -111
- package/skills/gm-copilot-cli/SKILL.md +2 -14
- package/skills/gm-cursor/SKILL.md +2 -14
- package/skills/gm-emit/SKILL.md +12 -66
- package/skills/gm-execute/SKILL.md +12 -79
- package/skills/gm-gc/SKILL.md +2 -14
- package/skills/gm-jetbrains/SKILL.md +2 -14
- package/skills/gm-kilo/SKILL.md +2 -14
- package/skills/gm-oc/SKILL.md +2 -14
- package/skills/gm-skill/SKILL.md +21 -0
- package/skills/gm-vscode/SKILL.md +2 -14
- package/skills/gm-zed/SKILL.md +2 -14
- package/skills/planning/SKILL.md +12 -165
- package/skills/update-docs/SKILL.md +12 -56
package/hermes-skill.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "gm",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.1079",
|
|
4
4
|
"description": "Spool-dispatch orchestration engine with unified state machine, skills, and automated git enforcement",
|
|
5
5
|
"author": "AnEntrypoint",
|
|
6
6
|
"homepage": "https://github.com/AnEntrypoint/gm",
|
package/index.html
CHANGED
|
@@ -74,7 +74,7 @@ body { display: flex; flex-direction: column; min-height: 100vh; }
|
|
|
74
74
|
<section>
|
|
75
75
|
<div class="gm-section-label"><span class="slash">//</span>status</div>
|
|
76
76
|
<div class="panel">
|
|
77
|
-
<div class="panel-head"><span>release · v2.0.
|
|
77
|
+
<div class="panel-head"><span>release · v2.0.1079</span><span>probably emerging</span></div>
|
|
78
78
|
<div class="panel-body">
|
|
79
79
|
<div class="row">
|
|
80
80
|
<span class="code"><span style="color:var(--panel-accent)">●</span></span>
|
package/package.json
CHANGED
package/skills/gm/SKILL.md
CHANGED
|
@@ -1,72 +1,23 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm
|
|
3
3
|
description: Orchestrator dispatching PLAN→EXECUTE→EMIT→VERIFY→UPDATE-DOCS skill chain; spool-driven task execution with session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
end-to-end: true
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
#
|
|
8
|
+
# gm — ORCHESTRATOR phase
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
Dispatch via plugkit spool. Write `.gm/exec-spool/in/<verb>/<N>.txt` with the request body. Read response from `.gm/exec-spool/out/<N>.json`. Follow `nextSkill` field via `Skill()` immediately — no stop, no approval, no narration.
|
|
11
11
|
|
|
12
|
-
|
|
12
|
+
Verbs for this phase:
|
|
13
|
+
- `phase-status` — read current FSM state
|
|
14
|
+
- `transition` — advance phase
|
|
15
|
+
- `mutable-resolve` — mark mutable witnessed (auto-fires memorize)
|
|
16
|
+
- `memorize-fire` — explicit memorize dispatch
|
|
17
|
+
- Plus existing: `recall`, `codesearch`, `memorize`, `health`, all language stems
|
|
13
18
|
|
|
14
|
-
|
|
19
|
+
The plugkit orchestrator owns state. The skill is one line: dispatch and follow.
|
|
15
20
|
|
|
16
|
-
|
|
21
|
+
## Transition
|
|
17
22
|
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
**LLM provider**: rs-learn uses a 3-tier fallback chain: acptoapi HTTP (127.0.0.1:4800) → ACP subprocess (opencode/kilo-code/codex/gemini-cli/qwen-code) → AGENTS.md content. All errors reported explicitly — never silently absorbed. SWE-bench scores rank model fallback ordering (highest first). Downstream platforms read OPENAI_BASE_URL defaulting to 127.0.0.1:4800. Anthropic SDK only when acptoapi socket unavailable. acptoapi strips internal tools (fs_*, bash, python, ssh, recall, memorize, codesearch, etc.) before forwarding to providers.
|
|
21
|
-
|
|
22
|
-
**rs-learn failure contract**: memorize, recall, and codesearch spool dispatch failures must be reported explicitly with error details to the user. Fallback to AGENTS.md for memory preservation when socket/network unavailable. Never silently absorb errors because memory preservation requires explicit fallback. This rule applies across all phases (PLAN through UPDATE-DOCS).
|
|
23
|
-
|
|
24
|
-
**Spool watcher restart**: at session start, check `.gm/exec-spool/.watcher.heartbeat`. If older than 30s, the watcher is dead. Write a nodejs spool file to `.gm/exec-spool/in/nodejs/restart-watcher.js` that kills stale plugkit processes and spawns `plugkit runner --watch .gm/exec-spool/in --out .gm/exec-spool/out` detached. Wait 2s for initialization before dispatching any exec commands.
|
|
25
|
-
|
|
26
|
-
**Spool dispatch chain**: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`. Watcher executes and streams `out/<N>.out` + `out/<N>.err` + `out/<N>.json` metadata. Languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno. Verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, type, kill-port, forget, feedback, learn-status, learn-debug, learn-build, discipline, pause, health.
|
|
27
|
-
|
|
28
|
-
**Data layout**: `.gm/` for ephemeral state (excluded from git), `gm-data/` for persistent DBs (rs-learn.db, code-search/ — tracked in git). Legacy `.gm/rs-learn.db` auto-migrates to `gm-data/rs-learn.db` on first access.
|
|
29
|
-
|
|
30
|
-
**WASM parity**: All CLI verbs available in browser/WASM via thebird host emulation. Python/bash/ssh/powershell map to exec_js; status/wait/sleep/close/kill-port/forget/feedback/learn-status/learn-debug/learn-build/discipline/pause/runner/type/browser all have WASM handlers. rs-search WASM uses fusion search (vector + BM25 + git via RRF scoring).
|
|
31
|
-
|
|
32
|
-
**Session isolation**: SESSION_ID environment variable (or uuid fallback) threads through task dispatch for cleanup scope. rs-exec RPC handlers verify session_id match on all task-scoped operations.
|
|
33
|
-
|
|
34
|
-
**Code does mechanics; meaning routes through textprocessing skill**: summarize, classify, extract intent, rewrite, translate, semantic dedup, rank, label — all via `Agent(subagent_type='gm:textprocessing', ...)`.
|
|
35
|
-
|
|
36
|
-
**Recall before fresh execution**: before witnessing unknown via execution, recall first. Hits arrive as weak_prior; empty results confirm fresh unknown.
|
|
37
|
-
|
|
38
|
-
**Memorize is the back-half of witness**: resolution incomplete until fact lives outside this context window. Fire `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<fact>')` alongside witness, in parallel, never blocking.
|
|
39
|
-
|
|
40
|
-
**Parallel independent items**: up to 3 `gm:gm` subagents per message for independent PRD items. Serial for dependent items — no re-asking between them.
|
|
41
|
-
|
|
42
|
-
**Terse response**: fragments OK. `[thing] [action] [reason]. [next step].` Code, commits, PRs use normal prose.
|
|
43
|
-
|
|
44
|
-
**Caveman medium mode (full) always on**: drop articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries, and hedging. Fragments OK. Use short synonyms. Keep technical terms exact. Keep code blocks and exact error strings unchanged. Pattern: `[thing] [action] [reason]. [next step].` Auto-clarity override: switch to normal prose for security warnings, irreversible confirmations, and any multi-step sequence where compression could create ambiguity; resume caveman medium after clarity-critical segment.
|
|
45
|
-
|
|
46
|
-
## End-to-End Phase Chaining (Skills-Based Platforms)
|
|
47
|
-
|
|
48
|
-
When `end-to-end: true` is present in SKILL.md frontmatter, skill output includes structured JSON on stdout (final line):
|
|
49
|
-
|
|
50
|
-
```json
|
|
51
|
-
{"nextSkill": "gm-execute" | "gm-emit" | "gm-complete" | "update-docs" | null, "context": {PRD and state dict}, "phase": "PLAN" | "EXECUTE" | "EMIT" | "COMPLETE"}
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
Platform adapters (vscode, cursor, zed, jetbrains) that support `end-to-end: true` detection:
|
|
55
|
-
1. Invoke `Skill(skill="gm:gm")`
|
|
56
|
-
2. Parse stdout for trailing JSON blob
|
|
57
|
-
3. If `nextSkill` is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict auto-passed
|
|
58
|
-
4. Repeat until `nextSkill` is null
|
|
59
|
-
|
|
60
|
-
This collapses 5 manual skill invocations into 1 user invocation + 4 transparent auto-dispatches, achieving perceived single-flow parity with gm-cc's subagent orchestration.
|
|
61
|
-
|
|
62
|
-
## Skill Transition Protocol
|
|
63
|
-
|
|
64
|
-
Skill-to-skill transitions NEVER stop for approval or ask the user. Each skill completes its phase and invokes the next skill via `Skill()` tool call directly. Platform adapters auto-detect the transition from stdout JSON and fire the next skill without user interaction. Tool-use approvals are pre-authorized by the initial user request — re-asking defeats the cascade and breaks autonomy. Skills must assume subsequent tools are approved once the PRD is written by PLAN.
|
|
65
|
-
|
|
66
|
-
## Gate enforcement (spool-dispatch layer)
|
|
67
|
-
|
|
68
|
-
The file-spool (`lib/spool-dispatch.js::checkDispatchGates()`) blocks Write/Edit/git operations when:
|
|
69
|
-
1. `.gm/prd.yml` exists AND `.gm/needs-gm` exists AND `.gm/gm-fired-<sessionId>` does NOT exist → reason: "gm orchestration in progress; skills must complete work before tools execute"
|
|
70
|
-
2. `.gm/mutables.yml` has entries with `status: unknown` → reason: "unresolved mutables block tool execution; resolve all mutables before proceeding"
|
|
71
|
-
|
|
72
|
-
Gate 1 auto-clears: PLAN writes THREE markers (`.gm/prd.yml`, `.gm/needs-gm`, `.gm/gm-fired-<sessionId>`) at session start BEFORE transitioning to EXECUTE. The marker proves planning has run and authorized tool use. Gate 2 auto-clears: EXECUTE resolves mutables by updating `.gm/mutables.yml` entries to `status: witnessed`, or the file is deleted when empty by gm-complete. Tool denials surface the reason text to the agent, which adjusts behavior (e.g., resolve remaining mutables before retrying). Tool denials never mutate command arguments — they surface policy as imperative instruction.
|
|
23
|
+
Read `out/<N>.json::nextSkill`. Invoke `Skill(skill="gm:<nextSkill>")` immediately. End of skill.
|
package/skills/gm-cc/SKILL.md
CHANGED
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-cc
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on cc; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
package/skills/gm-codex/SKILL.md
CHANGED
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-codex
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on codex; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
|
@@ -1,121 +1,22 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-complete
|
|
3
3
|
description: VERIFY and COMPLETE phase. End-to-end system verification and git enforcement. Any new unknown triggers immediate snake back to planning — restart chain.
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
4
5
|
---
|
|
5
6
|
|
|
6
|
-
#
|
|
7
|
+
# gm-complete — VERIFY phase
|
|
7
8
|
|
|
8
|
-
|
|
9
|
+
Dispatch via plugkit spool. Write `.gm/exec-spool/in/<verb>/<N>.txt` with the request body. Read response from `.gm/exec-spool/out/<N>.json`. Follow `nextSkill` field via `Skill()` immediately — no stop, no approval, no narration.
|
|
9
10
|
|
|
10
|
-
|
|
11
|
+
Verbs for this phase:
|
|
12
|
+
- `phase-status` — read current FSM state
|
|
13
|
+
- `transition` — advance phase
|
|
14
|
+
- `mutable-resolve` — mark mutable witnessed (auto-fires memorize)
|
|
15
|
+
- `memorize-fire` — explicit memorize dispatch
|
|
16
|
+
- Plus existing: `recall`, `codesearch`, `memorize`, `health`, all language stems
|
|
11
17
|
|
|
12
|
-
|
|
18
|
+
The plugkit orchestrator owns state. The skill is one line: dispatch and follow.
|
|
13
19
|
|
|
14
|
-
|
|
15
|
-
- `.prd` empty AND test.js green AND pushed AND CI green → `update-docs`
|
|
16
|
-
- Broken file output → `gm-emit`
|
|
17
|
-
- Wrong logic → `gm-execute`
|
|
18
|
-
- New unknown or wrong requirements → `planning`
|
|
20
|
+
## Transition
|
|
19
21
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
## Mutables that must resolve before COMPLETE
|
|
23
|
-
|
|
24
|
-
- `witnessed_e2e` — real end-to-end run with witnessed output
|
|
25
|
-
- `browser_validated` — for any change touching client / UI / browser-facing code, see gate below. test.js + node-side imports DO NOT satisfy this gate.
|
|
26
|
-
- `git_clean` — `git status --porcelain` returns empty
|
|
27
|
-
- `git_pushed` — `git log origin/main..HEAD --oneline` returns empty
|
|
28
|
-
- `ci_passed` — every GitHub Actions run reaches `conclusion: success`
|
|
29
|
-
- `mutables_resolved` — `.gm/mutables.yml` deleted OR every entry `status: witnessed`. Never stop the turn while any entry is `status: unknown` — this gate is self-enforced.
|
|
30
|
-
- `prd_empty` — `.gm/prd.yml` deleted AFTER residual scan: enumerate every in-spirit reachable residual surfaced this session; any hit re-enters `planning`, appends PRD items, executes. Empty PRD is necessary, not sufficient — done = empty PRD AND zero reachable in-spirit residuals. Out-of-spirit-or-unreachable residuals are named in the response and skipped; everything else is this turn's work.
|
|
31
|
-
- `stress_suite_clear` — change walked through M1–D1 (governance), none flunked
|
|
32
|
-
- `hidden_decision_posture` — open → down_weighted → closed only when CI is green AND stress suite is clear
|
|
33
|
-
|
|
34
|
-
## End-to-end verification
|
|
35
|
-
|
|
36
|
-
Real system, real data, witness actual output. Doc updates, "saying done", and screenshots alone are not verification. Write the e2e probe to the spool (`.gm/exec-spool/in/nodejs/<N>.js`):
|
|
37
|
-
|
|
38
|
-
```
|
|
39
|
-
const { fn } = await import('/abs/path/to/module.js');
|
|
40
|
-
console.log(await fn(realInput));
|
|
41
|
-
```
|
|
42
|
-
|
|
43
|
-
After every success, enumerate what remains — never stop at first green.
|
|
44
|
-
|
|
45
|
-
## Browser validation gate
|
|
46
|
-
|
|
47
|
-
Required when this session changed any code that runs in a browser: anything under `client/`, UI components, shaders, page-loaded JS, served HTML, gh-pages assets, dev-server endpoints, or any module imported into the page bundle.
|
|
48
|
-
|
|
49
|
-
Trigger detection (any one): `git diff --name-only origin/main..HEAD` includes paths under `client/`, `apps/*/index.js` with client export, `docs/`, `*.html`, shader files, or any file imported by a browser entry; new/changed export consumed by `window.*` or rendered in DOM/canvas/WebGL; visual, layout, animation, input, network-on-page, or shader behavior altered.
|
|
50
|
-
|
|
51
|
-
Protocol: boot the real server (or open the static page) on a known URL — witness HTTP 200. `exec:browser` → `page.goto(url)` → wait for app init by polling for the global the change affects (`window.__app.<system>`). Probe via `page.evaluate(() => …)` asserting the specific invariant the change was supposed to establish — instance counts, scene meshes, DOM nodes, render stats, network frames. Capture witnessed numbers in the response — "looks fine" is not a witness. Failures route to `gm-execute` (logic) or `gm-emit` (output) — never paper over.
|
|
52
|
-
|
|
53
|
-
Long-running probes split into navigate-call → `exec:wait N` → probe-call to stay under the per-call budget. Do not stack multi-second `setTimeout` inside one `exec:browser` invocation.
|
|
54
|
-
|
|
55
|
-
Exempt only when: change is server-only with zero browser-facing surface, OR the repository has no browser surface at all (pure CLI / library). Exemption requires explicit tag in the response: `BROWSER EXEMPT: <reason — must reference diff paths showing zero browser-facing surface>`. Default posture is NOT exempt — burden is on the agent to prove exemption with diff evidence.
|
|
56
|
-
|
|
57
|
-
Pre-flight: run `git diff --name-only origin/main..HEAD` directly via Bash, then dispatch a nodejs spool file that reads the diff list and filters lines matching `client/|docs/|\.html$|\.glsl$|\.frag$|\.vert$`. Any hit AND no `exec:browser` block in this session → mandatory regression to `gm-execute`.
|
|
58
|
-
|
|
59
|
-
## Integration test gate
|
|
60
|
-
|
|
61
|
-
Write to `.gm/exec-spool/in/nodejs/<N>.js`:
|
|
62
|
-
|
|
63
|
-
```
|
|
64
|
-
const { execSync } = require('child_process');
|
|
65
|
-
try { execSync('node test.js', { stdio: 'inherit', timeout: 30000 }); console.log('PASS'); }
|
|
66
|
-
catch (e) { console.error('FAIL'); process.exit(1); }
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
Failure → `gm-execute`. No test.js in a repo with testable surface → `gm-execute` to create it.
|
|
70
|
-
|
|
71
|
-
## Git enforcement
|
|
72
|
-
|
|
73
|
-
Run directly via Bash:
|
|
74
|
-
|
|
75
|
-
```
|
|
76
|
-
git status --porcelain
|
|
77
|
-
git log origin/main..HEAD --oneline
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
Both must return empty. Local commit without push is not complete.
|
|
81
|
-
|
|
82
|
-
## CI is automated
|
|
83
|
-
|
|
84
|
-
After `git push`, poll `gh run list --branch main --limit 5 --json status,name,databaseId` until all runs reach a terminal state. Green → continue; failure → investigate via `gh run view <id> --log-failed`, fix, push again. Deadline 180s (override `GM_CI_WATCH_SECS`). Poll every 10s via a nodejs spool file with a `setInterval` loop writing results to stdout.
|
|
85
|
-
|
|
86
|
-
## Hygiene sweep
|
|
87
|
-
|
|
88
|
-
1. Files >200 lines → split
|
|
89
|
-
2. Comments in code → remove
|
|
90
|
-
3. Scattered test files (`.test.js`, `.spec.js`, `__tests__/`, `fixtures/`, `mocks/`) → delete, consolidate into root `test.js`
|
|
91
|
-
4. Mock / stub / simulation files → delete
|
|
92
|
-
5. Unnecessary doc files (not CHANGELOG, CLAUDE, README, TODO.md) → delete
|
|
93
|
-
6. Duplicate concern → regress to `planning` with restructuring instructions
|
|
94
|
-
7. Hardcoded values → derive from ground truth
|
|
95
|
-
8. Fallback / demo modes → remove, fail loud
|
|
96
|
-
9. TODO.md → empty or deleted
|
|
97
|
-
10. CHANGELOG.md → entries for this session
|
|
98
|
-
11. Observability gaps → server subsystems expose `/debug/<subsystem>`; client modules register in `window.__debug`
|
|
99
|
-
12. Memorize → every fact from verification handed off via background `Agent(memorize)` at moment of resolution
|
|
100
|
-
13. Deploy / publish → if deployable, deploy; if npm package, publish
|
|
101
|
-
14. GitHub Pages → check `.github/workflows/pages.yml` + `docs/index.html` exist; invoke `pages` skill if absent
|
|
102
|
-
15. Governance stress-suite → walk change through M1, F1, C1, H1, S1, B1, A1, D1; any flunk regresses to the owning phase
|
|
103
|
-
|
|
104
|
-
## Completion
|
|
105
|
-
|
|
106
|
-
All true at once: witnessed e2e | browser_validated when client work touched | failure paths exercised | test.js passes | `.prd` deleted | git clean and pushed | CI green | hygiene sweep clean | TODO.md gone | CHANGELOG.md updated.
|
|
107
|
-
|
|
108
|
-
## Marker file protocol
|
|
109
|
-
|
|
110
|
-
On transition to `update-docs`, delete all gm orchestration markers for the next cycle:
|
|
111
|
-
```
|
|
112
|
-
const fs = require('fs');
|
|
113
|
-
const path = require('path');
|
|
114
|
-
const sessionId = process.env.SESSION_ID || 'default';
|
|
115
|
-
const gm = path.join(process.cwd(), '.gm');
|
|
116
|
-
['.gm/prd.yml', '.gm/mutables.yml', '.gm/needs-gm', `.gm/gm-fired-${sessionId}`].forEach(m => {
|
|
117
|
-
try { fs.unlinkSync(path.join(gm, m.split('/')[1])); } catch (_) {}
|
|
118
|
-
});
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
The `.gm/gm-fired-<sessionId>` marker was written by PLAN at session start and proves gm orchestration has completed. Cleanup before next cycle resets gates so the next PLAN run can write fresh markers.
|
|
22
|
+
Read `out/<N>.json::nextSkill`. Invoke `Skill(skill="gm:<nextSkill>")` immediately. End of skill.
|
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-copilot-cli
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on copilot-cli; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-cursor
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on cursor; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
package/skills/gm-emit/SKILL.md
CHANGED
|
@@ -1,76 +1,22 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-emit
|
|
3
3
|
description: EMIT phase. Pre-emit debug, write files, post-emit verify from disk. Any new unknown triggers immediate snake back to planning — restart chain.
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
4
5
|
---
|
|
5
6
|
|
|
6
|
-
#
|
|
7
|
+
# gm-emit — EMIT phase
|
|
7
8
|
|
|
8
|
-
|
|
9
|
+
Dispatch via plugkit spool. Write `.gm/exec-spool/in/<verb>/<N>.txt` with the request body. Read response from `.gm/exec-spool/out/<N>.json`. Follow `nextSkill` field via `Skill()` immediately — no stop, no approval, no narration.
|
|
9
10
|
|
|
10
|
-
|
|
11
|
+
Verbs for this phase:
|
|
12
|
+
- `phase-status` — read current FSM state
|
|
13
|
+
- `transition` — advance phase
|
|
14
|
+
- `mutable-resolve` — mark mutable witnessed (auto-fires memorize)
|
|
15
|
+
- `memorize-fire` — explicit memorize dispatch
|
|
16
|
+
- Plus existing: `recall`, `codesearch`, `memorize`, `health`, all language stems
|
|
11
17
|
|
|
12
|
-
|
|
18
|
+
The plugkit orchestrator owns state. The skill is one line: dispatch and follow.
|
|
13
19
|
|
|
14
|
-
##
|
|
20
|
+
## Transition
|
|
15
21
|
|
|
16
|
-
|
|
17
|
-
- Post-emit variance with known cause → fix in-band, re-verify, stay in EMIT (self-loop, no transition)
|
|
18
|
-
- Pre-emit reveals known logic error → `gm-execute` (invoke via `Skill(skill="gm:gm-execute")` immediately, no stop)
|
|
19
|
-
- Pre-emit reveals new unknown OR post-emit variance with unknown cause OR scope changed → `planning` (invoke via `Skill(skill="planning")` immediately, no stop)
|
|
20
|
-
|
|
21
|
-
## Legitimacy gate (before pre-emit run)
|
|
22
|
-
|
|
23
|
-
For every claim landing in a file, answer five questions:
|
|
24
|
-
|
|
25
|
-
1. Earned specificity — does it trace to `authorization=witnessed`, or is it inflated from a weak prior?
|
|
26
|
-
2. Repair legality — is a local patch dressed as structural repair? Downgrade scope or regress to PLAN.
|
|
27
|
-
3. Lawful downgrade — can a weaker, true statement replace it? Prefer the downgrade.
|
|
28
|
-
4. Alternative-route suppression — is a live competing route being silenced? Preserve it.
|
|
29
|
-
5. Strongest objection — what would the sharpest reviewer pushback be? Articulate it. Cannot articulate = have not understood the alternatives → `gm-execute`.
|
|
30
|
-
|
|
31
|
-
Any failure regresses to `gm-execute` to witness what was missing, or `planning` if the gap is structural.
|
|
32
|
-
|
|
33
|
-
## Pre-emit run
|
|
34
|
-
|
|
35
|
-
Mandatory before writing any file. Write the probe to the spool (`.gm/exec-spool/in/nodejs/<N>.js`):
|
|
36
|
-
|
|
37
|
-
```
|
|
38
|
-
const { fn } = await import('/abs/path/to/module.js');
|
|
39
|
-
console.log(await fn(realInput));
|
|
40
|
-
```
|
|
41
|
-
|
|
42
|
-
Import the actual module from disk to witness current behavior as the baseline. Run the proposed logic in isolation without writing — witness with real inputs and with real error inputs. Match expected → write. Unexpected → new unknown → `planning`.
|
|
43
|
-
|
|
44
|
-
## Writing
|
|
45
|
-
|
|
46
|
-
Use the Write tool, or a nodejs spool file with `require('fs')`. Write only when every gate mutable resolves simultaneously.
|
|
47
|
-
|
|
48
|
-
## Post-emit verification
|
|
49
|
-
|
|
50
|
-
Re-import from disk — in-memory state is stale and inadmissible. Run identical inputs as pre-emit; output must match the baseline exactly. Known variance → fix and re-verify (self-loop). Unknown variance → `planning`.
|
|
51
|
-
|
|
52
|
-
## Mutables gate
|
|
53
|
-
|
|
54
|
-
Before pre-emit run, read `.gm/mutables.yml`. Any entry with `status: unknown` → regress to `gm-execute`. Never use Write/Edit/NotebookEdit while unresolved entries exist — this gate is self-enforced. Zero unresolved is the precondition for every legitimacy question below.
|
|
55
|
-
|
|
56
|
-
## Gate (all true at once)
|
|
57
|
-
|
|
58
|
-
- `.gm/mutables.yml` empty/absent OR every entry `status: witnessed` with filled `witness_evidence`
|
|
59
|
-
- Legitimacy gate passed; no refused collapse
|
|
60
|
-
- Pre-emit passed with real inputs and real error inputs
|
|
61
|
-
- Post-emit matches pre-emit exactly
|
|
62
|
-
- Hot-reloadable; errors throw with context (no `|| default`, no `catch { return null }`, no fallbacks)
|
|
63
|
-
- No mocks, fakes, stubs, or scattered test files (delete on discovery)
|
|
64
|
-
- Any behavior change has a corresponding assertion in `test.js` — a change no test catches is a change you cannot prove
|
|
65
|
-
- Browser-facing change → post-emit verify includes a live `exec:browser` witness (boot server → `page.goto` → `page.evaluate` asserting the invariant the change established). Node-side import + test.js does not satisfy this — the final gate runs again in `gm-complete`.
|
|
66
|
-
- Files ≤ 200 lines
|
|
67
|
-
- No duplicate concern (run `exec:codesearch` for the primary concern after writing; overlap → `planning`)
|
|
68
|
-
- No comments, no hardcoded values, no adjectives in identifiers, no unnecessary files
|
|
69
|
-
- Observability: new server subsystems expose `/debug/<subsystem>`; new client modules register in `window.__debug`
|
|
70
|
-
- Structure: no if/else where dispatch suffices; no one-liners that obscure; no reinvented APIs
|
|
71
|
-
- Every fact resolved this phase memorized via background `Agent(memorize)`
|
|
72
|
-
- CHANGELOG.md updated; TODO.md cleared or deleted
|
|
73
|
-
|
|
74
|
-
## Marker file protocol
|
|
75
|
-
|
|
76
|
-
EMIT phase operates after EXECUTE resolves all mutables. `.gm/prd.yml` and `.gm/mutables.yml` remain live (deleted by gm-complete when work finishes). Gate enforcement does not block EMIT — the `.gm/gm-fired-<sessionId>` marker was already written by PLAN at session start, so tool use (Write/Edit) is fully authorized. EMIT does not write or read markers; it only invokes Write/Edit on files and runs post-emit verifications. On transition to gm-complete (all gates clear), invoke the skill immediately (no marker write needed by EMIT).
|
|
22
|
+
Read `out/<N>.json::nextSkill`. Invoke `Skill(skill="gm:<nextSkill>")` immediately. End of skill.
|
|
@@ -1,89 +1,22 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-execute
|
|
3
3
|
description: EXECUTE phase AND the foundational execution contract for every skill. Every spool dispatch run, every witnessed check, every code search, in every phase, follows this skill's discipline. Resolve all mutables via witnessed execution. Any new unknown triggers immediate snake back to planning — restart chain from PLAN.
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
4
5
|
---
|
|
5
6
|
|
|
6
|
-
#
|
|
7
|
+
# gm-execute — EXECUTE phase
|
|
7
8
|
|
|
8
|
-
|
|
9
|
+
Dispatch via plugkit spool. Write `.gm/exec-spool/in/<verb>/<N>.txt` with the request body. Read response from `.gm/exec-spool/out/<N>.json`. Follow `nextSkill` field via `Skill()` immediately — no stop, no approval, no narration.
|
|
9
10
|
|
|
10
|
-
|
|
11
|
+
Verbs for this phase:
|
|
12
|
+
- `phase-status` — read current FSM state
|
|
13
|
+
- `transition` — advance phase
|
|
14
|
+
- `mutable-resolve` — mark mutable witnessed (auto-fires memorize)
|
|
15
|
+
- `memorize-fire` — explicit memorize dispatch
|
|
16
|
+
- Plus existing: `recall`, `codesearch`, `memorize`, `health`, all language stems
|
|
11
17
|
|
|
12
|
-
|
|
18
|
+
The plugkit orchestrator owns state. The skill is one line: dispatch and follow.
|
|
13
19
|
|
|
14
|
-
|
|
20
|
+
## Transition
|
|
15
21
|
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
- All mutables KNOWN → `gm-emit` (invoke via `Skill(skill="gm:emit")` immediately, no stop)
|
|
19
|
-
- Still UNKNOWN → re-run from a different angle (max 2 passes)
|
|
20
|
-
- New unknown OR unresolvable after 2 passes → `planning` (invoke via `Skill(skill="planning")` immediately, no stop)
|
|
21
|
-
|
|
22
|
-
## Mutable discipline
|
|
23
|
-
|
|
24
|
-
Each mutable carries: name, expected, current, resolution method.
|
|
25
|
-
|
|
26
|
-
Resolves to KNOWN only when all four pass:
|
|
27
|
-
|
|
28
|
-
- **ΔS = 0** — witnessed output equals expected
|
|
29
|
-
- **λ ≥ 2** — two independent paths agree
|
|
30
|
-
- **ε intact** — adjacent invariants hold
|
|
31
|
-
- **Coverage ≥ 0.70** — enough corpus inspected to rule out contradiction
|
|
32
|
-
|
|
33
|
-
Unresolved after 2 passes regresses to `planning`. Never narrate past an unresolved mutable.
|
|
34
|
-
|
|
35
|
-
Every witness that resolves a mutable writes back to `.gm/mutables.yml` the same step: set `status: witnessed` and fill `witness_evidence` with concrete proof (file:line, codesearch hit, exec output snippet). No write-back = the mutable stays unknown and the EMIT-gate stays closed. The file is the record; the agent's memory of "I resolved it" does not count.
|
|
36
|
-
|
|
37
|
-
Route candidates from PLAN are `weak_prior` only. Plausibility is the right to test, not the right to believe. A claim with no witness in the current session is a hypothesis — say so when stating it, and say what would settle it. The next reader (you, next turn) needs to know which lines were earned and which were carried forward.
|
|
38
|
-
|
|
39
|
-
## Verification budget
|
|
40
|
-
|
|
41
|
-
Spend on `.prd` items in descending order of consequence-if-wrong × distance-from-witnessed. Items whose failure would collapse the headline finding must reach witnessed status before EMIT; sub-argument-level items need at minimum a stated fallback path.
|
|
42
|
-
|
|
43
|
-
## Code execution
|
|
44
|
-
|
|
45
|
-
Code AND utility verbs both run through the file-spool. Write a file to `.gm/exec-spool/in/<lang-or-verb>/<N>.<ext>` — language stems (`in/nodejs/42.js`, `in/python/43.py`, `in/bash/44.sh`, plus typescript, go, rust, c, cpp, java, deno) or verb stems (`in/codesearch/45.txt`, `in/recall/46.txt`, `in/memorize/47.md`, plus wait, sleep, status, close, browser, runner, type, kill-port, forget, feedback, learn-status, learn-debug, learn-build, discipline, pause, health). The spool watcher executes and streams stdout to `out/<N>.out`, stderr to `out/<N>.err`, then writes `out/<N>.json` metadata sidecar at completion (taskId, lang, ok, exitCode, durationMs, timedOut, startedAt, endedAt). Both streams return as systemMessage with `--- stdout ---` / `--- stderr ---` separators. File I/O via a nodejs spool file + `require('fs')`. Only `git` and `gh` run directly in Bash. Never `Bash(node/npm/npx/bun)`, never `Bash(exec:<anything>)`.
|
|
46
|
-
|
|
47
|
-
Pack runs: `Promise.allSettled`, each idea own try/catch, under 12s per call. Runner: write `in/runner/<N>.txt` with body `start` | `stop` | `status`.
|
|
48
|
-
|
|
49
|
-
Every exec daemonizes. The spool watcher tails the task logfile up to 30s wall-clock and returns whatever is there — short tasks complete inside the window and look synchronous; long tasks return a task_id with partial output. Continue with `exec:tail` (drain, bounded), `exec:watch` (resume blocking until match or timeout), or `exec:close` (terminate). Never re-spawn a long task to check on it — that orphans the first one. `exec:wait` is a pure timer; `exec:sleep` blocks on a specific task's output; `exec:watch` is the match-or-timeout primitive. Every execution-platform RPC returns the live list of running tasks for this session — close stragglers via `exec:close\n<id>` so the list stays scannable. Session-end (clear/logout/prompt_input_exit) kills the session's tasks; compaction/handoff preserves them.
|
|
50
|
-
|
|
51
|
-
Every utility verb dispatches via `in/<verb>/<N>.txt`; the body of the file is the verb's argument. There is no inline form and no Bash-prefix form — use only the spool path.
|
|
52
|
-
|
|
53
|
-
## Codebase search
|
|
54
|
-
|
|
55
|
-
Codesearch only. Never use Grep, Glob, Find, Explore, raw grep/rg/find inside Bash. Write query to `.gm/exec-spool/in/codesearch/<N>.txt`. Read result from `.gm/exec-spool/out/<N>.out`.
|
|
56
|
-
|
|
57
|
-
Start two words, change/add one per pass, minimum four attempts before concluding absent. Known absolute path → `Read`. Known directory → nodejs spool file + `fs.readdirSync`.
|
|
58
|
-
|
|
59
|
-
## Utility verb failure handling
|
|
60
|
-
|
|
61
|
-
**Utility verb failures must surface**: memorize, recall, codesearch, and other utility verbs may fail (socket unavailable, timeout, network error). Failures do not block witness completion but must be reported to the user with error context. Fallback mechanisms (AGENTS.md for memorize) ensure memory preservation even when rs-learn is temporarily unavailable.
|
|
62
|
-
|
|
63
|
-
## Import-based execution
|
|
64
|
-
|
|
65
|
-
Hypotheses become real by importing actual modules from disk. Reimplemented behavior is UNKNOWN. Write the import probe to the spool:
|
|
66
|
-
|
|
67
|
-
```
|
|
68
|
-
# write .gm/exec-spool/in/nodejs/42.js
|
|
69
|
-
const { fn } = await import('/abs/path/to/module.js');
|
|
70
|
-
console.log(await fn(realInput));
|
|
71
|
-
```
|
|
72
|
-
|
|
73
|
-
Differential diagnosis: smallest reproduction → compare actual vs expected → name the delta — that delta is the mutable.
|
|
74
|
-
|
|
75
|
-
## Edits depend on witnesses
|
|
76
|
-
|
|
77
|
-
Hypothesis → run → witness → edit. An edit before a witness is a guess. Scan via codesearch (write to `.gm/exec-spool/in/codesearch/<N>.txt`) before creating or modifying — duplicate concern regresses to `planning`. Code-quality preference: native → library → structure → write.
|
|
78
|
-
|
|
79
|
-
## Parallel subagents
|
|
80
|
-
|
|
81
|
-
Up to 3 `gm:gm` subagents for independent items in one message. Browser escalation: write to `.gm/exec-spool/in/browser/<N>.txt` → `browser` skill → screenshot only as last resort.
|
|
82
|
-
|
|
83
|
-
## CI is automated
|
|
84
|
-
|
|
85
|
-
After `git push`, poll `gh run list --branch main --limit 5 --json status,name,databaseId` until all runs reach a terminal state. Green → continue; failure → investigate via `gh run view <id> --log-failed`, fix, push again. Deadline 180s (override `GM_CI_WATCH_SECS`). Poll every 10s via a nodejs spool file with a `setInterval` loop writing results to stdout.
|
|
86
|
-
|
|
87
|
-
## Marker file protocol
|
|
88
|
-
|
|
89
|
-
EXECUTE phase works against `.gm/prd.yml` and `.gm/mutables.yml` written by PLAN. Both files live for the duration of work and are deleted by gm-complete when work finishes. Gate enforcement (spool-dispatch layer) checks: if `.gm/prd.yml` + `.gm/needs-gm` exist BUT `.gm/gm-fired-<sessionId>` marker is missing, tool use blocks. PLAN writes all three markers at session start before transitioning to EXECUTE, so the gate is already clear when EXECUTE runs. EXECUTE does not write or read markers — it only reads and updates mutables. On transition to gm-emit, invoke the skill immediately (no marker write needed by EXECUTE).
|
|
22
|
+
Read `out/<N>.json::nextSkill`. Invoke `Skill(skill="gm:<nextSkill>")` immediately. End of skill.
|
package/skills/gm-gc/SKILL.md
CHANGED
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-gc
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on gc; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-jetbrains
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on jetbrains; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
package/skills/gm-kilo/SKILL.md
CHANGED
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-kilo
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on kilo; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
package/skills/gm-oc/SKILL.md
CHANGED
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-oc
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on oc; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: gm-skill
|
|
3
|
+
description: Canonical universal harness — AI-native software engineering via skill-driven orchestration; bootstraps plugkit for task execution and session isolation
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# GM — Universal Skill Harness
|
|
8
|
+
|
|
9
|
+
Single canonical body re-exported by every platform-specific gm-<platform> skill. All 15 platforms share this identical plugkit dispatch surface.
|
|
10
|
+
|
|
11
|
+
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
12
|
+
|
|
13
|
+
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
14
|
+
|
|
15
|
+
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
16
|
+
|
|
17
|
+
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
18
|
+
|
|
19
|
+
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
20
|
+
|
|
21
|
+
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-vscode
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on vscode; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
package/skills/gm-zed/SKILL.md
CHANGED
|
@@ -1,19 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: gm-zed
|
|
3
3
|
description: AI-native software engineering via skill-driven orchestration on zed; bootstraps plugkit for task execution and session isolation
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
AI-native software engineering orchestrated via skill chain: PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS.
|
|
10
|
-
|
|
11
|
-
**Bootstrap pattern**: `bun x gm-plugkit@latest --daemon` downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Call once at session start; idempotent on subsequent calls. All execution routes through the file-spool: write to `.gm/exec-spool/in/<lang>/<N>.<ext>` or `in/<verb>/<N>.txt`, poll `out/<N>.json` for results.
|
|
12
|
-
|
|
13
|
-
**Session-ID threading (no session-start hook)**: At skill invoke time, generate or detect SESSION_ID (env var `SESSION_ID` or `uuid()`). Pass `sessionId: "<id>"` in every rs-exec RPC body (spawn, tail, watch, etc.) and every spool-written task body. All task-scoped cleanup (deleteTask, getTask, appendOutput, killSessionTasks) requires matching sessionId. Absence is forbidden — hard reject by rs-exec handler.
|
|
14
|
-
|
|
15
|
-
**Spool dispatch surface**: Write to `.gm/exec-spool/in/<lang>/<N>.<ext>` (languages: nodejs, python, bash, typescript, go, rust, c, cpp, java, deno) or `in/<verb>/<N>.txt` (verbs: codesearch, recall, memorize, wait, sleep, status, close, browser, runner, etc.). Watcher executes and streams `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion.
|
|
16
|
-
|
|
17
|
-
**End-to-end skill chaining (skills-based platforms)**: When gm SKILL.md includes `end-to-end: true`, adapter detects signal and parses stdout for trailing JSON: `{"nextSkill": "...", "context": {...}, "phase": "..."}`. If nextSkill is non-null, invoke `Skill(skill="gm:<nextSkill>")` with context dict, repeat until null. This auto-chains 5 invocations into 1 user invocation.
|
|
18
|
-
|
|
19
|
-
Every task returns complete: taskId, exitCode, durationMs, timedOut, stdout, stderr. Background tasks return immediately with task_id; continue with `in/status/<N>.txt` (tail), `in/watch/<N>.txt` (watch), or `in/close/<N>.txt` (close).
|
|
7
|
+
See [gm-skill](../gm-skill/SKILL.md). All platforms share the same plugkit dispatch surface.
|
package/skills/planning/SKILL.md
CHANGED
|
@@ -1,175 +1,22 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: planning
|
|
3
3
|
description: State machine orchestrator. Mutable discovery, PRD construction, and full PLAN→EXECUTE→EMIT→VERIFY→COMPLETE lifecycle. Invoke at session start and on any new unknown.
|
|
4
|
-
allowed-tools: Skill
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
5
5
|
---
|
|
6
6
|
|
|
7
|
-
#
|
|
7
|
+
# planning — PLAN phase
|
|
8
8
|
|
|
9
|
-
|
|
9
|
+
Dispatch via plugkit spool. Write `.gm/exec-spool/in/<verb>/<N>.txt` with the request body. Read response from `.gm/exec-spool/out/<N>.json`. Follow `nextSkill` field via `Skill()` immediately — no stop, no approval, no narration.
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
Verbs for this phase:
|
|
12
|
+
- `phase-status` — read current FSM state
|
|
13
|
+
- `transition` — advance phase
|
|
14
|
+
- `mutable-resolve` — mark mutable witnessed (auto-fires memorize)
|
|
15
|
+
- `memorize-fire` — explicit memorize dispatch
|
|
16
|
+
- Plus existing: `recall`, `codesearch`, `memorize`, `health`, all language stems
|
|
12
17
|
|
|
13
|
-
|
|
18
|
+
The plugkit orchestrator owns state. The skill is one line: dispatch and follow.
|
|
14
19
|
|
|
15
|
-
##
|
|
20
|
+
## Transition
|
|
16
21
|
|
|
17
|
-
|
|
18
|
-
- New unknown anywhere in chain → re-enter PLAN
|
|
19
|
-
- EXECUTE unresolvable after 2 passes → PLAN
|
|
20
|
-
- VERIFY: `.prd` empty + git clean + pushed → `update-docs`; else → `gm-execute`
|
|
21
|
-
|
|
22
|
-
Cannot stop while `.gm/prd.yml` has items, git is dirty, or commits are unpushed.
|
|
23
|
-
|
|
24
|
-
## Session start: restart spool watcher
|
|
25
|
-
|
|
26
|
-
Before any orient or PRD work, ensure the spool watcher is running. Check `.gm/exec-spool/.watcher.heartbeat` — if older than 30s, the watcher is dead. Restart it:
|
|
27
|
-
|
|
28
|
-
```
|
|
29
|
-
# write .gm/exec-spool/in/nodejs/restart-watcher.js
|
|
30
|
-
const { spawn, spawnSync } = require('child_process');
|
|
31
|
-
const fs = require('fs');
|
|
32
|
-
const path = require('path');
|
|
33
|
-
const os = require('os');
|
|
34
|
-
const bin = path.join(os.homedir(), '.claude', 'gm-tools', 'plugkit.wasm');
|
|
35
|
-
const root = process.cwd();
|
|
36
|
-
const spoolIn = path.join(root, '.gm', 'exec-spool', 'in');
|
|
37
|
-
const spoolOut = path.join(root, '.gm', 'exec-spool', 'out');
|
|
38
|
-
const pidFile = path.join(os.tmpdir(), 'gm-plugkit-spool.pid');
|
|
39
|
-
if (fs.existsSync(pidFile)) {
|
|
40
|
-
const pid = parseInt(fs.readFileSync(pidFile, 'utf8').trim(), 10);
|
|
41
|
-
if (Number.isFinite(pid)) { try { process.kill(pid); } catch (_) {} }
|
|
42
|
-
try { fs.unlinkSync(pidFile); } catch (_) {}
|
|
43
|
-
}
|
|
44
|
-
if (process.platform === 'win32') {
|
|
45
|
-
try { spawnSync('taskkill', ['/F', '/IM', 'node.exe'], { windowsHide: true, timeout: 3000, stdio: 'ignore' }); } catch (_) {}
|
|
46
|
-
} else {
|
|
47
|
-
try { spawnSync('pkill', ['-f', 'plugkit'], { timeout: 3000, stdio: 'ignore' }); } catch (_) {}
|
|
48
|
-
}
|
|
49
|
-
fs.mkdirSync(spoolIn, { recursive: true });
|
|
50
|
-
fs.mkdirSync(spoolOut, { recursive: true });
|
|
51
|
-
const proc = spawn('node', [bin, 'runner', '--watch', spoolIn, '--out', spoolOut], {
|
|
52
|
-
detached: true, stdio: 'ignore', windowsHide: true, cwd: root,
|
|
53
|
-
});
|
|
54
|
-
proc.unref();
|
|
55
|
-
fs.writeFileSync(pidFile, String(proc.pid));
|
|
56
|
-
```
|
|
57
|
-
|
|
58
|
-
Wait 2s for watcher to initialize, then proceed with orient.
|
|
59
|
-
|
|
60
|
-
## Orient
|
|
61
|
-
|
|
62
|
-
Open every plan with one parallel pack of recall + codesearch against the request's nouns. Write queries to `.gm/exec-spool/in/recall/<N>.txt` and `.gm/exec-spool/in/codesearch/<N>.txt`. Read results from `.gm/exec-spool/out/<N>.out`. Hits land as `weak_prior`; misses confirm the unknown is fresh. The pack runs in one message.
|
|
63
|
-
|
|
64
|
-
**Auto-recall injection (skills-only platforms)**: derive a 2–6 word query from the request's nouns (subject, verb objects, key domain terms). Write recall query to `.gm/exec-spool/in/recall/<N>.txt` at PLAN start before writing `.gm/prd.yml`. Read result from `.gm/exec-spool/out/<N>.out`. This replaces the prompt-submit hook's auto-recall for platforms without hook infrastructure. Recall hits are injected as context into mutable discovery and PRD item acceptance criteria.
|
|
65
|
-
|
|
66
|
-
## Mutable discovery
|
|
67
|
-
|
|
68
|
-
For each aspect of the work, ask: what do I not know, what could go wrong, what depends on what, what am I assuming. Unwitnessed assumptions are mutables.
|
|
69
|
-
|
|
70
|
-
Fault surfaces to scan: file existence, API shape, data format, dep versions, runtime behavior, env differences, error conditions, concurrency, integration seams, backwards compat, rollback paths, CI correctness.
|
|
71
|
-
|
|
72
|
-
Tag every item with a route family (grounding | reasoning | state | execution | observability | boundary | representation) and cross-reference the 16-failure taxonomy. `governance` skill holds the table.
|
|
73
|
-
|
|
74
|
-
`existingImpl=UNKNOWN` is the default; resolve via codesearch (write to `.gm/exec-spool/in/codesearch/<N>.txt`) before adding the item. An existing concern routes to consolidation, not addition.
|
|
75
|
-
|
|
76
|
-
Plan exits when zero new unknowns surfaced last pass AND every item has acceptance criteria AND deps are mapped.
|
|
77
|
-
|
|
78
|
-
## .gm/mutables.yml — co-equal with .gm/prd.yml
|
|
79
|
-
|
|
80
|
-
Every unknown surfaced during PLAN lands as an entry in `.gm/mutables.yml` the same pass. Live during work, deleted when empty. Self-enforced: never use Write/Edit/NotebookEdit, never run `git commit`/`git push`, never stop the turn while any entry has `status: unknown`. This discipline is owned by the agent — not by external infrastructure.
|
|
81
|
-
|
|
82
|
-
```yaml
|
|
83
|
-
- id: kebab-id
|
|
84
|
-
claim: One-line statement of what is assumed
|
|
85
|
-
witness_method: codesearch <query> | nodejs import | recall <query> | Read <path>
|
|
86
|
-
witness_evidence: ""
|
|
87
|
-
status: unknown
|
|
88
|
-
```
|
|
89
|
-
|
|
90
|
-
`status: unknown` → `witnessed` only when `witness_evidence` is filled with concrete proof (file:line, codesearch hit, dispatched test output). Resolution lives in gm-execute. PRD items reference mutables via optional `mutables: [id1, id2]` field; an item is blocked while any referenced mutable is unresolved.
|
|
91
|
-
|
|
92
|
-
## .prd format
|
|
93
|
-
|
|
94
|
-
Path: `./.gm/prd.yml`. Write via the Write tool or by emitting a nodejs spool file (`in/nodejs/<N>.js`) that calls `fs.writeFileSync`. Delete the file when empty.
|
|
95
|
-
|
|
96
|
-
```yaml
|
|
97
|
-
- id: kebab-id
|
|
98
|
-
subject: Imperative verb phrase
|
|
99
|
-
status: pending
|
|
100
|
-
description: Precise criterion
|
|
101
|
-
effort: small|medium|large
|
|
102
|
-
category: feature|bug|refactor|infra
|
|
103
|
-
route_family: grounding|reasoning|state|execution|observability|boundary|representation
|
|
104
|
-
load: 0.0-1.0
|
|
105
|
-
failure_modes: []
|
|
106
|
-
route_fit: unexamined|examined|dominant
|
|
107
|
-
authorization: none|weak_prior|witnessed
|
|
108
|
-
blocking: []
|
|
109
|
-
blockedBy: []
|
|
110
|
-
acceptance:
|
|
111
|
-
- binary criterion
|
|
112
|
-
edge_cases:
|
|
113
|
-
- failure mode
|
|
114
|
-
```
|
|
115
|
-
|
|
116
|
-
`load` is consequence-if-wrong: 0.9 = headline collapses, 0.7 = sub-argument rebuilt, 0.4 = local patch, 0.1 = nothing breaks. Verification budget = `load × (1 − tier_confidence)`. λ>0.75 must reach witnessed before EMIT.
|
|
117
|
-
|
|
118
|
-
`status`: pending → in_progress → completed (then remove). `effort`: small <15min | medium <45min | large >1h.
|
|
119
|
-
|
|
120
|
-
## Parallel subagent launch
|
|
121
|
-
|
|
122
|
-
After `.prd` is written, up to 3 parallel `gm:gm` subagents for independent items in one message. Browser tasks serialize.
|
|
123
|
-
|
|
124
|
-
```
|
|
125
|
-
Agent(subagent_type="gm:gm", prompt="Work on .prd item: <id>. .prd path: <path>. Item: <full YAML>.")
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
Items not parallelizable → invoke `gm-execute` directly.
|
|
129
|
-
|
|
130
|
-
## Observability gates in the plan
|
|
131
|
-
|
|
132
|
-
Server: every subsystem exposes `/debug/<subsystem>`; structured logs `{subsystem, severity, ts}`. Client: `window.__debug` live registry; modules register on mount. `console.log` is not observability. Discovery of a gap during PLAN adds a `.prd` item the same pass — never deferred.
|
|
133
|
-
|
|
134
|
-
`window.__debug` is THE in-page registry; `test.js` at project root is the sole out-of-page test asset. Any new file whose purpose is to exercise, smoke-test, demo, or sandbox in-page behavior outside that registry fights the discipline — extend the registry instead.
|
|
135
|
-
|
|
136
|
-
## Test discipline encoded in the plan
|
|
137
|
-
|
|
138
|
-
One `test.js` at project root, 200-line hard cap, real data, real system. No fixtures, mocks, or scattered tests. A second test runner under any name in any directory is a smuggled parallel surface.
|
|
139
|
-
|
|
140
|
-
The 200 lines are a *budget* for maximum surface coverage, not a target. Subsystems get one combined group each — names joined with `+` (`home+config+skin`, `mcp+swe+distributions+account+credpool`). When a new subsystem's failure mode overlaps an existing group's side-effects, fold the assertion in rather than creating a new group. When `wc -l test.js > 200`, the discipline is *merge groups + drop redundancy*, never split.
|
|
141
|
-
|
|
142
|
-
## Execution norms encoded in the plan
|
|
143
|
-
|
|
144
|
-
Code execution AND utility verbs both write to `.gm/exec-spool/in/<lang-or-verb>/<N>.<ext>`. Languages live under `in/<lang>/` (nodejs, python, bash, typescript, go, rust, c, cpp, java, deno); verbs live under `in/<verb>/` (codesearch, recall, memorize, wait, sleep, status, close, browser, runner, type, kill-port, forget, feedback, learn-status, learn-debug, learn-build, discipline, pause, health). The spool watcher runs the file and streams to `out/<N>.out` (stdout) + `out/<N>.err` (stderr) line-by-line, then writes `out/<N>.json` metadata (exitCode, durationMs, timedOut, startedAt, endedAt) at completion. Both streams return as systemMessage with `--- stdout ---` / `--- stderr ---` separators. `in/` and `out/` are wiped at session start and at real-exit session end. Only `git` (and `gh`) run directly via Bash; never `Bash(node/npm/npx/bun)`, never `Bash(exec:<anything>)`. Spool paths in nodejs files are platform-literal — use `os.tmpdir()` and `path.join`. The spool enforces per-task timeouts; on timeout, partial output is preserved and the watcher emits `[exec timed out after Nms; partial output above]`.
|
|
145
|
-
|
|
146
|
-
Codesearch only — never use Grep/Glob/Find/Explore. Write to `.gm/exec-spool/in/codesearch/<N>.txt`. Start two words, change/add one per pass, minimum four attempts before concluding absent.
|
|
147
|
-
|
|
148
|
-
Pack runs use `Promise.allSettled`, each idea its own try/catch, under 12s per call.
|
|
149
|
-
|
|
150
|
-
## Dev workflow encoded in the plan
|
|
151
|
-
|
|
152
|
-
No comments. 200-line per-file cap. Fail loud. No duplication. Scan before edit. AGENTS.md edits route through the memorize sub-agent only. CHANGELOG.md gets one entry per commit.
|
|
153
|
-
|
|
154
|
-
Minimal-code process, stop at the first that resolves: native → library → structure (map / pipeline) → write.
|
|
155
|
-
|
|
156
|
-
## Marker File Protocol
|
|
157
|
-
|
|
158
|
-
PLAN phase writes THREE marker files before transitioning to EXECUTE:
|
|
159
|
-
1. `.gm/prd.yml` — the work items (already written per PRD format above)
|
|
160
|
-
2. `.gm/needs-gm` — empty marker file signaling PRD is ready for orchestration
|
|
161
|
-
3. `.gm/gm-fired-<sessionId>` — signals that gm orchestration (planning) has run and cleared the gate
|
|
162
|
-
|
|
163
|
-
When `.gm/prd.yml` and `.gm/needs-gm` both exist, downstream tools check `.gm/gm-fired-<sessionId>` marker. If missing, tool execution blocks with reason: "gm orchestration in progress; skills must complete work before tools execute." This gate prevents tool use from tools that run BEFORE the orchestration phase is complete. Writing the marker clears this gate.
|
|
164
|
-
|
|
165
|
-
Write all three markers as final step of PLAN:
|
|
166
|
-
```
|
|
167
|
-
const fs = require('fs');
|
|
168
|
-
const path = require('path');
|
|
169
|
-
const sessionId = process.env.SESSION_ID || 'default';
|
|
170
|
-
fs.mkdirSync(path.join(process.cwd(), '.gm'), { recursive: true });
|
|
171
|
-
fs.writeFileSync(path.join(process.cwd(), '.gm', 'needs-gm'), '');
|
|
172
|
-
fs.writeFileSync(path.join(process.cwd(), '.gm', `gm-fired-${sessionId}`), '');
|
|
173
|
-
```
|
|
174
|
-
|
|
175
|
-
Transition to `gm-execute` (or `gm-gm` subagent) immediately after writing all files. No stop-for-approval; the transition is automatic.
|
|
22
|
+
Read `out/<N>.json::nextSkill`. Invoke `Skill(skill="gm:<nextSkill>")` immediately. End of skill.
|
|
@@ -1,66 +1,22 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: update-docs
|
|
3
3
|
description: UPDATE-DOCS phase. Refresh README.md, AGENTS.md, and docs/index.html to reflect changes made this session. Commits and pushes doc updates. Terminal phase — declares COMPLETE.
|
|
4
|
+
allowed-tools: Skill, Read, Write
|
|
4
5
|
---
|
|
5
6
|
|
|
6
|
-
#
|
|
7
|
+
# update-docs — UPDATE-DOCS phase
|
|
7
8
|
|
|
8
|
-
|
|
9
|
+
Dispatch via plugkit spool. Write `.gm/exec-spool/in/<verb>/<N>.txt` with the request body. Read response from `.gm/exec-spool/out/<N>.json`. Follow `nextSkill` field via `Skill()` immediately — no stop, no approval, no narration.
|
|
9
10
|
|
|
10
|
-
|
|
11
|
+
Verbs for this phase:
|
|
12
|
+
- `phase-status` — read current FSM state
|
|
13
|
+
- `transition` — advance phase
|
|
14
|
+
- `mutable-resolve` — mark mutable witnessed (auto-fires memorize)
|
|
15
|
+
- `memorize-fire` — explicit memorize dispatch
|
|
16
|
+
- Plus existing: `recall`, `codesearch`, `memorize`, `health`, all language stems
|
|
11
17
|
|
|
12
|
-
|
|
18
|
+
The plugkit orchestrator owns state. The skill is one line: dispatch and follow.
|
|
13
19
|
|
|
14
|
-
|
|
20
|
+
## Transition
|
|
15
21
|
|
|
16
|
-
|
|
17
|
-
git log -5 --oneline
|
|
18
|
-
git diff HEAD~1 --stat
|
|
19
|
-
```
|
|
20
|
-
|
|
21
|
-
Read current docs via Read tool, or via a nodejs spool file (`in/nodejs/<N>.js`):
|
|
22
|
-
|
|
23
|
-
```
|
|
24
|
-
const fs = require('fs');
|
|
25
|
-
['README.md', 'AGENTS.md', 'docs/index.html', 'gm-starter/agents/gm.md'].forEach(f => {
|
|
26
|
-
try { console.log(`=== ${f} ===\n` + fs.readFileSync(f, 'utf8')); }
|
|
27
|
-
catch(e) { console.log(`MISSING: ${f}`); }
|
|
28
|
-
});
|
|
29
|
-
```
|
|
30
|
-
|
|
31
|
-
Write changed sections only:
|
|
32
|
-
|
|
33
|
-
- **README.md** — platform count, skill tree diagram, quick-start commands
|
|
34
|
-
- **AGENTS.md** — via `Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt='## CONTEXT TO MEMORIZE\n<learnings>')`. Never inline-edit.
|
|
35
|
-
- **docs/index.html** — `PHASES` array, platform lists, state machine diagram
|
|
36
|
-
- **gm-starter/agents/gm.md** — skill chain line if new skills added
|
|
37
|
-
|
|
38
|
-
Verify from disk (Read tool, or a nodejs spool file):
|
|
39
|
-
|
|
40
|
-
```
|
|
41
|
-
const content = require('fs').readFileSync('/abs/path/file.md', 'utf8');
|
|
42
|
-
console.log(content.includes('expectedString'), content.length);
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
Commit and push directly via Bash:
|
|
46
|
-
|
|
47
|
-
```
|
|
48
|
-
git add README.md docs/index.html gm-starter/agents/gm.md
|
|
49
|
-
git commit -m "docs: update documentation to reflect session changes"
|
|
50
|
-
git push -u origin HEAD
|
|
51
|
-
```
|
|
52
|
-
|
|
53
|
-
## Exit: browser cleanup
|
|
54
|
-
|
|
55
|
-
After docs push succeeds, close any browser sessions spawned during this or prior skill phases. Write a nodejs spool file calling rs-exec:
|
|
56
|
-
|
|
57
|
-
```javascript
|
|
58
|
-
const sessionId = process.env.CLAUDE_SESSION_ID;
|
|
59
|
-
if (!sessionId) return;
|
|
60
|
-
const rs = require('rs-exec');
|
|
61
|
-
try {
|
|
62
|
-
rs.client().close_sessions_for(sessionId).catch(() => {});
|
|
63
|
-
} catch (e) {}
|
|
64
|
-
```
|
|
65
|
-
|
|
66
|
-
Best-effort: session context or rs-exec unavailable → skip gracefully. No error thrown.
|
|
22
|
+
Read `out/<N>.json::nextSkill`. Invoke `Skill(skill="gm:<nextSkill>")` immediately. End of skill.
|