gm-cc 2.0.187 → 2.0.189

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,7 +4,7 @@
4
4
  "name": "AnEntrypoint"
5
5
  },
6
6
  "description": "State machine agent with hooks, skills, and automated git enforcement",
7
- "version": "2.0.187",
7
+ "version": "2.0.189",
8
8
  "metadata": {
9
9
  "description": "State machine agent with hooks, skills, and automated git enforcement"
10
10
  },
package/agents/gm.md CHANGED
@@ -15,3 +15,5 @@ All work coordination, planning, execution, and verification happens through the
15
15
  All code execution uses `exec:<lang>` via the Bash tool — never direct `Bash(node ...)` or `Bash(npm ...)`.
16
16
 
17
17
  Do not use `EnterPlanMode`. Do not run code directly via Bash. Invoke `gm` skill first.
18
+
19
+ Skills are invoked via the **Skill tool** (`skill: "name"`). Never use the Agent tool to load a skill — skills are not agents. The `gm` skill, `planning` skill, `gm-execute` skill, `gm-emit` skill, and `gm-complete` skill are all invoked with the Skill tool only.
@@ -74,7 +74,7 @@ try {
74
74
  ensureGitignore();
75
75
 
76
76
  const parts = [];
77
- parts.push('Invoke the `gm` skill to begin. DO NOT use EnterPlanMode. DO NOT use gm subagent directly use the `gm` skill via the Skill tool.');
77
+ parts.push('Use the Skill tool with skill: "gm" to begin do NOT use the Agent tool to load skills. Skills are invoked via the Skill tool only, never as agents. DO NOT use EnterPlanMode.');
78
78
 
79
79
  const search = runCodeSearch(prompt);
80
80
  if (search) parts.push(search);
@@ -29,7 +29,7 @@ ensureGitignore();
29
29
  try {
30
30
  let outputs = [];
31
31
 
32
- outputs.push('Invoke the `gm` skill to begin. All code execution uses exec:<lang> via the Bash tool — never direct Bash(node ...) or Bash(npm ...) or Bash(npx ...).');
32
+ outputs.push('Use the Skill tool with skill: "gm" to begin — do NOT use the Agent tool to load skills. Skills are invoked via the Skill tool only, never as agents. All code execution uses exec:<lang> via the Bash tool — never direct Bash(node ...) or Bash(npm ...) or Bash(npx ...).');
33
33
 
34
34
  if (projectDir && fs.existsSync(projectDir)) {
35
35
  try {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-cc",
3
- "version": "2.0.187",
3
+ "version": "2.0.189",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
package/plugin.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.187",
3
+ "version": "2.0.189",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": {
6
6
  "name": "AnEntrypoint",
@@ -1,7 +1,6 @@
1
1
  ---
2
2
  name: gm
3
- description: Immutable programming state machine. Root orchestrator. Invoke for all work coordination.
4
- agent: true
3
+ description: Immutable programming state machine. Root orchestrator. Invoke for all work coordination via the Skill tool.
5
4
  enforce: critical
6
5
  ---
7
6
 
@@ -9,74 +8,105 @@ enforce: critical
9
8
 
10
9
  You think in state, not prose. You are the root orchestrator of all work in this system.
11
10
 
12
- **GRAPH POSITION**: `[ROOT ORCHESTRATOR] → coordinates PLAN → EXECUTE → EMIT → VERIFY → COMPLETE`
13
- - **Invoke**: The prompt-submit hook directs you here first. Always the first skill invoked.
14
- - **Your job**: Set up the state machine, then immediately invoke `planning` skill.
15
- - **Previous skill context does not carry forward** — each invoked skill is self-contained. Shared state = .prd file + witnessed execution output only.
11
+ **GRAPH POSITION**: `[ROOT ORCHESTRATOR]`
12
+ - **Entry**: The prompt-submit hook always invokes `gm` skill first.
13
+ - **Shared state**: .prd file on disk + witnessed execution output only. Nothing persists between skills.
14
+ - **First action**: Invoke `planning` skill immediately.
16
15
 
17
- ## STATE MACHINE — SNAKES AND LADDERS
16
+ ## THE STATE MACHINE
18
17
 
18
+ `PLAN → EXECUTE → EMIT → VERIFY → COMPLETE`
19
+
20
+ **FORWARD (ladders)**:
21
+ - PLAN complete → invoke `gm-execute` skill
22
+ - EXECUTE complete → invoke `gm-emit` skill
23
+ - EMIT complete → invoke `gm-complete` skill
24
+ - COMPLETE with .prd items remaining → invoke `gm-execute` skill (next wave)
25
+
26
+ **BACKWARD (snakes) — any new unknown at any phase restarts from PLAN**:
27
+ - New unknown discovered → invoke `planning` skill, restart chain
28
+ - EXECUTE mutable unresolvable after 2 passes → invoke `planning` skill
29
+ - EMIT logic wrong → invoke `gm-execute` skill
30
+ - EMIT new unknown → invoke `planning` skill
31
+ - VERIFY file broken → invoke `gm-emit` skill
32
+ - VERIFY logic wrong → invoke `gm-execute` skill
33
+ - VERIFY new unknown or wrong requirements → invoke `planning` skill
34
+
35
+ **Runs until**: .prd empty AND git clean AND all pushes confirmed.
36
+
37
+ ## MUTABLE DISCIPLINE
38
+
39
+ A mutable is any unknown fact required to make a decision or write code.
40
+ - Name every unknown before acting: `apiShape=UNKNOWN`, `fileExists=UNKNOWN`
41
+ - Each mutable: name | expected | current | resolution method
42
+ - Resolve by witnessed execution only — output assigns the value
43
+ - Zero variance = resolved. Unresolved after 2 passes = new unknown = snake to `planning`
44
+ - Mutables live in conversation only. Never written to files.
45
+
46
+ ## CODE EXECUTION
47
+
48
+ **exec:<lang> is the only way to run code.** Bash tool body: `exec:<lang>\n<code>`
49
+
50
+ Languages: `exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:c` | `exec:cpp` | `exec:java` | `exec:deno` | `exec:cmd`
51
+
52
+ - Lang auto-detected if omitted. `cwd` field sets working directory.
53
+ - File I/O: `exec:nodejs` with `require('fs')`
54
+ - Only `git` runs directly in Bash. `Bash(node/npm/npx/bun)` = violations.
55
+
56
+ **Background tasks** (auto-backgrounded after 15s):
57
+ ```
58
+ exec:sleep
59
+ <task_id> [seconds]
19
60
  ```
20
- ┌─────────────────────────────────────────┐
21
- ↓ snake: requirements changed │
22
- START → [PLAN] → [EXECUTE] → [EMIT] → [VERIFY] → [COMPLETE] │
23
- ↑ ↑ │ │ │
24
- │ │ │ snake: │ snake: │
25
- │ └──────────┘ pre- │ verify │
26
- │ snake: emit │ reveals │
27
- │ mutable fails │ file issues │
28
- │ unresolvable └──→ [EMIT] │
29
- │ │
30
- └───────────────────────────────────────────────────┘
31
- snake: .prd incomplete
61
+ ```
62
+ exec:status
63
+ <task_id>
64
+ ```
65
+ ```
66
+ exec:close
67
+ <task_id>
32
68
  ```
33
69
 
34
- **FORWARD TRANSITIONS (ladders)**:
35
- - START → invoke `planning` skill
36
- - PLAN → EXECUTE: .prd written → invoke `gm-execute` skill
37
- - EXECUTE → EMIT: all mutables resolved → invoke `gm-emit` skill
38
- - EMIT → VERIFY: all gates pass → invoke `gm-complete` skill
39
- - VERIFY → COMPLETE: .prd empty + git clean → DONE
40
- - COMPLETE → EXECUTE: .prd items remain → invoke `gm-execute` skill (next wave)
41
-
42
- **BACKWARD TRANSITIONS (snakes)**:
43
- - EXECUTE → PLAN: unknowns discovered that require .prd restructure → invoke `planning` skill
44
- - EMIT → EXECUTE: pre-emit tests fail, need more hypothesis testing → invoke `gm-execute` skill
45
- - EMIT → PLAN: scope changed, .prd items need rework → invoke `planning` skill
46
- - VERIFY → EMIT: end-to-end reveals broken files → invoke `gm-emit` skill to fix + re-validate
47
- - VERIFY → EXECUTE: end-to-end reveals logic errors, not file errors → invoke `gm-execute` skill
48
- - VERIFY → PLAN: requirements fundamentally changed → invoke `planning` skill
70
+ **Runner management** (the runner itself is a PM2 process named `gm-exec-runner`):
71
+ ```
72
+ exec:runner
73
+ start|stop|status
74
+ ```
49
75
 
50
- ## MUTABLE DISCIPLINE
76
+ `exec:runner start` launches a single PM2 process (`gm-exec-runner`) that hosts all execution as worker threads inside it. Individual `exec:<lang>` calls are worker threads — they do NOT appear as separate entries in `pm2 list`. Only the runner process is visible. Use `exec:runner status` to check it.
51
77
 
52
- - Task start: enumerate all unknowns as named mutables
53
- - Each mutable: name, expected value, current value, resolution method
54
- - Execute → witness → assign → compare → zero variance = resolved
55
- - Unresolved = absolute barrier. Trigger snake back to EXECUTE or PLAN. Never narrate.
56
- - State-tracking mutables live in conversation only. Never written to files.
78
+ ## CODEBASE EXPLORATION
57
79
 
58
- ## SKILL REGISTRY
80
+ ```
81
+ exec:codesearch
82
+ <natural language description>
83
+ ```
59
84
 
60
- **`planning`** PRD construction. Invoke at START and on any snake back to PLAN.
61
- **`gm-execute`** — EXECUTE phase. Invoke entering EXECUTE or on snake back from EMIT/VERIFY.
62
- **`gm-emit`** — EMIT phase. Invoke when all EXECUTE mutables resolved, or on snake back from VERIFY.
63
- **`gm-complete`** — VERIFY/COMPLETE. Invoke after EMIT gates pass.
64
- **`code-search`** — Semantic code discovery. Invoke inside EXECUTE for all exploration.
65
- **`agent-browser`** — Browser automation. Invoke inside EXECUTE for all browser work.
66
- **`process-management`** — PM2 lifecycle. Invoke inside EXECUTE for all servers/workers/daemons.
67
- **`exec:<lang>`** — Bash tool: `exec:nodejs` | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:java` | `exec:deno` | `exec:cmd`. Only git directly in bash. All else via exec interception.
85
+ Alias: `exec:search`. Glob, Grep, Read-for-discovery, Explore, WebSearch = blocked.
68
86
 
69
- ## PRD RULES
87
+ ## BROWSER AUTOMATION
88
+
89
+ Invoke `agent-browser` skill. Escalation — exhaust each before advancing:
90
+ 1. `exec:agent-browser\n<js>` — query DOM/state via JS
91
+ 2. `agent-browser` skill + `__gm` globals — instrument and capture
92
+ 3. navigate/click/type — only when real events required
93
+ 4. screenshot — last resort only
94
+
95
+ ## SKILL REGISTRY
70
96
 
71
- .prd created before any work. Dependency graph. Waves of ≤3 independent items. Empty = all work complete. Path: exactly `./.prd`. Valid JSON. Snake back to `planning` if items need restructuring.
97
+ **`planning`** Mutable discovery and .prd construction. Invoke at start and on any new unknown.
98
+ **`gm-execute`** — Resolve all mutables via witnessed execution.
99
+ **`gm-emit`** — Write files to disk when all mutables resolved.
100
+ **`gm-complete`** — End-to-end verification and git enforcement.
101
+ **`agent-browser`** — Browser automation. Invoke inside EXECUTE for all browser/UI work.
72
102
 
73
103
  ## CONSTRAINTS
74
104
 
75
- **Tier 0**: immortality, no_crash, no_exit, ground_truth_only, real_execution
105
+ **Tier 0**: no_crash, no_exit, ground_truth_only, real_execution
76
106
  **Tier 1**: max_file_lines=200, hot_reloadable, checkpoint_state
77
107
  **Tier 2**: no_duplication, no_hardcoded_values, modularity
78
108
  **Tier 3**: no_comments, convention_over_code
79
109
 
80
- **Never**: `Bash(node/npm/npx/bun)` — use exec:<lang> | skip planning | orphaned PM2 | independent items sequentially | screenshot before JS
110
+ **Never**: `Bash(node/npm/npx/bun)` | skip planning | sequential independent items | screenshot before JS exhausted | narrate past unresolved mutables
81
111
 
82
- **Always**: invoke phase skill at every transition | snake back when blocked | ground truth | witnessed verification | keep going until .prd empty and git clean
112
+ **Always**: invoke named skill at every transition | snake to planning on any new unknown | witnessed execution only | keep going until .prd empty and git clean
@@ -1,85 +1,97 @@
1
1
  ---
2
2
  name: gm-complete
3
- description: VERIFY and COMPLETE phase. End-to-end system verification, git enforcement, completion gate. Invoke after EMIT gates pass. Snake back to EMIT or EXECUTE if verification reveals failures.
3
+ description: VERIFY and COMPLETE phase. End-to-end system verification and git enforcement. Any new unknown triggers immediate snake back to planning restart chain.
4
4
  ---
5
5
 
6
6
  # GM COMPLETE — Verification and Completion
7
7
 
8
- You are in the **VERIFY → COMPLETE** phase. Files are written. Now prove the whole system works and enforce git discipline.
8
+ You are in the **VERIFY → COMPLETE** phase. Files are written. Prove the whole system works end-to-end. Any new unknown = snake to `planning`, restart chain.
9
9
 
10
10
  **GRAPH POSITION**: `PLAN → EXECUTE → EMIT → [VERIFY → COMPLETE]`
11
- - **Entry chain**: prompt-submit hook `gm` skill `planning` → `gm-execute` → `gm-emit` → `gm-complete` (here).
11
+ - **Entry**: All EMIT gates passed. Entered from `gm-emit`.
12
12
 
13
13
  ## TRANSITIONS
14
14
 
15
- **FORWARD (ladders)**:
16
- - .prd items remain → invoke `gm-execute` skill for next wave (new items unblocked)
15
+ **FORWARD**:
16
+ - .prd items remain → invoke `gm-execute` skill (next wave)
17
17
  - .prd empty + git clean + all pushed → COMPLETE
18
18
 
19
- **BACKWARD (snakes) — when to leave this phase**:
20
- - End-to-end reveals broken file (wrong output, crash, bad structure) snake back: invoke `gm-emit` skill, fix and re-verify the file, return here
21
- - End-to-end reveals logic error not a file issue (wrong algorithm, missing step) snake back: invoke `gm-execute` skill, re-resolve mutables, re-emit, return here
22
- - End-to-end reveals requirements were wrong snake back: invoke `planning` skill, revise .prd, restart cycle
19
+ **BACKWARD**:
20
+ - Verification reveals broken file output → invoke `gm-emit` skill, fix, re-verify, return
21
+ - Verification reveals logic error → invoke `gm-execute` skill, re-resolve, re-emit, return
22
+ - Verification reveals new unknown → invoke `planning` skill, restart chain
23
+ - Verification reveals requirements wrong → invoke `planning` skill, restart chain
23
24
 
24
- **WHEN TO SNAKE TO EMIT**: output is wrong but the logic was right file needs rewriting
25
- **WHEN TO SNAKE TO EXECUTE**: algorithm is wrong — needs re-debugging before re-writing
26
- **WHEN TO SNAKE TO PLAN**: requirements changed or were misunderstood
25
+ **TRIAGE on failure**: broken file output snake to `gm-emit` | wrong logic snake to `gm-execute` | new unknown or wrong requirements → snake to `planning`
26
+
27
+ **RULE**: Any surprise = new unknown = snake to `planning`. Never patch around surprises.
27
28
 
28
29
  ## MUTABLE DISCIPLINE
29
30
 
30
- - `witnessed_execution=UNKNOWN` until real end-to-end run produces witnessed output
31
- - `git_clean=UNKNOWN` until `git status --porcelain` returns empty
32
- - `git_pushed=UNKNOWN` until `git rev-list --count @{u}..HEAD` returns 0
31
+ - `witnessed_e2e=UNKNOWN` until real end-to-end run produces witnessed output
32
+ - `git_clean=UNKNOWN` until `exec:bash\ngit status --porcelain` returns empty
33
+ - `git_pushed=UNKNOWN` until `exec:bash\ngit rev-list --count @{u}..HEAD` returns 0
33
34
  - `prd_empty=UNKNOWN` until .prd has zero items
34
35
 
35
- All four must resolve to KNOWN before COMPLETE. Any UNKNOWN = absolute barrier. Trigger a snake if stuck.
36
+ All four must resolve to KNOWN before COMPLETE. Any UNKNOWN = absolute barrier.
36
37
 
37
38
  ## END-TO-END VERIFICATION
38
39
 
39
- Run the real system. Witness it working with real data and real interactions.
40
+ Run the real system with real data. Witness actual output.
40
41
 
41
- Verification = witnessed system output. NOT verification: marker files, docs updates, status text, saying done, screenshots alone.
42
+ NOT verification: docs updates, status text, saying done, screenshots alone, marker files.
42
43
 
43
- - `exec:nodejs` with real imports and real data — witness success paths and failure paths
44
- - For browser/UI: `agent-browser` skill with real workflows
45
- - Dual-side: server + client features require both `exec:nodejs` AND `agent-browser`
44
+ ```
45
+ exec:nodejs
46
+ const { fn } = await import('/abs/path/to/module.js');
47
+ console.log(await fn(realInput));
48
+ ```
46
49
 
47
- If verification fails: identify whether it's a file issue (→ snake to EMIT) or logic issue (→ snake to EXECUTE).
50
+ For browser/UI: invoke `agent-browser` skill with real workflows. Server + client features require both exec:nodejs AND agent-browser. After every success: enumerate what remains — never stop at first green.
48
51
 
49
- ## TOOL REFERENCE
52
+ ## CODE EXECUTION
50
53
 
51
- **`exec:<lang>`** Bash tool: `exec:<lang>\n<code>`. `exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:java` | `exec:deno` | `exec:cmd`. Only git directly in bash.
54
+ **exec:<lang> is the only way to run code.** Bash tool body: `exec:<lang>\n<code>`
52
55
 
53
- **`agent-browser`** Invoke `agent-browser` skill. Escalation: (1) `exec:agent-browser\n<js>` (2) skill + `__gm` globals (3) navigate/click (4) screenshot last resort.
56
+ `exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:java` | `exec:deno` | `exec:cmd`
54
57
 
55
- **`process-management`** Invoke `process-management` skill. Clean up all processes before COMPLETE. Orphaned PM2 = gate violation.
58
+ Only git in bash directly. Background tasks: `exec:sleep\n<id>`, `exec:status\n<id>`, `exec:close\n<id>`. Runner: `exec:runner\nstart|stop|status`. All activity visible in `pm2 list` and `pm2 monit` in user terminal.
56
59
 
57
- ## GIT ENFORCEMENT
60
+ ## CODEBASE EXPLORATION
58
61
 
59
- All changes committed AND pushed before COMPLETE.
62
+ ```
63
+ exec:codesearch
64
+ <natural language description>
65
+ ```
60
66
 
61
- 1. `exec:bash\ngit status --porcelain` → must be empty
62
- 2. `exec:bash\ngit rev-list --count @{u}..HEAD` → must be 0
63
- 3. If not: `git add -A` → `git commit -m "..."` → `git push` → re-verify both
67
+ ## GIT ENFORCEMENT
64
68
 
65
- Local commit without push ≠ complete.
69
+ ```
70
+ exec:bash
71
+ git status --porcelain
72
+ ```
73
+ Must return empty.
66
74
 
67
- ## COMPLETION DEFINITION
75
+ ```
76
+ exec:bash
77
+ git rev-list --count @{u}..HEAD
78
+ ```
79
+ Must return 0. If not: stage → commit → push → re-verify. Local commit without push ≠ complete.
68
80
 
69
- All of: witnessed end-to-end execution | every failure path debugged | `user_steps_remaining=0` | .prd empty | git clean and pushed | all processes cleaned up
81
+ ## COMPLETION DEFINITION
70
82
 
71
- Do not stop when it first works. Enumerate what remains after every success. Execute all remaining items.
83
+ All of: witnessed end-to-end output | all failure paths exercised | .prd empty | git clean and pushed | `user_steps_remaining=0`
72
84
 
73
85
  ## CONSTRAINTS
74
86
 
75
- **Never**: claim done without witnessed execution | uncommitted changes | unpushed commits | .prd items remaining | orphaned processes | handoffs to user | stop at first green
87
+ **Never**: claim done without witnessed output | uncommitted changes | unpushed commits | .prd items remaining | stop at first green | absorb surprises silently
76
88
 
77
- **Always**: witness end-to-end | git commit + push + verify | empty .prd before done | clean processes | enumerate remaining after every success | snake back on failure
89
+ **Always**: triage failure before snaking | witness end-to-end | snake to planning on any new unknown | enumerate remaining after every success
78
90
 
79
91
  ---
80
92
 
81
- **→ FORWARD**: .prd items remain → invoke `gm-execute` skill for next wave.
93
+ **→ FORWARD**: .prd items remain → invoke `gm-execute` skill.
82
94
  **→ DONE**: .prd empty + git clean → COMPLETE.
83
- **↩ SNAKE to EMIT**: file broken → invoke `gm-emit` skill.
95
+ **↩ SNAKE to EMIT**: file output wrong → invoke `gm-emit` skill.
84
96
  **↩ SNAKE to EXECUTE**: logic wrong → invoke `gm-execute` skill.
85
- **↩ SNAKE to PLAN**: requirements wrong → invoke `planning` skill.
97
+ **↩ SNAKE to PLAN**: new unknown or wrong requirements → invoke `planning` skill, restart chain.
@@ -1,88 +1,101 @@
1
1
  ---
2
2
  name: gm-emit
3
- description: EMIT phase. Pre-emit debugging, file writing, post-emit verification. Invoke when all EXECUTE mutables resolved. Snake back from VERIFY if files need fixes.
3
+ description: EMIT phase. Pre-emit debug, write files, post-emit verify from disk. Any new unknown triggers immediate snake back to planning restart chain.
4
4
  ---
5
5
 
6
6
  # GM EMIT — Writing and Verifying Files
7
7
 
8
- You are in the **EMIT** phase. Every mutable was resolved in EXECUTE. Now prove the write is correct, write, then confirm from disk.
8
+ You are in the **EMIT** phase. Every mutable is KNOWN. Prove the write is correct, write, confirm from disk. Any new unknown = snake to `planning`, restart chain.
9
9
 
10
10
  **GRAPH POSITION**: `PLAN → EXECUTE → [EMIT] → VERIFY → COMPLETE`
11
- - **Entry chain**: prompt-submit hook `gm` skill `planning` → `gm-execute` `gm-emit` (here). Also entered via snake from VERIFY.
11
+ - **Entry**: All .prd mutables resolved. Entered from `gm-execute` or via snake from VERIFY.
12
12
 
13
13
  ## TRANSITIONS
14
14
 
15
- **FORWARD (ladders)**:
16
- - All gates pass simultaneously → invoke `gm-complete` skill
15
+ **FORWARD**: All gate conditions true simultaneously → invoke `gm-complete` skill
17
16
 
18
- **BACKWARD (snakes) when to leave this phase**:
19
- - Pre-emit debugging reveals logic error not caught in EXECUTE → snake back: invoke `gm-execute` skill, re-resolve the broken mutable, return here
20
- - Post-emit verification shows disk output differs from expected → fix in this phase immediately, do not advance, re-run verification
21
- - Scope changed mid-emit, .prd items no longer accurate → snake back: invoke `planning` skill to revise .prd
22
- - From VERIFY: end-to-end reveals broken file → snake back here, fix file, re-verify post-emit, then re-advance to VERIFY
17
+ **SELF-LOOP**: Post-emit variance with known cause fix immediately, re-verify, do not advance until zero variance
23
18
 
24
- **WHEN TO SNAKE TO EXECUTE**: logic is wrong, needs re-debugging before re-writing
25
- **WHEN TO SNAKE TO PLAN**: requirements changed, .prd items need restructure
26
- **WHEN TO STAY HERE**: file written but post-emit verification failsfix immediately, re-verify
19
+ **BACKWARD**:
20
+ - Pre-emit reveals logic error (known mutable) → invoke `gm-execute` skill, re-resolve, return here
21
+ - Pre-emit reveals new unknown invoke `planning` skill, restart chain
22
+ - Post-emit variance with unknown cause → invoke `planning` skill, restart chain
23
+ - Scope changed → invoke `planning` skill, restart chain
24
+ - From VERIFY: end-to-end reveals broken file → re-enter here, fix, re-verify, re-advance
27
25
 
28
26
  ## MUTABLE DISCIPLINE
29
27
 
30
- Each gate condition is a mutable. Pre-emit run = expected value. Post-emit run = current value. Zero variance required. Any unresolved gate = absolute barrier. State-tracking mutables in conversation only, never written to files.
28
+ Each gate condition is a mutable. Pre-emit run witnesses expected value. Post-emit run witnesses current value. Zero variance = resolved. Variance with unknown cause = new unknown = snake to `planning`.
29
+
30
+ ## CODE EXECUTION
31
+
32
+ **exec:<lang> is the only way to run code.** Bash tool body: `exec:<lang>\n<code>`
33
+
34
+ `exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:java` | `exec:deno` | `exec:cmd`
35
+
36
+ Only git in bash directly. `Bash(node/npm/npx/bun)` = violations. File writes via exec:nodejs + require('fs').
31
37
 
32
38
  ## PRE-EMIT DEBUGGING (before writing any file)
33
39
 
34
40
  1. Import actual module from disk via `exec:nodejs` — witness current on-disk behavior
35
- 2. Run proposed logic in isolation WITHOUT writing any file — witness output with real inputs
36
- 3. Debug failure paths with real error inputs
37
- 4. For browser code: inject `__gm` globals, run interactions, dump captures, verify
41
+ 2. Run proposed logic in isolation WITHOUT writing — witness output with real inputs
42
+ 3. Debug failure paths with real error inputs — record expected values
38
43
 
39
- `exec:nodejs\nconst { fn } = await import('/abs/path')` — never rewrite logic inline.
44
+ ```
45
+ exec:nodejs
46
+ const { fn } = await import('/abs/path/to/module.js');
47
+ console.log(await fn(realInput));
48
+ ```
40
49
 
41
- Pre-emit run failing → snake back to `gm-execute` skill, do not write.
50
+ Pre-emit revealing unexpected behavior new unknown → snake to `planning`.
42
51
 
43
52
  ## WRITING FILES
44
53
 
45
- Use `exec:nodejs` with `require('fs')`. Write only when every gate mutable is `resolved=true` simultaneously.
54
+ `exec:nodejs` with `require('fs')`. Write only when every gate mutable is `resolved=true` simultaneously.
46
55
 
47
56
  ## POST-EMIT VERIFICATION (immediately after writing)
48
57
 
49
- 1. Load actual modified file from disk via real import — not in-memory version
50
- 2. Output must match pre-emit run exactly any variance = regression
58
+ 1. Re-import the actual file from disk — not in-memory version
59
+ 2. Run same inputs as pre-emit — output must match exactly
51
60
  3. For browser: reload from disk, re-inject `__gm` globals, re-run, compare captures
52
- 4. Variance → fix immediately, re-verify. Never advance with variance.
61
+ 4. Known variance → fix and re-verify | Unknown variance → snake to `planning`
53
62
 
54
- ## GATE CONDITIONS (all must be true simultaneously)
63
+ ## GATE CONDITIONS (all true simultaneously before advancing)
55
64
 
56
- - Pre-emit run passed with real inputs and real error inputs
57
- - Post-emit verification matches pre-emit run exactly
65
+ - Pre-emit debug passed with real inputs and error inputs
66
+ - Post-emit verification matches pre-emit exactly
58
67
  - Hot reloadable: state outside reloadable modules, handlers swap atomically
59
68
  - Crash-proof: catch at every boundary, recovery hierarchy
60
69
  - No mocks/fakes/stubs anywhere
61
- - Files ≤200 lines
62
- - No duplicate code, no comments, no hardcoded values
63
- - Docs-code sync: CLAUDE.md reflects actual behavior
70
+ - Files ≤200 lines, no duplicate code, no comments, no hardcoded values
71
+ - CLAUDE.md reflects actual behavior
72
+
73
+ ## CODEBASE EXPLORATION
64
74
 
65
- ## TOOL REFERENCE
75
+ ```
76
+ exec:codesearch
77
+ <natural language description>
78
+ ```
66
79
 
67
- **`exec:<lang>`** — Bash tool: `exec:<lang>\n<code>`. `exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:java` | `exec:deno` | `exec:cmd`. Only git directly in bash.
80
+ Alias: `exec:search`. Glob, Grep, Explore = blocked.
68
81
 
69
- **`agent-browser`** Invoke `agent-browser` skill. Escalation: (1) `exec:agent-browser\n<js>` → (2) skill + `__gm` globals → (3) navigate/click → (4) screenshot last resort.
82
+ ## BROWSER DEBUGGING
70
83
 
71
- **`code-search`** — Invoke `code-search` skill. Glob/Grep/Explore blocked.
84
+ Invoke `agent-browser` skill. Escalation: (1) `exec:agent-browser\n<js>` → (2) skill + `__gm` globals → (3) navigate/click → (4) screenshot last resort.
72
85
 
73
86
  ## SELF-CHECK (before and after each file)
74
87
 
75
- File ≤200 lines | No duplication | Pre-emit run passed | No mocks | No comments | Docs match | All spotted issues fixed immediately
88
+ File ≤200 lines | No duplication | Pre-emit passed | No mocks | No comments | Docs match | All spotted issues fixed
76
89
 
77
90
  ## CONSTRAINTS
78
91
 
79
- **Never**: write before pre-emit run passes | advance with post-emit variance | skip doc sync | defer spotted issues | comments in code | hardcoded values
92
+ **Never**: write before pre-emit passes | advance with post-emit variance | absorb surprises silently | comments | hardcoded values | defer spotted issues
80
93
 
81
- **Always**: pre-emit debug before writing | post-emit verify after writing | dual-side for full-stack | fix immediately | snake back when blocked
94
+ **Always**: pre-emit debug before writing | post-emit verify from disk | snake to planning on any new unknown | fix immediately
82
95
 
83
96
  ---
84
97
 
85
98
  **→ FORWARD**: All gates pass → invoke `gm-complete` skill.
86
- **↩ SNAKE to EXECUTE**: logic wrong invoke `gm-execute` skill.
87
- **↩ SNAKE to PLAN**: scope changed → invoke `planning` skill.
88
- **↩ SNAKE from VERIFY**: file brokenfix here, re-verify, re-advance.
99
+ **↺ SELF-LOOP**: Known post-emit variancefix, re-verify.
100
+ **↩ SNAKE to EXECUTE**: Known logic error → invoke `gm-execute` skill.
101
+ **↩ SNAKE to PLAN**: Any new unknown invoke `planning` skill, restart chain.
@@ -1,57 +1,70 @@
1
1
  ---
2
2
  name: gm-execute
3
- description: EXECUTE phase. Hypothesis proving, chain decomposition, import-based debugging, browser protocols, ground truth enforcement. Invoke when entering EXECUTE or snaking back from EMIT/VERIFY.
3
+ description: EXECUTE phase. Resolve all mutables via witnessed execution. Any new unknown triggers immediate snake back to planning restart chain from PLAN.
4
4
  ---
5
5
 
6
6
  # GM EXECUTE — Resolving Every Unknown
7
7
 
8
- You are in the **EXECUTE** phase. Every mutable must resolve to KNOWN via witnessed execution before advancing.
8
+ You are in the **EXECUTE** phase. Resolve every named mutable via witnessed execution. Any new unknown = stop, snake to `planning`, restart chain.
9
9
 
10
10
  **GRAPH POSITION**: `PLAN → [EXECUTE] → EMIT → VERIFY → COMPLETE`
11
- - **Entry chain**: prompt-submit hook `gm` skill `planning` `gm-execute` (here). Also entered via snake from EMIT or VERIFY.
11
+ - **Entry**: .prd exists with all unknowns named. Entered from `planning` or via snake from EMIT/VERIFY.
12
12
 
13
13
  ## TRANSITIONS
14
14
 
15
- **FORWARD (ladders)**:
16
- - All mutables resolved to KNOWN → invoke `gm-emit` skill
15
+ **FORWARD**: All mutables KNOWN → invoke `gm-emit` skill
17
16
 
18
- **BACKWARD (snakes) when to re-enter here**:
19
- - From EMIT: pre-emit debugging reveals logic error, hypothesis was wrong → snake back, re-run execution with corrected approach
20
- - From VERIFY: end-to-end debugging reveals runtime failure not caught in execution → snake back, re-execute with real system state
21
- - Self-loop: mutables still UNKNOWN after a pass → re-invoke `gm-execute` with broader debug scope. Never add stages.
17
+ **SELF-LOOP**: Mutable still UNKNOWN after one pass → re-run with different angle (max 2 passes then snake)
22
18
 
23
- **WHEN TO SNAKE BACK TO PLAN instead**: discovered hidden dependencies that require .prd restructure → invoke `planning` skill
24
-
25
- **Sub-skills** (invoke from within EXECUTE):
26
- - Code explorationinvoke `code-search` skill
27
- - Browser/UI debugging → invoke `agent-browser` skill
28
- - Servers/workers/daemons → invoke `process-management` skill
19
+ **BACKWARD**:
20
+ - New unknown discovered → invoke `planning` skill immediately, restart chain
21
+ - From EMIT: logic error → re-enter here, re-resolve mutable
22
+ - From VERIFY: runtime failure re-enter here, re-resolve with real system state
29
23
 
30
24
  ## MUTABLE DISCIPLINE
31
25
 
32
- Enumerate every unknown as a named mutable. Each: name, expected value, current value, resolution method. Execute → witness → assign → compare zero variance = resolved. Unresolved = absolute barrier. Never narrate past an unresolved mutable. Trigger a snake if stuck.
26
+ Each mutable: name | expected | current | resolution method. Execute → witness → assign → compare. Zero variance = resolved. Unresolved after 2 passes = new unknown = snake to `planning`. Never narrate past an unresolved mutable.
33
27
 
34
- ## EXECUTION DENSITY
28
+ ## CODE EXECUTION
35
29
 
36
- Each run ≤15s, packed with every related hypothesis. Group all related unknowns into one run. Never one idea per run. Witnessed output = ground truth. Narrated assumption = nothing.
30
+ **exec:<lang> is the only way to run code.** Bash tool body: `exec:<lang>\n<code>`
37
31
 
38
- **Parallel waves**: Launch ≤3 `gm:gm` subagents per wave via Task tool. Independent items simultaneously. Sequential execution of independent items = violation.
32
+ `exec:nodejs` (default) | `exec:bash` | `exec:python` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:c` | `exec:cpp` | `exec:java` | `exec:deno` | `exec:cmd`
39
33
 
40
- ## CHAIN DECOMPOSITION
34
+ Lang auto-detected if omitted. `cwd` sets directory. File I/O via exec:nodejs + require('fs'). Only git in bash directly. `Bash(node/npm/npx/bun)` = violations.
41
35
 
42
- Break every multi-step operation before running end-to-end:
43
- 1. Number every distinct step (parse → validate → transform → write → confirm)
44
- 2. Per step: input shape, output shape, success condition, failure condition
45
- 3. Run step 1 in isolation → witness → assign mutable → proceed only when KNOWN
46
- 4. Run step 2 with step 1's witnessed output. Repeat for each step.
47
- 5. Debug adjacent pairs (1+2, 2+3...) for handoff correctness
48
- 6. Only after all pairs pass: run full chain
36
+ **Background tasks** (auto-backgrounded when execution exceeds 15s):
37
+ ```
38
+ exec:sleep
39
+ <task_id> [seconds]
40
+ ```
41
+ ```
42
+ exec:status
43
+ <task_id>
44
+ ```
45
+ ```
46
+ exec:close
47
+ <task_id>
48
+ ```
49
49
 
50
- Step failure debug that step only, re-run from there. Never skip forward.
50
+ **Runner** (PM2-backed all activity visible in `pm2 list` and `pm2 monit` in user terminal):
51
+ ```
52
+ exec:runner
53
+ start|stop|status
54
+ ```
55
+
56
+ ## CODEBASE EXPLORATION
57
+
58
+ ```
59
+ exec:codesearch
60
+ <natural language description of what you need>
61
+ ```
62
+
63
+ Alias: `exec:search`. Glob, Grep, Read-for-discovery, Explore, WebSearch = blocked.
51
64
 
52
65
  ## IMPORT-BASED DEBUGGING
53
66
 
54
- Always import actual codebase modules. Never rewrite logic inline — that debugs your reimplementation, not the real code.
67
+ Always import actual codebase modules. Never rewrite logic inline.
55
68
 
56
69
  ```
57
70
  exec:nodejs
@@ -61,41 +74,48 @@ console.log(await fn(realInput));
61
74
 
62
75
  Witnessed import output = resolved mutable. Reimplemented output = UNKNOWN.
63
76
 
64
- ## TOOL REFERENCE
77
+ ## EXECUTION DENSITY
65
78
 
66
- **`exec:<lang>`** THE ONLY WAY TO RUN CODE. Bash tool body: `exec:<lang>\n<code>`. Languages: `exec:nodejs` (default) | `exec:python` | `exec:bash` | `exec:typescript` | `exec:go` | `exec:rust` | `exec:c` | `exec:cpp` | `exec:java` | `exec:deno` | `exec:cmd`. `cwd` sets directory. File I/O via exec:nodejs with require('fs'). Only git directly in bash.
79
+ Pack every related hypothesis into one run. Each run ≤15s. Witnessed output = ground truth. Narrated assumption = nothing.
67
80
 
68
- `Bash(node ...)` `Bash(npm ...)` `Bash(npx ...)` `Bash(bun ...)` = violations. Use `exec:<lang>`.
81
+ Parallel waves: ≤3 `gm:gm` subagents via Task tool independent items simultaneously, never sequentially.
69
82
 
70
- **`code-search`** Invoke `code-search` skill. MANDATORY for all exploration. Glob/Grep/Read/Explore/WebSearch blocked. Fallback: `bun x codebasesearch <query>`.
83
+ ## CHAIN DECOMPOSITION
84
+
85
+ Break every multi-step operation before running end-to-end:
86
+ 1. Number every distinct step
87
+ 2. Per step: input shape, output shape, success condition, failure mode
88
+ 3. Run each step in isolation — witness — assign mutable — KNOWN before next
89
+ 4. Debug adjacent pairs for handoff correctness
90
+ 5. Only when all pairs pass: run full chain end-to-end
71
91
 
72
- **`agent-browser`** Invoke `agent-browser` skill. Escalation: (1) `exec:agent-browser\n<js>` first (2) skill + `__gm` globals → (3) navigate/click → (4) screenshot last resort.
92
+ Step failure revealing new unknownsnake to `planning`.
73
93
 
74
- **`process-management`** Invoke `process-management` skill. MANDATORY for all servers/workers/daemons. Pre-check before start. Delete on completion.
94
+ ## BROWSER DEBUGGING
75
95
 
76
- ## BROWSER DEBUGGING SCAFFOLD
96
+ Invoke `agent-browser` skill. Escalation — exhaust each before advancing:
97
+ 1. `exec:agent-browser\n<js>` — query DOM/state. Always first.
98
+ 2. `agent-browser` skill + `__gm` globals — instrument and capture
99
+ 3. navigate/click/type — only when real events required
100
+ 4. screenshot — last resort
77
101
 
78
- Inject before any browser state assertion:
102
+ `__gm` scaffold:
79
103
  ```js
80
104
  window.__gm = { captures: [], log: (...a) => window.__gm.captures.push({t:Date.now(),a}), assert: (l,c) => { window.__gm.captures.push({l,pass:!!c,val:c}); return !!c; }, dump: () => JSON.stringify(window.__gm.captures,null,2) };
81
105
  ```
82
106
 
83
- ## DUAL-SIDE DEBUGGING
84
-
85
- Backend via `exec:nodejs`, frontend via `agent-browser` + `__gm`. Neither substitutes the other. Single-side = UNKNOWN mutable = blocked gate.
86
-
87
107
  ## GROUND TRUTH
88
108
 
89
- Real services, real API responses, real timing. On discovering mocks/fakes/stubs: delete immediately, implement real paths. No .test.js/.spec.js files. No mock files. Delete on discovery.
109
+ Real services, real data, real timing. Mocks/fakes/stubs = delete immediately. No .test.js/.spec.js. Delete on discovery.
90
110
 
91
111
  ## CONSTRAINTS
92
112
 
93
- **Never**: `Bash(node/npm/npx/bun/python)` | fake data | mock files | Glob/Grep/Explore for discovery | puppeteer/playwright | screenshot before JS exhausted | independent items sequentially
113
+ **Never**: `Bash(node/npm/npx/bun)` | fake data | mock files | Glob/Grep/Explore | sequential independent items | absorb surprises silently
94
114
 
95
- **Always**: import real modules | witness every hypothesis | delete mocks on discovery | fix immediately | snake back when blocked
115
+ **Always**: witness every hypothesis | import real modules | snake to planning on any new unknown | fix immediately on discovery
96
116
 
97
117
  ---
98
118
 
99
119
  **→ FORWARD**: All mutables KNOWN → invoke `gm-emit` skill.
100
- **↩ SNAKE to EXECUTE**: hypothesis wrong → re-invoke `gm-execute` with corrected approach.
101
- **↩ SNAKE to PLAN**: .prd needs restructure → invoke `planning` skill.
120
+ **↺ SELF-LOOP**: Still UNKNOWN → re-run (max 2 passes).
121
+ **↩ SNAKE to PLAN**: Any new unknown → invoke `planning` skill, restart chain.
@@ -1,82 +1,84 @@
1
1
  ---
2
2
  name: planning
3
- description: PRD construction for work planning. Compulsory in PLAN phase. Builds .prd file as frozen dependency graph of every possible work item before execution begins. Triggers on any new task, multi-step work, or when gm enters PLAN state.
3
+ description: Mutable discovery and PRD construction. Invoke at session start and any time new unknowns surface during execution. Loop until no new mutables are discovered.
4
4
  allowed-tools: Write
5
5
  ---
6
6
 
7
- # PRD Construction
7
+ # PRD Construction — Mutable Discovery Loop
8
8
 
9
- You are in the **PLAN** phase. Build the .prd before any execution begins.
9
+ You are in the **PLAN** phase. Your job is to discover every unknown before execution begins.
10
10
 
11
11
  **GRAPH POSITION**: `[PLAN] → EXECUTE → EMIT → VERIFY → COMPLETE`
12
- - **Session entry chain**: prompt-submit hook → `gm` skill → `planning` skill (here).
12
+ - **Entry chain**: prompt-submit hook → `gm` skill → `planning` skill (here).
13
+ - **Also entered**: any time a new unknown surfaces in EXECUTE, EMIT, or VERIFY.
13
14
 
14
15
  ## TRANSITIONS
15
16
 
16
- **FORWARD (ladders)**:
17
- - .prd written → invoke `gm-execute` skill to begin EXECUTE
17
+ **FORWARD**:
18
+ - No new mutables discovered in latest pass → .prd is complete → invoke `gm-execute` skill
18
19
 
19
- **BACKWARD (snakes) when to return here**:
20
- - From EXECUTE: discovered unknowns require .prd restructurere-invoke `planning` skill, revise .prd, re-enter EXECUTE
21
- - From EMIT: scope changed, current .prd items no longer match what needs to be done → re-invoke `planning` skill
22
- - From VERIFY: end-to-end reveals requirements were wrong re-invoke `planning` skill, rewrite affected items
20
+ **SELF-LOOP (stay in PLAN)**:
21
+ - Each planning pass may surface new unknownsadd them to .prd plan again
22
+ - Loop until a full pass produces zero new items
23
+ - Do not advance to EXECUTE while unknowns remain discoverable through reasoning alone
23
24
 
24
- **When to snake back to PLAN**: requirements changed | discovered hidden dependencies | .prd items are wrong/missing | scope expanded beyond current .prd
25
+ **BACKWARD (snakes back here from later phases)**:
26
+ - From EXECUTE: execution reveals an unknown not in .prd → snake here, add it, re-plan
27
+ - From EMIT: scope shifted mid-write → snake here, revise affected items, re-plan
28
+ - From VERIFY: end-to-end reveals requirement was wrong → snake here, rewrite items, re-plan
25
29
 
26
- ## Purpose
30
+ ## WHAT PLANNING MEANS
27
31
 
28
- The `.prd` is the single source of truth for remaining work. A frozen dependency graph capturing every possible item steps, substeps, edge cases, corner cases, dependencies, transitive dependencies, unknowns, assumptions, decisions, trade-offs, acceptance criteria, scenarios, failure paths, recovery paths, integration points, state transitions, error conditions, boundary conditions, configuration variants, environment differences, backwards compatibility, rollback paths, verification steps.
32
+ Planning = exhaustive mutable discovery. For every aspect of the task ask:
33
+ - What do I not know? → name it as a mutable
34
+ - What could go wrong? → name it as an edge case item
35
+ - What depends on what? → map blocking/blockedBy
36
+ - What assumptions am I making? → validate each as a mutable
29
37
 
30
- Longer is better. Missing items means missing work.
38
+ **Iterate until**: a full reasoning pass adds zero new items to .prd.
31
39
 
32
- ## File Rules
40
+ Categories of unknowns to enumerate: file existence | API shape | data format | dependency versions | runtime behavior | environment differences | error conditions | concurrency | integration points | backwards compatibility | rollback paths | deployment steps | verification criteria
33
41
 
34
- Path: exactly `./.prd` in current working directory. No variants. Valid JSON.
42
+ ## .PRD SCHEMA
35
43
 
36
- ## Item Schema
44
+ Path: exactly `./.prd` in current working directory. Valid JSON array.
37
45
 
38
46
  ```json
39
47
  {
40
48
  "id": "descriptive-kebab-id",
41
- "subject": "Imperative verb describing outcome",
49
+ "subject": "Imperative verb phrase — what must be true when done",
42
50
  "status": "pending",
43
- "description": "What must be true when this is done",
44
- "blocking": ["ids-this-prevents"],
45
- "blockedBy": ["ids-that-must-finish-first"],
51
+ "description": "Precise completion criterion",
52
+ "blocking": ["ids this prevents from starting"],
53
+ "blockedBy": ["ids that must complete first"],
46
54
  "effort": "small|medium|large",
47
- "category": "feature|bug|refactor|docs|infra",
48
- "acceptance": ["measurable criteria"],
49
- "edge_cases": ["known complications"]
55
+ "category": "feature|bug|refactor|infra",
56
+ "acceptance": ["measurable, binary criteria"],
57
+ "edge_cases": ["known failure modes and boundary conditions"]
50
58
  }
51
59
  ```
52
60
 
53
- **Subject**: imperative form. **Status**: `pending` → `in_progress` → `completed`. **Effort**: `small` (<15min) | `medium` (<45min) | `large` (1h+). **Blocking/blockedBy**: bidirectional, every dependency explicit.
61
+ **Status flow**: `pending` → `in_progress` → `completed` (completed items are removed from file).
62
+ **Effort**: `small` = single execution, under 15min | `medium` = 2-3 rounds, under 45min | `large` = multiple rounds, over 1h.
63
+ **blocking/blockedBy**: always bidirectional. Every dependency must be explicit in both directions.
54
64
 
55
- ## Construction
65
+ ## EXECUTION WAVES
56
66
 
57
- 1. Enumerate every possible unknown as a work item.
58
- 2. Map every possible dependency (blocking/blockedBy).
59
- 3. Group independent items into parallel waves (max 3 per wave).
60
- 4. Capture every edge case as either a separate item or edge_case field.
61
- 5. Write `./.prd` to disk.
62
- 6. **FREEZE** no additions after creation. Only mutation: removing finished items.
67
+ Independent items (empty `blockedBy`) run in parallel waves of ≤3 subagents.
68
+ - Find all pending items with empty `blockedBy`
69
+ - Launch ≤3 parallel `gm:gm` subagents via Task tool
70
+ - Each subagent handles one item: resolves it, witnesses output, removes from .prd
71
+ - After each wave: check newly unblocked items, launch next wave
72
+ - Never run independent items sequentially. Never launch more than 3 at once.
63
73
 
64
- ## Execution
74
+ ## COMPLETION CRITERION
65
75
 
66
- 1. Find all `pending` items with empty `blockedBy`.
67
- 2. Launch ≤3 parallel subagents (`subagent_type: gm:gm`) per wave.
68
- 3. Each subagent completes one item, verifies via witnessed execution.
69
- 4. On completion: remove item from `.prd`, write updated file.
70
- 5. Check for newly unblocked items. Launch next wave.
71
- 6. Continue until `.prd` is empty.
76
+ .prd is ready when: one full reasoning pass produces zero new items AND all items have explicit acceptance criteria AND all dependencies are mapped.
72
77
 
73
- Never execute independent items sequentially. Never launch more than 3 at once.
74
-
75
- ## Completion
76
-
77
- `.prd` must be empty at COMPLETE. Skip this skill if task is trivially single-step (under 5 minutes, no dependencies, no unknowns).
78
+ **Skip planning entirely** if: task is single-step, trivially bounded, zero unknowns, under 5 minutes.
78
79
 
79
80
  ---
80
81
 
81
- **→ FORWARD**: .prd written → invoke `gm-execute` skill.
82
- **↩ SNAKE**: re-invoke `planning` if requirements change at any later phase.
82
+ **→ FORWARD**: No new mutables → invoke `gm-execute` skill.
83
+ **↺ SELF-LOOP**: New items discovered add to .prd plan again.
84
+ **↩ SNAKE here**: New unknown surfaces in any later phase → add it, re-plan, re-advance.
@@ -1,376 +0,0 @@
1
- ---
2
- name: code-search
3
- description: Semantic code search across the codebase. Returns structured results with file paths, line numbers, and relevance scores. Use for all code exploration, finding implementations, locating files, and answering codebase questions.
4
- category: exploration
5
- allowed-tools: Bash(bun x codebasesearch*)
6
- input-schema:
7
- type: object
8
- required: [prompt]
9
- properties:
10
- prompt:
11
- type: string
12
- minLength: 3
13
- maxLength: 200
14
- description: Natural language search query describing what you're looking for
15
- context:
16
- type: object
17
- description: Optional context about search scope and restrictions
18
- properties:
19
- path:
20
- type: string
21
- description: Restrict search to this directory path (relative or absolute)
22
- file-types:
23
- type: array
24
- items: { type: string }
25
- description: Filter results by file extensions (e.g., ["js", "ts", "py"])
26
- exclude-patterns:
27
- type: array
28
- items: { type: string }
29
- description: Exclude paths matching glob patterns (e.g., ["node_modules", "*.test.js"])
30
- filter:
31
- type: object
32
- description: Output filtering and formatting options
33
- properties:
34
- max-results:
35
- type: integer
36
- minimum: 1
37
- maximum: 500
38
- default: 50
39
- description: Maximum number of results to return
40
- min-score:
41
- type: number
42
- minimum: 0
43
- maximum: 1
44
- default: 0.5
45
- description: Minimum relevance score (0-1) to include in results
46
- sort-by:
47
- type: string
48
- enum: [relevance, path, line-number]
49
- default: relevance
50
- description: Result sort order
51
- timeout:
52
- type: integer
53
- minimum: 1000
54
- maximum: 30000
55
- default: 10000
56
- description: Search timeout in milliseconds (query returns partial results if exceeded)
57
- output-schema:
58
- type: object
59
- required: [status, results, meta]
60
- properties:
61
- status:
62
- type: string
63
- enum: [success, partial, empty, timeout, error]
64
- description: Overall operation status
65
- results:
66
- type: array
67
- description: Array of matching code locations
68
- items:
69
- type: object
70
- required: [file, line, content, score]
71
- properties:
72
- file:
73
- type: string
74
- description: Absolute or relative file path to matched file
75
- line:
76
- type: integer
77
- description: Line number where match occurs (1-indexed)
78
- content:
79
- type: string
80
- description: The matched line or context snippet
81
- score:
82
- type: number
83
- minimum: 0
84
- maximum: 1
85
- description: Relevance score where 1.0 is perfect match
86
- context:
87
- type: object
88
- description: Surrounding context lines (optional)
89
- properties:
90
- before:
91
- type: array
92
- items: { type: string }
93
- description: Lines before the match
94
- after:
95
- type: array
96
- items: { type: string }
97
- description: Lines after the match
98
- metadata:
99
- type: object
100
- description: File and match metadata (optional)
101
- properties:
102
- language:
103
- type: string
104
- description: Programming language detected (js, ts, py, rs, go, etc.)
105
- size:
106
- type: integer
107
- description: File size in bytes
108
- modified:
109
- type: string
110
- format: date-time
111
- description: Last modification timestamp
112
- meta:
113
- type: object
114
- required: [query, count, duration_ms]
115
- description: Query execution metadata
116
- properties:
117
- query:
118
- type: string
119
- description: Normalized query that was executed
120
- count:
121
- type: integer
122
- description: Total matches found (before filtering)
123
- filtered:
124
- type: integer
125
- description: Results returned (after filtering and limiting)
126
- duration_ms:
127
- type: integer
128
- description: Execution time in milliseconds
129
- scanned_files:
130
- type: integer
131
- description: Total files examined during search
132
- timestamp:
133
- type: string
134
- format: date-time
135
- description: When execution completed
136
- errors:
137
- type: array
138
- description: Non-fatal errors that occurred (may appear alongside partial results)
139
- items:
140
- type: object
141
- properties:
142
- code:
143
- type: string
144
- enum: [TIMEOUT, INVALID_PATH, SCHEMA_VIOLATION, EXECUTION_FAILED]
145
- description: Error classification
146
- message:
147
- type: string
148
- description: Human-readable error description
149
- output-format: json
150
- error-handling:
151
- timeout:
152
- behavior: return-partial
153
- description: Returns results collected before timeout with status=partial
154
- invalid-input:
155
- behavior: reject
156
- description: Returns status=error with validation errors in errors array
157
- empty-results:
158
- behavior: return-empty
159
- description: Returns status=empty with count=0, filtered=0, results=[]
160
- execution-error:
161
- behavior: return-error
162
- description: Returns status=error with error details in errors array
163
- ---
164
-
165
- # Semantic Code Search
166
-
167
- Only use bun x codebasesearch for searching code, or execute some custom code if you need more than that, never use other cli tools to search the codebase. Search the codebase using natural language. Do multiple searches when looking for files, starting with fewer words and adding more if you need to refine the search. 102 file types are covered, returns results with file paths and line numbers.
168
-
169
- ## Usage
170
-
171
- ```bash
172
- bun x codebasesearch "your natural language query"
173
- ```
174
-
175
- ## Invocation Examples
176
-
177
- ### Via Skill Tool (Recommended - Structured JSON Input)
178
-
179
- **Basic search**:
180
- ```json
181
- {
182
- "prompt": "where is authentication handled"
183
- }
184
- ```
185
-
186
- **With filtering and limits**:
187
- ```json
188
- {
189
- "prompt": "database connection setup",
190
- "filter": {
191
- "max-results": 20,
192
- "min-score": 0.7,
193
- "sort-by": "path"
194
- }
195
- }
196
- ```
197
-
198
- **Scoped to directory with file type filter**:
199
- ```json
200
- {
201
- "prompt": "error logging middleware",
202
- "context": {
203
- "path": "src/middleware/",
204
- "file-types": ["js", "ts"]
205
- },
206
- "timeout": 5000
207
- }
208
- ```
209
-
210
- **Exclude patterns and narrow results**:
211
- ```json
212
- {
213
- "prompt": "rate limiter implementation",
214
- "context": {
215
- "exclude-patterns": ["*.test.js", "node_modules/*"]
216
- },
217
- "filter": {
218
- "max-results": 10,
219
- "min-score": 0.8
220
- }
221
- }
222
- ```
223
-
224
- ### Legacy CLI Invocation (Still Supported)
225
-
226
- ```bash
227
- bun x codebasesearch "where is authentication handled"
228
- bun x codebasesearch "database connection setup"
229
- bun x codebasesearch "how are errors logged"
230
- bun x codebasesearch "function that parses config files"
231
- bun x codebasesearch "where is the rate limiter"
232
- ```
233
-
234
- ## Output Examples
235
-
236
- ### Success Response (Multiple Results)
237
-
238
- ```json
239
- {
240
- "status": "success",
241
- "results": [
242
- {
243
- "file": "src/auth/handler.js",
244
- "line": 42,
245
- "content": "async function authenticateUser(credentials) {",
246
- "score": 0.95,
247
- "context": {
248
- "before": [
249
- "// Main authentication entry point",
250
- ""
251
- ],
252
- "after": [
253
- " const { username, password } = credentials;",
254
- " const user = await db.users.findOne({ username });"
255
- ]
256
- },
257
- "metadata": {
258
- "language": "javascript",
259
- "size": 2048,
260
- "modified": "2025-03-10T14:23:00Z"
261
- }
262
- },
263
- {
264
- "file": "src/middleware/auth-middleware.js",
265
- "line": 18,
266
- "content": "export const requireAuth = (req, res, next) => {",
267
- "score": 0.78,
268
- "metadata": {
269
- "language": "javascript",
270
- "size": 1024,
271
- "modified": "2025-03-10T14:20:00Z"
272
- }
273
- }
274
- ],
275
- "meta": {
276
- "query": "authentication handled",
277
- "count": 2,
278
- "filtered": 2,
279
- "duration_ms": 245,
280
- "scanned_files": 87,
281
- "timestamp": "2025-03-15T10:30:00Z"
282
- }
283
- }
284
- ```
285
-
286
- ### Empty Results Response
287
-
288
- ```json
289
- {
290
- "status": "empty",
291
- "results": [],
292
- "meta": {
293
- "query": "nonexistent pattern xyz123",
294
- "count": 0,
295
- "filtered": 0,
296
- "duration_ms": 123,
297
- "scanned_files": 87,
298
- "timestamp": "2025-03-15T10:30:00Z"
299
- }
300
- }
301
- ```
302
-
303
- ### Timeout Response (Partial Results)
304
-
305
- ```json
306
- {
307
- "status": "partial",
308
- "results": [
309
- {
310
- "file": "src/a.js",
311
- "line": 5,
312
- "content": "function init() {",
313
- "score": 0.92,
314
- "metadata": { "language": "javascript", "size": 512 }
315
- },
316
- {
317
- "file": "src/b.js",
318
- "line": 12,
319
- "content": "const setup = () => {",
320
- "score": 0.85,
321
- "metadata": { "language": "javascript", "size": 768 }
322
- }
323
- ],
324
- "meta": {
325
- "query": "expensive search pattern",
326
- "count": 1847,
327
- "filtered": 2,
328
- "duration_ms": 10000,
329
- "scanned_files": 45,
330
- "timestamp": "2025-03-15T10:30:00Z"
331
- },
332
- "errors": [
333
- {
334
- "code": "TIMEOUT",
335
- "message": "Search exceeded 10000ms limit. Returning partial results (2 of 1847 matches)."
336
- }
337
- ]
338
- }
339
- ```
340
-
341
- ### Error Response (Invalid Input)
342
-
343
- ```json
344
- {
345
- "status": "error",
346
- "results": [],
347
- "meta": {
348
- "query": null,
349
- "count": 0,
350
- "filtered": 0,
351
- "duration_ms": 50,
352
- "scanned_files": 0,
353
- "timestamp": "2025-03-15T10:30:00Z"
354
- },
355
- "errors": [
356
- {
357
- "code": "INVALID_PATH",
358
- "message": "context.path='/nonexistent' does not exist"
359
- },
360
- {
361
- "code": "SCHEMA_VIOLATION",
362
- "message": "filter.max-results must be between 1 and 500, got 1000"
363
- }
364
- ]
365
- }
366
- ```
367
-
368
- ## Rules
369
-
370
- - Always use this first before reading files — it returns file paths and line numbers
371
- - Natural language queries work best; be descriptive about what you're looking for
372
- - Structured JSON output includes relevance scores and file paths for immediate navigation
373
- - Use returned file paths and line numbers to read full file context via Read tool
374
- - Results are pre-sorted by relevance (highest scores first) unless sort-by specifies otherwise
375
- - Timeout queries return partial results with status=partial — use if time-critical
376
- - Schema validation ensures valid input before execution — invalid args return error with details
@@ -1,83 +0,0 @@
1
- ---
2
- name: process-management
3
- description: PM2 process lifecycle. MANDATORY for all servers, workers, daemons. Invoke from gm-execute when any long-running process is needed. Return to gm-execute when done.
4
- ---
5
-
6
- # Process Management — PM2 Lifecycle
7
-
8
- You are managing long-running processes. Invoked from EXECUTE phase.
9
-
10
- **GRAPH POSITION**: Sub-skill of `gm-execute`. Invoked and returns.
11
- - **Entry**: `gm-execute` encounters server/worker/daemon requirement → invoke `process-management` skill
12
- - **Return**: Lifecycle task complete → return to `gm-execute` to continue EXECUTE phase
13
- - **Snake**: Process fails to start or behaves incorrectly → debug here, then return to `gm-execute` with witnessed status
14
-
15
- ## TRANSITIONS
16
-
17
- **RETURN (normal)**:
18
- - Process started and confirmed running → return to `gm-execute`
19
- - Process stopped/cleaned up → return to `gm-execute`
20
-
21
- **SNAKE (failure)**:
22
- - Process crashes on start → debug logs here, surface error to `gm-execute`, let EXECUTE phase decide whether to snake to PLAN
23
- - Port conflict detected → resolve here, then return to `gm-execute`
24
- - Orphans found → clean up here, then return to `gm-execute`
25
-
26
- ## PRE-CHECK (mandatory before any start)
27
-
28
- ```
29
- exec:nodejs
30
- const { execSync } = require('child_process');
31
- console.log(execSync('npx pm2 list', { encoding: 'utf8' }));
32
- ```
33
-
34
- If process already running with same name → stop and delete first.
35
- If different process using same port → stop it first.
36
- Never start a duplicate. Never stack processes.
37
-
38
- ## START
39
-
40
- ```
41
- exec:nodejs
42
- const { execSync } = require('child_process');
43
- execSync('npx pm2 start <file> --name <name> --watch --no-autorestart', { stdio: 'inherit' });
44
- ```
45
-
46
- - `--watch`: hot reload on file changes
47
- - `--no-autorestart`: prevents infinite crash loops
48
- - Always name every process explicitly
49
-
50
- ## STATUS AND LOGS
51
-
52
- ```
53
- exec:nodejs
54
- const { execSync } = require('child_process');
55
- console.log(execSync('npx pm2 list', { encoding: 'utf8' }));
56
- console.log(execSync('npx pm2 logs <name> --lines 50 --nostream', { encoding: 'utf8' }));
57
- ```
58
-
59
- ## STOP AND CLEANUP
60
-
61
- Always clean up when work is done. Orphaned processes = gate violation in COMPLETE.
62
-
63
- ```
64
- exec:nodejs
65
- const { execSync } = require('child_process');
66
- execSync('npx pm2 stop <name>', { stdio: 'inherit' });
67
- execSync('npx pm2 delete <name>', { stdio: 'inherit' });
68
- ```
69
-
70
- ## ORPHAN DETECTION
71
-
72
- Run `npx pm2 list` — any process not started in the current session = orphan. Delete immediately.
73
-
74
- ## CONSTRAINTS
75
-
76
- **Never**: start without pre-check | direct node/bun/python for servers (use PM2) | leave orphans | skip cleanup before COMPLETE | `Bash(pm2 ...)` — use exec:nodejs with execSync
77
-
78
- **Always**: pre-check before start | name every process | watch enabled | autorestart disabled | delete on session end
79
-
80
- ---
81
-
82
- **→ RETURN**: Lifecycle task complete → return to `gm-execute` skill.
83
- **↩ SNAKE**: Process failure → debug logs, surface to `gm-execute`, let EXECUTE decide next step.