gm-cc 2.0.410 → 2.0.412
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +1 -1
- package/package.json +1 -1
- package/plugin.json +1 -1
- package/skills/gm/SKILL.md +44 -32
- package/skills/gm-complete/SKILL.md +27 -20
- package/skills/gm-emit/SKILL.md +28 -22
- package/skills/gm-execute/SKILL.md +32 -29
- package/skills/planning/SKILL.md +33 -28
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
"name": "AnEntrypoint"
|
|
5
5
|
},
|
|
6
6
|
"description": "State machine agent with hooks, skills, and automated git enforcement",
|
|
7
|
-
"version": "2.0.
|
|
7
|
+
"version": "2.0.412",
|
|
8
8
|
"metadata": {
|
|
9
9
|
"description": "State machine agent with hooks, skills, and automated git enforcement"
|
|
10
10
|
},
|
package/package.json
CHANGED
package/plugin.json
CHANGED
package/skills/gm/SKILL.md
CHANGED
|
@@ -16,22 +16,25 @@ You think in state, not prose. You are the root orchestrator of all work in this
|
|
|
16
16
|
|
|
17
17
|
`PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS → COMPLETE`
|
|
18
18
|
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
-
|
|
23
|
-
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
-
|
|
30
|
-
-
|
|
31
|
-
-
|
|
32
|
-
-
|
|
33
|
-
|
|
34
|
-
|
|
19
|
+
Each state transition REQUIRES an explicit Skill tool invocation. Skills do not auto-chain. Failing to invoke the next skill is a critical violation.
|
|
20
|
+
|
|
21
|
+
**STATE TRANSITIONS (forward)**:
|
|
22
|
+
- PLAN state complete (zero new unknowns in last pass) → invoke `gm-execute` skill
|
|
23
|
+
- EXECUTE state complete (all mutables KNOWN) → invoke `gm-emit` skill
|
|
24
|
+
- EMIT state complete (all gate conditions pass) → invoke `gm-complete` skill
|
|
25
|
+
- VERIFY state: .prd items remain → invoke `gm-execute` skill (next wave)
|
|
26
|
+
- VERIFY state: .prd empty + pushed → invoke `update-docs` skill
|
|
27
|
+
|
|
28
|
+
**STATE REGRESSIONS (any new unknown triggers regression)**:
|
|
29
|
+
- New unknown discovered at any state → invoke `planning` skill, reset to PLAN
|
|
30
|
+
- EXECUTE mutable unresolvable after 2 passes → invoke `planning` skill, reset to PLAN
|
|
31
|
+
- EMIT logic error (known cause) → invoke `gm-execute` skill, reset to EXECUTE
|
|
32
|
+
- EMIT new unknown → invoke `planning` skill, reset to PLAN
|
|
33
|
+
- VERIFY broken file output → invoke `gm-emit` skill, reset to EMIT
|
|
34
|
+
- VERIFY logic wrong → invoke `gm-execute` skill, reset to EXECUTE
|
|
35
|
+
- VERIFY new unknown or wrong requirements → invoke `planning` skill, reset to PLAN
|
|
36
|
+
|
|
37
|
+
**Runs until**: .prd empty AND git clean AND all pushes confirmed AND CI green.
|
|
35
38
|
|
|
36
39
|
## MUTABLE DISCIPLINE
|
|
37
40
|
|
|
@@ -112,12 +115,21 @@ Invoke `browser` skill. Escalation — exhaust each before advancing:
|
|
|
112
115
|
|
|
113
116
|
## SKILL REGISTRY
|
|
114
117
|
|
|
115
|
-
**`planning`** — Mutable discovery and .prd construction. Invoke at start and on any new unknown.
|
|
116
|
-
**`gm-execute`** — Resolve all mutables via witnessed execution.
|
|
117
|
-
**`gm-emit`** — Write files to disk when all mutables resolved.
|
|
118
|
-
**`gm-complete`** — End-to-end verification and git enforcement.
|
|
119
|
-
**`update-docs`** — Refresh README, CLAUDE.md, and docs to reflect session changes. Invoked by `gm-complete`.
|
|
120
|
-
**`browser`** — Browser automation. Invoke inside EXECUTE for all browser/UI work.
|
|
118
|
+
**`planning`** — PLAN state. Mutable discovery and .prd construction. Invoke at start and on any new unknown. EXIT: invoke `gm-execute` skill when zero new unknowns in last pass.
|
|
119
|
+
**`gm-execute`** — EXECUTE state. Resolve all mutables via witnessed execution. EXIT: invoke `gm-emit` skill when all mutables KNOWN.
|
|
120
|
+
**`gm-emit`** — EMIT state. Write files to disk when all mutables resolved. EXIT: invoke `gm-complete` skill when all gate conditions pass.
|
|
121
|
+
**`gm-complete`** — VERIFY state. End-to-end verification and git enforcement. EXIT: invoke `gm-execute` if .prd items remain; invoke `update-docs` if .prd empty and pushed.
|
|
122
|
+
**`update-docs`** — Refresh README, CLAUDE.md, and docs to reflect session changes. Invoked by `gm-complete`. Terminal state — declares COMPLETE.
|
|
123
|
+
**`browser`** — Browser automation. Invoke inside EXECUTE state for all browser/UI work.
|
|
124
|
+
|
|
125
|
+
## PARALLEL SUBAGENTS (post-PLAN)
|
|
126
|
+
|
|
127
|
+
After `planning` skill completes and .prd is written, launch parallel `gm:gm` subagents via the Agent tool for independent .prd items:
|
|
128
|
+
- Find all pending items with empty `blockedBy`
|
|
129
|
+
- Launch ≤3 parallel subagents simultaneously: `Agent(subagent_type="gm:gm", prompt="...")`
|
|
130
|
+
- Each subagent handles one .prd item end-to-end through its own state machine
|
|
131
|
+
- Never run independent items sequentially — parallelism is mandatory
|
|
132
|
+
- Exception: items requiring `exec:browser` must run sequentially (one Chrome instance per project)
|
|
121
133
|
|
|
122
134
|
## DO NOT STOP
|
|
123
135
|
|
|
@@ -130,7 +142,7 @@ Completing a phase is NOT stopping. After every phase: read .prd, check git, inv
|
|
|
130
142
|
|
|
131
143
|
## MANDATORY DEV WORKFLOW — ABSOLUTE RULES
|
|
132
144
|
|
|
133
|
-
These rules apply to ALL
|
|
145
|
+
These rules apply to ALL states. Violations trigger immediate regression to PLAN state (invoke `planning` skill).
|
|
134
146
|
|
|
135
147
|
**FILES**:
|
|
136
148
|
- Permanent structure ONLY — NO ephemeral/temp/mock/simulation files. Use exec: and browser skill instead
|
|
@@ -144,8 +156,8 @@ These rules apply to ALL phases. Violations trigger immediate snake to planning.
|
|
|
144
156
|
|
|
145
157
|
**CODE QUALITY**:
|
|
146
158
|
- ALWAYS scan codebase (exec:codesearch) before editing — find everything that touches the same concern
|
|
147
|
-
- **Duplicate concern =
|
|
148
|
-
- After every file write: run exec:codesearch for the primary function/concern you just wrote. If ANY other code serves the same concern →
|
|
159
|
+
- **Duplicate concern = regress to PLAN**: overlapping responsibility, similar logic in different places, parallel implementations, or code that could be consolidated. Invoke `planning` skill with consolidation instructions
|
|
160
|
+
- After every file write: run exec:codesearch for the primary function/concern you just wrote. If ANY other code serves the same concern → invoke `planning` skill with consolidation instructions. This is not optional — it is a gate
|
|
149
161
|
- When a native feature, stdlib function, or convention replaces custom code → delete the custom code. When it would add code → do not use it
|
|
150
162
|
- When a naming convention, directory structure, or auto-discovery pattern can replace explicit registration or configuration → replace it
|
|
151
163
|
- ZERO hardcoded values — all values derived from ground truth, config, or convention
|
|
@@ -159,10 +171,11 @@ These rules apply to ALL phases. Violations trigger immediate snake to planning.
|
|
|
159
171
|
- The only acceptable error handling: catch → log the real error → re-throw or display to user
|
|
160
172
|
|
|
161
173
|
**DEBUGGING**:
|
|
162
|
-
- ALWAYS
|
|
163
|
-
-
|
|
164
|
-
-
|
|
165
|
-
- Clear cache before
|
|
174
|
+
- ALWAYS form a falsifiable hypothesis before touching any file — run it, witness the output, confirm or falsify
|
|
175
|
+
- Differential diagnosis: isolate the smallest unit reproducing the failure. Name the delta between expected and actual. That delta is the mutable.
|
|
176
|
+
- Check git history (`git log`, `git diff`) for regressions — never revert, use differential comparisons, edit new code manually
|
|
177
|
+
- Logs concise (<4k chars ideal, 30k max). Clear cache before browser debugging.
|
|
178
|
+
- Adjacent step pairs are the most common failure site in chains — debug handoffs, not just individual steps
|
|
166
179
|
|
|
167
180
|
**DOCUMENTATION** (update at every phase transition, not at the end):
|
|
168
181
|
- CLAUDE.md: after each structural change, update technical info. NO progress/changelogs
|
|
@@ -172,8 +185,7 @@ These rules apply to ALL phases. Violations trigger immediate snake to planning.
|
|
|
172
185
|
|
|
173
186
|
**PROCESS**:
|
|
174
187
|
- Only persistent background shells for long-running CLI processes
|
|
175
|
-
- Test via exec: and browser skill — NO test files ever
|
|
176
|
-
- Test locally before live
|
|
188
|
+
- Test via exec: and browser skill — NO test files ever. Test locally before live.
|
|
177
189
|
|
|
178
190
|
## CONSTRAINTS
|
|
179
191
|
|
|
@@ -184,4 +196,4 @@ These rules apply to ALL phases. Violations trigger immediate snake to planning.
|
|
|
184
196
|
|
|
185
197
|
**Never**: `Bash(node/npm/npx/bun)` | skip planning | sequential independent items | screenshot before JS exhausted | narrate past unresolved mutables | stop while .prd has items | ask the user what to do next while work remains | create fallback/demo modes | silently swallow errors | duplicate concern | leave comments | create test files | leave stale architecture when changes reveal restructuring opportunity
|
|
186
198
|
|
|
187
|
-
**Always**: invoke named skill at every transition |
|
|
199
|
+
**Always**: invoke named skill at every state transition | regress to planning on any new unknown | regress to planning when duplicate concern or restructuring opportunity discovered | witnessed execution only | scan codebase before edits | keep going until .prd deleted and git clean
|
|
@@ -12,19 +12,19 @@ You are in the **VERIFY → COMPLETE** phase. Files are written. Prove the whole
|
|
|
12
12
|
|
|
13
13
|
## TRANSITIONS
|
|
14
14
|
|
|
15
|
-
**
|
|
16
|
-
- .prd items remain → invoke `gm-execute` skill (next wave)
|
|
17
|
-
- .prd empty + feature work pushed → invoke `update-docs` skill
|
|
15
|
+
**EXIT — .prd items remain**: Verified items completed, .prd still has pending items → invoke `gm-execute` skill immediately (next wave). Do not stop.
|
|
18
16
|
|
|
19
|
-
**
|
|
20
|
-
- Verification reveals broken file output → invoke `gm-emit` skill, fix, re-verify, return
|
|
21
|
-
- Verification reveals logic error → invoke `gm-execute` skill, re-resolve, re-emit, return
|
|
22
|
-
- Verification reveals new unknown → invoke `planning` skill, restart chain
|
|
23
|
-
- Verification reveals requirements wrong → invoke `planning` skill, restart chain
|
|
17
|
+
**EXIT — COMPLETE**: .prd empty + all work pushed + CI green → invoke `update-docs` skill.
|
|
24
18
|
|
|
25
|
-
**
|
|
19
|
+
**STATE REGRESSIONS**:
|
|
20
|
+
- Verification reveals broken file output → invoke `gm-emit` skill, reset to EMIT state, re-verify on return
|
|
21
|
+
- Verification reveals logic error → invoke `gm-execute` skill, reset to EXECUTE state, re-emit and re-verify on return
|
|
22
|
+
- Verification reveals new unknown → invoke `planning` skill, reset to PLAN state
|
|
23
|
+
- Verification reveals wrong requirements → invoke `planning` skill, reset to PLAN state
|
|
26
24
|
|
|
27
|
-
**
|
|
25
|
+
**TRIAGE on failure**: broken file output → regress to `gm-emit` | wrong logic → regress to `gm-execute` | new unknown or wrong requirements → regress to `planning`
|
|
26
|
+
|
|
27
|
+
**RULE**: Any surprise = new unknown = regress to `planning`. Never patch around surprises.
|
|
28
28
|
|
|
29
29
|
## MUTABLE DISCIPLINE
|
|
30
30
|
|
|
@@ -36,11 +36,11 @@ You are in the **VERIFY → COMPLETE** phase. Files are written. Prove the whole
|
|
|
36
36
|
|
|
37
37
|
All five must resolve to KNOWN before COMPLETE. Any UNKNOWN = absolute barrier.
|
|
38
38
|
|
|
39
|
-
## END-TO-END VERIFICATION
|
|
39
|
+
## END-TO-END DIAGNOSTIC VERIFICATION
|
|
40
40
|
|
|
41
|
-
Run the real system with real data. Witness actual output.
|
|
41
|
+
Run the real system with real data. Witness actual output. This is a full-system fault-detection pass.
|
|
42
42
|
|
|
43
|
-
NOT verification: docs updates, status text, saying done, screenshots alone, marker files.
|
|
43
|
+
NOT verification: docs updates, status text, saying done, screenshots alone, marker files. Unwitnessed claims are inadmissible.
|
|
44
44
|
|
|
45
45
|
```
|
|
46
46
|
exec:nodejs
|
|
@@ -48,7 +48,14 @@ const { fn } = await import('/abs/path/to/module.js');
|
|
|
48
48
|
console.log(await fn(realInput));
|
|
49
49
|
```
|
|
50
50
|
|
|
51
|
-
|
|
51
|
+
**Failure triage protocol**: when end-to-end fails, do not patch blindly. Isolate the fault:
|
|
52
|
+
1. Identify which subsystem produced the unexpected output
|
|
53
|
+
2. Reproduce the failure in isolation (single function, single module)
|
|
54
|
+
3. Name the delta between expected and actual — this is the mutable
|
|
55
|
+
4. Triage: broken file output → regress to EMIT | wrong logic → regress to EXECUTE | new unknown → regress to PLAN
|
|
56
|
+
5. Never fix a symptom without identifying and fixing the root cause
|
|
57
|
+
|
|
58
|
+
For browser/UI: invoke `browser` skill with real workflows. Server + client features require both exec:nodejs AND browser diagnostics. After every success: enumerate what remains — never stop at first green. First green is not COMPLETE.
|
|
52
59
|
|
|
53
60
|
## CODE EXECUTION
|
|
54
61
|
|
|
@@ -149,12 +156,12 @@ After end-to-end verification passes: read .prd from disk. If any items remain,
|
|
|
149
156
|
|
|
150
157
|
**Never**: claim done without witnessed output | uncommitted changes | unpushed commits | failed CI runs | .prd items remaining | TODO.md with items remaining | stop at first green | absorb surprises silently | respond to user while .prd has items | skip hygiene sweep | leave comments/mocks/test files/fallbacks
|
|
151
158
|
|
|
152
|
-
**Always**: triage failure before
|
|
159
|
+
**Always**: triage failure before regressing | witness end-to-end | regress to planning on any new unknown | enumerate remaining after every success | check .prd after every verification pass | run hygiene sweep before declaring complete | deploy/publish if applicable | update CHANGELOG.md
|
|
153
160
|
|
|
154
161
|
---
|
|
155
162
|
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
163
|
+
**EXIT → EXECUTE**: .prd items remain → invoke `gm-execute` skill immediately (keep going, never stop with .prd items).
|
|
164
|
+
**EXIT → COMPLETE**: .prd deleted + feature work pushed + CI green → invoke `update-docs` skill.
|
|
165
|
+
**REGRESS → EMIT**: file output wrong → invoke `gm-emit` skill, reset to EMIT state.
|
|
166
|
+
**REGRESS → EXECUTE**: logic wrong → invoke `gm-execute` skill, reset to EXECUTE state.
|
|
167
|
+
**REGRESS → PLAN**: new unknown or wrong requirements → invoke `planning` skill, reset to PLAN state.
|
package/skills/gm-emit/SKILL.md
CHANGED
|
@@ -12,16 +12,16 @@ You are in the **EMIT** phase. Every mutable is KNOWN. Prove the write is correc
|
|
|
12
12
|
|
|
13
13
|
## TRANSITIONS
|
|
14
14
|
|
|
15
|
-
**
|
|
15
|
+
**EXIT — invoke `gm-complete` skill immediately when**: All gate conditions are true simultaneously. Do not pause. Invoke the skill.
|
|
16
16
|
|
|
17
|
-
**SELF-LOOP**: Post-emit variance with known cause → fix immediately, re-verify, do not advance until zero variance
|
|
17
|
+
**SELF-LOOP (remain in EMIT state)**: Post-emit variance with known cause → fix immediately, re-verify, do not advance until zero variance
|
|
18
18
|
|
|
19
|
-
**
|
|
20
|
-
- Pre-emit reveals logic error (known mutable) → invoke `gm-execute` skill,
|
|
21
|
-
- Pre-emit reveals new unknown → invoke `planning` skill,
|
|
22
|
-
- Post-emit variance with unknown cause → invoke `planning` skill,
|
|
23
|
-
- Scope changed → invoke `planning` skill,
|
|
24
|
-
-
|
|
19
|
+
**STATE REGRESSIONS**:
|
|
20
|
+
- Pre-emit reveals logic error (known mutable) → invoke `gm-execute` skill, reset to EXECUTE, return here after resolution
|
|
21
|
+
- Pre-emit reveals new unknown → invoke `planning` skill, reset to PLAN state
|
|
22
|
+
- Post-emit variance with unknown cause → invoke `planning` skill, reset to PLAN state
|
|
23
|
+
- Scope changed → invoke `planning` skill, reset to PLAN state
|
|
24
|
+
- Re-entered from VERIFY state (broken file output) → fix, re-verify, then re-invoke `gm-complete` skill
|
|
25
25
|
|
|
26
26
|
## MUTABLE DISCIPLINE
|
|
27
27
|
|
|
@@ -41,11 +41,14 @@ Only git in bash directly. `Bash(node/npm/npx/bun)` = violations. File writes vi
|
|
|
41
41
|
- Target under 12s per exec call; split work across multiple calls only when dependencies require it
|
|
42
42
|
- Prefer a single well-structured exec that does 5 things over 5 sequential execs
|
|
43
43
|
|
|
44
|
-
## PRE-EMIT
|
|
44
|
+
## PRE-EMIT DIAGNOSTIC RUN (mandatory before writing any file)
|
|
45
45
|
|
|
46
|
-
|
|
46
|
+
The pre-emit run is a diagnostic pass. Its purpose is to falsify the write before it happens.
|
|
47
|
+
|
|
48
|
+
1. Import the actual module from disk via `exec:nodejs` — witness current on-disk behavior as the baseline
|
|
47
49
|
2. Run proposed logic in isolation WITHOUT writing — witness output with real inputs
|
|
48
|
-
3.
|
|
50
|
+
3. Probe all failure paths with real error inputs — record expected vs actual for each
|
|
51
|
+
4. Compare: if proposed output matches expected → proceed to write. If not → new unknown, regress to `planning`.
|
|
49
52
|
|
|
50
53
|
```
|
|
51
54
|
exec:nodejs
|
|
@@ -53,18 +56,21 @@ const { fn } = await import('/abs/path/to/module.js');
|
|
|
53
56
|
console.log(await fn(realInput));
|
|
54
57
|
```
|
|
55
58
|
|
|
56
|
-
Pre-emit revealing unexpected behavior → new unknown →
|
|
59
|
+
Pre-emit revealing unexpected behavior → name the delta → new unknown → invoke `planning` skill, reset to PLAN state.
|
|
57
60
|
|
|
58
61
|
## WRITING FILES
|
|
59
62
|
|
|
60
63
|
`exec:nodejs` with `require('fs')`. Write only when every gate mutable is `resolved=true` simultaneously.
|
|
61
64
|
|
|
62
|
-
## POST-EMIT VERIFICATION (immediately after writing)
|
|
65
|
+
## POST-EMIT DIAGNOSTIC VERIFICATION (immediately after writing)
|
|
66
|
+
|
|
67
|
+
The post-emit verification is a differential diagnosis against the pre-emit baseline.
|
|
63
68
|
|
|
64
|
-
1. Re-import the actual file from disk — not in-memory version
|
|
65
|
-
2. Run
|
|
66
|
-
3. For browser: reload from disk, re-inject `__gm` globals, re-run, compare
|
|
67
|
-
4. Known variance
|
|
69
|
+
1. Re-import the actual file from disk — not the in-memory version (stale in-memory state is inadmissible)
|
|
70
|
+
2. Run identical inputs as pre-emit — output must match pre-emit witnessed values exactly
|
|
71
|
+
3. For browser: reload from disk, re-inject `__gm` globals, re-run, compare captured outputs to pre-emit baseline
|
|
72
|
+
4. Known variance (cause is identified, mutable is KNOWN) → fix immediately and re-verify
|
|
73
|
+
5. Unknown variance (delta exists but cause cannot be determined) → this is a new unknown → invoke `planning` skill, reset to PLAN state
|
|
68
74
|
|
|
69
75
|
## GATE CONDITIONS (all true simultaneously before advancing)
|
|
70
76
|
|
|
@@ -109,11 +115,11 @@ Never respond to the user from this phase. When all gate conditions pass, immedi
|
|
|
109
115
|
|
|
110
116
|
**Never**: write before pre-emit passes | advance with post-emit variance | absorb surprises silently | comments | hardcoded values | fallback/demo modes | silent error swallowing | defer spotted issues | respond to user or pause for input
|
|
111
117
|
|
|
112
|
-
**Always**: pre-emit debug before writing | post-emit verify from disk |
|
|
118
|
+
**Always**: pre-emit debug before writing | post-emit verify from disk | regress to planning on any new unknown | fix immediately | invoke next skill immediately when gates pass
|
|
113
119
|
|
|
114
120
|
---
|
|
115
121
|
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
122
|
+
**EXIT → VERIFY**: All gates pass → invoke `gm-complete` skill immediately.
|
|
123
|
+
**SELF-LOOP**: Known post-emit variance → fix, re-verify (remain in EMIT state).
|
|
124
|
+
**REGRESS → EXECUTE**: Known logic error → invoke `gm-execute` skill, reset to EXECUTE state.
|
|
125
|
+
**REGRESS → PLAN**: Any new unknown → invoke `planning` skill, reset to PLAN state.
|
|
@@ -12,14 +12,15 @@ You are in the **EXECUTE** phase. Resolve every named mutable via witnessed exec
|
|
|
12
12
|
|
|
13
13
|
## TRANSITIONS
|
|
14
14
|
|
|
15
|
-
**
|
|
15
|
+
**EXIT — invoke `gm-emit` skill immediately when**: All mutables are KNOWN (zero UNKNOWN remaining). Do not wait, do not summarize. Invoke the skill.
|
|
16
16
|
|
|
17
|
-
**SELF-LOOP**: Mutable still UNKNOWN after one pass → re-run with different angle (max 2 passes then
|
|
17
|
+
**SELF-LOOP (remain in EXECUTE state)**: Mutable still UNKNOWN after one pass → re-run with different angle (max 2 passes, then regress to PLAN)
|
|
18
18
|
|
|
19
|
-
**
|
|
20
|
-
- New unknown discovered → invoke `planning` skill immediately,
|
|
21
|
-
-
|
|
22
|
-
-
|
|
19
|
+
**STATE REGRESSIONS**:
|
|
20
|
+
- New unknown discovered → invoke `planning` skill immediately, reset to PLAN state
|
|
21
|
+
- EXECUTE mutable unresolvable after 2 passes → invoke `planning` skill, reset to PLAN state
|
|
22
|
+
- Re-entered from EMIT state (logic error) → re-resolve the mutable, then re-invoke `gm-emit` skill
|
|
23
|
+
- Re-entered from VERIFY state (runtime failure) → re-resolve with real system state, then re-invoke `gm-emit` skill
|
|
23
24
|
|
|
24
25
|
## MUTABLE DISCIPLINE
|
|
25
26
|
|
|
@@ -75,9 +76,9 @@ exec:codesearch
|
|
|
75
76
|
4. Keep changing or adding words each pass until content is found
|
|
76
77
|
5. Minimum 4 attempts before concluding content is absent
|
|
77
78
|
|
|
78
|
-
## IMPORT-BASED
|
|
79
|
+
## DIAGNOSTIC PROTOCOL — IMPORT-BASED EXECUTION
|
|
79
80
|
|
|
80
|
-
Always import actual codebase modules. Never rewrite logic inline.
|
|
81
|
+
Always import actual codebase modules. Never rewrite logic inline. Reimplemented output is unwitnessed and inadmissible as ground truth.
|
|
81
82
|
|
|
82
83
|
```
|
|
83
84
|
exec:nodejs
|
|
@@ -87,38 +88,40 @@ console.log(await fn(realInput));
|
|
|
87
88
|
|
|
88
89
|
Witnessed import output = resolved mutable. Reimplemented output = UNKNOWN.
|
|
89
90
|
|
|
91
|
+
**Differential diagnosis**: when behavior diverges from expectation, run the smallest possible isolation test first. Compare actual vs expected. Name the delta. The delta is the mutable — resolve it before touching any file.
|
|
92
|
+
|
|
90
93
|
## EXECUTION DENSITY
|
|
91
94
|
|
|
92
|
-
Pack every related hypothesis into one run. Each run ≤15s. Witnessed output = ground truth. Narrated assumption =
|
|
95
|
+
Pack every related hypothesis into one run. Each run ≤15s. Witnessed output = ground truth. Narrated assumption = inadmissible.
|
|
93
96
|
|
|
94
|
-
Parallel waves: ≤3 `gm:gm` subagents via
|
|
97
|
+
Parallel waves: ≤3 `gm:gm` subagents via Agent tool (`Agent(subagent_type="gm:gm", ...)`) — independent items simultaneously, never sequentially.
|
|
95
98
|
|
|
96
|
-
## CHAIN DECOMPOSITION
|
|
99
|
+
## CHAIN DECOMPOSITION — FAULT ISOLATION
|
|
97
100
|
|
|
98
|
-
Break every multi-step operation before running end-to-end:
|
|
101
|
+
Break every multi-step operation before running end-to-end. Treat each step as a diagnostic unit:
|
|
99
102
|
1. Number every distinct step
|
|
100
103
|
2. Per step: input shape, output shape, success condition, failure mode
|
|
101
|
-
3. Run each step in isolation — witness — assign mutable — KNOWN before next
|
|
102
|
-
4. Debug adjacent pairs for handoff correctness
|
|
104
|
+
3. Run each step in isolation — witness output — assign mutable — must be KNOWN before proceeding to next step
|
|
105
|
+
4. Debug adjacent step pairs for handoff correctness — the seam between steps is the most common failure site
|
|
103
106
|
5. Only when all pairs pass: run full chain end-to-end
|
|
104
107
|
|
|
105
|
-
Step failure revealing new unknown →
|
|
108
|
+
Step failure revealing new unknown → regress to `planning` state immediately.
|
|
106
109
|
|
|
107
|
-
## BROWSER
|
|
110
|
+
## BROWSER DIAGNOSTIC ESCALATION
|
|
108
111
|
|
|
109
|
-
Invoke `browser` skill.
|
|
110
|
-
1. `exec:browser\n<js>` —
|
|
111
|
-
2. `browser` skill — for full session workflows
|
|
112
|
-
3. navigate/click/type — only when real events required
|
|
113
|
-
4. screenshot — last resort
|
|
112
|
+
Invoke `browser` skill. Exhaust each level before advancing to next:
|
|
113
|
+
1. `exec:browser\n<js>` — inspect DOM state, read globals, check network responses. Always first.
|
|
114
|
+
2. `browser` skill — for full session workflows requiring navigation
|
|
115
|
+
3. navigate/click/type — only when real events required and DOM inspection insufficient
|
|
116
|
+
4. screenshot — last resort, only after all JS-based diagnostics exhausted
|
|
114
117
|
|
|
115
|
-
## GROUND TRUTH
|
|
118
|
+
## GROUND TRUTH ENFORCEMENT
|
|
116
119
|
|
|
117
|
-
Real services, real data, real timing. Mocks/fakes/stubs/simulations = delete immediately. No .test.js/.spec.js. Delete on discovery. No fallback/demo modes — errors must
|
|
120
|
+
Real services, real data, real timing. Mocks/fakes/stubs/simulations = diagnostic noise = delete immediately. No .test.js/.spec.js. Delete on discovery. No fallback/demo modes — errors must surface with full diagnostic context and fail loud.
|
|
118
121
|
|
|
119
|
-
**SCAN BEFORE EDIT**: Before modifying or creating any file, search the codebase (exec:codesearch) for existing implementations of the same concern. "Duplicate" means overlapping responsibility, similar logic, or parallel implementations — not just identical files. If consolidation is possible,
|
|
122
|
+
**SCAN BEFORE EDIT**: Before modifying or creating any file, search the codebase (exec:codesearch) for existing implementations of the same concern. "Duplicate" means overlapping responsibility, similar logic, or parallel implementations — not just identical files. If consolidation is possible, regress to `planning` with restructuring instructions instead of continuing.
|
|
120
123
|
|
|
121
|
-
**HYPOTHESIZE VIA EXECUTION**:
|
|
124
|
+
**HYPOTHESIZE VIA EXECUTION — NEVER VIA ASSUMPTION**: Formulate a falsifiable hypothesis. Run it. Witness the output. The output either confirms or falsifies. Only a witnessed falsification justifies editing a file. Never edit based on unwitnessed assumptions — form hypothesis → run → witness → edit.
|
|
122
125
|
|
|
123
126
|
## DO NOT STOP
|
|
124
127
|
|
|
@@ -128,10 +131,10 @@ Never respond to the user from this phase. When all mutables are KNOWN, immediat
|
|
|
128
131
|
|
|
129
132
|
**Never**: `Bash(node/npm/npx/bun)` | fake data | mock files | test files | fallback/demo modes | Glob/Grep/Read/Explore (hook-blocked — use exec:codesearch) | sequential independent items | absorb surprises silently | respond to user or pause for input | edit files before executing to understand current behavior | duplicate existing code
|
|
130
133
|
|
|
131
|
-
**Always**: witness every hypothesis | import real modules | scan codebase before creating/editing files |
|
|
134
|
+
**Always**: witness every hypothesis | import real modules | scan codebase before creating/editing files | regress to planning on any new unknown | fix immediately on discovery | delete mocks/stubs/comments/test files on discovery | invoke next skill immediately when done
|
|
132
135
|
|
|
133
136
|
---
|
|
134
137
|
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
+
**EXIT → EMIT**: All mutables KNOWN → invoke `gm-emit` skill immediately.
|
|
139
|
+
**SELF-LOOP**: Still UNKNOWN → re-run (max 2 passes, then regress to PLAN).
|
|
140
|
+
**REGRESS → PLAN**: Any new unknown → invoke `planning` skill, reset to PLAN state.
|
package/skills/planning/SKILL.md
CHANGED
|
@@ -14,30 +14,32 @@ You are in the **PLAN** phase. Your job is to discover every unknown before exec
|
|
|
14
14
|
|
|
15
15
|
## TRANSITIONS
|
|
16
16
|
|
|
17
|
-
**
|
|
18
|
-
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
-
|
|
17
|
+
**EXIT — invoke `gm-execute` skill immediately when**:
|
|
18
|
+
- Zero new unknowns discovered in the latest reasoning pass
|
|
19
|
+
- All .prd items have explicit acceptance criteria
|
|
20
|
+
- All dependencies are mapped bidirectionally
|
|
21
|
+
- Do NOT advance while unknowns remain discoverable through reasoning alone
|
|
22
|
+
|
|
23
|
+
**SELF-LOOP (remain in PLAN state)**:
|
|
24
|
+
- Each planning pass surfaces new unknowns → add them to .prd → plan again
|
|
22
25
|
- Loop until a full pass produces zero new items
|
|
23
|
-
- Do not advance to EXECUTE while unknowns remain discoverable through reasoning alone
|
|
24
26
|
|
|
25
|
-
**
|
|
26
|
-
- From EXECUTE: execution reveals an unknown not in .prd →
|
|
27
|
-
- From EMIT: scope shifted mid-write →
|
|
28
|
-
- From VERIFY: end-to-end reveals requirement
|
|
27
|
+
**REGRESSION ENTRIES (this skill is re-entered from later states)**:
|
|
28
|
+
- From EXECUTE state: execution reveals an unknown not in .prd → add it, re-plan, re-advance
|
|
29
|
+
- From EMIT state: scope shifted mid-write → revise affected items, re-plan, re-advance
|
|
30
|
+
- From VERIFY state: end-to-end reveals wrong requirement → rewrite items, re-plan, re-advance
|
|
29
31
|
|
|
30
32
|
## WHAT PLANNING MEANS
|
|
31
33
|
|
|
32
|
-
Planning = exhaustive
|
|
33
|
-
- What do I not know? → name it as a mutable
|
|
34
|
-
- What could go wrong? → name it as an edge case item
|
|
35
|
-
- What depends on what? → map blocking/blockedBy
|
|
36
|
-
- What assumptions am I making? →
|
|
34
|
+
Planning = exhaustive fault-surface enumeration. For every aspect of the task, apply diagnostic questioning:
|
|
35
|
+
- What do I not know? → name it as a mutable (UNKNOWN)
|
|
36
|
+
- What could go wrong? → name it as an edge case item with a failure mode
|
|
37
|
+
- What depends on what? → map blocking/blockedBy explicitly
|
|
38
|
+
- What assumptions am I making? → each assumption is an unwitnessed hypothesis = mutable until proven by execution
|
|
37
39
|
|
|
38
|
-
**Iterate until**: a full reasoning pass adds zero new items to .prd.
|
|
40
|
+
**Iterate until**: a full reasoning pass adds zero new items to .prd. Every item must have an acceptance criterion that is binary and measurable — no subjective criteria.
|
|
39
41
|
|
|
40
|
-
|
|
42
|
+
Fault surfaces to enumerate exhaustively: file existence | API shape | data format | dependency versions | runtime behavior | environment differences | error conditions | concurrency hazards | integration seams | backwards compatibility | rollback paths | deployment steps | verification criteria | CI/CD pipeline correctness
|
|
41
43
|
|
|
42
44
|
**MANDATORY CODEBASE SCAN**: For every planned item, add `existingImpl=UNKNOWN` mutable. Resolve by running exec:codesearch for the concern (not the implementation). If existing code serves the same concern → the .prd item becomes a consolidation task, not an addition. The plan restructures existing code to absorb the new requirement — never bolt new code alongside existing code that does related work.
|
|
43
45
|
|
|
@@ -69,15 +71,18 @@ Path: exactly `./.prd` in current working directory. **JSON array** written via
|
|
|
69
71
|
**blocking/blockedBy**: always bidirectional. Every dependency must be explicit in both directions.
|
|
70
72
|
**Deletion rule**: when the last item is completed and removed, delete the `.prd` file. An empty file is a violation.
|
|
71
73
|
|
|
72
|
-
##
|
|
74
|
+
## PARALLEL SUBAGENT LAUNCH (immediately after .prd is written)
|
|
75
|
+
|
|
76
|
+
When .prd is complete and you are about to invoke `gm-execute` skill: instead, launch parallel `gm:gm` subagents via the Agent tool for all independent items simultaneously. Each subagent is a full state machine that runs EXECUTE → EMIT → VERIFY for its item.
|
|
73
77
|
|
|
74
|
-
Independent items (empty `blockedBy`) run in parallel waves of ≤3 subagents.
|
|
75
78
|
- Find all pending items with empty `blockedBy`
|
|
76
|
-
- Launch ≤3 parallel `gm:gm
|
|
77
|
-
- Each subagent
|
|
78
|
-
- After each wave:
|
|
79
|
-
- Never run independent items sequentially
|
|
80
|
-
- **Exception — browser tasks**: items requiring `exec:browser` must run sequentially
|
|
79
|
+
- Launch ≤3 parallel subagents: `Agent(subagent_type="gm:gm", prompt="Work on .prd item: <id>. .prd path: <path>. Item: <full item JSON>.")`
|
|
80
|
+
- Each subagent resolves its item end-to-end: witnessed execution → file write → verification → removes item from .prd
|
|
81
|
+
- After each wave: read .prd from disk, find newly unblocked items, launch next wave
|
|
82
|
+
- Never run independent items sequentially — parallelism is mandatory for independent items
|
|
83
|
+
- **Exception — browser tasks**: items requiring `exec:browser` must run sequentially (one Chrome instance per project; concurrent browser subagents will conflict)
|
|
84
|
+
|
|
85
|
+
**When parallelism is not applicable** (single-item .prd, or all items blocked): invoke `gm-execute` skill directly.
|
|
81
86
|
|
|
82
87
|
## COMPLETION CRITERION
|
|
83
88
|
|
|
@@ -87,10 +92,10 @@ Independent items (empty `blockedBy`) run in parallel waves of ≤3 subagents.
|
|
|
87
92
|
|
|
88
93
|
## DO NOT STOP
|
|
89
94
|
|
|
90
|
-
Never respond to the user from this phase. When .prd is complete (zero new items in last pass), immediately invoke `gm-execute` skill. Do not pause, summarize, or ask for confirmation.
|
|
95
|
+
Never respond to the user from this phase. When .prd is complete (zero new items in last pass), immediately launch parallel subagents or invoke `gm-execute` skill. Do not pause, summarize, or ask for confirmation.
|
|
91
96
|
|
|
92
97
|
---
|
|
93
98
|
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
99
|
+
**EXIT → EXECUTE**: Zero new mutables → launch parallel subagents or invoke `gm-execute` skill immediately.
|
|
100
|
+
**SELF-LOOP**: New items discovered → add to .prd → plan again (remain in PLAN state).
|
|
101
|
+
**REGRESSION ENTRY**: New unknown surfaces in any later state → add it, re-plan, re-advance through full chain.
|