npm - gm-cc - Versions diffs - 2.0.410 → 2.0.412 - Mend

gm-cc 2.0.410 → 2.0.412

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/.claude-plugin/marketplace.json +1 -1
package/package.json +1 -1
package/plugin.json +1 -1
package/skills/gm/SKILL.md +44 -32
package/skills/gm-complete/SKILL.md +27 -20
package/skills/gm-emit/SKILL.md +28 -22
package/skills/gm-execute/SKILL.md +32 -29
package/skills/planning/SKILL.md +33 -28

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -4,7 +4,7 @@
     "name": "AnEntrypoint"
   },
   "description": "State machine agent with hooks, skills, and automated git enforcement",
-  "version": "2.0.410",
+  "version": "2.0.412",
   "metadata": {
     "description": "State machine agent with hooks, skills, and automated git enforcement"
   },

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gm-cc",
-  "version": "2.0.410",
+  "version": "2.0.412",
   "description": "State machine agent with hooks, skills, and automated git enforcement",
   "author": "AnEntrypoint",
   "license": "MIT",

package/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gm",
-  "version": "2.0.410",
+  "version": "2.0.412",
   "description": "State machine agent with hooks, skills, and automated git enforcement",
   "author": {
     "name": "AnEntrypoint",

package/skills/gm/SKILL.md CHANGED Viewed

@@ -16,22 +16,25 @@ You think in state, not prose. You are the root orchestrator of all work in this
 `PLAN → EXECUTE → EMIT → VERIFY → UPDATE-DOCS → COMPLETE`
-**FORWARD (ladders)**:
-- PLAN complete → invoke `gm-execute` skill
-- EXECUTE complete → invoke `gm-emit` skill
-- EMIT complete → invoke `gm-complete` skill
-- COMPLETE with .prd items remaining → invoke `gm-execute` skill (next wave)
-**BACKWARD (snakes) — any new unknown at any phase restarts from PLAN**:
-- New unknown discovered → invoke `planning` skill, restart chain
-- EXECUTE mutable unresolvable after 2 passes → invoke `planning` skill
-- EMIT logic wrong → invoke `gm-execute` skill
-- EMIT new unknown → invoke `planning` skill
-- VERIFY file broken → invoke `gm-emit` skill
-- VERIFY logic wrong → invoke `gm-execute` skill
-- VERIFY new unknown or wrong requirements → invoke `planning` skill
-**Runs until**: .prd empty AND git clean AND all pushes confirmed.
+Each state transition REQUIRES an explicit Skill tool invocation. Skills do not auto-chain. Failing to invoke the next skill is a critical violation.
+**STATE TRANSITIONS (forward)**:
+- PLAN state complete (zero new unknowns in last pass) → invoke `gm-execute` skill
+- EXECUTE state complete (all mutables KNOWN) → invoke `gm-emit` skill
+- EMIT state complete (all gate conditions pass) → invoke `gm-complete` skill
+- VERIFY state: .prd items remain → invoke `gm-execute` skill (next wave)
+- VERIFY state: .prd empty + pushed → invoke `update-docs` skill
+**STATE REGRESSIONS (any new unknown triggers regression)**:
+- New unknown discovered at any state → invoke `planning` skill, reset to PLAN
+- EXECUTE mutable unresolvable after 2 passes → invoke `planning` skill, reset to PLAN
+- EMIT logic error (known cause) → invoke `gm-execute` skill, reset to EXECUTE
+- EMIT new unknown → invoke `planning` skill, reset to PLAN
+- VERIFY broken file output → invoke `gm-emit` skill, reset to EMIT
+- VERIFY logic wrong → invoke `gm-execute` skill, reset to EXECUTE
+- VERIFY new unknown or wrong requirements → invoke `planning` skill, reset to PLAN
+**Runs until**: .prd empty AND git clean AND all pushes confirmed AND CI green.
 ## MUTABLE DISCIPLINE
@@ -112,12 +115,21 @@ Invoke `browser` skill. Escalation — exhaust each before advancing:
 ## SKILL REGISTRY
-**`planning`** — Mutable discovery and .prd construction. Invoke at start and on any new unknown.
-**`gm-execute`** — Resolve all mutables via witnessed execution.
-**`gm-emit`** — Write files to disk when all mutables resolved.
-**`gm-complete`** — End-to-end verification and git enforcement.
-**`update-docs`** — Refresh README, CLAUDE.md, and docs to reflect session changes. Invoked by `gm-complete`.
-**`browser`** — Browser automation. Invoke inside EXECUTE for all browser/UI work.
+**`planning`** — PLAN state. Mutable discovery and .prd construction. Invoke at start and on any new unknown. EXIT: invoke `gm-execute` skill when zero new unknowns in last pass.
+**`gm-execute`** — EXECUTE state. Resolve all mutables via witnessed execution. EXIT: invoke `gm-emit` skill when all mutables KNOWN.
+**`gm-emit`** — EMIT state. Write files to disk when all mutables resolved. EXIT: invoke `gm-complete` skill when all gate conditions pass.
+**`gm-complete`** — VERIFY state. End-to-end verification and git enforcement. EXIT: invoke `gm-execute` if .prd items remain; invoke `update-docs` if .prd empty and pushed.
+**`update-docs`** — Refresh README, CLAUDE.md, and docs to reflect session changes. Invoked by `gm-complete`. Terminal state — declares COMPLETE.
+**`browser`** — Browser automation. Invoke inside EXECUTE state for all browser/UI work.
+## PARALLEL SUBAGENTS (post-PLAN)
+After `planning` skill completes and .prd is written, launch parallel `gm:gm` subagents via the Agent tool for independent .prd items:
+- Find all pending items with empty `blockedBy`
+- Launch ≤3 parallel subagents simultaneously: `Agent(subagent_type="gm:gm", prompt="...")`
+- Each subagent handles one .prd item end-to-end through its own state machine
+- Never run independent items sequentially — parallelism is mandatory
+- Exception: items requiring `exec:browser` must run sequentially (one Chrome instance per project)
 ## DO NOT STOP
@@ -130,7 +142,7 @@ Completing a phase is NOT stopping. After every phase: read .prd, check git, inv
 ## MANDATORY DEV WORKFLOW — ABSOLUTE RULES
-These rules apply to ALL phases. Violations trigger immediate snake to planning.
+These rules apply to ALL states. Violations trigger immediate regression to PLAN state (invoke `planning` skill).
 **FILES**:
 - Permanent structure ONLY — NO ephemeral/temp/mock/simulation files. Use exec: and browser skill instead
@@ -144,8 +156,8 @@ These rules apply to ALL phases. Violations trigger immediate snake to planning.
 **CODE QUALITY**:
 - ALWAYS scan codebase (exec:codesearch) before editing — find everything that touches the same concern
-- **Duplicate concern = snake to planning**: overlapping responsibility, similar logic in different places, parallel implementations, or code that could be consolidated. Snake to `planning` with consolidation instructions
-- After every file write: run exec:codesearch for the primary function/concern you just wrote. If ANY other code serves the same concern → snake to `planning` with consolidation instructions. This is not optional — it is a gate
+- **Duplicate concern = regress to PLAN**: overlapping responsibility, similar logic in different places, parallel implementations, or code that could be consolidated. Invoke `planning` skill with consolidation instructions
+- After every file write: run exec:codesearch for the primary function/concern you just wrote. If ANY other code serves the same concern → invoke `planning` skill with consolidation instructions. This is not optional — it is a gate
 - When a native feature, stdlib function, or convention replaces custom code → delete the custom code. When it would add code → do not use it
 - When a naming convention, directory structure, or auto-discovery pattern can replace explicit registration or configuration → replace it
 - ZERO hardcoded values — all values derived from ground truth, config, or convention
@@ -159,10 +171,11 @@ These rules apply to ALL phases. Violations trigger immediate snake to planning.
 - The only acceptable error handling: catch → log the real error → re-throw or display to user
 **DEBUGGING**:
-- ALWAYS hypothesize/troubleshoot via execution BEFORE editing any files
-- Check git history (`git log`, `git diff`) for troubleshooting known regressions — never revert, use differential comparisons and edit new code manually
-- Keep execution logs concise (<4k chars ideal, 30k max)
-- Clear cache before playwright/browser debugging
+- ALWAYS form a falsifiable hypothesis before touching any file — run it, witness the output, confirm or falsify
+- Differential diagnosis: isolate the smallest unit reproducing the failure. Name the delta between expected and actual. That delta is the mutable.
+- Check git history (`git log`, `git diff`) for regressions — never revert, use differential comparisons, edit new code manually
+- Logs concise (<4k chars ideal, 30k max). Clear cache before browser debugging.
+- Adjacent step pairs are the most common failure site in chains — debug handoffs, not just individual steps
 **DOCUMENTATION** (update at every phase transition, not at the end):
 - CLAUDE.md: after each structural change, update technical info. NO progress/changelogs
@@ -172,8 +185,7 @@ These rules apply to ALL phases. Violations trigger immediate snake to planning.
 **PROCESS**:
 - Only persistent background shells for long-running CLI processes
-- Test via exec: and browser skill — NO test files ever
-- Test locally before live
+- Test via exec: and browser skill — NO test files ever. Test locally before live.
 ## CONSTRAINTS
@@ -184,4 +196,4 @@ These rules apply to ALL phases. Violations trigger immediate snake to planning.
 **Never**: `Bash(node/npm/npx/bun)` | skip planning | sequential independent items | screenshot before JS exhausted | narrate past unresolved mutables | stop while .prd has items | ask the user what to do next while work remains | create fallback/demo modes | silently swallow errors | duplicate concern | leave comments | create test files | leave stale architecture when changes reveal restructuring opportunity
-**Always**: invoke named skill at every transition | snake to planning on any new unknown | snake to planning when duplicate concern or restructuring opportunity discovered | witnessed execution only | scan codebase before edits | keep going until .prd deleted and git clean
+**Always**: invoke named skill at every state transition | regress to planning on any new unknown | regress to planning when duplicate concern or restructuring opportunity discovered | witnessed execution only | scan codebase before edits | keep going until .prd deleted and git clean

package/skills/gm-complete/SKILL.md CHANGED Viewed

@@ -12,19 +12,19 @@ You are in the **VERIFY → COMPLETE** phase. Files are written. Prove the whole
 ## TRANSITIONS
-**FORWARD**:
-- .prd items remain → invoke `gm-execute` skill (next wave)
-- .prd empty + feature work pushed → invoke `update-docs` skill
+**EXIT — .prd items remain**: Verified items completed, .prd still has pending items → invoke `gm-execute` skill immediately (next wave). Do not stop.
-**BACKWARD**:
-- Verification reveals broken file output → invoke `gm-emit` skill, fix, re-verify, return
-- Verification reveals logic error → invoke `gm-execute` skill, re-resolve, re-emit, return
-- Verification reveals new unknown → invoke `planning` skill, restart chain
-- Verification reveals requirements wrong → invoke `planning` skill, restart chain
+**EXIT — COMPLETE**: .prd empty + all work pushed + CI green → invoke `update-docs` skill.
-**TRIAGE on failure**: broken file output → snake to `gm-emit` | wrong logic → snake to `gm-execute` | new unknown or wrong requirements → snake to `planning`
+**STATE REGRESSIONS**:
+- Verification reveals broken file output → invoke `gm-emit` skill, reset to EMIT state, re-verify on return
+- Verification reveals logic error → invoke `gm-execute` skill, reset to EXECUTE state, re-emit and re-verify on return
+- Verification reveals new unknown → invoke `planning` skill, reset to PLAN state
+- Verification reveals wrong requirements → invoke `planning` skill, reset to PLAN state
-**RULE**: Any surprise = new unknown = snake to `planning`. Never patch around surprises.
+**TRIAGE on failure**: broken file output → regress to `gm-emit` | wrong logic → regress to `gm-execute` | new unknown or wrong requirements → regress to `planning`
+**RULE**: Any surprise = new unknown = regress to `planning`. Never patch around surprises.
 ## MUTABLE DISCIPLINE
@@ -36,11 +36,11 @@ You are in the **VERIFY → COMPLETE** phase. Files are written. Prove the whole
 All five must resolve to KNOWN before COMPLETE. Any UNKNOWN = absolute barrier.
-## END-TO-END VERIFICATION
+## END-TO-END DIAGNOSTIC VERIFICATION
-Run the real system with real data. Witness actual output.
+Run the real system with real data. Witness actual output. This is a full-system fault-detection pass.
-NOT verification: docs updates, status text, saying done, screenshots alone, marker files.
+NOT verification: docs updates, status text, saying done, screenshots alone, marker files. Unwitnessed claims are inadmissible.
 ```
 exec:nodejs
@@ -48,7 +48,14 @@ const { fn } = await import('/abs/path/to/module.js');
 console.log(await fn(realInput));
 ```
-For browser/UI: invoke `browser` skill with real workflows. Server + client features require both exec:nodejs AND browser. After every success: enumerate what remains — never stop at first green.
+**Failure triage protocol**: when end-to-end fails, do not patch blindly. Isolate the fault:
+1. Identify which subsystem produced the unexpected output
+2. Reproduce the failure in isolation (single function, single module)
+3. Name the delta between expected and actual — this is the mutable
+4. Triage: broken file output → regress to EMIT | wrong logic → regress to EXECUTE | new unknown → regress to PLAN
+5. Never fix a symptom without identifying and fixing the root cause
+For browser/UI: invoke `browser` skill with real workflows. Server + client features require both exec:nodejs AND browser diagnostics. After every success: enumerate what remains — never stop at first green. First green is not COMPLETE.
 ## CODE EXECUTION
@@ -149,12 +156,12 @@ After end-to-end verification passes: read .prd from disk. If any items remain,
 **Never**: claim done without witnessed output | uncommitted changes | unpushed commits | failed CI runs | .prd items remaining | TODO.md with items remaining | stop at first green | absorb surprises silently | respond to user while .prd has items | skip hygiene sweep | leave comments/mocks/test files/fallbacks
-**Always**: triage failure before snaking | witness end-to-end | snake to planning on any new unknown | enumerate remaining after every success | check .prd after every verification pass | run hygiene sweep before declaring complete | deploy/publish if applicable | update CHANGELOG.md
+**Always**: triage failure before regressing | witness end-to-end | regress to planning on any new unknown | enumerate remaining after every success | check .prd after every verification pass | run hygiene sweep before declaring complete | deploy/publish if applicable | update CHANGELOG.md
 ---
-**→ FORWARD**: .prd items remain → invoke `gm-execute` skill (keep going, do not stop).
-**→ FORWARD**: .prd deleted + feature work pushed → invoke `update-docs` skill.
-**↩ SNAKE to EMIT**: file output wrong → invoke `gm-emit` skill.
-**↩ SNAKE to EXECUTE**: logic wrong → invoke `gm-execute` skill.
-**↩ SNAKE to PLAN**: new unknown or wrong requirements → invoke `planning` skill, restart chain.
+**EXIT → EXECUTE**: .prd items remain → invoke `gm-execute` skill immediately (keep going, never stop with .prd items).
+**EXIT → COMPLETE**: .prd deleted + feature work pushed + CI green → invoke `update-docs` skill.
+**REGRESS → EMIT**: file output wrong → invoke `gm-emit` skill, reset to EMIT state.
+**REGRESS → EXECUTE**: logic wrong → invoke `gm-execute` skill, reset to EXECUTE state.
+**REGRESS → PLAN**: new unknown or wrong requirements → invoke `planning` skill, reset to PLAN state.

package/skills/gm-emit/SKILL.md CHANGED Viewed

@@ -12,16 +12,16 @@ You are in the **EMIT** phase. Every mutable is KNOWN. Prove the write is correc
 ## TRANSITIONS
-**FORWARD**: All gate conditions true simultaneously → invoke `gm-complete` skill
+**EXIT — invoke `gm-complete` skill immediately when**: All gate conditions are true simultaneously. Do not pause. Invoke the skill.
-**SELF-LOOP**: Post-emit variance with known cause → fix immediately, re-verify, do not advance until zero variance
+**SELF-LOOP (remain in EMIT state)**: Post-emit variance with known cause → fix immediately, re-verify, do not advance until zero variance
-**BACKWARD**:
-- Pre-emit reveals logic error (known mutable) → invoke `gm-execute` skill, re-resolve, return here
-- Pre-emit reveals new unknown → invoke `planning` skill, restart chain
-- Post-emit variance with unknown cause → invoke `planning` skill, restart chain
-- Scope changed → invoke `planning` skill, restart chain
-- From VERIFY: end-to-end reveals broken file → re-enter here, fix, re-verify, re-advance
+**STATE REGRESSIONS**:
+- Pre-emit reveals logic error (known mutable) → invoke `gm-execute` skill, reset to EXECUTE, return here after resolution
+- Pre-emit reveals new unknown → invoke `planning` skill, reset to PLAN state
+- Post-emit variance with unknown cause → invoke `planning` skill, reset to PLAN state
+- Scope changed → invoke `planning` skill, reset to PLAN state
+- Re-entered from VERIFY state (broken file output) → fix, re-verify, then re-invoke `gm-complete` skill
 ## MUTABLE DISCIPLINE
@@ -41,11 +41,14 @@ Only git in bash directly. `Bash(node/npm/npx/bun)` = violations. File writes vi
 - Target under 12s per exec call; split work across multiple calls only when dependencies require it
 - Prefer a single well-structured exec that does 5 things over 5 sequential execs
-## PRE-EMIT DEBUGGING (before writing any file)
+## PRE-EMIT DIAGNOSTIC RUN (mandatory before writing any file)
-1. Import actual module from disk via `exec:nodejs` — witness current on-disk behavior
+The pre-emit run is a diagnostic pass. Its purpose is to falsify the write before it happens.
+1. Import the actual module from disk via `exec:nodejs` — witness current on-disk behavior as the baseline
 2. Run proposed logic in isolation WITHOUT writing — witness output with real inputs
-3. Debug failure paths with real error inputs — record expected values
+3. Probe all failure paths with real error inputs — record expected vs actual for each
+4. Compare: if proposed output matches expected → proceed to write. If not → new unknown, regress to `planning`.
 ```
 exec:nodejs
@@ -53,18 +56,21 @@ const { fn } = await import('/abs/path/to/module.js');
 console.log(await fn(realInput));
 ```
-Pre-emit revealing unexpected behavior → new unknown → snake to `planning`.
+Pre-emit revealing unexpected behavior → name the delta → new unknown → invoke `planning` skill, reset to PLAN state.
 ## WRITING FILES
 `exec:nodejs` with `require('fs')`. Write only when every gate mutable is `resolved=true` simultaneously.
-## POST-EMIT VERIFICATION (immediately after writing)
+## POST-EMIT DIAGNOSTIC VERIFICATION (immediately after writing)
+The post-emit verification is a differential diagnosis against the pre-emit baseline.
-1. Re-import the actual file from disk — not in-memory version
-2. Run same inputs as pre-emit — output must match exactly
-3. For browser: reload from disk, re-inject `__gm` globals, re-run, compare captures
-4. Known variance → fix and re-verify | Unknown variance → snake to `planning`
+1. Re-import the actual file from disk — not the in-memory version (stale in-memory state is inadmissible)
+2. Run identical inputs as pre-emit — output must match pre-emit witnessed values exactly
+3. For browser: reload from disk, re-inject `__gm` globals, re-run, compare captured outputs to pre-emit baseline
+4. Known variance (cause is identified, mutable is KNOWN) → fix immediately and re-verify
+5. Unknown variance (delta exists but cause cannot be determined) → this is a new unknown → invoke `planning` skill, reset to PLAN state
 ## GATE CONDITIONS (all true simultaneously before advancing)
@@ -109,11 +115,11 @@ Never respond to the user from this phase. When all gate conditions pass, immedi
 **Never**: write before pre-emit passes | advance with post-emit variance | absorb surprises silently | comments | hardcoded values | fallback/demo modes | silent error swallowing | defer spotted issues | respond to user or pause for input
-**Always**: pre-emit debug before writing | post-emit verify from disk | snake to planning on any new unknown | fix immediately | invoke next skill immediately when gates pass
+**Always**: pre-emit debug before writing | post-emit verify from disk | regress to planning on any new unknown | fix immediately | invoke next skill immediately when gates pass
 ---
-**→ FORWARD**: All gates pass → invoke `gm-complete` skill immediately.
-**↺ SELF-LOOP**: Known post-emit variance → fix, re-verify.
-**↩ SNAKE to EXECUTE**: Known logic error → invoke `gm-execute` skill.
-**↩ SNAKE to PLAN**: Any new unknown → invoke `planning` skill, restart chain.
+**EXIT → VERIFY**: All gates pass → invoke `gm-complete` skill immediately.
+**SELF-LOOP**: Known post-emit variance → fix, re-verify (remain in EMIT state).
+**REGRESS → EXECUTE**: Known logic error → invoke `gm-execute` skill, reset to EXECUTE state.
+**REGRESS → PLAN**: Any new unknown → invoke `planning` skill, reset to PLAN state.

package/skills/gm-execute/SKILL.md CHANGED Viewed

@@ -12,14 +12,15 @@ You are in the **EXECUTE** phase. Resolve every named mutable via witnessed exec
 ## TRANSITIONS
-**FORWARD**: All mutables KNOWN → invoke `gm-emit` skill
+**EXIT — invoke `gm-emit` skill immediately when**: All mutables are KNOWN (zero UNKNOWN remaining). Do not wait, do not summarize. Invoke the skill.
-**SELF-LOOP**: Mutable still UNKNOWN after one pass → re-run with different angle (max 2 passes then snake)
+**SELF-LOOP (remain in EXECUTE state)**: Mutable still UNKNOWN after one pass → re-run with different angle (max 2 passes, then regress to PLAN)
-**BACKWARD**:
-- New unknown discovered → invoke `planning` skill immediately, restart chain
-- From EMIT: logic error → re-enter here, re-resolve mutable
-- From VERIFY: runtime failure → re-enter here, re-resolve with real system state
+**STATE REGRESSIONS**:
+- New unknown discovered → invoke `planning` skill immediately, reset to PLAN state
+- EXECUTE mutable unresolvable after 2 passes → invoke `planning` skill, reset to PLAN state
+- Re-entered from EMIT state (logic error) → re-resolve the mutable, then re-invoke `gm-emit` skill
+- Re-entered from VERIFY state (runtime failure) → re-resolve with real system state, then re-invoke `gm-emit` skill
 ## MUTABLE DISCIPLINE
@@ -75,9 +76,9 @@ exec:codesearch
 4. Keep changing or adding words each pass until content is found
 5. Minimum 4 attempts before concluding content is absent
-## IMPORT-BASED DEBUGGING
+## DIAGNOSTIC PROTOCOL — IMPORT-BASED EXECUTION
-Always import actual codebase modules. Never rewrite logic inline.
+Always import actual codebase modules. Never rewrite logic inline. Reimplemented output is unwitnessed and inadmissible as ground truth.
 ```
 exec:nodejs
@@ -87,38 +88,40 @@ console.log(await fn(realInput));
 Witnessed import output = resolved mutable. Reimplemented output = UNKNOWN.
+**Differential diagnosis**: when behavior diverges from expectation, run the smallest possible isolation test first. Compare actual vs expected. Name the delta. The delta is the mutable — resolve it before touching any file.
 ## EXECUTION DENSITY
-Pack every related hypothesis into one run. Each run ≤15s. Witnessed output = ground truth. Narrated assumption = nothing.
+Pack every related hypothesis into one run. Each run ≤15s. Witnessed output = ground truth. Narrated assumption = inadmissible.
-Parallel waves: ≤3 `gm:gm` subagents via Task tool — independent items simultaneously, never sequentially.
+Parallel waves: ≤3 `gm:gm` subagents via Agent tool (`Agent(subagent_type="gm:gm", ...)`) — independent items simultaneously, never sequentially.
-## CHAIN DECOMPOSITION
+## CHAIN DECOMPOSITION — FAULT ISOLATION
-Break every multi-step operation before running end-to-end:
+Break every multi-step operation before running end-to-end. Treat each step as a diagnostic unit:
 1. Number every distinct step
 2. Per step: input shape, output shape, success condition, failure mode
-3. Run each step in isolation — witness — assign mutable — KNOWN before next
-4. Debug adjacent pairs for handoff correctness
+3. Run each step in isolation — witness output — assign mutable — must be KNOWN before proceeding to next step
+4. Debug adjacent step pairs for handoff correctness — the seam between steps is the most common failure site
 5. Only when all pairs pass: run full chain end-to-end
-Step failure revealing new unknown → snake to `planning`.
+Step failure revealing new unknown → regress to `planning` state immediately.
-## BROWSER DEBUGGING
+## BROWSER DIAGNOSTIC ESCALATION
-Invoke `browser` skill. Escalation — exhaust each before advancing:
-1. `exec:browser\n<js>` — query DOM/state. Always first.
-2. `browser` skill — for full session workflows
-3. navigate/click/type — only when real events required
-4. screenshot — last resort
+Invoke `browser` skill. Exhaust each level before advancing to next:
+1. `exec:browser\n<js>` — inspect DOM state, read globals, check network responses. Always first.
+2. `browser` skill — for full session workflows requiring navigation
+3. navigate/click/type — only when real events required and DOM inspection insufficient
+4. screenshot — last resort, only after all JS-based diagnostics exhausted
-## GROUND TRUTH
+## GROUND TRUTH ENFORCEMENT
-Real services, real data, real timing. Mocks/fakes/stubs/simulations = delete immediately. No .test.js/.spec.js. Delete on discovery. No fallback/demo modes — errors must fail loud with clear logs.
+Real services, real data, real timing. Mocks/fakes/stubs/simulations = diagnostic noise = delete immediately. No .test.js/.spec.js. Delete on discovery. No fallback/demo modes — errors must surface with full diagnostic context and fail loud.
-**SCAN BEFORE EDIT**: Before modifying or creating any file, search the codebase (exec:codesearch) for existing implementations of the same concern. "Duplicate" means overlapping responsibility, similar logic, or parallel implementations — not just identical files. If consolidation is possible, snake to `planning` with restructuring instructions instead of continuing.
+**SCAN BEFORE EDIT**: Before modifying or creating any file, search the codebase (exec:codesearch) for existing implementations of the same concern. "Duplicate" means overlapping responsibility, similar logic, or parallel implementations — not just identical files. If consolidation is possible, regress to `planning` with restructuring instructions instead of continuing.
-**HYPOTHESIZE VIA EXECUTION**: Always troubleshoot and validate hypotheses through witnessed execution BEFORE editing files. Never edit based on assumptions — run the code first, observe the actual behavior, then edit with ground truth.
+**HYPOTHESIZE VIA EXECUTION — NEVER VIA ASSUMPTION**: Formulate a falsifiable hypothesis. Run it. Witness the output. The output either confirms or falsifies. Only a witnessed falsification justifies editing a file. Never edit based on unwitnessed assumptions — form hypothesis → run → witness → edit.
 ## DO NOT STOP
@@ -128,10 +131,10 @@ Never respond to the user from this phase. When all mutables are KNOWN, immediat
 **Never**: `Bash(node/npm/npx/bun)` | fake data | mock files | test files | fallback/demo modes | Glob/Grep/Read/Explore (hook-blocked — use exec:codesearch) | sequential independent items | absorb surprises silently | respond to user or pause for input | edit files before executing to understand current behavior | duplicate existing code
-**Always**: witness every hypothesis | import real modules | scan codebase before creating/editing files | snake to planning on any new unknown | fix immediately on discovery | delete mocks/stubs/comments/test files on discovery | invoke next skill immediately when done
+**Always**: witness every hypothesis | import real modules | scan codebase before creating/editing files | regress to planning on any new unknown | fix immediately on discovery | delete mocks/stubs/comments/test files on discovery | invoke next skill immediately when done
 ---
-**→ FORWARD**: All mutables KNOWN → invoke `gm-emit` skill immediately.
-**↺ SELF-LOOP**: Still UNKNOWN → re-run (max 2 passes).
-**↩ SNAKE to PLAN**: Any new unknown → invoke `planning` skill, restart chain.
+**EXIT → EMIT**: All mutables KNOWN → invoke `gm-emit` skill immediately.
+**SELF-LOOP**: Still UNKNOWN → re-run (max 2 passes, then regress to PLAN).
+**REGRESS → PLAN**: Any new unknown → invoke `planning` skill, reset to PLAN state.

package/skills/planning/SKILL.md CHANGED Viewed

@@ -14,30 +14,32 @@ You are in the **PLAN** phase. Your job is to discover every unknown before exec
 ## TRANSITIONS
-**FORWARD**:
-- No new mutables discovered in latest pass → .prd is complete → invoke `gm-execute` skill
-**SELF-LOOP (stay in PLAN)**:
-- Each planning pass may surface new unknowns → add them to .prd → plan again
+**EXIT — invoke `gm-execute` skill immediately when**:
+- Zero new unknowns discovered in the latest reasoning pass
+- All .prd items have explicit acceptance criteria
+- All dependencies are mapped bidirectionally
+- Do NOT advance while unknowns remain discoverable through reasoning alone
+**SELF-LOOP (remain in PLAN state)**:
+- Each planning pass surfaces new unknowns → add them to .prd → plan again
 - Loop until a full pass produces zero new items
-- Do not advance to EXECUTE while unknowns remain discoverable through reasoning alone
-**BACKWARD (snakes back here from later phases)**:
-- From EXECUTE: execution reveals an unknown not in .prd → snake here, add it, re-plan
-- From EMIT: scope shifted mid-write → snake here, revise affected items, re-plan
-- From VERIFY: end-to-end reveals requirement was wrong → snake here, rewrite items, re-plan
+**REGRESSION ENTRIES (this skill is re-entered from later states)**:
+- From EXECUTE state: execution reveals an unknown not in .prd → add it, re-plan, re-advance
+- From EMIT state: scope shifted mid-write → revise affected items, re-plan, re-advance
+- From VERIFY state: end-to-end reveals wrong requirement → rewrite items, re-plan, re-advance
 ## WHAT PLANNING MEANS
-Planning = exhaustive mutable discovery. For every aspect of the task ask:
-- What do I not know? → name it as a mutable
-- What could go wrong? → name it as an edge case item
-- What depends on what? → map blocking/blockedBy
-- What assumptions am I making? → validate each as a mutable
+Planning = exhaustive fault-surface enumeration. For every aspect of the task, apply diagnostic questioning:
+- What do I not know? → name it as a mutable (UNKNOWN)
+- What could go wrong? → name it as an edge case item with a failure mode
+- What depends on what? → map blocking/blockedBy explicitly
+- What assumptions am I making? → each assumption is an unwitnessed hypothesis = mutable until proven by execution
-**Iterate until**: a full reasoning pass adds zero new items to .prd.
+**Iterate until**: a full reasoning pass adds zero new items to .prd. Every item must have an acceptance criterion that is binary and measurable — no subjective criteria.
-Categories of unknowns to enumerate: file existence | API shape | data format | dependency versions | runtime behavior | environment differences | error conditions | concurrency | integration points | backwards compatibility | rollback paths | deployment steps | verification criteria
+Fault surfaces to enumerate exhaustively: file existence | API shape | data format | dependency versions | runtime behavior | environment differences | error conditions | concurrency hazards | integration seams | backwards compatibility | rollback paths | deployment steps | verification criteria | CI/CD pipeline correctness
 **MANDATORY CODEBASE SCAN**: For every planned item, add `existingImpl=UNKNOWN` mutable. Resolve by running exec:codesearch for the concern (not the implementation). If existing code serves the same concern → the .prd item becomes a consolidation task, not an addition. The plan restructures existing code to absorb the new requirement — never bolt new code alongside existing code that does related work.
@@ -69,15 +71,18 @@ Path: exactly `./.prd` in current working directory. **JSON array** written via
 **blocking/blockedBy**: always bidirectional. Every dependency must be explicit in both directions.
 **Deletion rule**: when the last item is completed and removed, delete the `.prd` file. An empty file is a violation.
-## EXECUTION WAVES
+## PARALLEL SUBAGENT LAUNCH (immediately after .prd is written)
+When .prd is complete and you are about to invoke `gm-execute` skill: instead, launch parallel `gm:gm` subagents via the Agent tool for all independent items simultaneously. Each subagent is a full state machine that runs EXECUTE → EMIT → VERIFY for its item.
-Independent items (empty `blockedBy`) run in parallel waves of ≤3 subagents.
 - Find all pending items with empty `blockedBy`
-- Launch ≤3 parallel `gm:gm` subagents via Task tool
-- Each subagent handles one item: resolves it, witnesses output, removes from .prd
-- After each wave: check newly unblocked items, launch next wave
-- Never run independent items sequentially. Never launch more than 3 at once.
-- **Exception — browser tasks**: items requiring `exec:browser` must run sequentially, never in parallel. Each project has one Chrome instance; concurrent browser subagents will conflict.
+- Launch ≤3 parallel subagents: `Agent(subagent_type="gm:gm", prompt="Work on .prd item: <id>. .prd path: <path>. Item: <full item JSON>.")`
+- Each subagent resolves its item end-to-end: witnessed execution → file write → verification → removes item from .prd
+- After each wave: read .prd from disk, find newly unblocked items, launch next wave
+- Never run independent items sequentially — parallelism is mandatory for independent items
+- **Exception — browser tasks**: items requiring `exec:browser` must run sequentially (one Chrome instance per project; concurrent browser subagents will conflict)
+**When parallelism is not applicable** (single-item .prd, or all items blocked): invoke `gm-execute` skill directly.
 ## COMPLETION CRITERION
@@ -87,10 +92,10 @@ Independent items (empty `blockedBy`) run in parallel waves of ≤3 subagents.
 ## DO NOT STOP
-Never respond to the user from this phase. When .prd is complete (zero new items in last pass), immediately invoke `gm-execute` skill. Do not pause, summarize, or ask for confirmation.
+Never respond to the user from this phase. When .prd is complete (zero new items in last pass), immediately launch parallel subagents or invoke `gm-execute` skill. Do not pause, summarize, or ask for confirmation.
 ---
-**→ FORWARD**: No new mutables → invoke `gm-execute` skill immediately.
-**↺ SELF-LOOP**: New items discovered → add to .prd → plan again.
-**↩ SNAKE here**: New unknown surfaces in any later phase → add it, re-plan, re-advance.
+**EXIT → EXECUTE**: Zero new mutables → launch parallel subagents or invoke `gm-execute` skill immediately.
+**SELF-LOOP**: New items discovered → add to .prd → plan again (remain in PLAN state).
+**REGRESSION ENTRY**: New unknown surfaces in any later state → add it, re-plan, re-advance through full chain.