@mutmutco/opencode-mmi 2.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/dist/index.d.ts +35 -0
  2. package/dist/index.js +194 -0
  3. package/package.json +44 -0
  4. package/skills/_shared/doctrine.md +238 -0
  5. package/skills/bootstrap/SKILL.md +419 -0
  6. package/skills/bootstrap/seeds/Dockerfile.template +25 -0
  7. package/skills/bootstrap/seeds/README.template.md +36 -0
  8. package/skills/bootstrap/seeds/architecture.template.md +19 -0
  9. package/skills/bootstrap/seeds/components.json.template +31 -0
  10. package/skills/bootstrap/seeds/cursor-environment.template.json +3 -0
  11. package/skills/bootstrap/seeds/cursor-rules.template.mdc +11 -0
  12. package/skills/bootstrap/seeds/design-system.paths.template.json +8 -0
  13. package/skills/bootstrap/seeds/docker-compose.template.yml +17 -0
  14. package/skills/bootstrap/seeds/gate.template.yml +42 -0
  15. package/skills/bootstrap/seeds/google-login.template.md +35 -0
  16. package/skills/bootstrap/seeds/manifest.json +32 -0
  17. package/skills/bootstrap/seeds/mcp-playwright.template.json +13 -0
  18. package/skills/bootstrap/seeds/mmi-product-required-checks.template.json +23 -0
  19. package/skills/browser-automation/SKILL.md +137 -0
  20. package/skills/build/SKILL.md +237 -0
  21. package/skills/build/references/halt-report.md +38 -0
  22. package/skills/build/references/loops.md +13 -0
  23. package/skills/build/references/worked-example.md +18 -0
  24. package/skills/build/templates/campaign-northstar.md +40 -0
  25. package/skills/coop/SKILL.md +77 -0
  26. package/skills/grind/SKILL.md +469 -0
  27. package/skills/grind/references/auto.md +107 -0
  28. package/skills/grind/references/build-notes.md +56 -0
  29. package/skills/grind/references/routing.md +76 -0
  30. package/skills/grind/references/verify.md +83 -0
  31. package/skills/grind/templates/saga-snapshot.md +28 -0
  32. package/skills/grind/templates/synthesize-panel.md +104 -0
  33. package/skills/handoff/SKILL.md +67 -0
  34. package/skills/hotfix/SKILL.md +219 -0
  35. package/skills/mmi/SKILL.md +372 -0
  36. package/skills/rcand/SKILL.md +169 -0
  37. package/skills/release/SKILL.md +309 -0
  38. package/skills/secrets/SKILL.md +137 -0
  39. package/skills/stage/SKILL.md +150 -0
@@ -0,0 +1,77 @@
1
+ ---
2
+ name: coop
3
+ description: Cross-repo, cross-PC multi-agent coordination — coordinator + joiners, GitHub issue handshake, #mmi-agents bus, Hub wake. Use instead of send_message or ad-hoc Slack MCP in unsupervised mode.
4
+ ---
5
+
6
+ # /coop — agent coordination
7
+
8
+ Opt-in **multi-agent alignment** when parallel agents (different worktrees, IDEs, PCs, or repos) must handshake before merging or continuing.
9
+
10
+ ## When to use
11
+
12
+ - Two or more agents need to **align on a plan** before either proceeds
13
+ - Cross-repo dependency (Agent A's API shape ↔ Agent B's consumer)
14
+ - Cross-PC / cross-IDE coordination without human relay
15
+
16
+ ## When not to use
17
+
18
+ - Serial merge train → use `wave land`
19
+ - Session transfer → use `/handoff`
20
+ - General Slack chat → not chatops; `#mmi-agents` + `COOP_*` protocol only
21
+
22
+ ## Quick start
23
+
24
+ **Coordinator** (creates issue + posts `COOP_START`):
25
+
26
+ ```bash
27
+ mmi-cli coop start --repo mutmutco/MyRepo --message-file tmp/coop-open.md
28
+ ```
29
+
30
+ **Joiner**:
31
+
32
+ ```bash
33
+ mmi-cli coop join <coopId> [--cloud]
34
+ ```
35
+
36
+ **Handshake** (substance on the **GitHub issue**; Slack gets stubs):
37
+
38
+ ```bash
39
+ mmi-cli coop say <coopId> --phase HANDSHAKE_OPEN --message-file tmp/proposal.md
40
+ mmi-cli coop say <coopId> --phase ACK --message-file tmp/ack.md
41
+ mmi-cli coop say <coopId> --phase SHOOK --message-file tmp/shook.md
42
+ ```
43
+
44
+ **End** (coordinator, after user confirms):
45
+
46
+ ```bash
47
+ mmi-cli coop end <coopId>
48
+ ```
49
+
50
+ ## Wake (primary path)
51
+
52
+ Hub dispatches wake on every coop message targeting joiners:
53
+
54
+ 1. **SessionStart** — `coop pending` banner + detached `coop deliver`
55
+ 2. **Cursor cloud** — when joiner used `--cloud`
56
+ 3. **Slack Events** — inbound `#mmi-agents` coop messages re-trigger wake
57
+
58
+ **Do not** rely on `coop watch` unless wake is broken — it is degraded poll only.
59
+
60
+ ## Rules
61
+
62
+ - **Never** use Claude `send_message` or harness-specific live chat for org coordination
63
+ - **Substance on GitHub issue**; keep Slack stubs short
64
+ - Any **mutmutco org member** with a Hub session may join
65
+ - Coordinator drives until `SHOOK` or explicit abort
66
+
67
+ ## Reference
68
+
69
+ Architecture SSOT: `docs/Architecture/coop-agent-coordination.md`
70
+
71
+ ## Retro
72
+
73
+ If this skill misfires, file one lesson on the Hub board:
74
+
75
+ ```powershell
76
+ mmi-cli skill-lesson --skill coop --title "<what misfired>" --body "<what; evidence; proposed amendment>"
77
+ ```
@@ -0,0 +1,469 @@
1
+ ---
2
+ name: grind
3
+ description: Drive one or more board items to a verified, PR-ready outcome autonomously — or, with --auto, all the way to merged into development. Verify with a fresh-eyes panel, iterate until clean, open a PR; with --auto, watch CI and merge when green. Use when the user says "grind on X", "take this to a PR", "work this issue until it's done", or invokes /grind [--explore] [--light|--standard|--deep|--ultra] [--auto].
4
+ ---
5
+
6
+ # /grind — autonomous deliver-and-verify loop
7
+
8
+ You drive an outcome from intake to a PR-ready state, then **stop** — or, with `--auto`, all the
9
+ way to **merged**. By default you never merge; `--auto` is the sole exception (see its section).
10
+
11
+ Two kinds of work, one loop:
12
+ - **Convergent** (fix a known issue): iterate until zero blockers. Quality = correctness.
13
+ - **Generative** (`--explore`): explore approaches, iterate toward a goal/metric. Quality =
14
+ fitness for the desired end result, not just absence of bugs. When auto-detection fires,
15
+ use the **generative fusion path** (fusion doctrine → fused markdown deliverable) for
16
+ research/explore-class outcomes — see shared doctrine and **## Generative fusion path**.
17
+
18
+ **Shared doctrine:** Read `skills/_shared/doctrine.md` at session start and on resume. Fusion, parallelism, panel economics, flat fan-out, classifier-denied spawns, worktree hygiene, saga resume, enforcement matrix — single source; do not duplicate here.
19
+
20
+ Flags:
21
+ - `--explore` — brainstorm and judge approaches before building (use for open-ended,
22
+ "find a better/faster way" work). Without it, run the convergent loop.
23
+ - **`--light` · `--standard` · `--deep` · `--ultra`** — force one of the four shared effort tiers
24
+ (same ladder `/build` uses; SSOT `_shared/doctrine.md` § Effort tiers + § Model economy). Without a
25
+ flag, grind self-selects from signals (default `standard`); the verify dial tracks the tier. Doctrine
26
+ also governs lightest-model-first, batch-to-cap, and the fusion-model `light` cap.
27
+ - `--ultra` — the **top tier = YOLO full force** across the whole grind loop (not verify-only): ≥3
28
+ distinct models (cross-vendor when exposed), always multi-agent planning + judge, always tool-enabled
29
+ lenses on applicable hard/soft lenses, generative fusion when research/`--explore`, raised verify (8)
30
+ and CI-fix (5) caps under `--auto`, Ultra routing every round. Composable with `--explore`/`--auto`.
31
+ **Distinct from auto-ultra** (below) — escalation triggers alone do **not** unlock YOLO without `--ultra`.
32
+ - `--auto` — fully autonomous, **zero gates**: classify + route silently, auto-frame (auto-deciding
33
+ whether to `--explore`), then after the PR watch CI, fix failures, and **merge into
34
+ `development`** when green. Accepts multiple issues and partitions them
35
+ (batch/serialize/parallel). See **## `--auto` — fully autonomous mode**.
36
+
37
+ ## Hard rules (never break these)
38
+ - **Two gates.** Gate 1: after you state the direction + success criteria, ALWAYS stop and
39
+ wait for the user's go — never skip it. Gate 2: after you open the PR, wait for the merge
40
+ command. Run autonomously between the gates. Every gate follows **## Gates**: a
41
+ self-contained final message, answered through the host's question module. (`--auto` removes
42
+ both gates — it is the only mode that does; see its section.)
43
+ - **Never merge — except under `--auto`.** In every other mode, do not run `mmi-cli pr merge`;
44
+ the human merges. `--auto` is the sole exception: it merges its own PR into `development`
45
+ (never `rc`/`main`, never a promotion), pre-authorized by the flag.
46
+ - **Claim for the human.** `mmi-cli board claim <owner/repo#N> --for <login>` so the human
47
+ owns the work, not the bot — claiming several at once (Phase 00 sets, side-issue batches) →
48
+ pass every ref to ONE call (`board claim <ref> <ref> … --for <login>`; it dedupes and claims
49
+ in parallel), never loop one-by-one. Resolve `<login>` yourself — the session banner's
50
+ `current human:` line, else `mmi-cli whoami` — and echo who you're claiming for; ask only
51
+ when whoami returns `unknown`. Branch from the latest `origin/development` in a worktree
52
+ (`../mmi-worktrees/<branch>`). Never push `rc`/`main`.
53
+ - **Verifier ≠ builder.** Verifier (and hard lenses) must use a **different model** from builder
54
+ whenever the host exposes two models. Under `--ultra`, the third model must also differ from
55
+ builder; synthesizer is never builder.
56
+ - **Bounded.** Iteration cap = **5** verify→fix rounds (**8** when explicit `--ultra`).
57
+ Stuck = the same blocker survives 2 fix attempts. On cap OR stuck: STOP and ask the user — (a) open a WIP PR + report, (b) keep
58
+ grinding with a raised cap, or (c) stop. Never loop forever, never thrash silently. (Under
59
+ `--auto` you cannot ask — afk — so on cap OR stuck you **terminate and report**: open a
60
+ `[WIP]` PR, do not merge.)
61
+ - **Leave nothing behind.** Every main *and* side issue *of this outcome* is resolved before you
62
+ stop — a second root cause, an unverified ask, a downstream break the fix exposes is part of the
63
+ grind, not a follow-up to punt. Fan out **parallel sub-agents** to resolve independent side issues
64
+ concurrently rather than serializing or skipping them. The only leftovers allowed: a
65
+ **truly-separate side-quest** (unrelated scope — file it out-of-scope, it was never this grind's
66
+ job) or the **cap/stuck hand-off** (a `[WIP]` PR — the user's, or `--auto`'s terminate-and-report).
67
+ "Honest scope: fixed one ask, punted the rest" is not an allowed exit.
68
+ - **Not grindable ≠ punt.** A few issues have **no code/PR deliverable at all**: privileged
69
+ master-only ops (prod deploys, org-tier secret writes, box/runner admin), pure coordination
70
+ checklists, or work outside this run's authority. The floor is positive and hard: **if the
71
+ issue's deliverable is a PR to `development`, it IS grindable** — topic, difficulty, or
72
+ privileged-*adjacent* code never make in-scope work "not grindable" (calling it that would be
73
+ the punt the rule above forbids). For a genuinely non-grindable issue: classify it **before you
74
+ claim**, then do NOT claim it, do NOT execute its ops, do NOT open an empty/fake PR. Interactive
75
+ `/grind`: surface it up front (skip Phase 0a′/0a, don't propose criteria) and stop.
76
+ `--auto`: drop it (no claim, no branch, no PR) and surface it in the final report with the reason
77
+ — if it was the only issue, the run produces just that report.
78
+ - **Data/config-fix outcomes (#1576).** Some grindable issues resolve via a **safe, authorized data or
79
+ config mutation** (e.g. a registry-META backfill via `mmi-cli project set`), not a code PR. When the
80
+ root-cause fix is data and a code change would be speculative (AGENTS §2 "least code"):
81
+ - **Do not** manufacture a synthetic code PR, and **do not** drop the issue as "not grindable".
82
+ - Confirm the mutation is **within standing authority** (`access role`) and **low-risk/reversible**;
83
+ under `--auto` the flag authorizes resolving the named issue its recommended way.
84
+ - Perform the fix, **verify empirically** (run the command whose output the acceptance pins — no diff
85
+ to panel), close the issue, and **file any durable-prevention enhancement** as a separate item.
86
+ - Terminal-done layer 1 is the live command output; layer 3 is the close, not a merge.
87
+ - **Resumable.** After every phase: (1) silent one-line `mmi-cli saga note "<audit>"`; (2) `mmi-cli saga snapshot set --kind grind …` (or `--json-file`) — see **## Saga keep (resume snapshot)**. On resume, `mmi-cli saga snapshot show --kind grind` first; never reconstruct from collapsed chat history.
88
+ - **Announce routing.** At phase transitions (after Phase 0a′ / model selection, before Gate 1 under
89
+ interactive, and at equivalent points under `--auto`), **announce explicitly** in user-visible text
90
+ **and** mirror the same facts in a silent `mmi-cli saga note`:
91
+ - **Ultra mode** when explicit `--ultra` YOLO is active (distinct from auto-ultra verify uplift).
92
+ - **Explore mode** when `--explore` or auto-framed explore is selected.
93
+ - **Routing tier** (Budget / Balanced / Paranoid / Ultra verify routing).
94
+ - **Models chosen** for panel roles: builder, verifier, third (when ultra), synthesizer, judge
95
+ (multi-plan) — slot names matching host-exposed models.
96
+ - **Reasoning effort** when explicit `--ultra` elevates it (model + level; see **## Reasoning effort
97
+ under explicit `--ultra`**).
98
+ Under `--auto`, emit these announcements in the run output/report even without gates — never route silently.
99
+ - **Self-contained intake.** Given a task (issue ref, title, or fresh ask), `/grind` owns
100
+ grindability, task classification, explore-vs-convergent, ultra eligibility, and scope splitting.
101
+ Vague bodies are framed in Phase 0b — derive testable criteria, file child issues with `--parent`
102
+ when the scope must split, or `--explore` when the approach is open-ended.
103
+
104
+ ## Saga keep (resume snapshot)
105
+
106
+ Grind-specific snapshot wiring — shared resume rules in `skills/_shared/doctrine.md`. Schema: `templates/saga-snapshot.md`. Use `--kind grind` for show/set. CLI maps snapshot fields → HEAD primitives (`next`, `anchor`, checklist) — no parallel store.
107
+
108
+ ## Gates — how every gate (and decision point) is presented
109
+ Hosts collapse mid-turn text: anything you printed between tool calls may be invisible when
110
+ the user reaches the gate. So:
111
+
112
+ - **Self-contained final message.** Each gate message restates everything the user needs to
113
+ decide, in the message itself — never "as shown above", never relying on earlier mid-turn
114
+ output. Required content:
115
+ - **Gate 1:** grind **class** (bug/feature/research/task/security-critical), **routing**
116
+ (Budget/Balanced/Paranoid/Ultra), **effort tier** — name the chosen tier
117
+ (`light`/`standard`/`deep`/`ultra`), not an "Ultra: yes/no"; at `ultra` list the YOLO dimensions, and
118
+ note any auto-ultra verify uplift separately. Plus **planning** verdict when it ran (planner count, judge scores),
119
+ **tool-enabled lenses** when auto-gated (or always at `ultra`), **generative fusion** when active;
120
+ model slots + cost note at `ultra`; candidate approaches (with `--explore`: judge verdict/scores);
121
+ the chosen direction; and the full success criteria/target.
122
+ - **Gate 2:** the results summary — criteria/metric met, rounds run, issues filed/fixed,
123
+ anything unresolved.
124
+ - **Cap/stuck ask:** the blocker, what was tried, and the (a)/(b)/(c) options.
125
+ - **Ask through the host's question module.** Collect the answer at EVERY gate and decision
126
+ point — Phase 0a classification confirm, ultra confirm (when auto-ultra), model overrides, Gate 1
127
+ go, cap/stuck ask, Gate 2 merge decision — with the host's structured-question UI
128
+ (AskUserQuestion in Claude Code; the equivalent on Codex / Cursor Composer; plain text only if
129
+ the host has none). Required content stays in the message body; the question module carries the
130
+ choice, not the context.
131
+
132
+ ## Phase 0a′ — Classify & route
133
+
134
+ Runs immediately after the grindability check, **before** Phase 0a model questions. (`--auto`:
135
+ apply silently for classification; still **announce** routing per **Hard rules — Announce routing**;
136
+ `saga note "grind class=X routing=Y ultra=Z reason=…"`.)
137
+
138
+ Read the issue **type label** (`bug` / `feature` / `task`), **priority**, **title/body**,
139
+ **labels** (e.g. `security`), known **files touched**, and flags (`--explore`, `--ultra`, `--auto`).
140
+
141
+ | Class | Signals | Builder bias | Default routing | Explore? |
142
+ |-------|---------|--------------|-----------------|----------|
143
+ | **Bug** | `bug` label, repro, regression | Fast/cheap OK | **Budget** → Balanced if security-adjacent | No |
144
+ | **Feature** | `feature` label, new capability | Stronger | **Balanced** | Only if body open-ended |
145
+ | **Research** | `--explore`, spike, metric goal | Brainstorm model | **Balanced**; ultra if high-stakes metric | Yes |
146
+ | **Task / chore** | `task` label, refactor, deps, CI | Cheap | **Budget**; prose-only → reduced panel | No |
147
+ | **Security-critical** | auth/crypto/payments/PII labels or code; agent-facing trusted corpus | Strong verifier | **Balanced** min; auto-ultra likely | Rare |
148
+
149
+ ### Ultra routing detail → `references/routing.md`
150
+
151
+ The full **auto-ultra detection**, **explicit `--ultra` YOLO dimensions**, **cross-vendor model
152
+ diversity**, and **reasoning-effort** rules live in `references/routing.md`. Load it when ultra is in
153
+ play. Quick recap:
154
+
155
+ - **auto-ultra** = verify-panel uplift only (≥3 models, Ultra routing, hard-lens double-pass) — fires on urgent+security, architectural scope, or after 2 failed rounds. **Not** whole-loop YOLO.
156
+ - **Explicit `--ultra`** = whole-loop YOLO (≥3 cross-vendor models, always multi-plan, tool lenses, fusion, raised caps 8/5, elevated reasoning). List every dimension at Gate 1.
157
+ - Verifier ≠ builder always; synthesizer never builder; document any vendor-diversity gap honestly.
158
+
159
+ ## Phase 0a — Choose models
160
+
161
+ (`--auto`: skip questions — see **### Phase 0a — Models (auto)** under `--auto`.)
162
+
163
+ First, **grindability** (Hard rule): if not grindable, stop before Phase 0a′.
164
+
165
+ **Already-resolved guard (#1617):** before claiming, check the named issue's live state. If it is
166
+ already **CLOSED/COMPLETED with its grind PR MERGED**, the outcome already shipped — no-op: do
167
+ **not** re-claim, branch, or open a PR. Report "already done" and stop. Under `--auto` this is the
168
+ safe landing for a stale re-invocation (e.g. a wakeup that fires after the work merged — see the
169
+ **background-land** note in Phase 4); verify against `gh`/the board, never assume from chat.
170
+
171
+ Then run **Phase 0a′**. Interactive flow:
172
+
173
+ 1. Present **classification summary** (class, routing, ultra yes/no + reason; cost note if ultra).
174
+ 2. **Confirm or override** builder, verifier, third model (ultra only), synthesizer — pre-filled
175
+ from class table + ultra rules.
176
+ 3. **Verify routing** — pre-fill Budget/Balanced/Paranoid/Ultra; Paranoid only when ultra or
177
+ explicit override.
178
+
179
+ Offer models the **current host** exposes — do not hardcode one vendor:
180
+ - Claude Code: Opus / Sonnet / Haiku / Fable.
181
+ - Codex / Cursor Composer: that host's models.
182
+ - If you cannot enumerate them, accept free-text model names.
183
+
184
+ **Prioritize cross-vendor pairing** whenever the host exposes multiple vendors — verifier **≠**
185
+ builder is the floor, mixed vendors is the target. **Hard lenses cannot drop below verifier tier.**
186
+ See `templates/synthesize-panel.md`.
187
+
188
+ **Lens tiers:**
189
+
190
+ | Lens | Role |
191
+ |------|------|
192
+ | `requirements-match`, `tests-actually-test`, `error-handling` | **Soft** |
193
+ | `security`, `correctness` | **Hard** (verifier tier minimum) |
194
+
195
+ | Verify routing | Soft lenses | Hard lenses | Phase 2b | When |
196
+ |----------------|-------------|-------------|----------|------|
197
+ | **Budget** | budget tier | verifier, **≠ builder** | every round | Routine bugs, chores |
198
+ | **Balanced** | mid/budget tier | verifier, **≠ builder** | every round | Features, default `--auto` |
199
+ | **Paranoid** | verifier tier | **double-pass** | every round | Opt-in high-stakes (non-ultra) |
200
+ | **Ultra** | verifier tier | **double-pass**, 2 models if possible | every round | `--ultra` or auto-ultra |
201
+
202
+ **Paranoid / Ultra hard-lens double-pass:** run `security` and `correctness` twice — different
203
+ temperature or two different models — before Phase 2b.
204
+
205
+ Log each verify round: `mmi-cli saga note "verify round N: routing=X ultra=Y"`.
206
+
207
+ ## Phase 0b — Frame [GATE 1]
208
+ (`--auto`: no gate — auto-decide explore-vs-convergent, in `--explore` auto-pick the judge's
209
+ winner, then proceed straight into Phase 1 without waiting. See **## `--auto`**.)
210
+
211
+ ### Multi-agent planning (auto-gated)
212
+
213
+ When auto-detection fires (or **always** under explicit `--ultra`), run the **fusion doctrine**
214
+ planning shape — **parallel planners + verifier-tier judge** — before Gate 1 / Phase 1:
215
+
216
+ 1. Spawn **N parallel planner sub-agents** (N=2–3; N=3 when scope is architectural, Research class,
217
+ wide `--explore`, or explicit `--ultra`).
218
+ 2. Each planner returns: approach summary, risks, proposed success criteria, estimated complexity.
219
+ 3. **Judge agent** (verifier-tier model, **≠** any planner/builder) scores against the goal rubric;
220
+ picks winner or synthesizes hybrid — synthesize-and-reconcile, not vote/debate.
221
+ 4. Winner + criteria written to issue body; North Star push; proceed to Gate 1 (interactive) or
222
+ Phase 1 (`--auto`). `mmi-cli saga note "multi-plan N=<n> winner=<summary>"`.
223
+
224
+ **Skip multi-agent planning** for narrow bugs, low priority, or user "quick"/"small" — single
225
+ planner + judge (or direct criteria framing) is enough.
226
+
227
+ | Trigger | Multi-agent planning |
228
+ |---------|---------------------|
229
+ | `--explore` + wide/open-ended body | Yes (N=3) |
230
+ | `--explore` + clear metric/goal | Yes (N=2) |
231
+ | Class = **Research** | Yes (N=3) |
232
+ | Architectural / cross-cutting (auto-ultra trigger #4) | Yes (N=3) |
233
+ | Priority `urgent` + security-critical | Yes (N=2) |
234
+ | Explicit **`--ultra`** | **Always** (N=3) |
235
+ | Narrow bug / low priority / quick | No |
236
+
237
+ Composable with **`--explore`** generative fusion (below) — fused deliverable may incorporate
238
+ planner outputs when both paths are active.
239
+
240
+ **Default (`/grind`):**
241
+ 1. Resolve the item — file it
242
+ (`mmi-cli issue create --type <bug|feature|task> --priority <…> --title "…" --body "…"`)
243
+ or claim the one the user named. Then `mmi-cli board claim <owner/repo#N> --for <login>`.
244
+ (`--priority` writes the board Priority **field**, never a `priority:*` label — #416.)
245
+ (Grindability was already settled in Phase 0a′ — a not-grindable issue never reaches here.)
246
+ 2. Convert the ask into **verifiable success criteria** — testable statements, not vibes.
247
+ **Vague or ambiguous body?** Read the full issue (body + comments), state the deliverable,
248
+ and write criteria the user can confirm at Gate 1. Umbrella scope → child issues (`--parent`),
249
+ one shippable unit per grind.
250
+ Write them into the issue body; push the criteria to North Star (`mmi-cli northstar push <slug>` —
251
+ the default push queues a background sync and prints "queued": that is success, not a failure;
252
+ `mmi-cli northstar status` checks it, `mmi-cli northstar sync` confirms durably).
253
+ 3. Present per **## Gates** (class, routing, ultra, criteria). **Wait for the user's go.**
254
+
255
+ **With `--explore`:**
256
+ 1. When **generative fusion** auto-gates (see below), run the fusion panel instead of or after
257
+ single-builder brainstorm — fused markdown becomes the direction doc.
258
+ 2. Otherwise brainstorm **2-3 candidate approaches** (2 if the goal is clear, 3 if wide) on the
259
+ builder model — or use **multi-agent planning** when auto-gated.
260
+ 3. Score with the **verifier-model judge** (or fusion panel judge); pick or synthesize a direction.
261
+ 4. Define a success target — numeric metric if stated, else judged rubric. File/claim, North Star
262
+ push. Present per **## Gates**. **Wait for the user's go.**
263
+
264
+ ## Generative fusion path (auto-gated)
265
+
266
+ Same **fusion doctrine** shape — parallel cross-vendor planners/agents on the explore problem,
267
+ verifier-tier **judge** fuses into one markdown deliverable. For **research-class** and wide
268
+ **`--explore`** grinds whose deliverable is a design conclusion, spike report, or approach doc
269
+ (not only a code PR):
270
+
271
+ 1. Run **2–3 parallel planner-tier agents** (cross-vendor when the host allows) with **bounded
272
+ tools** (web search, repo read) on the same explore problem. **Judge** (verifier-tier, **≠**
273
+ any planner) synthesizes — not vote/debate.
274
+ 2. Judge output is a **fused markdown deliverable**: chosen approach, tradeoffs, criteria
275
+ refinement, spike findings.
276
+ 3. Land in issue body + `mmi-cli northstar push <slug>`.
277
+ 4. **Code-shipping explore:** fused doc feeds Phase 1; Phase 2 verify stays diff-pinned.
278
+ 5. **Research-only** (no code PR): fusion output is the primary artifact; stop after criteria met
279
+ + Gate 2 (`--auto` report).
280
+
281
+ | Trigger | Generative fusion |
282
+ |---------|-------------------|
283
+ | `--explore` + non-purely-convergent deliverable | Candidate |
284
+ | Class = **Research** | Yes |
285
+ | Open-ended feature ("find best approach") | Yes |
286
+ | Explicit **`--ultra`** + Research/`--explore` | Yes |
287
+ | Bug / task / chore / quick | No |
288
+
289
+ Auto-detection rules are also encoded in `cli/src/grind-policy.ts` (test fixtures lock inputs → path).
290
+
291
+ ## Phase 1 — Build
292
+ - Cut a worktree from the latest `origin/development`. Implement to the criteria/approach on the
293
+ **builder model**. Write/adjust tests that lock the behavior (TDD where it fits).
294
+ - Run **scoped repo checks** while building (shared doctrine § Verify tiers) — not the full gate mirror on every loop. Fix obvious breaks before spending verifier tokens.
295
+ - (`--explore`) Spike a second approach if the first looks weak; let the judges choose.
296
+ - **Checkpoint verification (build sub-loops).** Do not defer all judgment to end-of-round Phase 2.
297
+ At natural **increment checkpoints** — a module landed, a test suite green, a refactor slice done
298
+ — run a **light verify pass**: requirements slice for that increment, quick correctness read, or
299
+ a single lens if stakes are high. Cheap early signal; full panel still runs at Phase 2 before PR.
300
+ - **Re-tackle when inadequate.** If a checkpoint or local check shows the approach is wrong, discard
301
+ the slice and rebuild — do not patch forward to green.
302
+ - **Build gotchas → `references/build-notes.md`.** Load it before trusting a local green: edit-the-worktree
303
+ false-greens, clean-UTF-8/mojibake checks (and the Windows-console false-positive), Windows warm-build
304
+ flakes, the CI-gate mirror (incl. the `infra` `plugin-set` static guards for spine/surface diffs), and
305
+ the real-run requirement for IO/syscall-boundary changes.
306
+
307
+ ## Phase 2 — Judge (fresh eyes)
308
+
309
+ **Mechanism:** spawn parallel lenses → each returns strict JSON → Phase 2b synthesizer produces
310
+ `PanelReport` → triage uses **`PanelReport.blockers` only**. Not vote, not debate.
311
+
312
+ `/grind` (`--auto` too) authorizes required panels. On Codex, use `multi_agent_v1` when
313
+ available; report degraded fallback. Never replace an available panel with hand-written lens JSON.
314
+
315
+ Spawn a **parallel panel** per lens, using **Phase 0a routing** for each lens's model tier. Under
316
+ **Ultra**, assign soft lenses to the third model when available. Each verifier gets ONLY the diff +
317
+ criteria/goal — never how you built it. **Pin every verifier to the change — a hard rule on EVERY
318
+ round.** The orchestrator MUST hand each lens the *verbatim* diff — never a summarized/minified
319
+ rendering. Form (a) inline verbatim `git diff` only when verifying from the **same checkout** that
320
+ holds the change. **Grind default `isolation=worktree`:** form (b) only — write
321
+ `git -C <worktree> diff …` to `tmp/grind-verify-<round>.patch` and pass that path; never rely on
322
+ lens CWD.
323
+
324
+ **Lens-prompt clauses → `references/verify.md`.** Every lens prompt MUST carry: the **verbatim-includes-test-files** rule, the **abstention** rule (`cannot-verify`, never a false "absent/missing" blocker), the **diff-shape** clause (a referenced-but-undefined symbol is pre-existing — never flag it), and the **worktree-isolation** clause (patch-only, deny repo FS, stale-checkout warning). The exact wording lives in `references/verify.md` — load it before spawning lenses.
325
+
326
+ Under **Paranoid** or **Ultra**, run **hard lenses twice** before Phase 2b.
327
+ Run **all five lenses, every round** (except prose/docs-only reduction below):
328
+ 1. **requirements-match** — does it meet the stated criteria?
329
+ 2. **correctness** — bugs, logic errors, broken edge cases.
330
+ 3. **error-handling** — failure modes, boundaries, resource cleanup.
331
+ 4. **security** — exploitable paths (attacker · input · sink · impact).
332
+ 5. **tests-actually-test** — do the tests exercise behavior, or pass vacuously?
333
+
334
+ **Prose/docs-only changes** run a **reduced panel**: **requirements-match**, **correctness**
335
+ (factual accuracy), **no-new-falsehood**. Error-handling and tests-actually-test are **N/A**.
336
+ **Keep security** for agent-facing prose (SKILL/AGENTS/prompts, KB). Drop security only for inert
337
+ human-only prose.
338
+
339
+ (`--explore`) Add **goal judges**: metric hit? simpler path? right result?
340
+
341
+ Each verifier returns strict JSON:
342
+ `{ "verdict": "pass|fail|cannot-verify", "blockers": [{"title","file","line","why"}], "nits": [] }`.
343
+ Do **not** triage raw lens blockers directly (Phase 2b first).
344
+
345
+ **Tool-enabled lenses → `references/verify.md`.** Hard lenses should anchor to objective signals
346
+ (failing test, typecheck error, cited source) and populate `blockers[].sources`. Bounded `webSearch`
347
+ is the default on applicable lenses; `sandboxRepro` only when the body has repro commands. Under the
348
+ grind default `isolation=worktree`, **repo FS tools stay forbidden even under ultra** — web search for
349
+ external facts only. Full trigger table + domain-hygiene rules: `references/verify.md`.
350
+
351
+ ## Phase 2b — Synthesize (panel reconciliation)
352
+ After Phase 2, spawn **one synthesizer sub-agent** on the **synthesizer model** (Phase 0a). Feed
353
+ it ONLY: success criteria/goal, every lens JSON from this round (labeled by lens name), and a
354
+ diff stat or excerpt — **never** builder reasoning or session history. Use the prompt in
355
+ `templates/synthesize-panel.md`.
356
+
357
+ The synthesizer returns a **`PanelReport`** — structured reconciliation of lens outputs:
358
+
359
+ | Field | Meaning |
360
+ |-------|---------|
361
+ | `consensus` | Points most/all lenses agree on (higher confidence) |
362
+ | `contradictions` | Topic + per-lens stances where they disagree |
363
+ | `partial_coverage` | Only some lenses covered a point |
364
+ | `unique_insights` | Something one lens raised |
365
+ | `blind_spots` | Criteria imply it matters; no lens addressed it |
366
+ | `blockers` | Deduped ship-stoppers — each has stable `id`, `title`, `file`, `line`, `why`, `sources` |
367
+ | `nits` | Advisory, non-blocking |
368
+
369
+ A verify round **fails if `PanelReport.blockers` is non-empty**. If synthesis errors or returns
370
+ invalid JSON, **degrade gracefully**: union raw lens `blockers` (manual dedupe by file+line+title),
371
+ `saga note` the degradation, and continue Phase 3 — synthesis is an uplift, not a hard dependency.
372
+ Optional CLI path: `mmi-cli verify panel` plans lens jobs; pipe lens JSON to `mmi-cli verify synthesize`
373
+ for deterministic blocker dedupe before the host synthesizer enriches consensus/contradictions.
374
+ Real verifier lanes only. Empty or controller-authored all-pass stubs are invalid evidence and do
375
+ not count toward clean rounds.
376
+
377
+ **Synthesis handling → `references/verify.md`** for the detail on `blind_spots` (one micro-lens max),
378
+ `contradictions` (escalate spec ambiguity per **## Gates**; `--auto` picks the criteria-satisfying
379
+ reading), and **absence-claims** (drop "missing/absent" blockers contradicted by the pinned patch or
380
+ green builder tests; re-run patch-only if lens logs show disk reads). Fusion is IDE-native multi-model
381
+ only — no hosted provider.
382
+
383
+ ## Phase 3 — Triage & iterate
384
+
385
+ **Each loop must deepen understanding** — of the problem and of the implementation — not merely
386
+ retry the same shape with more tokens. Re-frame criteria and approach as evidence accumulates; a
387
+ fix round should leave you knowing more than the last round.
388
+
389
+ Triage from the **`PanelReport`** (Phase 2b), not raw lens JSON. Use stable blocker `id`s across
390
+ rounds — the same root cause should keep the same `id` until resolved.
391
+
392
+ - **Substantial or pre-existing** blocker → file a board issue
393
+ (`mmi-cli issue create --type bug …`), `board claim --for <login>`, then fix it. Reference the
394
+ `PanelReport` blocker `id` in the issue body when filing.
395
+ - **Trivial** blocker → fix from a `tmp/` worklist (gitignored scratch), no board issue.
396
+ - **Side issue *of this outcome*** → in scope: resolve before you stop. Fan out **parallel
397
+ sub-agents** for independent side issues.
398
+ - **Truly-separate side-quest** → file out-of-scope and move on.
399
+ - (`--explore`) A "works but a better path exists" verdict drives the next round; re-measure
400
+ against the target.
401
+ - Re-run Phase 2 → 2b → 3. Repeat until **2 consecutive synthesis-stable clean rounds**:
402
+ - `PanelReport.blockers` empty on both rounds, **and**
403
+ - the blocker `id` set is identical across both (both empty counts as stable).
404
+ - **And** repo checks / CI green (terminal-done layer 1) — two clean rounds alone is insufficient.
405
+ This stability rule approximates **variance control** — one lucky clean round is not enough.
406
+ If round N is clean but round N+1 introduces a **new** `id`, reset the counter. OR cap/stuck
407
+ (then ask, per Hard rules). On round 3+, if still failing and not yet ultra, apply **auto-ultra
408
+ escalation** (Phase 0a′) before the next verify round.
409
+
410
+ ## Phase 4 — PR [GATE 2]
411
+ (`--auto`: no gate — open the PR, then run the **CI-merge loop** and merge into `development`
412
+ yourself when green. See **## `--auto`**.)
413
+
414
+ - **Base freshness (#1906):** before `pr create`, doctrine § worktree hygiene.
415
+ - Open the PR: `mmi-cli pr create --base development --head <grind-branch> --title "…" --body "…"`
416
+ with `Closes #<issue>` for every issue resolved. **Pass `--head` explicitly** — `pr create`
417
+ defaults `--head` to the *current* branch, which is `development` when you run Phase 4 from the
418
+ main checkout (the Windows worktree-cleanup default), and `head == base == development` errors with
419
+ "cannot create a pull request". (`mmi-cli pr create` has no `--draft`; for a cap-hit hand-off, open
420
+ a regular PR with a `[WIP]` title and the unresolved blockers listed in the body.)
421
+ - **Tick the parent's checklist (#1796).** If the resolved issue appears as a markdown task-list
422
+ item in a parent epic/tracking issue, flip it: `mmi-cli issue check <epic-ref> --item "<text>"`
423
+ — so the epic doesn't read 0% after its items shipped. (Native sub-issues auto-tick on close;
424
+ this is for body checklists not backed by sub-issues.)
425
+ - A PR is done only when **every main + side issue of the outcome is resolved**. The sole
426
+ legitimate leftovers: a truly-separate side-quest (filed out-of-scope) or a cap/stuck `[WIP]`
427
+ hand-off — never an "honest residual" punt.
428
+ - Run the **## Retro** check (silent when clean) before reporting.
429
+ - Report per **## Gates**: criteria/metric met, rounds run, issues filed/fixed (including a
430
+ skill-lesson issue, if the retro filed one), and — only on a cap/stuck hand-off — what's
431
+ explicitly deferred and why.
432
+ - Emit the **## End-of-grind summary** block before waiting for merge.
433
+ - **Wait for the merge command.** Never run `mmi-cli pr merge` yourself.
434
+
435
+ ## Retro — one check after the PR opens
436
+
437
+ See shared doctrine § Self-learning + retro. Grind-specific examples: gate message missing context; verifier lens missed a blocker; Windows worktree lock on cleanup; ambiguous phase instruction. File via `mmi-cli skill-lesson --skill grind` — at most one lesson per run.
438
+
439
+ ## End-of-grind summary
440
+
441
+ At grind completion (PR opened + interactive stop, or `--auto` terminate/merge report), emit a **very
442
+ brief** summary block — user-visible and mirrored in `mmi-cli saga note`:
443
+
444
+ - **Tier + modes:** chosen tier (`light`/`standard`/`deep`/`ultra`); `--explore`, auto-ultra (verify uplift), `--auto` if applied.
445
+ - **Models used:** builder / verifier / third / synthesizer / judge (host slot names).
446
+ - **Reasoning level** when elevated under explicit `--ultra`.
447
+ - **Fusion paths taken:** generative fusion, multi-agent planning — IDE-native only (not hosted).
448
+ - **Verify rounds run** (and cap if hit).
449
+ - **PR outcome:** opened, WIP, merged under `--auto`, CI state if relevant.
450
+
451
+ ## `--auto` — fully autonomous mode → `references/auto.md`
452
+
453
+ When `--auto` is passed, **load `references/auto.md`** — it carries the full autonomous flow. Summary:
454
+
455
+ - **Zero gates.** Both Gate 1 (direction) and Gate 2 (merge) are removed; never wait on the human.
456
+ - **Authority.** `--auto` is the human's durable up-front authorization to merge *this run's* PR into
457
+ `development` only — never `rc`/`main`, never a promotion. Probe `mmi-cli access role <owner/repo>`
458
+ at start; abort if `train: false`. Branch protection under the human's `gh` token is the real guard.
459
+ - **Bounded.** On cap/stuck you cannot ask (afk) → **terminate and report**: open a `[WIP]` PR, do not merge.
460
+ - **Phase 00 (set):** partition multiple issues into **batch / parallel / serialize** (mode per group);
461
+ shared files or a shared generated artifact (e.g. `cli/dist/*`) → serialize/batch; ≤3 loops at once;
462
+ claim every grindable issue `--for <login>`.
463
+ - **Phases 0a/0b (auto):** silent class/routing/model pick (2-model default; ultra mixes vendors);
464
+ auto-frame explore-vs-convergent; auto-pick the planning winner; straight into Phase 1.
465
+ - **Phase 4 (CI-merge loop, replaces Gate 2):** `mmi-cli pr land <n> --json` (the recommended one call);
466
+ CI red → fix per Phase 3 (CI-fix cap 3, 5 under `--ultra`); permission-reject → leave green PR open.
467
+
468
+ See `references/auto.md` for the per-step land mechanics, Windows worktree-cleanup rule, and the
469
+ background-land wake-signal guidance.