@mutmutco/opencode-mmi 2.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/dist/index.d.ts +35 -0
  2. package/dist/index.js +194 -0
  3. package/package.json +44 -0
  4. package/skills/_shared/doctrine.md +238 -0
  5. package/skills/bootstrap/SKILL.md +419 -0
  6. package/skills/bootstrap/seeds/Dockerfile.template +25 -0
  7. package/skills/bootstrap/seeds/README.template.md +36 -0
  8. package/skills/bootstrap/seeds/architecture.template.md +19 -0
  9. package/skills/bootstrap/seeds/components.json.template +31 -0
  10. package/skills/bootstrap/seeds/cursor-environment.template.json +3 -0
  11. package/skills/bootstrap/seeds/cursor-rules.template.mdc +11 -0
  12. package/skills/bootstrap/seeds/design-system.paths.template.json +8 -0
  13. package/skills/bootstrap/seeds/docker-compose.template.yml +17 -0
  14. package/skills/bootstrap/seeds/gate.template.yml +42 -0
  15. package/skills/bootstrap/seeds/google-login.template.md +35 -0
  16. package/skills/bootstrap/seeds/manifest.json +32 -0
  17. package/skills/bootstrap/seeds/mcp-playwright.template.json +13 -0
  18. package/skills/bootstrap/seeds/mmi-product-required-checks.template.json +23 -0
  19. package/skills/browser-automation/SKILL.md +137 -0
  20. package/skills/build/SKILL.md +237 -0
  21. package/skills/build/references/halt-report.md +38 -0
  22. package/skills/build/references/loops.md +13 -0
  23. package/skills/build/references/worked-example.md +18 -0
  24. package/skills/build/templates/campaign-northstar.md +40 -0
  25. package/skills/coop/SKILL.md +77 -0
  26. package/skills/grind/SKILL.md +469 -0
  27. package/skills/grind/references/auto.md +107 -0
  28. package/skills/grind/references/build-notes.md +56 -0
  29. package/skills/grind/references/routing.md +76 -0
  30. package/skills/grind/references/verify.md +83 -0
  31. package/skills/grind/templates/saga-snapshot.md +28 -0
  32. package/skills/grind/templates/synthesize-panel.md +104 -0
  33. package/skills/handoff/SKILL.md +67 -0
  34. package/skills/hotfix/SKILL.md +219 -0
  35. package/skills/mmi/SKILL.md +372 -0
  36. package/skills/rcand/SKILL.md +169 -0
  37. package/skills/release/SKILL.md +309 -0
  38. package/skills/secrets/SKILL.md +137 -0
  39. package/skills/stage/SKILL.md +150 -0
@@ -0,0 +1,107 @@
1
+ # grind `--auto` — fully autonomous mode (load when `--auto` is passed)
2
+
3
+ Hand off the prompt and walk away. `--auto` has **zero interactive gates** — it never waits on
4
+ the human. It overrides specific phases of the loop in `SKILL.md`; everything not listed here is
5
+ unchanged.
6
+
7
+ - **No gates.** Both Gate 1 (direction) and Gate 2 (merge) are removed. No decision point pauses
8
+ for input.
9
+ - **Authority.** Passing `--auto` is the human's explicit, durable, up-front authorization to
10
+ merge *this run's* PR into `development` — never `rc`/`main`, never a promotion (the train
11
+ skills stay human-only). Probe `mmi-cli access role <owner/repo>` at run start; proceed only when
12
+ `train: true` (master or project-admin). The real guard remains GitHub branch protection (fail-closed)
13
+ under the human's own `gh` token.
14
+ - **Bounded, not interactive.** The safety cap stays. On cap OR stuck you cannot ask (afk), so
15
+ you **terminate and report**: open a `[WIP]` PR, do not merge.
16
+
17
+ ## Phase 00 — Plan the set (when given more than one issue)
18
+ Read every issue (title, body, labels, linked code). First drop any **not-grindable** issue (per
19
+ the Hard rule) — unclaimed, no branch/PR, just a line in the final report. Partition the rest into
20
+ execution groups — **mode per group, not one global mode.** No override flags; you always decide.
21
+ - **Batch** → one shared worktree/branch → one PR that `Closes` every issue in the group. Stay
22
+ there until PR/integration; don't bounce to main for routine re-sync. For issues
23
+ that are facets of one change (same files/module, no independent value).
24
+ - **Parallel** → each issue its own worktree + PR, run concurrently. For independent issues
25
+ (disjoint files, no dependency). Fastest.
26
+ - **Serialize** → one PR after another in dependency order (ties by issue number); each branches from
27
+ the prior's merged result. For sequential same-repo/session work, reuse the active worktree
28
+ until the group/session ends; create a new worktree only at a real branch/PR boundary, for true
29
+ parallel work, on user ask, real base drift, or a broken/unsafe worktree.
30
+
31
+ Procedure: map shared files + stated dependencies → group → run. **A committed generated
32
+ artifact counts as a shared file:** when 2+ items rebuild the same checked-in bundle (e.g.
33
+ `cli/dist/*`), classify them as overlapping even when their source files are disjoint — prefer
34
+ **serialize** or **batch**. When parallel is still chosen, plan an explicit rebase + rebuild +
35
+ retest step for every follow-on PR, never treating the conflict as a surprise. A real set may mix
36
+ modes (e.g.
37
+ `{950,951}` batched, `{952}` parallel, `{953}` serialized after `952`). **Concurrency bound:**
38
+ at most **3 grind loops run at once**; the rest queue. Claim every **partitioned** (grindable)
39
+ issue `--for <login>` before its work, in every mode — never the not-grindable ones already dropped.
40
+ If `/stage` is running, keep it attached to the active worktree; don't restart it for git
41
+ bookkeeping. If a new worktree is truly needed, stop/destroy/recreate stage there, or warn
42
+ first when intent is unclear.
43
+
44
+ ## Phase 0a — Models (auto)
45
+ Run **Phase 0a′** silently first (class, routing, ultra, escalation).
46
+
47
+ **Default (2-model, not ultra):** builder = current host model; verifier = a *different* model
48
+ (Opus, or Sonnet if builder is Opus); synthesizer = verifier. Routing from class table — usually
49
+ **Budget** for bug/task, **Balanced** for feature/research. Bug/task: prefer cheaper builder when
50
+ tiers exist; feature/research: prefer stronger builder.
51
+
52
+ **Ultra (explicit `--ultra` or auto-ultra):** pick three models **actively mixing vendors** when the
53
+ host allows; hard lenses on verifier; soft lenses on third model; synthesizer = third or verifier;
54
+ routing = **Ultra**.
55
+ Explicit `--ultra` applies YOLO dimensions (planning, tools, fusion, caps) per `references/routing.md`
56
+ § Explicit `--ultra`; auto-ultra alone does **not**.
57
+
58
+ On a host that cannot enumerate models, use one model for all roles and note degradation in the
59
+ report. Never pretend cross-vendor ultra ran when it did not.
60
+
61
+ ## Phase 0b — Frame (auto, no gate)
62
+ (Not-grindable issues were already dropped — in Phase 00 for a set, or Phase 0a′ for a lone issue.)
63
+ Auto-decide explore-vs-convergent from the ask (open-ended → explore; `--explore` forces it on).
64
+ Run **multi-agent planning + judge** when auto-gated (always under explicit `--ultra`). In explore,
65
+ brainstorm + judge or **generative fusion** when auto-gated — **auto-pick the winning approach** —
66
+ no wait. File/claim the item(s), write the criteria, push to North Star, then go straight to Phase 1.
67
+ `saga note` multi-plan winner, tool policy, and fusion path when active.
68
+
69
+ ## Phase 4 — PR → CI-merge loop (replaces Gate 2)
70
+
71
+ **Start (`--auto` only):** `mmi-cli access role <owner/repo> --json` — abort the run if `train: false`
72
+ before building (fail fast; do not waste a full grind on a PR you cannot land).
73
+
74
+ 0. **Base freshness (#1906):** doctrine § worktree hygiene per worktree.
75
+ 1. Open the PR (squash; normal title — `[WIP]` only on a cap/stuck hand-off). **Pass `--head` to
76
+ `pr create` explicitly** — `mmi-cli pr create --base development --head <grind-branch> …`; the
77
+ default `--head` is the current branch, which is `development` when Phase 4 runs from the main
78
+ checkout (per step 2's Windows rule), and that errors with `head == base`.
79
+ 2. **Land** — `(Recommended)` one call: `mmi-cli pr land <n> --json` (train probe → `ci-policy` →
80
+ `checks-wait` → `merge --auto` → poll enqueued auto-merge → branch/worktree cleanup → return the
81
+ checkout to an up-to-date `development`). Base must be `development`. Never land promotion PRs.
82
+ Cleanup is **automatic at the branch/PR boundary** — a successful land/merge removes that grind
83
+ worktree + local branch and fast-forwards the checkout onto `development` (when its tree is clean);
84
+ never hand-run `git checkout development` / `git worktree remove` afterward (#1606). For shared
85
+ sequential same-repo work, make the execution group one branch/PR where possible; do not land each
86
+ issue separately and then pretend the same active worktree continues. Active-batch PR:
87
+ `pr land <n> --preserve-worktree --json` skips local cleanup; clean at end (#1888).
88
+ **Windows:** run land/merge from
89
+ the **main repo checkout** (not the grind worktree cwd) so post-merge cleanup can rmdir the worktree
90
+ (#1444).
91
+ - **Background-land wake signal (#1617):** if you run `pr land` in the background, the
92
+ task-completion notification IS your wake signal — the harness re-invokes you when the land
93
+ finishes. Do **not** schedule a self-firing wakeup with `/grind` as its prompt as a "fallback"
94
+ for a backgrounded land; it fires redundantly *after* the land already merged and re-enters a
95
+ finished grind (the **already-resolved guard** in Phase 0a then no-ops it). A genuine watchdog
96
+ must be a long-interval **status check**, never a `/grind` re-invocation.
97
+ 3. **Manual path** (when debugging land): probe `mmi-cli pr ci-policy --json`; when `policy` is `no-ci`
98
+ skip check polling after local verify; else `mmi-cli pr checks-wait <n>`. On green (or no-ci skip),
99
+ `mmi-cli pr merge <n> --auto` (add `--preserve-worktree` for active batches; not bare, #973). If merge
100
+ returns `auto-merge-enqueued`, poll `gh pr view` until `MERGED` before declaring success or cleaning
101
+ up worktrees (#1440).
102
+ 4. CI **red** → treat failing checks as blockers, triage + fix per Phase 3, push, re-land. CI-fix rounds
103
+ cap **3** (**5** under explicit `--ultra`); exceeding = stuck → terminate + report (PR open, not merged).
104
+ 5. Merge **rejected for permission** → leave the green PR open; report merge needs an authorized human
105
+ (`train: false` or not on push allowlist).
106
+ 6. Run **## Retro**, then **## End-of-grind summary** per issue — merged / PR-open / WIP-stopped, CI state,
107
+ rounds run, issues filed/fixed.
@@ -0,0 +1,56 @@
1
+ # grind Phase 1 — build gotchas (load when implementing / before verifying locally)
2
+
3
+ Detail pulled from `SKILL.md` § Phase 1. The operational spine (cut worktree, implement to criteria, write tests, run repo checks, checkpoint-verify, re-tackle) stays in `SKILL.md`; these are the failure-mode notes.
4
+
5
+ - **Edit the worktree, not the primary checkout.** All edits must target the worktree path you just
6
+ cut — it is a *separate* working tree from the primary checkout. Before trusting a green local
7
+ verification, confirm `git -C <worktree> status` lists exactly the files you changed. Two false-green
8
+ signatures to treat as "edits landed in the wrong tree, re-check": (a) the test count is unchanged
9
+ despite added test cases, or (b) `cli/dist` shows no diff after a `cli/src` edit. Re-point edits at
10
+ the worktree and re-verify before opening the PR.
11
+ - **Verify every edited file is clean UTF-8 before running checks** — no NUL (`U+0000`), no zero-width /
12
+ soft-hyphen chars (`U+200B`-`U+200D`, `U+2060`, `U+FEFF`, `U+00AD`), no mojibake. A `git diff --stat`
13
+ showing `Bin` for a text file, or `file` reporting `data`, is the signal. When a literal MUST hold an
14
+ invisible or non-ASCII character (a regex class, a test fixture, a dedup key), build it from an escape
15
+ sequence or char-code — never by typing the raw character (the host's editor can inject one silently).
16
+ - **A "mojibake" report seen only in terminal/console output is a false positive — verify the stored
17
+ bytes before treating it as a defect (#1609).** A Windows console on a non-UTF-8 code page (the common
18
+ default) renders clean UTF-8 bytes through CP1252, so a correctly stored em-dash / arrow / ellipsis
19
+ looks corrupted (e.g. an em-dash as `â€"`).
20
+ Before any lens files an encoding/mojibake blocker, or any builder rewrites a file to "fix" it, `grep`
21
+ the source for mojibake markers (`â€`, `Ã`, `Â`) — only stored-byte corruption counts. Never edit
22
+ source to fix mojibake observed only in console rendering: that injects real corruption while chasing
23
+ a phantom (AGENTS: "fix mojibake, don't propagate it" — but only REAL mojibake).
24
+ - **Windows worktree — warm the build first (#1630).** On a fresh Windows worktree, a full `cli`
25
+ suite run can **false-fail** on tests that `execFile` `build.mjs`: esbuild's first cold build (and
26
+ concurrent builds across test files) can exceed a short per-test timeout. Warm it once
27
+ (`node build.mjs`) before the suite and/or raise `--testTimeout`. A test that fails *inside* a
28
+ `build.mjs`/esbuild exec (a timeout, or "Command failed: node build.mjs") on a cold/concurrent
29
+ worktree is a **probable environment flake** — re-run that file in isolation to confirm before
30
+ triaging it as a code blocker. CI runs Linux with warm caches and does not hit this (generalizes
31
+ the #1588 "validate the harness/inputs first" guidance).
32
+ - **Mirror the repo's CI gate once before PR — not at every checkpoint.** Read the gate workflow (e.g.
33
+ `.github/workflows/gate.yml`) and run its read-only build/test steps locally — especially
34
+ artifact-parity checks (e.g. a committed `dist/` that must match a rebuild). Commit regenerated
35
+ artifacts before opening the PR; a locally-green branch that fails the gate on artifact drift
36
+ wastes a CI round. **When the diff touches a spine/surface path** — `AGENTS.md`, `CLAUDE.md`,
37
+ `GEMINI.md`, root `*.md`, `skills/**`, `plugins/**`, `hooks/**`, `.github/workflows/**`, or org
38
+ docs — the local gate-mirror MUST run the **infra** job's static-guard tests too (`cd infra &&
39
+ npm ci && npx vitest run plugin-set` at minimum; ideally the full `infra` `npm test`), not only the
40
+ `cli` suite + `scripts/check-*.mjs`. Those scripts validate structure/mirror sync but do **not**
41
+ assert the byte-identical plain-language-copy contract that `infra/plugin-set.test.mjs` pins.
42
+ - **`plugin-set` is not enough when you change a tested module (#1700).** `npx vitest run plugin-set`
43
+ pins only the plugin-set contract — it does **not** run the rest of the `infra` suite. So when the diff
44
+ rewrites or alters the **export surface** of an existing `scripts/**` or `infra/**` module, a green
45
+ `plugin-set` can hide a co-located test that imports that module and now breaks (a dropped re-export
46
+ resolves to `undefined`, and the test misreads it as missing). Before opening the PR: grep for files
47
+ importing the changed module (`from '.*<module>'`) and run the **full** `cd infra && npm test` — or at
48
+ minimum each importing `*.test.mjs` — not just `plugin-set`.
49
+ - **IO/syscall-boundary changes need a real run, not just mocks (#1588).** For a fix at an IO boundary
50
+ (stdin, fs, sockets, process spawn), mocked unit tests and the diff-review panel can both pass while
51
+ real behavior regresses — they exercise in-memory mocks, not the syscall. A green panel is necessary but
52
+ not sufficient: such a change also needs a **real end-to-end run on the target platform** before you trust
53
+ it. When a real run regresses after a change, **validate the test harness and inputs first** — shell
54
+ escaping, payload validity, env contamination — before re-debugging the code; a malformed fixture commonly
55
+ masquerades as a code regression. And don't rewrite a proven IO primitive to satisfy a hardening nit — wrap
56
+ or bound it instead.
@@ -0,0 +1,76 @@
1
+ # grind routing detail — auto-ultra, explicit `--ultra`, model diversity, reasoning
2
+
3
+ Loaded on demand from `SKILL.md` § Phase 0a′. The class table + the effort-tier flags live in `SKILL.md`; this file holds the ultra / cross-vendor / reasoning detail.
4
+
5
+ ## Auto-ultra detection
6
+
7
+ **auto-ultra = true** (verify-panel uplift **only** — not whole-loop YOLO) when **any** of
8
+ (log first match in saga):
9
+
10
+ 1. ~~User passed **`--ultra`**.~~ **Explicit `--ultra` is separate** — see **Flags** and
11
+ **§ Explicit `--ultra` vs auto-ultra**; it is **not** auto-ultra.
12
+ 2. **Priority `urgent`** + (`security` label OR auth/crypto/payments/PII/agent-facing trusted corpus).
13
+ 3. **`--explore`** + stated numeric SLA/metric with high blast radius.
14
+ 4. **Architectural / cross-cutting** scope (multi-module, public API break, migration).
15
+ 5. **Escalation:** after **2 failed verify rounds** on default routing, escalate to auto-ultra for round 3+
16
+ (once per grind; saga note). Interactive: announce; `--auto`: silent.
17
+
18
+ **auto-ultra = false** (stay 2-model): docs/prose-only diffs; priority `low`; narrow bug with clear repro;
19
+ user says "quick" / "small" in the prompt (explicit user instruction wins).
20
+
21
+ **Explicit `--ultra` vs auto-ultra:**
22
+
23
+ | Mode | Scope |
24
+ |------|-------|
25
+ | **Explicit `--ultra`** | Whole-loop YOLO — models, planning, tools, generative fusion when applicable, raised caps |
26
+ | **auto-ultra** | Stronger verify panel only (≥3 models, Ultra routing, hard-lens double-pass) — **no** automatic multi-plan, tool lenses, fusion deliverable, or cap raises unless user also passed `--ultra` |
27
+
28
+ When auto-ultra fires interactively, **confirm at Gate 1** (default yes) with cost note — list
29
+ **auto-ultra dimensions only** (do not imply YOLO unless `--ultra` is also set).
30
+
31
+ ## Explicit `--ultra` — YOLO dimensions
32
+
33
+ When the user passes **`--ultra`** (explicit, not auto-ultra alone):
34
+
35
+ - **Models:** ≥3 distinct models; cross-vendor required when the host exposes them; report gaps —
36
+ never claim full ultra when diversity is unavailable.
37
+ - **Routing:** Ultra/Paranoid on **every** verify round; hard lenses always double-pass on two models when possible.
38
+ - **Phase 0b:** Always **multi-agent planning + judge** (N=3 planners, verifier-tier judge ≠ builders).
39
+ - **Phase 2 tools:** Always enable bounded web search / sandbox repro on applicable lenses (see **§ Tool-enabled lenses** in `references/verify.md`).
40
+ - **Generative fusion:** When class = Research or `--explore`, use the fusion deliverable path.
41
+ - **Caps:** verify rounds **8**, CI-fix rounds **5** under `--auto` (still bounded — not infinite).
42
+ - **Gate 1 (interactive):** List **all** YOLO dimensions enabled — not just model count.
43
+ - **`--ultra --auto` final report:** State YOLO dimensions used per issue.
44
+
45
+ ## Cross-vendor model diversity (priority)
46
+
47
+ **Actively prefer mixing vendors the host exposes** — cross-vendor diversity is the strongest
48
+ independence signal. On Cursor: GPT, Claude, Composer, Kimi among others. On Claude Code:
49
+ Anthropic tiers are one vendor — still assign **verifier ≠ builder** and pull a third from another
50
+ host when ultra requires it.
51
+
52
+ When ultra (explicit or auto), assign **≥3 distinct models** across **≥2 vendors when the host
53
+ allows**:
54
+
55
+ | Role | Assignment |
56
+ |------|------------|
57
+ | **Builder** | Strongest implementation model (class table may refine) |
58
+ | **Verifier** | Top-tier; **different vendor** from builder when possible; hard lenses |
59
+ | **Third model** | Soft lenses / split panel; **different vendor** from builder and verifier when possible |
60
+ | **Synthesizer** | Third model if distinct; else verifier — **never builder** |
61
+
62
+ **Routing under ultra:** treat as **Ultra** = Paranoid rules (verifier-tier soft lenses; hard-lens
63
+ double-pass on **two different models** when host allows). If the host cannot supply 3 models or 2
64
+ vendors, use maximum diversity available; document the gap at Gate 1 / final report — never claim
65
+ ultra ran fully when it did not.
66
+
67
+ ## Reasoning effort under explicit `--ultra`
68
+
69
+ When the user passes **explicit `--ultra`**, select **higher reasoning effort** wherever the host IDE
70
+ exposes it (e.g. elevated thinking/reasoning tier on supported models). Apply to builder, verifier,
71
+ third, synthesizer, and judge roles when the host offers per-model reasoning knobs.
72
+
73
+ - **Announce** when elevated reasoning is selected — model + level — in the routing announcement block
74
+ and `mmi-cli saga note "reasoning=<model>:<level> …"`.
75
+ - **Fallback:** when the host has no reasoning-effort knob, use the strongest available model tier and
76
+ note the gap in the announcement — do not block the grind.
@@ -0,0 +1,83 @@
1
+ # grind verify detail — lens prompts, tool-enabled lenses, Phase 2b notes
2
+
3
+ Loaded on demand from `SKILL.md` § Phase 2 / 2b. The mechanism, the five lenses, the verbatim-diff
4
+ pin, the `PanelReport` table, and the round-fail rule stay in `SKILL.md`; this file holds the
5
+ lens-prompt clauses and the deeper synthesis notes.
6
+
7
+ ## Lens-prompt clauses (put these in EVERY lens prompt)
8
+
9
+ **Verbatim includes test files.** Never abbreviate or summarize for **tests-actually-test**. A
10
+ "commented-out / absent" verdict may be a paste artifact — re-check input before a fix round.
11
+
12
+ **Lens abstention.** If a lens cannot see the full change, it returns
13
+ `{"verdict":"cannot-verify"}` — it must NEVER emit an "absent / missing / commented-out" blocker
14
+ from a file it could not read. **The orchestrator MUST put this abstention rule in every lens
15
+ prompt** — a lens never told to abstain files false "missing test / missing code" blockers instead,
16
+ and a hand-trimmed diff is the usual trigger (#1561). Every lens prompt must also carry an explicit
17
+ diff-shape clause:
18
+
19
+ > You are reading a unified DIFF, not whole files. Any function, type, or symbol that is *referenced
20
+ > but not defined within the patch* is PRE-EXISTING code — assume it exists and works; NEVER report
21
+ > it as "undefined", "missing", or "never defined". If you cannot see a definition you need to judge
22
+ > a point, return `cannot-verify` for that point.
23
+
24
+ This prevents diff-absent symbols from becoming ship-stoppers at the lens; Phase 2b's absence-claims
25
+ drop rule remains the backstop.
26
+
27
+ **Worktree isolation (#1621, #1895).** Phase 2: `isolation=worktree` — other checkouts may be stale.
28
+ **Orchestrator MUST:** patch-only input; deny repo FS tools on lenses; stale-checkout clause in every lens prompt; re-run Phase 2 if a transcript shows disk reads — never triage disk-sourced blockers.
29
+ Abstention + diff-shape rules above still apply; `cannot-verify` beats false absence.
30
+
31
+ ## Tool-enabled lenses (default expectation on applicable lenses)
32
+
33
+ When an objective signal exists, hard lenses must anchor to it — failing test, typecheck error,
34
+ sandbox repro, cited source — not bare judgment. Tool-enabled path (`webSearch`, `sandboxRepro`)
35
+ is the **default expectation** on applicable lenses, not only an auto-gated extra. Populate
36
+ `PanelReport.blockers[].sources` with objective anchors; opinion-only blockers require explicit
37
+ justification when no signal was available.
38
+
39
+ Eligible lenses may use **bounded tools** during Phase 2 — Fusion-style fact-check and repro —
40
+ while Phase 2b synthesizer stays **tool-free** (lens JSON + diff stat only).
41
+
42
+ | Tool | Use |
43
+ |------|-----|
44
+ | **Web search** | External fact-check (API version, CVE, deprecation, third-party behavior in criteria) |
45
+ | **Sandboxed repro** | Read-only bash in worktree when issue body includes repro shell commands |
46
+
47
+ **Auto-enable tools** when triggers match (no manual `--tools` required):
48
+
49
+ | Trigger | Enable tools |
50
+ |---------|--------------|
51
+ | Criteria/body need external fact-check | Web search for that lens |
52
+ | Issue body has repro shell commands | Sandboxed repro for `correctness` |
53
+ | Prose/docs-only reduced panel | No |
54
+ | Priority `low` + narrow bug, inline repro in diff | No (cost guard) |
55
+ | **`isolation=worktree`** (grind default) | No sandbox repro / repo FS; web search only for external facts |
56
+ | **auto-ultra** + external validation implied | Prefer tools on hard lenses (not repo FS under isolation) |
57
+ | **Explicit `--ultra`** | Always on applicable lenses (not repo FS under isolation) |
58
+
59
+ **Hygiene:** configurable allow/deny domain lists (org/repo-level) — exclude benchmark-leak
60
+ domains (e.g. Stack Overflow, issue mirrors) from verify search. Default deny list in
61
+ `cli/src/grind-policy.ts`. **Per-lens budget:** hard cap (3 queries/lens/round); `saga note` on exceed.
62
+
63
+ **Diff-pinning preserved:** tools supplement — never replace — the pinned patch. Under
64
+ `isolation=worktree`, repo FS tools stay forbidden even under ultra; cite builder test output from
65
+ criteria, not lens shell into a stale checkout.
66
+
67
+ ## Phase 2b — deeper synthesis notes
68
+
69
+ **Fusion doctrine:** IDE-native multi-model only; the agent orchestrates panels. No hosted grind provider.
70
+
71
+ **`blind_spots`:** optionally spawn **one** targeted micro-lens on the highest-priority spot (not a
72
+ full re-panel) before Phase 3 triage — at most once per round.
73
+
74
+ **`contradictions`:** if the disagreement is **criteria/spec ambiguity** (lenses read the spec
75
+ differently), stop and escalate to the human per **## Gates** — do not guess. If one lens is
76
+ clearly wrong against consensus + diff, note it in the saga and triage real blockers only.
77
+
78
+ **Absence-claims (#1621, #1895).** Drop "missing/absent/unimplemented/not fixed" blockers
79
+ contradicted by the pinned patch or green builder tests. Drop blockers when lens logs show repo
80
+ Read/Grep despite isolation — re-run Phase 2 patch-only instead of fixing phantom gaps.
81
+
82
+ (`--auto`) No human escalation — pick the interpretation that satisfies the written criteria; if
83
+ still ambiguous, fail the round and fix the criteria in the issue body before continuing.
@@ -0,0 +1,28 @@
1
+ # Grind saga keep — resume snapshot
2
+
3
+ Enforced via **`mmi-cli saga snapshot`** (maps to saga HEAD — no parallel store). Checklist uses namespaced prefixes: `gs-open:`, `gs-resolved:`, `gs-ceiling:`.
4
+
5
+ ## Read-first on resume
6
+
7
+ ```bash
8
+ mmi-cli saga snapshot show --kind grind
9
+ ```
10
+
11
+ ## Write-last each phase
12
+
13
+ ```bash
14
+ mmi-cli saga note "grind phase=2b: …"
15
+ mmi-cli saga snapshot set --kind grind \
16
+ --class bug --routing Balanced --ultra auto \
17
+ --criteria "…" --phase 2b --verify-round 2 --verify-cap 5 \
18
+ --open-blocker cfg-null-parse-42 \
19
+ --resolved-blocker typo-import-7 \
20
+ --verification-ceiling "CI green" \
21
+ --next-action "triage PanelReport blockers, round 3"
22
+ ```
23
+
24
+ `set` retires prior active `gs-*` checklist rows before writing. Resolved/done rows are marked `done` on write.
25
+
26
+ **NEXT format:** `grind class=… verify=N/cap | next action` — split on ` | ` only (pipes in next action are allowed).
27
+
28
+ Stable blocker `id`s come from `synthesize-panel.md` / `PanelReport`.
@@ -0,0 +1,104 @@
1
+ # Grind panel synthesizer prompt
2
+
3
+ Use this template for **Phase 2b — Synthesize**. Spawn one sub-agent on the **synthesizer model**
4
+ (Phase 0a). The synthesizer reconciles parallel lens outputs — it does not re-review the diff from
5
+ scratch and does not merge panel text into a narrative answer.
6
+
7
+ ## Inputs (provide all)
8
+
9
+ 1. **Success criteria / goal** — verbatim from the grind frame.
10
+ 2. **Lens outputs** — strict JSON from every lens that ran this round (full five, reduced prose
11
+ panel, or `--explore` goal judges). Label each with its lens name.
12
+ 3. **Diff excerpt or stat** — enough to anchor file/line references (`git diff --stat` minimum).
13
+ Do **not** include builder reasoning, session history, or implementation notes.
14
+
15
+ ## Hard rules for the synthesizer
16
+
17
+ - Treat agreement across lenses as higher-confidence `consensus`.
18
+ - Surface disagreements as `contradictions` — do not pick a winner silently.
19
+ - Preserve `unique_insights` a single lens raised.
20
+ - Flag `blind_spots` no lens addressed but the criteria/goal imply should be checked.
21
+ - **Dedupe blockers** — same root cause across lenses → one blocker with stable `id`.
22
+ - Assign each blocker a stable `id` (short kebab, e.g. `cfg-null-parse-42`) reused across rounds
23
+ when the root cause persists. **Convergence** requires the same `id` set on two consecutive
24
+ clean rounds (see Phase 3).
25
+ - Each blocker must cite **objective anchors** in `sources` when available (test name, CI log,
26
+ repro command, file:line from a failing check) — not lens opinion alone.
27
+ - Output **only** the JSON object below — no prose wrapper, no markdown fences.
28
+
29
+ ## Wrong-checkout false positives (#1621, #1895)
30
+
31
+ Grind verify lenses run under **worktree isolation** — the orchestrator pins a patch file and denies
32
+ repo FS tools. Guard against stale-checkout contamination anyway:
33
+
34
+ - **Drop** any blocker whose evidence is "missing / absent / unimplemented / not fixed" when the
35
+ pinned patch or another lens contradicts it — treat as paste artifact or stale-checkout read.
36
+ - **Drop** blockers when the orchestrator flagged the lens transcript for **disk reads** (Read/Grep
37
+ on repo paths despite isolation) — never ship-stop on that evidence alone.
38
+ - **Prefer `cannot-verify`** over `fail` when a lens abstained because the patch was insufficient.
39
+ - Record demoted items in `nits` or `unique_insights` with `"wrong-checkout-fp"` when useful for saga
40
+ audit — do not promote them to `blockers`.
41
+
42
+ ## Output schema (`PanelReport`)
43
+
44
+ ```json
45
+ {
46
+ "consensus": ["Points most or all lenses agreed on — factual, specific"],
47
+ "contradictions": [
48
+ {
49
+ "topic": "What the lenses disagree about",
50
+ "stances": [
51
+ { "source": "lens-name or model", "stance": "..." }
52
+ ]
53
+ }
54
+ ],
55
+ "partial_coverage": [
56
+ {
57
+ "sources": ["lens-a", "lens-b"],
58
+ "point": "Only some lenses covered this"
59
+ }
60
+ ],
61
+ "unique_insights": [
62
+ { "source": "lens-name", "insight": "Something only one lens raised" }
63
+ ],
64
+ "blind_spots": ["Topics no lens addressed that criteria imply matter"],
65
+ "blockers": [
66
+ {
67
+ "id": "stable-kebab-id",
68
+ "title": "Short blocker title",
69
+ "file": "path/to/file",
70
+ "line": "42 or range",
71
+ "why": "Concrete reason this blocks shipping",
72
+ "sources": ["correctness", "security"]
73
+ }
74
+ ],
75
+ "nits": [
76
+ {
77
+ "title": "Advisory, non-blocking",
78
+ "file": "path/to/file",
79
+ "line": "10",
80
+ "why": "..."
81
+ }
82
+ ]
83
+ }
84
+ ```
85
+
86
+ ## Phase 3 hints (for the builder — synthesizer does not act on these)
87
+
88
+ The grind loop uses this report as follows:
89
+
90
+ | Field | Action |
91
+ |-------|--------|
92
+ | `blockers` | Phase 3 — any blocker fails the round |
93
+ | `nits` | Advisory — fold cheap fixes, file costly ones |
94
+ | `blind_spots` | Optional **one** targeted micro-lens this round (not a full re-panel) |
95
+ | `contradictions` | If criteria/spec ambiguity → human escalation; if one lens is wrong → note in synthesis, do not block on noise |
96
+ | `consensus` | Elevated confidence; do not re-litigate unless a later round contradicts |
97
+
98
+ ## Degradation
99
+
100
+ If the synthesizer returns invalid JSON or errors:
101
+
102
+ 1. Fall back to **raw lens blockers** — union all lens `blockers`, manual dedupe by file+line+title.
103
+ 2. `mmi-cli saga note "phase 2b: synthesize degraded, raw lens triage"`.
104
+ 3. Continue Phase 3 — synthesis is an uplift, not a hard dependency.
@@ -0,0 +1,67 @@
1
+ ---
2
+ name: handoff
3
+ description: Record or claim an explicit session handoff in the saga + North Star system that SessionStart resumes — open a handoff to leave work for a future session, accept a pending one, or cancel your own. Use when the user says "handoff" or "/handoff", asks to hand off work, leave a checkpoint for the next session, or claim a prior handoff, or invokes /handoff.
4
+ ---
5
+
6
+ # /handoff — explicit saga handoff lifecycle
7
+
8
+ Use when the user says `handoff` or `/handoff`, or asks to leave work for a future session or claim a prior handoff. This skill records an explicit handoff in the same saga + North Star system that SessionStart resumes.
9
+
10
+ ## Start
11
+
12
+ 1. Read the current repo context:
13
+ - `mmi-cli saga show`
14
+ - `mmi-cli saga key --json`
15
+ - `mmi-cli northstar relevant`
16
+ 2. If you are opening a handoff, identify the North Star slug or issue key. If the saga HEAD already has `northstar:<slug>`, use that slug unless the user names a different one.
17
+ 3. If you are accepting or cancelling, list open handoffs first:
18
+ - `mmi-cli handoff list`
19
+
20
+ ## Open
21
+
22
+ When the user asks to hand off current work:
23
+
24
+ ```powershell
25
+ mmi-cli handoff open --key "<slug-or-#issue>" --north-star-slug "<slug>" "<state snapshot + next action>"
26
+ ```
27
+
28
+ The summary should be short and evidence-oriented: current branch, latest verified state, next action, and any blocker. Do not paste secrets or large logs. Keep the durable plan in North Star; the handoff is the session bridge.
29
+
30
+ ## Accept
31
+
32
+ When the user chooses an open handoff:
33
+
34
+ ```powershell
35
+ mmi-cli handoff accept "<slug-or-#issue>"
36
+ mmi-cli northstar pull "<slug>"
37
+ ```
38
+
39
+ Then corroborate the inherited state against git, GitHub, and the board before acting. The command closes the source mark and binds the accepting session saga anchor to the handoff's North Star slug. Handoffs are discovered repo-wide, so a handoff may have been opened on a different branch — `accept` shows that branch and a `git switch` hint when it differs from yours.
40
+
41
+ ## Cancel Or Decline
42
+
43
+ Cancel closes an open handoff without claiming it:
44
+
45
+ ```powershell
46
+ mmi-cli handoff cancel "<slug-or-#issue>"
47
+ ```
48
+
49
+ Decline is deliberately non-destructive. It leaves the handoff open so it is offered again later:
50
+
51
+ ```powershell
52
+ mmi-cli handoff decline "<slug-or-#issue>"
53
+ ```
54
+
55
+ ## Boundaries
56
+
57
+ - Handoffs are same-login and same-repo (any branch): discovery is repo-scoped, enforced by the Hub session identity and the project-scoped saga read. `handoff list` shows each handoff's source branch.
58
+ - Multiple open handoffs are allowed. Use the North Star slug or issue key as the handle.
59
+ - Never treat a handoff as present truth. It is prior-session belief until verified.
60
+
61
+ ## Retro
62
+
63
+ If this skill misfires, file one lesson on the Hub board:
64
+
65
+ ```powershell
66
+ mmi-cli skill-lesson --skill handoff --title "<what misfired>" --body "<what; evidence; proposed amendment>"
67
+ ```