npm - @mutmutco/opencode-mmi - Versions diffs - 2.48.0 - Mend

@mutmutco/opencode-mmi 2.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/dist/index.d.ts +35 -0
package/dist/index.js +194 -0
package/package.json +44 -0
package/skills/_shared/doctrine.md +238 -0
package/skills/bootstrap/SKILL.md +419 -0
package/skills/bootstrap/seeds/Dockerfile.template +25 -0
package/skills/bootstrap/seeds/README.template.md +36 -0
package/skills/bootstrap/seeds/architecture.template.md +19 -0
package/skills/bootstrap/seeds/components.json.template +31 -0
package/skills/bootstrap/seeds/cursor-environment.template.json +3 -0
package/skills/bootstrap/seeds/cursor-rules.template.mdc +11 -0
package/skills/bootstrap/seeds/design-system.paths.template.json +8 -0
package/skills/bootstrap/seeds/docker-compose.template.yml +17 -0
package/skills/bootstrap/seeds/gate.template.yml +42 -0
package/skills/bootstrap/seeds/google-login.template.md +35 -0
package/skills/bootstrap/seeds/manifest.json +32 -0
package/skills/bootstrap/seeds/mcp-playwright.template.json +13 -0
package/skills/bootstrap/seeds/mmi-product-required-checks.template.json +23 -0
package/skills/browser-automation/SKILL.md +137 -0
package/skills/build/SKILL.md +237 -0
package/skills/build/references/halt-report.md +38 -0
package/skills/build/references/loops.md +13 -0
package/skills/build/references/worked-example.md +18 -0
package/skills/build/templates/campaign-northstar.md +40 -0
package/skills/coop/SKILL.md +77 -0
package/skills/grind/SKILL.md +469 -0
package/skills/grind/references/auto.md +107 -0
package/skills/grind/references/build-notes.md +56 -0
package/skills/grind/references/routing.md +76 -0
package/skills/grind/references/verify.md +83 -0
package/skills/grind/templates/saga-snapshot.md +28 -0
package/skills/grind/templates/synthesize-panel.md +104 -0
package/skills/handoff/SKILL.md +67 -0
package/skills/hotfix/SKILL.md +219 -0
package/skills/mmi/SKILL.md +372 -0
package/skills/rcand/SKILL.md +169 -0
package/skills/release/SKILL.md +309 -0
package/skills/secrets/SKILL.md +137 -0
package/skills/stage/SKILL.md +150 -0

package/skills/bootstrap/seeds/google-login.template.md ADDED Viewed

@@ -0,0 +1,35 @@
+# Google login — {{REPO_NAME}}
+This repo has a Google OAuth client provisioned by the org — one client spanning local/`/stage`, dev, rc, and
+prod. Adding Google login here is **self-serve**: you do not need master-admin help.
+## Get the creds (from SSM — never in git)
+```bash
+mmi-cli secrets get GOOGLE_CLIENT_ID       # dev tier, self-serve on your repo role
+mmi-cli secrets get GOOGLE_CLIENT_SECRET
+```
+The canonical keys are `{dev,rc,main}/GOOGLE_CLIENT_ID|SECRET` (the master stores them once with
+`mmi-cli oauth set-creds`). Runtime and CI read their own tier keylessly via OIDC. Never bake a secret into an
+image or commit it. The deployed box writes the release `.env` itself — **do not commit or hand-edit a `.env`**;
+the only `.env` you create is the gitignored one `/stage` makes from `.env.example` for local dev.
+## The one rule that makes every environment work
+Build the OAuth `redirect_uri` from the **incoming request** — never a hardcoded base URL:
+- callback path: `/api/auth/callback`
+- derive scheme + host from `X-Forwarded-Proto` / `X-Forwarded-Host` (set by the Caddy proxy), falling back
+  to the `Host` header.
+The client's loopback redirect URIs are registered **port-agnostic**, so any local port works; dev/rc/prod
+URIs are registered for both `mutatismutandis.co` and `mutmut.co`, so the deploy train works unchanged.
+## Reference implementation + full guide
+Org KB guide (FastAPI+authlib and Next.js/Auth.js patterns, the `@mutatismutandis.co` domain allowlist, and the
+exact URI set):
+```bash
+mmi-cli kb get kb/guides/google-login.md
+```
+Inspect this repo's expected URIs and confirm the client is port-agnostic:
+```bash
+mmi-cli oauth plan      # the canonical JS origins + redirect URIs + SSM cred params
+mmi-cli oauth verify    # probes an arbitrary :9123 loopback — no redirect_uri_mismatch = good
+```

package/skills/bootstrap/seeds/manifest.json ADDED Viewed

@@ -0,0 +1,32 @@
+{
+  "_comment": "Bootstrap seed manifest (#201) — the machine-readable contract of what /bootstrap and `mmi-cli bootstrap --apply` (#202) stamp into a target repo. Consumed by the CLI (loadBootstrapSeeds). ownership: 'org' = org-delivered, OVERWRITTEN on upgrade (the org owns it); 'repo' = created ONCE on a fresh bootstrap, never clobbered on upgrade (the repo owns its content). source: 'self' = copy MMI-Hub's own current file verbatim; 'seed:<file>' = render the named template in this dir with {{PLACEHOLDERS}}; 'fanout' = delivered by the fanout pipeline (.github/fanout-targets.json), NOT by bootstrap — listed for completeness; 'managed-block' = merge the org-managed .gitignore block in place (preserves the repo's own ignore lines). classes: which repo classes receive this seed.",
+  "placeholders": ["OWNER", "REPO", "REPO_SLUG", "REPO_NAME", "CLASS", "INSTALL_CMD", "GATE_CMD", "GATE_PUSH_BRANCHES_YAML", "GATE_FULL_RUN_BRANCH", "GATE_RULESET_BRANCH_REFS_JSON", "PROJECT_OWNER", "PROJECT_NUMBER", "PROJECT_ID", "STATUS_FIELD_ID", "STATUS_TODO", "STATUS_IN_PROGRESS", "STATUS_IN_REVIEW", "STATUS_DONE", "STACK", "REGION", "SAGA_API_URL"],
+  "seeds": [
+    { "target": "AGENTS.md", "source": "self", "ownership": "org", "classes": ["deployable", "content"] },
+    { "target": "CLAUDE.md", "source": "self", "ownership": "org", "classes": ["deployable", "content"] },
+    { "target": ".claude/settings.json", "source": "self", "ownership": "org", "classes": ["deployable", "content"] },
+    { "target": ".github/ISSUE_TEMPLATE/bug.yml", "source": "self", "ownership": "org", "classes": ["deployable", "content"] },
+    { "target": ".github/ISSUE_TEMPLATE/feature.yml", "source": "self", "ownership": "org", "classes": ["deployable", "content"] },
+    { "target": ".github/ISSUE_TEMPLATE/task.yml", "source": "self", "ownership": "org", "classes": ["deployable", "content"] },
+    { "target": ".github/ISSUE_TEMPLATE/config.yml", "source": "self", "ownership": "org", "classes": ["deployable", "content"] },
+    { "target": "scripts/next-version.mjs", "source": "self", "ownership": "org", "classes": ["deployable"] },
+    { "target": ".github/workflows/gate.yml", "source": "seed:gate.template.yml", "ownership": "org", "classes": ["deployable"] },
+    { "target": ".github/rulesets/mmi-product-required-checks.json", "source": "seed:mmi-product-required-checks.template.json", "ownership": "org", "classes": ["deployable"] },
+    { "target": ".gitignore", "source": "managed-block", "ownership": "org", "classes": ["deployable", "content"] },
+    { "target": "README.md", "source": "seed:README.template.md", "ownership": "repo", "classes": ["deployable", "content"] },
+    { "target": "architecture.md", "source": "seed:architecture.template.md", "ownership": "repo", "classes": ["deployable", "content"] },
+    { "target": ".cursor/environment.json", "source": "seed:cursor-environment.template.json", "ownership": "repo", "classes": ["deployable", "content"] },
+    { "target": ".cursor/mcp.json", "source": "seed:mcp-playwright.template.json", "ownership": "repo", "classes": ["deployable", "content"] },
+    { "target": ".cursor/rules/{{REPO_SLUG}}.mdc", "source": "seed:cursor-rules.template.mdc", "ownership": "repo", "classes": ["deployable", "content"] },
+    { "target": "docs/Guides/google-login.md", "source": "seed:google-login.template.md", "ownership": "repo", "classes": ["deployable"] },
+    { "target": "docker-compose.yml", "source": "seed:docker-compose.template.yml", "ownership": "repo", "classes": ["deployable"], "deployModels": ["tenant-container"] },
+    { "target": "Dockerfile", "source": "seed:Dockerfile.template", "ownership": "repo", "classes": ["deployable"], "deployModels": ["tenant-container"] },
+    { "target": "components.json", "source": "seed:components.json.template", "ownership": "repo", "classes": ["deployable"], "dashboard": true },
+    { "target": "design-system.paths.json", "source": "seed:design-system.paths.template.json", "ownership": "repo", "classes": ["deployable"], "dashboard": true }
+  ],
+  "labels": [
+    { "name": "bug", "color": "d73a4a", "description": "Something is broken or behaving wrong" },
+    { "name": "feature", "color": "a2eeef", "description": "New capability or enhancement" },
+    { "name": "task", "color": "0052cc", "description": "Task, chore, or improvement" }
+  ]
+}

package/skills/bootstrap/seeds/mcp-playwright.template.json ADDED Viewed

@@ -0,0 +1,13 @@
+{
+  "mcpServers": {
+    "playwright": {
+      "command": "npx",
+      "args": [
+        "-y",
+        "@playwright/mcp@latest",
+        "--output-dir",
+        "tmp/playwright-mcp"
+      ]
+    }
+  }
+}

package/skills/bootstrap/seeds/mmi-product-required-checks.template.json ADDED Viewed

@@ -0,0 +1,23 @@
+{
+  "_comment": "Repository-level ruleset requiring the product gate job (#1333). Apply via GitHub repo rulesets (master-admin) after bootstrap seeds gate.yml — the job name gate must match this context.",
+  "name": "mmi-product-required-checks",
+  "target": "branch",
+  "enforcement": "active",
+  "conditions": {
+    "ref_name": {
+      "include": {{GATE_RULESET_BRANCH_REFS_JSON}},
+      "exclude": []
+    }
+  },
+  "rules": [
+    {
+      "type": "required_status_checks",
+      "parameters": {
+        "strict_required_status_checks_policy": false,
+        "required_status_checks": [
+          { "context": "gate" }
+        ]
+      }
+    }
+  ]
+}

package/skills/browser-automation/SKILL.md ADDED Viewed

@@ -0,0 +1,137 @@
+---
+name: browser-automation
+description: Org browser automation doctrine and MCP setup — accessibility tree first, DOM second, vision last; Playwright MCP without --caps=vision; tmp/playwright-mcp artifacts; when to use /stage vs MCP. Use when driving a browser from an agent, configuring Playwright MCP, debugging UI flows, or verifying web behavior outside /stage smoke.
+---
+# Browser automation — DOM-first Playwright MCP
+Org standard for agent browser work across Cursor, Claude Code, VS Code, and Codex. **Playwright** is the engine; agents interact through **structure-first** MCP tools, not pixels-first defaults.
+## Doctrine (non-negotiable)
+1. **Accessibility tree first** — semantic structure the agent can reason about (`browser_snapshot`, a11y refs).
+2. **DOM second** — selectors, snapshots, network when the tree is not enough.
+3. **Vision last** — screenshots only when tree + DOM cannot answer the question.
+4. **Prefer HTTP/OpenAPI** — call APIs directly when discovery or the task allows; do not drive UI for data you can fetch.
+**Never** pass `--caps=vision` (or equivalent vision-first defaults) on Playwright MCP org-wide. Vision caps burn tokens, hide structure, and break on theme/layout drift.
+## When to use what
+| Need | Use |
+|------|-----|
+| Local dev server + smoke on current branch | **`/stage`** — gitignored stack under `tmp/stage/`; see `skills/stage/SKILL.md` |
+| Personal cloud dev preview of your branch | **`/stage --live`** — IP-gated dev stage; not rc/prod |
+| Interactive UI debug, one-off flow, agent-driven clicks | **Playwright MCP** (this skill) — DOM-first, artifacts under `tmp/` |
+| Durable hosted automation outside dev machines | **Stagehand + Browserbase** (production path) — explicit choice, not the default for every local task |
+`/stage` and Playwright MCP **complement** each other. `/stage` spins the app; MCP drives the browser against a URL (often the stage URL).
+## MCP configuration (no vision)
+### Cursor / VS Code — `.cursor/mcp.json`
+Repo template (bootstrap seed): `.cursor/mcp.json` from `mcp-playwright.template.json`.
+```json
+{
+  "mcpServers": {
+    "playwright": {
+      "command": "npx",
+      "args": [
+        "-y",
+        "@playwright/mcp@latest",
+        "--output-dir",
+        "tmp/playwright-mcp"
+      ]
+    }
+  }
+}
+```
+**Do not** add `--caps=vision`. Team/user MCP can mirror the same block (Phase 1 — per developer machine; Hub seeds the repo copy).
+### Codex — `~/.codex/config.toml`
+```toml
+[mcp_servers.playwright]
+command = "npx"
+args = ["-y", "@playwright/mcp@latest", "--output-dir", "tmp/playwright-mcp"]
+```
+No vision caps. Restart Codex after edits.
+### Claude Code
+Enable the official Playwright plugin (`playwright@claude-plugins-official`). Follow its DOM-first tools; do not enable vision-first modes for routine agent work.
+## Playwright availability
+Check the configured/global CLI before adding a temporary local dependency. On Windows PowerShell:
+```powershell
+Get-Command playwright
+playwright --version
+```
+`node -e "require.resolve('playwright')"` only proves a Node module is installed in the current package; it
+can fail while the global or editor-configured Playwright CLI works. Use the available CLI for smoke checks
+and MCP setup. Temporary per-worktree installs are fallback-only, and should stay untracked.
+## Agent workflow (MCP)
+1. **Goal** — what observable outcome proves success?
+2. **Navigate** — open the target URL (often from `/stage` JSON: `mmi-cli stage --json`).
+3. **Snapshot** — a11y tree / DOM snapshot before interacting.
+4. **Act** — click, type, select using refs from the latest snapshot.
+5. **Re-snapshot** after navigation or major DOM changes (refs go stale).
+6. **Vision only if stuck** — one screenshot to disambiguate; then return to tree/DOM.
+Core MCP loop:
+```
+browser_navigate → browser_snapshot → browser_click / browser_type → browser_snapshot
+```
+Use `browser_lock` / `browser_unlock` when the host documents a longer automation sequence (Cursor IDE browser MCP).
+## Artifacts and hygiene
+- **All Playwright MCP output → `tmp/playwright-mcp/`** (pass `--output-dir tmp/playwright-mcp` on the MCP server).
+- **Never** leave traces, screenshots, or reports at the repo root.
+- `.playwright-mcp/` at repo root is gitignored as a **safety net only** — not the canonical path.
+- `mmi-cli doctor` warns on vision caps in MCP config and stray browser artifacts outside `tmp/`.
+## Phase 1 — per-user editor setup (document only)
+These are **developer-machine** steps; Hub does not mutate user configs in Phase 2:
+- **Cursor:** import Playwright MCP in user/team `mcp.json` (no vision caps) or rely on bootstrapped `.cursor/mcp.json`.
+- **Claude Code:** `/plugin install playwright@claude-plugins-official` (or enable via plugin panel).
+- **Codex:** add `[mcp_servers.playwright]` to `config.toml` as above.
+Full per-surface onboarding: [Agentic-Dev-Environment-Guide](https://github.com/mutmutco/MMI-Hub/wiki/Agentic-Dev-Environment-Guide) (wiki — Phase 3 fanout).
+## Anti-patterns (org-wide avoid)
+- `--caps=vision` or screenshot-first wrappers for routine tasks
+- Skyvern, Magnitude, LaVague, or other vision-first agent browsers as org defaults
+- Committing `.playwright-mcp/`, `playwright-report/`, or `test-results/` from agent runs
+- Replacing `/stage` with ad-hoc MCP servers for branch smoke (use `/stage` for the stack, MCP for the browser)
+## Related
+- **`/stage`** — `skills/stage/SKILL.md`
+- **`/grind`** — use DOM-first browser checks in verification when criteria need UI proof
+- **`/bootstrap`** — seeds `.cursor/mcp.json` from `mcp-playwright.template.json`
+- **`mmi-cli doctor`** — vision-cap and artifact hygiene checks
+## Retro — one check before you finish
+Before your final report, answer one question honestly: did **this skill's own instructions** misfire
+this run — ambiguous wording, a misleading MCP snippet, or an artifact path it should have warned
+about? (Process only — never the user's code or task.) If yes, file **one** lesson and move on; a clean run is
+silent (hard cap: one per run). It lands on the Hub board (deduped) and is fixed only via a reviewed PR —
+never edit the skill live; the retro is advisory, so if the call fails, note it and continue:
+`mmi-cli skill-lesson --skill browser-automation --title "<what misfired>" --body "<what; evidence; proposed amendment>"`

package/skills/build/SKILL.md ADDED Viewed

@@ -0,0 +1,237 @@
+---
+name: build
+description: Drive a milestone from a finished foundation to merged PRs, autonomously. Auto by default — the framing interview authorizes auto-merging its PRs into `development` only (never `rc`/`main`, never promotion). Use when the user says "/build", "build out the milestone", "construct the pipeline body", "construct <capability>", names a North Star slug, asks you to run a milestone to merge, or invokes /build [--light|--standard|--deep|--ultra].
+---
+# /build — autonomous milestone construction loop
+**Shared doctrine:** Read `skills/_shared/doctrine.md` at session start and on resume. Fusion, parallelism, flat fan-out, classifier-denied spawns, blocker tiers, worktree hygiene, #1595 verify+commit, enforcement matrix — single source; do not duplicate here.
+You own a milestone or capability from a finished foundation to merged PRs, autonomously. Auto is the default mode and the framing interview is the one routine human touchpoint. After the interview locks decisions to the board and the North Star, you run gateless: fan out parallel sites, raise the effort tier per part, fuse cross-vendor planners on the hard parts, verify in nested sub-loops, and auto-merge each PR into `development` until the frontier is exhausted or a hard-decision blocker parks a site.
+## Core idea
+Build is essentially a `/goal` skill with enhancements (effort tiers, fusion, blocker tiers, framing interview, self-learning, no-drift). `grind` drives one item; `build` constructs the plan that drives all of them. Build is the outer loop; grind is the inner one when build chooses to delegate a single PR.
+- **Scope:** one milestone, capability, or North Star slug end-to-end — multiple slices, multiple PRs, often parallel.
+- **Mode:** auto by default; the framing interview is the one routine pause.
+- **Engine:** nested loops where every loop deepens understanding; effort tier per site; fusion on the hard parts; maximum parallelism throughout.
+- **Quality bar:** re-tackle and rebuild rather than patch to green. Holding the bar matters more than closing the loop fast.
+- **Exit:** an honest halt report — what merged, what filed, what's parked on which decision, where verification stopped short of live-verified.
+Build ships its own policy module (`cli/src/build-policy.ts`) and a small CLI surface (`mmi-cli build tier`, `mmi-cli build plan`); the loop itself is agent-driven.
+## Mode model — auto by default
+The framing interview (Phase 0) is the only routine human gate. Completing it is the **durable authorization** to auto-merge this milestone's PRs into `development` — never `rc`, never `main`, never a promotion. The train skills (`/rcand`, `/release`, `/hotfix`) stay human-only.
+- No per-PR gates. Build opens PRs, watches CI, fixes failures, and merges into `development` itself.
+- The real guard remains GitHub branch protection (fail-closed) under the human's own `gh` token. Build cannot promote even if its loop tells it to.
+- Build stops only at **hard-decision blockers** (park the site, keep the frontier moving) and the final halt report.
+- An interactive/gated fallback may exist for the rare case where the user wants per-step confirmation; auto is the norm.
+### Autonomy ladder (shared vocabulary)
+| Level | Meaning | Build mapping |
+|-------|---------|---------------|
+| **L1** suggest only | — | — |
+| **L2** draft for human | — | — |
+| **L3** apply + human approves merge | interactive/gated fallback | rare opt-in |
+| **L4** apply + auto-complete + audit log | post–Phase 0 default | saga notes + halt report |
+**L4 is earned, not assumed.** First-run supervised is an opt-in: run gated once, confirm the loop produces approvable work, then go gateless. Auto remains the default once earned — framing only, not a new default gate. L4 audit expectation: silent when nothing to report; halt report when the run ends.
+## Effort-tier engine
+Build self-selects an effort tier **per site or part** from five signals: scope, risk, ambiguity, blast radius, and foundational-dependency depth. The tier scales agent count, process count, parallelism, verification depth, and per-agent reasoning effort together. The four tiers (`light · standard · deep · ultra`) are the **same shared ladder** `/grind` uses — see `_shared/doctrine.md` § Effort tiers.
+| Tier | Planners | Parallel sites (cap) | Verification depth | Reasoning effort |
+|------|----------|----------------------|--------------------|------------------|
+| **light** | 1 builder + 1 cross-vendor verifier | up to 2 | one checkpoint + final panel | standard |
+| **standard** | builder + cross-vendor verifier + synthesizer | up to 3 | checkpoint sub-loops + full lens panel | standard -> high |
+| **deep** | 2-3 cross-vendor planners -> judge (plan-fusion) + full panel | up to 4 | nested checkpoints + hard-lens double-pass | high |
+| **ultra** | 3 cross-vendor planners -> judge + double-pass hard lenses | up to 5 (policy cap) | integration checkpoints + research fan-out | highest the host exposes |
+Heuristics:
+- **light** — trivial slice, clear spec, narrow blast radius.
+- **standard** — normal slice, real but bounded risk.
+- **deep** — hard or architectural; ambiguous; cross-cutting blast radius; foundational dependency.
+- **ultra** — highest-stakes or riskiest; foundational; multi-system impact; rebuild-tier consequences if wrong.
+**Flags force a tier:** `--light`, `--standard`, `--deep`, `--ultra` override auto-selection for the whole run (explicit override always wins, per `pickEffortTier`).
+**Fusion-model cap:** on opencode `fugu` / `fugu-ultra` or `codex-fugu`, cap at `light` and run the straightforward single-pass method — no planner/lens fan-out. See doctrine § Model economy.
+**Announce + log every tier choice.** Print the tier, the signals that drove it, and a one-line "why" before construction starts; `mmi-cli saga note "build tier=<tier> site=<slug> signals=<…> reason=<…>"`. The heuristics are encoded testably in `build-policy.ts` (fixtures lock signals -> tier/parallelism/planner-count/reasoning).
+The tier is the **concurrency ceiling** and the **verification depth floor** for that site. **Fill the cap** — batch independent sites up to it rather than serializing (doctrine § Parallelism). The tier raises the ceiling; it does not lower the parallel bias.
+## Phase 0 — Framing interview
+A short, focused discussion with the user **before** the loop runs. The point of Phase 0 is to **prevent drift** — drift is the top failure mode of a long autonomous run.
+Cover:
+- The milestone in one paragraph; the North Star slug it lands under.
+- The finished foundation it builds on (what is true today; what build can assume).
+- Success criteria for the milestone — testable statements, not vibes.
+- The shape of the deliverable (one PR vs many; which surfaces; which artifacts).
+- Hard-decision territory: which calls the user wants to make personally if they come up (auth model, schema migrations, vendor choice, public API shape).
+- Anything explicitly out of scope.
+**Interview checklist** (work through every line; missing answers are drift hazards):
+- Milestone in one paragraph; North Star slug.
+- Finished foundation: what is true today, what build can assume, what is brittle. For deploy/infra/integration/publish chunks, read the existing org mechanism FIRST (central workflows in MMI-Hub, registry coords, the relevant runbook) before framing — never offer a framing option that contradicts discovered org doctrine (e.g. "product repos carry no deploy files").
+- Success criteria: testable statements (numbers, named artifacts, named surfaces).
+- Deliverable shape: one PR vs many; which repos; which surfaces; which generated artifacts must rebuild.
+- Wave shape (best guess): which sites look parallel, which look serialized, which look blocked.
+- Hard-decision territory: auth model, schema migration, public API shape, vendor choice, anything the user wants to call personally.
+- Explicitly out of scope.
+- Verification ceiling the user expects (built / tested / CI green / merged / live-verified) — build cannot deliver live-verified itself; name that gap up front.
+**Lock the decisions in:**
+1. `mmi-cli northstar push <slug>` with the milestone body + criteria.
+2. File or update the umbrella issue for the milestone and any known child issues; **batch-claim everything build will work in ONE call** — `mmi-cli board claim <ref> <ref> … --for <login>` (dedupes + claims in parallel; never one-by-one). (Set urgency with `--priority` — the board Priority **field**, never a `priority:*` label; #416.)
+3. `mmi-cli saga note "build framing: milestone=<…> criteria=<…> out-of-scope=<…> hard-decisions=<…>"`.
+4. **Cost estimate (measure-first).** Run `mmi-cli build estimate` (or `build tier` + mental math) and print the worst-case **agent-call proxy** + ceiling. If projected exceed → lower tier or halt-and-report before going gateless; log `saga note "build cost-estimate units=<n> ceiling=<n> action=<ok|lower-tier|halt>"`. Phase 0 may override `CAMPAIGN_ITERATION_CAP` (~15 orients default) with explicit human approval — log to saga.
+5. Initialize the **in-hand North Star** from `templates/campaign-northstar.md` via `northstar push` — this is campaign SSOT; saga is session handoff.
+After Phase 0 the loop runs gateless. If the milestone genuinely shifts mid-run, re-interview — do not let the loop redefine the milestone on its own.
+## Loop memory — North Star in hand
+**Two-store rule:** North Star (`plans/<slug>.md`) = campaign position; saga = session audit trail. Halt report = projection of the in-hand North Star, not a third source of truth.
+The North Star is **campaign working memory**, not a doc read once. **Read → act → write-back** every state transition:
+- **Read first:** L0 orient starts with `mmi-cli northstar show <slug>` (or `northstar relevant`), then `mmi-cli saga snapshot show --kind build` for session handoff.
+- **Write last:** after site enter/exit, tier choice, park, solvable-clear, merge, re-frame — `mmi-cli northstar push <slug>` with the canonical structure in `templates/campaign-northstar.md`, then `mmi-cli saga snapshot set --kind build` for session handoff.
+Canonical sections: Milestone + slug; Criteria; Done last turn / In progress / Blocked / Next frontier; Tier ledger; Verification ceiling per site.
+**Hygiene:** short + structured. If state outgrows a scannable snapshot, graduate or split the milestone — do not grow the file.
+**Re-frame:** North Star edit first, then `saga note --decision`, then continue. When a halt fires (iteration cap, cost ceiling, hard-decision), record the reason in Blocked / Next frontier before the halt report.
+**Re-interview triggers** (always pause for these; never let the loop absorb them):
+- A success criterion proves untestable or contradictory in flight.
+- A finished-foundation assumption turns out to be false (a dependency moved).
+- A site uncovers a hard-decision the framing did not flag.
+- The user changes the scope mid-run.
+## Loop engineering (nested) — the primary engine
+Build is loops inside loops. **Every loop must improve understanding** of the problem and of the implementation — iterate to deepen and re-frame, not merely to fix. If a round closes a blocker but did not teach you anything new about the problem, treat that as a signal to widen the next round, not to skip it.
+The nesting:
+- **L0 — Campaign loop (frontier).** Read the in-hand North Star **first**; orient on board + research; select the unblocked frontier; pick a site; construct; learn; write North Star back; re-orient. Runs until **externally confirmed** frontier exhaustion or a halt condition (iteration cap, cost ceiling, hard-decision). Default **~15** L0 orients (`CAMPAIGN_ITERATION_CAP`); on cap without external exhaustion → halt-and-report (honest halt ≠ failure).
+- **L1 — Site loop (slice -> merge).** For one site: plan, build, checkpoint-verify in sub-loops, fix, re-verify, open PR, watch CI, merge into `development`, then tick the site's line in the milestone umbrella (`mmi-cli issue check <umbrella> --item "<text>"`; native sub-issues auto-tick on close) so it doesn't read 0% after slices shipped (#1796). Each L1 turn rolls back into L0.
+- **L2 — Checkpoint verification sub-loops.** Inside a site, at each meaningful increment (not just at the end), spin a verification sub-loop. The effort tier sets how many checkpoints and how deep each one runs.
+- **L3 — Fusion-panel rounds.** Inside a checkpoint, use the host multi-agent panel mechanism when available, then run parallel lenses -> synthesizer -> triage; repeat until **two consecutive synthesis-stable clean rounds** (identical blocker `id` sets; an empty blocker set counts as stable). Degraded fallback follows shared doctrine and must be reported. Empty or controller-authored all-pass lens stubs are invalid evidence. Each blocker/clean verdict must cite an **objective signal** when one exists (failing test, typecheck error, sandbox repro, cited source) — not bare judgment. **Terminal done hierarchy:** (1) repo checks / CI green, (2) two stable clean rounds, (3) merge authority — never skip layer 1.
+- **Research sub-loops** can be spawned at any level when a decision needs grounding. Treat research as first-class, not a detour.
+The effort tier sets the width and depth at each level. `light` may collapse L2/L3 into one pass; `ultra` runs full L3 panels at every L2 checkpoint, with hard-lens double-pass.
+**What "deepens understanding" means per level (concrete tests) + the distribution-chunk consumer
+rule → `references/loops.md`.** Short version: each level must surface one new fact (L0 a re-framed
+frontier, L1 a sharper plan, L2 a missed coupling, L3 a `PanelReport` that moves the model); a round
+that teaches nothing is a stall signal. Artifacts consumed outside their repo (registry items, npm
+packages, plugin bundles, seeds) need a **consumer-path checkpoint**, not just an in-repo build.
+**Worktree tests:** see shared doctrine — tests must run with `cwd` set to the worktree; orchestrator reproduces subagent green before merge.
+**Stay in the active worktree.** For a coherent build site or same-repo wave, keep implementation,
+verification, commits, PR prep, saga, and North Star updates in the selected feature worktree until
+the site/wave reaches its integration boundary. Do not bounce to the main checkout for routine status,
+fast-forward, or reorientation work. Switch mid-run only for a new unrelated branch, real base drift,
+final integrated validation, cleanup after merge, explicit user request, or a broken/unsafe worktree.
+Keep any local `/stage` attached to the active worktree; do not disrupt it for git bookkeeping alone.
+**Delegated verify+commit:** see shared doctrine § Delegated verify+commit (#1595).
+**Re-framing rule:** when L0 orient detects that the problem has shifted under you (a dependency landed differently, a spec was wrong, a downstream effect was missed), **North Star edit first**, then `saga note --decision`, then update the board, then continue. Never let the loop quietly redefine the milestone.
+**Ralph Wiggum guard:** distrust the loop's own "I'm done" signal. **Frontier exhausted** requires external confirmation: in-hand North Star shows empty next frontier, no open unblocked board claims, no in-flight PRs for this milestone, no unresolved parked sites — checked via `mmi-cli board` / `gh` and `mmi-cli build frontier`, not self-belief. Same orient picture twice = stall signal (counts toward stale orients).
+## Research is first-class
+When a decision needs grounding — an API contract, a third-party behavior, a security CVE, a benchmark, an unfamiliar primitive — spawn a research sub-loop. Bounded web search, doc fetch, or sandboxed repro; output is a short markdown note plus sources fed back into the surrounding loop (L1 plan, L2 checkpoint, L3 lens).
+- Research is **not a detour from building** — it is part of building.
+- Default research budget: 3 queries per spike, hard cap on round trips, allow/deny lists honored (no benchmark-leak domains in verify-time search).
+- Saga-log every spike: `mmi-cli saga note "build research spike=<…> sources=<…> conclusion=<…>"`.
+## Authority + guardrails
+Non-negotiable, in priority order:
+- **Auto-merge to `development` only.** Never `rc`, never `main`, never a promotion (`/rcand`, `/release`, `/hotfix` are human-only).
+- **No-CI repos (#1432).** Run `mmi-cli pr ci-policy --json` before polling. When `policy` is `no-ci`, local verify satisfies terminal layer 1 — merge after local verify. **`(Recommended)` `mmi-cli pr land <n>`** runs the full path. After land, that branch's worktree + branch cleanup is automatic (#1606); for serialized same-repo waves, do not land every issue separately if the active shared worktree is meant to continue. When the user explicitly wants the linked worktree/stage to continue across a same-session batch boundary, use `mmi-cli pr land <n> --preserve-worktree` and clean up the local branch/worktree/stage at the batch end (#1888).
+- **Worktree / node_modules:** see shared doctrine.
+- **Claim `--for <login>`.** Resolve the human from `mmi-cli whoami` or the session banner. Never claim as the bot.
+- **Saga note every phase.** Drop a one-line `mmi-cli saga note "<…>"` after Phase 0, each L0 orient, each tier choice, each L1 site enters/exits, every parked site, and at the halt report.
+- **North Star in hand each L0 turn.** `mmi-cli northstar show <slug>` first; `mmi-cli saga snapshot show --kind build` for handoff; write back via `northstar push` + `saga snapshot set --kind build`.
+- **Two-tier vault only.** `/secrets` — never env files. Harness: `vault-edit-gate.mjs` on Claude Code (see enforcement matrix).
+- **No self-escalation.** Privileged ops → board issue for master.
+- **No fabrication.** Source-of-truth links, paths, CLI flags only.
+**What build never does** (hard floor):
+- Push to `rc` or `main`; promotion is human-only (`/rcand`, `/release`, `/hotfix`).
+- `gh pr merge` on bases other than `development`.
+- Org-tier secret writes; box/runner admin; prod-touching infra.
+- Hand-edit runtime `.env` or commit `.env`.
+- Destructive git — **Claude Code 2.1.183+ harness blocks natively**; skill-enforced elsewhere (see enforcement matrix).
+- Edit a skill live mid-run — skill fixes are PRs to MMI-Hub.
+## Build loop
+```mermaid
+flowchart TD
+  interview["Phase 0: framing interview - lock decisions to board + North Star"] --> orient
+  orient["Orient: read North Star first, board/research, write back — no drift"] --> select
+  select["Select unblocked frontier + pick effort tier per site"] --> construct
+  construct["Construct site at tier: plan-fusion -> build -> checkpoint verify sub-loops"] --> blockers
+  blockers{"Blocker?"}
+  blockers -->|solvable| selfclear["File, claim, fix, PR, merge, continue"]
+  blockers -->|hard decision| park["File, park this site, keep frontier moving"]
+  blockers -->|none| merge["Auto-merge to development"]
+  selfclear --> orient
+  park --> orient
+  merge --> learn["File uplift + friction to MMI-Hub; saga + North Star sync"]
+  learn --> orient
+  orient -->|externally confirmed exhaustion| halt["Honest halt report — projection of in-hand North Star"]
+  orient -->|iteration cap / cost ceiling| halt
+```
+## Halt report
+The end-of-run report is the durable handoff — a **projection** of the in-hand North Star (sections map 1:1; no duplicate prose formats). Required content:
+- **Merged** — every PR landed into `development`, with site/slice name + issue refs.
+- **Filed** — every issue build opened during the run (blockers, uplift, friction) with link and which board.
+- **Parked** — every site build stopped on, the hard decision it waits on, and the options surfaced for the user.
+- **Verification ceiling** — what level of verification was achieved (built / tested / CI green / merged to development). Live-verified against a deployed environment is **always** outside build's ceiling; name that explicitly.
+- **Tier ledger** — tier choice per site + the signals that drove it.
+- **Fusion gap** — if any fusion round could not run fully cross-vendor, name the gap.
+- **Next step** — the smallest concrete action the user can take to clear the parked sites and resume the milestone.
+A run that merged nothing but cleanly parked the frontier on hard decisions is still a successful run — the milestone moved, even if no code merged.
+**Halt report skeleton → `references/halt-report.md`** — a fill-in template with all sections
+(Milestone, Merged, Filed, Parked, Verification ceiling, Tier ledger, Fusion coverage, Next step).
+## Worked example → `references/worked-example.md`
+A full illustrative walkthrough of the loop on one real milestone (the Katip v1.0 pipeline body) —
+Phase 0 framing → L0 orient → frontier partition by buildability + effort tier → wave sequencing →
+L1 site loops → auto-merge → self-learning → halt report — lives in `references/worked-example.md`.
+## Retro
+See shared doctrine § Self-learning + retro. File via `mmi-cli skill-lesson --skill build` — at most one lesson per run.

package/skills/build/references/halt-report.md ADDED Viewed

@@ -0,0 +1,38 @@
+# build halt-report skeleton
+Fill-in template for the end-of-run halt report. The required-content list + the rationale stay in
+`SKILL.md` § Halt report; this is the skeleton to fill (an empty section is a positive statement, not
+a missing one).
+```text
+Milestone: <name> (North Star: <slug>)
+Run mode: auto (Phase 0 authorization @ <iso ts>)
+Merged into development:
+  - <site/slice> — PR #<n> — closes #<issue>... — tier=<tier>
+  - ...
+Filed:
+  - blockers: <list with issue links + repo>
+  - process uplift (Hub): <skill-lesson link or "none">
+  - friction (Hub): <issue links or "none">
+Parked:
+  - <site/slice> — waits on: <hard decision> — options surfaced in #<issue>
+  - ...
+Verification ceiling:
+  - <site> -> merged + CI green (not live-verified against <env>)
+  - ...
+Tier ledger:
+  - <site> -> <tier> (signals: <…>)
+  - ...
+Fusion coverage:
+  - rounds run cross-vendor: <n/total>
+  - gaps: <vendor missing on which rounds, or "none">
+Next step:
+  - <smallest concrete action the user can take to clear the parked sites>
+```

package/skills/build/references/loops.md ADDED Viewed

@@ -0,0 +1,13 @@
+# build loop engineering — what "deepens understanding" means per level
+Detail pulled from `SKILL.md` § Loop engineering. The L0–L3 nesting summary stays in `SKILL.md`; this
+file holds the per-level "deepened understanding" tests and the distribution-chunk consumer rule.
+**What "deepens understanding" looks like per level** (concrete tests, not vibes):
+- **L0** — the orient is deeper than the last one: new evidence on the board, a parked decision now clearer, a research spike that re-frames the frontier. If the orient produces the same picture as last time, treat that as a signal the campaign has stalled — escalate effort tier or surface a hard-decision.
+- **L1** — the site plan is sharper than the planner's first sketch: edge cases named, dependencies traced, the test surface enumerated. A site that ends with the same understanding it started with built nothing new.
+- **L2** — the checkpoint exposed something the plan missed (a coupling, a contract, a perf cliff, a failure mode). A checkpoint that just confirms "no blockers found" without surfacing one new fact is suspect — widen the next L2.
+- **L3** — the fusion-panel round produced a `PanelReport` whose `consensus`, `contradictions`, `blind_spots`, and `unique_insights` actually move the model of the change. Two clean rounds with identical empty blocker sets prove stability; two clean rounds with the same shallow lens write-ups prove only that the panel is asleep.
+**Distribution chunks need consumer verification.** When any site, slice, or task ships an artifact consumed outside its source repo — shadcn registry items, npm packages, generated plugin bundles, scaffold seeds, or any publish/install surface — an in-repo build is not enough. Add a checkpoint that simulates the consumer path (`shadcn add`, `npm pack` + install, scaffold into a scratch app, or equivalent) and typechecks/runs the consumer. If a full simulation is unavailable, do both of these before merge: research the installer's rewrite/resolution rules, and verify every runtime dependency is declared, including CSS/token/theme layers. Log the exact consumer lens and its ceiling in the saga.

package/skills/build/references/worked-example.md ADDED Viewed

@@ -0,0 +1,18 @@
+# build worked example — Katip v1.0 pipeline body
+Illustrative walkthrough of how the `/build` loop chews through one real milestone without going on
+autopilot. The slice names and issue refs are illustrative; the real ones belong to the milestone's
+own North Star.
+- **Phase 0 — Framing.** Lock the milestone: "ship the Katip v1.0 pipeline body end-to-end on top of the finished foundation." Criteria: every slice green to `development`; vocab pack landed; live-Recall integration scoped (not done). Hard-decision territory: slice-B (`#192`) depends on `#156` ratify + a live-Recall field that needs a human call. North Star push; saga note; out-of-scope = the live-Recall integration itself.
+- **L0 orient (first turn).** Read North Star, scan the board for unblocked items, run a brief research spike on the `#156` ratify state. Inventory: slice-F machinery is the foundation; slices C/D/E run on default choices; vocab pack is parallel-safe; slice-B is blocked on a hard decision.
+- **Partition the frontier (buildability + effort tier).** Three groups fall out:
+  - **Group 1 — foundational, buildable.** owner_email machinery, `#159`, slice-F machinery. Effort tier **deep**: foundational, one wrong call cascades.
+  - **Group 2 — slices on default choices, buildable.** Slices C, D, E. Effort tier **standard**: bounded risk, real but not architectural.
+  - **Group 3 — blocked.** Slice-B (`#192`). Hard-decision blocker on `#156` ratify + live-Recall field. File the option set in the issue body, **park the site**, do not guess.
+  - **Vocab pack** — independent, trivial slice, parallel-safe. Effort tier **light**.
+- **Sequence the waves.** Foundational Group 1 first. Then the conceptual order B -> C -> D -> E -> F collapses (B parked) to C -> D -> E -> F in dependency order. Vocab pack runs in parallel from the start (its own worktree, its own PR).
+- **L1 site loops.** Parallel sites cut separate worktrees from `development`; serialized same-repo waves can reuse one shared active worktree until the wave/session ends. Group 1 sites get plan-fusion (2-3 cross-vendor planners -> judge), full L3 panels at every L2 checkpoint, hard-lens double-pass. Group 2 sites get a standard panel at end-of-slice. Vocab pack gets a light pass + final panel.
+- **Auto-merge** each PR into `development` as it goes clean. Watch CI; CI-fix loop bounded; rebuild + re-verify artifacts before merge.
+- **Self-learning.** Two friction notes filed during the run (a Windows worktree lock, an artifact-parity surprise on the vocab pack). One process uplift (`mmi-cli skill-lesson --skill build`) when L2 checkpoints proved too sparse on Group 1 site F.
+- **Halt report.** Merged: foundational machinery + slices C/D/E/F + vocab pack. Parked: slice-B on the `#156` ratify + live-Recall field decision (issue link + options). Verification ceiling: merged to `development` + CI green; **not live-verified** against the deployed Katip env — that next step is named explicitly in the report. The user clears the hard decision, then the next build run picks up slice-B.

package/skills/build/templates/campaign-northstar.md ADDED Viewed

@@ -0,0 +1,40 @@
+# Campaign North Star — in-hand template
+Short, structured campaign working memory. Keep scannable; if this outgrows one screen, graduate or split the milestone.
+```markdown
+# <North Star slug>
+## Milestone
+<one paragraph — what this campaign ships>
+## Criteria
+- <testable statement from Phase 0>
+- ...
+## Done last turn
+- <site/slice + outcome, or "none — first orient">
+## In progress
+- <site/slice currently under construction>
+## Blocked (parked)
+- <site> — waits on: <hard decision> — issue #<n>
+## Next frontier
+- <unblocked sites/issues the next L0 turn should select>
+## Tier ledger
+| Site | Tier | Signals |
+|------|------|---------|
+| <slug> | <light\|standard\|deep\|max> | <brief> |
+## Verification ceiling reached
+| Site | Rung |
+|------|------|
+| <slug> | <built\|tested\|CI green\|merged> |
+```
+**Write-back triggers:** Phase 0 push; every L0 orient (read first, write last); site enter/exit; tier choice; park; solvable-clear; merge; re-frame; halt (record reason in Blocked/Next frontier before halt report).
+**Write primitive:** `mmi-cli northstar push <slug>` with this structure — not ad-hoc chat summaries.