@zigrivers/scaffold 3.4.0 → 3.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +320 -7
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -76,6 +76,13 @@ Independent review from Google's model. Can run alongside or instead of Codex.
76
76
  - Requires: Google account (free tier available)
77
77
  - Verify: `gemini --version`
78
78
 
79
+ **mmr** (multi-model review CLI)
80
+ Automates dispatching, monitoring, and reconciling code reviews across multiple AI model CLIs. Works standalone or with Scaffold.
81
+ - Install: `npm install -g @zigrivers/mmr`
82
+ - Verify: `mmr --help`
83
+ - Setup: `mmr config init` (auto-detects installed CLIs)
84
+ - See: [mmr — Multi-Model Review CLI](#mmr--multi-model-review-cli)
85
+
79
86
  **Playwright MCP** (web apps only)
80
87
  Lets Claude control a real browser for visual testing and screenshots.
81
88
  - Install: `claude mcp add playwright npx @playwright/mcp@latest`
@@ -151,6 +158,12 @@ npm update -g @zigrivers/scaffold
151
158
  brew upgrade scaffold
152
159
  ```
153
160
 
161
+ ### mmr
162
+
163
+ ```bash
164
+ npm update -g @zigrivers/mmr
165
+ ```
166
+
154
167
  ### Plugin
155
168
 
156
169
  ```
@@ -546,16 +559,291 @@ You don't need both — Scaffold works with whichever CLIs are available. Having
546
559
 
547
560
  ### mmr — Multi-Model Review CLI
548
561
 
549
- Scaffold includes `mmr`, a standalone CLI that automates multi-model code review dispatch, monitoring, and reconciliation. Instead of manually running each review channel, `mmr` handles it all:
562
+ `mmr` is a standalone CLI that automates multi-model code review. It solves the problems teams hit when manually orchestrating reviews across Claude, Codex, and Gemini: timeouts, auth failures, inconsistent prompts, fragile output parsing, and manual reconciliation.
563
+
564
+ **The core problem it solves:** Without `mmr`, an AI agent dispatching multi-model reviews has to manually construct CLI commands for each model, handle per-tool auth quirks, improvise timeout handling, parse different output formats, and reconcile findings across channels. In practice, this takes 4-6+ minutes per review and frequently fails. `mmr` reduces this to three commands.
565
+
566
+ #### How mmr Works
567
+
568
+ ```
569
+ mmr review --pr 47 ──→ Dispatches to all channels in background
570
+ Returns job ID immediately
571
+ Agent continues working
572
+
573
+ mmr status mmr-a1b2c3 ──→ Poll progress (which channels done?)
574
+ Exit code: 0=done, 1=running, 2=failed
575
+
576
+ mmr results mmr-a1b2c3 ──→ Reconcile findings across channels
577
+ Apply severity gate
578
+ Output unified findings
579
+ Exit code: 0=passed, 1=gate failed
580
+ ```
581
+
582
+ **Key features:**
583
+
584
+ - **Async job model** — reviews run in background processes. The agent fires `mmr review` and continues working. No blocking for 4-6 minutes.
585
+ - **Per-channel auth verification** — checks authentication before every dispatch. Auth failures are never silent — `mmr` tells you exactly what expired and the command to fix it.
586
+ - **Immutable core prompt** — every channel gets the same severity definitions (P0-P3), output format spec (JSON), and review criteria. No prompt drift between channels.
587
+ - **Automated reconciliation** — when two channels flag the same location, that's consensus (high confidence). When only one channel flags something, it's unique (medium confidence). P0 from any single source is always high confidence.
588
+ - **Configurable severity gate** — project default in `.mmr.yaml`, override per-review with `--fix-threshold`. Default: P2 (fix P0/P1/P2, skip P3).
589
+ - **Multiple output formats** — JSON (default, for machines), text (terminals), markdown (PR comments).
590
+
591
+ #### Installing mmr
550
592
 
593
+ **npm** (available now):
551
594
  ```bash
552
- mmr review --pr 47 --focus "auth flow" # Dispatch to all channels (async)
553
- mmr status mmr-a1b2c3 # Poll progress
554
- mmr results mmr-a1b2c3 # Reconciled findings + gate
555
- mmr config test # Pre-flight auth check
595
+ npm install -g @zigrivers/mmr
556
596
  ```
557
597
 
558
- `mmr` is installed alongside Scaffold, or independently via `npm install -g @zigrivers/mmr`. Run `mmr config init` to auto-detect installed CLIs and generate a `.mmr.yaml` config. See the [mmr design spec](docs/superpowers/specs/2026-04-05-mmr-multi-model-review-design.md) for full details.
598
+ **Homebrew** (available after next scaffold release):
599
+ ```bash
600
+ brew tap zigrivers/scaffold
601
+ brew install mmr
602
+ ```
603
+
604
+ Verify: `mmr --help`
605
+
606
+ #### Enabling mmr in an Existing Project
607
+
608
+ **Step 1: Install the model CLIs you want to use**
609
+
610
+ You need at least one. More models = more diverse review perspectives.
611
+
612
+ ```bash
613
+ # Claude Code (you probably already have this)
614
+ npm install -g @anthropic-ai/claude-code
615
+
616
+ # Codex CLI (requires ChatGPT Plus/Pro/Team subscription)
617
+ npm install -g @openai/codex
618
+
619
+ # Gemini CLI (free tier available with Google account)
620
+ npm install -g @google/gemini-cli
621
+ ```
622
+
623
+ **Step 2: Authenticate each CLI**
624
+
625
+ Each CLI needs a one-time interactive authentication:
626
+
627
+ ```bash
628
+ # Claude — if not already logged in
629
+ claude login
630
+
631
+ # Codex — opens browser for OAuth
632
+ codex login
633
+
634
+ # Gemini — opens browser for OAuth
635
+ gemini -p "hello"
636
+ ```
637
+
638
+ **Step 3: Initialize mmr in your project**
639
+
640
+ ```bash
641
+ cd your-project
642
+ mmr config init
643
+ ```
644
+
645
+ This auto-detects which CLIs are installed and generates `.mmr.yaml` in your project root:
646
+
647
+ ```
648
+ Detected CLIs:
649
+ ✓ claude (claude -p)
650
+ ✓ gemini (gemini -p)
651
+ ✗ codex (not found)
652
+
653
+ Generated .mmr.yaml with 2 enabled channels.
654
+ Run `mmr config test` to verify authentication.
655
+ ```
656
+
657
+ **Step 4: Verify authentication**
658
+
659
+ ```bash
660
+ mmr config test
661
+ ```
662
+
663
+ ```
664
+ claude ✓ installed ✓ authenticated
665
+ gemini ✓ installed ✓ authenticated
666
+ codex ✗ not installed (skipped)
667
+
668
+ 2/3 channels ready.
669
+ ```
670
+
671
+ If any channel shows an auth failure, `mmr` tells you the exact command to fix it.
672
+
673
+ **Step 5: Commit the config**
674
+
675
+ ```bash
676
+ git add .mmr.yaml
677
+ git commit -m "chore: add mmr multi-model review config"
678
+ ```
679
+
680
+ This ensures your team shares the same channel configuration.
681
+
682
+ **Step 6 (optional): Customize review criteria**
683
+
684
+ Edit `.mmr.yaml` to add project-specific review criteria that get injected into every review prompt:
685
+
686
+ ```yaml
687
+ review_criteria:
688
+ - "Verify all database queries use parameterized statements"
689
+ - "Check that error messages do not leak internal state"
690
+ - "Ensure all API endpoints validate authentication"
691
+ ```
692
+
693
+ You can also adjust per-channel timeouts, the default severity threshold, and named review templates for different review types (PR reviews, implementation plan reviews, etc.).
694
+
695
+ #### Using mmr Day-to-Day
696
+
697
+ **After creating a PR:**
698
+
699
+ ```bash
700
+ mmr review --pr 47 --focus "auth flow, session handling"
701
+ # → Job mmr-a1b2c3 started. 2/2 channels dispatched.
702
+ ```
703
+
704
+ **Continue working, then check back:**
705
+
706
+ ```bash
707
+ mmr status mmr-a1b2c3
708
+ # → claude: completed (47s) | gemini: running (2m12s)
709
+
710
+ # Later:
711
+ mmr status mmr-a1b2c3
712
+ # → All channels complete.
713
+ ```
714
+
715
+ **Collect reconciled results:**
716
+
717
+ ```bash
718
+ mmr results mmr-a1b2c3
719
+ # → JSON output with gate_passed, reconciled_findings, per_channel details
720
+
721
+ mmr results mmr-a1b2c3 --format text
722
+ # → Human-readable terminal output
723
+
724
+ mmr results mmr-a1b2c3 --format markdown
725
+ # → Markdown table for PR comments
726
+ ```
727
+
728
+ **Review staged changes before committing:**
729
+
730
+ ```bash
731
+ mmr review --staged --focus "regression risk"
732
+ ```
733
+
734
+ **Review a diff between branches:**
735
+
736
+ ```bash
737
+ mmr review --base main --head feature/auth
738
+ ```
739
+
740
+ **Override severity gate for a critical path:**
741
+
742
+ ```bash
743
+ mmr review --pr 47 --fix-threshold P1 # Only fix P0 and P1
744
+ mmr review --pr 47 --fix-threshold P0 # Only fix critical/security issues
745
+ ```
746
+
747
+ #### mmr Commands Reference
748
+
749
+ | Command | Purpose |
750
+ |---------|---------|
751
+ | `mmr review` | Dispatch a review job to all configured channels |
752
+ | `mmr status <job-id>` | Check progress of a running job |
753
+ | `mmr results <job-id>` | Collect, reconcile, and output findings |
754
+ | `mmr config init` | Auto-detect CLIs and generate `.mmr.yaml` |
755
+ | `mmr config test` | Verify all channels (installation + auth) |
756
+ | `mmr config channels` | List configured channels |
757
+ | `mmr jobs list` | Show recent review jobs |
758
+ | `mmr jobs prune` | Remove old jobs (default: older than 7 days) |
759
+
760
+ #### mmr Configuration (.mmr.yaml)
761
+
762
+ The config file controls channel definitions, defaults, and project-specific review criteria:
763
+
764
+ ```yaml
765
+ version: 1
766
+
767
+ defaults:
768
+ fix_threshold: P2 # P0/P1/P2 block the gate, P3 is informational
769
+ timeout: 300 # Per-channel timeout in seconds
770
+ format: json # Default output format
771
+ job_retention_days: 7 # Auto-prune old jobs
772
+
773
+ # Project-specific criteria appended to every review prompt
774
+ review_criteria:
775
+ - "Check for SQL injection in all query builders"
776
+ - "Verify RBAC rules match API contract"
777
+
778
+ # Channel definitions (auto-generated by mmr config init)
779
+ channels:
780
+ claude:
781
+ enabled: true
782
+ command: claude -p
783
+ auth:
784
+ check: "claude -p 'respond with ok' 2>/dev/null"
785
+ timeout: 5
786
+ failure_exit_codes: [1]
787
+ recovery: "Run: claude login"
788
+
789
+ gemini:
790
+ enabled: true
791
+ command: gemini -p
792
+ flags:
793
+ - "--approval-mode yolo"
794
+ - "--output-format json"
795
+ env:
796
+ NO_BROWSER: "true"
797
+ auth:
798
+ check: "NO_BROWSER=true gemini -p 'respond with ok' -o json 2>&1"
799
+ timeout: 5
800
+ failure_exit_codes: [41]
801
+ recovery: "Run: gemini -p 'hello' (interactive, opens browser)"
802
+ timeout: 360 # Gemini tends to be slower
803
+
804
+ codex:
805
+ enabled: true
806
+ command: codex exec
807
+ flags:
808
+ - "--skip-git-repo-check"
809
+ - "-s read-only"
810
+ - "--ephemeral"
811
+ auth:
812
+ check: "codex login status 2>/dev/null"
813
+ timeout: 5
814
+ failure_exit_codes: [1]
815
+ recovery: "Run: codex login"
816
+ ```
817
+
818
+ **User-level defaults** can be set in `~/.mmr/config.yaml` for settings that apply across all projects (e.g., which channels are installed on your machine). Project config overrides user config. CLI flags override everything.
819
+
820
+ **Adding a new model CLI** requires only a YAML config change — no code modifications to `mmr`. When a new model CLI ships, add its channel definition to `.mmr.yaml` and you're ready.
821
+
822
+ #### Severity Levels
823
+
824
+ mmr uses a standardized P0-P3 severity classification across all channels:
825
+
826
+ | Level | Name | Definition | Gate Default |
827
+ |-------|------|------------|-------------|
828
+ | **P0** | Critical | Will cause failure, data loss, security vulnerability, or fundamental architectural flaw | Blocks |
829
+ | **P1** | High | Will cause bugs in normal usage, inconsistency, or blocks downstream work | Blocks |
830
+ | **P2** | Medium | Improvement opportunity — style, naming, documentation, minor optimization | Blocks |
831
+ | **P3** | Trivial | Personal preference, trivial nits | Informational |
832
+
833
+ With the default `fix_threshold: P2`, any P0, P1, or P2 finding fails the gate. Only P3-only reviews pass.
834
+
835
+ #### Reconciliation Rules
836
+
837
+ When multiple channels return findings, mmr applies consensus rules:
838
+
839
+ | Scenario | Confidence | Action |
840
+ |----------|-----------|--------|
841
+ | 2+ channels flag same location, same severity | **High** | Report at agreed severity |
842
+ | 2+ channels flag same location, different severity | **Medium** | Report at higher severity |
843
+ | All channels approve (no findings) | **High** | Gate passed |
844
+ | One channel flags P0, others approve | **High** | Report P0 (critical from any source) |
845
+ | One channel flags P1/P2, others approve | **Medium** | Report with attribution |
846
+ | Channels contradict each other | **Low** | Present both for user adjudication |
559
847
 
560
848
  ### How It Works
561
849
 
@@ -771,6 +1059,7 @@ Options: `--dry-run` to preview, `minor`/`major`/`patch` to specify the bump, `c
771
1059
  | **Knowledge base** | 60 domain expertise entries that get injected into prompts. Can be extended with project-local overrides. |
772
1060
  | **MCP** | Model Context Protocol. A way for Claude to use external tools like a headless browser. |
773
1061
  | **Meta-prompt** | A short intent declaration in `content/pipeline/` that gets assembled into a full prompt at runtime. |
1062
+ | **mmr** | Multi-Model Review CLI (`@zigrivers/mmr`). Standalone tool for async multi-model code review dispatch, reconciliation, and severity gating. |
774
1063
  | **Methodology** | A preset (deep, mvp, custom) controlling which steps run and at what depth. |
775
1064
  | **Multi-model review** | Independent validation from Codex/Gemini CLIs at depth 4-5, catching blind spots a single model misses. |
776
1065
  | **PRD** | Product Requirements Document. The foundation for everything Scaffold builds. |
@@ -826,6 +1115,15 @@ NO_BROWSER=true gemini -p "Review this artifact..." --output-format json --appro
826
1115
  ```
827
1116
  These are documented in detail in the `multi-model-dispatch` skill.
828
1117
 
1118
+ **mmr review dispatches but no channels return results**
1119
+ Check auth: `mmr config test`. If channels show auth failures, re-authenticate with the recovery command shown. If channels are installed but the review hangs, check the per-channel timeout in `.mmr.yaml` — some models take 3-5 minutes for large diffs. Increase `timeout` to 360-600 seconds for large PRs.
1120
+
1121
+ **mmr results says "gate failed" but I disagree with the findings**
1122
+ Use `mmr results <job-id> --format text` to see the full reconciled findings with source attribution and confidence scores. Single-source findings with "unique" agreement are less certain than "consensus" findings. Override the threshold for a specific review: `mmr review --pr 47 --fix-threshold P1` (only gate on P0 and P1).
1123
+
1124
+ **How do I add a new AI model CLI to mmr?**
1125
+ Add a channel definition to `.mmr.yaml` with the command, auth check, and output parser. No code changes needed. See the [mmr Configuration](#mmr-configuration-mmryaml) section for the full schema.
1126
+
829
1127
  **I upgraded and my pipeline shows old step names**
830
1128
  Run `scaffold status` — the state manager automatically migrates old step names (e.g., `add-playwright` → `add-e2e-testing`, `multi-model-review` → `automated-pr-review`) and removes retired steps.
831
1129
 
@@ -873,7 +1171,22 @@ content/
873
1171
  ├── tools/ # 10 tool meta-prompts (stateless, category: tool)
874
1172
  ├── knowledge/ # 61 domain expertise entries (core, product, review, validation, finalization, execution, tools)
875
1173
  ├── methodology/ # 3 YAML presets (deep, mvp, custom)
876
- └── skills/ # 3 skill templates with {{markers}} for multi-platform resolution
1174
+ └── skills/ # Skill templates with {{markers}} for multi-platform resolution (includes mmr)
1175
+ ```
1176
+
1177
+ ### mmr package layout
1178
+
1179
+ `@zigrivers/mmr` lives in `packages/mmr/` as an independent workspace package:
1180
+
1181
+ ```
1182
+ packages/mmr/
1183
+ ├── src/
1184
+ │ ├── commands/ # review, status, results, config, jobs (yargs)
1185
+ │ ├── config/ # Zod schema, 4-layer config loader, builtin channel presets
1186
+ │ ├── core/ # job-store, auth, prompt assembly, parser, reconciler, dispatcher
1187
+ │ └── formatters/ # json, text, markdown output formatters
1188
+ ├── templates/ # Immutable core review prompt (severity defs, output format)
1189
+ └── tests/ # 60 tests across 11 files
877
1190
  ```
878
1191
 
879
1192
  Generated output (gitignored):
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zigrivers/scaffold",
3
- "version": "3.4.0",
3
+ "version": "3.4.1",
4
4
  "description": "AI-powered software project scaffolding pipeline",
5
5
  "type": "module",
6
6
  "workspaces": ["packages/*"],