@zigrivers/scaffold 3.24.3 → 3.25.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -14,7 +14,7 @@ Automated code review leverages AI models to provide consistent, thorough code r
14
14
 
15
15
  See `review-methodology` for severity definitions (P0-P3). See `multi-model-review-dispatch` for finding reconciliation rules.
16
16
 
17
- **Action thresholds:** P0/P1/P2 findings must be fixed before proceeding to the next task. P3 findings are recorded but not actioned.
17
+ **Action thresholds:** Findings at or above the configured `fix_threshold` (read from `results.fix_threshold` in the verdict JSON; default `P2`) must be fixed before proceeding to the next task. Findings below threshold are recorded as advisory but not actioned.
18
18
 
19
19
  ### Degraded-Mode Behavior
20
20
 
@@ -24,8 +24,8 @@ These are the authoritative verdict definitions. Tool files (`review-code.md`, `
24
24
 
25
25
  | Verdict | Condition |
26
26
  |---------|-----------|
27
- | `pass` | All channels completed, no unresolved P0/P1/P2 |
28
- | `degraded-pass` | Some channels unavailable, compensating passes ran, no unresolved P0/P1/P2 |
27
+ | `pass` | All channels completed, no unresolved findings at or above `fix_threshold` |
28
+ | `degraded-pass` | Some channels unavailable, compensating passes ran, no unresolved findings at or above `fix_threshold` |
29
29
  | `blocked` | Findings at or above fix threshold remain unresolved |
30
30
  | `needs-user-decision` | No channels completed — insufficient data for a determination |
31
31
 
@@ -47,7 +47,7 @@ When a channel (Codex or Gemini) is unavailable, the CLI dispatches a compensati
47
47
  - Missing Gemini → focus on architectural patterns, design reasoning, broad context.
48
48
  - Missing both → two compensating passes (one per missing channel's strength area).
49
49
  - Compensating-pass findings are **single-source confidence** — they do NOT raise to high confidence even if they agree with another channel's findings.
50
- - Normal mandatory-fix thresholds apply: P0/P1/P2 findings from compensating passes still require fixing.
50
+ - Normal mandatory-fix thresholds apply: findings at or above `fix_threshold` from compensating passes still require fixing.
51
51
 
52
52
  #### Foreground-Only Execution
53
53
 
@@ -63,7 +63,7 @@ After all channels complete (including compensating passes), reconcile findings
63
63
 
64
64
  Reconciliation normalizes findings from all channels (real and compensating) to a common schema, then matches findings across channels by location and category. The purpose is to detect when multiple independent channels agree on a finding (raising confidence) and to surface contradictions that require human judgment. A finding reported by Codex alone has lower confidence than the same finding reported by both Codex and Gemini.
65
65
 
66
- The reconciliation output is a deduplicated list of findings with confidence scores. High-confidence findings (agreed by 2+ real channels) are actionable without further discussion. Low-confidence findings (single-source, or from compensating passes) still require action at P0/P1/P2 but should be noted as lower-confidence in the review summary.
66
+ The reconciliation output is a deduplicated list of findings with confidence scores. High-confidence findings (agreed by 2+ real channels) are actionable without further discussion. Low-confidence findings (single-source, or from compensating passes) still require action when at or above `fix_threshold` but should be noted as lower-confidence in the review summary.
67
67
 
68
68
  Findings that appear in all three channels (Codex, Gemini, Claude) are considered maximum-confidence and should be surfaced first in the review summary. Findings that appear in only one channel should include the channel name in the finding description to help the developer assess confidence independently.
69
69
 
@@ -112,16 +112,16 @@ Apply the following evaluation order to determine the final verdict. The first m
112
112
  ```
113
113
  Verdict evaluation order:
114
114
  1. No channels completed? → needs-user-decision
115
- 2. Any unresolved P0/P1/P2 after 3 fix rounds? → blocked
115
+ 2. Any unresolved findings at or above `fix_threshold` after 3 fix rounds? → blocked
116
116
  3. Any channel not at full coverage? → degraded-pass
117
- 4. All channels completed, no unresolved P0/P1/P2? → pass
117
+ 4. All channels completed, no unresolved findings at or above `fix_threshold`? → pass
118
118
  ```
119
119
 
120
120
  A channel is "not at full coverage" when: it ran as a compensating pass instead of a real tool, or it timed out.
121
121
 
122
122
  **Verdict precedence reminder:** `needs-user-decision` > `blocked` > `degraded-pass` > `pass`. When multiple conditions apply simultaneously, the higher-precedence verdict wins.
123
123
 
124
- The verdict is always computed after all fix rounds are exhausted — do not emit a partial verdict mid-cycle. If a fix round resolves all P0/P1/P2 findings, the verdict upgrades from `blocked` to `pass` or `degraded-pass` depending on channel coverage. This upgrade must be verified explicitly by re-running the reconciliation step after each fix round, not assumed from the fact that fixes were applied.
124
+ The verdict is always computed after all fix rounds are exhausted — do not emit a partial verdict mid-cycle. If a fix round resolves all findings at or above `fix_threshold`, the verdict upgrades from `blocked` to `pass` or `degraded-pass` depending on channel coverage. This upgrade must be verified explicitly by re-running the reconciliation step after each fix round, not assumed from the fact that fixes were applied.
125
125
 
126
126
  ### Security-Focused Review Checklist
127
127
 
@@ -168,7 +168,7 @@ Once in-progress work is complete (or if there was none):
168
168
  - This reviews the local delivery candidate without requiring a PR
169
169
  - Surface auth failures immediately and retry after recovery
170
170
  - If recovery is not possible, document reduced review coverage and continue with the available channels
171
- - Fix any P0/P1/P2 findings before proceeding
171
+ - Fix any findings at or above `fix_threshold` before proceeding
172
172
 
173
173
  3. **Create PR** (if not already created for in-progress work)
174
174
  - Push the branch: `git push -u origin HEAD`
@@ -184,7 +184,7 @@ Once in-progress work is complete (or if there was none):
184
184
  4. **Superpowers code-reviewer** (4th channel): dispatch `superpowers:code-reviewer` subagent with BASE_SHA and HEAD_SHA
185
185
  - Verify auth before each CLI (`mmr config test` pre-flights all three at once)
186
186
  - All four channels should execute. Missing Codex or Gemini → MMR runs a compensating Claude pass in its place (degraded-pass verdict). Missing Claude CLI → review proceeds without compensation.
187
- - Fix any P0/P1/P2 findings before proceeding
187
+ - Fix any findings at or above `fix_threshold` before proceeding
188
188
  - Do NOT move to the next task until the review completes
189
189
 
190
190
  5. **Between-task cleanup**
@@ -239,7 +239,7 @@ Once in-progress work is complete (or if there was none):
239
239
  5. **TDD is not optional** — Continue the red-green-refactor cycle for any in-progress work.
240
240
  6. **Quality gates before PR** — Never create a PR with failing checks.
241
241
  7. **Honor pre-push review when requested** — If the user or project workflow asks for pre-push multi-model review, run `scaffold run review-code` after quality gates and before `git push`.
242
- 8. **Code review before next task** — After creating a PR, run `scaffold run review-pr`: three CLI channels (Codex CLI, Gemini CLI, Claude CLI) via MMR plus the Superpowers code-reviewer agent as a complementary 4th channel. Fix all P0/P1/P2 findings before moving on.
242
+ 8. **Code review before next task** — After creating a PR, run `scaffold run review-pr`: three CLI channels (Codex CLI, Gemini CLI, Claude CLI) via MMR plus the Superpowers code-reviewer agent as a complementary 4th channel. Fix all findings at or above `fix_threshold` before moving on.
243
243
  9. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
244
244
 
245
245
  ---
@@ -171,7 +171,7 @@ For each task:
171
171
  - This reviews the local delivery candidate without requiring a PR
172
172
  - Surface auth failures immediately and retry after recovery
173
173
  - If recovery is not possible, document reduced review coverage and continue with the available channels
174
- - Fix any P0/P1/P2 findings before proceeding
174
+ - Fix any findings at or above `fix_threshold` before proceeding
175
175
 
176
176
  7. **Create PR**
177
177
  - Push the branch: `git push -u origin HEAD`
@@ -188,7 +188,7 @@ For each task:
188
188
  4. **Superpowers code-reviewer** (4th channel): dispatch `superpowers:code-reviewer` subagent with BASE_SHA and HEAD_SHA
189
189
  - Verify auth before each CLI (`mmr config test` pre-flights all three at once)
190
190
  - All four channels should execute. Missing Codex or Gemini → MMR runs a compensating Claude pass in its place (degraded-pass verdict). Missing Claude CLI → review proceeds without compensation.
191
- - Fix any P0/P1/P2 findings before proceeding
191
+ - Fix any findings at or above `fix_threshold` before proceeding
192
192
  - Do NOT move to the next task until the review completes
193
193
 
194
194
  9. **Between-task cleanup**
@@ -231,7 +231,7 @@ For each task:
231
231
  4. **TDD is not optional** — Write failing tests before implementation. No exceptions.
232
232
  5. **Quality gates before PR** — Never create a PR with failing checks.
233
233
  6. **Honor pre-push review when requested** — If the user or project workflow asks for pre-push multi-model review, run `scaffold run review-code` after quality gates and before `git push`.
234
- 7. **Code review before next task** — After creating a PR, run `scaffold run review-pr`: three CLI channels (Codex CLI, Gemini CLI, Claude CLI) via MMR plus the Superpowers code-reviewer agent as a complementary 4th channel. Fix all P0/P1/P2 findings before moving on.
234
+ 7. **Code review before next task** — After creating a PR, run `scaffold run review-pr`: three CLI channels (Codex CLI, Gemini CLI, Claude CLI) via MMR plus the Superpowers code-reviewer agent as a complementary 4th channel. Fix all findings at or above `fix_threshold` before moving on.
235
235
  8. **Avoid task conflicts** — Check what other agents are working on before claiming.
236
236
  9. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
237
237
 
@@ -145,7 +145,7 @@ Once in-progress work is complete (or if there was none):
145
145
  - This reviews the local delivery candidate without requiring a PR
146
146
  - Surface auth failures immediately and retry after recovery
147
147
  - If recovery is not possible, document reduced review coverage and continue with the available channels
148
- - Fix any P0/P1/P2 findings before proceeding
148
+ - Fix any findings at or above `fix_threshold` before proceeding
149
149
 
150
150
  3. **Create PR** (if not already created for in-progress work)
151
151
  - Push the branch: `git push -u origin HEAD`
@@ -161,7 +161,7 @@ Once in-progress work is complete (or if there was none):
161
161
  4. **Superpowers code-reviewer** (4th channel): dispatch `superpowers:code-reviewer` subagent with BASE_SHA and HEAD_SHA
162
162
  - Verify auth before each CLI (`mmr config test` pre-flights all three at once)
163
163
  - All four channels should execute. Missing Codex or Gemini → MMR runs a compensating Claude pass in its place (degraded-pass verdict). Missing Claude CLI → review proceeds without compensation.
164
- - Fix any P0/P1/P2 findings before proceeding
164
+ - Fix any findings at or above `fix_threshold` before proceeding
165
165
  - Do NOT move to the next task until the review completes
166
166
 
167
167
  5. **Claim next task**
@@ -204,7 +204,7 @@ Once in-progress work is complete (or if there was none):
204
204
  4. **TDD is not optional** — Continue the red-green-refactor cycle for any in-progress work.
205
205
  5. **Quality gates before PR** — Never create a PR with failing checks.
206
206
  6. **Honor pre-push review when requested** — If the user or project workflow asks for pre-push multi-model review, run `scaffold run review-code` after quality gates and before `git push`.
207
- 7. **Code review before next task** — After creating a PR, run `scaffold run review-pr`: three CLI channels (Codex CLI, Gemini CLI, Claude CLI) via MMR plus the Superpowers code-reviewer agent as a complementary 4th channel. Fix all P0/P1/P2 findings before moving on.
207
+ 7. **Code review before next task** — After creating a PR, run `scaffold run review-pr`: three CLI channels (Codex CLI, Gemini CLI, Claude CLI) via MMR plus the Superpowers code-reviewer agent as a complementary 4th channel. Fix all findings at or above `fix_threshold` before moving on.
208
208
  8. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
209
209
 
210
210
  ---
@@ -150,7 +150,7 @@ For each task:
150
150
  - This reviews the local delivery candidate without requiring a PR
151
151
  - Surface auth failures immediately and retry after recovery
152
152
  - If recovery is not possible, document reduced review coverage and continue with the available channels
153
- - Fix any P0/P1/P2 findings before proceeding
153
+ - Fix any findings at or above `fix_threshold` before proceeding
154
154
 
155
155
  7. **Create PR**
156
156
  - Push the branch: `git push -u origin HEAD`
@@ -167,7 +167,7 @@ For each task:
167
167
  4. **Superpowers code-reviewer** (4th channel): dispatch `superpowers:code-reviewer` subagent with BASE_SHA and HEAD_SHA
168
168
  - Verify auth before each CLI (`mmr config test` pre-flights all three at once)
169
169
  - All four channels should execute. Missing Codex or Gemini → MMR runs a compensating Claude pass in its place (degraded-pass verdict). Missing Claude CLI → review proceeds without compensation.
170
- - Fix any P0/P1/P2 findings before proceeding
170
+ - Fix any findings at or above `fix_threshold` before proceeding
171
171
  - Do NOT move to the next task until the review completes
172
172
 
173
173
  9. **Update status**
@@ -202,7 +202,7 @@ For each task:
202
202
  2. **One task at a time** — Complete the current task fully before starting the next.
203
203
  3. **Quality gates before PR** — Never create a PR with failing checks.
204
204
  4. **Honor pre-push review when requested** — If the user or project workflow asks for pre-push multi-model review, run `scaffold run review-code` after quality gates and before `git push`.
205
- 5. **Code review before next task** — After creating a PR, run `scaffold run review-pr`: three CLI channels (Codex CLI, Gemini CLI, Claude CLI) via MMR plus the Superpowers code-reviewer agent as a complementary 4th channel. Fix all P0/P1/P2 findings before moving on.
205
+ 5. **Code review before next task** — After creating a PR, run `scaffold run review-pr`: three CLI channels (Codex CLI, Gemini CLI, Claude CLI) via MMR plus the Superpowers code-reviewer agent as a complementary 4th channel. Fix all findings at or above `fix_threshold` before moving on.
206
206
  6. **Update status immediately** — Mark tasks complete as soon as review passes.
207
207
  7. **Consult lessons.md** — Check for relevant anti-patterns before each task.
208
208
  8. **Follow CLAUDE.md** — It is the authority on project conventions and commands.
@@ -100,6 +100,18 @@ Check if AGENTS.md exists first. If it exists, check for scaffold tracking comme
100
100
 
101
101
  ## Instructions
102
102
 
103
+ ### MMR Configuration
104
+
105
+ If `.mmr.yaml` does not exist in the project root and `mmr` is on `PATH`,
106
+ run `mmr config init` once to create one. The generated file pins
107
+ `fix_threshold: P2` (the recommended default for typical software work)
108
+ with an explanatory comment block describing each severity tier — edit
109
+ the value if your project warrants a different gate (`P1` for low-friction
110
+ prototypes; `P3` for security-sensitive work).
111
+
112
+ If `mmr` is not installed, install it before running multi-model review;
113
+ otherwise channels will degrade.
114
+
103
115
  ### Configure Review Enforcement Hook
104
116
 
105
117
  Add a Claude Code hook to the project's `.claude/settings.json` that fires after
@@ -118,7 +130,7 @@ Add this to `.claude/settings.json`:
118
130
  "hooks": [
119
131
  {
120
132
  "type": "command",
121
- "command": "if echo \"$CC_BASH_COMMAND\" | grep -q 'gh pr create'; then echo '\\n⚠️ MANDATORY: Run all 3 CLI review channels plus the Superpowers 4th channel before proceeding to the next task:\\n\\n 1. Codex CLI:\\n Auth: codex login status 2>/dev/null\\n Run: codex exec --skip-git-repo-check -s read-only --ephemeral \"REVIEW_PROMPT\" 2>/dev/null\\n\\n 2. Gemini CLI:\\n Auth: NO_BROWSER=true gemini -p \"respond with ok\" -o json 2>&1\\n Run: NO_BROWSER=true gemini -p \"REVIEW_PROMPT\" --output-format json --approval-mode yolo 2>/dev/null\\n\\n 3. Claude CLI:\\n Auth: claude -p \"respond with ok\" 2>/dev/null\\n Run: claude -p \"REVIEW_PROMPT\" --output-format json 2>/dev/null\\n\\n 4. Superpowers code-reviewer (complementary 4th channel):\\n Dispatch superpowers:code-reviewer subagent with BASE_SHA and HEAD_SHA\\n\\nIf auth fails: tell user to run ! codex login, ! gemini -p \"hello\", or ! claude login (as applicable).\\nDo not silently skip channels — surface auth failures and let MMR decide: missing Codex/Gemini get compensating Claude passes (degraded-pass verdict); missing Claude proceeds without compensation.\\nFix all P0/P1/P2 findings before moving on.\\nFull instructions: scaffold run review-pr'; fi"
133
+ "command": "if echo \"$CC_BASH_COMMAND\" | grep -q 'gh pr create'; then echo '\\n⚠️ MANDATORY: Run all 3 CLI review channels plus the Superpowers 4th channel before proceeding to the next task:\\n\\n 1. Codex CLI:\\n Auth: codex login status 2>/dev/null\\n Run: codex exec --skip-git-repo-check -s read-only --ephemeral \"REVIEW_PROMPT\" 2>/dev/null\\n\\n 2. Gemini CLI:\\n Auth: NO_BROWSER=true gemini -p \"respond with ok\" -o json 2>&1\\n Run: NO_BROWSER=true gemini -p \"REVIEW_PROMPT\" --output-format json --approval-mode yolo 2>/dev/null\\n\\n 3. Claude CLI:\\n Auth: claude -p \"respond with ok\" 2>/dev/null\\n Run: claude -p \"REVIEW_PROMPT\" --output-format json 2>/dev/null\\n\\n 4. Superpowers code-reviewer (complementary 4th channel):\\n Dispatch superpowers:code-reviewer subagent with BASE_SHA and HEAD_SHA\\n\\nIf auth fails: tell user to run ! codex login, ! gemini -p \"hello\", or ! claude login (as applicable).\\nDo not silently skip channels — surface auth failures and let MMR decide: missing Codex/Gemini get compensating Claude passes (degraded-pass verdict); missing Claude proceeds without compensation.\\nFix all findings at or above the configured fix_threshold (see results.fix_threshold in the verdict JSON; default P2).\\nFull instructions: scaffold run review-pr'; fi"
122
134
  }
123
135
  ]
124
136
  }
@@ -151,8 +163,8 @@ markers, replace it in place and add the markers.
151
163
  <!-- scaffold:automated-pr-review:claude-md start -->
152
164
  **Mandatory after `gh pr create`** — run `/scaffold:review-pr <PR#>` to execute
153
165
  all three review channels (Codex CLI, Gemini CLI, Claude CLI), plus the
154
- Superpowers code-reviewer agent as a complementary 4th channel. Fix P0/P1/P2
155
- findings before moving to the next task. A post-hook on `gh pr create` will
166
+ Superpowers code-reviewer agent as a complementary 4th channel. Fix findings
167
+ at or above `fix_threshold` before moving to the next task. A post-hook on `gh pr create` will
156
168
  remind you.
157
169
 
158
170
  **Optional but supported** for non-PR targets — the review is not PR-gated.
@@ -119,13 +119,20 @@ Re-run `mmr config test` after re-authenticating to verify.
119
119
 
120
120
  ## Severity Gate
121
121
 
122
- Default threshold is P2 (fix P0/P1/P2, skip P3). Override per-review:
122
+ Default threshold is `P2` (the verdict gate blocks on P0, P1, and P2;
123
+ P3 findings are kept in the result as **advisory** but don't cause
124
+ `blocked`). Override per-review:
123
125
 
124
126
  ```bash
125
127
  mmr review --pr 47 --fix-threshold P1 # Only fix P0 and P1
126
128
  mmr review --pr 47 --fix-threshold P0 # Only fix critical issues
127
129
  ```
128
130
 
131
+ The verdict JSON includes `advisory_count` (count of findings strictly
132
+ below the threshold). Formatted output shows `Advisory: N` (text) or
133
+ `**Advisory:** N` (markdown) when non-zero — useful for spotting real
134
+ findings that the gate didn't block.
135
+
129
136
  ## Output Formats
130
137
 
131
138
  ```bash
@@ -147,13 +147,13 @@ When dispatching a review, bundle all relevant context into the prompt. Each CLI
147
147
  ### Template for Artifact Review
148
148
 
149
149
  ```
150
- You are reviewing a project artifact for quality issues. Report P0 (critical), P1 (high), and P2 (medium) issues.
150
+ You are reviewing a project artifact for quality issues. Report all P0, P1, P2, and P3 findings; the project's fix threshold is applied downstream.
151
151
 
152
152
  ## Severity Definitions
153
153
  - P0: Will cause implementation failure, data loss, security vulnerability, or fundamental architectural flaw
154
154
  - P1: Will cause bugs in normal usage, inconsistency across documents, or blocks downstream work
155
155
  - P2: Improvement opportunity — style, naming, documentation, minor optimization
156
- - Do NOT report P3 issues (personal preference, trivial nits)
156
+ - P3: Personal preference, trivial nits — included so a strict project (`fix_threshold: P3`) can act on them; otherwise advisory
157
157
 
158
158
  ## Review Standards
159
159
  [paste contents of docs/review-standards.md if it exists, otherwise use severity definitions above]
@@ -170,7 +170,7 @@ Respond with a JSON object:
170
170
  "approved": true/false,
171
171
  "findings": [
172
172
  {
173
- "severity": "P0" or "P1" or "P2",
173
+ "severity": "P0" or "P1" or "P2" or "P3",
174
174
  "location": "section or line reference",
175
175
  "description": "what's wrong",
176
176
  "suggestion": "specific fix"
@@ -179,13 +179,13 @@ Respond with a JSON object:
179
179
  "summary": "one-line assessment"
180
180
  }
181
181
 
182
- If no P0/P1/P2 issues found, respond with: { "approved": true, "findings": [], "summary": "No issues found." }
182
+ If no findings, respond with: { "approved": true, "findings": [], "summary": "No issues found." }
183
183
  ```
184
184
 
185
185
  ### Template for PR Diff Review
186
186
 
187
187
  ```
188
- You are reviewing a pull request diff. Report P0, P1, and P2 issues.
188
+ You are reviewing a pull request diff. Report all P0, P1, P2, and P3 findings; the project's fix threshold is applied downstream.
189
189
 
190
190
  ## Review Standards
191
191
  [paste docs/review-standards.md]
@@ -10,7 +10,7 @@ conditional: null
10
10
  stateless: true
11
11
  category: tool
12
12
  knowledge-base: [multi-model-review-dispatch, automated-review-tooling, post-implementation-review-methodology]
13
- argument-hint: "[--report-only]"
13
+ argument-hint: "[--report-only] [--fix-threshold P0|P1|P2|P3]"
14
14
  ---
15
15
 
16
16
  ## Purpose
@@ -30,7 +30,7 @@ The three channels are:
30
30
 
31
31
  ## Inputs
32
32
 
33
- - `$ARGUMENTS` — `--report-only` flag (optional; omit to review + fix)
33
+ - `$ARGUMENTS` — `--report-only` flag and/or `--fix-threshold P0|P1|P2|P3` (both optional)
34
34
  - `docs/user-stories.md` (required) — user stories with acceptance criteria; organizing manifest for Phase 2
35
35
  - `docs/implementation-plan.md` (optional) — implementation tasks; used to cross-check that all planned deliverables were built
36
36
  - `docs/coding-standards.md` (required) — coding conventions for review context
@@ -43,13 +43,13 @@ The three channels are:
43
43
  ## Expected Outputs
44
44
 
45
45
  - `docs/reviews/post-implementation-review.md` — consolidated findings report
46
- - Fixed code (P0/P1/P2 findings resolved) — in review+fix and update modes
46
+ - Fixed code (findings at or above `fix_threshold` resolved) — in review+fix and update modes
47
47
 
48
48
  ## Mode Detection
49
49
 
50
50
  | Condition | Mode |
51
51
  |-----------|------|
52
- | No prior report, no `--report-only` | **Review + Fix** — run all phases, then fix P0/P1/P2 |
52
+ | No prior report, no `--report-only` | **Review + Fix** — run all phases, then fix findings at or above `fix_threshold` |
53
53
  | No prior report, `--report-only` | **Report Only** — run all phases, write report, no code changes |
54
54
  | Prior report exists, no `--report-only` | **Update Mode** — load prior findings, skip to Phase 3 fix execution |
55
55
  | Prior report exists, `--report-only` | **Re-review** — run full review fresh, overwrite prior report |
@@ -63,6 +63,12 @@ The three channels are:
63
63
  REPORT_ONLY=false
64
64
  [[ "$ARGUMENTS" == *"--report-only"* ]] && REPORT_ONLY=true
65
65
 
66
+ # Detect --fix-threshold flag
67
+ FIX_THRESHOLD=""
68
+ if [[ "$ARGUMENTS" =~ (^|[[:space:]])--fix-threshold[[:space:]]+(P[0-3])($|[[:space:]]) ]]; then
69
+ FIX_THRESHOLD="${BASH_REMATCH[2]}"
70
+ fi
71
+
66
72
  # Detect prior report
67
73
  PRIOR_REPORT="docs/reviews/post-implementation-review.md"
68
74
  [[ -f "$PRIOR_REPORT" ]] && PRIOR_EXISTS=true || PRIOR_EXISTS=false
@@ -482,6 +488,12 @@ diff-only), so it operates independently of `mmr review`. Use `mmr reconcile` on
482
488
  when you want to merge post-implementation findings into an existing MMR job for a
483
489
  single unified verdict.
484
490
 
491
+ If `$FIX_THRESHOLD` is set and a fresh `mmr review` is dispatched as part
492
+ of this flow (e.g., to seed a job for `mmr reconcile`), forward it to that
493
+ invocation: `mmr review … --fix-threshold "$FIX_THRESHOLD" …`. The
494
+ existing `mmr reconcile` call does not take `--fix-threshold` directly —
495
+ the job's threshold is set at `mmr review` time.
496
+
485
497
  ### Step 6: Consolidate Findings
486
498
 
487
499
  Merge all findings from Phase 1 (`CODEX_PHASE1_FINDINGS`, `GEMINI_PHASE1_FINDINGS`,
@@ -495,8 +507,12 @@ entry. Record all source channels in a `sources` array on the merged finding.
495
507
 
496
508
  **Sorting:** P0 first, then P1, then P2, then P3.
497
509
 
498
- **Fix queue:** P0, P1, and P2 findings enter the fix queue. P3 findings are recorded
499
- in the report but not actioned.
510
+ **Fix queue:** Findings at or above the configured `fix_threshold` enter the
511
+ fix queue. The threshold defaults to `P2` (so P0, P1, P2 enter the queue and
512
+ P3 is advisory) and is configurable via `.mmr.yaml`, `--fix-threshold`
513
+ passed to this command, or the user's `~/.mmr/config.yaml`. The agent
514
+ reads the active threshold from `$FIX_THRESHOLD` if set; otherwise from
515
+ `.mmr.yaml` or the built-in default.
500
516
 
501
517
  ### Step 7: Write the Findings Report
502
518
 
@@ -543,7 +559,7 @@ Create `docs/reviews/` if it does not exist. Write the following to
543
559
  - [criterion]: satisfied | partial | not-satisfied
544
560
 
545
561
  **Findings:**
546
- [P0/P1/P2/P3 findings for this story, or "No findings."]
562
+ [Findings sorted by severity, or "No findings."]
547
563
 
548
564
  [Repeat for each story]
549
565
 
@@ -574,7 +590,10 @@ PRE_FIX_SHA=$(git rev-parse HEAD)
574
590
  This is used in Step 9 to identify all files modified across all fix commits,
575
591
  regardless of how many severity-tier commits are made.
576
592
 
577
- Process the fix queue in priority order: all P0s first, then all P1s, then all P2s.
593
+ Process the fix queue in priority order: iterate severity tiers from most
594
+ critical to least, processing every tier from `P0` down to and including
595
+ the configured `fix_threshold` (default `P2`). At threshold `P3` this
596
+ includes all four tiers; at `P0` only critical findings are processed.
578
597
  Within each severity tier, fix high-confidence findings (multi-source) first.
579
598
 
580
599
  For each finding:
@@ -593,15 +612,16 @@ For each finding:
593
612
  - Stop attempting to fix it
594
613
  - Continue to the next finding in the queue
595
614
 
596
- After all P0s are fixed, re-read each P0-modified file once to confirm correctness
597
- before moving to P1s.
615
+ After all findings in a severity tier are fixed, re-read each modified file
616
+ once to confirm correctness before moving to the next tier.
598
617
 
599
- Commit after each severity tier:
618
+ Commit after each severity tier processed (the tier label varies by run —
619
+ `P0`, `P1`, `P2`, or `P3` depending on the configured threshold):
600
620
 
601
621
  ```bash
602
622
  git add [modified source files only — not the report]
603
- git commit -m "fix: resolve P0 post-implementation review findings"
604
- # Replace P0 with P1 or P2 for the respective tiers
623
+ git commit -m "fix: resolve <tier> post-implementation review findings"
624
+ # Substitute <tier> with the severity label of the tier you just processed
605
625
  ```
606
626
 
607
627
  ### Step 9: Final Verification Pass
@@ -616,7 +636,7 @@ recorded at the start of Step 8:
616
636
  git diff --name-only $PRE_FIX_SHA..HEAD
617
637
  ```
618
638
 
619
- This captures files from every severity-tier commit (P0, P1, P2), not just
639
+ This captures files from every severity-tier commit, not just
620
640
  the most recent one.
621
641
 
622
642
  Dispatch `superpowers:code-reviewer` with:
@@ -701,7 +721,7 @@ the user they require manual attention before the project is ready to release.
701
721
  3. **Auth failures are not silent** — always surface to the user with the exact recovery command (`! codex login` or `! gemini -p "hello"`). Wait for user response before queuing a compensating pass.
702
722
  4. **Independence** — never share one channel's output with another. Each reviews independently.
703
723
  5. **Verify every fix** — run tests (or re-read the file) immediately after each fix before moving on.
704
- 6. **3-round limit (per finding)** — never attempt to fix the *same* P0/P1/P2 finding more than 3 times. Each round that surfaces a *new, different, fixable* finding is healthy iteration — keep going. Stop only when the same finding recurs across 3 attempts, channels contradict each other, or the user asks to stop. Surface unresolved findings to the user.
724
+ 6. **3-round limit (per finding)** — never attempt to fix the *same* blocking finding more than 3 times. Each round that surfaces a *new, different, fixable* finding is healthy iteration — keep going. Stop only when the same finding recurs across 3 attempts, channels contradict each other, or the user asks to stop. Surface unresolved findings to the user.
705
725
  7. **Document everything** — the report must show which channels ran, which were compensating, which were skipped, and the root cause for any degraded channel.
706
726
  8. **No auto-merge** — this tool modifies local files only. It never pushes, merges, or creates PRs.
707
727
  9. **Dispatch pattern cross-reference** — Phase 2 parallel dispatch uses `superpowers:dispatching-parallel-agents`. Each story subagent dispatches its own `superpowers:code-reviewer` as Channel 3. This two-level nesting is intentional and supported.
@@ -10,7 +10,7 @@ conditional: null
10
10
  stateless: true
11
11
  category: tool
12
12
  knowledge-base: [multi-model-review-dispatch, automated-review-tooling]
13
- argument-hint: "[--base <ref>] [--head <ref>] [--staged] [--report-only]"
13
+ argument-hint: "[--base <ref>] [--head <ref>] [--staged] [--report-only] [--fix-threshold P0|P1|P2|P3]"
14
14
  ---
15
15
 
16
16
  ## Purpose
@@ -44,6 +44,7 @@ brand-new files.
44
44
  - `--head <ref>` — explicit head ref for diff review
45
45
  - `--staged` — review only staged changes (`git diff --cached`)
46
46
  - `--report-only` — collect findings and verdict, but do not apply fixes
47
+ - `--fix-threshold P0|P1|P2|P3` — override the project's configured threshold for this run
47
48
  - `docs/coding-standards.md` (required) — coding conventions for review context
48
49
  - `docs/tdd-standards.md` (optional) — test expectations
49
50
  - `docs/review-standards.md` (optional) — severity definitions and review criteria
@@ -63,6 +64,17 @@ brand-new files.
63
64
  When the MMR CLI is installed, use it as the primary entry point. Pick the
64
65
  invocation that matches the scope the user asked for:
65
66
 
67
+ A common helper across all four invocation modes — set `MMR_FLAGS` once
68
+ and reuse it. **Note:** `FIX_THRESHOLD` is parsed from `$ARGUMENTS` in
69
+ Step 1 below; if you're skipping ahead to the invocations, run Step 1's
70
+ detection block first so the `--fix-threshold` flag actually flows
71
+ through.
72
+
73
+ ```bash
74
+ MMR_FLAGS=(--sync --format json)
75
+ [ -n "$FIX_THRESHOLD" ] && MMR_FLAGS+=(--fix-threshold "$FIX_THRESHOLD")
76
+ ```
77
+
66
78
  ```bash
67
79
  # Default (no flags) — full local delivery candidate:
68
80
  # committed branch diff (vs origin/main or main) + staged + unstaged.
@@ -93,16 +105,16 @@ fi
93
105
  # that covers committed branch work + staged + unstaged edits, with
94
106
  # repeated edits to the same file collapsed into a single final hunk.
95
107
  MERGE_BASE=$(git merge-base "$BASE_REF" HEAD 2>/dev/null || echo "$BASE_REF")
96
- git diff "$MERGE_BASE" | mmr review --diff - --sync --format json
108
+ git diff "$MERGE_BASE" | mmr review --diff - "${MMR_FLAGS[@]}"
97
109
 
98
110
  # Staged changes only:
99
- mmr review --staged --sync --format json
111
+ mmr review --staged "${MMR_FLAGS[@]}"
100
112
 
101
113
  # Branch diff against main (committed only, no staged/unstaged):
102
- mmr review --base main --sync --format json
114
+ mmr review --base main "${MMR_FLAGS[@]}"
103
115
 
104
116
  # Explicit ref range:
105
- mmr review --base <base-ref> --head <head-ref> --sync --format json
117
+ mmr review --base <base-ref> --head <head-ref> "${MMR_FLAGS[@]}"
106
118
  ```
107
119
 
108
120
  Routing rules:
@@ -133,6 +145,14 @@ Parse `$ARGUMENTS` and set:
133
145
  - `STAGED_ONLY=true` if `$ARGUMENTS` contains `--staged`
134
146
  - `BASE_REF` from `--base <ref>` if present
135
147
  - `HEAD_REF` from `--head <ref>` if present
148
+ - `FIX_THRESHOLD` from `--fix-threshold <value>` if present (must match `P0`, `P1`, `P2`, or `P3`); leave empty to defer to `.mmr.yaml`/built-in default
149
+
150
+ ```bash
151
+ FIX_THRESHOLD=""
152
+ if [[ "$ARGUMENTS" =~ (^|[[:space:]])--fix-threshold[[:space:]]+(P[0-3])($|[[:space:]]) ]]; then
153
+ FIX_THRESHOLD="${BASH_REMATCH[2]}"
154
+ fi
155
+ ```
136
156
 
137
157
  If `--head` is provided without `--base`, stop and tell the user both refs are
138
158
  required for explicit-range review.
@@ -312,14 +332,14 @@ clean ref range exists.
312
332
  All channels should receive an equivalent prompt bundle built from the local review scope:
313
333
 
314
334
  ```text
315
- You are reviewing local code changes before commit or push. Report only P0, P1,
316
- and P2 issues.
335
+ You are reviewing local code changes before commit or push. Report all P0, P1,
336
+ P2, and P3 findings; the project's fix threshold is applied downstream.
317
337
 
318
338
  ## Scope
319
339
  [scope label]
320
340
 
321
341
  ## Review Standards
322
- [docs/review-standards.md if present, otherwise define P0/P1/P2]
342
+ [docs/review-standards.md if present, otherwise define P0–P3]
323
343
 
324
344
  ## Coding Standards
325
345
  [docs/coding-standards.md]
@@ -342,7 +362,7 @@ Respond with JSON:
342
362
  "approved": true/false,
343
363
  "findings": [
344
364
  {
345
- "severity": "P0" | "P1" | "P2",
365
+ "severity": "P0" | "P1" | "P2" | "P3",
346
366
  "location": "file:line or section",
347
367
  "description": "what is wrong",
348
368
  "suggestion": "specific fix"
@@ -364,7 +384,7 @@ Use these rules:
364
384
  | Any single P2 | Fix unless clearly inapplicable; if disputed, surface to user |
365
385
  | All executed channels approve | Candidate passes review |
366
386
  | Strong contradiction on a medium-severity issue | Verdict becomes `needs-user-decision` |
367
- | Compensating-pass P0/P1/P2 finding | Single-source confidence — fix per normal thresholds, but label as compensating in summary |
387
+ | Compensating-pass blocking finding | Single-source confidence — fix per normal thresholds, but label as compensating in summary |
368
388
 
369
389
  ### Step 7: Apply Fixes Unless in Report-Only Mode
370
390
 
@@ -374,10 +394,10 @@ If `REPORT_ONLY=true`:
374
394
  - Stop
375
395
 
376
396
  Otherwise:
377
- 1. Fix all P0/P1/P2 findings
397
+ 1. Fix all findings at or above `fix_threshold` (read from `results.fix_threshold` in the verdict JSON; default `P2`)
378
398
  2. Re-run the channels that produced findings
379
399
  3. Keep iterating as long as each new round surfaces *different, concrete, fixable* findings — that is healthy review/fix iteration, not a stuck loop
380
- 4. The 3-round limit is **per finding**: stop and surface to the user when the *same* P0/P1/P2 finding (or set) recurs across 3 attempts without progress. Other stop conditions: a finding is genuinely ambiguous (channels contradict each other), or the user explicitly asks to stop. Use verdict `needs-user-decision` for ambiguity, `blocked` for stuck-loop cases.
400
+ 4. The 3-round limit is **per finding**: stop and surface to the user when the *same* blocking finding (or set) recurs across 3 attempts without progress. Other stop conditions: a finding is genuinely ambiguous (channels contradict each other), or the user explicitly asks to stop. Use verdict `needs-user-decision` for ambiguity, `blocked` for stuck-loop cases.
381
401
 
382
402
  **Fix cycle channel rule:** Re-run only channels that originally completed or ran as compensating passes. Never retry a channel marked `not_installed`, `auth_failed`, or `timeout` during fix rounds — its availability does not change within a session.
383
403
 
@@ -385,9 +405,9 @@ Otherwise:
385
405
 
386
406
  Return exactly one verdict:
387
407
 
388
- - `pass` — all channels completed with `full` coverage, no unresolved P0/P1/P2
389
- - `degraded-pass` — at least one channel was skipped/compensated (coverage is not all `full`), but all executed and compensating channels have no unresolved P0/P1/P2
390
- - `blocked` — gate failed: at least one unresolved finding sits at or above the fix threshold (typically the *same* finding(s) remain unresolved after 3 fix attempts; default threshold is `P2`, so this means an unresolved P0/P1/P2)
408
+ - `pass` — all channels completed with `full` coverage, no unresolved findings at or above `fix_threshold`
409
+ - `degraded-pass` — at least one channel was skipped/compensated (coverage is not all `full`), but all executed and compensating channels have no unresolved findings at or above `fix_threshold`
410
+ - `blocked` — gate failed: at least one unresolved finding sits at or above the fix threshold (typically the *same* finding(s) remain unresolved after 3 fix attempts; the threshold defaults to `P2` but is configurable via `.mmr.yaml` or `--fix-threshold`)
391
411
  - `needs-user-decision` — no channels completed (no reconciled result was possible), reviewer disagreement / contradictions, or a finding requires human judgment that automated iteration can't resolve
392
412
 
393
413
  When compensating passes ran for any channel, the maximum achievable verdict is `degraded-pass` — never `pass`, even if all findings are resolved. When both external channels were compensated, the review summary must note: "All findings are single-model (Claude only)."
@@ -424,5 +444,5 @@ for the next delivery step (commit, push, or PR creation).
424
444
  2. **All 3 channels are mandatory** — skip only when a tool is genuinely not installed, never by choice.
425
445
  3. **Auth failures are not silent** — always surface to the user with recovery instructions.
426
446
  4. **Independence** — never share one channel's output with another.
427
- 5. **Fix before proceeding** — P0/P1/P2 findings must be resolved before moving to the next task.
447
+ 5. **Fix before proceeding** — findings at or above `fix_threshold` must be resolved before moving to the next task.
428
448
  6. **Dispatch pattern** follows `multi-model-review-dispatch` knowledge entry. When modifying channel dispatch in this file, verify consistency with `review-pr.md` and `post-implementation-review.md`.
@@ -9,7 +9,7 @@ conditional: null
9
9
  stateless: true
10
10
  category: tool
11
11
  knowledge-base: [multi-model-review-dispatch, automated-review-tooling]
12
- argument-hint: "<PR number or blank for current branch>"
12
+ argument-hint: "<PR# or blank> [--fix-threshold P0|P1|P2|P3]"
13
13
  ---
14
14
 
15
15
  ## Purpose
@@ -44,7 +44,7 @@ The three channels are:
44
44
 
45
45
  ## Inputs
46
46
 
47
- - $ARGUMENTS — PR number (optional; auto-detected from current branch if omitted)
47
+ - $ARGUMENTS — PR number (optional; auto-detected from current branch if omitted) and/or `--fix-threshold P0|P1|P2|P3` to override the project's configured threshold for this run
48
48
  - `.mmr.yaml` — MMR CLI configuration (channels, review_criteria, defaults)
49
49
 
50
50
  The CLI handles review context via config (`review_criteria` in `.mmr.yaml`).
@@ -54,7 +54,7 @@ in the review criteria config rather than read at dispatch time.
54
54
  ## Expected Outputs
55
55
 
56
56
  - All three CLI review channels executed (or fallback documented) plus the Superpowers code-reviewer 4th channel reconciled via `mmr reconcile`
57
- - P0/P1/P2 findings fixed before proceeding
57
+ - findings at or above the configured `fix_threshold` fixed before proceeding (read from `results.fix_threshold` in the verdict JSON; default `P2`)
58
58
  - Review summary with per-channel results and reconciliation
59
59
 
60
60
  ## Instructions
@@ -62,8 +62,21 @@ in the review criteria config rather than read at dispatch time.
62
62
  ### Step 1: Identify the PR
63
63
 
64
64
  ```bash
65
- # Use argument if provided, otherwise detect from current branch
66
- PR_NUMBER="${ARGUMENTS:-$(gh pr view --json number -q .number 2>/dev/null)}"
65
+ # Strip --fix-threshold from $ARGUMENTS if present; remainder is the PR number.
66
+ # Strip the entire matched span (BASH_REMATCH[0]) — including whatever
67
+ # whitespace separator was used (space, tab, multi-space). Replacing with a
68
+ # single space preserves token boundaries; tr -d '[:space:]' below drops
69
+ # everything else.
70
+ FIX_THRESHOLD=""
71
+ ARGS_REMAINING="$ARGUMENTS"
72
+ if [[ "$ARGS_REMAINING" =~ (^|[[:space:]])--fix-threshold[[:space:]]+(P[0-3])($|[[:space:]]) ]]; then
73
+ FIX_THRESHOLD="${BASH_REMATCH[2]}"
74
+ ARGS_REMAINING="${ARGS_REMAINING//${BASH_REMATCH[0]}/ }"
75
+ fi
76
+
77
+ # Use remaining argument if provided, otherwise detect from current branch
78
+ PR_NUMBER="$(echo "$ARGS_REMAINING" | tr -d '[:space:]')"
79
+ PR_NUMBER="${PR_NUMBER:-$(gh pr view --json number -q .number 2>/dev/null)}"
67
80
  ```
68
81
 
69
82
  If no PR is found, stop and tell the user to create a PR first.
@@ -73,7 +86,9 @@ If no PR is found, stop and tell the user to create a PR first.
73
86
  Use the MMR CLI as the primary entry point for automated dispatch, reconciliation, and verdict:
74
87
 
75
88
  ```bash
76
- MMR_RESULT=$(mmr review --pr "$PR_NUMBER" --sync --format json)
89
+ MMR_FLAGS=(--pr "$PR_NUMBER" --sync --format json)
90
+ [ -n "$FIX_THRESHOLD" ] && MMR_FLAGS+=(--fix-threshold "$FIX_THRESHOLD")
91
+ MMR_RESULT=$(mmr review "${MMR_FLAGS[@]}")
77
92
  # Extract job_id from JSON output for use in mmr reconcile
78
93
  JOB_ID=$(echo "$MMR_RESULT" | grep -o '"job_id": "[^"]*"' | head -1 | cut -d'"' -f4)
79
94
  ```
@@ -168,7 +183,7 @@ reconcile findings after all channels complete:
168
183
  | One channel flags P0, others approve | **High** | Fix it — P0 is critical from any source |
169
184
  | One channel flags P1, others approve | **Medium** | Fix it — P1 findings are mandatory regardless of source count |
170
185
  | Channels contradict each other | **Low** | Present to user for adjudication |
171
- | Compensating-pass P0/P1/P2 finding | **Single-source** | Fix per normal thresholds, label as compensating |
186
+ | Compensating-pass blocking finding | **Single-source** | Fix per normal thresholds, label as compensating |
172
187
 
173
188
  ### Step 6: Report Results
174
189
 
@@ -200,7 +215,7 @@ Output a review summary in this format:
200
215
 
201
216
  Return exactly one verdict:
202
217
 
203
- - `pass` — all channels completed and the gate passed (no unresolved findings at or above the configured fix threshold; default threshold is `P2`, so this means no unresolved P0/P1/P2)
218
+ - `pass` — all channels completed and the gate passed (no unresolved findings at or above the configured fix threshold; the threshold defaults to `P2` but is configurable via `.mmr.yaml` or `--fix-threshold`)
204
219
  - `degraded-pass` — gate passed but some channels were skipped or replaced by compensating passes (max achievable verdict when any channel was compensated)
205
220
  - `blocked` — gate failed: at least one unresolved finding sits at or above the fix threshold (typically the *same* finding(s) remain unresolved after 3 fix attempts)
206
221
  - `needs-user-decision` — no channels completed (no reconciled result was possible), reviewer disagreement / contradictions, or a finding requires human judgment that automated iteration can't resolve
@@ -209,15 +224,15 @@ Verdict precedence: `needs-user-decision` > `blocked` > `degraded-pass` > `pass`
209
224
 
210
225
  When compensating passes ran, maximum achievable verdict is `degraded-pass`. When both external channels were compensated, note "All findings are single-model."
211
226
 
212
- ### Step 7: Fix P0/P1/P2 Findings
227
+ ### Step 7: Fix Blocking Findings
213
228
 
214
- If any P0, P1, or P2 findings exist:
229
+ If any findings sit at or above `fix_threshold` (the verdict JSON's `fix_threshold` field; default `P2`):
215
230
  1. Fix them in the code
216
231
  2. Push the fixes: `git push`
217
232
  3. Re-run the review to verify fixes: `mmr review --pr "$PR_NUMBER" --sync --format json`
218
233
  4. The 3-round limit is **per finding**, not total rounds:
219
234
  - **Keep going** when each new round surfaces *different, concrete, fixable* findings — that is healthy review/fix iteration.
220
- - **Stop and ask the user** when (a) the *same* P0/P1/P2 finding (or set) recurs across 3 attempts without progress, (b) a finding is genuinely ambiguous (channels contradict each other), or (c) the user explicitly asks to stop.
235
+ - **Stop and ask the user** when (a) the *same* blocking finding (or set) recurs across 3 attempts without progress, (b) a finding is genuinely ambiguous (channels contradict each other), or (c) the user explicitly asks to stop.
221
236
  - **When stopped**, do NOT merge automatically. Document the unresolved findings (severity, location, attempt count) and let the user decide whether to continue fixing, create follow-up issues, or override.
222
237
 
223
238
  **Note:** Fix cycles are an orchestration concern — the caller (agent or human) handles the fix loop. The CLI provides the review and verdict; the caller decides whether to fix and re-run.
@@ -261,8 +276,8 @@ In either path, output the message and stop. Do NOT proceed to the next task wit
261
276
  2. **All three CLI channels are mandatory** — Codex CLI, Gemini CLI, and Claude CLI. Plus the Superpowers code-reviewer agent as a complementary 4th channel reconciled via `mmr reconcile` (Step 3). Skip a CLI channel only when a tool is genuinely not installed or auth cannot be recovered (in which case MMR emits a compensating pass for missing Codex/Gemini channels; a missing Claude CLI has no compensator). Never skip by choice.
262
277
  3. **Auth failures are not silent** — always surface to the user with the exact recovery command.
263
278
  4. **Independence** — never share one channel's output with another. Each reviews the diff independently.
264
- 5. **Fix before proceeding** — P0/P1/P2 findings must be resolved before moving to the next task.
265
- 6. **3-round limit (per finding)** — never attempt to fix the *same* P0/P1/P2 finding more than 3 times. Each round that surfaces a *new* fixable finding is healthy iteration — keep going. Stop only when the same finding recurs across 3 attempts, channels contradict each other, or the user asks to stop.
279
+ 5. **Fix before proceeding** — findings at or above `fix_threshold` must be resolved before moving to the next task.
280
+ 6. **3-round limit (per finding)** — never attempt to fix the *same* blocking finding more than 3 times. Each round that surfaces a *new* fixable finding is healthy iteration — keep going. Stop only when the same finding recurs across 3 attempts, channels contradict each other, or the user asks to stop.
266
281
  7. **Document everything** — the review summary must show which channels ran and which were skipped, with reasons.
267
282
  8. **CLI-first** — use `mmr review --sync` as the primary entry point. Manual dispatch is a fallback only.
268
283
  9. **Job storage** — the CLI stores job data at `~/.mmr/jobs/{job-id}/results.json`. Review results are available via `mmr results <job-id>`.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zigrivers/scaffold",
3
- "version": "3.24.3",
3
+ "version": "3.25.0",
4
4
  "description": "AI-powered software project scaffolding pipeline",
5
5
  "type": "module",
6
6
  "workspaces": [
@@ -119,13 +119,20 @@ Re-run `mmr config test` after re-authenticating to verify.
119
119
 
120
120
  ## Severity Gate
121
121
 
122
- Default threshold is P2 (fix P0/P1/P2, skip P3). Override per-review:
122
+ Default threshold is `P2` (the verdict gate blocks on P0, P1, and P2;
123
+ P3 findings are kept in the result as **advisory** but don't cause
124
+ `blocked`). Override per-review:
123
125
 
124
126
  ```bash
125
127
  mmr review --pr 47 --fix-threshold P1 # Only fix P0 and P1
126
128
  mmr review --pr 47 --fix-threshold P0 # Only fix critical issues
127
129
  ```
128
130
 
131
+ The verdict JSON includes `advisory_count` (count of findings strictly
132
+ below the threshold). Formatted output shows `Advisory: N` (text) or
133
+ `**Advisory:** N` (markdown) when non-zero — useful for spotting real
134
+ findings that the gate didn't block.
135
+
129
136
  ## Output Formats
130
137
 
131
138
  ```bash
@@ -147,13 +147,13 @@ When dispatching a review, bundle all relevant context into the prompt. Each CLI
147
147
  ### Template for Artifact Review
148
148
 
149
149
  ```
150
- You are reviewing a project artifact for quality issues. Report P0 (critical), P1 (high), and P2 (medium) issues.
150
+ You are reviewing a project artifact for quality issues. Report all P0, P1, P2, and P3 findings; the project's fix threshold is applied downstream.
151
151
 
152
152
  ## Severity Definitions
153
153
  - P0: Will cause implementation failure, data loss, security vulnerability, or fundamental architectural flaw
154
154
  - P1: Will cause bugs in normal usage, inconsistency across documents, or blocks downstream work
155
155
  - P2: Improvement opportunity — style, naming, documentation, minor optimization
156
- - Do NOT report P3 issues (personal preference, trivial nits)
156
+ - P3: Personal preference, trivial nits — included so a strict project (`fix_threshold: P3`) can act on them; otherwise advisory
157
157
 
158
158
  ## Review Standards
159
159
  [paste contents of docs/review-standards.md if it exists, otherwise use severity definitions above]
@@ -170,7 +170,7 @@ Respond with a JSON object:
170
170
  "approved": true/false,
171
171
  "findings": [
172
172
  {
173
- "severity": "P0" or "P1" or "P2",
173
+ "severity": "P0" or "P1" or "P2" or "P3",
174
174
  "location": "section or line reference",
175
175
  "description": "what's wrong",
176
176
  "suggestion": "specific fix"
@@ -179,13 +179,13 @@ Respond with a JSON object:
179
179
  "summary": "one-line assessment"
180
180
  }
181
181
 
182
- If no P0/P1/P2 issues found, respond with: { "approved": true, "findings": [], "summary": "No issues found." }
182
+ If no findings, respond with: { "approved": true, "findings": [], "summary": "No issues found." }
183
183
  ```
184
184
 
185
185
  ### Template for PR Diff Review
186
186
 
187
187
  ```
188
- You are reviewing a pull request diff. Report P0, P1, and P2 issues.
188
+ You are reviewing a pull request diff. Report all P0, P1, P2, and P3 findings; the project's fix threshold is applied downstream.
189
189
 
190
190
  ## Review Standards
191
191
  [paste docs/review-standards.md]