@fro.bot/systematic 2.3.2 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/README.md +12 -13
  2. package/agents/design/design-implementation-reviewer.md +2 -19
  3. package/agents/design/design-iterator.md +2 -31
  4. package/agents/design/figma-design-sync.md +2 -22
  5. package/agents/docs/ankane-readme-writer.md +2 -19
  6. package/agents/document-review/adversarial-document-reviewer.md +3 -2
  7. package/agents/document-review/coherence-reviewer.md +5 -7
  8. package/agents/document-review/design-lens-reviewer.md +3 -4
  9. package/agents/document-review/feasibility-reviewer.md +3 -4
  10. package/agents/document-review/product-lens-reviewer.md +25 -6
  11. package/agents/document-review/scope-guardian-reviewer.md +3 -4
  12. package/agents/document-review/security-lens-reviewer.md +3 -4
  13. package/agents/research/best-practices-researcher.md +4 -21
  14. package/agents/research/framework-docs-researcher.md +2 -19
  15. package/agents/research/git-history-analyzer.md +2 -19
  16. package/agents/research/issue-intelligence-analyst.md +2 -24
  17. package/agents/research/learnings-researcher.md +7 -28
  18. package/agents/research/repo-research-analyst.md +3 -32
  19. package/agents/research/slack-researcher.md +128 -0
  20. package/agents/review/agent-native-reviewer.md +109 -195
  21. package/agents/review/architecture-strategist.md +3 -19
  22. package/agents/review/cli-agent-readiness-reviewer.md +1 -27
  23. package/agents/review/code-simplicity-reviewer.md +5 -19
  24. package/agents/review/data-integrity-guardian.md +3 -19
  25. package/agents/review/data-migration-expert.md +3 -19
  26. package/agents/review/deployment-verification-agent.md +3 -19
  27. package/agents/review/pattern-recognition-specialist.md +4 -20
  28. package/agents/review/performance-oracle.md +3 -31
  29. package/agents/review/project-standards-reviewer.md +5 -5
  30. package/agents/review/schema-drift-detector.md +3 -19
  31. package/agents/review/security-sentinel.md +3 -25
  32. package/agents/review/testing-reviewer.md +3 -3
  33. package/agents/workflow/pr-comment-resolver.md +54 -22
  34. package/agents/workflow/spec-flow-analyzer.md +2 -25
  35. package/package.json +1 -1
  36. package/skills/agent-native-architecture/SKILL.md +28 -27
  37. package/skills/agent-native-architecture/references/agent-execution-patterns.md +3 -3
  38. package/skills/agent-native-architecture/references/agent-native-testing.md +1 -1
  39. package/skills/agent-native-architecture/references/mobile-patterns.md +1 -1
  40. package/skills/andrew-kane-gem-writer/SKILL.md +5 -5
  41. package/skills/ce-brainstorm/SKILL.md +43 -181
  42. package/skills/ce-compound/SKILL.md +143 -89
  43. package/skills/ce-compound-refresh/SKILL.md +48 -5
  44. package/skills/ce-ideate/SKILL.md +27 -242
  45. package/skills/ce-plan/SKILL.md +165 -81
  46. package/skills/ce-review/SKILL.md +348 -125
  47. package/skills/ce-review/references/findings-schema.json +5 -0
  48. package/skills/ce-review/references/persona-catalog.md +2 -2
  49. package/skills/ce-review/references/resolve-base.sh +5 -2
  50. package/skills/ce-review/references/subagent-template.md +25 -3
  51. package/skills/ce-work/SKILL.md +95 -242
  52. package/skills/ce-work-beta/SKILL.md +154 -301
  53. package/skills/dhh-rails-style/SKILL.md +13 -12
  54. package/skills/document-review/SKILL.md +56 -109
  55. package/skills/document-review/references/findings-schema.json +0 -23
  56. package/skills/document-review/references/subagent-template.md +13 -18
  57. package/skills/dspy-ruby/SKILL.md +8 -8
  58. package/skills/every-style-editor/SKILL.md +3 -2
  59. package/skills/frontend-design/SKILL.md +2 -3
  60. package/skills/git-commit/SKILL.md +1 -1
  61. package/skills/git-commit-push-pr/SKILL.md +81 -265
  62. package/skills/git-worktree/SKILL.md +20 -21
  63. package/skills/lfg/SKILL.md +10 -17
  64. package/skills/onboarding/SKILL.md +2 -2
  65. package/skills/onboarding/scripts/inventory.mjs +31 -7
  66. package/skills/proof/SKILL.md +134 -28
  67. package/skills/resolve-pr-feedback/SKILL.md +7 -2
  68. package/skills/setup/SKILL.md +1 -1
  69. package/skills/test-browser/SKILL.md +10 -11
  70. package/skills/test-xcode/SKILL.md +6 -3
  71. package/dist/lib/manifest.d.ts +0 -39
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ce:review
3
- description: Structured code review using tiered persona agents, confidence-gated findings, and a merge/dedup pipeline. Use when reviewing code changes before creating a PR.
4
- argument-hint: '[mode:autofix|mode:report-only] [PR number, GitHub URL, or branch name]'
3
+ description: "Structured code review using tiered persona agents, confidence-gated findings, and a merge/dedup pipeline. Use when reviewing code changes before creating a PR."
4
+ argument-hint: "[blank to review current branch, or provide PR link]"
5
5
  ---
6
6
 
7
7
  # Code Review
@@ -16,15 +16,30 @@ Reviews code changes using dynamically selected reviewer personas. Spawns parall
16
16
  - Can be invoked standalone
17
17
  - Can run as a read-only or autofix review step inside larger workflows
18
18
 
19
- ## Mode Detection
19
+ ## Argument Parsing
20
+
21
+ Parse `$ARGUMENTS` for the following optional tokens. Strip each recognized token before interpreting the remainder as the PR number, GitHub URL, or branch name.
22
+
23
+ | Token | Example | Effect |
24
+ |-------|---------|--------|
25
+ | `mode:autofix` | `mode:autofix` | Select autofix mode (see Mode Detection below) |
26
+ | `mode:report-only` | `mode:report-only` | Select report-only mode |
27
+ | `mode:headless` | `mode:headless` | Select headless mode for programmatic callers (see Mode Detection below) |
28
+ | `base:<sha-or-ref>` | `base:abc1234` or `base:origin/main` | Skip scope detection — use this as the diff base directly |
29
+ | `plan:<path>` | `plan:docs/plans/2026-03-25-001-feat-foo-plan.md` | Load this plan for requirements verification |
20
30
 
21
- Check `$ARGUMENTS` for `mode:autofix` or `mode:report-only`. If either token is present, strip it from the remaining arguments before interpreting the rest as the PR number, GitHub URL, or branch name.
31
+ All tokens are optional. Each one present means one less thing to infer. When absent, fall back to existing behavior for that stage.
32
+
33
+ **Conflicting mode flags:** If multiple mode tokens appear in arguments, stop and do not dispatch agents. If `mode:headless` is one of the conflicting tokens, emit the headless error envelope: `Review failed (headless mode). Reason: conflicting mode flags — <mode_a> and <mode_b> cannot be combined.` Otherwise emit the generic form: `Review failed. Reason: conflicting mode flags — <mode_a> and <mode_b> cannot be combined.`
34
+
35
+ ## Mode Detection
22
36
 
23
37
  | Mode | When | Behavior |
24
38
  |------|------|----------|
25
- | **Interactive** (default) | No mode token present | Review, present findings, ask for policy decisions when needed, and optionally continue into fix/push/PR next steps |
39
+ | **Interactive** (default) | No mode token present | Review, apply safe_auto fixes automatically, present findings, ask for policy decisions on gated/manual findings, and optionally continue into fix/push/PR next steps |
26
40
  | **Autofix** | `mode:autofix` in arguments | No user interaction. Review, apply only policy-allowed `safe_auto` fixes, re-review in bounded rounds, write a run artifact, and emit residual downstream work when needed |
27
41
  | **Report-only** | `mode:report-only` in arguments | Strictly read-only. Review and report only, then stop with no edits, artifacts, todos, commits, pushes, or PR actions |
42
+ | **Headless** | `mode:headless` in arguments | Programmatic mode for skill-to-skill invocation. Apply `safe_auto` fixes silently (single pass), return all other findings as structured text output, write run artifacts, skip todos, and return "Review complete" signal. No interactive prompts. |
28
43
 
29
44
  ### Autofix mode rules
30
45
 
@@ -42,6 +57,19 @@ Check `$ARGUMENTS` for `mode:autofix` or `mode:report-only`. If either token is
42
57
  - **Do not switch the shared checkout.** If the caller passes an explicit PR or branch target, `mode:report-only` must run in an isolated checkout/worktree or stop instead of running `gh pr checkout` / `git checkout`.
43
58
  - **Do not overlap mutating review with browser testing on the same checkout.** If a future orchestrator wants fixes, run the mutating review phase after browser testing or in an isolated checkout/worktree.
44
59
 
60
+ ### Headless mode rules
61
+
62
+ - **Skip all user questions.** Never use the platform question tool (`question` in OpenCode, `request_user_input` in Codex, `ask_user` in Gemini) or other interactive prompts. Infer intent conservatively if the diff metadata is thin.
63
+ - **Require a determinable diff scope.** If headless mode cannot determine a diff scope (no branch, PR, or `base:` ref determinable without user interaction), emit `Review failed (headless mode). Reason: no diff scope detected. Re-invoke with a branch name, PR number, or base:<ref>.` and stop without dispatching agents.
64
+ - **Apply only `safe_auto -> review-fixer` findings in a single pass.** No bounded re-review rounds. Leave `gated_auto`, `manual`, `human`, and `release` work unresolved and return them in the structured output.
65
+ - **Return all non-auto findings as structured text output.** Use the headless output envelope format (see Stage 6 below) preserving severity, autofix_class, owner, requires_verification, confidence, pre_existing, and suggested_fix per finding. Enrich with detail-tier fields (why_it_matters, evidence[]) from the per-agent artifact files on disk (see Detail enrichment in Stage 6).
66
+ - **Write a run artifact** under `.context/systematic/ce-review/<run-id>/` summarizing findings, applied fixes, and advisory outputs. Include the artifact path in the structured output.
67
+ - **Do not create todo files.** The caller receives structured findings and routes downstream work itself.
68
+ - **Do not switch the shared checkout.** If the caller passes an explicit PR or branch target, `mode:headless` must run in an isolated checkout/worktree or stop instead of running `gh pr checkout` / `git checkout`. When stopping, emit `Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.`
69
+ - **Not safe for concurrent use on a shared checkout.** Unlike `mode:report-only`, headless mutates files (applies `safe_auto` fixes). Callers must not run headless concurrently with other mutating operations on the same checkout.
70
+ - **Never commit, push, or create a PR** from headless mode. The caller owns those decisions.
71
+ - **End with "Review complete" as the terminal signal** so callers can detect completion. If all reviewers fail or time out, emit `Code review degraded (headless mode). Reason: 0 of N reviewers returned results.` followed by "Review complete".
72
+
45
73
  ## Severity Scale
46
74
 
47
75
  All reviewers use P0-P3:
@@ -73,7 +101,7 @@ Routing rules:
73
101
 
74
102
  ## Reviewers
75
103
 
76
- 13 reviewer personas in layered conditionals, plus CE-specific agents. See [persona-catalog.md](./references/persona-catalog.md) for the full catalog.
104
+ 17 reviewer personas in layered conditionals, plus CE-specific agents. See the persona catalog included below for the full catalog.
77
105
 
78
106
  **Always-on (every review):**
79
107
 
@@ -82,6 +110,7 @@ Routing rules:
82
110
  | `systematic:review:correctness-reviewer` | Logic errors, edge cases, state bugs, error propagation |
83
111
  | `systematic:review:testing-reviewer` | Coverage gaps, weak assertions, brittle tests |
84
112
  | `systematic:review:maintainability-reviewer` | Coupling, complexity, naming, dead code, abstraction debt |
113
+ | `systematic:review:project-standards-reviewer` | AGENTS.md compliance -- frontmatter, references, naming, portability |
85
114
  | `systematic:review:agent-native-reviewer` | Verify new features are agent-accessible |
86
115
  | `systematic:research:learnings-researcher` | Search docs/solutions/ for past issues related to this PR |
87
116
 
@@ -94,6 +123,9 @@ Routing rules:
94
123
  | `systematic:review:api-contract-reviewer` | Routes, serializers, type signatures, versioning |
95
124
  | `systematic:review:data-migrations-reviewer` | Migrations, schema changes, backfills |
96
125
  | `systematic:review:reliability-reviewer` | Error handling, retries, timeouts, background jobs |
126
+ | `systematic:review:adversarial-reviewer` | Diff >=50 changed non-test/non-generated/non-lockfile lines, or auth, payments, data mutations, external APIs |
127
+ | `systematic:review:cli-readiness-reviewer` | CLI command definitions, argument parsing, CLI framework usage, command handler implementations |
128
+ | `systematic:review:previous-comments-reviewer` | Reviewing a PR that has existing review comments or threads |
97
129
 
98
130
  **Stack-specific conditional (selected per diff):**
99
131
 
@@ -114,7 +146,7 @@ Routing rules:
114
146
 
115
147
  ## Review Scope
116
148
 
117
- Every review spawns all 3 always-on personas plus the 2 CE always-on agents, then adds whichever cross-cutting and stack-specific conditionals fit the diff. The model naturally right-sizes: a small config change triggers 0 conditionals = 5 reviewers. A Rails auth feature might trigger security + reliability + kieran-rails + dhh-rails = 9 reviewers.
149
+ Every review spawns all 4 always-on personas plus the 2 CE always-on agents, then adds whichever cross-cutting and stack-specific conditionals fit the diff. The model naturally right-sizes: a small config change triggers 0 conditionals = 6 reviewers. A Rails auth feature might trigger security + reliability + kieran-rails + dhh-rails = 10 reviewers.
118
150
 
119
151
  ## Protected Artifacts
120
152
 
@@ -132,9 +164,26 @@ If a reviewer flags any file in these directories for cleanup or removal, discar
132
164
 
133
165
  Compute the diff range, file list, and diff. Minimize permission prompts by combining into as few commands as possible.
134
166
 
167
+ **If `base:` argument is provided (fast path):**
168
+
169
+ The caller already knows the diff base. Skip all base-branch detection, remote resolution, and merge-base computation. Use the provided value directly:
170
+
171
+ ```
172
+ BASE_ARG="{base_arg}"
173
+ BASE=$(git merge-base HEAD "$BASE_ARG" 2>/dev/null) || BASE="$BASE_ARG"
174
+ ```
175
+
176
+ Then produce the same output as the other paths:
177
+
178
+ ```
179
+ echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
180
+ ```
181
+
182
+ This path works with any ref — a SHA, `origin/main`, a branch name. Automated callers (ce:work, lfg, slfg) should prefer this to avoid the detection overhead. **Do not combine `base:` with a PR number or branch target.** If both are present, stop with an error: "Cannot use `base:` with a PR number or branch target — `base:` implies the current checkout is already the correct branch. Pass `base:` alone, or pass the target alone and let scope detection resolve the base." This avoids scope/intent mismatches where the diff base comes from one source but the code and metadata come from another.
183
+
135
184
  **If a PR number or GitHub URL is provided as an argument:**
136
185
 
137
- If `mode:report-only` is active, do **not** run `gh pr checkout <number-or-url>` on the shared checkout. Tell the caller: "mode:report-only cannot switch the shared checkout to review a PR target. Run it from an isolated worktree/checkout for that PR, or run report-only with no target argument on the already checked out branch." Stop here unless the review is already running in an isolated checkout.
186
+ If `mode:report-only` or `mode:headless` is active, do **not** run `gh pr checkout <number-or-url>` on the shared checkout. For `mode:report-only`, tell the caller: "mode:report-only cannot switch the shared checkout to review a PR target. Run it from an isolated worktree/checkout for that PR, or run report-only with no target argument on the already checked out branch." For `mode:headless`, emit `Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.` Stop here unless the review is already running in an isolated checkout.
138
187
 
139
188
  First, verify the worktree is clean before switching branches:
140
189
 
@@ -188,7 +237,7 @@ Extract PR title/body, base branch, and PR URL from `gh pr view`, then extract t
188
237
 
189
238
  Check out the named branch, then diff it against the base branch. Substitute the provided branch name (shown here as `<branch>`).
190
239
 
191
- If `mode:report-only` is active, do **not** run `git checkout <branch>` on the shared checkout. Tell the caller: "mode:report-only cannot switch the shared checkout to review another branch. Run it from an isolated worktree/checkout for `<branch>`, or run report-only on the current checkout with no target argument." Stop here unless the review is already running in an isolated checkout.
240
+ If `mode:report-only` or `mode:headless` is active, do **not** run `git checkout <branch>` on the shared checkout. For `mode:report-only`, tell the caller: "mode:report-only cannot switch the shared checkout to review another branch. Run it from an isolated worktree/checkout for `<branch>`, or run report-only on the current checkout with no target argument." For `mode:headless`, emit `Review failed (headless mode). Reason: cannot switch shared checkout. Re-invoke with base:<ref> to review the current checkout, or run from an isolated worktree.` Stop here unless the review is already running in an isolated checkout.
192
241
 
193
242
  First, verify the worktree is clean before switching branches:
194
243
 
@@ -202,97 +251,45 @@ If the output is non-empty, inform the user: "You have uncommitted changes on th
202
251
  git checkout <branch>
203
252
  ```
204
253
 
205
- Then detect the review base branch before computing the merge-base. When the branch has an open PR, resolve the base ref from the PR's actual base repository (not just `origin`), mirroring the PR-mode logic for fork safety. Fall back to `origin/HEAD`, GitHub metadata, then common branch names:
254
+ Then detect the review base branch and compute the merge-base. Run the `references/resolve-base.sh` script, which handles fork-safe remote resolution with multi-fallback detection (PR metadata -> `origin/HEAD` -> `gh repo view` -> common branch names):
206
255
 
207
256
  ```
208
- REVIEW_BASE_BRANCH=""
209
- PR_BASE_REPO=""
210
- if command -v gh >/dev/null 2>&1; then
211
- PR_META=$(gh pr view --json baseRefName,url 2>/dev/null || true)
212
- if [ -n "$PR_META" ]; then
213
- REVIEW_BASE_BRANCH=$(echo "$PR_META" | jq -r '.baseRefName // empty')
214
- PR_BASE_REPO=$(echo "$PR_META" | jq -r '.url // empty' | sed -n 's#https://github.com/\([^/]*/[^/]*\)/pull/.*#\1#p')
215
- fi
216
- fi
217
- if [ -z "$REVIEW_BASE_BRANCH" ]; then REVIEW_BASE_BRANCH=$(git symbolic-ref --quiet --short refs/remotes/origin/HEAD 2>/dev/null | sed 's#^origin/##'); fi
218
- if [ -z "$REVIEW_BASE_BRANCH" ] && command -v gh >/dev/null 2>&1; then REVIEW_BASE_BRANCH=$(gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name' 2>/dev/null); fi
219
- if [ -z "$REVIEW_BASE_BRANCH" ]; then
220
- for candidate in main master develop trunk; do
221
- if git rev-parse --verify "origin/$candidate" >/dev/null 2>&1 || git rev-parse --verify "$candidate" >/dev/null 2>&1; then
222
- REVIEW_BASE_BRANCH="$candidate"
223
- break
224
- fi
225
- done
226
- fi
227
- if [ -n "$REVIEW_BASE_BRANCH" ]; then
228
- if [ -n "$PR_BASE_REPO" ]; then
229
- PR_BASE_REMOTE=$(git remote -v | awk "index(\$2, \"github.com:$PR_BASE_REPO\") || index(\$2, \"github.com/$PR_BASE_REPO\") {print \$1; exit}")
230
- if [ -n "$PR_BASE_REMOTE" ]; then
231
- git rev-parse --verify "$PR_BASE_REMOTE/$REVIEW_BASE_BRANCH" >/dev/null 2>&1 || git fetch --no-tags "$PR_BASE_REMOTE" "$REVIEW_BASE_BRANCH" 2>/dev/null || true
232
- BASE_REF=$(git rev-parse --verify "$PR_BASE_REMOTE/$REVIEW_BASE_BRANCH" 2>/dev/null || true)
233
- fi
234
- fi
235
- if [ -z "$BASE_REF" ]; then
236
- git rev-parse --verify "origin/$REVIEW_BASE_BRANCH" >/dev/null 2>&1 || git fetch --no-tags origin "$REVIEW_BASE_BRANCH" 2>/dev/null || true
237
- BASE_REF=$(git rev-parse --verify "origin/$REVIEW_BASE_BRANCH" 2>/dev/null || git rev-parse --verify "$REVIEW_BASE_BRANCH" 2>/dev/null || true)
238
- fi
239
- if [ -n "$BASE_REF" ]; then BASE=$(git merge-base HEAD "$BASE_REF" 2>/dev/null) || BASE=""; else BASE=""; fi
240
- else BASE=""; fi
257
+ RESOLVE_OUT=$(bash references/resolve-base.sh) || { echo "ERROR: resolve-base.sh failed"; exit 1; }
258
+ if [ -z "$RESOLVE_OUT" ] || echo "$RESOLVE_OUT" | grep -q '^ERROR:'; then echo "${RESOLVE_OUT:-ERROR: resolve-base.sh produced no output}"; exit 1; fi
259
+ BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')
241
260
  ```
242
261
 
262
+ If the script outputs an error, stop instead of falling back to `git diff HEAD`; a branch review without the base branch would only show uncommitted changes and silently miss all committed work.
263
+
264
+ On success, produce the diff:
265
+
243
266
  ```
244
- if [ -n "$BASE" ]; then echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard; else echo "ERROR: Unable to resolve review base branch locally. Fetch the base branch and rerun, or provide a PR number so the review scope can be determined from PR metadata."; fi
267
+ echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
245
268
  ```
246
269
 
247
- If the branch has an open PR, the detection above uses the PR's base repository to resolve the merge-base, which handles fork workflows correctly. You may still fetch additional PR metadata with `gh pr view` for title, body, and linked issues, but do not fail if no PR exists. If the base branch still cannot be resolved after the detection and fetch attempts, stop instead of falling back to `git diff HEAD`; a branch review without the base branch would only show uncommitted changes and silently miss all committed work.
270
+ You may still fetch additional PR metadata with `gh pr view` for title, body, and linked issues, but do not fail if no PR exists.
248
271
 
249
272
  **If no argument (standalone on current branch):**
250
273
 
251
- Detect the review base branch before computing the merge-base. When the current branch has an open PR, resolve the base ref from the PR's actual base repository (not just `origin`), mirroring the PR-mode logic for fork safety. Fall back to `origin/HEAD`, GitHub metadata, then common branch names:
274
+ Detect the review base branch and compute the merge-base using the same `references/resolve-base.sh` script as branch mode:
252
275
 
253
276
  ```
254
- REVIEW_BASE_BRANCH=""
255
- PR_BASE_REPO=""
256
- if command -v gh >/dev/null 2>&1; then
257
- PR_META=$(gh pr view --json baseRefName,url 2>/dev/null || true)
258
- if [ -n "$PR_META" ]; then
259
- REVIEW_BASE_BRANCH=$(echo "$PR_META" | jq -r '.baseRefName // empty')
260
- PR_BASE_REPO=$(echo "$PR_META" | jq -r '.url // empty' | sed -n 's#https://github.com/\([^/]*/[^/]*\)/pull/.*#\1#p')
261
- fi
262
- fi
263
- if [ -z "$REVIEW_BASE_BRANCH" ]; then REVIEW_BASE_BRANCH=$(git symbolic-ref --quiet --short refs/remotes/origin/HEAD 2>/dev/null | sed 's#^origin/##'); fi
264
- if [ -z "$REVIEW_BASE_BRANCH" ] && command -v gh >/dev/null 2>&1; then REVIEW_BASE_BRANCH=$(gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name' 2>/dev/null); fi
265
- if [ -z "$REVIEW_BASE_BRANCH" ]; then
266
- for candidate in main master develop trunk; do
267
- if git rev-parse --verify "origin/$candidate" >/dev/null 2>&1 || git rev-parse --verify "$candidate" >/dev/null 2>&1; then
268
- REVIEW_BASE_BRANCH="$candidate"
269
- break
270
- fi
271
- done
272
- fi
273
- if [ -n "$REVIEW_BASE_BRANCH" ]; then
274
- if [ -n "$PR_BASE_REPO" ]; then
275
- PR_BASE_REMOTE=$(git remote -v | awk "index(\$2, \"github.com:$PR_BASE_REPO\") || index(\$2, \"github.com/$PR_BASE_REPO\") {print \$1; exit}")
276
- if [ -n "$PR_BASE_REMOTE" ]; then
277
- git rev-parse --verify "$PR_BASE_REMOTE/$REVIEW_BASE_BRANCH" >/dev/null 2>&1 || git fetch --no-tags "$PR_BASE_REMOTE" "$REVIEW_BASE_BRANCH" 2>/dev/null || true
278
- BASE_REF=$(git rev-parse --verify "$PR_BASE_REMOTE/$REVIEW_BASE_BRANCH" 2>/dev/null || true)
279
- fi
280
- fi
281
- if [ -z "$BASE_REF" ]; then
282
- git rev-parse --verify "origin/$REVIEW_BASE_BRANCH" >/dev/null 2>&1 || git fetch --no-tags origin "$REVIEW_BASE_BRANCH" 2>/dev/null || true
283
- BASE_REF=$(git rev-parse --verify "origin/$REVIEW_BASE_BRANCH" 2>/dev/null || git rev-parse --verify "$REVIEW_BASE_BRANCH" 2>/dev/null || true)
284
- fi
285
- if [ -n "$BASE_REF" ]; then BASE=$(git merge-base HEAD "$BASE_REF" 2>/dev/null) || BASE=""; else BASE=""; fi
286
- else BASE=""; fi
277
+ RESOLVE_OUT=$(bash references/resolve-base.sh) || { echo "ERROR: resolve-base.sh failed"; exit 1; }
278
+ if [ -z "$RESOLVE_OUT" ] || echo "$RESOLVE_OUT" | grep -q '^ERROR:'; then echo "${RESOLVE_OUT:-ERROR: resolve-base.sh produced no output}"; exit 1; fi
279
+ BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')
287
280
  ```
288
281
 
282
+ If the script outputs an error, stop instead of falling back to `git diff HEAD`; a standalone review without the base branch would only show uncommitted changes and silently miss all committed work on the branch.
283
+
284
+ On success, produce the diff:
285
+
289
286
  ```
290
- if [ -n "$BASE" ]; then echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard; else echo "ERROR: Unable to resolve review base branch locally. Fetch the base branch and rerun, or provide a PR number so the review scope can be determined from PR metadata."; fi
287
+ echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard
291
288
  ```
292
289
 
293
- Parse: `BASE:` = merge-base SHA, `FILES:` = file list, `DIFF:` = diff, `UNTRACKED:` = files excluded from review scope because they are not staged. Using `git diff $BASE` (without `..HEAD`) diffs the merge-base against the working tree, which includes committed, staged, and unstaged changes together. If the base branch cannot be resolved after the detection and fetch attempts, stop instead of falling back to `git diff HEAD`; a standalone review without the base branch would only show uncommitted changes and silently miss all committed work on the branch.
290
+ Using `git diff $BASE` (without `..HEAD`) diffs the merge-base against the working tree, which includes committed, staged, and unstaged changes together.
294
291
 
295
- **Untracked file handling:** Always inspect the `UNTRACKED:` list, even when `FILES:`/`DIFF:` are non-empty. Untracked files are outside review scope until staged. If the list is non-empty, tell the user which files are excluded. If any of them should be reviewed, stop and tell the user to `git add` them first and rerun. Only continue when the user is intentionally reviewing tracked changes only.
292
+ **Untracked file handling:** Always inspect the `UNTRACKED:` list, even when `FILES:`/`DIFF:` are non-empty. Untracked files are outside review scope until staged. If the list is non-empty, tell the user which files are excluded. If any of them should be reviewed, stop and tell the user to `git add` them first and rerun. Only continue when the user is intentionally reviewing tracked changes only. In `mode:headless` or `mode:autofix`, do not stop to ask — proceed with tracked changes only and note the excluded untracked files in the Coverage section of the output.
296
293
 
297
294
  ### Stage 2: Intent discovery
298
295
 
@@ -308,7 +305,7 @@ Understand what the change is trying to accomplish. The source of intent depends
308
305
  echo "BRANCH:" && git rev-parse --abbrev-ref HEAD && echo "COMMITS:" && git log --oneline ${BASE}..HEAD
309
306
  ```
310
307
 
311
- Combined with conversation context (plan section summary, PR description, caller-provided description), write a 2-3 line intent summary:
308
+ Combined with conversation context (plan section summary, PR description), write a 2-3 line intent summary:
312
309
 
313
310
  ```
314
311
  Intent: Simplify tax calculation by replacing the multi-tier rate lookup
@@ -320,11 +317,31 @@ Pass this to every reviewer in their spawn prompt. Intent shapes *how hard each
320
317
  **When intent is ambiguous:**
321
318
 
322
319
  - **Interactive mode:** Ask one question using the platform's interactive question tool (question in OpenCode, request_user_input in Codex): "What is the primary goal of these changes?" Do not spawn reviewers until intent is established.
323
- - **Autofix/report-only modes:** Infer intent conservatively from the branch name, diff, PR metadata, and caller context. Note the uncertainty in Coverage or Verdict reasoning instead of blocking.
320
+ - **Autofix/report-only/headless modes:** Infer intent conservatively from the branch name, diff, PR metadata, and caller context. Note the uncertainty in Coverage or Verdict reasoning instead of blocking.
321
+
322
+ ### Stage 2b: Plan discovery (requirements verification)
323
+
324
+ Locate the plan document so Stage 6 can verify requirements completeness. Check these sources in priority order — stop at the first hit:
325
+
326
+ 1. **`plan:` argument.** If the caller passed a plan path, use it directly. Read the file to confirm it exists.
327
+ 2. **PR body.** If PR metadata was fetched in Stage 1, scan the body for paths matching `docs/plans/*.md`. If exactly one match is found and the file exists, use it as `plan_source: explicit`. If multiple plan paths appear, treat as ambiguous — demote to `plan_source: inferred` for the most recent match that exists on disk, or skip if none exist or none clearly relate to the PR title/intent. Always verify the selected file exists before using it — stale or copied plan links in PR descriptions are common.
328
+ 3. **Auto-discover.** Extract 2-3 keywords from the branch name (e.g., `feat/onboarding-skill` -> `onboarding`, `skill`). Glob `docs/plans/*` and filter filenames containing those keywords. If exactly one match, use it. If multiple matches or the match looks ambiguous (e.g., generic keywords like `review`, `fix`, `update` that could hit many plans), **skip auto-discovery** — a wrong plan is worse than no plan. If zero matches, skip.
329
+
330
+ **Confidence tagging:** Record how the plan was found:
331
+ - `plan:` argument -> `plan_source: explicit` (high confidence)
332
+ - Single unambiguous PR body match -> `plan_source: explicit` (high confidence)
333
+ - Multiple/ambiguous PR body matches -> `plan_source: inferred` (lower confidence)
334
+ - Auto-discover with single unambiguous match -> `plan_source: inferred` (lower confidence)
335
+
336
+ If a plan is found, read its **Requirements Trace** (R1, R2, etc.) and **Implementation Units** (checkbox items). Store the extracted requirements list and `plan_source` for Stage 6. Do not block the review if no plan is found — requirements verification is additive, not required.
324
337
 
325
338
  ### Stage 3: Select reviewers
326
339
 
327
- Read the diff and file list from Stage 1. The 3 always-on personas and 2 CE always-on agents are automatic. For each cross-cutting and stack-specific conditional persona in [persona-catalog.md](./references/persona-catalog.md), decide whether the diff warrants it. This is agent judgment, not keyword matching.
340
+ Read the diff and file list from Stage 1. The 4 always-on personas and 2 CE always-on agents are automatic. For each cross-cutting and stack-specific conditional persona in the persona catalog included below, decide whether the diff warrants it. This is agent judgment, not keyword matching.
341
+
342
+ **File-type awareness for conditional selection:** Instruction-prose files (Markdown skill definitions, JSON schemas, config files) are product code but do not benefit from runtime-focused reviewers. The adversarial reviewer's techniques (race conditions, cascade failures, abuse cases) target executable code behavior. For diffs that only change instruction-prose files, skip adversarial unless the prose describes auth, payment, or data-mutation behavior. Count only executable code lines toward line-count thresholds.
343
+
344
+ **`previous-comments` is PR-only.** Only select this persona when Stage 1 gathered PR metadata (PR number or URL was provided as an argument, or `gh pr view` returned metadata for the current branch). Skip it entirely for standalone branch reviews with no associated PR -- there are no prior comments to check.
328
345
 
329
346
  Stack-specific personas are additive. A Rails UI change may warrant `kieran-rails` plus `julik-frontend-races`; a TypeScript API diff may warrant `kieran-typescript` plus `api-contract` and `reliability`.
330
347
 
@@ -337,6 +354,7 @@ Review team:
337
354
  - correctness (always)
338
355
  - testing (always)
339
356
  - maintainability (always)
357
+ - project-standards (always)
340
358
  - agent-native-reviewer (always)
341
359
  - learnings-researcher (always)
342
360
  - security -- new endpoint in routes.rb accepts user-provided redirect URL
@@ -348,69 +366,221 @@ Review team:
348
366
 
349
367
  This is progress reporting, not a blocking confirmation.
350
368
 
369
+ ### Stage 3b: Discover project standards paths
370
+
371
+ Before spawning sub-agents, find the file paths (not contents) of all relevant standards files for the `project-standards` persona. Use the native file-search/glob tool to locate:
372
+
373
+ 1. Use the native file-search tool (e.g., Glob in OpenCode) to find all `**/AGENTS.md` and `**/AGENTS.md` in the repo.
374
+ 2. Filter to those whose directory is an ancestor of at least one changed file. A standards file governs all files below it (e.g., `plugins/systematic/AGENTS.md` applies to everything under `plugins/systematic/`).
375
+
376
+ Pass the resulting path list to the `project-standards` persona inside a `<standards-paths>` block in its review context (see Stage 4). The persona reads the files itself, targeting only the sections relevant to the changed file types. This keeps the orchestrator's work cheap (path discovery only) and avoids bloating the subagent prompt with content the reviewer may not fully need.
377
+
351
378
  ### Stage 4: Spawn sub-agents
352
379
 
353
- Spawn each selected persona reviewer as a parallel sub-agent using the template in [subagent-template.md](./references/subagent-template.md). Each persona sub-agent receives:
380
+ #### Model tiering
381
+
382
+ Persona sub-agents do focused, scoped work and should use a fast mid-tier model to reduce cost and latency without sacrificing review quality. The orchestrator itself stays on the default (most capable) model.
383
+
384
+ Use the platform's mid-tier model for all persona and CE sub-agents. In OpenCode, pass `model: "sonnet"` in the Agent tool call. On other platforms, use the equivalent mid-tier (e.g., `gpt-4o` in Codex). If the platform has no model override mechanism or the available model names are unknown, omit the model parameter and let agents inherit the default -- a working review on the parent model is better than a broken dispatch from an unrecognized model name.
385
+
386
+ CE always-on agents (agent-native-reviewer, learnings-researcher) and CE conditional agents (schema-drift-detector, deployment-verification-agent) also use the mid-tier model since they perform scoped, focused work.
387
+
388
+ The orchestrator (this skill) stays on the default model because it handles intent discovery, reviewer selection, finding merge/dedup, and synthesis -- tasks that benefit from stronger reasoning.
389
+
390
+ #### Run ID
391
+
392
+ Generate a unique run identifier before dispatching any agents. This ID scopes all agent artifact files and the post-review run artifact to the same directory.
393
+
394
+ ```bash
395
+ RUN_ID=$(date +%Y%m%d-%H%M%S)-$(head -c4 /dev/urandom | od -An -tx1 | tr -d ' ')
396
+ mkdir -p ".context/systematic/ce-review/$RUN_ID"
397
+ ```
398
+
399
+ Pass `{run_id}` to every persona sub-agent so they can write their full analysis to `.context/systematic/ce-review/{run_id}/{reviewer_name}.json`.
400
+
401
+ **Report-only mode:** Skip run-id generation and directory creation. Do not pass `{run_id}` to agents. Agents return compact JSON only with no file write, consistent with report-only's no-write contract.
402
+
403
+ #### Spawning
404
+
405
+ Omit the `mode` parameter when dispatching sub-agents so the user's configured permission settings apply. Do not pass `mode: "auto"`.
406
+
407
+ Spawn each selected persona reviewer as a parallel sub-agent using the subagent template included below. Each persona sub-agent receives:
354
408
 
355
409
  1. Their persona file content (identity, failure modes, calibration, suppress conditions)
356
- 2. Shared diff-scope rules from [diff-scope.md](./references/diff-scope.md)
357
- 3. The JSON output contract from [findings-schema.json](./references/findings-schema.json)
358
- 4. Review context: intent summary, file list, diff
410
+ 2. Shared diff-scope rules from the diff-scope reference included below
411
+ 3. The JSON output contract from the findings schema included below
412
+ 4. PR metadata: title, body, and URL when reviewing a PR (empty string otherwise). Passed in a `<pr-context>` block so reviewers can verify code against stated intent
413
+ 5. Review context: intent summary, file list, diff
414
+ 6. Run ID and reviewer name for the artifact file path
415
+ 7. **For `project-standards` only:** the standards file path list from Stage 3b, wrapped in a `<standards-paths>` block appended to the review context
359
416
 
360
- Persona sub-agents are **read-only**: they review and return structured JSON. They do not edit files or propose refactors.
417
+ Persona sub-agents are **read-only** with respect to the project: they review and return structured JSON. They do not edit project files or propose refactors. The one permitted write is saving their full analysis to the `.context/` artifact path specified in the output contract.
361
418
 
362
- Read-only here means **non-mutating**, not "no shell access." Reviewer sub-agents may use non-mutating inspection commands when needed to gather evidence or verify scope, including read-oriented `git` / `gh` usage such as `git diff`, `git show`, `git blame`, `git log`, and `gh pr view`. They must not edit files, change branches, commit, push, create PRs, or otherwise mutate the checkout or repository state.
419
+ Read-only here means **non-mutating**, not "no shell access." Reviewer sub-agents may use non-mutating inspection commands when needed to gather evidence or verify scope, including read-oriented `git` / `gh` usage such as `git diff`, `git show`, `git blame`, `git log`, and `gh pr view`. They must not edit project files, change branches, commit, push, create PRs, or otherwise mutate the checkout or repository state.
363
420
 
364
- Each persona sub-agent returns JSON matching [findings-schema.json](./references/findings-schema.json):
421
+ Each persona sub-agent writes full JSON (all schema fields) to `.context/systematic/ce-review/{run_id}/{reviewer_name}.json` and returns compact JSON with merge-tier fields only:
365
422
 
366
423
  ```json
367
424
  {
368
425
  "reviewer": "security",
369
- "findings": [...],
426
+ "findings": [
427
+ {
428
+ "title": "User-supplied ID in account lookup without ownership check",
429
+ "severity": "P0",
430
+ "file": "orders_controller.rb",
431
+ "line": 42,
432
+ "confidence": 0.92,
433
+ "autofix_class": "gated_auto",
434
+ "owner": "downstream-resolver",
435
+ "requires_verification": true,
436
+ "pre_existing": false,
437
+ "suggested_fix": "Add current_user.owns?(account) guard before lookup"
438
+ }
439
+ ],
370
440
  "residual_risks": [...],
371
441
  "testing_gaps": [...]
372
442
  }
373
443
  ```
374
444
 
445
+ Detail-tier fields (`why_it_matters`, `evidence`) are in the artifact file only. `suggested_fix` is optional in both tiers -- included in compact returns when present so the orchestrator has fix context for auto-apply decisions. If the file write fails, the compact return still provides everything the merge needs.
446
+
375
447
  **CE always-on agents** (agent-native-reviewer, learnings-researcher) are dispatched as standard Agent calls in parallel with the persona agents. Give them the same review context bundle the personas receive: entry mode, any PR metadata gathered in Stage 1, intent summary, review base branch name when known, `BASE:` marker, file list, diff, and `UNTRACKED:` scope notes. Do not invoke them with a generic "review this" prompt. Their output is unstructured and synthesized separately in Stage 6.
376
448
 
377
449
  **CE conditional agents** (schema-drift-detector, deployment-verification-agent) are also dispatched as standard Agent calls when applicable. Pass the same review context bundle plus the applicability reason (for example, which migration files triggered the agent). For schema-drift-detector specifically, pass the resolved review base branch explicitly so it never assumes `main`. Their output is unstructured and must be preserved for Stage 6 synthesis just like the CE always-on agents.
378
450
 
379
451
  ### Stage 5: Merge findings
380
452
 
381
- Convert multiple reviewer JSON payloads into one deduplicated, confidence-gated finding set.
382
-
383
- 1. **Validate.** Check each output against the schema. Drop malformed findings (missing required fields). Record the drop count.
384
- 2. **Confidence gate.** Suppress findings below 0.60 confidence. Record the suppressed count. This matches the persona instructions: findings below 0.60 are noise and should not survive synthesis.
385
- 3. **Deduplicate.** Compute fingerprint: `normalize(file) + line_bucket(line, +/-3) + normalize(title)`. When fingerprints match, merge: keep highest severity, keep highest confidence with strongest evidence, union evidence, note which reviewers flagged it.
386
- 4. **Separate pre-existing.** Pull out findings with `pre_existing: true` into a separate list.
387
- 5. **Normalize routing.** For each merged finding, set the final `autofix_class`, `owner`, and `requires_verification`. If reviewers disagree, keep the most conservative route. Synthesis may narrow a finding from `safe_auto` to `gated_auto` or `manual`, but must not widen it without new evidence.
388
- 6. **Partition the work.** Build three sets:
453
+ Convert multiple reviewer compact JSON returns into one deduplicated, confidence-gated finding set. The compact returns contain merge-tier fields (title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing) plus the optional suggested_fix. Detail-tier fields (why_it_matters, evidence) are on disk in the per-agent artifact files and are not loaded at this stage.
454
+
455
+ 1. **Validate.** Check each compact return for required top-level and per-finding fields, plus value constraints. Drop malformed returns or findings. Record the drop count.
456
+ - **Top-level required:** reviewer (string), findings (array), residual_risks (array), testing_gaps (array). Drop the entire return if any are missing or wrong type.
457
+ - **Per-finding required:** title, severity, file, line, confidence, autofix_class, owner, requires_verification, pre_existing
458
+ - **Value constraints:**
459
+ - severity: P0 | P1 | P2 | P3
460
+ - autofix_class: safe_auto | gated_auto | manual | advisory
461
+ - owner: review-fixer | downstream-resolver | human | release
462
+ - confidence: numeric, 0.0-1.0
463
+ - line: positive integer
464
+ - pre_existing, requires_verification: boolean
465
+ - Do not validate against the full schema here -- the full schema (including why_it_matters and evidence) applies to the artifact files on disk, not the compact returns.
466
+ 2. **Confidence gate.** Suppress findings below 0.60 confidence. Exception: P0 findings at 0.50+ confidence survive the gate -- critical-but-uncertain issues must not be silently dropped. Record the suppressed count. This matches the persona instructions and the schema's confidence thresholds.
467
+ 3. **Deduplicate.** Compute fingerprint: `normalize(file) + line_bucket(line, +/-3) + normalize(title)`. When fingerprints match, merge: keep highest severity, keep highest confidence, note which reviewers flagged it.
468
+ 4. **Cross-reviewer agreement.** When 2+ independent reviewers flag the same issue (same fingerprint), boost the merged confidence by 0.10 (capped at 1.0). Cross-reviewer agreement is strong signal -- independent reviewers converging on the same issue is more reliable than any single reviewer's confidence. Note the agreement in the Reviewer column of the output (e.g., "security, correctness").
469
+ 5. **Separate pre-existing.** Pull out findings with `pre_existing: true` into a separate list.
470
+ 6. **Resolve disagreements.** When reviewers flag the same code region but disagree on severity, autofix_class, or owner, annotate the Reviewer column with the disagreement (e.g., "security (P0), correctness (P1) -- kept P0"). This transparency helps the user understand why a finding was routed the way it was.
471
+ 7. **Normalize routing.** For each merged finding, set the final `autofix_class`, `owner`, and `requires_verification`. If reviewers disagree, keep the most conservative route. Synthesis may narrow a finding from `safe_auto` to `gated_auto` or `manual`, but must not widen it without new evidence.
472
+ 8. **Partition the work.** Build three sets:
389
473
  - in-skill fixer queue: only `safe_auto -> review-fixer`
390
474
  - residual actionable queue: unresolved `gated_auto` or `manual` findings whose owner is `downstream-resolver`
391
475
  - report-only queue: `advisory` findings plus anything owned by `human` or `release`
392
- 7. **Sort.** Order by severity (P0 first) -> confidence (descending) -> file path -> line number.
393
- 8. **Collect coverage data.** Union residual_risks and testing_gaps across reviewers.
394
- 9. **Preserve CE agent artifacts.** Keep the learnings, agent-native, schema-drift, and deployment-verification outputs alongside the merged finding set. Do not drop unstructured agent output just because it does not match the persona JSON schema.
476
+ 9. **Sort.** Order by severity (P0 first) -> confidence (descending) -> file path -> line number.
477
+ 10. **Collect coverage data.** Union residual_risks and testing_gaps across reviewers.
478
+ 11. **Preserve CE agent artifacts.** Keep the learnings, agent-native, schema-drift, and deployment-verification outputs alongside the merged finding set. Do not drop unstructured agent output just because it does not match the persona JSON schema.
395
479
 
396
480
  ### Stage 6: Synthesize and present
397
481
 
398
- Assemble the final report using the template in [review-output-template.md](./references/review-output-template.md):
482
+ Assemble the final report using **pipe-delimited markdown tables for findings** from the review output template included below. The table format is mandatory for finding rows in interactive mode — do not render findings as freeform text blocks or horizontal-rule-separated prose. Other report sections (Applied Fixes, Learnings, Coverage, etc.) use bullet lists and the `---` separator before the verdict, as shown in the template.
399
483
 
400
484
  1. **Header.** Scope, intent, mode, reviewer team with per-conditional justifications.
401
- 2. **Findings.** Grouped by severity (P0, P1, P2, P3). Each finding shows file, issue, reviewer(s), confidence, and synthesized route.
402
- 3. **Applied Fixes.** Include only if a fix phase ran in this invocation.
403
- 4. **Residual Actionable Work.** Include when unresolved actionable findings were handed off or should be handed off.
404
- 5. **Pre-existing.** Separate section, does not count toward verdict.
405
- 6. **Learnings & Past Solutions.** Surface learnings-researcher results: if past solutions are relevant, flag them as "Known Pattern" with links to docs/solutions/ files.
406
- 7. **Agent-Native Gaps.** Surface agent-native-reviewer results. Omit section if no gaps found.
407
- 8. **Schema Drift Check.** If schema-drift-detector ran, summarize whether drift was found. If drift exists, list the unrelated schema objects and the required cleanup command. If clean, say so briefly.
408
- 9. **Deployment Notes.** If deployment-verification-agent ran, surface the key Go/No-Go items: blocking pre-deploy checks, the most important verification queries, rollback caveats, and monitoring focus areas. Keep the checklist actionable rather than dropping it into Coverage.
409
- 10. **Coverage.** Suppressed count, residual risks, testing gaps, failed/timed-out reviewers, and any intent uncertainty carried by non-interactive modes.
410
- 11. **Verdict.** Ready to merge / Ready with fixes / Not ready. Fix order if applicable.
485
+ 2. **Findings.** Rendered as pipe-delimited tables grouped by severity (`### P0 -- Critical`, `### P1 -- High`, `### P2 -- Moderate`, `### P3 -- Low`). Each finding row shows `#`, file, issue, reviewer(s), confidence, and synthesized route. Omit empty severity levels. Never render findings as freeform text blocks or numbered lists.
486
+ 3. **Requirements Completeness.** Include only when a plan was found in Stage 2b. For each requirement (R1, R2, etc.) and implementation unit in the plan, report whether corresponding work appears in the diff. Use a simple checklist: met / not addressed / partially addressed. Routing depends on `plan_source`:
487
+ - **`explicit`** (caller-provided or PR body): Flag unaddressed requirements as P1 findings with `autofix_class: manual`, `owner: downstream-resolver`. These enter the residual actionable queue and can become todos.
488
+ - **`inferred`** (auto-discovered): Flag unaddressed requirements as P3 findings with `autofix_class: advisory`, `owner: human`. These stay in the report only — no todos, no autonomous follow-up. An inferred plan match is a hint, not a contract.
489
+ Omit this section entirely when no plan was found do not mention the absence of a plan.
490
+ 4. **Applied Fixes.** Include only if a fix phase ran in this invocation.
491
+ 5. **Residual Actionable Work.** Include when unresolved actionable findings were handed off or should be handed off.
492
+ 6. **Pre-existing.** Separate section, does not count toward verdict.
493
+ 7. **Learnings & Past Solutions.** Surface learnings-researcher results: if past solutions are relevant, flag them as "Known Pattern" with links to docs/solutions/ files.
494
+ 8. **Agent-Native Gaps.** Surface agent-native-reviewer results. Omit section if no gaps found.
495
+ 9. **Schema Drift Check.** If schema-drift-detector ran, summarize whether drift was found. If drift exists, list the unrelated schema objects and the required cleanup command. If clean, say so briefly.
496
+ 10. **Deployment Notes.** If deployment-verification-agent ran, surface the key Go/No-Go items: blocking pre-deploy checks, the most important verification queries, rollback caveats, and monitoring focus areas. Keep the checklist actionable rather than dropping it into Coverage.
497
+ 11. **Coverage.** Suppressed count, residual risks, testing gaps, failed/timed-out reviewers, and any intent uncertainty carried by non-interactive modes.
498
+ 12. **Verdict.** Ready to merge / Ready with fixes / Not ready. Fix order if applicable. When an `explicit` plan has unaddressed requirements, the verdict must reflect it — a PR that's code-clean but missing planned requirements is "Not ready" unless the omission is intentional. When an `inferred` plan has unaddressed requirements, note it in the verdict reasoning but do not block on it alone.
411
499
 
412
500
  Do not include time estimates.
413
501
 
502
+ **Format verification:** Before delivering the report, verify the findings sections use pipe-delimited table rows (`| # | File | Issue | ... |`) not freeform text. If you catch yourself rendering findings as prose blocks separated by horizontal rules or bullet points, stop and reformat into tables.
503
+
504
+ ### Headless output format
505
+
506
+ In `mode:headless`, replace the interactive pipe-delimited table report with a structured text envelope. The envelope follows the same structural pattern as document-review's headless output (completion header, metadata block, findings grouped by autofix_class, trailing sections) while using ce:review's own section headings and per-finding fields.
507
+
508
+ ```
509
+ Code review complete (headless mode).
510
+
511
+ Scope: <scope-line>
512
+ Intent: <intent-summary>
513
+ Reviewers: <reviewer-list with conditional justifications>
514
+ Verdict: <Ready to merge | Ready with fixes | Not ready>
515
+ Artifact: .context/systematic/ce-review/<run-id>/
516
+
517
+ Applied N safe_auto fixes.
518
+
519
+ Gated-auto findings (concrete fix, changes behavior/contracts):
520
+
521
+ [P1][gated_auto -> downstream-resolver][needs-verification] File: <file:line> -- <title> (<reviewer>, confidence <N>)
522
+ Why: <why_it_matters>
523
+ Suggested fix: <suggested_fix or "none">
524
+ Evidence: <evidence[0]>
525
+ Evidence: <evidence[1]>
526
+
527
+ Manual findings (actionable, needs handoff):
528
+
529
+ [P1][manual -> downstream-resolver] File: <file:line> -- <title> (<reviewer>, confidence <N>)
530
+ Why: <why_it_matters>
531
+ Evidence: <evidence[0]>
532
+
533
+ Advisory findings (report-only):
534
+
535
+ [P2][advisory -> human] File: <file:line> -- <title> (<reviewer>, confidence <N>)
536
+ Why: <why_it_matters>
537
+
538
+ Pre-existing issues:
539
+ [P2][gated_auto -> downstream-resolver] File: <file:line> -- <title> (<reviewer>, confidence <N>)
540
+ Why: <why_it_matters>
541
+
542
+ Residual risks:
543
+ - <risk>
544
+
545
+ Learnings & Past Solutions:
546
+ - <learning>
547
+
548
+ Agent-Native Gaps:
549
+ - <gap description>
550
+
551
+ Schema Drift Check:
552
+ - <drift status>
553
+
554
+ Deployment Notes:
555
+ - <deployment note>
556
+
557
+ Testing gaps:
558
+ - <gap>
559
+
560
+ Coverage:
561
+ - Suppressed: <N> findings below 0.60 confidence (P0 at 0.50+ retained)
562
+ - Untracked files excluded: <file1>, <file2>
563
+ - Failed reviewers: <reviewer>
564
+
565
+ Review complete
566
+ ```
567
+
568
+ **Detail enrichment (headless only):** The headless envelope includes `Why:`, `Evidence:`, and `Suggested fix:` lines. After merge (Stage 5), read the per-agent artifact files from `.context/systematic/ce-review/{run_id}/` for only the findings that survived dedup and confidence gating.
569
+ - **Field tiers:** `Why:` and `Evidence:` are detail-tier -- load from per-agent artifact files. `Suggested fix:` is merge-tier -- use it directly from the compact return without artifact lookup.
570
+ - **Artifact matching:** For each surviving finding, look up its detail-tier fields in the artifact files of the contributing reviewers. Match on `file + line_bucket(line, +/-3)` (the same tolerance used in Stage 5 dedup) within each contributing reviewer's artifact. When multiple artifact entries fall within the line bucket, apply `normalize(title)` to both the merged finding's title and each candidate entry's title as a tie-breaker.
571
+ - **Reviewer order:** Try contributing reviewers in the order they appear in the merged finding's reviewer list; use the first match.
572
+ - **No-match fallback:** If no artifact file contains a match (all writes failed, or the finding was synthesized during merge), omit the `Why:` and `Evidence:` lines for that finding and note the gap in Coverage. The `Suggested fix:` line can still be populated from the compact return since it is merge-tier.
573
+
574
+ **Formatting rules:**
575
+ - The `[needs-verification]` marker appears only on findings where `requires_verification: true`.
576
+ - The `Artifact:` line gives callers the path to the full run artifact for machine-readable access to the complete findings schema. The text envelope is the primary handoff; the artifact is for debugging and full-fidelity access.
577
+ - Findings with `owner: release` appear in the Advisory section (they are operational/rollout items, not code fixes).
578
+ - Findings with `pre_existing: true` appear in the Pre-existing section regardless of autofix_class.
579
+ - The Verdict appears in the metadata header (deliberately reordered from the interactive format where it appears at the bottom) so programmatic callers get the verdict first.
580
+ - Omit any section with zero items.
581
+ - If all reviewers fail or time out, emit `Code review degraded (headless mode). Reason: 0 of N reviewers returned results.` followed by "Review complete".
582
+ - End with "Review complete" as the terminal signal so callers can detect completion.
583
+
414
584
  ## Quality Gates
415
585
 
416
586
  Before delivering the review, verify:
@@ -446,17 +616,26 @@ After presenting findings and verdict (Stage 6), route the next steps by mode. R
446
616
 
447
617
  **Interactive mode**
448
618
 
449
- - Ask a single policy question only when actionable work exists.
450
- - Recommended default:
619
+ - Apply `safe_auto -> review-fixer` findings automatically without asking. These are safe by definition.
620
+ - Ask a policy question **using the platform's blocking question tool** (`question` in OpenCode, `request_user_input` in Codex, `ask_user` in Gemini) only when `gated_auto` or `manual` findings remain after safe fixes. Do not replace with a conversational open-ended question. Adapt the options to match what actually remains:
451
621
 
622
+ **When `gated_auto` findings are present** (with or without `manual`):
452
623
  ```
453
- What should I do with the actionable findings?
454
- 1. Apply safe_auto fixes and leave the rest as residual work (Recommended)
455
- 2. Apply safe_auto fixes only
456
- 3. Review report only
624
+ Safe fixes have been applied. What should I do with the remaining findings?
625
+ 1. Review and approve specific gated fixes (Recommended)
626
+ 2. Leave as residual work
627
+ 3. Report only -- no further action
457
628
  ```
458
629
 
459
- - Tailor the prompt to the actual action sets. If the fixer queue is empty, do not offer "Apply safe_auto fixes" options. Ask whether to externalize the residual actionable work or keep the review report-only instead.
630
+ **When only `manual` findings remain** (no `gated_auto`):
631
+ ```
632
+ Safe fixes have been applied. The remaining findings need manual resolution. What should I do?
633
+ 1. Leave as residual work (Recommended)
634
+ 2. Report only -- no further action
635
+ ```
636
+
637
+ If no blocking question tool is available, present the applicable numbered options as text and wait for the user's selection before proceeding.
638
+ - If no `gated_auto` or `manual` findings remain after safe fixes, skip the policy question entirely — report what was fixed and proceed to next steps.
460
639
  - Only include `gated_auto` findings in the fixer queue after the user explicitly approves the specific items. Do not widen the queue based on severity alone.
461
640
 
462
641
  **Autofix mode**
@@ -473,6 +652,15 @@ After presenting findings and verdict (Stage 6), route the next steps by mode. R
473
652
  - Do not create residual todos or `.context` artifacts.
474
653
  - Stop after Stage 6. Everything remains in the report.
475
654
 
655
+ **Headless mode**
656
+
657
+ - Ask no questions.
658
+ - Apply only the `safe_auto -> review-fixer` queue in a single pass. Do not enter the bounded re-review loop (Step 3). Spawn one fixer subagent, apply fixes, then proceed directly to Step 4.
659
+ - Leave `gated_auto`, `manual`, `human`, and `release` items unresolved — they appear in the structured text output.
660
+ - Output the headless output envelope (see Stage 6) instead of the interactive report.
661
+ - Write a run artifact (Step 4) but do not create todo files.
662
+ - Stop after the structured text output and "Review complete" signal. No commit/push/PR.
663
+
476
664
  #### Step 3: Apply fixes with one fixer and bounded rounds
477
665
 
478
666
  - Spawn exactly one fixer subagent for the current fixer queue in the current checkout. That fixer applies all approved changes and runs the relevant targeted tests in one pass against a consistent tree.
@@ -484,11 +672,23 @@ After presenting findings and verdict (Stage 6), route the next steps by mode. R
484
672
 
485
673
  #### Step 4: Emit artifacts and downstream handoff
486
674
 
487
- - In interactive and autofix modes, write a per-run artifact under `.context/systematic/ce-review/<run-id>/` containing:
488
- - synthesized findings
675
+ - In interactive, autofix, and headless modes, write a per-run artifact under `.context/systematic/ce-review/<run-id>/` containing:
676
+ - synthesized findings (merged output from Stage 5)
489
677
  - applied fixes
490
678
  - residual actionable work
491
679
  - advisory-only outputs
680
+ Per-agent full-detail JSON files (`{reviewer_name}.json`) are already present in this directory from Stage 4 dispatch.
681
+ - Also write `metadata.json` alongside the findings so downstream skills (e.g., `ce:polish-beta`) can verify the artifact matches the current branch and HEAD. Minimum fields:
682
+ ```json
683
+ {
684
+ "run_id": "<run-id>",
685
+ "branch": "<git branch --show-current at dispatch time>",
686
+ "head_sha": "<git rev-parse HEAD at dispatch time>",
687
+ "verdict": "<Ready to merge | Ready with fixes | Not ready>",
688
+ "completed_at": "<ISO 8601 UTC timestamp>"
689
+ }
690
+ ```
691
+ Capture `branch` and `head_sha` at dispatch time (before any autofixes land), and write the file after the verdict is finalized. This file is additive -- pre-existing artifacts that predate this field are still valid, and downstream skills fall back to file mtime when it is missing.
492
692
  - In autofix mode, create durable todo files only for unresolved actionable findings whose final owner is `downstream-resolver`. Load the `todo-create` skill for the canonical directory path, naming convention, YAML frontmatter structure, and template. Each todo should map the finding's severity to the todo priority (`P0`/`P1` -> `p1`, `P2` -> `p2`, `P3` -> `p3`) and set `status: ready` since these findings have already been triaged by synthesis.
493
693
  - Do not create todos for `advisory` findings, `owner: human`, `owner: release`, or protected-artifact cleanup suggestions.
494
694
  - If only advisory outputs remain, create no todos.
@@ -512,9 +712,32 @@ After presenting findings and verdict (Stage 6), route the next steps by mode. R
512
712
  If "Create a PR": first publish the branch with `git push --set-upstream origin HEAD`, then use `gh pr create` with a title and summary derived from the branch changes.
513
713
  If "Push fixes": push the branch with `git push` to update the existing PR.
514
714
 
515
- **Autofix and report-only modes:** stop after the report, artifact emission, and residual-work handoff. Do not commit, push, or create a PR.
715
+ **Autofix, report-only, and headless modes:** stop after the report, artifact emission, and residual-work handoff. Do not commit, push, or create a PR.
516
716
 
517
717
  ## Fallback
518
718
 
519
719
  If the platform doesn't support parallel sub-agents, run reviewers sequentially. Everything else (stages, output format, merge pipeline) stays the same.
520
720
 
721
+ ---
722
+
723
+ ## Included References
724
+
725
+ ### Persona Catalog
726
+
727
+ @./references/persona-catalog.md
728
+
729
+ ### Subagent Template
730
+
731
+ @./references/subagent-template.md
732
+
733
+ ### Diff Scope Rules
734
+
735
+ @./references/diff-scope.md
736
+
737
+ ### Findings Schema
738
+
739
+ @./references/findings-schema.json
740
+
741
+ ### Review Output Template
742
+
743
+ @./references/review-output-template.md