npm - planflow-ai - Versions diffs - 1.3.0 → 1.3.2 - Mend

planflow-ai 1.3.0 → 1.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/.claude/commands/brainstorm.md +2 -2
package/.claude/commands/heartbeat.md +1 -1
package/.claude/commands/learn.md +1 -1
package/.claude/commands/{brain.md → note.md} +12 -12
package/.claude/commands/review-code.md +53 -0
package/.claude/commands/review-pr.md +53 -0
package/.claude/resources/core/_index.md +50 -2
package/.claude/resources/core/resource-capture.md +1 -1
package/.claude/resources/core/review-adaptive-depth.md +217 -0
package/.claude/resources/core/review-multi-agent.md +289 -0
package/.claude/resources/core/review-severity-ranking.md +149 -0
package/.claude/resources/core/review-verification.md +158 -0
package/.claude/resources/patterns/review-code-templates.md +315 -2
package/.claude/resources/skills/_index.md +9 -1
package/.claude/resources/skills/brain-skill.md +3 -3
package/.claude/resources/skills/review-code-skill.md +73 -0
package/.claude/resources/skills/review-pr-skill.md +58 -0
package/README.md +38 -3
package/dist/cli/handlers/claude.js +20 -12
package/dist/cli/handlers/claude.js.map +1 -1
package/package.json +1 -1
package/rules/skills/brain-skill.mdc +4 -4
package/skills/plan-flow/SKILL.md +1 -1
package/skills/plan-flow/brain/SKILL.md +1 -1
package/templates/shared/AGENTS.md.template +1 -1
package/templates/shared/CLAUDE.md.template +1 -1

package/.claude/resources/patterns/review-code-templates.md CHANGED Viewed

@@ -61,15 +61,48 @@ For each changed file, similar implementations in the codebase:
 ---
+## Verification Summary
+| Metric | Count |
+|--------|-------|
+| Initial findings | {N} |
+| Confirmed | {N} |
+| Likely (needs human judgment) | {N} |
+| Dismissed (false positives filtered) | {N} |
+| **False positive rate** | **{N}%** |
+---
+## Executive Summary
+> Only include when total findings ≥ 5. Omit for smaller reviews.
+**Risk level**: {Low | Medium | High}
+**Top issues to address**:
+1. {Finding title} ({Severity}) — `{file}:{line}`
+2. {Finding title} ({Severity}) — `{file}:{line}`
+3. {Finding title} ({Severity}) — `{file}:{line}`
+---
 ## Findings
-### Finding 1: {Finding Name}
+> Findings are grouped by severity (Critical → Major → Minor → Suggestion).
+> For findings classified as "Likely" during verification, prepend `[Likely]` to the heading.
+> Omit empty severity sections.
+> Related findings across files may be grouped — see review-severity-ranking.md.
+### Critical Findings
+#### Finding 1: {Finding Name}
 | Field          | Value                                            |
 | -------------- | ------------------------------------------------ |
 | File           | `{file_path}`                                    |
 | Line           | {line_number}                                    |
-| Severity       | {Critical/Major/Minor/Suggestion}                |
+| Severity       | Critical                                         |
 | Fix Complexity | {X/10} - {Level}                                 |
 | Pattern        | {Reference to pattern from rules, if applicable} |
@@ -84,6 +117,24 @@ See `{reference_file_path}` for how this is handled elsewhere in the codebase.
 // Suggested code improvement
 \`\`\`
+### Major Findings
+#### Finding N: {Finding Name}
+> Same format as Critical Findings.
+### Minor Findings
+#### Finding N: {Finding Name}
+> Same format as Critical Findings.
+### Suggestions
+#### Finding N: {Finding Name}
+> Same format as Critical Findings.
 ---
 ## Pattern Conflicts
@@ -184,6 +235,268 @@ List any particularly well-written code or good practices observed:
 ---
+## Lightweight Review Template (< 50 lines)
+Use this compact template when the changeset is under 50 lines.
+```markdown
+# Local Code Review: {Description}
+**Project**: [[{project-name}]]
+## Review Information
+| Field          | Value                 |
+| -------------- | --------------------- |
+| Date           | {date}                |
+| Files Reviewed | {number_of_files}     |
+| Scope          | {all/staged/unstaged} |
+| Language(s)    | {detected_languages}  |
+---
+## Review Summary
+| Metric | Value |
+|--------|-------|
+| **Review Mode** | Lightweight (< 50 lines) |
+| **Total Findings** | {count} |
+| **Status** | {LGTM / Needs Changes} |
+---
+## Findings
+> Only present if issues were found. Skip this section entirely for LGTM reviews.
+### Finding 1: {Finding Name}
+| Field          | Value                                            |
+| -------------- | ------------------------------------------------ |
+| File           | `{file_path}`                                    |
+| Line           | {line_number}                                    |
+| Severity       | {Critical/Major/Minor}                           |
+| Fix Complexity | {X/10} - {Level}                                 |
+**Description**:
+{Detailed explanation of the issue found}
+**Suggested Fix**:
+\`\`\`{language}
+// Suggested code improvement
+\`\`\`
+---
+## Positive Highlights
+- {Highlight 1}
+- {Highlight 2}
+- {Highlight 3}
+---
+## Commit Readiness
+| Status | {Ready to Commit / Needs Changes} |
+| ------ | --------------------------------- |
+| Reason | {Brief explanation}               |
+```
+> **Note**: Lightweight reviews skip Reference Implementations, Pattern Conflicts, Rule Update Recommendations, and Verification Summary sections.
+---
+## Deep Review Template (500+ lines)
+Use this template for large changesets. Findings are grouped by severity instead of by file, and an executive summary is prepended.
+```markdown
+# Local Code Review: {Description}
+**Project**: [[{project-name}]]
+## Review Information
+| Field          | Value                 |
+| -------------- | --------------------- |
+| Date           | {date}                |
+| Files Reviewed | {number_of_files}     |
+| Scope          | {all/staged/unstaged} |
+| Language(s)    | {detected_languages}  |
+---
+## Executive Summary
+### Files Changed by Category
+| Category | Files | Lines Changed |
+|----------|-------|--------------|
+| Core Logic | {N} | +{add}/-{del} |
+| Infrastructure | {N} | +{add}/-{del} |
+| UI/Presentation | {N} | +{add}/-{del} |
+| Tests | {N} | +{add}/-{del} |
+### Risk Assessment
+**Overall Risk**: {Low | Medium | High}
+{1-2 sentence justification based on scope, categories affected, and finding severity distribution}
+### Top 3 Findings
+1. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
+2. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
+3. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
+---
+## Review Agents
+> Multi-agent parallel review section. Only present in Deep mode (500+ lines).
+| Agent | Model | Findings | After Dedup |
+|-------|-------|----------|-------------|
+| Security | sonnet | {N} | {N} |
+| Logic & Bugs | sonnet | {N} | {N} |
+| Performance | sonnet | {N} | {N} |
+| Pattern Compliance | haiku | {N} | {N} |
+| **Total** | | **{N}** | **{N}** |
+Duplicates removed: {N}
+---
+## Changed Files
+| File          | Category       | Status     | Lines Changed |
+| ------------- | -------------- | ---------- | ------------- |
+| `{file_path}` | {Core/Infra/UI/Tests} | {modified} | +{add}/-{del} |
+| ...           | ...            | ...        | ...           |
+---
+## Reference Implementations Found
+> Same format as standard template — per changed file, similar implementations in the codebase.
+---
+## Review Summary
+| Metric                | Value              |
+| --------------------- | ------------------ |
+| **Review Mode**       | Deep (500+ lines)  |
+| **Total Findings**    | {count}            |
+| Critical              | {critical_count}   |
+| Major                 | {major_count}      |
+| Minor                 | {minor_count}      |
+| Suggestion            | {suggestion_count} |
+| **Pattern Conflicts** | {conflict_count}   |
+| **Total Fix Effort**  | {sum_of_scores}/X  |
+---
+## Verification Summary
+| Metric | Count |
+|--------|-------|
+| Initial findings | {N} |
+| Confirmed | {N} |
+| Likely (needs human judgment) | {N} |
+| Dismissed (false positives filtered) | {N} |
+| **False positive rate** | **{N}%** |
+---
+## Critical Findings
+### Finding 1: {Finding Name}
+| Field          | Value                                            |
+| -------------- | ------------------------------------------------ |
+| File           | `{file_path}`                                    |
+| Line           | {line_number}                                    |
+| Severity       | Critical                                         |
+| Fix Complexity | {X/10} - {Level}                                 |
+| Category       | {Core Logic/Infrastructure/UI/Tests}             |
+| Pattern        | {Reference to pattern from rules, if applicable} |
+**Description**:
+{Detailed explanation}
+**Reference Implementation**:
+See `{reference_file_path}` for how this is handled elsewhere.
+**Suggested Fix**:
+\`\`\`{language}
+// Suggested code improvement
+\`\`\`
+---
+## Major Findings
+### Finding N: {Finding Name}
+> Same format as Critical Findings.
+---
+## Minor Findings
+### Finding N: {Finding Name}
+> Same format as Critical Findings.
+---
+## Suggestions
+### Finding N: {Finding Name}
+> Same format as Critical Findings.
+---
+## Pattern Conflicts
+> Same format as standard template.
+---
+## Rule Update Recommendations
+> Same format as standard template.
+---
+## Positive Highlights
+- {Highlight 1}
+- {Highlight 2}
+---
+## Commit Readiness
+| Status | {Ready to Commit/Needs Changes/Needs Discussion} |
+| ------ | ------------------------------------------------ |
+| Reason | {Brief explanation}                              |
+### Before Committing
+- [ ] Address all Critical findings
+- [ ] Address all Major findings
+- [ ] Review Pattern Conflicts and decide on resolution
+- [ ] Update rules files if new patterns should be documented
+```
+> **Note**: Deep reviews always include the Verification Summary, group findings by severity, and prepend an Executive Summary with risk assessment. The Changed Files table includes a Category column.
+---
 ## Example Output
 ### Pattern Conflict Example

package/.claude/resources/skills/_index.md CHANGED Viewed

@@ -15,6 +15,14 @@ Skills implement the workflow logic for commands. Each skill orchestrates a spec
 > **Note**: The execute-plan skill supports **Model Routing** — automatic model selection per phase based on complexity scores (0-3 → haiku, 4-5 → sonnet, 6-10 → opus). Controlled by `model_routing` in `flow/.flowconfig`. See `.claude/resources/core/model-routing.md` for full rules.
 >
 > **Note**: The discovery skill also includes **Design Awareness**. During discovery, the LLM asks whether the feature involves UI work and captures structured design tokens (colors, typography, spacing) into a `## Design Context` section. During execution, these tokens are auto-injected into UI phase prompts. See `.claude/resources/core/design-awareness.md` for full rules.
+>
+> **Note**: The review-code and review-pr skills include a **Verification Pass**. After initial analysis, each finding is re-examined against surrounding code context and classified as Confirmed, Likely, or Dismissed. False positives are filtered before output. See `.claude/resources/core/review-verification.md` for full rules.
+>
+> **Note**: The review-code and review-pr skills include **Multi-Agent Parallel Review** for Deep mode (500+ lines). Four specialized subagents (security, logic, performance, patterns) run in parallel, and the coordinator deduplicates, verifies, and ranks the merged results. See `.claude/resources/core/review-multi-agent.md` for full rules.
+>
+> **Note**: The review-code and review-pr skills include **Severity Re-Ranking**. After verification, findings are re-ranked by severity → confidence → fix complexity, related findings across files are grouped, and an executive summary is added when ≥ 5 findings. See `.claude/resources/core/review-severity-ranking.md` for full rules.
+>
+> **Note**: The review-code and review-pr skills include **Adaptive Depth**. Review depth scales automatically based on changeset size: < 50 lines → Lightweight (quick-scan), 50–500 → Standard (no change), 500+ → Deep (multi-pass with executive summary). See `.claude/resources/core/review-adaptive-depth.md` for full rules.
 ---
@@ -188,7 +196,7 @@ Skills implement the workflow logic for commands. Each skill orchestrates a spec
 | Command | Skill | Key Codes |
 |---------|-------|-----------|
 | `/brainstorm` | brainstorm-skill | SKL-BS-1 through SKL-BS-3 |
-| `/brain` | brain-skill | SKL-BR-1 through SKL-BR-3 |
+| `/note` | brain-skill | SKL-BR-1 through SKL-BR-3 |
 | `/flow cost` | flow-cost | SKL-COST-1 through SKL-COST-4 |
 | `/learn` | learn-skill | SKL-LRN-1 through SKL-LRN-4 |
 | `/discovery-plan` | discovery-skill | SKL-DIS-1 through SKL-DIS-4 |

package/.claude/resources/skills/brain-skill.md CHANGED Viewed

@@ -49,7 +49,7 @@ This skill **only writes to `flow/brain/`**. It does NOT:
 ### Free-Text Mode (Default)
-When user runs `/brain {free text}`:
+When user runs `/note {free text}`:
 1. **Parse input** - Read the user's unstructured text
 2. **Extract entities** - Identify:
@@ -63,9 +63,9 @@ When user runs `/brain {free text}`:
 4. **Write** - Create/update the appropriate brain file with `[[wiki-links]]`
 5. **Update index** - Update `flow/brain/index.md` if needed (new feature, error, or decision)
-### Guided Mode (`/brain -guided`)
+### Guided Mode (`/note -guided`)
-When user runs `/brain -guided`:
+When user runs `/note -guided`:
 1. **Ask structured questions** using `AskUserQuestion`:

package/.claude/resources/skills/review-code-skill.md CHANGED Viewed

@@ -75,6 +75,26 @@ This skill is **strictly read-only analysis**. The review process:
 4. If `file_path` is provided, filter to only those files
 5. If `scope` is provided, filter to staged or unstaged only
+### Step 1b: Determine Review Depth
+Determine the review mode based on changeset size. See `.claude/resources/core/review-adaptive-depth.md` for full rules.
+1. Count total lines changed (additions + deletions) from `git diff --stat`
+2. Exclude lock files, generated files, and pure whitespace changes from the count
+3. Classify into tier:
+   - **Small** (< 50 lines) → **Lightweight** mode
+   - **Medium** (50–500 lines) → **Standard** mode (no behavior change)
+   - **Large** (500+ lines) → **Deep** mode
+4. Display: `**Review mode**: {Lightweight|Standard|Deep} ({N} lines changed across {M} files)`
+**If Lightweight**: Skip Steps 2–5 (pattern loading, similar implementations, full analysis, pattern conflicts). Perform abbreviated analysis checking ONLY security issues, obvious logic bugs, and breaking changes. Skip verification pass (Step 5b). Generate output using the lightweight template from `review-code-templates.md`.
+**If Deep**: Activate multi-agent parallel review. See `.claude/resources/core/review-multi-agent.md`. Categorize files by type (Core Logic, Infrastructure, UI/Presentation, Tests), then spawn 4 specialized subagents in parallel (security, logic & bugs, performance, pattern compliance). The coordinator collects results, deduplicates overlapping findings, then proceeds to Step 5b (verification), Step 5c (re-ranking), Step 6b (pattern review), and Step 6 (output using deep template). Steps 2–5 are handled by subagents instead of the main agent.
+**If Standard**: Proceed with all steps as defined below (no behavior change).
+---
 ### Step 2: Load Review Patterns
 1. Read `.claude/resources/patterns/review-pr-patterns.md` for general review guidelines
@@ -160,6 +180,42 @@ When a pattern conflict is found:
 4. **Buffer patterns for capture**: Silently append identified patterns (both good patterns and anti-patterns found during review) to `flow/resources/pending-patterns.md`. See `.claude/resources/core/pattern-capture.md` for buffer format and capture triggers.
+---
+### Step 5b: Verify Findings
+After collecting all findings from Steps 4 and 5, run a second-pass verification to filter false positives. See `.claude/resources/core/review-verification.md` for full logic.
+**For each finding**:
+1. **Re-read surrounding context** — Read 15 lines above and 15 below the flagged line
+2. **Ask 3 standard questions**:
+   - Is this actually a bug, or does surrounding code handle it?
+   - Is there a test that covers this case?
+   - Would a senior developer agree this is a real issue?
+3. **Ask 1 category-specific question** (security → exploit path, logic → reachability, performance → hot path, etc.)
+4. **Classify**:
+   - **Confirmed** — Clear issue, 2+ standard questions support it. Keep as-is.
+   - **Likely** — Ambiguous, 1 question supports. Tag with `[Likely]` in output.
+   - **Dismissed** — False positive, all questions fail. Remove from output.
+**Rules**:
+- When in doubt between Likely and Dismissed → choose **Likely**
+- NEVER dismiss a Critical severity finding (downgrade to Likely at most)
+**After verification**: Remove Dismissed findings, tag Likely findings, generate Verification Summary stats.
+### Step 5c: Re-Rank and Group Findings
+After verification, re-rank all remaining findings by impact. See `.claude/resources/core/review-severity-ranking.md` for full rules.
+1. **Sort findings**: Severity (Critical → Major → Minor → Suggestion), then Confidence (Confirmed → Likely), then Fix Complexity (lower first)
+2. **Group related findings**: Scan for same issue type across files, same root cause, or causal chains. Only group when genuinely related (≥ 2 findings). Skip grouping for small reviews (1-3 findings).
+3. **Executive summary**: If total findings ≥ 5 (standard mode) or always (deep mode), prepend an executive summary with risk level and top 3 findings. Skip for lightweight mode.
+4. **Structure output by severity**: Use severity-grouped sections (Critical → Major → Minor → Suggestions) instead of per-file ordering. Omit empty severity sections.
+---
 ### Step 6b: Pattern Review
 After analysis but before generating the review document, run the pattern review protocol:
@@ -175,6 +231,8 @@ See `.claude/resources/core/pattern-capture.md` for the full end-of-skill review
 ### Step 6: Generate Review Document
+**Important**: Only include Confirmed and Likely findings in the output. Dismissed findings are excluded. Add the Verification Summary section after the Review Summary.
 Create a markdown file in `flow/reviewed-code/` with the naming convention:
 ```
@@ -317,3 +375,18 @@ After running this command:
 5. **Commit changes** once review concerns are addressed
 > The goal is not just to review current changes, but to **improve the codebase patterns over time** by documenting good patterns and preventing anti-patterns from spreading.
+### Validation Checklist
+- [ ] All changed files analyzed
+- [ ] Forbidden patterns checked
+- [ ] Allowed patterns verified
+- [ ] Similar implementations searched
+- [ ] Pattern conflicts documented
+- [ ] Verification pass completed — all findings classified
+- [ ] Dismissed findings removed from output
+- [ ] Likely findings tagged with `[Likely]`
+- [ ] Verification Summary section included
+- [ ] Findings sorted by severity (Critical → Major → Minor → Suggestion)
+- [ ] Related findings grouped when applicable (≥ 2 related)
+- [ ] Executive summary included when ≥ 5 findings (standard) or always (deep)

package/.claude/resources/skills/review-pr-skill.md CHANGED Viewed

@@ -162,6 +162,26 @@ After successful authentication, proceed to fetch PR information.
 2. Extract the PR title, description, and list of changed files
 3. Identify the primary language(s) used in the PR
+### Step 1b: Determine Review Depth
+Determine the review mode based on changeset size. See `.claude/resources/core/review-adaptive-depth.md` for full rules.
+1. Count total lines changed (additions + deletions) from `gh pr diff --stat` or Azure DevOps diff API
+2. Exclude lock files, generated files, and pure whitespace changes from the count
+3. Classify into tier:
+   - **Small** (< 50 lines) → **Lightweight** mode
+   - **Medium** (50–500 lines) → **Standard** mode (no behavior change)
+   - **Large** (500+ lines) → **Deep** mode
+4. Display: `**Review mode**: {Lightweight|Standard|Deep} ({N} lines changed across {M} files)`
+**If Lightweight**: Skip Steps 2–3 (pattern loading, full analysis). Perform abbreviated analysis checking ONLY security issues, obvious logic bugs, and breaking changes. Skip verification pass (Step 3b). Generate compact output with no Pattern Conflicts or Reference Implementations sections.
+**If Deep**: Activate multi-agent parallel review. See `.claude/resources/core/review-multi-agent.md`. Categorize files by type (Core Logic, Infrastructure, UI/Presentation, Tests), then spawn 4 specialized subagents in parallel (security, logic & bugs, performance, pattern compliance). The coordinator collects results, deduplicates overlapping findings, then proceeds to Step 3b (verification), Step 3c (re-ranking), and Step 4 (output using deep template with severity grouping and executive summary). Steps 2–3 are handled by subagents instead of the main agent.
+**If Standard**: Proceed with all steps as defined below (no behavior change).
+---
 ### Step 2: Load Review Patterns
 1. Read `.claude/resources/patterns/review-pr-patterns.md` for general review guidelines
@@ -182,8 +202,46 @@ For each file in the PR:
 4. Apply language-specific checks
 5. Identify security, performance, and maintainability concerns
+---
+### Step 3b: Verify Findings
+After collecting all findings from Step 3, run a second-pass verification to filter false positives. See `.claude/resources/core/review-verification.md` for full logic.
+**For each finding**:
+1. **Re-read surrounding context** — Read 15 lines above and 15 below the flagged line
+2. **Ask 3 standard questions**:
+   - Is this actually a bug, or does surrounding code handle it?
+   - Is there a test that covers this case?
+   - Would a senior developer agree this is a real issue?
+3. **Ask 1 category-specific question** (security → exploit path, logic → reachability, performance → hot path, etc.)
+4. **Classify**:
+   - **Confirmed** — Clear issue, 2+ standard questions support it. Keep as-is.
+   - **Likely** — Ambiguous, 1 question supports. Tag with `[Likely]` in output.
+   - **Dismissed** — False positive, all questions fail. Remove from output.
+**Rules**:
+- When in doubt between Likely and Dismissed → choose **Likely**
+- NEVER dismiss a Critical severity finding (downgrade to Likely at most)
+**After verification**: Remove Dismissed findings, tag Likely findings, generate Verification Summary stats.
+### Step 3c: Re-Rank and Group Findings
+After verification, re-rank all remaining findings by impact. See `.claude/resources/core/review-severity-ranking.md` for full rules.
+1. **Sort findings**: Severity (Critical → Major → Minor → Suggestion), then Confidence (Confirmed → Likely), then Fix Complexity (lower first)
+2. **Group related findings**: Scan for same issue type across files, same root cause, or causal chains. Only group when genuinely related (≥ 2 findings). Skip grouping for small reviews (1-3 findings).
+3. **Executive summary**: If total findings ≥ 5 (standard mode) or always (deep mode), prepend an executive summary with risk level and top 3 findings. Skip for lightweight mode.
+4. **Structure output by severity**: Use severity-grouped sections (Critical → Major → Minor → Suggestions) instead of per-file ordering. Omit empty severity sections.
+---
 ### Step 4: Generate or Update Review Document
+**Important**: Only include Confirmed and Likely findings in the output. Dismissed findings are excluded. Add the Verification Summary section after the Review Summary.
 **Check for existing review file** in `flow/reviewed-pr/` before creating a new one.
 #### If reviewing the same PR again:

package/README.md CHANGED Viewed

@@ -75,11 +75,11 @@ Installs for Claude Code, Cursor, OpenClaw, and Codex CLI simultaneously.
 | `/create-plan` | Create implementation plan with phases |
 | `/execute-plan` | Execute plan phases with verification |
 | `/create-contract` | Create integration contract from API docs |
-| `/review-code` | Review local uncommitted changes |
-| `/review-pr` | Review a Pull Request |
+| `/review-code` | Review local uncommitted changes (adaptive depth + multi-agent) |
+| `/review-pr` | Review a Pull Request (adaptive depth + multi-agent) |
 | `/write-tests` | Generate tests for coverage target |
 | `/flow` | Configure plan-flow settings (autopilot, git control, runtime options) |
-| `/brain` | Capture meeting notes, ideas, brainstorms |
+| `/note` | Capture meeting notes, ideas, brainstorms |
 | `/learn` | Extract reusable patterns or learn a topic step-by-step |
 | `/pattern-validate` | Scan and index global brain patterns |
 | `/heartbeat` | Manage scheduled automated tasks |
@@ -209,6 +209,41 @@ Tasks with `in {N} hours/minutes` schedules run once and auto-disable after exec
 If a task fails because a Claude Code session is already active, the daemon retries up to 5 times at 60-second intervals instead of failing permanently.
+## Code Review
+`/review-code` and `/review-pr` include three layers of intelligence:
+### Adaptive Depth
+Review depth scales automatically based on changeset size:
+| Lines Changed | Mode | Behavior |
+|--------------|------|----------|
+| < 50 | Lightweight | Quick-scan for security, logic bugs, and breaking changes only |
+| 50–500 | Standard | Full review with pattern matching and similar implementation search |
+| 500+ | Deep | Multi-pass review with file categorization, executive summary, and multi-agent analysis |
+### Verification Pass
+Every finding goes through a second-pass verification that re-reads surrounding context and asks structured questions to classify findings as Confirmed, Likely, or Dismissed. False positives are filtered before output.
+### Severity Re-Ranking
+Findings are sorted by impact (Critical > Major > Minor > Suggestion), related findings across files are grouped, and an executive summary is added when there are 5+ findings.
+### Multi-Agent Parallel Review
+In Deep mode (500+ lines), the review is split into 4 specialized subagents running in parallel:
+| Agent | Focus | Model |
+|-------|-------|-------|
+| Security | Vulnerabilities, secrets, injection, auth bypass | sonnet |
+| Logic & Bugs | Edge cases, null handling, race conditions | sonnet |
+| Performance | N+1 queries, memory leaks, blocking I/O | sonnet |
+| Pattern Compliance | Forbidden/allowed patterns, naming consistency | haiku |
+The coordinator merges results, deduplicates overlapping findings, then runs verification and re-ranking.
 ## Complexity Scoring
 Every plan phase has a complexity score (0-10):