planflow-ai 1.3.0 → 1.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -61,15 +61,48 @@ For each changed file, similar implementations in the codebase:
61
61
 
62
62
  ---
63
63
 
64
+ ## Verification Summary
65
+
66
+ | Metric | Count |
67
+ |--------|-------|
68
+ | Initial findings | {N} |
69
+ | Confirmed | {N} |
70
+ | Likely (needs human judgment) | {N} |
71
+ | Dismissed (false positives filtered) | {N} |
72
+ | **False positive rate** | **{N}%** |
73
+
74
+ ---
75
+
76
+ ## Executive Summary
77
+
78
+ > Only include when total findings ≥ 5. Omit for smaller reviews.
79
+
80
+ **Risk level**: {Low | Medium | High}
81
+
82
+ **Top issues to address**:
83
+
84
+ 1. {Finding title} ({Severity}) — `{file}:{line}`
85
+ 2. {Finding title} ({Severity}) — `{file}:{line}`
86
+ 3. {Finding title} ({Severity}) — `{file}:{line}`
87
+
88
+ ---
89
+
64
90
  ## Findings
65
91
 
66
- ### Finding 1: {Finding Name}
92
+ > Findings are grouped by severity (Critical → Major → Minor → Suggestion).
93
+ > For findings classified as "Likely" during verification, prepend `[Likely]` to the heading.
94
+ > Omit empty severity sections.
95
+ > Related findings across files may be grouped — see review-severity-ranking.md.
96
+
97
+ ### Critical Findings
98
+
99
+ #### Finding 1: {Finding Name}
67
100
 
68
101
  | Field | Value |
69
102
  | -------------- | ------------------------------------------------ |
70
103
  | File | `{file_path}` |
71
104
  | Line | {line_number} |
72
- | Severity | {Critical/Major/Minor/Suggestion} |
105
+ | Severity | Critical |
73
106
  | Fix Complexity | {X/10} - {Level} |
74
107
  | Pattern | {Reference to pattern from rules, if applicable} |
75
108
 
@@ -84,6 +117,24 @@ See `{reference_file_path}` for how this is handled elsewhere in the codebase.
84
117
  // Suggested code improvement
85
118
  \`\`\`
86
119
 
120
+ ### Major Findings
121
+
122
+ #### Finding N: {Finding Name}
123
+
124
+ > Same format as Critical Findings.
125
+
126
+ ### Minor Findings
127
+
128
+ #### Finding N: {Finding Name}
129
+
130
+ > Same format as Critical Findings.
131
+
132
+ ### Suggestions
133
+
134
+ #### Finding N: {Finding Name}
135
+
136
+ > Same format as Critical Findings.
137
+
87
138
  ---
88
139
 
89
140
  ## Pattern Conflicts
@@ -184,6 +235,268 @@ List any particularly well-written code or good practices observed:
184
235
 
185
236
  ---
186
237
 
238
+ ## Lightweight Review Template (< 50 lines)
239
+
240
+ Use this compact template when the changeset is under 50 lines.
241
+
242
+ ```markdown
243
+ # Local Code Review: {Description}
244
+
245
+ **Project**: [[{project-name}]]
246
+
247
+ ## Review Information
248
+
249
+ | Field | Value |
250
+ | -------------- | --------------------- |
251
+ | Date | {date} |
252
+ | Files Reviewed | {number_of_files} |
253
+ | Scope | {all/staged/unstaged} |
254
+ | Language(s) | {detected_languages} |
255
+
256
+ ---
257
+
258
+ ## Review Summary
259
+
260
+ | Metric | Value |
261
+ |--------|-------|
262
+ | **Review Mode** | Lightweight (< 50 lines) |
263
+ | **Total Findings** | {count} |
264
+ | **Status** | {LGTM / Needs Changes} |
265
+
266
+ ---
267
+
268
+ ## Findings
269
+
270
+ > Only present if issues were found. Skip this section entirely for LGTM reviews.
271
+
272
+ ### Finding 1: {Finding Name}
273
+
274
+ | Field | Value |
275
+ | -------------- | ------------------------------------------------ |
276
+ | File | `{file_path}` |
277
+ | Line | {line_number} |
278
+ | Severity | {Critical/Major/Minor} |
279
+ | Fix Complexity | {X/10} - {Level} |
280
+
281
+ **Description**:
282
+ {Detailed explanation of the issue found}
283
+
284
+ **Suggested Fix**:
285
+ \`\`\`{language}
286
+ // Suggested code improvement
287
+ \`\`\`
288
+
289
+ ---
290
+
291
+ ## Positive Highlights
292
+
293
+ - {Highlight 1}
294
+ - {Highlight 2}
295
+ - {Highlight 3}
296
+
297
+ ---
298
+
299
+ ## Commit Readiness
300
+
301
+ | Status | {Ready to Commit / Needs Changes} |
302
+ | ------ | --------------------------------- |
303
+ | Reason | {Brief explanation} |
304
+ ```
305
+
306
+ > **Note**: Lightweight reviews skip Reference Implementations, Pattern Conflicts, Rule Update Recommendations, and Verification Summary sections.
307
+
308
+ ---
309
+
310
+ ## Deep Review Template (500+ lines)
311
+
312
+ Use this template for large changesets. Findings are grouped by severity instead of by file, and an executive summary is prepended.
313
+
314
+ ```markdown
315
+ # Local Code Review: {Description}
316
+
317
+ **Project**: [[{project-name}]]
318
+
319
+ ## Review Information
320
+
321
+ | Field | Value |
322
+ | -------------- | --------------------- |
323
+ | Date | {date} |
324
+ | Files Reviewed | {number_of_files} |
325
+ | Scope | {all/staged/unstaged} |
326
+ | Language(s) | {detected_languages} |
327
+
328
+ ---
329
+
330
+ ## Executive Summary
331
+
332
+ ### Files Changed by Category
333
+
334
+ | Category | Files | Lines Changed |
335
+ |----------|-------|--------------|
336
+ | Core Logic | {N} | +{add}/-{del} |
337
+ | Infrastructure | {N} | +{add}/-{del} |
338
+ | UI/Presentation | {N} | +{add}/-{del} |
339
+ | Tests | {N} | +{add}/-{del} |
340
+
341
+ ### Risk Assessment
342
+
343
+ **Overall Risk**: {Low | Medium | High}
344
+
345
+ {1-2 sentence justification based on scope, categories affected, and finding severity distribution}
346
+
347
+ ### Top 3 Findings
348
+
349
+ 1. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
350
+ 2. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
351
+ 3. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
352
+
353
+ ---
354
+
355
+ ## Review Agents
356
+
357
+ > Multi-agent parallel review section. Only present in Deep mode (500+ lines).
358
+
359
+ | Agent | Model | Findings | After Dedup |
360
+ |-------|-------|----------|-------------|
361
+ | Security | sonnet | {N} | {N} |
362
+ | Logic & Bugs | sonnet | {N} | {N} |
363
+ | Performance | sonnet | {N} | {N} |
364
+ | Pattern Compliance | haiku | {N} | {N} |
365
+ | **Total** | | **{N}** | **{N}** |
366
+
367
+ Duplicates removed: {N}
368
+
369
+ ---
370
+
371
+ ## Changed Files
372
+
373
+ | File | Category | Status | Lines Changed |
374
+ | ------------- | -------------- | ---------- | ------------- |
375
+ | `{file_path}` | {Core/Infra/UI/Tests} | {modified} | +{add}/-{del} |
376
+ | ... | ... | ... | ... |
377
+
378
+ ---
379
+
380
+ ## Reference Implementations Found
381
+
382
+ > Same format as standard template — per changed file, similar implementations in the codebase.
383
+
384
+ ---
385
+
386
+ ## Review Summary
387
+
388
+ | Metric | Value |
389
+ | --------------------- | ------------------ |
390
+ | **Review Mode** | Deep (500+ lines) |
391
+ | **Total Findings** | {count} |
392
+ | Critical | {critical_count} |
393
+ | Major | {major_count} |
394
+ | Minor | {minor_count} |
395
+ | Suggestion | {suggestion_count} |
396
+ | **Pattern Conflicts** | {conflict_count} |
397
+ | **Total Fix Effort** | {sum_of_scores}/X |
398
+
399
+ ---
400
+
401
+ ## Verification Summary
402
+
403
+ | Metric | Count |
404
+ |--------|-------|
405
+ | Initial findings | {N} |
406
+ | Confirmed | {N} |
407
+ | Likely (needs human judgment) | {N} |
408
+ | Dismissed (false positives filtered) | {N} |
409
+ | **False positive rate** | **{N}%** |
410
+
411
+ ---
412
+
413
+ ## Critical Findings
414
+
415
+ ### Finding 1: {Finding Name}
416
+
417
+ | Field | Value |
418
+ | -------------- | ------------------------------------------------ |
419
+ | File | `{file_path}` |
420
+ | Line | {line_number} |
421
+ | Severity | Critical |
422
+ | Fix Complexity | {X/10} - {Level} |
423
+ | Category | {Core Logic/Infrastructure/UI/Tests} |
424
+ | Pattern | {Reference to pattern from rules, if applicable} |
425
+
426
+ **Description**:
427
+ {Detailed explanation}
428
+
429
+ **Reference Implementation**:
430
+ See `{reference_file_path}` for how this is handled elsewhere.
431
+
432
+ **Suggested Fix**:
433
+ \`\`\`{language}
434
+ // Suggested code improvement
435
+ \`\`\`
436
+
437
+ ---
438
+
439
+ ## Major Findings
440
+
441
+ ### Finding N: {Finding Name}
442
+
443
+ > Same format as Critical Findings.
444
+
445
+ ---
446
+
447
+ ## Minor Findings
448
+
449
+ ### Finding N: {Finding Name}
450
+
451
+ > Same format as Critical Findings.
452
+
453
+ ---
454
+
455
+ ## Suggestions
456
+
457
+ ### Finding N: {Finding Name}
458
+
459
+ > Same format as Critical Findings.
460
+
461
+ ---
462
+
463
+ ## Pattern Conflicts
464
+
465
+ > Same format as standard template.
466
+
467
+ ---
468
+
469
+ ## Rule Update Recommendations
470
+
471
+ > Same format as standard template.
472
+
473
+ ---
474
+
475
+ ## Positive Highlights
476
+
477
+ - {Highlight 1}
478
+ - {Highlight 2}
479
+
480
+ ---
481
+
482
+ ## Commit Readiness
483
+
484
+ | Status | {Ready to Commit/Needs Changes/Needs Discussion} |
485
+ | ------ | ------------------------------------------------ |
486
+ | Reason | {Brief explanation} |
487
+
488
+ ### Before Committing
489
+
490
+ - [ ] Address all Critical findings
491
+ - [ ] Address all Major findings
492
+ - [ ] Review Pattern Conflicts and decide on resolution
493
+ - [ ] Update rules files if new patterns should be documented
494
+ ```
495
+
496
+ > **Note**: Deep reviews always include the Verification Summary, group findings by severity, and prepend an Executive Summary with risk assessment. The Changed Files table includes a Category column.
497
+
498
+ ---
499
+
187
500
  ## Example Output
188
501
 
189
502
  ### Pattern Conflict Example
@@ -15,6 +15,14 @@ Skills implement the workflow logic for commands. Each skill orchestrates a spec
15
15
  > **Note**: The execute-plan skill supports **Model Routing** — automatic model selection per phase based on complexity scores (0-3 → haiku, 4-5 → sonnet, 6-10 → opus). Controlled by `model_routing` in `flow/.flowconfig`. See `.claude/resources/core/model-routing.md` for full rules.
16
16
  >
17
17
  > **Note**: The discovery skill also includes **Design Awareness**. During discovery, the LLM asks whether the feature involves UI work and captures structured design tokens (colors, typography, spacing) into a `## Design Context` section. During execution, these tokens are auto-injected into UI phase prompts. See `.claude/resources/core/design-awareness.md` for full rules.
18
+ >
19
+ > **Note**: The review-code and review-pr skills include a **Verification Pass**. After initial analysis, each finding is re-examined against surrounding code context and classified as Confirmed, Likely, or Dismissed. False positives are filtered before output. See `.claude/resources/core/review-verification.md` for full rules.
20
+ >
21
+ > **Note**: The review-code and review-pr skills include **Multi-Agent Parallel Review** for Deep mode (500+ lines). Four specialized subagents (security, logic, performance, patterns) run in parallel, and the coordinator deduplicates, verifies, and ranks the merged results. See `.claude/resources/core/review-multi-agent.md` for full rules.
22
+ >
23
+ > **Note**: The review-code and review-pr skills include **Severity Re-Ranking**. After verification, findings are re-ranked by severity → confidence → fix complexity, related findings across files are grouped, and an executive summary is added when ≥ 5 findings. See `.claude/resources/core/review-severity-ranking.md` for full rules.
24
+ >
25
+ > **Note**: The review-code and review-pr skills include **Adaptive Depth**. Review depth scales automatically based on changeset size: < 50 lines → Lightweight (quick-scan), 50–500 → Standard (no change), 500+ → Deep (multi-pass with executive summary). See `.claude/resources/core/review-adaptive-depth.md` for full rules.
18
26
 
19
27
  ---
20
28
 
@@ -188,7 +196,7 @@ Skills implement the workflow logic for commands. Each skill orchestrates a spec
188
196
  | Command | Skill | Key Codes |
189
197
  |---------|-------|-----------|
190
198
  | `/brainstorm` | brainstorm-skill | SKL-BS-1 through SKL-BS-3 |
191
- | `/brain` | brain-skill | SKL-BR-1 through SKL-BR-3 |
199
+ | `/note` | brain-skill | SKL-BR-1 through SKL-BR-3 |
192
200
  | `/flow cost` | flow-cost | SKL-COST-1 through SKL-COST-4 |
193
201
  | `/learn` | learn-skill | SKL-LRN-1 through SKL-LRN-4 |
194
202
  | `/discovery-plan` | discovery-skill | SKL-DIS-1 through SKL-DIS-4 |
@@ -49,7 +49,7 @@ This skill **only writes to `flow/brain/`**. It does NOT:
49
49
 
50
50
  ### Free-Text Mode (Default)
51
51
 
52
- When user runs `/brain {free text}`:
52
+ When user runs `/note {free text}`:
53
53
 
54
54
  1. **Parse input** - Read the user's unstructured text
55
55
  2. **Extract entities** - Identify:
@@ -63,9 +63,9 @@ When user runs `/brain {free text}`:
63
63
  4. **Write** - Create/update the appropriate brain file with `[[wiki-links]]`
64
64
  5. **Update index** - Update `flow/brain/index.md` if needed (new feature, error, or decision)
65
65
 
66
- ### Guided Mode (`/brain -guided`)
66
+ ### Guided Mode (`/note -guided`)
67
67
 
68
- When user runs `/brain -guided`:
68
+ When user runs `/note -guided`:
69
69
 
70
70
  1. **Ask structured questions** using `AskUserQuestion`:
71
71
 
@@ -75,6 +75,26 @@ This skill is **strictly read-only analysis**. The review process:
75
75
  4. If `file_path` is provided, filter to only those files
76
76
  5. If `scope` is provided, filter to staged or unstaged only
77
77
 
78
+ ### Step 1b: Determine Review Depth
79
+
80
+ Determine the review mode based on changeset size. See `.claude/resources/core/review-adaptive-depth.md` for full rules.
81
+
82
+ 1. Count total lines changed (additions + deletions) from `git diff --stat`
83
+ 2. Exclude lock files, generated files, and pure whitespace changes from the count
84
+ 3. Classify into tier:
85
+ - **Small** (< 50 lines) → **Lightweight** mode
86
+ - **Medium** (50–500 lines) → **Standard** mode (no behavior change)
87
+ - **Large** (500+ lines) → **Deep** mode
88
+ 4. Display: `**Review mode**: {Lightweight|Standard|Deep} ({N} lines changed across {M} files)`
89
+
90
+ **If Lightweight**: Skip Steps 2–5 (pattern loading, similar implementations, full analysis, pattern conflicts). Perform abbreviated analysis checking ONLY security issues, obvious logic bugs, and breaking changes. Skip verification pass (Step 5b). Generate output using the lightweight template from `review-code-templates.md`.
91
+
92
+ **If Deep**: Activate multi-agent parallel review. See `.claude/resources/core/review-multi-agent.md`. Categorize files by type (Core Logic, Infrastructure, UI/Presentation, Tests), then spawn 4 specialized subagents in parallel (security, logic & bugs, performance, pattern compliance). The coordinator collects results, deduplicates overlapping findings, then proceeds to Step 5b (verification), Step 5c (re-ranking), Step 6b (pattern review), and Step 6 (output using deep template). Steps 2–5 are handled by subagents instead of the main agent.
93
+
94
+ **If Standard**: Proceed with all steps as defined below (no behavior change).
95
+
96
+ ---
97
+
78
98
  ### Step 2: Load Review Patterns
79
99
 
80
100
  1. Read `.claude/resources/patterns/review-pr-patterns.md` for general review guidelines
@@ -160,6 +180,42 @@ When a pattern conflict is found:
160
180
 
161
181
  4. **Buffer patterns for capture**: Silently append identified patterns (both good patterns and anti-patterns found during review) to `flow/resources/pending-patterns.md`. See `.claude/resources/core/pattern-capture.md` for buffer format and capture triggers.
162
182
 
183
+ ---
184
+
185
+ ### Step 5b: Verify Findings
186
+
187
+ After collecting all findings from Steps 4 and 5, run a second-pass verification to filter false positives. See `.claude/resources/core/review-verification.md` for full logic.
188
+
189
+ **For each finding**:
190
+
191
+ 1. **Re-read surrounding context** — Read 15 lines above and 15 below the flagged line
192
+ 2. **Ask 3 standard questions**:
193
+ - Is this actually a bug, or does surrounding code handle it?
194
+ - Is there a test that covers this case?
195
+ - Would a senior developer agree this is a real issue?
196
+ 3. **Ask 1 category-specific question** (security → exploit path, logic → reachability, performance → hot path, etc.)
197
+ 4. **Classify**:
198
+ - **Confirmed** — Clear issue, 2+ standard questions support it. Keep as-is.
199
+ - **Likely** — Ambiguous, 1 question supports. Tag with `[Likely]` in output.
200
+ - **Dismissed** — False positive, all questions fail. Remove from output.
201
+
202
+ **Rules**:
203
+ - When in doubt between Likely and Dismissed → choose **Likely**
204
+ - NEVER dismiss a Critical severity finding (downgrade to Likely at most)
205
+
206
+ **After verification**: Remove Dismissed findings, tag Likely findings, generate Verification Summary stats.
207
+
208
+ ### Step 5c: Re-Rank and Group Findings
209
+
210
+ After verification, re-rank all remaining findings by impact. See `.claude/resources/core/review-severity-ranking.md` for full rules.
211
+
212
+ 1. **Sort findings**: Severity (Critical → Major → Minor → Suggestion), then Confidence (Confirmed → Likely), then Fix Complexity (lower first)
213
+ 2. **Group related findings**: Scan for same issue type across files, same root cause, or causal chains. Only group when genuinely related (≥ 2 findings). Skip grouping for small reviews (1-3 findings).
214
+ 3. **Executive summary**: If total findings ≥ 5 (standard mode) or always (deep mode), prepend an executive summary with risk level and top 3 findings. Skip for lightweight mode.
215
+ 4. **Structure output by severity**: Use severity-grouped sections (Critical → Major → Minor → Suggestions) instead of per-file ordering. Omit empty severity sections.
216
+
217
+ ---
218
+
163
219
  ### Step 6b: Pattern Review
164
220
 
165
221
  After analysis but before generating the review document, run the pattern review protocol:
@@ -175,6 +231,8 @@ See `.claude/resources/core/pattern-capture.md` for the full end-of-skill review
175
231
 
176
232
  ### Step 6: Generate Review Document
177
233
 
234
+ **Important**: Only include Confirmed and Likely findings in the output. Dismissed findings are excluded. Add the Verification Summary section after the Review Summary.
235
+
178
236
  Create a markdown file in `flow/reviewed-code/` with the naming convention:
179
237
 
180
238
  ```
@@ -317,3 +375,18 @@ After running this command:
317
375
  5. **Commit changes** once review concerns are addressed
318
376
 
319
377
  > The goal is not just to review current changes, but to **improve the codebase patterns over time** by documenting good patterns and preventing anti-patterns from spreading.
378
+
379
+ ### Validation Checklist
380
+
381
+ - [ ] All changed files analyzed
382
+ - [ ] Forbidden patterns checked
383
+ - [ ] Allowed patterns verified
384
+ - [ ] Similar implementations searched
385
+ - [ ] Pattern conflicts documented
386
+ - [ ] Verification pass completed — all findings classified
387
+ - [ ] Dismissed findings removed from output
388
+ - [ ] Likely findings tagged with `[Likely]`
389
+ - [ ] Verification Summary section included
390
+ - [ ] Findings sorted by severity (Critical → Major → Minor → Suggestion)
391
+ - [ ] Related findings grouped when applicable (≥ 2 related)
392
+ - [ ] Executive summary included when ≥ 5 findings (standard) or always (deep)
@@ -162,6 +162,26 @@ After successful authentication, proceed to fetch PR information.
162
162
  2. Extract the PR title, description, and list of changed files
163
163
  3. Identify the primary language(s) used in the PR
164
164
 
165
+ ### Step 1b: Determine Review Depth
166
+
167
+ Determine the review mode based on changeset size. See `.claude/resources/core/review-adaptive-depth.md` for full rules.
168
+
169
+ 1. Count total lines changed (additions + deletions) from `gh pr diff --stat` or Azure DevOps diff API
170
+ 2. Exclude lock files, generated files, and pure whitespace changes from the count
171
+ 3. Classify into tier:
172
+ - **Small** (< 50 lines) → **Lightweight** mode
173
+ - **Medium** (50–500 lines) → **Standard** mode (no behavior change)
174
+ - **Large** (500+ lines) → **Deep** mode
175
+ 4. Display: `**Review mode**: {Lightweight|Standard|Deep} ({N} lines changed across {M} files)`
176
+
177
+ **If Lightweight**: Skip Steps 2–3 (pattern loading, full analysis). Perform abbreviated analysis checking ONLY security issues, obvious logic bugs, and breaking changes. Skip verification pass (Step 3b). Generate compact output with no Pattern Conflicts or Reference Implementations sections.
178
+
179
+ **If Deep**: Activate multi-agent parallel review. See `.claude/resources/core/review-multi-agent.md`. Categorize files by type (Core Logic, Infrastructure, UI/Presentation, Tests), then spawn 4 specialized subagents in parallel (security, logic & bugs, performance, pattern compliance). The coordinator collects results, deduplicates overlapping findings, then proceeds to Step 3b (verification), Step 3c (re-ranking), and Step 4 (output using deep template with severity grouping and executive summary). Steps 2–3 are handled by subagents instead of the main agent.
180
+
181
+ **If Standard**: Proceed with all steps as defined below (no behavior change).
182
+
183
+ ---
184
+
165
185
  ### Step 2: Load Review Patterns
166
186
 
167
187
  1. Read `.claude/resources/patterns/review-pr-patterns.md` for general review guidelines
@@ -182,8 +202,46 @@ For each file in the PR:
182
202
  4. Apply language-specific checks
183
203
  5. Identify security, performance, and maintainability concerns
184
204
 
205
+ ---
206
+
207
+ ### Step 3b: Verify Findings
208
+
209
+ After collecting all findings from Step 3, run a second-pass verification to filter false positives. See `.claude/resources/core/review-verification.md` for full logic.
210
+
211
+ **For each finding**:
212
+
213
+ 1. **Re-read surrounding context** — Read 15 lines above and 15 below the flagged line
214
+ 2. **Ask 3 standard questions**:
215
+ - Is this actually a bug, or does surrounding code handle it?
216
+ - Is there a test that covers this case?
217
+ - Would a senior developer agree this is a real issue?
218
+ 3. **Ask 1 category-specific question** (security → exploit path, logic → reachability, performance → hot path, etc.)
219
+ 4. **Classify**:
220
+ - **Confirmed** — Clear issue, 2+ standard questions support it. Keep as-is.
221
+ - **Likely** — Ambiguous, 1 question supports. Tag with `[Likely]` in output.
222
+ - **Dismissed** — False positive, all questions fail. Remove from output.
223
+
224
+ **Rules**:
225
+ - When in doubt between Likely and Dismissed → choose **Likely**
226
+ - NEVER dismiss a Critical severity finding (downgrade to Likely at most)
227
+
228
+ **After verification**: Remove Dismissed findings, tag Likely findings, generate Verification Summary stats.
229
+
230
+ ### Step 3c: Re-Rank and Group Findings
231
+
232
+ After verification, re-rank all remaining findings by impact. See `.claude/resources/core/review-severity-ranking.md` for full rules.
233
+
234
+ 1. **Sort findings**: Severity (Critical → Major → Minor → Suggestion), then Confidence (Confirmed → Likely), then Fix Complexity (lower first)
235
+ 2. **Group related findings**: Scan for same issue type across files, same root cause, or causal chains. Only group when genuinely related (≥ 2 findings). Skip grouping for small reviews (1-3 findings).
236
+ 3. **Executive summary**: If total findings ≥ 5 (standard mode) or always (deep mode), prepend an executive summary with risk level and top 3 findings. Skip for lightweight mode.
237
+ 4. **Structure output by severity**: Use severity-grouped sections (Critical → Major → Minor → Suggestions) instead of per-file ordering. Omit empty severity sections.
238
+
239
+ ---
240
+
185
241
  ### Step 4: Generate or Update Review Document
186
242
 
243
+ **Important**: Only include Confirmed and Likely findings in the output. Dismissed findings are excluded. Add the Verification Summary section after the Review Summary.
244
+
187
245
  **Check for existing review file** in `flow/reviewed-pr/` before creating a new one.
188
246
 
189
247
  #### If reviewing the same PR again:
package/README.md CHANGED
@@ -75,11 +75,11 @@ Installs for Claude Code, Cursor, OpenClaw, and Codex CLI simultaneously.
75
75
  | `/create-plan` | Create implementation plan with phases |
76
76
  | `/execute-plan` | Execute plan phases with verification |
77
77
  | `/create-contract` | Create integration contract from API docs |
78
- | `/review-code` | Review local uncommitted changes |
79
- | `/review-pr` | Review a Pull Request |
78
+ | `/review-code` | Review local uncommitted changes (adaptive depth + multi-agent) |
79
+ | `/review-pr` | Review a Pull Request (adaptive depth + multi-agent) |
80
80
  | `/write-tests` | Generate tests for coverage target |
81
81
  | `/flow` | Configure plan-flow settings (autopilot, git control, runtime options) |
82
- | `/brain` | Capture meeting notes, ideas, brainstorms |
82
+ | `/note` | Capture meeting notes, ideas, brainstorms |
83
83
  | `/learn` | Extract reusable patterns or learn a topic step-by-step |
84
84
  | `/pattern-validate` | Scan and index global brain patterns |
85
85
  | `/heartbeat` | Manage scheduled automated tasks |
@@ -209,6 +209,41 @@ Tasks with `in {N} hours/minutes` schedules run once and auto-disable after exec
209
209
 
210
210
  If a task fails because a Claude Code session is already active, the daemon retries up to 5 times at 60-second intervals instead of failing permanently.
211
211
 
212
+ ## Code Review
213
+
214
+ `/review-code` and `/review-pr` include three layers of intelligence:
215
+
216
+ ### Adaptive Depth
217
+
218
+ Review depth scales automatically based on changeset size:
219
+
220
+ | Lines Changed | Mode | Behavior |
221
+ |--------------|------|----------|
222
+ | < 50 | Lightweight | Quick-scan for security, logic bugs, and breaking changes only |
223
+ | 50–500 | Standard | Full review with pattern matching and similar implementation search |
224
+ | 500+ | Deep | Multi-pass review with file categorization, executive summary, and multi-agent analysis |
225
+
226
+ ### Verification Pass
227
+
228
+ Every finding goes through a second-pass verification that re-reads surrounding context and asks structured questions to classify findings as Confirmed, Likely, or Dismissed. False positives are filtered before output.
229
+
230
+ ### Severity Re-Ranking
231
+
232
+ Findings are sorted by impact (Critical > Major > Minor > Suggestion), related findings across files are grouped, and an executive summary is added when there are 5+ findings.
233
+
234
+ ### Multi-Agent Parallel Review
235
+
236
+ In Deep mode (500+ lines), the review is split into 4 specialized subagents running in parallel:
237
+
238
+ | Agent | Focus | Model |
239
+ |-------|-------|-------|
240
+ | Security | Vulnerabilities, secrets, injection, auth bypass | sonnet |
241
+ | Logic & Bugs | Edge cases, null handling, race conditions | sonnet |
242
+ | Performance | N+1 queries, memory leaks, blocking I/O | sonnet |
243
+ | Pattern Compliance | Forbidden/allowed patterns, naming consistency | haiku |
244
+
245
+ The coordinator merges results, deduplicates overlapping findings, then runs verification and re-ranking.
246
+
212
247
  ## Complexity Scoring
213
248
 
214
249
  Every plan phase has a complexity score (0-10):