planflow-ai 1.3.0 → 1.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/brainstorm.md +2 -2
- package/.claude/commands/heartbeat.md +1 -1
- package/.claude/commands/learn.md +1 -1
- package/.claude/commands/{brain.md → note.md} +12 -12
- package/.claude/commands/review-code.md +53 -0
- package/.claude/commands/review-pr.md +53 -0
- package/.claude/resources/core/_index.md +50 -2
- package/.claude/resources/core/resource-capture.md +1 -1
- package/.claude/resources/core/review-adaptive-depth.md +217 -0
- package/.claude/resources/core/review-multi-agent.md +289 -0
- package/.claude/resources/core/review-severity-ranking.md +149 -0
- package/.claude/resources/core/review-verification.md +158 -0
- package/.claude/resources/patterns/review-code-templates.md +315 -2
- package/.claude/resources/skills/_index.md +9 -1
- package/.claude/resources/skills/brain-skill.md +3 -3
- package/.claude/resources/skills/review-code-skill.md +73 -0
- package/.claude/resources/skills/review-pr-skill.md +58 -0
- package/README.md +38 -3
- package/dist/cli/handlers/claude.js +20 -12
- package/dist/cli/handlers/claude.js.map +1 -1
- package/package.json +1 -1
- package/rules/skills/brain-skill.mdc +4 -4
- package/skills/plan-flow/SKILL.md +1 -1
- package/skills/plan-flow/brain/SKILL.md +1 -1
- package/templates/shared/AGENTS.md.template +1 -1
- package/templates/shared/CLAUDE.md.template +1 -1
|
@@ -61,15 +61,48 @@ For each changed file, similar implementations in the codebase:
|
|
|
61
61
|
|
|
62
62
|
---
|
|
63
63
|
|
|
64
|
+
## Verification Summary
|
|
65
|
+
|
|
66
|
+
| Metric | Count |
|
|
67
|
+
|--------|-------|
|
|
68
|
+
| Initial findings | {N} |
|
|
69
|
+
| Confirmed | {N} |
|
|
70
|
+
| Likely (needs human judgment) | {N} |
|
|
71
|
+
| Dismissed (false positives filtered) | {N} |
|
|
72
|
+
| **False positive rate** | **{N}%** |
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## Executive Summary
|
|
77
|
+
|
|
78
|
+
> Only include when total findings ≥ 5. Omit for smaller reviews.
|
|
79
|
+
|
|
80
|
+
**Risk level**: {Low | Medium | High}
|
|
81
|
+
|
|
82
|
+
**Top issues to address**:
|
|
83
|
+
|
|
84
|
+
1. {Finding title} ({Severity}) — `{file}:{line}`
|
|
85
|
+
2. {Finding title} ({Severity}) — `{file}:{line}`
|
|
86
|
+
3. {Finding title} ({Severity}) — `{file}:{line}`
|
|
87
|
+
|
|
88
|
+
---
|
|
89
|
+
|
|
64
90
|
## Findings
|
|
65
91
|
|
|
66
|
-
|
|
92
|
+
> Findings are grouped by severity (Critical → Major → Minor → Suggestion).
|
|
93
|
+
> For findings classified as "Likely" during verification, prepend `[Likely]` to the heading.
|
|
94
|
+
> Omit empty severity sections.
|
|
95
|
+
> Related findings across files may be grouped — see review-severity-ranking.md.
|
|
96
|
+
|
|
97
|
+
### Critical Findings
|
|
98
|
+
|
|
99
|
+
#### Finding 1: {Finding Name}
|
|
67
100
|
|
|
68
101
|
| Field | Value |
|
|
69
102
|
| -------------- | ------------------------------------------------ |
|
|
70
103
|
| File | `{file_path}` |
|
|
71
104
|
| Line | {line_number} |
|
|
72
|
-
| Severity |
|
|
105
|
+
| Severity | Critical |
|
|
73
106
|
| Fix Complexity | {X/10} - {Level} |
|
|
74
107
|
| Pattern | {Reference to pattern from rules, if applicable} |
|
|
75
108
|
|
|
@@ -84,6 +117,24 @@ See `{reference_file_path}` for how this is handled elsewhere in the codebase.
|
|
|
84
117
|
// Suggested code improvement
|
|
85
118
|
\`\`\`
|
|
86
119
|
|
|
120
|
+
### Major Findings
|
|
121
|
+
|
|
122
|
+
#### Finding N: {Finding Name}
|
|
123
|
+
|
|
124
|
+
> Same format as Critical Findings.
|
|
125
|
+
|
|
126
|
+
### Minor Findings
|
|
127
|
+
|
|
128
|
+
#### Finding N: {Finding Name}
|
|
129
|
+
|
|
130
|
+
> Same format as Critical Findings.
|
|
131
|
+
|
|
132
|
+
### Suggestions
|
|
133
|
+
|
|
134
|
+
#### Finding N: {Finding Name}
|
|
135
|
+
|
|
136
|
+
> Same format as Critical Findings.
|
|
137
|
+
|
|
87
138
|
---
|
|
88
139
|
|
|
89
140
|
## Pattern Conflicts
|
|
@@ -184,6 +235,268 @@ List any particularly well-written code or good practices observed:
|
|
|
184
235
|
|
|
185
236
|
---
|
|
186
237
|
|
|
238
|
+
## Lightweight Review Template (< 50 lines)
|
|
239
|
+
|
|
240
|
+
Use this compact template when the changeset is under 50 lines.
|
|
241
|
+
|
|
242
|
+
```markdown
|
|
243
|
+
# Local Code Review: {Description}
|
|
244
|
+
|
|
245
|
+
**Project**: [[{project-name}]]
|
|
246
|
+
|
|
247
|
+
## Review Information
|
|
248
|
+
|
|
249
|
+
| Field | Value |
|
|
250
|
+
| -------------- | --------------------- |
|
|
251
|
+
| Date | {date} |
|
|
252
|
+
| Files Reviewed | {number_of_files} |
|
|
253
|
+
| Scope | {all/staged/unstaged} |
|
|
254
|
+
| Language(s) | {detected_languages} |
|
|
255
|
+
|
|
256
|
+
---
|
|
257
|
+
|
|
258
|
+
## Review Summary
|
|
259
|
+
|
|
260
|
+
| Metric | Value |
|
|
261
|
+
|--------|-------|
|
|
262
|
+
| **Review Mode** | Lightweight (< 50 lines) |
|
|
263
|
+
| **Total Findings** | {count} |
|
|
264
|
+
| **Status** | {LGTM / Needs Changes} |
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## Findings
|
|
269
|
+
|
|
270
|
+
> Only present if issues were found. Skip this section entirely for LGTM reviews.
|
|
271
|
+
|
|
272
|
+
### Finding 1: {Finding Name}
|
|
273
|
+
|
|
274
|
+
| Field | Value |
|
|
275
|
+
| -------------- | ------------------------------------------------ |
|
|
276
|
+
| File | `{file_path}` |
|
|
277
|
+
| Line | {line_number} |
|
|
278
|
+
| Severity | {Critical/Major/Minor} |
|
|
279
|
+
| Fix Complexity | {X/10} - {Level} |
|
|
280
|
+
|
|
281
|
+
**Description**:
|
|
282
|
+
{Detailed explanation of the issue found}
|
|
283
|
+
|
|
284
|
+
**Suggested Fix**:
|
|
285
|
+
\`\`\`{language}
|
|
286
|
+
// Suggested code improvement
|
|
287
|
+
\`\`\`
|
|
288
|
+
|
|
289
|
+
---
|
|
290
|
+
|
|
291
|
+
## Positive Highlights
|
|
292
|
+
|
|
293
|
+
- {Highlight 1}
|
|
294
|
+
- {Highlight 2}
|
|
295
|
+
- {Highlight 3}
|
|
296
|
+
|
|
297
|
+
---
|
|
298
|
+
|
|
299
|
+
## Commit Readiness
|
|
300
|
+
|
|
301
|
+
| Status | {Ready to Commit / Needs Changes} |
|
|
302
|
+
| ------ | --------------------------------- |
|
|
303
|
+
| Reason | {Brief explanation} |
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
> **Note**: Lightweight reviews skip Reference Implementations, Pattern Conflicts, Rule Update Recommendations, and Verification Summary sections.
|
|
307
|
+
|
|
308
|
+
---
|
|
309
|
+
|
|
310
|
+
## Deep Review Template (500+ lines)
|
|
311
|
+
|
|
312
|
+
Use this template for large changesets. Findings are grouped by severity instead of by file, and an executive summary is prepended.
|
|
313
|
+
|
|
314
|
+
```markdown
|
|
315
|
+
# Local Code Review: {Description}
|
|
316
|
+
|
|
317
|
+
**Project**: [[{project-name}]]
|
|
318
|
+
|
|
319
|
+
## Review Information
|
|
320
|
+
|
|
321
|
+
| Field | Value |
|
|
322
|
+
| -------------- | --------------------- |
|
|
323
|
+
| Date | {date} |
|
|
324
|
+
| Files Reviewed | {number_of_files} |
|
|
325
|
+
| Scope | {all/staged/unstaged} |
|
|
326
|
+
| Language(s) | {detected_languages} |
|
|
327
|
+
|
|
328
|
+
---
|
|
329
|
+
|
|
330
|
+
## Executive Summary
|
|
331
|
+
|
|
332
|
+
### Files Changed by Category
|
|
333
|
+
|
|
334
|
+
| Category | Files | Lines Changed |
|
|
335
|
+
|----------|-------|--------------|
|
|
336
|
+
| Core Logic | {N} | +{add}/-{del} |
|
|
337
|
+
| Infrastructure | {N} | +{add}/-{del} |
|
|
338
|
+
| UI/Presentation | {N} | +{add}/-{del} |
|
|
339
|
+
| Tests | {N} | +{add}/-{del} |
|
|
340
|
+
|
|
341
|
+
### Risk Assessment
|
|
342
|
+
|
|
343
|
+
**Overall Risk**: {Low | Medium | High}
|
|
344
|
+
|
|
345
|
+
{1-2 sentence justification based on scope, categories affected, and finding severity distribution}
|
|
346
|
+
|
|
347
|
+
### Top 3 Findings
|
|
348
|
+
|
|
349
|
+
1. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
|
|
350
|
+
2. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
|
|
351
|
+
3. **[{Severity}]** {Finding title} — {one-line description} (`{file}:{line}`)
|
|
352
|
+
|
|
353
|
+
---
|
|
354
|
+
|
|
355
|
+
## Review Agents
|
|
356
|
+
|
|
357
|
+
> Multi-agent parallel review section. Only present in Deep mode (500+ lines).
|
|
358
|
+
|
|
359
|
+
| Agent | Model | Findings | After Dedup |
|
|
360
|
+
|-------|-------|----------|-------------|
|
|
361
|
+
| Security | sonnet | {N} | {N} |
|
|
362
|
+
| Logic & Bugs | sonnet | {N} | {N} |
|
|
363
|
+
| Performance | sonnet | {N} | {N} |
|
|
364
|
+
| Pattern Compliance | haiku | {N} | {N} |
|
|
365
|
+
| **Total** | | **{N}** | **{N}** |
|
|
366
|
+
|
|
367
|
+
Duplicates removed: {N}
|
|
368
|
+
|
|
369
|
+
---
|
|
370
|
+
|
|
371
|
+
## Changed Files
|
|
372
|
+
|
|
373
|
+
| File | Category | Status | Lines Changed |
|
|
374
|
+
| ------------- | -------------- | ---------- | ------------- |
|
|
375
|
+
| `{file_path}` | {Core/Infra/UI/Tests} | {modified} | +{add}/-{del} |
|
|
376
|
+
| ... | ... | ... | ... |
|
|
377
|
+
|
|
378
|
+
---
|
|
379
|
+
|
|
380
|
+
## Reference Implementations Found
|
|
381
|
+
|
|
382
|
+
> Same format as standard template — per changed file, similar implementations in the codebase.
|
|
383
|
+
|
|
384
|
+
---
|
|
385
|
+
|
|
386
|
+
## Review Summary
|
|
387
|
+
|
|
388
|
+
| Metric | Value |
|
|
389
|
+
| --------------------- | ------------------ |
|
|
390
|
+
| **Review Mode** | Deep (500+ lines) |
|
|
391
|
+
| **Total Findings** | {count} |
|
|
392
|
+
| Critical | {critical_count} |
|
|
393
|
+
| Major | {major_count} |
|
|
394
|
+
| Minor | {minor_count} |
|
|
395
|
+
| Suggestion | {suggestion_count} |
|
|
396
|
+
| **Pattern Conflicts** | {conflict_count} |
|
|
397
|
+
| **Total Fix Effort** | {sum_of_scores}/X |
|
|
398
|
+
|
|
399
|
+
---
|
|
400
|
+
|
|
401
|
+
## Verification Summary
|
|
402
|
+
|
|
403
|
+
| Metric | Count |
|
|
404
|
+
|--------|-------|
|
|
405
|
+
| Initial findings | {N} |
|
|
406
|
+
| Confirmed | {N} |
|
|
407
|
+
| Likely (needs human judgment) | {N} |
|
|
408
|
+
| Dismissed (false positives filtered) | {N} |
|
|
409
|
+
| **False positive rate** | **{N}%** |
|
|
410
|
+
|
|
411
|
+
---
|
|
412
|
+
|
|
413
|
+
## Critical Findings
|
|
414
|
+
|
|
415
|
+
### Finding 1: {Finding Name}
|
|
416
|
+
|
|
417
|
+
| Field | Value |
|
|
418
|
+
| -------------- | ------------------------------------------------ |
|
|
419
|
+
| File | `{file_path}` |
|
|
420
|
+
| Line | {line_number} |
|
|
421
|
+
| Severity | Critical |
|
|
422
|
+
| Fix Complexity | {X/10} - {Level} |
|
|
423
|
+
| Category | {Core Logic/Infrastructure/UI/Tests} |
|
|
424
|
+
| Pattern | {Reference to pattern from rules, if applicable} |
|
|
425
|
+
|
|
426
|
+
**Description**:
|
|
427
|
+
{Detailed explanation}
|
|
428
|
+
|
|
429
|
+
**Reference Implementation**:
|
|
430
|
+
See `{reference_file_path}` for how this is handled elsewhere.
|
|
431
|
+
|
|
432
|
+
**Suggested Fix**:
|
|
433
|
+
\`\`\`{language}
|
|
434
|
+
// Suggested code improvement
|
|
435
|
+
\`\`\`
|
|
436
|
+
|
|
437
|
+
---
|
|
438
|
+
|
|
439
|
+
## Major Findings
|
|
440
|
+
|
|
441
|
+
### Finding N: {Finding Name}
|
|
442
|
+
|
|
443
|
+
> Same format as Critical Findings.
|
|
444
|
+
|
|
445
|
+
---
|
|
446
|
+
|
|
447
|
+
## Minor Findings
|
|
448
|
+
|
|
449
|
+
### Finding N: {Finding Name}
|
|
450
|
+
|
|
451
|
+
> Same format as Critical Findings.
|
|
452
|
+
|
|
453
|
+
---
|
|
454
|
+
|
|
455
|
+
## Suggestions
|
|
456
|
+
|
|
457
|
+
### Finding N: {Finding Name}
|
|
458
|
+
|
|
459
|
+
> Same format as Critical Findings.
|
|
460
|
+
|
|
461
|
+
---
|
|
462
|
+
|
|
463
|
+
## Pattern Conflicts
|
|
464
|
+
|
|
465
|
+
> Same format as standard template.
|
|
466
|
+
|
|
467
|
+
---
|
|
468
|
+
|
|
469
|
+
## Rule Update Recommendations
|
|
470
|
+
|
|
471
|
+
> Same format as standard template.
|
|
472
|
+
|
|
473
|
+
---
|
|
474
|
+
|
|
475
|
+
## Positive Highlights
|
|
476
|
+
|
|
477
|
+
- {Highlight 1}
|
|
478
|
+
- {Highlight 2}
|
|
479
|
+
|
|
480
|
+
---
|
|
481
|
+
|
|
482
|
+
## Commit Readiness
|
|
483
|
+
|
|
484
|
+
| Status | {Ready to Commit/Needs Changes/Needs Discussion} |
|
|
485
|
+
| ------ | ------------------------------------------------ |
|
|
486
|
+
| Reason | {Brief explanation} |
|
|
487
|
+
|
|
488
|
+
### Before Committing
|
|
489
|
+
|
|
490
|
+
- [ ] Address all Critical findings
|
|
491
|
+
- [ ] Address all Major findings
|
|
492
|
+
- [ ] Review Pattern Conflicts and decide on resolution
|
|
493
|
+
- [ ] Update rules files if new patterns should be documented
|
|
494
|
+
```
|
|
495
|
+
|
|
496
|
+
> **Note**: Deep reviews always include the Verification Summary, group findings by severity, and prepend an Executive Summary with risk assessment. The Changed Files table includes a Category column.
|
|
497
|
+
|
|
498
|
+
---
|
|
499
|
+
|
|
187
500
|
## Example Output
|
|
188
501
|
|
|
189
502
|
### Pattern Conflict Example
|
|
@@ -15,6 +15,14 @@ Skills implement the workflow logic for commands. Each skill orchestrates a spec
|
|
|
15
15
|
> **Note**: The execute-plan skill supports **Model Routing** — automatic model selection per phase based on complexity scores (0-3 → haiku, 4-5 → sonnet, 6-10 → opus). Controlled by `model_routing` in `flow/.flowconfig`. See `.claude/resources/core/model-routing.md` for full rules.
|
|
16
16
|
>
|
|
17
17
|
> **Note**: The discovery skill also includes **Design Awareness**. During discovery, the LLM asks whether the feature involves UI work and captures structured design tokens (colors, typography, spacing) into a `## Design Context` section. During execution, these tokens are auto-injected into UI phase prompts. See `.claude/resources/core/design-awareness.md` for full rules.
|
|
18
|
+
>
|
|
19
|
+
> **Note**: The review-code and review-pr skills include a **Verification Pass**. After initial analysis, each finding is re-examined against surrounding code context and classified as Confirmed, Likely, or Dismissed. False positives are filtered before output. See `.claude/resources/core/review-verification.md` for full rules.
|
|
20
|
+
>
|
|
21
|
+
> **Note**: The review-code and review-pr skills include **Multi-Agent Parallel Review** for Deep mode (500+ lines). Four specialized subagents (security, logic, performance, patterns) run in parallel, and the coordinator deduplicates, verifies, and ranks the merged results. See `.claude/resources/core/review-multi-agent.md` for full rules.
|
|
22
|
+
>
|
|
23
|
+
> **Note**: The review-code and review-pr skills include **Severity Re-Ranking**. After verification, findings are re-ranked by severity → confidence → fix complexity, related findings across files are grouped, and an executive summary is added when ≥ 5 findings. See `.claude/resources/core/review-severity-ranking.md` for full rules.
|
|
24
|
+
>
|
|
25
|
+
> **Note**: The review-code and review-pr skills include **Adaptive Depth**. Review depth scales automatically based on changeset size: < 50 lines → Lightweight (quick-scan), 50–500 → Standard (no change), 500+ → Deep (multi-pass with executive summary). See `.claude/resources/core/review-adaptive-depth.md` for full rules.
|
|
18
26
|
|
|
19
27
|
---
|
|
20
28
|
|
|
@@ -188,7 +196,7 @@ Skills implement the workflow logic for commands. Each skill orchestrates a spec
|
|
|
188
196
|
| Command | Skill | Key Codes |
|
|
189
197
|
|---------|-------|-----------|
|
|
190
198
|
| `/brainstorm` | brainstorm-skill | SKL-BS-1 through SKL-BS-3 |
|
|
191
|
-
| `/
|
|
199
|
+
| `/note` | brain-skill | SKL-BR-1 through SKL-BR-3 |
|
|
192
200
|
| `/flow cost` | flow-cost | SKL-COST-1 through SKL-COST-4 |
|
|
193
201
|
| `/learn` | learn-skill | SKL-LRN-1 through SKL-LRN-4 |
|
|
194
202
|
| `/discovery-plan` | discovery-skill | SKL-DIS-1 through SKL-DIS-4 |
|
|
@@ -49,7 +49,7 @@ This skill **only writes to `flow/brain/`**. It does NOT:
|
|
|
49
49
|
|
|
50
50
|
### Free-Text Mode (Default)
|
|
51
51
|
|
|
52
|
-
When user runs `/
|
|
52
|
+
When user runs `/note {free text}`:
|
|
53
53
|
|
|
54
54
|
1. **Parse input** - Read the user's unstructured text
|
|
55
55
|
2. **Extract entities** - Identify:
|
|
@@ -63,9 +63,9 @@ When user runs `/brain {free text}`:
|
|
|
63
63
|
4. **Write** - Create/update the appropriate brain file with `[[wiki-links]]`
|
|
64
64
|
5. **Update index** - Update `flow/brain/index.md` if needed (new feature, error, or decision)
|
|
65
65
|
|
|
66
|
-
### Guided Mode (`/
|
|
66
|
+
### Guided Mode (`/note -guided`)
|
|
67
67
|
|
|
68
|
-
When user runs `/
|
|
68
|
+
When user runs `/note -guided`:
|
|
69
69
|
|
|
70
70
|
1. **Ask structured questions** using `AskUserQuestion`:
|
|
71
71
|
|
|
@@ -75,6 +75,26 @@ This skill is **strictly read-only analysis**. The review process:
|
|
|
75
75
|
4. If `file_path` is provided, filter to only those files
|
|
76
76
|
5. If `scope` is provided, filter to staged or unstaged only
|
|
77
77
|
|
|
78
|
+
### Step 1b: Determine Review Depth
|
|
79
|
+
|
|
80
|
+
Determine the review mode based on changeset size. See `.claude/resources/core/review-adaptive-depth.md` for full rules.
|
|
81
|
+
|
|
82
|
+
1. Count total lines changed (additions + deletions) from `git diff --stat`
|
|
83
|
+
2. Exclude lock files, generated files, and pure whitespace changes from the count
|
|
84
|
+
3. Classify into tier:
|
|
85
|
+
- **Small** (< 50 lines) → **Lightweight** mode
|
|
86
|
+
- **Medium** (50–500 lines) → **Standard** mode (no behavior change)
|
|
87
|
+
- **Large** (500+ lines) → **Deep** mode
|
|
88
|
+
4. Display: `**Review mode**: {Lightweight|Standard|Deep} ({N} lines changed across {M} files)`
|
|
89
|
+
|
|
90
|
+
**If Lightweight**: Skip Steps 2–5 (pattern loading, similar implementations, full analysis, pattern conflicts). Perform abbreviated analysis checking ONLY security issues, obvious logic bugs, and breaking changes. Skip verification pass (Step 5b). Generate output using the lightweight template from `review-code-templates.md`.
|
|
91
|
+
|
|
92
|
+
**If Deep**: Activate multi-agent parallel review. See `.claude/resources/core/review-multi-agent.md`. Categorize files by type (Core Logic, Infrastructure, UI/Presentation, Tests), then spawn 4 specialized subagents in parallel (security, logic & bugs, performance, pattern compliance). The coordinator collects results, deduplicates overlapping findings, then proceeds to Step 5b (verification), Step 5c (re-ranking), Step 6b (pattern review), and Step 6 (output using deep template). Steps 2–5 are handled by subagents instead of the main agent.
|
|
93
|
+
|
|
94
|
+
**If Standard**: Proceed with all steps as defined below (no behavior change).
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
78
98
|
### Step 2: Load Review Patterns
|
|
79
99
|
|
|
80
100
|
1. Read `.claude/resources/patterns/review-pr-patterns.md` for general review guidelines
|
|
@@ -160,6 +180,42 @@ When a pattern conflict is found:
|
|
|
160
180
|
|
|
161
181
|
4. **Buffer patterns for capture**: Silently append identified patterns (both good patterns and anti-patterns found during review) to `flow/resources/pending-patterns.md`. See `.claude/resources/core/pattern-capture.md` for buffer format and capture triggers.
|
|
162
182
|
|
|
183
|
+
---
|
|
184
|
+
|
|
185
|
+
### Step 5b: Verify Findings
|
|
186
|
+
|
|
187
|
+
After collecting all findings from Steps 4 and 5, run a second-pass verification to filter false positives. See `.claude/resources/core/review-verification.md` for full logic.
|
|
188
|
+
|
|
189
|
+
**For each finding**:
|
|
190
|
+
|
|
191
|
+
1. **Re-read surrounding context** — Read 15 lines above and 15 below the flagged line
|
|
192
|
+
2. **Ask 3 standard questions**:
|
|
193
|
+
- Is this actually a bug, or does surrounding code handle it?
|
|
194
|
+
- Is there a test that covers this case?
|
|
195
|
+
- Would a senior developer agree this is a real issue?
|
|
196
|
+
3. **Ask 1 category-specific question** (security → exploit path, logic → reachability, performance → hot path, etc.)
|
|
197
|
+
4. **Classify**:
|
|
198
|
+
- **Confirmed** — Clear issue, 2+ standard questions support it. Keep as-is.
|
|
199
|
+
- **Likely** — Ambiguous, 1 question supports. Tag with `[Likely]` in output.
|
|
200
|
+
- **Dismissed** — False positive, all questions fail. Remove from output.
|
|
201
|
+
|
|
202
|
+
**Rules**:
|
|
203
|
+
- When in doubt between Likely and Dismissed → choose **Likely**
|
|
204
|
+
- NEVER dismiss a Critical severity finding (downgrade to Likely at most)
|
|
205
|
+
|
|
206
|
+
**After verification**: Remove Dismissed findings, tag Likely findings, generate Verification Summary stats.
|
|
207
|
+
|
|
208
|
+
### Step 5c: Re-Rank and Group Findings
|
|
209
|
+
|
|
210
|
+
After verification, re-rank all remaining findings by impact. See `.claude/resources/core/review-severity-ranking.md` for full rules.
|
|
211
|
+
|
|
212
|
+
1. **Sort findings**: Severity (Critical → Major → Minor → Suggestion), then Confidence (Confirmed → Likely), then Fix Complexity (lower first)
|
|
213
|
+
2. **Group related findings**: Scan for same issue type across files, same root cause, or causal chains. Only group when genuinely related (≥ 2 findings). Skip grouping for small reviews (1-3 findings).
|
|
214
|
+
3. **Executive summary**: If total findings ≥ 5 (standard mode) or always (deep mode), prepend an executive summary with risk level and top 3 findings. Skip for lightweight mode.
|
|
215
|
+
4. **Structure output by severity**: Use severity-grouped sections (Critical → Major → Minor → Suggestions) instead of per-file ordering. Omit empty severity sections.
|
|
216
|
+
|
|
217
|
+
---
|
|
218
|
+
|
|
163
219
|
### Step 6b: Pattern Review
|
|
164
220
|
|
|
165
221
|
After analysis but before generating the review document, run the pattern review protocol:
|
|
@@ -175,6 +231,8 @@ See `.claude/resources/core/pattern-capture.md` for the full end-of-skill review
|
|
|
175
231
|
|
|
176
232
|
### Step 6: Generate Review Document
|
|
177
233
|
|
|
234
|
+
**Important**: Only include Confirmed and Likely findings in the output. Dismissed findings are excluded. Add the Verification Summary section after the Review Summary.
|
|
235
|
+
|
|
178
236
|
Create a markdown file in `flow/reviewed-code/` with the naming convention:
|
|
179
237
|
|
|
180
238
|
```
|
|
@@ -317,3 +375,18 @@ After running this command:
|
|
|
317
375
|
5. **Commit changes** once review concerns are addressed
|
|
318
376
|
|
|
319
377
|
> The goal is not just to review current changes, but to **improve the codebase patterns over time** by documenting good patterns and preventing anti-patterns from spreading.
|
|
378
|
+
|
|
379
|
+
### Validation Checklist
|
|
380
|
+
|
|
381
|
+
- [ ] All changed files analyzed
|
|
382
|
+
- [ ] Forbidden patterns checked
|
|
383
|
+
- [ ] Allowed patterns verified
|
|
384
|
+
- [ ] Similar implementations searched
|
|
385
|
+
- [ ] Pattern conflicts documented
|
|
386
|
+
- [ ] Verification pass completed — all findings classified
|
|
387
|
+
- [ ] Dismissed findings removed from output
|
|
388
|
+
- [ ] Likely findings tagged with `[Likely]`
|
|
389
|
+
- [ ] Verification Summary section included
|
|
390
|
+
- [ ] Findings sorted by severity (Critical → Major → Minor → Suggestion)
|
|
391
|
+
- [ ] Related findings grouped when applicable (≥ 2 related)
|
|
392
|
+
- [ ] Executive summary included when ≥ 5 findings (standard) or always (deep)
|
|
@@ -162,6 +162,26 @@ After successful authentication, proceed to fetch PR information.
|
|
|
162
162
|
2. Extract the PR title, description, and list of changed files
|
|
163
163
|
3. Identify the primary language(s) used in the PR
|
|
164
164
|
|
|
165
|
+
### Step 1b: Determine Review Depth
|
|
166
|
+
|
|
167
|
+
Determine the review mode based on changeset size. See `.claude/resources/core/review-adaptive-depth.md` for full rules.
|
|
168
|
+
|
|
169
|
+
1. Count total lines changed (additions + deletions) from `gh pr diff --stat` or Azure DevOps diff API
|
|
170
|
+
2. Exclude lock files, generated files, and pure whitespace changes from the count
|
|
171
|
+
3. Classify into tier:
|
|
172
|
+
- **Small** (< 50 lines) → **Lightweight** mode
|
|
173
|
+
- **Medium** (50–500 lines) → **Standard** mode (no behavior change)
|
|
174
|
+
- **Large** (500+ lines) → **Deep** mode
|
|
175
|
+
4. Display: `**Review mode**: {Lightweight|Standard|Deep} ({N} lines changed across {M} files)`
|
|
176
|
+
|
|
177
|
+
**If Lightweight**: Skip Steps 2–3 (pattern loading, full analysis). Perform abbreviated analysis checking ONLY security issues, obvious logic bugs, and breaking changes. Skip verification pass (Step 3b). Generate compact output with no Pattern Conflicts or Reference Implementations sections.
|
|
178
|
+
|
|
179
|
+
**If Deep**: Activate multi-agent parallel review. See `.claude/resources/core/review-multi-agent.md`. Categorize files by type (Core Logic, Infrastructure, UI/Presentation, Tests), then spawn 4 specialized subagents in parallel (security, logic & bugs, performance, pattern compliance). The coordinator collects results, deduplicates overlapping findings, then proceeds to Step 3b (verification), Step 3c (re-ranking), and Step 4 (output using deep template with severity grouping and executive summary). Steps 2–3 are handled by subagents instead of the main agent.
|
|
180
|
+
|
|
181
|
+
**If Standard**: Proceed with all steps as defined below (no behavior change).
|
|
182
|
+
|
|
183
|
+
---
|
|
184
|
+
|
|
165
185
|
### Step 2: Load Review Patterns
|
|
166
186
|
|
|
167
187
|
1. Read `.claude/resources/patterns/review-pr-patterns.md` for general review guidelines
|
|
@@ -182,8 +202,46 @@ For each file in the PR:
|
|
|
182
202
|
4. Apply language-specific checks
|
|
183
203
|
5. Identify security, performance, and maintainability concerns
|
|
184
204
|
|
|
205
|
+
---
|
|
206
|
+
|
|
207
|
+
### Step 3b: Verify Findings
|
|
208
|
+
|
|
209
|
+
After collecting all findings from Step 3, run a second-pass verification to filter false positives. See `.claude/resources/core/review-verification.md` for full logic.
|
|
210
|
+
|
|
211
|
+
**For each finding**:
|
|
212
|
+
|
|
213
|
+
1. **Re-read surrounding context** — Read 15 lines above and 15 below the flagged line
|
|
214
|
+
2. **Ask 3 standard questions**:
|
|
215
|
+
- Is this actually a bug, or does surrounding code handle it?
|
|
216
|
+
- Is there a test that covers this case?
|
|
217
|
+
- Would a senior developer agree this is a real issue?
|
|
218
|
+
3. **Ask 1 category-specific question** (security → exploit path, logic → reachability, performance → hot path, etc.)
|
|
219
|
+
4. **Classify**:
|
|
220
|
+
- **Confirmed** — Clear issue, 2+ standard questions support it. Keep as-is.
|
|
221
|
+
- **Likely** — Ambiguous, 1 question supports. Tag with `[Likely]` in output.
|
|
222
|
+
- **Dismissed** — False positive, all questions fail. Remove from output.
|
|
223
|
+
|
|
224
|
+
**Rules**:
|
|
225
|
+
- When in doubt between Likely and Dismissed → choose **Likely**
|
|
226
|
+
- NEVER dismiss a Critical severity finding (downgrade to Likely at most)
|
|
227
|
+
|
|
228
|
+
**After verification**: Remove Dismissed findings, tag Likely findings, generate Verification Summary stats.
|
|
229
|
+
|
|
230
|
+
### Step 3c: Re-Rank and Group Findings
|
|
231
|
+
|
|
232
|
+
After verification, re-rank all remaining findings by impact. See `.claude/resources/core/review-severity-ranking.md` for full rules.
|
|
233
|
+
|
|
234
|
+
1. **Sort findings**: Severity (Critical → Major → Minor → Suggestion), then Confidence (Confirmed → Likely), then Fix Complexity (lower first)
|
|
235
|
+
2. **Group related findings**: Scan for same issue type across files, same root cause, or causal chains. Only group when genuinely related (≥ 2 findings). Skip grouping for small reviews (1-3 findings).
|
|
236
|
+
3. **Executive summary**: If total findings ≥ 5 (standard mode) or always (deep mode), prepend an executive summary with risk level and top 3 findings. Skip for lightweight mode.
|
|
237
|
+
4. **Structure output by severity**: Use severity-grouped sections (Critical → Major → Minor → Suggestions) instead of per-file ordering. Omit empty severity sections.
|
|
238
|
+
|
|
239
|
+
---
|
|
240
|
+
|
|
185
241
|
### Step 4: Generate or Update Review Document
|
|
186
242
|
|
|
243
|
+
**Important**: Only include Confirmed and Likely findings in the output. Dismissed findings are excluded. Add the Verification Summary section after the Review Summary.
|
|
244
|
+
|
|
187
245
|
**Check for existing review file** in `flow/reviewed-pr/` before creating a new one.
|
|
188
246
|
|
|
189
247
|
#### If reviewing the same PR again:
|
package/README.md
CHANGED
|
@@ -75,11 +75,11 @@ Installs for Claude Code, Cursor, OpenClaw, and Codex CLI simultaneously.
|
|
|
75
75
|
| `/create-plan` | Create implementation plan with phases |
|
|
76
76
|
| `/execute-plan` | Execute plan phases with verification |
|
|
77
77
|
| `/create-contract` | Create integration contract from API docs |
|
|
78
|
-
| `/review-code` | Review local uncommitted changes |
|
|
79
|
-
| `/review-pr` | Review a Pull Request |
|
|
78
|
+
| `/review-code` | Review local uncommitted changes (adaptive depth + multi-agent) |
|
|
79
|
+
| `/review-pr` | Review a Pull Request (adaptive depth + multi-agent) |
|
|
80
80
|
| `/write-tests` | Generate tests for coverage target |
|
|
81
81
|
| `/flow` | Configure plan-flow settings (autopilot, git control, runtime options) |
|
|
82
|
-
| `/
|
|
82
|
+
| `/note` | Capture meeting notes, ideas, brainstorms |
|
|
83
83
|
| `/learn` | Extract reusable patterns or learn a topic step-by-step |
|
|
84
84
|
| `/pattern-validate` | Scan and index global brain patterns |
|
|
85
85
|
| `/heartbeat` | Manage scheduled automated tasks |
|
|
@@ -209,6 +209,41 @@ Tasks with `in {N} hours/minutes` schedules run once and auto-disable after exec
|
|
|
209
209
|
|
|
210
210
|
If a task fails because a Claude Code session is already active, the daemon retries up to 5 times at 60-second intervals instead of failing permanently.
|
|
211
211
|
|
|
212
|
+
## Code Review
|
|
213
|
+
|
|
214
|
+
`/review-code` and `/review-pr` include three layers of intelligence:
|
|
215
|
+
|
|
216
|
+
### Adaptive Depth
|
|
217
|
+
|
|
218
|
+
Review depth scales automatically based on changeset size:
|
|
219
|
+
|
|
220
|
+
| Lines Changed | Mode | Behavior |
|
|
221
|
+
|--------------|------|----------|
|
|
222
|
+
| < 50 | Lightweight | Quick-scan for security, logic bugs, and breaking changes only |
|
|
223
|
+
| 50–500 | Standard | Full review with pattern matching and similar implementation search |
|
|
224
|
+
| 500+ | Deep | Multi-pass review with file categorization, executive summary, and multi-agent analysis |
|
|
225
|
+
|
|
226
|
+
### Verification Pass
|
|
227
|
+
|
|
228
|
+
Every finding goes through a second-pass verification that re-reads surrounding context and asks structured questions to classify findings as Confirmed, Likely, or Dismissed. False positives are filtered before output.
|
|
229
|
+
|
|
230
|
+
### Severity Re-Ranking
|
|
231
|
+
|
|
232
|
+
Findings are sorted by impact (Critical > Major > Minor > Suggestion), related findings across files are grouped, and an executive summary is added when there are 5+ findings.
|
|
233
|
+
|
|
234
|
+
### Multi-Agent Parallel Review
|
|
235
|
+
|
|
236
|
+
In Deep mode (500+ lines), the review is split into 4 specialized subagents running in parallel:
|
|
237
|
+
|
|
238
|
+
| Agent | Focus | Model |
|
|
239
|
+
|-------|-------|-------|
|
|
240
|
+
| Security | Vulnerabilities, secrets, injection, auth bypass | sonnet |
|
|
241
|
+
| Logic & Bugs | Edge cases, null handling, race conditions | sonnet |
|
|
242
|
+
| Performance | N+1 queries, memory leaks, blocking I/O | sonnet |
|
|
243
|
+
| Pattern Compliance | Forbidden/allowed patterns, naming consistency | haiku |
|
|
244
|
+
|
|
245
|
+
The coordinator merges results, deduplicates overlapping findings, then runs verification and re-ranking.
|
|
246
|
+
|
|
212
247
|
## Complexity Scoring
|
|
213
248
|
|
|
214
249
|
Every plan phase has a complexity score (0-10):
|