invar-tools 1.10.0__py3-none-any.whl → 1.11.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,337 +1,158 @@
1
1
  ---
2
2
  name: review
3
- description: Fault-finding code review with REJECTION-FIRST mindset. Code is GUILTY until proven INNOCENT. Two-step loop (Review→Fix) with full-scope review each round. Use after development, when Guard reports review_suggested, or user explicitly requests review.
3
+ description: Adversarial code review. Code is GUILTY until proven INNOCENT. Every round spawns isolated subagent reviewing FULL scope.
4
4
  _invar:
5
- version: "5.3"
5
+ version: "7.0"
6
6
  managed: skill
7
7
  ---
8
8
  <!--invar:skill-->
9
9
 
10
- # Review Mode (Fault-Finding with Auto-Loop)
10
+ # Review Skill (Adversarial)
11
11
 
12
- > **Purpose:** Find problems that Guard, doctests, and property tests missed.
13
- > **Mindset:** REJECTION-FIRST. Code is GUILTY until proven INNOCENT.
14
- > **Success Metric:** Issues FOUND, not code approved. Zero issues = you failed to look hard enough.
15
- > **Workflow:** Two-step loop: Review → Fix → Review → Fix → ... (full scope each round, no separate "verify" step).
12
+ ## Mandatory Rules (MUST follow, NO exceptions)
16
13
 
17
- ## Depth Levels (DX-70)
14
+ 1. **EVERY round MUST spawn isolated subagent** (Task tool with model=opus)
15
+ 2. **EVERY round reviews FULL scope** (all files, not just changes)
16
+ 3. **Code is GUILTY until proven INNOCENT**
17
+ 4. **NO user confirmation between rounds** — just do it
18
+ 5. **MAX_ROUNDS = 5**
18
19
 
19
- | Level | Context | Use Case |
20
- |-------|---------|----------|
21
- | (default) | Same context | Reviewing **others' code** only |
22
- | `--deep` | **Isolated agent** | Self-review, before merge, maximum objectivity |
20
+ **Violation = Review Invalid.** If you skip subagent or review only changes, the review is worthless.
23
21
 
24
- **Default:** Same context — **only appropriate for code you did NOT write**.
25
-
26
- **`--deep` mode:** Spawns isolated agent with no conversation history. **Required when:**
27
- - You wrote or modified the code being reviewed (self-review)
28
- - Before merge/PR
29
- - Maximum objectivity needed
30
-
31
- ### ⚠️ Same-Context Review Limitations (CRITICAL)
32
-
33
- **Same-context review CANNOT be objective for self-written code because:**
34
-
35
- | Cognitive Bias | Effect |
36
- |----------------|--------|
37
- | **Intent over code** | You "know" what it's supposed to do, so you don't see what it actually does |
38
- | **Context memory** | You "remember" reading code, so you skip re-reading carefully |
39
- | **Confirmation bias** | You look for "code works" evidence, not "code fails" evidence |
40
- | **Completion pressure** | Subconscious goal becomes "finish review" not "find bugs" |
41
-
42
- **Evidence:** In DX-71 review, same-context missed 2 CRITICAL + 4 MAJOR issues that
43
- isolated agent found immediately. "Fresh eyes" claims don't work in same context.
44
-
45
- ### Mandatory Self-Review Detection (DX-72)
46
-
47
- **Before starting review, you MUST check:**
48
-
49
- ```
50
- If ANY file in review scope was edited by agent this session:
51
- ┌──────────────────────────────────────────────────────────────┐
52
- │ 🚨 SELF-REVIEW DETECTED — Isolation Required │
53
- │ │
54
- │ You modified files in the review scope this session. │
55
- │ Same-context review has proven cognitive blind spots. │
56
- │ │
57
- │ Options: │
58
- │ [1] Use --deep (RECOMMENDED) — Spawn isolated agent │
59
- │ [2] Acknowledge risk — User explicitly accepts limitations │
60
- │ │
61
- │ If user says "continue" or "quick review": │
62
- │ → Proceed but add WARNING to final report │
63
- │ → Report MUST state: "Self-review without isolation" │
64
- └──────────────────────────────────────────────────────────────┘
65
- ```
66
-
67
- **Default action:** If user doesn't specify, use `--deep` for self-review.
68
-
69
- ### --deep Mode Execution
70
-
71
- When `--deep` is selected:
72
-
73
- 1. Collect minimal inputs:
74
- - Files to review
75
- - Contracts (if available)
76
- - Test files (if available)
77
-
78
- 2. Spawn Task agent with:
79
- - **Adversarial Code Reviewer persona** (see Appendix)
80
- - NO conversation history
81
- - Only the collected inputs
82
-
83
- 3. Isolated agent returns structured review report
84
-
85
- 4. Main agent fixes issues (if any)
22
+ ---
86
23
 
87
- 5. **CRITICAL: Spawn NEW isolated agent for Round 2+ Review**
24
+ ## Scope Classification (DX-75)
88
25
 
89
- ### --deep Mode Loop (MANDATORY)
26
+ **Before starting, classify the scope:**
90
27
 
91
- ```
92
- while not quality_met:
93
- report = spawn_NEW_isolated_reviewer(files) # 每轮新 agent
94
- if report.has_critical_or_major:
95
- main_agent.fix(report.issues) # 主 agent 修复
96
- else:
97
- quality_met = True
98
- ```
28
+ | Classification | Criteria | Strategy |
29
+ |----------------|----------|----------|
30
+ | **SMALL** | <5 files AND <1500 lines | THOROUGH (no enumeration) |
31
+ | **MEDIUM** | 5-10 files OR 1500-5000 lines | HYBRID (enum + open) |
32
+ | **LARGE** | >10 files OR >5000 lines | CHUNKED (parallel subagents) |
99
33
 
100
- **Why new agent each round?**
101
- - Main agent has context contamination from fixing
102
- - "Fresh eyes" cannot be achieved in same context
103
- - Round 2 in same context drifts to "verify my fixes" not "find problems"
34
+ **Why different strategies?**
35
+ - SMALL: Pre-enumeration causes "checklist mentality" you only verify listed items, miss variants
36
+ - LARGE: Without enumeration, attention drifts later files get less scrutiny
104
37
 
105
38
  ---
106
39
 
107
- ## Scope Boundaries
108
-
109
- **This skill IS for:**
110
- - Finding bugs and logic errors in existing code
111
- - Verifying contract semantic value
112
- - Auditing escape hatches
113
- - Security review
114
-
115
- **This skill is NOT for:**
116
- - Implementing new features → switch to `/develop`
117
- - Understanding how code works → switch to `/investigate`
118
- - Deciding on architecture → switch to `/propose`
119
-
120
- **Drift detection:** If you're writing significant new code (not fixes) → STOP, you're in wrong skill.
121
-
122
- ## Auto-Loop Configuration
40
+ ## Strategy: THOROUGH (SMALL scope)
123
41
 
124
42
  ```
125
- MAX_ROUNDS = 5 # Maximum review-fix cycles
126
- AUTO_TRANSITION = true # No human confirmation between roles
127
- ASK_USER = never # NEVER ask user, just do it
43
+ ┌─────────────────────────────────────────────────────────────┐
44
+ THOROUGH STRATEGY (for SMALL scope) │
45
+ │ ───────────────────────────────────────────────────────────│
46
+ │ │
47
+ │ ⚠️ DO NOT pre-enumerate issues or patterns │
48
+ │ ⚠️ DO NOT use grep/sig to "find issues first" │
49
+ │ │
50
+ │ Instead: │
51
+ │ 1. Read each file COMPLETELY, line by line │
52
+ │ 2. Apply checklist A-G as you read │
53
+ │ 3. Trust your judgment to find issues │
54
+ │ 4. Look for VARIANTS and EDGE CASES │
55
+ │ │
56
+ │ Why: Pre-enumeration narrows focus to known patterns. │
57
+ │ Small scope = you CAN read everything thoroughly. │
58
+ │ This finds issues that pattern matching misses. │
59
+ └─────────────────────────────────────────────────────────────┘
128
60
  ```
129
61
 
130
- **CRITICAL: After finding issues, IMMEDIATELY switch to FIXER role and fix them.**
131
- **DO NOT ask "Proceed with fixes?" or similar — just fix and continue.**
132
-
133
- ## Prime Directive: Reject Until Proven Correct
134
-
135
- **You are the PROSECUTOR, not the defense attorney.**
136
-
137
- | Trap | Reality Check |
138
- |------|---------------|
139
- | "Seems fine" | You failed to find the bug |
140
- | "Makes sense" | You're rationalizing, not reviewing |
141
- | "Edge case is unlikely" | Edge cases ARE bugs |
142
- | "Comment explains it" | Comments don't fix code |
143
- | "Assessed as acceptable" | "Assessed" ≠ "Fixed" |
144
-
145
- ## Role Separation (CRITICAL)
146
-
147
- **You play TWO distinct roles that cycle AUTOMATICALLY:**
148
-
149
- | Role | Allowed Actions | Forbidden |
150
- |------|-----------------|-----------|
151
- | **REVIEWER** | Find issues (full scope), declare quality_met | Write code, rationalize issues |
152
- | **FIXER** | Implement fixes only | Declare quality_met, dismiss issues |
153
-
154
- **Role Transition Markers (REQUIRED):**
62
+ ## Strategy: HYBRID (MEDIUM scope)
155
63
 
156
64
  ```
157
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
158
- 🔍 REVIEWER [Round N] — Full scope review
159
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
160
-
161
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
162
- 🔧 FIXER [Round N] — Implementing fixes
163
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
65
+ ┌─────────────────────────────────────────────────────────────┐
66
+ │ HYBRID STRATEGY (for MEDIUM scope) │
67
+ │ ───────────────────────────────────────────────────────────│
68
+ │ │
69
+ │ Phase 0: ENUMERATE (Main Agent) │
70
+ │ ┌─────────────────────────────────────────────────────┐ │
71
+ │ │ Use grep/invar_sig to find: │ │
72
+ │ │ - All @pre/@post contracts │ │
73
+ │ │ - All @invar:allow escape hatches │ │
74
+ │ │ - Hardcoded strings (secrets?) │ │
75
+ │ │ - subprocess/exec/eval calls │ │
76
+ │ │ - bare except clauses │ │
77
+ │ │ Create issue_map with file:line for each │ │
78
+ │ └─────────────────────────────────────────────────────┘ │
79
+ │ │
80
+ │ Phase 1: GUIDED REVIEW (Isolated Subagent) │
81
+ │ ┌─────────────────────────────────────────────────────┐ │
82
+ │ │ Pass issue_map to subagent │ │
83
+ │ │ Subagent verifies each item │ │
84
+ │ │ Reports: "Checked N/M items from issue_map" │ │
85
+ │ └─────────────────────────────────────────────────────┘ │
86
+ │ │
87
+ │ Phase 2: OPEN DISCOVERY (Same Subagent) │
88
+ │ ┌─────────────────────────────────────────────────────┐ │
89
+ │ │ "Now forget the issue_map. │ │
90
+ │ │ Look for issues NOT in the map: │ │
91
+ │ │ - Variants of listed patterns │ │
92
+ │ │ - Logic errors │ │
93
+ │ │ - Edge cases" │ │
94
+ │ │ Reports: "Found N additional issues" │ │
95
+ │ └─────────────────────────────────────────────────────┘ │
96
+ └─────────────────────────────────────────────────────────────┘
164
97
  ```
165
98
 
166
- **NO separate "Verify" step.** After Fix, go directly to next round's Review.
167
-
168
- ## Quality Gate Authority
169
-
170
- **ONLY the Reviewer role can declare `quality_met`.**
171
-
172
- Before declaring exit:
173
- 1. Re-read EVERY issue found
174
- 2. For each issue, verify: "Is this ACTUALLY fixed, or did I rationalize it?"
175
- 3. Ask: "Would I accept this excuse from someone else's code?"
176
-
177
- **Self-Check Questions:**
178
- - Did I write code AND declare quality_met? → Role confusion detected
179
- - Did I say "assessed" instead of "fixed"? → Rationalization detected
180
- - Did any MAJOR become a comment instead of code? → Fix failed
181
-
182
- ## Fault-Finding Persona
183
-
184
- Assume:
185
- - The code has bugs until proven otherwise
186
- - The contracts may be meaningless ceremony
187
- - The implementer may have rationalized poor decisions
188
- - Escape hatches may be abused
189
- - **Your own fixes may introduce new bugs**
190
-
191
- You ARE here to:
192
- - Find bugs, logic errors, edge cases
193
- - Challenge whether contracts have semantic value
194
- - Check if code matches contracts (not if code "seems right")
195
-
196
- ## Fresh Eyes Mandate (Round 2+) — ENFORCED
197
-
198
- **For rounds after the first, you MUST adopt "fresh eyes" mindset:**
199
-
200
- > "I am a different reviewer who has never seen this code or the previous fixes."
201
-
202
- | Trap | Correction |
203
- |------|------------|
204
- | "I just fixed this" | Irrelevant. Review it like new code. |
205
- | "This was fine last round" | Maybe you missed something. Check again. |
206
- | "The fix looks correct" | That's FIXER thinking. Find what's WRONG. |
207
-
208
- ### Why This Exists
209
-
210
- Round 2+ in the same context naturally drifts toward "verify my fixes" instead of
211
- "find all problems". This cognitive bias causes issues to slip through:
212
- - Attention focuses on recently-fixed areas
213
- - Brain skips content it "remembers" reading
214
- - Subconscious goal becomes "complete task" not "find bugs"
215
-
216
- ### Mandatory Actions (Round 2+)
217
-
218
- **Before declaring quality_met, you MUST:**
219
-
220
- 1. **RE-READ all files using Read tool**
221
- ```
222
- ❌ WRONG: Rely on context memory ("I already read this")
223
- ✅ RIGHT: Call Read() for each file in scope, every round
224
- ```
225
-
226
- 2. **Systematic audit per code block** (for documentation/examples)
227
- ```
228
- For each code block:
229
- - List all symbols USED (types, functions, classes)
230
- - List all IMPORTS shown
231
- - Verify: every used symbol has corresponding import
232
- ```
233
-
234
- 3. **Section-by-section explicit check**
235
- ```
236
- □ Section 1 checked
237
- □ Section 2 checked
238
- □ Section 3 checked
239
- ... (every section, not "looks fine overall")
240
- ```
241
-
242
- 4. **Verbalize findings before exit**
243
- ```
244
- ❌ WRONG: "Verified fixes, looks good"
245
- ✅ RIGHT: "Re-read 5 files, checked 23 sections, found 0 new issues"
246
- ```
247
-
248
- ### Round 2+ Workflow Diagram
99
+ ## Strategy: CHUNKED (LARGE scope)
249
100
 
250
101
  ```
251
- FIXER [Round N] completes
252
-
253
- ┌─────────────────────────────────────────┐
254
- REVIEWER [Round N+1] — MANDATORY STEPS
255
-
256
- 1. Call Read() for EVERY file in scope
257
- (Do NOT skip, do NOT rely on memory)│
258
-
259
- 2. For each file:
260
- □ Check section by section
261
- Audit imports vs usage
262
- Look for issues MISSED before
263
-
264
- 3. Verbalize: "Read X files, checked
265
- Y sections, found Z issues"
266
-
267
- 4. Only THEN: EXIT CHECK
268
- └─────────────────────────────────────────┘
102
+ ┌─────────────────────────────────────────────────────────────┐
103
+ │ CHUNKED STRATEGY (for LARGE scope) │
104
+ │ ───────────────────────────────────────────────────────────│
105
+
106
+ 1. Split files into chunks of ~3-5 files each
107
+
108
+ 2. For each chunk (can be parallel):
109
+ - Spawn isolated subagent
110
+ - Use HYBRID strategy within chunk
111
+
112
+ 3. Cross-chunk analysis:
113
+ - Check cross-file dependencies
114
+ - Check API consistency
115
+
116
+ 4. Merge all findings, deduplicate
117
+
118
+ Why: Prevents "attention fatigue" on file 8+ of 15.
119
+ │ Each chunk gets fresh attention. │
120
+ └─────────────────────────────────────────────────────────────┘
269
121
  ```
270
122
 
271
- **Full scope means:**
272
- 1. Re-run the ENTIRE checklist (A through G)
273
- 2. Review ALL files in scope, not just recent fixes
274
- 3. Check if fixes introduced NEW issues
275
- 4. Look for issues you missed in previous rounds
276
-
277
- ## Entry Actions
278
-
279
- ### Context Refresh (DX-54)
280
-
281
- Before any workflow action:
282
- 1. Read `.invar/context.md` (especially Key Rules section)
283
- 2. Display routing announcement
284
-
285
- ### Routing Announcement
286
-
287
- ```
288
- 📍 Routing: /review — [trigger, e.g. "review_suggested", "user requested review"]
289
- Task: [review scope summary]
290
- ```
291
-
292
- ## Mode Selection
293
-
294
- ### Step 1: Check Self-Review (MANDATORY)
295
-
296
- ```python
297
- # Pseudo-code for self-review detection
298
- files_in_scope = get_review_scope()
299
- files_edited_this_session = get_agent_edits()
300
-
301
- if files_in_scope & files_edited_this_session:
302
- # SELF-REVIEW DETECTED
303
- if user_said("--deep") or user_said("deep review"):
304
- mode = ISOLATED
305
- elif user_said("quick") or user_said("continue"):
306
- mode = SAME_CONTEXT
307
- add_warning_to_report = True # "Self-review without isolation"
308
- else:
309
- # Default: recommend --deep, wait for user choice
310
- show_self_review_warning()
311
- mode = ISOLATED # Default to safe option
312
- ```
123
+ ---
313
124
 
314
- ### Step 2: Check Guard Output
125
+ ## 2-Step Loop (MANDATORY workflow)
315
126
 
316
- Look for `review_suggested` warning:
317
127
  ```
318
- WARNING: review_suggested - High escape hatch count
319
- WARNING: review_suggested - Security-sensitive path detected
320
- WARNING: review_suggested - Low contract coverage
128
+ ┌─────────────────────────────────────────────────────────────┐
129
+ │ Round N:
130
+ │ │
131
+ │ 1. REVIEWER [Subagent] ─────────────────────────────────── │
132
+ │ • Spawn NEW isolated agent (Task tool) │
133
+ │ • Use strategy based on scope classification │
134
+ │ • Review ALL files in scope (full checklist A-G) │
135
+ │ • Return: issues[] or APPROVED │
136
+ │ │
137
+ │ 2. FIXER [Main Agent] ──────────────────────────────────── │
138
+ │ • Fix CRITICAL/MAJOR issues with CODE │
139
+ │ • Run invar_guard() │
140
+ │ • Cannot declare quality_met │
141
+ │ │
142
+ │ → Loop until: APPROVED OR max_rounds OR no_progress │
143
+ └─────────────────────────────────────────────────────────────┘
321
144
  ```
322
145
 
323
- ### Select Mode (Final Decision)
146
+ **Why new subagent each round?**
147
+ - Main agent has context contamination from fixing
148
+ - "Fresh eyes" impossible in same context
149
+ - Round 2+ drifts to "verify my fixes" not "find problems"
324
150
 
325
- | Condition | Mode | Notes |
326
- |-----------|------|-------|
327
- | Self-review detected | **Isolated** (default) | Unless user explicitly accepts risk |
328
- | `review_suggested` present | **Isolated** | Guard recommends isolation |
329
- | `--deep` flag | **Isolated** | User requested |
330
- | Others' code, no triggers | **Quick** (same context) | Only valid for non-self code |
151
+ ---
331
152
 
332
- ## Review Checklist
153
+ ## Review Checklist (apply to ALL files)
333
154
 
334
- > **Principle:** Only items requiring semantic judgment. Mechanical checks are handled by Guard.
155
+ > **Principle:** Only items requiring semantic judgment. Mechanical checks handled by Guard.
335
156
 
336
157
  ### A. Contract Semantic Value
337
158
 
@@ -341,273 +162,164 @@ WARNING: review_suggested - Low contract coverage
341
162
  - [ ] Does @post verify meaningful output properties?
342
163
  - Bad: `@post(lambda result: result is not None)`
343
164
  - Good: `@post(lambda result: len(result) == len(input))`
344
-
345
165
  - [ ] Could someone implement correctly from contracts alone?
346
166
  - [ ] Are boundary conditions explicit in contracts?
347
167
 
348
168
  ### B. Doctest Coverage
349
- - [ ] Do doctests cover normal cases?
350
- - [ ] Do doctests cover boundary cases?
351
- - [ ] Do doctests cover error cases?
169
+
170
+ - [ ] Do doctests cover normal, boundary, and error cases?
352
171
  - [ ] Are doctests testing behavior, not just syntax?
353
172
 
354
173
  ### C. Code Quality
174
+
355
175
  - [ ] Is duplicated code worth extracting?
356
176
  - [ ] Is naming consistent and clear?
357
177
  - [ ] Is complexity justified?
358
178
 
359
179
  ### D. Escape Hatch Audit
180
+
360
181
  - [ ] Is each @invar:allow justification valid?
361
182
  - [ ] Could refactoring eliminate the need?
362
- - [ ] Is there a pattern suggesting systematic issues?
363
183
 
364
184
  ### E. Logic Verification
185
+
365
186
  - [ ] Do contracts correctly capture intended behavior?
366
187
  - [ ] Are there paths that bypass contract checks?
367
188
  - [ ] Are there implicit assumptions not in contracts?
368
- - [ ] Is there dead code or unreachable branches?
369
189
 
370
190
  ### F. Security
191
+
371
192
  - [ ] Are inputs validated against security threats (injection, XSS)?
372
193
  - [ ] No hardcoded secrets (API keys, passwords, tokens)?
373
194
  - [ ] Are authentication/authorization checks correct?
374
- - [ ] Is sensitive data properly protected?
375
195
 
376
- ### G. Error Handling & Observability
196
+ ### G. Error Handling
197
+
377
198
  - [ ] Are exceptions caught at appropriate level?
378
199
  - [ ] Are error messages clear without leaking sensitive info?
379
- - [ ] Are critical operations logged for debugging?
380
200
  - [ ] Is there graceful degradation on failure?
381
201
 
382
- ## Excluded (Covered by Guard)
383
-
384
- These are checked by Guard or linters - don't duplicate:
385
- - Core/Shell separation → Guard (forbidden_import, impure_call)
386
- - Shell returns Result[T,E] → Guard (shell_result)
387
- - Missing contracts → Guard (missing_contract)
388
- - File/function size limits → Guard (file_size, function_size)
389
- - Entry point thickness → Guard (entry_point_too_thick)
390
- - Escape hatch count → Guard (review_suggested)
202
+ ---
391
203
 
392
- ## Auto-Loop Workflow (FULLY AUTOMATIC)
204
+ ## Subagent Prompt Templates
393
205
 
394
- **The loop runs AUTOMATICALLY until exit condition is met. NO user interaction.**
206
+ ### THOROUGH (SMALL scope)
395
207
 
396
- **Two-step cycle: Review → Fix → Review → Fix → ...**
208
+ ```
209
+ You are an independent Adversarial Code Reviewer.
397
210
 
398
- ⚠️ **NEVER ask user:**
399
- - "Proceed with fixes?"
400
- - "Should I fix these?"
401
- - "Do you want me to continue?"
211
+ RULES:
212
+ 1. Code is GUILTY until proven INNOCENT
213
+ 2. You did NOT write this code — no emotional attachment
214
+ 3. Find reasons to REJECT, not accept
215
+ 4. Be specific: file:line + concrete fix
402
216
 
403
- **Just do it.** Find issues → Fix them → Review again → Repeat until done.
217
+ STRATEGY: THOROUGH READING
218
+ - Read each file COMPLETELY, line by line
219
+ - DO NOT pre-scan for patterns — just READ
220
+ - Look for VARIANTS and EDGE CASES
221
+ - Trust your judgment
404
222
 
405
- ```
406
- ┌─────────────────────────────────────────────────────────────────┐
407
- │ START: round = 1, issues = [] │
408
- │ │
409
- │ ┌─────────────────────────────────────────────────────────┐ │
410
- │ │ 🔍 REVIEWER [Round N] — Full Scope Review │ │
411
- │ │ 1. Apply FULL checklist (A-G) to ENTIRE scope │ │
412
- │ │ 2. Find ALL issues (don't stop at first) │ │
413
- │ │ 3. Classify: CRITICAL / MAJOR / MINOR │ │
414
- │ │ 4. Check previous fixes: CODE or just COMMENT? │ │
415
- │ │ 5. Check if fixes introduced NEW issues │ │
416
- │ │ 6. Update issues table │ │
417
- │ │ │ │
418
- │ │ EXIT CHECK: │ │
419
- │ │ - IF no CRITICAL/MAJOR found → quality_met, EXIT │ │
420
- │ │ - IF round >= MAX_ROUNDS → max_rounds, EXIT │ │
421
- │ │ - IF no progress (same issues 2 rounds) → EXIT │ │
422
- │ │ - ELSE → AUTO-TRANSITION to FIXER │ │
423
- │ └─────────────────────────────────────────────────────────┘ │
424
- │ ↓ (automatic) │
425
- │ ┌─────────────────────────────────────────────────────────┐ │
426
- │ │ 🔧 FIXER [Round N] │ │
427
- │ │ 1. Fix EACH CRITICAL/MAJOR issue with CODE │ │
428
- │ │ 2. Run invar_guard() after fixes │ │
429
- │ │ 3. NO declaring quality_met (forbidden) │ │
430
- │ │ 4. round++ │ │
431
- │ │ 5. AUTO-TRANSITION to REVIEWER [Round N+1] │ │
432
- │ └─────────────────────────────────────────────────────────┘ │
433
- │ ↓ (automatic, fresh eyes) │
434
- │ [LOOP BACK TO REVIEWER] │
435
- │ │
436
- │ EXIT: Generate final report │
437
- └─────────────────────────────────────────────────────────────────┘
438
- ```
223
+ SCOPE: [list all files]
439
224
 
440
- **Key change from v5.1:** No separate "Verify" step. Each round's Review is a
441
- full-scope audit with the same rigor as Round 1. This prevents the "verification
442
- mindset" trap where standards unconsciously lower after fixing.
225
+ Apply checklist A-G to each file.
443
226
 
444
- ## Loop State Tracking
227
+ OUTPUT FORMAT:
228
+ ## Verdict: APPROVED | NEEDS WORK | REJECTED
229
+ ## Critical Issues (must fix)
230
+ | ID | File:Line | Issue | Fix |
231
+ ## Major Issues (should fix)
232
+ | ID | File:Line | Issue | Fix |
233
+ ## Minor Issues (backlog)
234
+ | ID | File:Line | Issue | Fix |
235
+ ```
445
236
 
446
- **Maintain this state throughout the loop:**
237
+ ### HYBRID (MEDIUM scope)
447
238
 
448
- ```markdown
449
- ## Review State
450
- - **Round:** N / MAX_ROUNDS
451
- - **Role:** REVIEWER | FIXER
452
- - **Issues Found:** [count]
453
- - **Issues Fixed:** [count]
454
- - **Guard Status:** PASS | FAIL
455
239
  ```
240
+ You are an independent Adversarial Code Reviewer.
241
+
242
+ RULES:
243
+ 1. Code is GUILTY until proven INNOCENT
244
+ 2. You did NOT write this code — no emotional attachment
245
+ 3. Find reasons to REJECT, not accept
246
+ 4. Be specific: file:line + concrete fix
456
247
 
457
- ## Issues Table (Updated Each Round)
248
+ STRATEGY: HYBRID (two passes)
458
249
 
459
- | Issue ID | Severity | Round Found | Round Fixed | Status | Evidence |
460
- |----------|----------|-------------|-------------|--------|----------|
461
- | MAJOR-1 | MAJOR | 1 | 1 | ✅ Fixed | Code change at file.py:123 |
462
- | MAJOR-2 | MAJOR | 1 | - | ❌ Unfixed | Fix was comment, not code |
463
- | MAJOR-3 | MAJOR | 2 | - | 🆕 New | Found in Round 2 review |
464
- | MINOR-1 | MINOR | 1 | - | ⏭️ Backlog | Deferred (non-blocking) |
250
+ PASS 1 - GUIDED:
251
+ Using this issue_map, verify each potential issue:
252
+ [issue_map from Phase 0]
465
253
 
466
- **Status Legend:**
467
- - ✅ Fixed — Actually fixed with CODE (not comments)
468
- - ❌ Unfixed — Fix failed, was just a comment, or not addressed
469
- - 🆕 New — Found in a later round (fix may have introduced it, or missed earlier)
470
- - ⏭️ Backlog — MINOR, deferred to later (non-blocking)
254
+ Report: "Verified X/Y items from issue_map"
471
255
 
472
- **Round 2+ Review MUST check:**
473
- 1. Are previous Fixed items ACTUALLY fixed? (Re-verify with fresh eyes)
474
- 2. Did fixes introduce NEW issues?
475
- 3. Did we miss anything in earlier rounds?
256
+ PASS 2 - OPEN DISCOVERY:
257
+ Now FORGET the issue_map. Read the code fresh.
258
+ Look for issues NOT in the map:
259
+ - Variants of listed patterns
260
+ - Logic errors
261
+ - Edge cases
476
262
 
477
- If ANY exists for CRITICAL/MAJOR after MAX_ROUNDS → quality_not_met
263
+ Report: "Found N additional issues not in issue_map"
478
264
 
479
- ## Severity Definitions
265
+ SCOPE: [list all files]
480
266
 
481
- | Level | Meaning | Examples | Exit Blocker? |
482
- |-------|---------|----------|---------------|
483
- | CRITICAL | Security, data loss, crash | SQL injection, unhandled null | **YES** |
484
- | MAJOR | Logic error, missing validation | Wrong calculation, no bounds | **YES** |
485
- | MINOR | Style, documentation | Naming, missing docstring | No (backlog) |
267
+ OUTPUT FORMAT:
268
+ ## Verdict: APPROVED | NEEDS WORK | REJECTED
269
+ ## From Issue Map (Pass 1)
270
+ | ID | File:Line | Issue | Fix |
271
+ ## Additional Findings (Pass 2)
272
+ | ID | File:Line | Issue | Fix |
273
+ ```
486
274
 
487
- ## Exit Conditions (Auto-Loop)
275
+ ---
488
276
 
489
- **Exit is checked at the START of each REVIEWER phase (before finding issues):**
277
+ ## Exit Conditions
490
278
 
491
279
  | Condition | Exit Reason | Result |
492
280
  |-----------|-------------|--------|
493
- | Round N Review finds 0 CRITICAL/MAJOR | `quality_met` | Ready for merge |
494
- | Round >= MAX_ROUNDS | `max_rounds` | ⚠️ Manual review needed |
495
- | No progress (same issues 2 rounds) | `no_improvement` | Architectural issue |
496
-
497
- **quality_met requires ALL of:**
498
- 1. Current round's FULL SCOPE review found zero CRITICAL/MAJOR
499
- 2. All previous issues verified as fixed (with code, not comments)
500
- 3. Guard passes
501
- 4. Issues table complete with evidence
502
-
503
- **Automatic quality_not_met:**
504
- - Any MAJOR "fixed" with comment instead of code
505
- - Any issue marked "assessed" or "acceptable"
506
- - Fixer role declared quality_met (role violation)
507
- - Same CRITICAL/MAJOR persists for 2+ rounds
281
+ | Subagent returns APPROVED | `quality_met` | Ready for merge |
282
+ | round >= 5 | `max_rounds` | Manual review needed |
283
+ | Same issues 2 rounds | `no_improvement` | Architectural issue |
508
284
 
509
- **Important:** quality_met is declared when a Review round finds NO new issues,
510
- not when fixes are applied. This ensures the final state is actually reviewed.
285
+ ---
511
286
 
512
- ## Exit Report (Generated Automatically)
287
+ ## Exit Report
513
288
 
514
- ```markdown
289
+ ```
515
290
  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
516
291
  📋 REVIEW COMPLETE
517
292
  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
518
293
 
519
- **Exit Reason:** quality_met | max_rounds | no_improvement
520
- **Total Rounds:** N / MAX_ROUNDS
521
- **Final Round Result:** 0 CRITICAL/MAJOR found (quality_met) | X issues remain
522
- **Guard Status:** PASS | FAIL
523
- **Review Mode:** Isolated | Same-context (self-review⚠️)
294
+ **Scope:** SMALL | MEDIUM | LARGE
295
+ **Strategy:** THOROUGH | HYBRID | CHUNKED
296
+ **Exit:** quality_met | max_rounds | no_improvement
297
+ **Rounds:** N / 5
298
+ **Guard:** PASS | FAIL
524
299
 
525
300
  ## Issues Table
526
-
527
- | Issue | Severity | Found | Fixed | Status | Evidence |
528
- |-------|----------|-------|-------|--------|----------|
529
- | MAJOR-1 | MAJOR | R1 | R1 | ✅ Fixed | Code at file.py:123 |
530
- | MAJOR-2 | MAJOR | R2 | R2 | ✅ Fixed | Added validation |
531
- | ... | ... | ... | ... | ... | ... |
301
+ | Issue | Severity | Round | Status | Evidence |
532
302
 
533
303
  ## Round Summary
304
+ | Round | Found | Fixed |
534
305
 
535
- | Round | Issues Found | Issues Fixed | New from Fixes |
536
- |-------|--------------|--------------|----------------|
537
- | 1 | 3 | 3 | 0 |
538
- | 2 | 1 | 1 | 0 |
539
- | 3 | 0 | - | - | ← quality_met
540
-
541
- ## Self-Check (Final Review Round)
542
-
543
- - [x] Applied FULL checklist (A-G) with fresh eyes
544
- - [x] All fixes are CODE, not comments
545
- - [x] No "assessed as acceptable" rationalizations
546
- - [x] Guard passes after all changes
547
- - [x] Role separation maintained throughout
548
-
549
- ## Self-Review Warning (if applicable)
550
-
551
- ⚠️ **This was a same-context self-review.** Cognitive biases may have caused
552
- issues to be missed. For higher confidence, run `--deep` review before merge.
553
-
554
- Known blind spots in self-review:
555
- - Exception handlers that silently lose data
556
- - Path traversal / security issues in user input
557
- - Edge cases in validation logic
558
- - Documentation-implementation mismatches
559
-
560
- ## Recommendation
561
-
562
- - [x] Ready for merge (quality_met)
563
- - [ ] Needs manual review (max_rounds)
564
- - [ ] Architectural refactor needed (no_improvement)
565
-
566
- **MINOR (Backlog):**
567
- - [list deferred items]
306
+ Final: guard PASS | X errors, Y warnings
568
307
  ```
569
- ## Appendix: Adversarial Code Reviewer Persona
570
308
 
571
- Used in `--deep` mode (isolated agent):
572
-
573
- ```
574
- You are an independent Adversarial Code Reviewer.
309
+ ---
575
310
 
576
- CRITICAL RULES:
577
- 1. Code is GUILTY until proven INNOCENT
578
- 2. You did NOT write this code — no emotional attachment
579
- 3. Find reasons to REJECT, not accept
580
- 4. Be specific and actionable (file:line, concrete fix)
581
- 5. Your job is to find bugs, not approve code
311
+ ## Scope Boundaries
582
312
 
583
- INPUT YOU WILL RECEIVE:
584
- - Code files to review
585
- - Contracts (if available)
586
- - Test files (if available)
313
+ **IS for:** Finding bugs, verifying contracts, security review
314
+ **NOT for:** New features → /develop | Understanding → /investigate
587
315
 
588
- INPUT YOU WILL NOT RECEIVE:
589
- - Development conversation history
590
- - Developer's explanations
591
- - Prior context about design decisions
316
+ ## Excluded (Covered by Guard)
592
317
 
593
- OUTPUT FORMAT:
594
- Produce structured Review Report with:
595
- 1. Verdict: APPROVED / NEEDS WORK / REJECTED
596
- 2. Critical issues (must fix)
597
- 3. Major issues (should fix)
598
- 4. Minor issues (nice to fix)
599
- 5. Positive observations (what's done well)
600
- ```
318
+ Don't duplicate mechanical checks:
319
+ - Core/Shell separation Guard
320
+ - Missing contracts Guard
321
+ - File/function size Guard
601
322
 
602
323
  <!--/invar:skill--><!--invar:extensions-->
603
- <!-- ========================================================================
604
- EXTENSIONS REGION - USER EDITABLE
605
- Add project-specific extensions here. This section is preserved on update.
606
-
607
- Examples of what to add:
608
- - Project-specific security review checklists
609
- - Custom severity definitions
610
- - Domain-specific code patterns to check
611
- - Team code review standards
612
- ======================================================================== -->
324
+ <!-- User extensions preserved on update -->
613
325
  <!--/invar:extensions-->