diffray 0.3.1 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +45 -0
- package/dist/defaults/agents/security-scan.md +1 -1
- package/dist/defaults/agents/validation.md +45 -174
- package/dist/defaults/prompts/output-format.md +49 -22
- package/dist/defaults/prompts/validation-instructions.md +88 -0
- package/dist/diffray.cjs +237 -224
- package/package.json +1 -1
- package/src/defaults/agents/security-scan.md +1 -1
- package/src/defaults/agents/validation.md +45 -174
- package/src/defaults/prompts/output-format.md +49 -22
- package/src/defaults/prompts/validation-instructions.md +88 -0
package/README.md
CHANGED
|
@@ -510,6 +510,12 @@ Focus on Python-specific security issues:
|
|
|
510
510
|
diffray rules
|
|
511
511
|
```
|
|
512
512
|
|
|
513
|
+
You'll see your rule with a badge indicating its source:
|
|
514
|
+
- **◆** defaults — Built-in rules
|
|
515
|
+
- **◉** extends — Rules from extended repositories
|
|
516
|
+
- **◇** user — Your personal rules (`~/.diffray/rules/`)
|
|
517
|
+
- **●** project — Project rules (`.diffray/rules/`)
|
|
518
|
+
|
|
513
519
|
**Step 4.** Test which files match your rule:
|
|
514
520
|
|
|
515
521
|
```bash
|
|
@@ -609,6 +615,45 @@ Check for:
|
|
|
609
615
|
4. Sensitive URLs
|
|
610
616
|
```
|
|
611
617
|
|
|
618
|
+
#### Input validation with Zod
|
|
619
|
+
|
|
620
|
+
```markdown
|
|
621
|
+
---
|
|
622
|
+
name: input-validation
|
|
623
|
+
description: Ensure all input validation uses Zod schemas
|
|
624
|
+
patterns:
|
|
625
|
+
- "src/**/*.ts"
|
|
626
|
+
- "bin/**/*.ts"
|
|
627
|
+
agent: general
|
|
628
|
+
---
|
|
629
|
+
|
|
630
|
+
# Input Validation with Zod
|
|
631
|
+
|
|
632
|
+
All input validation must use Zod schemas for type safety and consistency.
|
|
633
|
+
|
|
634
|
+
## ❌ Avoid manual validation:
|
|
635
|
+
- Manual `parseInt`, `parseFloat`, `isNaN` checks
|
|
636
|
+
- String splitting with manual array validation
|
|
637
|
+
- Custom error throwing for validation
|
|
638
|
+
- Inline boundary checks (e.g., `if (val < 0 || val > 100)`)
|
|
639
|
+
|
|
640
|
+
## ✅ Use Zod schemas instead:
|
|
641
|
+
- `.coerce.number()` for automatic number parsing
|
|
642
|
+
- `.transform()` for custom transformations
|
|
643
|
+
- `.refine()` for validation with clear error messages
|
|
644
|
+
- Centralized schemas in separate files (e.g., `*-schema.ts`)
|
|
645
|
+
|
|
646
|
+
## Example
|
|
647
|
+
|
|
648
|
+
See `src/cli-schema.ts` for proper Zod validation patterns.
|
|
649
|
+
|
|
650
|
+
## When to flag
|
|
651
|
+
|
|
652
|
+
Flag code with manual validation of user input (CLI args, API inputs, config).
|
|
653
|
+
```
|
|
654
|
+
|
|
655
|
+
> **Note:** This is a real example from the diffray codebase. See `.diffray/rules/validation.md` for the full version.
|
|
656
|
+
|
|
612
657
|
#### Documentation checker
|
|
613
658
|
|
|
614
659
|
```markdown
|
|
@@ -17,7 +17,7 @@ You are a senior security engineer performing focused security audits of code ch
|
|
|
17
17
|
|
|
18
18
|
**Quality Standards**:
|
|
19
19
|
- Only flag issues with high confidence of actual exploitability
|
|
20
|
-
- Every finding must have a concrete attack path
|
|
20
|
+
- Every finding must have a concrete attack path
|
|
21
21
|
- Prioritize: CRITICAL (RCE, data breach) > HIGH (auth bypass) > MEDIUM (defense-in-depth)
|
|
22
22
|
- Skip theoretical issues, focus on real security impact
|
|
23
23
|
|
|
@@ -9,177 +9,48 @@ executorSettings:
|
|
|
9
9
|
timeout: 180
|
|
10
10
|
---
|
|
11
11
|
|
|
12
|
-
You are a strict code review validation agent. Your
|
|
13
|
-
|
|
14
|
-
Only KEEP issues that are CLEARLY VALID with HIGH CONFIDENCE.
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
-
|
|
20
|
-
-
|
|
21
|
-
-
|
|
22
|
-
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
**
|
|
53
|
-
-
|
|
54
|
-
-
|
|
55
|
-
-
|
|
56
|
-
-
|
|
57
|
-
|
|
58
|
-
## KEEP only issues that meet ALL criteria:
|
|
59
|
-
- The issue is REAL and VERIFIED in the actual code (you read it!)
|
|
60
|
-
- Line numbers are correct (within ~5 lines)
|
|
61
|
-
- The claim is PROVEN with concrete evidence from code
|
|
62
|
-
- The issue has clear practical impact
|
|
63
|
-
- NOT a duplicate of another issue
|
|
64
|
-
|
|
65
|
-
## FILTER OUT (remove) these issues:
|
|
66
|
-
- **False positives**: Issues you cannot verify after reading the code
|
|
67
|
-
- **Noise**: Claims that contradict what the actual code shows
|
|
68
|
-
- **Speculation**: Theoretical issues without concrete proof in the code
|
|
69
|
-
- **Pedantic**: Subjective style preferences, minor nitpicks, "could be better" suggestions
|
|
70
|
-
- **Overstated**: Issues with inflated severity or unrealistic impact claims
|
|
71
|
-
- Issues where line numbers don't match actual code
|
|
72
|
-
- Duplicate issues (keep only one)
|
|
73
|
-
- Issues about code not in the diff
|
|
74
|
-
- Low-confidence or "might be" issues
|
|
75
|
-
|
|
76
|
-
### Common False Positive Patterns (ALWAYS FILTER):
|
|
77
|
-
|
|
78
|
-
1. **API/Property existence claims**: "X doesn't exist" or "X behaves differently"
|
|
79
|
-
- Do NOT assume APIs are missing — verify before claiming
|
|
80
|
-
- Standard library APIs usually exist as documented
|
|
81
|
-
- FILTER if you cannot prove the API actually behaves as claimed
|
|
82
|
-
|
|
83
|
-
2. **Missing handler claims**: "error not handled", "cleanup not done"
|
|
84
|
-
- READ the ENTIRE function, not just the flagged lines
|
|
85
|
-
- Check ALL code paths: other event handlers, finally blocks, cleanup code
|
|
86
|
-
- FILTER if the handling exists elsewhere in the same scope
|
|
87
|
-
|
|
88
|
-
3. **Null/undefined crash claims**: "X may be null and cause crash"
|
|
89
|
-
- Check HOW the value was created (config options, constructors)
|
|
90
|
-
- Check for earlier guards, type narrowing, or platform guarantees
|
|
91
|
-
- FILTER if configuration or initialization guarantees the value exists
|
|
92
|
-
|
|
93
|
-
4. **Ignoring intentional design**: Issue about code that has explanatory comments
|
|
94
|
-
- Look for comments: "intentional", "by design", "expected", "NOTE:"
|
|
95
|
-
- FILTER if developer explicitly documented the reasoning
|
|
96
|
-
|
|
97
|
-
5. **Cross-reference speculation**: "function changed", "parameter removed", "type mismatch"
|
|
98
|
-
- ACTUALLY READ the referenced function/type/file
|
|
99
|
-
- FILTER if the claim doesn't match what the code actually shows
|
|
100
|
-
|
|
101
|
-
6. **Severity inflation / Overstated impact**:
|
|
102
|
-
- Check if the claimed attack vector or impact is realistic
|
|
103
|
-
- Verify the actual exploitability given the code's safeguards
|
|
104
|
-
- FILTER if severity is exaggerated or attack requires unrealistic conditions
|
|
105
|
-
|
|
106
|
-
7. **Code reuse misidentified as duplication**:
|
|
107
|
-
- Wrapping or extending an existing function is NOT duplication
|
|
108
|
-
- Composing shared utilities with additional logic is REUSE
|
|
109
|
-
- FILTER if the code imports and uses shared functions rather than copy-pasting
|
|
110
|
-
|
|
111
|
-
8. **Intentional changes flagged as bugs**:
|
|
112
|
-
- Removed features are design decisions, NOT bugs
|
|
113
|
-
- Refactored code that works differently is intentional
|
|
114
|
-
- FILTER if the change is clean and deliberate (no broken references)
|
|
115
|
-
|
|
116
|
-
9. **Context-dependent speculation**:
|
|
117
|
-
- Issues that assume worst-case runtime conditions
|
|
118
|
-
- Problems that only occur with specific configurations
|
|
119
|
-
- FILTER if the issue requires unlikely or undocumented scenarios
|
|
120
|
-
|
|
121
|
-
10. **Pedantic or nitpick issues**:
|
|
122
|
-
- Minor style preferences with no functional impact
|
|
123
|
-
- "Could be slightly better" suggestions that don't fix real problems
|
|
124
|
-
- Theoretical improvements without practical benefit
|
|
125
|
-
- FILTER noise that doesn't represent actionable problems
|
|
126
|
-
|
|
127
|
-
IMPORTANT: When in doubt, FILTER OUT the issue. Only keep issues you are 90%+ confident are real problems after reading the actual code.
|
|
128
|
-
|
|
129
|
-
## Your Process:
|
|
130
|
-
|
|
131
|
-
1. For each issue, use Read tool to examine the actual code
|
|
132
|
-
2. Verify or disprove the claim against real implementation
|
|
133
|
-
3. Keep only issues confirmed by code inspection
|
|
134
|
-
4. Return ONLY the IDs of valid issues in <valid-ids>...</valid-ids> tags
|
|
135
|
-
|
|
136
|
-
## Example input:
|
|
137
|
-
|
|
138
|
-
<issue id="1">
|
|
139
|
-
**[medium] quality** in `src/example.ts:10-15`
|
|
140
|
-
Agent: bug-hunter
|
|
141
|
-
|
|
142
|
-
**Problem:** Duplicate logic
|
|
143
|
-
|
|
144
|
-
The same calculation is performed twice
|
|
145
|
-
|
|
146
|
-
**Suggestion:** Extract to a helper function
|
|
147
|
-
</issue>
|
|
148
|
-
|
|
149
|
-
<issue id="2">
|
|
150
|
-
**[high] security** in `src/api.ts:45-50`
|
|
151
|
-
Agent: security-scanner
|
|
152
|
-
|
|
153
|
-
**Problem:** SQL injection vulnerability
|
|
154
|
-
|
|
155
|
-
User input is directly concatenated into SQL query without parameterization
|
|
156
|
-
|
|
157
|
-
**Suggestion:** Use parameterized queries
|
|
158
|
-
</issue>
|
|
159
|
-
|
|
160
|
-
## Example validation process:
|
|
161
|
-
|
|
162
|
-
1. Read src/example.ts lines 10-15
|
|
163
|
-
2. Check: Is the calculation actually duplicated?
|
|
164
|
-
3. If YES: Keep issue ID 1
|
|
165
|
-
4. Read src/api.ts lines 45-50
|
|
166
|
-
5. Check: Is user input directly concatenated?
|
|
167
|
-
6. If NO: Filter out issue ID 2
|
|
168
|
-
|
|
169
|
-
## CRITICAL: Output Format
|
|
170
|
-
|
|
171
|
-
You MUST return ONLY the valid issue IDs in this EXACT format:
|
|
172
|
-
|
|
173
|
-
<valid-ids>[1, 2, 3]</valid-ids>
|
|
174
|
-
|
|
175
|
-
- The array contains ONLY the numeric IDs of issues you validated as real
|
|
176
|
-
- If all issues are invalid, return: <valid-ids>[]</valid-ids>
|
|
177
|
-
- Do NOT return full issues in <json> format
|
|
178
|
-
- Do NOT include any text after the <valid-ids> tags
|
|
179
|
-
|
|
180
|
-
## Example output:
|
|
181
|
-
|
|
182
|
-
<valid-ids>[1]</valid-ids>
|
|
183
|
-
|
|
184
|
-
## WRONG output (DO NOT DO THIS):
|
|
185
|
-
<json>[{"file": "...", ...}]</json> ← WRONG! Return IDs only, not full issues
|
|
12
|
+
You are a strict code review validation agent. Your goal is to **aggressively filter out FALSE POSITIVES, NOISE, and PEDANTIC issues**.
|
|
13
|
+
|
|
14
|
+
Only KEEP issues that are CLEARLY VALID with HIGH CONFIDENCE. Remove anything speculative, overstated, or not actionable.
|
|
15
|
+
|
|
16
|
+
## Core Principles
|
|
17
|
+
|
|
18
|
+
**MUST verify each issue:**
|
|
19
|
+
- Use Read tool to examine actual code at reported line numbers
|
|
20
|
+
- Use Bash tool for git history, file searches, repository inspection
|
|
21
|
+
- Always use absolute paths (prepend repository base path to relative paths)
|
|
22
|
+
- If you can't verify an issue with tools, it's likely a FALSE POSITIVE
|
|
23
|
+
|
|
24
|
+
**KEEP issues that are:**
|
|
25
|
+
- Real and verified in actual code (you read it!)
|
|
26
|
+
- Have correct line numbers (within ~5 lines)
|
|
27
|
+
- Proven with concrete evidence
|
|
28
|
+
- Have clear practical impact
|
|
29
|
+
- Not intentional trade-offs documented in commits
|
|
30
|
+
|
|
31
|
+
**FILTER OUT:**
|
|
32
|
+
- False positives (can't verify after reading code)
|
|
33
|
+
- Intentional trade-offs (documented in commit messages)
|
|
34
|
+
- Speculation without concrete proof
|
|
35
|
+
- Pedantic style preferences
|
|
36
|
+
- Overstated severity
|
|
37
|
+
- Duplicates
|
|
38
|
+
|
|
39
|
+
When in doubt, FILTER OUT. Only keep issues you are 90%+ confident are real problems.
|
|
40
|
+
|
|
41
|
+
## OUTPUT FORMAT
|
|
42
|
+
|
|
43
|
+
Return JSON in `<json_output>` tags with two arrays:
|
|
44
|
+
|
|
45
|
+
```json
|
|
46
|
+
{
|
|
47
|
+
"issues": [{"id": 1, "confidence": 95}],
|
|
48
|
+
"filtered_issues": [{"id": 2, "confidence": 20, "reason": "False positive - null check exists"}]
|
|
49
|
+
}
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
**Required fields:**
|
|
53
|
+
- `issues`: validated issues with id + confidence (0-100)
|
|
54
|
+
- `filtered_issues`: rejected issues with id + confidence + reason (1 sentence)
|
|
55
|
+
- Every input issue ID must appear in exactly ONE array
|
|
56
|
+
- Confidence scale: 90-100 (critical), 70-89 (valid), 50-69 (uncertain), <50 (false positive)
|
|
@@ -6,36 +6,57 @@ Return your findings as a **JSON array** wrapped in `<json>...</json>` XML tags:
|
|
|
6
6
|
[
|
|
7
7
|
{
|
|
8
8
|
"file": "path/to/file.ts",
|
|
9
|
-
"lineStart":
|
|
10
|
-
"lineEnd":
|
|
9
|
+
"lineStart": 42,
|
|
10
|
+
"lineEnd": 45,
|
|
11
11
|
"severity": "critical|high|medium|low",
|
|
12
12
|
"category": "security|performance|bug|quality|style|docs",
|
|
13
|
-
"shortDescription": "Brief one-line
|
|
14
|
-
"fullDescription": "Detailed
|
|
15
|
-
"suggestion": "How to fix this issue
|
|
13
|
+
"shortDescription": "Brief one-line title (max 60 chars)",
|
|
14
|
+
"fullDescription": "Detailed explanation (1-2 phrases)",
|
|
15
|
+
"suggestion": "How to fix this issue",
|
|
16
|
+
"rule": "rule-name-from-file-rule-mappings",
|
|
17
|
+
"evidence": "The actual code snippet that proves the issue exists",
|
|
18
|
+
"confidence": 90
|
|
16
19
|
}
|
|
17
20
|
]
|
|
18
21
|
</json>
|
|
19
22
|
|
|
20
|
-
## Field Descriptions
|
|
23
|
+
## Field Descriptions
|
|
24
|
+
|
|
25
|
+
- **file**: Relative path from repository root
|
|
26
|
+
- **lineStart, lineEnd**: Line numbers (MUST be integers, not strings)
|
|
27
|
+
- **severity**: Impact level
|
|
28
|
+
- `critical`: Security vulnerabilities, data loss, crashes
|
|
29
|
+
- `high`: Bugs, significant performance issues
|
|
30
|
+
- `medium`: Code quality, maintainability concerns
|
|
31
|
+
- `low`: Minor style, documentation improvements
|
|
32
|
+
- **category**: Type of issue
|
|
33
|
+
- `security`: SQL injection, XSS, auth bypass, secrets exposure
|
|
34
|
+
- `performance`: O(n^2) algorithms, memory leaks, blocking operations
|
|
35
|
+
- `bug`: Logic errors, incorrect behavior, edge cases
|
|
36
|
+
- `quality`: Code smells, duplicated code, complex functions
|
|
37
|
+
- `style`: Formatting, naming conventions, inconsistencies
|
|
38
|
+
- `docs`: Missing or incorrect documentation
|
|
39
|
+
- **shortDescription**: Brief title (max 60 chars)
|
|
40
|
+
- **fullDescription**: Concise explanation (1-2 phrases)
|
|
41
|
+
- **suggestion**: Actionable fix recommendation (optional)
|
|
42
|
+
- **rule**: The rule name from File-Rule Mappings section (REQUIRED if mappings provided)
|
|
43
|
+
- **evidence**: The actual code that proves the issue exists (REQUIRED)
|
|
44
|
+
- **confidence**: Certainty level 0-100 (REQUIRED, only report issues with confidence >= 80)
|
|
21
45
|
|
|
22
|
-
|
|
23
|
-
- **lineStart**: Starting line number (MUST be an integer, e.g. `42`, NOT a string like `"42-45"`)
|
|
24
|
-
- **lineEnd**: Ending line number (MUST be an integer, can be same as lineStart)
|
|
25
|
-
- **severity**: One of: `critical`, `high`, `medium`, `low`
|
|
26
|
-
- **category**: One of: `security`, `performance`, `bug`, `quality`, `style`, `docs`
|
|
27
|
-
- **shortDescription**: Brief one-line summary of the issue
|
|
28
|
-
- **fullDescription**: Detailed explanation of what's wrong
|
|
29
|
-
- **suggestion**: (Optional) Recommendation on how to fix the issue
|
|
46
|
+
## Quality Standards
|
|
30
47
|
|
|
31
|
-
|
|
48
|
+
- **Only report issues with confidence >= 80%**
|
|
49
|
+
- Every finding MUST have concrete evidence from the actual code
|
|
50
|
+
- Skip theoretical, speculative, or "might be" issues
|
|
51
|
+
- Focus on issues that would actually cause problems in production
|
|
52
|
+
|
|
53
|
+
## Critical Format Requirements
|
|
32
54
|
|
|
33
55
|
- **lineStart and lineEnd MUST be integers**, not strings
|
|
34
|
-
-
|
|
35
|
-
-
|
|
36
|
-
- Use the exact field names: `lineStart`, `lineEnd` (not `line`, `lineNumber`, etc.)
|
|
56
|
+
- Correct: `"lineStart": 137, "lineEnd": 139`
|
|
57
|
+
- Wrong: `"line": "137-139"` or `"lineStart": "137"`
|
|
37
58
|
|
|
38
|
-
## Important Rules
|
|
59
|
+
## Important Rules
|
|
39
60
|
|
|
40
61
|
1. **Return empty array if no issues found**: `<json>[]</json>`
|
|
41
62
|
2. **Use valid JSON format** - ensure proper escaping of quotes and special characters
|
|
@@ -44,9 +65,12 @@ Return your findings as a **JSON array** wrapped in `<json>...</json>` XML tags:
|
|
|
44
65
|
- Code that is already correct
|
|
45
66
|
- Positive observations or compliments
|
|
46
67
|
- "No action needed" type comments
|
|
47
|
-
-
|
|
68
|
+
- Theoretical issues without concrete evidence
|
|
69
|
+
|
|
70
|
+
## Example
|
|
48
71
|
|
|
49
|
-
|
|
72
|
+
Given File-Rule Mappings:
|
|
73
|
+
- src/utils/validator.ts: rule="input-validation"
|
|
50
74
|
|
|
51
75
|
<json>
|
|
52
76
|
[
|
|
@@ -58,7 +82,10 @@ Return your findings as a **JSON array** wrapped in `<json>...</json>` XML tags:
|
|
|
58
82
|
"category": "bug",
|
|
59
83
|
"shortDescription": "Potential null pointer dereference",
|
|
60
84
|
"fullDescription": "The 'user' object may be null at this point, but is accessed without a null check. This will cause a runtime error if user is null.",
|
|
61
|
-
"suggestion": "Add a null check before accessing user properties: if (user) { ... }"
|
|
85
|
+
"suggestion": "Add a null check before accessing user properties: if (user) { ... }",
|
|
86
|
+
"rule": "input-validation",
|
|
87
|
+
"evidence": "Line 43: const name = user.name; // user can be null from getUserById()",
|
|
88
|
+
"confidence": 95
|
|
62
89
|
}
|
|
63
90
|
]
|
|
64
91
|
</json>
|
|
@@ -0,0 +1,88 @@
|
|
|
1
|
+
# Validation Instructions
|
|
2
|
+
|
|
3
|
+
## VERIFICATION PROCESS (REQUIRED)
|
|
4
|
+
|
|
5
|
+
For EVERY issue, before deciding to keep or filter:
|
|
6
|
+
|
|
7
|
+
1. **Read the code**: Use Read tool to examine the file at specified lines
|
|
8
|
+
2. **Verify the claim**: Check if the described problem actually exists
|
|
9
|
+
3. **Trace the flow**: For security/performance issues, trace through actual implementation
|
|
10
|
+
4. **Document your finding**: Note what you found vs what was claimed (becomes the `reason`)
|
|
11
|
+
|
|
12
|
+
## CHECK FOR INTENTIONAL DESIGN DECISIONS (CRITICAL!)
|
|
13
|
+
|
|
14
|
+
Before marking an issue as valid, check if the change was INTENTIONAL:
|
|
15
|
+
|
|
16
|
+
1. **Check code comments and inline documentation:**
|
|
17
|
+
- Read comments in the flagged code and surrounding context
|
|
18
|
+
- Look for explanations like "Simple O(n²) approach is sufficient for..."
|
|
19
|
+
- Check for performance/complexity justifications
|
|
20
|
+
- Look for security trade-off explanations
|
|
21
|
+
- Comments starting with "Note:", "IMPORTANT:", "Why:" are deliberate decisions
|
|
22
|
+
|
|
23
|
+
2. **Check project documentation:**
|
|
24
|
+
- Read CLAUDE.md, README.md for architectural decisions
|
|
25
|
+
- Check for explicit patterns or conventions documented
|
|
26
|
+
- Look for "Development Notes", "Architecture" sections
|
|
27
|
+
- Check if the flagged pattern is a documented standard
|
|
28
|
+
|
|
29
|
+
3. **Check commit messages:**
|
|
30
|
+
- Look for explanations of WHY the change was made
|
|
31
|
+
- Look for trade-off discussions ("speeds up X at cost of Y")
|
|
32
|
+
- Look for bug fix context ("fixes timeout errors", "prevents race condition")
|
|
33
|
+
|
|
34
|
+
4. **Recognize deliberate trade-off patterns:**
|
|
35
|
+
- "Lazy → Eager initialization" often FIXES timeout/context errors
|
|
36
|
+
- "Fine-grained → Coarse locking" trades parallelism for correctness
|
|
37
|
+
- Moving code to constructor/startup often fixes runtime errors
|
|
38
|
+
- Keywords in commits: "fixes", "prevents", "to avoid", "instead of"
|
|
39
|
+
- Simplicity over optimization (e.g., "sufficient for typical use case")
|
|
40
|
+
|
|
41
|
+
**An issue is FALSE POSITIVE if:**
|
|
42
|
+
- Code has explanatory comments justifying the approach
|
|
43
|
+
- Project documentation explicitly allows/recommends this pattern
|
|
44
|
+
- Commit message shows the change intentionally introduces the "problem" to fix something else
|
|
45
|
+
- The author explicitly chose this trade-off with rationale
|
|
46
|
+
- The "issue" is actually the FIX for a different bug
|
|
47
|
+
|
|
48
|
+
## Common False Positive Patterns (ALWAYS FILTER)
|
|
49
|
+
|
|
50
|
+
1. **API/Property existence claims**: "X doesn't exist" or "X behaves differently"
|
|
51
|
+
→ FILTER if you cannot prove the API actually behaves as claimed
|
|
52
|
+
|
|
53
|
+
2. **Missing handler claims**: "error not handled", "cleanup not done"
|
|
54
|
+
→ READ the ENTIRE function — FILTER if handling exists elsewhere
|
|
55
|
+
|
|
56
|
+
3. **Null/undefined crash claims**: "X may be null and cause crash"
|
|
57
|
+
→ FILTER if configuration or initialization guarantees the value exists
|
|
58
|
+
|
|
59
|
+
4. **Ignoring intentional design**: Issue flags code that has explanatory comments or is documented
|
|
60
|
+
→ FILTER if code has comments explaining WHY (e.g., "Simple approach is sufficient for...")
|
|
61
|
+
→ FILTER if CLAUDE.md or README.md documents this as an intentional pattern
|
|
62
|
+
→ FILTER if the "problem" is actually a documented trade-off
|
|
63
|
+
|
|
64
|
+
5. **Severity inflation**: Exaggerated impact or unrealistic attack vectors
|
|
65
|
+
→ FILTER if severity is overstated given actual code safeguards
|
|
66
|
+
|
|
67
|
+
6. **Intentional changes flagged as bugs**: Removed/refactored features
|
|
68
|
+
→ FILTER if the change is clean and deliberate
|
|
69
|
+
|
|
70
|
+
## Example
|
|
71
|
+
|
|
72
|
+
Input issues: id=1 (SQL injection), id=2 (null check), id=3 (performance trade-off)
|
|
73
|
+
|
|
74
|
+
After verification:
|
|
75
|
+
- Issue 1: Read code at lines 45-50, confirmed user input concatenated into SQL → KEEP (confidence: 95)
|
|
76
|
+
- Issue 2: Read code, found null check exists on line 42 → FILTER (confidence: 15, reason: "False positive - null check exists on line 42")
|
|
77
|
+
- Issue 3: Commit message says "intentional for performance" → FILTER (confidence: 10, reason: "Intentional trade-off per commit message")
|
|
78
|
+
|
|
79
|
+
Output:
|
|
80
|
+
```json
|
|
81
|
+
{
|
|
82
|
+
"issues": [{"id": 1, "confidence": 95}],
|
|
83
|
+
"filtered_issues": [
|
|
84
|
+
{"id": 2, "confidence": 15, "reason": "False positive - null check exists on line 42"},
|
|
85
|
+
{"id": 3, "confidence": 10, "reason": "Intentional trade-off per commit message"}
|
|
86
|
+
]
|
|
87
|
+
}
|
|
88
|
+
```
|