@vigolium/piolium 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (271) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +117 -0
  3. package/agents/access-auditor.md +300 -0
  4. package/agents/assumption-breaker.md +154 -0
  5. package/agents/attack-designer.md +116 -0
  6. package/agents/code-scanner.md +139 -0
  7. package/agents/concurrency-auditor.md +238 -0
  8. package/agents/confirm-writer.md +257 -0
  9. package/agents/context-reviewer.md +274 -0
  10. package/agents/cross-verifier.md +165 -0
  11. package/agents/cve-scout.md +381 -0
  12. package/agents/env-builder.md +282 -0
  13. package/agents/env-profiler.md +205 -0
  14. package/agents/evidence-collector.md +140 -0
  15. package/agents/finding-grader.md +142 -0
  16. package/agents/finding-writer.md +148 -0
  17. package/agents/flow-tracer.md +106 -0
  18. package/agents/goal-backtracer.md +146 -0
  19. package/agents/history-miner.md +467 -0
  20. package/agents/independent-verifier.md +118 -0
  21. package/agents/intent-mapper.md +183 -0
  22. package/agents/longshot-collector.md +128 -0
  23. package/agents/longshot-prober.md +126 -0
  24. package/agents/patch-auditor.md +73 -0
  25. package/agents/poc-author.md +124 -0
  26. package/agents/poc-runner.md +194 -0
  27. package/agents/probe-lead.md +269 -0
  28. package/agents/red-challenger.md +101 -0
  29. package/agents/report-composer.md +208 -0
  30. package/agents/review-adjudicator.md +216 -0
  31. package/agents/spec-auditor.md +155 -0
  32. package/agents/taint-tracer.md +265 -0
  33. package/agents/test-locator.md +209 -0
  34. package/agents/threat-modeler.md +132 -0
  35. package/agents/variant-scanner.md +108 -0
  36. package/agents/variant-spotter.md +110 -0
  37. package/bin/piolium.mjs +376 -0
  38. package/extensions/piolium/_vendor/yaml.bundle.d.mts +6 -0
  39. package/extensions/piolium/_vendor/yaml.bundle.mjs +139 -0
  40. package/extensions/piolium/agent-runner.ts +322 -0
  41. package/extensions/piolium/agents.ts +266 -0
  42. package/extensions/piolium/audit-state.ts +522 -0
  43. package/extensions/piolium/bundled-resources.ts +97 -0
  44. package/extensions/piolium/candidate-scan.ts +966 -0
  45. package/extensions/piolium/command-target.ts +177 -0
  46. package/extensions/piolium/console-stream.ts +57 -0
  47. package/extensions/piolium/export-results.ts +380 -0
  48. package/extensions/piolium/findings.ts +448 -0
  49. package/extensions/piolium/heartbeat.ts +182 -0
  50. package/extensions/piolium/help.ts +234 -0
  51. package/extensions/piolium/index.ts +1865 -0
  52. package/extensions/piolium/longshot.ts +530 -0
  53. package/extensions/piolium/matcher-suggestions.ts +196 -0
  54. package/extensions/piolium/matcher-utils.ts +83 -0
  55. package/extensions/piolium/modes/balanced.ts +750 -0
  56. package/extensions/piolium/modes/confirm-bootstrap.ts +186 -0
  57. package/extensions/piolium/modes/confirm.ts +697 -0
  58. package/extensions/piolium/modes/deep.ts +917 -0
  59. package/extensions/piolium/modes/diff.ts +177 -0
  60. package/extensions/piolium/modes/lite.ts +540 -0
  61. package/extensions/piolium/modes/longshot.ts +595 -0
  62. package/extensions/piolium/modes/merge.ts +204 -0
  63. package/extensions/piolium/modes/phase-runner.ts +267 -0
  64. package/extensions/piolium/modes/reinvest.ts +546 -0
  65. package/extensions/piolium/modes/revisit.ts +279 -0
  66. package/extensions/piolium/modes.ts +48 -0
  67. package/extensions/piolium/phase-labels.ts +123 -0
  68. package/extensions/piolium/phase-status-strip.ts +92 -0
  69. package/extensions/piolium/prompt-prefix-editor.ts +39 -0
  70. package/extensions/piolium/providers/anthropic-vertex.ts +836 -0
  71. package/extensions/piolium/recon.ts +409 -0
  72. package/extensions/piolium/result-stats.ts +105 -0
  73. package/extensions/piolium/retry.ts +120 -0
  74. package/extensions/piolium/scheduler.ts +212 -0
  75. package/extensions/piolium/secrets.ts +368 -0
  76. package/extensions/piolium/tools/web-tools.ts +148 -0
  77. package/package.json +77 -0
  78. package/skills/agentic-actions-auditor/SKILL.md +327 -0
  79. package/skills/agentic-actions-auditor/references/action-profiles.md +186 -0
  80. package/skills/agentic-actions-auditor/references/cross-file-resolution.md +209 -0
  81. package/skills/agentic-actions-auditor/references/foundations.md +94 -0
  82. package/skills/agentic-actions-auditor/references/vector-a-env-var-intermediary.md +77 -0
  83. package/skills/agentic-actions-auditor/references/vector-b-direct-expression-injection.md +83 -0
  84. package/skills/agentic-actions-auditor/references/vector-c-cli-data-fetch.md +83 -0
  85. package/skills/agentic-actions-auditor/references/vector-d-pr-target-checkout.md +88 -0
  86. package/skills/agentic-actions-auditor/references/vector-e-error-log-injection.md +88 -0
  87. package/skills/agentic-actions-auditor/references/vector-f-subshell-expansion.md +82 -0
  88. package/skills/agentic-actions-auditor/references/vector-g-eval-of-ai-output.md +91 -0
  89. package/skills/agentic-actions-auditor/references/vector-h-dangerous-sandbox-configs.md +102 -0
  90. package/skills/agentic-actions-auditor/references/vector-i-wildcard-allowlists.md +88 -0
  91. package/skills/audit/SKILL.md +562 -0
  92. package/skills/audit/assets/icon.svg +7 -0
  93. package/skills/audit/hooks/scripts/validate_phase_output.py +550 -0
  94. package/skills/audit/references/adversarial-review.md +148 -0
  95. package/skills/audit/references/architecture-aware-sast.md +306 -0
  96. package/skills/audit/references/audit-workflow.md +737 -0
  97. package/skills/audit/references/chamber-protocol.md +384 -0
  98. package/skills/audit/references/creative-attack-modes.md +221 -0
  99. package/skills/audit/references/deep-analysis.md +273 -0
  100. package/skills/audit/references/domain-attack-playbooks.md +1129 -0
  101. package/skills/audit/references/knowledge-base-template.md +513 -0
  102. package/skills/audit/references/real-env-validation.md +191 -0
  103. package/skills/audit/references/report-templates.md +417 -0
  104. package/skills/audit/references/triage-and-prereqs.md +134 -0
  105. package/skills/audit/scripts/consolidate_drafts.py +554 -0
  106. package/skills/audit/scripts/partition_findings.py +152 -0
  107. package/skills/audit/scripts/rg-hotspots.sh +121 -0
  108. package/skills/audit/scripts/stamp_file_state.py +349 -0
  109. package/skills/code-reviewer/SKILL.md +65 -0
  110. package/skills/codeql/SKILL.md +281 -0
  111. package/skills/codeql/references/build-fixes.md +90 -0
  112. package/skills/codeql/references/diagnostic-query-templates.md +339 -0
  113. package/skills/codeql/references/extension-yaml-format.md +209 -0
  114. package/skills/codeql/references/important-only-suite.md +153 -0
  115. package/skills/codeql/references/language-details.md +207 -0
  116. package/skills/codeql/references/macos-arm64e-workaround.md +179 -0
  117. package/skills/codeql/references/performance-tuning.md +111 -0
  118. package/skills/codeql/references/quality-assessment.md +172 -0
  119. package/skills/codeql/references/ruleset-catalog.md +63 -0
  120. package/skills/codeql/references/run-all-suite.md +92 -0
  121. package/skills/codeql/references/sarif-processing.md +79 -0
  122. package/skills/codeql/references/threat-models.md +51 -0
  123. package/skills/codeql/workflows/build-database.md +280 -0
  124. package/skills/codeql/workflows/create-data-extensions.md +261 -0
  125. package/skills/codeql/workflows/run-analysis.md +301 -0
  126. package/skills/differential-review/SKILL.md +220 -0
  127. package/skills/differential-review/adversarial.md +203 -0
  128. package/skills/differential-review/methodology.md +234 -0
  129. package/skills/differential-review/patterns.md +300 -0
  130. package/skills/differential-review/reporting.md +369 -0
  131. package/skills/fp-check/SKILL.md +125 -0
  132. package/skills/fp-check/references/bug-class-verification.md +114 -0
  133. package/skills/fp-check/references/deep-verification.md +143 -0
  134. package/skills/fp-check/references/evidence-templates.md +91 -0
  135. package/skills/fp-check/references/false-positive-patterns.md +115 -0
  136. package/skills/fp-check/references/gate-reviews.md +27 -0
  137. package/skills/fp-check/references/standard-verification.md +78 -0
  138. package/skills/insecure-defaults/SKILL.md +117 -0
  139. package/skills/insecure-defaults/references/examples.md +409 -0
  140. package/skills/last30days/SKILL.md +444 -0
  141. package/skills/sarif-parsing/SKILL.md +483 -0
  142. package/skills/sarif-parsing/resources/jq-queries.md +162 -0
  143. package/skills/sarif-parsing/resources/sarif_helpers.py +331 -0
  144. package/skills/security-threat-model/LICENSE.txt +201 -0
  145. package/skills/security-threat-model/SKILL.md +81 -0
  146. package/skills/security-threat-model/agents/openai.yaml +4 -0
  147. package/skills/security-threat-model/references/prompt-template.md +255 -0
  148. package/skills/security-threat-model/references/security-controls-and-assets.md +32 -0
  149. package/skills/semgrep/SKILL.md +212 -0
  150. package/skills/semgrep/references/rulesets.md +162 -0
  151. package/skills/semgrep/references/scan-modes.md +110 -0
  152. package/skills/semgrep/references/scanner-task-prompt.md +140 -0
  153. package/skills/semgrep/scripts/merge_sarif.py +203 -0
  154. package/skills/semgrep/workflows/scan-workflow.md +311 -0
  155. package/skills/semgrep-rule-creator/SKILL.md +168 -0
  156. package/skills/semgrep-rule-creator/references/quick-reference.md +202 -0
  157. package/skills/semgrep-rule-creator/references/workflow.md +240 -0
  158. package/skills/semgrep-rule-variant-creator/SKILL.md +205 -0
  159. package/skills/semgrep-rule-variant-creator/references/applicability-analysis.md +250 -0
  160. package/skills/semgrep-rule-variant-creator/references/language-syntax-guide.md +324 -0
  161. package/skills/semgrep-rule-variant-creator/references/workflow.md +518 -0
  162. package/skills/sharp-edges/SKILL.md +292 -0
  163. package/skills/sharp-edges/references/auth-patterns.md +252 -0
  164. package/skills/sharp-edges/references/case-studies.md +274 -0
  165. package/skills/sharp-edges/references/config-patterns.md +333 -0
  166. package/skills/sharp-edges/references/crypto-apis.md +190 -0
  167. package/skills/sharp-edges/references/lang-c.md +205 -0
  168. package/skills/sharp-edges/references/lang-csharp.md +285 -0
  169. package/skills/sharp-edges/references/lang-go.md +270 -0
  170. package/skills/sharp-edges/references/lang-java.md +263 -0
  171. package/skills/sharp-edges/references/lang-javascript.md +269 -0
  172. package/skills/sharp-edges/references/lang-kotlin.md +265 -0
  173. package/skills/sharp-edges/references/lang-php.md +245 -0
  174. package/skills/sharp-edges/references/lang-python.md +274 -0
  175. package/skills/sharp-edges/references/lang-ruby.md +273 -0
  176. package/skills/sharp-edges/references/lang-rust.md +272 -0
  177. package/skills/sharp-edges/references/lang-swift.md +287 -0
  178. package/skills/sharp-edges/references/language-specific.md +588 -0
  179. package/skills/spec-to-code-compliance/SKILL.md +357 -0
  180. package/skills/spec-to-code-compliance/resources/COMPLETENESS_CHECKLIST.md +69 -0
  181. package/skills/spec-to-code-compliance/resources/IR_EXAMPLES.md +417 -0
  182. package/skills/spec-to-code-compliance/resources/OUTPUT_REQUIREMENTS.md +105 -0
  183. package/skills/supply-chain-risk-auditor/SKILL.md +67 -0
  184. package/skills/supply-chain-risk-auditor/resources/results-template.md +41 -0
  185. package/skills/variant-analysis/METHODOLOGY.md +327 -0
  186. package/skills/variant-analysis/SKILL.md +142 -0
  187. package/skills/variant-analysis/resources/codeql/cpp.ql +119 -0
  188. package/skills/variant-analysis/resources/codeql/go.ql +69 -0
  189. package/skills/variant-analysis/resources/codeql/java.ql +71 -0
  190. package/skills/variant-analysis/resources/codeql/javascript.ql +63 -0
  191. package/skills/variant-analysis/resources/codeql/python.ql +80 -0
  192. package/skills/variant-analysis/resources/semgrep/cpp.yaml +98 -0
  193. package/skills/variant-analysis/resources/semgrep/go.yaml +63 -0
  194. package/skills/variant-analysis/resources/semgrep/java.yaml +61 -0
  195. package/skills/variant-analysis/resources/semgrep/javascript.yaml +60 -0
  196. package/skills/variant-analysis/resources/semgrep/python.yaml +72 -0
  197. package/skills/variant-analysis/resources/variant-report-template.md +75 -0
  198. package/skills/vuln-report/SKILL.md +137 -0
  199. package/skills/vuln-report/agents/openai.yaml +4 -0
  200. package/skills/vuln-report/references/report-template.md +135 -0
  201. package/skills/wooyun-legacy/SKILL.md +367 -0
  202. package/skills/wooyun-legacy/references/bank-penetration.md +222 -0
  203. package/skills/wooyun-legacy/references/checklists/command-execution-checklist.md +119 -0
  204. package/skills/wooyun-legacy/references/checklists/csrf-checklist.md +74 -0
  205. package/skills/wooyun-legacy/references/checklists/file-upload-checklist.md +108 -0
  206. package/skills/wooyun-legacy/references/checklists/info-disclosure-checklist.md +114 -0
  207. package/skills/wooyun-legacy/references/checklists/logic-flaws-checklist.md +95 -0
  208. package/skills/wooyun-legacy/references/checklists/misconfig-checklist.md +124 -0
  209. package/skills/wooyun-legacy/references/checklists/path-traversal-checklist.md +87 -0
  210. package/skills/wooyun-legacy/references/checklists/rce-checklist.md +93 -0
  211. package/skills/wooyun-legacy/references/checklists/sql-injection-checklist.md +97 -0
  212. package/skills/wooyun-legacy/references/checklists/ssrf-checklist.md +99 -0
  213. package/skills/wooyun-legacy/references/checklists/unauthorized-access-checklist.md +89 -0
  214. package/skills/wooyun-legacy/references/checklists/weak-password-checklist.md +115 -0
  215. package/skills/wooyun-legacy/references/checklists/xss-checklist.md +103 -0
  216. package/skills/wooyun-legacy/references/checklists/xxe-checklist.md +130 -0
  217. package/skills/wooyun-legacy/references/info-disclosure.md +975 -0
  218. package/skills/wooyun-legacy/references/logic-flaws.md +721 -0
  219. package/skills/wooyun-legacy/references/path-traversal.md +1191 -0
  220. package/skills/wooyun-legacy/references/telecom-penetration.md +156 -0
  221. package/skills/wooyun-legacy/references/unauthorized-access.md +980 -0
  222. package/skills/wooyun-legacy/references/xss.md +746 -0
  223. package/skills/zeroize-audit/SKILL.md +371 -0
  224. package/skills/zeroize-audit/configs/c.yaml +21 -0
  225. package/skills/zeroize-audit/configs/default.yaml +128 -0
  226. package/skills/zeroize-audit/configs/rust.yaml +83 -0
  227. package/skills/zeroize-audit/prompts/report_template.md +238 -0
  228. package/skills/zeroize-audit/prompts/system.md +163 -0
  229. package/skills/zeroize-audit/prompts/task.md +97 -0
  230. package/skills/zeroize-audit/references/compile-commands.md +231 -0
  231. package/skills/zeroize-audit/references/detection-strategy.md +191 -0
  232. package/skills/zeroize-audit/references/ir-analysis.md +252 -0
  233. package/skills/zeroize-audit/references/mcp-analysis.md +221 -0
  234. package/skills/zeroize-audit/references/poc-generation.md +470 -0
  235. package/skills/zeroize-audit/references/rust-zeroization-patterns.md +867 -0
  236. package/skills/zeroize-audit/schemas/input.json +83 -0
  237. package/skills/zeroize-audit/schemas/output.json +140 -0
  238. package/skills/zeroize-audit/tools/analyze_asm.sh +202 -0
  239. package/skills/zeroize-audit/tools/analyze_cfg.py +381 -0
  240. package/skills/zeroize-audit/tools/analyze_heap.sh +211 -0
  241. package/skills/zeroize-audit/tools/analyze_ir_semantic.py +429 -0
  242. package/skills/zeroize-audit/tools/diff_ir.sh +135 -0
  243. package/skills/zeroize-audit/tools/diff_rust_mir.sh +189 -0
  244. package/skills/zeroize-audit/tools/emit_asm.sh +67 -0
  245. package/skills/zeroize-audit/tools/emit_ir.sh +77 -0
  246. package/skills/zeroize-audit/tools/emit_rust_asm.sh +178 -0
  247. package/skills/zeroize-audit/tools/emit_rust_ir.sh +150 -0
  248. package/skills/zeroize-audit/tools/emit_rust_mir.sh +158 -0
  249. package/skills/zeroize-audit/tools/extract_compile_flags.py +284 -0
  250. package/skills/zeroize-audit/tools/generate_poc.py +1329 -0
  251. package/skills/zeroize-audit/tools/mcp/apply_confidence_gates.py +113 -0
  252. package/skills/zeroize-audit/tools/mcp/check_mcp.sh +68 -0
  253. package/skills/zeroize-audit/tools/mcp/normalize_mcp_evidence.py +125 -0
  254. package/skills/zeroize-audit/tools/scripts/check_llvm_patterns.py +481 -0
  255. package/skills/zeroize-audit/tools/scripts/check_mir_patterns.py +554 -0
  256. package/skills/zeroize-audit/tools/scripts/check_rust_asm.py +424 -0
  257. package/skills/zeroize-audit/tools/scripts/check_rust_asm_aarch64.py +300 -0
  258. package/skills/zeroize-audit/tools/scripts/check_rust_asm_x86.py +283 -0
  259. package/skills/zeroize-audit/tools/scripts/find_dangerous_apis.py +375 -0
  260. package/skills/zeroize-audit/tools/scripts/semantic_audit.py +923 -0
  261. package/skills/zeroize-audit/tools/track_dataflow.sh +196 -0
  262. package/skills/zeroize-audit/tools/validate_rust_toolchain.sh +298 -0
  263. package/skills/zeroize-audit/workflows/phase-0-preflight.md +150 -0
  264. package/skills/zeroize-audit/workflows/phase-1-source-analysis.md +144 -0
  265. package/skills/zeroize-audit/workflows/phase-2-compiler-analysis.md +139 -0
  266. package/skills/zeroize-audit/workflows/phase-3-interim-report.md +46 -0
  267. package/skills/zeroize-audit/workflows/phase-4-poc-generation.md +46 -0
  268. package/skills/zeroize-audit/workflows/phase-5-poc-validation.md +136 -0
  269. package/skills/zeroize-audit/workflows/phase-6-final-report.md +44 -0
  270. package/skills/zeroize-audit/workflows/phase-7-test-generation.md +42 -0
  271. package/themes/piolium-srcery.json +94 -0
@@ -0,0 +1,125 @@
1
+ ---
2
+ name: fp-check
3
+ description: "Systematically verifies suspected security bugs to eliminate false positives. Produces TRUE POSITIVE or FALSE POSITIVE verdicts with documented evidence for each bug."
4
+ allowed-tools:
5
+ - Read
6
+ - Grep
7
+ - Glob
8
+ - LSP
9
+ - Bash
10
+ - Task
11
+ - Write
12
+ - Edit
13
+ - AskUserQuestion
14
+ - TaskCreate
15
+ - TaskUpdate
16
+ - TaskList
17
+ - TaskGet
18
+ ---
19
+
20
+ # False Positive Check
21
+
22
+ ## When to Use
23
+
24
+ - "Is this bug real?" or "is this a true positive?"
25
+ - "Is this a false positive?" or "verify this finding"
26
+ - "Check if this vulnerability is exploitable"
27
+ - Any request to verify or validate a specific suspected bug
28
+
29
+ ## When NOT to Use
30
+
31
+ - Finding or hunting for bugs ("find bugs", "security analysis", "audit code")
32
+ - General code review for style, performance, or maintainability
33
+ - Feature development, refactoring, or non-security tasks
34
+ - When the user explicitly asks for a quick scan without verification
35
+
36
+ ## Rationalizations to Reject
37
+
38
+ If you catch yourself thinking any of these, STOP.
39
+
40
+ | Rationalization | Why It's Wrong | Required Action |
41
+ |---|---|---|
42
+ | "Rapid analysis of remaining bugs" | Every bug gets full verification | Return to task list, verify next bug through all phases |
43
+ | "This pattern looks dangerous, so it's a vulnerability" | Pattern recognition is not analysis | Complete data flow tracing before any conclusion |
44
+ | "Skipping full verification for efficiency" | No partial analysis allowed | Execute all steps per the chosen verification path |
45
+ | "The code looks unsafe, reporting without tracing data flow" | Unsafe-looking code may have upstream validation | Trace the complete path from source to sink |
46
+ | "Similar code was vulnerable elsewhere" | Each context has different validation, callers, and protections | Verify this specific instance independently |
47
+ | "This is clearly critical" | LLMs are biased toward seeing bugs and overrating severity | Complete devil's advocate review; prove it with evidence |
48
+
49
+ ---
50
+
51
+ ## Step 0: Understand the Claim and Context
52
+
53
+ Before any analysis, restate the bug in your own words. If you cannot do this clearly, ask the user for clarification using AskUserQuestion. Half of false positives collapse at this step — the claim doesn't make coherent sense when restated precisely.
54
+
55
+ Document:
56
+
57
+ - **What is the exact vulnerability claim?** (e.g., "heap buffer overflow in `parse_header()` when `content_length` exceeds 4096")
58
+ - **What is the alleged root cause?** (e.g., "missing bounds check before `memcpy` at line 142")
59
+ - **What is the supposed trigger?** (e.g., "attacker sends HTTP request with oversized Content-Length header")
60
+ - **What is the claimed impact?** (e.g., "remote code execution via controlled heap corruption")
61
+ - **What is the threat model?** What privilege level does this code run at? Is it sandboxed? What can the attacker already do before triggering this bug? (e.g., "unauthenticated remote attacker vs privileged local user"; "runs inside Chrome renderer sandbox" vs "runs as root with no sandbox")
62
+ - **What is the bug class?** Classify the bug and consult [bug-class-verification.md]({baseDir}/references/bug-class-verification.md) for class-specific verification requirements that supplement the generic phases below.
63
+ - **Execution context**: When and how is this code path reached during normal execution?
64
+ - **Caller analysis**: What functions call this code and what input constraints do they impose?
65
+ - **Architectural context**: Is this part of a larger security system with multiple protection layers?
66
+ - **Historical context**: Any recent changes, known issues, or previous security reviews of this code area?
67
+
68
+ ## Route: Standard vs Deep Verification
69
+
70
+ After Step 0, choose a verification path.
71
+
72
+ ### Standard Verification
73
+
74
+ Use when ALL of these hold:
75
+
76
+ - Clear, specific vulnerability claim (not vague or ambiguous)
77
+ - Single component — no cross-component interaction in the bug path
78
+ - Well-understood bug class (buffer overflow, SQL injection, XSS, integer overflow, etc.)
79
+ - No concurrency or async involved in the trigger
80
+ - Straightforward data flow from source to sink
81
+
82
+ Follow [standard-verification.md]({baseDir}/references/standard-verification.md). No task creation — work through the linear checklist, documenting findings inline.
83
+
84
+ ### Deep Verification
85
+
86
+ Use when ANY of these hold:
87
+
88
+ - Ambiguous claim that could be interpreted multiple ways
89
+ - Cross-component bug path (data flows through 3+ modules or services)
90
+ - Race conditions, TOCTOU, or concurrency in the trigger mechanism
91
+ - Logic bugs without a clear spec to verify against
92
+ - Standard verification was inconclusive or escalated
93
+ - User explicitly requests full verification
94
+
95
+ Follow [deep-verification.md]({baseDir}/references/deep-verification.md). Create the full task dependency graph and execute phases with the plugin's agents.
96
+
97
+ ### Default
98
+
99
+ Start with standard. Standard verification has two built-in escalation checkpoints that route to deep when complexity exceeds the linear checklist.
100
+
101
+ ## Batch Triage
102
+
103
+ When verifying multiple bugs at once:
104
+
105
+ 1. Run Step 0 for all bugs first — restating each claim often collapses obvious false positives immediately
106
+ 2. Route each bug independently (some may be standard, others deep)
107
+ 3. Process all standard-routed bugs first, then deep-routed bugs
108
+ 4. After all bugs are verified, check for **exploit chains** — findings that individually failed gate review may combine to form a viable attack
109
+
110
+ ## Final Summary
111
+
112
+ After processing ALL suspected bugs, provide:
113
+
114
+ 1. **Counts**: X TRUE POSITIVES, Y FALSE POSITIVES
115
+ 2. **TRUE POSITIVE list**: Each with brief vulnerability description
116
+ 3. **FALSE POSITIVE list**: Each with brief reason for rejection
117
+
118
+ ## References
119
+
120
+ - [Standard Verification]({baseDir}/references/standard-verification.md) — Linear single-pass checklist for straightforward bugs
121
+ - [Deep Verification]({baseDir}/references/deep-verification.md) — Full task-based orchestration for complex bugs
122
+ - [Gate Reviews]({baseDir}/references/gate-reviews.md) — Six mandatory gates and verdict format
123
+ - [Bug-Class Verification]({baseDir}/references/bug-class-verification.md) — Class-specific verification requirements for memory corruption, logic bugs, race conditions, integer issues, crypto, injection, info disclosure, DoS, and deserialization
124
+ - [False Positive Patterns]({baseDir}/references/false-positive-patterns.md) — 13-item checklist and red flags for common false positive patterns
125
+ - [Evidence Templates]({baseDir}/references/evidence-templates.md) — Documentation templates for data flow, mathematical proofs, attacker control, and devil's advocate reviews
@@ -0,0 +1,114 @@
1
+ # Bug-Class-Specific Verification
2
+
3
+ Different bug classes require different verification approaches. After classifying the bug in Step 0, apply the class-specific requirements below **in addition to** the generic verification phases.
4
+
5
+ ## Memory Corruption
6
+
7
+ Buffer overflow, heap overflow, stack overflow, out-of-bounds read/write, use-after-free, double-free, type confusion.
8
+
9
+ **Language safety check first:** Memory corruption in safe Rust, Go (without `unsafe.Pointer`/cgo), or managed languages (Java, C#, Python) is almost always a false positive — the type system or runtime prevents it. Verify whether the code is in an `unsafe` block (Rust), uses cgo/`unsafe.Pointer` (Go), or calls native code via JNI/P/Invoke. If the code is entirely in the safe subset, reject the memory corruption claim unless it involves a compiler bug or soundness hole.
10
+
11
+ **Verify:**
12
+
13
+ - What exactly gets corrupted? (which object, field, or memory region)
14
+ - What is the corruption size and offset? Can the attacker control them?
15
+ - Is the corruption a useful exploitation primitive (arbitrary read/write, vtable overwrite, function pointer overwrite) or just a crash?
16
+ - What allocator is in use (glibc, tcmalloc, jemalloc, Windows heap)? Does it have hardening that blocks exploitation?
17
+ - For UAF: trace the object lifetime — what frees it, what reuses the memory, can the attacker control the replacement object?
18
+ - For type confusion: prove the type mismatch exists and that misinterpretation of the data leads to a useful primitive.
19
+
20
+ ## Logic Bugs
21
+
22
+ Authentication bypass, access control errors, incorrect state transitions, confused deputy, privilege escalation through API misuse.
23
+
24
+ **Verify:**
25
+
26
+ - Check against the specification, RFC, or design docs — not just the code. Does the implementation match the intended behavior?
27
+ - Map all state transitions. Can the system reach a state the developer didn't anticipate?
28
+ - Identify implicit assumptions that are never enforced in code.
29
+ - For auth bugs: verify ALL authentication/authorization paths, not just the one that appears broken. Is there a secondary check that catches it?
30
+ - Logic bugs pass every bounds check and mathematical proof — don't let clean static analysis convince you it's a false positive.
31
+
32
+ ## Race Conditions
33
+
34
+ TOCTOU, data races, signal handling races, concurrent state modification.
35
+
36
+ **Verify:**
37
+
38
+ - What is the actual race window? Is it nanoseconds or seconds?
39
+ - Can the attacker widen the window (e.g., by stalling a thread with a slow NFS mount, large allocation, or CPU contention)?
40
+ - Verify the threading model: what threads/processes can actually access this data concurrently?
41
+ - Check all synchronization primitives in use — mutexes, atomics, RCU, lock-free structures.
42
+ - For TOCTOU on filesystem: can the attacker control the path between check and use (symlink races)?
43
+
44
+ ## Integer Issues
45
+
46
+ Overflow, underflow, truncation, signedness errors, wraparound.
47
+
48
+ **Verify:**
49
+
50
+ - What are the exact integer types and their ranges at every point in the computation?
51
+ - Is the overflow signed (undefined behavior in C/C++ — compiler may exploit this) or unsigned (defined wraparound)?
52
+ - Trace the integer through all casts, conversions, and promotions. Where does truncation or sign extension occur?
53
+ - After the integer issue occurs, is the resulting value actually used in a dangerous way (allocation size, array index, loop bound)?
54
+ - Check if compiler warnings (`-Wconversion`, `-Wsign-compare`) flag this.
55
+
56
+ ## Crypto Weaknesses
57
+
58
+ Weak algorithms, bad parameters, nonce reuse, padding oracle, insufficient randomness, timing side channels.
59
+
60
+ **Verify:**
61
+
62
+ - Check parameter choices against current standards (NIST, IETF) and known attacks. "AES-128" is fine; "DES" is not.
63
+ - Verify randomness sources. Is the PRNG cryptographically secure? Is it properly seeded?
64
+ - For nonce reuse: prove the same nonce can actually be used twice in practice, not just theoretically.
65
+ - For timing side channels: is the code actually reachable by an attacker who can measure timing? Network jitter may make remote timing attacks impractical.
66
+ - Compare the implementation against a reference implementation or test vectors from the spec.
67
+
68
+ ## Injection
69
+
70
+ SQL injection, XSS, command injection, server-side template injection, path traversal, LDAP injection.
71
+
72
+ **Verify:**
73
+
74
+ - Trace attacker input from entry point to the sink (query, command, template, filesystem path). Is there any sanitization or escaping along the way?
75
+ - Check if the framework provides automatic escaping (e.g., parameterized queries, template auto-escaping). If so, is it actually enabled and not bypassed?
76
+ - For XSS: what context does the input land in (HTML body, attribute, JavaScript, URL)? Each requires different escaping.
77
+ - For path traversal: is the path canonicalized before the access check? Can `../` or null bytes bypass validation?
78
+ - Test actual payload delivery through all intermediate processing — encoding, decoding, and transformation steps may neutralize or enable the payload.
79
+
80
+ ## Information Disclosure
81
+
82
+ Uninitialized memory reads, error message leaks, timing side channels, padding oracles.
83
+
84
+ **Verify:**
85
+
86
+ - What specific data leaks? Not all leaks are equal — a stack leak revealing ASLR base or canary is critical; one revealing a static string is worthless.
87
+ - Is the leaked data actually useful to an attacker for further exploitation (ASLR bypass, session tokens, crypto keys)?
88
+ - For uninitialized memory: prove the memory is actually uninitialized at the point of read, not just potentially uninitialized on some code path.
89
+ - For timing side channels: can the attacker make enough measurements with sufficient precision? What's the noise level?
90
+ - For error messages: does the error path actually reach the attacker, or is it logged server-side only?
91
+
92
+ ## Denial of Service
93
+
94
+ Algorithmic complexity, resource exhaustion, crash bugs, infinite loops, memory bombs.
95
+
96
+ **Verify:**
97
+
98
+ - What is the resource consumption ratio? Attacker sends X bytes, server consumes Y resources. Is the amplification meaningful?
99
+ - Can the resource be reclaimed (connection closes, memory freed) or is it permanent exhaustion?
100
+ - For algorithmic complexity: what is the actual worst-case input? Prove it triggers worst-case behavior, don't just claim O(n²).
101
+ - For crash bugs: is the crash reliably triggerable, or does it depend on specific heap/stack layout?
102
+ - Does the service restart automatically? A crash that causes a 100ms restart is different from one that requires manual intervention.
103
+
104
+ ## Deserialization
105
+
106
+ Unsafe deserialization, object injection, gadget chain exploitation.
107
+
108
+ **Verify:**
109
+
110
+ - Does the attacker actually control the serialized data that reaches the deserialization call?
111
+ - Does a usable gadget chain exist in the classpath/import graph? Without a gadget chain, unsafe deserialization is a design smell, not an exploitable bug.
112
+ - What deserialization library and version is in use? Are there known gadget chains for it?
113
+ - Are there type restrictions, allowlists, or look-ahead deserialization filters that block dangerous classes?
114
+ - For language-specific: Java `ObjectInputStream`, Python `pickle`, PHP `unserialize`, .NET `BinaryFormatter` each have different exploitation characteristics.
@@ -0,0 +1,143 @@
1
+ # Deep Verification
2
+
3
+ Full task-based verification for complex bugs. Use when routing from SKILL.md selects the deep path, or when standard verification escalates.
4
+
5
+ ## If Escalated from Standard
6
+
7
+ When a bug escalates from standard verification:
8
+
9
+ 1. Review all evidence gathered during the standard pass — do not repeat completed work
10
+ 2. Identify which phases below are already satisfied by existing evidence
11
+ 3. Create tasks only for remaining phases, starting from where standard left off
12
+ 4. Preserve and reference all prior findings in new task descriptions
13
+
14
+ ## Verification Task List
15
+
16
+ For each bug (Bug #N), create tasks with the dependency structure below. After creating all tasks, use the task IDs returned by TaskCreate to wire dependencies with `addBlockedBy` in TaskUpdate.
17
+
18
+ ```
19
+ ── Phase 1: Data Flow Analysis ──────────────────────────────────
20
+ "BUG #N - Phase 1.1: Map trust boundaries and trace data flow"
21
+ Then in parallel (each blocked by 1.1):
22
+ "BUG #N - Phase 1.2: Research API contracts and safety guarantees"
23
+ "BUG #N - Phase 1.3: Environment protection analysis"
24
+ "BUG #N - Phase 1.4: Cross-reference analysis"
25
+
26
+ ── Phase 2: Exploitability Verification (blocked by Phase 1) ───
27
+ In parallel:
28
+ "BUG #N - Phase 2.1: Confirm attacker controls input data"
29
+ "BUG #N - Phase 2.2: Mathematical bounds verification"
30
+ "BUG #N - Phase 2.3: Race condition feasibility proof"
31
+ Then (blocked by 2.1, 2.2, 2.3):
32
+ "BUG #N - Phase 2.4: Adversarial analysis"
33
+
34
+ ── Phase 3: Impact Assessment (blocked by Phase 2) ─────────────
35
+ In parallel:
36
+ "BUG #N - Phase 3.1: Demonstrate real security impact"
37
+ "BUG #N - Phase 3.2: Primary control vs defense-in-depth"
38
+
39
+ ── Phase 4: PoC Creation (blocked by Phase 3) ──────────────────
40
+ "BUG #N - Phase 4.1: Create pseudocode PoC with data flow diagrams"
41
+ Then in parallel (each blocked by 4.1):
42
+ "BUG #N - Phase 4.2: Create executable PoC if feasible"
43
+ "BUG #N - Phase 4.3: Create unit test PoC if feasible"
44
+ "BUG #N - Phase 4.4: Negative PoC — show exploit preconditions"
45
+ Then (blocked by 4.2, 4.3, 4.4):
46
+ "BUG #N - Phase 4.5: Verify PoC demonstrates the vulnerability"
47
+
48
+ ── Phase 5: Devil's Advocate (blocked by Phase 4) ──────────────
49
+ "BUG #N - Phase 5.1: Devil's advocate review"
50
+
51
+ ── Gate Review (blocked by Phase 5) ────────────────────────────
52
+ "BUG #N - GATE REVIEW: Evaluate all six gates before verdict"
53
+ ```
54
+
55
+ ## Execution Rules
56
+
57
+ - Mark each task as in-progress when starting, completed only with concrete evidence
58
+ - **Parallel sub-phases**: Launch independent sub-phases concurrently using the plugin's agents. Collect all results before proceeding to the next dependency gate.
59
+ - **Dependency gates**: Never start a phase until all tasks it depends on are completed.
60
+ - Apply all 13 checklist items from [false-positive-patterns.md]({baseDir}/references/false-positive-patterns.md) to each bug
61
+
62
+ ## Agents
63
+
64
+ Spawn these agents via `Task` for their respective phases. Pass the bug description and any prior phase results as context.
65
+
66
+ | Agent | Phases | Purpose |
67
+ |-------|--------|---------|
68
+ | `data-flow-analyzer` | 1.1–1.4 | Trace data flow, map trust boundaries, check API contracts and environment protections |
69
+ | `exploitability-verifier` | 2.1–2.4 | Prove attacker control, mathematical bounds, race condition feasibility |
70
+ | `poc-author` | 4.1–4.5 | Create pseudocode, executable, unit test, and negative PoCs |
71
+
72
+ Phases 3 (Impact Assessment), 5 (Devil's Advocate), and the Gate Review are handled directly — they require synthesizing results across phases and should not be delegated.
73
+
74
+ ## Phase Requirements
75
+
76
+ The task list above names every phase. Below are the key pitfalls and decision criteria for each — focus on what you might get wrong.
77
+
78
+ ### Phase 1: Data Flow Analysis
79
+
80
+ **1.1**: Map trust boundaries (internal/trusted vs external/untrusted) and trace data from source to alleged vulnerability. Apply class-specific verification from [bug-class-verification.md]({baseDir}/references/bug-class-verification.md). **Key pitfall**: Analyzing code in isolation without tracing the full validation chain. Conditional logic upstream may make the vulnerable code mathematically unreachable (see [false-positive-patterns.md]({baseDir}/references/false-positive-patterns.md) items 1 and 1a).
81
+
82
+ **1.2**: Check API contracts before claiming overflows — many APIs have built-in bounds protection that prevents the alleged issue regardless of inputs.
83
+
84
+ **1.3**: Before concluding vulnerability, verify that no compiler, runtime, OS, or framework protections prevent exploitation. Note: mitigations like ASLR and stack canaries raise the exploitation bar but do not eliminate the vulnerability itself. Distinguish "prevents exploitation entirely" (e.g., Rust's safe type system) from "makes exploitation harder" (e.g., ASLR).
85
+
86
+ **1.4**: Check if similar code patterns exist elsewhere and are handled safely. Review test coverage, code review history, and design documentation for this area.
87
+
88
+ ### Phase 2: Exploitability Verification
89
+
90
+ **2.1**: Prove attacker controls the data reaching the vulnerability. **Key pitfall**: Assuming network/external data reaches the operation without tracing the actual path — internal storage set by trusted components is not attacker-controlled.
91
+
92
+ **2.2**: Create explicit algebraic proofs for bounds-related issues. Use the template in [evidence-templates.md]({baseDir}/references/evidence-templates.md). Verify: IF validation_check_passes THEN bounds_guarantee_holds.
93
+
94
+ **2.3**: For race conditions, prove concurrent access is actually possible. **Key pitfall**: Assuming race conditions in single-threaded initialization or synchronized contexts.
95
+
96
+ **2.4**: Assess full attack surface: input control, validation bypass paths, timing dependencies, and state manipulation.
97
+
98
+ ### Phase 3: Impact Assessment
99
+
100
+ **3.1**: Distinguish real security impact (RCE, privesc, info disclosure) from operational robustness issues.
101
+
102
+ **3.2**: Distinguish primary security controls from defense-in-depth. Failure of a defense-in-depth measure is not a vulnerability if primary protections remain intact.
103
+
104
+ ### Phase 4: PoC Creation
105
+
106
+ **Always create a pseudocode PoC.** Additionally, create executable and/or unit test PoCs when feasible:
107
+
108
+ 1. **Pseudocode with data flow diagrams** showing the attack path (always)
109
+ 2. **Executable PoC** in the target language demonstrating the vulnerability (if feasible)
110
+ 3. **Unit test PoC** exercising the vulnerable code path with crafted inputs (if feasible)
111
+
112
+ See [evidence-templates.md]({baseDir}/references/evidence-templates.md) for PoC templates.
113
+
114
+ **Negative PoC (Phase 4.4)**: Demonstrate the gap between normal operation and the exploit path — what preconditions must hold for the vulnerability to trigger, and why they don't hold under normal conditions.
115
+
116
+ ### Phase 5: Devil's Advocate Review
117
+
118
+ Before final verdict, systematically challenge the vulnerability claim. Assume you are biased toward finding bugs and rating them as critical — actively work against that bias.
119
+
120
+ **Challenges arguing AGAINST the vulnerability:**
121
+
122
+ 1. What non-vulnerability explanations exist for this code pattern?
123
+ 2. How would the original developers justify this implementation?
124
+ 3. What crucial system architecture context might be missing?
125
+ 4. Am I seeing a vulnerability because the pattern "looks dangerous" rather than because it actually is?
126
+ 5. Even if validation looks insufficient, does it actually prevent the claimed condition?
127
+ 6. Am I incorrectly assuming attacker control over trusted data?
128
+ 7. Have I rigorously proven the mathematical condition for vulnerability can occur?
129
+ 8. Beyond theoretical possibility, is this practically exploitable?
130
+ 9. Am I confusing defense-in-depth failure with a primary security vulnerability?
131
+ 10. What compiler/runtime/OS protections might prevent exploitation?
132
+ 11. Am I hallucinating this vulnerability? LLMs are biased toward seeing bugs everywhere and rating every finding as critical — is this actually a real, exploitable issue or am I pattern-matching on scary-looking code?
133
+
134
+ **Challenges arguing FOR the vulnerability (false-negative protection):**
135
+
136
+ 12. Am I dismissing a real vulnerability because the exploit seems complex or unlikely?
137
+ 13. Am I inventing mitigations or validation logic that I haven't verified in the actual source code? Re-read the code after reaching a conclusion.
138
+
139
+ See [evidence-templates.md]({baseDir}/references/evidence-templates.md) for the devil's advocate documentation template.
140
+
141
+ ## Gate Review
142
+
143
+ Apply the six gates from [gate-reviews.md]({baseDir}/references/gate-reviews.md) to reach a verdict.
@@ -0,0 +1,91 @@
1
+ # Evidence Templates
2
+
3
+ Use these templates when documenting verification evidence for each bug.
4
+
5
+ ## Data Flow Documentation
6
+
7
+ ```
8
+ Bug #N Data Flow Analysis
9
+ Source: [exact location] — Trust Level: [trusted/untrusted]
10
+ Path: Source → Validation1[file:line] → Transform[file:line] → Vulnerability[file:line]
11
+ Validation Points:
12
+ - Check1: [condition] at [file:line] — [passes/fails/bypassed]
13
+ - Check2: [condition] at [file:line] — [passes/fails/bypassed]
14
+ ```
15
+
16
+ ## Mathematical Bounds Proof
17
+
18
+ ```
19
+ Bug #N Mathematical Analysis
20
+ Claim: Operation X is vulnerable to [overflow/underflow/bounds violation]
21
+ Given Constraints: [list all validation conditions]
22
+
23
+ Algebraic Proof:
24
+ 1. [first constraint from validation]
25
+ 2. [constant or known value]
26
+ 3. [derived inequality]
27
+ ...
28
+ N. Therefore: [vulnerability confirmed/debunked] (Q.E.D.)
29
+
30
+ Conclusion: [vulnerability is/is not mathematically possible]
31
+ ```
32
+
33
+ **Example:**
34
+
35
+ ```
36
+ Given: validation ensures (input_size >= MIN_SIZE)
37
+ Given: MIN_SIZE = 16, header_size = 8
38
+ Prove: (input_size - header_size) cannot underflow
39
+
40
+ 1. input_size >= MIN_SIZE (from validation)
41
+ 2. MIN_SIZE = 16 (constant)
42
+ 3. header_size = 8 (constant)
43
+ 4. input_size >= 16 (substitution of 1,2)
44
+ 5. input_size - 8 >= 16 - 8 (subtract header_size from both sides)
45
+ 6. input_size - header_size >= 8 (simplification)
46
+ 7. Therefore: underflow impossible (Q.E.D.)
47
+ ```
48
+
49
+ ## Attacker Control Analysis
50
+
51
+ ```
52
+ Bug #N Attacker Control Analysis
53
+ Input Vector: [how attacker provides input]
54
+ Control Level: [full/partial/none]
55
+ Constraints: [what limits exist on attacker input]
56
+ Reachability: [can attacker-controlled data reach vulnerable operation?]
57
+ ```
58
+
59
+ ## PoC — Pseudocode with Data Flow Diagram
60
+
61
+ ```
62
+ PoC for Bug #N: [Brief Description]
63
+
64
+ Data Flow Diagram:
65
+
66
+ [External Input] → [Validation Point] → [Processing] → [Vulnerable Operation]
67
+ | | | |
68
+ Attacker (May be bypassed) (Transforms data) (Unsafe operation)
69
+ Controlled | | |
70
+ | v v v
71
+ [Malicious Data] → [Insufficient Check] → [Processed Data] → [Impact]
72
+
73
+ PSEUDOCODE:
74
+ function vulnerable_operation(user_data):
75
+ validation_result = weak_validation(user_data) // Explain why this fails
76
+ processed_data = transform_data(user_data) // Show transformation
77
+ unsafe_operation(processed_data) // Show vulnerability trigger
78
+ ```
79
+
80
+ ## Devil's Advocate Review
81
+
82
+ ```
83
+ Bug #N Devil's Advocate Review
84
+ Vulnerability Claim: [brief description]
85
+
86
+ For each of the 13 questions from the devil's advocate review, document your answer:
87
+ 1-11. [Challenges arguing AGAINST the vulnerability]
88
+ 12-13. [Challenges arguing FOR the vulnerability — false-negative protection]
89
+
90
+ Final Assessment: [Vulnerability confirmed/debunked with reasoning]
91
+ ```
@@ -0,0 +1,115 @@
1
+ # False Positive Patterns — Lessons Learned
2
+
3
+ Apply ALL items in this checklist to EACH potential bug during verification.
4
+
5
+ ## Checklist
6
+
7
+ ### 1. Trace Full Validation Chain
8
+
9
+ Don't analyze isolated code snippets. Trace backwards to find ALL validation that precedes potentially dangerous operations. Network packet size operations may look dangerous but often have bounds validation earlier in the function.
10
+
11
+ ### 1a. Map Complete Conditional Logic Flow
12
+
13
+ Vulnerable-looking code may be unreachable due to conditional logic that creates mathematical guarantees. Example: array access `buffer[length-4]` appears unsafe when `length < 4`, but if the code is only reachable when `length > 12` due to earlier validation, the vulnerability is impossible.
14
+
15
+ **Verify:**
16
+
17
+ - What conditions must be met for execution to reach the alleged vulnerability?
18
+ - Do those conditions mathematically prevent the vulnerability scenario?
19
+ - Are there minimum size/length requirements that guarantee safe access?
20
+ - Does the conditional flow create impossible-to-violate bounds?
21
+
22
+ ### 2. Identify Defensive Programming Patterns
23
+
24
+ Distinguish between actual vulnerabilities and defensive assertions/validations. `ASSERT(size == expected_size)` followed by size-controlled operations is defensive, not vulnerable. Verify that checks actually prevent the alleged vulnerability.
25
+
26
+ ### 3. Confirm Exploitable Data Paths
27
+
28
+ Only report vulnerabilities with CONFIRMED exploitable data flow paths. Don't assume network-controlled data reaches dangerous functions without tracing the actual path step by step.
29
+
30
+ ### 4. Understand Data Source Context
31
+
32
+ Distinguish between data sources and their trust levels. API return values, compile-time constants, and network data have different risk profiles. Determine the actual source and whether it is attacker-controlled.
33
+
34
+ ### 5. Analyze Bounds Validation Logic
35
+
36
+ Look for mathematical relationships between validation checks and subsequent operations. If `packet_size >= MIN_SIZE` is checked and `MIN_SIZE >= sizeof(header)`, then `packet_size - sizeof(header)` cannot underflow.
37
+
38
+ ### 6. Verify TOCTOU Claims
39
+
40
+ Time-of-check-time-of-use issues require proof that the checked value can change between check and use. If a size is checked and immediately used in the same function with no external modification possible, there is no TOCTOU.
41
+
42
+ ### 7. Understand API Contract and Trust Boundaries
43
+
44
+ Always understand API contracts before claiming buffer overflows. Some APIs have built-in bounds protection and cannot write beyond the buffer regardless of input parameters.
45
+
46
+ ### 8. Distinguish Internal Storage from External Input
47
+
48
+ Internal storage systems (configuration stores, registries) are controlled by trusted components, not attackers. Values set during installation by trusted components are not attacker-controlled.
49
+
50
+ ### 9. Don't Confuse Pattern Recognition with Vulnerability Analysis
51
+
52
+ Code patterns that "look vulnerable" may be safely implemented due to context and API contracts. Size parameters being modified doesn't mean buffer overflow if the API prevents writing beyond bounds.
53
+
54
+ ### 10. Verify Concurrent Access is Actually Possible
55
+
56
+ Don't assume race conditions exist without proving concurrent access patterns. Single-threaded initialization contexts cannot have race conditions. Verify the threading model and synchronization mechanisms.
57
+
58
+ ### 11. Assess Real vs Theoretical Security Impact
59
+
60
+ Focus on vulnerabilities with actual security impact. Storage failure for non-critical data is an operational issue, not a security vulnerability. Ask: would this lead to code execution, privilege escalation, or information disclosure?
61
+
62
+ ### 12. Understand Defense-in-Depth vs Primary Controls
63
+
64
+ Failure of defense-in-depth mechanisms is not always a vulnerability if primary protections exist. Token cleanup failure is not critical if tokens are single-use by design at the server.
65
+
66
+ ### 13. Apply the Checklist Rigorously, Not Superficially
67
+
68
+ Having a checklist doesn't prevent false positives if it isn't applied systematically. For EVERY potential vulnerability, work through ALL checklist items before concluding.
69
+
70
+ ---
71
+
72
+ ## Red Flags for False Positives
73
+
74
+ ### Pattern-Based False Positives
75
+
76
+ - Reporting vulnerabilities in validation/bounds-checking code itself
77
+ - Claiming TOCTOU without proving the value can change
78
+ - Ignoring preceding validation logic
79
+ - Assuming network data reaches operations without tracing the path
80
+ - Confusing defensive programming (assertions/checks) with vulnerabilities
81
+ - Analyzing vulnerable-looking patterns without tracing conditional logic that controls reachability
82
+ - Reporting "vulnerabilities" in error handling or cleanup code
83
+ - Flagging size calculations without understanding mathematical constraints
84
+ - Identifying "dangerous" functions without checking if inputs are bounded
85
+ - Claiming buffer overflows in fixed-size operations with compile-time bounds
86
+ - Reporting race conditions in single-threaded or synchronized contexts
87
+
88
+ ### Context-Blind Analysis False Positives
89
+
90
+ - Analyzing code snippets without understanding broader system design
91
+ - Ignoring architectural guarantees (single-writer, trusted input sources)
92
+ - Missing that "vulnerable" code is unreachable due to earlier validation
93
+ - Confusing debug/development code paths with production paths
94
+ - Reporting issues in code that only runs during trusted installation/setup
95
+ - Flagging theoretical issues that cannot occur due to system architecture
96
+ - Missing that alleged vulnerabilities are prevented by framework or language guarantees
97
+ - Reporting issues in test-only or debug-only code paths as production vulnerabilities
98
+
99
+ ### Mathematical/Bounds Analysis False Positives
100
+
101
+ - Reporting integer underflow without proving the mathematical condition can occur
102
+ - Claiming buffer overflow when bounds are mathematically guaranteed by validation
103
+ - Missing that conditional logic creates mathematical impossibility of vulnerable conditions
104
+ - Reporting off-by-one errors without checking if loop bounds prevent the condition
105
+ - Claiming memory corruption when allocation sizes are verified sufficient
106
+ - Reporting arithmetic overflow without checking if input ranges prevent the condition
107
+
108
+ ### API Contract Misunderstanding False Positives
109
+
110
+ - Claiming buffer overflows when APIs have built-in bounds checking
111
+ - Reporting memory corruption for APIs that manage their own memory safely
112
+ - Missing that return values are already validated by the API contract
113
+ - Confusing API parameter modification with vulnerability when API prevents unsafe modification
114
+ - Reporting issues explicitly handled by the API's safety guarantees
115
+ - Missing that seemingly dangerous operations are safe due to API implementation details
@@ -0,0 +1,27 @@
1
+ # Gate Reviews and Verdicts
2
+
3
+ Before reporting ANY bug as a vulnerability, all six gate reviews must pass. Evaluate these during the GATE REVIEW task after all phases are complete:
4
+
5
+ | Gate | Criterion | Pass | Fail |
6
+ |------|-----------|------|------|
7
+ | **1. Process** | All phases completed with documented evidence | Evidence exists for every phase | Phases lack concrete evidence |
8
+ | **2. Reachability** | Attacker can reach and control data at the vulnerability | Clear evidence of attacker-controlled path + PoC confirms | Cannot demonstrate attacker control or reachability |
9
+ | **3. Real Impact** | Exploitation leads to RCE, privesc, or info disclosure | Direct impact with concrete scenarios | Only operational robustness issue |
10
+ | **4. PoC Validation** | PoC (pseudocode, executable, or unit test) demonstrates the attack path | Shows attacker control, trigger, and impact | PoC fails to show attack path or impact |
11
+ | **5. Math Bounds** | Mathematical analysis confirms vulnerable condition is possible | Algebraic proof shows condition is possible | Math proves validation prevents it |
12
+ | **6. Environment** | No environmental protections entirely prevent exploitation | Protections do not eliminate vulnerability | Environmental protections block it entirely |
13
+
14
+ ## Verdict Format
15
+
16
+ - **TRUE POSITIVE**: All gate reviews pass → `BUG #N TRUE POSITIVE — [brief vulnerability description]`
17
+ - **FALSE POSITIVE**: Any gate review fails → `BUG #N FALSE POSITIVE — [brief reason for rejection]`
18
+
19
+ If any phase fails verification, document the failure with evidence and continue all remaining phases. Issue the FALSE POSITIVE verdict only after all phases are complete.
20
+
21
+ ## Example Verdict
22
+
23
+ ```
24
+ BUG #3 FALSE POSITIVE — Integer underflow in packet_handler.c:142
25
+ Gate 5 (Math Bounds) FAIL: validation at line 98 ensures packet_size >= 16,
26
+ making (packet_size - header_size) >= 8. Underflow is mathematically impossible.
27
+ ```