@vigolium/piolium 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (271) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +117 -0
  3. package/agents/access-auditor.md +300 -0
  4. package/agents/assumption-breaker.md +154 -0
  5. package/agents/attack-designer.md +116 -0
  6. package/agents/code-scanner.md +139 -0
  7. package/agents/concurrency-auditor.md +238 -0
  8. package/agents/confirm-writer.md +257 -0
  9. package/agents/context-reviewer.md +274 -0
  10. package/agents/cross-verifier.md +165 -0
  11. package/agents/cve-scout.md +381 -0
  12. package/agents/env-builder.md +282 -0
  13. package/agents/env-profiler.md +205 -0
  14. package/agents/evidence-collector.md +140 -0
  15. package/agents/finding-grader.md +142 -0
  16. package/agents/finding-writer.md +148 -0
  17. package/agents/flow-tracer.md +106 -0
  18. package/agents/goal-backtracer.md +146 -0
  19. package/agents/history-miner.md +467 -0
  20. package/agents/independent-verifier.md +118 -0
  21. package/agents/intent-mapper.md +183 -0
  22. package/agents/longshot-collector.md +128 -0
  23. package/agents/longshot-prober.md +126 -0
  24. package/agents/patch-auditor.md +73 -0
  25. package/agents/poc-author.md +124 -0
  26. package/agents/poc-runner.md +194 -0
  27. package/agents/probe-lead.md +269 -0
  28. package/agents/red-challenger.md +101 -0
  29. package/agents/report-composer.md +208 -0
  30. package/agents/review-adjudicator.md +216 -0
  31. package/agents/spec-auditor.md +155 -0
  32. package/agents/taint-tracer.md +265 -0
  33. package/agents/test-locator.md +209 -0
  34. package/agents/threat-modeler.md +132 -0
  35. package/agents/variant-scanner.md +108 -0
  36. package/agents/variant-spotter.md +110 -0
  37. package/bin/piolium.mjs +376 -0
  38. package/extensions/piolium/_vendor/yaml.bundle.d.mts +6 -0
  39. package/extensions/piolium/_vendor/yaml.bundle.mjs +139 -0
  40. package/extensions/piolium/agent-runner.ts +322 -0
  41. package/extensions/piolium/agents.ts +266 -0
  42. package/extensions/piolium/audit-state.ts +522 -0
  43. package/extensions/piolium/bundled-resources.ts +97 -0
  44. package/extensions/piolium/candidate-scan.ts +966 -0
  45. package/extensions/piolium/command-target.ts +177 -0
  46. package/extensions/piolium/console-stream.ts +57 -0
  47. package/extensions/piolium/export-results.ts +380 -0
  48. package/extensions/piolium/findings.ts +448 -0
  49. package/extensions/piolium/heartbeat.ts +182 -0
  50. package/extensions/piolium/help.ts +234 -0
  51. package/extensions/piolium/index.ts +1865 -0
  52. package/extensions/piolium/longshot.ts +530 -0
  53. package/extensions/piolium/matcher-suggestions.ts +196 -0
  54. package/extensions/piolium/matcher-utils.ts +83 -0
  55. package/extensions/piolium/modes/balanced.ts +750 -0
  56. package/extensions/piolium/modes/confirm-bootstrap.ts +186 -0
  57. package/extensions/piolium/modes/confirm.ts +697 -0
  58. package/extensions/piolium/modes/deep.ts +917 -0
  59. package/extensions/piolium/modes/diff.ts +177 -0
  60. package/extensions/piolium/modes/lite.ts +540 -0
  61. package/extensions/piolium/modes/longshot.ts +595 -0
  62. package/extensions/piolium/modes/merge.ts +204 -0
  63. package/extensions/piolium/modes/phase-runner.ts +267 -0
  64. package/extensions/piolium/modes/reinvest.ts +546 -0
  65. package/extensions/piolium/modes/revisit.ts +279 -0
  66. package/extensions/piolium/modes.ts +48 -0
  67. package/extensions/piolium/phase-labels.ts +123 -0
  68. package/extensions/piolium/phase-status-strip.ts +92 -0
  69. package/extensions/piolium/prompt-prefix-editor.ts +39 -0
  70. package/extensions/piolium/providers/anthropic-vertex.ts +836 -0
  71. package/extensions/piolium/recon.ts +409 -0
  72. package/extensions/piolium/result-stats.ts +105 -0
  73. package/extensions/piolium/retry.ts +120 -0
  74. package/extensions/piolium/scheduler.ts +212 -0
  75. package/extensions/piolium/secrets.ts +368 -0
  76. package/extensions/piolium/tools/web-tools.ts +148 -0
  77. package/package.json +77 -0
  78. package/skills/agentic-actions-auditor/SKILL.md +327 -0
  79. package/skills/agentic-actions-auditor/references/action-profiles.md +186 -0
  80. package/skills/agentic-actions-auditor/references/cross-file-resolution.md +209 -0
  81. package/skills/agentic-actions-auditor/references/foundations.md +94 -0
  82. package/skills/agentic-actions-auditor/references/vector-a-env-var-intermediary.md +77 -0
  83. package/skills/agentic-actions-auditor/references/vector-b-direct-expression-injection.md +83 -0
  84. package/skills/agentic-actions-auditor/references/vector-c-cli-data-fetch.md +83 -0
  85. package/skills/agentic-actions-auditor/references/vector-d-pr-target-checkout.md +88 -0
  86. package/skills/agentic-actions-auditor/references/vector-e-error-log-injection.md +88 -0
  87. package/skills/agentic-actions-auditor/references/vector-f-subshell-expansion.md +82 -0
  88. package/skills/agentic-actions-auditor/references/vector-g-eval-of-ai-output.md +91 -0
  89. package/skills/agentic-actions-auditor/references/vector-h-dangerous-sandbox-configs.md +102 -0
  90. package/skills/agentic-actions-auditor/references/vector-i-wildcard-allowlists.md +88 -0
  91. package/skills/audit/SKILL.md +562 -0
  92. package/skills/audit/assets/icon.svg +7 -0
  93. package/skills/audit/hooks/scripts/validate_phase_output.py +550 -0
  94. package/skills/audit/references/adversarial-review.md +148 -0
  95. package/skills/audit/references/architecture-aware-sast.md +306 -0
  96. package/skills/audit/references/audit-workflow.md +737 -0
  97. package/skills/audit/references/chamber-protocol.md +384 -0
  98. package/skills/audit/references/creative-attack-modes.md +221 -0
  99. package/skills/audit/references/deep-analysis.md +273 -0
  100. package/skills/audit/references/domain-attack-playbooks.md +1129 -0
  101. package/skills/audit/references/knowledge-base-template.md +513 -0
  102. package/skills/audit/references/real-env-validation.md +191 -0
  103. package/skills/audit/references/report-templates.md +417 -0
  104. package/skills/audit/references/triage-and-prereqs.md +134 -0
  105. package/skills/audit/scripts/consolidate_drafts.py +554 -0
  106. package/skills/audit/scripts/partition_findings.py +152 -0
  107. package/skills/audit/scripts/rg-hotspots.sh +121 -0
  108. package/skills/audit/scripts/stamp_file_state.py +349 -0
  109. package/skills/code-reviewer/SKILL.md +65 -0
  110. package/skills/codeql/SKILL.md +281 -0
  111. package/skills/codeql/references/build-fixes.md +90 -0
  112. package/skills/codeql/references/diagnostic-query-templates.md +339 -0
  113. package/skills/codeql/references/extension-yaml-format.md +209 -0
  114. package/skills/codeql/references/important-only-suite.md +153 -0
  115. package/skills/codeql/references/language-details.md +207 -0
  116. package/skills/codeql/references/macos-arm64e-workaround.md +179 -0
  117. package/skills/codeql/references/performance-tuning.md +111 -0
  118. package/skills/codeql/references/quality-assessment.md +172 -0
  119. package/skills/codeql/references/ruleset-catalog.md +63 -0
  120. package/skills/codeql/references/run-all-suite.md +92 -0
  121. package/skills/codeql/references/sarif-processing.md +79 -0
  122. package/skills/codeql/references/threat-models.md +51 -0
  123. package/skills/codeql/workflows/build-database.md +280 -0
  124. package/skills/codeql/workflows/create-data-extensions.md +261 -0
  125. package/skills/codeql/workflows/run-analysis.md +301 -0
  126. package/skills/differential-review/SKILL.md +220 -0
  127. package/skills/differential-review/adversarial.md +203 -0
  128. package/skills/differential-review/methodology.md +234 -0
  129. package/skills/differential-review/patterns.md +300 -0
  130. package/skills/differential-review/reporting.md +369 -0
  131. package/skills/fp-check/SKILL.md +125 -0
  132. package/skills/fp-check/references/bug-class-verification.md +114 -0
  133. package/skills/fp-check/references/deep-verification.md +143 -0
  134. package/skills/fp-check/references/evidence-templates.md +91 -0
  135. package/skills/fp-check/references/false-positive-patterns.md +115 -0
  136. package/skills/fp-check/references/gate-reviews.md +27 -0
  137. package/skills/fp-check/references/standard-verification.md +78 -0
  138. package/skills/insecure-defaults/SKILL.md +117 -0
  139. package/skills/insecure-defaults/references/examples.md +409 -0
  140. package/skills/last30days/SKILL.md +444 -0
  141. package/skills/sarif-parsing/SKILL.md +483 -0
  142. package/skills/sarif-parsing/resources/jq-queries.md +162 -0
  143. package/skills/sarif-parsing/resources/sarif_helpers.py +331 -0
  144. package/skills/security-threat-model/LICENSE.txt +201 -0
  145. package/skills/security-threat-model/SKILL.md +81 -0
  146. package/skills/security-threat-model/agents/openai.yaml +4 -0
  147. package/skills/security-threat-model/references/prompt-template.md +255 -0
  148. package/skills/security-threat-model/references/security-controls-and-assets.md +32 -0
  149. package/skills/semgrep/SKILL.md +212 -0
  150. package/skills/semgrep/references/rulesets.md +162 -0
  151. package/skills/semgrep/references/scan-modes.md +110 -0
  152. package/skills/semgrep/references/scanner-task-prompt.md +140 -0
  153. package/skills/semgrep/scripts/merge_sarif.py +203 -0
  154. package/skills/semgrep/workflows/scan-workflow.md +311 -0
  155. package/skills/semgrep-rule-creator/SKILL.md +168 -0
  156. package/skills/semgrep-rule-creator/references/quick-reference.md +202 -0
  157. package/skills/semgrep-rule-creator/references/workflow.md +240 -0
  158. package/skills/semgrep-rule-variant-creator/SKILL.md +205 -0
  159. package/skills/semgrep-rule-variant-creator/references/applicability-analysis.md +250 -0
  160. package/skills/semgrep-rule-variant-creator/references/language-syntax-guide.md +324 -0
  161. package/skills/semgrep-rule-variant-creator/references/workflow.md +518 -0
  162. package/skills/sharp-edges/SKILL.md +292 -0
  163. package/skills/sharp-edges/references/auth-patterns.md +252 -0
  164. package/skills/sharp-edges/references/case-studies.md +274 -0
  165. package/skills/sharp-edges/references/config-patterns.md +333 -0
  166. package/skills/sharp-edges/references/crypto-apis.md +190 -0
  167. package/skills/sharp-edges/references/lang-c.md +205 -0
  168. package/skills/sharp-edges/references/lang-csharp.md +285 -0
  169. package/skills/sharp-edges/references/lang-go.md +270 -0
  170. package/skills/sharp-edges/references/lang-java.md +263 -0
  171. package/skills/sharp-edges/references/lang-javascript.md +269 -0
  172. package/skills/sharp-edges/references/lang-kotlin.md +265 -0
  173. package/skills/sharp-edges/references/lang-php.md +245 -0
  174. package/skills/sharp-edges/references/lang-python.md +274 -0
  175. package/skills/sharp-edges/references/lang-ruby.md +273 -0
  176. package/skills/sharp-edges/references/lang-rust.md +272 -0
  177. package/skills/sharp-edges/references/lang-swift.md +287 -0
  178. package/skills/sharp-edges/references/language-specific.md +588 -0
  179. package/skills/spec-to-code-compliance/SKILL.md +357 -0
  180. package/skills/spec-to-code-compliance/resources/COMPLETENESS_CHECKLIST.md +69 -0
  181. package/skills/spec-to-code-compliance/resources/IR_EXAMPLES.md +417 -0
  182. package/skills/spec-to-code-compliance/resources/OUTPUT_REQUIREMENTS.md +105 -0
  183. package/skills/supply-chain-risk-auditor/SKILL.md +67 -0
  184. package/skills/supply-chain-risk-auditor/resources/results-template.md +41 -0
  185. package/skills/variant-analysis/METHODOLOGY.md +327 -0
  186. package/skills/variant-analysis/SKILL.md +142 -0
  187. package/skills/variant-analysis/resources/codeql/cpp.ql +119 -0
  188. package/skills/variant-analysis/resources/codeql/go.ql +69 -0
  189. package/skills/variant-analysis/resources/codeql/java.ql +71 -0
  190. package/skills/variant-analysis/resources/codeql/javascript.ql +63 -0
  191. package/skills/variant-analysis/resources/codeql/python.ql +80 -0
  192. package/skills/variant-analysis/resources/semgrep/cpp.yaml +98 -0
  193. package/skills/variant-analysis/resources/semgrep/go.yaml +63 -0
  194. package/skills/variant-analysis/resources/semgrep/java.yaml +61 -0
  195. package/skills/variant-analysis/resources/semgrep/javascript.yaml +60 -0
  196. package/skills/variant-analysis/resources/semgrep/python.yaml +72 -0
  197. package/skills/variant-analysis/resources/variant-report-template.md +75 -0
  198. package/skills/vuln-report/SKILL.md +137 -0
  199. package/skills/vuln-report/agents/openai.yaml +4 -0
  200. package/skills/vuln-report/references/report-template.md +135 -0
  201. package/skills/wooyun-legacy/SKILL.md +367 -0
  202. package/skills/wooyun-legacy/references/bank-penetration.md +222 -0
  203. package/skills/wooyun-legacy/references/checklists/command-execution-checklist.md +119 -0
  204. package/skills/wooyun-legacy/references/checklists/csrf-checklist.md +74 -0
  205. package/skills/wooyun-legacy/references/checklists/file-upload-checklist.md +108 -0
  206. package/skills/wooyun-legacy/references/checklists/info-disclosure-checklist.md +114 -0
  207. package/skills/wooyun-legacy/references/checklists/logic-flaws-checklist.md +95 -0
  208. package/skills/wooyun-legacy/references/checklists/misconfig-checklist.md +124 -0
  209. package/skills/wooyun-legacy/references/checklists/path-traversal-checklist.md +87 -0
  210. package/skills/wooyun-legacy/references/checklists/rce-checklist.md +93 -0
  211. package/skills/wooyun-legacy/references/checklists/sql-injection-checklist.md +97 -0
  212. package/skills/wooyun-legacy/references/checklists/ssrf-checklist.md +99 -0
  213. package/skills/wooyun-legacy/references/checklists/unauthorized-access-checklist.md +89 -0
  214. package/skills/wooyun-legacy/references/checklists/weak-password-checklist.md +115 -0
  215. package/skills/wooyun-legacy/references/checklists/xss-checklist.md +103 -0
  216. package/skills/wooyun-legacy/references/checklists/xxe-checklist.md +130 -0
  217. package/skills/wooyun-legacy/references/info-disclosure.md +975 -0
  218. package/skills/wooyun-legacy/references/logic-flaws.md +721 -0
  219. package/skills/wooyun-legacy/references/path-traversal.md +1191 -0
  220. package/skills/wooyun-legacy/references/telecom-penetration.md +156 -0
  221. package/skills/wooyun-legacy/references/unauthorized-access.md +980 -0
  222. package/skills/wooyun-legacy/references/xss.md +746 -0
  223. package/skills/zeroize-audit/SKILL.md +371 -0
  224. package/skills/zeroize-audit/configs/c.yaml +21 -0
  225. package/skills/zeroize-audit/configs/default.yaml +128 -0
  226. package/skills/zeroize-audit/configs/rust.yaml +83 -0
  227. package/skills/zeroize-audit/prompts/report_template.md +238 -0
  228. package/skills/zeroize-audit/prompts/system.md +163 -0
  229. package/skills/zeroize-audit/prompts/task.md +97 -0
  230. package/skills/zeroize-audit/references/compile-commands.md +231 -0
  231. package/skills/zeroize-audit/references/detection-strategy.md +191 -0
  232. package/skills/zeroize-audit/references/ir-analysis.md +252 -0
  233. package/skills/zeroize-audit/references/mcp-analysis.md +221 -0
  234. package/skills/zeroize-audit/references/poc-generation.md +470 -0
  235. package/skills/zeroize-audit/references/rust-zeroization-patterns.md +867 -0
  236. package/skills/zeroize-audit/schemas/input.json +83 -0
  237. package/skills/zeroize-audit/schemas/output.json +140 -0
  238. package/skills/zeroize-audit/tools/analyze_asm.sh +202 -0
  239. package/skills/zeroize-audit/tools/analyze_cfg.py +381 -0
  240. package/skills/zeroize-audit/tools/analyze_heap.sh +211 -0
  241. package/skills/zeroize-audit/tools/analyze_ir_semantic.py +429 -0
  242. package/skills/zeroize-audit/tools/diff_ir.sh +135 -0
  243. package/skills/zeroize-audit/tools/diff_rust_mir.sh +189 -0
  244. package/skills/zeroize-audit/tools/emit_asm.sh +67 -0
  245. package/skills/zeroize-audit/tools/emit_ir.sh +77 -0
  246. package/skills/zeroize-audit/tools/emit_rust_asm.sh +178 -0
  247. package/skills/zeroize-audit/tools/emit_rust_ir.sh +150 -0
  248. package/skills/zeroize-audit/tools/emit_rust_mir.sh +158 -0
  249. package/skills/zeroize-audit/tools/extract_compile_flags.py +284 -0
  250. package/skills/zeroize-audit/tools/generate_poc.py +1329 -0
  251. package/skills/zeroize-audit/tools/mcp/apply_confidence_gates.py +113 -0
  252. package/skills/zeroize-audit/tools/mcp/check_mcp.sh +68 -0
  253. package/skills/zeroize-audit/tools/mcp/normalize_mcp_evidence.py +125 -0
  254. package/skills/zeroize-audit/tools/scripts/check_llvm_patterns.py +481 -0
  255. package/skills/zeroize-audit/tools/scripts/check_mir_patterns.py +554 -0
  256. package/skills/zeroize-audit/tools/scripts/check_rust_asm.py +424 -0
  257. package/skills/zeroize-audit/tools/scripts/check_rust_asm_aarch64.py +300 -0
  258. package/skills/zeroize-audit/tools/scripts/check_rust_asm_x86.py +283 -0
  259. package/skills/zeroize-audit/tools/scripts/find_dangerous_apis.py +375 -0
  260. package/skills/zeroize-audit/tools/scripts/semantic_audit.py +923 -0
  261. package/skills/zeroize-audit/tools/track_dataflow.sh +196 -0
  262. package/skills/zeroize-audit/tools/validate_rust_toolchain.sh +298 -0
  263. package/skills/zeroize-audit/workflows/phase-0-preflight.md +150 -0
  264. package/skills/zeroize-audit/workflows/phase-1-source-analysis.md +144 -0
  265. package/skills/zeroize-audit/workflows/phase-2-compiler-analysis.md +139 -0
  266. package/skills/zeroize-audit/workflows/phase-3-interim-report.md +46 -0
  267. package/skills/zeroize-audit/workflows/phase-4-poc-generation.md +46 -0
  268. package/skills/zeroize-audit/workflows/phase-5-poc-validation.md +136 -0
  269. package/skills/zeroize-audit/workflows/phase-6-final-report.md +44 -0
  270. package/skills/zeroize-audit/workflows/phase-7-test-generation.md +42 -0
  271. package/themes/piolium-srcery.json +94 -0
@@ -0,0 +1,240 @@
1
+ # Semgrep Rule Creation Workflow
2
+
3
+ Detailed workflow for creating production-quality Semgrep rules.
4
+
5
+ ## Step 1: Analyze the Problem
6
+
7
+ Before writing any code:
8
+
9
+ 1. **Fetch external documentation**: See [Documentation](../SKILL.md#documentation) for required reading
10
+ 2. **Understand the exact bug pattern and explain the bug for a junior developer**: What vulnerability, issue or pattern should be detected?
11
+ 3. **Identify the target language**: What is specific about the bug and that language?
12
+ 4. **Determine the approach**:
13
+ - **Pattern matching**: Syntactic patterns without data flow
14
+ - **Taint mode**: Data flows from untrusted source to dangerous sink
15
+
16
+ ### When to Use Taint Mode
17
+
18
+ Taint mode is a powerful feature in Semgrep that can track the flow of data from one location to another. By using taint mode, you can:
19
+
20
+ - **Track data flow across multiple variables**: Trace how data moves across different variables, functions, components, and identify insecure flow paths (e.g., situations where a specific sanitizer is not used).
21
+ - **Find injection vulnerabilities**: Identify injection vulnerabilities such as SQL injection, command injection, and XSS attacks.
22
+ - **Write simple and resilient Semgrep rules**: Simplify rules that are resilient to code patterns nested in if statements, loops, and other structures.
23
+
24
+ ## Step 2: Write Tests First
25
+
26
+ **Why test-first?** Writing tests before the rule forces you to think about both vulnerable AND safe cases. Rules written without tests often have hidden false positives (matching safe cases) or false negatives (missing vulnerable variants). Tests make these visible immediately.
27
+
28
+ Create directory and test file with annotations (`# ruleid:`, `# ok:` only). See [quick-reference.md]({baseDir}/references/quick-reference.md#test-file-annotations) for full syntax.
29
+
30
+ ### Directory Structure
31
+
32
+ ```
33
+ <rule-id>/
34
+ ├── <rule-id>.yaml # Semgrep rule
35
+ └── <rule-id>.<ext> # Test file with ruleid/ok annotations
36
+ ```
37
+
38
+ **CRITICAL**:
39
+ 1. The comment (`# ruleid:` or `# ok:` ) must be on the line IMMEDIATELY BEFORE the code. Semgrep reports findings on the line after the annotation.
40
+ 2. The comment must contain ONLY the comment marker and annotation (e.g., `# ruleid: my-rule`). No other text, comments, or code on the same line.
41
+
42
+ ### Test Case Design
43
+
44
+ You must include test cases for:
45
+ - Clear vulnerable cases (must match)
46
+ - Clear safe cases (must not match)
47
+ - Edge cases and variations
48
+ - Different coding styles
49
+ - Sanitized/validated input (must not match)
50
+ - Unrelated code (must not match) - normal code with no relation to the rule's target pattern
51
+ - Nested structures (e.g., inside if statements, loops, try/catch blocks, callbacks)
52
+
53
+ ## Step 3: Analyze AST Structure
54
+
55
+ **Why analyze AST?** Semgrep matches against the AST, not raw text. Code that looks similar may parse differently (e.g., `foo.bar()` vs `foo().bar`). The AST dump shows exactly what Semgrep sees, preventing patterns that fail due to unexpected tree structure. Understanding how exactly Semgrep parses code is crucial for writing precise patterns.
56
+
57
+ ```bash
58
+ semgrep --dump-ast -l <language> <rule-id>.<ext>
59
+ ```
60
+
61
+ Example output helps understand:
62
+ - How function calls are represented
63
+ - How variables are bound
64
+ - How control flow is structured
65
+
66
+ ## Step 4: Write the Rule
67
+
68
+ Choose the appropriate pattern operators and write the rule.
69
+
70
+ For pattern operator syntax (basic matching, scope operators, metavariable filters, focus), see [quick-reference.md](quick-reference.md).
71
+
72
+ ### Validate and Test
73
+
74
+ #### Validate YAML Syntax
75
+
76
+ ```bash
77
+ semgrep --validate --config <rule-id>.yaml
78
+ ```
79
+
80
+ #### Run Tests
81
+
82
+ ```bash
83
+ cd <rule-directory>
84
+ semgrep --test --config <rule-id>.yaml <rule-id>.<ext>
85
+ ```
86
+
87
+ #### Expected Output
88
+
89
+ ```
90
+ 1/1: ✓ All tests passed
91
+ ```
92
+
93
+ #### Debug Failures
94
+
95
+ If tests fail, check:
96
+ 1. **Missed lines**: Rule didn't match when it should
97
+ - Pattern too specific
98
+ - Missing pattern variant
99
+ 2. **Incorrect lines**: Rule matched when it shouldn't
100
+ - Pattern too broad
101
+ - Need `pattern-not` exclusion
102
+
103
+ #### Debug Taint Mode Rules
104
+
105
+ ```bash
106
+ semgrep --dataflow-traces -f <rule-id>.yaml <rule-id>.<ext>
107
+ ```
108
+
109
+ Shows:
110
+ - Source locations
111
+ - Sink locations
112
+ - Data flow path
113
+ - Why taint didn't propagate (if applicable)
114
+
115
+ ## Step 5: Iterate Until Tests Pass
116
+ Work on writing Semgrep rule (patterns) iteratively to ensure the Semgrep rule works correctly.
117
+
118
+ Each time when you introduce any changes, test Semgrep rule:
119
+
120
+ ```bash
121
+ semgrep --test --config <rule-id>.yaml <rule-id>.<ext>
122
+ ```
123
+
124
+ For debugging taint mode rules:
125
+ ```bash
126
+ semgrep --dataflow-traces -f <rule-id>.yaml <rule-id>.<ext>
127
+ ```
128
+
129
+ **Verification checkpoint**: Output MUST show "All tests passed". **Only proceed when validation passes**.
130
+
131
+
132
+ **Verification checkpoint**: Proceed to Step 6: Optimize the Rule when:
133
+ - "All tests passed"
134
+ - No "missed lines" (false negatives)
135
+ - No "incorrect lines" (false positives)
136
+
137
+ ### Common Fixes
138
+
139
+ | Problem | Solution |
140
+ |---------|----------|
141
+ | Too many matches | Add `pattern-not` exclusions |
142
+ | Missing matches | Add `pattern-either` variants |
143
+ | Wrong line matched | Adjust `focus-metavariable` |
144
+ | Taint not flowing | Check sanitizers aren't too broad |
145
+ | Taint false positive | Add sanitizer pattern |
146
+
147
+ ## Step 6: Optimize the Rule
148
+
149
+ After all tests pass, remove redundant patterns (quote variants, ellipsis subsets, redundant patterns).
150
+
151
+ ### Semgrep Pattern Equivalences
152
+
153
+ Semgrep treats certain patterns as equivalent:
154
+
155
+ | Written | Also Matches | Reason |
156
+ |---------|--------------|--------|
157
+ | `"string"` | `'string'` | Quote style normalized (in languages where both are equivalent) |
158
+ | `func(...)` | `func()`, `func(a)`, `func(a,b)` | Ellipsis matches zero or more |
159
+ | `func($X, ...)` | `func($X)`, `func($X, a, b)` | Trailing ellipsis is optional |
160
+
161
+ ### Common Redundancies to Remove
162
+
163
+ **1. Quote Variants** (depends on the language)
164
+
165
+ Before:
166
+ ```yaml
167
+ pattern-either:
168
+ - pattern: hashlib.new("md5", ...)
169
+ - pattern: hashlib.new('md5', ...)
170
+ ```
171
+
172
+ After:
173
+ ```yaml
174
+ pattern-either:
175
+ - pattern: hashlib.new("md5", ...)
176
+ ```
177
+
178
+ **2. Ellipsis Subsets**
179
+
180
+ Before:
181
+ ```yaml
182
+ pattern-either:
183
+ - pattern: dangerous($X, ...)
184
+ - pattern: dangerous($X)
185
+ - pattern: dangerous($X, $Y)
186
+ ```
187
+
188
+ After:
189
+ ```yaml
190
+ pattern: dangerous($X, ...)
191
+ ```
192
+
193
+ **3. Consolidate with Metavariables**
194
+
195
+ Before:
196
+ ```yaml
197
+ pattern-either:
198
+ - pattern: md5($X)
199
+ - pattern: sha1($X)
200
+ - pattern: sha256($X)
201
+ ```
202
+
203
+ After:
204
+ ```yaml
205
+ patterns:
206
+ - pattern: $FUNC($X)
207
+ - metavariable-regex:
208
+ metavariable: $FUNC
209
+ regex: ^(md5|sha1|sha256)$
210
+ ```
211
+
212
+ ### Optimization Checklist
213
+
214
+ 1. Remove patterns differing only in quote style
215
+ 2. Remove patterns that are subsets of `...` patterns
216
+ 3. Consolidate similar patterns using metavariable-regex
217
+ 4. Remove duplicate patterns in pattern-either
218
+ 5. Simplify nested pattern-either when possible
219
+ 6. Replace complex regex patterns with metavariable-comparison
220
+ 7. **Re-run tests after each optimization**
221
+
222
+ ### Verify After Optimization
223
+
224
+ ```bash
225
+ semgrep --test --config <rule-id>.yaml <rule-id>.<ext>
226
+ ```
227
+
228
+ **CRITICAL**: Always re-run tests after optimization. Some "redundant" patterns may actually be necessary due to AST structure differences. If any test fails, revert the optimization that caused it.
229
+
230
+ **Task complete ONLY when**: All tests pass after optimization.
231
+
232
+
233
+ ## Step 7: Final Run
234
+ Run the Semgrep rule you created using: `semgrep --config <rule-id>.yaml <rule-id>.<ext>`.
235
+
236
+ Ensure that message:
237
+ 1. Contains a short and concise explanation of the matched pattern
238
+ 2. Has no uninterpolated metavariables (e.g., $OP, $VAR). All metavariables referenced in the message must be captured by the pattern so they interpolate to actual code.
239
+
240
+ Fix any message issues and re-run that Semgrep rule after each fix.
@@ -0,0 +1,205 @@
1
+ ---
2
+ name: semgrep-rule-variant-creator
3
+ description: Creates language variants of existing Semgrep rules. Use when porting a Semgrep rule to specified target languages. Takes an existing rule and target languages as input, produces independent rule+test directories for each language.
4
+ allowed-tools:
5
+ - Bash
6
+ - Read
7
+ - Write
8
+ - Edit
9
+ - Glob
10
+ - Grep
11
+ - WebFetch
12
+ ---
13
+
14
+ # Semgrep Rule Variant Creator
15
+
16
+ Port existing Semgrep rules to new target languages with proper applicability analysis and test-driven validation.
17
+
18
+ ## When to Use
19
+
20
+ **Ideal scenarios:**
21
+ - Porting an existing Semgrep rule to one or more target languages
22
+ - Creating language-specific variants of a universal vulnerability pattern
23
+ - Expanding rule coverage across a polyglot codebase
24
+ - Translating rules between languages with equivalent constructs
25
+
26
+ ## When NOT to Use
27
+
28
+ Do NOT use this skill for:
29
+ - Creating a new Semgrep rule from scratch (use `semgrep-rule-creator` instead)
30
+ - Running existing rules against code
31
+ - Languages where the vulnerability pattern fundamentally doesn't apply
32
+ - Minor syntax variations within the same language
33
+
34
+ ## Input Specification
35
+
36
+ This skill requires:
37
+ 1. **Existing Semgrep rule** - YAML file path or YAML rule content
38
+ 2. **Target languages** - One or more languages to port to (e.g., "Golang and Java")
39
+
40
+ ## Output Specification
41
+
42
+ For each applicable target language, produces:
43
+ ```
44
+ <original-rule-id>-<language>/
45
+ ├── <original-rule-id>-<language>.yaml # Ported Semgrep rule
46
+ └── <original-rule-id>-<language>.<ext> # Test file with annotations
47
+ ```
48
+
49
+ Example output for porting `sql-injection` to Go and Java:
50
+ ```
51
+ sql-injection-golang/
52
+ ├── sql-injection-golang.yaml
53
+ └── sql-injection-golang.go
54
+
55
+ sql-injection-java/
56
+ ├── sql-injection-java.yaml
57
+ └── sql-injection-java.java
58
+ ```
59
+
60
+ ## Rationalizations to Reject
61
+
62
+ When porting Semgrep rules, reject these common shortcuts:
63
+
64
+ | Rationalization | Why It Fails | Correct Approach |
65
+ |-----------------|--------------|------------------|
66
+ | "Pattern structure is identical" | Different ASTs across languages | Always dump AST for target language |
67
+ | "Same vulnerability, same detection" | Data flow differs between languages | Analyze target language idioms |
68
+ | "Rule doesn't need tests since original worked" | Language edge cases differ | Write NEW test cases for target |
69
+ | "Skip applicability - it obviously applies" | Some patterns are language-specific | Complete applicability analysis first |
70
+ | "I'll create all variants then test" | Errors compound, hard to debug | Complete full cycle per language |
71
+ | "Library equivalent is close enough" | Surface similarity hides differences | Verify API semantics match |
72
+ | "Just translate the syntax 1:1" | Languages have different idioms | Research target language patterns |
73
+
74
+ ## Strictness Level
75
+
76
+ This workflow is **strict** - do not skip steps:
77
+ - **Applicability analysis is mandatory**: Don't assume patterns translate
78
+ - **Each language is independent**: Complete full cycle before moving to next
79
+ - **Test-first for each variant**: Never write a rule without test cases
80
+ - **100% test pass required**: "Most tests pass" is not acceptable
81
+
82
+ ## Overview
83
+
84
+ This skill guides the creation of language-specific variants of existing Semgrep rules. Each target language goes through an independent 4-phase cycle:
85
+
86
+ ```
87
+ FOR EACH target language:
88
+ Phase 1: Applicability Analysis → Verdict
89
+ Phase 2: Test Creation (Test-First)
90
+ Phase 3: Rule Creation
91
+ Phase 4: Validation
92
+ (Complete full cycle before moving to next language)
93
+ ```
94
+
95
+ ## Foundational Knowledge
96
+
97
+ **The `semgrep-rule-creator` skill is the authoritative reference for Semgrep rule creation fundamentals.** While this skill focuses on porting existing rules to new languages, the core principles of writing quality rules remain the same.
98
+
99
+ Consult `semgrep-rule-creator` for guidance on:
100
+ - **When to use taint mode vs pattern matching** - Choosing the right approach for the vulnerability type
101
+ - **Test-first methodology** - Why tests come before rules and how to write effective test cases
102
+ - **Anti-patterns to avoid** - Common mistakes like overly broad or overly specific patterns
103
+ - **Iterating until tests pass** - The validation loop and debugging techniques
104
+ - **Rule optimization** - Removing redundant patterns after tests pass
105
+
106
+ When porting a rule, you're applying these same principles in a new language context. If uncertain about rule structure or approach, refer to `semgrep-rule-creator` first.
107
+
108
+ ## Four-Phase Workflow
109
+
110
+ ### Phase 1: Applicability Analysis
111
+
112
+ Before porting, determine if the pattern applies to the target language.
113
+
114
+ **Analysis criteria:**
115
+ 1. Does the vulnerability class exist in the target language?
116
+ 2. Does an equivalent construct exist (function, pattern, library)?
117
+ 3. Are the semantics similar enough for meaningful detection?
118
+
119
+ **Verdict options:**
120
+ - `APPLICABLE` → Proceed with variant creation
121
+ - `APPLICABLE_WITH_ADAPTATION` → Proceed but significant changes needed
122
+ - `NOT_APPLICABLE` → Skip this language, document why
123
+
124
+ See [applicability-analysis.md]({baseDir}/references/applicability-analysis.md) for detailed guidance.
125
+
126
+ ### Phase 2: Test Creation (Test-First)
127
+
128
+ **Always write tests before the rule.**
129
+
130
+ Create test file with target language idioms:
131
+ - Minimum 2 vulnerable cases (`ruleid:`)
132
+ - Minimum 2 safe cases (`ok:`)
133
+ - Include language-specific edge cases
134
+
135
+ ```go
136
+ // ruleid: sql-injection-golang
137
+ db.Query("SELECT * FROM users WHERE id = " + userInput)
138
+
139
+ // ok: sql-injection-golang
140
+ db.Query("SELECT * FROM users WHERE id = ?", userInput)
141
+ ```
142
+
143
+ ### Phase 3: Rule Creation
144
+
145
+ 1. **Analyze AST**: `semgrep --dump-ast -l <lang> test-file`
146
+ 2. **Translate patterns** to target language syntax
147
+ 3. **Update metadata**: language key, message, rule ID
148
+ 4. **Adapt for idioms**: Handle language-specific constructs
149
+
150
+ See [language-syntax-guide.md]({baseDir}/references/language-syntax-guide.md) for translation guidance.
151
+
152
+ ### Phase 4: Validation
153
+
154
+ ```bash
155
+ # Validate YAML
156
+ semgrep --validate --config rule.yaml
157
+
158
+ # Run tests
159
+ semgrep --test --config rule.yaml test-file
160
+ ```
161
+
162
+ **Checkpoint**: Output MUST show `All tests passed`.
163
+
164
+ For taint rule debugging:
165
+ ```bash
166
+ semgrep --dataflow-traces -f rule.yaml test-file
167
+ ```
168
+
169
+ See [workflow.md]({baseDir}/references/workflow.md) for detailed workflow and troubleshooting.
170
+
171
+ ## Quick Reference
172
+
173
+ | Task | Command |
174
+ |------|---------|
175
+ | Run tests | `semgrep --test --config rule.yaml test-file` |
176
+ | Validate YAML | `semgrep --validate --config rule.yaml` |
177
+ | Dump AST | `semgrep --dump-ast -l <lang> <file>` |
178
+ | Debug taint flow | `semgrep --dataflow-traces -f rule.yaml file` |
179
+
180
+
181
+ ## Key Differences from Rule Creation
182
+
183
+ | Aspect | semgrep-rule-creator | This skill |
184
+ |--------|---------------------|------------|
185
+ | Input | Bug pattern description | Existing rule + target languages |
186
+ | Output | Single rule+test | Multiple rule+test directories |
187
+ | Workflow | Single creation cycle | Independent cycle per language |
188
+ | Phase 1 | Problem analysis | Applicability analysis per language |
189
+ | Library research | Always relevant | Optional (when original uses libraries) |
190
+
191
+ ## Documentation
192
+
193
+ **REQUIRED**: Before porting rules, read relevant Semgrep documentation:
194
+
195
+ - [Rule Syntax](https://semgrep.dev/docs/writing-rules/rule-syntax) - YAML structure and operators
196
+ - [Pattern Syntax](https://semgrep.dev/docs/writing-rules/pattern-syntax) - Pattern matching and metavariables
197
+ - [Pattern Examples](https://semgrep.dev/docs/writing-rules/pattern-examples) - Per-language pattern references
198
+ - [Testing Rules](https://semgrep.dev/docs/writing-rules/testing-rules) - Testing annotations
199
+ - [Trail of Bits Testing Handbook](https://appsec.guide/docs/static-analysis/semgrep/advanced/) - Advanced patterns
200
+
201
+ ## Next Steps
202
+
203
+ - For applicability analysis guidance, see [applicability-analysis.md]({baseDir}/references/applicability-analysis.md)
204
+ - For language translation guidance, see [language-syntax-guide.md]({baseDir}/references/language-syntax-guide.md)
205
+ - For detailed workflow and examples, see [workflow.md]({baseDir}/references/workflow.md)
@@ -0,0 +1,250 @@
1
+ # Applicability Analysis
2
+
3
+ Phase 1 of the variant creation workflow. Before porting a rule, analyze whether the vulnerability pattern applies to the target language.
4
+
5
+ ## Analysis Process
6
+
7
+ For EACH target language, answer these questions:
8
+
9
+ ### 1. Does the Vulnerability Class Exist?
10
+
11
+ **Determine if the vulnerability type is possible in the target language.**
12
+
13
+ Examples:
14
+ - Buffer overflow: Applies to C/C++, may apply to Rust (in unsafe blocks), does NOT apply to Python/Java
15
+ - SQL injection: Applies to any language with database access
16
+ - XSS: Applies to any language generating HTML output
17
+ - Memory leak: Relevant in C/C++, less relevant in garbage-collected languages
18
+ - Type confusion: Relevant in dynamically typed languages, less relevant in strongly typed
19
+
20
+ ### 2. Does an Equivalent Construct Exist?
21
+
22
+ **Identify what the original rule detects and find equivalents.**
23
+
24
+ Parse the original rule to identify:
25
+ - **Sinks**: What dangerous functions/methods does it detect?
26
+ - **Sources**: Where does tainted data originate?
27
+ - **Pattern type**: Is it taint-mode or pattern-matching?
28
+
29
+ Then research the target language:
30
+ - What are the equivalent dangerous functions?
31
+ - What are the common source patterns?
32
+ - Are there language-specific idioms to consider?
33
+
34
+ ### 3. Are the Semantics Similar Enough?
35
+
36
+ **Verify the pattern translates meaningfully.**
37
+
38
+ Consider:
39
+ - Does the vulnerability manifest the same way?
40
+ - Are there language-specific mitigations that change detection needs?
41
+ - Would the ported rule provide actual security value?
42
+
43
+ ## Verdict Format
44
+
45
+ Document your analysis for each target language:
46
+
47
+ ```
48
+ TARGET: <language>
49
+ VERDICT: APPLICABLE | APPLICABLE_WITH_ADAPTATION | NOT_APPLICABLE
50
+ REASONING: <specific analysis>
51
+ ADAPTATIONS_NEEDED: <if APPLICABLE_WITH_ADAPTATION>
52
+ EQUIVALENT_CONSTRUCTS:
53
+ - Original: <function/pattern>
54
+ - Target: <equivalent function/pattern>
55
+ ```
56
+
57
+ ## Verdict Definitions
58
+
59
+ ### APPLICABLE
60
+
61
+ The pattern translates directly with minor syntax adjustments.
62
+
63
+ **Criteria:**
64
+ - Equivalent constructs exist with same semantics
65
+ - Vulnerability manifests identically
66
+ - Detection logic remains the same
67
+
68
+ **Example:**
69
+ ```
70
+ Original: Python os.system(user_input)
71
+ Target: Go exec.Command(user_input)
72
+
73
+ VERDICT: APPLICABLE
74
+ REASONING: Both execute shell commands with user input. Vulnerability is
75
+ identical (command injection). Detection logic (taint from input to exec)
76
+ translates directly.
77
+ ```
78
+
79
+ ### APPLICABLE_WITH_ADAPTATION
80
+
81
+ The pattern can be ported but requires significant changes.
82
+
83
+ **Criteria:**
84
+ - Vulnerability class exists but manifests differently
85
+ - Equivalent constructs exist but with different APIs
86
+ - Additional patterns needed for target language idioms
87
+
88
+ **Example:**
89
+ ```
90
+ Original: Python pickle.loads(untrusted)
91
+ Target: Java ObjectInputStream.readObject()
92
+
93
+ VERDICT: APPLICABLE_WITH_ADAPTATION
94
+ REASONING: Both detect deserialization vulnerabilities but the APIs differ
95
+ significantly. Java requires detection of ObjectInputStream creation and
96
+ readObject() calls, not a single function call.
97
+ ADAPTATIONS_NEEDED:
98
+ - Different sink patterns (readObject vs loads)
99
+ - May need pattern-inside for ObjectInputStream context
100
+ - Consider readUnshared() variant
101
+ ```
102
+
103
+ ### NOT_APPLICABLE
104
+
105
+ The pattern should not be ported to this language.
106
+
107
+ **Criteria:**
108
+ - Vulnerability class doesn't exist in target language
109
+ - No equivalent construct exists
110
+ - Pattern would be meaningless or misleading
111
+
112
+ **Example:**
113
+ ```
114
+ Original: C buffer overflow detection
115
+ Target: Python
116
+
117
+ VERDICT: NOT_APPLICABLE
118
+ REASONING: Python handles memory management automatically. Buffer overflows
119
+ in the traditional C sense don't exist. The vulnerability class is not
120
+ present in the target language.
121
+ ```
122
+
123
+ ## Common Applicability Patterns
124
+
125
+ ### Always Translate (Language-Agnostic Vulnerabilities)
126
+
127
+ These vulnerability classes exist across most languages:
128
+ - SQL injection (any language with DB access)
129
+ - Command injection (any language with shell execution)
130
+ - Path traversal (any language with file operations)
131
+ - SSRF (any language with HTTP clients)
132
+ - XSS (any language generating HTML)
133
+
134
+ ### Sometimes Translate (Context-Dependent)
135
+
136
+ These require careful analysis:
137
+ - Deserialization: Different mechanisms per language
138
+ - Cryptographic weaknesses: Language-specific crypto libraries
139
+ - Race conditions: Depends on concurrency model
140
+ - Integer overflow: Depends on type system
141
+
142
+ ### Rarely Translate (Language-Specific)
143
+
144
+ These are often NOT_APPLICABLE for other languages:
145
+ - Memory corruption (C/C++ specific)
146
+ - Type juggling (PHP specific)
147
+ - Prototype pollution (JavaScript specific)
148
+ - GIL-related issues (Python specific)
149
+
150
+ ## Library-Specific Rules
151
+
152
+ When the original rule targets a third-party library:
153
+
154
+ ### Step 1: Identify the Library's Purpose
155
+
156
+ What functionality does the library provide?
157
+ - ORM / Database access
158
+ - HTTP client/server
159
+ - Serialization
160
+ - Templating
161
+ - etc.
162
+
163
+ ### Step 2: Research Target Language Ecosystem
164
+
165
+ For the target language, identify:
166
+ - Standard library equivalents
167
+ - Popular third-party libraries with same functionality
168
+ - Language-specific idioms for this functionality
169
+
170
+ ### Step 3: Decide on Scope
171
+
172
+ Options:
173
+ - **Native constructs only**: Port to standard library equivalents
174
+ - **Popular library**: Port to the most common library in target ecosystem
175
+ - **Multiple variants**: Create separate rules for multiple libraries
176
+
177
+ **Recommendation**: Start with standard library or most popular option. Additional library variants can be created separately if needed.
178
+
179
+ ## Analysis Checklist
180
+
181
+ Before proceeding past Phase 1:
182
+
183
+ - [ ] Parsed original rule and identified pattern type
184
+ - [ ] Identified sinks, sources, and sanitizers (if taint mode)
185
+ - [ ] Researched equivalent constructs in target language
186
+ - [ ] Documented verdict with specific reasoning
187
+ - [ ] If APPLICABLE_WITH_ADAPTATION, listed required changes
188
+ - [ ] If NOT_APPLICABLE, documented clear explanation
189
+
190
+ ## Example Analysis
191
+
192
+ **Original Rule**: Python command injection via subprocess
193
+
194
+ ```yaml
195
+ rules:
196
+ - id: python-command-injection
197
+ mode: taint
198
+ languages: [python]
199
+ pattern-sources:
200
+ - pattern: request.args.get(...)
201
+ pattern-sinks:
202
+ - pattern: subprocess.call($CMD, shell=True, ...)
203
+ ```
204
+
205
+ **Target**: Go
206
+
207
+ ```
208
+ TARGET: Go
209
+ VERDICT: APPLICABLE_WITH_ADAPTATION
210
+
211
+ REASONING:
212
+ - Command injection exists in Go (vulnerability class present)
213
+ - Go uses exec.Command() and exec.CommandContext() for command execution
214
+ - Go doesn't have shell=True equivalent; commands run directly by default
215
+ - Shell execution in Go requires explicit bash -c wrapping
216
+
217
+ EQUIVALENT_CONSTRUCTS:
218
+ - Original sink: subprocess.call(cmd, shell=True)
219
+ - Target sinks:
220
+ - exec.Command("bash", "-c", cmd)
221
+ - exec.Command("sh", "-c", cmd)
222
+ - exec.Command(cmd) when cmd comes from user input
223
+
224
+ ADAPTATIONS_NEEDED:
225
+ 1. Different sink patterns for Go's exec package
226
+ 2. Source patterns need Go HTTP handler equivalents (r.URL.Query(), r.FormValue())
227
+ 3. Consider both direct exec.Command and shell-wrapped variants
228
+ ```
229
+
230
+ **Target**: Java
231
+
232
+ ```
233
+ TARGET: Java
234
+ VERDICT: APPLICABLE
235
+
236
+ REASONING:
237
+ - Command injection exists in Java (vulnerability class present)
238
+ - Java uses Runtime.exec() and ProcessBuilder for command execution
239
+ - Direct equivalent functionality available
240
+
241
+ EQUIVALENT_CONSTRUCTS:
242
+ - Original sink: subprocess.call(cmd, shell=True)
243
+ - Target sinks:
244
+ - Runtime.getRuntime().exec(cmd)
245
+ - new ProcessBuilder(cmd).start()
246
+
247
+ ADAPTATIONS_NEEDED:
248
+ - Source patterns need Java servlet equivalents (request.getParameter())
249
+ - Consider both Runtime.exec and ProcessBuilder patterns
250
+ ```