@vigolium/piolium 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +117 -0
- package/agents/access-auditor.md +300 -0
- package/agents/assumption-breaker.md +154 -0
- package/agents/attack-designer.md +116 -0
- package/agents/code-scanner.md +139 -0
- package/agents/concurrency-auditor.md +238 -0
- package/agents/confirm-writer.md +257 -0
- package/agents/context-reviewer.md +274 -0
- package/agents/cross-verifier.md +165 -0
- package/agents/cve-scout.md +381 -0
- package/agents/env-builder.md +282 -0
- package/agents/env-profiler.md +205 -0
- package/agents/evidence-collector.md +140 -0
- package/agents/finding-grader.md +142 -0
- package/agents/finding-writer.md +148 -0
- package/agents/flow-tracer.md +106 -0
- package/agents/goal-backtracer.md +146 -0
- package/agents/history-miner.md +467 -0
- package/agents/independent-verifier.md +118 -0
- package/agents/intent-mapper.md +183 -0
- package/agents/longshot-collector.md +128 -0
- package/agents/longshot-prober.md +126 -0
- package/agents/patch-auditor.md +73 -0
- package/agents/poc-author.md +124 -0
- package/agents/poc-runner.md +194 -0
- package/agents/probe-lead.md +269 -0
- package/agents/red-challenger.md +101 -0
- package/agents/report-composer.md +208 -0
- package/agents/review-adjudicator.md +216 -0
- package/agents/spec-auditor.md +155 -0
- package/agents/taint-tracer.md +265 -0
- package/agents/test-locator.md +209 -0
- package/agents/threat-modeler.md +132 -0
- package/agents/variant-scanner.md +108 -0
- package/agents/variant-spotter.md +110 -0
- package/bin/piolium.mjs +376 -0
- package/extensions/piolium/_vendor/yaml.bundle.d.mts +6 -0
- package/extensions/piolium/_vendor/yaml.bundle.mjs +139 -0
- package/extensions/piolium/agent-runner.ts +322 -0
- package/extensions/piolium/agents.ts +266 -0
- package/extensions/piolium/audit-state.ts +522 -0
- package/extensions/piolium/bundled-resources.ts +97 -0
- package/extensions/piolium/candidate-scan.ts +966 -0
- package/extensions/piolium/command-target.ts +177 -0
- package/extensions/piolium/console-stream.ts +57 -0
- package/extensions/piolium/export-results.ts +380 -0
- package/extensions/piolium/findings.ts +448 -0
- package/extensions/piolium/heartbeat.ts +182 -0
- package/extensions/piolium/help.ts +234 -0
- package/extensions/piolium/index.ts +1865 -0
- package/extensions/piolium/longshot.ts +530 -0
- package/extensions/piolium/matcher-suggestions.ts +196 -0
- package/extensions/piolium/matcher-utils.ts +83 -0
- package/extensions/piolium/modes/balanced.ts +750 -0
- package/extensions/piolium/modes/confirm-bootstrap.ts +186 -0
- package/extensions/piolium/modes/confirm.ts +697 -0
- package/extensions/piolium/modes/deep.ts +917 -0
- package/extensions/piolium/modes/diff.ts +177 -0
- package/extensions/piolium/modes/lite.ts +540 -0
- package/extensions/piolium/modes/longshot.ts +595 -0
- package/extensions/piolium/modes/merge.ts +204 -0
- package/extensions/piolium/modes/phase-runner.ts +267 -0
- package/extensions/piolium/modes/reinvest.ts +546 -0
- package/extensions/piolium/modes/revisit.ts +279 -0
- package/extensions/piolium/modes.ts +48 -0
- package/extensions/piolium/phase-labels.ts +123 -0
- package/extensions/piolium/phase-status-strip.ts +92 -0
- package/extensions/piolium/prompt-prefix-editor.ts +39 -0
- package/extensions/piolium/providers/anthropic-vertex.ts +836 -0
- package/extensions/piolium/recon.ts +409 -0
- package/extensions/piolium/result-stats.ts +105 -0
- package/extensions/piolium/retry.ts +120 -0
- package/extensions/piolium/scheduler.ts +212 -0
- package/extensions/piolium/secrets.ts +368 -0
- package/extensions/piolium/tools/web-tools.ts +148 -0
- package/package.json +77 -0
- package/skills/agentic-actions-auditor/SKILL.md +327 -0
- package/skills/agentic-actions-auditor/references/action-profiles.md +186 -0
- package/skills/agentic-actions-auditor/references/cross-file-resolution.md +209 -0
- package/skills/agentic-actions-auditor/references/foundations.md +94 -0
- package/skills/agentic-actions-auditor/references/vector-a-env-var-intermediary.md +77 -0
- package/skills/agentic-actions-auditor/references/vector-b-direct-expression-injection.md +83 -0
- package/skills/agentic-actions-auditor/references/vector-c-cli-data-fetch.md +83 -0
- package/skills/agentic-actions-auditor/references/vector-d-pr-target-checkout.md +88 -0
- package/skills/agentic-actions-auditor/references/vector-e-error-log-injection.md +88 -0
- package/skills/agentic-actions-auditor/references/vector-f-subshell-expansion.md +82 -0
- package/skills/agentic-actions-auditor/references/vector-g-eval-of-ai-output.md +91 -0
- package/skills/agentic-actions-auditor/references/vector-h-dangerous-sandbox-configs.md +102 -0
- package/skills/agentic-actions-auditor/references/vector-i-wildcard-allowlists.md +88 -0
- package/skills/audit/SKILL.md +562 -0
- package/skills/audit/assets/icon.svg +7 -0
- package/skills/audit/hooks/scripts/validate_phase_output.py +550 -0
- package/skills/audit/references/adversarial-review.md +148 -0
- package/skills/audit/references/architecture-aware-sast.md +306 -0
- package/skills/audit/references/audit-workflow.md +737 -0
- package/skills/audit/references/chamber-protocol.md +384 -0
- package/skills/audit/references/creative-attack-modes.md +221 -0
- package/skills/audit/references/deep-analysis.md +273 -0
- package/skills/audit/references/domain-attack-playbooks.md +1129 -0
- package/skills/audit/references/knowledge-base-template.md +513 -0
- package/skills/audit/references/real-env-validation.md +191 -0
- package/skills/audit/references/report-templates.md +417 -0
- package/skills/audit/references/triage-and-prereqs.md +134 -0
- package/skills/audit/scripts/consolidate_drafts.py +554 -0
- package/skills/audit/scripts/partition_findings.py +152 -0
- package/skills/audit/scripts/rg-hotspots.sh +121 -0
- package/skills/audit/scripts/stamp_file_state.py +349 -0
- package/skills/code-reviewer/SKILL.md +65 -0
- package/skills/codeql/SKILL.md +281 -0
- package/skills/codeql/references/build-fixes.md +90 -0
- package/skills/codeql/references/diagnostic-query-templates.md +339 -0
- package/skills/codeql/references/extension-yaml-format.md +209 -0
- package/skills/codeql/references/important-only-suite.md +153 -0
- package/skills/codeql/references/language-details.md +207 -0
- package/skills/codeql/references/macos-arm64e-workaround.md +179 -0
- package/skills/codeql/references/performance-tuning.md +111 -0
- package/skills/codeql/references/quality-assessment.md +172 -0
- package/skills/codeql/references/ruleset-catalog.md +63 -0
- package/skills/codeql/references/run-all-suite.md +92 -0
- package/skills/codeql/references/sarif-processing.md +79 -0
- package/skills/codeql/references/threat-models.md +51 -0
- package/skills/codeql/workflows/build-database.md +280 -0
- package/skills/codeql/workflows/create-data-extensions.md +261 -0
- package/skills/codeql/workflows/run-analysis.md +301 -0
- package/skills/differential-review/SKILL.md +220 -0
- package/skills/differential-review/adversarial.md +203 -0
- package/skills/differential-review/methodology.md +234 -0
- package/skills/differential-review/patterns.md +300 -0
- package/skills/differential-review/reporting.md +369 -0
- package/skills/fp-check/SKILL.md +125 -0
- package/skills/fp-check/references/bug-class-verification.md +114 -0
- package/skills/fp-check/references/deep-verification.md +143 -0
- package/skills/fp-check/references/evidence-templates.md +91 -0
- package/skills/fp-check/references/false-positive-patterns.md +115 -0
- package/skills/fp-check/references/gate-reviews.md +27 -0
- package/skills/fp-check/references/standard-verification.md +78 -0
- package/skills/insecure-defaults/SKILL.md +117 -0
- package/skills/insecure-defaults/references/examples.md +409 -0
- package/skills/last30days/SKILL.md +444 -0
- package/skills/sarif-parsing/SKILL.md +483 -0
- package/skills/sarif-parsing/resources/jq-queries.md +162 -0
- package/skills/sarif-parsing/resources/sarif_helpers.py +331 -0
- package/skills/security-threat-model/LICENSE.txt +201 -0
- package/skills/security-threat-model/SKILL.md +81 -0
- package/skills/security-threat-model/agents/openai.yaml +4 -0
- package/skills/security-threat-model/references/prompt-template.md +255 -0
- package/skills/security-threat-model/references/security-controls-and-assets.md +32 -0
- package/skills/semgrep/SKILL.md +212 -0
- package/skills/semgrep/references/rulesets.md +162 -0
- package/skills/semgrep/references/scan-modes.md +110 -0
- package/skills/semgrep/references/scanner-task-prompt.md +140 -0
- package/skills/semgrep/scripts/merge_sarif.py +203 -0
- package/skills/semgrep/workflows/scan-workflow.md +311 -0
- package/skills/semgrep-rule-creator/SKILL.md +168 -0
- package/skills/semgrep-rule-creator/references/quick-reference.md +202 -0
- package/skills/semgrep-rule-creator/references/workflow.md +240 -0
- package/skills/semgrep-rule-variant-creator/SKILL.md +205 -0
- package/skills/semgrep-rule-variant-creator/references/applicability-analysis.md +250 -0
- package/skills/semgrep-rule-variant-creator/references/language-syntax-guide.md +324 -0
- package/skills/semgrep-rule-variant-creator/references/workflow.md +518 -0
- package/skills/sharp-edges/SKILL.md +292 -0
- package/skills/sharp-edges/references/auth-patterns.md +252 -0
- package/skills/sharp-edges/references/case-studies.md +274 -0
- package/skills/sharp-edges/references/config-patterns.md +333 -0
- package/skills/sharp-edges/references/crypto-apis.md +190 -0
- package/skills/sharp-edges/references/lang-c.md +205 -0
- package/skills/sharp-edges/references/lang-csharp.md +285 -0
- package/skills/sharp-edges/references/lang-go.md +270 -0
- package/skills/sharp-edges/references/lang-java.md +263 -0
- package/skills/sharp-edges/references/lang-javascript.md +269 -0
- package/skills/sharp-edges/references/lang-kotlin.md +265 -0
- package/skills/sharp-edges/references/lang-php.md +245 -0
- package/skills/sharp-edges/references/lang-python.md +274 -0
- package/skills/sharp-edges/references/lang-ruby.md +273 -0
- package/skills/sharp-edges/references/lang-rust.md +272 -0
- package/skills/sharp-edges/references/lang-swift.md +287 -0
- package/skills/sharp-edges/references/language-specific.md +588 -0
- package/skills/spec-to-code-compliance/SKILL.md +357 -0
- package/skills/spec-to-code-compliance/resources/COMPLETENESS_CHECKLIST.md +69 -0
- package/skills/spec-to-code-compliance/resources/IR_EXAMPLES.md +417 -0
- package/skills/spec-to-code-compliance/resources/OUTPUT_REQUIREMENTS.md +105 -0
- package/skills/supply-chain-risk-auditor/SKILL.md +67 -0
- package/skills/supply-chain-risk-auditor/resources/results-template.md +41 -0
- package/skills/variant-analysis/METHODOLOGY.md +327 -0
- package/skills/variant-analysis/SKILL.md +142 -0
- package/skills/variant-analysis/resources/codeql/cpp.ql +119 -0
- package/skills/variant-analysis/resources/codeql/go.ql +69 -0
- package/skills/variant-analysis/resources/codeql/java.ql +71 -0
- package/skills/variant-analysis/resources/codeql/javascript.ql +63 -0
- package/skills/variant-analysis/resources/codeql/python.ql +80 -0
- package/skills/variant-analysis/resources/semgrep/cpp.yaml +98 -0
- package/skills/variant-analysis/resources/semgrep/go.yaml +63 -0
- package/skills/variant-analysis/resources/semgrep/java.yaml +61 -0
- package/skills/variant-analysis/resources/semgrep/javascript.yaml +60 -0
- package/skills/variant-analysis/resources/semgrep/python.yaml +72 -0
- package/skills/variant-analysis/resources/variant-report-template.md +75 -0
- package/skills/vuln-report/SKILL.md +137 -0
- package/skills/vuln-report/agents/openai.yaml +4 -0
- package/skills/vuln-report/references/report-template.md +135 -0
- package/skills/wooyun-legacy/SKILL.md +367 -0
- package/skills/wooyun-legacy/references/bank-penetration.md +222 -0
- package/skills/wooyun-legacy/references/checklists/command-execution-checklist.md +119 -0
- package/skills/wooyun-legacy/references/checklists/csrf-checklist.md +74 -0
- package/skills/wooyun-legacy/references/checklists/file-upload-checklist.md +108 -0
- package/skills/wooyun-legacy/references/checklists/info-disclosure-checklist.md +114 -0
- package/skills/wooyun-legacy/references/checklists/logic-flaws-checklist.md +95 -0
- package/skills/wooyun-legacy/references/checklists/misconfig-checklist.md +124 -0
- package/skills/wooyun-legacy/references/checklists/path-traversal-checklist.md +87 -0
- package/skills/wooyun-legacy/references/checklists/rce-checklist.md +93 -0
- package/skills/wooyun-legacy/references/checklists/sql-injection-checklist.md +97 -0
- package/skills/wooyun-legacy/references/checklists/ssrf-checklist.md +99 -0
- package/skills/wooyun-legacy/references/checklists/unauthorized-access-checklist.md +89 -0
- package/skills/wooyun-legacy/references/checklists/weak-password-checklist.md +115 -0
- package/skills/wooyun-legacy/references/checklists/xss-checklist.md +103 -0
- package/skills/wooyun-legacy/references/checklists/xxe-checklist.md +130 -0
- package/skills/wooyun-legacy/references/info-disclosure.md +975 -0
- package/skills/wooyun-legacy/references/logic-flaws.md +721 -0
- package/skills/wooyun-legacy/references/path-traversal.md +1191 -0
- package/skills/wooyun-legacy/references/telecom-penetration.md +156 -0
- package/skills/wooyun-legacy/references/unauthorized-access.md +980 -0
- package/skills/wooyun-legacy/references/xss.md +746 -0
- package/skills/zeroize-audit/SKILL.md +371 -0
- package/skills/zeroize-audit/configs/c.yaml +21 -0
- package/skills/zeroize-audit/configs/default.yaml +128 -0
- package/skills/zeroize-audit/configs/rust.yaml +83 -0
- package/skills/zeroize-audit/prompts/report_template.md +238 -0
- package/skills/zeroize-audit/prompts/system.md +163 -0
- package/skills/zeroize-audit/prompts/task.md +97 -0
- package/skills/zeroize-audit/references/compile-commands.md +231 -0
- package/skills/zeroize-audit/references/detection-strategy.md +191 -0
- package/skills/zeroize-audit/references/ir-analysis.md +252 -0
- package/skills/zeroize-audit/references/mcp-analysis.md +221 -0
- package/skills/zeroize-audit/references/poc-generation.md +470 -0
- package/skills/zeroize-audit/references/rust-zeroization-patterns.md +867 -0
- package/skills/zeroize-audit/schemas/input.json +83 -0
- package/skills/zeroize-audit/schemas/output.json +140 -0
- package/skills/zeroize-audit/tools/analyze_asm.sh +202 -0
- package/skills/zeroize-audit/tools/analyze_cfg.py +381 -0
- package/skills/zeroize-audit/tools/analyze_heap.sh +211 -0
- package/skills/zeroize-audit/tools/analyze_ir_semantic.py +429 -0
- package/skills/zeroize-audit/tools/diff_ir.sh +135 -0
- package/skills/zeroize-audit/tools/diff_rust_mir.sh +189 -0
- package/skills/zeroize-audit/tools/emit_asm.sh +67 -0
- package/skills/zeroize-audit/tools/emit_ir.sh +77 -0
- package/skills/zeroize-audit/tools/emit_rust_asm.sh +178 -0
- package/skills/zeroize-audit/tools/emit_rust_ir.sh +150 -0
- package/skills/zeroize-audit/tools/emit_rust_mir.sh +158 -0
- package/skills/zeroize-audit/tools/extract_compile_flags.py +284 -0
- package/skills/zeroize-audit/tools/generate_poc.py +1329 -0
- package/skills/zeroize-audit/tools/mcp/apply_confidence_gates.py +113 -0
- package/skills/zeroize-audit/tools/mcp/check_mcp.sh +68 -0
- package/skills/zeroize-audit/tools/mcp/normalize_mcp_evidence.py +125 -0
- package/skills/zeroize-audit/tools/scripts/check_llvm_patterns.py +481 -0
- package/skills/zeroize-audit/tools/scripts/check_mir_patterns.py +554 -0
- package/skills/zeroize-audit/tools/scripts/check_rust_asm.py +424 -0
- package/skills/zeroize-audit/tools/scripts/check_rust_asm_aarch64.py +300 -0
- package/skills/zeroize-audit/tools/scripts/check_rust_asm_x86.py +283 -0
- package/skills/zeroize-audit/tools/scripts/find_dangerous_apis.py +375 -0
- package/skills/zeroize-audit/tools/scripts/semantic_audit.py +923 -0
- package/skills/zeroize-audit/tools/track_dataflow.sh +196 -0
- package/skills/zeroize-audit/tools/validate_rust_toolchain.sh +298 -0
- package/skills/zeroize-audit/workflows/phase-0-preflight.md +150 -0
- package/skills/zeroize-audit/workflows/phase-1-source-analysis.md +144 -0
- package/skills/zeroize-audit/workflows/phase-2-compiler-analysis.md +139 -0
- package/skills/zeroize-audit/workflows/phase-3-interim-report.md +46 -0
- package/skills/zeroize-audit/workflows/phase-4-poc-generation.md +46 -0
- package/skills/zeroize-audit/workflows/phase-5-poc-validation.md +136 -0
- package/skills/zeroize-audit/workflows/phase-6-final-report.md +44 -0
- package/skills/zeroize-audit/workflows/phase-7-test-generation.md +42 -0
- package/themes/piolium-srcery.json +94 -0
|
@@ -0,0 +1,240 @@
|
|
|
1
|
+
# Semgrep Rule Creation Workflow
|
|
2
|
+
|
|
3
|
+
Detailed workflow for creating production-quality Semgrep rules.
|
|
4
|
+
|
|
5
|
+
## Step 1: Analyze the Problem
|
|
6
|
+
|
|
7
|
+
Before writing any code:
|
|
8
|
+
|
|
9
|
+
1. **Fetch external documentation**: See [Documentation](../SKILL.md#documentation) for required reading
|
|
10
|
+
2. **Understand the exact bug pattern and explain the bug for a junior developer**: What vulnerability, issue or pattern should be detected?
|
|
11
|
+
3. **Identify the target language**: What is specific about the bug and that language?
|
|
12
|
+
4. **Determine the approach**:
|
|
13
|
+
- **Pattern matching**: Syntactic patterns without data flow
|
|
14
|
+
- **Taint mode**: Data flows from untrusted source to dangerous sink
|
|
15
|
+
|
|
16
|
+
### When to Use Taint Mode
|
|
17
|
+
|
|
18
|
+
Taint mode is a powerful feature in Semgrep that can track the flow of data from one location to another. By using taint mode, you can:
|
|
19
|
+
|
|
20
|
+
- **Track data flow across multiple variables**: Trace how data moves across different variables, functions, components, and identify insecure flow paths (e.g., situations where a specific sanitizer is not used).
|
|
21
|
+
- **Find injection vulnerabilities**: Identify injection vulnerabilities such as SQL injection, command injection, and XSS attacks.
|
|
22
|
+
- **Write simple and resilient Semgrep rules**: Simplify rules that are resilient to code patterns nested in if statements, loops, and other structures.
|
|
23
|
+
|
|
24
|
+
## Step 2: Write Tests First
|
|
25
|
+
|
|
26
|
+
**Why test-first?** Writing tests before the rule forces you to think about both vulnerable AND safe cases. Rules written without tests often have hidden false positives (matching safe cases) or false negatives (missing vulnerable variants). Tests make these visible immediately.
|
|
27
|
+
|
|
28
|
+
Create directory and test file with annotations (`# ruleid:`, `# ok:` only). See [quick-reference.md]({baseDir}/references/quick-reference.md#test-file-annotations) for full syntax.
|
|
29
|
+
|
|
30
|
+
### Directory Structure
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
<rule-id>/
|
|
34
|
+
├── <rule-id>.yaml # Semgrep rule
|
|
35
|
+
└── <rule-id>.<ext> # Test file with ruleid/ok annotations
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
**CRITICAL**:
|
|
39
|
+
1. The comment (`# ruleid:` or `# ok:` ) must be on the line IMMEDIATELY BEFORE the code. Semgrep reports findings on the line after the annotation.
|
|
40
|
+
2. The comment must contain ONLY the comment marker and annotation (e.g., `# ruleid: my-rule`). No other text, comments, or code on the same line.
|
|
41
|
+
|
|
42
|
+
### Test Case Design
|
|
43
|
+
|
|
44
|
+
You must include test cases for:
|
|
45
|
+
- Clear vulnerable cases (must match)
|
|
46
|
+
- Clear safe cases (must not match)
|
|
47
|
+
- Edge cases and variations
|
|
48
|
+
- Different coding styles
|
|
49
|
+
- Sanitized/validated input (must not match)
|
|
50
|
+
- Unrelated code (must not match) - normal code with no relation to the rule's target pattern
|
|
51
|
+
- Nested structures (e.g., inside if statements, loops, try/catch blocks, callbacks)
|
|
52
|
+
|
|
53
|
+
## Step 3: Analyze AST Structure
|
|
54
|
+
|
|
55
|
+
**Why analyze AST?** Semgrep matches against the AST, not raw text. Code that looks similar may parse differently (e.g., `foo.bar()` vs `foo().bar`). The AST dump shows exactly what Semgrep sees, preventing patterns that fail due to unexpected tree structure. Understanding how exactly Semgrep parses code is crucial for writing precise patterns.
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
semgrep --dump-ast -l <language> <rule-id>.<ext>
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Example output helps understand:
|
|
62
|
+
- How function calls are represented
|
|
63
|
+
- How variables are bound
|
|
64
|
+
- How control flow is structured
|
|
65
|
+
|
|
66
|
+
## Step 4: Write the Rule
|
|
67
|
+
|
|
68
|
+
Choose the appropriate pattern operators and write the rule.
|
|
69
|
+
|
|
70
|
+
For pattern operator syntax (basic matching, scope operators, metavariable filters, focus), see [quick-reference.md](quick-reference.md).
|
|
71
|
+
|
|
72
|
+
### Validate and Test
|
|
73
|
+
|
|
74
|
+
#### Validate YAML Syntax
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
semgrep --validate --config <rule-id>.yaml
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
#### Run Tests
|
|
81
|
+
|
|
82
|
+
```bash
|
|
83
|
+
cd <rule-directory>
|
|
84
|
+
semgrep --test --config <rule-id>.yaml <rule-id>.<ext>
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
#### Expected Output
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
1/1: ✓ All tests passed
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
#### Debug Failures
|
|
94
|
+
|
|
95
|
+
If tests fail, check:
|
|
96
|
+
1. **Missed lines**: Rule didn't match when it should
|
|
97
|
+
- Pattern too specific
|
|
98
|
+
- Missing pattern variant
|
|
99
|
+
2. **Incorrect lines**: Rule matched when it shouldn't
|
|
100
|
+
- Pattern too broad
|
|
101
|
+
- Need `pattern-not` exclusion
|
|
102
|
+
|
|
103
|
+
#### Debug Taint Mode Rules
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
semgrep --dataflow-traces -f <rule-id>.yaml <rule-id>.<ext>
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Shows:
|
|
110
|
+
- Source locations
|
|
111
|
+
- Sink locations
|
|
112
|
+
- Data flow path
|
|
113
|
+
- Why taint didn't propagate (if applicable)
|
|
114
|
+
|
|
115
|
+
## Step 5: Iterate Until Tests Pass
|
|
116
|
+
Work on writing Semgrep rule (patterns) iteratively to ensure the Semgrep rule works correctly.
|
|
117
|
+
|
|
118
|
+
Each time when you introduce any changes, test Semgrep rule:
|
|
119
|
+
|
|
120
|
+
```bash
|
|
121
|
+
semgrep --test --config <rule-id>.yaml <rule-id>.<ext>
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
For debugging taint mode rules:
|
|
125
|
+
```bash
|
|
126
|
+
semgrep --dataflow-traces -f <rule-id>.yaml <rule-id>.<ext>
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
**Verification checkpoint**: Output MUST show "All tests passed". **Only proceed when validation passes**.
|
|
130
|
+
|
|
131
|
+
|
|
132
|
+
**Verification checkpoint**: Proceed to Step 6: Optimize the Rule when:
|
|
133
|
+
- "All tests passed"
|
|
134
|
+
- No "missed lines" (false negatives)
|
|
135
|
+
- No "incorrect lines" (false positives)
|
|
136
|
+
|
|
137
|
+
### Common Fixes
|
|
138
|
+
|
|
139
|
+
| Problem | Solution |
|
|
140
|
+
|---------|----------|
|
|
141
|
+
| Too many matches | Add `pattern-not` exclusions |
|
|
142
|
+
| Missing matches | Add `pattern-either` variants |
|
|
143
|
+
| Wrong line matched | Adjust `focus-metavariable` |
|
|
144
|
+
| Taint not flowing | Check sanitizers aren't too broad |
|
|
145
|
+
| Taint false positive | Add sanitizer pattern |
|
|
146
|
+
|
|
147
|
+
## Step 6: Optimize the Rule
|
|
148
|
+
|
|
149
|
+
After all tests pass, remove redundant patterns (quote variants, ellipsis subsets, redundant patterns).
|
|
150
|
+
|
|
151
|
+
### Semgrep Pattern Equivalences
|
|
152
|
+
|
|
153
|
+
Semgrep treats certain patterns as equivalent:
|
|
154
|
+
|
|
155
|
+
| Written | Also Matches | Reason |
|
|
156
|
+
|---------|--------------|--------|
|
|
157
|
+
| `"string"` | `'string'` | Quote style normalized (in languages where both are equivalent) |
|
|
158
|
+
| `func(...)` | `func()`, `func(a)`, `func(a,b)` | Ellipsis matches zero or more |
|
|
159
|
+
| `func($X, ...)` | `func($X)`, `func($X, a, b)` | Trailing ellipsis is optional |
|
|
160
|
+
|
|
161
|
+
### Common Redundancies to Remove
|
|
162
|
+
|
|
163
|
+
**1. Quote Variants** (depends on the language)
|
|
164
|
+
|
|
165
|
+
Before:
|
|
166
|
+
```yaml
|
|
167
|
+
pattern-either:
|
|
168
|
+
- pattern: hashlib.new("md5", ...)
|
|
169
|
+
- pattern: hashlib.new('md5', ...)
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
After:
|
|
173
|
+
```yaml
|
|
174
|
+
pattern-either:
|
|
175
|
+
- pattern: hashlib.new("md5", ...)
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
**2. Ellipsis Subsets**
|
|
179
|
+
|
|
180
|
+
Before:
|
|
181
|
+
```yaml
|
|
182
|
+
pattern-either:
|
|
183
|
+
- pattern: dangerous($X, ...)
|
|
184
|
+
- pattern: dangerous($X)
|
|
185
|
+
- pattern: dangerous($X, $Y)
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
After:
|
|
189
|
+
```yaml
|
|
190
|
+
pattern: dangerous($X, ...)
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
**3. Consolidate with Metavariables**
|
|
194
|
+
|
|
195
|
+
Before:
|
|
196
|
+
```yaml
|
|
197
|
+
pattern-either:
|
|
198
|
+
- pattern: md5($X)
|
|
199
|
+
- pattern: sha1($X)
|
|
200
|
+
- pattern: sha256($X)
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
After:
|
|
204
|
+
```yaml
|
|
205
|
+
patterns:
|
|
206
|
+
- pattern: $FUNC($X)
|
|
207
|
+
- metavariable-regex:
|
|
208
|
+
metavariable: $FUNC
|
|
209
|
+
regex: ^(md5|sha1|sha256)$
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
### Optimization Checklist
|
|
213
|
+
|
|
214
|
+
1. Remove patterns differing only in quote style
|
|
215
|
+
2. Remove patterns that are subsets of `...` patterns
|
|
216
|
+
3. Consolidate similar patterns using metavariable-regex
|
|
217
|
+
4. Remove duplicate patterns in pattern-either
|
|
218
|
+
5. Simplify nested pattern-either when possible
|
|
219
|
+
6. Replace complex regex patterns with metavariable-comparison
|
|
220
|
+
7. **Re-run tests after each optimization**
|
|
221
|
+
|
|
222
|
+
### Verify After Optimization
|
|
223
|
+
|
|
224
|
+
```bash
|
|
225
|
+
semgrep --test --config <rule-id>.yaml <rule-id>.<ext>
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
**CRITICAL**: Always re-run tests after optimization. Some "redundant" patterns may actually be necessary due to AST structure differences. If any test fails, revert the optimization that caused it.
|
|
229
|
+
|
|
230
|
+
**Task complete ONLY when**: All tests pass after optimization.
|
|
231
|
+
|
|
232
|
+
|
|
233
|
+
## Step 7: Final Run
|
|
234
|
+
Run the Semgrep rule you created using: `semgrep --config <rule-id>.yaml <rule-id>.<ext>`.
|
|
235
|
+
|
|
236
|
+
Ensure that message:
|
|
237
|
+
1. Contains a short and concise explanation of the matched pattern
|
|
238
|
+
2. Has no uninterpolated metavariables (e.g., $OP, $VAR). All metavariables referenced in the message must be captured by the pattern so they interpolate to actual code.
|
|
239
|
+
|
|
240
|
+
Fix any message issues and re-run that Semgrep rule after each fix.
|
|
@@ -0,0 +1,205 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: semgrep-rule-variant-creator
|
|
3
|
+
description: Creates language variants of existing Semgrep rules. Use when porting a Semgrep rule to specified target languages. Takes an existing rule and target languages as input, produces independent rule+test directories for each language.
|
|
4
|
+
allowed-tools:
|
|
5
|
+
- Bash
|
|
6
|
+
- Read
|
|
7
|
+
- Write
|
|
8
|
+
- Edit
|
|
9
|
+
- Glob
|
|
10
|
+
- Grep
|
|
11
|
+
- WebFetch
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# Semgrep Rule Variant Creator
|
|
15
|
+
|
|
16
|
+
Port existing Semgrep rules to new target languages with proper applicability analysis and test-driven validation.
|
|
17
|
+
|
|
18
|
+
## When to Use
|
|
19
|
+
|
|
20
|
+
**Ideal scenarios:**
|
|
21
|
+
- Porting an existing Semgrep rule to one or more target languages
|
|
22
|
+
- Creating language-specific variants of a universal vulnerability pattern
|
|
23
|
+
- Expanding rule coverage across a polyglot codebase
|
|
24
|
+
- Translating rules between languages with equivalent constructs
|
|
25
|
+
|
|
26
|
+
## When NOT to Use
|
|
27
|
+
|
|
28
|
+
Do NOT use this skill for:
|
|
29
|
+
- Creating a new Semgrep rule from scratch (use `semgrep-rule-creator` instead)
|
|
30
|
+
- Running existing rules against code
|
|
31
|
+
- Languages where the vulnerability pattern fundamentally doesn't apply
|
|
32
|
+
- Minor syntax variations within the same language
|
|
33
|
+
|
|
34
|
+
## Input Specification
|
|
35
|
+
|
|
36
|
+
This skill requires:
|
|
37
|
+
1. **Existing Semgrep rule** - YAML file path or YAML rule content
|
|
38
|
+
2. **Target languages** - One or more languages to port to (e.g., "Golang and Java")
|
|
39
|
+
|
|
40
|
+
## Output Specification
|
|
41
|
+
|
|
42
|
+
For each applicable target language, produces:
|
|
43
|
+
```
|
|
44
|
+
<original-rule-id>-<language>/
|
|
45
|
+
├── <original-rule-id>-<language>.yaml # Ported Semgrep rule
|
|
46
|
+
└── <original-rule-id>-<language>.<ext> # Test file with annotations
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
Example output for porting `sql-injection` to Go and Java:
|
|
50
|
+
```
|
|
51
|
+
sql-injection-golang/
|
|
52
|
+
├── sql-injection-golang.yaml
|
|
53
|
+
└── sql-injection-golang.go
|
|
54
|
+
|
|
55
|
+
sql-injection-java/
|
|
56
|
+
├── sql-injection-java.yaml
|
|
57
|
+
└── sql-injection-java.java
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Rationalizations to Reject
|
|
61
|
+
|
|
62
|
+
When porting Semgrep rules, reject these common shortcuts:
|
|
63
|
+
|
|
64
|
+
| Rationalization | Why It Fails | Correct Approach |
|
|
65
|
+
|-----------------|--------------|------------------|
|
|
66
|
+
| "Pattern structure is identical" | Different ASTs across languages | Always dump AST for target language |
|
|
67
|
+
| "Same vulnerability, same detection" | Data flow differs between languages | Analyze target language idioms |
|
|
68
|
+
| "Rule doesn't need tests since original worked" | Language edge cases differ | Write NEW test cases for target |
|
|
69
|
+
| "Skip applicability - it obviously applies" | Some patterns are language-specific | Complete applicability analysis first |
|
|
70
|
+
| "I'll create all variants then test" | Errors compound, hard to debug | Complete full cycle per language |
|
|
71
|
+
| "Library equivalent is close enough" | Surface similarity hides differences | Verify API semantics match |
|
|
72
|
+
| "Just translate the syntax 1:1" | Languages have different idioms | Research target language patterns |
|
|
73
|
+
|
|
74
|
+
## Strictness Level
|
|
75
|
+
|
|
76
|
+
This workflow is **strict** - do not skip steps:
|
|
77
|
+
- **Applicability analysis is mandatory**: Don't assume patterns translate
|
|
78
|
+
- **Each language is independent**: Complete full cycle before moving to next
|
|
79
|
+
- **Test-first for each variant**: Never write a rule without test cases
|
|
80
|
+
- **100% test pass required**: "Most tests pass" is not acceptable
|
|
81
|
+
|
|
82
|
+
## Overview
|
|
83
|
+
|
|
84
|
+
This skill guides the creation of language-specific variants of existing Semgrep rules. Each target language goes through an independent 4-phase cycle:
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
FOR EACH target language:
|
|
88
|
+
Phase 1: Applicability Analysis → Verdict
|
|
89
|
+
Phase 2: Test Creation (Test-First)
|
|
90
|
+
Phase 3: Rule Creation
|
|
91
|
+
Phase 4: Validation
|
|
92
|
+
(Complete full cycle before moving to next language)
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
## Foundational Knowledge
|
|
96
|
+
|
|
97
|
+
**The `semgrep-rule-creator` skill is the authoritative reference for Semgrep rule creation fundamentals.** While this skill focuses on porting existing rules to new languages, the core principles of writing quality rules remain the same.
|
|
98
|
+
|
|
99
|
+
Consult `semgrep-rule-creator` for guidance on:
|
|
100
|
+
- **When to use taint mode vs pattern matching** - Choosing the right approach for the vulnerability type
|
|
101
|
+
- **Test-first methodology** - Why tests come before rules and how to write effective test cases
|
|
102
|
+
- **Anti-patterns to avoid** - Common mistakes like overly broad or overly specific patterns
|
|
103
|
+
- **Iterating until tests pass** - The validation loop and debugging techniques
|
|
104
|
+
- **Rule optimization** - Removing redundant patterns after tests pass
|
|
105
|
+
|
|
106
|
+
When porting a rule, you're applying these same principles in a new language context. If uncertain about rule structure or approach, refer to `semgrep-rule-creator` first.
|
|
107
|
+
|
|
108
|
+
## Four-Phase Workflow
|
|
109
|
+
|
|
110
|
+
### Phase 1: Applicability Analysis
|
|
111
|
+
|
|
112
|
+
Before porting, determine if the pattern applies to the target language.
|
|
113
|
+
|
|
114
|
+
**Analysis criteria:**
|
|
115
|
+
1. Does the vulnerability class exist in the target language?
|
|
116
|
+
2. Does an equivalent construct exist (function, pattern, library)?
|
|
117
|
+
3. Are the semantics similar enough for meaningful detection?
|
|
118
|
+
|
|
119
|
+
**Verdict options:**
|
|
120
|
+
- `APPLICABLE` → Proceed with variant creation
|
|
121
|
+
- `APPLICABLE_WITH_ADAPTATION` → Proceed but significant changes needed
|
|
122
|
+
- `NOT_APPLICABLE` → Skip this language, document why
|
|
123
|
+
|
|
124
|
+
See [applicability-analysis.md]({baseDir}/references/applicability-analysis.md) for detailed guidance.
|
|
125
|
+
|
|
126
|
+
### Phase 2: Test Creation (Test-First)
|
|
127
|
+
|
|
128
|
+
**Always write tests before the rule.**
|
|
129
|
+
|
|
130
|
+
Create test file with target language idioms:
|
|
131
|
+
- Minimum 2 vulnerable cases (`ruleid:`)
|
|
132
|
+
- Minimum 2 safe cases (`ok:`)
|
|
133
|
+
- Include language-specific edge cases
|
|
134
|
+
|
|
135
|
+
```go
|
|
136
|
+
// ruleid: sql-injection-golang
|
|
137
|
+
db.Query("SELECT * FROM users WHERE id = " + userInput)
|
|
138
|
+
|
|
139
|
+
// ok: sql-injection-golang
|
|
140
|
+
db.Query("SELECT * FROM users WHERE id = ?", userInput)
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
### Phase 3: Rule Creation
|
|
144
|
+
|
|
145
|
+
1. **Analyze AST**: `semgrep --dump-ast -l <lang> test-file`
|
|
146
|
+
2. **Translate patterns** to target language syntax
|
|
147
|
+
3. **Update metadata**: language key, message, rule ID
|
|
148
|
+
4. **Adapt for idioms**: Handle language-specific constructs
|
|
149
|
+
|
|
150
|
+
See [language-syntax-guide.md]({baseDir}/references/language-syntax-guide.md) for translation guidance.
|
|
151
|
+
|
|
152
|
+
### Phase 4: Validation
|
|
153
|
+
|
|
154
|
+
```bash
|
|
155
|
+
# Validate YAML
|
|
156
|
+
semgrep --validate --config rule.yaml
|
|
157
|
+
|
|
158
|
+
# Run tests
|
|
159
|
+
semgrep --test --config rule.yaml test-file
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
**Checkpoint**: Output MUST show `All tests passed`.
|
|
163
|
+
|
|
164
|
+
For taint rule debugging:
|
|
165
|
+
```bash
|
|
166
|
+
semgrep --dataflow-traces -f rule.yaml test-file
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
See [workflow.md]({baseDir}/references/workflow.md) for detailed workflow and troubleshooting.
|
|
170
|
+
|
|
171
|
+
## Quick Reference
|
|
172
|
+
|
|
173
|
+
| Task | Command |
|
|
174
|
+
|------|---------|
|
|
175
|
+
| Run tests | `semgrep --test --config rule.yaml test-file` |
|
|
176
|
+
| Validate YAML | `semgrep --validate --config rule.yaml` |
|
|
177
|
+
| Dump AST | `semgrep --dump-ast -l <lang> <file>` |
|
|
178
|
+
| Debug taint flow | `semgrep --dataflow-traces -f rule.yaml file` |
|
|
179
|
+
|
|
180
|
+
|
|
181
|
+
## Key Differences from Rule Creation
|
|
182
|
+
|
|
183
|
+
| Aspect | semgrep-rule-creator | This skill |
|
|
184
|
+
|--------|---------------------|------------|
|
|
185
|
+
| Input | Bug pattern description | Existing rule + target languages |
|
|
186
|
+
| Output | Single rule+test | Multiple rule+test directories |
|
|
187
|
+
| Workflow | Single creation cycle | Independent cycle per language |
|
|
188
|
+
| Phase 1 | Problem analysis | Applicability analysis per language |
|
|
189
|
+
| Library research | Always relevant | Optional (when original uses libraries) |
|
|
190
|
+
|
|
191
|
+
## Documentation
|
|
192
|
+
|
|
193
|
+
**REQUIRED**: Before porting rules, read relevant Semgrep documentation:
|
|
194
|
+
|
|
195
|
+
- [Rule Syntax](https://semgrep.dev/docs/writing-rules/rule-syntax) - YAML structure and operators
|
|
196
|
+
- [Pattern Syntax](https://semgrep.dev/docs/writing-rules/pattern-syntax) - Pattern matching and metavariables
|
|
197
|
+
- [Pattern Examples](https://semgrep.dev/docs/writing-rules/pattern-examples) - Per-language pattern references
|
|
198
|
+
- [Testing Rules](https://semgrep.dev/docs/writing-rules/testing-rules) - Testing annotations
|
|
199
|
+
- [Trail of Bits Testing Handbook](https://appsec.guide/docs/static-analysis/semgrep/advanced/) - Advanced patterns
|
|
200
|
+
|
|
201
|
+
## Next Steps
|
|
202
|
+
|
|
203
|
+
- For applicability analysis guidance, see [applicability-analysis.md]({baseDir}/references/applicability-analysis.md)
|
|
204
|
+
- For language translation guidance, see [language-syntax-guide.md]({baseDir}/references/language-syntax-guide.md)
|
|
205
|
+
- For detailed workflow and examples, see [workflow.md]({baseDir}/references/workflow.md)
|
|
@@ -0,0 +1,250 @@
|
|
|
1
|
+
# Applicability Analysis
|
|
2
|
+
|
|
3
|
+
Phase 1 of the variant creation workflow. Before porting a rule, analyze whether the vulnerability pattern applies to the target language.
|
|
4
|
+
|
|
5
|
+
## Analysis Process
|
|
6
|
+
|
|
7
|
+
For EACH target language, answer these questions:
|
|
8
|
+
|
|
9
|
+
### 1. Does the Vulnerability Class Exist?
|
|
10
|
+
|
|
11
|
+
**Determine if the vulnerability type is possible in the target language.**
|
|
12
|
+
|
|
13
|
+
Examples:
|
|
14
|
+
- Buffer overflow: Applies to C/C++, may apply to Rust (in unsafe blocks), does NOT apply to Python/Java
|
|
15
|
+
- SQL injection: Applies to any language with database access
|
|
16
|
+
- XSS: Applies to any language generating HTML output
|
|
17
|
+
- Memory leak: Relevant in C/C++, less relevant in garbage-collected languages
|
|
18
|
+
- Type confusion: Relevant in dynamically typed languages, less relevant in strongly typed
|
|
19
|
+
|
|
20
|
+
### 2. Does an Equivalent Construct Exist?
|
|
21
|
+
|
|
22
|
+
**Identify what the original rule detects and find equivalents.**
|
|
23
|
+
|
|
24
|
+
Parse the original rule to identify:
|
|
25
|
+
- **Sinks**: What dangerous functions/methods does it detect?
|
|
26
|
+
- **Sources**: Where does tainted data originate?
|
|
27
|
+
- **Pattern type**: Is it taint-mode or pattern-matching?
|
|
28
|
+
|
|
29
|
+
Then research the target language:
|
|
30
|
+
- What are the equivalent dangerous functions?
|
|
31
|
+
- What are the common source patterns?
|
|
32
|
+
- Are there language-specific idioms to consider?
|
|
33
|
+
|
|
34
|
+
### 3. Are the Semantics Similar Enough?
|
|
35
|
+
|
|
36
|
+
**Verify the pattern translates meaningfully.**
|
|
37
|
+
|
|
38
|
+
Consider:
|
|
39
|
+
- Does the vulnerability manifest the same way?
|
|
40
|
+
- Are there language-specific mitigations that change detection needs?
|
|
41
|
+
- Would the ported rule provide actual security value?
|
|
42
|
+
|
|
43
|
+
## Verdict Format
|
|
44
|
+
|
|
45
|
+
Document your analysis for each target language:
|
|
46
|
+
|
|
47
|
+
```
|
|
48
|
+
TARGET: <language>
|
|
49
|
+
VERDICT: APPLICABLE | APPLICABLE_WITH_ADAPTATION | NOT_APPLICABLE
|
|
50
|
+
REASONING: <specific analysis>
|
|
51
|
+
ADAPTATIONS_NEEDED: <if APPLICABLE_WITH_ADAPTATION>
|
|
52
|
+
EQUIVALENT_CONSTRUCTS:
|
|
53
|
+
- Original: <function/pattern>
|
|
54
|
+
- Target: <equivalent function/pattern>
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## Verdict Definitions
|
|
58
|
+
|
|
59
|
+
### APPLICABLE
|
|
60
|
+
|
|
61
|
+
The pattern translates directly with minor syntax adjustments.
|
|
62
|
+
|
|
63
|
+
**Criteria:**
|
|
64
|
+
- Equivalent constructs exist with same semantics
|
|
65
|
+
- Vulnerability manifests identically
|
|
66
|
+
- Detection logic remains the same
|
|
67
|
+
|
|
68
|
+
**Example:**
|
|
69
|
+
```
|
|
70
|
+
Original: Python os.system(user_input)
|
|
71
|
+
Target: Go exec.Command(user_input)
|
|
72
|
+
|
|
73
|
+
VERDICT: APPLICABLE
|
|
74
|
+
REASONING: Both execute shell commands with user input. Vulnerability is
|
|
75
|
+
identical (command injection). Detection logic (taint from input to exec)
|
|
76
|
+
translates directly.
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
### APPLICABLE_WITH_ADAPTATION
|
|
80
|
+
|
|
81
|
+
The pattern can be ported but requires significant changes.
|
|
82
|
+
|
|
83
|
+
**Criteria:**
|
|
84
|
+
- Vulnerability class exists but manifests differently
|
|
85
|
+
- Equivalent constructs exist but with different APIs
|
|
86
|
+
- Additional patterns needed for target language idioms
|
|
87
|
+
|
|
88
|
+
**Example:**
|
|
89
|
+
```
|
|
90
|
+
Original: Python pickle.loads(untrusted)
|
|
91
|
+
Target: Java ObjectInputStream.readObject()
|
|
92
|
+
|
|
93
|
+
VERDICT: APPLICABLE_WITH_ADAPTATION
|
|
94
|
+
REASONING: Both detect deserialization vulnerabilities but the APIs differ
|
|
95
|
+
significantly. Java requires detection of ObjectInputStream creation and
|
|
96
|
+
readObject() calls, not a single function call.
|
|
97
|
+
ADAPTATIONS_NEEDED:
|
|
98
|
+
- Different sink patterns (readObject vs loads)
|
|
99
|
+
- May need pattern-inside for ObjectInputStream context
|
|
100
|
+
- Consider readUnshared() variant
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### NOT_APPLICABLE
|
|
104
|
+
|
|
105
|
+
The pattern should not be ported to this language.
|
|
106
|
+
|
|
107
|
+
**Criteria:**
|
|
108
|
+
- Vulnerability class doesn't exist in target language
|
|
109
|
+
- No equivalent construct exists
|
|
110
|
+
- Pattern would be meaningless or misleading
|
|
111
|
+
|
|
112
|
+
**Example:**
|
|
113
|
+
```
|
|
114
|
+
Original: C buffer overflow detection
|
|
115
|
+
Target: Python
|
|
116
|
+
|
|
117
|
+
VERDICT: NOT_APPLICABLE
|
|
118
|
+
REASONING: Python handles memory management automatically. Buffer overflows
|
|
119
|
+
in the traditional C sense don't exist. The vulnerability class is not
|
|
120
|
+
present in the target language.
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
## Common Applicability Patterns
|
|
124
|
+
|
|
125
|
+
### Always Translate (Language-Agnostic Vulnerabilities)
|
|
126
|
+
|
|
127
|
+
These vulnerability classes exist across most languages:
|
|
128
|
+
- SQL injection (any language with DB access)
|
|
129
|
+
- Command injection (any language with shell execution)
|
|
130
|
+
- Path traversal (any language with file operations)
|
|
131
|
+
- SSRF (any language with HTTP clients)
|
|
132
|
+
- XSS (any language generating HTML)
|
|
133
|
+
|
|
134
|
+
### Sometimes Translate (Context-Dependent)
|
|
135
|
+
|
|
136
|
+
These require careful analysis:
|
|
137
|
+
- Deserialization: Different mechanisms per language
|
|
138
|
+
- Cryptographic weaknesses: Language-specific crypto libraries
|
|
139
|
+
- Race conditions: Depends on concurrency model
|
|
140
|
+
- Integer overflow: Depends on type system
|
|
141
|
+
|
|
142
|
+
### Rarely Translate (Language-Specific)
|
|
143
|
+
|
|
144
|
+
These are often NOT_APPLICABLE for other languages:
|
|
145
|
+
- Memory corruption (C/C++ specific)
|
|
146
|
+
- Type juggling (PHP specific)
|
|
147
|
+
- Prototype pollution (JavaScript specific)
|
|
148
|
+
- GIL-related issues (Python specific)
|
|
149
|
+
|
|
150
|
+
## Library-Specific Rules
|
|
151
|
+
|
|
152
|
+
When the original rule targets a third-party library:
|
|
153
|
+
|
|
154
|
+
### Step 1: Identify the Library's Purpose
|
|
155
|
+
|
|
156
|
+
What functionality does the library provide?
|
|
157
|
+
- ORM / Database access
|
|
158
|
+
- HTTP client/server
|
|
159
|
+
- Serialization
|
|
160
|
+
- Templating
|
|
161
|
+
- etc.
|
|
162
|
+
|
|
163
|
+
### Step 2: Research Target Language Ecosystem
|
|
164
|
+
|
|
165
|
+
For the target language, identify:
|
|
166
|
+
- Standard library equivalents
|
|
167
|
+
- Popular third-party libraries with same functionality
|
|
168
|
+
- Language-specific idioms for this functionality
|
|
169
|
+
|
|
170
|
+
### Step 3: Decide on Scope
|
|
171
|
+
|
|
172
|
+
Options:
|
|
173
|
+
- **Native constructs only**: Port to standard library equivalents
|
|
174
|
+
- **Popular library**: Port to the most common library in target ecosystem
|
|
175
|
+
- **Multiple variants**: Create separate rules for multiple libraries
|
|
176
|
+
|
|
177
|
+
**Recommendation**: Start with standard library or most popular option. Additional library variants can be created separately if needed.
|
|
178
|
+
|
|
179
|
+
## Analysis Checklist
|
|
180
|
+
|
|
181
|
+
Before proceeding past Phase 1:
|
|
182
|
+
|
|
183
|
+
- [ ] Parsed original rule and identified pattern type
|
|
184
|
+
- [ ] Identified sinks, sources, and sanitizers (if taint mode)
|
|
185
|
+
- [ ] Researched equivalent constructs in target language
|
|
186
|
+
- [ ] Documented verdict with specific reasoning
|
|
187
|
+
- [ ] If APPLICABLE_WITH_ADAPTATION, listed required changes
|
|
188
|
+
- [ ] If NOT_APPLICABLE, documented clear explanation
|
|
189
|
+
|
|
190
|
+
## Example Analysis
|
|
191
|
+
|
|
192
|
+
**Original Rule**: Python command injection via subprocess
|
|
193
|
+
|
|
194
|
+
```yaml
|
|
195
|
+
rules:
|
|
196
|
+
- id: python-command-injection
|
|
197
|
+
mode: taint
|
|
198
|
+
languages: [python]
|
|
199
|
+
pattern-sources:
|
|
200
|
+
- pattern: request.args.get(...)
|
|
201
|
+
pattern-sinks:
|
|
202
|
+
- pattern: subprocess.call($CMD, shell=True, ...)
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
**Target**: Go
|
|
206
|
+
|
|
207
|
+
```
|
|
208
|
+
TARGET: Go
|
|
209
|
+
VERDICT: APPLICABLE_WITH_ADAPTATION
|
|
210
|
+
|
|
211
|
+
REASONING:
|
|
212
|
+
- Command injection exists in Go (vulnerability class present)
|
|
213
|
+
- Go uses exec.Command() and exec.CommandContext() for command execution
|
|
214
|
+
- Go doesn't have shell=True equivalent; commands run directly by default
|
|
215
|
+
- Shell execution in Go requires explicit bash -c wrapping
|
|
216
|
+
|
|
217
|
+
EQUIVALENT_CONSTRUCTS:
|
|
218
|
+
- Original sink: subprocess.call(cmd, shell=True)
|
|
219
|
+
- Target sinks:
|
|
220
|
+
- exec.Command("bash", "-c", cmd)
|
|
221
|
+
- exec.Command("sh", "-c", cmd)
|
|
222
|
+
- exec.Command(cmd) when cmd comes from user input
|
|
223
|
+
|
|
224
|
+
ADAPTATIONS_NEEDED:
|
|
225
|
+
1. Different sink patterns for Go's exec package
|
|
226
|
+
2. Source patterns need Go HTTP handler equivalents (r.URL.Query(), r.FormValue())
|
|
227
|
+
3. Consider both direct exec.Command and shell-wrapped variants
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
**Target**: Java
|
|
231
|
+
|
|
232
|
+
```
|
|
233
|
+
TARGET: Java
|
|
234
|
+
VERDICT: APPLICABLE
|
|
235
|
+
|
|
236
|
+
REASONING:
|
|
237
|
+
- Command injection exists in Java (vulnerability class present)
|
|
238
|
+
- Java uses Runtime.exec() and ProcessBuilder for command execution
|
|
239
|
+
- Direct equivalent functionality available
|
|
240
|
+
|
|
241
|
+
EQUIVALENT_CONSTRUCTS:
|
|
242
|
+
- Original sink: subprocess.call(cmd, shell=True)
|
|
243
|
+
- Target sinks:
|
|
244
|
+
- Runtime.getRuntime().exec(cmd)
|
|
245
|
+
- new ProcessBuilder(cmd).start()
|
|
246
|
+
|
|
247
|
+
ADAPTATIONS_NEEDED:
|
|
248
|
+
- Source patterns need Java servlet equivalents (request.getParameter())
|
|
249
|
+
- Consider both Runtime.exec and ProcessBuilder patterns
|
|
250
|
+
```
|