@vigolium/piolium 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +117 -0
- package/agents/access-auditor.md +300 -0
- package/agents/assumption-breaker.md +154 -0
- package/agents/attack-designer.md +116 -0
- package/agents/code-scanner.md +139 -0
- package/agents/concurrency-auditor.md +238 -0
- package/agents/confirm-writer.md +257 -0
- package/agents/context-reviewer.md +274 -0
- package/agents/cross-verifier.md +165 -0
- package/agents/cve-scout.md +381 -0
- package/agents/env-builder.md +282 -0
- package/agents/env-profiler.md +205 -0
- package/agents/evidence-collector.md +140 -0
- package/agents/finding-grader.md +142 -0
- package/agents/finding-writer.md +148 -0
- package/agents/flow-tracer.md +106 -0
- package/agents/goal-backtracer.md +146 -0
- package/agents/history-miner.md +467 -0
- package/agents/independent-verifier.md +118 -0
- package/agents/intent-mapper.md +183 -0
- package/agents/longshot-collector.md +128 -0
- package/agents/longshot-prober.md +126 -0
- package/agents/patch-auditor.md +73 -0
- package/agents/poc-author.md +124 -0
- package/agents/poc-runner.md +194 -0
- package/agents/probe-lead.md +269 -0
- package/agents/red-challenger.md +101 -0
- package/agents/report-composer.md +208 -0
- package/agents/review-adjudicator.md +216 -0
- package/agents/spec-auditor.md +155 -0
- package/agents/taint-tracer.md +265 -0
- package/agents/test-locator.md +209 -0
- package/agents/threat-modeler.md +132 -0
- package/agents/variant-scanner.md +108 -0
- package/agents/variant-spotter.md +110 -0
- package/bin/piolium.mjs +376 -0
- package/extensions/piolium/_vendor/yaml.bundle.d.mts +6 -0
- package/extensions/piolium/_vendor/yaml.bundle.mjs +139 -0
- package/extensions/piolium/agent-runner.ts +322 -0
- package/extensions/piolium/agents.ts +266 -0
- package/extensions/piolium/audit-state.ts +522 -0
- package/extensions/piolium/bundled-resources.ts +97 -0
- package/extensions/piolium/candidate-scan.ts +966 -0
- package/extensions/piolium/command-target.ts +177 -0
- package/extensions/piolium/console-stream.ts +57 -0
- package/extensions/piolium/export-results.ts +380 -0
- package/extensions/piolium/findings.ts +448 -0
- package/extensions/piolium/heartbeat.ts +182 -0
- package/extensions/piolium/help.ts +234 -0
- package/extensions/piolium/index.ts +1865 -0
- package/extensions/piolium/longshot.ts +530 -0
- package/extensions/piolium/matcher-suggestions.ts +196 -0
- package/extensions/piolium/matcher-utils.ts +83 -0
- package/extensions/piolium/modes/balanced.ts +750 -0
- package/extensions/piolium/modes/confirm-bootstrap.ts +186 -0
- package/extensions/piolium/modes/confirm.ts +697 -0
- package/extensions/piolium/modes/deep.ts +917 -0
- package/extensions/piolium/modes/diff.ts +177 -0
- package/extensions/piolium/modes/lite.ts +540 -0
- package/extensions/piolium/modes/longshot.ts +595 -0
- package/extensions/piolium/modes/merge.ts +204 -0
- package/extensions/piolium/modes/phase-runner.ts +267 -0
- package/extensions/piolium/modes/reinvest.ts +546 -0
- package/extensions/piolium/modes/revisit.ts +279 -0
- package/extensions/piolium/modes.ts +48 -0
- package/extensions/piolium/phase-labels.ts +123 -0
- package/extensions/piolium/phase-status-strip.ts +92 -0
- package/extensions/piolium/prompt-prefix-editor.ts +39 -0
- package/extensions/piolium/providers/anthropic-vertex.ts +836 -0
- package/extensions/piolium/recon.ts +409 -0
- package/extensions/piolium/result-stats.ts +105 -0
- package/extensions/piolium/retry.ts +120 -0
- package/extensions/piolium/scheduler.ts +212 -0
- package/extensions/piolium/secrets.ts +368 -0
- package/extensions/piolium/tools/web-tools.ts +148 -0
- package/package.json +77 -0
- package/skills/agentic-actions-auditor/SKILL.md +327 -0
- package/skills/agentic-actions-auditor/references/action-profiles.md +186 -0
- package/skills/agentic-actions-auditor/references/cross-file-resolution.md +209 -0
- package/skills/agentic-actions-auditor/references/foundations.md +94 -0
- package/skills/agentic-actions-auditor/references/vector-a-env-var-intermediary.md +77 -0
- package/skills/agentic-actions-auditor/references/vector-b-direct-expression-injection.md +83 -0
- package/skills/agentic-actions-auditor/references/vector-c-cli-data-fetch.md +83 -0
- package/skills/agentic-actions-auditor/references/vector-d-pr-target-checkout.md +88 -0
- package/skills/agentic-actions-auditor/references/vector-e-error-log-injection.md +88 -0
- package/skills/agentic-actions-auditor/references/vector-f-subshell-expansion.md +82 -0
- package/skills/agentic-actions-auditor/references/vector-g-eval-of-ai-output.md +91 -0
- package/skills/agentic-actions-auditor/references/vector-h-dangerous-sandbox-configs.md +102 -0
- package/skills/agentic-actions-auditor/references/vector-i-wildcard-allowlists.md +88 -0
- package/skills/audit/SKILL.md +562 -0
- package/skills/audit/assets/icon.svg +7 -0
- package/skills/audit/hooks/scripts/validate_phase_output.py +550 -0
- package/skills/audit/references/adversarial-review.md +148 -0
- package/skills/audit/references/architecture-aware-sast.md +306 -0
- package/skills/audit/references/audit-workflow.md +737 -0
- package/skills/audit/references/chamber-protocol.md +384 -0
- package/skills/audit/references/creative-attack-modes.md +221 -0
- package/skills/audit/references/deep-analysis.md +273 -0
- package/skills/audit/references/domain-attack-playbooks.md +1129 -0
- package/skills/audit/references/knowledge-base-template.md +513 -0
- package/skills/audit/references/real-env-validation.md +191 -0
- package/skills/audit/references/report-templates.md +417 -0
- package/skills/audit/references/triage-and-prereqs.md +134 -0
- package/skills/audit/scripts/consolidate_drafts.py +554 -0
- package/skills/audit/scripts/partition_findings.py +152 -0
- package/skills/audit/scripts/rg-hotspots.sh +121 -0
- package/skills/audit/scripts/stamp_file_state.py +349 -0
- package/skills/code-reviewer/SKILL.md +65 -0
- package/skills/codeql/SKILL.md +281 -0
- package/skills/codeql/references/build-fixes.md +90 -0
- package/skills/codeql/references/diagnostic-query-templates.md +339 -0
- package/skills/codeql/references/extension-yaml-format.md +209 -0
- package/skills/codeql/references/important-only-suite.md +153 -0
- package/skills/codeql/references/language-details.md +207 -0
- package/skills/codeql/references/macos-arm64e-workaround.md +179 -0
- package/skills/codeql/references/performance-tuning.md +111 -0
- package/skills/codeql/references/quality-assessment.md +172 -0
- package/skills/codeql/references/ruleset-catalog.md +63 -0
- package/skills/codeql/references/run-all-suite.md +92 -0
- package/skills/codeql/references/sarif-processing.md +79 -0
- package/skills/codeql/references/threat-models.md +51 -0
- package/skills/codeql/workflows/build-database.md +280 -0
- package/skills/codeql/workflows/create-data-extensions.md +261 -0
- package/skills/codeql/workflows/run-analysis.md +301 -0
- package/skills/differential-review/SKILL.md +220 -0
- package/skills/differential-review/adversarial.md +203 -0
- package/skills/differential-review/methodology.md +234 -0
- package/skills/differential-review/patterns.md +300 -0
- package/skills/differential-review/reporting.md +369 -0
- package/skills/fp-check/SKILL.md +125 -0
- package/skills/fp-check/references/bug-class-verification.md +114 -0
- package/skills/fp-check/references/deep-verification.md +143 -0
- package/skills/fp-check/references/evidence-templates.md +91 -0
- package/skills/fp-check/references/false-positive-patterns.md +115 -0
- package/skills/fp-check/references/gate-reviews.md +27 -0
- package/skills/fp-check/references/standard-verification.md +78 -0
- package/skills/insecure-defaults/SKILL.md +117 -0
- package/skills/insecure-defaults/references/examples.md +409 -0
- package/skills/last30days/SKILL.md +444 -0
- package/skills/sarif-parsing/SKILL.md +483 -0
- package/skills/sarif-parsing/resources/jq-queries.md +162 -0
- package/skills/sarif-parsing/resources/sarif_helpers.py +331 -0
- package/skills/security-threat-model/LICENSE.txt +201 -0
- package/skills/security-threat-model/SKILL.md +81 -0
- package/skills/security-threat-model/agents/openai.yaml +4 -0
- package/skills/security-threat-model/references/prompt-template.md +255 -0
- package/skills/security-threat-model/references/security-controls-and-assets.md +32 -0
- package/skills/semgrep/SKILL.md +212 -0
- package/skills/semgrep/references/rulesets.md +162 -0
- package/skills/semgrep/references/scan-modes.md +110 -0
- package/skills/semgrep/references/scanner-task-prompt.md +140 -0
- package/skills/semgrep/scripts/merge_sarif.py +203 -0
- package/skills/semgrep/workflows/scan-workflow.md +311 -0
- package/skills/semgrep-rule-creator/SKILL.md +168 -0
- package/skills/semgrep-rule-creator/references/quick-reference.md +202 -0
- package/skills/semgrep-rule-creator/references/workflow.md +240 -0
- package/skills/semgrep-rule-variant-creator/SKILL.md +205 -0
- package/skills/semgrep-rule-variant-creator/references/applicability-analysis.md +250 -0
- package/skills/semgrep-rule-variant-creator/references/language-syntax-guide.md +324 -0
- package/skills/semgrep-rule-variant-creator/references/workflow.md +518 -0
- package/skills/sharp-edges/SKILL.md +292 -0
- package/skills/sharp-edges/references/auth-patterns.md +252 -0
- package/skills/sharp-edges/references/case-studies.md +274 -0
- package/skills/sharp-edges/references/config-patterns.md +333 -0
- package/skills/sharp-edges/references/crypto-apis.md +190 -0
- package/skills/sharp-edges/references/lang-c.md +205 -0
- package/skills/sharp-edges/references/lang-csharp.md +285 -0
- package/skills/sharp-edges/references/lang-go.md +270 -0
- package/skills/sharp-edges/references/lang-java.md +263 -0
- package/skills/sharp-edges/references/lang-javascript.md +269 -0
- package/skills/sharp-edges/references/lang-kotlin.md +265 -0
- package/skills/sharp-edges/references/lang-php.md +245 -0
- package/skills/sharp-edges/references/lang-python.md +274 -0
- package/skills/sharp-edges/references/lang-ruby.md +273 -0
- package/skills/sharp-edges/references/lang-rust.md +272 -0
- package/skills/sharp-edges/references/lang-swift.md +287 -0
- package/skills/sharp-edges/references/language-specific.md +588 -0
- package/skills/spec-to-code-compliance/SKILL.md +357 -0
- package/skills/spec-to-code-compliance/resources/COMPLETENESS_CHECKLIST.md +69 -0
- package/skills/spec-to-code-compliance/resources/IR_EXAMPLES.md +417 -0
- package/skills/spec-to-code-compliance/resources/OUTPUT_REQUIREMENTS.md +105 -0
- package/skills/supply-chain-risk-auditor/SKILL.md +67 -0
- package/skills/supply-chain-risk-auditor/resources/results-template.md +41 -0
- package/skills/variant-analysis/METHODOLOGY.md +327 -0
- package/skills/variant-analysis/SKILL.md +142 -0
- package/skills/variant-analysis/resources/codeql/cpp.ql +119 -0
- package/skills/variant-analysis/resources/codeql/go.ql +69 -0
- package/skills/variant-analysis/resources/codeql/java.ql +71 -0
- package/skills/variant-analysis/resources/codeql/javascript.ql +63 -0
- package/skills/variant-analysis/resources/codeql/python.ql +80 -0
- package/skills/variant-analysis/resources/semgrep/cpp.yaml +98 -0
- package/skills/variant-analysis/resources/semgrep/go.yaml +63 -0
- package/skills/variant-analysis/resources/semgrep/java.yaml +61 -0
- package/skills/variant-analysis/resources/semgrep/javascript.yaml +60 -0
- package/skills/variant-analysis/resources/semgrep/python.yaml +72 -0
- package/skills/variant-analysis/resources/variant-report-template.md +75 -0
- package/skills/vuln-report/SKILL.md +137 -0
- package/skills/vuln-report/agents/openai.yaml +4 -0
- package/skills/vuln-report/references/report-template.md +135 -0
- package/skills/wooyun-legacy/SKILL.md +367 -0
- package/skills/wooyun-legacy/references/bank-penetration.md +222 -0
- package/skills/wooyun-legacy/references/checklists/command-execution-checklist.md +119 -0
- package/skills/wooyun-legacy/references/checklists/csrf-checklist.md +74 -0
- package/skills/wooyun-legacy/references/checklists/file-upload-checklist.md +108 -0
- package/skills/wooyun-legacy/references/checklists/info-disclosure-checklist.md +114 -0
- package/skills/wooyun-legacy/references/checklists/logic-flaws-checklist.md +95 -0
- package/skills/wooyun-legacy/references/checklists/misconfig-checklist.md +124 -0
- package/skills/wooyun-legacy/references/checklists/path-traversal-checklist.md +87 -0
- package/skills/wooyun-legacy/references/checklists/rce-checklist.md +93 -0
- package/skills/wooyun-legacy/references/checklists/sql-injection-checklist.md +97 -0
- package/skills/wooyun-legacy/references/checklists/ssrf-checklist.md +99 -0
- package/skills/wooyun-legacy/references/checklists/unauthorized-access-checklist.md +89 -0
- package/skills/wooyun-legacy/references/checklists/weak-password-checklist.md +115 -0
- package/skills/wooyun-legacy/references/checklists/xss-checklist.md +103 -0
- package/skills/wooyun-legacy/references/checklists/xxe-checklist.md +130 -0
- package/skills/wooyun-legacy/references/info-disclosure.md +975 -0
- package/skills/wooyun-legacy/references/logic-flaws.md +721 -0
- package/skills/wooyun-legacy/references/path-traversal.md +1191 -0
- package/skills/wooyun-legacy/references/telecom-penetration.md +156 -0
- package/skills/wooyun-legacy/references/unauthorized-access.md +980 -0
- package/skills/wooyun-legacy/references/xss.md +746 -0
- package/skills/zeroize-audit/SKILL.md +371 -0
- package/skills/zeroize-audit/configs/c.yaml +21 -0
- package/skills/zeroize-audit/configs/default.yaml +128 -0
- package/skills/zeroize-audit/configs/rust.yaml +83 -0
- package/skills/zeroize-audit/prompts/report_template.md +238 -0
- package/skills/zeroize-audit/prompts/system.md +163 -0
- package/skills/zeroize-audit/prompts/task.md +97 -0
- package/skills/zeroize-audit/references/compile-commands.md +231 -0
- package/skills/zeroize-audit/references/detection-strategy.md +191 -0
- package/skills/zeroize-audit/references/ir-analysis.md +252 -0
- package/skills/zeroize-audit/references/mcp-analysis.md +221 -0
- package/skills/zeroize-audit/references/poc-generation.md +470 -0
- package/skills/zeroize-audit/references/rust-zeroization-patterns.md +867 -0
- package/skills/zeroize-audit/schemas/input.json +83 -0
- package/skills/zeroize-audit/schemas/output.json +140 -0
- package/skills/zeroize-audit/tools/analyze_asm.sh +202 -0
- package/skills/zeroize-audit/tools/analyze_cfg.py +381 -0
- package/skills/zeroize-audit/tools/analyze_heap.sh +211 -0
- package/skills/zeroize-audit/tools/analyze_ir_semantic.py +429 -0
- package/skills/zeroize-audit/tools/diff_ir.sh +135 -0
- package/skills/zeroize-audit/tools/diff_rust_mir.sh +189 -0
- package/skills/zeroize-audit/tools/emit_asm.sh +67 -0
- package/skills/zeroize-audit/tools/emit_ir.sh +77 -0
- package/skills/zeroize-audit/tools/emit_rust_asm.sh +178 -0
- package/skills/zeroize-audit/tools/emit_rust_ir.sh +150 -0
- package/skills/zeroize-audit/tools/emit_rust_mir.sh +158 -0
- package/skills/zeroize-audit/tools/extract_compile_flags.py +284 -0
- package/skills/zeroize-audit/tools/generate_poc.py +1329 -0
- package/skills/zeroize-audit/tools/mcp/apply_confidence_gates.py +113 -0
- package/skills/zeroize-audit/tools/mcp/check_mcp.sh +68 -0
- package/skills/zeroize-audit/tools/mcp/normalize_mcp_evidence.py +125 -0
- package/skills/zeroize-audit/tools/scripts/check_llvm_patterns.py +481 -0
- package/skills/zeroize-audit/tools/scripts/check_mir_patterns.py +554 -0
- package/skills/zeroize-audit/tools/scripts/check_rust_asm.py +424 -0
- package/skills/zeroize-audit/tools/scripts/check_rust_asm_aarch64.py +300 -0
- package/skills/zeroize-audit/tools/scripts/check_rust_asm_x86.py +283 -0
- package/skills/zeroize-audit/tools/scripts/find_dangerous_apis.py +375 -0
- package/skills/zeroize-audit/tools/scripts/semantic_audit.py +923 -0
- package/skills/zeroize-audit/tools/track_dataflow.sh +196 -0
- package/skills/zeroize-audit/tools/validate_rust_toolchain.sh +298 -0
- package/skills/zeroize-audit/workflows/phase-0-preflight.md +150 -0
- package/skills/zeroize-audit/workflows/phase-1-source-analysis.md +144 -0
- package/skills/zeroize-audit/workflows/phase-2-compiler-analysis.md +139 -0
- package/skills/zeroize-audit/workflows/phase-3-interim-report.md +46 -0
- package/skills/zeroize-audit/workflows/phase-4-poc-generation.md +46 -0
- package/skills/zeroize-audit/workflows/phase-5-poc-validation.md +136 -0
- package/skills/zeroize-audit/workflows/phase-6-final-report.md +44 -0
- package/skills/zeroize-audit/workflows/phase-7-test-generation.md +42 -0
- package/themes/piolium-srcery.json +94 -0
|
@@ -0,0 +1,255 @@
|
|
|
1
|
+
# Threat Modeling Prompt Template for LLMs
|
|
2
|
+
|
|
3
|
+
This reference provides a disciplined, repo-grounded prompt that produces AppSec-usable threat models. Use it when you need a reliable output contract and a consistent process to assemble the threat model output
|
|
4
|
+
|
|
5
|
+
## System prompt
|
|
6
|
+
|
|
7
|
+
Use this as a stable system prompt:
|
|
8
|
+
|
|
9
|
+
````text
|
|
10
|
+
You are a senior application security engineer producing a threat model that will be read by other AppSec engineers.
|
|
11
|
+
|
|
12
|
+
Primary objective:
|
|
13
|
+
- Generate a threat model that is specific to THIS repository and its real-world usage.
|
|
14
|
+
- Prefer concrete, evidence-backed findings over generic vulnerability checklists.
|
|
15
|
+
|
|
16
|
+
Evidence and grounding rules:
|
|
17
|
+
- Do not invent components, data stores, endpoints, flows, or controls.
|
|
18
|
+
- Every architectural claim must be backed by at least one "Evidence anchor" referencing a repo path
|
|
19
|
+
(and a symbol name, config key, or a short quoted snippet if available).
|
|
20
|
+
- If information is missing, state assumptions explicitly and list the open questions needed to validate them.
|
|
21
|
+
|
|
22
|
+
Security hygiene:
|
|
23
|
+
- Never output secrets. If you encounter tokens/keys/passwords, redact them and only describe their presence and location.
|
|
24
|
+
|
|
25
|
+
Threat modeling approach:
|
|
26
|
+
- Model the system using data flows and trust boundaries.
|
|
27
|
+
- Enumerate threats and produce attack goals and abuse paths
|
|
28
|
+
- Prioritize threats using explicit likelihood and impact reasoning (qualitative is acceptable: low/medium/high).
|
|
29
|
+
|
|
30
|
+
Scope discipline:
|
|
31
|
+
- Clearly separate: production/runtime behavior vs CI/build/dev tooling vs tests/examples.
|
|
32
|
+
- Clearly separate attacker-controlled inputs vs operator-controlled inputs vs developer-controlled inputs.
|
|
33
|
+
- If a vulnerability class requires attacker control that likely does not exist for this repo's real usage, say so and downgrade severity.
|
|
34
|
+
|
|
35
|
+
Communication quality:
|
|
36
|
+
- Write for AppSec engineers: concise but specific.
|
|
37
|
+
- Use precise terminology. Include mitigations and residual risks.
|
|
38
|
+
- Avoid restating large blocks of README/spec; summarize and point to evidence.
|
|
39
|
+
|
|
40
|
+
Diagram requirements:
|
|
41
|
+
- Produce a single compact Mermaid flowchart showing primary components and trust boundaries.
|
|
42
|
+
- Mermaid must render cleanly. Use a conservative subset:
|
|
43
|
+
- Use `flowchart TD` or `flowchart LR` and only `-->` arrows.
|
|
44
|
+
- Use simple node IDs (letters/numbers/underscores only) and quoted labels (e.g., `A["Label"]`); avoid `A(Label)` shape syntax.
|
|
45
|
+
- Do not use Mermaid `title` lines or `style` directives.
|
|
46
|
+
- Keep edge labels to plain words/spaces only via `-->|label|`; avoid `{}`, `[]`, `()`, or quotes in edge labels (if needed, drop the label).
|
|
47
|
+
- Keep node labels short and readable: do not include file paths, URLs, or socket paths (put those details in prose outside the diagram).
|
|
48
|
+
- Wrap the diagram in a Markdown fenced block:
|
|
49
|
+
```mermaid
|
|
50
|
+
<mermaid syntax here>
|
|
51
|
+
```
|
|
52
|
+
````
|
|
53
|
+
|
|
54
|
+
## Repository summary prompt
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
We have a codebase located at {repo_directory/path}, currently on branch {branch_name}.
|
|
58
|
+
|
|
59
|
+
Please produce a security-oriented summary of the repository (or the specified sub-path) with the goal of helping a follow-on security engineer quickly understand the system well enough to build an initial threat model and investigate potential security hypotheses.
|
|
60
|
+
|
|
61
|
+
Objectives
|
|
62
|
+
1. Project overview
|
|
63
|
+
• Identify the primary programming languages, frameworks, and build system.
|
|
64
|
+
• Summarize the project’s core purpose and high-level architecture.
|
|
65
|
+
• Describe major components, services, or modules and how they interact.
|
|
66
|
+
2. Security posture and entry points
|
|
67
|
+
• Identify likely user entry points and trust boundaries.
|
|
68
|
+
• Describe existing security layers (e.g., authentication, authorization, validation, sandboxing, isolation, privilege boundaries).
|
|
69
|
+
• Call out security-critical components and assumptions that must hold for the system to remain secure.
|
|
70
|
+
|
|
71
|
+
Guidance for Security Analysis
|
|
72
|
+
|
|
73
|
+
Structure the summary so an application security engineer can quickly answer questions such as:
|
|
74
|
+
• Where does user input originate?
|
|
75
|
+
• How is untrusted data parsed, validated, and handled?
|
|
76
|
+
• What security assumptions should not be violated?
|
|
77
|
+
• Where are the most likely choke points for security bugs?
|
|
78
|
+
|
|
79
|
+
Adapt the analysis to the project type. For example:
|
|
80
|
+
• Web applications: where requests enter, how user data is parsed, routed, authenticated, and stored.
|
|
81
|
+
• Command-line tools: supported inputs (arguments, files, environment variables, stdin) and how they are processed.
|
|
82
|
+
• Network daemons: exposed ports, supported protocols, message formats, and request handling paths.
|
|
83
|
+
• Operating system or low-level components: common vulnerability classes (e.g., memory corruption, logic flaws) that could lead to LPE or RCE.
|
|
84
|
+
|
|
85
|
+
Be thorough but pragmatic: the goal is to help a security engineer quickly determine whether a discovered bug is security-relevant and where deeper investigation should focus.
|
|
86
|
+
|
|
87
|
+
Tooling Notes
|
|
88
|
+
|
|
89
|
+
If Ripgrep (rg) is available, use it to explore the codebase. When using grep or rg, always include the -I flag to avoid searching through binary files.
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
|
|
93
|
+
|
|
94
|
+
## User prompt template
|
|
95
|
+
|
|
96
|
+
Use this as the task prompt, filling in what you know and marking the rest as assumptions:
|
|
97
|
+
|
|
98
|
+
```text
|
|
99
|
+
# Inputs
|
|
100
|
+
Context (fill as available; otherwise infer and mark assumptions):
|
|
101
|
+
- intended_usage: {intended_usage}
|
|
102
|
+
- deployment_model: {deployment_model}
|
|
103
|
+
- data_sensitivity: {data_sensitivity}
|
|
104
|
+
- internet_exposure: {internet_exposure}
|
|
105
|
+
- authn_authz_expectations: {authn_authz_expectations}
|
|
106
|
+
- out_of_scope: {out_of_scope}
|
|
107
|
+
|
|
108
|
+
Provided summaries (may be incomplete):
|
|
109
|
+
- repository_summary: {repository_summary}
|
|
110
|
+
|
|
111
|
+
|
|
112
|
+
In-scope code locations (if known):
|
|
113
|
+
- in_scope_paths: {in_scope_paths}
|
|
114
|
+
|
|
115
|
+
# Task
|
|
116
|
+
Construct a repo-centric threat model that helps AppSec engineers understand the most important security risks and where to focus manual review.
|
|
117
|
+
|
|
118
|
+
You MUST follow this process and reflect outputs in the final document:
|
|
119
|
+
|
|
120
|
+
## Process
|
|
121
|
+
1) Repo discovery (evidence collection)
|
|
122
|
+
a. Identify the repo shape:
|
|
123
|
+
- languages and frameworks
|
|
124
|
+
- how it runs (server/cli/library), entrypoints, build artifacts
|
|
125
|
+
b. Identify security-relevant surfaces and controls by searching for evidences, such as:
|
|
126
|
+
- network listeners/routes/endpoints; RPC handlers; message consumers
|
|
127
|
+
- authentication, session/token handling, authorization checks, RBAC/ACL logic
|
|
128
|
+
- parsing/serialization/deserialization (JSON/YAML/XML/protobuf), template rendering, eval/dynamic code
|
|
129
|
+
- file upload/read paths, archive extraction, image/document parsing
|
|
130
|
+
- database/queue/cache clients and query construction
|
|
131
|
+
- secrets/config loading, environment variables, key management
|
|
132
|
+
- SSRF-capable HTTP clients, webhooks, URL fetchers
|
|
133
|
+
- sandboxing/isolation, privilege boundaries, subprocess execution
|
|
134
|
+
- logging/auditing and error handling paths
|
|
135
|
+
- CI/build/release: pipelines, dependency management, artifact publishing
|
|
136
|
+
|
|
137
|
+
2) System model
|
|
138
|
+
a. Summarize the primary components (runtime plus critical build/CI components when relevant).
|
|
139
|
+
b. Enumerate data flows and trust boundaries.
|
|
140
|
+
- For each trust boundary, specify:
|
|
141
|
+
* source to destination
|
|
142
|
+
* data types crossing (e.g., credentials, PII, files, tokens, prompts)
|
|
143
|
+
* channel/protocol (HTTP/gRPC/IPC/file/db)
|
|
144
|
+
* security guarantees and validation (auth, mTLS, origin checks, schema validation, rate limits)
|
|
145
|
+
c. Provide a compact Mermaid diagram showing components and trust boundaries.
|
|
146
|
+
|
|
147
|
+
3) Assets and security objectives
|
|
148
|
+
- List assets (data, credentials, integrity-critical state, availability-critical components, build artifacts).
|
|
149
|
+
- For each asset, state why it matters (confidentiality/integrity/availability, compliance, user harm).
|
|
150
|
+
|
|
151
|
+
4) Attacker model
|
|
152
|
+
- Capabilities: realistic remote attacker assumptions based on intended usage and exposure.
|
|
153
|
+
- Non-capabilities: things attacker cannot plausibly do (unless explicitly in scope), to avoid inflated severity.
|
|
154
|
+
|
|
155
|
+
5) Threat enumeration (concrete, system-specific)
|
|
156
|
+
- Generate threats as attacker stories tied to:
|
|
157
|
+
* entry points
|
|
158
|
+
* trust boundaries
|
|
159
|
+
* privileged components
|
|
160
|
+
- Prefer abuse paths (multi-step sequences) over single-line generic threats.
|
|
161
|
+
|
|
162
|
+
6) Risk prioritization
|
|
163
|
+
- For each threat:
|
|
164
|
+
* Likelihood: low/medium/high with a 1 to 2 sentence justification
|
|
165
|
+
* Impact: low/medium/high with a 1 to 2 sentence justification
|
|
166
|
+
* Overall priority: critical/high/medium/low (based on likelihood x impact, adjusted for existing controls)
|
|
167
|
+
- Explicitly state which assumptions most affect risk.
|
|
168
|
+
|
|
169
|
+
7) Validate assumptions and service context with the user (required before final report)
|
|
170
|
+
- Summarize key assumptions that materially affect scope or risk ranking.
|
|
171
|
+
- Ask 1 to 3 targeted questions to resolve missing service meta-context (service owner/environment, scale/users, deployment model, authn/authz, internet exposure, data sensitivity, multi-tenancy).
|
|
172
|
+
- Pause and wait for user feedback before producing the final report.
|
|
173
|
+
- If the user cannot answer, proceed with explicit assumptions and mark any conditional conclusions.
|
|
174
|
+
|
|
175
|
+
8) Mitigations and recommendations
|
|
176
|
+
- For each high/critical threat:
|
|
177
|
+
* Existing mitigations (with evidence anchors)
|
|
178
|
+
* Gaps/weaknesses
|
|
179
|
+
* Recommended mitigations (code/config/process)
|
|
180
|
+
* Detection/monitoring ideas (logging, metrics, alerts)
|
|
181
|
+
|
|
182
|
+
9) Focus paths for manual security review
|
|
183
|
+
- Output 2 to 30 repo-relative paths (files or directories) that merit deeper review.
|
|
184
|
+
- For each path, give a one-sentence reason tied to the threat model.
|
|
185
|
+
|
|
186
|
+
10) Quality check
|
|
187
|
+
- Provide a short checklist confirming you covered:
|
|
188
|
+
* all entry points you discovered
|
|
189
|
+
* each trust boundary at least once in threats
|
|
190
|
+
* runtime vs CI/dev separation
|
|
191
|
+
* user clarifications (or explicit non-responses)
|
|
192
|
+
* assumptions and open questions
|
|
193
|
+
|
|
194
|
+
## Required output format (exact)
|
|
195
|
+
Before producing the final Markdown report, first provide an assumption-validation check-in:
|
|
196
|
+
- List the key assumptions in 3 to 6 bullets.
|
|
197
|
+
- Ask 1 to 3 targeted context questions.
|
|
198
|
+
- Wait for the user response, then produce the final report below using the clarified context.
|
|
199
|
+
|
|
200
|
+
Produce valid Markdown with these sections in this order:
|
|
201
|
+
|
|
202
|
+
## Executive summary
|
|
203
|
+
- 1 short paragraph on the top risk themes and highest-risk areas.
|
|
204
|
+
|
|
205
|
+
## Scope and assumptions
|
|
206
|
+
- In-scope paths, out-of-scope items, and explicit assumptions.
|
|
207
|
+
- A short list of open questions that would materially change the risk ranking.
|
|
208
|
+
|
|
209
|
+
|
|
210
|
+
## System model
|
|
211
|
+
### Primary components
|
|
212
|
+
### Data flows and trust boundaries
|
|
213
|
+
Represent the system as a sequence of arrow-style bullets (e.g., Internet → API Server, User Input -> Application Logic, etc). For each boundary, document:
|
|
214
|
+
• the primary data types crossing the boundary,
|
|
215
|
+
• the communication channel or protocol,
|
|
216
|
+
• the security guarantees (e.g., authentication, origin checks, encryption, rate limiting), and
|
|
217
|
+
• any input validation, normalization, or schema enforcement performed.
|
|
218
|
+
|
|
219
|
+
#### Diagram
|
|
220
|
+
- Include a single, compact Mermaid diagram (`flowchart TD` or `flowchart LR`) showing primary components and trust boundaries (e.g., separate trust zones via subgraphs). Keep it compact, use only `-->`, avoid `title`/`style`, keep node labels short (no paths/URLs), and keep edge labels to plain words only (avoid `{}`, `[]`, `()`, or quotes).
|
|
221
|
+
|
|
222
|
+
|
|
223
|
+
## Assets and security objectives
|
|
224
|
+
- A table: Asset | Why it matters | Security objective (C/I/A)
|
|
225
|
+
|
|
226
|
+
## Attacker model
|
|
227
|
+
### Capabilities
|
|
228
|
+
### Non-capabilities
|
|
229
|
+
|
|
230
|
+
## Entry points and attack surfaces
|
|
231
|
+
- A table: Surface | How reached | Trust boundary | Notes | Evidence (repo path / symbol)
|
|
232
|
+
|
|
233
|
+
## Top abuse paths
|
|
234
|
+
- 5 to 10 short abuse paths, each as a numbered sequence of steps (attacker goal -> steps -> impact).
|
|
235
|
+
|
|
236
|
+
## Threat model table
|
|
237
|
+
- A Markdown table with columns:
|
|
238
|
+
Threat ID | Threat source | Prerequisites | Threat action | Impact | Impacted assets | Existing controls (evidence) | Gaps | Recommended mitigations | Detection ideas | Likelihood | Impact severity | Priority
|
|
239
|
+
|
|
240
|
+
Rules:
|
|
241
|
+
- Threat IDs must be stable and formatted: TM-001, TM-002, ...
|
|
242
|
+
- Priority must be one of: critical, high, medium, low.
|
|
243
|
+
- Keep prerequisites to 1 to 2 sentences. Keep recommended mitigations concrete.
|
|
244
|
+
|
|
245
|
+
## Criticality calibration
|
|
246
|
+
- Define what counts as critical/high/medium/low for THIS repo and context.
|
|
247
|
+
- Include 2 to 3 examples per level (tailored to the repo's assets and exposure).
|
|
248
|
+
|
|
249
|
+
## Focus paths for security review
|
|
250
|
+
- A table: Path | Why it matters | Related Threat IDs
|
|
251
|
+
|
|
252
|
+
## Notes on use
|
|
253
|
+
|
|
254
|
+
- Fill in known context, but allow the model to infer and mark assumptions.
|
|
255
|
+
- Include 1–2 repo-path anchors per major claim; do not dump every match.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Security Controls and Asset Categories
|
|
2
|
+
|
|
3
|
+
Use this as a lightweight checklist to keep outputs consistent across teams. Prefer concrete, system-specific items over generic text.
|
|
4
|
+
|
|
5
|
+
## Asset categories (pick only what applies)
|
|
6
|
+
- User data (PII, content, uploads)
|
|
7
|
+
- Authentication artifacts (passwords, tokens, sessions, cookies)
|
|
8
|
+
- Authorization state (roles, policies, ACLs)
|
|
9
|
+
- Secrets and keys (API keys, signing keys, encryption keys)
|
|
10
|
+
- Configuration and feature flags
|
|
11
|
+
- Models and weights (if ML systems)
|
|
12
|
+
- Source code and build artifacts
|
|
13
|
+
- Audit logs and telemetry
|
|
14
|
+
- Availability-critical resources (queues, caches, rate limits, compute budgets)
|
|
15
|
+
- Tenant isolation boundaries and metadata
|
|
16
|
+
|
|
17
|
+
## Security control categories
|
|
18
|
+
- Identity and access: authN, authZ, session handling, mTLS, key rotation
|
|
19
|
+
- Input protection: schema validation, parsing hardening, upload scanning, sandboxing
|
|
20
|
+
- Network safeguards: TLS, network policies, WAF, rate limiting, DoS controls
|
|
21
|
+
- Data protection: encryption at rest/in transit, tokenization, redaction
|
|
22
|
+
- Isolation: process sandboxing, container boundaries, tenant isolation, seccomp
|
|
23
|
+
- Observability: audit logs, alerting, anomaly detection, tamper resistance
|
|
24
|
+
- Supply chain: dependency pinning, SBOMs, provenance, signing
|
|
25
|
+
- Change control: CI checks, deployment approvals, config guardrails
|
|
26
|
+
|
|
27
|
+
## Mitigation phrasing patterns
|
|
28
|
+
- "Enforce schema at <boundary> for <payload> before <component>."
|
|
29
|
+
- "Require authZ check for <action> on <resource> in <service>."
|
|
30
|
+
- "Isolate <parser/component> in a sandbox with <resource limits>."
|
|
31
|
+
- "Rate limit <endpoint> by <key> and apply burst caps."
|
|
32
|
+
- "Encrypt <data> at rest using <key management> and rotate <keys>."
|
|
@@ -0,0 +1,212 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: semgrep
|
|
3
|
+
description: >-
|
|
4
|
+
Run Semgrep static analysis scan on a codebase using parallel subagents.
|
|
5
|
+
Supports two scan modes — "run all" (full ruleset coverage) and "important
|
|
6
|
+
only" (high-confidence security vulnerabilities). Automatically detects and
|
|
7
|
+
uses Semgrep Pro for cross-file taint analysis when available. Use when asked
|
|
8
|
+
to scan code for vulnerabilities, run a security audit with Semgrep, find
|
|
9
|
+
bugs, or perform static analysis. Spawns parallel workers for multi-language
|
|
10
|
+
codebases.
|
|
11
|
+
allowed-tools:
|
|
12
|
+
- Bash
|
|
13
|
+
- Read
|
|
14
|
+
- Glob
|
|
15
|
+
- Task
|
|
16
|
+
- AskUserQuestion
|
|
17
|
+
- TaskCreate
|
|
18
|
+
- TaskList
|
|
19
|
+
- TaskUpdate
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
# Semgrep Security Scan
|
|
23
|
+
|
|
24
|
+
Run a Semgrep scan with automatic language detection, parallel execution via Task subagents, and merged SARIF output.
|
|
25
|
+
|
|
26
|
+
## Essential Principles
|
|
27
|
+
|
|
28
|
+
1. **Always use `--metrics=off`** — Semgrep sends telemetry by default; `--config auto` also phones home. Every `semgrep` command must include `--metrics=off` to prevent data leakage during security audits.
|
|
29
|
+
2. **User must approve the scan plan (Step 3 is a hard gate)** — The original "scan this codebase" request is NOT approval. Present exact rulesets, target, engine, and mode; wait for explicit "yes"/"proceed" before spawning scanners.
|
|
30
|
+
3. **Third-party rulesets are required, not optional** — Trail of Bits, 0xdea, and Decurity rules catch vulnerabilities absent from the official registry. Include them whenever the detected language matches.
|
|
31
|
+
4. **Spawn all scan Tasks in a single message** — Parallel execution is the core performance advantage. Never spawn Tasks sequentially; always emit all Task tool calls in one response.
|
|
32
|
+
5. **Always check for Semgrep Pro before scanning** — Pro enables cross-file taint tracking and catches ~250% more true positives. Skipping the check means silently missing critical inter-file vulnerabilities.
|
|
33
|
+
|
|
34
|
+
## When to Use
|
|
35
|
+
|
|
36
|
+
- Security audit of a codebase
|
|
37
|
+
- Finding vulnerabilities before code review
|
|
38
|
+
- Scanning for known bug patterns
|
|
39
|
+
- First-pass static analysis
|
|
40
|
+
|
|
41
|
+
## When NOT to Use
|
|
42
|
+
|
|
43
|
+
- Binary analysis → Use binary analysis tools
|
|
44
|
+
- Already have Semgrep CI configured → Use existing pipeline
|
|
45
|
+
- Need cross-file analysis but no Pro license → Consider CodeQL as alternative
|
|
46
|
+
- Creating custom Semgrep rules → Use `semgrep-rule-creator` skill
|
|
47
|
+
- Porting existing rules to other languages → Use `semgrep-rule-variant-creator` skill
|
|
48
|
+
|
|
49
|
+
## Output Directory
|
|
50
|
+
|
|
51
|
+
All scan results, SARIF files, and temporary data are stored in a single output directory.
|
|
52
|
+
|
|
53
|
+
- **If the user specifies an output directory** in their prompt, use it as `OUTPUT_DIR`.
|
|
54
|
+
- **If not specified**, default to `./static_analysis_semgrep_1`. If that already exists, increment to `_2`, `_3`, etc.
|
|
55
|
+
|
|
56
|
+
In both cases, **always create the directory** with `mkdir -p` before writing any files.
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
# Resolve output directory
|
|
60
|
+
if [ -n "$USER_SPECIFIED_DIR" ]; then
|
|
61
|
+
OUTPUT_DIR="$USER_SPECIFIED_DIR"
|
|
62
|
+
else
|
|
63
|
+
BASE="static_analysis_semgrep"
|
|
64
|
+
N=1
|
|
65
|
+
while [ -e "${BASE}_${N}" ]; do
|
|
66
|
+
N=$((N + 1))
|
|
67
|
+
done
|
|
68
|
+
OUTPUT_DIR="${BASE}_${N}"
|
|
69
|
+
fi
|
|
70
|
+
mkdir -p "$OUTPUT_DIR/raw" "$OUTPUT_DIR/results"
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
The output directory is resolved **once** at the start of Step 1 and used throughout all subsequent steps.
|
|
74
|
+
|
|
75
|
+
```
|
|
76
|
+
$OUTPUT_DIR/
|
|
77
|
+
├── rulesets.txt # Approved rulesets (logged after Step 3)
|
|
78
|
+
├── raw/ # Per-scan raw output (unfiltered)
|
|
79
|
+
│ ├── python-python.json
|
|
80
|
+
│ ├── python-python.sarif
|
|
81
|
+
│ ├── python-django.json
|
|
82
|
+
│ ├── python-django.sarif
|
|
83
|
+
│ └── ...
|
|
84
|
+
└── results/ # Final merged output
|
|
85
|
+
└── results.sarif
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
## Prerequisites
|
|
89
|
+
|
|
90
|
+
**Required:** Semgrep CLI (`semgrep --version`). If not installed, see [Semgrep installation docs](https://semgrep.dev/docs/getting-started/).
|
|
91
|
+
|
|
92
|
+
**Optional:** Semgrep Pro — enables cross-file taint tracking, inter-procedural analysis, and additional languages (Apex, C#, Elixir). Check with:
|
|
93
|
+
|
|
94
|
+
```bash
|
|
95
|
+
semgrep --pro --validate --config p/default 2>/dev/null && echo "Pro available" || echo "OSS only"
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
**Limitations:** OSS mode cannot track data flow across files. Pro mode uses `-j 1` for cross-file analysis (slower per ruleset, but parallel rulesets compensate).
|
|
99
|
+
|
|
100
|
+
## Scan Modes
|
|
101
|
+
|
|
102
|
+
Select mode in Step 2 of the workflow. Mode affects both scanner flags and post-processing.
|
|
103
|
+
|
|
104
|
+
| Mode | Coverage | Findings Reported |
|
|
105
|
+
|------|----------|-------------------|
|
|
106
|
+
| **Run all** | All rulesets, all severity levels | Everything |
|
|
107
|
+
| **Important only** | All rulesets, pre- and post-filtered | Security vulns only, medium-high confidence/impact |
|
|
108
|
+
|
|
109
|
+
**Important only** applies two filter layers:
|
|
110
|
+
1. **Pre-filter**: `--severity MEDIUM --severity HIGH --severity CRITICAL` (CLI flag)
|
|
111
|
+
2. **Post-filter**: JSON metadata — keeps only `category=security`, `confidence∈{MEDIUM,HIGH}`, `impact∈{MEDIUM,HIGH}`
|
|
112
|
+
|
|
113
|
+
See [scan-modes.md](references/scan-modes.md) for metadata criteria and jq filter commands.
|
|
114
|
+
|
|
115
|
+
## Orchestration Architecture
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
┌──────────────────────────────────────────────────────────────────┐
|
|
119
|
+
│ MAIN AGENT (this skill) │
|
|
120
|
+
│ Step 1: Detect languages + check Pro availability │
|
|
121
|
+
│ Step 2: Select scan mode + rulesets (ref: rulesets.md) │
|
|
122
|
+
│ Step 3: Present plan + rulesets, get approval [⛔ HARD GATE] │
|
|
123
|
+
│ Step 4: Spawn parallel scan Tasks (approved rulesets + mode) │
|
|
124
|
+
│ Step 5: Merge results and report │
|
|
125
|
+
└──────────────────────────────────────────────────────────────────┘
|
|
126
|
+
│ Step 4
|
|
127
|
+
▼
|
|
128
|
+
┌─────────────────┐
|
|
129
|
+
│ Scan Tasks │
|
|
130
|
+
│ (parallel) │
|
|
131
|
+
├─────────────────┤
|
|
132
|
+
│ Python scanner │
|
|
133
|
+
│ JS/TS scanner │
|
|
134
|
+
│ Go scanner │
|
|
135
|
+
│ Docker scanner │
|
|
136
|
+
└─────────────────┘
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
## Workflow
|
|
140
|
+
|
|
141
|
+
**Follow the detailed workflow in [scan-workflow.md](workflows/scan-workflow.md).** Summary:
|
|
142
|
+
|
|
143
|
+
| Step | Action | Gate | Key Reference |
|
|
144
|
+
|------|--------|------|---------------|
|
|
145
|
+
| 1 | Resolve output dir, detect languages + Pro availability | — | Use Glob, not Bash |
|
|
146
|
+
| 2 | Select scan mode + rulesets | — | [rulesets.md](references/rulesets.md) |
|
|
147
|
+
| 3 | Present plan, get explicit approval | ⛔ HARD | AskUserQuestion |
|
|
148
|
+
| 4 | Spawn parallel scan Tasks | — | [scanner-task-prompt.md](references/scanner-task-prompt.md) |
|
|
149
|
+
| 5 | Merge results and report | — | Merge script (below) |
|
|
150
|
+
|
|
151
|
+
**Task enforcement:** On invocation, create 5 tasks with blockedBy dependencies (each step blocks the previous). Step 3 is a HARD GATE — mark complete ONLY after user explicitly approves.
|
|
152
|
+
|
|
153
|
+
**Merge command (Step 5):**
|
|
154
|
+
|
|
155
|
+
```bash
|
|
156
|
+
uv run {baseDir}/scripts/merge_sarif.py $OUTPUT_DIR/raw $OUTPUT_DIR/results/results.sarif
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
## Agents
|
|
160
|
+
|
|
161
|
+
| Agent | Tools | Purpose |
|
|
162
|
+
|-------|-------|---------|
|
|
163
|
+
| `static-analysis:semgrep-scanner` | Bash | Executes parallel semgrep scans for a language category |
|
|
164
|
+
|
|
165
|
+
Use `subagent_type: static-analysis:semgrep-scanner` in Step 4 when spawning Task subagents.
|
|
166
|
+
|
|
167
|
+
## Rationalizations to Reject
|
|
168
|
+
|
|
169
|
+
| Shortcut | Why It's Wrong |
|
|
170
|
+
|----------|----------------|
|
|
171
|
+
| "User asked for scan, that's approval" | Original request ≠ plan approval. Present plan, use AskUserQuestion, await explicit "yes" |
|
|
172
|
+
| "Step 3 task is blocking, just mark complete" | Lying about task status defeats enforcement. Only mark complete after real approval |
|
|
173
|
+
| "I already know what they want" | Assumptions cause scanning wrong directories/rulesets. Present plan for verification |
|
|
174
|
+
| "Just use default rulesets" | User must see and approve exact rulesets before scan |
|
|
175
|
+
| "Add extra rulesets without asking" | Modifying approved list without consent breaks trust |
|
|
176
|
+
| "Third-party rulesets are optional" | Trail of Bits, 0xdea, Decurity catch vulnerabilities not in official registry — REQUIRED |
|
|
177
|
+
| "Use --config auto" | Sends metrics; less control over rulesets |
|
|
178
|
+
| "One Task at a time" | Defeats parallelism; spawn all Tasks together |
|
|
179
|
+
| "Pro is too slow, skip --pro" | Cross-file analysis catches 250% more true positives; worth the time |
|
|
180
|
+
| "Semgrep handles GitHub URLs natively" | URL handling fails on repos with non-standard YAML; always clone first |
|
|
181
|
+
| "Cleanup is optional" | Cloned repos pollute the user's workspace and accumulate across runs |
|
|
182
|
+
| "Use `.` or relative path as target" | Subagents need absolute paths to avoid ambiguity |
|
|
183
|
+
| "Let the user pick an output dir later" | Output directory must be resolved at Step 1, before any files are created |
|
|
184
|
+
|
|
185
|
+
## Reference Index
|
|
186
|
+
|
|
187
|
+
| File | Content |
|
|
188
|
+
|------|---------|
|
|
189
|
+
| [rulesets.md](references/rulesets.md) | Complete ruleset catalog and selection algorithm |
|
|
190
|
+
| [scan-modes.md](references/scan-modes.md) | Pre/post-filter criteria and jq commands |
|
|
191
|
+
| [scanner-task-prompt.md](references/scanner-task-prompt.md) | Template for spawning scanner subagents |
|
|
192
|
+
|
|
193
|
+
| Workflow | Purpose |
|
|
194
|
+
|----------|---------|
|
|
195
|
+
| [scan-workflow.md](workflows/scan-workflow.md) | Complete 5-step scan execution process |
|
|
196
|
+
|
|
197
|
+
## Success Criteria
|
|
198
|
+
|
|
199
|
+
- [ ] Output directory resolved (user-specified or auto-incremented default)
|
|
200
|
+
- [ ] All generated files stored inside `$OUTPUT_DIR`
|
|
201
|
+
- [ ] Languages detected with file counts; Pro status checked
|
|
202
|
+
- [ ] Scan mode selected by user (run all / important only)
|
|
203
|
+
- [ ] Rulesets include third-party rules for all detected languages
|
|
204
|
+
- [ ] User explicitly approved the scan plan (Step 3 gate passed)
|
|
205
|
+
- [ ] All scan Tasks spawned in a single message and completed
|
|
206
|
+
- [ ] Every `semgrep` command used `--metrics=off`
|
|
207
|
+
- [ ] Approved rulesets logged to `$OUTPUT_DIR/rulesets.txt`
|
|
208
|
+
- [ ] Raw per-scan outputs stored in `$OUTPUT_DIR/raw/`
|
|
209
|
+
- [ ] `results.sarif` exists in `$OUTPUT_DIR/results/` and is valid JSON
|
|
210
|
+
- [ ] Important-only mode: post-filter applied before merge; unfiltered results preserved in `raw/`
|
|
211
|
+
- [ ] Results summary reported with severity and category breakdown
|
|
212
|
+
- [ ] Cloned repos (if any) cleaned up from `$OUTPUT_DIR/repos/`
|
|
@@ -0,0 +1,162 @@
|
|
|
1
|
+
# Semgrep Rulesets Reference
|
|
2
|
+
|
|
3
|
+
## Complete Ruleset Catalog
|
|
4
|
+
|
|
5
|
+
### Security-Focused Rulesets
|
|
6
|
+
|
|
7
|
+
| Ruleset | Description | Use Case |
|
|
8
|
+
|---------|-------------|----------|
|
|
9
|
+
| `p/security-audit` | Comprehensive vulnerability detection, higher false positives | Manual audits, security reviews |
|
|
10
|
+
| `p/secrets` | Hardcoded credentials, API keys, tokens | Always include |
|
|
11
|
+
| `p/owasp-top-ten` | OWASP Top 10 web application vulnerabilities | Web app security |
|
|
12
|
+
| `p/cwe-top-25` | CWE Top 25 most dangerous software weaknesses | General security |
|
|
13
|
+
| `p/sql-injection` | SQL injection patterns and tainted data flows | Database security |
|
|
14
|
+
| `p/insecure-transport` | Ensures code uses encrypted channels | Network security |
|
|
15
|
+
| `p/gitleaks` | Hard-coded credentials detection (gitleaks port) | Secrets scanning |
|
|
16
|
+
| `p/findsecbugs` | FindSecBugs rule pack for Java | Java security |
|
|
17
|
+
| `p/phpcs-security-audit` | PHP security audit rules | PHP security |
|
|
18
|
+
|
|
19
|
+
### CI/CD Rulesets
|
|
20
|
+
|
|
21
|
+
| Ruleset | Description | Use Case |
|
|
22
|
+
|---------|-------------|----------|
|
|
23
|
+
| `p/default` | Default ruleset, balanced coverage | First-time users |
|
|
24
|
+
| `p/ci` | High-confidence security + logic bugs, low FP | CI pipelines |
|
|
25
|
+
| `p/r2c-ci` | Low false positives, CI-safe | CI/CD blocking |
|
|
26
|
+
| `p/r2c` | Community favorite, curated by Semgrep (618k+ downloads) | General scanning |
|
|
27
|
+
| `p/auto` | Auto-selects rules based on detected languages/frameworks | Quick scans |
|
|
28
|
+
| `p/comment` | Comment-related rules | Code review |
|
|
29
|
+
|
|
30
|
+
### Third-Party Rulesets
|
|
31
|
+
|
|
32
|
+
| Ruleset | Description | Maintainer |
|
|
33
|
+
|---------|-------------|------------|
|
|
34
|
+
| `p/gitlab` | GitLab-maintained security rules | GitLab |
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## Ruleset Selection Algorithm
|
|
39
|
+
|
|
40
|
+
Follow this algorithm to select rulesets based on detected languages and frameworks.
|
|
41
|
+
|
|
42
|
+
### Step 1: Always Include Security Baseline
|
|
43
|
+
|
|
44
|
+
```json
|
|
45
|
+
{
|
|
46
|
+
"baseline": ["p/security-audit", "p/secrets"]
|
|
47
|
+
}
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
- `p/security-audit` - Comprehensive vulnerability detection (always include)
|
|
51
|
+
- `p/secrets` - Hardcoded credentials, API keys, tokens (always include)
|
|
52
|
+
|
|
53
|
+
### Step 2: Add Language-Specific Rulesets
|
|
54
|
+
|
|
55
|
+
For each detected language, add the primary ruleset. If a framework is detected, add its ruleset too.
|
|
56
|
+
|
|
57
|
+
**GA Languages (production-ready):**
|
|
58
|
+
|
|
59
|
+
| Detection | Primary Ruleset | Framework Rulesets | Pro Rule Count |
|
|
60
|
+
|-----------|-----------------|-------------------|----------------|
|
|
61
|
+
| `.py` | `p/python` | `p/django`, `p/flask`, `p/fastapi` | 710+ |
|
|
62
|
+
| `.js`, `.jsx` | `p/javascript` | `p/react`, `p/nodejs`, `p/express`, `p/nextjs`, `p/angular` | 250+ (JS), 70+ (JSX) |
|
|
63
|
+
| `.ts`, `.tsx` | `p/typescript` | `p/react`, `p/nodejs`, `p/express`, `p/nextjs`, `p/angular` | 230+ |
|
|
64
|
+
| `.go` | `p/golang` | `p/go` (alias) | 80+ |
|
|
65
|
+
| `.java` | `p/java` | `p/spring`, `p/findsecbugs` | 190+ |
|
|
66
|
+
| `.kt` | `p/kotlin` | `p/spring` | 60+ |
|
|
67
|
+
| `.rb` | `p/ruby` | `p/rails` | 40+ |
|
|
68
|
+
| `.php` | `p/php` | `p/symfony`, `p/laravel`, `p/phpcs-security-audit` | 50+ |
|
|
69
|
+
| `.c`, `.cpp`, `.h` | `p/c` | - | 150+ |
|
|
70
|
+
| `.rs` | `p/rust` | - | 40+ |
|
|
71
|
+
| `.cs` | `p/csharp` | - | 170+ |
|
|
72
|
+
| `.scala` | `p/scala` | - | Community |
|
|
73
|
+
| `.swift` | `p/swift` | - | 60+ |
|
|
74
|
+
|
|
75
|
+
**Beta Languages (Pro recommended):**
|
|
76
|
+
|
|
77
|
+
| Detection | Primary Ruleset | Notes |
|
|
78
|
+
|-----------|-----------------|-------|
|
|
79
|
+
| `.ex`, `.exs` | `p/elixir` | Requires Pro for best coverage |
|
|
80
|
+
| `.cls`, `.trigger` | `p/apex` | Salesforce; requires Pro |
|
|
81
|
+
|
|
82
|
+
**Experimental Languages:**
|
|
83
|
+
|
|
84
|
+
| Detection | Primary Ruleset | Notes |
|
|
85
|
+
|-----------|-----------------|-------|
|
|
86
|
+
| `.sol` | No official ruleset | Use Decurity third-party rules |
|
|
87
|
+
| `Dockerfile` | `p/dockerfile` | Limited rules |
|
|
88
|
+
| `.yaml`, `.yml` | `p/yaml` | K8s, GitHub Actions, docker-compose patterns |
|
|
89
|
+
| `.json` | `r/json.aws` | AWS IAM policies; use `r/json.*` for specific rules |
|
|
90
|
+
| Bash scripts | - | Community support |
|
|
91
|
+
| Cairo, Circom | - | Experimental, smart contracts |
|
|
92
|
+
|
|
93
|
+
**Framework detection hints:**
|
|
94
|
+
|
|
95
|
+
| Framework | Detection Signals | Ruleset |
|
|
96
|
+
|-----------|------------------|---------|
|
|
97
|
+
| Django | `settings.py`, `urls.py`, `django` in requirements | `p/django` |
|
|
98
|
+
| Flask | `flask` in requirements, `@app.route` | `p/flask` |
|
|
99
|
+
| FastAPI | `fastapi` in requirements, `@app.get/post` | `p/fastapi` |
|
|
100
|
+
| React | `package.json` with react dependency, `.jsx`/`.tsx` files | `p/react` |
|
|
101
|
+
| Next.js | `next.config.js`, `pages/` or `app/` directory | `p/nextjs` |
|
|
102
|
+
| Angular | `angular.json`, `@angular/` dependencies | `p/angular` |
|
|
103
|
+
| Express | `express` in package.json, `app.use()` patterns | `p/express` |
|
|
104
|
+
| NestJS | `@nestjs/` dependencies, `@Controller` decorators | `p/nodejs` |
|
|
105
|
+
| Spring | `pom.xml` with spring, `@SpringBootApplication` | `p/spring` |
|
|
106
|
+
| Rails | `Gemfile` with rails, `config/routes.rb` | `p/rails` |
|
|
107
|
+
| Laravel | `composer.json` with laravel, `artisan` | `p/laravel` |
|
|
108
|
+
| Symfony | `composer.json` with symfony, `config/packages/` | `p/symfony` |
|
|
109
|
+
|
|
110
|
+
### Step 3: Add Infrastructure Rulesets
|
|
111
|
+
|
|
112
|
+
| Detection | Ruleset | Description |
|
|
113
|
+
|-----------|---------|-------------|
|
|
114
|
+
| `Dockerfile` | `p/dockerfile` | Container security, best practices |
|
|
115
|
+
| `.tf`, `.hcl` | `p/terraform` | IaC misconfigurations, CIS benchmarks, AWS/Azure/GCP |
|
|
116
|
+
| k8s manifests | `p/kubernetes` | K8s security, RBAC issues |
|
|
117
|
+
| CloudFormation | `p/cloudformation` | AWS infrastructure security |
|
|
118
|
+
| GitHub Actions | `p/github-actions` | CI/CD security, secrets exposure |
|
|
119
|
+
| `.yaml`, `.yml` | `p/yaml` | Generic YAML patterns (K8s, docker-compose) |
|
|
120
|
+
| AWS IAM JSON | `r/json.aws` | IAM policy misconfigurations (use `--config r/json.aws`) |
|
|
121
|
+
|
|
122
|
+
### Step 4: Add Third-Party Rulesets
|
|
123
|
+
|
|
124
|
+
These are **NOT optional**. Include automatically when language matches:
|
|
125
|
+
|
|
126
|
+
| Languages | Source | Why Required |
|
|
127
|
+
|-----------|--------|--------------|
|
|
128
|
+
| Python, Go, Ruby, JS/TS, Terraform, HCL | [Trail of Bits](https://github.com/trailofbits/semgrep-rules) | Security audit patterns from real engagements (AGPLv3) |
|
|
129
|
+
| C, C++ | [0xdea](https://github.com/0xdea/semgrep-rules) | Memory safety, low-level vulnerabilities |
|
|
130
|
+
| Solidity, Cairo, Rust | [Decurity](https://github.com/Decurity/semgrep-smart-contracts) | Smart contract vulnerabilities, DeFi exploits |
|
|
131
|
+
| Go | [dgryski](https://github.com/dgryski/semgrep-go) | Additional Go-specific patterns |
|
|
132
|
+
| Android (Java/Kotlin) | [MindedSecurity](https://github.com/mindedsecurity/semgrep-rules-android-security) | OWASP MASTG-derived mobile security rules |
|
|
133
|
+
| Java, Go, JS/TS, C#, Python, PHP | [elttam](https://github.com/elttam/semgrep-rules) | Security consulting patterns |
|
|
134
|
+
| Dockerfile, PHP, Go, Java | [kondukto](https://github.com/kondukto-io/semgrep-rules) | Container and web app security |
|
|
135
|
+
| PHP, Kotlin, Java | [dotta](https://github.com/federicodotta/semgrep-rules) | Pentest-derived web/mobile app rules |
|
|
136
|
+
| Terraform, HCL | [HashiCorp](https://github.com/hashicorp-forge/semgrep-rules) | HashiCorp infrastructure patterns |
|
|
137
|
+
| Swift, Java, Cobol | [akabe1](https://github.com/akabe1/akabe1-semgrep-rules) | iOS and legacy system patterns |
|
|
138
|
+
| Java | [Atlassian Labs](https://github.com/atlassian-labs/atlassian-sast-ruleset) | Atlassian-maintained Java rules |
|
|
139
|
+
| Python, JS/TS, Java, Ruby, Go, PHP | [Apiiro](https://github.com/apiiro/malicious-code-ruleset) | Malicious code detection, supply chain |
|
|
140
|
+
|
|
141
|
+
### Step 5: Verify Rulesets
|
|
142
|
+
|
|
143
|
+
Before finalizing, verify official rulesets load:
|
|
144
|
+
|
|
145
|
+
```bash
|
|
146
|
+
# Quick validation (exits 0 if valid)
|
|
147
|
+
semgrep --config p/python --validate --metrics=off 2>&1 | head -3
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
Or browse the [Semgrep Registry](https://semgrep.dev/explore).
|
|
151
|
+
|
|
152
|
+
### Output Format
|
|
153
|
+
|
|
154
|
+
```json
|
|
155
|
+
{
|
|
156
|
+
"baseline": ["p/security-audit", "p/secrets"],
|
|
157
|
+
"python": ["p/python", "p/django"],
|
|
158
|
+
"javascript": ["p/javascript", "p/react", "p/nodejs"],
|
|
159
|
+
"docker": ["p/dockerfile"],
|
|
160
|
+
"third_party": ["https://github.com/trailofbits/semgrep-rules"]
|
|
161
|
+
}
|
|
162
|
+
```
|