npm - @intentsolutionsio/penetration-tester - Versions diffs - 2.0.0 → 3.0.4 - Mend

@intentsolutionsio/penetration-tester 2.0.0 → 3.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (112) hide show

package/skills/composing-vulnerability-report/references/PLAYBOOK.md ADDED Viewed

@@ -0,0 +1,180 @@
+# PLAYBOOK — Report Templates and Workflow
+## End-of-engagement workflow
+```bash
+# 1. All cluster 1-4 scans have run; findings are in engagements/<id>/findings/
+# 2. (Optional) Map findings to OWASP Top 10
+python3 plugins/security/penetration-tester/skills/mapping-findings-to-owasp-top10/scripts/map_owasp.py \
+    engagements/acme-2026-q2/
+# 3. Compose the vulnerability report
+python3 ./scripts/compose_report.py engagements/acme-2026-q2/
+# 4. Generate the executive summary
+python3 plugins/security/penetration-tester/skills/generating-executive-summary/scripts/exec_summary.py \
+    engagements/acme-2026-q2/
+# 5. Record the engagement (chain of custody)
+python3 plugins/security/penetration-tester/skills/recording-pentest-engagement/scripts/record_engagement.py \
+    engagements/acme-2026-q2/ --sign --tar engagements/archives/acme-2026-q2.tar.gz
+```
+## Report-template variants per audience
+### Technical (default — what the skill emits)
+Everything: per-finding detail, remediation steps, evidence,
+references. Audience: customer's security engineer or external
+auditor.
+### Manager-style (use `--min-severity high`)
+Only HIGH and CRITICAL findings shown. Body of the report is the
+same per-finding format but the customer's manager doesn't have
+to scroll past medium / low findings.
+```bash
+python3 ./scripts/compose_report.py engagements/acme-2026-q2/ \
+    --min-severity high \
+    --report-output engagements/acme-2026-q2/reports/manager-brief.md
+```
+### Compliance-evidence (use `--include-info`)
+Includes the INFO-severity operational records (scan ran cleanly,
+no findings in this target, etc.). Audience: SOC2 / ISO 27001
+auditors who want documented evidence the scan ran even where
+nothing was found.
+```bash
+python3 ./scripts/compose_report.py engagements/acme-2026-q2/ \
+    --include-info \
+    --report-output engagements/acme-2026-q2/reports/compliance-evidence.md
+```
+## Per-finding remediation phrasing patterns
+The skill emits each finding's `remediation` field verbatim from
+the source. Cluster 1-4 skills construct these as numbered lists
+with imperative voice:
+```
+1. Run `npm audit fix` in the project root.
+2. If the fix requires a semver-major bump, evaluate the breaking changes.
+3. Commit the updated package-lock.json.
+```
+Patterns to maintain when authoring remediation:
+- **Imperative voice.** "Run X" not "You should run X" not "It is recommended to run X."
+- **Numbered list when there are >1 steps.** Bullets blur the order.
+- **Concrete commands when possible.** "Apply security patch" is unactionable; "`apt-get update && apt-get upgrade libssl3`" is actionable.
+- **Decision points called out.** "If X, do Y; otherwise do Z" makes the operator's choice explicit.
+## Evidence-redaction for distributed reports
+A pentest report sometimes needs distribution beyond the immediate
+customer — to their auditor, their insurance carrier, their board.
+Some evidence shouldn't leave the original engagement: leaked
+credentials, full request/response bodies containing PII, screenshots
+showing customer data.
+The skill doesn't auto-redact. The pattern for human-controlled
+redaction:
+1. Generate the full report with this skill.
+2. Use a separate redaction pass to:
+   - Replace leaked-secret values with `[REDACTED]` while keeping
+     finding context.
+   - Blur or remove customer-data screenshots.
+   - Replace user-identifying URLs with `<USER>/path/...` placeholders.
+3. Distribute the redacted version. Keep the unredacted version in
+   the engagement archive.
+The skill could be extended with a `--redact PATTERN` flag if this
+becomes a frequent enough pattern.
+## Cross-reference protocol with OWASP-mapping + exec-summary
+The next two cluster 6 skills consume this report's findings file
+(JSON output of THIS skill's findings → not the rendered markdown).
+The protocol:
+1. **Findings producer:** cluster 1-4 skills emit per-skill findings JSONLs
+2. **Vulnerability report composer (this skill):** composes a single
+   vulnerability-report.md AND can output a unified findings JSONL
+3. **OWASP mapping:** reads the unified JSONL, adds `owasp_category`
+   field to each finding, writes back
+4. **Exec summary:** reads the (potentially OWASP-enriched) unified
+   JSONL, produces the executive-summary.md
+The order can be either:
+- A: cluster-1-4 → OWASP → compose → exec-summary
+- B: cluster-1-4 → compose → OWASP → re-compose → exec-summary
+Pattern B regenerates the report after enrichment. Pattern A only
+composes once. Both are valid; the choice depends on whether the
+operator wants the OWASP tags visible in the report's per-finding
+sections.
+## Common composition mistakes
+| Mistake | Consequence | Fix |
+|---|---|---|
+| Findings without `target` field | Findings dropped from report; HIGH op-finding | Ensure all source skills emit `target` |
+| Mixed findings + opfindings in source file | Report contains operational noise | Separate skill output and operational output |
+| Identical findings with different IDs | Dedup fails | Ensure source skills use consistent `Finding.fingerprint()` |
+| Re-running with new sources but old report path | Old report overwritten silently | Use timestamped `--report-output` paths for interim reports |
+## Per-engagement directory layout (canonical)
+The skill assumes:
+- `engagement/findings/*.json[l]` — source findings
+- `engagement/reports/` — output reports (auto-created if missing)
+- `engagement/roe.yaml` — has `engagement_id:` for header
+Deviations are supported via flags:
+- `--source FILE` — bypass auto-discovery
+- `--report-output PATH` — write report anywhere
+- `--engagement-id ID` — override ROE detection
+## Integration with cluster 1-4 skills
+The cluster 1-4 scan skills all emit findings via `report.emit()`
+(in `lib/report.py`) which can write JSON / JSONL / Markdown.
+For composition workflow, each scan should run with
+`--format jsonl --output engagement/findings/<skill>-<date>.jsonl`:
+```bash
+python3 plugins/security/penetration-tester/skills/checking-http-security-headers/scripts/check_headers.py \
+    https://app.acme.example --format jsonl \
+    --output engagements/acme-2026-q2/findings/headers-$(date +%Y%m%d).jsonl
+```
+The skill then picks up every file under `findings/` by default.
+## Stable-rendering verification
+To verify the report is rendering stably (same input → same output
+except timestamp):
+```bash
+# Render twice
+python3 ./scripts/compose_report.py engagements/acme-2026-q2/ \
+    --report-output /tmp/report-1.md
+sleep 2
+python3 ./scripts/compose_report.py engagements/acme-2026-q2/ \
+    --report-output /tmp/report-2.md
+# Diff ignoring the timestamp line
+diff <(grep -v "Generated by" /tmp/report-1.md) \
+     <(grep -v "Generated by" /tmp/report-2.md)
+```
+The diff should be empty. If it isn't, file an issue — stable
+rendering is a contract this skill maintains.

package/skills/composing-vulnerability-report/references/THEORY.md ADDED Viewed

@@ -0,0 +1,178 @@
+# THEORY — Vulnerability Report Structure
+## What a customer actually reads
+A pentest vulnerability report is read by three different people
+in three different ways:
+1. **The customer's security engineer** — reads end to end, in
+   order, looking for specific technical detail per finding.
+2. **The customer's manager** — reads the summary table, then
+   skims headlines of critical / high findings, then trusts the
+   security engineer's recommendation.
+3. **A future reader during compliance audit** — reads to verify
+   the finding existed and the remediation was documented.
+The report has to serve all three audiences without trying too
+hard. The structure this skill emits is the result of trial and
+error against those three readers:
+- Top: short header identifying engagement, generation time,
+  sources.
+- Below: summary table — counts by severity.
+- Then: severity-ordered finding sections, CRITICAL first.
+- Each finding: title, severity, target, detail, remediation,
+  evidence, references — in that order. Always.
+The shape is boring on purpose. Surprise in report structure is
+not a feature; the reader should be able to predict where a piece
+of information lives without searching.
+## Why CVSS v3.1 (not CVSS v4 or CWE-only)
+CVSS v3.1 (June 2019) is the de facto industry standard severity
+vector. NVD scores every CVE with CVSS v3.1. SOC2 / ISO 27001 /
+PCI auditors expect to see CVSS scores.
+CVSS v4.0 was published in November 2023. It addresses real
+problems with v3.1 (better handling of supply-chain attacks,
+clearer attack-vector semantics). Adoption has been slow because
+the existing tooling chain is v3.1-shaped. As of 2026 the practical
+choice is still v3.1.
+CWE alone is a different axis — what KIND of weakness, not how
+SEVERE. Reports include CWE as enrichment when available but use
+CVSS for the severity decision.
+## CVSS vector composition (when the skill derives one)
+When a finding arrives without a CVSS score, the skill does NOT
+guess a precise numeric score (that would falsely imply NVD-level
+authority). Instead it maps severity → severity band, retaining
+the explicit "derived" label so the report's reader knows the score
+isn't NVD-blessed.
+| Severity | Implied CVSS band | Note |
+|---|---|---|
+| CRITICAL | 9.0 - 10.0 | implied by mapping; not authoritative |
+| HIGH | 7.0 - 8.9 | implied |
+| MEDIUM | 4.0 - 6.9 | implied |
+| LOW | 0.1 - 3.9 | implied |
+| INFO | n/a | no score emitted |
+Findings that arrived WITH a CVSS score keep their original value
+unchanged. This is important: a finding with `cvss_score: 7.5`
+from `auditing-python-dependencies` (which extracts it from OSV)
+must not be overwritten by this skill's heuristic.
+## Why fingerprint-based deduplication
+`lib/finding.py` defines `Finding.fingerprint()` as a stable hash
+of `(skill_id, title, target)`. Two findings with the same
+fingerprint are the same finding from the same source about the
+same target, regardless of which scan run produced them.
+The alternative — title-based dedup — fails when:
+- Same skill is run twice (the second invocation produces
+  identical findings)
+- Two different skills produce findings with the same title but
+  different targets (legitimate distinct findings)
+- A skill produces findings with parameterized titles (the
+  title varies but the substance is the same)
+Fingerprint-based dedup is the right primitive. The trade-off:
+fingerprints don't catch SEMANTICALLY-equivalent findings from
+different skills. If `auditing-npm-dependencies` and an external
+tool both surface the same CVE, the fingerprints differ (different
+skill_id). Cross-tool semantic dedup is a separate problem — this
+skill handles within-skill dedup correctly.
+## NIST SP 800-115 vs PTES vs OWASP Testing Guide
+Three frameworks define what a pentest report should contain:
+| Framework | Strengths | Weaknesses |
+|---|---|---|
+| NIST SP 800-115 (US gov-flavored) | Definitive, well-structured | Network-focused; app-pentest sections are thin |
+| PTES | Defines the workflow end-to-end | Report structure is one of many topics; less prescriptive |
+| OWASP Testing Guide (WSTG) | Best on web-app vulnerability classification | Doesn't prescribe report structure deeply |
+The skill's output format is informed by all three but doesn't
+claim conformance to any. The structure is pragmatic: what works
+in real customer engagements as of 2026.
+## Severity-grouping rationale
+The skill groups findings by severity, NOT by skill or by target.
+Reasons:
+- Customers care about severity first. A CRITICAL CVE in
+  dependency-vuln scan and a CRITICAL TLS misconfig and a
+  CRITICAL secret in source are all the same severity to the
+  reader; they should be co-located.
+- Per-severity grouping makes remediation prioritization
+  trivially visual.
+- Per-skill grouping would hide cross-skill patterns ("multiple
+  high findings in the same target" becomes invisible).
+If the customer specifically wants per-skill grouping (some prefer
+it), this skill's output is structured Markdown that a downstream
+tool can re-group. The skill emits the canonical form; consumers
+re-shape if needed.
+## Stable anchors for cross-reference
+Each finding subsection in the report has a stable HTML anchor
+based on the fingerprint:
+```html
+<a id="finding-a1b2c3d4e5f6"></a>
+```
+The executive-summary skill and the OWASP-mapping skill reference
+findings by this anchor:
+```markdown
+[See finding](vulnerability-report.md#finding-a1b2c3d4e5f6)
+```
+The anchor is stable across regenerations of the report (because
+the fingerprint is stable). If the report is regenerated after
+adding new findings, existing finding anchors don't change —
+only new anchors are added.
+## Required-field policy
+The skill refuses to include findings missing any of `title`,
+`severity`, `target`, `detail`, `remediation`. The dropped finding
+becomes a HIGH operational finding from this skill itself,
+surfacing the problem to the operator without silently swallowing
+the broken record.
+This is conservative on purpose. A pentest report that includes
+malformed findings is unprofessional and undermines the engagement.
+Better to surface the problem with a loud error than to ship
+broken records.
+## Why include-info defaults to off
+INFO-severity findings are operational chatter (no vulnerabilities
+found, scanner ran cleanly, etc.). Customer reports should NOT
+include them in the main severity sections — the customer cares
+about findings, not about confirmation that the scan ran.
+`--include-info` adds an INFO section at the bottom of the report
+for operational transparency. Useful for SOC2 evidence packages
+where auditors want to confirm every scan ran.
+## Generation timestamp + reproducibility
+The header includes the generation timestamp in ISO-8601. The
+report is reproducible from the same source findings: if you
+re-run with the same inputs, the report content is byte-identical
+except for the timestamp.
+This matters for evidence: the customer's audit team needs to
+verify "report-as-shipped" against "report-from-source-now."
+Stable rendering = trustable comparison.