PyPI - websec-validator - Versions diffs - 0.2.1__tar.gz → 0.2.3__tar.gz - Mend

websec-validator 0.2.1tar.gz → 0.2.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (62) hide show

{websec_validator-0.2.1/src/websec_validator.egg-info → websec_validator-0.2.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: websec-validator
-Version: 0.2.1
+Version: 0.2.3
 Summary: Local-first security recon that briefs your AI coding agent: facts + tailored probe scripts, code-in / artifacts-out. No LLM, no server, no running app.
 Author: Ricardo Accioly
 License: MIT
@@ -21,7 +21,7 @@ Dynamic: license-file
 It is *not* an autonomous scanner and *not* a SaaS. It's the missing front-half: the thing that
 turns a repo into a precise, fact-grounded security brief an AI agent (with a human in the loop)
 can act on — an auto-filled, repo-aware version of a senior pentester's "here's what to test and
-how" handoff. Full landscape + why this niche is real: [`MARKET-ANALYSIS-AND-VERDICT.md`](MARKET-ANALYSIS-AND-VERDICT.md).
+how" handoff. How it works + the reasoning behind every check: [`docs/METHODOLOGY.md`](docs/METHODOLOGY.md).
 ## Quickstart — just point it at your repo
@@ -37,7 +37,7 @@ local. The four ways to get there, all ending in the same `AGENT-BRIEFING.md` yo
 | **Tell your agent** (simplest) | — | say the line above |
 | **CLI** (a terminal) | `pipx install websec-validator` | `websec run /path/to/your/app` |
 | **Claude Code plugin** (slash) | `/plugin marketplace add raccioly/websec-validator`  →  `/plugin install websec-validator@websec-plugins` | invoke the **security-pass** skill, or just ask |
-| **Docker** (no install) | `docker build -t websec-validator .` | `docker run --rm -v "$PWD:/scan" websec-validator run /scan --out /scan/websec-out` |
+| **Docker** (no install) | `docker build -t websec-validator .` | `docker run --rm --user "$(id -u):$(id -g)" -v "$PWD:/scan" websec-validator run /scan --out /scan/websec-out` |
 ➡️ **Want the reasoning behind every check?** Read **[docs/METHODOLOGY.md](docs/METHODOLOGY.md)** — what each test does and why.
@@ -63,7 +63,7 @@ No need to install Noir or any scanner — the image bundles them all (arch-awar
 ```bash
 docker build -t websec-validator .
-docker run --rm -v "$PWD:/scan" websec-validator run /scan --out /scan/websec-out
+docker run --rm --user "$(id -u):$(id -g)" -v "$PWD:/scan" websec-validator run /scan --out /scan/websec-out
 ```
 The image carries Noir + Trivy + Gitleaks + Semgrep + Checkov; mount your repo at `/scan` and the
@@ -171,9 +171,9 @@ the next dynamic probes (explicitly gated — they mutate).
 ## Validated on
-HugoCross (Next.js), `wu-whatsappinbox` (106-service Express/AWS monorepo), VAmPI, NodeGoat, DVGA —
-independently reproducing a hand-done pentest's findings (tenant boundary, SSO-endpoint SSRF, media
-upload, conversation-BOLA routes, roles).
+A production Next.js app, a large Express/AWS monorepo, and the VAmPI / NodeGoat / DVGA vuln-app
+corpus — independently reproducing a hand-done pentest's findings (tenant boundary, SSRF, file
+upload, cross-tenant BOLA, role/authz gaps).
 ## Tests
@@ -186,10 +186,12 @@ python3 -m unittest discover -s tests    # stdlib only, no Noir/network — 23 t
 Published to PyPI via **Trusted Publishing** (OIDC — no API token in the repo). To cut a release:
 ```bash
-# 1. bump the version in pyproject.toml (e.g. 0.2.0 → 0.2.1)
+# 1. bump the version in pyproject.toml (e.g. 0.2.1 → 0.2.2)
 # 2. tag it and push — the tag must match pyproject's version (CI verifies):
-git tag v0.2.1 && git push origin v0.2.1
-# → .github/workflows/publish.yml builds + publishes to PyPI
+git tag v0.2.2 && git push origin v0.2.2
+# → publish.yml builds, INSTALLS + smoke-tests the wheel (version match,
+#   calibration ships, a real `websec run`), then publishes. A bad build fails
+#   CI instead of reaching PyPI — so you never have to yank after the fact.
 ```
 One-time PyPI setup (before the first release): on pypi.org → **Account → Publishing → Add a pending
@@ -225,8 +227,27 @@ lets you just ask, in plain English, for a security pass: it runs `websec`, read
 works the findings with you. For other agents the universal interface is unchanged: run the CLI, read
 `AGENT-BRIEFING.md`.
+**Install gotchas (field-tested):**
+- The install id is `plugin@marketplace` — `websec-validator@websec-plugins` (the marketplace name
+  from `.claude-plugin/marketplace.json`), **not** `@websec-validator` (the repo).
+- The plugin only delivers the *instructions*; the actual scanning is a **separate Python CLI**
+  (`websec`). The skill's Step 0 installs it (`pipx install websec-validator`) if it's missing.
+- **`/plugin …` only works in the terminal CLI.** In the Claude **app / Agent SDK** (no `/plugin`),
+  configure it in `.claude/settings.json` instead:
+  ```json
+  {
+    "extraKnownMarketplaces": {
+      "websec-plugins": { "source": { "source": "github", "repo": "raccioly/websec-validator" } }
+    },
+    "enabledPlugins": { "websec-validator@websec-plugins": true }
+  }
+  ```
+  This **registers + enables** the plugin but does **not** auto-fetch it — the first download still
+  needs the CLI (`/plugin install websec-validator@websec-plugins`) once. (Project `.claude/settings.json`
+  for a team; `~/.claude/settings.json` for just you.)
 ## Credits
-Methodology + probe library come from a real authenticated pentest pass
-([`base-research/REPLICATION-PLAYBOOK.md`](base-research/REPLICATION-PLAYBOOK.md), not committed).
-This tool productizes that hand-written pass into something an AI agent can run on any repo.
+Methodology + probe library are distilled from a real authenticated penetration-testing pass.
+This tool productizes that hand-written methodology into something an AI agent can run on any repo.

{websec_validator-0.2.1 → websec_validator-0.2.3}/README.md RENAMED Viewed

@@ -9,7 +9,7 @@
 It is *not* an autonomous scanner and *not* a SaaS. It's the missing front-half: the thing that
 turns a repo into a precise, fact-grounded security brief an AI agent (with a human in the loop)
 can act on — an auto-filled, repo-aware version of a senior pentester's "here's what to test and
-how" handoff. Full landscape + why this niche is real: [`MARKET-ANALYSIS-AND-VERDICT.md`](MARKET-ANALYSIS-AND-VERDICT.md).
+how" handoff. How it works + the reasoning behind every check: [`docs/METHODOLOGY.md`](docs/METHODOLOGY.md).
 ## Quickstart — just point it at your repo
@@ -25,7 +25,7 @@ local. The four ways to get there, all ending in the same `AGENT-BRIEFING.md` yo
 | **Tell your agent** (simplest) | — | say the line above |
 | **CLI** (a terminal) | `pipx install websec-validator` | `websec run /path/to/your/app` |
 | **Claude Code plugin** (slash) | `/plugin marketplace add raccioly/websec-validator`  →  `/plugin install websec-validator@websec-plugins` | invoke the **security-pass** skill, or just ask |
-| **Docker** (no install) | `docker build -t websec-validator .` | `docker run --rm -v "$PWD:/scan" websec-validator run /scan --out /scan/websec-out` |
+| **Docker** (no install) | `docker build -t websec-validator .` | `docker run --rm --user "$(id -u):$(id -g)" -v "$PWD:/scan" websec-validator run /scan --out /scan/websec-out` |
 ➡️ **Want the reasoning behind every check?** Read **[docs/METHODOLOGY.md](docs/METHODOLOGY.md)** — what each test does and why.
@@ -51,7 +51,7 @@ No need to install Noir or any scanner — the image bundles them all (arch-awar
 ```bash
 docker build -t websec-validator .
-docker run --rm -v "$PWD:/scan" websec-validator run /scan --out /scan/websec-out
+docker run --rm --user "$(id -u):$(id -g)" -v "$PWD:/scan" websec-validator run /scan --out /scan/websec-out
 ```
 The image carries Noir + Trivy + Gitleaks + Semgrep + Checkov; mount your repo at `/scan` and the
@@ -159,9 +159,9 @@ the next dynamic probes (explicitly gated — they mutate).
 ## Validated on
-HugoCross (Next.js), `wu-whatsappinbox` (106-service Express/AWS monorepo), VAmPI, NodeGoat, DVGA —
-independently reproducing a hand-done pentest's findings (tenant boundary, SSO-endpoint SSRF, media
-upload, conversation-BOLA routes, roles).
+A production Next.js app, a large Express/AWS monorepo, and the VAmPI / NodeGoat / DVGA vuln-app
+corpus — independently reproducing a hand-done pentest's findings (tenant boundary, SSRF, file
+upload, cross-tenant BOLA, role/authz gaps).
 ## Tests
@@ -174,10 +174,12 @@ python3 -m unittest discover -s tests    # stdlib only, no Noir/network — 23 t
 Published to PyPI via **Trusted Publishing** (OIDC — no API token in the repo). To cut a release:
 ```bash
-# 1. bump the version in pyproject.toml (e.g. 0.2.0 → 0.2.1)
+# 1. bump the version in pyproject.toml (e.g. 0.2.1 → 0.2.2)
 # 2. tag it and push — the tag must match pyproject's version (CI verifies):
-git tag v0.2.1 && git push origin v0.2.1
-# → .github/workflows/publish.yml builds + publishes to PyPI
+git tag v0.2.2 && git push origin v0.2.2
+# → publish.yml builds, INSTALLS + smoke-tests the wheel (version match,
+#   calibration ships, a real `websec run`), then publishes. A bad build fails
+#   CI instead of reaching PyPI — so you never have to yank after the fact.
 ```
 One-time PyPI setup (before the first release): on pypi.org → **Account → Publishing → Add a pending
@@ -213,8 +215,27 @@ lets you just ask, in plain English, for a security pass: it runs `websec`, read
 works the findings with you. For other agents the universal interface is unchanged: run the CLI, read
 `AGENT-BRIEFING.md`.
+**Install gotchas (field-tested):**
+- The install id is `plugin@marketplace` — `websec-validator@websec-plugins` (the marketplace name
+  from `.claude-plugin/marketplace.json`), **not** `@websec-validator` (the repo).
+- The plugin only delivers the *instructions*; the actual scanning is a **separate Python CLI**
+  (`websec`). The skill's Step 0 installs it (`pipx install websec-validator`) if it's missing.
+- **`/plugin …` only works in the terminal CLI.** In the Claude **app / Agent SDK** (no `/plugin`),
+  configure it in `.claude/settings.json` instead:
+  ```json
+  {
+    "extraKnownMarketplaces": {
+      "websec-plugins": { "source": { "source": "github", "repo": "raccioly/websec-validator" } }
+    },
+    "enabledPlugins": { "websec-validator@websec-plugins": true }
+  }
+  ```
+  This **registers + enables** the plugin but does **not** auto-fetch it — the first download still
+  needs the CLI (`/plugin install websec-validator@websec-plugins`) once. (Project `.claude/settings.json`
+  for a team; `~/.claude/settings.json` for just you.)
 ## Credits
-Methodology + probe library come from a real authenticated pentest pass
-([`base-research/REPLICATION-PLAYBOOK.md`](base-research/REPLICATION-PLAYBOOK.md), not committed).
-This tool productizes that hand-written pass into something an AI agent can run on any repo.
+Methodology + probe library are distilled from a real authenticated penetration-testing pass.
+This tool productizes that hand-written methodology into something an AI agent can run on any repo.

{websec_validator-0.2.1 → websec_validator-0.2.3}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "websec-validator"
-version = "0.2.1"
+version = "0.2.3"
 description = "Local-first security recon that briefs your AI coding agent: facts + tailored probe scripts, code-in / artifacts-out. No LLM, no server, no running app."
 readme = "README.md"
 requires-python = ">=3.11"

{websec_validator-0.2.1 → websec_validator-0.2.3}/src/websec_validator/briefing.py RENAMED Viewed

@@ -164,6 +164,11 @@ Production source maps exposed: {client.get("production_source_maps", False)}
 Scanners available: {avail}
+> ⚠️ The count below is **raw scanner output (pre-triage)** — expect mostly noise (vulnerable-looking
+> patterns that are guarded, intended-public, or not exploitable). The **triaged, calibrated view** is the
+> findings ledger in `REPORT.md` / `findings-ledger.json` — each finding there carries a `P(real)`. Start
+> from the ledger and debate-verify; don't report these raw counts as vulnerabilities.
 {findings_block}
 Install for fuller coverage:

{websec_validator-0.2.1 → websec_validator-0.2.3}/src/websec_validator/cli.py RENAMED Viewed

@@ -111,7 +111,7 @@ def cmd_run(args) -> int:
     # 3. probes: choose + stage
     chosen = probes.applicable(facts)
-    manifest = probes.stage(chosen, out)
+    manifest = probes.stage(chosen, out, facts)
     print(f"\n  staged {len([m for m in manifest if 'attack_class' in m])} tailored probe template(s) → {out / 'probes'}")
     # 4. traceable findings ledger (recon + static; dynamic merges in via `websec dynamic`)
@@ -156,12 +156,16 @@ def cmd_dynamic(args) -> int:
         dyn = dynamic.run_unauth(args.target, facts_path, out, probe_writes=args.probe_writes)
         u = dyn["unauth_reachability"]
         print(f"  target: {u['target']}  ·  → {u['summary']}")
+        if u.get("warning"):
+            print(f"\n  {u['warning']}\n")
         for r in u["results"]:
             mark = "🔓" if r["verdict"] == "OPEN-no-auth" else (" ·" if r["verdict"] == "protected" else "  ")
             print(f"    {mark} {str(r['status']):>4}  {r['verdict']:26} {r['path']}")
         if args.probe_writes:
             w = dyn["write_auth_enforcement"]
             print(f"\n  write-verb auth enforcement → {w['summary']}")
+            if w.get("warning"):
+                print(f"\n  {w['warning']}\n")
             for r in w["results"]:
                 mark = "🔓" if r["verdict"] != "auth-enforced" and not r["verdict"].startswith("http-") else " ·"
                 print(f"    {mark} {str(r['status']):>4}  {r['verdict']:42} {r['method']} {r['path']}")

{websec_validator-0.2.1 → websec_validator-0.2.3}/src/websec_validator/dynamic.py RENAMED Viewed

@@ -138,6 +138,19 @@ SIDE_EFFECTING = re.compile(
     r"sponsor-post|upload|/refresh|/rebuild|/process|/dispatch|/import|/export|/scrape(?![\w-])", re.I)
+# When NOTHING enforces auth, the likeliest cause in a test env is a fail-OPEN auth
+# provider (unconfigured/erroring), not "the app has no auth". Say so loudly — a naive
+# read of all-200s as "wide open" is a catastrophic false positive.
+FAIL_OPEN_WARNING = (
+    "⚠ NO endpoint enforced auth (none returned 401/403). Before concluding authentication is missing, "
+    "RULE OUT a fail-OPEN test environment: an unconfigured or erroring auth provider "
+    "(Cognito/Auth0/NextAuth/…) can let every request through. Configure a valid (even dummy) provider, or "
+    "mock a session, and RE-RUN — if these flip to 401, the app is fine and the env was the bug. Until an "
+    "auth-enforced response is observed, treat ALL authN/authZ results here as UNTRUSTWORTHY. (If it stays "
+    "open WITH a working provider, that's a real finding: the middleware should fail CLOSED — deny on auth error.)"
+)
 def unauth_reachability(target: str, facts: dict, max_endpoints: int = 50) -> dict:
     """STRICT read-only: GET each genuine data-read endpoint with NO auth, to see
     which are reachable unauthenticated. Skips side-effecting GETs and any path
@@ -170,6 +183,8 @@ def unauth_reachability(target: str, facts: dict, max_endpoints: int = 50) -> di
         results.append({"path": path, "status": code, "bytes": n, "verdict": verdict})
     openish = [r for r in results if r["verdict"] == "OPEN-no-auth"]
+    protected = [r for r in results if r["verdict"] in ("protected", "redirect (likely to login)")]
+    fail_open = len(results) >= 3 and not protected and bool(openish)
     return {
         "target": target,
         "mode": "STRICT read-only · unauthenticated · GET-only · side-effecting paths skipped",
@@ -177,8 +192,12 @@ def unauth_reachability(target: str, facts: dict, max_endpoints: int = 50) -> di
         "skipped_side_effecting": sorted(set(skipped)),
         "open_no_auth": openish,
         "results": results,
+        "fail_open_suspected": fail_open,
+        "authn_trustworthy": not fail_open,
+        "warning": FAIL_OPEN_WARNING if fail_open else "",
         "summary": f"{len(openish)}/{len(results)} data-read GET endpoints reachable WITHOUT auth"
-                   + (" — review whether these should be public" if openish else " — all gated"),
+                   + (" — review whether these should be public" if openish else " — all gated")
+                   + ("  ·  ⚠ FAIL-OPEN SUSPECTED (nothing enforced auth — results untrustworthy)" if fail_open else ""),
     }
@@ -216,6 +235,7 @@ def write_auth_enforcement(target: str, facts: dict, max_endpoints: int = 80) ->
     missing = [r for r in results if r["verdict"] != "auth-enforced" and not r["verdict"].startswith("http-")]
     executed = [r for r in results if r["verdict"] == "EXECUTED-UNAUTH"]
     enforced = sum(1 for r in results if r["verdict"] == "auth-enforced")
+    fail_open = len(results) >= 3 and enforced == 0
     return {
         "note": "Heuristic: a protected route returns 401/403 BEFORE validation; a 400/404 unauth means "
                 "the request reached the handler with no auth gate. VERIFY each — but inconsistency vs "
@@ -225,8 +245,12 @@ def write_auth_enforcement(target: str, facts: dict, max_endpoints: int = 80) ->
         "no_auth_gate": missing,
         "executed_unauth": executed,
         "results": results,
+        "fail_open_suspected": fail_open,
+        "authn_trustworthy": not fail_open,
+        "warning": FAIL_OPEN_WARNING if fail_open else "",
         "summary": f"{enforced}/{len(results)} write endpoints enforce auth · "
-                   f"{len(missing)} reached with no auth gate · {len(executed)} executed unauthenticated",
+                   f"{len(missing)} reached with no auth gate · {len(executed)} executed unauthenticated"
+                   + ("  ·  ⚠ FAIL-OPEN SUSPECTED — results untrustworthy" if fail_open else ""),
     }

{websec_validator-0.2.1 → websec_validator-0.2.3}/src/websec_validator/extractors/auth.py RENAMED Viewed

@@ -47,7 +47,7 @@ class AuthExtractor(Extractor):
         # Detect ALL schemes present, then pick a primary by priority. A JWT app
         # that also wires Passport for SSO must read as primary=jwt, not passport
-        # (the bug the WhatsApp app exposed). Priority: nextauth > jwt > session > passport > api-key.
+        # (Passport is often SSO-only). Priority: nextauth > jwt > session > passport > api-key.
         detected = []
         if nextauth:
             detected.append("nextauth (session JWT in cookie)")

{websec_validator-0.2.1 → websec_validator-0.2.3}/src/websec_validator/extractors/authz.py RENAMED Viewed

@@ -34,6 +34,14 @@ GLOBAL_AUTH = re.compile(
     r"app\.use\s*\(\s*[\w.]*(?:authenticate|requireAuth|authMiddleware|verifyToken|"
     r"isAuthenticated|jwtMiddleware|ensureAuth)\w*\s*\)", re.I)
+# Does a Next.js middleware/proxy file actually enforce AUTH (vs. i18n/headers only)?
+# `auth((req)=>…)` / `withAuth` / `req.auth` / getToken / getServerSession / redirect-to-login /
+# a 401 / Clerk / Supabase updateSession all signal a global auth gate.
+MW_AUTH = re.compile(
+    r"\bauth\s*\(|withAuth\b|req\.auth\b|getToken\s*\(|getServerSession\s*\(|clerkMiddleware|"
+    r"updateSession\s*\(|NextResponse\.redirect\([^)]*(?:login|signin)|status:\s*401|"
+    r"['\"]Authentication required['\"]", re.I)
 PUBLIC_HINT = re.compile(
     r"/(login|logout|register|signup|signin|health|healthz|ping|status|webhooks?|"
     r"public|\.well-known|robots|favicon|sitemap|callback|refresh|csrf|metrics)\b", re.I)
@@ -46,15 +54,20 @@ ROLE = re.compile(
 def _parse_next_middleware(ctx: RepoContext) -> dict:
-    for cand in ("middleware.ts", "middleware.js", "src/middleware.ts", "src/middleware.js"):
+    # Next 15.5+/16 renamed `middleware.ts` → `proxy.ts` (both filenames are valid; the
+    # framework recognizes either). Missing this made the tool report "no global auth" on
+    # Next 16 apps and flag every handler — the single biggest false-positive cluster.
+    for cand in ("middleware.ts", "middleware.js", "src/middleware.ts", "src/middleware.js",
+                 "proxy.ts", "proxy.js", "src/proxy.ts", "src/proxy.js"):
         txt = ctx.manifest(cand)
         if not txt:
             continue
         matchers = re.findall(r"matcher\s*:\s*\[([^\]]*)\]", txt)
         patterns = re.findall(r"['\"]([^'\"]+)['\"]", matchers[0]) if matchers else []
         roles = [m for grp in ROLE.findall(txt) for m in grp if m]
-        return {"present": True, "file": cand, "matchers": patterns, "role_checks": roles}
-    return {"present": False, "matchers": []}
+        return {"present": True, "file": cand, "matchers": patterns,
+                "is_auth": bool(MW_AUTH.search(txt)), "role_checks": roles}
+    return {"present": False, "matchers": [], "is_auth": False}
 def _matcher_covers(path: str, matchers: list) -> bool:
@@ -85,8 +98,10 @@ class AuthzExtractor(Extractor):
     def extract(self, ctx: RepoContext, facts: dict) -> dict:
         endpoints = (facts.get("routes") or {}).get("endpoints", [])
         mw = _parse_next_middleware(ctx)
+        mw_auth = mw.get("is_auth", False)
-        global_auth = any(GLOBAL_AUTH.search(t) for _p, _r, t in ctx.iter_code())
+        # global auth = an Express path-less auth middleware OR a Next auth middleware/proxy
+        global_auth = mw_auth or any(GLOBAL_AUTH.search(t) for _p, _r, t in ctx.iter_code())
         roles: set = set(mw.get("role_checks", []))
         protected = no_guard = unknown = 0
         no_guard_writes, egs = [], []
@@ -95,7 +110,10 @@ class AuthzExtractor(Extractor):
             cp = e.get("code_path", "")
             text = ctx.text(Path(cp)) if cp else ""
             _collect_roles(text, roles)
-            guarded = bool(text and GUARD.search(text)) or _matcher_covers(e.get("path", ""), mw.get("matchers", []))
+            # a matcher only counts as a guard when the middleware actually does auth — a
+            # non-auth middleware.ts (i18n/headers) must NOT mark routes protected.
+            guarded = bool(text and GUARD.search(text)) or \
+                (mw_auth and _matcher_covers(e.get("path", ""), mw.get("matchers", [])))
             relcp = ctx.rel(Path(cp)) if cp else ""
             egs.append({"method": e.get("method"), "path": e.get("path"), "code_path": relcp,
                         "guarded": bool(guarded), "analyzed": bool(text),
@@ -110,10 +128,12 @@ class AuthzExtractor(Extractor):
                     no_guard_writes.append(f"{e['method']} {e['path']}  ({relcp or '?'})")
         if global_auth:
-            note = ("A GLOBAL auth middleware (`app.use(<auth>)`) was detected — most routes are likely "
-                    "protected by default. The list below is write endpoints with NO guard visible in their "
-                    "own handler file; they MAY be covered globally. Verify each is either covered or an "
-                    "intentional public exemption — don't assume they're vulnerable.")
+            where = f"`{mw['file']}` (matcher {mw.get('matchers') or '—'})" if mw_auth else "`app.use(<auth>)`"
+            note = (f"A GLOBAL auth middleware ({where}) was detected — most routes are protected by default. "
+                    "Endpoints its matcher covers are reported as guarded (defense-in-depth handled centrally). "
+                    "Any list below is write endpoints with NO guard visible in their own handler file AND not "
+                    "covered by the matcher; verify each is either covered or an intentional public exemption — "
+                    "don't assume they're vulnerable.")
         else:
             note = ("No global auth middleware detected. Write endpoints with no visible guard are "
                     "high-signal missing-authz leads — verify each.")

{websec_validator-0.2.1 → websec_validator-0.2.3}/src/websec_validator/findings.py RENAMED Viewed

@@ -111,6 +111,11 @@ def build_ledger(facts: dict, unified: dict | None, dynamic: dict | None = None,
                  ((dynamic or {}).get("write_auth_enforcement", {}) or {}).get("results", [])}
     dyn_get = {r["path"]: r for r in
                ((dynamic or {}).get("unauth_reachability", {}) or {}).get("results", [])}
+    # If the dynamic run suspects a fail-OPEN test env, its unauth "successes" are untrustworthy —
+    # do NOT escalate them to CRITICAL (the catastrophic-false-positive trap). Fall back to the
+    # recon-level hypothesis with a caveat until the operator re-runs with auth resolving.
+    dyn_fail_open = bool(((dynamic or {}).get("write_auth_enforcement", {}) or {}).get("fail_open_suspected")
+                         or ((dynamic or {}).get("unauth_reachability", {}) or {}).get("fail_open_suspected"))
     for eg in authz.get("endpoint_guards", []):
         if eg.get("guarded") or eg.get("public_hint") or not eg.get("analyzed"):
             continue
@@ -121,7 +126,12 @@ def build_ledger(facts: dict, unified: dict | None, dynamic: dict | None = None,
         dv = dyn_write.get((m, p)) or dyn_get.get(p)
         if dv:
             verdict = dv.get("verdict", "")
-            if "EXECUTED-UNAUTH" in verdict:
+            if dyn_fail_open and verdict not in ("auth-enforced", "protected"):
+                ev.append({"layer": "dynamic", "detail": f"reached unauthenticated (HTTP {dv.get('status')}) — "
+                           "BUT fail-open suspected (auth not resolving in the test env); UNTRUSTWORTHY, "
+                           "re-run with a working auth provider before trusting this"})
+                # keep recon-level conf/sev; do not escalate
+            elif "EXECUTED-UNAUTH" in verdict:
                 ev.append({"layer": "dynamic", "detail": f"{m} executed UNAUTHENTICATED (HTTP {dv.get('status')})"})
                 conf, sev = "HIGH", "CRITICAL"
             elif "no-auth-gate" in verdict or verdict == "OPEN-no-auth":
@@ -151,11 +161,22 @@ def build_ledger(facts: dict, unified: dict | None, dynamic: dict | None = None,
                       [{"layer": "static", "detail": f"{'+'.join(t.get('tools', []))}: {t.get('title','')}"}]))
     # ---- 3. Attack-surface sinks (recon hypotheses) ----
+    # On a purely-NoSQL datastore, classic SQL-injection alerts are almost always FPs —
+    # down-rank them (the inflation the field test flagged) rather than ranking them MEDIUM.
+    _ds = {d.lower() for d in (facts.get("stack", {}).get("datastores") or [])}
+    _nosql = {"dynamodb", "dynamo", "mongodb", "mongo", "firestore", "cosmos", "cosmosdb", "couchdb", "cassandra"}
+    _sql = {"postgres", "postgresql", "mysql", "mariadb", "sqlite", "mssql", "sqlserver", "aurora", "oracle", "cockroach"}
+    is_nosql_only = bool(_ds & _nosql) and not (_ds & _sql)
     for cls, info in (facts.get("surface", {}).get("sinks", {}) or {}).items():
+        sev = "MEDIUM"
+        ev = [{"layer": "recon", "detail": f"user-input-gated {cls} in {info.get('count')} file(s)"}]
+        if cls in ("sqli", "sql-injection") and is_nosql_only:
+            sev = "LOW"
+            ev.append({"layer": "recon", "detail": f"datastore is {', '.join(sorted(_ds)) or 'NoSQL'} — "
+                       "classic SQLi is unlikely here; check for NoSQL injection instead (usually a false positive)"})
         out.append(_f(f"{cls} sink ({info.get('count')} site(s))", "attack-surface",
-                      cls if cls in STANDARDS else "sast", "MEDIUM", "LOW",
-                      (info.get("files") or ["?"])[0],
-                      [{"layer": "recon", "detail": f"user-input-gated {cls} in {info.get('count')} file(s)"}]))
+                      cls if cls in STANDARDS else "sast", sev, "LOW",
+                      (info.get("files") or ["?"])[0], ev))
     # ---- 4. Client-side secret exposure (HIGH — ships to browser) ----
     for leak in (facts.get("client_exposure", {}).get("public_secret_leaks", []) +

websec_validator-0.2.3/src/websec_validator/probes.py ADDED Viewed

@@ -0,0 +1,161 @@
+"""Stage the probe library, tailored to the extracted attack surface.
+Probe selection is driven by the real recon facts. Staging now also writes a
+`probe-context.json` (the target's REAL routes/auth/sensitive-fields/tenant key,
+from FACTS) next to the probes, prepends a "this is a draft — your surface is in
+probe-context.json" banner to each, and records the real per-probe target endpoints
+in the manifest — so the staged probes describe *this* app, not the reference app
+the templates were authored against.
+"""
+from __future__ import annotations
+import json
+from importlib import resources
+from pathlib import Path
+WRITE_VERBS = ("POST", "PUT", "PATCH", "DELETE")
+# label -> (filename, attack class, what the agent must supply)
+PROBES = {
+    "unauth-baseline": ("unauth-baseline.sh", "Missing authentication (no-creds baseline)",
+                        "just the target base URL — it reads the routes from probe-context.json"),
+    "bola-cross-tenant": ("bola-cross-tenant.sh", "BOLA / cross-tenant read (OWASP API #1)",
+                          "two role tokens in different tenants + the IDOR-candidate routes"),
+    "bola-write-verbs": ("bola-write-verbs.py", "BOLA on PATCH/PUT/POST/DELETE",
+                         "two role tokens + the write endpoints + a sample object id per tenant"),
+    "mass-assignment": ("mass-assignment.py", "BOPLA / mass assignment (OWASP API #3)",
+                        "a low-priv token + a write endpoint that updates a record"),
+    "jwt-attacks": ("jwt-attacks.sh", "JWT: alg:none, tamper, expiry, replay",
+                   "a valid token + the login + a protected endpoint"),
+    "hs256-brute-force": ("hs256-brute-force.py", "Offline HS256 weak-secret brute",
+                         "one HS256 JWT (offline — no live app needed)"),
+    "ssrf-probes": ("ssrf-probes.sh", "SSRF: IMDS / RFC1918 / file://",
+                   "an authorized token + the SSRF-candidate endpoints/params"),
+    "race-conditions": ("race-conditions.py", "Race / claim-collision invariants",
+                       "a token + an endpoint with a single-winner invariant + an idempotency key"),
+    "webhook-forgery": ("webhook-forgery.py", "Inbound webhook signature/replay",
+                       "the webhook path + signature header name + scheme"),
+    "rate-limit-burst": ("rate-limit-burst.sh", "Rate-limit + X-Forwarded-For bypass",
+                        "the login + a rate-limited endpoint"),
+    "compare-roles": ("compare-roles.sh", "Two-role DAST surface diff",
+                     "two SARIF reports from a role-A and role-B scan (dynamic phase)"),
+    "dlp-bypass-offline": ("dlp-bypass-offline.py", "DLP/detection regex encoding bypass",
+                          "your DLP/redaction regexes (offline)"),
+    "s3-assess": ("s3-assess.sh", "S3 bucket posture", "a bucket name + AWS creds"),
+}
+# unauth-baseline is ALWAYS staged: it's the cheapest probe and directly exercises the
+# #1 lead class (missing authentication) — the one a no-creds run can confirm immediately.
+ALWAYS = ["unauth-baseline", "jwt-attacks", "hs256-brute-force", "rate-limit-burst"]
+# which targeting bucket each probe should be pointed at (for the manifest's real targets)
+_TARGET_KEYS = {
+    "unauth-baseline": "write_endpoints",
+    "bola-write-verbs": "write_endpoints",
+    "mass-assignment": "write_endpoints",
+    "bola-cross-tenant": "idor_candidates",
+    "ssrf-probes": "ssrf_candidates",
+    "webhook-forgery": "write_endpoints",
+}
+_BANNER = (
+    "# ─────────────────────────────────────────────────────────────────────────────\n"
+    "# websec-validator — DRAFT probe. Any example endpoints / auth / login below are\n"
+    "# PLACEHOLDERS from the template. THIS target's real surface — routes, auth scheme\n"
+    "# + token location, sensitive fields, tenant key — is in  ./probe-context.json\n"
+    "# (generated from FACTS.json for this app). Use those values before running; the\n"
+    "# agent should finalize this draft against probe-context.json, then fill secrets.\n"
+    "# ─────────────────────────────────────────────────────────────────────────────\n"
+)
+def applicable(facts: dict) -> list:
+    """Pick probes the extracted surface actually justifies."""
+    chosen = list(ALWAYS)
+    targeting = (facts.get("routes") or {}).get("targeting", {})
+    tenant = (facts.get("tenant") or {}).get("candidates")
+    if targeting.get("write_endpoints"):
+        chosen += ["mass-assignment"]
+    if tenant:
+        chosen += ["bola-cross-tenant", "bola-write-verbs", "compare-roles"]
+    if targeting.get("ssrf_candidates") or (facts.get("surface") or {}).get("sinks", {}).get("ssrf-outbound-http"):
+        chosen += ["ssrf-probes"]
+    if targeting.get("write_endpoints"):
+        chosen += ["webhook-forgery", "race-conditions"]
+    seen, ordered = set(), []
+    for k in chosen:
+        if k in PROBES and k not in seen:
+            seen.add(k)
+            ordered.append(k)
+    return ordered
+def build_context(facts: dict) -> dict:
+    """The target's real, probe-ready surface — written to probe-context.json."""
+    routes = facts.get("routes") or {}
+    tgt = routes.get("targeting", {})
+    auth = facts.get("auth") or {}
+    writes = [f"{e.get('method')} {e.get('path')}" for e in routes.get("endpoints", [])
+              if e.get("method") in WRITE_VERBS][:80]
+    return {
+        "target_base_url": "FILL_ME (e.g. http://localhost:3000)",
+        "auth": {
+            "scheme": auth.get("scheme"),
+            "token_location": auth.get("token_location"),
+            "login_endpoints": tgt.get("auth_endpoints", [])[:10],
+            "how_to_authenticate": "cookie-session (e.g. NextAuth) → send the session cookie; "
+                                   "bearer → Authorization: Bearer <jwt>; api-key → the documented key header",
+        },
+        "endpoints": {
+            "writes": writes,
+            "idor_candidates": tgt.get("idor_candidates", [])[:60],
+            "ssrf_candidates": tgt.get("ssrf_candidates", [])[:40],
+            "upload_candidates": tgt.get("upload_candidates", [])[:40],
+            "auth_endpoints": tgt.get("auth_endpoints", [])[:20],
+        },
+        "sensitive_fields": (facts.get("schemas") or {}).get("sensitive_fields", []),
+        "tenant_keys": [c.get("key") for c in (facts.get("tenant") or {}).get("candidates", [])][:5],
+        "datastore_class": (facts.get("surface") or {}).get("datastore_class"),
+        "note": "These are THIS app's real routes/auth (from FACTS.json). Finalize each probe draft "
+                "against this file, supply secrets/tokens, then run against a TEST instance only.",
+    }
+def stage(chosen: list, outdir: Path, facts: dict | None = None) -> list:
+    dest = outdir / "probes"
+    dest.mkdir(parents=True, exist_ok=True)
+    facts = facts or {}
+    ctx = build_context(facts)
+    (dest / "probe-context.json").write_text(json.dumps(ctx, indent=2) + "\n")
+    tgt = (facts.get("routes") or {}).get("targeting", {})
+    manifest = [{"key": "_context", "file": "probes/probe-context.json",
+                 "note": "the target's real routes/auth/fields — finalize the drafts against this"}]
+    src_root = resources.files("websec_validator").joinpath("templates/probes")
+    # always ship the shared helper the Python probes import (load context + env auth)
+    try:
+        (dest / "_lib.py").write_text(src_root.joinpath("_lib.py").read_text())
+    except Exception:
+        pass
+    for key in chosen:
+        fname, attack, needs = PROBES[key]
+        targets = (tgt.get(_TARGET_KEYS[key], []) if key in _TARGET_KEYS else [])[:15]
+        try:
+            body = src_root.joinpath(fname).read_bytes()
+            # prepend the draft banner after any shebang line
+            text = body.decode("utf-8", "replace")
+            if text.startswith("#!"):
+                shebang, _, rest = text.partition("\n")
+                text = f"{shebang}\n{_BANNER}{rest}"
+            else:
+                text = _BANNER + text
+            (dest / fname).write_text(text)
+            manifest.append({"key": key, "file": f"probes/{fname}", "attack_class": attack,
+                             "agent_must_supply": needs, "targets": targets})
+        except Exception as e:
+            manifest.append({"key": key, "file": fname, "status": f"stage-error: {e}"})
+    return manifest

{websec_validator-0.2.1 → websec_validator-0.2.3}/src/websec_validator/report.py RENAMED Viewed

@@ -71,7 +71,8 @@ def render(facts: dict, scanners: dict, scan_results: list, unified: dict | None
 | Endpoints | **{routes.get('count', 0)}** (via {routes.get('engine','?').split(' ')[0]}) |
 | Auth | {facts.get('auth', {}).get('scheme','?')} · roles: {', '.join(authz.get('roles_detected', [])) or 'none'} |
 | Access control | {gs.get('with_visible_guard', 0)} guarded · **{gs.get('no_visible_guard', 0)} no visible guard** · global-middleware: {authz.get('global_auth_middleware', False)} |
-| Findings (ledger) | {ledger_hdr} |
+| Static scanner (raw, pre-triage) | {sev_line} |
+| **Findings ledger** (triaged + calibrated) | {ledger_hdr} |
 | Attack surface | IDOR: {len(tgt.get('idor_candidates', []))} · SSRF: {len(tgt.get('ssrf_candidates', []))} · upload: {len(tgt.get('upload_candidates', []))} · writes: {len(tgt.get('write_endpoints', []))} |
 ## 1. Findings ledger (ranked · evidence chain · standards · confidence)

websec-validator 0.2.1__tar.gz → 0.2.3__tar.gz

websec-validator 0.2.1tar.gz → 0.2.3tar.gz