guardvibe 3.1.28 → 3.1.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,23 @@ All notable changes to GuardVibe are documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [3.1.30] - 2026-06-06
9
+
10
+ ### Added — release-integrity foundation
11
+ - **Metadata consistency guard** (`tests/meta/consistency.test.ts`) — makes the actual `builtinRules.length` the single source of truth and fails CI if any public surface (package.json, README, server.json, CLAUDE.md) advertises a different rule count, if tool-count strings diverge, if rule ids are not unique, if CHANGELOG lacks the current version, or if server.json and package.json versions drift apart. Ends the recurring count-drift (390 → 406 → 422 → 429 each previously needed a manual multi-file fix).
12
+ - **Release gate** (`npm run gate`, `scripts/release-gate.mjs`) — one command that runs build → lint → full test suite (incl. the consistency guard) → self-audit, and refuses to pass unless GuardVibe scans itself clean (PASS / A / 0). Run before every tag/release.
13
+ - **CI dogfood step** — `ci.yml` now runs `guardvibe audit . --fail-on high` so every PR must keep the project self-clean, not just green on tests.
14
+
15
+ ## [3.1.29] - 2026-06-06
16
+
17
+ ### Fixed — deep_scan (LLM) quality
18
+ - **Determinism**: deep_scan now calls the LLM with `temperature: 0`. The same code previously produced different findings across identical runs (e.g. 0 findings one call, 3 the next); it now returns stable results (verified: 3 identical runs on the same input).
19
+ - **Precision**: the prompt now enforces "precision over recall" — only report vulnerabilities present in the code shown, never speculate about code that isn't shown (imported middleware/helpers/DB layer), and never emit generic hardening suggestions (add rate limiting, shorten token lifetime) unless their absence is a concrete exploitable flaw. Correctly-handled concerns (ownership filter, validated input, parameterized query) are not flagged. A clean, auth-guarded, ownership-checked endpoint that previously drew 3 speculative findings now returns 0 while real IDOR/business-logic flaws are still caught.
20
+
21
+ ### Validation
22
+ - Live before/after against the Anthropic API (Haiku): determinism 0/3-variance → identical-across-3-runs; clean-code false positives 3 → 0; real semantic vulns (IDOR, business-logic price tampering, TOCTOU race, auth-bypass) still detected
23
+ - 1 new prompt-builder unit test; full suite 1788 → 1789, self-audit PASS A 100
24
+
8
25
  ## [3.1.28] - 2026-06-06
9
26
 
10
27
  ### Fixed
package/README.md CHANGED
@@ -457,7 +457,7 @@ If your AI agent cannot connect to GuardVibe:
457
457
 
458
458
  1. **Restart your IDE/agent.** MCP servers are started by the host application. After running `npx guardvibe init`, restart Claude Code, Cursor, or Gemini CLI for the config to take effect.
459
459
  2. **Check the config path.** Run `npx guardvibe init claude` again and verify the output shows the correct config file location (`.mcp.json` in your project root for Claude Code, `.cursor/mcp.json` for Cursor).
460
- 3. **Re-run `init` to upgrade.** When upgrading GuardVibe, re-run `npx guardvibe init claude` — `.mcp.json` is pinned to a specific version (e.g. `guardvibe@3.1.28`) at init time for fast deterministic startup. As of v3.1.2 the re-run also rewrites stale pins automatically (`Upgraded GuardVibe pin (3.1.27 → 3.1.28)`); since v3.1.27 the PostToolUse hook command is pinned to the same version (was `@latest`) and re-run upgrades a stale hook too. The same applies to `npx guardvibe hook install` and `npx guardvibe ci github` (since v3.1.3) — both are version-pinned at install/generate time and re-run to upgrade.
460
+ 3. **Re-run `init` to upgrade.** When upgrading GuardVibe, re-run `npx guardvibe init claude` — `.mcp.json` is pinned to a specific version (e.g. `guardvibe@3.1.30`) at init time for fast deterministic startup. As of v3.1.2 the re-run also rewrites stale pins automatically (`Upgraded GuardVibe pin (3.1.27 → 3.1.28)`); since v3.1.27 the PostToolUse hook command is pinned to the same version (was `@latest`) and re-run upgrades a stale hook too. The same applies to `npx guardvibe hook install` and `npx guardvibe ci github` (since v3.1.3) — both are version-pinned at install/generate time and re-run to upgrade.
461
461
  4. **Pre-3.1.1 users won't see the auto-update banner.** GuardVibe started writing a once-per-day "newer version available" notice to stderr in v3.1.1. If your install predates that, you'll never see it — run `npx -y guardvibe@latest init <host>` once to bake in the latest pin and start receiving banners on subsequent sessions.
462
462
  5. **Verify Node.js version.** GuardVibe requires Node.js >= 18.0.0. Check with `node --version`.
463
463
  6. **Check npx cache.** If you upgraded GuardVibe and the old version is cached, run `npx -y guardvibe@latest` to force the latest version.
@@ -76,6 +76,13 @@ export function buildDeepScanPrompt(code, language, existingFindings, focus = "a
76
76
  }
77
77
  }
78
78
  lines.push("");
79
+ lines.push("## Rules (precision over recall)");
80
+ lines.push("- Only report a vulnerability you can point to in the code SHOWN above. Cite the specific line or construct.");
81
+ lines.push("- Do NOT speculate about code that is not shown (imported helpers, middleware internals, DB layer). If `requireAuth` or a validator is referenced but not defined here, assume it works correctly.");
82
+ lines.push("- Do NOT report generic hardening or defense-in-depth suggestions (add rate limiting, shorten token lifetime, add logging) unless their absence is a concrete, exploitable flaw in THIS code.");
83
+ lines.push("- If the code already handles a concern correctly (e.g. an ownership/userId filter is present, input is validated, a parameterized query is used), do NOT flag it.");
84
+ lines.push("- When uncertain whether something is exploitable, OMIT it. A short, correct list beats a long, speculative one.");
85
+ lines.push("");
79
86
  lines.push("## Response Format");
80
87
  lines.push("Return ONLY a JSON object with this structure:");
81
88
  lines.push("```json");
@@ -180,6 +187,9 @@ export async function callLLM(prompt, options = {}) {
180
187
  body: JSON.stringify({
181
188
  model: MODEL_IDS[model],
182
189
  max_tokens: 2048,
190
+ // temperature 0 — maximize determinism so the same code yields stable findings
191
+ // across runs (deep_scan is the one non-deterministic layer; keep variance minimal).
192
+ temperature: 0,
183
193
  messages: [{ role: "user", content: trimmedPrompt }],
184
194
  }),
185
195
  });
@@ -198,6 +208,7 @@ export async function callLLM(prompt, options = {}) {
198
208
  body: JSON.stringify({
199
209
  model: model === "sonnet" ? "gpt-4o" : "gpt-4o-mini",
200
210
  max_tokens: 2048,
211
+ temperature: 0,
201
212
  messages: [{ role: "user", content: trimmedPrompt }],
202
213
  }),
203
214
  });
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "guardvibe",
3
- "version": "3.1.28",
3
+ "version": "3.1.30",
4
4
  "mcpName": "io.github.goklab/guardvibe",
5
5
  "description": "Security MCP for vibe coding. 429 rules, 36 tools, CLI + doctor. Host security, auth coverage mapping, LLM-powered deep scan (IDOR/business logic), taint analysis. 63 CVE rules refreshed daily from GHSA/OSV/CISA KEV — Miasma @redhat-cloud-services compromise, Next.js May 2026 13-advisory cluster, Drizzle/MikroORM/Kysely SQL injection, Axios proxy-auth redirect leak, Hono setCookie attribute injection, Clerk SSRF, tRPC prototype pollution, @tanstack supply-chain, node-ipc protestware, OpenClaude sandbox bypass, plus the full AI-generated stack (Supabase, Stripe, Prisma, Hono, GraphQL, Convex, Turso, Uploadthing, AI SDK). 68 AI-native rules including OWASP MCP Top 10 tool-description prompt injection (VG1068), model-controlled sandbox-disable flag detection (VG1063), Session messenger exfil endpoint IOC (VG1075), and CI/CD supply-chain hardening (VG1070 npm --expect-provenance / --ignore-scripts enforcement).",
6
6
  "type": "module",
@@ -33,7 +33,8 @@
33
33
  "prepare": "npm run build",
34
34
  "lint": "eslint src/",
35
35
  "test": "node --import tsx --test tests/**/*.test.ts",
36
- "test:coverage": "c8 --reporter=lcov --reporter=text node --import tsx --test tests/**/*.test.ts"
36
+ "test:coverage": "c8 --reporter=lcov --reporter=text node --import tsx --test tests/**/*.test.ts",
37
+ "gate": "node scripts/release-gate.mjs"
37
38
  },
38
39
  "keywords": [
39
40
  "mcp",