npm - thumbgate - Versions diffs - 1.17.0 → 1.19.0 - Mend

thumbgate 1.17.0 → 1.19.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/.claude-plugin/marketplace.json +2 -2
package/.claude-plugin/plugin.json +1 -1
package/.well-known/mcp/server-card.json +1 -1
package/README.md +26 -4
package/adapters/claude/.mcp.json +2 -2
package/adapters/mcp/server-stdio.js +1 -1
package/adapters/opencode/opencode.json +1 -1
package/config/model-candidates.json +31 -0
package/package.json +29 -8
package/public/compare.html +6 -0
package/public/federal.html +375 -0
package/public/guide.html +2 -2
package/public/index.html +49 -19
package/public/learn.html +28 -0
package/public/numbers.html +2 -2
package/public/pro.html +4 -4
package/scripts/activation-tracker.js +127 -0
package/scripts/auto-promote-gates.js +4 -1
package/scripts/feedback-loop.js +14 -1
package/scripts/feedback-to-rules.js +11 -1
package/scripts/feedback_quality_eval.py +725 -0
package/scripts/memory-scope-readiness.js +315 -0
package/scripts/plausible-server-events.js +162 -0
package/scripts/seo-gsd.js +75 -2
package/scripts/statusline-links.js +2 -0
package/scripts/telemetry-analytics.js +1 -0
package/src/api/server.js +557 -12

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "thumbgate-marketplace",
-  "version": "1.17.0",
+  "version": "1.19.0",
   "owner": {
     "name": "Igor Ganapolsky",
     "email": "ig5973700@gmail.com"
@@ -13,7 +13,7 @@
         "source": "npm",
         "package": "thumbgate"
       },
-      "version": "1.17.0",
+      "version": "1.19.0",
       "author": {
         "name": "Igor Ganapolsky"
       },

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "thumbgate",
   "description": "Type 👍 or 👎 on any agent action. ThumbGate captures it, distills a lesson, and blocks the pattern from repeating. One thumbs-down = the agent physically cannot make that mistake again. 33 pre-action checks, budget enforcement, self-protection, and NIST/SOC2 compliance tags.",
-  "version": "1.17.0",
+  "version": "1.19.0",
   "author": {
     "name": "Igor Ganapolsky"
   },

package/.well-known/mcp/server-card.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "thumbgate",
-  "version": "1.17.0",
+  "version": "1.19.0",
   "description": "ThumbGate — 👍👎 feedback that teaches your AI agent. Thumbs down a mistake, it never happens again.",
   "homepage": "https://thumbgate-production.up.railway.app",
   "transport": "stdio",

package/README.md CHANGED Viewed

@@ -8,7 +8,7 @@
 **Your AI coding bill has a leak.**
-**Stop paying $ for the same AI mistake.**
+**Stop paying for the same AI mistake twice.**
 Every retry loop, every hallucinated import, every *"let me try a different approach"* — those are billable tokens on every LLM vendor's bill. Thumbs-down once; ThumbGate blocks that exact mistake on every future call. Across Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode — any MCP-compatible agent, forever.
@@ -24,6 +24,14 @@ Under the hood: your thumbs-down becomes one of your **Pre-Action Checks** that
 ---
+> *"A better dashboard doesn't make the agents more reliable. The hard part isn't visibility. It's trust."*
+>
+> — **Rob May**, CEO & co-founder, Neurometric AI, quoted in [The New Stack](https://thenewstack.io/claude-code-agent-view/) on Anthropic's Claude Code Agent View (May 2026).
+>
+> ThumbGate is the open-source layer that makes the trust part real: PreToolUse gates, thumbs-down to rule, audit trail on every interception.
+---
 ## 🎬 90-second demo
 Watch the force-push scenario: agent tries to `git push --force`, one thumbs-down, next session it's blocked — zero tokens spent on the repeat.
@@ -107,13 +115,26 @@ ThumbGate operates as a 4-layer enforcement stack between your AI agent and your
 Your thumbs-up/down reactions are captured via MCP protocol, CLI, or the ChatGPT GPT surface. Each reaction is stored as a structured lesson with context, timestamp, and severity.
 ### Layer 2: Check Engine
-The check engine converts lessons into enforceable rules using pattern matching, semantic similarity (via LanceDB vectors), and Thompson Sampling for adaptive rule selection. Rules stay in local ThumbGate runtime state.
+The check engine converts lessons into enforceable rules. **The runtime gate decision is deterministic** — literal pattern match → AST match → scoped rule lookup. No LLM call on the enforcement path.
+Where retrieval is needed (an agent is about to run a destructive command not on the literal block list, but semantically similar to one we've blocked before), ThumbGate uses local CPU-only `bge-small` embeddings via LanceDB's built-in pipeline. No external API call, no inference cost beyond CPU. So **"no LLM in enforcement"** holds: the gate decision uses no LLM; the rule corpus is just searchable via local embeddings.
+**Thompson Sampling tunes per-rule confidence weights** for soft-gating rules so high-noise rules quiet down and high-signal rules sharpen. It never decides *whether* a rule fires — a hard rule like "block `git push --force` on main" always fires deterministically. Bandit exploration would be terrifying for hard rules; we don't do it.
+Rules stay in local ThumbGate runtime state.
 ### Layer 3: Pre-Action Interception
 Before any agent action executes, ThumbGate's `PreToolUse` hook intercepts the command and evaluates it against all active checks. This happens at the MCP protocol level — the agent physically cannot bypass it.
-### Layer 4: Multi-Agent Distribution
-Checks are distributed across all connected agents via MCP stdio protocol. One correction in Claude Code protects Cursor, Codex, Gemini CLI, Cline, and any MCP-compatible agent.
+### Layer 4: Multi-Agent Distribution (the actual moat vs hand-rolled hooks)
+Claude Code already ships `permissions.deny` and `PreToolUse` hooks. Cursor and Codex have their own. So why ThumbGate over a hand-written hook?
+Two things hand-written hooks structurally cannot do:
+1. **Cross-agent propagation.** A `permissions.deny` pattern lives in one agent's config and stays there. ThumbGate's checks distribute across every connected agent over MCP stdio — thumbs-down once in Cursor, the same pattern blocks on Claude Code, Codex, Gemini CLI, Cline, OpenCode, Amp in the next session, no copy-paste between configs.
+2. **Learning loop.** A hand-written hook covers exactly the patterns you wrote. ThumbGate promotes every thumbs-down into a fresh rule, tunes existing rules' confidence weights from outcomes (Thompson Sampling, see Layer 2), and pulls semantically-near patterns into scope via local embeddings. The rule corpus sharpens without an operator hand-writing a regex for every new mistake shape.
+Hand-rolled hooks are the right tool for a small, static denylist you maintain by hand. ThumbGate is the right tool when you want corrections from any agent to harden every agent automatically.
 Prompt engineering still matters, but it is only the starting point. ThumbGate adds prompt evaluation on top: proof lanes, benchmarks, and self-heal checks tell you whether your prompt and workflow actually held up under execution instead of leaving you to guess from vibes. Run `npx thumbgate eval --from-feedback --write-report=.thumbgate/prompt-eval-proof.md` to turn real thumbs-up/down feedback into reusable eval cases and a buyer-ready proof report.
@@ -423,6 +444,7 @@ Pro ($19/mo or $149/yr) removes the rule cap and adds history-aware lesson recal
 ## Docs
+- [**ThumbGate for Federal Agencies**](docs/FEDERAL.md) — pilot-ready posture, NIST 800-53 control mapping, OMB M-24-10 / EO 14110 alignment, ThumbGate-Core gov deployment mode, public/Core boundary invariants. Landing page: [thumbgate.ai/federal](https://thumbgate-production.up.railway.app/federal).
 - [First Dollar Playbook](docs/FIRST_DOLLAR_PLAYBOOK.md) — turning one painful workflow into the next booked pilot
 - [Commercial Truth](docs/COMMERCIAL_TRUTH.md) — pricing, claims, what we don't say
 - [Changeset Strategy](docs/CHANGESET_STRATEGY.md) — release notes and version bump enforcement

package/adapters/claude/.mcp.json CHANGED Viewed

@@ -2,13 +2,13 @@
   "mcpServers": {
     "thumbgate": {
       "command": "npx",
-      "args": ["--yes", "--package", "thumbgate@1.17.0", "thumbgate", "serve"]
+      "args": ["--yes", "--package", "thumbgate@1.19.0", "thumbgate", "serve"]
     }
   },
   "hooks": {
     "preToolUse": {
       "command": "npx",
-      "args": ["--yes", "--package", "thumbgate@1.17.0", "thumbgate", "gate-check"]
+      "args": ["--yes", "--package", "thumbgate@1.19.0", "thumbgate", "gate-check"]
     }
   }
 }

package/adapters/mcp/server-stdio.js CHANGED Viewed

@@ -216,7 +216,7 @@ const {
   finalizeSession: finalizeFeedbackSession,
 } = require('../../scripts/feedback-session');
-const SERVER_INFO = { name: 'thumbgate-mcp', version: '1.17.0' };
+const SERVER_INFO = { name: 'thumbgate-mcp', version: '1.19.0' };
 const COMMERCE_CATEGORIES = [
   'product_recommendation',
   'brand_compliance',

package/adapters/opencode/opencode.json CHANGED Viewed

@@ -7,7 +7,7 @@
         "npx",
         "--yes",
         "--package",
-        "thumbgate@1.17.0",
+        "thumbgate@1.19.0",
         "thumbgate",
         "serve"
       ],

package/config/model-candidates.json CHANGED Viewed

@@ -76,9 +76,40 @@
         "medianLatencyMs",
         "costPerAnalysisUsd"
       ]
+    },
+    "tokenizer-brittleness": {
+      "label": "Tokenizer brittleness and byte-level robustness",
+      "summary": "Evaluate models for malformed JSONL, Unicode confusables, stack traces, secrets, SQL snippets, file paths, and code-symbol-heavy inputs before routing log, code, or security workloads.",
+      "desiredStrengths": ["tokenizer-free", "byte-level", "log-robustness", "code-symbols", "security-scanning", "fast-inference"],
+      "targetContextWindow": 64000,
+      "benchmarkCommands": [
+        "npx thumbgate model-candidates --workload=tokenizer-brittleness --json",
+        "node --test tests/model-candidates.test.js --test-name-pattern=tokenizer",
+        "node scripts/gate-eval.js run"
+      ],
+      "metrics": [
+        "caseCoverage",
+        "symbolPreservationRate",
+        "secretDetectionRecall",
+        "jsonlRepairAccuracy",
+        "medianLatencyMs",
+        "memoryBandwidthEstimate"
+      ]
     }
   },
   "candidates": [
+    {
+      "id": "research/fast-byte-latent-transformer",
+      "vendor": "Meta + Stanford + University of Washington",
+      "family": "blt",
+      "provider": "research",
+      "model": "fast-byte-latent-transformer",
+      "contextWindow": 64000,
+      "costClass": "medium",
+      "researchOnly": true,
+      "strengths": ["tokenizer-free", "byte-level", "log-robustness", "code-symbols", "security-scanning", "fast-inference"],
+      "notes": "Research-only candidate inspired by Fast BLT. Use as an evaluation target for tokenizer-free robustness and memory-bandwidth planning; do not route production traffic until a maintained runtime and model weights exist."
+    },
     {
       "id": "self-hosted/deepseek-v4-flash-sglang",
       "vendor": "DeepSeek",

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "thumbgate",
-  "version": "1.17.0",
+  "version": "1.19.0",
   "description": "ThumbGate self-improving agent governance: thumbs-up/down turns every mistake into a prevention rule and blocks repeat patterns. 33 pre-action checks, budget enforcement, and self-protection for Claude Code, Cursor, Codex, Gemini CLI, and Amp.",
   "homepage": "https://thumbgate-production.up.railway.app",
   "repository": {
@@ -39,6 +39,7 @@
     "public/codex-plugin.html",
     "public/compare.html",
     "public/dashboard.html",
+    "public/federal.html",
     "public/guide.html",
     "public/index.html",
     "public/learn.html",
@@ -46,6 +47,7 @@
     "public/numbers.html",
     "public/pro.html",
     "scripts/access-anomaly-detector.js",
+    "scripts/activation-tracker.js",
     "scripts/agent-audit-trace.js",
     "scripts/agent-design-governance.js",
     "scripts/agent-memory-lifecycle.js",
@@ -110,6 +112,7 @@
     "scripts/feedback-loop.js",
     "scripts/feedback-paths.js",
     "scripts/feedback-quality.js",
+    "scripts/feedback_quality_eval.py",
     "scripts/feedback-schema.js",
     "scripts/feedback-session.js",
     "scripts/feedback-to-rules.js",
@@ -151,6 +154,7 @@
     "scripts/mcp-policy.js",
     "scripts/mcp-transport-strategy.js",
     "scripts/memory-firewall.js",
+    "scripts/memory-scope-readiness.js",
     "scripts/meta-agent-loop.js",
     "scripts/model-access-eligibility.js",
     "scripts/model-migration-readiness.js",
@@ -162,6 +166,7 @@
     "scripts/otel-declarative-config.js",
     "scripts/operational-integrity.js",
     "scripts/perplexity-client.js",
+    "scripts/plausible-server-events.js",
     "scripts/pr-manager.js",
     "scripts/private-core-boundary.js",
     "scripts/pro-local-dashboard.js",
@@ -322,7 +327,16 @@
     "social:prospect:bluesky": "node scripts/social-bluesky-prospecting.js",
     "social:prospect:bluesky:dry": "node scripts/social-bluesky-prospecting.js --dry-run",
     "social:reply-publish:bluesky:dry": "node scripts/social-reply-monitor-bluesky.js --publish-approved --dry-run",
-    "test": "npm run test:schema && npm run test:loop && npm run test:dpo && npm run test:kto && npm run test:api && npm run test:proof && npm run test:e2e && npm run test:rlaif && npm run test:attribution && npm run test:quality && npm run test:intelligence && npm run test:training-export && npm run test:deployment && npm run test:operational-integrity && npm run test:workflow && npm run test:billing && npm run test:cli && npm run test:watcher && npm run test:autoresearch && npm run test:ops && npm run test:session-analyzer && npm run test:tessl && npm run test:gates && npm run test:evoskill && npm run test:gates-hardening && npm run test:workers && npm run test:social-analytics && npm run test:memalign && npm run test:xmemory-lite && npm run test:filesystem-search && npm run test:zernio && npm run test:platform-limits && npm run test:post-video && npm run test:post-everywhere-instagram && npm run test:post-everywhere-channels && npm run test:post-everywhere-zernio-default && npm run test:zernio-canonical-pollers && npm run test:zernio-status && npm run test:obsidian-export && npm run test:lesson-db && npm run test:lesson-rotation && npm run test:memory-dedup && npm run test:feedback-quality && npm run test:sync-version && npm run test:check-congruence && npm run test:tool-registry && npm run test:feedback-to-rules && npm run test:memory-firewall && npm run test:belief-update && npm run test:hosted-config && npm run test:operational-summary && npm run test:operational-dashboard && npm run test:operator-artifacts && npm run test:operator-key-auth && npm run test:cloudflare-sandbox && npm run test:mcp-config && npm run test:plan-gate && npm run test:pulse && npm run test:semantic-layer && npm run test:data-pipeline && npm run test:optimize-context && npm run test:principle-extractor && npm run test:analytics-window && npm run test:funnel-analytics && npm run test:experiment-tracker && npm run test:build-metadata && npm run test:context-engine && npm run test:hf-papers && npm run test:marketing-experiment && npm run test:seo-gsd && npm run test:verify-run && npm run test:export-dpo-pairs && npm run test:export-hf-dataset && npm run test:license && npm run test:bot-detector && npm run test:postinstall && npm run test:funnel-invariants && npm run test:cli-telemetry && npm run test:pro-parity && npm run test:model-tier-router && npm run test:computer-use-firewall && npm run test:skill-exporter && npm run test:statusline && npm run test:evolution && npm run test:org-dashboard && npm run test:multi-hop-recall && npm run test:synthetic-dpo && npm run test:thumbgate-skill && npm run test:learn-hub && npm run test:feedback-fallback && npm run test:metaclaw && npm run test:server-lock && npm run test:control-tower && npm run test:pii-scanner && npm run test:data-governance && npm run test:lesson-inference && npm run test:semantic-dedup && npm run test:fs-utils && npm run test:cli-schema && npm run test:explore && npm run test:lesson-reranker && npm run test:lesson-retrieval && npm run test:cross-encoder && npm run test:reflector-agent && npm run test:feedback-session && npm run test:feedback-history-distiller && npm run test:hallucination-detector && npm run test:history-distiller && npm run test:predictive-insights && npm run test:prove-predictive-insights && npm run test:statusbar-cli && npm run test:generate-instagram-card && npm run test:instagram-thumbgate-post && npm run test:publish-instagram-thumbgate && npm run test:lesson-synthesis && npm run test:lesson-canonical && npm run test:background-governance && npm run test:memory-migration && npm run test:prompt-dlp && npm run test:ephemeral-store && npm run test:agent-security && npm run test:skill-progressive && npm run test:per-step-scoring && npm run test:weekly-auto-post && npm run test:social-post-hourly && npm run test:social-quality-gate && npm run test:a2ui-engine && npm run test:gate-satisfy && npm run test:money-watcher && npm run test:budget && npm run test:quick-start && npm run test:utm && npm run test:product-feedback && npm run test:feedback-root-consolidator && npm run test:engagement-audit && npm run test:install-growth-automation && npm run test:publish-thumbgate-launch && npm run test:community-course-platform-launch-kit && npm run test:reconcile-thumbgate-campaign && npm run test:reddit-publisher && npm run test:schedule-thumbgate-campaign && npm run test:social-reply-monitor && npm run test:social-dedupe-cleanup && npm run test:sync-launch-assets && npm run test:ai-search-visibility && npm run test:perplexity && npm run test:security-scanner && npm run test:llm-client && npm run test:managed-lesson-agent && npm run test:self-distill && npm run test:meta-agent && npm run test:harness-selector && npm run test:thumbgate-bench && npm run test:seo-guides && npm run test:enforcement-loop && npm run test:cli-agent-experience && npm run test:bot-detection && npm run test:checkout-bot-guard && npm run test:session-health && npm run test:session-episodes && npm run test:spec-gate && npm run test:decision-trace && npm run test:dashboard-insights && npm run test:telemetry-tracked-link-slug && npm run test:prompt-eval && npm run test:demo-voiceover && npm run test:gate-coherence && npm run test:gate-eval && npm run test:high-roi && npm run test:public-static-assets && npm run test:token-savings && npm run test:numbers-page && npm run test:workflow-gate-checkpoint && npm run test:lesson-export-import && npm run test:landing-page-claims && npm run test:competitive-positioning-marketing && npm run test:medium-weekly && npm run test:dashboard-deeplink-e2e && npm run test:public-package-parity && npm run test:token-savings-dashboard && npm run test:cursor-wiring && npm run test:pretooluse-injection && npm run test:recent-corrective-context && npm run test:durability-step && npm run test:mailer && npm run test:brand-assets && npm run test:enforcement-teeth && npm run test:bayes-optimal-gate && npm run test:swarm-coordinator && npm run test:session-report && npm run test:agent-reasoning-traces && npm run test:judge-reward && npm run test:llm-behavior-monitor && npm run test:prompting-os && npm run test:single-use-credential-gate && npm run test:structured-prompt-driven && npm run test:require-evidence-gate && npm run test:rule-validator && npm run test:bluesky-atproto && npm run test:social-reply-monitor-bluesky && npm run test:bluesky-delete-replies && npm run test:architect-kit-memory-bridge && npm run test:sonar-review-hotspots && npm run test:actionable-remediations && npm run test:gemini-embedding-policy && npm run test:agent-design-governance && npm run test:public-core-boundary",
+    "test:python": "python3 -m pytest tests/*.py",
+    "test": "npm run test:python && npm run test:schema && npm run test:loop && npm run test:dpo && npm run test:kto && npm run test:api && npm run test:proof && npm run test:e2e && npm run test:rlaif && npm run test:attribution && npm run test:quality && npm run test:intelligence && npm run test:training-export && npm run test:deployment && npm run test:operational-integrity && npm run test:workflow && npm run test:billing && npm run test:cli && npm run test:watcher && npm run test:autoresearch && npm run test:ops && npm run test:session-analyzer && npm run test:tessl && npm run test:gates && npm run test:evoskill && npm run test:gates-hardening && npm run test:workers && npm run test:social-analytics && npm run test:memalign && npm run test:xmemory-lite && npm run test:filesystem-search && npm run test:zernio && npm run test:platform-limits && npm run test:post-video && npm run test:post-everywhere-instagram && npm run test:post-everywhere-channels && npm run test:post-everywhere-zernio-default && npm run test:zernio-canonical-pollers && npm run test:zernio-status && npm run test:obsidian-export && npm run test:lesson-db && npm run test:lesson-rotation && npm run test:memory-dedup && npm run test:feedback-quality && npm run test:sync-version && npm run test:check-congruence && npm run test:tool-registry && npm run test:feedback-to-rules && npm run test:memory-firewall && npm run test:memory-scope-readiness && npm run test:belief-update && npm run test:hosted-config && npm run test:operational-summary && npm run test:operational-dashboard && npm run test:operator-artifacts && npm run test:operator-key-auth && npm run test:cloudflare-sandbox && npm run test:mcp-config && npm run test:plan-gate && npm run test:pulse && npm run test:semantic-layer && npm run test:data-pipeline && npm run test:optimize-context && npm run test:principle-extractor && npm run test:analytics-window && npm run test:funnel-analytics && npm run test:experiment-tracker && npm run test:build-metadata && npm run test:context-engine && npm run test:hf-papers && npm run test:marketing-experiment && npm run test:seo-gsd && npm run test:verify-run && npm run test:export-dpo-pairs && npm run test:export-hf-dataset && npm run test:license && npm run test:bot-detector && npm run test:audit-pr-bot-contamination && npm run test:stripe-bootstrap-saas-catalog && npm run test:postinstall && npm run test:funnel-invariants && npm run test:cli-telemetry && npm run test:pro-parity && npm run test:model-tier-router && npm run test:computer-use-firewall && npm run test:skill-exporter && npm run test:statusline && npm run test:evolution && npm run test:org-dashboard && npm run test:multi-hop-recall && npm run test:synthetic-dpo && npm run test:thumbgate-skill && npm run test:learn-hub && npm run test:feedback-fallback && npm run test:metaclaw && npm run test:server-lock && npm run test:control-tower && npm run test:pii-scanner && npm run test:data-governance && npm run test:lesson-inference && npm run test:semantic-dedup && npm run test:fs-utils && npm run test:cli-schema && npm run test:explore && npm run test:lesson-reranker && npm run test:lesson-retrieval && npm run test:cross-encoder && npm run test:reflector-agent && npm run test:feedback-session && npm run test:feedback-history-distiller && npm run test:hallucination-detector && npm run test:history-distiller && npm run test:predictive-insights && npm run test:prove-predictive-insights && npm run test:statusbar-cli && npm run test:generate-instagram-card && npm run test:instagram-thumbgate-post && npm run test:publish-instagram-thumbgate && npm run test:lesson-synthesis && npm run test:lesson-canonical && npm run test:background-governance && npm run test:memory-migration && npm run test:prompt-dlp && npm run test:ephemeral-store && npm run test:agent-security && npm run test:skill-progressive && npm run test:per-step-scoring && npm run test:weekly-auto-post && npm run test:social-post-hourly && npm run test:social-quality-gate && npm run test:a2ui-engine && npm run test:gate-satisfy && npm run test:money-watcher && npm run test:budget && npm run test:quick-start && npm run test:utm && npm run test:product-feedback && npm run test:feedback-root-consolidator && npm run test:engagement-audit && npm run test:install-growth-automation && npm run test:publish-thumbgate-launch && npm run test:community-course-platform-launch-kit && npm run test:reconcile-thumbgate-campaign && npm run test:reddit-publisher && npm run test:schedule-thumbgate-campaign && npm run test:social-reply-monitor && npm run test:social-dedupe-cleanup && npm run test:sync-launch-assets && npm run test:ai-search-visibility && npm run test:perplexity && npm run test:security-scanner && npm run test:llm-client && npm run test:managed-lesson-agent && npm run test:self-distill && npm run test:meta-agent && npm run test:harness-selector && npm run test:thumbgate-bench && npm run test:seo-guides && npm run test:enforcement-loop && npm run test:cli-agent-experience && npm run test:bot-detection && npm run test:checkout-bot-guard && npm run test:checkout-pro-confirmation-gate && npm run test:session-health && npm run test:session-episodes && npm run test:spec-gate && npm run test:decision-trace && npm run test:dashboard-insights && npm run test:telemetry-tracked-link-slug && npm run test:prompt-eval && npm run test:demo-voiceover && npm run test:gate-coherence && npm run test:gate-eval && npm run test:high-roi && npm run test:public-static-assets && npm run test:token-savings && npm run test:numbers-page && npm run test:workflow-gate-checkpoint && npm run test:lesson-export-import && npm run test:landing-page-claims && npm run test:competitive-positioning-marketing && npm run test:medium-weekly && npm run test:dashboard-deeplink-e2e && npm run test:public-package-parity && npm run test:token-savings-dashboard && npm run test:cursor-wiring && npm run test:pretooluse-injection && npm run test:recent-corrective-context && npm run test:durability-step && npm run test:mailer && npm run test:brand-assets && npm run test:enforcement-teeth && npm run test:bayes-optimal-gate && npm run test:swarm-coordinator && npm run test:session-report && npm run test:agent-reasoning-traces && npm run test:judge-reward && npm run test:llm-behavior-monitor && npm run test:prompting-os && npm run test:single-use-credential-gate && npm run test:structured-prompt-driven && npm run test:require-evidence-gate && npm run test:rule-validator && npm run test:bluesky-atproto && npm run test:social-reply-monitor-bluesky && npm run test:bluesky-delete-replies && npm run test:architect-kit-memory-bridge && npm run test:sonar-review-hotspots && npm run test:actionable-remediations && npm run test:gemini-embedding-policy && npm run test:agent-design-governance && npm run test:public-core-boundary && npm run test:hook-stop-verify-deploy && npm run test:hook-stop-anti-claim && npm run test:plausible-server-events && npm run test:activation-tracker && npm run test:unified-revenue-rollup && npm run test:conversion-rate-stats && npm run test:external-customer-audit && npm run test:telemetry-export",
+    "test:hook-stop-verify-deploy": "node --test tests/hook-stop-verify-deploy.test.js",
+    "test:hook-stop-anti-claim": "node --test tests/hook-stop-anti-claim.test.js",
+    "test:plausible-server-events": "node --test tests/plausible-server-events.test.js",
+    "test:activation-tracker": "node --test tests/activation-tracker.test.js",
+    "test:unified-revenue-rollup": "node --test tests/unified-revenue-rollup.test.js",
+    "test:conversion-rate-stats": "node --test tests/conversion-rate-stats.test.js",
+    "test:external-customer-audit": "node --test tests/external-customer-audit.test.js",
+    "test:telemetry-export": "node --test tests/telemetry-export.test.js",
     "test:swarm-coordinator": "node --test tests/swarm-coordinator.test.js",
     "test:session-report": "node --test tests/session-report.test.js",
     "test:agent-reasoning-traces": "node --test tests/agent-reasoning-traces.test.js tests/agent-stack-survival-audit.test.js",
@@ -346,6 +360,8 @@
     "test:telemetry-tracked-link-slug": "node --test tests/telemetry-tracked-link-slug.test.js",
     "test:prompt-eval": "node --test tests/prompt-eval.test.js",
     "eval:feedback": "node scripts/prompt-eval.js --from-feedback",
+    "eval:feedback-quality": "python3 scripts/feedback_quality_eval.py",
+    "eval:classifier": "python3 scripts/eval_gate_classifier.py",
     "test:decision-trace": "node --test tests/decision-trace.test.js",
     "test:feedback-fallback": "node --test tests/feedback-fallback.test.js",
     "test:metaclaw": "node --test tests/metaclaw-features.test.js",
@@ -365,6 +381,7 @@
     "test:learn-hub": "node --test tests/learn-hub.test.js",
     "test:feedback-to-rules": "node --test tests/feedback-to-rules.test.js",
     "test:memory-firewall": "node --test tests/memory-firewall.test.js",
+    "test:memory-scope-readiness": "node --test tests/memory-scope-readiness.test.js",
     "test:belief-update": "node --test tests/belief-update.test.js",
     "test:hosted-config": "node --test tests/hosted-config.test.js",
     "test:operational-summary": "node --test tests/operational-summary.test.js",
@@ -403,10 +420,10 @@
     "test:kto": "node --test tests/export-kto.test.js",
     "test:api": "node --test --test-concurrency=1 tests/api-server.test.js tests/api-events-sse.test.js tests/api-auth-config.test.js tests/mcp-server.test.js tests/adapters.test.js tests/openapi-parity.test.js tests/budget-guard.test.js tests/context-manager.test.js tests/contextfs.test.js tests/job-api.test.js tests/pack-templates.test.js tests/dashboard.test.js tests/dashboard-render-spec.test.js tests/dashboard-html.test.js tests/agent-readiness.test.js tests/mcp-policy.test.js tests/subagent-profiles.test.js tests/intent-router.test.js tests/internal-agent-bootstrap.test.js tests/lesson-search.test.js tests/thumbgate-search.test.js tests/document-intake.test.js tests/rubric-engine.test.js tests/self-healing-check.test.js tests/self-heal.test.js tests/feedback-schema.test.js tests/thompson-sampling.test.js tests/feedback-sequences.test.js tests/diversity-tracking.test.js tests/vector-store.test.js tests/gemini-embedding-policy.test.js tests/feedback-attribution.test.js tests/hybrid-feedback-context.test.js tests/loop-closure.test.js tests/code-reasoning.test.js tests/feedback-loop.test.js tests/feedback-inbox-read.test.js tests/feedback-to-memory.test.js tests/test-coverage.test.js tests/version-metadata.test.js tests/claude-mcpb.test.js tests/claude-codex-bridge.test.js tests/cursor-plugin.test.js tests/codex-plugin.test.js tests/ide-marketplace-extensions.test.js tests/telemetry-analytics.test.js tests/public-landing.test.js tests/lessons-page.test.js tests/pro-landing.test.js tests/local-model-profile.test.js tests/risk-scorer.test.js tests/context-compaction.test.js tests/reminder-engine.test.js tests/post-to-x.test.js tests/verification-loop.test.js tests/async-job-runner.test.js tests/commerce-quality.test.js tests/recall-limit.test.js tests/problem-detail.test.js tests/natural-language-harness.test.js tests/settings-hierarchy.test.js",
     "test:proof": "node --test tests/prove-adapters.test.js tests/prove-attribution.test.js tests/prove-cloudflare-sandbox.test.js tests/prove-data-quality.test.js tests/prove-intelligence.test.js tests/prove-lancedb.test.js tests/prove-loop-closure.test.js tests/prove-training-export.test.js tests/prove-local-intelligence.test.js tests/prove-workflow-contract.test.js tests/prove-autoresearch.test.js tests/prove-claim-verification.test.js tests/prove-data-pipeline.test.js tests/prove-evolution.test.js tests/prove-harnesses.test.js tests/prove-packaged-runtime.test.js tests/prove-runtime.test.js tests/prove-seo-gsd.test.js tests/prove-settings.test.js tests/prove-xmemory.test.js && node --test tests/prove-automation.test.js",
-    "test:e2e": "node --test tests/e2e-pipeline.test.js tests/e2e-product-flows.test.js tests/e2e-coverage-contract.test.js",
+    "test:e2e": "node --test tests/e2e-pipeline.test.js tests/e2e-product-flows.test.js tests/e2e-coverage-contract.test.js tests/interaction-model-e2e.test.js",
     "test:rlaif": "node --test tests/rlaif-self-audit.test.js tests/dpo-optimizer.test.js tests/meta-policy.test.js tests/agent-reward-model.test.js",
     "test:attribution": "node --test tests/feedback-attribution.test.js tests/hybrid-feedback-context.test.js",
-    "test:quality": "node --test tests/validate-feedback.test.js",
+    "test:quality": "node --test tests/validate-feedback.test.js tests/feedback-quality-eval-python.test.js tests/eval-gate-classifier.test.js",
     "test:intelligence": "node --test tests/intelligence.test.js",
     "test:training-export": "node --test tests/training-export.test.js tests/databricks-export.test.js",
     "test:deployment": "node --test tests/deployment.test.js tests/deploy-policy.test.js tests/publish-decision.test.js tests/changeset-check.test.js tests/release-notes.test.js tests/sonarcloud-workflow.test.js tests/package-boundary.test.js tests/public-package-boundary.test.js tests/revenue-observability-workflow.test.js",
@@ -486,8 +503,11 @@
     "test:zernio-status": "node --test tests/zernio-status.test.js",
     "test:license": "node --test tests/license.test.js",
     "test:bot-detector": "node --test tests/bot-detector.test.js",
+    "test:audit-pr-bot-contamination": "node --test tests/audit-pr-bot-contamination.test.js",
+    "test:stripe-bootstrap-saas-catalog": "node --test tests/stripe-bootstrap-saas-catalog.test.js",
     "test:bot-detection": "node --test tests/bot-detection.test.js",
     "test:checkout-bot-guard": "node --test tests/checkout-bot-guard.test.js",
+    "test:checkout-pro-confirmation-gate": "node --test tests/checkout-pro-confirmation-gate.test.js",
     "test:postinstall": "node --test tests/postinstall.test.js",
     "test:funnel-invariants": "node --test tests/funnel-invariants.test.js",
     "test:cli-telemetry": "node --test tests/cli-telemetry.test.js",
@@ -597,7 +617,8 @@
     "test:gate-eval": "node --test tests/gate-eval.test.js",
     "gate-eval:ci": "node scripts/gate-eval.js run",
     "test:ai-engineering-stack-guardrails": "node --test tests/ai-engineering-stack-guardrails.test.js",
-    "test:high-roi": "node --test tests/high-roi.test.js tests/model-candidates.test.js tests/autonomous-workflow.test.js tests/high-roi-agent-workflows.test.js tests/code-graph-guardrails.test.js tests/proxy-pointer-rag-guardrails.test.js tests/rag-precision-guardrails.test.js tests/ai-engineering-stack-guardrails.test.js tests/long-running-agent-context-guardrails.test.js tests/reasoning-efficiency-guardrails.test.js tests/deepseek-v4-runtime-guardrails.test.js tests/upstream-contribution-engine.test.js tests/proactive-agent-eval-guardrails.test.js tests/reward-hacking-guardrails.test.js tests/chatgpt-ads-readiness-pack.test.js tests/oss-pr-opportunity-scout.test.js tests/agent-design-governance.test.js tests/gemini-embedding-policy.test.js tests/openclaw-agent-governance-kit.test.js",
+    "test:interaction-model": "node --test tests/interaction-model.test.js tests/interaction-model-e2e.test.js",
+    "test:high-roi": "node --test tests/high-roi.test.js tests/model-candidates.test.js tests/autonomous-workflow.test.js tests/high-roi-agent-workflows.test.js tests/interaction-model.test.js tests/interaction-model-e2e.test.js tests/code-graph-guardrails.test.js tests/proxy-pointer-rag-guardrails.test.js tests/rag-precision-guardrails.test.js tests/ai-engineering-stack-guardrails.test.js tests/long-running-agent-context-guardrails.test.js tests/reasoning-efficiency-guardrails.test.js tests/deepseek-v4-runtime-guardrails.test.js tests/upstream-contribution-engine.test.js tests/proactive-agent-eval-guardrails.test.js tests/reward-hacking-guardrails.test.js tests/chatgpt-ads-readiness-pack.test.js tests/oss-pr-opportunity-scout.test.js tests/agent-design-governance.test.js tests/gemini-embedding-policy.test.js tests/openclaw-agent-governance-kit.test.js",
     "test:public-static-assets": "node --test tests/public-static-assets.test.js",
     "test:token-savings": "node --test tests/token-savings.test.js",
     "test:numbers-page": "node --test tests/numbers-page.test.js",
@@ -674,9 +695,9 @@
     "node": ">=18.18.0"
   },
   "dependencies": {
-    "@anthropic-ai/sdk": "0.92.0",
+    "@anthropic-ai/sdk": "0.95.2",
     "@google/genai": "1.49.0",
-    "@huggingface/transformers": "^4.1.0",
+    "@huggingface/transformers": "^4.2.0",
     "@lancedb/lancedb": "^0.27.2",
     "apache-arrow": "^18.1.0",
     "better-sqlite3": "^12.9.0",
@@ -692,7 +713,7 @@
   },
   "mcpName": "io.github.IgorGanapolsky/thumbgate",
   "devDependencies": {
-    "@changesets/changelog-github": "^0.6.0",
+    "@changesets/changelog-github": "^0.7.0",
     "@changesets/cli": "^2.31.0",
     "c8": "^11.0.0",
     "undici": "^8.2.0"

package/public/compare.html CHANGED Viewed

@@ -255,6 +255,12 @@
   <p><a href="/compare/agentix-labs" class="cta">Read ThumbGate vs Agentix Labs</a></p>
 </div>
+<div class="card">
+  <h3>Already using a supply-chain scanner?</h3>
+  <p>HEIDI (Meterian) catches AI assistants suggesting vulnerable npm/pip/maven packages — the supply-chain half of AI coding safety. ThumbGate is the behavior half: it blocks the agent's tool call before it fires a known-bad pattern. Different threat surfaces, both local-first, both free at base. Run both.</p>
+  <p><a href="/compare/heidi" class="cta">Read ThumbGate vs HEIDI</a></p>
+</div>
 <h2>How It Works</h2>
 <div class="step-grid">
   <div class="step-card">