thumbgate 1.18.0 → 1.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "thumbgate-marketplace",
3
- "version": "1.18.0",
3
+ "version": "1.20.0",
4
4
  "owner": {
5
5
  "name": "Igor Ganapolsky",
6
6
  "email": "ig5973700@gmail.com"
@@ -13,7 +13,7 @@
13
13
  "source": "npm",
14
14
  "package": "thumbgate"
15
15
  },
16
- "version": "1.18.0",
16
+ "version": "1.20.0",
17
17
  "author": {
18
18
  "name": "Igor Ganapolsky"
19
19
  },
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "thumbgate",
3
3
  "description": "Type 👍 or 👎 on any agent action. ThumbGate captures it, distills a lesson, and blocks the pattern from repeating. One thumbs-down = the agent physically cannot make that mistake again. 33 pre-action checks, budget enforcement, self-protection, and NIST/SOC2 compliance tags.",
4
- "version": "1.18.0",
4
+ "version": "1.20.0",
5
5
  "author": {
6
6
  "name": "Igor Ganapolsky"
7
7
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "thumbgate",
3
- "version": "1.18.0",
3
+ "version": "1.20.0",
4
4
  "description": "ThumbGate — 👍👎 feedback that teaches your AI agent. Thumbs down a mistake, it never happens again.",
5
5
  "homepage": "https://thumbgate-production.up.railway.app",
6
6
  "transport": "stdio",
package/README.md CHANGED
@@ -8,7 +8,7 @@
8
8
 
9
9
  **Your AI coding bill has a leak.**
10
10
 
11
- **Stop paying $ for the same AI mistake.**
11
+ **Stop paying for the same AI mistake twice.**
12
12
 
13
13
  Every retry loop, every hallucinated import, every *"let me try a different approach"* — those are billable tokens on every LLM vendor's bill. Thumbs-down once; ThumbGate blocks that exact mistake on every future call. Across Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode — any MCP-compatible agent, forever.
14
14
 
@@ -24,6 +24,14 @@ Under the hood: your thumbs-down becomes one of your **Pre-Action Checks** that
24
24
 
25
25
  ---
26
26
 
27
+ > *"A better dashboard doesn't make the agents more reliable. The hard part isn't visibility. It's trust."*
28
+ >
29
+ > — **Rob May**, CEO & co-founder, Neurometric AI, quoted in [The New Stack](https://thenewstack.io/claude-code-agent-view/) on Anthropic's Claude Code Agent View (May 2026).
30
+ >
31
+ > ThumbGate is the open-source layer that makes the trust part real: PreToolUse gates, thumbs-down to rule, audit trail on every interception.
32
+
33
+ ---
34
+
27
35
  ## 🎬 90-second demo
28
36
 
29
37
  Watch the force-push scenario: agent tries to `git push --force`, one thumbs-down, next session it's blocked — zero tokens spent on the repeat.
@@ -83,8 +91,8 @@ ThumbGate doesn't make your agent smarter. It makes your agent *cheaper to be wr
83
91
  ## Quick Start
84
92
 
85
93
  ```bash
86
- npx thumbgate init # auto-detects your agent, wires everything
87
- npx thumbgate capture "Never run DROP on production tables"
94
+ npx thumbgate init # auto-detects your agent, wires everything
95
+ npx thumbgate capture --feedback=down --context="Never run DROP on production tables"
88
96
  ```
89
97
 
90
98
  That single command creates a prevention rule. Next time any AI agent tries to run `DROP` on production:
@@ -107,13 +115,26 @@ ThumbGate operates as a 4-layer enforcement stack between your AI agent and your
107
115
  Your thumbs-up/down reactions are captured via MCP protocol, CLI, or the ChatGPT GPT surface. Each reaction is stored as a structured lesson with context, timestamp, and severity.
108
116
 
109
117
  ### Layer 2: Check Engine
110
- The check engine converts lessons into enforceable rules using pattern matching, semantic similarity (via LanceDB vectors), and Thompson Sampling for adaptive rule selection. Rules stay in local ThumbGate runtime state.
118
+ The check engine converts lessons into enforceable rules. **The runtime gate decision is deterministic** literal pattern match AST match → scoped rule lookup. No LLM call on the enforcement path.
119
+
120
+ Where retrieval is needed (an agent is about to run a destructive command not on the literal block list, but semantically similar to one we've blocked before), ThumbGate uses local CPU-only `bge-small` embeddings via LanceDB's built-in pipeline. No external API call, no inference cost beyond CPU. So **"no LLM in enforcement"** holds: the gate decision uses no LLM; the rule corpus is just searchable via local embeddings.
121
+
122
+ **Thompson Sampling tunes per-rule confidence weights** for soft-gating rules so high-noise rules quiet down and high-signal rules sharpen. It never decides *whether* a rule fires — a hard rule like "block `git push --force` on main" always fires deterministically. Bandit exploration would be terrifying for hard rules; we don't do it.
123
+
124
+ Rules stay in local ThumbGate runtime state.
111
125
 
112
126
  ### Layer 3: Pre-Action Interception
113
127
  Before any agent action executes, ThumbGate's `PreToolUse` hook intercepts the command and evaluates it against all active checks. This happens at the MCP protocol level — the agent physically cannot bypass it.
114
128
 
115
- ### Layer 4: Multi-Agent Distribution
116
- Checks are distributed across all connected agents via MCP stdio protocol. One correction in Claude Code protects Cursor, Codex, Gemini CLI, Cline, and any MCP-compatible agent.
129
+ ### Layer 4: Multi-Agent Distribution (the actual moat vs hand-rolled hooks)
130
+ Claude Code already ships `permissions.deny` and `PreToolUse` hooks. Cursor and Codex have their own. So why ThumbGate over a hand-written hook?
131
+
132
+ Two things hand-written hooks structurally cannot do:
133
+
134
+ 1. **Cross-agent propagation.** A `permissions.deny` pattern lives in one agent's config and stays there. ThumbGate's checks distribute across every connected agent over MCP stdio — thumbs-down once in Cursor, the same pattern blocks on Claude Code, Codex, Gemini CLI, Cline, OpenCode, Amp in the next session, no copy-paste between configs.
135
+ 2. **Learning loop.** A hand-written hook covers exactly the patterns you wrote. ThumbGate promotes every thumbs-down into a fresh rule, tunes existing rules' confidence weights from outcomes (Thompson Sampling, see Layer 2), and pulls semantically-near patterns into scope via local embeddings. The rule corpus sharpens without an operator hand-writing a regex for every new mistake shape.
136
+
137
+ Hand-rolled hooks are the right tool for a small, static denylist you maintain by hand. ThumbGate is the right tool when you want corrections from any agent to harden every agent automatically.
117
138
 
118
139
  Prompt engineering still matters, but it is only the starting point. ThumbGate adds prompt evaluation on top: proof lanes, benchmarks, and self-heal checks tell you whether your prompt and workflow actually held up under execution instead of leaving you to guess from vibes. Run `npx thumbgate eval --from-feedback --write-report=.thumbgate/prompt-eval-proof.md` to turn real thumbs-up/down feedback into reusable eval cases and a buyer-ready proof report.
119
140
 
@@ -232,10 +253,10 @@ ThumbGate sells three concrete outcomes:
232
253
  ## CLI Reference
233
254
 
234
255
  ```bash
235
- npx thumbgate init # detect agent, wire hooks
236
- npx thumbgate doctor # health check
237
- npx thumbgate capture # create a check from text
238
- npx thumbgate lessons # see what's been learned
256
+ npx thumbgate init # detect agent, wire hooks
257
+ npx thumbgate doctor # health check
258
+ npx thumbgate capture --feedback=up|down --context="<text>" # capture a signal as a stored lesson
259
+ npx thumbgate lessons # see what's been learned
239
260
  npx thumbgate explore # terminal explorer for lessons, checks, stats
240
261
  npx thumbgate background-governance # review background-agent run risk
241
262
  npx thumbgate model-candidates --workload=dashboard-analysis --provider=openai --json # evaluate GPT-5.5 routing
@@ -423,6 +444,7 @@ Pro ($19/mo or $149/yr) removes the rule cap and adds history-aware lesson recal
423
444
 
424
445
  ## Docs
425
446
 
447
+ - [**ThumbGate for Federal Agencies**](docs/FEDERAL.md) — pilot-ready posture, NIST 800-53 control mapping, OMB M-24-10 / EO 14110 alignment, ThumbGate-Core gov deployment mode, public/Core boundary invariants. Landing page: [thumbgate.ai/federal](https://thumbgate-production.up.railway.app/federal).
426
448
  - [First Dollar Playbook](docs/FIRST_DOLLAR_PLAYBOOK.md) — turning one painful workflow into the next booked pilot
427
449
  - [Commercial Truth](docs/COMMERCIAL_TRUTH.md) — pricing, claims, what we don't say
428
450
  - [Changeset Strategy](docs/CHANGESET_STRATEGY.md) — release notes and version bump enforcement
@@ -2,13 +2,13 @@
2
2
  "mcpServers": {
3
3
  "thumbgate": {
4
4
  "command": "npx",
5
- "args": ["--yes", "--package", "thumbgate@1.18.0", "thumbgate", "serve"]
5
+ "args": ["--yes", "--package", "thumbgate@1.20.0", "thumbgate", "serve"]
6
6
  }
7
7
  },
8
8
  "hooks": {
9
9
  "preToolUse": {
10
10
  "command": "npx",
11
- "args": ["--yes", "--package", "thumbgate@1.18.0", "thumbgate", "gate-check"]
11
+ "args": ["--yes", "--package", "thumbgate@1.20.0", "thumbgate", "gate-check"]
12
12
  }
13
13
  }
14
14
  }
@@ -216,7 +216,7 @@ const {
216
216
  finalizeSession: finalizeFeedbackSession,
217
217
  } = require('../../scripts/feedback-session');
218
218
 
219
- const SERVER_INFO = { name: 'thumbgate-mcp', version: '1.18.0' };
219
+ const SERVER_INFO = { name: 'thumbgate-mcp', version: '1.20.0' };
220
220
  const COMMERCE_CATEGORIES = [
221
221
  'product_recommendation',
222
222
  'brand_compliance',
@@ -7,7 +7,7 @@
7
7
  "npx",
8
8
  "--yes",
9
9
  "--package",
10
- "thumbgate@1.18.0",
10
+ "thumbgate@1.20.0",
11
11
  "thumbgate",
12
12
  "serve"
13
13
  ],
package/bin/cli.js CHANGED
@@ -141,6 +141,10 @@ function telemetryPing(installId) {
141
141
 
142
142
  function proNudge(context) {
143
143
  if (process.env.THUMBGATE_NO_NUDGE === '1') return;
144
+ try {
145
+ const { isProTier } = require(path.join(PKG_ROOT, 'scripts', 'rate-limiter'));
146
+ if (isProTier()) return;
147
+ } catch (_) { /* if rate-limiter is unavailable, fall through and nudge */ }
144
148
  const messages = [
145
149
  `\n 💡 Unlock Pro (${PRO_PRICE_LABEL}): searchable dashboard, DPO export, multi-repo sync\n ${PRO_CHECKOUT_URL}\n`,
146
150
  `\n 💡 Pro tip: export your feedback as DPO training pairs to improve your models.\n Get Pro: ${PRO_CHECKOUT_URL}\n`,
@@ -1157,7 +1161,10 @@ function pro() {
1157
1161
  }
1158
1162
 
1159
1163
  const resolvedKey = resolveProKey();
1160
- if (resolvedKey && resolvedKey.key) {
1164
+ // creator-dev legitimately returns {key: '', source: 'creator-dev', plan: 'enterprise'}
1165
+ // when THUMBGATE_DEV_KEY is unset — startLocalProDashboard accepts the empty
1166
+ // key in that case (see pro-local-dashboard.js:141), so launch on source too.
1167
+ if (resolvedKey && (resolvedKey.key || resolvedKey.source === 'creator-dev')) {
1161
1168
  return launchDashboard(resolvedKey.key, 'pro_dashboard_launch');
1162
1169
  }
1163
1170
 
@@ -2299,6 +2306,36 @@ function startApi() {
2299
2306
 
2300
2307
  function help() {
2301
2308
  const v = pkgVersion();
2309
+ const helpArgs = process.argv.slice(3);
2310
+ const showAll = helpArgs.includes('all')
2311
+ || helpArgs.includes('--all')
2312
+ || helpArgs.includes('--full');
2313
+
2314
+ // Default `thumbgate help` shows a curated short list. The full ~70-command
2315
+ // surface lives behind `thumbgate help all` so first-time users aren't hit
2316
+ // with a wall of text. (Pre-2026-05-18 default: dump everything.)
2317
+ if (!showAll) {
2318
+ console.log(`thumbgate v${v} — pre-action checks for AI coding agents`);
2319
+ console.log('');
2320
+ console.log('Common commands:');
2321
+ console.log(' init Detect agent and wire ThumbGate hooks');
2322
+ console.log(' capture --feedback=up|down --context="<text>" Capture a thumbs signal as a stored lesson');
2323
+ console.log(' stats Approval rate, recent trend, blocked-pattern count');
2324
+ console.log(' lessons [query] Search promoted lessons');
2325
+ console.log(' explore Interactive TUI for lessons, gates, stats');
2326
+ console.log(' dashboard Open the local ThumbGate dashboard');
2327
+ console.log(' doctor Audit runtime isolation + bootstrap context');
2328
+ console.log(' pro ThumbGate Pro (dashboard, exports, sync)');
2329
+ console.log('');
2330
+ console.log('More:');
2331
+ console.log(' thumbgate help all Full subcommand surface (~70 commands)');
2332
+ console.log(' thumbgate <cmd> --help Per-command flags (where supported)');
2333
+ console.log('');
2334
+ console.log('Docs: https://github.com/IgorGanapolsky/ThumbGate');
2335
+ proNudge();
2336
+ return;
2337
+ }
2338
+
2302
2339
  const { groupedCommands, commandHelpLine } = require(path.join(PKG_ROOT, 'scripts', 'cli-schema'));
2303
2340
  const groups = groupedCommands();
2304
2341
  const GROUP_LABELS = {
@@ -2310,7 +2347,7 @@ function help() {
2310
2347
  advanced: 'Advanced',
2311
2348
  };
2312
2349
 
2313
- console.log(`thumbgate v${v} — pre-action checks for AI coding agents`);
2350
+ console.log(`thumbgate v${v} — pre-action checks for AI coding agents (full command surface)`);
2314
2351
  console.log('');
2315
2352
 
2316
2353
  for (const [groupKey, label] of Object.entries(GROUP_LABELS)) {
@@ -76,9 +76,40 @@
76
76
  "medianLatencyMs",
77
77
  "costPerAnalysisUsd"
78
78
  ]
79
+ },
80
+ "tokenizer-brittleness": {
81
+ "label": "Tokenizer brittleness and byte-level robustness",
82
+ "summary": "Evaluate models for malformed JSONL, Unicode confusables, stack traces, secrets, SQL snippets, file paths, and code-symbol-heavy inputs before routing log, code, or security workloads.",
83
+ "desiredStrengths": ["tokenizer-free", "byte-level", "log-robustness", "code-symbols", "security-scanning", "fast-inference"],
84
+ "targetContextWindow": 64000,
85
+ "benchmarkCommands": [
86
+ "npx thumbgate model-candidates --workload=tokenizer-brittleness --json",
87
+ "node --test tests/model-candidates.test.js --test-name-pattern=tokenizer",
88
+ "node scripts/gate-eval.js run"
89
+ ],
90
+ "metrics": [
91
+ "caseCoverage",
92
+ "symbolPreservationRate",
93
+ "secretDetectionRecall",
94
+ "jsonlRepairAccuracy",
95
+ "medianLatencyMs",
96
+ "memoryBandwidthEstimate"
97
+ ]
79
98
  }
80
99
  },
81
100
  "candidates": [
101
+ {
102
+ "id": "research/fast-byte-latent-transformer",
103
+ "vendor": "Meta + Stanford + University of Washington",
104
+ "family": "blt",
105
+ "provider": "research",
106
+ "model": "fast-byte-latent-transformer",
107
+ "contextWindow": 64000,
108
+ "costClass": "medium",
109
+ "researchOnly": true,
110
+ "strengths": ["tokenizer-free", "byte-level", "log-robustness", "code-symbols", "security-scanning", "fast-inference"],
111
+ "notes": "Research-only candidate inspired by Fast BLT. Use as an evaluation target for tokenizer-free robustness and memory-bandwidth planning; do not route production traffic until a maintained runtime and model weights exist."
112
+ },
82
113
  {
83
114
  "id": "self-hosted/deepseek-v4-flash-sglang",
84
115
  "vendor": "DeepSeek",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "thumbgate",
3
- "version": "1.18.0",
3
+ "version": "1.20.0",
4
4
  "description": "ThumbGate self-improving agent governance: thumbs-up/down turns every mistake into a prevention rule and blocks repeat patterns. 33 pre-action checks, budget enforcement, and self-protection for Claude Code, Cursor, Codex, Gemini CLI, and Amp.",
5
5
  "homepage": "https://thumbgate-production.up.railway.app",
6
6
  "repository": {
@@ -39,6 +39,7 @@
39
39
  "public/codex-plugin.html",
40
40
  "public/compare.html",
41
41
  "public/dashboard.html",
42
+ "public/federal.html",
42
43
  "public/guide.html",
43
44
  "public/index.html",
44
45
  "public/learn.html",
@@ -46,6 +47,7 @@
46
47
  "public/numbers.html",
47
48
  "public/pro.html",
48
49
  "scripts/access-anomaly-detector.js",
50
+ "scripts/activation-tracker.js",
49
51
  "scripts/agent-audit-trace.js",
50
52
  "scripts/agent-design-governance.js",
51
53
  "scripts/agent-memory-lifecycle.js",
@@ -152,6 +154,7 @@
152
154
  "scripts/mcp-policy.js",
153
155
  "scripts/mcp-transport-strategy.js",
154
156
  "scripts/memory-firewall.js",
157
+ "scripts/memory-scope-readiness.js",
155
158
  "scripts/meta-agent-loop.js",
156
159
  "scripts/model-access-eligibility.js",
157
160
  "scripts/model-migration-readiness.js",
@@ -163,6 +166,7 @@
163
166
  "scripts/otel-declarative-config.js",
164
167
  "scripts/operational-integrity.js",
165
168
  "scripts/perplexity-client.js",
169
+ "scripts/plausible-server-events.js",
166
170
  "scripts/pr-manager.js",
167
171
  "scripts/private-core-boundary.js",
168
172
  "scripts/pro-local-dashboard.js",
@@ -245,6 +249,7 @@
245
249
  "verify:full": "node scripts/verify-run.js full",
246
250
  "budget:status": "node scripts/budget-guard.js --status",
247
251
  "revenue:status": "node scripts/revenue-status.js",
252
+ "revenue:doctor": "node scripts/revenue-observability-doctor.js",
248
253
  "revenue:plan": "node scripts/may-2026-revenue-machine.js",
249
254
  "revenue:status:local": "node bin/cli.js cfo",
250
255
  "revenue:repair:github-marketplace": "node bin/cli.js repair-github-marketplace --write",
@@ -323,7 +328,19 @@
323
328
  "social:prospect:bluesky": "node scripts/social-bluesky-prospecting.js",
324
329
  "social:prospect:bluesky:dry": "node scripts/social-bluesky-prospecting.js --dry-run",
325
330
  "social:reply-publish:bluesky:dry": "node scripts/social-reply-monitor-bluesky.js --publish-approved --dry-run",
326
- "test": "npm run test:schema && npm run test:loop && npm run test:dpo && npm run test:kto && npm run test:api && npm run test:proof && npm run test:e2e && npm run test:rlaif && npm run test:attribution && npm run test:quality && npm run test:intelligence && npm run test:training-export && npm run test:deployment && npm run test:operational-integrity && npm run test:workflow && npm run test:billing && npm run test:cli && npm run test:watcher && npm run test:autoresearch && npm run test:ops && npm run test:session-analyzer && npm run test:tessl && npm run test:gates && npm run test:evoskill && npm run test:gates-hardening && npm run test:workers && npm run test:social-analytics && npm run test:memalign && npm run test:xmemory-lite && npm run test:filesystem-search && npm run test:zernio && npm run test:platform-limits && npm run test:post-video && npm run test:post-everywhere-instagram && npm run test:post-everywhere-channels && npm run test:post-everywhere-zernio-default && npm run test:zernio-canonical-pollers && npm run test:zernio-status && npm run test:obsidian-export && npm run test:lesson-db && npm run test:lesson-rotation && npm run test:memory-dedup && npm run test:feedback-quality && npm run test:sync-version && npm run test:check-congruence && npm run test:tool-registry && npm run test:feedback-to-rules && npm run test:memory-firewall && npm run test:belief-update && npm run test:hosted-config && npm run test:operational-summary && npm run test:operational-dashboard && npm run test:operator-artifacts && npm run test:operator-key-auth && npm run test:cloudflare-sandbox && npm run test:mcp-config && npm run test:plan-gate && npm run test:pulse && npm run test:semantic-layer && npm run test:data-pipeline && npm run test:optimize-context && npm run test:principle-extractor && npm run test:analytics-window && npm run test:funnel-analytics && npm run test:experiment-tracker && npm run test:build-metadata && npm run test:context-engine && npm run test:hf-papers && npm run test:marketing-experiment && npm run test:seo-gsd && npm run test:verify-run && npm run test:export-dpo-pairs && npm run test:export-hf-dataset && npm run test:license && npm run test:bot-detector && npm run test:postinstall && npm run test:funnel-invariants && npm run test:cli-telemetry && npm run test:pro-parity && npm run test:model-tier-router && npm run test:computer-use-firewall && npm run test:skill-exporter && npm run test:statusline && npm run test:evolution && npm run test:org-dashboard && npm run test:multi-hop-recall && npm run test:synthetic-dpo && npm run test:thumbgate-skill && npm run test:learn-hub && npm run test:feedback-fallback && npm run test:metaclaw && npm run test:server-lock && npm run test:control-tower && npm run test:pii-scanner && npm run test:data-governance && npm run test:lesson-inference && npm run test:semantic-dedup && npm run test:fs-utils && npm run test:cli-schema && npm run test:explore && npm run test:lesson-reranker && npm run test:lesson-retrieval && npm run test:cross-encoder && npm run test:reflector-agent && npm run test:feedback-session && npm run test:feedback-history-distiller && npm run test:hallucination-detector && npm run test:history-distiller && npm run test:predictive-insights && npm run test:prove-predictive-insights && npm run test:statusbar-cli && npm run test:generate-instagram-card && npm run test:instagram-thumbgate-post && npm run test:publish-instagram-thumbgate && npm run test:lesson-synthesis && npm run test:lesson-canonical && npm run test:background-governance && npm run test:memory-migration && npm run test:prompt-dlp && npm run test:ephemeral-store && npm run test:agent-security && npm run test:skill-progressive && npm run test:per-step-scoring && npm run test:weekly-auto-post && npm run test:social-post-hourly && npm run test:social-quality-gate && npm run test:a2ui-engine && npm run test:gate-satisfy && npm run test:money-watcher && npm run test:budget && npm run test:quick-start && npm run test:utm && npm run test:product-feedback && npm run test:feedback-root-consolidator && npm run test:engagement-audit && npm run test:install-growth-automation && npm run test:publish-thumbgate-launch && npm run test:community-course-platform-launch-kit && npm run test:reconcile-thumbgate-campaign && npm run test:reddit-publisher && npm run test:schedule-thumbgate-campaign && npm run test:social-reply-monitor && npm run test:social-dedupe-cleanup && npm run test:sync-launch-assets && npm run test:ai-search-visibility && npm run test:perplexity && npm run test:security-scanner && npm run test:llm-client && npm run test:managed-lesson-agent && npm run test:self-distill && npm run test:meta-agent && npm run test:harness-selector && npm run test:thumbgate-bench && npm run test:seo-guides && npm run test:enforcement-loop && npm run test:cli-agent-experience && npm run test:bot-detection && npm run test:checkout-bot-guard && npm run test:session-health && npm run test:session-episodes && npm run test:spec-gate && npm run test:decision-trace && npm run test:dashboard-insights && npm run test:telemetry-tracked-link-slug && npm run test:prompt-eval && npm run test:demo-voiceover && npm run test:gate-coherence && npm run test:gate-eval && npm run test:high-roi && npm run test:public-static-assets && npm run test:token-savings && npm run test:numbers-page && npm run test:workflow-gate-checkpoint && npm run test:lesson-export-import && npm run test:landing-page-claims && npm run test:competitive-positioning-marketing && npm run test:medium-weekly && npm run test:dashboard-deeplink-e2e && npm run test:public-package-parity && npm run test:token-savings-dashboard && npm run test:cursor-wiring && npm run test:pretooluse-injection && npm run test:recent-corrective-context && npm run test:durability-step && npm run test:mailer && npm run test:brand-assets && npm run test:enforcement-teeth && npm run test:bayes-optimal-gate && npm run test:swarm-coordinator && npm run test:session-report && npm run test:agent-reasoning-traces && npm run test:judge-reward && npm run test:llm-behavior-monitor && npm run test:prompting-os && npm run test:single-use-credential-gate && npm run test:structured-prompt-driven && npm run test:require-evidence-gate && npm run test:rule-validator && npm run test:bluesky-atproto && npm run test:social-reply-monitor-bluesky && npm run test:bluesky-delete-replies && npm run test:architect-kit-memory-bridge && npm run test:sonar-review-hotspots && npm run test:actionable-remediations && npm run test:gemini-embedding-policy && npm run test:agent-design-governance && npm run test:public-core-boundary",
331
+ "test:python": "python3 -m pytest tests/*.py",
332
+ "test": "npm run test:python && npm run test:schema && npm run test:loop && npm run test:dpo && npm run test:kto && npm run test:api && npm run test:proof && npm run test:e2e && npm run test:rlaif && npm run test:attribution && npm run test:quality && npm run test:intelligence && npm run test:training-export && npm run test:deployment && npm run test:operational-integrity && npm run test:workflow && npm run test:billing && npm run test:cli && npm run test:watcher && npm run test:autoresearch && npm run test:ops && npm run test:session-analyzer && npm run test:tessl && npm run test:gates && npm run test:evoskill && npm run test:gates-hardening && npm run test:workers && npm run test:social-analytics && npm run test:memalign && npm run test:xmemory-lite && npm run test:filesystem-search && npm run test:zernio && npm run test:platform-limits && npm run test:post-video && npm run test:post-everywhere-instagram && npm run test:post-everywhere-channels && npm run test:post-everywhere-zernio-default && npm run test:zernio-canonical-pollers && npm run test:zernio-status && npm run test:obsidian-export && npm run test:lesson-db && npm run test:lesson-rotation && npm run test:memory-dedup && npm run test:feedback-quality && npm run test:sync-version && npm run test:check-congruence && npm run test:tool-registry && npm run test:feedback-to-rules && npm run test:memory-firewall && npm run test:memory-scope-readiness && npm run test:belief-update && npm run test:hosted-config && npm run test:operational-summary && npm run test:operational-dashboard && npm run test:operator-artifacts && npm run test:operator-key-auth && npm run test:cloudflare-sandbox && npm run test:mcp-config && npm run test:plan-gate && npm run test:pulse && npm run test:semantic-layer && npm run test:data-pipeline && npm run test:optimize-context && npm run test:principle-extractor && npm run test:analytics-window && npm run test:funnel-analytics && npm run test:experiment-tracker && npm run test:build-metadata && npm run test:context-engine && npm run test:hf-papers && npm run test:marketing-experiment && npm run test:seo-gsd && npm run test:verify-run && npm run test:export-dpo-pairs && npm run test:export-hf-dataset && npm run test:license && npm run test:bot-detector && npm run test:audit-pr-bot-contamination && npm run test:stripe-bootstrap-saas-catalog && npm run test:postinstall && npm run test:funnel-invariants && npm run test:cli-telemetry && npm run test:pro-parity && npm run test:model-tier-router && npm run test:computer-use-firewall && npm run test:skill-exporter && npm run test:statusline && npm run test:evolution && npm run test:org-dashboard && npm run test:multi-hop-recall && npm run test:synthetic-dpo && npm run test:thumbgate-skill && npm run test:learn-hub && npm run test:feedback-fallback && npm run test:metaclaw && npm run test:server-lock && npm run test:control-tower && npm run test:pii-scanner && npm run test:data-governance && npm run test:lesson-inference && npm run test:semantic-dedup && npm run test:fs-utils && npm run test:cli-schema && npm run test:explore && npm run test:lesson-reranker && npm run test:lesson-retrieval && npm run test:cross-encoder && npm run test:reflector-agent && npm run test:feedback-session && npm run test:feedback-history-distiller && npm run test:hallucination-detector && npm run test:history-distiller && npm run test:predictive-insights && npm run test:prove-predictive-insights && npm run test:statusbar-cli && npm run test:generate-instagram-card && npm run test:instagram-thumbgate-post && npm run test:publish-instagram-thumbgate && npm run test:lesson-synthesis && npm run test:lesson-canonical && npm run test:background-governance && npm run test:memory-migration && npm run test:prompt-dlp && npm run test:ephemeral-store && npm run test:agent-security && npm run test:skill-progressive && npm run test:per-step-scoring && npm run test:weekly-auto-post && npm run test:social-post-hourly && npm run test:social-quality-gate && npm run test:a2ui-engine && npm run test:gate-satisfy && npm run test:money-watcher && npm run test:budget && npm run test:quick-start && npm run test:utm && npm run test:product-feedback && npm run test:feedback-root-consolidator && npm run test:engagement-audit && npm run test:install-growth-automation && npm run test:publish-thumbgate-launch && npm run test:community-course-platform-launch-kit && npm run test:reconcile-thumbgate-campaign && npm run test:reddit-publisher && npm run test:schedule-thumbgate-campaign && npm run test:social-reply-monitor && npm run test:social-dedupe-cleanup && npm run test:sync-launch-assets && npm run test:ai-search-visibility && npm run test:perplexity && npm run test:security-scanner && npm run test:llm-client && npm run test:managed-lesson-agent && npm run test:self-distill && npm run test:meta-agent && npm run test:harness-selector && npm run test:thumbgate-bench && npm run test:seo-guides && npm run test:enforcement-loop && npm run test:cli-agent-experience && npm run test:bot-detection && npm run test:checkout-bot-guard && npm run test:checkout-pro-confirmation-gate && npm run test:session-health && npm run test:session-episodes && npm run test:spec-gate && npm run test:decision-trace && npm run test:dashboard-insights && npm run test:telemetry-tracked-link-slug && npm run test:prompt-eval && npm run test:demo-voiceover && npm run test:gate-coherence && npm run test:gate-eval && npm run test:high-roi && npm run test:public-static-assets && npm run test:token-savings && npm run test:numbers-page && npm run test:workflow-gate-checkpoint && npm run test:lesson-export-import && npm run test:landing-page-claims && npm run test:competitive-positioning-marketing && npm run test:medium-weekly && npm run test:dashboard-deeplink-e2e && npm run test:public-package-parity && npm run test:token-savings-dashboard && npm run test:cursor-wiring && npm run test:pretooluse-injection && npm run test:recent-corrective-context && npm run test:durability-step && npm run test:mailer && npm run test:brand-assets && npm run test:enforcement-teeth && npm run test:bayes-optimal-gate && npm run test:swarm-coordinator && npm run test:session-report && npm run test:agent-reasoning-traces && npm run test:judge-reward && npm run test:llm-behavior-monitor && npm run test:prompting-os && npm run test:single-use-credential-gate && npm run test:structured-prompt-driven && npm run test:require-evidence-gate && npm run test:rule-validator && npm run test:bluesky-atproto && npm run test:social-reply-monitor-bluesky && npm run test:bluesky-delete-replies && npm run test:architect-kit-memory-bridge && npm run test:sonar-review-hotspots && npm run test:actionable-remediations && npm run test:gemini-embedding-policy && npm run test:agent-design-governance && npm run test:public-core-boundary && npm run test:hook-stop-verify-deploy && npm run test:hook-stop-anti-claim && npm run test:plausible-server-events && npm run test:activation-tracker && npm run test:unified-revenue-rollup && npm run test:conversion-rate-stats && npm run test:external-customer-audit && npm run test:telemetry-export && npm run test:stripe-checkout-diagnostic && npm run test:stripe-business-identity-probe && npm run test:revenue-observability-doctor && npm run test:public-bundle-ratchet && npm run test:ci-cd-hygiene-audit",
333
+ "test:hook-stop-verify-deploy": "node --test tests/hook-stop-verify-deploy.test.js",
334
+ "test:hook-stop-anti-claim": "node --test tests/hook-stop-anti-claim.test.js",
335
+ "test:plausible-server-events": "node --test tests/plausible-server-events.test.js",
336
+ "test:activation-tracker": "node --test tests/activation-tracker.test.js",
337
+ "test:unified-revenue-rollup": "node --test tests/unified-revenue-rollup.test.js",
338
+ "test:conversion-rate-stats": "node --test tests/conversion-rate-stats.test.js",
339
+ "test:external-customer-audit": "node --test tests/external-customer-audit.test.js",
340
+ "test:stripe-checkout-diagnostic": "node --test tests/stripe-checkout-diagnostic.test.js",
341
+ "test:stripe-business-identity-probe": "node --test tests/stripe-business-identity-probe.test.js",
342
+ "test:ci-cd-hygiene-audit": "node --test tests/ci-cd-hygiene-audit.test.js",
343
+ "test:telemetry-export": "node --test tests/telemetry-export.test.js",
327
344
  "test:swarm-coordinator": "node --test tests/swarm-coordinator.test.js",
328
345
  "test:session-report": "node --test tests/session-report.test.js",
329
346
  "test:agent-reasoning-traces": "node --test tests/agent-reasoning-traces.test.js tests/agent-stack-survival-audit.test.js",
@@ -348,6 +365,7 @@
348
365
  "test:prompt-eval": "node --test tests/prompt-eval.test.js",
349
366
  "eval:feedback": "node scripts/prompt-eval.js --from-feedback",
350
367
  "eval:feedback-quality": "python3 scripts/feedback_quality_eval.py",
368
+ "eval:classifier": "python3 scripts/eval_gate_classifier.py",
351
369
  "test:decision-trace": "node --test tests/decision-trace.test.js",
352
370
  "test:feedback-fallback": "node --test tests/feedback-fallback.test.js",
353
371
  "test:metaclaw": "node --test tests/metaclaw-features.test.js",
@@ -367,6 +385,7 @@
367
385
  "test:learn-hub": "node --test tests/learn-hub.test.js",
368
386
  "test:feedback-to-rules": "node --test tests/feedback-to-rules.test.js",
369
387
  "test:memory-firewall": "node --test tests/memory-firewall.test.js",
388
+ "test:memory-scope-readiness": "node --test tests/memory-scope-readiness.test.js",
370
389
  "test:belief-update": "node --test tests/belief-update.test.js",
371
390
  "test:hosted-config": "node --test tests/hosted-config.test.js",
372
391
  "test:operational-summary": "node --test tests/operational-summary.test.js",
@@ -405,10 +424,10 @@
405
424
  "test:kto": "node --test tests/export-kto.test.js",
406
425
  "test:api": "node --test --test-concurrency=1 tests/api-server.test.js tests/api-events-sse.test.js tests/api-auth-config.test.js tests/mcp-server.test.js tests/adapters.test.js tests/openapi-parity.test.js tests/budget-guard.test.js tests/context-manager.test.js tests/contextfs.test.js tests/job-api.test.js tests/pack-templates.test.js tests/dashboard.test.js tests/dashboard-render-spec.test.js tests/dashboard-html.test.js tests/agent-readiness.test.js tests/mcp-policy.test.js tests/subagent-profiles.test.js tests/intent-router.test.js tests/internal-agent-bootstrap.test.js tests/lesson-search.test.js tests/thumbgate-search.test.js tests/document-intake.test.js tests/rubric-engine.test.js tests/self-healing-check.test.js tests/self-heal.test.js tests/feedback-schema.test.js tests/thompson-sampling.test.js tests/feedback-sequences.test.js tests/diversity-tracking.test.js tests/vector-store.test.js tests/gemini-embedding-policy.test.js tests/feedback-attribution.test.js tests/hybrid-feedback-context.test.js tests/loop-closure.test.js tests/code-reasoning.test.js tests/feedback-loop.test.js tests/feedback-inbox-read.test.js tests/feedback-to-memory.test.js tests/test-coverage.test.js tests/version-metadata.test.js tests/claude-mcpb.test.js tests/claude-codex-bridge.test.js tests/cursor-plugin.test.js tests/codex-plugin.test.js tests/ide-marketplace-extensions.test.js tests/telemetry-analytics.test.js tests/public-landing.test.js tests/lessons-page.test.js tests/pro-landing.test.js tests/local-model-profile.test.js tests/risk-scorer.test.js tests/context-compaction.test.js tests/reminder-engine.test.js tests/post-to-x.test.js tests/verification-loop.test.js tests/async-job-runner.test.js tests/commerce-quality.test.js tests/recall-limit.test.js tests/problem-detail.test.js tests/natural-language-harness.test.js tests/settings-hierarchy.test.js",
407
426
  "test:proof": "node --test tests/prove-adapters.test.js tests/prove-attribution.test.js tests/prove-cloudflare-sandbox.test.js tests/prove-data-quality.test.js tests/prove-intelligence.test.js tests/prove-lancedb.test.js tests/prove-loop-closure.test.js tests/prove-training-export.test.js tests/prove-local-intelligence.test.js tests/prove-workflow-contract.test.js tests/prove-autoresearch.test.js tests/prove-claim-verification.test.js tests/prove-data-pipeline.test.js tests/prove-evolution.test.js tests/prove-harnesses.test.js tests/prove-packaged-runtime.test.js tests/prove-runtime.test.js tests/prove-seo-gsd.test.js tests/prove-settings.test.js tests/prove-xmemory.test.js && node --test tests/prove-automation.test.js",
408
- "test:e2e": "node --test tests/e2e-pipeline.test.js tests/e2e-product-flows.test.js tests/e2e-coverage-contract.test.js",
427
+ "test:e2e": "node --test tests/e2e-pipeline.test.js tests/e2e-product-flows.test.js tests/e2e-coverage-contract.test.js tests/interaction-model-e2e.test.js",
409
428
  "test:rlaif": "node --test tests/rlaif-self-audit.test.js tests/dpo-optimizer.test.js tests/meta-policy.test.js tests/agent-reward-model.test.js",
410
429
  "test:attribution": "node --test tests/feedback-attribution.test.js tests/hybrid-feedback-context.test.js",
411
- "test:quality": "node --test tests/validate-feedback.test.js tests/feedback-quality-eval-python.test.js",
430
+ "test:quality": "node --test tests/validate-feedback.test.js tests/feedback-quality-eval-python.test.js tests/eval-gate-classifier.test.js",
412
431
  "test:intelligence": "node --test tests/intelligence.test.js",
413
432
  "test:training-export": "node --test tests/training-export.test.js tests/databricks-export.test.js",
414
433
  "test:deployment": "node --test tests/deployment.test.js tests/deploy-policy.test.js tests/publish-decision.test.js tests/changeset-check.test.js tests/release-notes.test.js tests/sonarcloud-workflow.test.js tests/package-boundary.test.js tests/public-package-boundary.test.js tests/revenue-observability-workflow.test.js",
@@ -416,7 +435,7 @@
416
435
  "test:workflow": "node --test tests/workflow-contract.test.js tests/social-marketing-assets.test.js tests/social-pipeline.test.js tests/positioning-contract.test.js tests/docs-claim-hygiene.test.js tests/thumbgate-scope.test.js tests/workflow-runs.test.js tests/workflow-sprint-intake.test.js tests/gtm-revenue-loop.test.js tests/may-2026-revenue-machine.test.js tests/customer-discovery-sprint.test.js tests/revenue-pack-utils.test.js tests/aiventyx-marketplace-plan.test.js tests/cursor-marketplace-revenue-pack.test.js tests/codex-marketplace-revenue-pack.test.js tests/codex-plugin-revenue-pack.test.js tests/gemini-cli-demand-pack.test.js tests/roo-sunset-demand-pack.test.js tests/linkedin-workflow-hardening-pack.test.js tests/chatgpt-gpt-revenue-pack.test.js tests/mcp-directory-revenue-pack.test.js tests/money-marketplace-distribution-pack.test.js tests/autonomous-sales-agent.test.js tests/reddit-dm-workflow-hardening-pack.test.js tests/sales-pipeline.test.js tests/reddit-dm-outreach.test.js tests/github-outreach.test.js tests/enterprise-story.test.js tests/ralph-loop.test.js tests/ralph-mode-ci.test.js tests/guide-conversion-path.test.js tests/roo-sunset-marketing.test.js",
417
436
  "test:sales-pipeline": "node --test tests/sales-pipeline.test.js",
418
437
  "test:billing": "node --test tests/billing.test.js tests/stripe-sync-product-images.test.js",
419
- "test:cli": "node --test tests/analytics-report.test.js tests/agent-design-governance.test.js tests/codex-self-heal.test.js tests/creator-campaigns.test.js tests/cli.test.js tests/codex-bridge-script.test.js tests/dependabot-changeset.test.js tests/dispatch-brief.test.js tests/feedback-normalize.test.js tests/install-mcp.test.js tests/pr-manager.test.js tests/pro-local-dashboard.test.js tests/published-cli.test.js tests/revenue-status.test.js tests/stripe-live-status.test.js",
438
+ "test:cli": "node --test tests/analytics-report.test.js tests/agent-design-governance.test.js tests/codex-self-heal.test.js tests/creator-campaigns.test.js tests/cli.test.js tests/codex-bridge-script.test.js tests/dependabot-changeset.test.js tests/dispatch-brief.test.js tests/feedback-normalize.test.js tests/install-mcp.test.js tests/pr-manager.test.js tests/pro-local-dashboard.test.js tests/published-cli.test.js tests/revenue-status.test.js tests/stripe-live-status.test.js tests/creator-dev-and-prune.test.js",
420
439
  "test:evolution": "node --test tests/workspace-evolver.test.js",
421
440
  "test:watcher": "node --test tests/jsonl-watcher.test.js",
422
441
  "test:autoresearch": "node --test tests/autoresearch.test.js",
@@ -488,8 +507,12 @@
488
507
  "test:zernio-status": "node --test tests/zernio-status.test.js",
489
508
  "test:license": "node --test tests/license.test.js",
490
509
  "test:bot-detector": "node --test tests/bot-detector.test.js",
510
+ "test:audit-pr-bot-contamination": "node --test tests/audit-pr-bot-contamination.test.js",
511
+ "test:stripe-bootstrap-saas-catalog": "node --test tests/stripe-bootstrap-saas-catalog.test.js",
491
512
  "test:bot-detection": "node --test tests/bot-detection.test.js",
492
513
  "test:checkout-bot-guard": "node --test tests/checkout-bot-guard.test.js",
514
+ "test:checkout-pro-confirmation-gate": "node --test tests/checkout-pro-confirmation-gate.test.js",
515
+ "test:revenue-observability-doctor": "node --test tests/revenue-observability-doctor.test.js",
493
516
  "test:postinstall": "node --test tests/postinstall.test.js",
494
517
  "test:funnel-invariants": "node --test tests/funnel-invariants.test.js",
495
518
  "test:cli-telemetry": "node --test tests/cli-telemetry.test.js",
@@ -599,7 +622,8 @@
599
622
  "test:gate-eval": "node --test tests/gate-eval.test.js",
600
623
  "gate-eval:ci": "node scripts/gate-eval.js run",
601
624
  "test:ai-engineering-stack-guardrails": "node --test tests/ai-engineering-stack-guardrails.test.js",
602
- "test:high-roi": "node --test tests/high-roi.test.js tests/model-candidates.test.js tests/autonomous-workflow.test.js tests/high-roi-agent-workflows.test.js tests/code-graph-guardrails.test.js tests/proxy-pointer-rag-guardrails.test.js tests/rag-precision-guardrails.test.js tests/ai-engineering-stack-guardrails.test.js tests/long-running-agent-context-guardrails.test.js tests/reasoning-efficiency-guardrails.test.js tests/deepseek-v4-runtime-guardrails.test.js tests/upstream-contribution-engine.test.js tests/proactive-agent-eval-guardrails.test.js tests/reward-hacking-guardrails.test.js tests/chatgpt-ads-readiness-pack.test.js tests/oss-pr-opportunity-scout.test.js tests/agent-design-governance.test.js tests/gemini-embedding-policy.test.js tests/openclaw-agent-governance-kit.test.js",
625
+ "test:interaction-model": "node --test tests/interaction-model.test.js tests/interaction-model-e2e.test.js",
626
+ "test:high-roi": "node --test tests/high-roi.test.js tests/model-candidates.test.js tests/autonomous-workflow.test.js tests/high-roi-agent-workflows.test.js tests/interaction-model.test.js tests/interaction-model-e2e.test.js tests/code-graph-guardrails.test.js tests/proxy-pointer-rag-guardrails.test.js tests/rag-precision-guardrails.test.js tests/ai-engineering-stack-guardrails.test.js tests/long-running-agent-context-guardrails.test.js tests/reasoning-efficiency-guardrails.test.js tests/deepseek-v4-runtime-guardrails.test.js tests/upstream-contribution-engine.test.js tests/proactive-agent-eval-guardrails.test.js tests/reward-hacking-guardrails.test.js tests/chatgpt-ads-readiness-pack.test.js tests/oss-pr-opportunity-scout.test.js tests/agent-design-governance.test.js tests/gemini-embedding-policy.test.js tests/openclaw-agent-governance-kit.test.js",
603
627
  "test:public-static-assets": "node --test tests/public-static-assets.test.js",
604
628
  "test:token-savings": "node --test tests/token-savings.test.js",
605
629
  "test:numbers-page": "node --test tests/numbers-page.test.js",
@@ -621,7 +645,8 @@
621
645
  "test:brand-assets": "node --test tests/brand-assets.test.js",
622
646
  "test:enforcement-teeth": "node --test tests/enforcement-teeth.test.js",
623
647
  "test:bayes-optimal-gate": "node --test tests/bayes-optimal-gate.test.js",
624
- "test:actionable-remediations": "node --test tests/actionable-remediations.test.js"
648
+ "test:actionable-remediations": "node --test tests/actionable-remediations.test.js",
649
+ "test:public-bundle-ratchet": "node --test tests/public-bundle-ratchet.test.js"
625
650
  },
626
651
  "keywords": [
627
652
  "mcp",
@@ -676,7 +701,7 @@
676
701
  "node": ">=18.18.0"
677
702
  },
678
703
  "dependencies": {
679
- "@anthropic-ai/sdk": "0.92.0",
704
+ "@anthropic-ai/sdk": "0.95.2",
680
705
  "@google/genai": "1.49.0",
681
706
  "@huggingface/transformers": "^4.2.0",
682
707
  "@lancedb/lancedb": "^0.27.2",
@@ -255,6 +255,18 @@
255
255
  <p><a href="/compare/agentix-labs" class="cta">Read ThumbGate vs Agentix Labs</a></p>
256
256
  </div>
257
257
 
258
+ <div class="card">
259
+ <h3>Already using a supply-chain scanner?</h3>
260
+ <p>HEIDI (Meterian) catches AI assistants suggesting vulnerable npm/pip/maven packages — the supply-chain half of AI coding safety. ThumbGate is the behavior half: it blocks the agent's tool call before it fires a known-bad pattern. Different threat surfaces, both local-first, both free at base. Run both.</p>
261
+ <p><a href="/compare/heidi" class="cta">Read ThumbGate vs HEIDI</a></p>
262
+ </div>
263
+
264
+ <div class="card">
265
+ <h3>Evaluating Rein for AI agent governance?</h3>
266
+ <p>Rein is a Python decorator that wraps tool functions with policy checks, audit trails, and circuit breakers — targeted at production apps in trading, healthcare, and legal. ThumbGate is the coding-agent specialist with a learning feedback loop and MIT licensing. Same category (pre-action gates), different layer and stack.</p>
267
+ <p><a href="/compare/rein" class="cta">Read ThumbGate vs Rein</a></p>
268
+ </div>
269
+
258
270
  <h2>How It Works</h2>
259
271
  <div class="step-grid">
260
272
  <div class="step-card">