thumbgate 1.27.11 → 1.27.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (131) hide show
  1. package/.claude-plugin/plugin.json +1 -1
  2. package/.well-known/llms.txt +2 -1
  3. package/.well-known/mcp/server-card.json +1 -1
  4. package/README.md +2 -4
  5. package/adapters/claude/.mcp.json +2 -2
  6. package/adapters/mcp/server-stdio.js +1 -1
  7. package/adapters/opencode/opencode.json +1 -1
  8. package/adapters/policy-engine/ethicore-guardian-client.js +68 -0
  9. package/adapters/policy-engine/thumbgate-policy-engine-adapter.js +260 -0
  10. package/bin/cli.js +78 -259
  11. package/config/builtin-lessons.json +23 -0
  12. package/config/gate-templates.json +0 -228
  13. package/config/gates/claim-verification.json +0 -18
  14. package/package.json +35 -25
  15. package/public/assets/brand/thumbgate-logo-transparent.svg +22 -0
  16. package/public/assets/brand/thumbgate-mark-inline-v3.svg +19 -0
  17. package/public/assets/brand/thumbgate-mark.svg +11 -5
  18. package/public/blog.html +0 -30
  19. package/public/brand/thumbgate-mark.svg +9 -5
  20. package/public/chatgpt-app.html +2 -2
  21. package/public/compare.html +2 -1
  22. package/public/dashboard.html +1 -1
  23. package/public/federal.html +1 -1
  24. package/public/index.html +95 -216
  25. package/public/learn.html +59 -35
  26. package/public/lessons.html +1 -1
  27. package/public/numbers.html +2 -2
  28. package/public/pro.html +7 -7
  29. package/scripts/aws-blocks-guardrails.js +228 -0
  30. package/scripts/cli-schema.js +22 -10
  31. package/scripts/dashboard-chat.js +2 -1
  32. package/scripts/document-intake.js +1 -49
  33. package/scripts/durability/step.js +3 -3
  34. package/scripts/gate-stats.js +5 -11
  35. package/scripts/gemini-embedding-policy.js +2 -1
  36. package/scripts/hook-stop-anti-claim.js +116 -184
  37. package/scripts/hosted-config.js +0 -12
  38. package/scripts/llm-client.js +187 -5
  39. package/scripts/plausible-domain-config.js +3 -1
  40. package/scripts/seo-gsd.js +240 -1
  41. package/scripts/tool-registry.js +2 -2
  42. package/scripts/vector-store.js +44 -0
  43. package/scripts/workspace-evolver.js +62 -2
  44. package/src/api/server.js +340 -131
  45. package/public/assets/brand/thumbgate-mark-inline.svg +0 -15
  46. package/public/compare/adopt-ai.html +0 -219
  47. package/public/compare/agentix-labs.html +0 -197
  48. package/public/compare/ai-experience-orchestration.html +0 -216
  49. package/public/compare/anthropic-claude-for-legal.html +0 -260
  50. package/public/compare/anthropic-containment.html +0 -280
  51. package/public/compare/arcade.html +0 -175
  52. package/public/compare/arcjet.html +0 -239
  53. package/public/compare/bumblebee.html +0 -307
  54. package/public/compare/claude-code-hooks.html +0 -294
  55. package/public/compare/databricks-unity-ai-gateway.html +0 -215
  56. package/public/compare/fallow.html +0 -351
  57. package/public/compare/heidi.html +0 -233
  58. package/public/compare/mem0.html +0 -342
  59. package/public/compare/oak-and-sparrow-gatekeeper.html +0 -289
  60. package/public/compare/rein.html +0 -236
  61. package/public/compare/sigmashake.html +0 -256
  62. package/public/compare/speclock.html +0 -342
  63. package/public/guides/agent-harness-optimization.html +0 -342
  64. package/public/guides/agentic-web-governance.html +0 -406
  65. package/public/guides/ai-agent-governance-sprint.html +0 -415
  66. package/public/guides/ai-agent-pre-action-approval-gates.html +0 -401
  67. package/public/guides/ai-agent-workflow-migration-checklist.html +0 -392
  68. package/public/guides/ai-deployment-readiness.html +0 -415
  69. package/public/guides/ai-mode-ads-agent-governance.html +0 -401
  70. package/public/guides/ai-search-topical-presence.html +0 -342
  71. package/public/guides/autoresearch-agent-safety.html +0 -342
  72. package/public/guides/background-agent-governance.html +0 -358
  73. package/public/guides/best-tools-stop-ai-agents-breaking-production.html +0 -363
  74. package/public/guides/browser-automation-safety.html +0 -342
  75. package/public/guides/chatgpt-ads-trust.html +0 -353
  76. package/public/guides/claude-code-feedback.html +0 -339
  77. package/public/guides/claude-code-prevent-repeated-mistakes.html +0 -161
  78. package/public/guides/claude-code-skills-guardrails.html +0 -343
  79. package/public/guides/claude-desktop.html +0 -356
  80. package/public/guides/code-knowledge-graph-guardrails.html +0 -365
  81. package/public/guides/codex-cli-guardrails.html +0 -339
  82. package/public/guides/cursor-agent-guardrails.html +0 -339
  83. package/public/guides/cursor-prevent-repeated-mistakes.html +0 -161
  84. package/public/guides/database-agent-safety.html +0 -406
  85. package/public/guides/deepseek-v4-runtime-guardrails.html +0 -346
  86. package/public/guides/developer-machine-supply-chain-guardrails.html +0 -358
  87. package/public/guides/gcp-mcp-guardrails.html +0 -147
  88. package/public/guides/gemini-cli-feedback-memory.html +0 -339
  89. package/public/guides/gpt-5-5-model-evaluation.html +0 -358
  90. package/public/guides/internal-ai-engineering-stack-guardrails.html +0 -348
  91. package/public/guides/long-running-agent-context-management.html +0 -346
  92. package/public/guides/mcp-tool-governance.html +0 -401
  93. package/public/guides/multica-thumbgate-setup.html +0 -134
  94. package/public/guides/native-messaging-host-security.html +0 -342
  95. package/public/guides/policy-engine-pre-action-gates.html +0 -346
  96. package/public/guides/pre-action-checks.html +0 -342
  97. package/public/guides/pretooluse-hooks-vs-advisory-prompt-rules.html +0 -342
  98. package/public/guides/prompt-tricks-to-workflow-rules.html +0 -365
  99. package/public/guides/proxy-pointer-rag-guardrails.html +0 -352
  100. package/public/guides/rag-precision-tuning-guardrails.html +0 -352
  101. package/public/guides/reasoning-compression-guardrails.html +0 -346
  102. package/public/guides/relational-knowledge-ai-recommendations.html +0 -342
  103. package/public/guides/roo-code-alternative-cline.html +0 -339
  104. package/public/guides/semantic-programmatic-seo-guardrails.html +0 -352
  105. package/public/guides/seo-agent-skills-guardrails.html +0 -344
  106. package/public/guides/stop-repeated-ai-agent-mistakes.html +0 -342
  107. package/public/learn/ac-dc-runtime-enforcement.html +0 -277
  108. package/public/learn/agent-harness-pattern.html +0 -181
  109. package/public/learn/agent-identity-connector-governance.html +0 -146
  110. package/public/learn/agent-swarms-shared-gates.html +0 -173
  111. package/public/learn/agentic-enterprise-context-brain.html +0 -117
  112. package/public/learn/agentic-os-team-governance.html +0 -146
  113. package/public/learn/ai-agent-governance.html +0 -158
  114. package/public/learn/ai-agent-persistent-memory.html +0 -211
  115. package/public/learn/anthropomorphic-claim-gates.html +0 -180
  116. package/public/learn/background-agent-control-layer.html +0 -184
  117. package/public/learn/claude-code-goal-with-rubrics.html +0 -205
  118. package/public/learn/codex-role-plugins-need-governance.html +0 -125
  119. package/public/learn/cost-aware-agent-gate-routing.html +0 -173
  120. package/public/learn/databricks-unity-ai-gateway-runtime-governance.html +0 -157
  121. package/public/learn/deterministic-agent-workflows.html +0 -185
  122. package/public/learn/feedback-loop-vs-decision-layer.html +0 -283
  123. package/public/learn/from-prototype-to-production.html +0 -223
  124. package/public/learn/learn.css +0 -51
  125. package/public/learn/mcp-pre-action-checks-explained.html +0 -172
  126. package/public/learn/pretix-stripe-connect-marketplaces.html +0 -161
  127. package/public/learn/regulated-agent-execution-boundary.html +0 -196
  128. package/public/learn/spec-driven-development.html +0 -168
  129. package/public/learn/stop-ai-agent-force-push.html +0 -134
  130. package/public/learn/vibe-coding-safety-net.html +0 -142
  131. package/scripts/reddit-browser-notification-watch.js +0 -230
@@ -1,205 +0,0 @@
1
- <!DOCTYPE html>
2
- <html lang="en">
3
- <head>
4
- <meta charset="UTF-8">
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds — ThumbGate</title>
7
- <script defer data-domain="thumbgate.ai" src="https://plausible.io/js/script.js"></script>
8
- <meta name="description" content="Treating /goal like a todo wastes Claude Code. The pattern that holds: clear goal, measurable success, shown proof, hard limits. Maps directly to ThumbGate's rubric-engine for enforced verification.">
9
- <meta name="keywords" content="claude code /goal, claude code goal command, agent goal pattern, measurable success criteria, rubric engine, AI agent verification, ThumbGate rubric">
10
- <meta property="og:title" content="Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds">
11
- <meta property="og:description" content="The /goal command becomes ten times more useful when you treat it as a verifiable contract, not a todo. Here is the 4-field pattern and how to enforce it.">
12
- <meta property="og:type" content="article">
13
- <meta property="og:url" content="https://thumbgate.ai/learn/claude-code-goal-with-rubrics">
14
- <link rel="canonical" href="https://thumbgate.ai/learn/claude-code-goal-with-rubrics">
15
-
16
- <script type="application/ld+json">
17
- {
18
- "@context": "https://schema.org",
19
- "@type": "TechArticle",
20
- "headline": "Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds",
21
- "description": "Treating /goal like a todo wastes Claude Code. The pattern that holds: clear goal, measurable success, shown proof, hard limits — enforced by ThumbGate rubrics.",
22
- "author": {
23
- "@type": "Person",
24
- "name": "Igor Ganapolsky",
25
- "url": "https://github.com/IgorGanapolsky"
26
- },
27
- "publisher": {
28
- "@type": "Organization",
29
- "name": "ThumbGate",
30
- "url": "https://thumbgate.ai"
31
- },
32
- "datePublished": "2026-05-18",
33
- "dateModified": "2026-05-18",
34
- "mainEntityOfPage": "https://thumbgate.ai/learn/claude-code-goal-with-rubrics",
35
- "about": [
36
- {"@type": "Thing", "name": "Claude Code /goal command"},
37
- {"@type": "Thing", "name": "AI agent rubric"},
38
- {"@type": "Thing", "name": "verifiable AI agent outcomes"}
39
- ]
40
- }
41
- </script>
42
-
43
- <link rel="stylesheet" href="/learn/learn.css">
44
- <style>
45
- table { width: 100%; border-collapse: collapse; margin: 1rem 0; }
46
- th, td { text-align: left; padding: 0.6rem 0.8rem; border-bottom: 1px solid var(--border); font-size: 0.9rem; vertical-align: top; }
47
- th { color: var(--cyan); font-weight: 600; }
48
- .mapping-row td:first-child { color: var(--green); font-weight: 500; }
49
- pre.example { background: var(--bg-raised); padding: 14px 16px; border-radius: 8px; font-size: 13px; border-left: 3px solid var(--cyan); margin: 12px 0; overflow-x: auto; }
50
- </style>
51
- </head>
52
- <body>
53
-
54
- <nav>
55
- <a href="/" class="brand"><img src="/assets/brand/thumbgate-mark-inline.svg" alt="ThumbGate" class="logo-mark" width="28" height="28"><span class="logo-text">ThumbGate</span></a>
56
- <a href="/guide">Setup Guide</a>
57
- <a href="/learn">Learn</a>
58
- <a href="/dashboard">Dashboard</a>
59
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub</a>
60
- </nav>
61
-
62
- <div class="container">
63
- <div class="breadcrumb"><a href="/learn">Learn</a> / Claude Code /goal With Rubrics</div>
64
- <h1>Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds</h1>
65
- <p style="color:var(--muted);">6 min read &middot; For developers using Claude Code's /goal command in production</p>
66
-
67
- <div class="tldr"><strong>TL;DR:</strong> Treating <code>/goal</code> like a todo wastes the command. The pattern that holds: <strong>clear goal, measurable success, shown proof, hard limits.</strong> That's a 4-field rubric — the same shape ThumbGate's rubric-engine enforces at gate-fire time. Pair them and the agent cannot fake completion.</div>
68
-
69
- <h2>The todo-shaped /goal anti-pattern</h2>
70
- <p>A common usage of Claude Code's <code>/goal</code> command looks like this:</p>
71
-
72
- <pre class="example">/goal fix the auth bug</pre>
73
-
74
- <p>This is a todo, not a goal. There is no way for the agent or you to know when the work is actually done. The agent will declare success after the first plausible-looking change, you will discover it was wrong an hour later, and the same conversation will re-enter the loop.</p>
75
-
76
- <div class="callout">
77
- <strong>The verifiable-contract reframe:</strong> A goal is not a wish. It is a contract with a clear outcome, a measurable success criterion, an inspectable proof, and a stop condition. Anything short of all four is a todo.
78
- </div>
79
-
80
- <h2>The 4-field pattern</h2>
81
- <p>Each field maps to a specific failure mode the agent commits when the field is missing:</p>
82
-
83
- <table>
84
- <thead>
85
- <tr>
86
- <th>Field</th>
87
- <th>What it answers</th>
88
- <th>Failure mode if missing</th>
89
- </tr>
90
- </thead>
91
- <tbody>
92
- <tr class="mapping-row">
93
- <td>1. Clear goal</td>
94
- <td>What is the outcome, in one sentence, that someone outside the project would understand?</td>
95
- <td>Scope creep. Agent expands the task to whatever it can find.</td>
96
- </tr>
97
- <tr class="mapping-row">
98
- <td>2. Measurable success</td>
99
- <td>What single check, run after the work, returns 0 / pass / a specific number?</td>
100
- <td>"Looks done." Agent declares completion on optimistic intermediate signals.</td>
101
- </tr>
102
- <tr class="mapping-row">
103
- <td>3. Shown proof</td>
104
- <td>What output, file, or screenshot will be in the final message proving the check ran?</td>
105
- <td>Hallucinated completion. Agent reports a green test that was never executed.</td>
106
- </tr>
107
- <tr class="mapping-row">
108
- <td>4. Hard limits</td>
109
- <td>What is the deadline, retry cap, or scope cliff that stops the work even if not yet "done"?</td>
110
- <td>Infinite spin. Agent keeps trying variants past any reasonable budget.</td>
111
- </tr>
112
- </tbody>
113
- </table>
114
-
115
- <h2>Concrete /goal phrasing</h2>
116
- <p>Same task, todo-shape vs goal-shape:</p>
117
-
118
- <pre class="example"><strong>Todo shape (anti-pattern):</strong>
119
- /goal fix the auth bug
120
-
121
- <strong>Goal shape:</strong>
122
- /goal fix the auth bug
123
- success: npm test --testPathPattern=auth returns exit 0 with 12 passing
124
- proof: paste the final 'PASS auth' line of the test output
125
- limit: stop after 3 implementation attempts; if still failing, file a /thumbsdown</pre>
126
-
127
- <p>The agent now has a contract it cannot fake. The success criterion is mechanical (exit code + count). The proof is inspectable (a literal line of output). The limit is a stop condition that triggers a feedback capture instead of a fifth wasted attempt.</p>
128
-
129
- <h2>Where ThumbGate plugs in</h2>
130
- <p>The 4-field pattern is exactly the shape of a rubric. ThumbGate's <code>scripts/rubric-engine.js</code> evaluates each completed agent task against a 4-field rubric:</p>
131
-
132
- <table>
133
- <thead>
134
- <tr>
135
- <th>Claude Code /goal field</th>
136
- <th>ThumbGate rubric field</th>
137
- <th>What ThumbGate does</th>
138
- </tr>
139
- </thead>
140
- <tbody>
141
- <tr class="mapping-row">
142
- <td>Clear goal</td>
143
- <td><code>rubric.goal</code></td>
144
- <td>Stored as the canonical task description in the feedback log</td>
145
- </tr>
146
- <tr class="mapping-row">
147
- <td>Measurable success</td>
148
- <td><code>rubric.verification.check</code></td>
149
- <td>The check is run before the agent's "done" claim is promoted to memory</td>
150
- </tr>
151
- <tr class="mapping-row">
152
- <td>Shown proof</td>
153
- <td><code>rubric.verification.evidence</code></td>
154
- <td>Captured into the lesson DB; missing proof = no promotion</td>
155
- </tr>
156
- <tr class="mapping-row">
157
- <td>Hard limits</td>
158
- <td><code>rubric.budget</code></td>
159
- <td>Tied to budget-guard.js — when limit hit, ThumbGate forces a thumbs-down capture and surfaces it for review</td>
160
- </tr>
161
- </tbody>
162
- </table>
163
-
164
- <div class="callout callout-green">
165
- <strong>Two layers, same shape:</strong> Claude Code's <code>/goal</code> sets the contract at task start. ThumbGate's rubric-engine enforces it at task end. The agent cannot promote a "done" memory unless every rubric field has the proof attached.
166
- </div>
167
-
168
- <h2>What this prevents in practice</h2>
169
- <p>Three concrete agent failure modes the paired pattern blocks:</p>
170
-
171
- <ul>
172
- <li><strong>Test-skipping.</strong> Agent reports "tests pass" but never ran them. Rubric verification.check runs the actual command and refuses the "done" promotion if exit code != 0.</li>
173
- <li><strong>Optimistic completion.</strong> Agent declares success on the first plausible change. The proof field requires an inspectable artifact (test output line, screenshot, exit code) that has to exist before promotion.</li>
174
- <li><strong>Budget run-away.</strong> Agent retries silently across 8+ attempts. Hard limit triggers a budget-guard block, forces feedback capture, and prevents the same pattern from recurring on the next prompt.</li>
175
- </ul>
176
-
177
- <h2>Five minutes to wire it</h2>
178
- <p>In your project root:</p>
179
-
180
- <pre><code>npx thumbgate init</code></pre>
181
-
182
- <p>Then, in any conversation, use the goal-shape phrasing above. ThumbGate's rubric-engine runs as part of the PreToolUse hook chain; no extra config required. The first time the agent tries to promote a "done" claim without proof, you'll see the gate fire in your dashboard.</p>
183
-
184
- <div class="cta-box">
185
- <h2 style="color:var(--text);font-size:1.3rem;margin:0 0 8px;">Pair /goal with a rubric the agent cannot fake.</h2>
186
- <p>Works with Claude Code, Cursor, Codex, Gemini, Amp, OpenCode, and any MCP-compatible agent.</p>
187
- <div class="cta-install">$ npx thumbgate init</div>
188
- </div>
189
-
190
- <div class="related">
191
- <h3>Related articles</h3>
192
- <a href="/learn/agent-harness-pattern">The Agent Harness Pattern: Why Your AI Needs a Seatbelt &rarr;</a>
193
- <a href="/learn/agent-swarms-shared-gates">Agent Swarms: One Gate Layer, Every Model &rarr;</a>
194
- <a href="/learn/mcp-pre-action-checks-explained">MCP Pre-Action Checks Explained &rarr;</a>
195
- </div>
196
- </div>
197
-
198
-
199
- <div class="sticky-cta">
200
- <span style="color:var(--muted)">Try it now:</span>
201
- <code>npx thumbgate init</code>
202
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub &rarr;</a>
203
- </div>
204
- </body>
205
- </html>
@@ -1,125 +0,0 @@
1
- <!DOCTYPE html>
2
- <html lang="en">
3
- <head>
4
- <meta charset="UTF-8">
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>Codex Role Plugins Need Pre-Action Governance — ThumbGate</title>
7
- <script defer data-domain="thumbgate.ai" src="https://plausible.io/js/script.js"></script>
8
- <meta name="description" content="Codex plugins, Sites, and annotations move AI work from code into sales, analytics, design, finance, and documents. Teams need pre-action governance before those workflows publish, share, or modify business systems.">
9
- <meta name="keywords" content="Codex plugins, Codex Sites, Codex annotations, ChatGPT Codex, role-specific AI agents, pre-action governance, Codex plugin governance, ThumbGate">
10
- <meta property="og:title" content="Codex Role Plugins Need Pre-Action Governance">
11
- <meta property="og:description" content="As Codex expands beyond coding into role-specific plugins, ThumbGate is the evidence and policy layer before AI work touches business systems.">
12
- <meta property="og:type" content="article">
13
- <meta property="og:url" content="https://thumbgate.ai/learn/codex-role-plugins-need-governance">
14
- <link rel="canonical" href="https://thumbgate.ai/learn/codex-role-plugins-need-governance">
15
-
16
- <script type="application/ld+json">
17
- {
18
- "@context": "https://schema.org",
19
- "@type": "TechArticle",
20
- "headline": "Codex Role Plugins Need Pre-Action Governance",
21
- "description": "Codex plugins, Sites, and annotations move AI work from code into sales, analytics, design, finance, and documents. Teams need pre-action governance before those workflows publish, share, or modify business systems.",
22
- "author": { "@type": "Person", "name": "Igor Ganapolsky", "url": "https://github.com/IgorGanapolsky" },
23
- "publisher": { "@type": "Organization", "name": "ThumbGate", "url": "https://thumbgate.ai" },
24
- "datePublished": "2026-06-03",
25
- "dateModified": "2026-06-03",
26
- "mainEntityOfPage": "https://thumbgate.ai/learn/codex-role-plugins-need-governance",
27
- "about": [
28
- { "@type": "Thing", "name": "Codex plugins" },
29
- { "@type": "Thing", "name": "Codex Sites" },
30
- { "@type": "Thing", "name": "role-specific AI agents" },
31
- { "@type": "Thing", "name": "pre-action governance" }
32
- ]
33
- }
34
- </script>
35
-
36
- <link rel="stylesheet" href="/learn/learn.css">
37
- <style>
38
- .matrix { width: 100%; border-collapse: collapse; margin: 1rem 0 1.5rem; }
39
- .matrix th, .matrix td { text-align: left; padding: 0.7rem 0.8rem; border-bottom: 1px solid var(--border); vertical-align: top; }
40
- .matrix th { color: var(--cyan); font-weight: 600; }
41
- </style>
42
- </head>
43
- <body>
44
-
45
- <nav>
46
- <a href="/" class="brand"><img src="/assets/brand/thumbgate-mark-inline.svg" alt="ThumbGate" class="logo-mark" width="28" height="28"><span class="logo-text">ThumbGate</span></a>
47
- <a href="/guide">Setup Guide</a>
48
- <a href="/learn">Learn</a>
49
- <a href="/dashboard">Dashboard</a>
50
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub</a>
51
- </nav>
52
-
53
- <div class="container">
54
- <div class="breadcrumb"><a href="/learn">Learn</a> / Codex Role Plugin Governance</div>
55
- <h1>Codex role plugins need a governance layer before they touch business systems.</h1>
56
- <p style="color:var(--muted);">6 min read &middot; For teams adopting Codex plugins, Sites, annotations, and non-developer AI workflows</p>
57
-
58
- <div class="tldr"><strong>TL;DR:</strong> Codex is becoming a cross-functional work surface, not only a coding tool. OpenAI's Codex docs describe plugins as installable bundles of skills, app integrations, and MCP servers, plus Sites for hosted apps and dashboards. That makes ThumbGate's job sharper: enforce policy, evidence, and feedback-derived blocks before role-specific agents publish, share, edit, deploy, or write into customer systems.</div>
59
-
60
- <h2>The product shift</h2>
61
- <p>Codex plugins package skills, app integrations, and MCP servers into reusable workflows. Sites can turn Codex output into hosted websites, apps, dashboards, and games. Annotations let a user select part of a document, spreadsheet, or slide and ask Codex to work on that selected region.</p>
62
- <p>That is powerful because non-developers can now use the same inspect, edit, verify, report loop on business artifacts. It is risky for the same reason: the action surface expands from code to CRM records, revenue dashboards, design assets, finance decks, sales sequences, and hosted internal tools.</p>
63
-
64
- <div class="callout">
65
- <strong>ThumbGate's wedge:</strong> The more Codex becomes a role-specific operating layer, the more every team needs a pre-action policy layer outside the prompt.
66
- </div>
67
-
68
- <h2>What can go wrong without gates</h2>
69
- <ul>
70
- <li>A sales plugin drafts or updates outreach from stale positioning after a thumbs-down already rejected that claim.</li>
71
- <li>A data plugin publishes a dashboard before the source query, date window, and metric definition are proven.</li>
72
- <li>A Sites workflow deploys a public prototype before access mode, secrets, and intended audience are checked.</li>
73
- <li>A document annotation updates one selected section while breaking a compliance statement elsewhere in the same deck.</li>
74
- <li>A non-developer approves a tool action without knowing it writes to production systems.</li>
75
- </ul>
76
-
77
- <h2>The governance map</h2>
78
- <table class="matrix">
79
- <thead>
80
- <tr><th>Codex surface</th><th>Why it matters</th><th>ThumbGate gate</th></tr>
81
- </thead>
82
- <tbody>
83
- <tr><td>Role plugin</td><td>Bundles repeatable work for sales, analytics, design, finance, and operations.</td><td>Require role-specific allowed tools, scopes, and blocked action patterns before execution.</td></tr>
84
- <tr><td>App integration</td><td>Lets Codex read or write external systems.</td><td>Route CRM, email, billing, data warehouse, and file-share writes through approval and audit checks.</td></tr>
85
- <tr><td>MCP server</td><td>Adds custom tools and shared information.</td><td>Inventory tools, tag high-risk writes, and block unauthorized tool calls before the model invokes them.</td></tr>
86
- <tr><td>Sites</td><td>Turns output into shareable hosted apps and dashboards.</td><td>Require build proof, access mode, secret handling, and deployment evidence before publish.</td></tr>
87
- <tr><td>Annotations</td><td>Targets exact regions of documents, spreadsheets, and slides.</td><td>Require source-region evidence and prevent partial edits from bypassing whole-document policy.</td></tr>
88
- </tbody>
89
- </table>
90
-
91
- <h2>High-ROI implementation</h2>
92
- <ol>
93
- <li><strong>Ship role-specific gate templates:</strong> sales, analytics, design, finance, legal, and customer-support templates with allowed actions and evidence labels.</li>
94
- <li><strong>Make plugin install prove itself:</strong> every Codex plugin install path should end with <code>npx thumbgate feedback-self-test</code> and one real gate check.</li>
95
- <li><strong>Gate Sites deploys:</strong> block public deploy or access widening until build, audience, and secret-handling proof are attached.</li>
96
- <li><strong>Gate annotated edits:</strong> require the selected artifact region, intended edit, and document-level invariant before saving or exporting.</li>
97
- <li><strong>Measure the new buyer metric:</strong> role-workflow repeats blocked before execution, split by role and tool surface.</li>
98
- </ol>
99
-
100
- <div class="callout callout-green">
101
- <strong>Sales wedge:</strong> "Codex plugins make every team faster. ThumbGate makes every team safer before the plugin writes, shares, deploys, or publishes."
102
- </div>
103
-
104
- <div class="cta-box">
105
- <h2 style="color:var(--text);font-size:1.3rem;margin:0 0 8px;">Add gates to one role workflow</h2>
106
- <p>Start with the role, the write surface, and the evidence required before that role's agent can claim success.</p>
107
- <div class="cta-install">$ npx thumbgate init --agent codex</div>
108
- </div>
109
-
110
- <div class="related">
111
- <h3>Related articles</h3>
112
- <a href="/codex-plugin">ThumbGate for Codex &rarr;</a>
113
- <a href="/learn/deterministic-agent-workflows">Deterministic Agent Workflows Need Runtime Gates &rarr;</a>
114
- <a href="/learn/agentic-os-team-governance">Agentic OS Team Governance &rarr;</a>
115
- <a href="/learn/background-agent-control-layer">Background Agents Need a Control Layer &rarr;</a>
116
- </div>
117
- </div>
118
-
119
- <div class="sticky-cta">
120
- <span style="color:var(--muted)">Try it now:</span>
121
- <code>npx thumbgate init --agent codex</code>
122
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub &rarr;</a>
123
- </div>
124
- </body>
125
- </html>
@@ -1,173 +0,0 @@
1
- <!DOCTYPE html>
2
- <html lang="en">
3
- <head>
4
- <meta charset="UTF-8">
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>Cost-Aware Agent Gate Routing — ThumbGate</title>
7
- <script defer data-domain="thumbgate.ai" src="https://plausible.io/js/script.js"></script>
8
- <meta name="description" content="How ThumbGate routes agent checks through deterministic rules, semantic cache, local classifiers, LLM judges, and human review so teams avoid unnecessary latency, tokens, and provider calls.">
9
- <meta name="keywords" content="AI agent gate routing, LLM classifier, semantic caching, agent governance, pre-action checks, workflow harness, structured data provenance, ThumbGate">
10
- <meta property="og:title" content="Cost-Aware Agent Gate Routing">
11
- <meta property="og:description" content="Use deterministic checks, semantic cache, local classifiers, and LLM judges in the right order before an agent action runs.">
12
- <meta property="og:type" content="article">
13
- <meta property="og:url" content="https://thumbgate.ai/learn/cost-aware-agent-gate-routing">
14
- <link rel="canonical" href="https://thumbgate.ai/learn/cost-aware-agent-gate-routing">
15
-
16
- <script type="application/ld+json">
17
- {
18
- "@context": "https://schema.org",
19
- "@type": "TechArticle",
20
- "headline": "Cost-Aware Agent Gate Routing",
21
- "description": "How ThumbGate routes agent checks through deterministic rules, semantic cache, local classifiers, LLM judges, and human review so teams avoid unnecessary latency, tokens, and provider calls.",
22
- "author": {
23
- "@type": "Person",
24
- "name": "Igor Ganapolsky",
25
- "url": "https://github.com/IgorGanapolsky"
26
- },
27
- "publisher": {
28
- "@type": "Organization",
29
- "name": "ThumbGate",
30
- "url": "https://thumbgate.ai"
31
- },
32
- "datePublished": "2026-06-03",
33
- "dateModified": "2026-06-03",
34
- "mainEntityOfPage": "https://thumbgate.ai/learn/cost-aware-agent-gate-routing",
35
- "about": [
36
- {"@type": "Thing", "name": "pre-action checks"},
37
- {"@type": "Thing", "name": "semantic caching"},
38
- {"@type": "Thing", "name": "LLM classifiers"},
39
- {"@type": "Thing", "name": "agent workflows"}
40
- ]
41
- }
42
- </script>
43
-
44
- <link rel="stylesheet" href="/learn/learn.css">
45
- <style>
46
- .matrix { width: 100%; border-collapse: collapse; margin: 1rem 0 1.5rem; }
47
- .matrix th, .matrix td { text-align: left; padding: 0.7rem 0.8rem; border-bottom: 1px solid var(--border); vertical-align: top; }
48
- .matrix th { color: var(--cyan); font-weight: 600; }
49
- .command { background: var(--bg-card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; margin: 1rem 0; overflow-x: auto; }
50
- </style>
51
- </head>
52
- <body>
53
-
54
- <nav>
55
- <a href="/" class="brand"><img src="/assets/brand/thumbgate-mark-inline.svg" alt="ThumbGate" class="logo-mark" width="28" height="28"><span class="logo-text">ThumbGate</span></a>
56
- <a href="/guide">Setup Guide</a>
57
- <a href="/learn">Learn</a>
58
- <a href="/dashboard">Dashboard</a>
59
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub</a>
60
- </nav>
61
-
62
- <div class="container">
63
- <div class="breadcrumb"><a href="/learn">Learn</a> / Cost-Aware Gate Routing</div>
64
- <h1>Cost-aware agent gates: rules first, models last.</h1>
65
- <p style="color:var(--muted);">7 min read &middot; For teams trying to make agent governance fast enough to stay on by default</p>
66
-
67
- <div class="tldr"><strong>TL;DR:</strong> The expensive part of agent governance should not run on every action. ThumbGate should route checks through deterministic rules, semantic cache, local text classifiers, and local semantic recall before using an LLM judge. High-risk private ambiguity should stop for human review instead of calling a cloud model.</div>
68
-
69
- <h2>The pattern across the latest agent infrastructure work</h2>
70
- <p>The same lesson keeps showing up in different forms. Semantic caching cuts repeated LLM calls. Traditional text classifiers beat LLMs on speed and cost when labels are clear. Breadth-first query execution batches similar work instead of walking one branch at a time. Structured live dataset agents only become trustworthy when every row has source provenance. Streaming output removes dead air. Dynamic harnesses work best when critic, tournament, loop, and fan-out patterns are selected deliberately.</p>
71
- <p>For ThumbGate, these are not separate product bets. They collapse into one control-plane rule: <strong>choose the cheapest reliable gate before the action runs.</strong></p>
72
-
73
- <h2>The routing ladder</h2>
74
- <table class="matrix">
75
- <thead>
76
- <tr>
77
- <th>Lane</th>
78
- <th>Use when</th>
79
- <th>Why it is high ROI</th>
80
- </tr>
81
- </thead>
82
- <tbody>
83
- <tr>
84
- <td>Deterministic</td>
85
- <td>Secrets, force-push, destructive SQL, protected files, known repeated commands.</td>
86
- <td>Near-zero latency, no tokens, no provider call. This is the default for exact policy risk.</td>
87
- </tr>
88
- <tr>
89
- <td>Semantic cache</td>
90
- <td>A prompt or action is semantically equivalent to a prior rejected or approved pattern.</td>
91
- <td>Returns the cached decision without rerunning the judge. This is the AISG-style buyer message applied to pre-action checks.</td>
92
- </tr>
93
- <tr>
94
- <td>Rubric gate</td>
95
- <td>A critic/rubric loop failed a criterion, hit its cap, or lacks done evidence.</td>
96
- <td>Turns LangChain-style rubric iteration into an enforcement event: block completion claims until the missing proof exists.</td>
97
- </tr>
98
- <tr>
99
- <td>Local classical classifier</td>
100
- <td>High-volume labels with enough examples and low ambiguity.</td>
101
- <td>Fast and cheap for routine feedback triage, import classification, and known error families.</td>
102
- </tr>
103
- <tr>
104
- <td>Local semantic recall</td>
105
- <td>Few examples, fuzzy near-misses, or cross-session recurrence.</td>
106
- <td>Keeps private context local while catching cases regex and keyword routing miss.</td>
107
- </tr>
108
- <tr>
109
- <td>LLM judge</td>
110
- <td>High-risk semantic ambiguity with explicit cloud permission and a budget cap.</td>
111
- <td>Useful for critic/rubric review, multi-document evidence review, and structured provenance checks, but not for every action.</td>
112
- </tr>
113
- <tr>
114
- <td>Human review</td>
115
- <td>Private, regulated, payment, credential, customer-data, or unbounded external-posting risk.</td>
116
- <td>Prevents automation from laundering a risky decision through a model call.</td>
117
- </tr>
118
- </tbody>
119
- </table>
120
-
121
- <h2>What changed in ThumbGate</h2>
122
- <p>ThumbGate now has a small, testable routing primitive that makes this policy explicit:</p>
123
- <div class="command"><code>node scripts/classifier-routing.js --risk=high --ambiguity=0.82 --allow-cloud --latency-ms=5000</code></div>
124
- <p>That command returns an evidence-requiring LLM judge lane. Add <code>--semantic-cache-hit</code>, and it reuses the prior decision without a provider call. Add <code>--rubric-failed</code> or <code>--structured-dataset --missing-provenance</code>, and it blocks completion through the rubric gate. Change the same high-risk ambiguous input to <code>--privacy-sensitive</code> without <code>--allow-cloud</code>, and it routes to human review instead.</p>
125
-
126
- <h2>How the newer signals map to product work</h2>
127
- <ul>
128
- <li><strong>Scikit-LLM vs traditional classifiers:</strong> do not spend LLM calls on low-ambiguity bulk labels.</li>
129
- <li><strong>Semantic proxy caching:</strong> reuse a prior decision when prompt meaning has not changed.</li>
130
- <li><strong>LangChain-style rubrics:</strong> turn failed criteria into completion blockers instead of post-hoc scores.</li>
131
- <li><strong>Shopify Cardinal BFS:</strong> batch and evaluate similar gate scopes together instead of repeatedly traversing the same nested context.</li>
132
- <li><strong>BigSet-style dataset agents:</strong> require structured rows, source URLs, and retrieval traces before accepting live web data.</li>
133
- <li><strong>Streaming agent output:</strong> stream progress events during long gate reviews so users know the gate is working.</li>
134
- <li><strong>Dynamic harness patterns:</strong> use critic/rubric for correctness, tournament for ranking, loop-until-done for open-ended work, and fan-out/synthesize for parallel research.</li>
135
- </ul>
136
-
137
- <div class="callout callout-green">
138
- <strong>Buyer proof:</strong> show the same risky action going through three routes: exact repeat blocked instantly, fuzzy repeat caught locally, and genuinely ambiguous production change paused for evidence or human review.
139
- </div>
140
-
141
- <h2>Implementation checklist</h2>
142
- <ol>
143
- <li>Put exact denials and approval boundaries in deterministic checks.</li>
144
- <li>Cache semantically equivalent gate decisions with provenance and expiry.</li>
145
- <li>Use local text classification for routine high-volume feedback labels.</li>
146
- <li>Use local semantic recall for sparse, fuzzy, or cross-session lessons.</li>
147
- <li>Treat failed rubrics and missing source provenance as gate failures, not just evaluation notes.</li>
148
- <li>Reserve LLM judges for ambiguous high-value decisions with evidence requirements.</li>
149
- <li>Stream progress for long reviews and record every routed decision in the audit trail.</li>
150
- </ol>
151
-
152
- <div class="cta-box">
153
- <h2 style="color:var(--text);font-size:1.3rem;margin:0 0 8px;">Try the routing primitive</h2>
154
- <p>Check the gate lane before spending tokens on a risky decision.</p>
155
- <div class="cta-install">$ node scripts/classifier-routing.js --hard-rule --risk=critical</div>
156
- </div>
157
-
158
- <div class="related">
159
- <h3>Related articles</h3>
160
- <a href="/learn/deterministic-agent-workflows">Deterministic Agent Workflows Need Runtime Gates &rarr;</a>
161
- <a href="/learn/agentic-os-team-governance">Agentic OS Team Governance &rarr;</a>
162
- <a href="/learn/agentic-enterprise-context-brain">Agentic Enterprise Context Brain &rarr;</a>
163
- <a href="/learn/mcp-pre-action-checks-explained">MCP Pre-Action Checks Explained &rarr;</a>
164
- </div>
165
- </div>
166
-
167
- <div class="sticky-cta">
168
- <span style="color:var(--muted)">Install:</span>
169
- <code>npx thumbgate init</code>
170
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub &rarr;</a>
171
- </div>
172
- </body>
173
- </html>