thumbgate 1.27.12 → 1.27.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (133) hide show
  1. package/.claude-plugin/plugin.json +1 -1
  2. package/.well-known/llms.txt +2 -1
  3. package/.well-known/mcp/server-card.json +1 -1
  4. package/README.md +2 -4
  5. package/adapters/claude/.mcp.json +2 -2
  6. package/adapters/mcp/server-stdio.js +1 -1
  7. package/adapters/opencode/opencode.json +1 -1
  8. package/adapters/policy-engine/ethicore-guardian-client.js +68 -0
  9. package/adapters/policy-engine/thumbgate-policy-engine-adapter.js +260 -0
  10. package/bin/cli.js +78 -259
  11. package/config/gate-templates.json +0 -228
  12. package/config/gates/claim-verification.json +0 -18
  13. package/package.json +35 -25
  14. package/public/assets/brand/thumbgate-logo-transparent.svg +22 -0
  15. package/public/assets/brand/thumbgate-mark-inline-v3.svg +19 -0
  16. package/public/assets/brand/thumbgate-mark.svg +11 -5
  17. package/public/blog.html +0 -30
  18. package/public/brand/thumbgate-mark.svg +9 -5
  19. package/public/chatgpt-app.html +2 -2
  20. package/public/compare.html +2 -1
  21. package/public/dashboard.html +1 -1
  22. package/public/federal.html +1 -1
  23. package/public/index.html +95 -216
  24. package/public/learn.html +59 -35
  25. package/public/lessons.html +1 -1
  26. package/public/numbers.html +2 -2
  27. package/public/pro.html +7 -7
  28. package/scripts/agent-readiness.js +142 -0
  29. package/scripts/aws-blocks-guardrails.js +228 -0
  30. package/scripts/cli-schema.js +22 -10
  31. package/scripts/dashboard-chat.js +2 -1
  32. package/scripts/document-intake.js +1 -49
  33. package/scripts/durability/step.js +3 -3
  34. package/scripts/gate-stats.js +5 -11
  35. package/scripts/gates-engine.js +0 -49
  36. package/scripts/gemini-embedding-policy.js +2 -1
  37. package/scripts/hook-stop-anti-claim.js +116 -184
  38. package/scripts/hosted-config.js +0 -12
  39. package/scripts/lesson-search.js +1 -15
  40. package/scripts/llm-client.js +187 -5
  41. package/scripts/plausible-domain-config.js +3 -1
  42. package/scripts/seo-gsd.js +240 -1
  43. package/scripts/tool-registry.js +2 -2
  44. package/scripts/vector-store.js +44 -0
  45. package/scripts/workspace-evolver.js +62 -2
  46. package/src/api/server.js +340 -131
  47. package/public/assets/brand/thumbgate-mark-inline.svg +0 -15
  48. package/public/compare/adopt-ai.html +0 -219
  49. package/public/compare/agentix-labs.html +0 -197
  50. package/public/compare/ai-experience-orchestration.html +0 -216
  51. package/public/compare/anthropic-claude-for-legal.html +0 -260
  52. package/public/compare/anthropic-containment.html +0 -280
  53. package/public/compare/arcade.html +0 -175
  54. package/public/compare/arcjet.html +0 -239
  55. package/public/compare/bumblebee.html +0 -307
  56. package/public/compare/claude-code-hooks.html +0 -294
  57. package/public/compare/databricks-unity-ai-gateway.html +0 -215
  58. package/public/compare/fallow.html +0 -351
  59. package/public/compare/heidi.html +0 -233
  60. package/public/compare/mem0.html +0 -342
  61. package/public/compare/oak-and-sparrow-gatekeeper.html +0 -289
  62. package/public/compare/rein.html +0 -236
  63. package/public/compare/sigmashake.html +0 -256
  64. package/public/compare/speclock.html +0 -342
  65. package/public/guides/agent-harness-optimization.html +0 -342
  66. package/public/guides/agentic-web-governance.html +0 -406
  67. package/public/guides/ai-agent-governance-sprint.html +0 -415
  68. package/public/guides/ai-agent-pre-action-approval-gates.html +0 -401
  69. package/public/guides/ai-agent-workflow-migration-checklist.html +0 -392
  70. package/public/guides/ai-deployment-readiness.html +0 -415
  71. package/public/guides/ai-mode-ads-agent-governance.html +0 -401
  72. package/public/guides/ai-search-topical-presence.html +0 -342
  73. package/public/guides/autoresearch-agent-safety.html +0 -342
  74. package/public/guides/background-agent-governance.html +0 -358
  75. package/public/guides/best-tools-stop-ai-agents-breaking-production.html +0 -363
  76. package/public/guides/browser-automation-safety.html +0 -342
  77. package/public/guides/chatgpt-ads-trust.html +0 -353
  78. package/public/guides/claude-code-feedback.html +0 -339
  79. package/public/guides/claude-code-prevent-repeated-mistakes.html +0 -161
  80. package/public/guides/claude-code-skills-guardrails.html +0 -343
  81. package/public/guides/claude-desktop.html +0 -356
  82. package/public/guides/code-knowledge-graph-guardrails.html +0 -365
  83. package/public/guides/codex-cli-guardrails.html +0 -339
  84. package/public/guides/cursor-agent-guardrails.html +0 -339
  85. package/public/guides/cursor-prevent-repeated-mistakes.html +0 -161
  86. package/public/guides/database-agent-safety.html +0 -406
  87. package/public/guides/deepseek-v4-runtime-guardrails.html +0 -346
  88. package/public/guides/developer-machine-supply-chain-guardrails.html +0 -358
  89. package/public/guides/gcp-mcp-guardrails.html +0 -147
  90. package/public/guides/gemini-cli-feedback-memory.html +0 -339
  91. package/public/guides/gpt-5-5-model-evaluation.html +0 -358
  92. package/public/guides/internal-ai-engineering-stack-guardrails.html +0 -348
  93. package/public/guides/long-running-agent-context-management.html +0 -346
  94. package/public/guides/mcp-tool-governance.html +0 -401
  95. package/public/guides/multica-thumbgate-setup.html +0 -134
  96. package/public/guides/native-messaging-host-security.html +0 -342
  97. package/public/guides/policy-engine-pre-action-gates.html +0 -346
  98. package/public/guides/pre-action-checks.html +0 -342
  99. package/public/guides/pretooluse-hooks-vs-advisory-prompt-rules.html +0 -342
  100. package/public/guides/prompt-tricks-to-workflow-rules.html +0 -365
  101. package/public/guides/proxy-pointer-rag-guardrails.html +0 -352
  102. package/public/guides/rag-precision-tuning-guardrails.html +0 -352
  103. package/public/guides/reasoning-compression-guardrails.html +0 -346
  104. package/public/guides/relational-knowledge-ai-recommendations.html +0 -342
  105. package/public/guides/roo-code-alternative-cline.html +0 -339
  106. package/public/guides/semantic-programmatic-seo-guardrails.html +0 -352
  107. package/public/guides/seo-agent-skills-guardrails.html +0 -344
  108. package/public/guides/stop-repeated-ai-agent-mistakes.html +0 -342
  109. package/public/learn/ac-dc-runtime-enforcement.html +0 -277
  110. package/public/learn/agent-harness-pattern.html +0 -181
  111. package/public/learn/agent-identity-connector-governance.html +0 -146
  112. package/public/learn/agent-swarms-shared-gates.html +0 -173
  113. package/public/learn/agentic-enterprise-context-brain.html +0 -117
  114. package/public/learn/agentic-os-team-governance.html +0 -146
  115. package/public/learn/ai-agent-governance.html +0 -158
  116. package/public/learn/ai-agent-persistent-memory.html +0 -211
  117. package/public/learn/anthropomorphic-claim-gates.html +0 -180
  118. package/public/learn/background-agent-control-layer.html +0 -184
  119. package/public/learn/claude-code-goal-with-rubrics.html +0 -205
  120. package/public/learn/codex-role-plugins-need-governance.html +0 -125
  121. package/public/learn/cost-aware-agent-gate-routing.html +0 -173
  122. package/public/learn/databricks-unity-ai-gateway-runtime-governance.html +0 -157
  123. package/public/learn/deterministic-agent-workflows.html +0 -185
  124. package/public/learn/feedback-loop-vs-decision-layer.html +0 -283
  125. package/public/learn/from-prototype-to-production.html +0 -223
  126. package/public/learn/learn.css +0 -51
  127. package/public/learn/mcp-pre-action-checks-explained.html +0 -172
  128. package/public/learn/pretix-stripe-connect-marketplaces.html +0 -161
  129. package/public/learn/regulated-agent-execution-boundary.html +0 -196
  130. package/public/learn/spec-driven-development.html +0 -168
  131. package/public/learn/stop-ai-agent-force-push.html +0 -134
  132. package/public/learn/vibe-coding-safety-net.html +0 -142
  133. package/scripts/reddit-browser-notification-watch.js +0 -230
@@ -1,205 +0,0 @@
1
- <!DOCTYPE html>
2
- <html lang="en">
3
- <head>
4
- <meta charset="UTF-8">
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds — ThumbGate</title>
7
- <script defer data-domain="thumbgate.ai" src="https://plausible.io/js/script.js"></script>
8
- <meta name="description" content="Treating /goal like a todo wastes Claude Code. The pattern that holds: clear goal, measurable success, shown proof, hard limits. Maps directly to ThumbGate's rubric-engine for enforced verification.">
9
- <meta name="keywords" content="claude code /goal, claude code goal command, agent goal pattern, measurable success criteria, rubric engine, AI agent verification, ThumbGate rubric">
10
- <meta property="og:title" content="Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds">
11
- <meta property="og:description" content="The /goal command becomes ten times more useful when you treat it as a verifiable contract, not a todo. Here is the 4-field pattern and how to enforce it.">
12
- <meta property="og:type" content="article">
13
- <meta property="og:url" content="https://thumbgate.ai/learn/claude-code-goal-with-rubrics">
14
- <link rel="canonical" href="https://thumbgate.ai/learn/claude-code-goal-with-rubrics">
15
-
16
- <script type="application/ld+json">
17
- {
18
- "@context": "https://schema.org",
19
- "@type": "TechArticle",
20
- "headline": "Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds",
21
- "description": "Treating /goal like a todo wastes Claude Code. The pattern that holds: clear goal, measurable success, shown proof, hard limits — enforced by ThumbGate rubrics.",
22
- "author": {
23
- "@type": "Person",
24
- "name": "Igor Ganapolsky",
25
- "url": "https://github.com/IgorGanapolsky"
26
- },
27
- "publisher": {
28
- "@type": "Organization",
29
- "name": "ThumbGate",
30
- "url": "https://thumbgate.ai"
31
- },
32
- "datePublished": "2026-05-18",
33
- "dateModified": "2026-05-18",
34
- "mainEntityOfPage": "https://thumbgate.ai/learn/claude-code-goal-with-rubrics",
35
- "about": [
36
- {"@type": "Thing", "name": "Claude Code /goal command"},
37
- {"@type": "Thing", "name": "AI agent rubric"},
38
- {"@type": "Thing", "name": "verifiable AI agent outcomes"}
39
- ]
40
- }
41
- </script>
42
-
43
- <link rel="stylesheet" href="/learn/learn.css">
44
- <style>
45
- table { width: 100%; border-collapse: collapse; margin: 1rem 0; }
46
- th, td { text-align: left; padding: 0.6rem 0.8rem; border-bottom: 1px solid var(--border); font-size: 0.9rem; vertical-align: top; }
47
- th { color: var(--cyan); font-weight: 600; }
48
- .mapping-row td:first-child { color: var(--green); font-weight: 500; }
49
- pre.example { background: var(--bg-raised); padding: 14px 16px; border-radius: 8px; font-size: 13px; border-left: 3px solid var(--cyan); margin: 12px 0; overflow-x: auto; }
50
- </style>
51
- </head>
52
- <body>
53
-
54
- <nav>
55
- <a href="/" class="brand"><img src="/assets/brand/thumbgate-mark-inline.svg" alt="ThumbGate" class="logo-mark" width="28" height="28"><span class="logo-text">ThumbGate</span></a>
56
- <a href="/guide">Setup Guide</a>
57
- <a href="/learn">Learn</a>
58
- <a href="/dashboard">Dashboard</a>
59
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub</a>
60
- </nav>
61
-
62
- <div class="container">
63
- <div class="breadcrumb"><a href="/learn">Learn</a> / Claude Code /goal With Rubrics</div>
64
- <h1>Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds</h1>
65
- <p style="color:var(--muted);">6 min read &middot; For developers using Claude Code's /goal command in production</p>
66
-
67
- <div class="tldr"><strong>TL;DR:</strong> Treating <code>/goal</code> like a todo wastes the command. The pattern that holds: <strong>clear goal, measurable success, shown proof, hard limits.</strong> That's a 4-field rubric — the same shape ThumbGate's rubric-engine enforces at gate-fire time. Pair them and the agent cannot fake completion.</div>
68
-
69
- <h2>The todo-shaped /goal anti-pattern</h2>
70
- <p>A common usage of Claude Code's <code>/goal</code> command looks like this:</p>
71
-
72
- <pre class="example">/goal fix the auth bug</pre>
73
-
74
- <p>This is a todo, not a goal. There is no way for the agent or you to know when the work is actually done. The agent will declare success after the first plausible-looking change, you will discover it was wrong an hour later, and the same conversation will re-enter the loop.</p>
75
-
76
- <div class="callout">
77
- <strong>The verifiable-contract reframe:</strong> A goal is not a wish. It is a contract with a clear outcome, a measurable success criterion, an inspectable proof, and a stop condition. Anything short of all four is a todo.
78
- </div>
79
-
80
- <h2>The 4-field pattern</h2>
81
- <p>Each field maps to a specific failure mode the agent commits when the field is missing:</p>
82
-
83
- <table>
84
- <thead>
85
- <tr>
86
- <th>Field</th>
87
- <th>What it answers</th>
88
- <th>Failure mode if missing</th>
89
- </tr>
90
- </thead>
91
- <tbody>
92
- <tr class="mapping-row">
93
- <td>1. Clear goal</td>
94
- <td>What is the outcome, in one sentence, that someone outside the project would understand?</td>
95
- <td>Scope creep. Agent expands the task to whatever it can find.</td>
96
- </tr>
97
- <tr class="mapping-row">
98
- <td>2. Measurable success</td>
99
- <td>What single check, run after the work, returns 0 / pass / a specific number?</td>
100
- <td>"Looks done." Agent declares completion on optimistic intermediate signals.</td>
101
- </tr>
102
- <tr class="mapping-row">
103
- <td>3. Shown proof</td>
104
- <td>What output, file, or screenshot will be in the final message proving the check ran?</td>
105
- <td>Hallucinated completion. Agent reports a green test that was never executed.</td>
106
- </tr>
107
- <tr class="mapping-row">
108
- <td>4. Hard limits</td>
109
- <td>What is the deadline, retry cap, or scope cliff that stops the work even if not yet "done"?</td>
110
- <td>Infinite spin. Agent keeps trying variants past any reasonable budget.</td>
111
- </tr>
112
- </tbody>
113
- </table>
114
-
115
- <h2>Concrete /goal phrasing</h2>
116
- <p>Same task, todo-shape vs goal-shape:</p>
117
-
118
- <pre class="example"><strong>Todo shape (anti-pattern):</strong>
119
- /goal fix the auth bug
120
-
121
- <strong>Goal shape:</strong>
122
- /goal fix the auth bug
123
- success: npm test --testPathPattern=auth returns exit 0 with 12 passing
124
- proof: paste the final 'PASS auth' line of the test output
125
- limit: stop after 3 implementation attempts; if still failing, file a /thumbsdown</pre>
126
-
127
- <p>The agent now has a contract it cannot fake. The success criterion is mechanical (exit code + count). The proof is inspectable (a literal line of output). The limit is a stop condition that triggers a feedback capture instead of a fifth wasted attempt.</p>
128
-
129
- <h2>Where ThumbGate plugs in</h2>
130
- <p>The 4-field pattern is exactly the shape of a rubric. ThumbGate's <code>scripts/rubric-engine.js</code> evaluates each completed agent task against a 4-field rubric:</p>
131
-
132
- <table>
133
- <thead>
134
- <tr>
135
- <th>Claude Code /goal field</th>
136
- <th>ThumbGate rubric field</th>
137
- <th>What ThumbGate does</th>
138
- </tr>
139
- </thead>
140
- <tbody>
141
- <tr class="mapping-row">
142
- <td>Clear goal</td>
143
- <td><code>rubric.goal</code></td>
144
- <td>Stored as the canonical task description in the feedback log</td>
145
- </tr>
146
- <tr class="mapping-row">
147
- <td>Measurable success</td>
148
- <td><code>rubric.verification.check</code></td>
149
- <td>The check is run before the agent's "done" claim is promoted to memory</td>
150
- </tr>
151
- <tr class="mapping-row">
152
- <td>Shown proof</td>
153
- <td><code>rubric.verification.evidence</code></td>
154
- <td>Captured into the lesson DB; missing proof = no promotion</td>
155
- </tr>
156
- <tr class="mapping-row">
157
- <td>Hard limits</td>
158
- <td><code>rubric.budget</code></td>
159
- <td>Tied to budget-guard.js — when limit hit, ThumbGate forces a thumbs-down capture and surfaces it for review</td>
160
- </tr>
161
- </tbody>
162
- </table>
163
-
164
- <div class="callout callout-green">
165
- <strong>Two layers, same shape:</strong> Claude Code's <code>/goal</code> sets the contract at task start. ThumbGate's rubric-engine enforces it at task end. The agent cannot promote a "done" memory unless every rubric field has the proof attached.
166
- </div>
167
-
168
- <h2>What this prevents in practice</h2>
169
- <p>Three concrete agent failure modes the paired pattern blocks:</p>
170
-
171
- <ul>
172
- <li><strong>Test-skipping.</strong> Agent reports "tests pass" but never ran them. Rubric verification.check runs the actual command and refuses the "done" promotion if exit code != 0.</li>
173
- <li><strong>Optimistic completion.</strong> Agent declares success on the first plausible change. The proof field requires an inspectable artifact (test output line, screenshot, exit code) that has to exist before promotion.</li>
174
- <li><strong>Budget run-away.</strong> Agent retries silently across 8+ attempts. Hard limit triggers a budget-guard block, forces feedback capture, and prevents the same pattern from recurring on the next prompt.</li>
175
- </ul>
176
-
177
- <h2>Five minutes to wire it</h2>
178
- <p>In your project root:</p>
179
-
180
- <pre><code>npx thumbgate init</code></pre>
181
-
182
- <p>Then, in any conversation, use the goal-shape phrasing above. ThumbGate's rubric-engine runs as part of the PreToolUse hook chain; no extra config required. The first time the agent tries to promote a "done" claim without proof, you'll see the gate fire in your dashboard.</p>
183
-
184
- <div class="cta-box">
185
- <h2 style="color:var(--text);font-size:1.3rem;margin:0 0 8px;">Pair /goal with a rubric the agent cannot fake.</h2>
186
- <p>Works with Claude Code, Cursor, Codex, Gemini, Amp, OpenCode, and any MCP-compatible agent.</p>
187
- <div class="cta-install">$ npx thumbgate init</div>
188
- </div>
189
-
190
- <div class="related">
191
- <h3>Related articles</h3>
192
- <a href="/learn/agent-harness-pattern">The Agent Harness Pattern: Why Your AI Needs a Seatbelt &rarr;</a>
193
- <a href="/learn/agent-swarms-shared-gates">Agent Swarms: One Gate Layer, Every Model &rarr;</a>
194
- <a href="/learn/mcp-pre-action-checks-explained">MCP Pre-Action Checks Explained &rarr;</a>
195
- </div>
196
- </div>
197
-
198
-
199
- <div class="sticky-cta">
200
- <span style="color:var(--muted)">Try it now:</span>
201
- <code>npx thumbgate init</code>
202
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub &rarr;</a>
203
- </div>
204
- </body>
205
- </html>
@@ -1,125 +0,0 @@
1
- <!DOCTYPE html>
2
- <html lang="en">
3
- <head>
4
- <meta charset="UTF-8">
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>Codex Role Plugins Need Pre-Action Governance — ThumbGate</title>
7
- <script defer data-domain="thumbgate.ai" src="https://plausible.io/js/script.js"></script>
8
- <meta name="description" content="Codex plugins, Sites, and annotations move AI work from code into sales, analytics, design, finance, and documents. Teams need pre-action governance before those workflows publish, share, or modify business systems.">
9
- <meta name="keywords" content="Codex plugins, Codex Sites, Codex annotations, ChatGPT Codex, role-specific AI agents, pre-action governance, Codex plugin governance, ThumbGate">
10
- <meta property="og:title" content="Codex Role Plugins Need Pre-Action Governance">
11
- <meta property="og:description" content="As Codex expands beyond coding into role-specific plugins, ThumbGate is the evidence and policy layer before AI work touches business systems.">
12
- <meta property="og:type" content="article">
13
- <meta property="og:url" content="https://thumbgate.ai/learn/codex-role-plugins-need-governance">
14
- <link rel="canonical" href="https://thumbgate.ai/learn/codex-role-plugins-need-governance">
15
-
16
- <script type="application/ld+json">
17
- {
18
- "@context": "https://schema.org",
19
- "@type": "TechArticle",
20
- "headline": "Codex Role Plugins Need Pre-Action Governance",
21
- "description": "Codex plugins, Sites, and annotations move AI work from code into sales, analytics, design, finance, and documents. Teams need pre-action governance before those workflows publish, share, or modify business systems.",
22
- "author": { "@type": "Person", "name": "Igor Ganapolsky", "url": "https://github.com/IgorGanapolsky" },
23
- "publisher": { "@type": "Organization", "name": "ThumbGate", "url": "https://thumbgate.ai" },
24
- "datePublished": "2026-06-03",
25
- "dateModified": "2026-06-03",
26
- "mainEntityOfPage": "https://thumbgate.ai/learn/codex-role-plugins-need-governance",
27
- "about": [
28
- { "@type": "Thing", "name": "Codex plugins" },
29
- { "@type": "Thing", "name": "Codex Sites" },
30
- { "@type": "Thing", "name": "role-specific AI agents" },
31
- { "@type": "Thing", "name": "pre-action governance" }
32
- ]
33
- }
34
- </script>
35
-
36
- <link rel="stylesheet" href="/learn/learn.css">
37
- <style>
38
- .matrix { width: 100%; border-collapse: collapse; margin: 1rem 0 1.5rem; }
39
- .matrix th, .matrix td { text-align: left; padding: 0.7rem 0.8rem; border-bottom: 1px solid var(--border); vertical-align: top; }
40
- .matrix th { color: var(--cyan); font-weight: 600; }
41
- </style>
42
- </head>
43
- <body>
44
-
45
- <nav>
46
- <a href="/" class="brand"><img src="/assets/brand/thumbgate-mark-inline.svg" alt="ThumbGate" class="logo-mark" width="28" height="28"><span class="logo-text">ThumbGate</span></a>
47
- <a href="/guide">Setup Guide</a>
48
- <a href="/learn">Learn</a>
49
- <a href="/dashboard">Dashboard</a>
50
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub</a>
51
- </nav>
52
-
53
- <div class="container">
54
- <div class="breadcrumb"><a href="/learn">Learn</a> / Codex Role Plugin Governance</div>
55
- <h1>Codex role plugins need a governance layer before they touch business systems.</h1>
56
- <p style="color:var(--muted);">6 min read &middot; For teams adopting Codex plugins, Sites, annotations, and non-developer AI workflows</p>
57
-
58
- <div class="tldr"><strong>TL;DR:</strong> Codex is becoming a cross-functional work surface, not only a coding tool. OpenAI's Codex docs describe plugins as installable bundles of skills, app integrations, and MCP servers, plus Sites for hosted apps and dashboards. That makes ThumbGate's job sharper: enforce policy, evidence, and feedback-derived blocks before role-specific agents publish, share, edit, deploy, or write into customer systems.</div>
59
-
60
- <h2>The product shift</h2>
61
- <p>Codex plugins package skills, app integrations, and MCP servers into reusable workflows. Sites can turn Codex output into hosted websites, apps, dashboards, and games. Annotations let a user select part of a document, spreadsheet, or slide and ask Codex to work on that selected region.</p>
62
- <p>That is powerful because non-developers can now use the same inspect, edit, verify, report loop on business artifacts. It is risky for the same reason: the action surface expands from code to CRM records, revenue dashboards, design assets, finance decks, sales sequences, and hosted internal tools.</p>
63
-
64
- <div class="callout">
65
- <strong>ThumbGate's wedge:</strong> The more Codex becomes a role-specific operating layer, the more every team needs a pre-action policy layer outside the prompt.
66
- </div>
67
-
68
- <h2>What can go wrong without gates</h2>
69
- <ul>
70
- <li>A sales plugin drafts or updates outreach from stale positioning after a thumbs-down already rejected that claim.</li>
71
- <li>A data plugin publishes a dashboard before the source query, date window, and metric definition are proven.</li>
72
- <li>A Sites workflow deploys a public prototype before access mode, secrets, and intended audience are checked.</li>
73
- <li>A document annotation updates one selected section while breaking a compliance statement elsewhere in the same deck.</li>
74
- <li>A non-developer approves a tool action without knowing it writes to production systems.</li>
75
- </ul>
76
-
77
- <h2>The governance map</h2>
78
- <table class="matrix">
79
- <thead>
80
- <tr><th>Codex surface</th><th>Why it matters</th><th>ThumbGate gate</th></tr>
81
- </thead>
82
- <tbody>
83
- <tr><td>Role plugin</td><td>Bundles repeatable work for sales, analytics, design, finance, and operations.</td><td>Require role-specific allowed tools, scopes, and blocked action patterns before execution.</td></tr>
84
- <tr><td>App integration</td><td>Lets Codex read or write external systems.</td><td>Route CRM, email, billing, data warehouse, and file-share writes through approval and audit checks.</td></tr>
85
- <tr><td>MCP server</td><td>Adds custom tools and shared information.</td><td>Inventory tools, tag high-risk writes, and block unauthorized tool calls before the model invokes them.</td></tr>
86
- <tr><td>Sites</td><td>Turns output into shareable hosted apps and dashboards.</td><td>Require build proof, access mode, secret handling, and deployment evidence before publish.</td></tr>
87
- <tr><td>Annotations</td><td>Targets exact regions of documents, spreadsheets, and slides.</td><td>Require source-region evidence and prevent partial edits from bypassing whole-document policy.</td></tr>
88
- </tbody>
89
- </table>
90
-
91
- <h2>High-ROI implementation</h2>
92
- <ol>
93
- <li><strong>Ship role-specific gate templates:</strong> sales, analytics, design, finance, legal, and customer-support templates with allowed actions and evidence labels.</li>
94
- <li><strong>Make plugin install prove itself:</strong> every Codex plugin install path should end with <code>npx thumbgate feedback-self-test</code> and one real gate check.</li>
95
- <li><strong>Gate Sites deploys:</strong> block public deploy or access widening until build, audience, and secret-handling proof are attached.</li>
96
- <li><strong>Gate annotated edits:</strong> require the selected artifact region, intended edit, and document-level invariant before saving or exporting.</li>
97
- <li><strong>Measure the new buyer metric:</strong> role-workflow repeats blocked before execution, split by role and tool surface.</li>
98
- </ol>
99
-
100
- <div class="callout callout-green">
101
- <strong>Sales wedge:</strong> "Codex plugins make every team faster. ThumbGate makes every team safer before the plugin writes, shares, deploys, or publishes."
102
- </div>
103
-
104
- <div class="cta-box">
105
- <h2 style="color:var(--text);font-size:1.3rem;margin:0 0 8px;">Add gates to one role workflow</h2>
106
- <p>Start with the role, the write surface, and the evidence required before that role's agent can claim success.</p>
107
- <div class="cta-install">$ npx thumbgate init --agent codex</div>
108
- </div>
109
-
110
- <div class="related">
111
- <h3>Related articles</h3>
112
- <a href="/codex-plugin">ThumbGate for Codex &rarr;</a>
113
- <a href="/learn/deterministic-agent-workflows">Deterministic Agent Workflows Need Runtime Gates &rarr;</a>
114
- <a href="/learn/agentic-os-team-governance">Agentic OS Team Governance &rarr;</a>
115
- <a href="/learn/background-agent-control-layer">Background Agents Need a Control Layer &rarr;</a>
116
- </div>
117
- </div>
118
-
119
- <div class="sticky-cta">
120
- <span style="color:var(--muted)">Try it now:</span>
121
- <code>npx thumbgate init --agent codex</code>
122
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub &rarr;</a>
123
- </div>
124
- </body>
125
- </html>
@@ -1,173 +0,0 @@
1
- <!DOCTYPE html>
2
- <html lang="en">
3
- <head>
4
- <meta charset="UTF-8">
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>Cost-Aware Agent Gate Routing — ThumbGate</title>
7
- <script defer data-domain="thumbgate.ai" src="https://plausible.io/js/script.js"></script>
8
- <meta name="description" content="How ThumbGate routes agent checks through deterministic rules, semantic cache, local classifiers, LLM judges, and human review so teams avoid unnecessary latency, tokens, and provider calls.">
9
- <meta name="keywords" content="AI agent gate routing, LLM classifier, semantic caching, agent governance, pre-action checks, workflow harness, structured data provenance, ThumbGate">
10
- <meta property="og:title" content="Cost-Aware Agent Gate Routing">
11
- <meta property="og:description" content="Use deterministic checks, semantic cache, local classifiers, and LLM judges in the right order before an agent action runs.">
12
- <meta property="og:type" content="article">
13
- <meta property="og:url" content="https://thumbgate.ai/learn/cost-aware-agent-gate-routing">
14
- <link rel="canonical" href="https://thumbgate.ai/learn/cost-aware-agent-gate-routing">
15
-
16
- <script type="application/ld+json">
17
- {
18
- "@context": "https://schema.org",
19
- "@type": "TechArticle",
20
- "headline": "Cost-Aware Agent Gate Routing",
21
- "description": "How ThumbGate routes agent checks through deterministic rules, semantic cache, local classifiers, LLM judges, and human review so teams avoid unnecessary latency, tokens, and provider calls.",
22
- "author": {
23
- "@type": "Person",
24
- "name": "Igor Ganapolsky",
25
- "url": "https://github.com/IgorGanapolsky"
26
- },
27
- "publisher": {
28
- "@type": "Organization",
29
- "name": "ThumbGate",
30
- "url": "https://thumbgate.ai"
31
- },
32
- "datePublished": "2026-06-03",
33
- "dateModified": "2026-06-03",
34
- "mainEntityOfPage": "https://thumbgate.ai/learn/cost-aware-agent-gate-routing",
35
- "about": [
36
- {"@type": "Thing", "name": "pre-action checks"},
37
- {"@type": "Thing", "name": "semantic caching"},
38
- {"@type": "Thing", "name": "LLM classifiers"},
39
- {"@type": "Thing", "name": "agent workflows"}
40
- ]
41
- }
42
- </script>
43
-
44
- <link rel="stylesheet" href="/learn/learn.css">
45
- <style>
46
- .matrix { width: 100%; border-collapse: collapse; margin: 1rem 0 1.5rem; }
47
- .matrix th, .matrix td { text-align: left; padding: 0.7rem 0.8rem; border-bottom: 1px solid var(--border); vertical-align: top; }
48
- .matrix th { color: var(--cyan); font-weight: 600; }
49
- .command { background: var(--bg-card); border: 1px solid var(--border); border-radius: 8px; padding: 1rem; margin: 1rem 0; overflow-x: auto; }
50
- </style>
51
- </head>
52
- <body>
53
-
54
- <nav>
55
- <a href="/" class="brand"><img src="/assets/brand/thumbgate-mark-inline.svg" alt="ThumbGate" class="logo-mark" width="28" height="28"><span class="logo-text">ThumbGate</span></a>
56
- <a href="/guide">Setup Guide</a>
57
- <a href="/learn">Learn</a>
58
- <a href="/dashboard">Dashboard</a>
59
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub</a>
60
- </nav>
61
-
62
- <div class="container">
63
- <div class="breadcrumb"><a href="/learn">Learn</a> / Cost-Aware Gate Routing</div>
64
- <h1>Cost-aware agent gates: rules first, models last.</h1>
65
- <p style="color:var(--muted);">7 min read &middot; For teams trying to make agent governance fast enough to stay on by default</p>
66
-
67
- <div class="tldr"><strong>TL;DR:</strong> The expensive part of agent governance should not run on every action. ThumbGate should route checks through deterministic rules, semantic cache, local text classifiers, and local semantic recall before using an LLM judge. High-risk private ambiguity should stop for human review instead of calling a cloud model.</div>
68
-
69
- <h2>The pattern across the latest agent infrastructure work</h2>
70
- <p>The same lesson keeps showing up in different forms. Semantic caching cuts repeated LLM calls. Traditional text classifiers beat LLMs on speed and cost when labels are clear. Breadth-first query execution batches similar work instead of walking one branch at a time. Structured live dataset agents only become trustworthy when every row has source provenance. Streaming output removes dead air. Dynamic harnesses work best when critic, tournament, loop, and fan-out patterns are selected deliberately.</p>
71
- <p>For ThumbGate, these are not separate product bets. They collapse into one control-plane rule: <strong>choose the cheapest reliable gate before the action runs.</strong></p>
72
-
73
- <h2>The routing ladder</h2>
74
- <table class="matrix">
75
- <thead>
76
- <tr>
77
- <th>Lane</th>
78
- <th>Use when</th>
79
- <th>Why it is high ROI</th>
80
- </tr>
81
- </thead>
82
- <tbody>
83
- <tr>
84
- <td>Deterministic</td>
85
- <td>Secrets, force-push, destructive SQL, protected files, known repeated commands.</td>
86
- <td>Near-zero latency, no tokens, no provider call. This is the default for exact policy risk.</td>
87
- </tr>
88
- <tr>
89
- <td>Semantic cache</td>
90
- <td>A prompt or action is semantically equivalent to a prior rejected or approved pattern.</td>
91
- <td>Returns the cached decision without rerunning the judge. This is the AISG-style buyer message applied to pre-action checks.</td>
92
- </tr>
93
- <tr>
94
- <td>Rubric gate</td>
95
- <td>A critic/rubric loop failed a criterion, hit its cap, or lacks done evidence.</td>
96
- <td>Turns LangChain-style rubric iteration into an enforcement event: block completion claims until the missing proof exists.</td>
97
- </tr>
98
- <tr>
99
- <td>Local classical classifier</td>
100
- <td>High-volume labels with enough examples and low ambiguity.</td>
101
- <td>Fast and cheap for routine feedback triage, import classification, and known error families.</td>
102
- </tr>
103
- <tr>
104
- <td>Local semantic recall</td>
105
- <td>Few examples, fuzzy near-misses, or cross-session recurrence.</td>
106
- <td>Keeps private context local while catching cases regex and keyword routing miss.</td>
107
- </tr>
108
- <tr>
109
- <td>LLM judge</td>
110
- <td>High-risk semantic ambiguity with explicit cloud permission and a budget cap.</td>
111
- <td>Useful for critic/rubric review, multi-document evidence review, and structured provenance checks, but not for every action.</td>
112
- </tr>
113
- <tr>
114
- <td>Human review</td>
115
- <td>Private, regulated, payment, credential, customer-data, or unbounded external-posting risk.</td>
116
- <td>Prevents automation from laundering a risky decision through a model call.</td>
117
- </tr>
118
- </tbody>
119
- </table>
120
-
121
- <h2>What changed in ThumbGate</h2>
122
- <p>ThumbGate now has a small, testable routing primitive that makes this policy explicit:</p>
123
- <div class="command"><code>node scripts/classifier-routing.js --risk=high --ambiguity=0.82 --allow-cloud --latency-ms=5000</code></div>
124
- <p>That command returns an evidence-requiring LLM judge lane. Add <code>--semantic-cache-hit</code>, and it reuses the prior decision without a provider call. Add <code>--rubric-failed</code> or <code>--structured-dataset --missing-provenance</code>, and it blocks completion through the rubric gate. Change the same high-risk ambiguous input to <code>--privacy-sensitive</code> without <code>--allow-cloud</code>, and it routes to human review instead.</p>
125
-
126
- <h2>How the newer signals map to product work</h2>
127
- <ul>
128
- <li><strong>Scikit-LLM vs traditional classifiers:</strong> do not spend LLM calls on low-ambiguity bulk labels.</li>
129
- <li><strong>Semantic proxy caching:</strong> reuse a prior decision when prompt meaning has not changed.</li>
130
- <li><strong>LangChain-style rubrics:</strong> turn failed criteria into completion blockers instead of post-hoc scores.</li>
131
- <li><strong>Shopify Cardinal BFS:</strong> batch and evaluate similar gate scopes together instead of repeatedly traversing the same nested context.</li>
132
- <li><strong>BigSet-style dataset agents:</strong> require structured rows, source URLs, and retrieval traces before accepting live web data.</li>
133
- <li><strong>Streaming agent output:</strong> stream progress events during long gate reviews so users know the gate is working.</li>
134
- <li><strong>Dynamic harness patterns:</strong> use critic/rubric for correctness, tournament for ranking, loop-until-done for open-ended work, and fan-out/synthesize for parallel research.</li>
135
- </ul>
136
-
137
- <div class="callout callout-green">
138
- <strong>Buyer proof:</strong> show the same risky action going through three routes: exact repeat blocked instantly, fuzzy repeat caught locally, and genuinely ambiguous production change paused for evidence or human review.
139
- </div>
140
-
141
- <h2>Implementation checklist</h2>
142
- <ol>
143
- <li>Put exact denials and approval boundaries in deterministic checks.</li>
144
- <li>Cache semantically equivalent gate decisions with provenance and expiry.</li>
145
- <li>Use local text classification for routine high-volume feedback labels.</li>
146
- <li>Use local semantic recall for sparse, fuzzy, or cross-session lessons.</li>
147
- <li>Treat failed rubrics and missing source provenance as gate failures, not just evaluation notes.</li>
148
- <li>Reserve LLM judges for ambiguous high-value decisions with evidence requirements.</li>
149
- <li>Stream progress for long reviews and record every routed decision in the audit trail.</li>
150
- </ol>
151
-
152
- <div class="cta-box">
153
- <h2 style="color:var(--text);font-size:1.3rem;margin:0 0 8px;">Try the routing primitive</h2>
154
- <p>Check the gate lane before spending tokens on a risky decision.</p>
155
- <div class="cta-install">$ node scripts/classifier-routing.js --hard-rule --risk=critical</div>
156
- </div>
157
-
158
- <div class="related">
159
- <h3>Related articles</h3>
160
- <a href="/learn/deterministic-agent-workflows">Deterministic Agent Workflows Need Runtime Gates &rarr;</a>
161
- <a href="/learn/agentic-os-team-governance">Agentic OS Team Governance &rarr;</a>
162
- <a href="/learn/agentic-enterprise-context-brain">Agentic Enterprise Context Brain &rarr;</a>
163
- <a href="/learn/mcp-pre-action-checks-explained">MCP Pre-Action Checks Explained &rarr;</a>
164
- </div>
165
- </div>
166
-
167
- <div class="sticky-cta">
168
- <span style="color:var(--muted)">Install:</span>
169
- <code>npx thumbgate init</code>
170
- <a href="https://github.com/IgorGanapolsky/ThumbGate" target="_blank" rel="noopener">GitHub &rarr;</a>
171
- </div>
172
- </body>
173
- </html>