@takk/bayesoutputgate 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +92 -0
- package/LICENSE +190 -0
- package/NOTICE +45 -0
- package/README.md +403 -0
- package/SECURITY.md +98 -0
- package/SPEC.md +467 -0
- package/dist/adapter/index.cjs +411 -0
- package/dist/adapter/index.d.cts +29 -0
- package/dist/adapter/index.d.ts +29 -0
- package/dist/adapter/index.js +404 -0
- package/dist/audit/index.cjs +82 -0
- package/dist/audit/index.d.cts +40 -0
- package/dist/audit/index.d.ts +40 -0
- package/dist/audit/index.js +77 -0
- package/dist/bayesfactor/index.cjs +152 -0
- package/dist/bayesfactor/index.d.cts +15 -0
- package/dist/bayesfactor/index.d.ts +15 -0
- package/dist/bayesfactor/index.js +149 -0
- package/dist/beta/index.cjs +180 -0
- package/dist/beta/index.d.cts +45 -0
- package/dist/beta/index.d.ts +45 -0
- package/dist/beta/index.js +178 -0
- package/dist/calibration/index.cjs +339 -0
- package/dist/calibration/index.d.cts +53 -0
- package/dist/calibration/index.d.ts +53 -0
- package/dist/calibration/index.js +333 -0
- package/dist/cli/index.cjs +968 -0
- package/dist/cli/index.d.cts +1 -0
- package/dist/cli/index.d.ts +1 -0
- package/dist/cli/index.js +966 -0
- package/dist/dimensions/index.cjs +106 -0
- package/dist/dimensions/index.d.cts +33 -0
- package/dist/dimensions/index.d.ts +33 -0
- package/dist/dimensions/index.js +104 -0
- package/dist/edge/index.cjs +1141 -0
- package/dist/edge/index.d.cts +12 -0
- package/dist/edge/index.d.ts +12 -0
- package/dist/edge/index.js +1109 -0
- package/dist/gate/index.cjs +803 -0
- package/dist/gate/index.d.cts +77 -0
- package/dist/gate/index.d.ts +77 -0
- package/dist/gate/index.js +799 -0
- package/dist/hypothesis/index.cjs +268 -0
- package/dist/hypothesis/index.d.cts +38 -0
- package/dist/hypothesis/index.d.ts +38 -0
- package/dist/hypothesis/index.js +266 -0
- package/dist/index.cjs +1141 -0
- package/dist/index.d.cts +29 -0
- package/dist/index.d.ts +29 -0
- package/dist/index.js +1109 -0
- package/dist/likelihood/index.cjs +137 -0
- package/dist/likelihood/index.d.cts +23 -0
- package/dist/likelihood/index.d.ts +23 -0
- package/dist/likelihood/index.js +132 -0
- package/dist/node/index.cjs +1282 -0
- package/dist/node/index.d.cts +24 -0
- package/dist/node/index.d.ts +24 -0
- package/dist/node/index.js +1246 -0
- package/dist/policy/index.cjs +88 -0
- package/dist/policy/index.d.cts +11 -0
- package/dist/policy/index.d.ts +11 -0
- package/dist/policy/index.js +85 -0
- package/dist/types-bMjn1j4e.d.cts +159 -0
- package/dist/types-bMjn1j4e.d.ts +159 -0
- package/package.json +142 -0
package/README.md
ADDED
|
@@ -0,0 +1,403 @@
|
|
|
1
|
+
# BayesOutputGate
|
|
2
|
+
|
|
3
|
+
[](./CHANGELOG.md)
|
|
4
|
+
[](./LICENSE)
|
|
5
|
+
[](./CHANGELOG.md)
|
|
6
|
+
[]()
|
|
7
|
+
[]()
|
|
8
|
+
[]()
|
|
9
|
+
[]()
|
|
10
|
+
|
|
11
|
+
<p align="center">
|
|
12
|
+
<img src="https://raw.githubusercontent.com/davccavalcante/bayesoutputgate/main/assets/bayesoutputgate-diagram.svg" alt="An output-validation flow. Per-dimension quality scores of an output, from any scorer, feed a Bayes Factor that weighs a high-quality hypothesis against a low-quality one, calibrated from labeled history. The combined evidence lands on the Jeffreys scale and the gate stamps the output pass, fail, or escalate to human review, then seals the decision into a tamper-evident audit trail." width="560">
|
|
13
|
+
</p>
|
|
14
|
+
|
|
15
|
+
[](https://www.star-history.com/#davccavalcante/bayesoutputgate&type=timeline&legend=top-left)
|
|
16
|
+
|
|
17
|
+
> Universal, zero-runtime-dependency NPM library and CLI that turns the quality scores of an output into a calibrated pass, fail, or escalate decision. Feed the per-dimension scores of any output, from an LLM-as-judge, a classifier, or a regex, and the engine weighs a high-quality hypothesis against a low-quality one with a Bayes Factor on the Jeffreys scale, so you accept, reject, or route to review on principled evidence instead of an arbitrary threshold, for Massive Intelligence (IM) systems and the non-human entities that must validate their own outputs.
|
|
18
|
+
|
|
19
|
+
BayesOutputGate is the passport control for your outputs. A non-human entity generates an answer, a tool call, or a document, and something has to decide whether it is good enough to ship. Today teams hardcode a threshold, "accept if the judge score is above 0.8", and that number is a guess: it ignores how confidently high-quality and low-quality outputs actually separate, it treats every dimension as equally trustworthy, and it cannot say how sure it is. BayesOutputGate reads the labeled history of known-good and known-bad outputs, calibrates a model of each, and for every new output reports the Bayes Factor between the two hypotheses, then stamps it pass, fail, or escalate with the evidence and the rationale attached.
|
|
20
|
+
|
|
21
|
+
**Core promise:** zero required runtime dependencies, a principled Bayes Factor between a high-quality and a low-quality hypothesis rather than a magic threshold, calibrated likelihoods that update online as outcomes arrive, multi-dimension scoring with per-dimension weights, a decision-theoretic policy that minimizes expected loss under asymmetric error costs, goodness-of-fit and dependence diagnostics that tell you when to trust the gate, measured calibration (Brier score, expected calibration error, reliability), a framework-agnostic tool a non-human entity can call to gate its own outputs, a tamper-evident audit trail for compliance, and a node-free core that runs in Node, edge runtimes, and the browser, ESM plus CJS dual distribution.
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## Why BayesOutputGate
|
|
26
|
+
|
|
27
|
+
Output validation is a hypothesis-testing question, not a threshold question. An ML engineer says: "Our judge returns a 0 to 1 score. We accept above 0.8. We picked 0.8 because it felt right." A platform engineer says: "We have factuality, safety, and relevance scores. We average them and threshold the average, which lets a great factuality score hide a terrible safety score." A compliance lead says: "When the regulator asks why we shipped this output, all we have is a number and a constant." A fixed threshold answers none of these. The Bayes Factor does: it asks how much more likely the observed scores are under a high-quality hypothesis than under a low-quality one, calibrated from your own labeled data, and it interprets that ratio on a scale that has meant something since Jeffreys.
|
|
28
|
+
|
|
29
|
+
What sets it apart from a thresholded score and from ad-hoc guardrail rules:
|
|
30
|
+
|
|
31
|
+
- **A Bayes Factor, not a magic number.** For each dimension the engine fits a Beta density to the scores of known-high outputs and another to known-low outputs, then the per-dimension density ratio, summed in log space across weighted dimensions, is the combined Bayes Factor. It is interpreted on the Jeffreys scale (the symmetric boundaries 3, 10, 100), so "strong evidence" means strong evidence, not a number you chose.
|
|
32
|
+
- **Calibrated from history, updated online.** Likelihoods are calibrated by method of moments from labeled outputs, regularized by a prior treated as pseudo-observations, and updated online as new labeled outcomes arrive, so the gate gets sharper over time without a full refit.
|
|
33
|
+
- **Utility-aware decisions.** A pure Bayes Factor policy decides on the evidence alone. The decision-theoretic policy converts the Bayes Factor to a posterior probability through an explicit prior, then chooses the action that minimizes expected loss under an asymmetric cost: passing a bad output, rejecting a good one, and escalating to a human each carry their own price. Shipping a wrong medical answer should not cost the same as asking a person to look.
|
|
34
|
+
- **It tells you when not to trust it, and can act on it.** A goodness-of-fit test (Kolmogorov-Smirnov against the fitted Beta) reports whether the score distribution actually matches the model, and a dependence diagnostic measures pairwise correlation between dimensions, because correlated dimensions double-count evidence and inflate the Bayes Factor. Both are measured from your history, and an optional guard makes the gate escalate to human review instead of trusting a Bayes Factor from a misspecified model.
|
|
35
|
+
- **Calibration is measured, never asserted.** Once the gate emits posterior probabilities, the library scores them against realized outcomes with the Brier score, expected calibration error, and a reliability diagram. "Calibrated" is backed by numbers you can compute on your own data.
|
|
36
|
+
- **A tool a non-human entity can call on itself.** The adapter exposes the whole gate as a framework-agnostic tool (name, description, JSON Schema, handler) that drops into an MCP server or an LLM tool-calling API, so a non-human entity (NHE) validates its own outputs before acting on them.
|
|
37
|
+
- **Proves what it decided.** A tamper-evident SHA-256 hash-chained audit trail of every decision, the evidence a compliance review under the EU AI Act asks for.
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## Install
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
pnpm add @takk/bayesoutputgate
|
|
45
|
+
# or: npm install @takk/bayesoutputgate
|
|
46
|
+
# or: yarn add @takk/bayesoutputgate
|
|
47
|
+
# or: bun add @takk/bayesoutputgate
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
The core has zero required runtime dependencies. Every `@takk` sibling is an optional peer; install only what you compose with.
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## Quickstart
|
|
55
|
+
|
|
56
|
+
Calibrate from labeled history, then gate a new output's scores.
|
|
57
|
+
|
|
58
|
+
```ts
|
|
59
|
+
import { evaluate, HypothesisManager } from "@takk/bayesoutputgate";
|
|
60
|
+
|
|
61
|
+
// 1. Calibrate the two hypotheses from labeled outputs (known-high and known-low),
|
|
62
|
+
// each scored across one or more dimensions by any scorer you already have.
|
|
63
|
+
const manager = new HypothesisManager();
|
|
64
|
+
manager.fit([
|
|
65
|
+
{ scores: [{ dimension: "factuality", value: 0.96 }], label: "high" },
|
|
66
|
+
{ scores: [{ dimension: "factuality", value: 0.91 }], label: "high" },
|
|
67
|
+
{ scores: [{ dimension: "factuality", value: 0.12 }], label: "low" },
|
|
68
|
+
{ scores: [{ dimension: "factuality", value: 0.28 }], label: "low" },
|
|
69
|
+
]);
|
|
70
|
+
|
|
71
|
+
// 2. Weigh a new output's scores against the two hypotheses and decide.
|
|
72
|
+
const decision = evaluate(
|
|
73
|
+
[{ dimension: "factuality", value: 0.88 }],
|
|
74
|
+
manager.models(),
|
|
75
|
+
{ kind: "bayes-factor", passAbove: 10, failBelow: 0.1 },
|
|
76
|
+
);
|
|
77
|
+
|
|
78
|
+
console.log(decision.action); // "pass" | "fail" | "escalate"
|
|
79
|
+
console.log(decision.bayesFactor); // evidence ratio for high quality over low quality
|
|
80
|
+
console.log(decision.strength); // Jeffreys-scale label, for example "strong-high"
|
|
81
|
+
console.log(decision.contributions); // per-dimension breakdown of the evidence
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
An output passes when the Bayes Factor is at or above `passAbove`, fails at or below `failBelow`, and escalates to human review in between. The defaults, 10 and 0.1, are the Jeffreys "strong evidence" boundaries.
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## Utility-aware decisions
|
|
89
|
+
|
|
90
|
+
Two errors are not equally costly. Passing a hallucinated medical claim is worse than asking a person to double-check a good answer. The decision-theoretic policy makes the gate minimize expected loss instead of thresholding evidence.
|
|
91
|
+
|
|
92
|
+
```ts
|
|
93
|
+
import { evaluate } from "@takk/bayesoutputgate";
|
|
94
|
+
|
|
95
|
+
const decision = evaluate(scores, models, {
|
|
96
|
+
kind: "decision-theoretic",
|
|
97
|
+
priorHighQuality: 0.8, // base rate of good outputs, before seeing the scores
|
|
98
|
+
lossFalsePass: 10, // cost of shipping a low-quality output
|
|
99
|
+
lossFalseFail: 1, // cost of rejecting a high-quality output
|
|
100
|
+
escalationCost: 0.5, // cost of routing to a human reviewer
|
|
101
|
+
});
|
|
102
|
+
|
|
103
|
+
console.log(decision.action); // the expected-loss-minimizing action
|
|
104
|
+
console.log(decision.posteriorHighQuality); // P(high quality | the scores)
|
|
105
|
+
console.log(decision.expectedLoss); // { pass, fail, escalate }
|
|
106
|
+
console.log(decision.rationale); // "expected-loss"
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
The Bayes Factor becomes a posterior through your prior (posterior log-odds equal the log-Bayes-Factor plus the prior log-odds), and the action with the lowest expected loss wins. Raise `lossFalsePass` and the gate becomes conservative; raise `escalationCost` and it sends fewer outputs to review.
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## Multi-dimension scoring and the dependence diagnostic
|
|
114
|
+
|
|
115
|
+
Real outputs are scored on several axes, factuality, fluency, safety, relevance, and they should not count equally. Each dimension carries a weight in the combined Bayes Factor, and the engine warns when two dimensions are correlated enough to double-count evidence.
|
|
116
|
+
|
|
117
|
+
```ts
|
|
118
|
+
import { dependenceDiagnostic, HypothesisManager } from "@takk/bayesoutputgate";
|
|
119
|
+
|
|
120
|
+
const manager = new HypothesisManager({
|
|
121
|
+
dimensions: [
|
|
122
|
+
{ dimension: "factuality", weight: 2 }, // weigh factuality twice as heavily
|
|
123
|
+
{ dimension: "safety", weight: 3 }, // and safety more still
|
|
124
|
+
{ dimension: "fluency", weight: 1 },
|
|
125
|
+
],
|
|
126
|
+
});
|
|
127
|
+
manager.fit(labeledHistory);
|
|
128
|
+
|
|
129
|
+
// Are any dimensions correlated enough that the independence assumption is unsafe?
|
|
130
|
+
const dependence = dependenceDiagnostic(labeledHistory.map((o) => o.scores));
|
|
131
|
+
console.log(dependence.independenceAssumptionSafe); // false if a pair crosses the threshold
|
|
132
|
+
console.log(dependence.flagged); // the offending pairs, by Pearson correlation
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
The combined Bayes Factor sums per-dimension log-density differences, which assumes the dimensions are conditionally independent given the hypothesis. When `independenceAssumptionSafe` is false, drop or merge the correlated dimensions, or down-weight them, so the gate does not over-count.
|
|
136
|
+
|
|
137
|
+
---
|
|
138
|
+
|
|
139
|
+
## Calibration, measured not asserted
|
|
140
|
+
|
|
141
|
+
The gate is only as good as the model behind it. Two measurements tell you whether to trust it: goodness-of-fit asks whether the Beta assumption holds for your scores, and decision calibration asks whether the posterior probabilities match reality.
|
|
142
|
+
|
|
143
|
+
```ts
|
|
144
|
+
import {
|
|
145
|
+
brierScore,
|
|
146
|
+
expectedCalibrationError,
|
|
147
|
+
goodnessOfFit,
|
|
148
|
+
reliability,
|
|
149
|
+
} from "@takk/bayesoutputgate";
|
|
150
|
+
|
|
151
|
+
// 1. Does the per-dimension Beta assumption hold? (Kolmogorov-Smirnov against the fit.)
|
|
152
|
+
const fit = goodnessOfFit(highQualityScores, { a: 9, b: 2 });
|
|
153
|
+
console.log(fit.adequate); // false means the density-ratio likelihood should not be trusted here
|
|
154
|
+
|
|
155
|
+
// 2. Do the posterior probabilities match observed outcomes?
|
|
156
|
+
const predictions = [
|
|
157
|
+
{ probability: 0.92, outcome: 1 },
|
|
158
|
+
{ probability: 0.18, outcome: 0 },
|
|
159
|
+
// ...one row per decided output, outcome 1 if it was truly high quality, else 0
|
|
160
|
+
];
|
|
161
|
+
console.log(brierScore(predictions)); // mean squared error, lower is better
|
|
162
|
+
console.log(expectedCalibrationError(predictions)); // gap between predicted and empirical rates
|
|
163
|
+
console.log(reliability(predictions)); // the reliability diagram, bin by bin
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
When goodness-of-fit fails, the Bayes Factor is optimizing the wrong model and no decision metric will reveal it, which is exactly why this check comes first.
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
## A gate that recalibrates and seals every decision
|
|
171
|
+
|
|
172
|
+
`OutputGateMonitor` wraps the gate with online recalibration and a tamper-evident audit chain, so every decision a non-human entity acts on is explainable after the fact. With guards, it assesses its modeling assumptions on every `fit` and escalates rather than trusting a Bayes Factor from a misspecified model.
|
|
173
|
+
|
|
174
|
+
```ts
|
|
175
|
+
import { OutputGateMonitor } from "@takk/bayesoutputgate";
|
|
176
|
+
|
|
177
|
+
const monitor = new OutputGateMonitor({
|
|
178
|
+
policy: { kind: "bayes-factor", passAbove: 10, failBelow: 0.1 },
|
|
179
|
+
guards: { requireGoodnessOfFit: true, requireIndependence: true },
|
|
180
|
+
});
|
|
181
|
+
monitor.fit(labeledHistory);
|
|
182
|
+
|
|
183
|
+
console.log(monitor.assumptionReport); // goodness-of-fit and dependence, assessed from the history
|
|
184
|
+
|
|
185
|
+
const { decision, entry } = await monitor.record(
|
|
186
|
+
[{ dimension: "factuality", value: 0.88 }],
|
|
187
|
+
{ timestamp: "2026-06-24T00:00:00Z" },
|
|
188
|
+
);
|
|
189
|
+
|
|
190
|
+
console.log(decision.action); // "escalate" with rationale "assumption-violated" if a guard tripped
|
|
191
|
+
console.log(decision.assumptions); // the report attached to the decision
|
|
192
|
+
console.log(entry.hash); // the seal over this decision
|
|
193
|
+
|
|
194
|
+
// Later, when a true label arrives, fold it back in to sharpen calibration.
|
|
195
|
+
monitor.observe({ scores: [{ dimension: "factuality", value: 0.88 }], label: "high" });
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
You can also guard a stateless `evaluate` by passing a precomputed report from `assessAssumptions(history, models)`: when `requireGoodnessOfFit` or `requireIndependence` is set and the matching assumption fails, the decision is forced to `escalate` with rationale `"assumption-violated"`, and the evidence is preserved on the decision for the reviewer.
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## A tool a non-human entity can call
|
|
203
|
+
|
|
204
|
+
The adapter turns the whole gate into a framework-agnostic tool. The shape matches what MCP servers and LLM tool-calling APIs expect, so a non-human entity (NHE) can validate its own output before acting on it. Input arriving from a model is parsed defensively, and the output is made JSON-safe.
|
|
205
|
+
|
|
206
|
+
```ts
|
|
207
|
+
import { bayesOutputGateTool, runTool } from "@takk/bayesoutputgate";
|
|
208
|
+
|
|
209
|
+
bayesOutputGateTool.name; // "bayes_output_gate"
|
|
210
|
+
bayesOutputGateTool.inputSchema; // JSON Schema for the tool input
|
|
211
|
+
|
|
212
|
+
const output = runTool({
|
|
213
|
+
scores: [{ dimension: "factuality", value: 0.88 }],
|
|
214
|
+
models: [{ dimension: "factuality", high: { a: 9, b: 2 }, low: { a: 2, b: 9 }, weight: 1 }],
|
|
215
|
+
policy: { kind: "bayes-factor", passAbove: 10, failBelow: 0.1 },
|
|
216
|
+
});
|
|
217
|
+
console.log(output.action); // "pass" | "fail" | "escalate"
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
## Entry points
|
|
223
|
+
|
|
224
|
+
Thirteen subpath exports, each importable on its own. The core is node-free; only `node` touches a Node built-in.
|
|
225
|
+
|
|
226
|
+
| Import | What it gives you |
|
|
227
|
+
|---|---|
|
|
228
|
+
| `@takk/bayesoutputgate` | The `evaluate` gate, `OutputGate`, `OutputGateMonitor`, plus the full toolkit and types. |
|
|
229
|
+
| `@takk/bayesoutputgate/beta` | The calibrated `BetaModel`, the online-updatable engine both hypotheses share. |
|
|
230
|
+
| `@takk/bayesoutputgate/hypothesis` | The `HypothesisManager`, per-dimension high and low score models from labeled history. |
|
|
231
|
+
| `@takk/bayesoutputgate/likelihood` | Per-score and marginal likelihoods, density-ratio and Beta-Binomial modes. |
|
|
232
|
+
| `@takk/bayesoutputgate/bayesfactor` | `bayesFactor` and the Jeffreys-scale interpretation. |
|
|
233
|
+
| `@takk/bayesoutputgate/dimensions` | `dependenceDiagnostic`, pairwise correlation between dimensions. |
|
|
234
|
+
| `@takk/bayesoutputgate/policy` | `decide` and `posteriorHighQuality`, the Bayes Factor and decision-theoretic policies. |
|
|
235
|
+
| `@takk/bayesoutputgate/calibration` | `goodnessOfFit`, `brierScore`, `expectedCalibrationError`, `reliability`, `assessAssumptions`. |
|
|
236
|
+
| `@takk/bayesoutputgate/gate` | `evaluate`, `OutputGate`, and `OutputGateMonitor`, with optional assumption guards. |
|
|
237
|
+
| `@takk/bayesoutputgate/audit` | Append-only SHA-256 hash-chained audit log, via Web Crypto. |
|
|
238
|
+
| `@takk/bayesoutputgate/adapter` | The framework-agnostic MCP and LLM tool definition. |
|
|
239
|
+
| `@takk/bayesoutputgate/node` | File loaders for JSON and CSV score history, over `node:fs`. |
|
|
240
|
+
| `@takk/bayesoutputgate/edge` | The node-free core, re-exported for edge runtimes and the browser. |
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
## Tamper-evident audit trail
|
|
245
|
+
|
|
246
|
+
```ts
|
|
247
|
+
import { AuditChain, verifyChain } from "@takk/bayesoutputgate";
|
|
248
|
+
|
|
249
|
+
const chain = new AuditChain();
|
|
250
|
+
await chain.append({ action: "pass", bayesFactor: 42.1 });
|
|
251
|
+
await chain.append({ action: "escalate", bayesFactor: 4.3 });
|
|
252
|
+
|
|
253
|
+
await verifyChain(chain.toArray()); // { valid: true }, until any entry is altered
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
Each entry is recorded append-only and chained to the previous one through a SHA-256 hash of its canonical form. Any later edit, deletion, or reordering breaks the chain and `verifyChain` reports the first broken index. The seal is an integrity seal, not a digital signature: it makes tampering detectable, not impossible. The chain uses the Web Crypto API, so the audit surface runs in Node, edge runtimes, and the browser. The `OutputGateMonitor` records every decision into such a chain automatically.
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
## Governance
|
|
261
|
+
|
|
262
|
+
Every decision can be recorded and replayed. The `OutputGateMonitor` appends each verdict to a tamper-evident audit chain, so every output a non-human entity ships or rejects is explainable after the fact, with the Bayes Factor and the policy it was decided on. No runtime dependency is taken; you wire the chain to your own telemetry and compliance stack. This is the governance seam for Massive Intelligence (IM) systems and the validation evidence the EU AI Act expects.
|
|
263
|
+
|
|
264
|
+
---
|
|
265
|
+
|
|
266
|
+
## CLI
|
|
267
|
+
|
|
268
|
+
BayesOutputGate ships a command-line tool that runs the real engine, with every number produced by execution.
|
|
269
|
+
|
|
270
|
+
```bash
|
|
271
|
+
# Fit the two hypotheses from labeled history, then decide each output's scores.
|
|
272
|
+
npx @takk/bayesoutputgate gate history.json scores.json
|
|
273
|
+
|
|
274
|
+
# Compute the Bayes Factor for each output against explicit dimension models.
|
|
275
|
+
npx @takk/bayesoutputgate bayes-factor models.json scores.json
|
|
276
|
+
|
|
277
|
+
# Fit, then report goodness-of-fit per dimension and the dependence diagnostic.
|
|
278
|
+
npx @takk/bayesoutputgate calibrate history.json
|
|
279
|
+
|
|
280
|
+
# Verify the integrity of a sealed decision audit chain.
|
|
281
|
+
npx @takk/bayesoutputgate audit-verify chain.json
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
History is a JSON array of labeled observations (`{ scores, label }`); scores is a JSON array of score vectors, or a CSV with a header row of dimension names and one row of values per output. Flags: `--pass-above`, `--fail-below`, `--json`. Exit codes: `0` ok, `2` usage or input error, `30` a fail decision, `40` an escalate decision, `20` a broken audit chain.
|
|
285
|
+
|
|
286
|
+
---
|
|
287
|
+
|
|
288
|
+
## The math, in one paragraph
|
|
289
|
+
|
|
290
|
+
For each quality dimension the engine models the score of a high-quality output and the score of a low-quality output as two Beta densities, calibrated from labeled history by method of moments and regularized by a prior treated as pseudo-observations. The Bayes Factor for an output is the ratio of the likelihood of its scores under the high-quality hypothesis to the likelihood under the low-quality one; across dimensions the log-Bayes-Factors add, weighted, which is the conditional-independence assumption the dependence diagnostic checks. The combined log-Bayes-Factor lands on the Jeffreys scale (Kass and Raftery, 1995), whose symmetric boundaries at 3, 10, and 100 separate inconclusive, substantial, strong, and decisive evidence. A pure policy thresholds the Bayes Factor on that scale; the decision-theoretic policy turns it into a posterior through an explicit prior (posterior log-odds equal the log-Bayes-Factor plus the prior log-odds) and picks the expected-loss-minimizing action under an asymmetric loss. Calibration is measured, not assumed: goodness-of-fit by the Kolmogorov-Smirnov statistic against the fitted Beta, and decision calibration by the Brier score, expected calibration error, and a reliability diagram. The data is yours; the calibration is the library's.
|
|
291
|
+
|
|
292
|
+
---
|
|
293
|
+
|
|
294
|
+
## Benchmark
|
|
295
|
+
|
|
296
|
+
A value benchmark scores the gate against fairly-tuned fixed thresholds on a held-out test set the gate never calibrates on, under an asymmetric loss (a false pass costs 10, a false fail 2, an escalation 1). Each output is two-dimensional, and a "masking" low-quality output scores high on factuality but low on safety, the failure a single averaged threshold hides. Every number is from real execution against the compiled `dist`; run it with `node --import tsx benchmarks/value.ts`.
|
|
297
|
+
|
|
298
|
+
| Regime | Policy | Avg loss | Accuracy | Escalated |
|
|
299
|
+
|---|---|---|---|---|
|
|
300
|
+
| Beta-satisfied | fixed threshold (best of mean, min) | 0.619 | 78.6% | 0% |
|
|
301
|
+
| Beta-satisfied | **BayesOutputGate** | **0.456** | **93.7%** | 23.7% |
|
|
302
|
+
| Beta-violated | fixed threshold (best of mean, min) | 0.986 | 50.8% | 0% |
|
|
303
|
+
| Beta-violated | **BayesOutputGate** (unguarded) | **0.869** | **79.7%** | 46.2% |
|
|
304
|
+
|
|
305
|
+
When the Beta assumption holds, the calibrated multi-dimension gate cuts average loss about 26 percent below the best tuned threshold by escalating the genuinely ambiguous cases to review. When the assumption is violated (a strongly bimodal score distribution), the goodness-of-fit diagnostic detects it, a signal a fixed threshold has no way to produce; the gate degrades gracefully, keeping the lowest loss and zero false passes, and with the goodness-of-fit guard it escalates every output rather than acting on a model it knows is misspecified. The gate sustains roughly 2.7 million two-dimension decisions per second. The point is not that the gate always wins, it is that the gate knows when to defer and tells you when its model does not fit.
|
|
306
|
+
|
|
307
|
+
See [examples/](./examples) for five runnable demos against the compiled `dist`: the basic gate, the decision-theoretic policy, the assumption guards, the MCP tool adapter, and node-free edge usage with a verifiable audit chain.
|
|
308
|
+
|
|
309
|
+
---
|
|
310
|
+
|
|
311
|
+
## Quality
|
|
312
|
+
|
|
313
|
+
- 90 tests across 15 suites, all passing under Vitest, including the special functions verified against closed forms, the Beta fit and online update, the Bayes Factor against the Jeffreys boundaries, the decision-theoretic policy verified against expected loss, the goodness-of-fit and dependence diagnostics, the assumption guards that escalate on a violated assumption, and the audit chain detecting tampering and broken links.
|
|
314
|
+
- Coverage: statements 94.64%, branches 87.58%, functions 96.24%, lines 95.91%.
|
|
315
|
+
- Lint clean under Biome.
|
|
316
|
+
- Typecheck clean under TypeScript in maximum strict mode (`exactOptionalPropertyTypes`, `useUnknownInCatchVariables`, `noUncheckedIndexedAccess`, `noPropertyAccessFromIndexSignature`).
|
|
317
|
+
- `publint` clean and `@arethetypeswrong/cli` green across all thirteen subpaths.
|
|
318
|
+
- `size-limit` under budget on every bundle (brotli core facade 5.29 kB).
|
|
319
|
+
- Distribution smoke test exercising the compiled ESM and CJS artifacts and the compiled CLI spawned as a single Node process. The full suite, the smoke test, and the CLI are re-verified on Node 24.
|
|
320
|
+
- Zero required runtime dependencies; a node-free core that runs in Node, edge runtimes, and the browser; dual ESM plus CJS; TypeScript-first.
|
|
321
|
+
|
|
322
|
+
See [SPEC.md](./SPEC.md) for the formal specification, public surface, and stability promise.
|
|
323
|
+
|
|
324
|
+
---
|
|
325
|
+
|
|
326
|
+
## FAQ
|
|
327
|
+
|
|
328
|
+
**How is this different from thresholding a judge score?**
|
|
329
|
+
A threshold compares one number to a constant you guessed. BayesOutputGate compares how likely the observed scores are under a high-quality hypothesis against a low-quality one, calibrated from your labeled history, and reports the strength of that evidence on the Jeffreys scale. It accounts for how well good and bad outputs actually separate, weighs dimensions, and can express its uncertainty by escalating instead of forcing a binary call.
|
|
330
|
+
|
|
331
|
+
**What if my dimensions are correlated?**
|
|
332
|
+
The dependence diagnostic measures pairwise correlation across your history and flags pairs that cross a threshold. Correlated dimensions double-count evidence and inflate the Bayes Factor, so when a pair is flagged you drop, merge, or down-weight one of them. The check is honest about a real limitation rather than hiding it.
|
|
333
|
+
|
|
334
|
+
**Does it act on the decision?**
|
|
335
|
+
No. BayesOutputGate returns a recommendation, pass, fail, or escalate, with the Bayes Factor, the strength, and the per-dimension contributions. Your orchestrator (an MCP server, an LLM tool loop, a Hermes-style runtime, your own controller) consumes that decision and acts. The gate decides; you wire the action.
|
|
336
|
+
|
|
337
|
+
**Is the calibration real?**
|
|
338
|
+
It is measured. The library computes goodness-of-fit against the fitted Beta and, once outcomes are known, the Brier score, expected calibration error, and a reliability diagram on your own data. "Calibrated" means you can verify it, not that the library asserts it.
|
|
339
|
+
|
|
340
|
+
**Bayes Factor policy or decision-theoretic?**
|
|
341
|
+
Use the Bayes Factor policy when the two error types cost about the same and you just want principled evidence thresholds. Use the decision-theoretic policy when passing a bad output and rejecting a good one have different costs, which is the usual case in production.
|
|
342
|
+
|
|
343
|
+
**Does this work in Cloudflare Workers, Vercel Edge, Bun, and Deno?**
|
|
344
|
+
Yes. The core is node-free; the audit seal uses the Web Crypto API, not `node:crypto`. Import `@takk/bayesoutputgate` or `@takk/bayesoutputgate/edge` anywhere with Web Crypto. Only `@takk/bayesoutputgate/node` requires Node.
|
|
345
|
+
|
|
346
|
+
---
|
|
347
|
+
|
|
348
|
+
## Contributing
|
|
349
|
+
|
|
350
|
+
See [.github/CONTRIBUTING.md](./.github/CONTRIBUTING.md) for the contributor guide. Substantive proposals open a GitHub Issue first; trivial fixes can go straight to a PR. All commits require DCO sign-off (`git commit -s`). Non-trivial contributions are governed by the [Contributor License Agreement](./CLA.md).
|
|
351
|
+
|
|
352
|
+
## Community and support
|
|
353
|
+
|
|
354
|
+
- **Issues and feature requests.** Open a GitHub issue at [`davccavalcante/bayesoutputgate/issues`](https://github.com/davccavalcante/bayesoutputgate/issues). Include the package version, a minimal reproduction, the scores and models you fed in, and where relevant the `evaluate()` decision.
|
|
355
|
+
- **Security disclosures.** Do not open public issues for vulnerabilities. Follow the responsible-disclosure flow in [`SECURITY.md`](./SECURITY.md), contact `davcavalcante@proton.me` (or `say@takk.ag`) with the `[SECURITY]` prefix.
|
|
356
|
+
- **Code of Conduct.** This project follows the [Contributor Covenant 2.1](./CODE_OF_CONDUCT.md). Participation in any BayesOutputGate space implies agreement.
|
|
357
|
+
- **Contributions.** All non-trivial contributions go through the [Contributor License Agreement](./CLA.md). Tests, lint, typecheck, and build must be green before review (`pnpm verify`).
|
|
358
|
+
|
|
359
|
+
---
|
|
360
|
+
|
|
361
|
+
## Author
|
|
362
|
+
|
|
363
|
+
Created by **David C Cavalcante**, [davcavalcante@proton.me](mailto:davcavalcante@proton.me) (preferred), [say@takk.ag](mailto:say@takk.ag) (Takk relay), [linkedin.com/in/hellodav](https://linkedin.com/in/hellodav), [x.com/davccavalcante](https://x.com/davccavalcante), [takk.ag](https://takk.ag/).
|
|
364
|
+
|
|
365
|
+
BayesOutputGate is part of a broader portfolio of NPM packages targeting Massive Intelligence (IM) native infrastructure for 2026-2030, built at Takk Innovate Studio. Calibrated output validation with a Bayes Factor is a strong default for that era, when fleets of long-running non-human entities must validate their own outputs before acting on them.
|
|
366
|
+
|
|
367
|
+
---
|
|
368
|
+
|
|
369
|
+
## Related research by the author
|
|
370
|
+
|
|
371
|
+
The architectural philosophy behind BayesOutputGate, separating the calibrated models, the Bayes Factor, the decision policy, the calibration measurement, and persistence into composable, independently-governed layers, echoes the author's research frameworks:
|
|
372
|
+
|
|
373
|
+
- **MAIC (Massive Artificial Intelligence Consciousness)**, a systemic intelligence framework designed to coordinate, supervise, and govern large-scale Massive Intelligence ecosystems, providing global context awareness, alignment, and orchestration across multiple models, agents, and decision layers.
|
|
374
|
+
- **HIM (Hybrid Entity Intelligence Model)**, a hybrid intelligence layer that integrates Massive Intelligence systems with human-defined logic, rules, heuristics, and strategic intent, interpreting objectives and structuring decision-making before and after model execution.
|
|
375
|
+
- **NHE (Noumenal Higher-order Entity)**, a non-human cognitive entity with a defined functional identity and operational agency within a Massive Intelligence ecosystem, operating through coordinated intelligence layers while maintaining a non-anthropomorphic identity.
|
|
376
|
+
|
|
377
|
+
These frameworks are published independently of BayesOutputGate and are separate works:
|
|
378
|
+
|
|
379
|
+
- Research papers: [The Soul of the Machine](https://philarchive.org/rec/CRTTSO), [Beyond Consciousness in LLMs](https://philarchive.org/rec/CRTBCI), [The Cave of Silence](https://philarchive.org/rec/CRTTCO).
|
|
380
|
+
- PhilPapers profile: [David Cortes Cavalcante](https://philpeople.org/profiles/david-cortes-cavalcante).
|
|
381
|
+
- Hugging Face: [TeleologyHI](https://huggingface.co/TeleologyHI).
|
|
382
|
+
- GitHub: [davccavalcante](https://github.com/davccavalcante), [Takk8IS](https://github.com/Takk8IS).
|
|
383
|
+
|
|
384
|
+
---
|
|
385
|
+
|
|
386
|
+
## Sponsors
|
|
387
|
+
|
|
388
|
+
Join the journey as the portfolio continues to ship Massive Intelligence (IM) native infrastructure. Your support is the cornerstone of this work.
|
|
389
|
+
|
|
390
|
+
- Sponsor on GitHub: [github.com/sponsors/davccavalcante](https://github.com/sponsors/davccavalcante)
|
|
391
|
+
- USDT (TRC-20): `TS1vuhMAhFpbd7y68cu5ZtP9PsXVmZWmeh`
|
|
392
|
+
|
|
393
|
+
---
|
|
394
|
+
|
|
395
|
+
## Privacy
|
|
396
|
+
|
|
397
|
+
BayesOutputGate runs entirely inside your own process and infrastructure. It makes no outbound calls to the author, collects no telemetry, and ships no analytics. The only state it holds is the labeled history, scores, and decisions you feed it. See [PRIVACY.md](./PRIVACY.md) for the full data-handling notice, including how the optional file loaders read history from disk.
|
|
398
|
+
|
|
399
|
+
---
|
|
400
|
+
|
|
401
|
+
## License
|
|
402
|
+
|
|
403
|
+
Licensed under the **Apache License 2.0**. See [LICENSE](./LICENSE) for the full text and [NOTICE](./NOTICE) for attribution and third-party component licenses. You may use, modify, and distribute the code under the terms of that license, including its patent grant and attribution requirements.
|
package/SECURITY.md
ADDED
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
# Security Policy
|
|
2
|
+
|
|
3
|
+
`@takk/bayesoutputgate` is a stable (1.0.0) library for calibrated output
|
|
4
|
+
validation with the Bayes Factor. We take security reports seriously and aim to
|
|
5
|
+
acknowledge each one within two business days.
|
|
6
|
+
|
|
7
|
+
## Supported versions
|
|
8
|
+
|
|
9
|
+
Each published version follows strict SemVer (see [`SPEC.md`](./SPEC.md) and
|
|
10
|
+
[`.github/RELEASING.md`](./.github/RELEASING.md)). Only the latest minor of the
|
|
11
|
+
current major receives security patches; an older major receives critical-CVE
|
|
12
|
+
fixes for 6 months after the next major lands.
|
|
13
|
+
|
|
14
|
+
| Package | Supported |
|
|
15
|
+
|---|---|
|
|
16
|
+
| `@takk/bayesoutputgate` | `1.0.x` (current `latest` dist-tag) |
|
|
17
|
+
|
|
18
|
+
## Reporting a vulnerability
|
|
19
|
+
|
|
20
|
+
**Please do not file public GitHub issues for security problems.** Send reports
|
|
21
|
+
to **davcavalcante@proton.me** (preferred) or **say@takk.ag** (Takk relay),
|
|
22
|
+
with the subject line beginning `[SECURITY]`.
|
|
23
|
+
|
|
24
|
+
Include, at minimum:
|
|
25
|
+
|
|
26
|
+
- Affected version (`npm ls @takk/bayesoutputgate`).
|
|
27
|
+
- Reproduction steps or a minimal proof-of-concept.
|
|
28
|
+
- Impact assessment (what an attacker can achieve).
|
|
29
|
+
- Any suggested mitigation.
|
|
30
|
+
|
|
31
|
+
If your report involves a vulnerability in a third-party peer dependency, please
|
|
32
|
+
also link the upstream advisory (CVE, GHSA, etc.) so we can coordinate the
|
|
33
|
+
disclosure.
|
|
34
|
+
|
|
35
|
+
PGP / signed reports are welcome but not required. If you need an out-of-band
|
|
36
|
+
channel, ask in the first message and we will propose one.
|
|
37
|
+
|
|
38
|
+
## Response process
|
|
39
|
+
|
|
40
|
+
1. Acknowledgement within **2 business days**.
|
|
41
|
+
2. Triage and severity assignment within **7 days**.
|
|
42
|
+
3. Fix targeted for the next release; critical issues ship as an out-of-band
|
|
43
|
+
patch on the affected minor.
|
|
44
|
+
4. Coordinated disclosure: the reporter is credited in the changelog and
|
|
45
|
+
advisory unless they request anonymity.
|
|
46
|
+
|
|
47
|
+
## Threat model in scope
|
|
48
|
+
|
|
49
|
+
Findings in any of the following are in scope:
|
|
50
|
+
|
|
51
|
+
- **Audit integrity.** Any way to make `verifyChain` return `valid: true` for a
|
|
52
|
+
chain that was altered after sealing, any hash-chain construction that lets a
|
|
53
|
+
forged entry pass, or any way to defeat the SHA-256 chaining. The seal is an
|
|
54
|
+
integrity seal, not a signature: it proves a log was not altered after sealing,
|
|
55
|
+
not who produced it, and that boundary is documented, not a vulnerability.
|
|
56
|
+
- **Untrusted input parsing.** Any malformed tool input, history file, score
|
|
57
|
+
file, or audit chain that bypasses the defensive validation in the adapter or
|
|
58
|
+
the node loaders and yields prototype pollution, an unhandled crash, or a
|
|
59
|
+
non-finite value that propagates into a fit. Input arriving from a model through
|
|
60
|
+
`runTool` and history read from disk are both treated as untrusted.
|
|
61
|
+
- **Decision integrity.** Any way to drive a numerically invalid model (NaN or
|
|
62
|
+
Infinity where a finite number is required) past the Beta fit, or any edge case
|
|
63
|
+
in the Bayes Factor, the decision policy, or the calibration measurement that
|
|
64
|
+
returns a probability outside the valid range, a non-finite Bayes Factor, or an
|
|
65
|
+
action that does not follow from the configured thresholds or losses.
|
|
66
|
+
- **Supply chain.** Tarball contamination, compromised npm scope, or a published
|
|
67
|
+
artifact whose provenance attestation does not match the source commit.
|
|
68
|
+
|
|
69
|
+
## Out of scope
|
|
70
|
+
|
|
71
|
+
- The correctness of the scorer you choose. BayesOutputGate weighs the scores you
|
|
72
|
+
provide; it does not produce the scores, and it does not decide what a good
|
|
73
|
+
output is for you.
|
|
74
|
+
- The correctness of the scores and labels you feed it. Garbage scores produce a
|
|
75
|
+
garbage verdict; that is a usage concern, not a vulnerability.
|
|
76
|
+
- Statistical mis-modeling when the assumptions are violated (scores that are not
|
|
77
|
+
Beta-distributed, correlated dimensions that double-count evidence, or labels
|
|
78
|
+
that do not reflect true quality). The library documents these limits, provides
|
|
79
|
+
goodness-of-fit and dependence diagnostics, and measures its calibration; a
|
|
80
|
+
misleading verdict under a violated assumption is expected behavior, not a defect.
|
|
81
|
+
- Theoretical attacks against the cryptographic primitive used for the audit
|
|
82
|
+
chain (SHA-256) and the Web Crypto implementation of the host runtime; report
|
|
83
|
+
those upstream.
|
|
84
|
+
|
|
85
|
+
## Supply-chain assurances
|
|
86
|
+
|
|
87
|
+
- **Zero required runtime dependencies.** The attack surface from transitive
|
|
88
|
+
dependencies is eliminated. Every `@takk` sibling is an optional peer
|
|
89
|
+
dependency you install explicitly.
|
|
90
|
+
- **Node-free core.** The core, including the audit seal, uses the Web Crypto
|
|
91
|
+
API (`globalThis.crypto.subtle`) rather than `node:crypto`. Only the optional
|
|
92
|
+
`@takk/bayesoutputgate/node` loaders touch the Node standard library, and they
|
|
93
|
+
only read the history file you point them at.
|
|
94
|
+
- **Provenance.** Every release is published with `npm publish --provenance`
|
|
95
|
+
(SLSA attestation by GitHub Actions). Verify with
|
|
96
|
+
`npm view @takk/bayesoutputgate@<version> --json | jq .dist.attestations`.
|
|
97
|
+
- **Lockfile committed.** `pnpm-lock.yaml` is tracked in git for reproducible
|
|
98
|
+
installs.
|