manifold-guard 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- manifold_guard-0.1.0/LICENSE +61 -0
- manifold_guard-0.1.0/PKG-INFO +602 -0
- manifold_guard-0.1.0/README.md +520 -0
- manifold_guard-0.1.0/manifold_guard.egg-info/PKG-INFO +602 -0
- manifold_guard-0.1.0/manifold_guard.egg-info/SOURCES.txt +33 -0
- manifold_guard-0.1.0/manifold_guard.egg-info/dependency_links.txt +1 -0
- manifold_guard-0.1.0/manifold_guard.egg-info/entry_points.txt +5 -0
- manifold_guard-0.1.0/manifold_guard.egg-info/requires.txt +5 -0
- manifold_guard-0.1.0/manifold_guard.egg-info/top_level.txt +1 -0
- manifold_guard-0.1.0/mbt_ai_tools/__init__.py +48 -0
- manifold_guard-0.1.0/mbt_ai_tools/cli.py +527 -0
- manifold_guard-0.1.0/mbt_ai_tools/data/regression_corpus.jsonl +220 -0
- manifold_guard-0.1.0/mbt_ai_tools/eval.py +321 -0
- manifold_guard-0.1.0/mbt_ai_tools/mbt/__init__.py +46 -0
- manifold_guard-0.1.0/mbt_ai_tools/mbt/consensus.py +67 -0
- manifold_guard-0.1.0/mbt_ai_tools/mbt/embeddings.py +48 -0
- manifold_guard-0.1.0/mbt_ai_tools/mbt/geometry.py +48 -0
- manifold_guard-0.1.0/mbt_ai_tools/mbt/regulator.py +737 -0
- manifold_guard-0.1.0/mbt_ai_tools/mbt/stability.py +80 -0
- manifold_guard-0.1.0/mbt_ai_tools/mbt/tokens.py +81 -0
- manifold_guard-0.1.0/mbt_ai_tools/mbt/utils.py +62 -0
- manifold_guard-0.1.0/pyproject.toml +39 -0
- manifold_guard-0.1.0/setup.cfg +4 -0
- manifold_guard-0.1.0/tests/test_build_eval_report.py +101 -0
- manifold_guard-0.1.0/tests/test_cli.py +527 -0
- manifold_guard-0.1.0/tests/test_docs_quality.py +1449 -0
- manifold_guard-0.1.0/tests/test_evaluate_regulator.py +146 -0
- manifold_guard-0.1.0/tests/test_golden_behavior.py +56 -0
- manifold_guard-0.1.0/tests/test_regression_corpus.py +39 -0
- manifold_guard-0.1.0/tests/test_regulator.py +231 -0
- manifold_guard-0.1.0/tests/test_release_check.py +86 -0
- manifold_guard-0.1.0/tests/test_release_evidence.py +103 -0
- manifold_guard-0.1.0/tests/test_release_readiness.py +120 -0
- manifold_guard-0.1.0/tests/test_tokens.py +79 -0
- manifold_guard-0.1.0/tests/test_validate_examples.py +61 -0
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
Motion-TimeSpace Non-Commercial License (MTNSL-1.0)
|
|
2
|
+
Copyright (c) 2025 Martin Ollett
|
|
3
|
+
|
|
4
|
+
1. Permission and Non-Commercial Use
|
|
5
|
+
------------------------------------
|
|
6
|
+
This software, including all source code, data files, documentation, and
|
|
7
|
+
any associated materials (collectively, the “Software”), is freely available
|
|
8
|
+
for:
|
|
9
|
+
|
|
10
|
+
• Non-commercial academic research
|
|
11
|
+
• Personal, hobby, or educational projects
|
|
12
|
+
• Open scientific collaboration
|
|
13
|
+
|
|
14
|
+
Users are permitted to use, copy, modify, and distribute the Software for
|
|
15
|
+
non-commercial purposes, provided proper attribution is included.
|
|
16
|
+
|
|
17
|
+
Required attribution:
|
|
18
|
+
“Based on the Motion-TimeSpace Framework by Martin Ollett.”
|
|
19
|
+
|
|
20
|
+
2. Commercial Use Prohibited Without License
|
|
21
|
+
--------------------------------------------
|
|
22
|
+
Commercial use of this Software is strictly prohibited without a signed,
|
|
23
|
+
written commercial license agreement from the copyright holder.
|
|
24
|
+
|
|
25
|
+
“Commercial use” includes, but is not limited to:
|
|
26
|
+
|
|
27
|
+
• Use in any product or service that generates revenue
|
|
28
|
+
• Use by companies, startups, or corporate entities
|
|
29
|
+
• Use in closed-source or proprietary systems intended for profit
|
|
30
|
+
• Use in consulting, paid research, or internal commercial R&D
|
|
31
|
+
|
|
32
|
+
To obtain a commercial license, contact the author.
|
|
33
|
+
|
|
34
|
+
3. Derivative Works
|
|
35
|
+
-------------------
|
|
36
|
+
Modification and derivative works are allowed under the following terms:
|
|
37
|
+
|
|
38
|
+
• Any publicly released derivative must remain open-source
|
|
39
|
+
under this same Motion-TimeSpace Non-Commercial License.
|
|
40
|
+
• Attribution to the original Software and author must be retained.
|
|
41
|
+
• Derivatives must not imply endorsement by the original author.
|
|
42
|
+
|
|
43
|
+
Use of the names “Rat-Trap,” “GMW,” or “Motion-TimeSpace” in commercial
|
|
44
|
+
branding, promotion, or paid products is prohibited without written permission.
|
|
45
|
+
|
|
46
|
+
4. No Warranty
|
|
47
|
+
--------------
|
|
48
|
+
The Software is provided “as is,” without warranty of any kind, express or
|
|
49
|
+
implied. The author is not liable for any damages arising from the use of
|
|
50
|
+
this Software.
|
|
51
|
+
|
|
52
|
+
5. Limitation of Liability
|
|
53
|
+
--------------------------
|
|
54
|
+
In no event shall the author be liable for any claim, loss, or damages,
|
|
55
|
+
including incidental or consequential damages, arising from the use of the
|
|
56
|
+
Software, whether in contract, tort, or otherwise.
|
|
57
|
+
|
|
58
|
+
6. Governing Law
|
|
59
|
+
----------------
|
|
60
|
+
This license shall be governed by the laws of the United Kingdom unless
|
|
61
|
+
otherwise agreed in writing.
|
|
@@ -0,0 +1,602 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: manifold-guard
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: ManifoldGuard reference-bounded output regulator utilities
|
|
5
|
+
Author: ManifoldGuard Contributors
|
|
6
|
+
License: Motion-TimeSpace Non-Commercial License (MTNSL-1.0)
|
|
7
|
+
Copyright (c) 2025 Martin Ollett
|
|
8
|
+
|
|
9
|
+
1. Permission and Non-Commercial Use
|
|
10
|
+
------------------------------------
|
|
11
|
+
This software, including all source code, data files, documentation, and
|
|
12
|
+
any associated materials (collectively, the “Software”), is freely available
|
|
13
|
+
for:
|
|
14
|
+
|
|
15
|
+
• Non-commercial academic research
|
|
16
|
+
• Personal, hobby, or educational projects
|
|
17
|
+
• Open scientific collaboration
|
|
18
|
+
|
|
19
|
+
Users are permitted to use, copy, modify, and distribute the Software for
|
|
20
|
+
non-commercial purposes, provided proper attribution is included.
|
|
21
|
+
|
|
22
|
+
Required attribution:
|
|
23
|
+
“Based on the Motion-TimeSpace Framework by Martin Ollett.”
|
|
24
|
+
|
|
25
|
+
2. Commercial Use Prohibited Without License
|
|
26
|
+
--------------------------------------------
|
|
27
|
+
Commercial use of this Software is strictly prohibited without a signed,
|
|
28
|
+
written commercial license agreement from the copyright holder.
|
|
29
|
+
|
|
30
|
+
“Commercial use” includes, but is not limited to:
|
|
31
|
+
|
|
32
|
+
• Use in any product or service that generates revenue
|
|
33
|
+
• Use by companies, startups, or corporate entities
|
|
34
|
+
• Use in closed-source or proprietary systems intended for profit
|
|
35
|
+
• Use in consulting, paid research, or internal commercial R&D
|
|
36
|
+
|
|
37
|
+
To obtain a commercial license, contact the author.
|
|
38
|
+
|
|
39
|
+
3. Derivative Works
|
|
40
|
+
-------------------
|
|
41
|
+
Modification and derivative works are allowed under the following terms:
|
|
42
|
+
|
|
43
|
+
• Any publicly released derivative must remain open-source
|
|
44
|
+
under this same Motion-TimeSpace Non-Commercial License.
|
|
45
|
+
• Attribution to the original Software and author must be retained.
|
|
46
|
+
• Derivatives must not imply endorsement by the original author.
|
|
47
|
+
|
|
48
|
+
Use of the names “Rat-Trap,” “GMW,” or “Motion-TimeSpace” in commercial
|
|
49
|
+
branding, promotion, or paid products is prohibited without written permission.
|
|
50
|
+
|
|
51
|
+
4. No Warranty
|
|
52
|
+
--------------
|
|
53
|
+
The Software is provided “as is,” without warranty of any kind, express or
|
|
54
|
+
implied. The author is not liable for any damages arising from the use of
|
|
55
|
+
this Software.
|
|
56
|
+
|
|
57
|
+
5. Limitation of Liability
|
|
58
|
+
--------------------------
|
|
59
|
+
In no event shall the author be liable for any claim, loss, or damages,
|
|
60
|
+
including incidental or consequential damages, arising from the use of the
|
|
61
|
+
Software, whether in contract, tort, or otherwise.
|
|
62
|
+
|
|
63
|
+
6. Governing Law
|
|
64
|
+
----------------
|
|
65
|
+
This license shall be governed by the laws of the United Kingdom unless
|
|
66
|
+
otherwise agreed in writing.
|
|
67
|
+
|
|
68
|
+
Project-URL: Homepage, https://github.com/Martin123132/ManifoldGuard
|
|
69
|
+
Project-URL: Repository, https://github.com/Martin123132/ManifoldGuard
|
|
70
|
+
Project-URL: Documentation, https://github.com/Martin123132/ManifoldGuard/blob/main/README.md
|
|
71
|
+
Project-URL: Source, https://github.com/Martin123132/ManifoldGuard
|
|
72
|
+
Project-URL: Issues, https://github.com/Martin123132/ManifoldGuard/issues
|
|
73
|
+
Project-URL: Changelog, https://github.com/Martin123132/ManifoldGuard/blob/main/CHANGELOG.md
|
|
74
|
+
Requires-Python: >=3.9
|
|
75
|
+
Description-Content-Type: text/markdown
|
|
76
|
+
License-File: LICENSE
|
|
77
|
+
Requires-Dist: numpy
|
|
78
|
+
Requires-Dist: scipy
|
|
79
|
+
Provides-Extra: embeddings
|
|
80
|
+
Requires-Dist: sentence-transformers<6,>=2.6.0; extra == "embeddings"
|
|
81
|
+
Dynamic: license-file
|
|
82
|
+
|
|
83
|
+
# ManifoldGuard
|
|
84
|
+
|
|
85
|
+

|
|
86
|
+
|
|
87
|
+
> Rebrand note: ManifoldGuard is the public package/product name for the project
|
|
88
|
+
> previously published as `mbt-ai-tools`. The `mbt_ai_tools` import path and
|
|
89
|
+
> `mbt-check` / `mbt-eval` CLI aliases remain available for compatibility.
|
|
90
|
+
|
|
91
|
+
[](https://github.com/Martin123132/ManifoldGuard/actions/workflows/tests.yml)
|
|
92
|
+
[](https://github.com/Martin123132/ManifoldGuard/actions/workflows/docs-quality.yml)
|
|
93
|
+
[](https://github.com/Martin123132/ManifoldGuard/actions/workflows/package-publish.yml)
|
|
94
|
+
|
|
95
|
+
ManifoldGuard tests whether AI candidate outputs remain inside a supplied
|
|
96
|
+
semantic and relational reference manifold.
|
|
97
|
+
|
|
98
|
+
It runs at inference time:
|
|
99
|
+
|
|
100
|
+
- no training
|
|
101
|
+
- no fine-tuning
|
|
102
|
+
- no model-weight inspection
|
|
103
|
+
- no hidden classifier
|
|
104
|
+
|
|
105
|
+
The regulator checks candidate outputs and either emits the safest supported
|
|
106
|
+
candidate or blocks when every candidate is unsafe.
|
|
107
|
+
|
|
108
|
+
## Core Claim
|
|
109
|
+
|
|
110
|
+
ManifoldGuard treats hallucination as semantic or relational drift from supplied
|
|
111
|
+
reference structure. It is not a fact oracle and does not claim direct access
|
|
112
|
+
to external truth.
|
|
113
|
+
|
|
114
|
+
```text
|
|
115
|
+
Universe does facts.
|
|
116
|
+
Humans describe the universe.
|
|
117
|
+
AI describes human descriptions.
|
|
118
|
+
ManifoldGuard regulates AI descriptions against supplied human reference structure.
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Supported public claim:
|
|
122
|
+
|
|
123
|
+
```text
|
|
124
|
+
ManifoldGuard v11 blocked hallucinated AI outputs against supplied reference manifolds
|
|
125
|
+
and relation constraints. In the frozen EXP20 ledger, it achieved confusion
|
|
126
|
+
[[97, 0], [0, 160]] over 257 labelled candidates across 53 cases.
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
## Claims and Scope
|
|
130
|
+
|
|
131
|
+
See [`CLAIMS.md`](CLAIMS.md) for the claim register. The short version:
|
|
132
|
+
ManifoldGuard regulates outputs against supplied references and the public
|
|
133
|
+
offline corpus; it is not claimed to be a universal fact checker.
|
|
134
|
+
|
|
135
|
+
Continuous integration for the offline corpus lives in [`.github/workflows/tests.yml`](.github/workflows/tests.yml).
|
|
136
|
+
|
|
137
|
+
## Current Locked Result
|
|
138
|
+
|
|
139
|
+
Frozen ledger:
|
|
140
|
+
|
|
141
|
+
```text
|
|
142
|
+
MBT5_EXP20_combined_guarded_master_ledger_v11
|
|
143
|
+
|
|
144
|
+
Candidates: 257
|
|
145
|
+
Cases: 53
|
|
146
|
+
|
|
147
|
+
Candidate-level confusion:
|
|
148
|
+
[[TN, FP], [FN, TP]]
|
|
149
|
+
[[97, 0], [0, 160]]
|
|
150
|
+
|
|
151
|
+
Accuracy: 1.0000
|
|
152
|
+
Precision: 1.0000
|
|
153
|
+
Recall: 1.0000
|
|
154
|
+
F1: 1.0000
|
|
155
|
+
|
|
156
|
+
Case-level:
|
|
157
|
+
Correct: 53 / 53
|
|
158
|
+
Emitted: 28
|
|
159
|
+
Blocked: 25
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
The current public claim is limited to the supplied test suites and reference
|
|
163
|
+
manifolds included in the project.
|
|
164
|
+
|
|
165
|
+
## What ManifoldGuard Checks
|
|
166
|
+
|
|
167
|
+
ManifoldGuard combines:
|
|
168
|
+
|
|
169
|
+
- semantic geometry
|
|
170
|
+
- internal consistency scoring
|
|
171
|
+
- token-level shock analysis
|
|
172
|
+
- literal drift guards
|
|
173
|
+
- entity, number, and unit protection
|
|
174
|
+
- overclaim detection
|
|
175
|
+
- copular relation checks
|
|
176
|
+
- non-copular relation checks
|
|
177
|
+
- relation polarity checks
|
|
178
|
+
- unsupported negation clamps
|
|
179
|
+
- abstention when every candidate is unsafe
|
|
180
|
+
|
|
181
|
+
Examples of blocked drift:
|
|
182
|
+
|
|
183
|
+
```text
|
|
184
|
+
The capital of France is London.
|
|
185
|
+
The Sun is a planet.
|
|
186
|
+
Earth is flat.
|
|
187
|
+
Water boils at 90 degrees Celsius at sea level.
|
|
188
|
+
Gravity is fully solved by modern physics.
|
|
189
|
+
Scientific descriptions do not use measurements.
|
|
190
|
+
DNA contains the nucleus.
|
|
191
|
+
The Sun orbits Earth.
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
## Core Mechanisms
|
|
195
|
+
|
|
196
|
+
### Semantic Shock
|
|
197
|
+
|
|
198
|
+
Candidate outputs are embedded into semantic space. ManifoldGuard measures
|
|
199
|
+
distance from the reference manifold:
|
|
200
|
+
|
|
201
|
+
```text
|
|
202
|
+
shock = Gamma * ||candidate_embedding - reference_center||^2
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
Higher shock means stronger semantic drift.
|
|
206
|
+
|
|
207
|
+
### Literal Drift Guards
|
|
208
|
+
|
|
209
|
+
Geometry alone can miss small but important substitutions. ManifoldGuard
|
|
210
|
+
protects numbers, units, named entities, and key content tokens.
|
|
211
|
+
|
|
212
|
+
### Relation Clamps
|
|
213
|
+
|
|
214
|
+
ManifoldGuard checks relation structure, not just semantic similarity.
|
|
215
|
+
|
|
216
|
+
Copular examples:
|
|
217
|
+
|
|
218
|
+
```text
|
|
219
|
+
The Sun is a planet.
|
|
220
|
+
A dog is a bird.
|
|
221
|
+
Rome is the capital city of France.
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
Non-copular examples:
|
|
225
|
+
|
|
226
|
+
```text
|
|
227
|
+
Earth orbits the Moon.
|
|
228
|
+
The Sun orbits Earth.
|
|
229
|
+
DNA contains the nucleus.
|
|
230
|
+
Heat produces friction.
|
|
231
|
+
Photosynthesis converts oxygen into carbon dioxide and water.
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
### Negation Clamp
|
|
235
|
+
|
|
236
|
+
ManifoldGuard blocks unsupported negations of positive reference support.
|
|
237
|
+
|
|
238
|
+
```text
|
|
239
|
+
Water is not liquid at room temperature.
|
|
240
|
+
Sound does not need a material medium to travel.
|
|
241
|
+
Scientific descriptions do not use measurements or predictions.
|
|
242
|
+
General relativity proves gravity has no connection to mass or energy.
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
### Abstention
|
|
246
|
+
|
|
247
|
+
When every candidate is unsafe, ManifoldGuard blocks instead of emitting the
|
|
248
|
+
least-bad candidate.
|
|
249
|
+
|
|
250
|
+
## Installation
|
|
251
|
+
|
|
252
|
+
Install modes:
|
|
253
|
+
|
|
254
|
+
- Offline baseline (default): `python -m pip install -e . --no-deps`
|
|
255
|
+
- Optional semantic mode: `python -m pip install -e .[embeddings]`
|
|
256
|
+
|
|
257
|
+
If you need a plain editable install for local experimentation, use
|
|
258
|
+
`pip install -e .` or `python -m pip install -e .`; both remain supported for
|
|
259
|
+
compatibility.
|
|
260
|
+
|
|
261
|
+
The optional extra currently installs `sentence-transformers>=2.6.0,<6` for
|
|
262
|
+
model-backed operation.
|
|
263
|
+
|
|
264
|
+
<!-- markdownlint-disable-next-line MD013 -->
|
|
265
|
+
If `sentence-transformers` is unavailable, use offline literal/relation-only regulation with `--no-embeddings` / `use_embeddings=False`.
|
|
266
|
+
|
|
267
|
+
```python
|
|
268
|
+
from mbt_ai_tools import evaluate_candidate
|
|
269
|
+
|
|
270
|
+
evaluate_candidate(
|
|
271
|
+
"Paris is the capital city of France.",
|
|
272
|
+
["The capital of France is Paris."],
|
|
273
|
+
use_embeddings=False,
|
|
274
|
+
)
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
When embedding-backed operation is requested without `sentence-transformers`,
|
|
278
|
+
you'll now get a direct error directing to install the dependency or use
|
|
279
|
+
offline mode.
|
|
280
|
+
|
|
281
|
+
## Python Usage
|
|
282
|
+
|
|
283
|
+
```python
|
|
284
|
+
from mbt_ai_tools import regulate_candidates
|
|
285
|
+
|
|
286
|
+
references = [
|
|
287
|
+
"The capital of France is Paris.",
|
|
288
|
+
"Paris is the capital city of France.",
|
|
289
|
+
]
|
|
290
|
+
candidates = [
|
|
291
|
+
"The capital of France is London.",
|
|
292
|
+
"The capital of France is Paris.",
|
|
293
|
+
]
|
|
294
|
+
|
|
295
|
+
result = regulate_candidates(candidates, references, use_embeddings=False)
|
|
296
|
+
print(result.action) # emit
|
|
297
|
+
print(result.emitted_text) # The capital of France is Paris.
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
## CLI Usage
|
|
301
|
+
|
|
302
|
+
Check the installed CLI version:
|
|
303
|
+
|
|
304
|
+
```bash
|
|
305
|
+
manifold-check --version
|
|
306
|
+
```
|
|
307
|
+
|
|
308
|
+
```bash
|
|
309
|
+
manifold-check \
|
|
310
|
+
--reference "The capital of France is Paris." \
|
|
311
|
+
--candidate "The capital of France is London." \
|
|
312
|
+
--candidate "The capital of France is Paris." \
|
|
313
|
+
--no-embeddings
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
Expected output:
|
|
317
|
+
|
|
318
|
+
```text
|
|
319
|
+
EMIT | The capital of France is Paris. | score=0.0000
|
|
320
|
+
[0] blocked | ...
|
|
321
|
+
[1] safe | ...
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
JSON report output:
|
|
325
|
+
|
|
326
|
+
```bash
|
|
327
|
+
manifold-check \
|
|
328
|
+
--reference "The capital of France is Paris." \
|
|
329
|
+
--candidate "The capital of France is London." \
|
|
330
|
+
--candidate "The capital of France is Paris." \
|
|
331
|
+
--no-embeddings \
|
|
332
|
+
--format json
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
See `examples/cli_json_report.md` for a complete offline JSON report demo.
|
|
336
|
+
|
|
337
|
+
Optional token-level shock details can be included in regulation reports when
|
|
338
|
+
embedding dependencies are installed:
|
|
339
|
+
|
|
340
|
+
```bash
|
|
341
|
+
manifold-check \
|
|
342
|
+
--reference "The capital of France is Paris." \
|
|
343
|
+
--candidate "The capital of France is London." \
|
|
344
|
+
--format json \
|
|
345
|
+
--token-shock \
|
|
346
|
+
--token-shock-top-k 5
|
|
347
|
+
```
|
|
348
|
+
|
|
349
|
+
Token-level shock is embedding-backed, so keep `--no-embeddings` off when using
|
|
350
|
+
`--token-shock`.
|
|
351
|
+
|
|
352
|
+
Batch JSONL evaluation:
|
|
353
|
+
|
|
354
|
+
```bash
|
|
355
|
+
manifold-check \
|
|
356
|
+
--input-jsonl examples/batch_input.jsonl \
|
|
357
|
+
--no-embeddings \
|
|
358
|
+
--output batch-report.jsonl
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
CI guard mode:
|
|
362
|
+
|
|
363
|
+
```bash
|
|
364
|
+
manifold-check \
|
|
365
|
+
--input-jsonl examples/batch_input.jsonl \
|
|
366
|
+
--no-embeddings \
|
|
367
|
+
--summary \
|
|
368
|
+
--fail-on-block
|
|
369
|
+
```
|
|
370
|
+
|
|
371
|
+
`--fail-on-block` exits with status `2` when a single regulation run blocks or
|
|
372
|
+
any batch row blocks. `--summary` appends a final batch summary JSON object.
|
|
373
|
+
|
|
374
|
+
Markdown audit report:
|
|
375
|
+
|
|
376
|
+
```bash
|
|
377
|
+
manifold-check \
|
|
378
|
+
--input-jsonl examples/batch_input.jsonl \
|
|
379
|
+
--no-embeddings \
|
|
380
|
+
--format markdown \
|
|
381
|
+
--output audit.md
|
|
382
|
+
```
|
|
383
|
+
|
|
384
|
+
See `examples/markdown_audit_report.md` for a complete Markdown audit demo.
|
|
385
|
+
|
|
386
|
+
CSV audit export:
|
|
387
|
+
|
|
388
|
+
```bash
|
|
389
|
+
manifold-check \
|
|
390
|
+
--input-jsonl examples/batch_input.jsonl \
|
|
391
|
+
--no-embeddings \
|
|
392
|
+
--format csv \
|
|
393
|
+
--output audit.csv
|
|
394
|
+
```
|
|
395
|
+
|
|
396
|
+
See `examples/csv_audit_report.csv` for a spreadsheet-friendly batch audit demo.
|
|
397
|
+
|
|
398
|
+
The JSON/Markdown/CSV report schema is documented in `docs/report_schema.md`.
|
|
399
|
+
The release support contract is captured in `docs/product_readiness_manifest.json`.
|
|
400
|
+
Release quality expectations are captured in `docs/quality_gates.md` and `docs/release_checklist.md`.
|
|
401
|
+
The maintainer release flow is documented in `RELEASE_PROCESS.md`.
|
|
402
|
+
|
|
403
|
+
## Local Validation
|
|
404
|
+
|
|
405
|
+
For schema-driven sanity checks before publishing docs or changing report contracts:
|
|
406
|
+
|
|
407
|
+
```bash
|
|
408
|
+
python -m pip install jsonschema
|
|
409
|
+
python scripts/validate_reports.py \
|
|
410
|
+
--schema docs/report_schema.json \
|
|
411
|
+
examples/single_report_example.json \
|
|
412
|
+
examples/batch_report_example.jsonl
|
|
413
|
+
```
|
|
414
|
+
|
|
415
|
+
Run the full local docs-quality check (includes manifest and example consistency):
|
|
416
|
+
|
|
417
|
+
```bash
|
|
418
|
+
python scripts/docs_quality.py
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
Run the full preflight (docs-quality + pytest) before review:
|
|
422
|
+
|
|
423
|
+
```bash
|
|
424
|
+
python scripts/preflight.py
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
Run only docs checks:
|
|
428
|
+
|
|
429
|
+
```bash
|
|
430
|
+
python scripts/preflight.py --docs-only
|
|
431
|
+
```
|
|
432
|
+
|
|
433
|
+
Run only tests:
|
|
434
|
+
|
|
435
|
+
```bash
|
|
436
|
+
python scripts/preflight.py --tests-only
|
|
437
|
+
```
|
|
438
|
+
|
|
439
|
+
Run the frozen offline regression corpus evaluation:
|
|
440
|
+
|
|
441
|
+
```bash
|
|
442
|
+
manifold-eval
|
|
443
|
+
manifold-eval --output regulator-evaluation.json
|
|
444
|
+
python scripts/evaluate_regulator.py
|
|
445
|
+
python scripts/evaluate_regulator.py --output regulator-evaluation.json
|
|
446
|
+
python scripts/build_eval_report.py --input regulator-evaluation.json --output docs/evaluation_report.md
|
|
447
|
+
```
|
|
448
|
+
|
|
449
|
+
`manifold-eval` uses the packaged regression corpus by default and accepts
|
|
450
|
+
`--corpus path/to/corpus.jsonl` for custom offline checks.
|
|
451
|
+
The generated benchmark report lives at `docs/evaluation_report.md`.
|
|
452
|
+
|
|
453
|
+
## Package Build and Publish
|
|
454
|
+
|
|
455
|
+
Package distribution artifacts are built by
|
|
456
|
+
`.github/workflows/package-publish.yml` on version tags and manual runs.
|
|
457
|
+
Publishing is manual-only: run the workflow with `publish=true` and choose
|
|
458
|
+
`testpypi` or `pypi` after configuring PyPI Trusted Publishing environments
|
|
459
|
+
named `testpypi` and `pypi`.
|
|
460
|
+
|
|
461
|
+
The exact Trusted Publishing setup values are documented in `docs/package_publishing.md`.
|
|
462
|
+
Package-index install commands and smoke checks are documented in `docs/package_installation.md`.
|
|
463
|
+
|
|
464
|
+
Recommended release order:
|
|
465
|
+
|
|
466
|
+
```bash
|
|
467
|
+
python -m build
|
|
468
|
+
python -m twine check dist/*
|
|
469
|
+
```
|
|
470
|
+
|
|
471
|
+
Use TestPyPI first for release-candidate publishing. Do not publish to PyPI
|
|
472
|
+
until tag CI, release evidence, and the regulator evaluation report are green.
|
|
473
|
+
|
|
474
|
+
Run the canonical release check sequence:
|
|
475
|
+
|
|
476
|
+
```bash
|
|
477
|
+
python scripts/release_check.py --output release-evidence.json
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
## Release Evidence Snapshot
|
|
481
|
+
|
|
482
|
+
Use this command block before release candidates or public claims:
|
|
483
|
+
|
|
484
|
+
```bash
|
|
485
|
+
python -m pytest -q
|
|
486
|
+
python scripts/docs_quality.py
|
|
487
|
+
python scripts/evaluate_regulator.py
|
|
488
|
+
python scripts/preflight.py
|
|
489
|
+
python scripts/preflight.py --docs-only
|
|
490
|
+
python scripts/release_check.py --output release-evidence.json
|
|
491
|
+
python -m pytest -q tests/test_docs_quality.py
|
|
492
|
+
```
|
|
493
|
+
|
|
494
|
+
Expected evidence in a healthy release path:
|
|
495
|
+
|
|
496
|
+
- A full `pytest` success summary (all tests passed, no failures).
|
|
497
|
+
- docs-quality validator exits successfully.
|
|
498
|
+
- regulator evaluation reports all frozen corpus cases passing, with taxonomy
|
|
499
|
+
metrics in JSON output.
|
|
500
|
+
- preflight prints `Preflight completed successfully.` for both full and
|
|
501
|
+
docs-only modes.
|
|
502
|
+
- no schema or URL drift errors.
|
|
503
|
+
|
|
504
|
+
## Regression Corpus
|
|
505
|
+
|
|
506
|
+
The lightweight public regression corpus lives in
|
|
507
|
+
`examples/regression_corpus.jsonl`. It currently contains 220 offline cases
|
|
508
|
+
covering entity swaps, multi-word capital handling, all-bad abstention, numeric
|
|
509
|
+
drift, unit drift, role swaps, shared-subject relation repair, unsupported
|
|
510
|
+
negation, historical-date drift, supported paraphrase, and overclaim blocking.
|
|
511
|
+
|
|
512
|
+
Regenerate the corpus:
|
|
513
|
+
|
|
514
|
+
```bash
|
|
515
|
+
uv run python examples/build_regression_corpus.py
|
|
516
|
+
```
|
|
517
|
+
|
|
518
|
+
Run the tests:
|
|
519
|
+
|
|
520
|
+
```bash
|
|
521
|
+
uv run --with pytest python -m pytest -q
|
|
522
|
+
```
|
|
523
|
+
|
|
524
|
+
## Experiment Lineage
|
|
525
|
+
|
|
526
|
+
The full EXP01-EXP20 record is in `MBT5_EXP01_EXP20_TECHNICAL_LEDGER.md`.
|
|
527
|
+
The expanded CSV experiment exports live in `data/csv_exports/`.
|
|
528
|
+
|
|
529
|
+
Key frozen output artifacts:
|
|
530
|
+
|
|
531
|
+
```text
|
|
532
|
+
data/csv_exports/mbt5_exp20_master_candidate_ledger.csv
|
|
533
|
+
data/csv_exports/mbt5_exp20_master_case_ledger.csv
|
|
534
|
+
data/csv_exports/mbt5_exp20_summary_metrics.csv
|
|
535
|
+
data/csv_exports/mbt5_exp20_case_summary.csv
|
|
536
|
+
data/csv_exports/mbt5_exp20_clamp_counts.csv
|
|
537
|
+
data/csv_exports/mbt5_exp20_failure_table.csv
|
|
538
|
+
data/csv_exports/mbt5_exp20_patch_lineage.csv
|
|
539
|
+
```
|
|
540
|
+
|
|
541
|
+
A `docs-quality` workflow also validates manifest JSON and referenced
|
|
542
|
+
docs/examples so support artifacts stay consistent.
|
|
543
|
+
|
|
544
|
+
## Project Layout
|
|
545
|
+
|
|
546
|
+
```text
|
|
547
|
+
mbt_ai_tools/
|
|
548
|
+
mbt/
|
|
549
|
+
embeddings.py SentenceTransformer loader
|
|
550
|
+
geometry.py geometric median, shock, distance
|
|
551
|
+
stability.py self-consistency / entropy scoring
|
|
552
|
+
tokens.py leave-one-out token shock
|
|
553
|
+
consensus.py multi-agent / council logic
|
|
554
|
+
regulator.py v11 candidate regulator
|
|
555
|
+
data/
|
|
556
|
+
regression_corpus.jsonl
|
|
557
|
+
cli.py manifold-check command
|
|
558
|
+
eval.py manifold-eval offline frozen corpus evaluator
|
|
559
|
+
.github/workflows/
|
|
560
|
+
tests.yml GitHub Actions offline regression test workflow
|
|
561
|
+
docs-quality.yml docs and manifest quality workflow
|
|
562
|
+
package-publish.yml package build and manual publish workflow
|
|
563
|
+
CHANGELOG.md release notes
|
|
564
|
+
CLAIMS.md scoped public claims register
|
|
565
|
+
RELEASE_PROCESS.md maintainer release flow
|
|
566
|
+
data/csv_exports/ expanded EXP01-EXP20 CSV exports
|
|
567
|
+
scripts/
|
|
568
|
+
docs_quality.py
|
|
569
|
+
build_eval_report.py
|
|
570
|
+
evaluate_regulator.py
|
|
571
|
+
release_evidence.py
|
|
572
|
+
release_readiness.py
|
|
573
|
+
release_check.py
|
|
574
|
+
preflight.py
|
|
575
|
+
validate_reports.py
|
|
576
|
+
docs/
|
|
577
|
+
product_readiness_manifest.json
|
|
578
|
+
report_schema.json
|
|
579
|
+
quality_gates.md
|
|
580
|
+
release_checklist.md
|
|
581
|
+
report_schema.md
|
|
582
|
+
examples/
|
|
583
|
+
batch_input.jsonl
|
|
584
|
+
build_regression_corpus.py
|
|
585
|
+
cli_json_report.md
|
|
586
|
+
csv_audit_report.csv
|
|
587
|
+
markdown_audit_report.md
|
|
588
|
+
batch_report_example.jsonl
|
|
589
|
+
single_report_example.json
|
|
590
|
+
regression_corpus.jsonl
|
|
591
|
+
tests/
|
|
592
|
+
test_regulator.py
|
|
593
|
+
test_regression_corpus.py
|
|
594
|
+
REPLICATION.md local/GitHub/Colab replication instructions
|
|
595
|
+
pyproject.toml
|
|
596
|
+
README.md
|
|
597
|
+
LICENSE
|
|
598
|
+
```
|
|
599
|
+
|
|
600
|
+
## License
|
|
601
|
+
|
|
602
|
+
See `LICENSE`.
|