invarlock 0.3.7__py3-none-any.whl → 0.3.9__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- invarlock/__init__.py +3 -3
- invarlock/adapters/auto.py +2 -10
- invarlock/adapters/hf_loading.py +7 -7
- invarlock/adapters/hf_mixin.py +28 -5
- invarlock/assurance/__init__.py +15 -23
- invarlock/calibration/spectral_null.py +1 -1
- invarlock/cli/adapter_auto.py +1 -5
- invarlock/cli/app.py +57 -27
- invarlock/cli/commands/__init__.py +2 -2
- invarlock/cli/commands/calibrate.py +48 -4
- invarlock/cli/commands/{certify.py → evaluate.py} +69 -46
- invarlock/cli/commands/explain_gates.py +94 -51
- invarlock/cli/commands/export_html.py +11 -9
- invarlock/cli/commands/report.py +121 -47
- invarlock/cli/commands/run.py +274 -66
- invarlock/cli/commands/verify.py +84 -89
- invarlock/cli/determinism.py +1 -1
- invarlock/cli/provenance.py +3 -3
- invarlock/core/bootstrap.py +1 -1
- invarlock/core/retry.py +14 -14
- invarlock/core/runner.py +1 -1
- invarlock/edits/noop.py +2 -2
- invarlock/edits/quant_rtn.py +2 -2
- invarlock/eval/__init__.py +1 -1
- invarlock/eval/bench.py +11 -7
- invarlock/eval/primary_metric.py +1 -1
- invarlock/guards/spectral.py +2 -2
- invarlock/guards_ref/spectral_ref.py +1 -1
- invarlock/model_profile.py +16 -35
- invarlock/observability/health.py +38 -20
- invarlock/plugins/hf_bnb_adapter.py +32 -21
- invarlock/reporting/__init__.py +18 -4
- invarlock/reporting/html.py +7 -7
- invarlock/reporting/normalizer.py +2 -2
- invarlock/reporting/policy_utils.py +1 -1
- invarlock/reporting/primary_metric_utils.py +11 -11
- invarlock/reporting/render.py +126 -120
- invarlock/reporting/report.py +43 -37
- invarlock/reporting/{certificate.py → report_builder.py} +103 -99
- invarlock/reporting/{certificate_schema.py → report_schema.py} +22 -22
- invarlock-0.3.9.dist-info/METADATA +303 -0
- {invarlock-0.3.7.dist-info → invarlock-0.3.9.dist-info}/RECORD +46 -46
- {invarlock-0.3.7.dist-info → invarlock-0.3.9.dist-info}/WHEEL +1 -1
- invarlock-0.3.7.dist-info/METADATA +0 -602
- {invarlock-0.3.7.dist-info → invarlock-0.3.9.dist-info}/entry_points.txt +0 -0
- {invarlock-0.3.7.dist-info → invarlock-0.3.9.dist-info}/licenses/LICENSE +0 -0
- {invarlock-0.3.7.dist-info → invarlock-0.3.9.dist-info}/top_level.txt +0 -0
|
@@ -1,602 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: invarlock
|
|
3
|
-
Version: 0.3.7
|
|
4
|
-
Summary: Edit‑agnostic robustness certificates for weight edits (InvarLock framework)
|
|
5
|
-
Author-email: InvarLock Team <oss@invarlock.dev>
|
|
6
|
-
Maintainer-email: InvarLock Maintainers <support@invarlock.dev>
|
|
7
|
-
License-Expression: Apache-2.0
|
|
8
|
-
Project-URL: Homepage, https://github.com/invarlock/invarlock
|
|
9
|
-
Project-URL: Repository, https://github.com/invarlock/invarlock
|
|
10
|
-
Project-URL: Documentation, https://github.com/invarlock/invarlock/tree/main/docs
|
|
11
|
-
Project-URL: Issues, https://github.com/invarlock/invarlock/issues
|
|
12
|
-
Project-URL: Changelog, https://github.com/invarlock/invarlock/blob/main/CHANGELOG.md
|
|
13
|
-
Keywords: machine-learning,deep-learning,transformers,pytorch,llm,quantization,safety,evaluation,certification
|
|
14
|
-
Classifier: Development Status :: 4 - Beta
|
|
15
|
-
Classifier: Intended Audience :: Developers
|
|
16
|
-
Classifier: Intended Audience :: Science/Research
|
|
17
|
-
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
18
|
-
Classifier: Programming Language :: Python :: 3
|
|
19
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
20
|
-
Classifier: Programming Language :: Python :: 3.13
|
|
21
|
-
Classifier: Operating System :: OS Independent
|
|
22
|
-
Classifier: Typing :: Typed
|
|
23
|
-
Requires-Python: >=3.12
|
|
24
|
-
Description-Content-Type: text/markdown
|
|
25
|
-
License-File: LICENSE
|
|
26
|
-
Requires-Dist: typer>=0.15
|
|
27
|
-
Requires-Dist: click>=8.1
|
|
28
|
-
Requires-Dist: shellingham>=1.5.0
|
|
29
|
-
Requires-Dist: pandas>=2.2
|
|
30
|
-
Requires-Dist: scikit-learn>=1.4
|
|
31
|
-
Requires-Dist: pydantic>=2.0
|
|
32
|
-
Requires-Dist: rich>=13.0
|
|
33
|
-
Requires-Dist: pyyaml>=6.0
|
|
34
|
-
Requires-Dist: markdown>=3.5
|
|
35
|
-
Requires-Dist: psutil>=5.9
|
|
36
|
-
Requires-Dist: hypothesis>=6.98
|
|
37
|
-
Requires-Dist: typing_extensions>=4.7
|
|
38
|
-
Requires-Dist: jsonschema>=4.0
|
|
39
|
-
Provides-Extra: adapters
|
|
40
|
-
Requires-Dist: torch>=2.1.0; extra == "adapters"
|
|
41
|
-
Requires-Dist: transformers>=4.53.0; extra == "adapters"
|
|
42
|
-
Provides-Extra: hf
|
|
43
|
-
Requires-Dist: torch>=2.1.0; extra == "hf"
|
|
44
|
-
Requires-Dist: transformers>=4.53.0; extra == "hf"
|
|
45
|
-
Requires-Dist: datasets>=3.0; extra == "hf"
|
|
46
|
-
Requires-Dist: numpy>=1.24; extra == "hf"
|
|
47
|
-
Requires-Dist: huggingface_hub>=0.23; extra == "hf"
|
|
48
|
-
Requires-Dist: aiohttp>=3.12.14; extra == "hf"
|
|
49
|
-
Requires-Dist: h2>=4.3.0; extra == "hf"
|
|
50
|
-
Requires-Dist: pillow>=11.3.0; extra == "hf"
|
|
51
|
-
Provides-Extra: guards
|
|
52
|
-
Requires-Dist: torch>=2.1.0; extra == "guards"
|
|
53
|
-
Requires-Dist: numpy>=1.24; extra == "guards"
|
|
54
|
-
Provides-Extra: edits
|
|
55
|
-
Requires-Dist: torch>=2.1.0; extra == "edits"
|
|
56
|
-
Provides-Extra: eval
|
|
57
|
-
Requires-Dist: torch>=2.1.0; extra == "eval"
|
|
58
|
-
Requires-Dist: datasets>=3.0; extra == "eval"
|
|
59
|
-
Provides-Extra: gptq
|
|
60
|
-
Requires-Dist: torch>=2.1.0; extra == "gptq"
|
|
61
|
-
Requires-Dist: auto-gptq>=0.7.0; platform_system == "Linux" and extra == "gptq"
|
|
62
|
-
Requires-Dist: triton>=2.3.0; platform_system == "Linux" and extra == "gptq"
|
|
63
|
-
Requires-Dist: transformers>=4.53.0; extra == "gptq"
|
|
64
|
-
Provides-Extra: awq
|
|
65
|
-
Requires-Dist: torch>=2.1.0; extra == "awq"
|
|
66
|
-
Requires-Dist: autoawq>=0.2.0; platform_system == "Linux" and extra == "awq"
|
|
67
|
-
Requires-Dist: transformers>=4.53.0; extra == "awq"
|
|
68
|
-
Requires-Dist: triton>=2.3.0; platform_system == "Linux" and extra == "awq"
|
|
69
|
-
Provides-Extra: gpu
|
|
70
|
-
Requires-Dist: torch>=2.1.0; extra == "gpu"
|
|
71
|
-
Requires-Dist: accelerate>=0.27; extra == "gpu"
|
|
72
|
-
Requires-Dist: bitsandbytes>=0.41; platform_system == "Linux" and extra == "gpu"
|
|
73
|
-
Provides-Extra: all
|
|
74
|
-
Requires-Dist: torch>=2.1.0; extra == "all"
|
|
75
|
-
Requires-Dist: transformers>=4.53.0; extra == "all"
|
|
76
|
-
Requires-Dist: datasets>=3.0; extra == "all"
|
|
77
|
-
Requires-Dist: numpy>=1.24; extra == "all"
|
|
78
|
-
Requires-Dist: huggingface_hub>=0.23; extra == "all"
|
|
79
|
-
Requires-Dist: accelerate>=0.27; extra == "all"
|
|
80
|
-
Requires-Dist: bitsandbytes>=0.41; platform_system == "Linux" and extra == "all"
|
|
81
|
-
Requires-Dist: auto-gptq>=0.7.0; platform_system == "Linux" and extra == "all"
|
|
82
|
-
Requires-Dist: autoawq>=0.2.0; platform_system == "Linux" and extra == "all"
|
|
83
|
-
Requires-Dist: triton>=2.3.0; platform_system == "Linux" and extra == "all"
|
|
84
|
-
Requires-Dist: aiohttp>=3.12.14; extra == "all"
|
|
85
|
-
Requires-Dist: h2>=4.3.0; extra == "all"
|
|
86
|
-
Requires-Dist: pillow>=11.3.0; extra == "all"
|
|
87
|
-
Provides-Extra: onnx
|
|
88
|
-
Requires-Dist: optimum>=1.17.0; extra == "onnx"
|
|
89
|
-
Requires-Dist: onnxruntime>=1.17.0; extra == "onnx"
|
|
90
|
-
Provides-Extra: dev
|
|
91
|
-
Requires-Dist: pytest>=7.0; extra == "dev"
|
|
92
|
-
Requires-Dist: pytest-cov>=4.0; extra == "dev"
|
|
93
|
-
Requires-Dist: ruff>=0.1.0; extra == "dev"
|
|
94
|
-
Requires-Dist: black>=23.0; extra == "dev"
|
|
95
|
-
Requires-Dist: mypy>=1.0; extra == "dev"
|
|
96
|
-
Requires-Dist: hypothesis>=6.98; extra == "dev"
|
|
97
|
-
Requires-Dist: pre-commit>=3.0; extra == "dev"
|
|
98
|
-
Requires-Dist: mkdocs>=1.5; extra == "dev"
|
|
99
|
-
Requires-Dist: mkdocs-material>=9.5; extra == "dev"
|
|
100
|
-
Requires-Dist: mkdocs-mermaid2-plugin>=1.1; extra == "dev"
|
|
101
|
-
Requires-Dist: sphinx>=7.0; extra == "dev"
|
|
102
|
-
Requires-Dist: matplotlib>=3.7; extra == "dev"
|
|
103
|
-
Requires-Dist: bitsandbytes>=0.41; extra == "dev"
|
|
104
|
-
Requires-Dist: build>=0.10.0; extra == "dev"
|
|
105
|
-
Requires-Dist: twine>=4.0.0; extra == "dev"
|
|
106
|
-
Dynamic: license-file
|
|
107
|
-
|
|
108
|
-
# InvarLock — Edit‑agnostic robustness certificates for weight edits
|
|
109
|
-
|
|
110
|
-
In short: certify that weight edits (e.g., quantization) preserve quality. If
|
|
111
|
-
they don’t, roll back safely.
|
|
112
|
-
|
|
113
|
-
Technical: edit‑agnostic guard pipeline (invariants → spectral → RMT →
|
|
114
|
-
variance) producing a machine‑readable Evaluation Certificate.
|
|
115
|
-
|
|
116
|
-
> **Status:** 0.3.7 (pre‑1.0). Until 1.0, **minor** releases may be
|
|
117
|
-
> breaking. See CLI help and the CHANGELOG for updates.
|
|
118
|
-
|
|
119
|
-
[](https://github.com/invarlock/invarlock/actions/workflows/ci.yml)
|
|
120
|
-
[](https://pypi.org/project/invarlock/)
|
|
121
|
-
[](https://github.com/invarlock/invarlock/blob/main/docs/user-guide/quickstart.md)
|
|
122
|
-
[](LICENSE)
|
|
123
|
-
[](https://www.python.org/downloads/release/python-3120/)
|
|
124
|
-
---
|
|
125
|
-
|
|
126
|
-
For guidance on where to ask questions, how to report bugs, and what to expect in terms of response times, see
|
|
127
|
-
[SUPPORT.md](https://github.com/invarlock/invarlock/blob/main/SUPPORT.md).
|
|
128
|
-
|
|
129
|
-
## 🚀 Quick start (no repo clone)
|
|
130
|
-
|
|
131
|
-
Notebooks (Colab):
|
|
132
|
-
|
|
133
|
-
- [](https://colab.research.google.com/github/invarlock/invarlock/blob/main/notebooks/invarlock_quickstart_cpu.ipynb)
|
|
134
|
-
`invarlock_quickstart_cpu.ipynb` — install + certify + verify + HTML export (CPU-friendly)
|
|
135
|
-
- [](https://colab.research.google.com/github/invarlock/invarlock/blob/main/notebooks/invarlock_compare_certify.ipynb)
|
|
136
|
-
`invarlock_compare_certify.ipynb` — Compare & Certify (BYOE) end-to-end
|
|
137
|
-
- [](https://colab.research.google.com/github/invarlock/invarlock/blob/main/notebooks/invarlock_certificate_deep_dive.ipynb)
|
|
138
|
-
`invarlock_certificate_deep_dive.ipynb` — reading and interpreting certificates
|
|
139
|
-
- [](https://colab.research.google.com/github/invarlock/invarlock/blob/main/notebooks/invarlock_custom_datasets.ipynb)
|
|
140
|
-
`invarlock_custom_datasets.ipynb` — Bring Your Own Data (BYOD) with `local_jsonl`
|
|
141
|
-
- [](https://colab.research.google.com/github/invarlock/invarlock/blob/main/notebooks/invarlock_python_api.ipynb)
|
|
142
|
-
`invarlock_python_api.ipynb` — programmatic Python API usage
|
|
143
|
-
- [](https://colab.research.google.com/github/invarlock/invarlock/blob/main/notebooks/invarlock_policy_tiers.ipynb)
|
|
144
|
-
`invarlock_policy_tiers.ipynb` — Conservative vs Balanced vs Aggressive tier comparison
|
|
145
|
-
|
|
146
|
-
```bash
|
|
147
|
-
# Install with HF adapters
|
|
148
|
-
pip install "invarlock[hf]"
|
|
149
|
-
|
|
150
|
-
# Fast dev self‑cert on GPT‑2 small (tiny‑relax; downloads require explicit network)
|
|
151
|
-
INVARLOCK_ALLOW_NETWORK=1 INVARLOCK_DEDUP_TEXTS=1 INVARLOCK_TINY_RELAX=1 \
|
|
152
|
-
invarlock certify \
|
|
153
|
-
--baseline gpt2 \
|
|
154
|
-
--subject gpt2 \
|
|
155
|
-
--adapter auto \
|
|
156
|
-
--profile dev
|
|
157
|
-
```
|
|
158
|
-
|
|
159
|
-
This produces `reports/.../evaluation.cert.json` with paired metrics
|
|
160
|
-
(ppl/accuracy), structural deltas, spectral/RMT stats, variance‑estimator
|
|
161
|
-
provenance, seeds/hashes, pairing metrics, and a policy digest.
|
|
162
|
-
|
|
163
|
-
> **Calibration note:** tier thresholds and window sizes are piloted on GPT‑2 small
|
|
164
|
-
> and BERT base (see `docs/assurance/09-tier-v1-calibration.md`). For
|
|
165
|
-
> calibrated Balanced/Conservative certs, use the preset‑based CI/Release examples
|
|
166
|
-
> below. `INVARLOCK_TINY_RELAX` dev runs relax sample‑size floors and are intended
|
|
167
|
-
> only for small smoke tests (not release evidence).
|
|
168
|
-
|
|
169
|
-
> Need presets or matrix scripts? Clone this repo and see Presets & Demos below.
|
|
170
|
-
|
|
171
|
-
---
|
|
172
|
-
|
|
173
|
-
## 📚 Docs & Guides
|
|
174
|
-
|
|
175
|
-
- Quickstart: <https://github.com/invarlock/invarlock/blob/main/docs/user-guide/quickstart.md>
|
|
176
|
-
- Compare & Certify (BYOE): <https://github.com/invarlock/invarlock/blob/main/docs/user-guide/compare-and-certify.md>
|
|
177
|
-
- Reading a Certificate: <https://github.com/invarlock/invarlock/blob/main/docs/user-guide/reading-certificate.md>
|
|
178
|
-
- CLI reference: <https://github.com/invarlock/invarlock/blob/main/docs/reference/cli.md>
|
|
179
|
-
|
|
180
|
-
Quick examples (repo presets, CPU; repo clone required for preset paths):
|
|
181
|
-
|
|
182
|
-
```bash
|
|
183
|
-
# Install with HF adapters
|
|
184
|
-
pip install "invarlock[hf]"
|
|
185
|
-
|
|
186
|
-
# Preflight a config (JSON diagnostics)
|
|
187
|
-
invarlock doctor --config configs/presets/causal_lm/wikitext2_512.yaml --json
|
|
188
|
-
|
|
189
|
-
# Calibrated GPT‑2 small (recommended starting point; repo preset)
|
|
190
|
-
INVARLOCK_ALLOW_NETWORK=1 INVARLOCK_DEDUP_TEXTS=1 \
|
|
191
|
-
invarlock certify \
|
|
192
|
-
--baseline gpt2 \
|
|
193
|
-
--subject gpt2 \
|
|
194
|
-
--adapter auto \
|
|
195
|
-
--profile release \
|
|
196
|
-
--preset configs/presets/causal_lm/wikitext2_512.yaml
|
|
197
|
-
|
|
198
|
-
# Tiny causal LM smoke (out‑of‑calibration, dev‑only)
|
|
199
|
-
INVARLOCK_ALLOW_NETWORK=1 \
|
|
200
|
-
invarlock certify \
|
|
201
|
-
--baseline hf:sshleifer/tiny-gpt2 \
|
|
202
|
-
--subject hf:sshleifer/tiny-gpt2 \
|
|
203
|
-
--profile dev
|
|
204
|
-
```
|
|
205
|
-
|
|
206
|
-
Notes:
|
|
207
|
-
|
|
208
|
-
- Presets and scripts live in this repo (`configs/`, `scripts/`) and are not
|
|
209
|
-
shipped in wheels. Use flag‑only `certify` when installing from PyPI, or clone
|
|
210
|
-
this repo to use presets and the matrix script.
|
|
211
|
-
- `python -m invarlock` works the same as `invarlock`.
|
|
212
|
-
- InvarLock runs offline by default; enable network per command with `INVARLOCK_ALLOW_NETWORK=1` when fetching.
|
|
213
|
-
|
|
214
|
-
---
|
|
215
|
-
|
|
216
|
-
## 🔧 Installation
|
|
217
|
-
|
|
218
|
-
```bash
|
|
219
|
-
# Core + HF adapter
|
|
220
|
-
pip install "invarlock[hf]"
|
|
221
|
-
|
|
222
|
-
# GPU extras (CUDA wheels if available)
|
|
223
|
-
pip install "invarlock[gpu]"
|
|
224
|
-
|
|
225
|
-
# Optional edit backends
|
|
226
|
-
pip install "invarlock[awq,gptq]" # AWQ/GPTQ PTQ stacks
|
|
227
|
-
pip install "invarlock[dev]" # dev tooling (ruff, pytest, mkdocs)
|
|
228
|
-
```
|
|
229
|
-
|
|
230
|
-
> Minimal core installs with `pip install invarlock`. The OSS core is edit‑agnostic
|
|
231
|
-
> (BYOE): supply baseline and subject checkpoints and run Compare & Certify. A
|
|
232
|
-
> small built‑in edit, `quant_rtn`, is provided for CI/quickstart demos only;
|
|
233
|
-
> optional extras (e.g., `gptq`, `awq`, `gpu`) are loaders/runtimes, not edit
|
|
234
|
-
> pipelines. Core installs do not pull in torch/transformers; those are only
|
|
235
|
-
> installed when you opt into extras such as `"invarlock[hf]"` or
|
|
236
|
-
> `"invarlock[adapters]"`.
|
|
237
|
-
|
|
238
|
-
Run either entry point:
|
|
239
|
-
|
|
240
|
-
```bash
|
|
241
|
-
invarlock --help
|
|
242
|
-
python -m invarlock --help
|
|
243
|
-
```
|
|
244
|
-
|
|
245
|
-
Common error (missing torch on adapter-based commands):
|
|
246
|
-
|
|
247
|
-
```text
|
|
248
|
-
❌ Torch is required for this command.
|
|
249
|
-
Install extras with: pip install "invarlock[hf]" or "invarlock[adapters]".
|
|
250
|
-
```
|
|
251
|
-
|
|
252
|
-
If you see this, install an appropriate extra (for example, `pip install "invarlock[hf]"`)
|
|
253
|
-
before running `invarlock run` or `invarlock certify` with HF adapters.
|
|
254
|
-
|
|
255
|
-
### Network Access
|
|
256
|
-
|
|
257
|
-
- Outbound network is disabled by default for safety. Enable it explicitly (per
|
|
258
|
-
command) when you need to download models or datasets:
|
|
259
|
-
|
|
260
|
-
```bash
|
|
261
|
-
INVARLOCK_ALLOW_NETWORK=1 invarlock certify \
|
|
262
|
-
--baseline gpt2 \
|
|
263
|
-
--subject gpt2 \
|
|
264
|
-
--adapter auto \
|
|
265
|
-
--profile ci \
|
|
266
|
-
--preset configs/presets/causal_lm/wikitext2_512.yaml
|
|
267
|
-
```
|
|
268
|
-
|
|
269
|
-
- Offline/air‑gapped usage: pre‑download to a cache, then run with network
|
|
270
|
-
disabled. You can enforce offline reads with `HF_DATASETS_OFFLINE=1` (and
|
|
271
|
-
optionally set `HF_HOME`/`HF_DATASETS_CACHE` to your cache location).
|
|
272
|
-
|
|
273
|
-
See the CLI reference and datasets guide for details:
|
|
274
|
-
|
|
275
|
-
- <https://github.com/invarlock/invarlock/blob/main/docs/reference/cli.md>
|
|
276
|
-
- <https://github.com/invarlock/invarlock/blob/main/docs/reference/datasets.md>
|
|
277
|
-
|
|
278
|
-
### Install via pipx (isolated)
|
|
279
|
-
|
|
280
|
-
```bash
|
|
281
|
-
# Ensure pipx uses Python 3.12+
|
|
282
|
-
pipx install --python python3.12 "invarlock[hf]" # Python 3.12+ recommended
|
|
283
|
-
|
|
284
|
-
# With GPU extras (if supported on your platform)
|
|
285
|
-
pipx install --python python3.12 "invarlock[hf,gpu]"
|
|
286
|
-
```
|
|
287
|
-
|
|
288
|
-
### Conda environment recipe
|
|
289
|
-
|
|
290
|
-
```bash
|
|
291
|
-
conda create -n invarlock python=3.12 -y
|
|
292
|
-
conda activate invarlock
|
|
293
|
-
|
|
294
|
-
# Core + HF stack
|
|
295
|
-
pip install "invarlock[hf]"
|
|
296
|
-
|
|
297
|
-
# Optional extras
|
|
298
|
-
# pip install "invarlock[gpu]"
|
|
299
|
-
# pip install "invarlock[awq,gptq]"
|
|
300
|
-
```
|
|
301
|
-
|
|
302
|
-
---
|
|
303
|
-
|
|
304
|
-
## 💻 Support Matrix
|
|
305
|
-
|
|
306
|
-
<!-- markdownlint-disable MD060 -->
|
|
307
|
-
| Platform | Status | Notes |
|
|
308
|
-
| ---------------------- | --------------- | ----------------------------------------- |
|
|
309
|
-
| Python 3.12+ | ✅ Required | |
|
|
310
|
-
| Linux | ✅ Full | Primary dev target |
|
|
311
|
-
| macOS (Intel/M-series) | ✅ Full | MPS supported (default on Apple Silicon) |
|
|
312
|
-
| Windows | ❌ Not supported | Use WSL2 or a Linux container if required |
|
|
313
|
-
| CUDA | ✅ Recommended | For larger models |
|
|
314
|
-
| CPU | ✅ Fallback | Slower but functional |
|
|
315
|
-
<!-- markdownlint-enable MD060 -->
|
|
316
|
-
|
|
317
|
-
**Device selection:** CUDA → MPS → CPU (auto). Override with torch env if
|
|
318
|
-
needed (e.g., `CUDA_VISIBLE_DEVICES`).
|
|
319
|
-
|
|
320
|
-
---
|
|
321
|
-
|
|
322
|
-
## 🧱 What InvarLock Provides
|
|
323
|
-
|
|
324
|
-
- **Runner** (torch-agnostic core): `prepare → preview → apply → guards → evaluate → report/rollback`
|
|
325
|
-
|
|
326
|
-
- **Built-in edit**:
|
|
327
|
-
- `quant_rtn` (INT8 RTN, per‑channel, clamp/group size)
|
|
328
|
-
|
|
329
|
-
- **Guards** (policy-tiered; “GuardChain” = ordered guard pipeline):
|
|
330
|
-
|
|
331
|
-
1. **Invariants** (pre/post: shapes/finite/tying)
|
|
332
|
-
2. **Spectral** (per-family z-caps; monitor or gate per tier)
|
|
333
|
-
3. **RMT** (ε-band on outliers; monitor or gate per tier)
|
|
334
|
-
4. **Variance (VE)** (predictive paired ΔlogNLL gate; tiered sidedness)
|
|
335
|
-
|
|
336
|
-
- **Evaluation Certificate (schema v1, PM‑only)**: Primary Metric (ppl or
|
|
337
|
-
accuracy) with paired statistics, structural deltas, spectral/RMT stats, VE
|
|
338
|
-
provenance, seeds/hashes, pairing metrics, and **policy digest**. Canonical
|
|
339
|
-
artifact: `reports/.../evaluation.cert.json`.
|
|
340
|
-
|
|
341
|
-
**Scope (what InvarLock does / does not do):**
|
|
342
|
-
|
|
343
|
-
- InvarLock certifies **regression risk from weight edits** (e.g., quantization or
|
|
344
|
-
pruning) relative to a fixed baseline under a specific configuration.
|
|
345
|
-
- It focuses on **paired primary metrics** (ppl/accuracy) plus structural and
|
|
346
|
-
guard telemetry (invariants, spectral, RMT, variance) for those edits.
|
|
347
|
-
- It **does not** claim to solve content‑safety problems (toxicity, bias,
|
|
348
|
-
jailbreaks) or alignment in general, and it does not certify arbitrary
|
|
349
|
-
training changes or new datasets.
|
|
350
|
-
- It is calibrated and tested on Linux/macOS environments using the HF/PyTorch
|
|
351
|
-
stack described in the docs; native Windows is not supported.
|
|
352
|
-
- For the detailed assurance case and threat model, see
|
|
353
|
-
`docs/assurance/00-safety-case.md` and `docs/security/threat-model.md`.
|
|
354
|
-
|
|
355
|
-
Minimal excerpt (redacted):
|
|
356
|
-
|
|
357
|
-
```json
|
|
358
|
-
{
|
|
359
|
-
"schema_version": "v1",
|
|
360
|
-
"run_id": "...",
|
|
361
|
-
"validation": {
|
|
362
|
-
"primary_metric_acceptable": true,
|
|
363
|
-
"guard_overhead_acceptable": true
|
|
364
|
-
},
|
|
365
|
-
"primary_metric": {
|
|
366
|
-
"kind": "ppl_causal",
|
|
367
|
-
"preview": 12.3,
|
|
368
|
-
"final": 12.1,
|
|
369
|
-
"ratio_vs_baseline": 0.98,
|
|
370
|
-
"display_ci": [0.97, 0.99]
|
|
371
|
-
},
|
|
372
|
-
"structure": {"layers_modified": 0, "params_changed": 0},
|
|
373
|
-
"spectral": {"caps_applied": 0},
|
|
374
|
-
"rmt": {"stable": true},
|
|
375
|
-
"auto": {"tier": "balanced"}
|
|
376
|
-
}
|
|
377
|
-
```
|
|
378
|
-
|
|
379
|
-
---
|
|
380
|
-
|
|
381
|
-
## 🛡️ Guard Order & Balanced Defaults
|
|
382
|
-
|
|
383
|
-
**Canonical order**: `["invariants", "spectral", "rmt", "variance", "invariants"]`
|
|
384
|
-
|
|
385
|
-
**Balanced profile (example)**
|
|
386
|
-
|
|
387
|
-
```yaml
|
|
388
|
-
guards:
|
|
389
|
-
spectral:
|
|
390
|
-
mode: monitor
|
|
391
|
-
sigma_quantile: 0.95
|
|
392
|
-
deadband: 0.10
|
|
393
|
-
scope: all
|
|
394
|
-
max_caps: 5
|
|
395
|
-
max_spectral_norm: null # disable absolute clamp; rely on calibrated κ_f
|
|
396
|
-
multiple_testing: { method: bh, alpha: 0.05, m: 4 }
|
|
397
|
-
family_caps: { ffn: 2.5, attn: 2.8, embed: 3.0, other: 3.0 } # z-caps (FPR-derived)
|
|
398
|
-
rmt:
|
|
399
|
-
mode: monitor
|
|
400
|
-
epsilon_by_family: { ffn: 0.10, attn: 0.08, embed: 0.12, other: 0.12 }
|
|
401
|
-
variance:
|
|
402
|
-
tap: "post mlp.c_proj (pre-residual)"
|
|
403
|
-
targets: "edited_modules_only"
|
|
404
|
-
discovery:
|
|
405
|
-
deadband: 0.02
|
|
406
|
-
min_abs_adjust: 0.012
|
|
407
|
-
max_scale_step: 0.03
|
|
408
|
-
gating:
|
|
409
|
-
sided: "one-sided" # improvement-only
|
|
410
|
-
min_effect_lognll: 9e-4 # pilot-derived power threshold
|
|
411
|
-
```
|
|
412
|
-
|
|
413
|
-
> **Conservative** raises z-caps/ε/deadband/min-effect and uses **two-sided** VE; **Aggressive** relaxes accordingly.
|
|
414
|
-
|
|
415
|
-
---
|
|
416
|
-
|
|
417
|
-
> 🔍 For development and CI commands (pytest, mkdocs, generators), see CONTRIBUTING.md.
|
|
418
|
-
|
|
419
|
-
---
|
|
420
|
-
|
|
421
|
-
## ✂️ Edits & Plugins
|
|
422
|
-
|
|
423
|
-
- **Quant RTN** (built‑in): INT8 RTN, per‑channel, group size, percentile clamp
|
|
424
|
-
- **Compare & Certify (BYOE, recommended)**: Bring your baseline + subject checkpoints and certify with InvarLock
|
|
425
|
-
- **Plugins (optional)**: Adapters and guards via entry points. Adapters extend
|
|
426
|
-
model loading/inference (e.g., GPTQ/AWQ formats); plugins do not add edit
|
|
427
|
-
algorithms beyond RTN. List components with:
|
|
428
|
-
|
|
429
|
-
```bash
|
|
430
|
-
invarlock plugins --help # summary
|
|
431
|
-
invarlock plugins guards # guard plugins
|
|
432
|
-
invarlock plugins edits # edit plugins
|
|
433
|
-
invarlock plugins adapters # adapters and backend hints
|
|
434
|
-
```
|
|
435
|
-
|
|
436
|
-
---
|
|
437
|
-
|
|
438
|
-
## 🔁 Certification Criteria (balanced profile)
|
|
439
|
-
|
|
440
|
-
Key checks enforced by balanced policy (summary):
|
|
441
|
-
|
|
442
|
-
- **Pairing invariants**: preview = final counts; `match=1.00`, `overlap=0.00` (fail-fast in CI/Release)
|
|
443
|
-
- **PM ratio gate** (ppl or accuracy): upper CI ≤ **1.10**
|
|
444
|
-
- **Drift**: 0.95–1.05 (paired log-space)
|
|
445
|
-
- **Spectral/RMT**: within tier FPR/ε band
|
|
446
|
-
- **Catastrophe rollback**: automatic revert if PPL > **2.0×**
|
|
447
|
-
- **Guard overhead**: a bare-vs-guarded comparison records `validation.guard_overhead_acceptable=true` when ≤ 1 % PPL overhead
|
|
448
|
-
|
|
449
|
-
|
|
450
|
-
---
|
|
451
|
-
|
|
452
|
-
## 🧾 Minimal Config (balanced GPT-2, CI profile)
|
|
453
|
-
|
|
454
|
-
```yaml
|
|
455
|
-
model:
|
|
456
|
-
id: "<set-your-model-id>" # e.g., gpt2
|
|
457
|
-
adapter: "hf_causal"
|
|
458
|
-
device: "cpu"
|
|
459
|
-
dataset:
|
|
460
|
-
provider: "wikitext2"
|
|
461
|
-
split: "validation"
|
|
462
|
-
seq_len: 512
|
|
463
|
-
stride: 512
|
|
464
|
-
preview_n: 64
|
|
465
|
-
final_n: 64
|
|
466
|
-
seed: 42
|
|
467
|
-
edit:
|
|
468
|
-
# Optional: built-in quant demo. Omit for Compare & Certify/BYOE.
|
|
469
|
-
name: quant_rtn
|
|
470
|
-
plan:
|
|
471
|
-
bitwidth: 8
|
|
472
|
-
per_channel: true
|
|
473
|
-
scope: attn
|
|
474
|
-
eval:
|
|
475
|
-
metric:
|
|
476
|
-
kind: ppl_causal
|
|
477
|
-
loss:
|
|
478
|
-
type: causal
|
|
479
|
-
guards:
|
|
480
|
-
order: [invariants, spectral, rmt, variance, invariants]
|
|
481
|
-
spectral: { mode: monitor }
|
|
482
|
-
rmt: { mode: monitor }
|
|
483
|
-
variance:
|
|
484
|
-
tap: "post mlp.c_proj (pre-residual)"
|
|
485
|
-
targets: "edited_modules_only"
|
|
486
|
-
discovery: { deadband: 0.02, min_abs_adjust: 0.012, max_scale_step: 0.03 }
|
|
487
|
-
gating: { sided: one-sided, min_effect_lognll: 9e-4 }
|
|
488
|
-
auto:
|
|
489
|
-
enabled: true
|
|
490
|
-
tier: balanced
|
|
491
|
-
probes: 0
|
|
492
|
-
output:
|
|
493
|
-
dir: runs
|
|
494
|
-
save_model: false
|
|
495
|
-
save_report: true
|
|
496
|
-
```
|
|
497
|
-
|
|
498
|
-
---
|
|
499
|
-
|
|
500
|
-
## 🩺 Doctor (preflight)
|
|
501
|
-
|
|
502
|
-
Run preflight checks before a run to catch misconfigurations early:
|
|
503
|
-
|
|
504
|
-
```bash
|
|
505
|
-
invarlock doctor --config configs/presets/causal_lm/wikitext2_512.yaml --json
|
|
506
|
-
```
|
|
507
|
-
|
|
508
|
-
Text mode emits lines prefixed with `ERROR:`, `WARNING:`, or `NOTE:` and stable
|
|
509
|
-
codes like `[INVARLOCK:D001]`. JSON mode includes `summary`, `policy`,
|
|
510
|
-
`findings[]`, `resolution`, and `format_version`.
|
|
511
|
-
|
|
512
|
-
---
|
|
513
|
-
|
|
514
|
-
## 🏗️ Source Layout (Single Distribution)
|
|
515
|
-
|
|
516
|
-
```text
|
|
517
|
-
invarlock/
|
|
518
|
-
├─ src/
|
|
519
|
-
│ ├─ invarlock/ # core + unified namespace
|
|
520
|
-
│ │ ├─ core/ # runner, registry, contracts, events, ABI
|
|
521
|
-
│ │ ├─ cli/ # console app + command wrappers (unified import path)
|
|
522
|
-
│ │ ├─ adapters/ # model adapters (HF causal/MLM/seq2seq/onnx)
|
|
523
|
-
│ │ ├─ edits/ # quant_rtn
|
|
524
|
-
│ │ ├─ guards/ # invariants, spectral, rmt, variance
|
|
525
|
-
│ │ ├─ eval/ # evaluation metrics and helpers
|
|
526
|
-
│ │ ├─ reporting/ # report assembly, certificate generation/validation
|
|
527
|
-
│ │ ├─ assurance/ # assurance surface aggregating cert helpers
|
|
528
|
-
│ │ ├─ plugins/ # built-in example plugins
|
|
529
|
-
│ │ └─ observability/ # monitoring/metrics/tracing wrappers
|
|
530
|
-
├─ configs/ # presets (repo‑only; clone to use)
|
|
531
|
-
├─ docs/ # user guides, reference, assurance notes
|
|
532
|
-
├─ scripts/ # automation / QA helpers
|
|
533
|
-
└─ tests/ # unit/integration/property tests
|
|
534
|
-
|
|
535
|
-
Note: The package exposes a single import namespace (`invarlock.*`). Presets/scripts are repo resources and not packaged in wheels.
|
|
536
|
-
```
|
|
537
|
-
|
|
538
|
-
---
|
|
539
|
-
|
|
540
|
-
## 📚 Documentation
|
|
541
|
-
|
|
542
|
-
- User Guide: <https://github.com/invarlock/invarlock/blob/main/docs/user-guide/getting-started.md>
|
|
543
|
-
- Quickstart: <https://github.com/invarlock/invarlock/blob/main/docs/user-guide/quickstart.md>
|
|
544
|
-
- Compare & Certify (BYOE): <https://github.com/invarlock/invarlock/blob/main/docs/user-guide/compare-and-certify.md>
|
|
545
|
-
- Reading a Certificate: <https://github.com/invarlock/invarlock/blob/main/docs/user-guide/reading-certificate.md>
|
|
546
|
-
- Assurance (proof notes): <https://github.com/invarlock/invarlock/tree/main/docs/assurance>
|
|
547
|
-
- eval math, spectral FPR, RMT ε, VE gate power, determinism
|
|
548
|
-
- Config Schema: <https://github.com/invarlock/invarlock/blob/main/docs/reference/config-schema.md>
|
|
549
|
-
- Guard Reference: <https://github.com/invarlock/invarlock/blob/main/docs/reference/guards.md>
|
|
550
|
-
|
|
551
|
-
---
|
|
552
|
-
|
|
553
|
-
## ⚡ Quick CPU Demos (dev)
|
|
554
|
-
|
|
555
|
-
For tiny, CPU‑only demos that produce readable PASS banners in dev, enable
|
|
556
|
-
tiny‑relax and run the matrix script (repo clone required). This mode relaxes
|
|
557
|
-
primary‑metric token floors and is intended for smoke testing only (not release
|
|
558
|
-
evidence):
|
|
559
|
-
|
|
560
|
-
```bash
|
|
561
|
-
export INVARLOCK_TINY_RELAX=1 INVARLOCK_ALLOW_NETWORK=1 INVARLOCK_DEDUP_TEXTS=1 \
|
|
562
|
-
TRANSFORMERS_NO_TORCHVISION=1 TOKENIZERS_PARALLELISM=false
|
|
563
|
-
RUN=1 NET=1 bash scripts/run_tiny_all_matrix.sh
|
|
564
|
-
```
|
|
565
|
-
|
|
566
|
-
Add `INCLUDE_MEASURED_CLS=1` to include a measured classification step (requires warmed HF caches/network).
|
|
567
|
-
|
|
568
|
-
---
|
|
569
|
-
|
|
570
|
-
## 🧪 Determinism & Provenance
|
|
571
|
-
|
|
572
|
-
- Seeds: `{python, numpy, torch}` recorded in certs
|
|
573
|
-
- Dataset/tokenizer hashes recorded
|
|
574
|
-
- Paired non-overlapping windows (fail-fast if counts mismatch or pairing < 1.0)
|
|
575
|
-
- Cert math checks: `ppl_ratio.point == exp(mean ΔlogNLL)` and CI from the **same** paired Δ array
|
|
576
|
-
|
|
577
|
-
---
|
|
578
|
-
|
|
579
|
-
## 🤝 Contributing
|
|
580
|
-
|
|
581
|
-
```bash
|
|
582
|
-
make dev-install # editable + dev tools (pytest, ruff, mypy, mkdocs, etc.)
|
|
583
|
-
make test # run tests
|
|
584
|
-
make lint # ruff + mypy
|
|
585
|
-
make format # ruff format/fix
|
|
586
|
-
make docs # build docs (mkdocs)
|
|
587
|
-
make verify # tests, lint, format, markdownlint
|
|
588
|
-
```
|
|
589
|
-
|
|
590
|
-
Please see `CONTRIBUTING.md` for guidelines and `Makefile` for more targets.
|
|
591
|
-
|
|
592
|
-
---
|
|
593
|
-
|
|
594
|
-
## 📄 License
|
|
595
|
-
|
|
596
|
-
Apache-2.0 — see `LICENSE`.
|
|
597
|
-
|
|
598
|
-
---
|
|
599
|
-
|
|
600
|
-
### Notes
|
|
601
|
-
|
|
602
|
-
- PPL levels depend on `seq_len` (e.g., 768-token windows typically reduce PPL vs shorter contexts).
|
|
File without changes
|
|
File without changes
|
|
File without changes
|