omnibioai-model-registry 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. omnibioai_model_registry-0.1.0/PKG-INFO +405 -0
  2. omnibioai_model_registry-0.1.0/README.md +377 -0
  3. omnibioai_model_registry-0.1.0/omnibioai_model_registry/__init__.py +10 -0
  4. omnibioai_model_registry-0.1.0/omnibioai_model_registry/api.py +190 -0
  5. omnibioai_model_registry-0.1.0/omnibioai_model_registry/audit/__init__.py +0 -0
  6. omnibioai_model_registry-0.1.0/omnibioai_model_registry/audit/audit_log.py +27 -0
  7. omnibioai_model_registry-0.1.0/omnibioai_model_registry/cli/__init__.py +0 -0
  8. omnibioai_model_registry-0.1.0/omnibioai_model_registry/cli/main.py +211 -0
  9. omnibioai_model_registry-0.1.0/omnibioai_model_registry/config.py +23 -0
  10. omnibioai_model_registry-0.1.0/omnibioai_model_registry/errors.py +27 -0
  11. omnibioai_model_registry-0.1.0/omnibioai_model_registry/package/__init__.py +0 -0
  12. omnibioai_model_registry-0.1.0/omnibioai_model_registry/package/layout.py +67 -0
  13. omnibioai_model_registry-0.1.0/omnibioai_model_registry/package/manifest.py +74 -0
  14. omnibioai_model_registry-0.1.0/omnibioai_model_registry/package/validate.py +11 -0
  15. omnibioai_model_registry-0.1.0/omnibioai_model_registry/refs.py +25 -0
  16. omnibioai_model_registry-0.1.0/omnibioai_model_registry/service/app/main.py +192 -0
  17. omnibioai_model_registry-0.1.0/omnibioai_model_registry/storage/__init__.py +0 -0
  18. omnibioai_model_registry-0.1.0/omnibioai_model_registry/storage/base.py +26 -0
  19. omnibioai_model_registry-0.1.0/omnibioai_model_registry/storage/localfs.py +33 -0
  20. omnibioai_model_registry-0.1.0/omnibioai_model_registry.egg-info/PKG-INFO +405 -0
  21. omnibioai_model_registry-0.1.0/omnibioai_model_registry.egg-info/SOURCES.txt +26 -0
  22. omnibioai_model_registry-0.1.0/omnibioai_model_registry.egg-info/dependency_links.txt +1 -0
  23. omnibioai_model_registry-0.1.0/omnibioai_model_registry.egg-info/entry_points.txt +2 -0
  24. omnibioai_model_registry-0.1.0/omnibioai_model_registry.egg-info/requires.txt +7 -0
  25. omnibioai_model_registry-0.1.0/omnibioai_model_registry.egg-info/top_level.txt +1 -0
  26. omnibioai_model_registry-0.1.0/pyproject.toml +74 -0
  27. omnibioai_model_registry-0.1.0/setup.cfg +4 -0
  28. omnibioai_model_registry-0.1.0/tests/test_registry_localfs.py +183 -0
@@ -0,0 +1,405 @@
1
+ Metadata-Version: 2.4
2
+ Name: omnibioai-model-registry
3
+ Version: 0.1.0
4
+ Summary: Production-grade model registry for the OmniBioAI ecosystem.
5
+ Author: Manish Kumar
6
+ License: MIT
7
+ Keywords: model-registry,mlops,bioinformatics,ai,machine-learning,artifact-management,reproducibility
8
+ Classifier: Development Status :: 3 - Alpha
9
+ Classifier: Intended Audience :: Science/Research
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: License :: OSI Approved :: MIT License
12
+ Classifier: Programming Language :: Python :: 3
13
+ Classifier: Programming Language :: Python :: 3.10
14
+ Classifier: Programming Language :: Python :: 3.11
15
+ Classifier: Programming Language :: Python :: 3.12
16
+ Classifier: Operating System :: OS Independent
17
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
18
+ Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
19
+ Classifier: Topic :: Software Development :: Libraries
20
+ Requires-Python: >=3.10
21
+ Description-Content-Type: text/markdown
22
+ Provides-Extra: dev
23
+ Requires-Dist: pytest>=7.0; extra == "dev"
24
+ Requires-Dist: pytest-cov>=4.0; extra == "dev"
25
+ Requires-Dist: black>=23.0; extra == "dev"
26
+ Requires-Dist: ruff>=0.1.0; extra == "dev"
27
+ Requires-Dist: mypy>=1.0; extra == "dev"
28
+
29
+ # OmniBioAI Model Registry
30
+
31
+ **OmniBioAI Model Registry** is a **production-grade lifecycle management system for AI/ML models** within the **OmniBioAI ecosystem**.
32
+
33
+ It provides:
34
+
35
+ - **Immutable model versioning** (write-once versions)
36
+ - **Cryptographic integrity verification (SHA256)**
37
+ - **Provenance-friendly metadata capture**
38
+ - **Staged promotion workflows (latest → staging → production)**
39
+ - **Deterministic resolution by stable reference** (`model@alias` or `model@version`)
40
+ - **Local-first design** with a clean path to future backends (S3/Azure/on-prem)
41
+
42
+ The registry is implemented as a **standalone Python library** and includes:
43
+ - a **CLI** (`omr`)
44
+ - a **minimal REST service** (FastAPI)
45
+
46
+ ---
47
+
48
+ ## Why This Exists
49
+
50
+ Biomedical AI requires:
51
+
52
+ - **Reproducibility**
53
+ - **Auditability**
54
+ - **Governance**
55
+ - **Offline / air-gapped deployment**
56
+ - **Cross-infrastructure execution parity**
57
+
58
+ Traditional ML tooling often assumes:
59
+ - cloud-first infrastructure
60
+ - mutable artifacts
61
+ - weak provenance guarantees
62
+
63
+ **OmniBioAI Model Registry is designed differently.**
64
+
65
+ > It treats AI models as **scientific artifacts** that must be **immutable, verifiable, and reproducible** across environments.
66
+
67
+ ---
68
+
69
+ ## Role in the OmniBioAI Architecture
70
+
71
+ OmniBioAI follows a **four-plane architecture**:
72
+
73
+ | Plane | Responsibility |
74
+ | ----------------- | -------------------------------------- |
75
+ | **Control Plane** | UI, registries, metadata, governance |
76
+ | **Compute Plane** | Workflow execution, HPC/cloud adapters |
77
+ | **Data Plane** | Artifacts, datasets, outputs |
78
+ | **AI Plane** | Reasoning, RAG, agents, interpretation |
79
+
80
+ The **Model Registry** belongs to the **Control Plane** and provides:
81
+
82
+ - AI artifact governance
83
+ - deterministic inference references
84
+ - promotion and audit workflows
85
+ - infrastructure-independent model resolution
86
+
87
+ ---
88
+
89
+ ## Core Design Principles
90
+
91
+ ### 1) Immutability
92
+ Each model version is **write-once**:
93
+ - no overwrites
94
+ - no silent mutation
95
+ - full historical trace
96
+
97
+ This guarantees **scientific reproducibility**.
98
+
99
+ ### 2) Integrity Verification
100
+ Every model package includes a SHA256 manifest:
101
+
102
+ - `sha256sums.txt` hashes the package contents (excluding itself)
103
+
104
+ This enables:
105
+ - bit-level reproducibility
106
+ - tamper detection
107
+ - trustworthy deployment in regulated environments
108
+
109
+ ### 3) Provenance-Friendly Metadata
110
+ Each model stores structured metadata via `model_meta.json`, such as:
111
+ - training code version (git commit)
112
+ - dataset reference (e.g., DVC / object store ref)
113
+ - hyperparameters and preprocessing
114
+ - creator and timestamp
115
+
116
+ ### 4) Promotion Workflow
117
+ Models move through controlled stages:
118
+
119
+ ```
120
+
121
+ latest → staging → production
122
+
123
+ ```
124
+
125
+ All promotions are:
126
+ - explicit
127
+ - append-only
128
+ - audited (`audit/promotions.jsonl`)
129
+
130
+ ### 5) Storage Abstraction
131
+ v0.1.0 supports:
132
+ - **local filesystem backend** (`localfs`)
133
+
134
+ Planned:
135
+ - S3 / Azure Blob / enterprise on-prem backends
136
+
137
+ ---
138
+
139
+ ## Repository Structure
140
+
141
+ ```
142
+
143
+ omnibioai-model-registry/
144
+ ├── omnibioai_model_registry/
145
+ │ ├── api.py
146
+ │ ├── config.py
147
+ │ ├── refs.py
148
+ │ ├── storage/
149
+ │ ├── package/
150
+ │ ├── audit/
151
+ │ ├── cli/
152
+ │ └── service/
153
+ ├── tests/
154
+ ├── pyproject.toml
155
+ └── README.md
156
+
157
+ ```
158
+
159
+ ---
160
+
161
+ ## Canonical Model Package Layout
162
+
163
+ Registered models follow a strict, portable structure:
164
+
165
+ ```
166
+
167
+ <OMNIBIOAI_MODEL_REGISTRY_ROOT>/
168
+ tasks/<task>/models/<model_name>/
169
+ versions/<version>/
170
+ model.pt
171
+ model_genes.txt
172
+ label_map.json
173
+ model_meta.json
174
+ metrics.json
175
+ feature_schema.json
176
+ sha256sums.txt
177
+ aliases/
178
+ latest.json
179
+ staging.json
180
+ production.json
181
+ audit/
182
+ promotions.jsonl
183
+
184
+ ````
185
+
186
+ This guarantees:
187
+ - deterministic loading
188
+ - integrity validation
189
+ - cross-environment portability
190
+
191
+ ---
192
+
193
+ ## Install, Build, and Use as a Python Package
194
+
195
+ ### 1) Configure registry root
196
+ The registry requires a root directory:
197
+
198
+ ```bash
199
+ export OMNIBIOAI_MODEL_REGISTRY_ROOT=~/Desktop/machine/local_registry/model_registry
200
+ ````
201
+
202
+ ### 2) Install (editable) for development
203
+
204
+ From this repository root:
205
+
206
+ ```bash
207
+ pip install -e .
208
+ ```
209
+
210
+ Verify:
211
+
212
+ ```bash
213
+ python -c "import omnibioai_model_registry as m; print('OK', m.__file__)"
214
+ omr --help
215
+ ```
216
+
217
+ ### 3) Build a wheel (distribution)
218
+
219
+ Install build tooling:
220
+
221
+ ```bash
222
+ pip install build
223
+ ```
224
+
225
+ Build:
226
+
227
+ ```bash
228
+ python -m build
229
+ ```
230
+
231
+ Artifacts are written to `dist/`:
232
+
233
+ * `dist/omnibioai_model_registry-0.1.0-py3-none-any.whl`
234
+ * `dist/omnibioai_model_registry-0.1.0.tar.gz`
235
+
236
+ Install the wheel:
237
+
238
+ ```bash
239
+ pip install dist/*.whl
240
+ ```
241
+
242
+ ---
243
+
244
+ ## CLI Usage (`omr`)
245
+
246
+ ### Register a model package
247
+
248
+ ```bash
249
+ omr register \
250
+ --task celltype_sc \
251
+ --model human_pbmc \
252
+ --version 2026-02-14_001 \
253
+ --artifacts /tmp/model_pkg \
254
+ --set-alias latest
255
+ ```
256
+
257
+ ### Resolve a model reference
258
+
259
+ ```bash
260
+ omr resolve --task celltype_sc --ref human_pbmc@latest
261
+ ```
262
+
263
+ ### Promote a version to production
264
+
265
+ ```bash
266
+ omr promote --task celltype_sc --model human_pbmc --version 2026-02-14_001 --alias production
267
+ ```
268
+
269
+ ### Verify integrity
270
+
271
+ ```bash
272
+ omr verify --task celltype_sc --ref human_pbmc@production
273
+ ```
274
+
275
+ ### Show metadata
276
+
277
+ ```bash
278
+ omr show --task celltype_sc --ref human_pbmc@production --json
279
+ ```
280
+
281
+ ---
282
+
283
+ ## Python API Usage
284
+
285
+ ```python
286
+ from omnibioai_model_registry import register_model, resolve_model, promote_model
287
+
288
+ register_model(
289
+ task="celltype_sc",
290
+ model_name="human_pbmc",
291
+ version="2026-02-14_001",
292
+ artifacts_dir="/tmp/model_pkg",
293
+ metadata={
294
+ "framework": "pytorch",
295
+ "model_type": "classifier",
296
+ "provenance": {
297
+ "git_commit": "abc123",
298
+ "training_data_ref": "s3://bucket/datasets/pbmc_v1",
299
+ "trainer_version": "0.1.0",
300
+ },
301
+ },
302
+ set_alias="latest",
303
+ actor="manish",
304
+ reason="initial training",
305
+ )
306
+
307
+ # Resolve by alias (or version)
308
+ path = resolve_model("celltype_sc", "human_pbmc@latest", verify=True)
309
+ print("Resolved model dir:", path)
310
+
311
+ # Promote to production
312
+ promote_model(
313
+ task="celltype_sc",
314
+ model_name="human_pbmc",
315
+ alias="production",
316
+ version="2026-02-14_001",
317
+ actor="manish",
318
+ reason="validated metrics",
319
+ )
320
+ ```
321
+
322
+ ---
323
+
324
+ ## Minimal REST Service (FastAPI)
325
+
326
+ ### Run locally
327
+
328
+ ```bash
329
+ pip install -r omnibioai_model_registry/service/requirements.txt
330
+ uvicorn omnibioai_model_registry.service.app.main:app --host 0.0.0.0 --port 8095
331
+ ```
332
+
333
+ Test:
334
+
335
+ ```bash
336
+ curl -s http://127.0.0.1:8095/health | python -m json.tool
337
+ ```
338
+
339
+ Endpoints:
340
+
341
+ * `POST /v1/register`
342
+ * `GET /v1/resolve`
343
+ * `POST /v1/promote`
344
+ * `POST /v1/verify`
345
+ * `GET /v1/show`
346
+
347
+ ---
348
+
349
+ ## Testing
350
+
351
+ ```bash
352
+ pip install -e ".[dev]"
353
+ pytest -q
354
+ ```
355
+
356
+ ---
357
+
358
+ ## Relationship to OmniBioAI Ecosystem
359
+
360
+ This registry is a **control-plane component** of OmniBioAI.
361
+
362
+ Companion repositories:
363
+
364
+ * **omnibioai** → AI-powered bioinformatics workbench
365
+ * **omnibioai-tes** → execution orchestration across local/HPC/cloud
366
+ * **omnibioai-rag** → reasoning and literature intelligence
367
+ * **omnibioai-lims** → laboratory data management
368
+ * **omnibioai-workflow-bundles** → reproducible pipelines
369
+ * **omnibioai-sdk** → Python client access
370
+
371
+ The **Model Registry** provides the **AI artifact governance layer** shared by all.
372
+
373
+ ---
374
+
375
+ ## Roadmap
376
+
377
+ ### Near Term
378
+
379
+ * additional storage backends (S3 / Azure)
380
+ * expanded metadata validation + schemas
381
+ * model listing and metadata search APIs
382
+
383
+ ### Mid Term
384
+
385
+ * RBAC and governance controls
386
+ * richer registry service APIs (auth, pagination, filtering)
387
+ * comparison and promotion policies
388
+
389
+ ### Long Term
390
+
391
+ * enterprise biomedical AI governance platform
392
+ * regulatory-ready audit and lineage
393
+ * deeper integration with experiment tracking and clinical pipelines
394
+
395
+ ---
396
+
397
+ ## Status
398
+
399
+ * ✅ Immutable and verifiable model storage
400
+ * ✅ Audit-ready promotion workflow
401
+ * ✅ CLI + minimal REST service
402
+ * ✅ Local-first, cloud-ready design
403
+
404
+ **OmniBioAI Model Registry establishes the foundation for trustworthy, reproducible biomedical AI deployment.**
405
+