coreai-catalog 2.0.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- coreai_catalog-2.0.2/PKG-INFO +658 -0
- coreai_catalog-2.0.2/README.md +632 -0
- coreai_catalog-2.0.2/coreai_catalog/__init__.py +12 -0
- coreai_catalog-2.0.2/coreai_catalog/__main__.py +6 -0
- coreai_catalog-2.0.2/coreai_catalog/api.py +254 -0
- coreai_catalog-2.0.2/coreai_catalog/catalog.py +546 -0
- coreai_catalog-2.0.2/coreai_catalog/cli.py +1176 -0
- coreai_catalog-2.0.2/coreai_catalog/data/artifacts.yaml +1110 -0
- coreai_catalog-2.0.2/coreai_catalog/data/benchmarks.yaml +933 -0
- coreai_catalog-2.0.2/coreai_catalog/data/catalog.yaml +3885 -0
- coreai_catalog-2.0.2/coreai_catalog/data/schema/artifact.schema.json +107 -0
- coreai_catalog-2.0.2/coreai_catalog/data/schema/benchmark.schema.json +77 -0
- coreai_catalog-2.0.2/coreai_catalog/data/schema/model.schema.json +240 -0
- coreai_catalog-2.0.2/coreai_catalog/data/schema/term.schema.json +68 -0
- coreai_catalog-2.0.2/coreai_catalog/data/schema/upstream.schema.json +59 -0
- coreai_catalog-2.0.2/coreai_catalog/data/sources.yaml +213 -0
- coreai_catalog-2.0.2/coreai_catalog/data/terms.yaml +391 -0
- coreai_catalog-2.0.2/coreai_catalog/data/upstreams.yaml +673 -0
- coreai_catalog-2.0.2/coreai_catalog/exports.py +194 -0
- coreai_catalog-2.0.2/coreai_catalog/installer.py +280 -0
- coreai_catalog-2.0.2/coreai_catalog/task_pages.py +193 -0
- coreai_catalog-2.0.2/coreai_catalog.egg-info/PKG-INFO +658 -0
- coreai_catalog-2.0.2/coreai_catalog.egg-info/SOURCES.txt +31 -0
- coreai_catalog-2.0.2/coreai_catalog.egg-info/dependency_links.txt +1 -0
- coreai_catalog-2.0.2/coreai_catalog.egg-info/entry_points.txt +3 -0
- coreai_catalog-2.0.2/coreai_catalog.egg-info/requires.txt +6 -0
- coreai_catalog-2.0.2/coreai_catalog.egg-info/top_level.txt +2 -0
- coreai_catalog-2.0.2/mcp_server/__init__.py +1 -0
- coreai_catalog-2.0.2/mcp_server/server.py +554 -0
- coreai_catalog-2.0.2/pyproject.toml +50 -0
- coreai_catalog-2.0.2/setup.cfg +4 -0
- coreai_catalog-2.0.2/tests/test_error_resilience.py +386 -0
- coreai_catalog-2.0.2/tests/test_public_api.py +142 -0
|
@@ -0,0 +1,658 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: coreai-catalog
|
|
3
|
+
Version: 2.0.2
|
|
4
|
+
Summary: Discover, compare, and install Core AI models for Apple Silicon.
|
|
5
|
+
Author: Kevin Saltarelli
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/kevinqz/coreai-catalog
|
|
8
|
+
Project-URL: Repository, https://github.com/kevinqz/coreai-catalog
|
|
9
|
+
Project-URL: Issues, https://github.com/kevinqz/coreai-catalog/issues
|
|
10
|
+
Project-URL: Changelog, https://github.com/kevinqz/coreai-catalog/blob/main/CHANGELOG.md
|
|
11
|
+
Keywords: apple,core-ai,core-ml,on-device,ai,model-catalog,mcp,apple-silicon,llm
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Intended Audience :: Developers
|
|
14
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
15
|
+
Classifier: Programming Language :: Python :: 3
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
19
|
+
Requires-Python: >=3.10
|
|
20
|
+
Description-Content-Type: text/markdown
|
|
21
|
+
Requires-Dist: PyYAML>=6.0
|
|
22
|
+
Requires-Dist: requests>=2.28
|
|
23
|
+
Requires-Dist: jsonschema>=4.0
|
|
24
|
+
Provides-Extra: mcp
|
|
25
|
+
Requires-Dist: mcp>=1.0; extra == "mcp"
|
|
26
|
+
|
|
27
|
+
# Core AI Catalog
|
|
28
|
+
|
|
29
|
+

|
|
30
|
+
[](https://opensource.org/licenses/MIT)
|
|
31
|
+
[](https://www.python.org/downloads/)
|
|
32
|
+
|
|
33
|
+
π **[Live site: kevinqz.github.io/coreai-catalog](https://kevinqz.github.io/coreai-catalog/)** β searchable web UI with model cards, filters, and benchmarks.
|
|
34
|
+
|
|
35
|
+
A compact, source-grounded catalog of Apple Core AI models, artifacts, upstreams, benchmarks, provenance and a verified Apple AI terminology layer.
|
|
36
|
+
|
|
37
|
+
Core AI Catalog maps Apple Core AI-compatible model artifacts with granular metadata, source links, Hugging Face artifact references, GitHub/Hugging Face attribution, runtime requirements, device support, benchmark records and verification status.
|
|
38
|
+
|
|
39
|
+
> YAML is the source of truth. Markdown is the human view. JSON is the generated machine/API export.
|
|
40
|
+
|
|
41
|
+
## Scope and disclaimer
|
|
42
|
+
|
|
43
|
+
This catalog tracks **open-source models and their Apple Core AI artifacts** β provenance, runtime, licenses and benchmarks β plus a verified reference layer of Apple AI terminology grounded in official Apple sources. It does not redistribute model weights, re-document Apple's APIs, or treat Apple's proprietary Foundation Models as downloadable artifacts.
|
|
44
|
+
|
|
45
|
+
Not affiliated with or endorsed by Apple. `commercial_use` fields are triage labels, not legal advice or permissions β always verify the upstream model, code and artifact licenses yourself.
|
|
46
|
+
|
|
47
|
+
## Status
|
|
48
|
+
|
|
49
|
+
**Version:** v2.0.2
|
|
50
|
+
|
|
51
|
+
79 Apple Core AI models with artifact provenance, benchmarks, verified terminology, readiness scores, and an MCP server for agent-native model discovery, comparison, and recommendation. Agent-ready: CLI, MCP server, JSON exports, llms.txt, openapi.yaml β all from the same engine.
|
|
52
|
+
|
|
53
|
+
## Quick Start
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
# Install from GitHub (PyPI coming soon)
|
|
57
|
+
pip install git+https://github.com/kevinqz/coreai-catalog.git
|
|
58
|
+
|
|
59
|
+
# Find the right model for your task
|
|
60
|
+
coreai-catalog recommend --task "private OCR on iPhone" --license likely
|
|
61
|
+
|
|
62
|
+
# Install it (downloads .aimodel from Hugging Face)
|
|
63
|
+
coreai-catalog install unlimited-ocr
|
|
64
|
+
|
|
65
|
+
# Compare alternatives
|
|
66
|
+
coreai-catalog compare unlimited-ocr qwen3-vl-2b
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
See [`examples/`](./examples/) for Swift integration snippets (OCR, VLM chat, embeddings/RAG).
|
|
70
|
+
|
|
71
|
+
## Why this exists
|
|
72
|
+
|
|
73
|
+
Apple Core AI model artifacts are spread across upstream repositories, model cards, official recipe conversions, community ports and Hugging Face artifact repos. This project organizes that information into a compact, machine-readable catalog that can be consumed by humans, agents and automation.
|
|
74
|
+
|
|
75
|
+
The goal is not to run models directly. The goal is to know, precisely and traceably:
|
|
76
|
+
|
|
77
|
+
- what model exists
|
|
78
|
+
- where it came from
|
|
79
|
+
- what it can do
|
|
80
|
+
- what it receives and outputs
|
|
81
|
+
- where the artifact is hosted
|
|
82
|
+
- who should be credited
|
|
83
|
+
- whether it is an official Apple recipe conversion or a community zoo port
|
|
84
|
+
- what runtime/device constraints are known
|
|
85
|
+
- which benchmark records exist
|
|
86
|
+
- which fields are confirmed and which remain unknown
|
|
87
|
+
|
|
88
|
+
## Current scope
|
|
89
|
+
|
|
90
|
+
| Area | Count / status |
|
|
91
|
+
|---|---:|
|
|
92
|
+
| Model records | 79 |
|
|
93
|
+
| Artifact provenance records | 79 |
|
|
94
|
+
| Source records | 21 |
|
|
95
|
+
| Main upstreams | 2 |
|
|
96
|
+
| Upstream taxonomy entries | 66 |
|
|
97
|
+
| Benchmark records | 66 |
|
|
98
|
+
| Terminology records | 42 |
|
|
99
|
+
| JSON exports | generated via script |
|
|
100
|
+
|
|
101
|
+
Main upstreams:
|
|
102
|
+
|
|
103
|
+
- `john-rocky/coreai-model-zoo`
|
|
104
|
+
- `apple/coreai-models`
|
|
105
|
+
|
|
106
|
+
Primary Hugging Face artifact owner currently mapped:
|
|
107
|
+
|
|
108
|
+
- `mlboydaisuke`
|
|
109
|
+
|
|
110
|
+
## Repository structure
|
|
111
|
+
|
|
112
|
+
```txt
|
|
113
|
+
coreai-catalog/
|
|
114
|
+
βββ README.md
|
|
115
|
+
βββ AGENTS.md
|
|
116
|
+
βββ CONTRIBUTING.md
|
|
117
|
+
βββ CREDITS.md
|
|
118
|
+
βββ pyproject.toml
|
|
119
|
+
βββ catalog.yaml
|
|
120
|
+
βββ artifacts.yaml
|
|
121
|
+
βββ sources.yaml
|
|
122
|
+
βββ upstreams.yaml
|
|
123
|
+
βββ benchmarks.yaml
|
|
124
|
+
βββ terms.yaml
|
|
125
|
+
βββ requirements.txt
|
|
126
|
+
βββ schema/
|
|
127
|
+
β βββ model.schema.json
|
|
128
|
+
β βββ artifact.schema.json
|
|
129
|
+
β βββ upstream.schema.json
|
|
130
|
+
β βββ benchmark.schema.json
|
|
131
|
+
β βββ term.schema.json
|
|
132
|
+
βββ scripts/
|
|
133
|
+
β βββ validate.py
|
|
134
|
+
β βββ audit.py
|
|
135
|
+
β βββ deep_audit.py
|
|
136
|
+
β βββ derive_fields.py
|
|
137
|
+
β βββ generate.py
|
|
138
|
+
β βββ sync_upstream.py
|
|
139
|
+
β βββ check_sources.sh
|
|
140
|
+
βββ coreai_catalog/
|
|
141
|
+
β βββ __init__.py
|
|
142
|
+
β βββ __main__.py
|
|
143
|
+
β βββ cli.py
|
|
144
|
+
β βββ catalog.py
|
|
145
|
+
β βββ exports.py
|
|
146
|
+
β βββ installer.py
|
|
147
|
+
βββ mcp_server/
|
|
148
|
+
β βββ __init__.py
|
|
149
|
+
β βββ server.py
|
|
150
|
+
βββ skills/
|
|
151
|
+
β βββ coreai-model-selection/
|
|
152
|
+
β βββ coreai-license-triage/
|
|
153
|
+
βββ llms.txt
|
|
154
|
+
βββ docs/
|
|
155
|
+
β βββ index.md
|
|
156
|
+
β βββ model-registry.md
|
|
157
|
+
β βββ capability-matrix.md
|
|
158
|
+
β βββ runtime-matrix.md
|
|
159
|
+
β βββ artifact-provenance.md
|
|
160
|
+
β βββ upstream-map.md
|
|
161
|
+
β βββ benchmark-map.md
|
|
162
|
+
β βββ source-map.md
|
|
163
|
+
β βββ apple-terminology-map.md
|
|
164
|
+
β βββ data-model.md
|
|
165
|
+
β βββ compare/
|
|
166
|
+
β βββ v0.3-verification.md
|
|
167
|
+
β βββ sota-maintenance.md
|
|
168
|
+
β βββ generated-files.md
|
|
169
|
+
βββ .github/
|
|
170
|
+
βββ workflows/
|
|
171
|
+
βββ validate.yml
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
JSON exports are generated by `scripts/generate.py` and committed to `dist/`. They are available via raw GitHub URLs (e.g. `https://raw.githubusercontent.com/kevinqz/coreai-catalog/main/dist/catalog.json`) without cloning the repo.
|
|
175
|
+
|
|
176
|
+
## Source of truth
|
|
177
|
+
|
|
178
|
+
| File | Purpose |
|
|
179
|
+
|---|---|
|
|
180
|
+
| `catalog.yaml` | Model facts: name, family, capabilities, modalities, size, runtime, device support, license status and verification status. Measurements live in `benchmarks.yaml`, not here. |
|
|
181
|
+
| `artifacts.yaml` | Converted artifact provenance: GitHub conversion source, Hugging Face owner/repo/url and official recipe status. |
|
|
182
|
+
| `sources.yaml` | Compact registry of primary/supporting sources already used by the catalog. |
|
|
183
|
+
| `upstreams.yaml` | Source taxonomy for framework, conversion, artifact host, benchmark, sample, original model and license sources. |
|
|
184
|
+
| `benchmarks.yaml` | Normalized benchmark records by model, metric, device, compute unit and source. |
|
|
185
|
+
| `terms.yaml` | Verified Apple AI terminology, tagged by ecosystem layer, each citing an official Apple source. |
|
|
186
|
+
| `CREDITS.md` | Human-readable attribution for GitHub and Hugging Face users/repositories. |
|
|
187
|
+
| `schema/*.json` | Validation contracts for model, artifact, upstream and benchmark records. |
|
|
188
|
+
| `docs/*.md` | Generated or curated human views. |
|
|
189
|
+
| `dist/*.json` | Generated machine-readable exports. |
|
|
190
|
+
|
|
191
|
+
## Core data model
|
|
192
|
+
|
|
193
|
+
A model entry in `catalog.yaml` represents model metadata:
|
|
194
|
+
|
|
195
|
+
```yaml
|
|
196
|
+
- id: qwen3-5-0-8b
|
|
197
|
+
name: Qwen3.5-0.8B
|
|
198
|
+
family: Qwen
|
|
199
|
+
source_group: zoo
|
|
200
|
+
capabilities:
|
|
201
|
+
- chat
|
|
202
|
+
- text-generation
|
|
203
|
+
modalities:
|
|
204
|
+
input:
|
|
205
|
+
- text
|
|
206
|
+
output:
|
|
207
|
+
- text
|
|
208
|
+
artifact:
|
|
209
|
+
format: aimodel
|
|
210
|
+
availability: available
|
|
211
|
+
runtime:
|
|
212
|
+
runtime_name: apple-core-ai
|
|
213
|
+
runner: CoreAIRunner
|
|
214
|
+
status: confirmed
|
|
215
|
+
confidence: medium
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
An artifact entry in `artifacts.yaml` represents converted artifact provenance and hosting:
|
|
219
|
+
|
|
220
|
+
```yaml
|
|
221
|
+
- id: qwen3-5-0-8b
|
|
222
|
+
group: zoo
|
|
223
|
+
github:
|
|
224
|
+
owner: john-rocky
|
|
225
|
+
repo: coreai-model-zoo
|
|
226
|
+
path: https://github.com/john-rocky/coreai-model-zoo/blob/main/zoo/qwen3.5.md
|
|
227
|
+
huggingface:
|
|
228
|
+
owner: mlboydaisuke
|
|
229
|
+
repo: qwen3.5-0.8B-CoreAI
|
|
230
|
+
url: https://huggingface.co/mlboydaisuke/qwen3.5-0.8B-CoreAI
|
|
231
|
+
officiality:
|
|
232
|
+
apple_export_recipe: false
|
|
233
|
+
apple_hosted_artifact: false
|
|
234
|
+
community_packaged: true
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
An upstream entry in `upstreams.yaml` represents source taxonomy:
|
|
238
|
+
|
|
239
|
+
```yaml
|
|
240
|
+
- id: qwen
|
|
241
|
+
title: Qwen original model family
|
|
242
|
+
category: original_model
|
|
243
|
+
platform: huggingface
|
|
244
|
+
owner: Qwen
|
|
245
|
+
url: https://huggingface.co/Qwen
|
|
246
|
+
trust: original_model_primary
|
|
247
|
+
applies_to:
|
|
248
|
+
- qwen3-5-0-8b
|
|
249
|
+
- qwen3-vl-2b
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
A benchmark entry in `benchmarks.yaml` represents a normalized measurement:
|
|
253
|
+
|
|
254
|
+
```yaml
|
|
255
|
+
- id: qwen3-5-0-8b-iphone17pro-gpu-toks
|
|
256
|
+
model_id: qwen3-5-0-8b
|
|
257
|
+
metric: decode_throughput
|
|
258
|
+
unit: tokens_per_second
|
|
259
|
+
value: 71.9
|
|
260
|
+
device: iPhone 17 Pro
|
|
261
|
+
compute_unit: GPU
|
|
262
|
+
environment: iOS 27 beta, coreai-pipelined engine
|
|
263
|
+
observed: '2026-06-25'
|
|
264
|
+
source: john-rocky-coreai-model-zoo
|
|
265
|
+
confidence: medium
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
Measurements are the single source of truth in `benchmarks.yaml` (model records carry no inline numbers). Each row is environment-scoped and append-only: values that differ across OS/runtime versions are kept as separate dated records, and a superseded value is retained with `confidence: needs_review` and a `superseded_by` pointer rather than overwritten.
|
|
269
|
+
|
|
270
|
+
## Source layers
|
|
271
|
+
|
|
272
|
+
| Layer | File/category | Purpose |
|
|
273
|
+
|---|---|---|
|
|
274
|
+
| Model facts | `catalog.yaml` | What the model is and what it does. |
|
|
275
|
+
| Converted artifact | `artifacts.yaml` | Where the Core AI artifact lives and who converted/hosts it. |
|
|
276
|
+
| Framework/runtime | `upstreams.yaml > framework_sources` | Apple Core AI, Core ML and tooling context. |
|
|
277
|
+
| Original model | `upstreams.yaml > original_model_sources` | Original creators/model-family sources. |
|
|
278
|
+
| License | `upstreams.yaml > license_sources` | License documents and review flags. |
|
|
279
|
+
| Benchmarks | `benchmarks.yaml` | Measurement rows, source IDs and confidence. |
|
|
280
|
+
| Human docs | `docs/*.md` | Tables, maps and curated summaries. |
|
|
281
|
+
| Machine exports | `dist/*.json` | Generated JSON outputs for agents/APIs. |
|
|
282
|
+
|
|
283
|
+
## Model groups
|
|
284
|
+
|
|
285
|
+
| Group | Meaning |
|
|
286
|
+
|---|---|
|
|
287
|
+
| `zoo` | Community model port from `john-rocky/coreai-model-zoo`. |
|
|
288
|
+
| `official` | Artifact described upstream as an Apple official recipe conversion from `apple/coreai-models`. |
|
|
289
|
+
| `external` | External source, not yet used by the current catalog. |
|
|
290
|
+
| `unknown` | Not classified yet. |
|
|
291
|
+
|
|
292
|
+
## Official Apple recipe conversions
|
|
293
|
+
|
|
294
|
+
Entries with `source_group: official` in `catalog.yaml` and `officiality.apple_export_recipe: true` in `artifacts.yaml` are treated as official Apple recipe conversion artifacts. The `officiality` block disambiguates *official of what*: `apple_export_recipe` (converted via an Apple recipe), `apple_hosted_artifact` (Apple hosts the artifact β `false` for all current entries), and `community_packaged` (packaged/hosted by the community).
|
|
295
|
+
|
|
296
|
+
These entries credit:
|
|
297
|
+
|
|
298
|
+
- GitHub source: `apple/coreai-models`
|
|
299
|
+
- Artifact host: `mlboydaisuke` on Hugging Face
|
|
300
|
+
|
|
301
|
+
Current official entries include:
|
|
302
|
+
|
|
303
|
+
- gpt-oss-20B
|
|
304
|
+
- Qwen3 0.6B
|
|
305
|
+
- Qwen3 4B
|
|
306
|
+
- Qwen3 8B
|
|
307
|
+
- Gemma 3 4B IT
|
|
308
|
+
- Gemma 3 12B IT
|
|
309
|
+
- Mistral 7B v0.3
|
|
310
|
+
- FLUX.2 klein 4B
|
|
311
|
+
- SAM 3
|
|
312
|
+
- Whisper large-v3-turbo
|
|
313
|
+
|
|
314
|
+
## Original model attribution
|
|
315
|
+
|
|
316
|
+
Original model creators are tracked separately from converted artifact hosts. This avoids conflating:
|
|
317
|
+
|
|
318
|
+
- original model creator
|
|
319
|
+
- Apple official recipe source
|
|
320
|
+
- community conversion source
|
|
321
|
+
- Hugging Face artifact host
|
|
322
|
+
- license source
|
|
323
|
+
|
|
324
|
+
Examples:
|
|
325
|
+
|
|
326
|
+
| Model family | Original upstream | Converted artifact host |
|
|
327
|
+
|---|---|---|
|
|
328
|
+
| Qwen | `Qwen` | `mlboydaisuke` |
|
|
329
|
+
| Gemma | `google` | `mlboydaisuke` |
|
|
330
|
+
| Mistral | `mistralai` | `mlboydaisuke` |
|
|
331
|
+
| SAM | `facebook` / Meta | `mlboydaisuke` |
|
|
332
|
+
| RF-DETR | `Roboflow` | `mlboydaisuke` |
|
|
333
|
+
|
|
334
|
+
See `upstreams.yaml` and `docs/upstream-map.md`.
|
|
335
|
+
|
|
336
|
+
## Capabilities covered
|
|
337
|
+
|
|
338
|
+
The catalog currently covers:
|
|
339
|
+
|
|
340
|
+
- chat / text generation
|
|
341
|
+
- instruction following
|
|
342
|
+
- reasoning / agentic LLMs
|
|
343
|
+
- MoE LLMs
|
|
344
|
+
- 1.58-bit ternary LLMs
|
|
345
|
+
- vision-language models
|
|
346
|
+
- GUI grounding / computer use
|
|
347
|
+
- document OCR
|
|
348
|
+
- visual document retrieval (ColBERT / MaxSim)
|
|
349
|
+
- audio understanding
|
|
350
|
+
- text-to-speech
|
|
351
|
+
- speech-to-text (ASR + transducer / TDT)
|
|
352
|
+
- embeddings
|
|
353
|
+
- reranking
|
|
354
|
+
- image-text similarity (CLIP)
|
|
355
|
+
- object detection
|
|
356
|
+
- instance segmentation
|
|
357
|
+
- promptable segmentation
|
|
358
|
+
- monocular depth
|
|
359
|
+
- image generation
|
|
360
|
+
- super-resolution
|
|
361
|
+
- text-to-video
|
|
362
|
+
- image-to-3D (Gaussian splatting)
|
|
363
|
+
- text-to-audio (generative music)
|
|
364
|
+
- diffusion LLMs (dLLM)
|
|
365
|
+
- vision-language-action (VLA / robotics)
|
|
366
|
+
|
|
367
|
+
## Devices and runtime metadata
|
|
368
|
+
|
|
369
|
+
The catalog tracks known runtime/device facts when available:
|
|
370
|
+
|
|
371
|
+
- Apple Core AI artifact format
|
|
372
|
+
- `.aimodel` availability
|
|
373
|
+
- stock runtime vs community runtime
|
|
374
|
+
- runner name
|
|
375
|
+
- tokenizer requirement
|
|
376
|
+
- processor requirement
|
|
377
|
+
- custom Metal kernel requirement
|
|
378
|
+
- patch/workaround requirement
|
|
379
|
+
- AOT requirement
|
|
380
|
+
- iPhone/iPad/Mac support
|
|
381
|
+
- Mac-only status
|
|
382
|
+
|
|
383
|
+
Unknown or unverified values are intentionally kept as `unknown` instead of guessed.
|
|
384
|
+
|
|
385
|
+
## Validation and generation
|
|
386
|
+
|
|
387
|
+
Install dependencies:
|
|
388
|
+
|
|
389
|
+
```bash
|
|
390
|
+
pip install -r requirements.txt
|
|
391
|
+
```
|
|
392
|
+
|
|
393
|
+
Validate records:
|
|
394
|
+
|
|
395
|
+
```bash
|
|
396
|
+
python scripts/validate.py
|
|
397
|
+
```
|
|
398
|
+
|
|
399
|
+
Regenerate Markdown docs:
|
|
400
|
+
|
|
401
|
+
```bash
|
|
402
|
+
python scripts/generate.py --docs
|
|
403
|
+
```
|
|
404
|
+
|
|
405
|
+
Export JSON, search indexes, and readiness scores:
|
|
406
|
+
|
|
407
|
+
```bash
|
|
408
|
+
python scripts/generate.py --json
|
|
409
|
+
```
|
|
410
|
+
|
|
411
|
+
Or generate everything at once:
|
|
412
|
+
|
|
413
|
+
```bash
|
|
414
|
+
python scripts/generate.py
|
|
415
|
+
```
|
|
416
|
+
|
|
417
|
+
The GitHub Actions workflow runs validation, generation, CLI smoke test, and MCP assertion on every push and pull request.
|
|
418
|
+
|
|
419
|
+
## CLI
|
|
420
|
+
|
|
421
|
+
Install the CLI for the full experience:
|
|
422
|
+
|
|
423
|
+
```bash
|
|
424
|
+
pip install -e .
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
### Commands
|
|
428
|
+
|
|
429
|
+
```bash
|
|
430
|
+
# Discover models
|
|
431
|
+
coreai-catalog search --capability vision-language --device iphone
|
|
432
|
+
coreai-catalog list # all models, sorted by readiness score
|
|
433
|
+
coreai-catalog scores # 0-100 readiness scores with grade distribution
|
|
434
|
+
coreai-catalog capabilities # list all capabilities with model counts
|
|
435
|
+
|
|
436
|
+
# Inspect a model
|
|
437
|
+
coreai-catalog show qwen3-vl-2b # full details: caps, devices, runtime, provenance, benchmarks
|
|
438
|
+
coreai-catalog show qwen3-vl-2b -v # verbose β full notes, not truncated
|
|
439
|
+
coreai-catalog compare qwen3-vl-2b unlimited-ocr # side-by-side
|
|
440
|
+
|
|
441
|
+
# Get recommendations
|
|
442
|
+
coreai-catalog recommend --task "robot vision" --device iphone
|
|
443
|
+
coreai-catalog recommend --task "private on-device OCR" --device iphone
|
|
444
|
+
coreai-catalog recommend --task "voice assistant" --device mac
|
|
445
|
+
|
|
446
|
+
# Install a model (downloads from Hugging Face, writes manifest + Swift snippet)
|
|
447
|
+
coreai-catalog install qwen3-vl-2b # downloads artifact, generates snippet.swift
|
|
448
|
+
coreai-catalog install qwen3-vl-2b --dry-run # preview download size without downloading
|
|
449
|
+
coreai-catalog installed # list locally installed models
|
|
450
|
+
coreai-catalog uninstall qwen3-vl-2b
|
|
451
|
+
|
|
452
|
+
# Check your environment
|
|
453
|
+
coreai-catalog doctor # checks Python, Xcode, coreai-torch, coreai-opt, HF CLI, disk
|
|
454
|
+
```
|
|
455
|
+
|
|
456
|
+
All commands support `--json` for programmatic consumption by agents and automation.
|
|
457
|
+
|
|
458
|
+
## MCP server (Agent API)
|
|
459
|
+
|
|
460
|
+
The catalog ships an [MCP server](https://modelcontextprotocol.io/) that exposes 11 tools to AI agents (Claude Desktop, Cursor, any MCP-compatible client).
|
|
461
|
+
|
|
462
|
+
### Setup
|
|
463
|
+
|
|
464
|
+
```bash
|
|
465
|
+
pip install -e ".[mcp]"
|
|
466
|
+
```
|
|
467
|
+
|
|
468
|
+
### Configure in Claude Desktop
|
|
469
|
+
|
|
470
|
+
Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
|
|
471
|
+
|
|
472
|
+
```json
|
|
473
|
+
{
|
|
474
|
+
"mcpServers": {
|
|
475
|
+
"coreai-catalog": {
|
|
476
|
+
"command": "python",
|
|
477
|
+
"args": ["mcp_server/server.py"]
|
|
478
|
+
}
|
|
479
|
+
}
|
|
480
|
+
}
|
|
481
|
+
```
|
|
482
|
+
|
|
483
|
+
Or use the installed entry point:
|
|
484
|
+
|
|
485
|
+
```json
|
|
486
|
+
{
|
|
487
|
+
"mcpServers": {
|
|
488
|
+
"coreai-catalog": {
|
|
489
|
+
"command": "coreai-catalog-mcp"
|
|
490
|
+
}
|
|
491
|
+
}
|
|
492
|
+
}
|
|
493
|
+
```
|
|
494
|
+
|
|
495
|
+
### Available tools
|
|
496
|
+
|
|
497
|
+
| Tool | Description |
|
|
498
|
+
|---|---|
|
|
499
|
+
| `search_models` | Filter by capability, device, license, family, source, modality |
|
|
500
|
+
| `get_model` | Full model details: capabilities, runtime, provenance, benchmarks |
|
|
501
|
+
| `compare_models` | Side-by-side comparison of 2+ models |
|
|
502
|
+
| `recommend_model` | Task-based recommendations (89 task synonyms mapped) |
|
|
503
|
+
| `check_license` | License and commercial use triage for a model |
|
|
504
|
+
| `get_benchmarks` | All benchmark records for a model |
|
|
505
|
+
| `get_artifact` | Artifact provenance and download info |
|
|
506
|
+
| `explain_term` | Apple AI terminology lookup (42 verified terms) |
|
|
507
|
+
| `get_capabilities` | List all capabilities with model counts |
|
|
508
|
+
| `get_tasks` | List all supported task synonyms and their mappings |
|
|
509
|
+
| `get_version` | Catalog version, model count, last-verified date |
|
|
510
|
+
|
|
511
|
+
### Example agent interaction
|
|
512
|
+
|
|
513
|
+
```
|
|
514
|
+
User: I need a vision-language model that runs on iPhone for robot perception.
|
|
515
|
+
|
|
516
|
+
Agent calls: search_models(capability="vision-language", device="iphone")
|
|
517
|
+
β Returns 6 candidates with readiness scores
|
|
518
|
+
|
|
519
|
+
Agent calls: compare_models(["qwen3-vl-2b", "minicpm-v-4-6"])
|
|
520
|
+
β Returns side-by-side comparison
|
|
521
|
+
|
|
522
|
+
Agent calls: check_license("qwen3-vl-2b")
|
|
523
|
+
β Returns Apache-2.0, commercial_use: likely
|
|
524
|
+
|
|
525
|
+
Agent recommends: Qwen3-VL 2B β benchmarked, iPhone-supported, Apache-2.0
|
|
526
|
+
```
|
|
527
|
+
|
|
528
|
+
## Query and decision
|
|
529
|
+
|
|
530
|
+
All query and decision tools are built into the CLI (see above) and the MCP server (see below). There is no separate `scripts/query.py` or `scripts/recommend.py` β the CLI is the single entry point for both humans and automation.
|
|
531
|
+
|
|
532
|
+
## Documentation
|
|
533
|
+
|
|
534
|
+
`generated` docs are produced from the YAML source by scripts and must not be
|
|
535
|
+
hand-edited; `curated` docs are maintained manually (see `docs/generated-files.md`).
|
|
536
|
+
|
|
537
|
+
| Doc | Type | Description |
|
|
538
|
+
|---|---|---|
|
|
539
|
+
| `docs/getting-started.md` | curated | 60-second β 10-minute walkthrough |
|
|
540
|
+
| `docs/index.md` | generated | Docs entry point and counts (`scripts/generate.py`). |
|
|
541
|
+
| `docs/model-registry.md` | generated | Human-readable model table (`scripts/generate.py`). |
|
|
542
|
+
| `docs/artifact-provenance.md` | generated | Artifact ownership and hosting view (`scripts/generate.py`). |
|
|
543
|
+
| `docs/apple-terminology-map.md` | generated | Verified Apple AI terminology by layer (`scripts/generate.py`). |
|
|
544
|
+
| `docs/tasks/` | generated | Per-capability task pages with model tables (`scripts/generate.py`). |
|
|
545
|
+
| `docs/concepts/` | curated | Model vs artifact, runtime landscape, license risk, benchmark quality. |
|
|
546
|
+
| `docs/data-model.md` | curated | Entity model and relationship documentation. |
|
|
547
|
+
| `docs/capability-matrix.md` | curated | Models grouped by capability. |
|
|
548
|
+
| `docs/runtime-matrix.md` | curated | Runtime concepts and flags. |
|
|
549
|
+
| `docs/upstream-map.md` | curated | Framework/original-model/license upstream map. |
|
|
550
|
+
| `docs/benchmark-map.md` | curated | Benchmark registry explanation. |
|
|
551
|
+
| `docs/source-map.md` | curated | Source and upstream map. |
|
|
552
|
+
| `docs/sota-maintenance.md` | curated | Maintenance plan and data-model direction. |
|
|
553
|
+
| `docs/generated-files.md` | curated | Generated vs curated file policy. |
|
|
554
|
+
| `PROJECT_PHILOSOPHY.md` | curated | Why the project exists, design principles, non-goals. |
|
|
555
|
+
|
|
556
|
+
## Attribution
|
|
557
|
+
|
|
558
|
+
This project is a catalog and attribution layer. It does not claim ownership of upstream model artifacts or source repositories.
|
|
559
|
+
|
|
560
|
+
Primary credits are recorded in:
|
|
561
|
+
|
|
562
|
+
- `CREDITS.md`
|
|
563
|
+
- `sources.yaml`
|
|
564
|
+
- `artifacts.yaml`
|
|
565
|
+
- `upstreams.yaml`
|
|
566
|
+
|
|
567
|
+
Key credited sources include:
|
|
568
|
+
|
|
569
|
+
- `john-rocky/coreai-model-zoo`
|
|
570
|
+
- `john-rocky/CoreML-Models`
|
|
571
|
+
- `apple/coreai-models`
|
|
572
|
+
- `apple/coremltools`
|
|
573
|
+
- `john-rocky/apple-silicon-llm-bench`
|
|
574
|
+
- `john-rocky/coreai-samples`
|
|
575
|
+
- Hugging Face user `mlboydaisuke`
|
|
576
|
+
- original model creators listed in `upstreams.yaml`
|
|
577
|
+
|
|
578
|
+
## License handling
|
|
579
|
+
|
|
580
|
+
Licenses are tracked per model when known. Some entries are marked as `check_license` when commercial-use terms need explicit review.
|
|
581
|
+
|
|
582
|
+
Important rule:
|
|
583
|
+
|
|
584
|
+
> The repository license, upstream code license, model license and artifact-hosting license may differ.
|
|
585
|
+
|
|
586
|
+
For sensitive licenses such as Gemma Terms, Meta SAM License, LFM Open License or OpenRAIL-style licenses, treat `commercial_use: check_license` as requiring manual review before use.
|
|
587
|
+
|
|
588
|
+
## Maintenance rules
|
|
589
|
+
|
|
590
|
+
1. One meaningful model variant should have one catalog entry.
|
|
591
|
+
2. Do not collapse variants when size, device support, runtime, quantization, license or artifact changes.
|
|
592
|
+
3. Use `unknown` instead of guessing.
|
|
593
|
+
4. Keep `catalog.yaml` focused on model facts.
|
|
594
|
+
5. Keep `artifacts.yaml` focused on converted artifact provenance and hosting.
|
|
595
|
+
6. Keep `upstreams.yaml` focused on original model, framework, license and benchmark sources.
|
|
596
|
+
7. Keep `benchmarks.yaml` focused on normalized measurement records.
|
|
597
|
+
8. Keep `sources.yaml` focused on compact source registry.
|
|
598
|
+
9. Generate Markdown and JSON views from YAML whenever possible.
|
|
599
|
+
10. Credit original model creator, conversion source and artifact host separately.
|
|
600
|
+
11. Update `last_verified` when a source is rechecked.
|
|
601
|
+
|
|
602
|
+
## Roadmap
|
|
603
|
+
|
|
604
|
+
Current milestone:
|
|
605
|
+
|
|
606
|
+
- v2.0.0 β Web UI (GitHub Pages): model explorer, task browser, filters, search.
|
|
607
|
+
|
|
608
|
+
Earlier:
|
|
609
|
+
|
|
610
|
+
- v1.7.0 β Public Python library API (`from coreai_catalog import Catalog`), schema versioning docs.
|
|
611
|
+
- v1.6.0 β Task-first discovery: `tasks` command, `recommend --explain`, enriched MCP get_tasks.
|
|
612
|
+
- v1.5.0 β Structured docs (philosophy, getting-started, concepts, task pages), community templates, issue templates.
|
|
613
|
+
- v1.4.0 β PyPI-ready, 60-second demo, Swift examples, recommend redesign.
|
|
614
|
+
- v1.3.x β RWKV-7 Goose 1.5B, source-monitor cron, 3-round red-team, dist/ committed, docs sync.
|
|
615
|
+
|
|
616
|
+
- v1.3.0 β CLIβMCP parity, TASK_MAP expanded 40β89, `version` command, terminology alignment ("Core AI").
|
|
617
|
+
- v1.2.x β Fuzzy search, capability aliases, ANSI auto-detect, recommend --license, installer hardening, DX improvements.
|
|
618
|
+
- v1.0 β Error resilience: 8 crash fixes + 63-test suite + CI integration.
|
|
619
|
+
- v0.6 β Technical backfill (precision, quantization, runtime flags), non-LLM benchmarks, terminology to 42 terms.
|
|
620
|
+
- v0.5 β Expanded model coverage: ternary LLM, GUI grounding, visual retrieval, transducer ASR, video, 3D, diffusion LLM, VLA.
|
|
621
|
+
- v0.4 β Verified Apple AI terminology layer, artifact officiality, benchmark provenance.
|
|
622
|
+
- v0.3 β Validation depth, upstream taxonomy, benchmark registry.
|
|
623
|
+
|
|
624
|
+
Later:
|
|
625
|
+
|
|
626
|
+
- Split large YAML files into `data/models/*.yaml` if the catalog grows significantly.
|
|
627
|
+
- Richer model cards, per-model pages, and SEO optimization on the web UI.
|
|
628
|
+
- Additional filters: runtime, maturity, confidence, artifact availability, modality.
|
|
629
|
+
- Publish to PyPI for `pip install coreai-catalog` (currently `pip install git+...`).
|
|
630
|
+
- Automated source verification (in progress via `scripts/check_sources.sh`).
|
|
631
|
+
|
|
632
|
+
## Non-goals
|
|
633
|
+
|
|
634
|
+
This repository does not currently define:
|
|
635
|
+
|
|
636
|
+
- model workflows
|
|
637
|
+
- app logic
|
|
638
|
+
- inference pipelines
|
|
639
|
+
- benchmarking harnesses
|
|
640
|
+
- model conversion scripts
|
|
641
|
+
- runtime implementations
|
|
642
|
+
|
|
643
|
+
Those belong in separate repositories or future layers.
|
|
644
|
+
|
|
645
|
+
## Upstream
|
|
646
|
+
|
|
647
|
+
Primary community upstream:
|
|
648
|
+
|
|
649
|
+
- https://github.com/john-rocky/coreai-model-zoo
|
|
650
|
+
|
|
651
|
+
Official Apple recipe upstream:
|
|
652
|
+
|
|
653
|
+
- https://github.com/apple/coreai-models
|
|
654
|
+
|
|
655
|
+
Additional upstream taxonomy:
|
|
656
|
+
|
|
657
|
+
- `upstreams.yaml`
|
|
658
|
+
- `docs/upstream-map.md`
|