macafm 0.9.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
macafm-0.9.5/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 MacLocalAPI Contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,4 @@
1
+ include LICENSE
2
+ include README.md
3
+ recursive-include macafm/bin *
4
+ recursive-include macafm/share *
macafm-0.9.5/PKG-INFO ADDED
@@ -0,0 +1,546 @@
1
+ Metadata-Version: 2.4
2
+ Name: macafm
3
+ Version: 0.9.5
4
+ Summary: Access Apple's on-device Foundation Models via CLI and OpenAI-compatible API
5
+ Author: Sylvain Cousineau
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/scouzi1966/maclocal-api
8
+ Project-URL: Documentation, https://github.com/scouzi1966/maclocal-api#readme
9
+ Project-URL: Repository, https://github.com/scouzi1966/maclocal-api
10
+ Project-URL: Issues, https://github.com/scouzi1966/maclocal-api/issues
11
+ Keywords: apple,foundation-models,llm,openai,api,macos,apple-silicon,ai,machine-learning,cli
12
+ Classifier: Development Status :: 4 - Beta
13
+ Classifier: Environment :: Console
14
+ Classifier: Environment :: MacOS X
15
+ Classifier: Intended Audience :: Developers
16
+ Classifier: Operating System :: MacOS :: MacOS X
17
+ Classifier: Programming Language :: Python :: 3
18
+ Classifier: Programming Language :: Python :: 3.9
19
+ Classifier: Programming Language :: Python :: 3.10
20
+ Classifier: Programming Language :: Python :: 3.11
21
+ Classifier: Programming Language :: Python :: 3.12
22
+ Classifier: Programming Language :: Python :: 3.13
23
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
24
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
25
+ Requires-Python: >=3.9
26
+ Description-Content-Type: text/markdown
27
+ License-File: LICENSE
28
+ Provides-Extra: dev
29
+ Requires-Dist: build; extra == "dev"
30
+ Requires-Dist: twine; extra == "dev"
31
+ Dynamic: license-file
32
+
33
+ If you find this useful, please ⭐ the repo!   Also check out [Vesta AI Explorer](https://kruks.ai/) — my full-featured native macOS AI app.
34
+
35
+ > [!NOTE]
36
+ > **Attention M-series Mac AI enthusiasts!** You don't need to be a Swift developer to explore. Vibe coding really allows anyone to participate in this project.
37
+ >
38
+ > [Fork this repo](https://github.com/scouzi1966/maclocal-api/fork) first, then clone your fork to submit PRs:
39
+ >
40
+ > ```bash
41
+ > git clone https://github.com/<your-username>/maclocal-api.git
42
+ > cd maclocal-api
43
+ > claude
44
+ > /build-afm
45
+ > ```
46
+ >
47
+ > Start vibe coding! I will add support for skills with more coding agents in the future.
48
+
49
+ # afm — Run Any LLM on Your Mac, 100% Local
50
+
51
+ Extensive testing of Qwen3.5-35B-A3B with afm. Uses an experimental technique with Claude and Codex as judges for evaluation scoring. Click the link below to view test results.
52
+
53
+ ### [afm-next Nightly Test Report — Qwen3.5-35B-A3B Focus](https://kruks.ai/macafm/)
54
+
55
+ Run open-source MLX models **or** Apple's on-device Foundation Model through an OpenAI-compatible API. Built entirely in Swift for maximum Metal GPU performance. No Python runtime, no cloud, no API keys.
56
+
57
+ ## Install
58
+
59
+ | | Stable (v0.9.4) | Nightly (afm-next) |
60
+ |---|---|---|
61
+ | **Homebrew** | `brew install scouzi1966/afm/afm` | `brew install scouzi1966/afm/afm-next` |
62
+ | **pip** | `pip install macafm` | — |
63
+ | **Release notes** | [v0.9.4](https://github.com/scouzi1966/maclocal-api/releases/tag/v0.9.4) | [Latest nightly](https://github.com/scouzi1966/maclocal-api/releases) |
64
+
65
+ > [!TIP]
66
+ > **Switching between stable and nightly:**
67
+ > ```bash
68
+ > brew unlink afm && brew install scouzi1966/afm/afm-next # switch to nightly
69
+ > brew unlink afm-next && brew link afm # switch back to stable
70
+ > ASSUMES you did a brew install scouzi1966/afm/afm previously
71
+ > ```
72
+
73
+ ## What's new in afm-next
74
+
75
+ > [!IMPORTANT]
76
+ > The nightly build is the future stable release. It includes everything in v0.9.4 plus:
77
+ > - the test-reports folder is a mess but contains the extensive test reports performed
78
+ > - **Qwen3.5-35B-A3B MoE** — run a 35B model with only 3B active parameters (--vlm for image,video)
79
+ > - **Full tool calling** — Qwen3-Coder, Gemma, GLM, Kimi-K2.5, and more
80
+ > - **Prompt prefix caching** for faster repeat inference
81
+ > - **Stop sequences** with `<think>` model support
82
+ > - **New architectures** — Qwen3.5, Gemma 3n, Kimi-K2.5, MiniMax M2.5, Nemotron
83
+ > - --guided-json for structured output
84
+ > - Stop sequences through API
85
+ > - Pass image objects to API (using OpenAI APi SDK standards)
86
+ > - logprobs for agentic interpretability testing
87
+ > - top-k, min-p and presence penalty parameters
88
+ > - --tool-call-parser (experimental) hermes, llama3_json, gemma, mistral, qwen3_xml
89
+ > - Many more! afm mlx -h (not all features are wired at the moment)
90
+
91
+ ## Quick Start
92
+
93
+ ```bash
94
+ # Run any MLX model with WebUI
95
+ afm mlx -m mlx-community/Qwen3.5-35B-A3B-4bit -w
96
+
97
+ # Or any smaller model
98
+ afm mlx -m mlx-community/gemma-3-4b-it-8bit -w
99
+
100
+ # Chat from the terminal (auto-downloads from Hugging Face)
101
+ afm mlx -m Qwen3-0.6B-4bit -s "Explain quantum computing"
102
+
103
+ # Interactive model picker (lists your downloaded models)
104
+ MACAFM_MLX_MODEL_CACHE=/path/to/models afm mlx -w
105
+
106
+ # Apple's on-device Foundation Model with WebUI
107
+ afm -w
108
+ ```
109
+
110
+ ## Use with OpenCode
111
+
112
+ [OpenCode](https://opencode.ai/) is a terminal-based AI coding assistant. Connect it to afm for a fully local coding experience — no cloud, no API keys. No Internet required (other than initially download the model of course!)
113
+
114
+ **1. Configure OpenCode** (`~/.config/opencode/opencode.json`):
115
+
116
+ ```json
117
+ {
118
+ "$schema": "https://opencode.ai/config.json",
119
+ "provider": {
120
+ "ollama": {
121
+ "npm": "@ai-sdk/openai-compatible",
122
+ "name": "macafm (local)",
123
+ "options": {
124
+ "baseURL": "http://localhost:9999/v1"
125
+ },
126
+ "models": {
127
+ "mlx-community/Qwen3-Coder-Next-4bit": {
128
+ "name": "mlx-community/Qwen3-Coder-Next-4bit"
129
+ }
130
+ }
131
+ }
132
+ }
133
+ }
134
+ ```
135
+
136
+ **2. Start afm with a coding model:**
137
+ ```bash
138
+ afm mlx -m mlx-community/Qwen3-Coder-Next-4bit -t 1.0 --top-p 0.95 --max-tokens 8192
139
+ ```
140
+
141
+ **3. Launch OpenCode** and type `/connect`. Scroll down to the very bottom of the provider list — `macafm (local)` will likely be the last entry. Select it, and when prompted for an API key, enter any value (e.g. `x`) — tokenized access is not yet implemented in afm so the key is ignored. All inference runs locally on your Mac's GPU.
142
+
143
+ ---
144
+
145
+ ## 28+ MLX Models Tested
146
+
147
+ ![MLX Models](test-reports/MLX-Models.png)
148
+
149
+ 28 models tested and verified including Qwen3, Gemma 3/3n, GLM-4/5, DeepSeek V3, LFM2, SmolLM3, Llama 3.2, MiniMax M2.5, Nemotron, and more. See [test reports](test-reports/).
150
+
151
+ ---
152
+
153
+ [![Swift](https://img.shields.io/badge/Swift-6.2+-orange.svg)](https://swift.org)
154
+ [![macOS](https://img.shields.io/badge/macOS-26+-blue.svg)](https://developer.apple.com/macos/)
155
+ [![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
156
+
157
+ ## ⭐ Star History
158
+
159
+ [![Star History Chart](https://api.star-history.com/svg?repos=scouzi1966/maclocal-api&type=Date)](https://star-history.com/#scouzi1966/maclocal-api&Date)
160
+
161
+ ## Related Projects
162
+
163
+ - [Vesta AI Explorer](https://kruks.ai/) — full-featured native macOS AI chat app
164
+ - [AFMTrainer](https://github.com/scouzi1966/AFMTrainer) — LoRA fine-tuning wrapper for Apple's toolkit (Mac M-series & Linux CUDA)
165
+ - [Apple Foundation Model Adapters](https://developer.apple.com/apple-intelligence/foundation-models-adapter/) — Apple's adapter training toolkit
166
+
167
+ ## 🌟 Features
168
+
169
+ - **🔗 OpenAI API Compatible** - Works with existing OpenAI client libraries and applications
170
+ - **🧠 MLX Local Models** - Run any Hugging Face MLX model locally (Qwen, Gemma, Llama, DeepSeek, GLM, and 28+ tested models)
171
+ - **🌐 API Gateway** - Auto-discovers and proxies Ollama, LM Studio, Jan, and other local backends into a single API
172
+ - **⚡ LoRA adapter support** - Supports fine-tuning with LoRA adapters using Apple's tuning Toolkit
173
+ - **📱 Apple Foundation Models** - Uses Apple's on-device 3B parameter language model
174
+ - **👁️ Vision OCR** - Extract text from images and PDFs using Apple Vision (`afm vision`)
175
+ - **🖥️ Built-in WebUI** - Chat interface with model selection (`afm -w`)
176
+ - **🔒 Privacy-First** - All processing happens locally on your device
177
+ - **⚡ Fast & Lightweight** - No network calls, no API keys required
178
+ - **🛠️ Easy Integration** - Drop-in replacement for OpenAI API endpoints
179
+ - **📊 Token Usage Tracking** - Provides accurate token consumption metrics
180
+
181
+ ## 📋 Requirements
182
+
183
+ - **macOS 26 (Tahoe) or later
184
+ - **Apple Silicon Mac** (M1/M2/M3/M4 series)
185
+ - **Apple Intelligence enabled** in System Settings
186
+ - **Xcode 26 (for building from source)
187
+
188
+ ## 🚀 Quick Start
189
+
190
+ ### Installation
191
+
192
+ #### Option 1: Homebrew (Recommended)
193
+
194
+ ```bash
195
+ # Add the tap
196
+ brew tap scouzi1966/afm
197
+
198
+ # Install AFM
199
+ brew install afm
200
+
201
+ # Verify installation
202
+ afm --version
203
+ ```
204
+ #### Option 2: pip (PyPI)
205
+
206
+ ```bash
207
+ # Install from PyPI
208
+ pip install macafm
209
+
210
+ # Verify installation
211
+ afm --version
212
+ ```
213
+
214
+ #### Option 3: Build from Source
215
+
216
+ ```bash
217
+ # Clone the repository with submodules
218
+ git clone --recurse-submodules https://github.com/scouzi1966/maclocal-api.git
219
+ cd maclocal-api
220
+
221
+ # Build everything from scratch (patches + webui + release build)
222
+ ./Scripts/build-from-scratch.sh
223
+
224
+ # Or skip webui if you don't have Node.js
225
+ ./Scripts/build-from-scratch.sh --skip-webui
226
+
227
+ # Or use make (patches + release build, no webui)
228
+ make
229
+
230
+ # Run
231
+ ./.build/release/afm --version
232
+ ```
233
+
234
+ ### Running
235
+
236
+ ```bash
237
+ # API server only (Apple Foundation Model on port 9999)
238
+ afm
239
+
240
+ # API server with WebUI chat interface
241
+ afm -w
242
+
243
+ # WebUI + API gateway (auto-discovers Ollama, LM Studio, Jan, etc.)
244
+ afm -w -g
245
+
246
+ # Custom port with verbose logging
247
+ afm -p 8080 -v
248
+
249
+ # Show help
250
+ afm -h
251
+ ```
252
+
253
+ ### MLX Local Models
254
+
255
+ Run open-source models locally on Apple Silicon using MLX:
256
+
257
+ ```bash
258
+ # Run a model with single prompt
259
+ afm mlx -m mlx-community/Qwen2.5-0.5B-Instruct-4bit -s "Explain gravity"
260
+
261
+ # Start MLX model with WebUI
262
+ afm mlx -m mlx-community/gemma-3-4b-it-8bit -w
263
+
264
+ # Interactive model picker (lists downloaded models)
265
+ afm mlx -w
266
+
267
+ # MLX model as API server
268
+ afm mlx -m mlx-community/Llama-3.2-1B-Instruct-4bit -p 8080
269
+
270
+ # Pipe mode
271
+ cat essay.txt | afm mlx -m mlx-community/Qwen3-0.6B-4bit -i "Summarize this"
272
+
273
+ # MLX help
274
+ afm mlx --help
275
+ ```
276
+
277
+ Models are downloaded from Hugging Face on first use and cached locally. Any model from the [mlx-community](https://huggingface.co/mlx-community) collection is supported.
278
+
279
+ ## 📡 API Endpoints
280
+
281
+ ### Chat Completions
282
+ **POST** `/v1/chat/completions`
283
+
284
+ Compatible with OpenAI's chat completions API.
285
+
286
+ ```bash
287
+ curl -X POST http://localhost:9999/v1/chat/completions \
288
+ -H "Content-Type: application/json" \
289
+ -d '{
290
+ "model": "foundation",
291
+ "messages": [
292
+ {"role": "user", "content": "Hello, how are you?"}
293
+ ]
294
+ }'
295
+ ```
296
+
297
+ ### List Models
298
+ **GET** `/v1/models`
299
+
300
+ Returns available Foundation Models.
301
+
302
+ ```bash
303
+ curl http://localhost:9999/v1/models
304
+ ```
305
+
306
+ ### Health Check
307
+ **GET** `/health`
308
+
309
+ Server health status endpoint.
310
+
311
+ ```bash
312
+ curl http://localhost:9999/health
313
+ ```
314
+
315
+ ## 💻 Usage Examples
316
+
317
+ ### Python with OpenAI Library
318
+
319
+ ```python
320
+ from openai import OpenAI
321
+
322
+ # Point to your local MacLocalAPI server
323
+ client = OpenAI(
324
+ api_key="not-needed-for-local",
325
+ base_url="http://localhost:9999/v1"
326
+ )
327
+
328
+ response = client.chat.completions.create(
329
+ model="foundation",
330
+ messages=[
331
+ {"role": "user", "content": "Explain quantum computing in simple terms"}
332
+ ]
333
+ )
334
+
335
+ print(response.choices[0].message.content)
336
+ ```
337
+
338
+ ### JavaScript/Node.js
339
+
340
+ ```javascript
341
+ import OpenAI from 'openai';
342
+
343
+ const openai = new OpenAI({
344
+ apiKey: 'not-needed-for-local',
345
+ baseURL: 'http://localhost:9999/v1',
346
+ });
347
+
348
+ const completion = await openai.chat.completions.create({
349
+ messages: [{ role: 'user', content: 'Write a haiku about programming' }],
350
+ model: 'foundation',
351
+ });
352
+
353
+ console.log(completion.choices[0].message.content);
354
+ ```
355
+
356
+ ### curl Examples
357
+
358
+ ```bash
359
+ # Basic chat completion
360
+ curl -X POST http://localhost:9999/v1/chat/completions \
361
+ -H "Content-Type: application/json" \
362
+ -d '{
363
+ "model": "foundation",
364
+ "messages": [
365
+ {"role": "system", "content": "You are a helpful assistant."},
366
+ {"role": "user", "content": "What is the capital of France?"}
367
+ ]
368
+ }'
369
+
370
+ # With temperature control
371
+ curl -X POST http://localhost:9999/v1/chat/completions \
372
+ -H "Content-Type: application/json" \
373
+ -d '{
374
+ "model": "foundation",
375
+ "messages": [{"role": "user", "content": "Be creative!"}],
376
+ "temperature": 0.8
377
+ }'
378
+ ```
379
+
380
+ ### Single Prompt & Pipe Examples
381
+
382
+ ```bash
383
+ # Single prompt mode
384
+ afm -s "Explain quantum computing"
385
+
386
+ # Piped input from other commands
387
+ echo "What is the meaning of life?" | afm
388
+ cat file.txt | afm
389
+ git log --oneline | head -5 | afm
390
+
391
+ # Custom instructions with pipe
392
+ echo "Review this code" | afm -i "You are a senior software engineer"
393
+ ```
394
+
395
+ ## 🏗️ Architecture
396
+
397
+ ```
398
+ MacLocalAPI/
399
+ ├── Package.swift # Swift Package Manager config
400
+ ├── Sources/MacLocalAPI/
401
+ │ ├── main.swift # CLI entry point & ArgumentParser
402
+ │ ├── Server.swift # Vapor web server configuration
403
+ │ ├── Controllers/
404
+ │ │ └── ChatCompletionsController.swift # OpenAI API endpoints
405
+ │ └── Models/
406
+ │ ├── FoundationModelService.swift # Apple Foundation Models wrapper
407
+ │ ├── OpenAIRequest.swift # Request data models
408
+ │ └── OpenAIResponse.swift # Response data models
409
+ └── README.md
410
+ ```
411
+
412
+ ## 🔧 Configuration
413
+
414
+ ### Command Line Options
415
+
416
+ ```
417
+ OVERVIEW: macOS server that exposes Apple's Foundation Models through
418
+ OpenAI-compatible API
419
+
420
+ Use -w to enable the WebUI, -g to enable API gateway mode (auto-discovers and
421
+ proxies to Ollama, LM Studio, Jan, and other local LLM backends).
422
+
423
+ USAGE: afm <options>
424
+ afm mlx [<options>] Run local MLX models from Hugging Face
425
+ afm vision <image> OCR text extraction from images/PDFs
426
+
427
+ OPTIONS:
428
+ -s, --single-prompt <single-prompt>
429
+ Run a single prompt without starting the server
430
+ -i, --instructions <instructions>
431
+ Custom instructions for the AI assistant (default:
432
+ You are a helpful assistant)
433
+ -v, --verbose Enable verbose logging
434
+ --no-streaming Disable streaming responses (streaming is enabled by
435
+ default)
436
+ -a, --adapter <adapter> Path to a .fmadapter file for LoRA adapter fine-tuning
437
+ -p, --port <port> Port to run the server on (default: 9999)
438
+ -H, --hostname <hostname>
439
+ Hostname to bind server to (default: 127.0.0.1)
440
+ -t, --temperature <temperature>
441
+ Temperature for response generation (0.0-1.0)
442
+ -r, --randomness <randomness>
443
+ Sampling mode: 'greedy', 'random',
444
+ 'random:top-p=<0.0-1.0>', 'random:top-k=<int>', with
445
+ optional ':seed=<int>'
446
+ -P, --permissive-guardrails
447
+ Permissive guardrails for unsafe or inappropriate
448
+ responses
449
+ -w, --webui Enable webui and open in default browser
450
+ -g, --gateway Enable API gateway mode: discover and proxy to local
451
+ LLM backends (Ollama, LM Studio, Jan, etc.)
452
+ --prewarm <prewarm> Pre-warm the model on server startup for faster first
453
+ response (y/n, default: y)
454
+ --version Show the version.
455
+ -h, --help Show help information.
456
+
457
+ Note: afm also accepts piped input from other commands, equivalent to using -s
458
+ with the piped content as the prompt.
459
+ ```
460
+
461
+ ### Environment Variables
462
+
463
+ The server respects standard logging environment variables:
464
+ - `LOG_LEVEL` - Set logging level (trace, debug, info, notice, warning, error, critical)
465
+
466
+ ## ⚠️ Limitations & Notes
467
+
468
+ - **Model Scope**: Apple Foundation Model is a 3B parameter model (optimized for on-device performance)
469
+ - **macOS 26+ Only**: Requires the latest macOS with Foundation Models framework
470
+ - **Apple Intelligence Required**: Must be enabled in System Settings
471
+ - **Token Estimation**: Uses word-based approximation for token counting (Foundation model only; proxied backends report real counts)
472
+
473
+ ## 🔍 Troubleshooting
474
+
475
+ ### "Foundation Models framework is not available"
476
+ 1. Ensure you're running **macOS 26 or later
477
+ 2. Enable **Apple Intelligence** in System Settings → Apple Intelligence & Siri
478
+ 3. Verify you're on an **Apple Silicon Mac**
479
+ 4. Restart the application after enabling Apple Intelligence
480
+
481
+ ### Server Won't Start
482
+ 1. Check if the port is already in use: `lsof -i :9999`
483
+ 2. Try a different port: `afm -p 8080`
484
+ 3. Enable verbose logging: `afm -v`
485
+
486
+ ### Build Issues
487
+ 1. Ensure you have **Xcode 26 installed
488
+ 2. Update Swift toolchain: `xcode-select --install`
489
+ 3. Clean and rebuild: `swift package clean && swift build -c release`
490
+
491
+ ## 🤝 Contributing
492
+
493
+ Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
494
+
495
+ ### Development Setup
496
+
497
+ ```bash
498
+ # Clone the repo with submodules
499
+ git clone --recurse-submodules https://github.com/scouzi1966/maclocal-api.git
500
+ cd maclocal-api
501
+
502
+ # Full build from scratch (submodules + patches + webui + release)
503
+ ./Scripts/build-from-scratch.sh
504
+
505
+ # Or for debug builds during development
506
+ ./Scripts/build-from-scratch.sh --debug --skip-webui
507
+
508
+ # Run with verbose logging
509
+ ./.build/debug/afm -w -g -v
510
+ ```
511
+
512
+ ## 📄 License
513
+
514
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
515
+
516
+ ## 🙏 Acknowledgments
517
+
518
+ - Apple for the Foundation Models framework
519
+ - The Vapor Swift web framework team
520
+ - OpenAI for the API specification standard
521
+ - The Swift community for excellent tooling
522
+
523
+ ## 📞 Support
524
+
525
+ If you encounter any issues or have questions:
526
+
527
+ 1. Check the [Troubleshooting](#-troubleshooting) section
528
+ 2. Search existing [GitHub Issues](https://github.com/scouzi1966/maclocal-api/issues)
529
+ 3. Create a new issue with detailed information about your problem
530
+
531
+ ## 🗺️ Roadmap
532
+
533
+ - [x] Streaming response support
534
+ - [x] MLX local model support (28+ models tested)
535
+ - [x] Multiple model support (API gateway mode)
536
+ - [x] Web UI for testing (llama.cpp WebUI integration)
537
+ - [x] Vision OCR subcommand
538
+ - [x] Function/tool calling (OpenAI-compatible, multiple formats)
539
+ - [ ] Performance optimizations
540
+ - [ ] Docker containerization (when supported)
541
+
542
+ ---
543
+
544
+ **Made with ❤️ for the Apple Silicon community**
545
+
546
+ *Bringing the power of local AI to your fingertips.*