macafm 0.9.5__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- macafm-0.9.5/LICENSE +21 -0
- macafm-0.9.5/MANIFEST.in +4 -0
- macafm-0.9.5/PKG-INFO +546 -0
- macafm-0.9.5/README.md +514 -0
- macafm-0.9.5/macafm/__init__.py +21 -0
- macafm-0.9.5/macafm.egg-info/PKG-INFO +546 -0
- macafm-0.9.5/macafm.egg-info/SOURCES.txt +11 -0
- macafm-0.9.5/macafm.egg-info/dependency_links.txt +1 -0
- macafm-0.9.5/macafm.egg-info/entry_points.txt +2 -0
- macafm-0.9.5/macafm.egg-info/requires.txt +4 -0
- macafm-0.9.5/macafm.egg-info/top_level.txt +1 -0
- macafm-0.9.5/pyproject.toml +63 -0
- macafm-0.9.5/setup.cfg +4 -0
macafm-0.9.5/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 MacLocalAPI Contributors
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
macafm-0.9.5/MANIFEST.in
ADDED
macafm-0.9.5/PKG-INFO
ADDED
|
@@ -0,0 +1,546 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: macafm
|
|
3
|
+
Version: 0.9.5
|
|
4
|
+
Summary: Access Apple's on-device Foundation Models via CLI and OpenAI-compatible API
|
|
5
|
+
Author: Sylvain Cousineau
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/scouzi1966/maclocal-api
|
|
8
|
+
Project-URL: Documentation, https://github.com/scouzi1966/maclocal-api#readme
|
|
9
|
+
Project-URL: Repository, https://github.com/scouzi1966/maclocal-api
|
|
10
|
+
Project-URL: Issues, https://github.com/scouzi1966/maclocal-api/issues
|
|
11
|
+
Keywords: apple,foundation-models,llm,openai,api,macos,apple-silicon,ai,machine-learning,cli
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Environment :: Console
|
|
14
|
+
Classifier: Environment :: MacOS X
|
|
15
|
+
Classifier: Intended Audience :: Developers
|
|
16
|
+
Classifier: Operating System :: MacOS :: MacOS X
|
|
17
|
+
Classifier: Programming Language :: Python :: 3
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
21
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
22
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
23
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
24
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
25
|
+
Requires-Python: >=3.9
|
|
26
|
+
Description-Content-Type: text/markdown
|
|
27
|
+
License-File: LICENSE
|
|
28
|
+
Provides-Extra: dev
|
|
29
|
+
Requires-Dist: build; extra == "dev"
|
|
30
|
+
Requires-Dist: twine; extra == "dev"
|
|
31
|
+
Dynamic: license-file
|
|
32
|
+
|
|
33
|
+
If you find this useful, please ⭐ the repo! Also check out [Vesta AI Explorer](https://kruks.ai/) — my full-featured native macOS AI app.
|
|
34
|
+
|
|
35
|
+
> [!NOTE]
|
|
36
|
+
> **Attention M-series Mac AI enthusiasts!** You don't need to be a Swift developer to explore. Vibe coding really allows anyone to participate in this project.
|
|
37
|
+
>
|
|
38
|
+
> [Fork this repo](https://github.com/scouzi1966/maclocal-api/fork) first, then clone your fork to submit PRs:
|
|
39
|
+
>
|
|
40
|
+
> ```bash
|
|
41
|
+
> git clone https://github.com/<your-username>/maclocal-api.git
|
|
42
|
+
> cd maclocal-api
|
|
43
|
+
> claude
|
|
44
|
+
> /build-afm
|
|
45
|
+
> ```
|
|
46
|
+
>
|
|
47
|
+
> Start vibe coding! I will add support for skills with more coding agents in the future.
|
|
48
|
+
|
|
49
|
+
# afm — Run Any LLM on Your Mac, 100% Local
|
|
50
|
+
|
|
51
|
+
Extensive testing of Qwen3.5-35B-A3B with afm. Uses an experimental technique with Claude and Codex as judges for evaluation scoring. Click the link below to view test results.
|
|
52
|
+
|
|
53
|
+
### [afm-next Nightly Test Report — Qwen3.5-35B-A3B Focus](https://kruks.ai/macafm/)
|
|
54
|
+
|
|
55
|
+
Run open-source MLX models **or** Apple's on-device Foundation Model through an OpenAI-compatible API. Built entirely in Swift for maximum Metal GPU performance. No Python runtime, no cloud, no API keys.
|
|
56
|
+
|
|
57
|
+
## Install
|
|
58
|
+
|
|
59
|
+
| | Stable (v0.9.4) | Nightly (afm-next) |
|
|
60
|
+
|---|---|---|
|
|
61
|
+
| **Homebrew** | `brew install scouzi1966/afm/afm` | `brew install scouzi1966/afm/afm-next` |
|
|
62
|
+
| **pip** | `pip install macafm` | — |
|
|
63
|
+
| **Release notes** | [v0.9.4](https://github.com/scouzi1966/maclocal-api/releases/tag/v0.9.4) | [Latest nightly](https://github.com/scouzi1966/maclocal-api/releases) |
|
|
64
|
+
|
|
65
|
+
> [!TIP]
|
|
66
|
+
> **Switching between stable and nightly:**
|
|
67
|
+
> ```bash
|
|
68
|
+
> brew unlink afm && brew install scouzi1966/afm/afm-next # switch to nightly
|
|
69
|
+
> brew unlink afm-next && brew link afm # switch back to stable
|
|
70
|
+
> ASSUMES you did a brew install scouzi1966/afm/afm previously
|
|
71
|
+
> ```
|
|
72
|
+
|
|
73
|
+
## What's new in afm-next
|
|
74
|
+
|
|
75
|
+
> [!IMPORTANT]
|
|
76
|
+
> The nightly build is the future stable release. It includes everything in v0.9.4 plus:
|
|
77
|
+
> - the test-reports folder is a mess but contains the extensive test reports performed
|
|
78
|
+
> - **Qwen3.5-35B-A3B MoE** — run a 35B model with only 3B active parameters (--vlm for image,video)
|
|
79
|
+
> - **Full tool calling** — Qwen3-Coder, Gemma, GLM, Kimi-K2.5, and more
|
|
80
|
+
> - **Prompt prefix caching** for faster repeat inference
|
|
81
|
+
> - **Stop sequences** with `<think>` model support
|
|
82
|
+
> - **New architectures** — Qwen3.5, Gemma 3n, Kimi-K2.5, MiniMax M2.5, Nemotron
|
|
83
|
+
> - --guided-json for structured output
|
|
84
|
+
> - Stop sequences through API
|
|
85
|
+
> - Pass image objects to API (using OpenAI APi SDK standards)
|
|
86
|
+
> - logprobs for agentic interpretability testing
|
|
87
|
+
> - top-k, min-p and presence penalty parameters
|
|
88
|
+
> - --tool-call-parser (experimental) hermes, llama3_json, gemma, mistral, qwen3_xml
|
|
89
|
+
> - Many more! afm mlx -h (not all features are wired at the moment)
|
|
90
|
+
|
|
91
|
+
## Quick Start
|
|
92
|
+
|
|
93
|
+
```bash
|
|
94
|
+
# Run any MLX model with WebUI
|
|
95
|
+
afm mlx -m mlx-community/Qwen3.5-35B-A3B-4bit -w
|
|
96
|
+
|
|
97
|
+
# Or any smaller model
|
|
98
|
+
afm mlx -m mlx-community/gemma-3-4b-it-8bit -w
|
|
99
|
+
|
|
100
|
+
# Chat from the terminal (auto-downloads from Hugging Face)
|
|
101
|
+
afm mlx -m Qwen3-0.6B-4bit -s "Explain quantum computing"
|
|
102
|
+
|
|
103
|
+
# Interactive model picker (lists your downloaded models)
|
|
104
|
+
MACAFM_MLX_MODEL_CACHE=/path/to/models afm mlx -w
|
|
105
|
+
|
|
106
|
+
# Apple's on-device Foundation Model with WebUI
|
|
107
|
+
afm -w
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
## Use with OpenCode
|
|
111
|
+
|
|
112
|
+
[OpenCode](https://opencode.ai/) is a terminal-based AI coding assistant. Connect it to afm for a fully local coding experience — no cloud, no API keys. No Internet required (other than initially download the model of course!)
|
|
113
|
+
|
|
114
|
+
**1. Configure OpenCode** (`~/.config/opencode/opencode.json`):
|
|
115
|
+
|
|
116
|
+
```json
|
|
117
|
+
{
|
|
118
|
+
"$schema": "https://opencode.ai/config.json",
|
|
119
|
+
"provider": {
|
|
120
|
+
"ollama": {
|
|
121
|
+
"npm": "@ai-sdk/openai-compatible",
|
|
122
|
+
"name": "macafm (local)",
|
|
123
|
+
"options": {
|
|
124
|
+
"baseURL": "http://localhost:9999/v1"
|
|
125
|
+
},
|
|
126
|
+
"models": {
|
|
127
|
+
"mlx-community/Qwen3-Coder-Next-4bit": {
|
|
128
|
+
"name": "mlx-community/Qwen3-Coder-Next-4bit"
|
|
129
|
+
}
|
|
130
|
+
}
|
|
131
|
+
}
|
|
132
|
+
}
|
|
133
|
+
}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
**2. Start afm with a coding model:**
|
|
137
|
+
```bash
|
|
138
|
+
afm mlx -m mlx-community/Qwen3-Coder-Next-4bit -t 1.0 --top-p 0.95 --max-tokens 8192
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
**3. Launch OpenCode** and type `/connect`. Scroll down to the very bottom of the provider list — `macafm (local)` will likely be the last entry. Select it, and when prompted for an API key, enter any value (e.g. `x`) — tokenized access is not yet implemented in afm so the key is ignored. All inference runs locally on your Mac's GPU.
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## 28+ MLX Models Tested
|
|
146
|
+
|
|
147
|
+

|
|
148
|
+
|
|
149
|
+
28 models tested and verified including Qwen3, Gemma 3/3n, GLM-4/5, DeepSeek V3, LFM2, SmolLM3, Llama 3.2, MiniMax M2.5, Nemotron, and more. See [test reports](test-reports/).
|
|
150
|
+
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
[](https://swift.org)
|
|
154
|
+
[](https://developer.apple.com/macos/)
|
|
155
|
+
[](LICENSE)
|
|
156
|
+
|
|
157
|
+
## ⭐ Star History
|
|
158
|
+
|
|
159
|
+
[](https://star-history.com/#scouzi1966/maclocal-api&Date)
|
|
160
|
+
|
|
161
|
+
## Related Projects
|
|
162
|
+
|
|
163
|
+
- [Vesta AI Explorer](https://kruks.ai/) — full-featured native macOS AI chat app
|
|
164
|
+
- [AFMTrainer](https://github.com/scouzi1966/AFMTrainer) — LoRA fine-tuning wrapper for Apple's toolkit (Mac M-series & Linux CUDA)
|
|
165
|
+
- [Apple Foundation Model Adapters](https://developer.apple.com/apple-intelligence/foundation-models-adapter/) — Apple's adapter training toolkit
|
|
166
|
+
|
|
167
|
+
## 🌟 Features
|
|
168
|
+
|
|
169
|
+
- **🔗 OpenAI API Compatible** - Works with existing OpenAI client libraries and applications
|
|
170
|
+
- **🧠 MLX Local Models** - Run any Hugging Face MLX model locally (Qwen, Gemma, Llama, DeepSeek, GLM, and 28+ tested models)
|
|
171
|
+
- **🌐 API Gateway** - Auto-discovers and proxies Ollama, LM Studio, Jan, and other local backends into a single API
|
|
172
|
+
- **⚡ LoRA adapter support** - Supports fine-tuning with LoRA adapters using Apple's tuning Toolkit
|
|
173
|
+
- **📱 Apple Foundation Models** - Uses Apple's on-device 3B parameter language model
|
|
174
|
+
- **👁️ Vision OCR** - Extract text from images and PDFs using Apple Vision (`afm vision`)
|
|
175
|
+
- **🖥️ Built-in WebUI** - Chat interface with model selection (`afm -w`)
|
|
176
|
+
- **🔒 Privacy-First** - All processing happens locally on your device
|
|
177
|
+
- **⚡ Fast & Lightweight** - No network calls, no API keys required
|
|
178
|
+
- **🛠️ Easy Integration** - Drop-in replacement for OpenAI API endpoints
|
|
179
|
+
- **📊 Token Usage Tracking** - Provides accurate token consumption metrics
|
|
180
|
+
|
|
181
|
+
## 📋 Requirements
|
|
182
|
+
|
|
183
|
+
- **macOS 26 (Tahoe) or later
|
|
184
|
+
- **Apple Silicon Mac** (M1/M2/M3/M4 series)
|
|
185
|
+
- **Apple Intelligence enabled** in System Settings
|
|
186
|
+
- **Xcode 26 (for building from source)
|
|
187
|
+
|
|
188
|
+
## 🚀 Quick Start
|
|
189
|
+
|
|
190
|
+
### Installation
|
|
191
|
+
|
|
192
|
+
#### Option 1: Homebrew (Recommended)
|
|
193
|
+
|
|
194
|
+
```bash
|
|
195
|
+
# Add the tap
|
|
196
|
+
brew tap scouzi1966/afm
|
|
197
|
+
|
|
198
|
+
# Install AFM
|
|
199
|
+
brew install afm
|
|
200
|
+
|
|
201
|
+
# Verify installation
|
|
202
|
+
afm --version
|
|
203
|
+
```
|
|
204
|
+
#### Option 2: pip (PyPI)
|
|
205
|
+
|
|
206
|
+
```bash
|
|
207
|
+
# Install from PyPI
|
|
208
|
+
pip install macafm
|
|
209
|
+
|
|
210
|
+
# Verify installation
|
|
211
|
+
afm --version
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
#### Option 3: Build from Source
|
|
215
|
+
|
|
216
|
+
```bash
|
|
217
|
+
# Clone the repository with submodules
|
|
218
|
+
git clone --recurse-submodules https://github.com/scouzi1966/maclocal-api.git
|
|
219
|
+
cd maclocal-api
|
|
220
|
+
|
|
221
|
+
# Build everything from scratch (patches + webui + release build)
|
|
222
|
+
./Scripts/build-from-scratch.sh
|
|
223
|
+
|
|
224
|
+
# Or skip webui if you don't have Node.js
|
|
225
|
+
./Scripts/build-from-scratch.sh --skip-webui
|
|
226
|
+
|
|
227
|
+
# Or use make (patches + release build, no webui)
|
|
228
|
+
make
|
|
229
|
+
|
|
230
|
+
# Run
|
|
231
|
+
./.build/release/afm --version
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
### Running
|
|
235
|
+
|
|
236
|
+
```bash
|
|
237
|
+
# API server only (Apple Foundation Model on port 9999)
|
|
238
|
+
afm
|
|
239
|
+
|
|
240
|
+
# API server with WebUI chat interface
|
|
241
|
+
afm -w
|
|
242
|
+
|
|
243
|
+
# WebUI + API gateway (auto-discovers Ollama, LM Studio, Jan, etc.)
|
|
244
|
+
afm -w -g
|
|
245
|
+
|
|
246
|
+
# Custom port with verbose logging
|
|
247
|
+
afm -p 8080 -v
|
|
248
|
+
|
|
249
|
+
# Show help
|
|
250
|
+
afm -h
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
### MLX Local Models
|
|
254
|
+
|
|
255
|
+
Run open-source models locally on Apple Silicon using MLX:
|
|
256
|
+
|
|
257
|
+
```bash
|
|
258
|
+
# Run a model with single prompt
|
|
259
|
+
afm mlx -m mlx-community/Qwen2.5-0.5B-Instruct-4bit -s "Explain gravity"
|
|
260
|
+
|
|
261
|
+
# Start MLX model with WebUI
|
|
262
|
+
afm mlx -m mlx-community/gemma-3-4b-it-8bit -w
|
|
263
|
+
|
|
264
|
+
# Interactive model picker (lists downloaded models)
|
|
265
|
+
afm mlx -w
|
|
266
|
+
|
|
267
|
+
# MLX model as API server
|
|
268
|
+
afm mlx -m mlx-community/Llama-3.2-1B-Instruct-4bit -p 8080
|
|
269
|
+
|
|
270
|
+
# Pipe mode
|
|
271
|
+
cat essay.txt | afm mlx -m mlx-community/Qwen3-0.6B-4bit -i "Summarize this"
|
|
272
|
+
|
|
273
|
+
# MLX help
|
|
274
|
+
afm mlx --help
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
Models are downloaded from Hugging Face on first use and cached locally. Any model from the [mlx-community](https://huggingface.co/mlx-community) collection is supported.
|
|
278
|
+
|
|
279
|
+
## 📡 API Endpoints
|
|
280
|
+
|
|
281
|
+
### Chat Completions
|
|
282
|
+
**POST** `/v1/chat/completions`
|
|
283
|
+
|
|
284
|
+
Compatible with OpenAI's chat completions API.
|
|
285
|
+
|
|
286
|
+
```bash
|
|
287
|
+
curl -X POST http://localhost:9999/v1/chat/completions \
|
|
288
|
+
-H "Content-Type: application/json" \
|
|
289
|
+
-d '{
|
|
290
|
+
"model": "foundation",
|
|
291
|
+
"messages": [
|
|
292
|
+
{"role": "user", "content": "Hello, how are you?"}
|
|
293
|
+
]
|
|
294
|
+
}'
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
### List Models
|
|
298
|
+
**GET** `/v1/models`
|
|
299
|
+
|
|
300
|
+
Returns available Foundation Models.
|
|
301
|
+
|
|
302
|
+
```bash
|
|
303
|
+
curl http://localhost:9999/v1/models
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
### Health Check
|
|
307
|
+
**GET** `/health`
|
|
308
|
+
|
|
309
|
+
Server health status endpoint.
|
|
310
|
+
|
|
311
|
+
```bash
|
|
312
|
+
curl http://localhost:9999/health
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
## 💻 Usage Examples
|
|
316
|
+
|
|
317
|
+
### Python with OpenAI Library
|
|
318
|
+
|
|
319
|
+
```python
|
|
320
|
+
from openai import OpenAI
|
|
321
|
+
|
|
322
|
+
# Point to your local MacLocalAPI server
|
|
323
|
+
client = OpenAI(
|
|
324
|
+
api_key="not-needed-for-local",
|
|
325
|
+
base_url="http://localhost:9999/v1"
|
|
326
|
+
)
|
|
327
|
+
|
|
328
|
+
response = client.chat.completions.create(
|
|
329
|
+
model="foundation",
|
|
330
|
+
messages=[
|
|
331
|
+
{"role": "user", "content": "Explain quantum computing in simple terms"}
|
|
332
|
+
]
|
|
333
|
+
)
|
|
334
|
+
|
|
335
|
+
print(response.choices[0].message.content)
|
|
336
|
+
```
|
|
337
|
+
|
|
338
|
+
### JavaScript/Node.js
|
|
339
|
+
|
|
340
|
+
```javascript
|
|
341
|
+
import OpenAI from 'openai';
|
|
342
|
+
|
|
343
|
+
const openai = new OpenAI({
|
|
344
|
+
apiKey: 'not-needed-for-local',
|
|
345
|
+
baseURL: 'http://localhost:9999/v1',
|
|
346
|
+
});
|
|
347
|
+
|
|
348
|
+
const completion = await openai.chat.completions.create({
|
|
349
|
+
messages: [{ role: 'user', content: 'Write a haiku about programming' }],
|
|
350
|
+
model: 'foundation',
|
|
351
|
+
});
|
|
352
|
+
|
|
353
|
+
console.log(completion.choices[0].message.content);
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
### curl Examples
|
|
357
|
+
|
|
358
|
+
```bash
|
|
359
|
+
# Basic chat completion
|
|
360
|
+
curl -X POST http://localhost:9999/v1/chat/completions \
|
|
361
|
+
-H "Content-Type: application/json" \
|
|
362
|
+
-d '{
|
|
363
|
+
"model": "foundation",
|
|
364
|
+
"messages": [
|
|
365
|
+
{"role": "system", "content": "You are a helpful assistant."},
|
|
366
|
+
{"role": "user", "content": "What is the capital of France?"}
|
|
367
|
+
]
|
|
368
|
+
}'
|
|
369
|
+
|
|
370
|
+
# With temperature control
|
|
371
|
+
curl -X POST http://localhost:9999/v1/chat/completions \
|
|
372
|
+
-H "Content-Type: application/json" \
|
|
373
|
+
-d '{
|
|
374
|
+
"model": "foundation",
|
|
375
|
+
"messages": [{"role": "user", "content": "Be creative!"}],
|
|
376
|
+
"temperature": 0.8
|
|
377
|
+
}'
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
### Single Prompt & Pipe Examples
|
|
381
|
+
|
|
382
|
+
```bash
|
|
383
|
+
# Single prompt mode
|
|
384
|
+
afm -s "Explain quantum computing"
|
|
385
|
+
|
|
386
|
+
# Piped input from other commands
|
|
387
|
+
echo "What is the meaning of life?" | afm
|
|
388
|
+
cat file.txt | afm
|
|
389
|
+
git log --oneline | head -5 | afm
|
|
390
|
+
|
|
391
|
+
# Custom instructions with pipe
|
|
392
|
+
echo "Review this code" | afm -i "You are a senior software engineer"
|
|
393
|
+
```
|
|
394
|
+
|
|
395
|
+
## 🏗️ Architecture
|
|
396
|
+
|
|
397
|
+
```
|
|
398
|
+
MacLocalAPI/
|
|
399
|
+
├── Package.swift # Swift Package Manager config
|
|
400
|
+
├── Sources/MacLocalAPI/
|
|
401
|
+
│ ├── main.swift # CLI entry point & ArgumentParser
|
|
402
|
+
│ ├── Server.swift # Vapor web server configuration
|
|
403
|
+
│ ├── Controllers/
|
|
404
|
+
│ │ └── ChatCompletionsController.swift # OpenAI API endpoints
|
|
405
|
+
│ └── Models/
|
|
406
|
+
│ ├── FoundationModelService.swift # Apple Foundation Models wrapper
|
|
407
|
+
│ ├── OpenAIRequest.swift # Request data models
|
|
408
|
+
│ └── OpenAIResponse.swift # Response data models
|
|
409
|
+
└── README.md
|
|
410
|
+
```
|
|
411
|
+
|
|
412
|
+
## 🔧 Configuration
|
|
413
|
+
|
|
414
|
+
### Command Line Options
|
|
415
|
+
|
|
416
|
+
```
|
|
417
|
+
OVERVIEW: macOS server that exposes Apple's Foundation Models through
|
|
418
|
+
OpenAI-compatible API
|
|
419
|
+
|
|
420
|
+
Use -w to enable the WebUI, -g to enable API gateway mode (auto-discovers and
|
|
421
|
+
proxies to Ollama, LM Studio, Jan, and other local LLM backends).
|
|
422
|
+
|
|
423
|
+
USAGE: afm <options>
|
|
424
|
+
afm mlx [<options>] Run local MLX models from Hugging Face
|
|
425
|
+
afm vision <image> OCR text extraction from images/PDFs
|
|
426
|
+
|
|
427
|
+
OPTIONS:
|
|
428
|
+
-s, --single-prompt <single-prompt>
|
|
429
|
+
Run a single prompt without starting the server
|
|
430
|
+
-i, --instructions <instructions>
|
|
431
|
+
Custom instructions for the AI assistant (default:
|
|
432
|
+
You are a helpful assistant)
|
|
433
|
+
-v, --verbose Enable verbose logging
|
|
434
|
+
--no-streaming Disable streaming responses (streaming is enabled by
|
|
435
|
+
default)
|
|
436
|
+
-a, --adapter <adapter> Path to a .fmadapter file for LoRA adapter fine-tuning
|
|
437
|
+
-p, --port <port> Port to run the server on (default: 9999)
|
|
438
|
+
-H, --hostname <hostname>
|
|
439
|
+
Hostname to bind server to (default: 127.0.0.1)
|
|
440
|
+
-t, --temperature <temperature>
|
|
441
|
+
Temperature for response generation (0.0-1.0)
|
|
442
|
+
-r, --randomness <randomness>
|
|
443
|
+
Sampling mode: 'greedy', 'random',
|
|
444
|
+
'random:top-p=<0.0-1.0>', 'random:top-k=<int>', with
|
|
445
|
+
optional ':seed=<int>'
|
|
446
|
+
-P, --permissive-guardrails
|
|
447
|
+
Permissive guardrails for unsafe or inappropriate
|
|
448
|
+
responses
|
|
449
|
+
-w, --webui Enable webui and open in default browser
|
|
450
|
+
-g, --gateway Enable API gateway mode: discover and proxy to local
|
|
451
|
+
LLM backends (Ollama, LM Studio, Jan, etc.)
|
|
452
|
+
--prewarm <prewarm> Pre-warm the model on server startup for faster first
|
|
453
|
+
response (y/n, default: y)
|
|
454
|
+
--version Show the version.
|
|
455
|
+
-h, --help Show help information.
|
|
456
|
+
|
|
457
|
+
Note: afm also accepts piped input from other commands, equivalent to using -s
|
|
458
|
+
with the piped content as the prompt.
|
|
459
|
+
```
|
|
460
|
+
|
|
461
|
+
### Environment Variables
|
|
462
|
+
|
|
463
|
+
The server respects standard logging environment variables:
|
|
464
|
+
- `LOG_LEVEL` - Set logging level (trace, debug, info, notice, warning, error, critical)
|
|
465
|
+
|
|
466
|
+
## ⚠️ Limitations & Notes
|
|
467
|
+
|
|
468
|
+
- **Model Scope**: Apple Foundation Model is a 3B parameter model (optimized for on-device performance)
|
|
469
|
+
- **macOS 26+ Only**: Requires the latest macOS with Foundation Models framework
|
|
470
|
+
- **Apple Intelligence Required**: Must be enabled in System Settings
|
|
471
|
+
- **Token Estimation**: Uses word-based approximation for token counting (Foundation model only; proxied backends report real counts)
|
|
472
|
+
|
|
473
|
+
## 🔍 Troubleshooting
|
|
474
|
+
|
|
475
|
+
### "Foundation Models framework is not available"
|
|
476
|
+
1. Ensure you're running **macOS 26 or later
|
|
477
|
+
2. Enable **Apple Intelligence** in System Settings → Apple Intelligence & Siri
|
|
478
|
+
3. Verify you're on an **Apple Silicon Mac**
|
|
479
|
+
4. Restart the application after enabling Apple Intelligence
|
|
480
|
+
|
|
481
|
+
### Server Won't Start
|
|
482
|
+
1. Check if the port is already in use: `lsof -i :9999`
|
|
483
|
+
2. Try a different port: `afm -p 8080`
|
|
484
|
+
3. Enable verbose logging: `afm -v`
|
|
485
|
+
|
|
486
|
+
### Build Issues
|
|
487
|
+
1. Ensure you have **Xcode 26 installed
|
|
488
|
+
2. Update Swift toolchain: `xcode-select --install`
|
|
489
|
+
3. Clean and rebuild: `swift package clean && swift build -c release`
|
|
490
|
+
|
|
491
|
+
## 🤝 Contributing
|
|
492
|
+
|
|
493
|
+
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
|
|
494
|
+
|
|
495
|
+
### Development Setup
|
|
496
|
+
|
|
497
|
+
```bash
|
|
498
|
+
# Clone the repo with submodules
|
|
499
|
+
git clone --recurse-submodules https://github.com/scouzi1966/maclocal-api.git
|
|
500
|
+
cd maclocal-api
|
|
501
|
+
|
|
502
|
+
# Full build from scratch (submodules + patches + webui + release)
|
|
503
|
+
./Scripts/build-from-scratch.sh
|
|
504
|
+
|
|
505
|
+
# Or for debug builds during development
|
|
506
|
+
./Scripts/build-from-scratch.sh --debug --skip-webui
|
|
507
|
+
|
|
508
|
+
# Run with verbose logging
|
|
509
|
+
./.build/debug/afm -w -g -v
|
|
510
|
+
```
|
|
511
|
+
|
|
512
|
+
## 📄 License
|
|
513
|
+
|
|
514
|
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
515
|
+
|
|
516
|
+
## 🙏 Acknowledgments
|
|
517
|
+
|
|
518
|
+
- Apple for the Foundation Models framework
|
|
519
|
+
- The Vapor Swift web framework team
|
|
520
|
+
- OpenAI for the API specification standard
|
|
521
|
+
- The Swift community for excellent tooling
|
|
522
|
+
|
|
523
|
+
## 📞 Support
|
|
524
|
+
|
|
525
|
+
If you encounter any issues or have questions:
|
|
526
|
+
|
|
527
|
+
1. Check the [Troubleshooting](#-troubleshooting) section
|
|
528
|
+
2. Search existing [GitHub Issues](https://github.com/scouzi1966/maclocal-api/issues)
|
|
529
|
+
3. Create a new issue with detailed information about your problem
|
|
530
|
+
|
|
531
|
+
## 🗺️ Roadmap
|
|
532
|
+
|
|
533
|
+
- [x] Streaming response support
|
|
534
|
+
- [x] MLX local model support (28+ models tested)
|
|
535
|
+
- [x] Multiple model support (API gateway mode)
|
|
536
|
+
- [x] Web UI for testing (llama.cpp WebUI integration)
|
|
537
|
+
- [x] Vision OCR subcommand
|
|
538
|
+
- [x] Function/tool calling (OpenAI-compatible, multiple formats)
|
|
539
|
+
- [ ] Performance optimizations
|
|
540
|
+
- [ ] Docker containerization (when supported)
|
|
541
|
+
|
|
542
|
+
---
|
|
543
|
+
|
|
544
|
+
**Made with ❤️ for the Apple Silicon community**
|
|
545
|
+
|
|
546
|
+
*Bringing the power of local AI to your fingertips.*
|