mlx-code 0.0.20__tar.gz → 0.0.22__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {mlx_code-0.0.20 → mlx_code-0.0.22}/PKG-INFO +109 -66
- {mlx_code-0.0.20 → mlx_code-0.0.22}/README.md +106 -65
- mlx_code-0.0.20/mlx_code/ntui.py → mlx_code-0.0.22/mlx_code/bare.py +1 -0
- mlx_code-0.0.22/mlx_code/bats.py +300 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/main.py +73 -12
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/repl.py +93 -31
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/view_log.py +1 -1
- mlx_code-0.0.22/mlx_code/web.py +485 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code.egg-info/PKG-INFO +109 -66
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code.egg-info/SOURCES.txt +3 -1
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code.egg-info/requires.txt +2 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/setup.py +3 -1
- {mlx_code-0.0.20 → mlx_code-0.0.22}/LICENSE +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/__init__.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/apis.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/gits.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/lsp_tool.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/mcb.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/mcb_tool.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/stream_log.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/tools.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/util.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code/view_git.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code.egg-info/dependency_links.txt +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code.egg-info/entry_points.txt +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/mlx_code.egg-info/top_level.txt +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/setup.cfg +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/tests/__init__.py +0 -0
- {mlx_code-0.0.20 → mlx_code-0.0.22}/tests/test.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: mlx-code
|
|
3
|
-
Version: 0.0.
|
|
3
|
+
Version: 0.0.22
|
|
4
4
|
Summary: Coding Agent for Mac
|
|
5
5
|
Home-page: https://josefalbers.github.io/mlx-code/
|
|
6
6
|
Author: J Joe
|
|
@@ -17,6 +17,8 @@ Requires-Dist: httpx
|
|
|
17
17
|
Requires-Dist: pydantic
|
|
18
18
|
Requires-Dist: textual>=8.2.7
|
|
19
19
|
Requires-Dist: rich>=15.0.0
|
|
20
|
+
Requires-Dist: starlette
|
|
21
|
+
Requires-Dist: uvicorn
|
|
20
22
|
Provides-Extra: all
|
|
21
23
|
Requires-Dist: python-lsp-server[all]; extra == "all"
|
|
22
24
|
Requires-Dist: GitPython; extra == "all"
|
|
@@ -38,16 +40,16 @@ Dynamic: summary
|
|
|
38
40
|
|
|
39
41
|
A Git-native coding agent that can run entirely on your Mac. No API keys, no cloud, and no data leaving your machine. Powered by Apple MLX, it turns commits, branches, and worktrees into the agent’s state, history, and execution model
|
|
40
42
|
|
|
41
|
-
https://github.com/user-attachments/assets/
|
|
43
|
+
[](https://youtube.com/shorts/1LuifKFKixc)
|
|
42
44
|
|
|
43
45
|
---
|
|
44
46
|
|
|
45
47
|
## Architecture
|
|
46
48
|
|
|
47
49
|
```
|
|
48
|
-
|
|
50
|
+
Worktrees:
|
|
49
51
|
|
|
50
|
-
main
|
|
52
|
+
main ──●──●──●──●──●──●──●──●──●──●──●──●──●──●───────────► Node = git commit + chat hx
|
|
51
53
|
│ │
|
|
52
54
|
│ └── branch-1 ──●──●──●
|
|
53
55
|
│ │ ┌────────────┐
|
|
@@ -55,32 +57,30 @@ Conversation tree (nodes = git commits with embedded chat history)
|
|
|
55
57
|
│ └─────┬──────┘
|
|
56
58
|
└── branch-0 ──●──●──● │
|
|
57
59
|
│
|
|
60
|
+
Tabs: ├────────────► Tab = git branch + Agent
|
|
58
61
|
│
|
|
59
|
-
|
|
60
|
-
│
|
|
61
|
-
│
|
|
62
|
-
┌──────────────────────────────────────────────┼─────────┐
|
|
62
|
+
┌──────────────────────────────────────────────│─────────┐
|
|
63
63
|
│ TUI tabs │ │
|
|
64
64
|
│ ┌──────┐ ┌──────────┐ ┌──────────┐ ┌─────┴──────┐ │
|
|
65
65
|
│ │ main │ │ branch-0 │ │ branch-1 │ │ branch-1-0 │ │
|
|
66
66
|
│ └──────┘ └────┬─────┘ └──────────┘ └────────────┘ │
|
|
67
|
-
|
|
67
|
+
└─────────────────│──────────────────────────────────────┘
|
|
68
68
|
│
|
|
69
|
-
|
|
69
|
+
Agents: ├─────────────────────────────────────────► Each tab runs its own Agent
|
|
70
70
|
│
|
|
71
|
-
|
|
72
|
-
│ Agent
|
|
73
|
-
│
|
|
74
|
-
│ │ API:
|
|
75
|
-
│ │
|
|
76
|
-
│ │
|
|
77
|
-
│ │
|
|
78
|
-
│ │
|
|
79
|
-
│
|
|
80
|
-
│
|
|
81
|
-
│ Git worktree
|
|
82
|
-
│ (isolation + session state)
|
|
83
|
-
|
|
71
|
+
┌────┴─────────────────────────────────────┐
|
|
72
|
+
│ Agent │
|
|
73
|
+
│ ┌────────────────┐ ┌────────────────┐ │
|
|
74
|
+
│ │ API: │ │ Tools: │ │
|
|
75
|
+
│ │ Local (mlx-lm) │ │ Read Write │ │
|
|
76
|
+
│ │ Gemini │ │ Edit Bash │ │
|
|
77
|
+
│ │ Claude │ │ Grep Find │ │
|
|
78
|
+
│ │ Codex │ │ Ls Skill │ │
|
|
79
|
+
│ │ DeepSeek │ │ Agent ─────────┼──┼───► Recursively spawns sub-Agents
|
|
80
|
+
│ └────────────────┘ └────────────────┘ │
|
|
81
|
+
│ Git worktree │
|
|
82
|
+
│ (isolation + session state) │
|
|
83
|
+
└──────────────────────────────────────────┘
|
|
84
84
|
```
|
|
85
85
|
|
|
86
86
|
Each layer is importable and composable on its own. A commit records state, a branch records an alternative path, and a tab is just a live view over an `Agent`.
|
|
@@ -95,28 +95,31 @@ result = await agent.run('refactor utils.py to use dataclasses')
|
|
|
95
95
|
|
|
96
96
|
---
|
|
97
97
|
|
|
98
|
+
## Core ideas
|
|
99
|
+
|
|
100
|
+
- **Git is the state machine.** Every file-changing agent step is committed with the conversation that produced it, so you can inspect, resume, and branch from any checkpoint.
|
|
101
|
+
- **Branches are alternative futures.** A branch is not just a Git branch; it is a different reasoning path with its own worktree and session state.
|
|
102
|
+
- **Agents are the primitive.** Tabs, branches, and delegated subtasks are all instances of the same `Agent` abstraction.
|
|
103
|
+
- **Worktrees provide isolation.** The agent edits in a separate worktree, so your main checkout stays clean and recoverable.
|
|
104
|
+
|
|
105
|
+
---
|
|
106
|
+
|
|
98
107
|
## Quick start
|
|
99
108
|
|
|
100
109
|
```bash
|
|
110
|
+
# ephemeral run (no installation)
|
|
111
|
+
uvx --from mlx-code mlc
|
|
112
|
+
|
|
113
|
+
# or install into the current environment
|
|
101
114
|
pip install mlx-code
|
|
102
|
-
|
|
115
|
+
|
|
116
|
+
# launch
|
|
117
|
+
mlc # with a local MLX model
|
|
103
118
|
mlc-run --api gemini # or use a remote provider
|
|
104
|
-
mlc-run --api deepseek --model deepseek-v4-flash
|
|
105
119
|
```
|
|
106
120
|
|
|
107
121
|
That's it. The first run starts a local inference server and drops you into the REPL.
|
|
108
122
|
|
|
109
|
-
[](https://youtu.be/0lkY7YQCyCo)
|
|
110
|
-
|
|
111
|
-
---
|
|
112
|
-
|
|
113
|
-
## Core ideas
|
|
114
|
-
|
|
115
|
-
- **Git is the state machine.** Every file-changing agent step is committed with the conversation that produced it, so you can inspect, resume, and branch from any checkpoint.
|
|
116
|
-
- **Branches are alternative futures.** A branch is not just a Git branch; it is a different reasoning path with its own worktree and session state.
|
|
117
|
-
- **Agents are the primitive.** Tabs, branches, and delegated subtasks are all instances of the same `Agent` abstraction.
|
|
118
|
-
- **Worktrees provide isolation.** The agent edits in a separate worktree, so your main checkout stays clean and recoverable.
|
|
119
|
-
|
|
120
123
|
---
|
|
121
124
|
|
|
122
125
|
## Why mlx-code
|
|
@@ -125,12 +128,12 @@ That's it. The first run starts a local inference server and drops you into the
|
|
|
125
128
|
|
|
126
129
|
**Git is the database.** When the agent makes file changes, they’re committed to a git worktree with the full conversation embedded in the commit message. Resume any past session by hash, branch from any checkpoint, and inspect the agent timeline with `git log`. No proprietary state files, just Git.
|
|
127
130
|
|
|
128
|
-
**Your working directory is never at risk
|
|
129
|
-
|
|
130
|
-
**Built-in safety nets.** Subprocess environment variables go through an explicit allowlist, so secrets in your shell are never leaked to agent-spawned processes.
|
|
131
|
+
**Built-in safety nets.** Your working directory is never at risk. The agent operates inside a `git worktree`, not your checkout. It can make a mess, and you can inspect or discard it without ever touching `main`. Subprocess environment variables go through an explicit allowlist, so secrets in your shell are never leaked to agent-spawned processes.
|
|
131
132
|
|
|
132
133
|
**Batteries included.** Everything ships in one pip install: the MLX inference engine, the multi-protocol API server, the agent loop, the tools, and the TUI. No llama.cpp, no ollama, no vLLM bridge to find and configure. And the server natively speaks OpenAI, Anthropic, Gemini, and Codex wire formats simultaneously, so `claude`, `codex`, and `gemini` CLIs can all work against your local model without a translation layer.
|
|
133
134
|
|
|
135
|
+
**Continuous batching.** The local inference server runs a continuous batching engine that processes multiple sequences concurrently. When you spawn parallel agents (eg, multiple tabs, `asyncio.gather` pipelines, or delegated sub-tasks) they all share the same GPU context and are stepped together each tick. A prefix cache persists KV snapshots to disk, so repeated system prompts and conversation prefixes are prefilled once and reused across sessions. No request queueing, no waiting for the previous agent to finish.
|
|
136
|
+
|
|
134
137
|
---
|
|
135
138
|
|
|
136
139
|
## Agent primitive
|
|
@@ -168,12 +171,12 @@ agent.messages = messages
|
|
|
168
171
|
await agent.run("now add unit tests")
|
|
169
172
|
```
|
|
170
173
|
|
|
171
|
-
Branch from any point in the conversation
|
|
174
|
+
Branch from any point in the conversation. Each branch gets its own worktree:
|
|
172
175
|
|
|
173
176
|
```
|
|
174
177
|
/branch # branch from current state
|
|
175
178
|
/branch --rev 2 # branch from the 2nd user turn
|
|
176
|
-
/branch --rev 3
|
|
179
|
+
/branch --rev 3 make it use httpx instead
|
|
177
180
|
```
|
|
178
181
|
|
|
179
182
|
Since it's just git, you can inspect the timeline outside the REPL:
|
|
@@ -238,6 +241,43 @@ Reliability comes from specialization plus constraint. A read-only reviewer can'
|
|
|
238
241
|
|
|
239
242
|
---
|
|
240
243
|
|
|
244
|
+
## Continuous batching
|
|
245
|
+
|
|
246
|
+
The local server can run multiple inference sequences concurrently inside a single batch step. Instead of a global lock that serialises one request at a time, the batching engine maintains a live set of active sequences and yields tokens for all of them on every step.
|
|
247
|
+
|
|
248
|
+
```bash
|
|
249
|
+
mlc --engine batch # continuous batching + built-in REPL
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
This unlocks true parallelism for multi-agent workloads:
|
|
253
|
+
|
|
254
|
+
```python
|
|
255
|
+
import asyncio
|
|
256
|
+
from mlx_code.repl import Agent
|
|
257
|
+
|
|
258
|
+
async def main():
|
|
259
|
+
agents = [Agent() for _ in range(4)]
|
|
260
|
+
await asyncio.gather(*[
|
|
261
|
+
a.run(f"Research topic: {t}")
|
|
262
|
+
for a, t in zip(agents, ["consensus", "cryptography", "networking", "storage"])
|
|
263
|
+
])
|
|
264
|
+
|
|
265
|
+
asyncio.run(main())
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
All four agents generate simultaneously inside the same batch. No sequential blocking.
|
|
269
|
+
|
|
270
|
+
### Health endpoint
|
|
271
|
+
|
|
272
|
+
```bash
|
|
273
|
+
curl http://127.0.0.1:8000/health
|
|
274
|
+
# {"status":"ok","model":"mlx-community/Qwen3.5-4B-OptiQ-4bit","active_sequences":2,"prefix_cache_files":5}
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
`active_sequences` shows how many agents are generating right now; `prefix_cache_files` shows how many prefix KV snapshots are stored on disk.
|
|
278
|
+
|
|
279
|
+
---
|
|
280
|
+
|
|
241
281
|
## Command Line
|
|
242
282
|
|
|
243
283
|
### `mlc`: local server + harness
|
|
@@ -245,20 +285,20 @@ Reliability comes from specialization plus constraint. A read-only reviewer can'
|
|
|
245
285
|
Starts the MLX inference server and launches the built-in TUI harness against it.
|
|
246
286
|
|
|
247
287
|
```bash
|
|
248
|
-
# Default: local server + default
|
|
288
|
+
# Default: local server + default harness
|
|
249
289
|
mlc
|
|
250
290
|
|
|
251
|
-
#
|
|
252
|
-
mlc --
|
|
291
|
+
# Continuous batching mode (default is sequential caching mode)
|
|
292
|
+
mlc --engine batch
|
|
293
|
+
|
|
294
|
+
# Server only, no harness
|
|
295
|
+
mlc --leash none
|
|
253
296
|
|
|
254
297
|
# Use a different harness (routes traffic through the local server)
|
|
255
298
|
mlc --leash claude
|
|
256
299
|
mlc --leash gemini
|
|
257
300
|
mlc --leash codex
|
|
258
301
|
|
|
259
|
-
# Server only, no harness
|
|
260
|
-
mlc --leash none
|
|
261
|
-
|
|
262
302
|
# Specify a model
|
|
263
303
|
mlc --model mlx-community/Qwen3.5-4B-OptiQ-4bit
|
|
264
304
|
|
|
@@ -309,10 +349,9 @@ mlc-run --api codex
|
|
|
309
349
|
echo "explain lsp.py" | mlc-run -a deepseek | cat - PLAN.md | mlc-run --url http://localhost:9000
|
|
310
350
|
|
|
311
351
|
# Simple terminal REPL (no TUI)
|
|
312
|
-
mlc-run --
|
|
352
|
+
mlc-run --bare
|
|
313
353
|
```
|
|
314
354
|
|
|
315
|
-
|
|
316
355
|
---
|
|
317
356
|
|
|
318
357
|
## Using as a Library
|
|
@@ -435,18 +474,19 @@ agent = Agent(extra_tool_classes=[LiveDBTool], tool_names=["QueryDB"])
|
|
|
435
474
|
|
|
436
475
|
| Command | Description |
|
|
437
476
|
|---|---|
|
|
438
|
-
| `/
|
|
477
|
+
| `/branch [--rev N] [prompt]` | Open a new branch tab from the current (or earlier) checkpoint |
|
|
478
|
+
| `/diff [--all]` | Show a side-by-side diff of changes in the worktree |
|
|
439
479
|
| `/clear [--config F]` | Clear conversation; `--config` reloads agent from a JSON/YAML file |
|
|
480
|
+
| `/tab [N]` | Jump to tab N |
|
|
440
481
|
| `/history [--raw]` | Show conversation transcript; `--raw` shows the raw API message log |
|
|
441
|
-
| `/diff [--all]` | Show a side-by-side diff of changes in the worktree |
|
|
442
|
-
| `/errors` | Show timestamped error log for the current tab |
|
|
443
482
|
| `/tools` | List active tools |
|
|
444
|
-
| `/branch [--rev N] [prompt]` | Open a new branch tab from the current (or earlier) checkpoint |
|
|
445
483
|
| `/abort` | Abort the running agent |
|
|
484
|
+
| `/errors` | Show timestamped error log for the current tab |
|
|
446
485
|
| `/export [path]` | Export session to JSON |
|
|
447
|
-
| `/exit
|
|
448
|
-
|
|
|
449
|
-
|
|
|
486
|
+
| `/exit [--all]` | Close branch tab, or exit the app |
|
|
487
|
+
| `/help` | Show command reference |
|
|
488
|
+
| `!command` | Run a shell command; output captured in the TUI (eg, `ls`, `cat hello.c`) |
|
|
489
|
+
| `$command` | Run an interactive command (eg, `vim`, `yazi`, `less hello.c`) |
|
|
450
490
|
|
|
451
491
|
### Key bindings
|
|
452
492
|
|
|
@@ -454,9 +494,9 @@ agent = Agent(extra_tool_classes=[LiveDBTool], tool_names=["QueryDB"])
|
|
|
454
494
|
|---|---|
|
|
455
495
|
| `Enter` | Submit |
|
|
456
496
|
| `Ctrl-J` | Insert newline |
|
|
457
|
-
| `
|
|
458
|
-
| `
|
|
459
|
-
| `Ctrl-C` |
|
|
497
|
+
| `Ctrl-1` … `Ctrl-9` | Jump to tab N |
|
|
498
|
+
| `Ctrl-,` / `Ctrl-.` | Cycle through tabs |
|
|
499
|
+
| `Ctrl-C` | Clear input, or abort running agent |
|
|
460
500
|
| `Ctrl-D` | Close branch tab, or exit app |
|
|
461
501
|
| `Ctrl-R` | Recall last prompt into editor |
|
|
462
502
|
|
|
@@ -474,16 +514,16 @@ agent = Agent(extra_tool_classes=[LiveDBTool], tool_names=["QueryDB"])
|
|
|
474
514
|
| `Skill` | Retrieve named skill instructions from config |
|
|
475
515
|
| `Agent` | Spawn an autonomous sub-agent for delegated work |
|
|
476
516
|
|
|
477
|
-
All file tools enforce path sandboxing
|
|
517
|
+
All file tools enforce path sandboxing. The agent cannot read or write outside the worktree.
|
|
478
518
|
|
|
479
519
|
### Backends
|
|
480
520
|
|
|
481
521
|
| Backend | Flag | Notes |
|
|
482
522
|
|---------|------|-------|
|
|
483
|
-
| MLX (local) | `--api noapi` | Default. Runs on-device, no API key needed |
|
|
523
|
+
| MLX-LM (local) | `--api noapi` | Default. Runs on-device, no API key needed |
|
|
484
524
|
| Claude | `--api claude` | Requires `ANTHROPIC_API_KEY` |
|
|
485
525
|
| Gemini | `--api gemini` | Requires `GOOGLE_API_KEY` |
|
|
486
|
-
| DeepSeek | `--api deepseek` |
|
|
526
|
+
| DeepSeek | `--api deepseek` | Requires `DEEPSEEK_API_KEY` |
|
|
487
527
|
| Codex | `--api codex` | OpenAI Codex CLI integration |
|
|
488
528
|
| OpenAI | `--api openai` | Any OpenAI-compatible endpoint |
|
|
489
529
|
|
|
@@ -492,10 +532,13 @@ All file tools enforce path sandboxing — the agent cannot read or write outsid
|
|
|
492
532
|
The local MLX server speaks OpenAI, Anthropic, and Gemini wire formats simultaneously, so you can use any compatible CLI as the frontend:
|
|
493
533
|
|
|
494
534
|
```bash
|
|
495
|
-
mlc
|
|
496
|
-
mlc --
|
|
497
|
-
mlc --
|
|
498
|
-
mlc --leash none #
|
|
535
|
+
mlc # default
|
|
536
|
+
mlc --web # web UI (api.mlx-code.com)
|
|
537
|
+
mlc --bare # no TUI
|
|
538
|
+
mlc --leash none # no harness
|
|
539
|
+
mlc --leash codex # codex CLI
|
|
540
|
+
mlc --leash gemini # gemini CLI
|
|
541
|
+
mlc --leash claude # claude code
|
|
499
542
|
```
|
|
500
543
|
|
|
501
544
|
---
|
|
@@ -2,16 +2,16 @@
|
|
|
2
2
|
|
|
3
3
|
A Git-native coding agent that can run entirely on your Mac. No API keys, no cloud, and no data leaving your machine. Powered by Apple MLX, it turns commits, branches, and worktrees into the agent’s state, history, and execution model
|
|
4
4
|
|
|
5
|
-
https://github.com/user-attachments/assets/
|
|
5
|
+
[](https://youtube.com/shorts/1LuifKFKixc)
|
|
6
6
|
|
|
7
7
|
---
|
|
8
8
|
|
|
9
9
|
## Architecture
|
|
10
10
|
|
|
11
11
|
```
|
|
12
|
-
|
|
12
|
+
Worktrees:
|
|
13
13
|
|
|
14
|
-
main
|
|
14
|
+
main ──●──●──●──●──●──●──●──●──●──●──●──●──●──●───────────► Node = git commit + chat hx
|
|
15
15
|
│ │
|
|
16
16
|
│ └── branch-1 ──●──●──●
|
|
17
17
|
│ │ ┌────────────┐
|
|
@@ -19,32 +19,30 @@ Conversation tree (nodes = git commits with embedded chat history)
|
|
|
19
19
|
│ └─────┬──────┘
|
|
20
20
|
└── branch-0 ──●──●──● │
|
|
21
21
|
│
|
|
22
|
+
Tabs: ├────────────► Tab = git branch + Agent
|
|
22
23
|
│
|
|
23
|
-
|
|
24
|
-
│
|
|
25
|
-
│
|
|
26
|
-
┌──────────────────────────────────────────────┼─────────┐
|
|
24
|
+
┌──────────────────────────────────────────────│─────────┐
|
|
27
25
|
│ TUI tabs │ │
|
|
28
26
|
│ ┌──────┐ ┌──────────┐ ┌──────────┐ ┌─────┴──────┐ │
|
|
29
27
|
│ │ main │ │ branch-0 │ │ branch-1 │ │ branch-1-0 │ │
|
|
30
28
|
│ └──────┘ └────┬─────┘ └──────────┘ └────────────┘ │
|
|
31
|
-
|
|
29
|
+
└─────────────────│──────────────────────────────────────┘
|
|
32
30
|
│
|
|
33
|
-
|
|
31
|
+
Agents: ├─────────────────────────────────────────► Each tab runs its own Agent
|
|
34
32
|
│
|
|
35
|
-
|
|
36
|
-
│ Agent
|
|
37
|
-
│
|
|
38
|
-
│ │ API:
|
|
39
|
-
│ │
|
|
40
|
-
│ │
|
|
41
|
-
│ │
|
|
42
|
-
│ │
|
|
43
|
-
│
|
|
44
|
-
│
|
|
45
|
-
│ Git worktree
|
|
46
|
-
│ (isolation + session state)
|
|
47
|
-
|
|
33
|
+
┌────┴─────────────────────────────────────┐
|
|
34
|
+
│ Agent │
|
|
35
|
+
│ ┌────────────────┐ ┌────────────────┐ │
|
|
36
|
+
│ │ API: │ │ Tools: │ │
|
|
37
|
+
│ │ Local (mlx-lm) │ │ Read Write │ │
|
|
38
|
+
│ │ Gemini │ │ Edit Bash │ │
|
|
39
|
+
│ │ Claude │ │ Grep Find │ │
|
|
40
|
+
│ │ Codex │ │ Ls Skill │ │
|
|
41
|
+
│ │ DeepSeek │ │ Agent ─────────┼──┼───► Recursively spawns sub-Agents
|
|
42
|
+
│ └────────────────┘ └────────────────┘ │
|
|
43
|
+
│ Git worktree │
|
|
44
|
+
│ (isolation + session state) │
|
|
45
|
+
└──────────────────────────────────────────┘
|
|
48
46
|
```
|
|
49
47
|
|
|
50
48
|
Each layer is importable and composable on its own. A commit records state, a branch records an alternative path, and a tab is just a live view over an `Agent`.
|
|
@@ -59,28 +57,31 @@ result = await agent.run('refactor utils.py to use dataclasses')
|
|
|
59
57
|
|
|
60
58
|
---
|
|
61
59
|
|
|
60
|
+
## Core ideas
|
|
61
|
+
|
|
62
|
+
- **Git is the state machine.** Every file-changing agent step is committed with the conversation that produced it, so you can inspect, resume, and branch from any checkpoint.
|
|
63
|
+
- **Branches are alternative futures.** A branch is not just a Git branch; it is a different reasoning path with its own worktree and session state.
|
|
64
|
+
- **Agents are the primitive.** Tabs, branches, and delegated subtasks are all instances of the same `Agent` abstraction.
|
|
65
|
+
- **Worktrees provide isolation.** The agent edits in a separate worktree, so your main checkout stays clean and recoverable.
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
62
69
|
## Quick start
|
|
63
70
|
|
|
64
71
|
```bash
|
|
72
|
+
# ephemeral run (no installation)
|
|
73
|
+
uvx --from mlx-code mlc
|
|
74
|
+
|
|
75
|
+
# or install into the current environment
|
|
65
76
|
pip install mlx-code
|
|
66
|
-
|
|
77
|
+
|
|
78
|
+
# launch
|
|
79
|
+
mlc # with a local MLX model
|
|
67
80
|
mlc-run --api gemini # or use a remote provider
|
|
68
|
-
mlc-run --api deepseek --model deepseek-v4-flash
|
|
69
81
|
```
|
|
70
82
|
|
|
71
83
|
That's it. The first run starts a local inference server and drops you into the REPL.
|
|
72
84
|
|
|
73
|
-
[](https://youtu.be/0lkY7YQCyCo)
|
|
74
|
-
|
|
75
|
-
---
|
|
76
|
-
|
|
77
|
-
## Core ideas
|
|
78
|
-
|
|
79
|
-
- **Git is the state machine.** Every file-changing agent step is committed with the conversation that produced it, so you can inspect, resume, and branch from any checkpoint.
|
|
80
|
-
- **Branches are alternative futures.** A branch is not just a Git branch; it is a different reasoning path with its own worktree and session state.
|
|
81
|
-
- **Agents are the primitive.** Tabs, branches, and delegated subtasks are all instances of the same `Agent` abstraction.
|
|
82
|
-
- **Worktrees provide isolation.** The agent edits in a separate worktree, so your main checkout stays clean and recoverable.
|
|
83
|
-
|
|
84
85
|
---
|
|
85
86
|
|
|
86
87
|
## Why mlx-code
|
|
@@ -89,12 +90,12 @@ That's it. The first run starts a local inference server and drops you into the
|
|
|
89
90
|
|
|
90
91
|
**Git is the database.** When the agent makes file changes, they’re committed to a git worktree with the full conversation embedded in the commit message. Resume any past session by hash, branch from any checkpoint, and inspect the agent timeline with `git log`. No proprietary state files, just Git.
|
|
91
92
|
|
|
92
|
-
**Your working directory is never at risk
|
|
93
|
-
|
|
94
|
-
**Built-in safety nets.** Subprocess environment variables go through an explicit allowlist, so secrets in your shell are never leaked to agent-spawned processes.
|
|
93
|
+
**Built-in safety nets.** Your working directory is never at risk. The agent operates inside a `git worktree`, not your checkout. It can make a mess, and you can inspect or discard it without ever touching `main`. Subprocess environment variables go through an explicit allowlist, so secrets in your shell are never leaked to agent-spawned processes.
|
|
95
94
|
|
|
96
95
|
**Batteries included.** Everything ships in one pip install: the MLX inference engine, the multi-protocol API server, the agent loop, the tools, and the TUI. No llama.cpp, no ollama, no vLLM bridge to find and configure. And the server natively speaks OpenAI, Anthropic, Gemini, and Codex wire formats simultaneously, so `claude`, `codex`, and `gemini` CLIs can all work against your local model without a translation layer.
|
|
97
96
|
|
|
97
|
+
**Continuous batching.** The local inference server runs a continuous batching engine that processes multiple sequences concurrently. When you spawn parallel agents (eg, multiple tabs, `asyncio.gather` pipelines, or delegated sub-tasks) they all share the same GPU context and are stepped together each tick. A prefix cache persists KV snapshots to disk, so repeated system prompts and conversation prefixes are prefilled once and reused across sessions. No request queueing, no waiting for the previous agent to finish.
|
|
98
|
+
|
|
98
99
|
---
|
|
99
100
|
|
|
100
101
|
## Agent primitive
|
|
@@ -132,12 +133,12 @@ agent.messages = messages
|
|
|
132
133
|
await agent.run("now add unit tests")
|
|
133
134
|
```
|
|
134
135
|
|
|
135
|
-
Branch from any point in the conversation
|
|
136
|
+
Branch from any point in the conversation. Each branch gets its own worktree:
|
|
136
137
|
|
|
137
138
|
```
|
|
138
139
|
/branch # branch from current state
|
|
139
140
|
/branch --rev 2 # branch from the 2nd user turn
|
|
140
|
-
/branch --rev 3
|
|
141
|
+
/branch --rev 3 make it use httpx instead
|
|
141
142
|
```
|
|
142
143
|
|
|
143
144
|
Since it's just git, you can inspect the timeline outside the REPL:
|
|
@@ -202,6 +203,43 @@ Reliability comes from specialization plus constraint. A read-only reviewer can'
|
|
|
202
203
|
|
|
203
204
|
---
|
|
204
205
|
|
|
206
|
+
## Continuous batching
|
|
207
|
+
|
|
208
|
+
The local server can run multiple inference sequences concurrently inside a single batch step. Instead of a global lock that serialises one request at a time, the batching engine maintains a live set of active sequences and yields tokens for all of them on every step.
|
|
209
|
+
|
|
210
|
+
```bash
|
|
211
|
+
mlc --engine batch # continuous batching + built-in REPL
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
This unlocks true parallelism for multi-agent workloads:
|
|
215
|
+
|
|
216
|
+
```python
|
|
217
|
+
import asyncio
|
|
218
|
+
from mlx_code.repl import Agent
|
|
219
|
+
|
|
220
|
+
async def main():
|
|
221
|
+
agents = [Agent() for _ in range(4)]
|
|
222
|
+
await asyncio.gather(*[
|
|
223
|
+
a.run(f"Research topic: {t}")
|
|
224
|
+
for a, t in zip(agents, ["consensus", "cryptography", "networking", "storage"])
|
|
225
|
+
])
|
|
226
|
+
|
|
227
|
+
asyncio.run(main())
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
All four agents generate simultaneously inside the same batch. No sequential blocking.
|
|
231
|
+
|
|
232
|
+
### Health endpoint
|
|
233
|
+
|
|
234
|
+
```bash
|
|
235
|
+
curl http://127.0.0.1:8000/health
|
|
236
|
+
# {"status":"ok","model":"mlx-community/Qwen3.5-4B-OptiQ-4bit","active_sequences":2,"prefix_cache_files":5}
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
`active_sequences` shows how many agents are generating right now; `prefix_cache_files` shows how many prefix KV snapshots are stored on disk.
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
205
243
|
## Command Line
|
|
206
244
|
|
|
207
245
|
### `mlc`: local server + harness
|
|
@@ -209,20 +247,20 @@ Reliability comes from specialization plus constraint. A read-only reviewer can'
|
|
|
209
247
|
Starts the MLX inference server and launches the built-in TUI harness against it.
|
|
210
248
|
|
|
211
249
|
```bash
|
|
212
|
-
# Default: local server + default
|
|
250
|
+
# Default: local server + default harness
|
|
213
251
|
mlc
|
|
214
252
|
|
|
215
|
-
#
|
|
216
|
-
mlc --
|
|
253
|
+
# Continuous batching mode (default is sequential caching mode)
|
|
254
|
+
mlc --engine batch
|
|
255
|
+
|
|
256
|
+
# Server only, no harness
|
|
257
|
+
mlc --leash none
|
|
217
258
|
|
|
218
259
|
# Use a different harness (routes traffic through the local server)
|
|
219
260
|
mlc --leash claude
|
|
220
261
|
mlc --leash gemini
|
|
221
262
|
mlc --leash codex
|
|
222
263
|
|
|
223
|
-
# Server only, no harness
|
|
224
|
-
mlc --leash none
|
|
225
|
-
|
|
226
264
|
# Specify a model
|
|
227
265
|
mlc --model mlx-community/Qwen3.5-4B-OptiQ-4bit
|
|
228
266
|
|
|
@@ -273,10 +311,9 @@ mlc-run --api codex
|
|
|
273
311
|
echo "explain lsp.py" | mlc-run -a deepseek | cat - PLAN.md | mlc-run --url http://localhost:9000
|
|
274
312
|
|
|
275
313
|
# Simple terminal REPL (no TUI)
|
|
276
|
-
mlc-run --
|
|
314
|
+
mlc-run --bare
|
|
277
315
|
```
|
|
278
316
|
|
|
279
|
-
|
|
280
317
|
---
|
|
281
318
|
|
|
282
319
|
## Using as a Library
|
|
@@ -399,18 +436,19 @@ agent = Agent(extra_tool_classes=[LiveDBTool], tool_names=["QueryDB"])
|
|
|
399
436
|
|
|
400
437
|
| Command | Description |
|
|
401
438
|
|---|---|
|
|
402
|
-
| `/
|
|
439
|
+
| `/branch [--rev N] [prompt]` | Open a new branch tab from the current (or earlier) checkpoint |
|
|
440
|
+
| `/diff [--all]` | Show a side-by-side diff of changes in the worktree |
|
|
403
441
|
| `/clear [--config F]` | Clear conversation; `--config` reloads agent from a JSON/YAML file |
|
|
442
|
+
| `/tab [N]` | Jump to tab N |
|
|
404
443
|
| `/history [--raw]` | Show conversation transcript; `--raw` shows the raw API message log |
|
|
405
|
-
| `/diff [--all]` | Show a side-by-side diff of changes in the worktree |
|
|
406
|
-
| `/errors` | Show timestamped error log for the current tab |
|
|
407
444
|
| `/tools` | List active tools |
|
|
408
|
-
| `/branch [--rev N] [prompt]` | Open a new branch tab from the current (or earlier) checkpoint |
|
|
409
445
|
| `/abort` | Abort the running agent |
|
|
446
|
+
| `/errors` | Show timestamped error log for the current tab |
|
|
410
447
|
| `/export [path]` | Export session to JSON |
|
|
411
|
-
| `/exit
|
|
412
|
-
|
|
|
413
|
-
|
|
|
448
|
+
| `/exit [--all]` | Close branch tab, or exit the app |
|
|
449
|
+
| `/help` | Show command reference |
|
|
450
|
+
| `!command` | Run a shell command; output captured in the TUI (eg, `ls`, `cat hello.c`) |
|
|
451
|
+
| `$command` | Run an interactive command (eg, `vim`, `yazi`, `less hello.c`) |
|
|
414
452
|
|
|
415
453
|
### Key bindings
|
|
416
454
|
|
|
@@ -418,9 +456,9 @@ agent = Agent(extra_tool_classes=[LiveDBTool], tool_names=["QueryDB"])
|
|
|
418
456
|
|---|---|
|
|
419
457
|
| `Enter` | Submit |
|
|
420
458
|
| `Ctrl-J` | Insert newline |
|
|
421
|
-
| `
|
|
422
|
-
| `
|
|
423
|
-
| `Ctrl-C` |
|
|
459
|
+
| `Ctrl-1` … `Ctrl-9` | Jump to tab N |
|
|
460
|
+
| `Ctrl-,` / `Ctrl-.` | Cycle through tabs |
|
|
461
|
+
| `Ctrl-C` | Clear input, or abort running agent |
|
|
424
462
|
| `Ctrl-D` | Close branch tab, or exit app |
|
|
425
463
|
| `Ctrl-R` | Recall last prompt into editor |
|
|
426
464
|
|
|
@@ -438,16 +476,16 @@ agent = Agent(extra_tool_classes=[LiveDBTool], tool_names=["QueryDB"])
|
|
|
438
476
|
| `Skill` | Retrieve named skill instructions from config |
|
|
439
477
|
| `Agent` | Spawn an autonomous sub-agent for delegated work |
|
|
440
478
|
|
|
441
|
-
All file tools enforce path sandboxing
|
|
479
|
+
All file tools enforce path sandboxing. The agent cannot read or write outside the worktree.
|
|
442
480
|
|
|
443
481
|
### Backends
|
|
444
482
|
|
|
445
483
|
| Backend | Flag | Notes |
|
|
446
484
|
|---------|------|-------|
|
|
447
|
-
| MLX (local) | `--api noapi` | Default. Runs on-device, no API key needed |
|
|
485
|
+
| MLX-LM (local) | `--api noapi` | Default. Runs on-device, no API key needed |
|
|
448
486
|
| Claude | `--api claude` | Requires `ANTHROPIC_API_KEY` |
|
|
449
487
|
| Gemini | `--api gemini` | Requires `GOOGLE_API_KEY` |
|
|
450
|
-
| DeepSeek | `--api deepseek` |
|
|
488
|
+
| DeepSeek | `--api deepseek` | Requires `DEEPSEEK_API_KEY` |
|
|
451
489
|
| Codex | `--api codex` | OpenAI Codex CLI integration |
|
|
452
490
|
| OpenAI | `--api openai` | Any OpenAI-compatible endpoint |
|
|
453
491
|
|
|
@@ -456,10 +494,13 @@ All file tools enforce path sandboxing — the agent cannot read or write outsid
|
|
|
456
494
|
The local MLX server speaks OpenAI, Anthropic, and Gemini wire formats simultaneously, so you can use any compatible CLI as the frontend:
|
|
457
495
|
|
|
458
496
|
```bash
|
|
459
|
-
mlc
|
|
460
|
-
mlc --
|
|
461
|
-
mlc --
|
|
462
|
-
mlc --leash none #
|
|
497
|
+
mlc # default
|
|
498
|
+
mlc --web # web UI (api.mlx-code.com)
|
|
499
|
+
mlc --bare # no TUI
|
|
500
|
+
mlc --leash none # no harness
|
|
501
|
+
mlc --leash codex # codex CLI
|
|
502
|
+
mlc --leash gemini # gemini CLI
|
|
503
|
+
mlc --leash claude # claude code
|
|
463
504
|
```
|
|
464
505
|
|
|
465
506
|
---
|