mlx-code 0.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,209 @@
1
+ Metadata-Version: 2.4
2
+ Name: mlx-code
3
+ Version: 0.0.1
4
+ Summary: Local Claude Code for Mac
5
+ Home-page: https://github.com/JosefAlbers/mlx-code
6
+ Author: J Joe
7
+ Author-email: albersj66@gmail.com
8
+ License: Apache-2.0
9
+ Requires-Python: >=3.11
10
+ Description-Content-Type: text/markdown
11
+ Requires-Dist: mlx-lm>=0.19.0
12
+ Requires-Dist: PyYAML
13
+ Dynamic: author
14
+ Dynamic: author-email
15
+ Dynamic: description
16
+ Dynamic: description-content-type
17
+ Dynamic: home-page
18
+ Dynamic: license
19
+ Dynamic: requires-dist
20
+ Dynamic: requires-python
21
+ Dynamic: summary
22
+
23
+ # mlx-code: Local Claude Code for Mac
24
+
25
+ <img width="1024" height="1536" alt="mlxcodelogo" src="https://github.com/user-attachments/assets/df605c1d-f8e7-433c-9aaf-176c73e3d628" />
26
+ ![mlxcode](https://github.com/user-attachments/assets/86004bd2-549f-44b8-9a05-a9b6d9f1c4a2)
27
+
28
+ ### Quick Start
29
+
30
+ ```bash
31
+ brew install --cask claude-code
32
+ pip install mlx-code
33
+ ```
34
+
35
+ ### Commands
36
+ | Command | What it does | Example |
37
+ |--------|--------------|--------|
38
+ | `mlx-code` | Start interactive mode | `mlx-code` |
39
+ | `mlx-code "task"` | Run a one-time task | `mlx-code "fix the build error"` |
40
+ | `mlx-code -p "query"` | Run one-off query, then exit | `mlx-code -p "explain this function"` |
41
+ | `mlx-code -c` | Continue most recent conversation in current directory | `mlx-code -c` |
42
+ | `mlx-code -r` | Resume a previous conversation | `mlx-code -r` |
43
+ | `mlx-code commit` | Create a Git commit | `mlx-code commit` |
44
+ | `/clear` | Clear conversation history | `/clear` |
45
+ | `/help` | Show available commands | `/help` |
46
+ | `exit` or `Ctrl+C` | Exit Claude Code | `exit` |
47
+
48
+ <details><summary>Click to expand</summary><pre>
49
+ ╭─── Claude Code v2.1.74 ────────────────────────────────────────────────────────────────────────────────╮
50
+ │ │ Tips for getting started │
51
+ │ Welcome back! │ Run /init to create a CLAUDE.md file with instru… │
52
+ │ │ ───────────────────────────────────────────────── │
53
+ │ ▐▛███▜▌ │ Recent activity │
54
+ │ ▝▜█████▛▘ │ No recent activity │
55
+ │ ▘▘ ▝▝ │ │
56
+ │ │ │
57
+ │ mlx-community/Qwen3.5-4B-Opti… · API Usage Billing │ │
58
+ │ /…/T/tmpg4sztu_h/workspace │ │
59
+ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
60
+
61
+ ❯ explain the folder structure
62
+
63
+
64
+
65
+ ⏺ Searched for 1 pattern (ctrl+o to expand)
66
+
67
+ ⏺ Here's the folder structure of your workspace:
68
+
69
+ workspace/
70
+ ├── .DS_Store
71
+ ├── cache/
72
+ │ └── cache.safetensors
73
+ ├── cc/
74
+ │ ├── .DS_Store
75
+ │ ├── mlx-code-v1.py
76
+ │ └── mlx-code-v2.py
77
+ ├── cyberpunk/
78
+ │ ├── build.zig
79
+ │ ├── help.py
80
+ │ ├── map.c
81
+ │ └── main.zig
82
+ ├── rag/
83
+ │ └── jina.py
84
+ ├── rts/
85
+ │ ├── build.zig
86
+ │ ├── main.zig
87
+ │ └── battle_sim.c
88
+ ├── splat_zig_raylib/
89
+ │ ├── README.md
90
+ │ ├── .DS_Store
91
+ │ ├── build.zig
92
+ │ ├── build_web.sh
93
+ │ ├── index.html
94
+ │ ├── main.zig
95
+ │ ├── gsplat_transform.py
96
+ │ ├── src/
97
+ │ │ └── main.zig
98
+ │ └── splat_zig/
99
+ │ ├── .DS_Store
100
+ │ ├── build.zig
101
+ │ ├── build_web.sh
102
+ │ ├── gsplat_transform.py
103
+ │ ├── index.html
104
+ │ ├── main.zig
105
+ │ └── splat/
106
+ │ └── build.zig
107
+ ├── tarot/
108
+ │ ├── index.html
109
+ │ └── tarot.py
110
+ └── tmp.md
111
+
112
+ Summary of projects:
113
+
114
+ ┌───────────────────┬────────────────────────────────────────────┐
115
+ │ Folder │ Purpose │
116
+ ├───────────────────┼────────────────────────────────────────────┤
117
+ │ cc/ │ MLX code versions (v1, v2) │
118
+ ├───────────────────┼────────────────────────────────────────────┤
119
+ │ cyberpunk/ │ Zig game project with C map integration │
120
+ ├───────────────────┼────────────────────────────────────────────┤
121
+ │ rag/ │ RAG (Retrieval-Augmented Generation) setup │
122
+ ├───────────────────┼────────────────────────────────────────────┤
123
+ │ rts/ │ Real-time strategy battle simulation │
124
+ ├───────────────────┼────────────────────────────────────────────┤
125
+ │ splat_zig_raylib/ │ Raylib game with splat transform features │
126
+ ├───────────────────┼────────────────────────────────────────────┤
127
+ │ tarot/ │ Tarot card application │
128
+ ├───────────────────┼────────────────────────────────────────────┤
129
+ │ cache/ │ Model cache files │
130
+ ├───────────────────┼────────────────────────────────────────────┤
131
+ │ tmp.md │ Temporary notes │
132
+ └───────────────────┴────────────────────────────────────────────┘
133
+
134
+ ❯ how is output parsed in mlx-code.py?
135
+
136
+
137
+
138
+ ⏺ Read 1 file (ctrl+o to expand)
139
+
140
+ ⏺ Based on the code, here's how output is parsed in mlx-code.py:
141
+
142
+ Output Parsing Flow
143
+
144
+ 1. decode() function (lines 109-143)
145
+
146
+ This is the core parsing function that converts raw model output into structured blocks:
147
+
148
+ - Input: Raw text from the model
149
+ - Output: List of blocks with types: thinking, text, or tool_use
150
+
151
+ Key parsing logic:
152
+ - Splits text by `
153
+
154
+ ⏺ ` markers for thinking content
155
+ - Extracts tool calls wrapped in <tool_call>...</tool_call> tags
156
+ - Parses function name and parameters from XML-like tags
157
+
158
+ 2. Tool Call Extraction (lines 125-141)
159
+
160
+ When parsing tool calls:
161
+ tool_pattern = re.compile(r'<tool_call>(.*?)</tool_call>', re.DOTALL)
162
+ - Matches content between <tool_call> and </tool_call>
163
+ - Extracts function name: <function=name>
164
+ - Extracts parameters: <parameter=name>value</parameter>
165
+ - Creates tool_use block with:
166
+ - type: "tool_use"
167
+ - id: UUID
168
+ - name: function name
169
+ - input: params dict
170
+
171
+ 3. blocks_to_sse() function (lines 145-177)
172
+
173
+ Converts parsed blocks to Server-Sent Events (SSE) format:
174
+
175
+ def event(name: str, data: dict) -> bytes:
176
+ return f"event: {name}\ndata: {json.dumps(data)}\n\n".encode()
177
+
178
+ Event types sent:
179
+ - message_start - Initial message header
180
+ - content_block_start - Start of a content block
181
+ - content_block_delta - Streaming content (text/thinking/tool)
182
+ - content_block_stop - End of block
183
+ - message_delta - Stop reason
184
+ - message_stop - End of message
185
+
186
+ 4. HTTP Response (lines 288-297)
187
+
188
+ Sends SSE stream with headers:
189
+ self.send_header("Content-Type", "text/event-stream")
190
+ self.send_header("Cache-Control", "no-cache")
191
+
192
+ 5. Endpoint (/v1/messages)
193
+
194
+ - POST request with JSON body
195
+ - Returns streaming SSE response
196
+ - Supports max_tokens parameter
197
+
198
+ 6. Stop Reason Detection
199
+
200
+ - tool_use - If tool calls were generated
201
+ - end_turn - If just text/thinking was generated
202
+
203
+ ✻ Brewed for 35s
204
+
205
+ ❯ cool mate, bye
206
+
207
+ ⏺ Got it! Feel free to come back anytime if you need help. Have a great day! 👋
208
+ </pre></details><br>
209
+
@@ -0,0 +1,187 @@
1
+ # mlx-code: Local Claude Code for Mac
2
+
3
+ <img width="1024" height="1536" alt="mlxcodelogo" src="https://github.com/user-attachments/assets/df605c1d-f8e7-433c-9aaf-176c73e3d628" />
4
+ ![mlxcode](https://github.com/user-attachments/assets/86004bd2-549f-44b8-9a05-a9b6d9f1c4a2)
5
+
6
+ ### Quick Start
7
+
8
+ ```bash
9
+ brew install --cask claude-code
10
+ pip install mlx-code
11
+ ```
12
+
13
+ ### Commands
14
+ | Command | What it does | Example |
15
+ |--------|--------------|--------|
16
+ | `mlx-code` | Start interactive mode | `mlx-code` |
17
+ | `mlx-code "task"` | Run a one-time task | `mlx-code "fix the build error"` |
18
+ | `mlx-code -p "query"` | Run one-off query, then exit | `mlx-code -p "explain this function"` |
19
+ | `mlx-code -c` | Continue most recent conversation in current directory | `mlx-code -c` |
20
+ | `mlx-code -r` | Resume a previous conversation | `mlx-code -r` |
21
+ | `mlx-code commit` | Create a Git commit | `mlx-code commit` |
22
+ | `/clear` | Clear conversation history | `/clear` |
23
+ | `/help` | Show available commands | `/help` |
24
+ | `exit` or `Ctrl+C` | Exit Claude Code | `exit` |
25
+
26
+ <details><summary>Click to expand</summary><pre>
27
+ ╭─── Claude Code v2.1.74 ────────────────────────────────────────────────────────────────────────────────╮
28
+ │ │ Tips for getting started │
29
+ │ Welcome back! │ Run /init to create a CLAUDE.md file with instru… │
30
+ │ │ ───────────────────────────────────────────────── │
31
+ │ ▐▛███▜▌ │ Recent activity │
32
+ │ ▝▜█████▛▘ │ No recent activity │
33
+ │ ▘▘ ▝▝ │ │
34
+ │ │ │
35
+ │ mlx-community/Qwen3.5-4B-Opti… · API Usage Billing │ │
36
+ │ /…/T/tmpg4sztu_h/workspace │ │
37
+ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
38
+
39
+ ❯ explain the folder structure
40
+
41
+
42
+
43
+ ⏺ Searched for 1 pattern (ctrl+o to expand)
44
+
45
+ ⏺ Here's the folder structure of your workspace:
46
+
47
+ workspace/
48
+ ├── .DS_Store
49
+ ├── cache/
50
+ │ └── cache.safetensors
51
+ ├── cc/
52
+ │ ├── .DS_Store
53
+ │ ├── mlx-code-v1.py
54
+ │ └── mlx-code-v2.py
55
+ ├── cyberpunk/
56
+ │ ├── build.zig
57
+ │ ├── help.py
58
+ │ ├── map.c
59
+ │ └── main.zig
60
+ ├── rag/
61
+ │ └── jina.py
62
+ ├── rts/
63
+ │ ├── build.zig
64
+ │ ├── main.zig
65
+ │ └── battle_sim.c
66
+ ├── splat_zig_raylib/
67
+ │ ├── README.md
68
+ │ ├── .DS_Store
69
+ │ ├── build.zig
70
+ │ ├── build_web.sh
71
+ │ ├── index.html
72
+ │ ├── main.zig
73
+ │ ├── gsplat_transform.py
74
+ │ ├── src/
75
+ │ │ └── main.zig
76
+ │ └── splat_zig/
77
+ │ ├── .DS_Store
78
+ │ ├── build.zig
79
+ │ ├── build_web.sh
80
+ │ ├── gsplat_transform.py
81
+ │ ├── index.html
82
+ │ ├── main.zig
83
+ │ └── splat/
84
+ │ └── build.zig
85
+ ├── tarot/
86
+ │ ├── index.html
87
+ │ └── tarot.py
88
+ └── tmp.md
89
+
90
+ Summary of projects:
91
+
92
+ ┌───────────────────┬────────────────────────────────────────────┐
93
+ │ Folder │ Purpose │
94
+ ├───────────────────┼────────────────────────────────────────────┤
95
+ │ cc/ │ MLX code versions (v1, v2) │
96
+ ├───────────────────┼────────────────────────────────────────────┤
97
+ │ cyberpunk/ │ Zig game project with C map integration │
98
+ ├───────────────────┼────────────────────────────────────────────┤
99
+ │ rag/ │ RAG (Retrieval-Augmented Generation) setup │
100
+ ├───────────────────┼────────────────────────────────────────────┤
101
+ │ rts/ │ Real-time strategy battle simulation │
102
+ ├───────────────────┼────────────────────────────────────────────┤
103
+ │ splat_zig_raylib/ │ Raylib game with splat transform features │
104
+ ├───────────────────┼────────────────────────────────────────────┤
105
+ │ tarot/ │ Tarot card application │
106
+ ├───────────────────┼────────────────────────────────────────────┤
107
+ │ cache/ │ Model cache files │
108
+ ├───────────────────┼────────────────────────────────────────────┤
109
+ │ tmp.md │ Temporary notes │
110
+ └───────────────────┴────────────────────────────────────────────┘
111
+
112
+ ❯ how is output parsed in mlx-code.py?
113
+
114
+
115
+
116
+ ⏺ Read 1 file (ctrl+o to expand)
117
+
118
+ ⏺ Based on the code, here's how output is parsed in mlx-code.py:
119
+
120
+ Output Parsing Flow
121
+
122
+ 1. decode() function (lines 109-143)
123
+
124
+ This is the core parsing function that converts raw model output into structured blocks:
125
+
126
+ - Input: Raw text from the model
127
+ - Output: List of blocks with types: thinking, text, or tool_use
128
+
129
+ Key parsing logic:
130
+ - Splits text by `
131
+
132
+ ⏺ ` markers for thinking content
133
+ - Extracts tool calls wrapped in <tool_call>...</tool_call> tags
134
+ - Parses function name and parameters from XML-like tags
135
+
136
+ 2. Tool Call Extraction (lines 125-141)
137
+
138
+ When parsing tool calls:
139
+ tool_pattern = re.compile(r'<tool_call>(.*?)</tool_call>', re.DOTALL)
140
+ - Matches content between <tool_call> and </tool_call>
141
+ - Extracts function name: <function=name>
142
+ - Extracts parameters: <parameter=name>value</parameter>
143
+ - Creates tool_use block with:
144
+ - type: "tool_use"
145
+ - id: UUID
146
+ - name: function name
147
+ - input: params dict
148
+
149
+ 3. blocks_to_sse() function (lines 145-177)
150
+
151
+ Converts parsed blocks to Server-Sent Events (SSE) format:
152
+
153
+ def event(name: str, data: dict) -> bytes:
154
+ return f"event: {name}\ndata: {json.dumps(data)}\n\n".encode()
155
+
156
+ Event types sent:
157
+ - message_start - Initial message header
158
+ - content_block_start - Start of a content block
159
+ - content_block_delta - Streaming content (text/thinking/tool)
160
+ - content_block_stop - End of block
161
+ - message_delta - Stop reason
162
+ - message_stop - End of message
163
+
164
+ 4. HTTP Response (lines 288-297)
165
+
166
+ Sends SSE stream with headers:
167
+ self.send_header("Content-Type", "text/event-stream")
168
+ self.send_header("Cache-Control", "no-cache")
169
+
170
+ 5. Endpoint (/v1/messages)
171
+
172
+ - POST request with JSON body
173
+ - Returns streaming SSE response
174
+ - Supports max_tokens parameter
175
+
176
+ 6. Stop Reason Detection
177
+
178
+ - tool_use - If tool calls were generated
179
+ - end_turn - If just text/thinking was generated
180
+
181
+ ✻ Brewed for 35s
182
+
183
+ ❯ cool mate, bye
184
+
185
+ ⏺ Got it! Feel free to come back anytime if you need help. Have a great day! 👋
186
+ </pre></details><br>
187
+
mlx_code-0.0.1/main.py ADDED
@@ -0,0 +1,350 @@
1
+ import tempfile
2
+ import argparse
3
+ import json
4
+ import logging
5
+ import os
6
+ import re
7
+ import subprocess
8
+ import sys
9
+ import threading
10
+ import time
11
+ import uuid
12
+ from http.server import BaseHTTPRequestHandler, HTTPServer
13
+ from pathlib import Path
14
+ import mlx.core as mx
15
+ import mlx_lm
16
+ from mlx_lm.generate import generate_step
17
+
18
+ prompt_cache = {}
19
+ logging.basicConfig(filename="mlx_trace.log", filemode='w', level=logging.DEBUG, format="%(message)s") # format="【\n%(message)s\n】\n")
20
+ logger = logging.getLogger(__name__)
21
+ gen_lock = threading.Lock()
22
+
23
+ def parse_tool(tools, names):
24
+ qwen_tools = []
25
+ for tool in tools:
26
+ if tool["name"] in names:
27
+ qwen_tool = {
28
+ "type": "function",
29
+ "function": {
30
+ "name": tool["name"],
31
+ "description": tool["description"],
32
+ "parameters": tool.get("input_schema", {
33
+ "type": "object",
34
+ "properties": {}
35
+ })
36
+ }
37
+ }
38
+ params = qwen_tool["function"]["parameters"]
39
+ params.pop("$schema", None)
40
+ qwen_tools.append(qwen_tool)
41
+ return qwen_tools
42
+
43
+ def encode(body, tokenizer, system, names, skips):
44
+ msgs = []
45
+ sys_parts = []
46
+ if isinstance(system, str):
47
+ env = "\n".join(l.strip() for l in next((b["text"] for b in body.get("system", []) if "Primary working directory" in b.get("text", "")), "").splitlines() if "Primary working directory" in l or "Shell:" in l)
48
+ sys_parts.append(system.replace("{env}", env))
49
+ else:
50
+ raw_system = body.get("system")
51
+ if isinstance(raw_system, str) and raw_system.strip():
52
+ sys_parts.append(raw_system.strip())
53
+ elif isinstance(raw_system, list):
54
+ for block in raw_system:
55
+ if block.get("type") != "text":
56
+ continue
57
+ text = block.get("text", "").strip()
58
+ if re.match(r'^\S+:\s', text) and '\n' not in text:
59
+ continue
60
+ if text:
61
+ sys_parts.append(text)
62
+ if sys_parts:
63
+ msgs.append({"role": "system", "content": "\n\n".join(sys_parts)})
64
+ calls = {}
65
+ def skip(text, show_skipped=False):
66
+ if skips is None:
67
+ return text
68
+ lines = []
69
+ for pattern in skips:
70
+ found = re.findall(pattern, text)
71
+ if found:
72
+ lines.append(
73
+ f"{pattern}\n" +
74
+ "\n".join(re.sub(r"\S", ".", m) for m in found)
75
+ )
76
+ if lines and show_skipped:
77
+ logger.debug("\n".join(lines))
78
+ for pattern in skips:
79
+ text = re.sub(pattern, "", text)
80
+ return text
81
+ for msg in body.get("messages", []):
82
+ role = msg["role"]
83
+ content = msg["content"]
84
+ if isinstance(content, str):
85
+ content = [dict(type='text', text=content)]
86
+ parts = {}
87
+ for block in content:
88
+ t = block.get("type")
89
+ if t == "text":
90
+ parts['content'] = parts.get('content', '').rstrip() + '\n' + skip(block['text']).rstrip()
91
+ elif t == "thinking":
92
+ parts['reasoning_content'] = block['thinking']
93
+ elif t == "tool_use":
94
+ calls[block["id"]] = block
95
+ elif t == "tool_result":
96
+ tu = calls.get(block["tool_use_id"])
97
+ rc = block.get("content", "")
98
+ if isinstance(rc, list):
99
+ rc = skip("\n".join(c.get("text", "") for c in rc if c.get("type") == "text"))
100
+ parts |= {"role": "tool", "name": tu['name'], "content": f"{tu['input']}\n{rc}"}
101
+ if parts:
102
+ msgs.append({"role": role}|parts)
103
+ if not msgs[-1].get('content', '').strip():
104
+ return None
105
+ return tokenizer.apply_chat_template(msgs, tools = parse_tool(body.get("tools", []), names), tokenize=False, add_generation_prompt=True)
106
+
107
+ def decode(raw_text, tokenizer, parse_think=True):
108
+ raw_text = '<think>' + raw_text if (c := raw_text.find('</think>')) != -1 and ((o := raw_text.find('<think>')) == -1 or c < o) else raw_text
109
+ blocks = []
110
+ if parse_think:
111
+ parts = re.split(r'(<think>.*?</think>)', raw_text, flags=re.DOTALL)
112
+ else:
113
+ parts = [raw_text]
114
+ for part in parts:
115
+ if not part:
116
+ continue
117
+ if parse_think and part.startswith('<think>') and part.endswith('</think>'):
118
+ thinking_content = part[7:-8].strip()
119
+ if thinking_content:
120
+ blocks.append({"type": "thinking", "thinking": thinking_content})
121
+ else:
122
+ blocks.append({"type": "text", "text": part})
123
+ tool_pattern = re.compile(r'<tool_call>(.*?)</tool_call>', re.DOTALL)
124
+ for match in tool_pattern.finditer(part):
125
+ content = match.group(1).strip()
126
+ if not ("<function=" in content and "<parameter=" in content and "</parameter>" in content):
127
+ continue
128
+
129
+ fn_match = re.search(r"<function=([^\s>]+)>", content)
130
+ if fn_match:
131
+ name = fn_match.group(1)
132
+ params = re.findall(r"<parameter=([^\s>]+)>\s*(.*?)\s*</parameter>", content, re.DOTALL)
133
+ args = {k: v.strip() for k, v in params}
134
+ blocks.append({
135
+ "type": "tool_use",
136
+ "id": f"toolu_{uuid.uuid4().hex[:8]}",
137
+ "name": name,
138
+ "input": args,
139
+ })
140
+ stop_reason = "tool_use" if any(b["type"] == "tool_use" for b in blocks) else "end_turn"
141
+ return blocks, stop_reason
142
+
143
+ def blocks_to_sse(blocks: list[dict], msg_id: str, in_tokens: int, out_tokens: int, stop_reason='end_turn') -> bytes:
144
+ def event(name: str, data: dict) -> bytes:
145
+ return f"event: {name}\ndata: {json.dumps(data)}\n\n".encode()
146
+ out = bytearray()
147
+ out += event("message_start", {"type": "message_start", "message": {
148
+ "id": msg_id, "type": "message", "role": "assistant",
149
+ "model": "local", "content": [], "stop_reason": None, "stop_sequence": None,
150
+ "usage": {"input_tokens": in_tokens, "output_tokens": 0},
151
+ }})
152
+ for i, block in enumerate(blocks):
153
+ bt = block["type"]
154
+ if bt == "text":
155
+ out += event("content_block_start", {"type": "content_block_start", "index": i,
156
+ "content_block": {"type": "text", "text": ""}})
157
+ out += event("content_block_delta", {"type": "content_block_delta", "index": i,
158
+ "delta": {"type": "text_delta", "text": block["text"]}})
159
+ elif bt == "thinking":
160
+ out += event("content_block_start", {"type": "content_block_start", "index": i,
161
+ "content_block": {"type": "thinking", "thinking": ""}})
162
+ out += event("content_block_delta", {"type": "content_block_delta", "index": i,
163
+ "delta": {"type": "thinking_delta", "thinking": block["thinking"]}})
164
+ elif bt == "tool_use":
165
+ out += event("content_block_start", {"type": "content_block_start", "index": i,
166
+ "content_block": {"type": "tool_use", "id": block["id"],
167
+ "name": block["name"], "input": {}}})
168
+ out += event("content_block_delta", {"type": "content_block_delta", "index": i,
169
+ "delta": {"type": "input_json_delta", "partial_json": json.dumps(block["input"])}})
170
+ out += event("content_block_stop", {"type": "content_block_stop", "index": i})
171
+ out += event("message_delta", {"type": "message_delta",
172
+ "delta": {"stop_reason": stop_reason, "stop_sequence": None},
173
+ "usage": {"output_tokens": out_tokens}})
174
+ out += event("message_stop", {"type": "message_stop"})
175
+ return bytes(out)
176
+
177
+ def dmca(p_str):
178
+ pattern1 = r"(<\|im_start\|>system\n)(.*?)(<\|im_end\|>)"
179
+ def mask_system(match):
180
+ return match.group(1) + re.sub(r"\S", ".", match.group(2)) + match.group(3)
181
+ p_str = re.sub(pattern1, mask_system, p_str, flags=re.DOTALL)
182
+ block_patterns = [
183
+ r"(?m)^<system-reminder>[\s\S]*?^</system-reminder>\s*",
184
+ r"(?m)^\[SUGGESTION MODE[\s\S]*",
185
+ ]
186
+ for pattern in block_patterns:
187
+ p_str = re.sub(pattern, lambda m: re.sub(r"\S", ".", m.group(0)), p_str)
188
+ return p_str
189
+
190
+ def generate(model, tokenizer, prompt, hook=None, max_tokens=256, helper_max_tokens=64, **kwargs):
191
+ global prompt_cache, hx
192
+ if prompt is None:
193
+ return '', 0, 0
194
+ if not isinstance(tokenizer, mlx_lm.tokenizer_utils.TokenizerWrapper):
195
+ tokenizer = mlx_lm.tokenizer_utils.TokenizerWrapper(tokenizer)
196
+ detokenizer = tokenizer.detokenizer
197
+ if isinstance(prompt, str):
198
+ add_special_tokens = tokenizer.bos_token is None or not prompt.startswith(tokenizer.bos_token)
199
+ prompt_s = prompt
200
+ prompt = tokenizer.encode(prompt, add_special_tokens=add_special_tokens)
201
+ else:
202
+ prompt_s = tokenizer.decode(prompt)
203
+ logger.debug(dmca(prompt_s))
204
+ stream = logging.getLogger().handlers[0].stream
205
+ common_len = 0
206
+ if prompt_cache.get('cache', None):
207
+ for p, h in zip(prompt, prompt_cache['hx']):
208
+ if p == h:
209
+ common_len += 1
210
+ else:
211
+ break
212
+ else:
213
+ prompt_cache['hx'] = []
214
+ prompt_cache['cache'] = mlx_lm.models.cache.make_prompt_cache(model)
215
+ trim_amount = len(prompt_cache['hx']) - common_len
216
+ mlx_lm.models.cache.trim_prompt_cache(prompt_cache['cache'], trim_amount)
217
+ prompt_cache['hx'] = prompt_cache['hx'][:common_len]
218
+ token_gen = generate_step(
219
+ mx.array(prompt[common_len:]),
220
+ model,
221
+ prompt_cache=prompt_cache['cache'],
222
+ max_tokens=max_tokens,
223
+ **kwargs,
224
+ )
225
+ text = ""
226
+ tic_non = time.perf_counter()
227
+ gens = []
228
+ for token, _ in token_gen:
229
+ gens.append(token)
230
+ if token in tokenizer.eos_token_ids:
231
+ break
232
+ detokenizer.add_token(token)
233
+ seg = detokenizer.last_segment
234
+ stream.write(seg)
235
+ stream.flush()
236
+ text += seg
237
+ if len(gens) == 1:
238
+ tic_inp = time.perf_counter()
239
+ if len(prompt_cache['hx']) ==0:
240
+ mlx_lm.models.cache.save_prompt_cache(prompt_cache['file_name'], prompt_cache['cache'], metadata=dict(model_name=prompt_cache['model_name'], hx=json.dumps(prompt)))
241
+ if len(gens) >= max_tokens:
242
+ break
243
+ tic_out = time.perf_counter()
244
+ detokenizer.finalize()
245
+ text += detokenizer.last_segment
246
+ prompt_cache['hx'] = prompt+gens
247
+ logger.debug(f'{len(prompt)} in {tic_inp-tic_non:.0f}; {len(gens)} in {tic_out-tic_inp:.0f}; {common_len}\n=== INP ===\n{dmca(prompt_s)}\n=== OUT ===\n{text}\n')
248
+ return text, len(prompt), len(gens)
249
+
250
+ def make_handler(model, tokenizer, system, names, skips, parse_think=True):
251
+ class Handler(BaseHTTPRequestHandler):
252
+ def log_message(self, fmt, *args):
253
+ pass
254
+
255
+ def send_json(self, code: int, obj: dict):
256
+ body = json.dumps(obj).encode()
257
+ self.send_response(code)
258
+ self.send_header("Content-Type", "application/json")
259
+ self.send_header("Content-Length", str(len(body)))
260
+ self.end_headers()
261
+ self.wfile.write(body)
262
+
263
+ def do_GET(self):
264
+ if self.path.rstrip("/") == "/v1/models":
265
+ self.send_json(200, {"data": [{"id": "local", "object": "model",
266
+ "created": int(time.time()), "owned_by": "local"}]})
267
+ else:
268
+ self.send_json(404, {"error": "not found"})
269
+
270
+ def do_POST(self):
271
+ path = self.path.split("?")[0].rstrip("/")
272
+ if path == "/v1/messages/count_tokens":
273
+ self.send_json(200, {"input_tokens": 0})
274
+ return
275
+ if path != "/v1/messages":
276
+ self.send_json(404, {"error": f"unknown endpoint {path}"})
277
+ return
278
+ n = int(self.headers.get("Content-Length", 0))
279
+ body = json.loads(self.rfile.read(n))
280
+ prompt = encode(body, tokenizer, system, names, skips)
281
+ with gen_lock:
282
+ raw, in_tokens, out_tokens = generate(model, tokenizer, prompt=prompt, max_tokens=body.get("max_tokens", 8192))
283
+ blocks, stop_reason = decode(raw, tokenizer, parse_think=parse_think)
284
+ msg_id = f"msg_{uuid.uuid4().hex}"
285
+ sse = blocks_to_sse(blocks, msg_id, in_tokens, out_tokens, stop_reason)
286
+ self.send_response(200)
287
+ self.send_header("Content-Type", "text/event-stream")
288
+ self.send_header("Cache-Control", "no-cache")
289
+ self.send_header("Content-Length", str(len(sse)))
290
+ self.end_headers()
291
+ try:
292
+ self.wfile.write(sse)
293
+ self.wfile.flush()
294
+ except BrokenPipeError:
295
+ pass
296
+ return Handler
297
+
298
+ def main():
299
+ parser = argparse.ArgumentParser()
300
+ parser.add_argument("--model", default="mlx-community/Qwen3.5-4B-OptiQ-4bit")
301
+ # parser.add_argument("--model", default="mlx-community/Qwen3.5-2B-OptiQ-4bit")
302
+ # parser.add_argument("--model", default="mlx-community/Qwen3.5-0.8B-MLX-bf16")
303
+ parser.add_argument("--system", type=str, default=None)
304
+ # parser.add_argument("--system", type=str, default='# Env\n{env}')
305
+ parser.add_argument("--cache", type=str, default='cache/cache.safetensors')
306
+ parser.add_argument("--names", nargs="+", default=['Read','Edit','Write','Grep','Glob','Bash','Agent','Skill'])
307
+ parser.add_argument("--skips", nargs="+", default=[
308
+ r'(?m)^\[SUGGESTION MODE[\s\S]*'
309
+ # r'(?m)^<system-reminder>[\s\S]*?^</system-reminder>\s*',
310
+ ])
311
+ parser.add_argument("--port", type=int, default=8000)
312
+ parser.add_argument("--host", default="127.0.0.1")
313
+ parser.add_argument("--home", default=tempfile.mkdtemp())
314
+ parser.add_argument("--work", default=os.getcwd())
315
+ args, claude_args = parser.parse_known_args()
316
+ global prompt_cache
317
+ if os.path.exists(args.cache):
318
+ cache, metadata = mlx_lm.models.cache.load_prompt_cache(args.cache, return_metadata=True)
319
+ mx.eval(cache)
320
+ model_name = metadata.pop("model_name", "")
321
+ tokens_str = metadata.pop("hx", "[]")
322
+ tokens = json.loads(tokens_str)
323
+ prompt_cache = dict(cache=cache, hx=tokens, model_name=model_name)
324
+ if prompt_cache.get('model_name') != args.model:
325
+ prompt_cache = dict(model_name=args.model)
326
+ else:
327
+ Path(args.cache).parent.mkdir(parents=True, exist_ok=True)
328
+ prompt_cache = dict(model_name=args.model)
329
+ prompt_cache['file_name']=args.cache
330
+ model, tokenizer = mlx_lm.load(args.model)
331
+ server = HTTPServer((args.host, args.port), make_handler(model, tokenizer, args.system, args.names, args.skips))
332
+ threading.Thread(target=server.serve_forever, daemon=True).start()
333
+ env = os.environ.copy()
334
+ env["ANTHROPIC_BASE_URL"] = f"http://{args.host}:{args.port}"
335
+ env["ANTHROPIC_AUTH_TOKEN"] = "local"
336
+ env["ANTHROPIC_MODEL"] = args.model
337
+ env["ANTHROPIC_SMALL_FAST_MODEL"] = args.model
338
+ env["HOME"] = args.home
339
+ def mirror_workspace(src: str, dst: str):
340
+ for root, dirs, files in os.walk(src):
341
+ rel = os.path.relpath(root, src)
342
+ os.makedirs(os.path.join(dst, rel), exist_ok=True)
343
+ for f in files:
344
+ os.link(os.path.join(root, f), os.path.join(dst, rel, f))
345
+ workspace = os.path.join(args.home, "workspace")
346
+ mirror_workspace(args.work, workspace)
347
+ sys.exit(subprocess.run(["claude"] + claude_args, env=env, cwd=workspace).returncode)
348
+
349
+ if __name__ == "__main__":
350
+ main()
@@ -0,0 +1,209 @@
1
+ Metadata-Version: 2.4
2
+ Name: mlx-code
3
+ Version: 0.0.1
4
+ Summary: Local Claude Code for Mac
5
+ Home-page: https://github.com/JosefAlbers/mlx-code
6
+ Author: J Joe
7
+ Author-email: albersj66@gmail.com
8
+ License: Apache-2.0
9
+ Requires-Python: >=3.11
10
+ Description-Content-Type: text/markdown
11
+ Requires-Dist: mlx-lm>=0.19.0
12
+ Requires-Dist: PyYAML
13
+ Dynamic: author
14
+ Dynamic: author-email
15
+ Dynamic: description
16
+ Dynamic: description-content-type
17
+ Dynamic: home-page
18
+ Dynamic: license
19
+ Dynamic: requires-dist
20
+ Dynamic: requires-python
21
+ Dynamic: summary
22
+
23
+ # mlx-code: Local Claude Code for Mac
24
+
25
+ <img width="1024" height="1536" alt="mlxcodelogo" src="https://github.com/user-attachments/assets/df605c1d-f8e7-433c-9aaf-176c73e3d628" />
26
+ ![mlxcode](https://github.com/user-attachments/assets/86004bd2-549f-44b8-9a05-a9b6d9f1c4a2)
27
+
28
+ ### Quick Start
29
+
30
+ ```bash
31
+ brew install --cask claude-code
32
+ pip install mlx-code
33
+ ```
34
+
35
+ ### Commands
36
+ | Command | What it does | Example |
37
+ |--------|--------------|--------|
38
+ | `mlx-code` | Start interactive mode | `mlx-code` |
39
+ | `mlx-code "task"` | Run a one-time task | `mlx-code "fix the build error"` |
40
+ | `mlx-code -p "query"` | Run one-off query, then exit | `mlx-code -p "explain this function"` |
41
+ | `mlx-code -c` | Continue most recent conversation in current directory | `mlx-code -c` |
42
+ | `mlx-code -r` | Resume a previous conversation | `mlx-code -r` |
43
+ | `mlx-code commit` | Create a Git commit | `mlx-code commit` |
44
+ | `/clear` | Clear conversation history | `/clear` |
45
+ | `/help` | Show available commands | `/help` |
46
+ | `exit` or `Ctrl+C` | Exit Claude Code | `exit` |
47
+
48
+ <details><summary>Click to expand</summary><pre>
49
+ ╭─── Claude Code v2.1.74 ────────────────────────────────────────────────────────────────────────────────╮
50
+ │ │ Tips for getting started │
51
+ │ Welcome back! │ Run /init to create a CLAUDE.md file with instru… │
52
+ │ │ ───────────────────────────────────────────────── │
53
+ │ ▐▛███▜▌ │ Recent activity │
54
+ │ ▝▜█████▛▘ │ No recent activity │
55
+ │ ▘▘ ▝▝ │ │
56
+ │ │ │
57
+ │ mlx-community/Qwen3.5-4B-Opti… · API Usage Billing │ │
58
+ │ /…/T/tmpg4sztu_h/workspace │ │
59
+ ╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
60
+
61
+ ❯ explain the folder structure
62
+
63
+
64
+
65
+ ⏺ Searched for 1 pattern (ctrl+o to expand)
66
+
67
+ ⏺ Here's the folder structure of your workspace:
68
+
69
+ workspace/
70
+ ├── .DS_Store
71
+ ├── cache/
72
+ │ └── cache.safetensors
73
+ ├── cc/
74
+ │ ├── .DS_Store
75
+ │ ├── mlx-code-v1.py
76
+ │ └── mlx-code-v2.py
77
+ ├── cyberpunk/
78
+ │ ├── build.zig
79
+ │ ├── help.py
80
+ │ ├── map.c
81
+ │ └── main.zig
82
+ ├── rag/
83
+ │ └── jina.py
84
+ ├── rts/
85
+ │ ├── build.zig
86
+ │ ├── main.zig
87
+ │ └── battle_sim.c
88
+ ├── splat_zig_raylib/
89
+ │ ├── README.md
90
+ │ ├── .DS_Store
91
+ │ ├── build.zig
92
+ │ ├── build_web.sh
93
+ │ ├── index.html
94
+ │ ├── main.zig
95
+ │ ├── gsplat_transform.py
96
+ │ ├── src/
97
+ │ │ └── main.zig
98
+ │ └── splat_zig/
99
+ │ ├── .DS_Store
100
+ │ ├── build.zig
101
+ │ ├── build_web.sh
102
+ │ ├── gsplat_transform.py
103
+ │ ├── index.html
104
+ │ ├── main.zig
105
+ │ └── splat/
106
+ │ └── build.zig
107
+ ├── tarot/
108
+ │ ├── index.html
109
+ │ └── tarot.py
110
+ └── tmp.md
111
+
112
+ Summary of projects:
113
+
114
+ ┌───────────────────┬────────────────────────────────────────────┐
115
+ │ Folder │ Purpose │
116
+ ├───────────────────┼────────────────────────────────────────────┤
117
+ │ cc/ │ MLX code versions (v1, v2) │
118
+ ├───────────────────┼────────────────────────────────────────────┤
119
+ │ cyberpunk/ │ Zig game project with C map integration │
120
+ ├───────────────────┼────────────────────────────────────────────┤
121
+ │ rag/ │ RAG (Retrieval-Augmented Generation) setup │
122
+ ├───────────────────┼────────────────────────────────────────────┤
123
+ │ rts/ │ Real-time strategy battle simulation │
124
+ ├───────────────────┼────────────────────────────────────────────┤
125
+ │ splat_zig_raylib/ │ Raylib game with splat transform features │
126
+ ├───────────────────┼────────────────────────────────────────────┤
127
+ │ tarot/ │ Tarot card application │
128
+ ├───────────────────┼────────────────────────────────────────────┤
129
+ │ cache/ │ Model cache files │
130
+ ├───────────────────┼────────────────────────────────────────────┤
131
+ │ tmp.md │ Temporary notes │
132
+ └───────────────────┴────────────────────────────────────────────┘
133
+
134
+ ❯ how is output parsed in mlx-code.py?
135
+
136
+
137
+
138
+ ⏺ Read 1 file (ctrl+o to expand)
139
+
140
+ ⏺ Based on the code, here's how output is parsed in mlx-code.py:
141
+
142
+ Output Parsing Flow
143
+
144
+ 1. decode() function (lines 109-143)
145
+
146
+ This is the core parsing function that converts raw model output into structured blocks:
147
+
148
+ - Input: Raw text from the model
149
+ - Output: List of blocks with types: thinking, text, or tool_use
150
+
151
+ Key parsing logic:
152
+ - Splits text by `
153
+
154
+ ⏺ ` markers for thinking content
155
+ - Extracts tool calls wrapped in <tool_call>...</tool_call> tags
156
+ - Parses function name and parameters from XML-like tags
157
+
158
+ 2. Tool Call Extraction (lines 125-141)
159
+
160
+ When parsing tool calls:
161
+ tool_pattern = re.compile(r'<tool_call>(.*?)</tool_call>', re.DOTALL)
162
+ - Matches content between <tool_call> and </tool_call>
163
+ - Extracts function name: <function=name>
164
+ - Extracts parameters: <parameter=name>value</parameter>
165
+ - Creates tool_use block with:
166
+ - type: "tool_use"
167
+ - id: UUID
168
+ - name: function name
169
+ - input: params dict
170
+
171
+ 3. blocks_to_sse() function (lines 145-177)
172
+
173
+ Converts parsed blocks to Server-Sent Events (SSE) format:
174
+
175
+ def event(name: str, data: dict) -> bytes:
176
+ return f"event: {name}\ndata: {json.dumps(data)}\n\n".encode()
177
+
178
+ Event types sent:
179
+ - message_start - Initial message header
180
+ - content_block_start - Start of a content block
181
+ - content_block_delta - Streaming content (text/thinking/tool)
182
+ - content_block_stop - End of block
183
+ - message_delta - Stop reason
184
+ - message_stop - End of message
185
+
186
+ 4. HTTP Response (lines 288-297)
187
+
188
+ Sends SSE stream with headers:
189
+ self.send_header("Content-Type", "text/event-stream")
190
+ self.send_header("Cache-Control", "no-cache")
191
+
192
+ 5. Endpoint (/v1/messages)
193
+
194
+ - POST request with JSON body
195
+ - Returns streaming SSE response
196
+ - Supports max_tokens parameter
197
+
198
+ 6. Stop Reason Detection
199
+
200
+ - tool_use - If tool calls were generated
201
+ - end_turn - If just text/thinking was generated
202
+
203
+ ✻ Brewed for 35s
204
+
205
+ ❯ cool mate, bye
206
+
207
+ ⏺ Got it! Feel free to come back anytime if you need help. Have a great day! 👋
208
+ </pre></details><br>
209
+
@@ -0,0 +1,9 @@
1
+ README.md
2
+ main.py
3
+ setup.py
4
+ mlx_code.egg-info/PKG-INFO
5
+ mlx_code.egg-info/SOURCES.txt
6
+ mlx_code.egg-info/dependency_links.txt
7
+ mlx_code.egg-info/entry_points.txt
8
+ mlx_code.egg-info/requires.txt
9
+ mlx_code.egg-info/top_level.txt
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ mlx-code = main:main
@@ -0,0 +1,2 @@
1
+ mlx-lm>=0.19.0
2
+ PyYAML
@@ -0,0 +1 @@
1
+ main
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,18 @@
1
+ from setuptools import setup
2
+
3
+ setup(
4
+ name="mlx-code",
5
+ url='https://github.com/JosefAlbers/mlx-code',
6
+ author_email="albersj66@gmail.com",
7
+ author="J Joe",
8
+ license="Apache-2.0",
9
+ version="0.0.1",
10
+ readme="README.md",
11
+ description="Local Claude Code for Mac",
12
+ long_description=open("README.md").read(),
13
+ long_description_content_type="text/markdown",
14
+ python_requires=">=3.11",
15
+ install_requires=["mlx-lm>=0.19.0", "PyYAML"],
16
+ py_modules=["main"],
17
+ entry_points={"console_scripts": ["mlx-code=main:main"]},
18
+ )