PyPI - mlx-code - Versions diffs - 0.0.1__tar.gz - Mend

mlx-code 0.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

mlx_code-0.0.1/PKG-INFO +209 -0
mlx_code-0.0.1/README.md +187 -0
mlx_code-0.0.1/main.py +350 -0
mlx_code-0.0.1/mlx_code.egg-info/PKG-INFO +209 -0
mlx_code-0.0.1/mlx_code.egg-info/SOURCES.txt +9 -0
mlx_code-0.0.1/mlx_code.egg-info/dependency_links.txt +1 -0
mlx_code-0.0.1/mlx_code.egg-info/entry_points.txt +2 -0
mlx_code-0.0.1/mlx_code.egg-info/requires.txt +2 -0
mlx_code-0.0.1/mlx_code.egg-info/top_level.txt +1 -0
mlx_code-0.0.1/setup.cfg +4 -0
mlx_code-0.0.1/setup.py +18 -0

mlx_code-0.0.1/PKG-INFO ADDED Viewed

@@ -0,0 +1,209 @@
+Metadata-Version: 2.4
+Name: mlx-code
+Version: 0.0.1
+Summary: Local Claude Code for Mac
+Home-page: https://github.com/JosefAlbers/mlx-code
+Author: J Joe
+Author-email: albersj66@gmail.com
+License: Apache-2.0
+Requires-Python: >=3.11
+Description-Content-Type: text/markdown
+Requires-Dist: mlx-lm>=0.19.0
+Requires-Dist: PyYAML
+Dynamic: author
+Dynamic: author-email
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
+# mlx-code: Local Claude Code for Mac
+<img width="1024" height="1536" alt="mlxcodelogo" src="https://github.com/user-attachments/assets/df605c1d-f8e7-433c-9aaf-176c73e3d628" />
+![mlxcode](https://github.com/user-attachments/assets/86004bd2-549f-44b8-9a05-a9b6d9f1c4a2)
+### Quick Start
+```bash
+brew install --cask claude-code
+pip install mlx-code
+```
+### Commands
+| Command | What it does | Example |
+|--------|--------------|--------|
+| `mlx-code` | Start interactive mode | `mlx-code` |
+| `mlx-code "task"` | Run a one-time task | `mlx-code "fix the build error"` |
+| `mlx-code -p "query"` | Run one-off query, then exit | `mlx-code -p "explain this function"` |
+| `mlx-code -c` | Continue most recent conversation in current directory | `mlx-code -c` |
+| `mlx-code -r` | Resume a previous conversation | `mlx-code -r` |
+| `mlx-code commit` | Create a Git commit | `mlx-code commit` |
+| `/clear` | Clear conversation history | `/clear` |
+| `/help` | Show available commands | `/help` |
+| `exit` or `Ctrl+C` | Exit Claude Code | `exit` |
+<details><summary>Click to expand</summary><pre>
+╭─── Claude Code v2.1.74 ────────────────────────────────────────────────────────────────────────────────╮
+│                                                    │ Tips for getting started                          │
+│                    Welcome back!                   │ Run /init to create a CLAUDE.md file with instru… │
+│                                                    │ ───────────────────────────────────────────────── │
+│                       ▐▛███▜▌                      │ Recent activity                                   │
+│                      ▝▜█████▛▘                     │ No recent activity                                │
+│                        ▘▘ ▝▝                       │                                                   │
+│                                                    │                                                   │
+│ mlx-community/Qwen3.5-4B-Opti… · API Usage Billing │                                                   │
+│             /…/T/tmpg4sztu_h/workspace             │                                                   │
+╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
+❯ explain the folder structure
+⏺
+⏺ Searched for 1 pattern (ctrl+o to expand)
+⏺ Here's the folder structure of your workspace:
+  workspace/
+  ├── .DS_Store
+  ├── cache/
+  │   └── cache.safetensors
+  ├── cc/
+  │   ├── .DS_Store
+  │   ├── mlx-code-v1.py
+  │   └── mlx-code-v2.py
+  ├── cyberpunk/
+  │   ├── build.zig
+  │   ├── help.py
+  │   ├── map.c
+  │   └── main.zig
+  ├── rag/
+  │   └── jina.py
+  ├── rts/
+  │   ├── build.zig
+  │   ├── main.zig
+  │   └── battle_sim.c
+  ├── splat_zig_raylib/
+  │   ├── README.md
+  │   ├── .DS_Store
+  │   ├── build.zig
+  │   ├── build_web.sh
+  │   ├── index.html
+  │   ├── main.zig
+  │   ├── gsplat_transform.py
+  │   ├── src/
+  │   │   └── main.zig
+  │   └── splat_zig/
+  │       ├── .DS_Store
+  │       ├── build.zig
+  │       ├── build_web.sh
+  │       ├── gsplat_transform.py
+  │       ├── index.html
+  │       ├── main.zig
+  │       └── splat/
+  │           └── build.zig
+  ├── tarot/
+  │   ├── index.html
+  │   └── tarot.py
+  └── tmp.md
+  Summary of projects:
+  ┌───────────────────┬────────────────────────────────────────────┐
+  │      Folder       │                  Purpose                   │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cc/               │ MLX code versions (v1, v2)                 │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cyberpunk/        │ Zig game project with C map integration    │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ rag/              │ RAG (Retrieval-Augmented Generation) setup │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ rts/              │ Real-time strategy battle simulation       │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ splat_zig_raylib/ │ Raylib game with splat transform features  │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ tarot/            │ Tarot card application                     │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cache/            │ Model cache files                          │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ tmp.md            │ Temporary notes                            │
+  └───────────────────┴────────────────────────────────────────────┘
+❯ how is output parsed in mlx-code.py?
+⏺
+⏺ Read 1 file (ctrl+o to expand)
+⏺ Based on the code, here's how output is parsed in mlx-code.py:
+  Output Parsing Flow
+  1. decode() function (lines 109-143)
+  This is the core parsing function that converts raw model output into structured blocks:
+  - Input: Raw text from the model
+  - Output: List of blocks with types: thinking, text, or tool_use
+  Key parsing logic:
+  - Splits text by `
+⏺ ` markers for thinking content
+  - Extracts tool calls wrapped in <tool_call>...</tool_call> tags
+  - Parses function name and parameters from XML-like tags
+  2. Tool Call Extraction (lines 125-141)
+  When parsing tool calls:
+  tool_pattern = re.compile(r'<tool_call>(.*?)</tool_call>', re.DOTALL)
+  - Matches content between <tool_call> and </tool_call>
+  - Extracts function name: <function=name>
+  - Extracts parameters: <parameter=name>value</parameter>
+  - Creates tool_use block with:
+    - type: "tool_use"
+    - id: UUID
+    - name: function name
+    - input: params dict
+  3. blocks_to_sse() function (lines 145-177)
+  Converts parsed blocks to Server-Sent Events (SSE) format:
+  def event(name: str, data: dict) -> bytes:
+      return f"event: {name}\ndata: {json.dumps(data)}\n\n".encode()
+  Event types sent:
+  - message_start - Initial message header
+  - content_block_start - Start of a content block
+  - content_block_delta - Streaming content (text/thinking/tool)
+  - content_block_stop - End of block
+  - message_delta - Stop reason
+  - message_stop - End of message
+  4. HTTP Response (lines 288-297)
+  Sends SSE stream with headers:
+  self.send_header("Content-Type", "text/event-stream")
+  self.send_header("Cache-Control", "no-cache")
+  5. Endpoint (/v1/messages)
+  - POST request with JSON body
+  - Returns streaming SSE response
+  - Supports max_tokens parameter
+  6. Stop Reason Detection
+  - tool_use - If tool calls were generated
+  - end_turn - If just text/thinking was generated
+✻ Brewed for 35s
+❯ cool mate, bye
+⏺ Got it! Feel free to come back anytime if you need help. Have a great day! 👋
+</pre></details><br>

mlx_code-0.0.1/README.md ADDED Viewed

@@ -0,0 +1,187 @@
+# mlx-code: Local Claude Code for Mac
+<img width="1024" height="1536" alt="mlxcodelogo" src="https://github.com/user-attachments/assets/df605c1d-f8e7-433c-9aaf-176c73e3d628" />
+![mlxcode](https://github.com/user-attachments/assets/86004bd2-549f-44b8-9a05-a9b6d9f1c4a2)
+### Quick Start
+```bash
+brew install --cask claude-code
+pip install mlx-code
+```
+### Commands
+| Command | What it does | Example |
+|--------|--------------|--------|
+| `mlx-code` | Start interactive mode | `mlx-code` |
+| `mlx-code "task"` | Run a one-time task | `mlx-code "fix the build error"` |
+| `mlx-code -p "query"` | Run one-off query, then exit | `mlx-code -p "explain this function"` |
+| `mlx-code -c` | Continue most recent conversation in current directory | `mlx-code -c` |
+| `mlx-code -r` | Resume a previous conversation | `mlx-code -r` |
+| `mlx-code commit` | Create a Git commit | `mlx-code commit` |
+| `/clear` | Clear conversation history | `/clear` |
+| `/help` | Show available commands | `/help` |
+| `exit` or `Ctrl+C` | Exit Claude Code | `exit` |
+<details><summary>Click to expand</summary><pre>
+╭─── Claude Code v2.1.74 ────────────────────────────────────────────────────────────────────────────────╮
+│                                                    │ Tips for getting started                          │
+│                    Welcome back!                   │ Run /init to create a CLAUDE.md file with instru… │
+│                                                    │ ───────────────────────────────────────────────── │
+│                       ▐▛███▜▌                      │ Recent activity                                   │
+│                      ▝▜█████▛▘                     │ No recent activity                                │
+│                        ▘▘ ▝▝                       │                                                   │
+│                                                    │                                                   │
+│ mlx-community/Qwen3.5-4B-Opti… · API Usage Billing │                                                   │
+│             /…/T/tmpg4sztu_h/workspace             │                                                   │
+╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
+❯ explain the folder structure
+⏺
+⏺ Searched for 1 pattern (ctrl+o to expand)
+⏺ Here's the folder structure of your workspace:
+  workspace/
+  ├── .DS_Store
+  ├── cache/
+  │   └── cache.safetensors
+  ├── cc/
+  │   ├── .DS_Store
+  │   ├── mlx-code-v1.py
+  │   └── mlx-code-v2.py
+  ├── cyberpunk/
+  │   ├── build.zig
+  │   ├── help.py
+  │   ├── map.c
+  │   └── main.zig
+  ├── rag/
+  │   └── jina.py
+  ├── rts/
+  │   ├── build.zig
+  │   ├── main.zig
+  │   └── battle_sim.c
+  ├── splat_zig_raylib/
+  │   ├── README.md
+  │   ├── .DS_Store
+  │   ├── build.zig
+  │   ├── build_web.sh
+  │   ├── index.html
+  │   ├── main.zig
+  │   ├── gsplat_transform.py
+  │   ├── src/
+  │   │   └── main.zig
+  │   └── splat_zig/
+  │       ├── .DS_Store
+  │       ├── build.zig
+  │       ├── build_web.sh
+  │       ├── gsplat_transform.py
+  │       ├── index.html
+  │       ├── main.zig
+  │       └── splat/
+  │           └── build.zig
+  ├── tarot/
+  │   ├── index.html
+  │   └── tarot.py
+  └── tmp.md
+  Summary of projects:
+  ┌───────────────────┬────────────────────────────────────────────┐
+  │      Folder       │                  Purpose                   │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cc/               │ MLX code versions (v1, v2)                 │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cyberpunk/        │ Zig game project with C map integration    │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ rag/              │ RAG (Retrieval-Augmented Generation) setup │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ rts/              │ Real-time strategy battle simulation       │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ splat_zig_raylib/ │ Raylib game with splat transform features  │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ tarot/            │ Tarot card application                     │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cache/            │ Model cache files                          │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ tmp.md            │ Temporary notes                            │
+  └───────────────────┴────────────────────────────────────────────┘
+❯ how is output parsed in mlx-code.py?
+⏺
+⏺ Read 1 file (ctrl+o to expand)
+⏺ Based on the code, here's how output is parsed in mlx-code.py:
+  Output Parsing Flow
+  1. decode() function (lines 109-143)
+  This is the core parsing function that converts raw model output into structured blocks:
+  - Input: Raw text from the model
+  - Output: List of blocks with types: thinking, text, or tool_use
+  Key parsing logic:
+  - Splits text by `
+⏺ ` markers for thinking content
+  - Extracts tool calls wrapped in <tool_call>...</tool_call> tags
+  - Parses function name and parameters from XML-like tags
+  2. Tool Call Extraction (lines 125-141)
+  When parsing tool calls:
+  tool_pattern = re.compile(r'<tool_call>(.*?)</tool_call>', re.DOTALL)
+  - Matches content between <tool_call> and </tool_call>
+  - Extracts function name: <function=name>
+  - Extracts parameters: <parameter=name>value</parameter>
+  - Creates tool_use block with:
+    - type: "tool_use"
+    - id: UUID
+    - name: function name
+    - input: params dict
+  3. blocks_to_sse() function (lines 145-177)
+  Converts parsed blocks to Server-Sent Events (SSE) format:
+  def event(name: str, data: dict) -> bytes:
+      return f"event: {name}\ndata: {json.dumps(data)}\n\n".encode()
+  Event types sent:
+  - message_start - Initial message header
+  - content_block_start - Start of a content block
+  - content_block_delta - Streaming content (text/thinking/tool)
+  - content_block_stop - End of block
+  - message_delta - Stop reason
+  - message_stop - End of message
+  4. HTTP Response (lines 288-297)
+  Sends SSE stream with headers:
+  self.send_header("Content-Type", "text/event-stream")
+  self.send_header("Cache-Control", "no-cache")
+  5. Endpoint (/v1/messages)
+  - POST request with JSON body
+  - Returns streaming SSE response
+  - Supports max_tokens parameter
+  6. Stop Reason Detection
+  - tool_use - If tool calls were generated
+  - end_turn - If just text/thinking was generated
+✻ Brewed for 35s
+❯ cool mate, bye
+⏺ Got it! Feel free to come back anytime if you need help. Have a great day! 👋
+</pre></details><br>

mlx_code-0.0.1/main.py ADDED Viewed

@@ -0,0 +1,350 @@
+import tempfile
+import argparse
+import json
+import logging
+import os
+import re
+import subprocess
+import sys
+import threading
+import time
+import uuid
+from http.server import BaseHTTPRequestHandler, HTTPServer
+from pathlib import Path
+import mlx.core as mx
+import mlx_lm
+from mlx_lm.generate import generate_step
+prompt_cache = {}
+logging.basicConfig(filename="mlx_trace.log", filemode='w', level=logging.DEBUG, format="%(message)s") # format="【\n%(message)s\n】\n")
+logger = logging.getLogger(__name__)
+gen_lock = threading.Lock()
+def parse_tool(tools, names):
+    qwen_tools = []
+    for tool in tools:
+        if tool["name"] in names:
+            qwen_tool = {
+                "type": "function",
+                "function": {
+                    "name": tool["name"],
+                    "description": tool["description"],
+                    "parameters": tool.get("input_schema", {
+                        "type": "object",
+                        "properties": {}
+                    })
+                }
+            }
+            params = qwen_tool["function"]["parameters"]
+            params.pop("$schema", None)
+            qwen_tools.append(qwen_tool)
+    return qwen_tools
+def encode(body, tokenizer, system, names, skips):
+    msgs = []
+    sys_parts = []
+    if isinstance(system, str):
+        env = "\n".join(l.strip() for l in next((b["text"] for b in body.get("system", []) if "Primary working directory" in b.get("text", "")), "").splitlines() if "Primary working directory" in l or "Shell:" in l)
+        sys_parts.append(system.replace("{env}", env))
+    else:
+        raw_system = body.get("system")
+        if isinstance(raw_system, str) and raw_system.strip():
+            sys_parts.append(raw_system.strip())
+        elif isinstance(raw_system, list):
+            for block in raw_system:
+                if block.get("type") != "text":
+                    continue
+                text = block.get("text", "").strip()
+                if re.match(r'^\S+:\s', text) and '\n' not in text:
+                    continue
+                if text:
+                    sys_parts.append(text)
+    if sys_parts:
+        msgs.append({"role": "system", "content": "\n\n".join(sys_parts)})
+    calls = {}
+    def skip(text, show_skipped=False):
+        if skips is None:
+            return text
+        lines = []
+        for pattern in skips:
+            found = re.findall(pattern, text)
+            if found:
+                lines.append(
+                    f"{pattern}\n" +
+                    "\n".join(re.sub(r"\S", ".", m) for m in found)
+                )
+        if lines and show_skipped:
+            logger.debug("\n".join(lines))
+        for pattern in skips:
+            text = re.sub(pattern, "", text)
+        return text
+    for msg in body.get("messages", []):
+        role = msg["role"]
+        content = msg["content"]
+        if isinstance(content, str):
+            content = [dict(type='text', text=content)]
+        parts = {}
+        for block in content:
+            t = block.get("type")
+            if t == "text":
+                parts['content'] = parts.get('content', '').rstrip() + '\n' + skip(block['text']).rstrip()
+            elif t == "thinking":
+                parts['reasoning_content'] = block['thinking']
+            elif t == "tool_use":
+                calls[block["id"]] = block
+            elif t == "tool_result":
+                tu = calls.get(block["tool_use_id"])
+                rc = block.get("content", "")
+                if isinstance(rc, list):
+                    rc = skip("\n".join(c.get("text", "") for c in rc if c.get("type") == "text"))
+                parts |= {"role": "tool", "name": tu['name'], "content": f"{tu['input']}\n{rc}"}
+        if parts:
+            msgs.append({"role": role}|parts)
+    if not msgs[-1].get('content', '').strip():
+        return None
+    return tokenizer.apply_chat_template(msgs, tools = parse_tool(body.get("tools", []), names), tokenize=False, add_generation_prompt=True)
+def decode(raw_text, tokenizer, parse_think=True):
+    raw_text = '<think>' + raw_text if (c := raw_text.find('</think>')) != -1 and ((o := raw_text.find('<think>')) == -1 or c < o) else raw_text
+    blocks = []
+    if parse_think:
+        parts = re.split(r'(<think>.*?</think>)', raw_text, flags=re.DOTALL)
+    else:
+        parts = [raw_text]
+    for part in parts:
+        if not part:
+            continue
+        if parse_think and part.startswith('<think>') and part.endswith('</think>'):
+            thinking_content = part[7:-8].strip()
+            if thinking_content:
+                blocks.append({"type": "thinking", "thinking": thinking_content})
+        else:
+            blocks.append({"type": "text", "text": part})
+            tool_pattern = re.compile(r'<tool_call>(.*?)</tool_call>', re.DOTALL)
+            for match in tool_pattern.finditer(part):
+                content = match.group(1).strip()
+                if not ("<function=" in content and "<parameter=" in content and "</parameter>" in content):
+                    continue
+                fn_match = re.search(r"<function=([^\s>]+)>", content)
+                if fn_match:
+                    name = fn_match.group(1)
+                    params = re.findall(r"<parameter=([^\s>]+)>\s*(.*?)\s*</parameter>", content, re.DOTALL)
+                    args = {k: v.strip() for k, v in params}
+                    blocks.append({
+                        "type": "tool_use",
+                        "id": f"toolu_{uuid.uuid4().hex[:8]}",
+                        "name": name,
+                        "input": args,
+                    })
+    stop_reason = "tool_use" if any(b["type"] == "tool_use" for b in blocks) else "end_turn"
+    return blocks, stop_reason
+def blocks_to_sse(blocks: list[dict], msg_id: str, in_tokens: int, out_tokens: int, stop_reason='end_turn') -> bytes:
+    def event(name: str, data: dict) -> bytes:
+        return f"event: {name}\ndata: {json.dumps(data)}\n\n".encode()
+    out = bytearray()
+    out += event("message_start", {"type": "message_start", "message": {
+        "id": msg_id, "type": "message", "role": "assistant",
+        "model": "local", "content": [], "stop_reason": None, "stop_sequence": None,
+        "usage": {"input_tokens": in_tokens, "output_tokens": 0},
+    }})
+    for i, block in enumerate(blocks):
+        bt = block["type"]
+        if bt == "text":
+            out += event("content_block_start", {"type": "content_block_start", "index": i,
+                "content_block": {"type": "text", "text": ""}})
+            out += event("content_block_delta", {"type": "content_block_delta", "index": i,
+                "delta": {"type": "text_delta", "text": block["text"]}})
+        elif bt == "thinking":
+            out += event("content_block_start", {"type": "content_block_start", "index": i,
+                "content_block": {"type": "thinking", "thinking": ""}})
+            out += event("content_block_delta", {"type": "content_block_delta", "index": i,
+                "delta": {"type": "thinking_delta", "thinking": block["thinking"]}})
+        elif bt == "tool_use":
+            out += event("content_block_start", {"type": "content_block_start", "index": i,
+                "content_block": {"type": "tool_use", "id": block["id"],
+                    "name": block["name"], "input": {}}})
+            out += event("content_block_delta", {"type": "content_block_delta", "index": i,
+                "delta": {"type": "input_json_delta", "partial_json": json.dumps(block["input"])}})
+        out += event("content_block_stop", {"type": "content_block_stop", "index": i})
+    out += event("message_delta", {"type": "message_delta",
+        "delta": {"stop_reason": stop_reason, "stop_sequence": None},
+        "usage": {"output_tokens": out_tokens}})
+    out += event("message_stop", {"type": "message_stop"})
+    return bytes(out)
+def dmca(p_str):
+    pattern1 = r"(<\|im_start\|>system\n)(.*?)(<\|im_end\|>)"
+    def mask_system(match):
+        return match.group(1) + re.sub(r"\S", ".", match.group(2)) + match.group(3)
+    p_str = re.sub(pattern1, mask_system, p_str, flags=re.DOTALL)
+    block_patterns = [
+        r"(?m)^<system-reminder>[\s\S]*?^</system-reminder>\s*",
+        r"(?m)^\[SUGGESTION MODE[\s\S]*",
+    ]
+    for pattern in block_patterns:
+        p_str = re.sub(pattern, lambda m: re.sub(r"\S", ".", m.group(0)), p_str)
+    return p_str
+def generate(model, tokenizer, prompt, hook=None, max_tokens=256, helper_max_tokens=64, **kwargs):
+    global prompt_cache, hx
+    if prompt is None:
+        return '', 0, 0
+    if not isinstance(tokenizer, mlx_lm.tokenizer_utils.TokenizerWrapper):
+        tokenizer = mlx_lm.tokenizer_utils.TokenizerWrapper(tokenizer)
+    detokenizer = tokenizer.detokenizer
+    if isinstance(prompt, str):
+        add_special_tokens = tokenizer.bos_token is None or not prompt.startswith(tokenizer.bos_token)
+        prompt_s = prompt
+        prompt = tokenizer.encode(prompt, add_special_tokens=add_special_tokens)
+    else:
+        prompt_s = tokenizer.decode(prompt)
+    logger.debug(dmca(prompt_s))
+    stream = logging.getLogger().handlers[0].stream
+    common_len = 0
+    if prompt_cache.get('cache', None):
+        for p, h in zip(prompt, prompt_cache['hx']):
+            if p == h:
+                common_len += 1
+            else:
+                break
+    else:
+        prompt_cache['hx'] = []
+        prompt_cache['cache'] = mlx_lm.models.cache.make_prompt_cache(model)
+    trim_amount = len(prompt_cache['hx']) - common_len
+    mlx_lm.models.cache.trim_prompt_cache(prompt_cache['cache'], trim_amount)
+    prompt_cache['hx'] = prompt_cache['hx'][:common_len]
+    token_gen = generate_step(
+        mx.array(prompt[common_len:]),
+        model,
+        prompt_cache=prompt_cache['cache'],
+        max_tokens=max_tokens,
+        **kwargs,
+    )
+    text = ""
+    tic_non = time.perf_counter()
+    gens = []
+    for token, _ in token_gen:
+        gens.append(token)
+        if token in tokenizer.eos_token_ids:
+            break
+        detokenizer.add_token(token)
+        seg = detokenizer.last_segment
+        stream.write(seg)
+        stream.flush()
+        text += seg
+        if len(gens) == 1:
+            tic_inp = time.perf_counter()
+            if len(prompt_cache['hx']) ==0:
+                mlx_lm.models.cache.save_prompt_cache(prompt_cache['file_name'], prompt_cache['cache'], metadata=dict(model_name=prompt_cache['model_name'], hx=json.dumps(prompt)))
+        if len(gens) >= max_tokens:
+            break
+    tic_out = time.perf_counter()
+    detokenizer.finalize()
+    text += detokenizer.last_segment
+    prompt_cache['hx'] = prompt+gens
+    logger.debug(f'{len(prompt)} in {tic_inp-tic_non:.0f}; {len(gens)} in {tic_out-tic_inp:.0f}; {common_len}\n=== INP ===\n{dmca(prompt_s)}\n=== OUT ===\n{text}\n')
+    return text, len(prompt), len(gens)
+def make_handler(model, tokenizer, system, names, skips, parse_think=True):
+    class Handler(BaseHTTPRequestHandler):
+        def log_message(self, fmt, *args):
+            pass
+        def send_json(self, code: int, obj: dict):
+            body = json.dumps(obj).encode()
+            self.send_response(code)
+            self.send_header("Content-Type", "application/json")
+            self.send_header("Content-Length", str(len(body)))
+            self.end_headers()
+            self.wfile.write(body)
+        def do_GET(self):
+            if self.path.rstrip("/") == "/v1/models":
+                self.send_json(200, {"data": [{"id": "local", "object": "model",
+                    "created": int(time.time()), "owned_by": "local"}]})
+            else:
+                self.send_json(404, {"error": "not found"})
+        def do_POST(self):
+            path = self.path.split("?")[0].rstrip("/")
+            if path == "/v1/messages/count_tokens":
+                self.send_json(200, {"input_tokens": 0})
+                return
+            if path != "/v1/messages":
+                self.send_json(404, {"error": f"unknown endpoint {path}"})
+                return
+            n = int(self.headers.get("Content-Length", 0))
+            body = json.loads(self.rfile.read(n))
+            prompt = encode(body, tokenizer, system, names, skips)
+            with gen_lock:
+                raw, in_tokens, out_tokens = generate(model, tokenizer, prompt=prompt, max_tokens=body.get("max_tokens", 8192))
+            blocks, stop_reason = decode(raw, tokenizer, parse_think=parse_think)
+            msg_id = f"msg_{uuid.uuid4().hex}"
+            sse = blocks_to_sse(blocks, msg_id, in_tokens, out_tokens, stop_reason)
+            self.send_response(200)
+            self.send_header("Content-Type", "text/event-stream")
+            self.send_header("Cache-Control", "no-cache")
+            self.send_header("Content-Length", str(len(sse)))
+            self.end_headers()
+            try:
+                self.wfile.write(sse)
+                self.wfile.flush()
+            except BrokenPipeError:
+                pass
+    return Handler
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--model", default="mlx-community/Qwen3.5-4B-OptiQ-4bit")
+    # parser.add_argument("--model", default="mlx-community/Qwen3.5-2B-OptiQ-4bit")
+    # parser.add_argument("--model", default="mlx-community/Qwen3.5-0.8B-MLX-bf16")
+    parser.add_argument("--system", type=str, default=None)
+    # parser.add_argument("--system", type=str, default='# Env\n{env}')
+    parser.add_argument("--cache", type=str, default='cache/cache.safetensors')
+    parser.add_argument("--names", nargs="+", default=['Read','Edit','Write','Grep','Glob','Bash','Agent','Skill'])
+    parser.add_argument("--skips", nargs="+", default=[
+        r'(?m)^\[SUGGESTION MODE[\s\S]*'
+        # r'(?m)^<system-reminder>[\s\S]*?^</system-reminder>\s*',
+    ])
+    parser.add_argument("--port", type=int, default=8000)
+    parser.add_argument("--host", default="127.0.0.1")
+    parser.add_argument("--home", default=tempfile.mkdtemp())
+    parser.add_argument("--work", default=os.getcwd())
+    args, claude_args = parser.parse_known_args()
+    global prompt_cache
+    if os.path.exists(args.cache):
+        cache, metadata = mlx_lm.models.cache.load_prompt_cache(args.cache, return_metadata=True)
+        mx.eval(cache)
+        model_name = metadata.pop("model_name", "")
+        tokens_str = metadata.pop("hx", "[]")
+        tokens = json.loads(tokens_str)
+        prompt_cache = dict(cache=cache, hx=tokens, model_name=model_name)
+        if prompt_cache.get('model_name') != args.model:
+            prompt_cache = dict(model_name=args.model)
+    else:
+        Path(args.cache).parent.mkdir(parents=True, exist_ok=True)
+        prompt_cache = dict(model_name=args.model)
+    prompt_cache['file_name']=args.cache
+    model, tokenizer = mlx_lm.load(args.model)
+    server = HTTPServer((args.host, args.port), make_handler(model, tokenizer, args.system, args.names, args.skips))
+    threading.Thread(target=server.serve_forever, daemon=True).start()
+    env = os.environ.copy()
+    env["ANTHROPIC_BASE_URL"] = f"http://{args.host}:{args.port}"
+    env["ANTHROPIC_AUTH_TOKEN"] = "local"
+    env["ANTHROPIC_MODEL"] = args.model
+    env["ANTHROPIC_SMALL_FAST_MODEL"] = args.model
+    env["HOME"] = args.home
+    def mirror_workspace(src: str, dst: str):
+        for root, dirs, files in os.walk(src):
+            rel = os.path.relpath(root, src)
+            os.makedirs(os.path.join(dst, rel), exist_ok=True)
+            for f in files:
+                os.link(os.path.join(root, f), os.path.join(dst, rel, f))
+    workspace = os.path.join(args.home, "workspace")
+    mirror_workspace(args.work, workspace)
+    sys.exit(subprocess.run(["claude"] + claude_args, env=env, cwd=workspace).returncode)
+if __name__ == "__main__":
+    main()

mlx_code-0.0.1/mlx_code.egg-info/PKG-INFO ADDED Viewed

@@ -0,0 +1,209 @@
+Metadata-Version: 2.4
+Name: mlx-code
+Version: 0.0.1
+Summary: Local Claude Code for Mac
+Home-page: https://github.com/JosefAlbers/mlx-code
+Author: J Joe
+Author-email: albersj66@gmail.com
+License: Apache-2.0
+Requires-Python: >=3.11
+Description-Content-Type: text/markdown
+Requires-Dist: mlx-lm>=0.19.0
+Requires-Dist: PyYAML
+Dynamic: author
+Dynamic: author-email
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
+# mlx-code: Local Claude Code for Mac
+<img width="1024" height="1536" alt="mlxcodelogo" src="https://github.com/user-attachments/assets/df605c1d-f8e7-433c-9aaf-176c73e3d628" />
+![mlxcode](https://github.com/user-attachments/assets/86004bd2-549f-44b8-9a05-a9b6d9f1c4a2)
+### Quick Start
+```bash
+brew install --cask claude-code
+pip install mlx-code
+```
+### Commands
+| Command | What it does | Example |
+|--------|--------------|--------|
+| `mlx-code` | Start interactive mode | `mlx-code` |
+| `mlx-code "task"` | Run a one-time task | `mlx-code "fix the build error"` |
+| `mlx-code -p "query"` | Run one-off query, then exit | `mlx-code -p "explain this function"` |
+| `mlx-code -c` | Continue most recent conversation in current directory | `mlx-code -c` |
+| `mlx-code -r` | Resume a previous conversation | `mlx-code -r` |
+| `mlx-code commit` | Create a Git commit | `mlx-code commit` |
+| `/clear` | Clear conversation history | `/clear` |
+| `/help` | Show available commands | `/help` |
+| `exit` or `Ctrl+C` | Exit Claude Code | `exit` |
+<details><summary>Click to expand</summary><pre>
+╭─── Claude Code v2.1.74 ────────────────────────────────────────────────────────────────────────────────╮
+│                                                    │ Tips for getting started                          │
+│                    Welcome back!                   │ Run /init to create a CLAUDE.md file with instru… │
+│                                                    │ ───────────────────────────────────────────────── │
+│                       ▐▛███▜▌                      │ Recent activity                                   │
+│                      ▝▜█████▛▘                     │ No recent activity                                │
+│                        ▘▘ ▝▝                       │                                                   │
+│                                                    │                                                   │
+│ mlx-community/Qwen3.5-4B-Opti… · API Usage Billing │                                                   │
+│             /…/T/tmpg4sztu_h/workspace             │                                                   │
+╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
+❯ explain the folder structure
+⏺
+⏺ Searched for 1 pattern (ctrl+o to expand)
+⏺ Here's the folder structure of your workspace:
+  workspace/
+  ├── .DS_Store
+  ├── cache/
+  │   └── cache.safetensors
+  ├── cc/
+  │   ├── .DS_Store
+  │   ├── mlx-code-v1.py
+  │   └── mlx-code-v2.py
+  ├── cyberpunk/
+  │   ├── build.zig
+  │   ├── help.py
+  │   ├── map.c
+  │   └── main.zig
+  ├── rag/
+  │   └── jina.py
+  ├── rts/
+  │   ├── build.zig
+  │   ├── main.zig
+  │   └── battle_sim.c
+  ├── splat_zig_raylib/
+  │   ├── README.md
+  │   ├── .DS_Store
+  │   ├── build.zig
+  │   ├── build_web.sh
+  │   ├── index.html
+  │   ├── main.zig
+  │   ├── gsplat_transform.py
+  │   ├── src/
+  │   │   └── main.zig
+  │   └── splat_zig/
+  │       ├── .DS_Store
+  │       ├── build.zig
+  │       ├── build_web.sh
+  │       ├── gsplat_transform.py
+  │       ├── index.html
+  │       ├── main.zig
+  │       └── splat/
+  │           └── build.zig
+  ├── tarot/
+  │   ├── index.html
+  │   └── tarot.py
+  └── tmp.md
+  Summary of projects:
+  ┌───────────────────┬────────────────────────────────────────────┐
+  │      Folder       │                  Purpose                   │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cc/               │ MLX code versions (v1, v2)                 │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cyberpunk/        │ Zig game project with C map integration    │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ rag/              │ RAG (Retrieval-Augmented Generation) setup │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ rts/              │ Real-time strategy battle simulation       │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ splat_zig_raylib/ │ Raylib game with splat transform features  │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ tarot/            │ Tarot card application                     │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ cache/            │ Model cache files                          │
+  ├───────────────────┼────────────────────────────────────────────┤
+  │ tmp.md            │ Temporary notes                            │
+  └───────────────────┴────────────────────────────────────────────┘
+❯ how is output parsed in mlx-code.py?
+⏺
+⏺ Read 1 file (ctrl+o to expand)
+⏺ Based on the code, here's how output is parsed in mlx-code.py:
+  Output Parsing Flow
+  1. decode() function (lines 109-143)
+  This is the core parsing function that converts raw model output into structured blocks:
+  - Input: Raw text from the model
+  - Output: List of blocks with types: thinking, text, or tool_use
+  Key parsing logic:
+  - Splits text by `
+⏺ ` markers for thinking content
+  - Extracts tool calls wrapped in <tool_call>...</tool_call> tags
+  - Parses function name and parameters from XML-like tags
+  2. Tool Call Extraction (lines 125-141)
+  When parsing tool calls:
+  tool_pattern = re.compile(r'<tool_call>(.*?)</tool_call>', re.DOTALL)
+  - Matches content between <tool_call> and </tool_call>
+  - Extracts function name: <function=name>
+  - Extracts parameters: <parameter=name>value</parameter>
+  - Creates tool_use block with:
+    - type: "tool_use"
+    - id: UUID
+    - name: function name
+    - input: params dict
+  3. blocks_to_sse() function (lines 145-177)
+  Converts parsed blocks to Server-Sent Events (SSE) format:
+  def event(name: str, data: dict) -> bytes:
+      return f"event: {name}\ndata: {json.dumps(data)}\n\n".encode()
+  Event types sent:
+  - message_start - Initial message header
+  - content_block_start - Start of a content block
+  - content_block_delta - Streaming content (text/thinking/tool)
+  - content_block_stop - End of block
+  - message_delta - Stop reason
+  - message_stop - End of message
+  4. HTTP Response (lines 288-297)
+  Sends SSE stream with headers:
+  self.send_header("Content-Type", "text/event-stream")
+  self.send_header("Cache-Control", "no-cache")
+  5. Endpoint (/v1/messages)
+  - POST request with JSON body
+  - Returns streaming SSE response
+  - Supports max_tokens parameter
+  6. Stop Reason Detection
+  - tool_use - If tool calls were generated
+  - end_turn - If just text/thinking was generated
+✻ Brewed for 35s
+❯ cool mate, bye
+⏺ Got it! Feel free to come back anytime if you need help. Have a great day! 👋
+</pre></details><br>

mlx_code-0.0.1/mlx_code.egg-info/SOURCES.txt ADDED Viewed

@@ -0,0 +1,9 @@
+README.md
+main.py
+setup.py
+mlx_code.egg-info/PKG-INFO
+mlx_code.egg-info/SOURCES.txt
+mlx_code.egg-info/dependency_links.txt
+mlx_code.egg-info/entry_points.txt
+mlx_code.egg-info/requires.txt
+mlx_code.egg-info/top_level.txt

mlx_code-0.0.1/mlx_code.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+

mlx_code-0.0.1/mlx_code.egg-info/entry_points.txt ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ [console_scripts]
2	+ mlx-code = main:main

mlx_code-0.0.1/mlx_code.egg-info/requires.txt ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ mlx-lm>=0.19.0
2	+ PyYAML

mlx_code-0.0.1/mlx_code.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+ main

mlx_code-0.0.1/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

mlx_code-0.0.1/setup.py ADDED Viewed

@@ -0,0 +1,18 @@
+from setuptools import setup
+setup(
+    name="mlx-code",
+    url='https://github.com/JosefAlbers/mlx-code',
+    author_email="albersj66@gmail.com",
+    author="J Joe",
+    license="Apache-2.0",
+    version="0.0.1",
+    readme="README.md",
+    description="Local Claude Code for Mac",
+    long_description=open("README.md").read(),
+    long_description_content_type="text/markdown",
+    python_requires=">=3.11",
+    install_requires=["mlx-lm>=0.19.0", "PyYAML"],
+    py_modules=["main"],
+    entry_points={"console_scripts": ["mlx-code=main:main"]},
+)