@halfagiraf/clawx 0.1.13 → 0.1.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +111 -10
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -6,7 +6,7 @@
6
6
 
7
7
  Terminal-first coding agent — runs locally with Ollama, DeepSeek, OpenAI, or any OpenAI-compatible endpoint.
8
8
 
9
- Clawx started because [OpenClaw](https://github.com/openclaw/openclaw) kept getting heavier. Prompts ballooned, context windows filled up, and local models choked. We wanted the good parts — the tool-calling loop, the terminal UI, the coding tools — without the bloat. So we stripped it back to the essentials: a lean agent that runs local models on modest hardware, hits DeepSeek when you need more muscle, and scales up to frontier models when the task calls for it. No token budget wasted on platform overhead. Just the model, the tools, and your prompt.
9
+ Clawx started because tools like OpenClaw kept getting heavier. Prompts ballooned, context windows filled up, and local models choked. We wanted the good parts — the tool-calling loop, the terminal UI, the coding tools — without the bloat. So we built something lean on top of the open-source [pi-coding-agent](https://github.com/badlogic/pi-mono) SDK: an agent that runs local models on modest hardware, hits DeepSeek when you need more muscle, and scales up to frontier models when the task calls for it. No token budget wasted on platform overhead. Just the model, the tools, and your prompt.
10
10
 
11
11
  > **Fair warning:** Clawx runs with the guardrails off. It will create files, delete files, install packages, and execute shell commands — all without asking you first. That's the point. No confirmation dialogs, no "are you sure?", no waiting around. You give it a task, it gets on with it. This makes it ideal for disposable environments, home labs, Raspberry Pis, VMs, and machines you're happy to let rip. If you're pointing it at a production server with your life's work on it... maybe don't do that. Or do.
12
12
 
@@ -82,7 +82,8 @@ Tested on Windows 11, RTX 3060 12GB, 2026-03-15.
82
82
 
83
83
  | Model | Provider | Tool calling | VRAM | Benchmark | Status |
84
84
  |-------|----------|-------------|------|-----------|--------|
85
- | **glm-4.7-flash:latest** | Ollama | Structured `tool_calls` | ~5 GB | 12 turns, 13 tool calls — write file + run python | **Recommended** |
85
+ | **glm-4.7-flash:latest** | Ollama | Structured `tool_calls` | ~5 GB | 12 turns, 13 tool calls — write file + run python | **Recommended local** |
86
+ | **Qwen3.5-35B-A3B** (MoE) | Ollama | Structured `tool_calls` | ~12 GB | 35B params, only 3B active per token | **Best local if you have the VRAM** |
86
87
  | Qwen2.5-Coder-14B-abliterated Q4_K_M | Ollama | Text-only `<tool_call>` tags | ~9 GB | Tool loop never starts — model returns text, not structured calls | Not compatible |
87
88
  | Qwen2.5-Coder-14B-abliterated Q4_K_M | llama-server `--jinja` | Text-only `<tool_call>` tags | ~9 GB | Same as above | Not compatible |
88
89
  | GPT-4o / GPT-4-turbo | OpenAI API | Structured `tool_calls` | — | N/A (cloud) | Works |
@@ -134,7 +135,82 @@ EOF
134
135
  clawx run "Create a Python script that prints the first 20 Fibonacci numbers"
135
136
  ```
136
137
 
137
- ### Option 2: Qwen2.5-Coder-14B via Ollama (import GGUF)
138
+ ### Option 2: Qwen3.5-35B-A3B via Ollama (MoE — best local)
139
+
140
+ A 35B Mixture-of-Experts model that only activates 3B parameters per token. Fits in 12GB VRAM and punches well above its weight for coding tasks.
141
+
142
+ ```bash
143
+ # 1. Download the GGUF (or use one you already have)
144
+ # Example: Qwen3.5-35B-A3B-UD-Q2_K_XL.gguf (~12GB)
145
+
146
+ # 2. Create a Modelfile
147
+ cat > Modelfile-qwen35 << 'EOF'
148
+ FROM /path/to/Qwen3.5-35B-A3B-UD-Q2_K_XL.gguf
149
+
150
+ PARAMETER temperature 0.7
151
+ PARAMETER num_ctx 32768
152
+ PARAMETER stop <|im_end|>
153
+ PARAMETER stop <|endoftext|>
154
+
155
+ TEMPLATE """{{- if .System }}<|im_start|>system
156
+ {{ .System }}<|im_end|>
157
+ {{ end }}{{- range .Messages }}<|im_start|>{{ .Role }}
158
+ {{ .Content }}<|im_end|>
159
+ {{ end }}<|im_start|>assistant
160
+ """
161
+ EOF
162
+
163
+ # 3. Import into Ollama
164
+ ollama create qwen35-35b -f Modelfile-qwen35
165
+
166
+ # 4. Configure clawx
167
+ clawx init
168
+ # Choose: Ollama → model: qwen35-35b
169
+
170
+ # Or set it directly:
171
+ # ~/.clawx/config
172
+ # CLAWDEX_PROVIDER=ollama
173
+ # CLAWDEX_BASE_URL=http://localhost:11434/v1
174
+ # CLAWDEX_MODEL=qwen35-35b
175
+ # CLAWDEX_API_KEY=not-needed
176
+ # CLAWDEX_MAX_TOKENS=16384
177
+ ```
178
+
179
+ ### Importing any GGUF into Ollama
180
+
181
+ Got a GGUF from HuggingFace or elsewhere? Here's how to use it with clawx:
182
+
183
+ ```bash
184
+ # 1. Create a Modelfile (adjust the FROM path and template for your model)
185
+ cat > Modelfile << 'EOF'
186
+ FROM /path/to/your-model.gguf
187
+
188
+ PARAMETER temperature 0.7
189
+ PARAMETER num_ctx 16384
190
+ PARAMETER stop <|im_end|>
191
+ PARAMETER stop <|endoftext|>
192
+
193
+ TEMPLATE """{{- if .System }}<|im_start|>system
194
+ {{ .System }}<|im_end|>
195
+ {{ end }}{{- range .Messages }}<|im_start|>{{ .Role }}
196
+ {{ .Content }}<|im_end|>
197
+ {{ end }}<|im_start|>assistant
198
+ """
199
+ EOF
200
+
201
+ # 2. Import it
202
+ ollama create my-model -f Modelfile
203
+
204
+ # 3. Verify
205
+ ollama list
206
+
207
+ # 4. Use with clawx
208
+ clawx run -m my-model -p ollama -u http://localhost:11434/v1 "Your prompt here"
209
+ ```
210
+
211
+ > **Note:** The template above uses the ChatML format (`<|im_start|>`/`<|im_end|>`) which works with most Qwen, GLM, and many other models. Check your model's docs if it uses a different chat template (e.g. Llama, Mistral).
212
+
213
+ ### Option 3: Qwen2.5-Coder-14B via Ollama (reference only)
138
214
 
139
215
  > **Warning:** This model does NOT produce structured tool calls. It is listed here for reference only. Tool-using agent tasks will fail. You can still use it for plain chat without tools.
140
216
 
@@ -177,7 +253,7 @@ CLAWDEX_MAX_TOKENS=8192
177
253
  EOF
178
254
  ```
179
255
 
180
- ### Option 2b: Qwen2.5-Coder-14B via llama-server (llama.cpp)
256
+ ### Option 3b: Qwen2.5-Coder-14B via llama-server (llama.cpp)
181
257
 
182
258
  > **Warning:** Same limitation — text-only tool calls, not compatible with Clawx agent loop.
183
259
 
@@ -208,7 +284,7 @@ CLAWDEX_MAX_TOKENS=8192
208
284
  EOF
209
285
  ```
210
286
 
211
- ### Option 3: DeepSeek API
287
+ ### Option 4: DeepSeek API
212
288
 
213
289
  DeepSeek is OpenAI-compatible with full structured tool calling support, including thinking mode.
214
290
  Pricing: ~$0.27/1M input, $1.10/1M output (deepseek-chat). Very cost-effective.
@@ -239,7 +315,7 @@ EOF
239
315
  clawx run "Create a FastAPI app with SQLite and JWT auth"
240
316
  ```
241
317
 
242
- ### Option 4: OpenAI API
318
+ ### Option 5: OpenAI API
243
319
 
244
320
  ```bash
245
321
  cat > .env << 'EOF'
@@ -252,7 +328,7 @@ CLAWDEX_MAX_TOKENS=16384
252
328
  EOF
253
329
  ```
254
330
 
255
- ### Option 5: Anthropic API
331
+ ### Option 6: Anthropic API
256
332
 
257
333
  ```bash
258
334
  cat > .env << 'EOF'
@@ -267,12 +343,37 @@ EOF
267
343
 
268
344
  ### GPU / VRAM notes
269
345
 
270
- - **RTX 3060 12GB**: Can run glm-4.7-flash (~5GB) or Qwen-14B Q4_K_M (~9GB), but not both simultaneously
271
- - To free VRAM when switching models: `ollama stop glm-4.7-flash:latest` or `ollama stop qwen-coder-abliterated:latest`
346
+ - **RTX 3060 12GB**: Can run Qwen3.5-35B-A3B (~12GB), glm-4.7-flash (~5GB), or Qwen-14B (~9GB) but only one at a time
347
+ - **RTX 3070/3080 8GB**: glm-4.7-flash (~5GB) fits comfortably, 14B models are tight
348
+ - **RTX 4090 24GB**: Can run most models including full (non-MoE) 30B+ models
349
+ - To free VRAM when switching models: `ollama stop <model-name>`
272
350
  - Ollama auto-loads models on first request and keeps them in VRAM until timeout or manual stop
351
+ - Check VRAM usage: `nvidia-smi` (Linux/Windows) or `ollama ps`
273
352
 
274
353
  ## Configuration reference
275
354
 
355
+ ### Where config lives
356
+
357
+ Clawx looks for config in this order (first match wins):
358
+
359
+ | Priority | Location | Created by | Notes |
360
+ |----------|----------|------------|-------|
361
+ | 1 | `.env` in current directory | You | Per-project overrides |
362
+ | 2 | `~/.clawx/config` | `clawx init` | Global config (recommended) |
363
+ | 3 | `.env` in package install dir | Dev only | Fallback for development |
364
+ | 4 | `clawx.json` in current directory | You | JSON format, supports systemPrompt |
365
+ | 5 | Built-in defaults | — | Ollama on localhost |
366
+
367
+ **Config file paths by OS:**
368
+
369
+ | OS | Global config | Sessions |
370
+ |----|---------------|----------|
371
+ | **Windows** | `C:\Users\<you>\.clawx\config` | `C:\Users\<you>\.clawx\sessions\` |
372
+ | **Linux** | `~/.clawx/config` | `~/.clawx/sessions/` |
373
+ | **macOS** | `~/.clawx/config` | `~/.clawx/sessions/` |
374
+
375
+ The fastest way to set up is `clawx init` — it writes `~/.clawx/config` for you. To override per-project, drop a `.env` or `clawx.json` in the project directory.
376
+
276
377
  ### Environment variables
277
378
 
278
379
  ```bash
@@ -548,4 +649,4 @@ If you set up clawx via `clawx init`, your configured model should appear in `/m
548
649
 
549
650
  ## License
550
651
 
551
- MIT extracted and adapted from [OpenClaw](https://github.com/openclaw/openclaw) (MIT).
652
+ MIT. Built on the open-source [pi-coding-agent](https://github.com/badlogic/pi-mono) SDK (MIT).
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@halfagiraf/clawx",
3
- "version": "0.1.13",
3
+ "version": "0.1.14",
4
4
  "description": "Terminal-first coding agent — runs locally with Ollama, DeepSeek, OpenAI, or any OpenAI-compatible endpoint",
5
5
  "type": "module",
6
6
  "bin": {