nex-code 0.4.41 → 0.5.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -10,7 +10,7 @@
10
10
  </p>
11
11
 
12
12
  <p align="center">
13
- <a href="https://github.com/hybridpicker/nex-code/stargazers">⭐ If this saves you time, a star helps others find it.</a>
13
+ <a href="https://github.com/hybridpicker/nex-code/stargazers">If this saves you time, a star helps others find it.</a>
14
14
  </p>
15
15
 
16
16
  <p align="center">
@@ -22,24 +22,14 @@
22
22
  <img src="https://img.shields.io/badge/Ollama_Cloud-supported-brightgreen.svg" alt="Ollama Cloud: supported">
23
23
  <img src="https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg" alt="Node >= 18">
24
24
  <img src="https://img.shields.io/badge/dependencies-2-green.svg" alt="Dependencies: 2">
25
- <img src="https://img.shields.io/badge/tests-3719-blue.svg" alt="Tests: 3719">
25
+ <img src="https://img.shields.io/badge/tests-3920-blue.svg" alt="Tests: 3920">
26
26
  <img src="https://img.shields.io/badge/VS_Code-extension-007ACC.svg" alt="VS Code extension">
27
27
  </p>
28
28
 
29
29
  ---
30
30
 
31
- ```
32
- ██▄▄██ nex-code v0.3.54
33
- █▀██▀█ devstral-2:123b · /help
34
- ▀████▀
35
- ```
36
-
37
- ---
38
-
39
31
  ## Demo
40
32
 
41
-
42
-
43
33
  https://github.com/user-attachments/assets/68a6c134-2d13-4d66-bc5e-befea3acb794
44
34
 
45
35
  ---
@@ -48,234 +38,128 @@ https://github.com/user-attachments/assets/68a6c134-2d13-4d66-bc5e-befea3acb794
48
38
 
49
39
  ```bash
50
40
  npx nex-code
41
+ # or install globally:
42
+ npm install -g nex-code && cd ~/your-project && nex-code
51
43
  ```
52
44
 
53
- Or install globally:
54
-
55
- ```bash
56
- npm install -g nex-code
57
- cd ~/your-project
58
- nex-code
59
- ```
60
-
61
- That's it. You'll see the banner, your project context, and the `>` prompt. Start typing.
62
-
63
- ---
64
-
65
- ## Automatic Updates
66
-
67
- Nex Code automatically checks for new versions when you start it. If a newer version is available, you'll see a notification with instructions on how to update:
68
-
69
- ```
70
- 💡 New version available! Run npm update -g nex-code to upgrade from x.x.x to x.x.x
71
- ```
72
-
73
- To update to the latest version:
74
-
75
- ```bash
76
- npm update -g nex-code
77
- ```
45
+ On first launch, an interactive setup wizard guides you through provider and credential configuration. Re-run anytime with `/setup`.
78
46
 
79
47
  ---
80
48
 
81
49
  ## Why nex-code?
82
50
 
51
+ **Ollama Cloud first.** Built and optimized for [Ollama Cloud](https://ollama.com) — the flat-rate platform running devstral, Kimi K2, Qwen3-Coder, and 47+ models. Other providers (OpenAI, Anthropic, Gemini) work via the same interface.
52
+
83
53
  | Feature | nex-code | Closed-source alternatives |
84
54
  |---|---|---|
85
- | Free tier | Ollama Cloud flat-rate | subscription or limited quota |
86
- | Open models | devstral, Kimi K2, Qwen3 | vendor-locked |
87
- | Local Ollama | | |
88
- | Multi-provider | swap with one env var | |
89
- | VS Code sidebar | built-in | partial |
90
- | Startup time | ~100ms | 14s |
91
- | Runtime deps | 2 | heavy | heavy |
92
- | Infra tools | SSH, Docker, K8s built-in | | ❌ |
93
-
94
- ---
95
-
96
- ## Why nex-code?
97
-
98
- **Ollama Cloud first.** nex-code is built and optimized for [Ollama Cloud](https://ollama.com) — the flat-rate platform that runs devstral, Kimi K2, Qwen3-Coder, and 47+ other open models. All behavioral tuning (loop detection, context compression, tool-call repair) is done against real Ollama Cloud sessions. Other providers (OpenAI, Anthropic, Gemini) work via the same interface but are not the primary target.
55
+ | Free tier | Ollama Cloud flat-rate | subscription or limited quota |
56
+ | Open models | devstral, Kimi K2, Qwen3 | vendor-locked |
57
+ | Local Ollama | yes | no |
58
+ | Multi-provider | swap with one env var | no |
59
+ | VS Code sidebar | built-in | partial |
60
+ | Startup time | ~100ms | 1-4s |
61
+ | Runtime deps | 2 | heavy |
62
+ | Infra tools | SSH, Docker, K8s built-in | no |
63
+ <<<<<<< Updated upstream
99
64
 
100
- **Recommended model: `devstral-2:123b`** purpose-built for agentic coding, highest score on nex-code's own benchmark, best tool-call reliability.
65
+ **Smart model routing.** The built-in `/benchmark` tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.
101
66
 
102
- **Open-model first.** Not locked to any single vendor. Tool tiers (`essential / standard / full`) adapt automatically to the model's capability level, so smaller models don't receive tool schemas they can't handle. A 5-layer auto-fix loop catches and retries malformed tool calls without user intervention.
67
+ **Phase-based execution.** Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.
103
68
 
104
- **Smart model routing.** The built-in `/benchmark` system tests all configured models against 62 real nex-code tool-calling tasks across 5 task categories. The results feed a routing table so nex-code can automatically switch to the best model for the detected task type:
69
+ **45 built-in tools** across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See [Tools](#tools) for the full list.
105
70
 
106
- | Detected task | Routed model (example) |
107
- | ------------------------- | --------------------------- |
108
- | Frontend / CSS / React | `qwen3-coder:480b` |
109
- | Sysadmin / Docker / nginx | `devstral-2:123b` |
110
- | Data / SQL / migrations | `devstral-2:123b` |
111
- | Agentic swarms | `minimax-m2.7:cloud` |
112
- | General coding | `devstral-2:123b` (default) |
71
+ =======
113
72
 
114
- **Phase-based execution.** On Ollama Cloud, each task automatically runs through three phases each with the optimal model:
73
+ **Smart model routing.** The built-in `/benchmark` tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.
115
74
 
116
- | Phase | Purpose | Default model |
117
- | ------------- | -------------------------------- | ------------------------ |
118
- | **Plan** | Analyze codebase, find root cause | `qwen3-coder:480b` |
119
- | **Implement** | Write code, edit files | active model (default) |
120
- | **Verify** | Run tests, check correctness | `devstral-small-2:24b` |
75
+ **Phase-based execution.** Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.
121
76
 
122
- The verify phase catches incomplete work before reporting "done" if tests fail, it loops back to implement automatically. Phase models are auto-updated by `/benchmark`. Disable with `NEX_PHASE_ROUTING=0`.
77
+ **45 built-in tools** across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See [Tools](#tools) for the full list.
123
78
 
124
- **Built-in VS Code extension.** A sidebar chat panel with streaming output, collapsible tool cards, and native VS Code theme support — shipped in the same repo, no separate install.
125
-
126
- **Lightweight.** 2 runtime dependencies (`axios`, `dotenv`). Starts in ~100ms. No Python, no heavy runtime.
127
-
128
- **Daemon / watch mode.** Run `nex-code --daemon` to keep the process alive and fire tasks automatically on filesystem changes, git commits, or a cron schedule — configured via `.nex/daemon.json`. No extra dependencies; uses Node's built-in `fs.watch` and `setInterval`.
129
-
130
- **Server-aware from the first message.** When your prompt contains a URL whose domain matches a configured SSH profile (e.g. `server.example.com` → profile `server`), nex-code probes the server before responding — listing ports, running processes, and data directories. The model receives this topology before its first token, so it goes straight to `ssh_exec` instead of reading local files.
131
-
132
- **Few-shot behavior injection.** On each session start, nex-code injects a short example of the correct tool sequence for the detected task type (sysadmin → check remote logs first; coding → read file before editing; data → explain before rewriting). Works across all models without fine-tuning. Customize with your own high-scoring sessions via `npm run extract-examples`.
133
-
134
- **Infrastructure tools built in:**
135
-
136
- - SSH server management (AlmaLinux, macOS, any Linux)
137
- - Docker tools — local and remote via SSH
138
- - Kubernetes overview (`/k8s`)
139
- - GitHub Actions tools (trigger, monitor runs)
140
- - Named deploy configs (`rsync`-based, `/deploy`)
141
- - Browser agent via Playwright (optional, not bundled)
142
- - Grounded web search via Perplexity or DuckDuckGo
143
-
144
- **Developer safety:**
145
-
146
- - Pre-push secret detection — blocks commits that contain API keys or tokens
147
- - Full audit log (JSONL + sanitization)
148
- - Undo/Redo with persistence across restarts
149
- - Cost tracking and per-provider budget limits
150
- - Plan mode — analysis-only pass before any file changes
151
-
152
- **Extensible.** Plugin API (`registerTool` + lifecycle hooks), skill system (install from any git URL), MCP server support.
153
-
154
- **Tested.** 3719 tests, 83% coverage, CI on every push.
79
+ >>>>>>> Stashed changes
80
+ **2 runtime dependencies** (`axios`, `dotenv`). Starts in ~100ms. No Python, no heavy runtime.
155
81
 
156
82
  ---
157
83
 
158
- ## Ollama Cloud — Recommended Model Setup
159
-
160
- nex-code was built with Ollama Cloud as its primary provider. No subscription, no billing surprises.
161
- Rankings are based on nex-code's own `/benchmark` — 14-task quick benchmark against real nex-code schemas (62 tasks full run).
84
+ ## Ollama Cloud Model Rankings
162
85
 
163
- ### Flat-Rate / Pay-as-you-go
86
+ Rankings from nex-code's own `/benchmark` — 62 tasks testing tool selection, argument validity, and schema compliance.
164
87
 
165
88
  <!-- nex-benchmark-start -->
166
- <!-- Updated: 2026-04-01 — run `/benchmark --discover` after new Ollama Cloud releases -->
89
+ <!-- Updated: 2026-04-05 — run `/benchmark --discover` after new Ollama Cloud releases -->
167
90
 
168
91
  | Rank | Model | Score | Avg Latency | Context | Best For |
169
92
  |---|---|---|---|---|---|
170
- | 🥇 | `qwen3-vl:235b-instruct` | **79.9** | 3.8s | 131K | Best latency/score balance recommended default |
171
- | 🥈 | `qwen3-vl:235b` | 79.4 | 12.3s | 131K | Overall #1frontier tool selection, data + agentic tasks |
172
- | 🥉 | `qwen3-coder-next` | 74.9 | 1.7s | 256K | — |
173
- | — | `rnj-1:8b` | 74.6 | 2.5s | 131K | — |
174
- | — | `ministral-3:8b` | 74.2 | 1.2s | 131K | Fastest strong model 2.2s latency, 70+ score |
175
- | — | `qwen3.5:397b` | 72.8 | 2.1s | 256K | |
176
- | — | `qwen3-next:80b` | 71.3 | 10.3s | 131K | — |
177
- | — | `devstral-2:123b` | 69.9 | 1.6s | 131K | Sysadmin + SSH tasks, reliable coding |
178
- | — | `minimax-m2.7` | 69.4 | 4.1s | 200K | — |
179
- | — | `glm-5` | 69 | 7.6s | 131K | — |
180
- | — | `glm-4.7` | 67.8 | 3.7s | 131K | |
181
- | — | `kimi-k2-thinking` | 62 | 2.4s | 256K | |
93
+ | 🥇 | `qwen3-vl:235b` | **79** | 12.4s | 131K | Overall #1frontier tool selection, data + agentic tasks |
94
+ | 🥈 | `qwen3-vl:235b-instruct` | 78.2 | 5.3s | 131K | Best latency/score balance recommended default |
95
+ | 🥉 | `nemotron-3-super` | 78.1 | 3.5s | 256K | — |
96
+ | — | `rnj-1:8b` | 77.4 | 3.9s | 131K | — |
97
+ | — | `mistral-large-3:675b` | 76.5 | 3.9s | 131K | — |
98
+ | — | `gpt-oss:20b` | 76.5 | 1.9s | 131K | Fast small model, good overall score |
99
+ | — | `qwen3-coder-next` | 75.7 | 2.2s | 256K | — |
100
+ | — | `qwen3-next:80b` | 75.1 | 11.1s | 131K | |
101
+ | — | `ministral-3:8b` | 73.8 | 2.0s | 131K | Fastest strong model 2.2s latency, 70+ score |
102
+ | — | `deepseek-v3.1:671b` | 73.6 | 2.9s | 131K | — |
103
+ | — | `devstral-2:123b` | 73.2 | 2.0s | 131K | Sysadmin + SSH tasks, reliable coding |
104
+ | — | `kimi-k2:1t` | 72.2 | 5.6s | 256K | Large repos (>100K tokens) |
105
+ | — | `ministral-3:3b` | 72 | 1.6s | 32K | — |
106
+ | — | `devstral-small-2:24b` | 71.7 | 2.6s | 131K | Fast sub-agents, simple lookups |
107
+ | — | `qwen3.5:397b` | 70.7 | 4.2s | 256K | — |
108
+ | — | `qwen3-coder:480b` | 70.1 | 6.0s | 131K | Heavy coding sessions, large context |
109
+ | — | `minimax-m2.1` | 69.9 | 3.0s | 200K | — |
110
+ | — | `gemma4:31b` | 69.3 | 2.8s | ? | — |
111
+ | — | `glm-4.7` | 69.1 | 5.3s | 131K | — |
112
+ | — | `kimi-k2-thinking` | 69 | 3.1s | 256K | — |
113
+ | — | `ministral-3:14b` | 68.8 | 2.0s | 131K | — |
114
+ | — | `kimi-k2.5` | 68.7 | 3.4s | 256K | Large repos — faster than k2:1t |
115
+ | — | `minimax-m2.7` | 68.4 | 5.5s | 200K | — |
116
+ | — | `glm-4.6` | 67.8 | 4.7s | 131K | — |
117
+ | — | `glm-5` | 67.4 | 5.0s | 131K | — |
118
+ | — | `gpt-oss:120b` | 64.8 | 3.4s | 131K | — |
119
+ | — | `nemotron-3-nano:30b` | 64.7 | 2.3s | 131K | — |
120
+ | — | `minimax-m2.5` | 61.9 | 2.7s | 131K | Multi-agent, large context |
121
+ | — | `minimax-m2` | 60.6 | 4.3s | 200K | — |
182
122
 
183
123
  > Rankings are nex-code-specific: tool name accuracy, argument validity, schema compliance.
184
124
  > Toolathon (Minimax SOTA) measures different task types — run `/benchmark --discover` after model releases.
185
125
  <!-- nex-benchmark-end -->
186
126
 
187
- ### Recommended `.env` for Ollama Cloud (Flat-Rate)
127
+ **Recommended `.env`:**
188
128
 
189
129
  ```env
190
130
  DEFAULT_PROVIDER=ollama
191
- DEFAULT_MODEL=devstral-2:123b # nex-code benchmark winner (84/100, 1.5s)
192
-
193
- # Sub-agent routing
194
- NEX_HEAVY_MODEL=qwen3-coder:480b # complex multi-step coding
195
- NEX_STANDARD_MODEL=devstral-2:123b # routine tasks
196
- NEX_FAST_MODEL=devstral-small-2:24b # quick lookups, fast sub-agents
131
+ DEFAULT_MODEL=devstral-2:123b
132
+ NEX_HEAVY_MODEL=qwen3-coder:480b
133
+ NEX_STANDARD_MODEL=devstral-2:123b
134
+ NEX_FAST_MODEL=devstral-small-2:24b
197
135
  ```
198
136
 
199
- ### Run the benchmark yourself
200
-
201
- ```bash
202
- /benchmark # full run: 62 tasks × 5 models
203
- /benchmark --quick # fast run: 14 tasks × 3 models (doubled from 7 for better resolution)
204
- /benchmark --discover # detect new Ollama Cloud models, benchmark + auto-update README
205
- /benchmark --models=minimax-m2.7:cloud,qwen3-coder:480b
206
- /benchmark --history # show OpenClaw nightly trend
207
- ```
208
-
209
- Switch anytime: `/model devstral-2:123b` or update `DEFAULT_MODEL` in `.env`.
210
- The best models discovered are automatically saved to `~/.nex-code/.env` to persist globally across all your projects.
211
- Auto-discovery runs weekly via the scheduled improvement task and updates this table automatically.
212
-
213
137
  ---
214
138
 
215
139
  ## Setup
216
140
 
217
- ### Prerequisites
218
-
219
- - Node.js 18+
220
- - At least one API key **or** a local [Ollama](https://ollama.com/download) server
221
-
222
- ### Install from npm
141
+ **Prerequisites:** Node.js 18+ and at least one API key (or local Ollama).
223
142
 
224
143
  ```bash
225
- npm install -g nex-code
226
- ```
227
-
228
- Or run directly without installing:
144
+ # .env (or set environment variables)
145
+ OLLAMA_API_KEY=your-key # Ollama Cloud
146
+ OPENAI_API_KEY=your-key # OpenAI
147
+ ANTHROPIC_API_KEY=your-key # Anthropic
148
+ GEMINI_API_KEY=your-key # Gemini
149
+ PERPLEXITY_API_KEY=your-key # optional — enables grounded web search
229
150
 
230
- ```bash
231
- npx nex-code
151
+ DEFAULT_PROVIDER=ollama
152
+ DEFAULT_MODEL=devstral-2:123b
232
153
  ```
233
154
 
234
- ### Install from source (for contributors)
155
+ **Install from source:**
235
156
 
236
157
  ```bash
237
158
  git clone https://github.com/hybridpicker/nex-code.git
238
- cd nex-code
239
- npm install
240
- npm run build # Build the high-performance bundle
241
- cp .env.example .env
242
- npm link
243
- npm run install-hooks
244
- ```
245
-
246
- ### Configure a Provider
247
-
248
- Create a `.env` file in your project directory (or set environment variables):
249
-
250
- ```bash
251
- # Pick any — only one is required
252
- OLLAMA_API_KEY=your-key # Ollama Cloud (Qwen3 Coder, Qwen3.5, DeepSeek R1, Devstral, Kimi K2.5, Llama 4, MiniMax M2.5, GLM 4.7)
253
- OPENAI_API_KEY=your-key # OpenAI (GPT-4o, GPT-4.1, o1, o3, o4-mini)
254
- ANTHROPIC_API_KEY=your-key # Anthropic (Claude Sonnet 4.6, Opus 4.6, Haiku 4.5)
255
- GEMINI_API_KEY=your-key # Google Gemini (3.1 Pro Preview, 2.5 Pro/Flash, 2.0 Flash)
256
- PERPLEXITY_API_KEY=your-key # Perplexity (optional — enables grounded web search)
257
- # No key needed for local Ollama — just have it running
258
-
259
- # Optional tuning
260
- DEFAULT_PROVIDER=ollama # Active provider on startup
261
- DEFAULT_MODEL=devstral-2:123b # Active model on startup (see /benchmark for ranking)
262
- FALLBACK_CHAIN=anthropic,openai # Providers tried on failure (comma-separated)
263
- NEX_STALE_WARN_MS=60000 # Warn if no tokens received for N ms (default: 60000)
264
- NEX_STALE_ABORT_MS=120000 # Abort and retry stream after N ms of silence (default: 120000)
265
- NEX_LANGUAGE=auto # Response language: "auto" (mirrors user's language, default) or e.g. "English", "Deutsch"
266
- NEX_THEME=dark # Force dark/light theme (overrides auto-detection). Use if colours look wrong for your terminal profile.
267
- FOOTER_DEBUG=1 # Write terminal layout debug log to /tmp/footer-debug.log
159
+ cd nex-code && npm install && npm run build
160
+ cp .env.example .env && npm link && npm run install-hooks
268
161
  ```
269
162
 
270
- ### Verify
271
-
272
- ```bash
273
- cd ~/any-project
274
- nex-code
275
- ```
276
-
277
- You should see the banner, your project context, and the `>` prompt.
278
-
279
163
  ---
280
164
 
281
165
  ## Usage
@@ -284,1307 +168,250 @@ You should see the banner, your project context, and the `>` prompt.
284
168
  > explain the main function in index.js
285
169
  > add input validation to the createUser handler
286
170
  > run the tests and fix any failures
287
- > refactor this to use async/await instead of callbacks
288
- ```
289
-
290
- ### Try These Scenarios
291
-
292
- **Understand an unfamiliar codebase:**
293
-
294
- ```
295
- > give me an overview of this project — architecture, key files, tech stack
296
- > how does authentication work here? trace the flow from login to session
297
- > find all API endpoints and list them with their HTTP methods
298
- ```
299
-
300
- **Fix bugs with context:**
301
-
302
- ```
303
171
  > the /users endpoint returns 500 — find the bug and fix it
304
- > tests are failing in auth.test.js — figure out why and fix it
305
- > there's a memory leak somewhere — profile the app and find it
306
- ```
307
-
308
- **Add features end-to-end:**
309
-
310
- ```
311
- > add rate limiting to all API routes (100 req/min per IP)
312
- > add a /health endpoint that checks DB connectivity
313
- > implement pagination for the GET /products endpoint
314
- ```
315
-
316
- **Refactor and improve:**
317
-
318
172
  ```
319
- > refactor the database queries to use a connection pool
320
- > this function is 200 lines — break it into smaller functions
321
- > migrate these callbacks to async/await
322
- ```
323
-
324
- **DevOps and CI:**
325
173
 
326
- ```
327
- > write a Dockerfile for this project
328
- > set up GitHub Actions CI that runs tests on push
329
- > add a pre-commit hook that runs linting
330
- ```
174
+ ### YOLO Mode
331
175
 
332
- **Multi-step autonomous work (YOLO mode):**
176
+ Skip all confirmations — file changes, dangerous commands, and tool permissions are auto-approved. Auto-runs `caffeinate` on macOS.
333
177
 
334
178
  ```bash
335
179
  nex-code -yolo
336
180
  ```
337
181
 
338
- ```
339
- > read the entire src/ directory, run the tests, fix all failures, then commit
340
- > add input validation to every POST endpoint, add tests, run them
341
- > upgrade all dependencies to latest, fix any breaking changes, run tests
342
- ```
343
-
344
- The agent decides autonomously whether to use tools or just respond with text. Simple questions get direct answers. Coding tasks trigger the agentic tool loop.
345
-
346
- **Vision / Screenshot → Code** — drop an image path anywhere in your message and nex-code will send it to a vision-capable model automatically:
347
-
348
- ```
349
- > /path/to/screenshot.png implement this UI in React
350
- > describe the layout in mockup.png and generate the CSS
351
- ```
352
-
353
- Supported formats: PNG, JPG, GIF, WebP, BMP. Works with Anthropic, OpenAI, Gemini, and Ollama vision models (llava, qwen2-vl, etc.).
354
-
355
- ### YOLO Mode
356
-
357
- Skip all confirmation prompts — file changes, dangerous commands, and tool permissions are auto-approved. The banner shows a `⚡ YOLO` indicator. Toggle at runtime with `/autoconfirm`.
358
-
359
- On macOS, nex-code automatically runs `caffeinate` for the duration of the session (idle sleep and disk sleep are suppressed), so long autonomous tasks won't be interrupted by the system going to sleep. This applies to all modes, not just YOLO.
360
-
361
- You can also enable YOLO mode permanently for a project via `.nex/config.json`:
362
-
363
- ```json
364
- { "yolo": true }
365
- ```
366
-
367
182
  ### Headless / Programmatic Mode
368
183
 
369
- Run nex-code non-interactively from scripts, CI pipelines, or other processes:
370
-
371
184
  ```bash
372
- # Inline prompt
373
- nex-code --task "refactor src/index.js to use async/await" --yolo
374
-
375
- # Prompt from file (avoids shell-escaping issues with special characters)
376
- nex-code --prompt-file /tmp/task.txt --yolo
377
-
378
- # Delete the file after reading
379
- nex-code --prompt-file /tmp/task.txt --delete-prompt-file --yolo
380
-
381
- # JSON output for programmatic parsing
185
+ nex-code --task "refactor src/index.js to async/await" --yolo
382
186
  nex-code --prompt-file /tmp/task.txt --yolo --json
383
- # {"success":true,"response":"..."}
187
+ nex-code --daemon # watch mode: fires tasks on file changes, git commits, or cron
384
188
  ```
385
189
 
386
- | Flag | Description |
387
- | -------------------------- | ------------------------------------------------------------------------------------------------------------- |
388
- | `--task <prompt>` | Run a single prompt and exit |
389
- | `--prompt-file <path>` | Read prompt from a UTF-8 file and run headless |
390
- | `--delete-prompt-file` | Delete the prompt file after reading (use with `--prompt-file`) |
391
- | `--auto` | Skip confirmations (non-interactive, no REPL banner) |
392
- | `--yolo` | Skip all confirmations including dangerous commands (also configurable via `.nex/config.json` `"yolo": true`) |
393
- | `--server` | Start JSON-lines IPC server (used by the VS Code extension) |
394
- | `--daemon [config]` | Run as background watcher — fires tasks on file changes, git commits, or schedule (reads `.nex/daemon.json`) |
395
- | `--watch [config]` | Alias for `--daemon` |
396
- | `--flatrate` | Flatrate mode: 100 turns, 6 parallel agents, 5 retries (auto-enabled when `OLLAMA_API_KEY` is set) |
397
- | `--json` | Output `{"success":true,"response":"..."}` to stdout |
398
- | `--max-turns <n>` | Override the agentic loop iteration limit |
399
- | `--model <spec>` | Use a specific model (e.g. `anthropic:claude-sonnet-4-6`) |
400
- | `--debug` | Show internal diagnostic messages (compression, loop detection, guards) |
401
- | `--no-auto-orchestrate` | Disable auto-orchestration for multi-goal prompts (on by default; also: `NEX_AUTO_ORCHESTRATE=false`) |
402
- | `--orchestrator-model <m>` | Model for decomposition/synthesis step (default: `kimi-k2.5`) |
190
+ | Flag | Description |
191
+ |---|---|
192
+ | `--task <prompt>` | Run a single prompt and exit |
193
+ | `--prompt-file <path>` | Read prompt from file |
194
+ | `--yolo` | Skip all confirmations |
195
+ | `--server` | JSON-lines IPC server (VS Code extension) |
196
+ | `--daemon` | Background watcher (reads `.nex/daemon.json`) |
197
+ | `--flatrate` | 100 turns, 6 parallel agents, 5 retries |
198
+ | `--json` | JSON output to stdout |
199
+ | `--max-turns <n>` | Override agentic loop limit |
200
+ | `--model <spec>` | Use specific model (e.g. `anthropic:claude-sonnet-4-6`) |
201
+ | `--debug` | Show diagnostic messages |
202
+ <<<<<<< Updated upstream
403
203
 
404
- ---
405
-
406
- ## VS Code Extension
204
+ ### Vision / Screenshot
407
205
 
408
- nex-code ships with a built-in VS Code extension (`vscode/`) — no separate repo needed. It adds a sidebar chat panel with streaming output, collapsible tool cards, and confirmation dialogs, all styled with VS Code's native theme variables.
206
+ =======
409
207
 
410
- **Architecture:** The extension spawns `nex-code --server` as a child process and communicates over a JSON-lines protocol on stdin/stdout. No agent logic is duplicated — the CLI is the single source of truth.
208
+ ### Vision / Screenshot
411
209
 
412
- **Requirements:** nex-code must be in `$PATH` — either `npm install -g nex-code` or `npm link` for local development.
413
-
414
- **Install:**
415
-
416
- ```bash
417
- cd vscode
418
- npm install
419
- npm run package # syncs version, builds, and creates .vsix
420
- # Cmd+Shift+P → Extensions: Install from VSIX...
210
+ >>>>>>> Stashed changes
211
+ ```
212
+ > /path/to/screenshot.png implement this UI in React
213
+ > analyze https://example.com/mockup.png and implement it
214
+ > what's wrong with the layout in my clipboard # macOS clipboard capture
215
+ > screenshot localhost:3000 and review the navbar spacing
421
216
  ```
422
217
 
423
- **Settings** (`Settings Extensions Nex Code`):
424
-
425
- | Setting | Default | Description |
426
- | ------------------------- | ----------------- | --------------------------- |
427
- | `nexCode.executablePath` | `nex-code` | Path to the nex-code binary |
428
- | `nexCode.defaultProvider` | `ollama` | LLM provider |
429
- | `nexCode.defaultModel` | `devstral-2:123b` | Model name |
430
- | `nexCode.anthropicApiKey` | — | Anthropic API key |
431
- | `nexCode.openaiApiKey` | — | OpenAI API key |
432
- | `nexCode.ollamaApiKey` | — | Ollama Cloud API key |
433
- | `nexCode.geminiApiKey` | — | Google Gemini API key |
434
- | `nexCode.maxTurns` | `50` | Max agentic loop iterations |
435
-
436
- **Commands** (`Cmd+Shift+P`):
437
-
438
- | Command | Description |
439
- | ------------------------- | ----------------------------------------------------- |
440
- | `Nex Code: Clear Chat` | Clear conversation history |
441
- | `Nex Code: Switch Model` | Pick a different model |
442
- | `Nex Code: Restart Agent` | Restart the child process (e.g. after source changes) |
218
+ Works with Anthropic, OpenAI, Gemini, and Ollama vision models. Formats: PNG, JPG, GIF, WebP, BMP.
443
219
 
444
220
  ---
445
221
 
446
222
  ## Providers & Models
447
223
 
448
- Switch providers and models at runtime:
449
-
450
224
  ```
451
- /model # interactive model picker (arrow keys + Enter)
225
+ /model # interactive picker
452
226
  /model openai:gpt-4o # switch directly
453
- /model anthropic:claude-sonnet
454
- /model gemini:gemini-2.5-pro
455
- /model local:llama3
456
- /providers # see all available providers & models
457
- ```
458
-
459
- | Provider | Models | Env Variable |
460
- | ------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------- |
461
- | **ollama** | Qwen3 Coder, Qwen3.5 (397B, 122B-A10B, 35B-A3B, 27B, 9B, 4B, 2B, 0.8B), DeepSeek R1, Devstral, Kimi K2.5, MiniMax M2.5, GLM 4.7, Llama 4 | `OLLAMA_API_KEY` |
462
- | **openai** | GPT-4o, GPT-4.1, o1, o3, o4-mini | `OPENAI_API_KEY` |
463
- | **anthropic** | Claude Sonnet 4.6, Opus 4.6, Haiku 4.5, Sonnet 4.5, Sonnet 4 | `ANTHROPIC_API_KEY` |
464
- | **gemini** | Gemini 3.1 Pro Preview, 2.5 Pro/Flash, 1.5 Pro/Flash | `GEMINI_API_KEY` |
465
- | **local** | Any model on your local Ollama server | (none) |
466
-
467
- Fallback chains let you auto-switch when a provider fails:
468
-
469
- ```
470
- /fallback anthropic,openai,local
227
+ /providers # list all
228
+ /fallback anthropic,openai # auto-switch on failure
471
229
  ```
472
230
 
473
- **Wire Protocol Layer:** All 5 providers share 3 wire protocol implementations (OpenAI-compatible SSE, Anthropic Messages SSE, Ollama NDJSON). Stream parsing, tool call accumulation, and response normalization are handled by reusable `StreamParser` classes — eliminating duplicated protocol code across providers.
231
+ | Provider | Models | Env Variable |
232
+ |---|---|---|
233
+ | **ollama** | Qwen3, DeepSeek R1, Devstral, Kimi K2, MiniMax, GLM, Llama 4 | `OLLAMA_API_KEY` |
234
+ | **openai** | GPT-4o, GPT-4.1, o1, o3, o4-mini | `OPENAI_API_KEY` |
235
+ | **anthropic** | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 | `ANTHROPIC_API_KEY` |
236
+ | **gemini** | Gemini 3.1 Pro, 2.5 Pro/Flash | `GEMINI_API_KEY` |
237
+ | **local** | Any local Ollama model | (none) |
474
238
 
475
239
  ---
476
240
 
477
241
  ## Commands
478
242
 
479
- Type `/` to see inline suggestions as you type. Tab completion is supported for slash commands and file paths (type a path containing `/` and press Tab).
480
-
481
- | Command | Description |
482
- | --------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
483
- | `/help` | Full help |
484
- | `/model [spec]` | Show/switch model (e.g. `openai:gpt-4o`) |
485
- | `/providers` | List all providers and models |
486
- | `/fallback [chain]` | Show/set fallback chain |
487
- | `/tokens` | Token usage and context budget |
488
- | `/costs` | Session token costs |
489
- | `/budget` | Show/set per-provider cost limits |
490
- | `/clear` | Clear conversation |
491
- | `/context` | Show project context |
492
- | `/autoconfirm` | Toggle auto-confirm for file changes |
493
- | `/save [name]` | Save current session |
494
- | `/load <name>` | Load a saved session |
495
- | `/sessions` | List saved sessions |
496
- | `/resume` | Resume last session |
497
- | `/branches` | Show session tree (all conversation branches) |
498
- | `/timeline [n]` | Show message timeline of current branch |
499
- | `/goto <index>` | Jump to a message index (truncates later messages) |
500
- | `/fork [index] [name]` | Create a new branch at the given message index |
501
- | `/switch-branch <name>` | Switch to a different conversation branch |
502
- | `/delete-branch <name>` | Delete a conversation branch |
503
- | `/remember <text>` | Save a memory (persists across sessions) |
504
- | `/forget <key>` | Delete a memory |
505
- | `/memory` | Show all memories |
506
- | `/brain add <name>` | Add a document to the knowledge base |
507
- | `/brain list` | List all brain documents |
508
- | `/brain search <query>` | Search the knowledge base |
509
- | `/brain show <name>` | Show a brain document |
510
- | `/brain remove <name>` | Remove a brain document |
511
- | `/brain rebuild` | Rebuild keyword index |
512
- | `/brain embed` | Build/rebuild embedding index |
513
- | `/brain status` | Show brain status (docs, index, embeddings) |
514
- | `/brain review` | Review pending brain changes (git diff) |
515
- | `/brain undo` | Undo last brain write |
516
- | `/learn` | Reflect on session and auto-update memory + NEX.md |
517
- | `/permissions` | Show tool permissions |
518
- | `/allow <tool>` | Auto-allow a tool |
519
- | `/deny <tool>` | Block a tool |
520
- | `/plan [task]` | Plan mode (analyze before executing) |
521
- | `/plan edit` | Open current plan in `$EDITOR` for review/modification |
522
- | `/plans` | List saved plans |
523
- | `/auto [level]` | Set autonomy: interactive/semi-auto/autonomous |
524
- | `/commit [msg]` | Smart commit (analyze diff, suggest message) |
525
- | `/diff` | Show current diff |
526
- | `/branch [name]` | Create feature branch |
527
- | `/mcp` | MCP servers and tools |
528
- | `/hooks` | Show configured hooks |
529
- | `/skills` | List, enable, disable skills |
530
- | `/tree [depth]` | Show project file tree (default depth 3) |
531
- | `/retry [--model <id>]` | Retry the last user turn; optionally switch model first (`/retry --model kimi-k2.5`) |
532
- | `/undo` | Undo last file change |
533
- | `/redo` | Redo last undone change |
534
- | `/history` | Show file change history |
535
- | `/snapshot [name]` | Create a named git snapshot of current changes |
536
- | `/restore [name\|last]` | Restore a previously created snapshot |
537
- | `/review [--strict] [file]` | Deep code review: 3-phase protocol (broad scan → grep deep-dive → report), score table, diff fix snippets. `--strict` forces ≥3 critical findings. |
538
- | `/k8s [user@host]` | Kubernetes overview: namespaces + pod health (remote via SSH optional) |
539
- | `/setup` | Interactive setup wizard — configure provider, API keys, web search |
540
- | `/benchmark [--quick\|--discover\|--history]` | Rank models on nex-code tool-calling tasks, auto-update routing |
541
- | `/install-skill <url>` | Install a skill from a git repo |
542
- | `/search-skill <query>` | Search GitHub for nex-code skills |
543
- | `/remove-skill <name>` | Remove an installed skill |
544
- | `/audit` | Show tool execution audit summary |
545
- | `/exit` | Quit |
243
+ Type `/` to see inline suggestions. Tab completion for slash commands and file paths.
244
+
245
+ | Command | Description |
246
+ |---|---|
247
+ | `/help` | Full help |
248
+ | `/model [spec]` | Show/switch model |
249
+ | `/providers` | List providers |
250
+ | `/clear` | Clear conversation |
251
+ | `/save` / `/load` / `/sessions` / `/resume` | Session management |
252
+ | `/branches` / `/fork` / `/switch-branch` / `/goto` | Session tree navigation |
253
+ | `/remember` / `/forget` / `/memory` | Persistent memory |
254
+ | `/brain add\|list\|search\|show\|remove` | Knowledge base |
255
+ | `/plan [task]` / `/plan edit` / `/plan approve` | Plan mode |
256
+ | `/commit [msg]` / `/diff` / `/branch` | Git intelligence |
257
+ | `/undo` / `/redo` / `/history` | Persistent undo/redo |
258
+ | `/snapshot [name]` / `/restore` | Git snapshots |
259
+ | `/permissions` / `/allow` / `/deny` | Tool permissions |
260
+ | `/costs` / `/budget` | Cost tracking and limits |
261
+ | `/review [--strict]` | Deep code review |
262
+ | `/benchmark` | Model ranking (62 tasks) |
263
+ | `/autoresearch` / `/ar-self-improve` | Autonomous optimization loops |
264
+ | `/servers` / `/docker` / `/deploy` / `/k8s` | Infrastructure management |
265
+ | `/skills` / `/install-skill` / `/mcp` / `/hooks` | Extensibility |
266
+ | `/tree [depth]` | Project file tree |
267
+ | `/audit` | Tool execution audit |
268
+ | `/setup` | Interactive setup wizard |
546
269
 
547
270
  ---
548
271
 
549
272
  ## Tools
550
273
 
551
- The agent has 45 built-in tools:
552
-
553
- ### Core & File System
554
-
555
- | Tool | Description |
556
- | ---------------- | ------------------------------------------------------------ |
557
- | `bash` | Execute shell commands (90s timeout, 5MB buffer) |
558
- | `read_file` | Read files with optional line range (binary detection) |
559
- | `write_file` | Create or overwrite files (with diff preview + confirmation) |
560
- | `edit_file` | Targeted text replacement (with diff preview + confirmation) |
561
- | `patch_file` | Atomic multi-replacement in a single operation |
562
- | `list_directory` | Tree view with depth control and glob filtering |
563
- | `search_files` | Regex search across files (like grep) |
564
- | `glob` | Fast file search by name/extension pattern |
565
- | `grep` | Content search with regex and line numbers |
566
-
567
- ### Git & Web
568
-
569
- | Tool | Description |
570
- | ------------ | -------------------------------------------------------------------------- |
571
- | `git_status` | Git working tree status |
572
- | `git_diff` | Git diff with optional path filter |
573
- | `git_log` | Git commit history with configurable count |
574
- | `web_fetch` | Fetch content from a URL |
575
- | `web_search` | Grounded search via Perplexity (if `PERPLEXITY_API_KEY` set) or DuckDuckGo |
576
-
577
- ### Interaction & Agents
578
-
579
- | Tool | Description |
580
- | -------------- | ------------------------------------------------------ |
581
- | `ask_user` | Ask the user a question and wait for input |
582
- | `task_list` | Create and manage task lists for multi-step operations |
583
- | `spawn_agents` | Run parallel sub-agents with auto model routing |
584
- | `switch_model` | Switch active model mid-conversation |
585
-
586
- ### Browser (optional — requires Playwright)
587
-
588
- | Tool | Description |
589
- | -------------------- | ------------------------------------------------------------------ |
590
- | `browser_open` | Open URL in headless browser, return text + links (JS-heavy pages) |
591
- | `browser_screenshot` | Screenshot a URL → saved file + vision-ready path |
592
- | `browser_click` | Click element by CSS selector or visible text |
593
- | `browser_fill` | Fill form field and optionally submit |
594
-
595
- ### GitHub Actions
596
-
597
- | Tool | Description |
598
- | --------------------- | -------------------------------------------------------------------- |
599
- | `gh_run_list` | List GitHub Actions workflow runs |
600
- | `gh_run_view` | View run details and step logs |
601
- | `gh_workflow_trigger` | Trigger a workflow dispatch event |
602
- | `k8s_pods` | List Kubernetes pods (local kubectl or remote via SSH) |
603
- | `k8s_logs` | Fetch pod logs with `--tail` / `--since` filtering |
604
- | `k8s_exec` | Run a command inside a pod (with confirmation) |
605
- | `k8s_apply` | Apply a manifest file — `dry_run` mode supported (with confirmation) |
606
- | `k8s_rollout` | Rollout status / restart / history / undo for deployments |
607
-
608
- ### SSH & Server Management
609
-
610
- Requires `.nex/servers.json` — run `/init` to configure. See [Server Management](#server-management).
611
-
612
- | Tool | Description |
613
- | ---------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
614
- | `ssh_exec` | Execute a command on a remote server via SSH |
615
- | `ssh_upload` | Upload a file or directory via SCP |
616
- | `ssh_download` | Download a file or directory via SCP |
617
- | `service_manage` | Start/stop/restart/reload/enable/disable a systemd service (local or remote) |
618
- | `service_logs` | Fetch journalctl logs (local or remote, with `--since` support) |
619
- | `sysadmin` | Senior sysadmin operations on any Linux server (local or SSH). Actions: `audit` (full health overview), `disk_usage`, `process_list`, `network_status`, `package` (dnf/apt auto-detect), `user_manage` (list/create/delete/add_ssh_key), `firewall` (firewalld/ufw/iptables auto-detect), `cron` (list/add/remove), `ssl_check` (domain or cert file), `log_tail` (any log), `find_large` (big files by size). Read-only actions run without confirmation; state-changing actions require approval. |
620
- | `remote_agent` | Delegate a full coding task to a **nex-code instance running on a remote server** via SSH. Writes the task to a temp file, executes `nex-code --prompt-file ... --auto` on the remote, and streams back the result. Requires `.nex/servers.json`. Optional `project_path` (defaults to remote home dir) and `model` override. Timeout: 5 minutes. |
621
-
622
- ### Docker
623
-
624
- | Tool | Description |
625
- | ------------------ | ------------------------------------------------- |
626
- | `container_list` | List Docker containers (local or remote via SSH) |
627
- | `container_logs` | Fetch Docker container logs (`--tail`, `--since`) |
628
- | `container_exec` | Execute a command inside a running container |
629
- | `container_manage` | Start/stop/restart/remove/inspect a container |
630
-
631
- ### Deploy
632
-
633
- | Tool | Description |
634
- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
635
- | `deploy` | Deploy to a remote server via **rsync** (sync local files) or **git** (git pull on remote) + optional post-deploy script + optional health check. Supports named configs from `.nex/deploy.json`. |
636
- | `deployment_status` | Check deployment health across all configured servers — server reachability, service status, health checks. Reads `.nex/deploy.json`. |
637
-
638
- ### Frontend Design
639
-
640
- | Tool | Description |
641
- | ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
642
- | `frontend_recon` | **Mandatory first step before any frontend work.** Scans the project and returns: (1) design tokens — CSS custom properties (`:root`), Tailwind theme colors/fonts, (2) main layout/index page structure, (3) a reference component of the same type (`type=` hint), (4) detected JS/CSS framework stack — Vue/React, Alpine.js v2 vs v3, HTMX, Tailwind, Django. Call this before writing any markup or styles so the agent uses the project's actual design system instead of inventing one. |
643
-
644
- **Interactive commands** (vim, top, htop, ssh, tmux, fzf, etc.) are automatically detected and spawned with full TTY passthrough — no separate handling required.
645
-
646
- **Browser tools** require Playwright (`npm install playwright && npx playwright install chromium`). nex-code works without it — browser tools return a helpful install message if missing.
647
-
648
- Additional tools can be added via [MCP servers](#mcp) or [Skills](#skills).
649
-
650
- ---
274
+ 45 built-in tools organized by category:
651
275
 
652
- ## Server Management
276
+ **Core:** `bash`, `read_file`, `write_file`, `edit_file`, `patch_file`, `list_directory`, `search_files`, `glob`, `grep`
653
277
 
654
- nex-code has first-class support for remote server management via SSH, optimised for **AlmaLinux 9** and **macOS**.
278
+ **Git & Web:** `git_status`, `git_diff`, `git_log`, `web_fetch`, `web_search`
655
279
 
656
- ### Setup
280
+ **Agents:** `ask_user`, `task_list`, `spawn_agents`, `switch_model`
657
281
 
658
- Run `/init` inside nex-code to interactively configure your servers:
282
+ **Browser** (optional, requires Playwright): `browser_open`, `browser_screenshot`, `browser_click`, `browser_fill`
659
283
 
660
- ```
661
- > /init
662
- ```
284
+ **GitHub Actions & K8s:** `gh_run_list`, `gh_run_view`, `gh_workflow_trigger`, `k8s_pods`, `k8s_logs`, `k8s_exec`, `k8s_apply`, `k8s_rollout`
663
285
 
664
- Or create `.nex/servers.json` manually:
665
-
666
- ```json
667
- {
668
- "prod": {
669
- "host": "94.130.37.43",
670
- "user": "deploy",
671
- "port": 22,
672
- "key": "~/.ssh/id_rsa",
673
- "os": "almalinux9",
674
- "sudo": true
675
- },
676
- "macbook": {
677
- "host": "192.168.1.10",
678
- "user": "lukas",
679
- "os": "macos"
680
- }
681
- }
682
- ```
286
+ **SSH & Server:** `ssh_exec`, `ssh_upload`, `ssh_download`, `service_manage`, `service_logs`, `sysadmin`, `remote_agent`
683
287
 
684
- **OS values**: `almalinux9`, `almalinux8`, `ubuntu`, `debian`, `macos`
685
-
686
- When `.nex/servers.json` exists, the agent automatically receives OS-aware context:
687
-
688
- - **AlmaLinux 9**: `dnf`, `firewalld`, `systemctl`, SELinux hints
689
- - **macOS**: `brew`, `launchctl`, `log show` instead of `journalctl`
690
-
691
- If the project also has a `NEX.md` containing deployment indicators (`ssh`, `server`, `systemctl`, etc.), nex-code injects a **Deployment Context** block at the top of its system prompt. This tells the model that the application runs on the remote server — not locally — and steers debugging toward `ssh_exec`/`service_logs` instead of local `read_file` calls on server-side paths like `logs/`.
692
-
693
- ### Slash Commands
694
-
695
- | Command | Description |
696
- | -------------------------- | -------------------------------------------------- |
697
- | `/servers` | List all configured server profiles |
698
- | `/servers ping` | Check SSH connectivity for all servers in parallel |
699
- | `/servers ping <name>` | Check a specific server |
700
- | `/docker` | List running containers across all servers + local |
701
- | `/docker -a` | Include stopped containers |
702
- | `/deploy` | List all named deploy configs |
703
- | `/deploy <name>` | Run a named deploy (with confirmation) |
704
- | `/deploy <name> --dry-run` | Preview without syncing |
705
- | `/init` | Interactive wizard: create `.nex/servers.json` |
706
- | `/init deploy` | Interactive wizard: create `.nex/deploy.json` |
707
-
708
- ### Named Deploy Configs
709
-
710
- Create `.nex/deploy.json` (or use `/init deploy`):
711
-
712
- ```json
713
- {
714
- "prod": {
715
- "server": "prod",
716
- "method": "rsync",
717
- "local_path": "dist/",
718
- "remote_path": "/var/www/app",
719
- "exclude": ["node_modules", ".env"],
720
- "deploy_script": "systemctl restart gunicorn",
721
- "health_check": "https://myapp.example.com/health"
722
- },
723
- "api": {
724
- "server": "prod",
725
- "method": "git",
726
- "remote_path": "/home/deploy/my-api",
727
- "branch": "main",
728
- "deploy_script": "npm ci --omit=dev && sudo systemctl restart my-api",
729
- "health_check": "systemctl is-active my-api"
730
- }
731
- }
732
- ```
288
+ **Docker:** `container_list`, `container_logs`, `container_exec`, `container_manage`
733
289
 
734
- Then deploy with:
290
+ **Deploy:** `deploy`, `deployment_status`
735
291
 
736
- ```
737
- > /deploy prod
738
- ```
292
+ **Frontend:** `frontend_recon` — scans design tokens, layout, framework stack before any frontend work
739
293
 
740
- Or from within a conversation:
294
+ **Visual:** `visual_diff`, `responsive_sweep`, `visual_annotate`, `visual_watch`, `design_tokens`, `design_compare`
741
295
 
742
- ```
743
- deploy the latest build to prod
744
- ```
296
+ Additional tools via [MCP servers](#mcp) or [Skills](#skills).
745
297
 
746
298
  ---
747
299
 
748
- ## Features
749
-
750
- ### Compact Output
751
-
752
- The agent loop uses a bouncing-ball spinner (`● · · · ·` → `· ● · · ·` → …) during tool execution, then prints compact 1-line summaries:
753
-
754
- ```
755
- ⏺ Read, Grep, Edit
756
- ⎿ Read 45 lines from app.js
757
- ⎿ 12 matches "TODO"
758
- ⎿ old_text not found
759
- ```
760
-
761
- After multi-step tasks, a résumé and context-aware follow-up suggestions are shown:
762
-
763
- ```
764
- ── 3 steps · 8 tools · 2 files modified · 37s ──
765
- 💡 /diff · /commit · /undo
766
- ```
767
-
768
- Step counts match between inline `── step N ──` markers and the résumé. Elapsed time is included. Read-heavy sessions (analysis, status checks) suggest `/save · /clear` instead.
769
-
770
- When the model runs tools but produces no visible text, an automatic nudge forces it to summarize findings — preventing silent completions where the user sees nothing.
771
-
772
- ### Response Quality
773
-
774
- The system prompt enforces substantive responses: the model always presents findings as formatted text after using tools (users only see 1-line tool summaries). Responses use markdown with headers, bullet lists, and code blocks. The model states its approach before non-trivial tasks and summarizes results after completing work.
775
-
776
- **Language:** By default (`NEX_LANGUAGE=auto`), the model mirrors the language of the user's message — write in German, get a German response; write in English, get an English response. Set `NEX_LANGUAGE=English` (or any language) to force a fixed response language.
777
-
778
- **Code examples:** The model is instructed to always show actual, working code — never pseudocode or placeholder snippets.
779
-
780
- ### Performance
781
-
782
- - **Asynchronous I/O**: The entire CLI is built on non-blocking I/O. File reads, writes, and git operations never block the main thread, keeping the UI responsive even during heavy tasks.
783
- - **Fast Startup**: Pre-bundled with `esbuild` to minimize module loading overhead, achieving sub-100ms startup times.
784
- - **In-Memory Indexing**: A background indexing engine (using `ripgrep` or a fast fallback) keeps project file paths in RAM for instant file discovery, path auto-fixing, and glob searches.
785
-
786
- ### Streaming Output
787
-
788
- Tokens appear live as the model generates them. Bouncing-ball spinner during connection, then real-time line-by-line rendering via `StreamRenderer` with terminal width-aware word wrapping, markdown formatting, and syntax highlighting (JS, TS, Python, Go, Rust, CSS, HTML, and more).
789
-
790
- ### Paste Detection
791
-
792
- Automatic bracketed paste mode: pasting multi-line text into the prompt is detected and combined into a single input. A `[Pasted content — N lines]` indicator is shown with a preview of the first line. The user must press Enter to send — pasted content never auto-fires. The paste handler stores the combined text and waits for explicit submission.
793
-
794
- ### Ctrl+C Cancellation
795
-
796
- Pressing Ctrl+C during a running request immediately cancels the active HTTP stream and returns to the prompt:
797
-
798
- - An `AbortController` signal flows from the readline SIGINT handler through the agent loop to the provider's HTTP request
799
- - All providers (Ollama, OpenAI, Anthropic, Gemini, local) destroy the response stream on abort
800
- - No EPIPE errors after cancellation (stdout writes are EPIPE-guarded)
801
- - During processing: first Ctrl+C aborts the task and returns to prompt; second Ctrl+C force-exits
802
- - At the idle prompt: first Ctrl+C shows `(Press Ctrl+C again to exit)`, second Ctrl+C exits (hint resets after 2 s)
803
- - readline intercepts Ctrl+C on TTY (`rl.on('SIGINT')`) to prevent readline close → `process.exit(0)` race
804
-
805
- ### Diff Preview
806
-
807
- Every file change is shown in a diff-style format before being applied:
808
-
809
- - **Header**: `⏺ Update(file)` or `⏺ Create(file)` with relative path
810
- - **Summary**: `⎿ Added N lines, removed M lines`
811
- - **Numbered lines**: right-justified line numbers with red `-` / green `+` markers
812
- - **Context**: 3 lines of surrounding context per change, multiple hunks separated by `···`
813
- - OOM-safe: large diffs (>2000 lines) fall back to add/remove instead of LCS
814
- - All changes require `[y/n]` confirmation (toggle with `/autoconfirm` or start with `-yolo`)
815
-
816
- ### Terminal Theme Detection
817
-
818
- Nex Code automatically adapts all colours to your terminal's background:
819
-
820
- - **Dark terminals** — bright, saturated palette with `\x1b[2m` dim for muted text
821
- - **Light/white terminals** — darker, high-contrast palette; dim replaced with explicit grey to stay visible on white backgrounds; command echo uses a light blue-grey highlight instead of dark grey
822
-
823
- Detection priority:
824
-
825
- 1. `NEX_THEME=light|dark` env var — explicit override, useful if auto-detection is wrong
826
- 2. `COLORFGBG` env var — set by iTerm2 and other terminals
827
- 3. **OSC 11 query** — asks the terminal emulator directly for its background colour (works with Apple Terminal, iTerm2, WezTerm, Ghostty, and most xterm-compatible terminals). Result is cached per terminal session in `~/.nex-code/.theme_cache.json`, so the one-time ~100 ms startup cost only occurs on first launch in each terminal window.
828
- 4. Default → dark
829
-
830
- If you use multiple Apple Terminal profiles (e.g. white, dark teal, dark green), each window is detected independently — no manual configuration needed.
300
+ ## Key Features
831
301
 
832
- ### Auto-Context
833
-
834
- On startup, the CLI reads your project and injects context into the system prompt:
835
-
836
- - `package.json` — name, version, scripts, dependencies
837
- - `README.md` — first 50 lines
838
- - Git info — branch, status, recent commits
839
- - `.gitignore` content
840
- - **Merge conflicts** — detected and shown as a red warning; included in LLM context so the agent avoids editing conflicted files
841
-
842
- ### Context Engine
843
-
844
- Automatic token management with compression when the context window gets full. Tracks token usage across system prompt, conversation, tool results, and tool definitions.
845
-
846
- ### Safety Layer
847
-
848
- Three tiers of protection:
849
-
850
- - **Forbidden** (blocked): `rm -rf /`, `rm -rf .`, `mkfs`, `dd if=`, fork bombs, `curl|sh`, `cat .env`, `chmod 777`, reverse shells — 30+ patterns
851
- - **Critical** (always re-prompted, even in YOLO mode): `rm -rf`, `sudo`, `--no-verify` (hook bypass), `git reset --hard`, `git clean -f`, `git checkout --`, `git push --force` — every run requires explicit confirmation, no exceptions
852
- - **Notable** (confirmation on first use): `git push`, `npm publish`, `ssh`, `HUSKY=0`, `SKIP_HUSKY=1` — first-time prompt, then respects "always allow"
853
- - **SSH read-only safe list**: Common read-only SSH commands (`systemctl status`, `journalctl`, `tail`, `cat`, `git pull`, etc.) skip the dangerous-command confirmation
854
- - **Path protection**: Sensitive paths (`.ssh/`, `.aws/`, `.env`, credentials) are blocked from file operations
855
- - **Loop detection**: Edit-loop abort after 4 edits to the same file (warn at 2); bash-command loop abort after 8 identical commands (warn at 5); consecutive-error abort after 10 failures (warn at 6)
856
- - **Stale-stream detection**: Warns after 60 s without tokens (shows retry count + seconds until auto-abort); auto-switches to the fast model on retry 1 and offers interactive recovery when all retries are exhausted
857
- - **Auto Plan Mode**: Implementation tasks (`implement`, `refactor`, `create`, `build`, `add`, `write`, …) automatically activate plan mode — read-only analysis first, approve before any writes. Disable with `NEX_AUTO_PLAN=0`
858
- - **Intent-first behavior**: Before executing, the agent understands why you asked. If it finds something that contradicts or already satisfies the task, it asks instead of proceeding blindly
859
- - **Pre-push secret detection**: Git hook scans diffs for API keys, private keys, hardcoded secrets, SSH+IP patterns, and `.env` leaks before allowing push
860
- - **Post-merge automation**: Auto-bumps patch version on `devel→main` merge; runs `npm install` when `package.json` changes
861
-
862
- ### Sessions
863
-
864
- nex-code automatically saves your conversation after every turn. If the process crashes or is closed unexpectedly, the next startup will detect the autosave and offer to restore it:
865
-
866
- ```
867
- Previous session found. Resume? (y/n)
868
- ```
869
-
870
- Only sessions from the last 24 hours are offered for auto-resume. Older autosaves are silently skipped.
871
-
872
- **Session commands:**
873
-
874
- | Command | Description |
875
- | -------------- | -------------------------------------------------------- |
876
- | `/save <name>` | Save current conversation under a named slot |
877
- | `/load <name>` | Restore a previously saved session |
878
- | `/sessions` | List all saved sessions with message count and timestamp |
879
- | `/resume` | Resume the most recently saved session |
880
-
881
- ```
882
- /save my-feature # save with name
883
- /load my-feature # restore by name
884
- /sessions # list all saved sessions
885
- /resume # restore the latest session
886
- ```
887
-
888
- Sessions are stored in `.nex/sessions/` as JSON files. Auto-saves always write to `_autosave` (overwritten each turn). Writes are atomic — a temp file is written and renamed, so a crash mid-write never corrupts the saved state.
889
-
890
- ### Session Trees
302
+ ### Multi-Agent Orchestrator
891
303
 
892
- Navigate your conversation history like git branches. Fork at any point, explore alternative approaches, and switch between branches:
304
+ Multi-goal prompts auto-decompose into parallel sub-agents. Up to 5 agents run simultaneously with file locking.
893
305
 
894
- ```
895
- /timeline # see message indices
896
- /fork 5 experiment # branch from message 5
897
- /branches # see all branches
898
- /switch-branch main # go back to main
899
- /goto 3 # jump to message 3 (truncates later messages)
900
- /delete-branch experiment
306
+ ```bash
307
+ nex-code --task "fix type errors in src/, add JSDoc to utils/, update CHANGELOG"
901
308
  ```
902
309
 
903
- This enables non-linear conversations: try an approach, and if it doesn't work, fork from an earlier point and try something different — without losing the original attempt.
904
-
905
310
  ### Autoresearch
906
311
 
907
- Autonomous optimization loops inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). The agent creates a dedicated experiment branch, edits code, runs experiments, and automatically keeps improvements or reverts failures:
312
+ Autonomous optimization loops: edit -> experiment -> keep/revert, on a dedicated branch.
908
313
 
909
314
  ```
910
315
  /autoresearch reduce test runtime while maintaining correctness
911
- /autoresearch optimize bundle size under 500kb
912
- ```
913
-
914
- The agent follows a repeating cycle on a dedicated `autoresearch/<tag>` branch: **setup branch** -> **checkpoint** -> **edit** -> **run experiment** -> **log result** -> **keep or revert (git reset)**. Runs indefinitely until you interrupt. Experiments are logged to `.nex/autoresearch/experiments.json` with metrics, resource usage, and complexity tracking. Output can be redirected to log files with metric extraction via grep patterns to protect context.
915
-
916
- ```
917
- /ar-self-improve # self-improvement loop using nex-code's benchmark as metric
918
- /ar-self-improve sysadmin # focus on a specific weak category
919
- /ar-status # show experiment history with trends
920
- /ar-clear # reset experiment history
921
- ```
922
-
923
- The loop can also run **headless** — useful for unattended overnight sessions:
924
-
925
- ```bash
926
- nex-code --task "/ar-self-improve" --no-auto-orchestrate --max-turns 200
927
- ```
928
-
929
- `/ar-self-improve` uses nex-code's own 14-task quick benchmark as the fitness metric. Each experiment that raises the average score above the session baseline is kept; all others are reverted with `git reset`. The benchmark output includes a **Failing tasks** section that names which tasks each model got wrong, making root causes immediately visible.
930
-
931
- > **Self-improvement history** (2026-03-31): baseline 86.7 → **92.9** (+6.2 pts) in one session. Key fix: rewording the `edit_file` tool description so models call it directly instead of first calling `read_file`. `rnj-1:8b` jumped from 77.1 → 97.9 on that change alone.
932
-
933
- ### Daemon / Watch Mode
934
-
935
- Keep nex-code running in the background and fire tasks automatically when things change. Reads config from `.nex/daemon.json` (or a path passed after the flag):
936
-
937
- ```bash
938
- nex-code --daemon # reads .nex/daemon.json
939
- nex-code --daemon /path/to/config.json
940
- nex-code --watch # alias
941
- ```
942
-
943
- **Example `.nex/daemon.json`:**
944
-
945
- ```json
946
- {
947
- "triggers": [
948
- {
949
- "on": "file-change",
950
- "glob": "**/*.{js,ts}",
951
- "ignore": ["dist/**", "node_modules/**"],
952
- "task": "run npm test -- --testPathPattern={changedFile} and report results",
953
- "debounceMs": 2000,
954
- "auto": true
955
- },
956
- {
957
- "on": "git-commit",
958
- "task": "review the last commit for bugs or security issues — commit: {commitHash} \"{commitMessage}\"",
959
- "auto": true
960
- },
961
- {
962
- "on": "schedule",
963
- "cron": "0 8 * * *",
964
- "task": "check npm audit and report any new vulnerabilities",
965
- "auto": true
966
- }
967
- ],
968
- "notify": ["desktop", "matrix"],
969
- "logFile": ".nex/daemon.log"
970
- }
971
- ```
972
-
973
- **Trigger types:**
974
-
975
- | Type | Mechanism | Template vars |
976
- |------|-----------|---------------|
977
- | `file-change` | `fs.watch({ recursive: true })` + debounce | `{changedFile}`, `{changedFiles}` |
978
- | `git-commit` | polls `git log` every 10 s | `{commitHash}`, `{commitMessage}` |
979
- | `schedule` | `setInterval` — supports `*/N * * * *` and `0 H * * *` | — |
980
-
981
- **Notifications:** `"desktop"` fires a macOS notification via `osascript`. `"matrix"` posts to the room set in `NEX_MATRIX_URL` / `NEX_MATRIX_TOKEN` / `NEX_MATRIX_ROOM`.
982
-
983
- Each triggered task runs in its own isolated session (`clearConversation()` is called after every task). Events are appended as one-line JSON to `logFile`; the file is truncated (not deleted) when it exceeds 5 MB.
984
-
985
- No new npm dependencies — uses only Node.js built-ins (`fs`, `https`, `child_process`).
986
-
987
- ### Memory
988
-
989
- Persistent project memory that survives across sessions:
990
-
991
- ```
992
- /remember lang=TypeScript
993
- /remember always use yarn instead of npm
994
- /memory
995
- /forget lang
996
- ```
997
-
998
- Also loads `NEX.md` from project root for project-level instructions.
999
-
1000
- ### Brain — Persistent Knowledge Base
1001
-
1002
- A project-scoped knowledge base stored in `.nex/brain/`. The agent automatically retrieves relevant documents for each query and can write new entries as it discovers useful patterns, decisions, or context:
1003
-
316
+ /ar-self-improve # self-improvement using nex-code's benchmark
1004
317
  ```
1005
- /brain add auth-flow # add a document (prompted for content)
1006
- /brain search "jwt token" # keyword + semantic search
1007
- /brain list # list all documents
1008
- /brain show auth-flow # display a document
1009
- /brain remove auth-flow # delete a document
1010
- /brain status # index health (docs, keywords, embeddings)
1011
- /brain review # git diff of recent brain writes
1012
- /brain undo # undo last brain write
1013
- ```
1014
-
1015
- The agent uses the `brain_write` tool to save discoveries automatically. All writes are tracked in git so you can review, revert, or audit what the agent has stored.
1016
318
 
1017
319
  ### Plan Mode
1018
320
 
1019
- Analyze before executing the agent explores the codebase with read-only tools, produces a structured plan, then you approve before any changes are made.
1020
-
1021
- **Auto Plan Mode** — nex-code automatically activates plan mode when it detects an implementation task (prompts containing `implement`, `refactor`, `create`, `build`, `add`, `write`, etc.). No manual `/plan` needed:
1022
-
1023
- ```
1024
- > implement a search endpoint # → Auto Plan Mode activates immediately
1025
- > refactor the auth module # → Auto Plan Mode activates immediately
1026
- > how does auth work? # → normal mode (question, not implementation)
1027
- ```
1028
-
1029
- Disable with `NEX_AUTO_PLAN=0` if you prefer manual control.
1030
-
1031
- ```
1032
- /plan refactor the auth module # manual: enter plan mode with optional task
1033
- /plan status # show extracted steps with status icons
1034
- /plan edit # open plan in $EDITOR (nano/vim/code) to modify
1035
- /plan approve # approve and exit plan mode (all tools re-enabled)
1036
- /auto semi-auto # set autonomy level
1037
- ```
1038
-
1039
- Plan mode is **hard-enforced**: only read-only tools (`read_file`, `list_directory`, `search_files`, `glob`, `grep`, `web_search`, `web_fetch`, `git_status`, `git_diff`, `git_log`, `git_show`, `ask_user`) are available. Any attempt to call a write tool is blocked at the API level.
1040
-
1041
- **Step extraction**: when the LLM outputs a numbered plan, steps are automatically parsed into a structured list. During execution the spinner shows `Plan step 2/4: Implement tests` and `/plan status` shows per-step progress (○ pending → → in progress → ✓ done). The plan text is saved to `.nex/plans/current-plan.md`.
1042
-
1043
- ### Snapshots
1044
-
1045
- Named git snapshots — save and restore working-tree state at any point:
1046
-
1047
- ```
1048
- /snapshot before-refactor # create snapshot named "before-refactor"
1049
- /snapshot list # list all saved snapshots
1050
- /restore last # restore most recent snapshot
1051
- /restore before-refactor # restore by name
1052
- /restore list # show all available snapshots
1053
- ```
1054
-
1055
- Snapshots use `git stash` internally — no extra state files. The working tree is restored immediately after stashing so your changes are preserved. Use `/restore` when you want to roll back to a known-good state.
1056
-
1057
- ### File Tree
1058
-
1059
- Visualize the project structure:
1060
-
1061
- ```
1062
- /tree # show tree at depth 3
1063
- /tree 2 # shallower view
1064
- /tree 5 # deeper view (max 8)
1065
- ```
1066
-
1067
- Automatically excludes `node_modules`, `.git`, `dist`, `build`, `coverage`, and all entries listed in `.gitignore`. Directories are sorted before files.
1068
-
1069
- ### Undo / Redo (Persistent)
1070
-
1071
- Undo/redo for all file changes (write, edit, patch) — **survives restart**:
1072
-
1073
- ```
1074
- /undo # undo last file change
1075
- /redo # redo last undone change
1076
- /history # show file change history
1077
- ```
1078
-
1079
- Undo stack holds up to 50 changes, persisted to `.nex/history/`. Large files (>100KB) are deduplicated via SHA-256 blob storage. History is auto-pruned after 7 days. `/clear` resets the in-memory stack.
1080
-
1081
- > **Snapshots vs Undo**: `/undo` operates on the persistent change stack for fine-grained per-file rollback across sessions. `/snapshot` + `/restore` use git stash for broader checkpoints across multiple files.
1082
-
1083
- ### Desktop Notifications
1084
-
1085
- On macOS, nex-code fires a system notification when a task completes after ≥ 30 seconds — useful when running long autonomous tasks in the background. No configuration needed; requires macOS Notification Center access.
1086
-
1087
- ### Task Management
1088
-
1089
- Create structured task lists for complex multi-step operations:
1090
-
1091
- ```
1092
- /tasks # show current task list
1093
- /tasks clear # clear all tasks
1094
- ```
1095
-
1096
- The agent uses `task_list` to create, update, and track progress on tasks with dependency support.
1097
-
1098
- When the agent creates a task list, a **live animated display** replaces the static output:
1099
-
1100
- ```
1101
- ✽ Adding cost limit functions… (1m 35s · ↓ 2.6k tokens)
1102
- ⎿ ✔ Create cli/picker.js — Interactive Terminal Picker
1103
- ◼ Add cost limits to cli/costs.js
1104
- ◻ Add budget gate to cli/providers/registry.js
1105
- ◻ Update cli/index.js
1106
- ◻ Run tests
1107
- ```
1108
-
1109
- - Bouncing-ball spinner (⏺ ping-pong across 5 positions) with elapsed time display
1110
- - Per-task status icons: `✔` done, `◼` in progress, `◻` pending, `✗` failed
1111
- - Automatically pauses during text streaming and resumes during tool execution
1112
- - Falls back to the static `/tasks` view when no live display is active
1113
-
1114
- ### Sub-Agents
1115
-
1116
- Spawn parallel sub-agents for independent tasks:
1117
-
1118
- - Up to 5 agents run simultaneously with their own conversation contexts
1119
- - File locking prevents concurrent writes to the same file (intra-process sub-agents)
1120
- - Multi-progress display shows real-time status of each agent
1121
- - Good for: reading multiple files, analyzing separate modules, independent research
1122
-
1123
- ### Multi-Agent Orchestrator
1124
-
1125
- For complex tasks with multiple independent goals (e.g. "fix all TypeScript errors in auth/, add tests for utils/, and update the README"), the orchestrator decomposes the prompt into parallel sub-tasks, runs dedicated sub-agents on each, and synthesizes the results.
1126
-
1127
- **Auto-orchestration is on by default** for prompts with ≥3 goals.
321
+ Auto-activates for implementation tasks. Read-only analysis first, approve before writes. Hard-enforced tool restrictions.
1128
322
 
1129
- ```bash
1130
- # Just use it — multi-goal prompts auto-decompose into parallel agents
1131
- nex-code --task "fix all type errors in src/, add JSDoc to utils/, update CHANGELOG"
1132
-
1133
- # Custom orchestrator model
1134
- nex-code --orchestrator-model kimi-k2.5 --task "..."
1135
-
1136
- # Disable auto-orchestration
1137
- NEX_AUTO_ORCHESTRATE=false nex-code
1138
-
1139
- # Lower the goal threshold (default: 3)
1140
- NEX_ORCHESTRATE_THRESHOLD=2 nex-code
1141
- ```
1142
-
1143
- **Interactive:** type `/orchestrate <task>` at the prompt.
1144
-
1145
- **Example output:**
1146
-
1147
- ```
1148
- Orchestrator model: kimi-k2.5 | workers: devstral-2:123b | max parallel: 3
1149
-
1150
- Phase 1: Decomposing prompt into sub-tasks...
1151
- Decomposed into 3 sub-tasks:
1152
- t1: Fix TypeScript errors in src/auth/
1153
- scope: src/auth/
1154
- t2: Add JSDoc comments to src/utils/
1155
- scope: src/utils/
1156
- t3: Update CHANGELOG with recent changes
1157
- scope: CHANGELOG.md
1158
-
1159
- Phase 2: Running 3 sub-agents (max 3 parallel)...
1160
-
1161
- ✓ Agent 1 [devstral-2:123b]: Fix TypeScript errors in src/auth/: fixed 4 type errors in login.ts, ...
1162
- ✓ Agent 2 [devstral-2:123b]: Add JSDoc comments to src/utils/: documented 12 functions across 3 files
1163
- ✓ Agent 3 [devstral-2:123b]: Update CHANGELOG: added entries for v1.2.0 changes
1164
-
1165
- Phase 3: Synthesizing results...
1166
-
1167
- Summary: Fixed 4 TS errors in auth module, added JSDoc to 12 utility functions, updated CHANGELOG.
1168
- Suggested commit: fix: resolve auth type errors and add utility docs
1169
- ```
1170
-
1171
- **Env vars:**
1172
-
1173
- | Variable | Default | Description |
1174
- | --------------------------- | ------- | ------------------------------------------------------- |
1175
- | `NEX_AUTO_ORCHESTRATE` | `true` | Set to `false` to disable auto-orchestration |
1176
- | `NEX_ORCHESTRATE_THRESHOLD` | `3` | Minimum number of detected goals before auto-triggering |
1177
-
1178
- **Model roles in orchestration:**
1179
-
1180
- | Role | Default model | Purpose |
1181
- | ------------ | ----------------- | ------------------------------------------- |
1182
- | Orchestrator | `kimi-k2.5` | Decomposes prompt, synthesizes results |
1183
- | Worker | `devstral-2:123b` | Executes each sub-task (one agent per task) |
1184
-
1185
- Override via `--orchestrator-model` (orchestrator) or `DEFAULT_MODEL` / `NEX_STANDARD_MODEL` (workers).
1186
-
1187
- ---
1188
-
1189
- ### Parallel Sessions
1190
-
1191
- Running multiple nex-code instances in the same project directory is safe. All shared state files (`.nex/memory/memory.json`, `.nex/config.json`, `NEX.md`, brain index) use advisory inter-process locking (`O_EXCL` lock files with stale-lock reclaim) and atomic writes (temp file + `rename`). A session in Terminal A and a session in Terminal B can both call `/remember`, `/allow`, or `/learn` simultaneously without data corruption.
1192
-
1193
- **Multi-Model Routing** — Sub-agents auto-select the best model per task based on complexity:
1194
-
1195
- - **Read/search/list** tasks → fast models (essential tier)
1196
- - **Edit/fix/analyze** tasks → capable models (standard tier)
1197
- - **Refactor/implement/generate** tasks → most powerful models (full tier)
1198
-
1199
- The LLM can also explicitly override with `model: "provider:model"` in the agent definition. When multiple providers are configured, the system prompt includes a routing table showing all available models and their tiers.
1200
-
1201
- ### Git Intelligence
1202
-
1203
- ```
1204
- /commit # analyze diff, suggest commit message
1205
- /commit feat: add login
1206
- /diff # show current diff summary
1207
- /branch my-feature # create and switch to branch
1208
- ```
1209
-
1210
- ### Permissions
1211
-
1212
- Control which tools the agent can use:
1213
-
1214
- ```
1215
- /permissions # show current settings
1216
- /allow read_file # auto-allow without asking
1217
- /deny bash # block completely
1218
- ```
1219
-
1220
- Persisted in `.nex/config.json`.
1221
-
1222
- ### Cost Tracking
1223
-
1224
- Track token usage and costs per provider:
323
+ ### Daemon / Watch Mode
324
+ <<<<<<< Updated upstream
325
+ Background process that fires tasks on file changes, git commits, or cron schedule. Configured via `.nex/daemon.json`. Desktop and Matrix notifications.
326
+ =======
1225
327
 
1226
- ```
1227
- /costs
1228
- /costs reset
1229
- ```
328
+ Background process that fires tasks on file changes, git commits, or cron schedule. Configured via `.nex/daemon.json`. Desktop and Matrix notifications.
1230
329
 
1231
- ### Cost Limits
330
+ >>>>>>> Stashed changes
331
+ ### Session Trees
1232
332
 
1233
- Set per-provider spending limits. When a provider exceeds its budget, calls automatically fall back to the next provider in the fallback chain:
333
+ Navigate conversation history like git branches fork, switch, goto, delete branches.
1234
334
 
1235
- ```
1236
- /budget # show all limits + current spend
1237
- /budget anthropic 5 # set $5 limit for Anthropic
1238
- /budget openai 10 # set $10 limit for OpenAI
1239
- /budget anthropic off # remove limit
1240
- ```
335
+ ### Safety
1241
336
 
1242
- Limits are persisted in `.nex/config.json`. You can also set them directly:
337
+ | Layer | What it guards | Bypass? |
338
+ |---|---|---|
339
+ | **Forbidden patterns** | `rm -rf /`, fork bombs, reverse shells, `cat .env` | No |
340
+ | **Protected paths** | Destructive ops on `.env`, `.ssh/`, `.aws/`, `.git/` | `NEX_UNPROTECT=1` |
341
+ | **Sensitive file tools** | read/write/edit on `.env`, `.ssh/`, `.npmrc`, `.kube/` | No |
342
+ | **Critical commands** | `rm -rf`, `sudo`, `git push --force`, `git reset --hard` | Explicit confirmation |
1243
343
 
1244
- ```json
1245
- // .nex/config.json
1246
- {
1247
- "costLimits": {
1248
- "anthropic": 5,
1249
- "openai": 10
1250
- }
1251
- }
1252
- ```
344
+ Pre-push secret detection, audit logging (JSONL), persistent undo/redo, cost limits, auto plan mode.
1253
345
 
1254
346
  ### Open-Source Model Robustness
1255
347
 
1256
- Four features that make Nex Code significantly more reliable with open-source models:
1257
-
1258
- **Tool Call Retry with Schema Hints** — When a model sends malformed tool arguments, instead of a bare error, the agent sends back the expected JSON schema so the model can self-correct on the next loop iteration.
1259
-
1260
- **Smart Argument Parsing** — 5 fallback strategies for parsing tool arguments: direct JSON, trailing comma/quote fixes, JSON extraction from surrounding text, unquoted key repair, and markdown code fence stripping (common with DeepSeek R1, Llama).
1261
-
1262
- **Tool Argument Validation** — Validates arguments against tool schemas before execution. Auto-corrects similar parameter names (Levenshtein distance), fixes type mismatches (string↔number↔boolean), and provides "did you mean?" suggestions.
1263
-
1264
- **Auto-Fix Engine** — Three layers of automatic error recovery that silently fix common tool failures:
1265
-
1266
- - **Path auto-fix**: Wrong extension? Finds the right one (`.js` → `.ts`). File moved? Globs for it by basename. Double slashes, missing extensions — all auto-resolved.
1267
- - **Edit auto-fix**: Close match (≤5% Levenshtein distance) in `edit_file`/`patch_file` is auto-applied instead of erroring. Stacks with fuzzy whitespace matching.
1268
- - **Bash error hints**: Enriches error output with actionable hints — "command not found" → install suggestion, `MODULE_NOT_FOUND` → `npm install <pkg>`, port in use, syntax errors, TypeScript errors, test failures, and more.
1269
-
1270
- **Stale Stream Recovery** — Progressive retry strategy when streams stall (common with large Ollama models after many agent steps):
1271
-
1272
- - 1st retry: 3s backoff delay, resend same context (handles transient stalls)
1273
- - 2nd retry: force-compress conversation (~80k tokens freed), 5s delay, retry with smaller context
1274
- - Last resort: if retries exhausted, one final force-compress + reset for fresh attempts
1275
- - Broader context-too-long detection catches Ollama-specific error formats (`num_ctx`, `prompt`, `size`, `exceeds`)
1276
-
1277
- **Tool Tiers** — Dynamically reduces the tool set based on model capability:
1278
-
1279
- - **essential** (5 tools): bash, read_file, write_file, edit_file, list_directory
1280
- - **standard** (21 tools): + search_files, glob, grep, ask_user, git_status, git_diff, git_log, task_list, ssh_exec, service_manage, service_logs, container_list, container_logs, container_exec, container_manage, deploy
1281
- - **full** (45 tools): all tools
1282
-
1283
- Models are auto-classified, or override per-model in `.nex/config.json`:
1284
-
1285
- ```json
1286
- {
1287
- "toolTiers": {
1288
- "deepseek-r1": "essential",
1289
- "local:*": "essential",
1290
- "qwen3-coder": "full"
1291
- },
1292
- "maxIterations": 100
1293
- }
1294
- ```
1295
-
1296
- `maxIterations` sets the agentic loop limit project-wide (default: 50). The `--max-turns <n>` CLI flag overrides it per run.
1297
-
1298
- Tiers are also used by sub-agent routing — when a sub-agent auto-selects a model, its tool set is filtered to match that model's tier.
348
+ - **5-layer argument parsing** JSON, trailing fix, extraction, key repair, fence stripping
349
+ - **Tool call retry with schema hints** — malformed args get the expected schema for self-correction
350
+ - **Auto-fix engine** — path resolution, edit fuzzy matching (Levenshtein), bash error hints
351
+ - **Tool tiers** — essential (5) / standard (21) / full (45), auto-selected per model capability
352
+ - **Stale stream recovery** — progressive retry with context compression on stall
353
+ <<<<<<< Updated upstream
354
+ ### Visual Development Tools
355
+ Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.
1299
356
 
1300
357
  ---
1301
358
 
1302
- ## Skills
1303
-
1304
- Extend Nex Code with project-specific knowledge, commands, and tools via `.nex/skills/`.
1305
-
1306
- ### Prompt Skills (`.md`)
1307
-
1308
- Drop a Markdown file and its content is injected into the system prompt:
1309
-
1310
- ```markdown
1311
- <!-- .nex/skills/code-style.md -->
1312
-
1313
- # Code Style
1314
-
1315
- - Always use semicolons
1316
- - Prefer const over let
1317
- - Use TypeScript strict mode
1318
- ```
1319
-
1320
- ### Script Skills (`.js`)
1321
-
1322
- CommonJS modules that can provide instructions, slash commands, and tools:
1323
-
1324
- ```javascript
1325
- // .nex/skills/deploy.js
1326
- module.exports = {
1327
- name: "deploy",
1328
- description: "Deployment helper",
1329
- instructions: "When deploying, always run tests first...",
1330
- commands: [
1331
- {
1332
- cmd: "/deploy",
1333
- desc: "Run deployment",
1334
- handler: (args) => {
1335
- /* ... */
1336
- },
1337
- },
1338
- ],
1339
- tools: [
1340
- {
1341
- type: "function",
1342
- function: {
1343
- name: "deploy_status",
1344
- description: "Check status",
1345
- parameters: { type: "object", properties: {} },
1346
- },
1347
- execute: async (args) => "deployed",
1348
- },
1349
- ],
1350
- };
1351
- ```
1352
-
1353
- ### Management
1354
-
1355
- ```
1356
- /skills # list loaded skills
1357
- /skills enable code-style # enable a skill
1358
- /skills disable code-style # disable a skill
1359
- ```
1360
-
1361
- Skills are loaded on startup. All enabled by default. Disabled skills tracked in `.nex/config.json`.
1362
-
1363
- ### Global Skills (`~/.nex-code/skills/`)
359
+ ## Extensibility
1364
360
 
1365
- Skills placed in `~/.nex-code/skills/` are loaded globally across all projects. Useful for cross-project workflows.
361
+ ### Skills
1366
362
 
1367
- **Example: `server-agent.md`** instructs nex-code on your Mac to delegate tasks to a nex-code instance on a remote server using the `remote_agent` tool. Define a project→server mapping table in the skill so the agent knows which path to use for each project name.
363
+ Drop `.md` or `.js` files in `.nex/skills/` for project-specific knowledge, commands, and tools. Global skills in `~/.nex-code/skills/`. Install from git: `/install-skill user/repo`.
1368
364
 
1369
- ### Skill Marketplace
365
+ ### Plugins
1370
366
 
1371
- Install community skills directly from git:
367
+ Custom tools and lifecycle hooks via `.nex/plugins/`. Events: `onToolResult`, `onModelResponse`, `onSessionStart`, `onSessionEnd`, `onFileChange`, `beforeToolExec`, `afterToolExec`.
1372
368
 
1373
- ```
1374
- /install-skill https://github.com/user/nex-skill-deploy
1375
- /install-skill user/nex-skill-deploy # shorthand
1376
- /search-skill kubernetes # search GitHub
1377
- /remove-skill deploy # uninstall
1378
- ```
1379
-
1380
- Skills are cloned to `.nex/skills/{name}/` and validated (must contain `skill.json`, `.md`, or `.js` files).
1381
-
1382
- ### Built-in Skills
369
+ ### MCP
1383
370
 
1384
- nex-code ships with built-in skills in `cli/skills/`:
371
+ Connect external tool servers via [Model Context Protocol](https://modelcontextprotocol.io). Configure in `.nex/mcp.json` with env var interpolation.
1385
372
 
1386
- - **devops** — DevOps agent instructions for SSH, Docker, deploy, and infrastructure tools
373
+ ### Hooks
1387
374
 
1388
- Built-in skills are loaded automatically. Project skills with the same name override built-ins.
375
+ Run custom scripts on CLI events (`pre-tool`, `post-tool`, `pre-commit`, `post-response`, `session-start`, `session-end`). Configure in `.nex/config.json` or `.nex/hooks/`.
1389
376
 
1390
377
  ---
1391
378
 
1392
- ## Plugins
1393
-
1394
- Extend nex-code with custom tools and lifecycle hooks via `.nex/plugins/`:
1395
-
1396
- ```javascript
1397
- // .nex/plugins/my-plugin.js
1398
- module.exports = function setup(api) {
1399
- api.registerTool(
1400
- {
1401
- type: "function",
1402
- function: {
1403
- name: "my_tool",
1404
- description: "Custom tool",
1405
- parameters: { type: "object", properties: {} },
1406
- },
1407
- },
1408
- async (args) => {
1409
- return "result";
1410
- },
1411
- );
1412
-
1413
- api.registerHook("onToolResult", (data) => {
1414
- console.log(`Tool ${data.tool} completed`);
1415
- return data;
1416
- });
1417
- };
1418
- ```
379
+ =======
1419
380
 
1420
- **Events:** `onToolResult`, `onModelResponse`, `onSessionStart`, `onSessionEnd`, `onFileChange`, `beforeToolExec`, `afterToolExec`
381
+ ### Visual Development Tools
1421
382
 
1422
- Plugins are loaded automatically on startup. Hook handlers can modify event data (return the modified object).
383
+ Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.
1423
384
 
1424
385
  ---
1425
386
 
1426
- ## Audit Logging
387
+ ## Extensibility
1427
388
 
1428
- When `NEX_AUDIT=1` is set, all tool executions are logged to `.nex/audit/YYYY-MM-DD.jsonl`:
389
+ ### Skills
1429
390
 
1430
- ```
1431
- /audit # show summary (total calls, success rate, per-tool breakdown)
1432
- ```
391
+ Drop `.md` or `.js` files in `.nex/skills/` for project-specific knowledge, commands, and tools. Global skills in `~/.nex-code/skills/`. Install from git: `/install-skill user/repo`.
1433
392
 
1434
- Arguments are automatically sanitized — keys matching `key`, `token`, `password`, `secret`, or `credential` are masked. Long values (>500 chars) are truncated.
393
+ ### Plugins
1435
394
 
1436
- ---
1437
-
1438
- ## Safety
1439
-
1440
- nex-code includes multi-layer protections to prevent accidental damage — even in `--auto` and `--yolo` mode:
1441
-
1442
- | Layer | What it guards | Bypass possible? |
1443
- | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------ |
1444
- | **Forbidden patterns** | `rm -rf /`, fork bombs, reverse shells, `cat .env` | No |
1445
- | **Protected paths** | Destructive bash ops (`rm`, `mv`, `truncate`, …) on `.env`, `credentials/`, `venv/`, `.ssh/`, `.aws/`, `.sqlite3`, `.git/` internals | Only via `NEX_UNPROTECT=1` |
1446
- | **Sensitive file tools** | `read_file` / `write_file` / `edit_file` on `.env`, `.ssh/`, `.npmrc`, `.kube/config`, etc. | No |
1447
- | **Critical commands** | `rm -rf`, `sudo`, `git push --force`, `git reset --hard` | Requires explicit confirmation |
1448
-
1449
- **Override:** If you intentionally need to modify a protected path via bash (e.g. rotating credentials in a deploy script), set `NEX_UNPROTECT=1`:
1450
-
1451
- ```bash
1452
- NEX_UNPROTECT=1 nex-code
1453
- ```
1454
-
1455
- This disables the protected-path check only — forbidden patterns and critical-command prompts remain active.
1456
-
1457
- ### Reporting Vulnerabilities
1458
-
1459
- If you discover a security vulnerability, please report it responsibly:
1460
-
1461
- - **Do not** open a public GitHub issue
1462
- - Email: **security@schoensgibl.com**
1463
- - Include: description, reproduction steps, and potential impact
1464
- - Allow up to 72 hours for initial response
1465
-
1466
- ---
395
+ Custom tools and lifecycle hooks via `.nex/plugins/`. Events: `onToolResult`, `onModelResponse`, `onSessionStart`, `onSessionEnd`, `onFileChange`, `beforeToolExec`, `afterToolExec`.
1467
396
 
1468
- ## Team Permissions
397
+ ### MCP
1469
398
 
1470
- Permission presets for team environments:
399
+ Connect external tool servers via [Model Context Protocol](https://modelcontextprotocol.io). Configure in `.nex/mcp.json` with env var interpolation.
1471
400
 
1472
- | Preset | Description |
1473
- | ----------- | -------------------------------------------------- |
1474
- | `readonly` | Search and read tools only — no writes, no deploys |
1475
- | `developer` | All tools except deploy, ssh_exec, service_manage |
1476
- | `admin` | Full access to all tools |
1477
-
1478
- Configure in `.nex/config.json`:
1479
-
1480
- ```json
1481
- {
1482
- "permissionPreset": "developer"
1483
- }
1484
- ```
1485
-
1486
- Works alongside the existing per-tool `/allow` and `/deny` system.
1487
-
1488
- ---
1489
-
1490
- ## MCP
1491
-
1492
- Connect external tool servers via the [Model Context Protocol](https://modelcontextprotocol.io):
1493
-
1494
- ```json
1495
- // .nex/config.json
1496
- {
1497
- "mcpServers": {
1498
- "my-server": {
1499
- "command": "node",
1500
- "args": ["path/to/server.js"]
1501
- }
1502
- }
1503
- }
1504
- ```
1505
-
1506
- ```
1507
- /mcp # show servers and tools
1508
- /mcp connect # connect all configured servers
1509
- /mcp disconnect # disconnect all
1510
- ```
401
+ ### Hooks
1511
402
 
1512
- MCP tools appear with the `mcp_` prefix and are available to the agent alongside built-in tools.
403
+ Run custom scripts on CLI events (`pre-tool`, `post-tool`, `pre-commit`, `post-response`, `session-start`, `session-end`). Configure in `.nex/config.json` or `.nex/hooks/`.
1513
404
 
1514
405
  ---
1515
406
 
1516
- ## MCP Servers
1517
-
1518
- nex-code supports a dedicated `.nex/mcp.json` file (or `~/.nex/mcp.json` for global config) for
1519
- connecting MCP tool servers. This format supports environment variable interpolation so you can keep
1520
- API keys out of the config file.
1521
-
1522
- ```json
1523
- // .nex/mcp.json
1524
- {
1525
- "servers": {
1526
- "brave-search": {
1527
- "command": "npx",
1528
- "args": ["-y", "@modelcontextprotocol/server-brave-search"],
1529
- "env": { "BRAVE_API_KEY": "${BRAVE_API_KEY}" }
1530
- },
1531
- "filesystem": {
1532
- "command": "npx",
1533
- "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
1534
- }
1535
- }
1536
- }
1537
- ```
1538
-
1539
- **Search order for config:** `.nex/mcp.json` → `~/.nex/mcp.json`. Override with `--mcp-config <path>`.
1540
-
1541
- **Slash commands:**
1542
-
1543
- ```
1544
- /mcp list — list connected MCP servers and their exposed tools
1545
- /mcp status — show which servers are running / stopped
1546
- ```
1547
-
1548
- **How it works:**
1549
-
1550
- 1. nex-code spawns each server as a child process using stdio JSON-RPC transport.
1551
- 2. It sends `initialize` + `tools/list` to discover available tools.
1552
- 3. All discovered tools are merged into the nex-code tool registry under the name `mcp_<server>_<tool>`.
1553
- 4. The agent can call them transparently alongside built-in tools.
1554
- 5. All server processes are shut down cleanly when nex-code exits.
407
+ >>>>>>> Stashed changes
408
+ ## VS Code Extension
1555
409
 
1556
- **Env var interpolation:** `${VAR}` in the `env` block is replaced from `process.env` at startup so you
1557
- can store the actual key in your shell environment or `.env` file:
410
+ Built-in sidebar chat panel (`vscode/`) with streaming output, collapsible tool cards, and native theme support. Spawns `nex-code --server` over JSON-lines IPC.
1558
411
 
1559
412
  ```bash
1560
- export BRAVE_API_KEY=my-key
1561
- nex-code --mcp-config .nex/mcp.json
1562
- ```
1563
-
1564
- ---
1565
-
1566
- ## Hooks
1567
-
1568
- Run custom scripts on CLI events:
1569
-
1570
- ```json
1571
- // .nex/config.json
1572
- {
1573
- "hooks": {
1574
- "pre-tool": ["echo 'before tool'"],
1575
- "post-tool": ["echo 'after tool'"],
1576
- "pre-commit": ["npm test"]
1577
- }
1578
- }
1579
- ```
1580
-
1581
- Events: `pre-tool`, `post-tool`, `pre-commit`, `post-response`, `session-start`, `session-end`.
1582
-
1583
- Or place executable scripts in `.nex/hooks/`:
1584
-
1585
- ```
1586
- .nex/hooks/pre-tool
1587
- .nex/hooks/post-tool
413
+ cd vscode && npm install && npm run package
414
+ # Cmd+Shift+P -> Extensions: Install from VSIX...
1588
415
  ```
1589
416
 
1590
417
  ---
@@ -1592,161 +419,49 @@ Or place executable scripts in `.nex/hooks/`:
1592
419
  ## Architecture
1593
420
 
1594
421
  ```
1595
- bin/nex-code.js # Entrypoint (shebang, .env, startREPL)
422
+ bin/nex-code.js # Entrypoint
1596
423
  cli/
1597
- ├── index.js # REPL + ~45 slash commands + history persistence + AbortController
1598
- ├── agent.js # Agentic loop + conversation state + compact output + résumé + abort handling
1599
- ├── providers/ # Multi-provider abstraction
1600
- │ ├── base.js # Abstract provider interface
1601
- │ ├── ollama.js # Ollama Cloud provider
1602
- │ ├── openai.js # OpenAI provider
1603
- │ ├── anthropic.js # Anthropic provider
1604
- │ ├── gemini.js # Google Gemini provider
1605
- │ ├── local.js # Local Ollama server
1606
- │ └── registry.js # Provider registry + model resolution + provider routing
1607
- ├── tools.js # 45 tool definitions + implementations + auto-fix engine
1608
- ├── sub-agent.js # Parallel sub-agent runner with file locking + model routing
1609
- ├── tasks.js # Task list management (create, update, render, onChange callbacks)
1610
- ├── skills.js # Skills system (prompt + script + marketplace)
1611
- ├── plugins.js # Plugin API (registerTool, registerHook, event system)
1612
- ├── audit.js # Tool execution audit logging (JSONL + sanitization)
1613
- ├── mcp.js # MCP client (JSON-RPC over stdio)
1614
- ├── hooks.js # Hook system (pre/post events)
1615
- ├── context.js # Auto-context (package.json, git, README) + generateFileTree()
1616
- ├── context-engine.js # Token management + relevance-based context compression
1617
- ├── session.js # Session persistence (.nex/sessions/)
1618
- ├── memory.js # Project memory (.nex/memory/ + NEX.md)
1619
- ├── filelock.js # Inter-process file locking (atomicWrite + withFileLockSync)
1620
- ├── permissions.js # Tool permission system + team presets (readonly/developer/admin)
1621
- ├── planner.js # Plan mode, step extraction, step cursor, autonomy levels
1622
- ├── git.js # Git intelligence (commit, diff, branch)
1623
- ├── render.js # Markdown + syntax highlighting + StreamRenderer + EPIPE guard
1624
- ├── format.js # Tool call formatting, result formatting, compact summaries
1625
- ├── spinner.js # Spinner, MultiProgress, TaskProgress, ToolProgress display components
1626
- ├── diff.js # LCS diff (Myers + Hirschberg) + colored output + side-by-side view
1627
- ├── fuzzy-match.js # Fuzzy text matching for edit auto-fix (Levenshtein, whitespace normalization)
1628
- ├── file-history.js # Persistent undo/redo + named git snapshots + blob storage
1629
- ├── picker.js # Interactive terminal picker (model selection)
1630
- ├── costs.js # Token cost tracking + per-provider budget limits
1631
- ├── safety.js # Forbidden/dangerous pattern detection
1632
- ├── tool-validator.js # Tool argument validation + auto-correction
1633
- ├── tool-tiers.js # Dynamic tool set selection per model + model tier lookup + edit mode
1634
- ├── footer.js # Sticky footer (scroll region, status bar, input row, resize, FOOTER_DEBUG)
1635
- ├── ui.js # ANSI colors, banner + re-exports from format.js/spinner.js
1636
- ├── index-engine.js # In-memory file index (ripgrep/fallback) + semantic content index
1637
- ├── skills/devops.md # Built-in DevOps agent skill
1638
- ├── auto-fix.js # Path resolution, edit matching, bash error hints
1639
- ├── tool-retry.js # Malformed argument retry with schema hints
1640
- └── ollama.js # Backward-compatible wrapper
1641
- ```
1642
-
1643
- ### Agentic Loop
1644
-
1645
- ```
1646
- User Input --> [AbortController created]
1647
- |
1648
- [System Prompt + Context + Memory + Skills + Conversation]
1649
- |
1650
- [Filter tools by model tier (essential/standard/full)]
1651
- |
1652
- Provider API (streaming + abort signal) --> Text tokens --> rendered to terminal
1653
- | \--> Tool calls --> parse args (5 strategies)
1654
- | |
1655
- | [Validate against schema + auto-correct]
1656
- | |
1657
- | Execute (skill / MCP / built-in)
1658
- | |
1659
- | [Auto-fix: path resolution, edit matching, bash hints]
1660
- |
1661
- [Tool results added to history]
1662
- |
1663
- Loop until: no more tool calls OR 50 iterations OR Ctrl+C abort
1664
- ```
1665
-
1666
- ---
1667
-
1668
- ## .nex/ Directory
1669
-
1670
- Project-local configuration and state (gitignored):
1671
-
1672
- ```
1673
- .nex/
1674
- ├── config.json # Permissions, MCP servers, hooks, skills, cost limits
1675
- ├── sessions/ # Saved conversations
1676
- ├── memory/ # Persistent project knowledge
1677
- ├── plans/ # Saved plans
1678
- ├── hooks/ # Custom hook scripts
1679
- ├── skills/ # Skill files (.md and .js)
1680
- └── push-allowlist # False-positive allowlist for pre-push secret detection
1681
- ```
1682
-
1683
- ---
1684
-
1685
- ## Performance
1686
-
1687
- Nex Code v0.3.45+ includes comprehensive performance optimizations:
1688
-
1689
- | Optimization | Improvement | Impact |
1690
- | ---------------------------- | --------------- | ------------------------- |
1691
- | **System Prompt Caching** | 4.3× faster | 77µs → 18µs |
1692
- | **Token Estimation Caching** | 3.5× faster | Cached after first call |
1693
- | **Context File Caching** | 10-20× faster | 50-200ms → 5-10ms |
1694
- | **Debounced Auto-Save** | 0ms in hot path | Saves after 5s inactivity |
1695
- | **Tool Filter Caching** | 1.7× faster | Cached per model |
1696
- | **Schema Cache** | 3.4× faster | 2.51µs → 0.73µs |
1697
-
1698
- **Average speedup:** 2.7× (micro-benchmarks)
1699
- **Real-world improvement:** ~10× faster per turn
1700
-
1701
- Run benchmarks yourself:
1702
-
1703
- ```bash
1704
- node benchmark.js
1705
- ```
424
+ agent.js # Agentic loop + conversation state + guards
425
+ providers/ # Ollama, OpenAI, Anthropic, Gemini, Local + wire protocols
426
+ tools/index.js # 45 tool definitions + auto-fix engine
427
+ context-engine.js # Token management + 5-phase compression
428
+ sub-agent.js # Parallel sub-agents with file locking
429
+ <<<<<<< Updated upstream
430
+ orchestrator.js # Multi-agent decompose -> execute -> synthesize
431
+ =======
432
+ orchestrator.js # Multi-agent decompose -> execute -> synthesize
433
+ >>>>>>> Stashed changes
434
+ session-tree.js # Session branching
435
+ visual.js # Visual dev tools (pixelmatch-based)
436
+ browser.js # Playwright browser agent
437
+ skills/ # Built-in + user skills
438
+ ```
439
+
440
+ See [DEVELOPMENT.md](DEVELOPMENT.md) for full architecture details.
1706
441
 
1707
442
  ---
1708
443
 
1709
444
  ## Testing
1710
445
 
1711
446
  ```bash
1712
- npm test # Run all tests with coverage
1713
- npm run test:watch # Watch mode
447
+ npm test # 97 suites, 3920 tests
448
+ npm run typecheck # TypeScript noEmit check
449
+ npm run benchmark:gate # 7-task smoke test (blocks push on regression)
450
+ npm run benchmark:reallife # 35 real-world tasks across 7 categories
1714
451
  ```
1715
452
 
1716
- 91 test suites, 3719 tests, 83% statement / 74% branch coverage.
1717
-
1718
- CI runs on GitHub Actions (Node 20 LTS).
1719
-
1720
- **Type checking:** `npm run typecheck` runs TypeScript in `noEmit` mode with `allowJs`. Core type definitions live in `types/index.d.ts` (Message, ToolCall, IProvider, IWireProtocol, Session, Skill, etc.). The codebase uses incremental TypeScript adoption — new modules can be written in `.ts` while existing `.js` files are gradually migrated.
1721
-
1722
453
  ---
1723
454
 
1724
- ## Dependencies
1725
-
1726
- 2 runtime dependencies:
1727
-
1728
- ```json
1729
- {
1730
- "axios": "^1.7.0",
1731
- "dotenv": "^16.4.0"
1732
- }
1733
- ```
1734
-
1735
- Everything else is Node.js built-in.
1736
-
1737
- ## Installation
1738
-
1739
- ```bash
1740
- npm install -g nex-code # global install
1741
- npx nex-code # or run without installing
1742
- ```
1743
-
1744
- On first launch with no API keys configured, nex-code starts an **interactive setup wizard** that guides you through choosing a provider and entering credentials. You can re-run it anytime with `/setup`.
455
+ ## Security
1745
456
 
1746
- ## Roadmap
457
+ - Pre-push secret detection (API keys, private keys, hardcoded credentials)
458
+ - Audit logging with automatic argument sanitization
459
+ - Sensitive path blocking (`.ssh/`, `.aws/`, `.env`, credentials)
460
+ - Shell injection protection via `execFileSync` with argument arrays
461
+ - SSRF protection on `web_fetch`
462
+ - MCP environment isolation
1747
463
 
1748
- See [ROADMAP.md](ROADMAP.md) for planned features VS Code extension, browser agent, PTY support, and more.
1749
- Community contributions are welcome on all roadmap items.
464
+ **Reporting vulnerabilities:** Email **security@schoensgibl.com** (not a public issue). Allow 72h for initial response.
1750
465
 
1751
466
  ---
1752
467