nex-code 0.5.12 → 0.5.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,461 +1,159 @@
1
- <h1 align="center">nex-code</h1>
1
+ # nex-code
2
2
 
3
- <p align="center">
4
- <b>Run 400B+ open coding models on your codebase — without the hardware bill.</b><br>
5
- Ollama Cloud first. OpenAI, Anthropic, and Gemini when you need them.
6
- </p>
3
+ **A CLI coding assistant for production development workflows.**
7
4
 
8
- <p align="center">
9
- <code>npx nex-code</code>
10
- </p>
5
+ `nex-code` is an AI-powered developer tool that works in the terminal, reasons through tasks in phases, and routes work across multiple model providers. It is built for engineers who want an assistant that can operate on a real codebase, use real tools, and stay aligned with the way software is actually built and maintained.
11
6
 
12
- <p align="center">
13
- <a href="https://github.com/hybridpicker/nex-code/stargazers">If this saves you time, a star helps others find it.</a>
14
- </p>
7
+ ## Overview
15
8
 
16
- <p align="center">
17
- <a href="https://www.npmjs.com/package/nex-code"><img src="https://img.shields.io/npm/v/nex-code.svg" alt="npm version"></a>
18
- <a href="https://www.npmjs.com/package/nex-code"><img src="https://img.shields.io/npm/dm/nex-code.svg" alt="npm downloads"></a>
19
- <a href="https://github.com/hybridpicker/nex-code/stargazers"><img src="https://img.shields.io/github/stars/hybridpicker/nex-code.svg" alt="GitHub Stars"></a>
20
- <a href="https://github.com/hybridpicker/nex-code/actions/workflows/ci.yml"><img src="https://github.com/hybridpicker/nex-code/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
21
- <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License: MIT"></a>
22
- <img src="https://img.shields.io/badge/Ollama_Cloud-supported-brightgreen.svg" alt="Ollama Cloud: supported">
23
- <img src="https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg" alt="Node >= 18">
24
- <img src="https://img.shields.io/badge/dependencies-2-green.svg" alt="Dependencies: 2">
25
- <img src="https://img.shields.io/badge/tests-3929-blue.svg" alt="Tests: 3920">
26
- <img src="https://img.shields.io/badge/VS_Code-extension-007ACC.svg" alt="VS Code extension">
27
- </p>
9
+ Most AI coding tools are optimized for short demos: generate a file, suggest a snippet, answer a question. Real development work is different. It involves understanding an existing repository, planning changes, editing carefully, running verification, and working with the operational tools around the code.
28
10
 
29
- ---
11
+ `nex-code` exists to close that gap. It is designed as a serious CLI-first system that can:
30
12
 
31
- ## Demo
13
+ - work across OpenAI, Anthropic, Gemini, Ollama, and local models
14
+ - move through a structured plan -> implement -> verify loop
15
+ - use developer tooling such as Git, SSH, Docker, and Kubernetes
16
+ - adapt model choice to the kind of work being done
32
17
 
33
- https://github.com/user-attachments/assets/68a6c134-2d13-4d66-bc5e-befea3acb794
18
+ The result is not just "chat in the terminal." It is an agentic workflow engine for software delivery.
34
19
 
35
- ---
20
+ ## Core Concept
36
21
 
37
- ## Quickstart
22
+ ### Agentic Workflow: Plan -> Implement -> Verify
38
23
 
39
- ```bash
40
- npx nex-code
41
- # or install globally:
42
- npm install -g nex-code && cd ~/your-project && nex-code
43
- ```
24
+ `nex-code` treats coding tasks as execution flows rather than single prompts.
44
25
 
45
- On first launch, an interactive setup wizard guides you through provider and credential configuration. Re-run anytime with `/setup`.
46
-
47
- ---
48
-
49
- ## Why nex-code?
50
-
51
- **Ollama Cloud first.** Built and optimized for [Ollama Cloud](https://ollama.com) — the flat-rate platform running devstral, Kimi K2, Qwen3-Coder, and 47+ models. Other providers (OpenAI, Anthropic, Gemini) work via the same interface.
52
-
53
- | Feature | nex-code | Closed-source alternatives |
54
- |---|---|---|
55
- | Free tier | Ollama Cloud flat-rate | subscription or limited quota |
56
- | Open models | devstral, Kimi K2, Qwen3 | vendor-locked |
57
- | Local Ollama | yes | no |
58
- | Multi-provider | swap with one env var | no |
59
- | VS Code sidebar | built-in | partial |
60
- | Startup time | ~100ms | 1-4s |
61
- | Runtime deps | 2 | heavy |
62
- | Infra tools | SSH, Docker, K8s built-in | no |
63
-
64
- **Smart model routing.** The built-in `/benchmark` tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.
65
-
66
- **Phase-based execution.** Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.
67
-
68
- **45 built-in tools** across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See [Tools](#tools) for the full list.
69
-
70
- **2 runtime dependencies** (`axios`, `dotenv`). Starts in ~100ms. No Python, no heavy runtime.
71
-
72
- ---
73
-
74
- ## Ollama Cloud Model Rankings
75
-
76
- Rankings from nex-code's own `/benchmark` — 62 tasks testing tool selection, argument validity, and schema compliance.
77
-
78
- <!-- nex-benchmark-start -->
79
- <!-- Updated: 2026-04-12 — run `/benchmark --discover` after new Ollama Cloud releases -->
80
-
81
- | Rank | Model | Score | Avg Latency | Context | Best For |
82
- |---|---|---|---|---|---|
83
- | 🥇 | `qwen3-vl:235b` | **100** | 13.4s | 131K | Overall #1 — frontier tool selection, data + agentic tasks |
84
- | 🥈 | `qwen3-vl:235b-instruct` | 97.5 | 7.7s | 131K | Best latency/score balance — recommended default |
85
- | 🥉 | `glm-4.6` | 97.5 | 26.8s | 131K | — |
86
- | — | `qwen3-next:80b` | 97.2 | 8.0s | 131K | — |
87
- | — | `deepseek-v3.1:671b` | 94.5 | 3.1s | 131K | — |
88
- | — | `qwen3-coder-next` | 94.3 | 2.2s | 256K | — |
89
- | — | `qwen3.5:397b` | 94.3 | 4.2s | 256K | — |
90
- | — | `ministral-3:8b` | 94.3 | 1.6s | 131K | Fastest strong model — 2.2s latency, 70+ score |
91
- | — | `minimax-m2.7` | 92.9 | 4.7s | 200K | — |
92
- | — | `rnj-1:8b` | 92.2 | 2.1s | 131K | — |
93
- | — | `glm-5` | 91.7 | 3.6s | 131K | — |
94
- | — | `nemotron-3-super` | 91.4 | 1.7s | 256K | — |
95
- | — | `ministral-3:14b` | 91.2 | 1.5s | 131K | — |
96
- | — | `qwen3-coder:480b` | 91 | 8.3s | 131K | Heavy coding sessions, large context |
97
- | — | `glm-4.7` | 90.7 | 4.1s | 131K | — |
98
- | — | `devstral-2:123b` | 90.3 | 8.1s | 131K | Sysadmin + SSH tasks, reliable coding |
99
- | — | `kimi-k2:1t` | 90.3 | 3.7s | 256K | Large repos (>100K tokens) |
100
- | — | `minimax-m2` | 90 | 3.4s | 200K | — |
101
- | — | `devstral-small-2:24b` | 88.8 | 6.8s | 131K | Fast sub-agents, simple lookups |
102
- | — | `kimi-k2-thinking` | 88.7 | 4.3s | 256K | — |
103
- | — | `minimax-m2.1` | 88.1 | 2.5s | 200K | — |
104
- | — | `glm-5.1` | 87.2 | 5.0s | ? | — |
105
- | — | `kimi-k2.5` | 86.2 | 4.8s | 256K | Large repos — faster than k2:1t |
106
- | — | `gemma4:31b` | 85.2 | 4.8s | ? | — |
107
- | — | `minimax-m2.5` | 84.2 | 6.8s | 131K | Multi-agent, large context |
108
- | — | `gpt-oss:120b` | 83.9 | 2.8s | 131K | — |
109
- | — | `mistral-large-3:675b` | 82.5 | 7.0s | 131K | — |
110
- | — | `ministral-3:3b` | 82.4 | 1.3s | 32K | — |
111
- | — | `gpt-oss:20b` | 81.1 | 1.5s | 131K | Fast small model, good overall score |
112
- | — | `nemotron-3-nano:30b` | 78.3 | 2.3s | 131K | — |
113
- | — | `gemini-3-flash-preview` | 76.5 | 3.3s | 131K | — |
114
- | — | `deepseek-v3.2` | 65.4 | 14.3s | 131K | — |
115
- | — | `cogito-2.1:671b` | 65.2 | 3.4s | 131K | — |
116
-
117
- > Rankings are nex-code-specific: tool name accuracy, argument validity, schema compliance.
118
- > Toolathon (Minimax SOTA) measures different task types — run `/benchmark --discover` after model releases.
119
- <!-- nex-benchmark-end -->
120
-
121
- <!-- nex-routing-start -->
122
- <!-- Updated: 2026-04-12 -->
123
-
124
- **Model routing by task type** (auto-updated by `/benchmark --all`):
125
-
126
- | Category | Model | Score |
127
- |---|---|---|
128
- | coding | `new` | 90/100 |
129
- <!-- nex-routing-end -->
130
-
131
- **Recommended `.env`:**
132
-
133
- ```env
134
- DEFAULT_PROVIDER=ollama
135
- DEFAULT_MODEL=devstral-2:123b
136
- NEX_HEAVY_MODEL=qwen3-coder:480b
137
- NEX_STANDARD_MODEL=devstral-2:123b
138
- NEX_FAST_MODEL=devstral-small-2:24b
139
- ```
26
+ - **Plan**: understand the request, inspect the codebase, identify the relevant files and likely change strategy
27
+ - **Implement**: make the code changes with access to the right tools and repository context
28
+ - **Verify**: run tests, inspect outputs, and loop back if the change does not hold up
140
29
 
141
- ---
30
+ This matters because the failure mode of many coding assistants is not generation quality alone. It is premature action. A useful assistant must know when to inspect first, when to change code, and when to stop and verify before claiming success.
142
31
 
143
- ## Setup
32
+ ### Multi-Model Routing
144
33
 
145
- **Prerequisites:** Node.js 18+ and at least one API key (or local Ollama).
34
+ Different models are good at different things. Some are better at fast repo exploration, some at careful implementation, and some at structured verification or longer-context reasoning.
146
35
 
147
- ```bash
148
- # .env (or set environment variables)
149
- OLLAMA_API_KEY=your-key # Ollama Cloud
150
- OPENAI_API_KEY=your-key # OpenAI
151
- ANTHROPIC_API_KEY=your-key # Anthropic
152
- GEMINI_API_KEY=your-key # Gemini
153
- PERPLEXITY_API_KEY=your-key # optional — enables grounded web search
36
+ `nex-code` is built around that reality. Instead of binding the entire session to one model, it can route work by phase, task type, or provider availability. In practice, this means:
154
37
 
155
- DEFAULT_PROVIDER=ollama
156
- DEFAULT_MODEL=devstral-2:123b
157
- ```
38
+ - using one model for planning and another for implementation
39
+ - switching providers without changing the workflow model
40
+ - falling back across providers when a model is unavailable or unsuitable
41
+ - benchmarking configured models to improve routing decisions over time
158
42
 
159
- **Env file precedence.** nex-code loads `.env` from three places in this order:
43
+ The goal is not provider abstraction for its own sake. The goal is to make model choice operational rather than ideological.
160
44
 
161
- 1. Install directory `.env` — non-override, fills blanks only
162
- 2. `~/.nex-code/.env` — **override**, wins over ambient `process.env`
163
- 3. Current working directory `.env` — non-override, cannot clobber the global config
164
-
165
- `~/.nex-code/.env` is the authoritative location for long-lived config like `OLLAMA_API_KEY`. The `override:true` on that file exists so that a rotated key written there takes effect on the next `nex-code` launch, even when nex-code is spawned by a long-running parent process (systemd daemon, supervisor agent, test runner) whose own environment was captured earlier and is now stale. If you rotate an API key, update `~/.nex-code/.env` **and** restart any long-running daemon that spawns nex-code — the `override:true` fixes subprocess launches but cannot refresh the parent's own captured `process.env`.
166
-
167
- **Install from source:**
168
-
169
- ```bash
170
- git clone https://github.com/hybridpicker/nex-code.git
171
- cd nex-code && npm install && npm run build
172
- cp .env.example .env && npm link && npm run install-hooks
173
- ```
45
+ ## Key Features
174
46
 
175
- ---
47
+ - **CLI-first operation** with low overhead and a workflow that fits existing terminal habits
48
+ - **Phase-based execution** that separates planning, implementation, and verification
49
+ - **Multi-provider support** for OpenAI, Anthropic, Gemini, Ollama Cloud, and local Ollama
50
+ - **Tool-integrated execution** across files, shell commands, Git, SSH, Docker, and Kubernetes
51
+ - **Headless and interactive modes** for both conversational use and automated task runs
52
+ - **Sub-agent orchestration** for decomposing larger tasks into parallel workstreams
53
+ - **Benchmark-driven routing** to select stronger models for specific task categories
54
+ - **Repository-aware behavior** including context from the current project, config, and Git state
55
+ - **Safety controls** around confirmations, sensitive operations, and destructive commands
176
56
 
177
- ## Usage
57
+ ## Architecture
178
58
 
179
- ```
180
- > explain the main function in index.js
181
- > add input validation to the createUser handler
182
- > run the tests and fix any failures
183
- > the /users endpoint returns 500 — find the bug and fix it
184
- ```
59
+ At a high level, `nex-code` is organized as an orchestration layer on top of model providers and developer tools.
185
60
 
186
- ### YOLO Mode
61
+ 1. **CLI and session layer**
62
+ Accepts prompts, commands, flags, and session state from the terminal or editor integration.
187
63
 
188
- Skip all confirmations — file changes, dangerous commands, and tool permissions are auto-approved. Auto-runs `caffeinate` on macOS.
64
+ 2. **Agent loop**
65
+ Runs the task through a controlled execution cycle: inspect, plan, act, verify, and retry when needed.
189
66
 
190
- ```bash
191
- nex-code -yolo
192
- ```
67
+ 3. **Routing and provider layer**
68
+ Resolves which provider and model should handle the next step, based on configuration, task type, and fallback logic.
193
69
 
194
- ### Headless / Programmatic Mode
70
+ 4. **Tool execution layer**
71
+ Exposes filesystem, shell, Git, browser, SSH, Docker, Kubernetes, and related capabilities to the agent.
195
72
 
196
- ```bash
197
- nex-code --task "refactor src/index.js to async/await" --yolo
198
- nex-code --prompt-file /tmp/task.txt --yolo --json
199
- nex-code --daemon # watch mode: fires tasks on file changes, git commits, or cron
200
- ```
73
+ 5. **Verification layer**
74
+ Runs tests, evaluates outcomes, and decides whether the task is complete or needs another pass.
201
75
 
202
- | Flag | Description |
203
- |---|---|
204
- | `--task <prompt>` | Run a single prompt and exit |
205
- | `--prompt-file <path>` | Read prompt from file |
206
- | `--yolo` | Skip all confirmations |
207
- | `--server` | JSON-lines IPC server (VS Code extension) |
208
- | `--daemon` | Background watcher (reads `.nex/daemon.json`) |
209
- | `--flatrate` | 100 turns, 6 parallel agents, 5 retries |
210
- | `--json` | JSON output to stdout |
211
- | `--max-turns <n>` | Override agentic loop limit |
212
- | `--model <spec>` | Use specific model (e.g. `anthropic:claude-sonnet-4-6`) |
213
- | `--debug` | Show diagnostic messages |
214
- | `--gemini` | Local Gemini test mode (`gemini-3.1-pro-preview` by default, requires `GEMINI_API_KEY`) |
215
- | `--gemini-model <id>` | Pin a specific Gemini model (implies `--gemini`) |
216
-
217
- ### Vision / Screenshot
76
+ In practice, this makes `nex-code` closer to a local orchestration system than a thin wrapper around an LLM API.
218
77
 
219
- ```
220
- > /path/to/screenshot.png implement this UI in React
221
- > analyze https://example.com/mockup.png and implement it
222
- > what's wrong with the layout in my clipboard # macOS clipboard capture
223
- > screenshot localhost:3000 and review the navbar spacing
224
- ```
78
+ ## Example Workflow
225
79
 
226
- Works with Anthropic, OpenAI, Gemini, and Ollama vision models. Formats: PNG, JPG, GIF, WebP, BMP.
80
+ A typical developer flow with `nex-code` looks like this:
227
81
 
228
- ---
82
+ 1. Start in a repository and describe the task in plain English.
83
+ 2. `nex-code` inspects the project structure, relevant files, and surrounding context.
84
+ 3. It forms a plan or enters a planning phase before editing.
85
+ 4. It makes the implementation changes with tool access.
86
+ 5. It runs tests or other verification steps.
87
+ 6. If verification fails, it loops back, adjusts the implementation, and re-runs checks.
88
+ 7. When the task is complete, it leaves the repository in a verifiable state rather than stopping at code generation.
229
89
 
230
- ## Providers & Models
90
+ Example prompts:
231
91
 
92
+ ```text
93
+ explain why the user creation flow is failing in production
94
+ add input validation to the createUser handler and update the tests
95
+ refactor this module to async/await and verify the endpoint behavior
96
+ review the recent changes and look for regressions before I push
232
97
  ```
233
- /model # interactive picker
234
- /model openai:gpt-4o # switch directly
235
- /providers # list all
236
- /fallback anthropic,openai # auto-switch on failure
237
- ```
238
-
239
- | Provider | Models | Env Variable |
240
- |---|---|---|
241
- | **ollama** | Qwen3, DeepSeek R1, Devstral, Kimi K2, MiniMax, GLM, Llama 4 | `OLLAMA_API_KEY` |
242
- | **openai** | GPT-4o, GPT-4.1, o1, o3, o4-mini | `OPENAI_API_KEY` |
243
- | **anthropic** | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 | `ANTHROPIC_API_KEY` |
244
- | **gemini** | Gemini 3.1 Pro, 2.5 Pro/Flash | `GEMINI_API_KEY` |
245
- | **local** | Any local Ollama model | (none) |
246
-
247
- ---
248
-
249
- ## Commands
250
-
251
- Type `/` to see inline suggestions. Tab completion for slash commands and file paths.
252
-
253
- | Command | Description |
254
- |---|---|
255
- | `/help` | Full help |
256
- | `/model [spec]` | Show/switch model |
257
- | `/providers` | List providers |
258
- | `/clear` | Clear conversation |
259
- | `/save` / `/load` / `/sessions` / `/resume` | Session management |
260
- | `/branches` / `/fork` / `/switch-branch` / `/goto` | Session tree navigation |
261
- | `/remember` / `/forget` / `/memory` | Persistent memory |
262
- | `/brain add\|list\|search\|show\|remove` | Knowledge base |
263
- | `/plan [task]` / `/plan edit` / `/plan approve` | Plan mode |
264
- | `/commit [msg]` / `/diff` / `/branch` | Git intelligence |
265
- | `/undo` / `/redo` / `/history` | Persistent undo/redo |
266
- | `/snapshot [name]` / `/restore` | Git snapshots |
267
- | `/permissions` / `/allow` / `/deny` | Tool permissions |
268
- | `/costs` / `/budget` | Cost tracking and limits |
269
- | `/review [--strict]` | Deep code review |
270
- | `/benchmark` | Model ranking (62 tasks) |
271
- | `/autoresearch` / `/ar-self-improve` | Autonomous optimization loops |
272
- | `/servers` / `/docker` / `/deploy` / `/k8s` | Infrastructure management |
273
- | `/skills` / `/install-skill` / `/mcp` / `/hooks` | Extensibility |
274
- | `/tree [depth]` | Project file tree |
275
- | `/audit` | Tool execution audit |
276
- | `/setup` | Interactive setup wizard |
277
-
278
- ---
279
-
280
- ## Tools
281
98
 
282
- 45 built-in tools organized by category:
99
+ ## Design Philosophy
283
100
 
284
- **Core:** `bash`, `read_file`, `write_file`, `edit_file`, `patch_file`, `list_directory`, `search_files`, `glob`, `grep`
101
+ ### CLI-first
285
102
 
286
- **Git & Web:** `git_status`, `git_diff`, `git_log`, `web_fetch`, `web_search`
103
+ The terminal remains the most capable interface for real development work. `nex-code` is designed to operate where developers already inspect code, run tests, check diffs, and manage environments.
287
104
 
288
- **Agents:** `ask_user`, `task_list`, `spawn_agents`, `switch_model`
105
+ ### Developer-centric
289
106
 
290
- **Browser** (optional, requires Playwright): `browser_open`, `browser_screenshot`, `browser_click`, `browser_fill`
107
+ The product assumes a professional engineering workflow: existing repositories, mixed tooling, imperfect environments, partial context, and the need to verify outcomes. It is meant to assist a developer, not replace the surrounding engineering discipline.
291
108
 
292
- **GitHub Actions & K8s:** `gh_run_list`, `gh_run_view`, `gh_workflow_trigger`, `k8s_pods`, `k8s_logs`, `k8s_exec`, `k8s_apply`, `k8s_rollout`
109
+ ### Real-world workflows
293
110
 
294
- **SSH & Server:** `ssh_exec`, `ssh_upload`, `ssh_download`, `service_manage`, `service_logs`, `sysadmin`, `remote_agent`
111
+ A credible coding assistant must handle more than code generation. It needs to interact with source control, infrastructure, shells, CI-like verification, and operational context. `nex-code` is built around those constraints instead of treating them as edge cases.
295
112
 
296
- **Docker:** `container_list`, `container_logs`, `container_exec`, `container_manage`
113
+ ## Installation / Getting Started
297
114
 
298
- **Deploy:** `deploy`, `deployment_status`
299
-
300
- **Frontend:** `frontend_recon` — scans design tokens, layout, framework stack before any frontend work
301
-
302
- **Visual:** `visual_diff`, `responsive_sweep`, `visual_annotate`, `visual_watch`, `design_tokens`, `design_compare`
303
-
304
- Additional tools via [MCP servers](#mcp) or [Skills](#skills).
305
-
306
- ---
307
-
308
- ## Key Features
309
-
310
- ### Multi-Agent Orchestrator
311
-
312
- Multi-goal prompts auto-decompose into parallel sub-agents. Up to 5 agents run simultaneously with file locking.
115
+ Quick start:
313
116
 
314
117
  ```bash
315
- nex-code --task "fix type errors in src/, add JSDoc to utils/, update CHANGELOG"
316
- ```
317
-
318
- ### Background Agents
319
-
320
- Sub-agents can run non-blocking in isolated forked processes. The main agent continues working while background workers complete, then results are automatically injected into the conversation.
321
-
322
- ```
323
- # The model decides when to use background:true — no extra syntax needed.
324
- # Example: the model might run the linter in background while explaining code.
325
- spawn_agents([
326
- { task: "run the linter and report errors", background: true },
327
- { task: "explain the auth module" } ← main agent answers this immediately
328
- ])
329
- ```
330
-
331
- Background agents are shown in the spinner: `● Thinking [1 bg agent running]`. Results appear as `✓ Background agent done: …` when workers finish.
332
-
333
- ### Autoresearch
334
-
335
- Autonomous optimization loops: edit -> experiment -> keep/revert, on a dedicated branch.
336
-
337
- ```
338
- /autoresearch reduce test runtime while maintaining correctness
339
- /ar-self-improve # self-improvement using nex-code's benchmark
118
+ npx nex-code
340
119
  ```
341
120
 
342
- ### Plan Mode
343
-
344
- Auto-activates for implementation tasks. Read-only analysis first, approve before writes. Hard-enforced tool restrictions.
345
-
346
- ### Daemon / Watch Mode
347
-
348
- Background process that fires tasks on file changes, git commits, or cron schedule. Configured via `.nex/daemon.json`. Desktop and Matrix notifications.
349
-
350
- ### Session Trees
351
-
352
- Navigate conversation history like git branches — fork, switch, goto, delete branches.
353
-
354
- ### Safety
355
-
356
- | Layer | What it guards | Bypass? |
357
- |---|---|---|
358
- | **Forbidden patterns** | `rm -rf /`, fork bombs, reverse shells, `cat .env` | No |
359
- | **Protected paths** | Destructive ops on `.env`, `.ssh/`, `.aws/`, `.git/` | `NEX_UNPROTECT=1` |
360
- | **Sensitive file tools** | read/write/edit on `.env`, `.ssh/`, `.npmrc`, `.kube/` | No |
361
- | **Critical commands** | `rm -rf`, `sudo`, `git push --force`, `git reset --hard` | Explicit confirmation |
362
-
363
- Pre-push secret detection, audit logging (JSONL), persistent undo/redo, cost limits, auto plan mode.
364
-
365
- ### Open-Source Model Robustness
366
-
367
- - **5-layer argument parsing** — JSON, trailing fix, extraction, key repair, fence stripping
368
- - **Tool call retry with schema hints** — malformed args get the expected schema for self-correction
369
- - **Auto-fix engine** — path resolution, edit fuzzy matching (Levenshtein), bash error hints
370
- - **Tool tiers** — essential (5) / standard (21) / full (45), auto-selected per model capability
371
- - **Stale stream recovery** — progressive retry with context compression on stall
372
-
373
- ### Visual Development Tools
374
-
375
- Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.
376
-
377
- ---
378
-
379
- ## Extensibility
380
-
381
- ### Skills
382
-
383
- Drop `.md` or `.js` files in `.nex/skills/` for project-specific knowledge, commands, and tools. Global skills in `~/.nex-code/skills/`. Install from git: `/install-skill user/repo`.
384
-
385
- ### Plugins
386
-
387
- Custom tools and lifecycle hooks via `.nex/plugins/`. Events: `onToolResult`, `onModelResponse`, `onSessionStart`, `onSessionEnd`, `onFileChange`, `beforeToolExec`, `afterToolExec`.
388
-
389
- ### MCP
390
-
391
- Connect external tool servers via [Model Context Protocol](https://modelcontextprotocol.io). Configure in `.nex/mcp.json` with env var interpolation.
392
-
393
- ### Hooks
394
-
395
- Run custom scripts on CLI events (`pre-tool`, `post-tool`, `pre-commit`, `post-response`, `session-start`, `session-end`). Configure in `.nex/config.json` or `.nex/hooks/`.
396
-
397
- ---
398
-
399
- ## VS Code Extension
400
-
401
- Built-in sidebar chat panel (`vscode/`) with streaming output, collapsible tool cards, and native theme support. Spawns `nex-code --server` over JSON-lines IPC.
121
+ Or install globally:
402
122
 
403
123
  ```bash
404
- cd vscode && npm install && npm run package
405
- # Cmd+Shift+P -> Extensions: Install from VSIX...
124
+ npm install -g nex-code
125
+ nex-code
406
126
  ```
407
127
 
408
- ---
128
+ Basic requirements:
409
129
 
410
- ## Architecture
411
-
412
- ```
413
- bin/nex-code.js # Entrypoint
414
- cli/
415
- agent.js # Agentic loop + conversation state + guards
416
- providers/ # Ollama, OpenAI, Anthropic, Gemini, Local + wire protocols
417
- tools/index.js # 45 tool definitions + auto-fix engine
418
- context-engine.js # Token management + 5-phase compression
419
- sub-agent.js # Parallel sub-agents with file locking
420
- orchestrator.js # Multi-agent decompose -> execute -> synthesize
421
- session-tree.js # Session branching
422
- visual.js # Visual dev tools (pixelmatch-based)
423
- browser.js # Playwright browser agent
424
- skills/ # Built-in + user skills
425
- ```
426
-
427
- See [DEVELOPMENT.md](DEVELOPMENT.md) for full architecture details.
130
+ - Node.js 18+
131
+ - at least one configured provider key, or a local Ollama setup
428
132
 
429
- ---
133
+ Typical environment configuration:
430
134
 
431
- ## Testing
135
+ ```env
136
+ OLLAMA_API_KEY=your-key
137
+ OPENAI_API_KEY=your-key
138
+ ANTHROPIC_API_KEY=your-key
139
+ GEMINI_API_KEY=your-key
432
140
 
433
- ```bash
434
- npm test # 97 suites, 3920 tests
435
- npm run typecheck # TypeScript noEmit check
436
- npm run benchmark:gate # 7-task smoke test (blocks push on regression)
437
- npm run benchmark:reallife # 35 real-world tasks across 7 categories
141
+ DEFAULT_PROVIDER=ollama
142
+ DEFAULT_MODEL=devstral-2:123b
438
143
  ```
439
144
 
440
- ---
441
-
442
- ## Security
443
-
444
- - Pre-push secret detection (API keys, private keys, hardcoded credentials)
445
- - Audit logging with automatic argument sanitization
446
- - Sensitive path blocking (`.ssh/`, `.aws/`, `.env`, credentials)
447
- - Shell injection protection via `execFileSync` with argument arrays
448
- - SSRF protection on `web_fetch`
449
- - MCP environment isolation
145
+ On first launch, `nex-code` can guide setup interactively. More detailed installation, provider setup, and advanced runtime configuration can be expanded here as the project documentation matures.
450
146
 
451
- **Reporting vulnerabilities:** Email **security@schoensgibl.com** (not a public issue). Allow 72h for initial response.
147
+ ## Future Direction
452
148
 
453
- ---
149
+ The long-term value of `nex-code` is not only broader model support. It is better orchestration.
454
150
 
455
- ## License
151
+ Likely areas of continued investment include:
456
152
 
457
- MIT
153
+ - stronger benchmark-based routing across task categories
154
+ - deeper editor and automation integrations
155
+ - more robust multi-agent coordination for larger changes
156
+ - tighter verification loops for tests, diffs, and deployment workflows
157
+ - better support for persistent project knowledge and reusable team workflows
458
158
 
459
- <!-- Keywords: ollama cli, ollama coding assistant, claude code alternative, gemini cli alternative,
460
- agentic coding cli, open source ai terminal, free coding ai, qwen3 coder cli, devstral terminal,
461
- kimi k2 cli, multi-provider ai cli, local llm coding tool -->
159
+ The direction is clear: make AI assistance behave more like a disciplined engineering system and less like an isolated chat interface.