nex-code 0.5.12 → 0.5.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +98 -400
- package/dist/background-worker.js +569 -558
- package/dist/benchmark.js +512 -501
- package/dist/nex-code.js +806 -786
- package/package.json +2 -1
package/README.md
CHANGED
|
@@ -1,461 +1,159 @@
|
|
|
1
|
-
|
|
1
|
+
# nex-code
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
<b>Run 400B+ open coding models on your codebase — without the hardware bill.</b><br>
|
|
5
|
-
Ollama Cloud first. OpenAI, Anthropic, and Gemini when you need them.
|
|
6
|
-
</p>
|
|
3
|
+
**A CLI coding assistant for production development workflows.**
|
|
7
4
|
|
|
8
|
-
|
|
9
|
-
<code>npx nex-code</code>
|
|
10
|
-
</p>
|
|
5
|
+
`nex-code` is an AI-powered developer tool that works in the terminal, reasons through tasks in phases, and routes work across multiple model providers. It is built for engineers who want an assistant that can operate on a real codebase, use real tools, and stay aligned with the way software is actually built and maintained.
|
|
11
6
|
|
|
12
|
-
|
|
13
|
-
<a href="https://github.com/hybridpicker/nex-code/stargazers">If this saves you time, a star helps others find it.</a>
|
|
14
|
-
</p>
|
|
7
|
+
## Overview
|
|
15
8
|
|
|
16
|
-
|
|
17
|
-
<a href="https://www.npmjs.com/package/nex-code"><img src="https://img.shields.io/npm/v/nex-code.svg" alt="npm version"></a>
|
|
18
|
-
<a href="https://www.npmjs.com/package/nex-code"><img src="https://img.shields.io/npm/dm/nex-code.svg" alt="npm downloads"></a>
|
|
19
|
-
<a href="https://github.com/hybridpicker/nex-code/stargazers"><img src="https://img.shields.io/github/stars/hybridpicker/nex-code.svg" alt="GitHub Stars"></a>
|
|
20
|
-
<a href="https://github.com/hybridpicker/nex-code/actions/workflows/ci.yml"><img src="https://github.com/hybridpicker/nex-code/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
|
|
21
|
-
<a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License: MIT"></a>
|
|
22
|
-
<img src="https://img.shields.io/badge/Ollama_Cloud-supported-brightgreen.svg" alt="Ollama Cloud: supported">
|
|
23
|
-
<img src="https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg" alt="Node >= 18">
|
|
24
|
-
<img src="https://img.shields.io/badge/dependencies-2-green.svg" alt="Dependencies: 2">
|
|
25
|
-
<img src="https://img.shields.io/badge/tests-3929-blue.svg" alt="Tests: 3920">
|
|
26
|
-
<img src="https://img.shields.io/badge/VS_Code-extension-007ACC.svg" alt="VS Code extension">
|
|
27
|
-
</p>
|
|
9
|
+
Most AI coding tools are optimized for short demos: generate a file, suggest a snippet, answer a question. Real development work is different. It involves understanding an existing repository, planning changes, editing carefully, running verification, and working with the operational tools around the code.
|
|
28
10
|
|
|
29
|
-
|
|
11
|
+
`nex-code` exists to close that gap. It is designed as a serious CLI-first system that can:
|
|
30
12
|
|
|
31
|
-
|
|
13
|
+
- work across OpenAI, Anthropic, Gemini, Ollama, and local models
|
|
14
|
+
- move through a structured plan -> implement -> verify loop
|
|
15
|
+
- use developer tooling such as Git, SSH, Docker, and Kubernetes
|
|
16
|
+
- adapt model choice to the kind of work being done
|
|
32
17
|
|
|
33
|
-
|
|
18
|
+
The result is not just "chat in the terminal." It is an agentic workflow engine for software delivery.
|
|
34
19
|
|
|
35
|
-
|
|
20
|
+
## Core Concept
|
|
36
21
|
|
|
37
|
-
|
|
22
|
+
### Agentic Workflow: Plan -> Implement -> Verify
|
|
38
23
|
|
|
39
|
-
|
|
40
|
-
npx nex-code
|
|
41
|
-
# or install globally:
|
|
42
|
-
npm install -g nex-code && cd ~/your-project && nex-code
|
|
43
|
-
```
|
|
24
|
+
`nex-code` treats coding tasks as execution flows rather than single prompts.
|
|
44
25
|
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
## Why nex-code?
|
|
50
|
-
|
|
51
|
-
**Ollama Cloud first.** Built and optimized for [Ollama Cloud](https://ollama.com) — the flat-rate platform running devstral, Kimi K2, Qwen3-Coder, and 47+ models. Other providers (OpenAI, Anthropic, Gemini) work via the same interface.
|
|
52
|
-
|
|
53
|
-
| Feature | nex-code | Closed-source alternatives |
|
|
54
|
-
|---|---|---|
|
|
55
|
-
| Free tier | Ollama Cloud flat-rate | subscription or limited quota |
|
|
56
|
-
| Open models | devstral, Kimi K2, Qwen3 | vendor-locked |
|
|
57
|
-
| Local Ollama | yes | no |
|
|
58
|
-
| Multi-provider | swap with one env var | no |
|
|
59
|
-
| VS Code sidebar | built-in | partial |
|
|
60
|
-
| Startup time | ~100ms | 1-4s |
|
|
61
|
-
| Runtime deps | 2 | heavy |
|
|
62
|
-
| Infra tools | SSH, Docker, K8s built-in | no |
|
|
63
|
-
|
|
64
|
-
**Smart model routing.** The built-in `/benchmark` tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.
|
|
65
|
-
|
|
66
|
-
**Phase-based execution.** Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.
|
|
67
|
-
|
|
68
|
-
**45 built-in tools** across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See [Tools](#tools) for the full list.
|
|
69
|
-
|
|
70
|
-
**2 runtime dependencies** (`axios`, `dotenv`). Starts in ~100ms. No Python, no heavy runtime.
|
|
71
|
-
|
|
72
|
-
---
|
|
73
|
-
|
|
74
|
-
## Ollama Cloud Model Rankings
|
|
75
|
-
|
|
76
|
-
Rankings from nex-code's own `/benchmark` — 62 tasks testing tool selection, argument validity, and schema compliance.
|
|
77
|
-
|
|
78
|
-
<!-- nex-benchmark-start -->
|
|
79
|
-
<!-- Updated: 2026-04-12 — run `/benchmark --discover` after new Ollama Cloud releases -->
|
|
80
|
-
|
|
81
|
-
| Rank | Model | Score | Avg Latency | Context | Best For |
|
|
82
|
-
|---|---|---|---|---|---|
|
|
83
|
-
| 🥇 | `qwen3-vl:235b` | **100** | 13.4s | 131K | Overall #1 — frontier tool selection, data + agentic tasks |
|
|
84
|
-
| 🥈 | `qwen3-vl:235b-instruct` | 97.5 | 7.7s | 131K | Best latency/score balance — recommended default |
|
|
85
|
-
| 🥉 | `glm-4.6` | 97.5 | 26.8s | 131K | — |
|
|
86
|
-
| — | `qwen3-next:80b` | 97.2 | 8.0s | 131K | — |
|
|
87
|
-
| — | `deepseek-v3.1:671b` | 94.5 | 3.1s | 131K | — |
|
|
88
|
-
| — | `qwen3-coder-next` | 94.3 | 2.2s | 256K | — |
|
|
89
|
-
| — | `qwen3.5:397b` | 94.3 | 4.2s | 256K | — |
|
|
90
|
-
| — | `ministral-3:8b` | 94.3 | 1.6s | 131K | Fastest strong model — 2.2s latency, 70+ score |
|
|
91
|
-
| — | `minimax-m2.7` | 92.9 | 4.7s | 200K | — |
|
|
92
|
-
| — | `rnj-1:8b` | 92.2 | 2.1s | 131K | — |
|
|
93
|
-
| — | `glm-5` | 91.7 | 3.6s | 131K | — |
|
|
94
|
-
| — | `nemotron-3-super` | 91.4 | 1.7s | 256K | — |
|
|
95
|
-
| — | `ministral-3:14b` | 91.2 | 1.5s | 131K | — |
|
|
96
|
-
| — | `qwen3-coder:480b` | 91 | 8.3s | 131K | Heavy coding sessions, large context |
|
|
97
|
-
| — | `glm-4.7` | 90.7 | 4.1s | 131K | — |
|
|
98
|
-
| — | `devstral-2:123b` | 90.3 | 8.1s | 131K | Sysadmin + SSH tasks, reliable coding |
|
|
99
|
-
| — | `kimi-k2:1t` | 90.3 | 3.7s | 256K | Large repos (>100K tokens) |
|
|
100
|
-
| — | `minimax-m2` | 90 | 3.4s | 200K | — |
|
|
101
|
-
| — | `devstral-small-2:24b` | 88.8 | 6.8s | 131K | Fast sub-agents, simple lookups |
|
|
102
|
-
| — | `kimi-k2-thinking` | 88.7 | 4.3s | 256K | — |
|
|
103
|
-
| — | `minimax-m2.1` | 88.1 | 2.5s | 200K | — |
|
|
104
|
-
| — | `glm-5.1` | 87.2 | 5.0s | ? | — |
|
|
105
|
-
| — | `kimi-k2.5` | 86.2 | 4.8s | 256K | Large repos — faster than k2:1t |
|
|
106
|
-
| — | `gemma4:31b` | 85.2 | 4.8s | ? | — |
|
|
107
|
-
| — | `minimax-m2.5` | 84.2 | 6.8s | 131K | Multi-agent, large context |
|
|
108
|
-
| — | `gpt-oss:120b` | 83.9 | 2.8s | 131K | — |
|
|
109
|
-
| — | `mistral-large-3:675b` | 82.5 | 7.0s | 131K | — |
|
|
110
|
-
| — | `ministral-3:3b` | 82.4 | 1.3s | 32K | — |
|
|
111
|
-
| — | `gpt-oss:20b` | 81.1 | 1.5s | 131K | Fast small model, good overall score |
|
|
112
|
-
| — | `nemotron-3-nano:30b` | 78.3 | 2.3s | 131K | — |
|
|
113
|
-
| — | `gemini-3-flash-preview` | 76.5 | 3.3s | 131K | — |
|
|
114
|
-
| — | `deepseek-v3.2` | 65.4 | 14.3s | 131K | — |
|
|
115
|
-
| — | `cogito-2.1:671b` | 65.2 | 3.4s | 131K | — |
|
|
116
|
-
|
|
117
|
-
> Rankings are nex-code-specific: tool name accuracy, argument validity, schema compliance.
|
|
118
|
-
> Toolathon (Minimax SOTA) measures different task types — run `/benchmark --discover` after model releases.
|
|
119
|
-
<!-- nex-benchmark-end -->
|
|
120
|
-
|
|
121
|
-
<!-- nex-routing-start -->
|
|
122
|
-
<!-- Updated: 2026-04-12 -->
|
|
123
|
-
|
|
124
|
-
**Model routing by task type** (auto-updated by `/benchmark --all`):
|
|
125
|
-
|
|
126
|
-
| Category | Model | Score |
|
|
127
|
-
|---|---|---|
|
|
128
|
-
| coding | `new` | 90/100 |
|
|
129
|
-
<!-- nex-routing-end -->
|
|
130
|
-
|
|
131
|
-
**Recommended `.env`:**
|
|
132
|
-
|
|
133
|
-
```env
|
|
134
|
-
DEFAULT_PROVIDER=ollama
|
|
135
|
-
DEFAULT_MODEL=devstral-2:123b
|
|
136
|
-
NEX_HEAVY_MODEL=qwen3-coder:480b
|
|
137
|
-
NEX_STANDARD_MODEL=devstral-2:123b
|
|
138
|
-
NEX_FAST_MODEL=devstral-small-2:24b
|
|
139
|
-
```
|
|
26
|
+
- **Plan**: understand the request, inspect the codebase, identify the relevant files and likely change strategy
|
|
27
|
+
- **Implement**: make the code changes with access to the right tools and repository context
|
|
28
|
+
- **Verify**: run tests, inspect outputs, and loop back if the change does not hold up
|
|
140
29
|
|
|
141
|
-
|
|
30
|
+
This matters because the failure mode of many coding assistants is not generation quality alone. It is premature action. A useful assistant must know when to inspect first, when to change code, and when to stop and verify before claiming success.
|
|
142
31
|
|
|
143
|
-
|
|
32
|
+
### Multi-Model Routing
|
|
144
33
|
|
|
145
|
-
|
|
34
|
+
Different models are good at different things. Some are better at fast repo exploration, some at careful implementation, and some at structured verification or longer-context reasoning.
|
|
146
35
|
|
|
147
|
-
|
|
148
|
-
# .env (or set environment variables)
|
|
149
|
-
OLLAMA_API_KEY=your-key # Ollama Cloud
|
|
150
|
-
OPENAI_API_KEY=your-key # OpenAI
|
|
151
|
-
ANTHROPIC_API_KEY=your-key # Anthropic
|
|
152
|
-
GEMINI_API_KEY=your-key # Gemini
|
|
153
|
-
PERPLEXITY_API_KEY=your-key # optional — enables grounded web search
|
|
36
|
+
`nex-code` is built around that reality. Instead of binding the entire session to one model, it can route work by phase, task type, or provider availability. In practice, this means:
|
|
154
37
|
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
38
|
+
- using one model for planning and another for implementation
|
|
39
|
+
- switching providers without changing the workflow model
|
|
40
|
+
- falling back across providers when a model is unavailable or unsuitable
|
|
41
|
+
- benchmarking configured models to improve routing decisions over time
|
|
158
42
|
|
|
159
|
-
|
|
43
|
+
The goal is not provider abstraction for its own sake. The goal is to make model choice operational rather than ideological.
|
|
160
44
|
|
|
161
|
-
|
|
162
|
-
2. `~/.nex-code/.env` — **override**, wins over ambient `process.env`
|
|
163
|
-
3. Current working directory `.env` — non-override, cannot clobber the global config
|
|
164
|
-
|
|
165
|
-
`~/.nex-code/.env` is the authoritative location for long-lived config like `OLLAMA_API_KEY`. The `override:true` on that file exists so that a rotated key written there takes effect on the next `nex-code` launch, even when nex-code is spawned by a long-running parent process (systemd daemon, supervisor agent, test runner) whose own environment was captured earlier and is now stale. If you rotate an API key, update `~/.nex-code/.env` **and** restart any long-running daemon that spawns nex-code — the `override:true` fixes subprocess launches but cannot refresh the parent's own captured `process.env`.
|
|
166
|
-
|
|
167
|
-
**Install from source:**
|
|
168
|
-
|
|
169
|
-
```bash
|
|
170
|
-
git clone https://github.com/hybridpicker/nex-code.git
|
|
171
|
-
cd nex-code && npm install && npm run build
|
|
172
|
-
cp .env.example .env && npm link && npm run install-hooks
|
|
173
|
-
```
|
|
45
|
+
## Key Features
|
|
174
46
|
|
|
175
|
-
|
|
47
|
+
- **CLI-first operation** with low overhead and a workflow that fits existing terminal habits
|
|
48
|
+
- **Phase-based execution** that separates planning, implementation, and verification
|
|
49
|
+
- **Multi-provider support** for OpenAI, Anthropic, Gemini, Ollama Cloud, and local Ollama
|
|
50
|
+
- **Tool-integrated execution** across files, shell commands, Git, SSH, Docker, and Kubernetes
|
|
51
|
+
- **Headless and interactive modes** for both conversational use and automated task runs
|
|
52
|
+
- **Sub-agent orchestration** for decomposing larger tasks into parallel workstreams
|
|
53
|
+
- **Benchmark-driven routing** to select stronger models for specific task categories
|
|
54
|
+
- **Repository-aware behavior** including context from the current project, config, and Git state
|
|
55
|
+
- **Safety controls** around confirmations, sensitive operations, and destructive commands
|
|
176
56
|
|
|
177
|
-
##
|
|
57
|
+
## Architecture
|
|
178
58
|
|
|
179
|
-
|
|
180
|
-
> explain the main function in index.js
|
|
181
|
-
> add input validation to the createUser handler
|
|
182
|
-
> run the tests and fix any failures
|
|
183
|
-
> the /users endpoint returns 500 — find the bug and fix it
|
|
184
|
-
```
|
|
59
|
+
At a high level, `nex-code` is organized as an orchestration layer on top of model providers and developer tools.
|
|
185
60
|
|
|
186
|
-
|
|
61
|
+
1. **CLI and session layer**
|
|
62
|
+
Accepts prompts, commands, flags, and session state from the terminal or editor integration.
|
|
187
63
|
|
|
188
|
-
|
|
64
|
+
2. **Agent loop**
|
|
65
|
+
Runs the task through a controlled execution cycle: inspect, plan, act, verify, and retry when needed.
|
|
189
66
|
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
```
|
|
67
|
+
3. **Routing and provider layer**
|
|
68
|
+
Resolves which provider and model should handle the next step, based on configuration, task type, and fallback logic.
|
|
193
69
|
|
|
194
|
-
|
|
70
|
+
4. **Tool execution layer**
|
|
71
|
+
Exposes filesystem, shell, Git, browser, SSH, Docker, Kubernetes, and related capabilities to the agent.
|
|
195
72
|
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
nex-code --prompt-file /tmp/task.txt --yolo --json
|
|
199
|
-
nex-code --daemon # watch mode: fires tasks on file changes, git commits, or cron
|
|
200
|
-
```
|
|
73
|
+
5. **Verification layer**
|
|
74
|
+
Runs tests, evaluates outcomes, and decides whether the task is complete or needs another pass.
|
|
201
75
|
|
|
202
|
-
|
|
203
|
-
|---|---|
|
|
204
|
-
| `--task <prompt>` | Run a single prompt and exit |
|
|
205
|
-
| `--prompt-file <path>` | Read prompt from file |
|
|
206
|
-
| `--yolo` | Skip all confirmations |
|
|
207
|
-
| `--server` | JSON-lines IPC server (VS Code extension) |
|
|
208
|
-
| `--daemon` | Background watcher (reads `.nex/daemon.json`) |
|
|
209
|
-
| `--flatrate` | 100 turns, 6 parallel agents, 5 retries |
|
|
210
|
-
| `--json` | JSON output to stdout |
|
|
211
|
-
| `--max-turns <n>` | Override agentic loop limit |
|
|
212
|
-
| `--model <spec>` | Use specific model (e.g. `anthropic:claude-sonnet-4-6`) |
|
|
213
|
-
| `--debug` | Show diagnostic messages |
|
|
214
|
-
| `--gemini` | Local Gemini test mode (`gemini-3.1-pro-preview` by default, requires `GEMINI_API_KEY`) |
|
|
215
|
-
| `--gemini-model <id>` | Pin a specific Gemini model (implies `--gemini`) |
|
|
216
|
-
|
|
217
|
-
### Vision / Screenshot
|
|
76
|
+
In practice, this makes `nex-code` closer to a local orchestration system than a thin wrapper around an LLM API.
|
|
218
77
|
|
|
219
|
-
|
|
220
|
-
> /path/to/screenshot.png implement this UI in React
|
|
221
|
-
> analyze https://example.com/mockup.png and implement it
|
|
222
|
-
> what's wrong with the layout in my clipboard # macOS clipboard capture
|
|
223
|
-
> screenshot localhost:3000 and review the navbar spacing
|
|
224
|
-
```
|
|
78
|
+
## Example Workflow
|
|
225
79
|
|
|
226
|
-
|
|
80
|
+
A typical developer flow with `nex-code` looks like this:
|
|
227
81
|
|
|
228
|
-
|
|
82
|
+
1. Start in a repository and describe the task in plain English.
|
|
83
|
+
2. `nex-code` inspects the project structure, relevant files, and surrounding context.
|
|
84
|
+
3. It forms a plan or enters a planning phase before editing.
|
|
85
|
+
4. It makes the implementation changes with tool access.
|
|
86
|
+
5. It runs tests or other verification steps.
|
|
87
|
+
6. If verification fails, it loops back, adjusts the implementation, and re-runs checks.
|
|
88
|
+
7. When the task is complete, it leaves the repository in a verifiable state rather than stopping at code generation.
|
|
229
89
|
|
|
230
|
-
|
|
90
|
+
Example prompts:
|
|
231
91
|
|
|
92
|
+
```text
|
|
93
|
+
explain why the user creation flow is failing in production
|
|
94
|
+
add input validation to the createUser handler and update the tests
|
|
95
|
+
refactor this module to async/await and verify the endpoint behavior
|
|
96
|
+
review the recent changes and look for regressions before I push
|
|
232
97
|
```
|
|
233
|
-
/model # interactive picker
|
|
234
|
-
/model openai:gpt-4o # switch directly
|
|
235
|
-
/providers # list all
|
|
236
|
-
/fallback anthropic,openai # auto-switch on failure
|
|
237
|
-
```
|
|
238
|
-
|
|
239
|
-
| Provider | Models | Env Variable |
|
|
240
|
-
|---|---|---|
|
|
241
|
-
| **ollama** | Qwen3, DeepSeek R1, Devstral, Kimi K2, MiniMax, GLM, Llama 4 | `OLLAMA_API_KEY` |
|
|
242
|
-
| **openai** | GPT-4o, GPT-4.1, o1, o3, o4-mini | `OPENAI_API_KEY` |
|
|
243
|
-
| **anthropic** | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 | `ANTHROPIC_API_KEY` |
|
|
244
|
-
| **gemini** | Gemini 3.1 Pro, 2.5 Pro/Flash | `GEMINI_API_KEY` |
|
|
245
|
-
| **local** | Any local Ollama model | (none) |
|
|
246
|
-
|
|
247
|
-
---
|
|
248
|
-
|
|
249
|
-
## Commands
|
|
250
|
-
|
|
251
|
-
Type `/` to see inline suggestions. Tab completion for slash commands and file paths.
|
|
252
|
-
|
|
253
|
-
| Command | Description |
|
|
254
|
-
|---|---|
|
|
255
|
-
| `/help` | Full help |
|
|
256
|
-
| `/model [spec]` | Show/switch model |
|
|
257
|
-
| `/providers` | List providers |
|
|
258
|
-
| `/clear` | Clear conversation |
|
|
259
|
-
| `/save` / `/load` / `/sessions` / `/resume` | Session management |
|
|
260
|
-
| `/branches` / `/fork` / `/switch-branch` / `/goto` | Session tree navigation |
|
|
261
|
-
| `/remember` / `/forget` / `/memory` | Persistent memory |
|
|
262
|
-
| `/brain add\|list\|search\|show\|remove` | Knowledge base |
|
|
263
|
-
| `/plan [task]` / `/plan edit` / `/plan approve` | Plan mode |
|
|
264
|
-
| `/commit [msg]` / `/diff` / `/branch` | Git intelligence |
|
|
265
|
-
| `/undo` / `/redo` / `/history` | Persistent undo/redo |
|
|
266
|
-
| `/snapshot [name]` / `/restore` | Git snapshots |
|
|
267
|
-
| `/permissions` / `/allow` / `/deny` | Tool permissions |
|
|
268
|
-
| `/costs` / `/budget` | Cost tracking and limits |
|
|
269
|
-
| `/review [--strict]` | Deep code review |
|
|
270
|
-
| `/benchmark` | Model ranking (62 tasks) |
|
|
271
|
-
| `/autoresearch` / `/ar-self-improve` | Autonomous optimization loops |
|
|
272
|
-
| `/servers` / `/docker` / `/deploy` / `/k8s` | Infrastructure management |
|
|
273
|
-
| `/skills` / `/install-skill` / `/mcp` / `/hooks` | Extensibility |
|
|
274
|
-
| `/tree [depth]` | Project file tree |
|
|
275
|
-
| `/audit` | Tool execution audit |
|
|
276
|
-
| `/setup` | Interactive setup wizard |
|
|
277
|
-
|
|
278
|
-
---
|
|
279
|
-
|
|
280
|
-
## Tools
|
|
281
98
|
|
|
282
|
-
|
|
99
|
+
## Design Philosophy
|
|
283
100
|
|
|
284
|
-
|
|
101
|
+
### CLI-first
|
|
285
102
|
|
|
286
|
-
|
|
103
|
+
The terminal remains the most capable interface for real development work. `nex-code` is designed to operate where developers already inspect code, run tests, check diffs, and manage environments.
|
|
287
104
|
|
|
288
|
-
|
|
105
|
+
### Developer-centric
|
|
289
106
|
|
|
290
|
-
|
|
107
|
+
The product assumes a professional engineering workflow: existing repositories, mixed tooling, imperfect environments, partial context, and the need to verify outcomes. It is meant to assist a developer, not replace the surrounding engineering discipline.
|
|
291
108
|
|
|
292
|
-
|
|
109
|
+
### Real-world workflows
|
|
293
110
|
|
|
294
|
-
|
|
111
|
+
A credible coding assistant must handle more than code generation. It needs to interact with source control, infrastructure, shells, CI-like verification, and operational context. `nex-code` is built around those constraints instead of treating them as edge cases.
|
|
295
112
|
|
|
296
|
-
|
|
113
|
+
## Installation / Getting Started
|
|
297
114
|
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
**Frontend:** `frontend_recon` — scans design tokens, layout, framework stack before any frontend work
|
|
301
|
-
|
|
302
|
-
**Visual:** `visual_diff`, `responsive_sweep`, `visual_annotate`, `visual_watch`, `design_tokens`, `design_compare`
|
|
303
|
-
|
|
304
|
-
Additional tools via [MCP servers](#mcp) or [Skills](#skills).
|
|
305
|
-
|
|
306
|
-
---
|
|
307
|
-
|
|
308
|
-
## Key Features
|
|
309
|
-
|
|
310
|
-
### Multi-Agent Orchestrator
|
|
311
|
-
|
|
312
|
-
Multi-goal prompts auto-decompose into parallel sub-agents. Up to 5 agents run simultaneously with file locking.
|
|
115
|
+
Quick start:
|
|
313
116
|
|
|
314
117
|
```bash
|
|
315
|
-
nex-code
|
|
316
|
-
```
|
|
317
|
-
|
|
318
|
-
### Background Agents
|
|
319
|
-
|
|
320
|
-
Sub-agents can run non-blocking in isolated forked processes. The main agent continues working while background workers complete, then results are automatically injected into the conversation.
|
|
321
|
-
|
|
322
|
-
```
|
|
323
|
-
# The model decides when to use background:true — no extra syntax needed.
|
|
324
|
-
# Example: the model might run the linter in background while explaining code.
|
|
325
|
-
spawn_agents([
|
|
326
|
-
{ task: "run the linter and report errors", background: true },
|
|
327
|
-
{ task: "explain the auth module" } ← main agent answers this immediately
|
|
328
|
-
])
|
|
329
|
-
```
|
|
330
|
-
|
|
331
|
-
Background agents are shown in the spinner: `● Thinking [1 bg agent running]`. Results appear as `✓ Background agent done: …` when workers finish.
|
|
332
|
-
|
|
333
|
-
### Autoresearch
|
|
334
|
-
|
|
335
|
-
Autonomous optimization loops: edit -> experiment -> keep/revert, on a dedicated branch.
|
|
336
|
-
|
|
337
|
-
```
|
|
338
|
-
/autoresearch reduce test runtime while maintaining correctness
|
|
339
|
-
/ar-self-improve # self-improvement using nex-code's benchmark
|
|
118
|
+
npx nex-code
|
|
340
119
|
```
|
|
341
120
|
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
Auto-activates for implementation tasks. Read-only analysis first, approve before writes. Hard-enforced tool restrictions.
|
|
345
|
-
|
|
346
|
-
### Daemon / Watch Mode
|
|
347
|
-
|
|
348
|
-
Background process that fires tasks on file changes, git commits, or cron schedule. Configured via `.nex/daemon.json`. Desktop and Matrix notifications.
|
|
349
|
-
|
|
350
|
-
### Session Trees
|
|
351
|
-
|
|
352
|
-
Navigate conversation history like git branches — fork, switch, goto, delete branches.
|
|
353
|
-
|
|
354
|
-
### Safety
|
|
355
|
-
|
|
356
|
-
| Layer | What it guards | Bypass? |
|
|
357
|
-
|---|---|---|
|
|
358
|
-
| **Forbidden patterns** | `rm -rf /`, fork bombs, reverse shells, `cat .env` | No |
|
|
359
|
-
| **Protected paths** | Destructive ops on `.env`, `.ssh/`, `.aws/`, `.git/` | `NEX_UNPROTECT=1` |
|
|
360
|
-
| **Sensitive file tools** | read/write/edit on `.env`, `.ssh/`, `.npmrc`, `.kube/` | No |
|
|
361
|
-
| **Critical commands** | `rm -rf`, `sudo`, `git push --force`, `git reset --hard` | Explicit confirmation |
|
|
362
|
-
|
|
363
|
-
Pre-push secret detection, audit logging (JSONL), persistent undo/redo, cost limits, auto plan mode.
|
|
364
|
-
|
|
365
|
-
### Open-Source Model Robustness
|
|
366
|
-
|
|
367
|
-
- **5-layer argument parsing** — JSON, trailing fix, extraction, key repair, fence stripping
|
|
368
|
-
- **Tool call retry with schema hints** — malformed args get the expected schema for self-correction
|
|
369
|
-
- **Auto-fix engine** — path resolution, edit fuzzy matching (Levenshtein), bash error hints
|
|
370
|
-
- **Tool tiers** — essential (5) / standard (21) / full (45), auto-selected per model capability
|
|
371
|
-
- **Stale stream recovery** — progressive retry with context compression on stall
|
|
372
|
-
|
|
373
|
-
### Visual Development Tools
|
|
374
|
-
|
|
375
|
-
Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.
|
|
376
|
-
|
|
377
|
-
---
|
|
378
|
-
|
|
379
|
-
## Extensibility
|
|
380
|
-
|
|
381
|
-
### Skills
|
|
382
|
-
|
|
383
|
-
Drop `.md` or `.js` files in `.nex/skills/` for project-specific knowledge, commands, and tools. Global skills in `~/.nex-code/skills/`. Install from git: `/install-skill user/repo`.
|
|
384
|
-
|
|
385
|
-
### Plugins
|
|
386
|
-
|
|
387
|
-
Custom tools and lifecycle hooks via `.nex/plugins/`. Events: `onToolResult`, `onModelResponse`, `onSessionStart`, `onSessionEnd`, `onFileChange`, `beforeToolExec`, `afterToolExec`.
|
|
388
|
-
|
|
389
|
-
### MCP
|
|
390
|
-
|
|
391
|
-
Connect external tool servers via [Model Context Protocol](https://modelcontextprotocol.io). Configure in `.nex/mcp.json` with env var interpolation.
|
|
392
|
-
|
|
393
|
-
### Hooks
|
|
394
|
-
|
|
395
|
-
Run custom scripts on CLI events (`pre-tool`, `post-tool`, `pre-commit`, `post-response`, `session-start`, `session-end`). Configure in `.nex/config.json` or `.nex/hooks/`.
|
|
396
|
-
|
|
397
|
-
---
|
|
398
|
-
|
|
399
|
-
## VS Code Extension
|
|
400
|
-
|
|
401
|
-
Built-in sidebar chat panel (`vscode/`) with streaming output, collapsible tool cards, and native theme support. Spawns `nex-code --server` over JSON-lines IPC.
|
|
121
|
+
Or install globally:
|
|
402
122
|
|
|
403
123
|
```bash
|
|
404
|
-
|
|
405
|
-
|
|
124
|
+
npm install -g nex-code
|
|
125
|
+
nex-code
|
|
406
126
|
```
|
|
407
127
|
|
|
408
|
-
|
|
128
|
+
Basic requirements:
|
|
409
129
|
|
|
410
|
-
|
|
411
|
-
|
|
412
|
-
```
|
|
413
|
-
bin/nex-code.js # Entrypoint
|
|
414
|
-
cli/
|
|
415
|
-
agent.js # Agentic loop + conversation state + guards
|
|
416
|
-
providers/ # Ollama, OpenAI, Anthropic, Gemini, Local + wire protocols
|
|
417
|
-
tools/index.js # 45 tool definitions + auto-fix engine
|
|
418
|
-
context-engine.js # Token management + 5-phase compression
|
|
419
|
-
sub-agent.js # Parallel sub-agents with file locking
|
|
420
|
-
orchestrator.js # Multi-agent decompose -> execute -> synthesize
|
|
421
|
-
session-tree.js # Session branching
|
|
422
|
-
visual.js # Visual dev tools (pixelmatch-based)
|
|
423
|
-
browser.js # Playwright browser agent
|
|
424
|
-
skills/ # Built-in + user skills
|
|
425
|
-
```
|
|
426
|
-
|
|
427
|
-
See [DEVELOPMENT.md](DEVELOPMENT.md) for full architecture details.
|
|
130
|
+
- Node.js 18+
|
|
131
|
+
- at least one configured provider key, or a local Ollama setup
|
|
428
132
|
|
|
429
|
-
|
|
133
|
+
Typical environment configuration:
|
|
430
134
|
|
|
431
|
-
|
|
135
|
+
```env
|
|
136
|
+
OLLAMA_API_KEY=your-key
|
|
137
|
+
OPENAI_API_KEY=your-key
|
|
138
|
+
ANTHROPIC_API_KEY=your-key
|
|
139
|
+
GEMINI_API_KEY=your-key
|
|
432
140
|
|
|
433
|
-
|
|
434
|
-
|
|
435
|
-
npm run typecheck # TypeScript noEmit check
|
|
436
|
-
npm run benchmark:gate # 7-task smoke test (blocks push on regression)
|
|
437
|
-
npm run benchmark:reallife # 35 real-world tasks across 7 categories
|
|
141
|
+
DEFAULT_PROVIDER=ollama
|
|
142
|
+
DEFAULT_MODEL=devstral-2:123b
|
|
438
143
|
```
|
|
439
144
|
|
|
440
|
-
|
|
441
|
-
|
|
442
|
-
## Security
|
|
443
|
-
|
|
444
|
-
- Pre-push secret detection (API keys, private keys, hardcoded credentials)
|
|
445
|
-
- Audit logging with automatic argument sanitization
|
|
446
|
-
- Sensitive path blocking (`.ssh/`, `.aws/`, `.env`, credentials)
|
|
447
|
-
- Shell injection protection via `execFileSync` with argument arrays
|
|
448
|
-
- SSRF protection on `web_fetch`
|
|
449
|
-
- MCP environment isolation
|
|
145
|
+
On first launch, `nex-code` can guide setup interactively. More detailed installation, provider setup, and advanced runtime configuration can be expanded here as the project documentation matures.
|
|
450
146
|
|
|
451
|
-
|
|
147
|
+
## Future Direction
|
|
452
148
|
|
|
453
|
-
|
|
149
|
+
The long-term value of `nex-code` is not only broader model support. It is better orchestration.
|
|
454
150
|
|
|
455
|
-
|
|
151
|
+
Likely areas of continued investment include:
|
|
456
152
|
|
|
457
|
-
|
|
153
|
+
- stronger benchmark-based routing across task categories
|
|
154
|
+
- deeper editor and automation integrations
|
|
155
|
+
- more robust multi-agent coordination for larger changes
|
|
156
|
+
- tighter verification loops for tests, diffs, and deployment workflows
|
|
157
|
+
- better support for persistent project knowledge and reusable team workflows
|
|
458
158
|
|
|
459
|
-
|
|
460
|
-
agentic coding cli, open source ai terminal, free coding ai, qwen3 coder cli, devstral terminal,
|
|
461
|
-
kimi k2 cli, multi-provider ai cli, local llm coding tool -->
|
|
159
|
+
The direction is clear: make AI assistance behave more like a disciplined engineering system and less like an isolated chat interface.
|