reasonix 0.18.0 → 0.19.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +76 -884
- package/README.zh-CN.md +79 -804
- package/dashboard/app.css +1987 -2416
- package/dashboard/dist/app.js +24639 -0
- package/dashboard/dist/app.js.map +1 -0
- package/dashboard/dist/vendor-hljs.css +10 -0
- package/dashboard/dist/vendor-uplot.css +1 -0
- package/dashboard/index.html +2 -2
- package/dist/cli/index.js +5881 -3511
- package/dist/cli/index.js.map +1 -1
- package/package.json +10 -28
- package/dashboard/app.js +0 -4768
- package/dashboard/codemirror.js +0 -36
package/README.md
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
5
|
<p align="center">
|
|
6
|
-
<em>Cache-first agent loop for DeepSeek V4
|
|
6
|
+
<em>Cache-first agent loop for DeepSeek V4 — terminal-native, MCP first-class, no LangChain.</em>
|
|
7
7
|
</p>
|
|
8
8
|
|
|
9
9
|
<p align="center">
|
|
@@ -18,39 +18,25 @@
|
|
|
18
18
|
[](https://www.npmjs.com/package/reasonix)
|
|
19
19
|
[](./package.json)
|
|
20
20
|
|
|
21
|
-
**A DeepSeek-native AI coding agent
|
|
22
|
-
per task than Claude Code, with a cache-first loop engineered for
|
|
23
|
-
DeepSeek's pricing model. Edits as reviewable SEARCH/REPLACE blocks.
|
|
24
|
-
MIT-licensed. No IDE lock-in. MCP first-class.
|
|
21
|
+
**A DeepSeek-native AI coding agent for your terminal.** ~30× cheaper per task than Claude Code, engineered around DeepSeek's prefix-cache so the savings are real (94% live cache hit, not theoretical). MIT-licensed, no IDE lock-in, MCP first-class.
|
|
25
22
|
|
|
26
23
|
---
|
|
27
24
|
|
|
28
|
-
## Quick start
|
|
29
|
-
|
|
30
|
-
**1. Get a DeepSeek API key.** Free credit on signup:
|
|
31
|
-
<https://platform.deepseek.com/api_keys>
|
|
32
|
-
|
|
33
|
-
**2. Point it at a project.** No install needed.
|
|
25
|
+
## Quick start
|
|
34
26
|
|
|
35
27
|
```bash
|
|
36
28
|
cd my-project
|
|
37
29
|
npx reasonix code
|
|
38
30
|
```
|
|
39
31
|
|
|
40
|
-
First run
|
|
41
|
-
preset → multi-select MCP servers). Every run after that drops you
|
|
42
|
-
straight in.
|
|
43
|
-
|
|
44
|
-
**3. Ask it to edit.** The model proposes edits as SEARCH/REPLACE
|
|
45
|
-
blocks — nothing hits disk until you `/apply`.
|
|
32
|
+
First run: paste a [DeepSeek API key](https://platform.deepseek.com/api_keys), pick a preset, optionally select MCP servers. Every run after drops you straight in.
|
|
46
33
|
|
|
47
34
|
```
|
|
48
|
-
reasonix code ›
|
|
35
|
+
reasonix code › fix the case-sensitivity bug in findByEmail
|
|
49
36
|
|
|
50
37
|
assistant
|
|
51
38
|
▸ tool<search_files> → src/users.ts, src/users.test.ts
|
|
52
|
-
▸ tool<read_file> →
|
|
53
|
-
▸ 找到了。findByEmail 直接用 === 比对。改成小写规范化并补一条测试。
|
|
39
|
+
▸ tool<read_file> → src/users.ts (412 chars)
|
|
54
40
|
|
|
55
41
|
src/users.ts
|
|
56
42
|
<<<<<<< SEARCH
|
|
@@ -60,918 +46,124 @@ src/users.ts
|
|
|
60
46
|
return users.find(u => u.email.toLowerCase() === needle);
|
|
61
47
|
>>>>>>> REPLACE
|
|
62
48
|
|
|
63
|
-
▸ 1 pending edit
|
|
64
|
-
|
|
65
|
-
reasonix code › /apply
|
|
66
|
-
▸ ✓ applied src/users.ts
|
|
49
|
+
▸ 1 pending edit · /apply to write, /discard to drop
|
|
67
50
|
```
|
|
68
51
|
|
|
69
|
-
Requires Node ≥ 20.10. macOS, Linux, Windows (PowerShell
|
|
70
|
-
Windows Terminal). Press `Esc` anytime to abort; `/help` for the full
|
|
71
|
-
command list.
|
|
52
|
+
Edits stay in memory until you type `/apply` — nothing hits disk by default. Requires Node ≥ 20.10. Tested on macOS, Linux, and Windows (PowerShell, Git Bash, Windows Terminal).
|
|
72
53
|
|
|
73
54
|
---
|
|
74
55
|
|
|
75
|
-
##
|
|
76
|
-
|
|
77
|
-
| | Reasonix | Claude Code | Cursor | Aider |
|
|
78
|
-
|------------------------------------|----------------|----------------|----------------|----------------|
|
|
79
|
-
| Backend | DeepSeek V4 | Anthropic | OpenAI / Anthropic | any |
|
|
80
|
-
| Cost / typical task | **~$0.001–0.005** | ~$0.05–0.50 | $20/mo + usage | varies |
|
|
81
|
-
| Where it runs | terminal | terminal + IDE | IDE (Electron) | terminal |
|
|
82
|
-
| License | **MIT** | closed | closed | Apache 2 |
|
|
83
|
-
| DeepSeek prefix-cache hit rate | **90.2%** | n/a | n/a | ~33% |
|
|
84
|
-
| Reviewable edits (no auto-write) | **yes** (`/apply`) | yes | partial | yes |
|
|
85
|
-
| MCP servers | **first-class**| first-class | — | — |
|
|
86
|
-
|
|
87
|
-
Numbers from `benchmarks/tau-bench-lite` (8 multi-turn coding tasks ×
|
|
88
|
-
3 repeats, live `deepseek-chat`). Same workload, sole variable is
|
|
89
|
-
prefix stability — committed transcripts in [`benchmarks/`](./benchmarks/).
|
|
90
|
-
The full feature comparison [is below](#why-reasonix-vs-cursor--claude-code--cline--aider).
|
|
56
|
+
## How it compares
|
|
91
57
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
58
|
+
| | Reasonix | Claude Code | Cursor | Aider |
|
|
59
|
+
|----------------------------------|------------------|-----------------|--------------------|------------------|
|
|
60
|
+
| Backend | DeepSeek V4 | Anthropic | OpenAI / Anthropic | any (OpenRouter) |
|
|
61
|
+
| **Cost / typical task** | **~¥0.01–0.04** | ~¥0.40–4 | ¥150/mo + usage | varies |
|
|
62
|
+
| Surface | terminal | terminal + IDE | IDE (Electron) | terminal |
|
|
63
|
+
| License | **MIT** | closed | closed | Apache 2 |
|
|
64
|
+
| **DeepSeek prefix-cache hit** | **94%** (live) | n/a | n/a | ~33% (baseline) |
|
|
65
|
+
| Plan mode (read-only audit gate) | yes | yes | — | yes |
|
|
66
|
+
| Edit review (`/apply`, no auto-write) | yes | yes | partial | yes |
|
|
67
|
+
| MCP servers | first-class | first-class | — | — |
|
|
68
|
+
| User-authored skills | yes | yes | — | — |
|
|
69
|
+
| Embedded web dashboard | yes | — | n/a (IDE) | — |
|
|
70
|
+
| Hooks (`PreToolUse`, etc.) | yes | yes | — | — |
|
|
71
|
+
| Sandbox boundary | strict | yes | partial | yes |
|
|
72
|
+
| Persistent per-workspace sessions | yes | partial | n/a | — |
|
|
95
73
|
|
|
96
|
-
|
|
97
|
-
URL with a one-time token. Open it for a 13-tab control surface that
|
|
98
|
-
mirrors the running TUI — chat (with live streaming), the editor (file
|
|
99
|
-
tree + CodeMirror, syntax highlighting + autocomplete + side-by-side
|
|
100
|
-
diff for pending edits), Usage / Sessions / Plans / Tools /
|
|
101
|
-
Permissions / System / MCP / Skills / Memory / Hooks / Settings.
|
|
74
|
+
Numbers from `benchmarks/tau-bench-lite` (8 multi-turn tasks × 3 repeats, live `deepseek-chat`). [Committed transcripts →](./benchmarks/)
|
|
102
75
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
▸ http://127.0.0.1:54219/?token=… (open in browser)
|
|
106
|
-
```
|
|
76
|
+
<details>
|
|
77
|
+
<summary><strong>Why DeepSeek-only? — the cache economics</strong></summary>
|
|
107
78
|
|
|
108
|
-
|
|
109
|
-
mutation is CSRF-checked. The TUI keeps working — modals (shell
|
|
110
|
-
confirms, plan reviews, edit gates) mirror to whichever surface you
|
|
111
|
-
look at first. No build step, no Electron, no separate process to
|
|
112
|
-
keep alive.
|
|
79
|
+
Cheap tokens alone is half the story. DeepSeek's prefix-cache is **byte-stable**: the cache fingerprints from byte 0 of the prompt. Reasonix's loop is engineered around that — append-only growth, no re-ordering, no marker-based compaction — so the cache prefix survives every tool call.
|
|
113
80
|
|
|
114
|
-
|
|
81
|
+
By comparison, Claude Code is built around Anthropic's `cache_control` markers (a fundamentally different mechanic). Pointing it at DeepSeek's Anthropic-compat endpoint keeps the cheap tokens but loses the cache hits — markers are ignored, and the underlying prefix isn't byte-stable. Generic-backend tools (Aider / Cline / Continue) hit the same wall from the other direction: their compaction patterns destroy byte stability.
|
|
115
82
|
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
Three things you'd come to Reasonix for, that nothing else combines:
|
|
119
|
-
|
|
120
|
-
- **Cost economics that land in your bill.** DeepSeek V4 is ~30×
|
|
121
|
-
cheaper than Claude Sonnet per token. Cheap tokens alone isn't the
|
|
122
|
-
win — *cheap tokens with a 90%+ prefix-cache hit* is. Reasonix's
|
|
123
|
-
loop is engineered around append-only prompt growth so the
|
|
124
|
-
cache-stable prefix survives every tool call. The benchmarks
|
|
125
|
-
section verifies this end-to-end: 90.2% live cache hit, versus
|
|
126
|
-
32.8% for a generic harness on the same workload. The `/stats`
|
|
127
|
-
panel surfaces "vs Claude Sonnet 4.6" savings on every turn.
|
|
128
|
-
|
|
129
|
-
- **It lives in your terminal.** Pure CLI — no Electron, no VS Code
|
|
130
|
-
extension, no IDE plugin to wedge into your editor. Sits next to
|
|
131
|
-
git, tmux, and your shell history. macOS / Linux / Windows
|
|
132
|
-
(PowerShell, Git Bash, Windows Terminal all tested). The only
|
|
133
|
-
network call is to the DeepSeek API itself; no vendor server in
|
|
134
|
-
the middle.
|
|
135
|
-
|
|
136
|
-
- **Open source and hackable, end to end.** MIT-licensed TypeScript.
|
|
137
|
-
The entire loop, tool registry, cache-stable prefix, TUI, MCP
|
|
138
|
-
bridge — all in `src/` under 30k lines. Fork it, ship a private
|
|
139
|
-
build, drop it into CI. No SaaS layer, no enterprise tier, no
|
|
140
|
-
feature gates.
|
|
141
|
-
|
|
142
|
-
| | Reasonix | Claude Code | Cursor | Cline | Aider |
|
|
143
|
-
|---|---|---|---|---|---|
|
|
144
|
-
| Backend | DeepSeek V4 only | Anthropic only | OpenAI / Anthropic | any (OpenRouter) | any (OpenRouter) |
|
|
145
|
-
| Cost / typical task | **~$0.001–$0.005** | ~$0.05–$0.50 | $20/mo + usage | varies | varies |
|
|
146
|
-
| Where it runs | terminal | terminal + IDE | IDE (Electron) | VS Code only | terminal |
|
|
147
|
-
| License | **MIT** | closed | closed | Apache 2 | Apache 2 |
|
|
148
|
-
| Cache-first prefix loop | **engineered (94% hit)** | basic | n/a | n/a | basic |
|
|
149
|
-
| MCP servers | **first-class** | first-class | — | beta | — |
|
|
150
|
-
| Plan mode (read-only audit gate) | **yes** | yes | — | yes | — |
|
|
151
|
-
| User-authored skills | **yes** | yes | — | — | — |
|
|
152
|
-
| Edit review (no auto-write) | **yes** (`/apply`) | yes | partial | yes | yes |
|
|
153
|
-
| Workspace switch (`/cwd`, `change_workspace`) | **yes** | — | n/a (per-window) | — | — |
|
|
154
|
-
| Cross-session cost dashboard | **yes** (`/stats`) | — | — | — | — |
|
|
155
|
-
| Sandbox boundary enforcement | **strict** (refuses `..` escape) | yes | partial | yes | partial |
|
|
83
|
+
At DeepSeek's pricing — $0.07/Mtok uncached, $0.014/Mtok cached — **the difference between 50% and 94% hit is roughly 2.5× on input cost alone.** Same model, same API; the loop's invariants are what changed.
|
|
156
84
|
|
|
157
|
-
|
|
158
|
-
<summary><strong>When reasonix is the wrong choice · DeepSeek/Anthropic-compat caveats · vs Aider/Cline/Continue</strong></summary>
|
|
159
|
-
|
|
160
|
-
### Pick something else when
|
|
161
|
-
|
|
162
|
-
- **You want multi-provider flexibility** (mix Claude / GPT / Gemini /
|
|
163
|
-
local Llama in one tool). Try [Aider](https://aider.chat) or
|
|
164
|
-
[Cline](https://cline.bot). Reasonix is DeepSeek-only on purpose —
|
|
165
|
-
every layer (cache-first loop, R1 harvesting, JSON-mode tool repair,
|
|
166
|
-
reasoning-effort cap) is tuned against DeepSeek-specific behavior
|
|
167
|
-
and economics. Coupling to one backend is the feature, not a
|
|
168
|
-
limitation we'll grow out of.
|
|
169
|
-
- **You want IDE integration** (inline diff in your gutter,
|
|
170
|
-
multi-cursor, ghost text, refactor previews). Try
|
|
171
|
-
[Cursor](https://cursor.com) or Claude Code's IDE mode. Reasonix
|
|
172
|
-
is terminal-first; the diff lives in `git diff`, the file tree
|
|
173
|
-
lives in `ls`, the chat lives in your shell.
|
|
174
|
-
- **You're chasing the hardest reasoning benchmarks.** Claude Opus
|
|
175
|
-
4.6 still wins some leaderboards. DeepSeek V4-pro is competitive
|
|
176
|
-
on most coding tasks but doesn't lead every benchmark. If your
|
|
177
|
-
task is "solve this PhD-level proof" rather than "fix this auth
|
|
178
|
-
bug," start with Claude.
|
|
179
|
-
- **You need fully-local / fully-free**. DeepSeek's API has free
|
|
180
|
-
credit on signup, but isn't free forever. For air-gapped or
|
|
181
|
-
always-free, look at Aider + Ollama or [Continue](https://continue.dev).
|
|
182
|
-
|
|
183
|
-
### "But DeepSeek now has an Anthropic-compatible API — can't I just point Claude Code at it?"
|
|
184
|
-
|
|
185
|
-
You can. DeepSeek ships an official Anthropic-compatible endpoint at
|
|
186
|
-
`https://api.deepseek.com/anthropic`, and Claude Code (or any Anthropic
|
|
187
|
-
SDK client) talks to it without modification. The protocol works. The
|
|
188
|
-
**caching economics** don't transfer, and that's the whole point.
|
|
189
|
-
|
|
190
|
-
Look at DeepSeek's [own compatibility table](https://api-docs.deepseek.com/guides/anthropic_api):
|
|
191
|
-
|
|
192
|
-
| Field | Status on DeepSeek's compat endpoint |
|
|
193
|
-
|---|---|
|
|
194
|
-
| `cache_control` markers | **Ignored** |
|
|
195
|
-
| `mcp_servers` (API-level) | Ignored |
|
|
196
|
-
| `thinking.budget_tokens` | Ignored |
|
|
197
|
-
| Images / documents / citations | Not supported |
|
|
198
|
-
|
|
199
|
-
`cache_control: Ignored` is the load-bearing line. Two completely
|
|
200
|
-
different cache mechanics are colliding here:
|
|
201
|
-
|
|
202
|
-
| | Anthropic native | DeepSeek auto-cache |
|
|
203
|
-
|---|---|---|
|
|
204
|
-
| Model | **Marker-based.** You put `cache_control` on a message; Anthropic caches "everything up to this marker" as a content-addressed unit. Multiple markers = multiple independent breakpoints. | **Byte-stable prefix.** The cache fingerprints the literal byte stream from byte 0. |
|
|
205
|
-
| Claude Code's design | Built around this. Markers on system prompt + tool defs let the loop reorder, compact, or insert metadata after the markers without losing the cache. | n/a — Claude Code wasn't designed for byte-stable prefixes. |
|
|
206
|
-
| What happens when Claude Code → DeepSeek compat | Markers stripped (ignored). Claude Code's main caching strategy disappears. | Falls back to auto-cache. But Claude Code's prefix isn't byte-stable (markers were the *substitute* for byte-stability), so auto-cache misses too. |
|
|
207
|
-
|
|
208
|
-
Net effect: **Claude Code's loop, redirected at DeepSeek, gets the
|
|
209
|
-
cheap tokens and loses the cache hit it depended on.** A loop running
|
|
210
|
-
at 80%+ cache hit on Anthropic's marker cache lands somewhere in the
|
|
211
|
-
40-60% range on DeepSeek's auto-cache (matches the generic-harness
|
|
212
|
-
baseline in our benchmarks). Same model, same API, same workload —
|
|
213
|
-
the loop's invariants don't fit the cache mechanic it's now talking
|
|
214
|
-
to.
|
|
215
|
-
|
|
216
|
-
Reasonix's loop was designed around byte-stable prefix from line one.
|
|
217
|
-
No markers, no breakpoints — append-only is the invariant. That's why
|
|
218
|
-
the same τ-bench workload lands at **90.2% cache hit** on Reasonix
|
|
219
|
-
and **32.8%** on a cache-hostile baseline (committed transcripts;
|
|
220
|
-
benchmarks section below). At DeepSeek's pricing — $0.07/Mtok
|
|
221
|
-
uncached, ~$0.014/Mtok cached — the difference between 50% and 94%
|
|
222
|
-
hit is **roughly 2.5× on input cost alone**.
|
|
223
|
-
|
|
224
|
-
### "What about Aider / Cline / Continue?"
|
|
225
|
-
|
|
226
|
-
They support DeepSeek natively (no compat layer needed) and you do
|
|
227
|
-
get the cheap token price. What you don't get is the DeepSeek-
|
|
228
|
-
specific loop work — those tools' loops support every backend
|
|
229
|
-
generically (OpenAI / Anthropic / local Llama / ...) and use
|
|
230
|
-
compaction + summarization patterns that destroy byte-stability. They
|
|
231
|
-
land in the same 40-60% cache-hit range as the baseline. Plus a
|
|
232
|
-
handful of DeepSeek-specific quirks generic loops don't handle:
|
|
85
|
+
A few DeepSeek-specific fixes generic loops miss:
|
|
233
86
|
|
|
234
87
|
| Generic loops assume | DeepSeek actually does | Reasonix's fix |
|
|
235
88
|
|---|---|---|
|
|
236
|
-
| Reasoning emitted as a structured `thinking` block | R1 sometimes leaks tool-call JSON inside `<think>` tags | a `scavenge` pass that pulls escaped tool calls back out
|
|
237
|
-
| Tool schemas validated strictly | DeepSeek silently drops deeply-nested object/array params | auto-flatten — nested params get rewritten to single-level prefixed names
|
|
238
|
-
| Tool-call args are well-formed JSON | DeepSeek occasionally produces `string="false"` and other malformed fragments | dedicated `ToolCallRepair` heals the common shapes before
|
|
239
|
-
| Reasoning depth tuned via system-level switches | V4 exposes a `reasoning_effort` knob (`max` / `high`) | `/effort` slash + `--effort` flag
|
|
240
|
-
| Old tool results kept in full forever | 1M context — don't compact pre-emptively, but most agents do | call-storm breaker + result token cap, but the prefix is *never* rewritten; compaction lands as new turns at the tail |
|
|
89
|
+
| Reasoning emitted as a structured `thinking` block | R1 sometimes leaks tool-call JSON inside `<think>` tags | a `scavenge` pass that pulls escaped tool calls back out |
|
|
90
|
+
| Tool schemas validated strictly | DeepSeek silently drops deeply-nested object/array params | auto-flatten — nested params get rewritten to single-level prefixed names |
|
|
91
|
+
| Tool-call args are well-formed JSON | DeepSeek occasionally produces `string="false"` and other malformed fragments | dedicated `ToolCallRepair` heals the common shapes before dispatch |
|
|
92
|
+
| Reasoning depth tuned via system-level switches | V4 exposes a `reasoning_effort` knob (`max` / `high`) | `/effort` slash + `--effort` flag for cheap turns |
|
|
241
93
|
|
|
242
|
-
|
|
243
|
-
> the loop is designed around. Reasonix isn't yet-another agent
|
|
244
|
-
> CLI — it's an agent CLI built around DeepSeek's specific cache
|
|
245
|
-
> mechanic and pricing model.
|
|
94
|
+
Cache stability isn't a feature you turn on; it's an invariant the loop is designed around. That's the entire reason Reasonix is DeepSeek-only.
|
|
246
95
|
|
|
247
96
|
</details>
|
|
248
97
|
|
|
249
98
|
---
|
|
250
99
|
|
|
251
|
-
##
|
|
252
|
-
|
|
253
|
-
Scoped to the directory you launch from. The model has native
|
|
254
|
-
`read_file` / `write_file` / `edit_file` / `list_directory` /
|
|
255
|
-
`search_files` / `directory_tree` / `get_file_info` /
|
|
256
|
-
`create_directory` / `move_file` tools, all sandboxed — any path that
|
|
257
|
-
resolves outside the launch root (including `..` and symlink escapes)
|
|
258
|
-
is refused. Plus `run_command` with a read-only allowlist; anything
|
|
259
|
-
state-mutating (`npm install`, `git commit`, …) is gated behind a
|
|
260
|
-
confirmation picker.
|
|
261
|
-
|
|
262
|
-
### Walkthrough: explore before editing
|
|
263
|
-
|
|
264
|
-
For "what does this code do?" questions the model uses the read-side
|
|
265
|
-
tools and replies in prose — no SEARCH/REPLACE blocks, no file
|
|
266
|
-
writes. Ask to change something only when you mean it:
|
|
267
|
-
|
|
268
|
-
```
|
|
269
|
-
reasonix code › 这个项目的路由是怎么组织的?
|
|
270
|
-
assistant
|
|
271
|
-
▸ tool<directory_tree> → (src/ tree, 47 entries)
|
|
272
|
-
▸ tool<read_file> → (src/router.ts, 1.2 KB)
|
|
273
|
-
▸ 路由分三层:顶层 AppRouter 注册 tab,每个 tab 用 React Router 的
|
|
274
|
-
nested routes 写子路径,最后 …
|
|
275
|
-
```
|
|
100
|
+
## What's in the box
|
|
276
101
|
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
the error and retries — silent wrong edits are worse than visible
|
|
280
|
-
rejections.
|
|
102
|
+
### Cache-first agent loop
|
|
103
|
+
Loop preserves prefix stability across tool dispatches. R1-style reasoning supported, with a scavenge pass that pulls escaped tool calls back out of `<think>` blocks. Tool-call repair handles malformed args before they hit dispatch. `/effort` lets you step reasoning depth down for cheap turns.
|
|
281
104
|
|
|
282
|
-
###
|
|
105
|
+
### Tool registry
|
|
106
|
+
Native: `read_file`, `write_file`, `edit_file` (SEARCH/REPLACE), `list_directory`, `search_files`, `grep_files`, `run_command`, `run_background`, `web_search`, `web_fetch`. All sandboxed to the launch directory. **MCP first-class** — `--mcp 'name=cmd args'` adds external servers (stdio / Streamable HTTP / SSE), tools merge into the registry under a prefix.
|
|
283
107
|
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
Cancel**:
|
|
108
|
+
### Plan mode + edit review
|
|
109
|
+
`/plan` enters a read-only audit gate where the model can't dispatch edits until you approve a written plan. Edits emerge as SEARCH/REPLACE blocks; nothing hits disk until `/apply`. `/walk` steps through pending edits one at a time. `/discard` drops them all.
|
|
287
110
|
|
|
288
|
-
|
|
289
|
-
reasonix
|
|
290
|
-
|
|
291
|
-
▸ plan submitted — awaiting your review
|
|
292
|
-
────────────────────────────────────────
|
|
293
|
-
## Summary
|
|
294
|
-
Swap JWT middleware for session cookies, keep user table intact.
|
|
295
|
-
|
|
296
|
-
## Files
|
|
297
|
-
- src/auth/middleware.ts — replace `verifyJwt` with `readSession`
|
|
298
|
-
- src/auth/session.ts — new file, in-memory store + signed cookie
|
|
299
|
-
- src/routes/login.ts — return Set-Cookie instead of a token
|
|
300
|
-
- tests/auth/*.test.ts — update fixtures
|
|
301
|
-
|
|
302
|
-
## Risks
|
|
303
|
-
- Existing logged-in users get logged out (no migration).
|
|
304
|
-
- Session store is in-memory; restart clears sessions.
|
|
305
|
-
────────────────────────────────────────
|
|
306
|
-
▸ Approve and implement
|
|
307
|
-
Refine — explore more
|
|
308
|
-
Cancel
|
|
309
|
-
```
|
|
111
|
+
### Sessions, scoped per workspace
|
|
112
|
+
Sessions persist in `~/.reasonix/sessions/` and are filtered by launch directory. `--new` preserves the previous session under a timestamped name; `--resume` finds the latest. `/sessions` switches mid-chat without quitting.
|
|
310
113
|
|
|
311
|
-
|
|
312
|
-
the
|
|
313
|
-
shell call will execute. Use for high-stakes changes you want to
|
|
314
|
-
audit before the model touches disk. `/plan off` or picker
|
|
315
|
-
Approve/Cancel exits.
|
|
114
|
+
### Embedded web dashboard
|
|
115
|
+
`/dashboard` opens a localhost SPA mirroring the running TUI — chat (with full composer fallback when the TUI's renderer breaks down on legacy PowerShell), editor (file tree + CodeMirror), Sessions / Plans / Usage / Tools / MCP / Memory / Hooks / Settings. Token-gated, CSRF-checked, ephemeral. [Design mockup →](./design/agent-dashboard.html)
|
|
316
116
|
|
|
317
|
-
###
|
|
117
|
+
### Hooks
|
|
118
|
+
Configurable shell scripts that fire on `PreToolUse`, `PostToolUse`, `UserPromptSubmit`, `Stop`, `Notification`, `SessionEnd`. Lives in `.reasonix/settings.json` (per-project) or `~/.reasonix/settings.json` (per-user). The harness executes them — not the model.
|
|
318
119
|
|
|
319
|
-
|
|
120
|
+
### Memory + skills
|
|
121
|
+
Two layers: project-scoped `REASONIX.md` (committed, repo conventions) and user-scoped `~/.reasonix/memory/` (per-user, the model can write to it via the `remember` tool). Skills are user-authored prompt packs with optional sub-agent execution.
|
|
320
122
|
|
|
321
|
-
|
|
322
|
-
|
|
323
|
-
log AND in the session so the model's next turn reasons about it:
|
|
123
|
+
### Permissions
|
|
124
|
+
`allow` / `ask` / `deny` patterns on commands and tools. `npm publish` defaults to `ask`; `rm -rf *` and `git push --force *` default to `deny`. Approved-once decisions can be remembered for a prefix.
|
|
324
125
|
|
|
325
|
-
|
|
326
|
-
reasonix code › !git status --short
|
|
327
|
-
▸ M src/users.ts
|
|
328
|
-
▸ M src/users.test.ts
|
|
126
|
+
[Full feature docs on the website →](https://esengine.github.io/reasonix/) · [Architecture →](./docs/ARCHITECTURE.md) · [TUI design mockup →](./design/agent-tui-terminal.html)
|
|
329
127
|
|
|
330
|
-
reasonix code › 把这两个文件的改动说明一下
|
|
331
|
-
assistant
|
|
332
|
-
▸ tool<read_file> → src/users.ts, src/users.test.ts
|
|
333
|
-
▸ …
|
|
334
|
-
```
|
|
335
|
-
|
|
336
|
-
No allowlist gate — user-typed shell = explicit consent. 60s timeout,
|
|
337
|
-
32k char cap, survives session resume.
|
|
338
|
-
|
|
339
|
-
**`@path/to/file` — inline a file under "Referenced files."** Start
|
|
340
|
-
typing `@` and a picker appears (↑/↓ navigate, Tab/Enter to insert).
|
|
341
|
-
Good for "what does @src/users.ts do?" without making the model
|
|
342
|
-
`read_file` it first. Sandboxed: relative paths only, no `..` escape,
|
|
343
|
-
64KB per-file cap. Recent files rank higher.
|
|
344
|
-
|
|
345
|
-
### `/commit` — stage + commit in one step
|
|
346
|
-
|
|
347
|
-
```
|
|
348
|
-
reasonix code › /commit "fix: findByEmail case-insensitive"
|
|
349
|
-
▸ git add -A && git commit -m "fix: findByEmail case-insensitive"
|
|
350
|
-
[main a1b2c3d] fix: findByEmail case-insensitive
|
|
351
|
-
```
|
|
352
|
-
|
|
353
|
-
### Things to try
|
|
354
|
-
|
|
355
|
-
- `/tool 1` — dump the last tool call's full output (when the 400-char
|
|
356
|
-
inline clip isn't enough).
|
|
357
|
-
- `/think` — see the model's full reasoning for the last turn
|
|
358
|
-
(thinking-mode models: v4-flash / v4-pro / reasoner alias).
|
|
359
|
-
- `/undo` — roll back the last applied edit batch.
|
|
360
|
-
- `/new` — start fresh in the same directory without losing the
|
|
361
|
-
session file.
|
|
362
|
-
- `/effort high` — step down from the default `max` agent-class
|
|
363
|
-
reasoning_effort for cheaper/faster turns on simple tasks.
|
|
364
|
-
- `npx reasonix code --preset pro` — v4-pro for the whole session,
|
|
365
|
-
no auto-downgrade to flash. Pair with `--branch 3` if you want
|
|
366
|
-
3-way self-consistency on gnarly refactors.
|
|
367
|
-
- `npx reasonix code src/` — narrower sandbox (only `src/` is
|
|
368
|
-
writable).
|
|
369
|
-
- `npx reasonix code --no-session` — ephemeral; nothing saved.
|
|
370
|
-
|
|
371
|
-
### `reasonix stats` — how much did you actually save?
|
|
372
|
-
|
|
373
|
-
Every turn `reasonix chat|code|run` runs appends a compact record
|
|
374
|
-
(tokens + cost + what Claude Sonnet 4.6 would have charged) to
|
|
375
|
-
`~/.reasonix/usage.jsonl`. `reasonix stats` with no args rolls that
|
|
376
|
-
log into today / week / month / all-time windows:
|
|
377
|
-
|
|
378
|
-
```
|
|
379
|
-
Reasonix usage — /Users/you/.reasonix/usage.jsonl
|
|
380
|
-
|
|
381
|
-
turns cache hit cost (USD) vs Claude saved
|
|
382
|
-
----------------------------------------------------------------------
|
|
383
|
-
today 8 95.1% $0.004821 $0.1348 96.4%
|
|
384
|
-
week 34 93.8% $0.023104 $0.6081 96.2%
|
|
385
|
-
month 127 94.2% $0.081530 $2.1452 96.2%
|
|
386
|
-
all-time 342 94.0% $0.210881 $5.8934 96.4%
|
|
387
|
-
```
|
|
388
|
-
|
|
389
|
-
Privacy: only tokens, costs, and the session name you chose land
|
|
390
|
-
in the file. No prompts, no completions, no tool arguments.
|
|
391
|
-
`reasonix stats <transcript>` keeps the old per-file summary
|
|
392
|
-
(assistant turns + tool calls) for scripts that already use it.
|
|
393
|
-
|
|
394
|
-
### Staying current
|
|
395
|
-
|
|
396
|
-
The panel header shows the running version next to `Reasonix` (e.g.
|
|
397
|
-
`Reasonix 0.12.6 · v4-flash · AUTO · max …`, the trailing `max` is
|
|
398
|
-
the reasoning-effort badge — `/effort high` to step down).
|
|
399
|
-
A quiet 24-hour background check against
|
|
400
|
-
the npm registry surfaces a yellow `update: X.Y.Z` on the right side
|
|
401
|
-
of the same row when a newer version has been published. No blocking,
|
|
402
|
-
no nagging — the check runs once per day max and is silent on failure
|
|
403
|
-
(offline, firewall, etc.).
|
|
404
|
-
|
|
405
|
-
```bash
|
|
406
|
-
reasonix update # print current vs latest, run `npm i -g reasonix@latest`
|
|
407
|
-
reasonix update --dry-run # print the plan without running anything
|
|
408
|
-
```
|
|
409
|
-
|
|
410
|
-
Running via `npx`? The command detects that and prints a
|
|
411
|
-
cache-refresh hint instead — npx picks up the newest version on
|
|
412
|
-
its next invocation automatically.
|
|
413
|
-
|
|
414
|
-
### Project conventions — `REASONIX.md`
|
|
415
|
-
|
|
416
|
-
Drop a `REASONIX.md` in the project root and its contents are pinned
|
|
417
|
-
into the system prompt every launch. Committable team memory — house
|
|
418
|
-
conventions, domain glossary, things the model keeps forgetting:
|
|
419
|
-
|
|
420
|
-
```bash
|
|
421
|
-
cat > REASONIX.md <<'EOF'
|
|
422
|
-
# Notes for Reasonix
|
|
423
|
-
- Use snake_case for new Python modules; legacy camelCase modules keep their style.
|
|
424
|
-
- `cargo check` is in the auto-run allowlist; full `cargo test` needs confirmation.
|
|
425
|
-
- The `api/` dir mirrors `backend/` — keep schemas in sync.
|
|
426
|
-
EOF
|
|
427
|
-
```
|
|
428
|
-
|
|
429
|
-
Re-launch (or `/new`) to pick it up; the prefix is hashed once per
|
|
430
|
-
session to keep the DeepSeek cache warm. `/memory` prints what's
|
|
431
|
-
currently pinned. `REASONIX_MEMORY=off` disables every memory source
|
|
432
|
-
for CI / offline repro.
|
|
433
|
-
|
|
434
|
-
### User memory — `~/.reasonix/memory/`
|
|
435
|
-
|
|
436
|
-
A second, **private per-user** memory layer lives under your home
|
|
437
|
-
directory. Unlike `REASONIX.md` it's never committed, and the model
|
|
438
|
-
can write to it itself via the `remember` tool. Two scopes:
|
|
439
|
-
|
|
440
|
-
- `~/.reasonix/memory/global/` — cross-project (your preferences,
|
|
441
|
-
tooling).
|
|
442
|
-
- `~/.reasonix/memory/<project-hash>/` — scoped to one sandbox root
|
|
443
|
-
in `reasonix code` (decisions, local facts, per-repo shortcuts).
|
|
444
|
-
|
|
445
|
-
Each scope keeps an always-loaded `MEMORY.md` index of one-liners
|
|
446
|
-
plus zero or more `<name>.md` detail files (loaded on demand via
|
|
447
|
-
`recall_memory`). Writes land immediately; pinning into the system
|
|
448
|
-
prompt takes effect on next `/new` or launch so the cache prefix
|
|
449
|
-
stays stable for the current session.
|
|
450
|
-
|
|
451
|
-
```
|
|
452
|
-
reasonix code › 我用 bun 而不是 npm,请以后都用 bun 跑构建
|
|
453
|
-
|
|
454
|
-
assistant
|
|
455
|
-
▸ tool<remember> → project/bun_build saved
|
|
456
|
-
"Build command on this machine is `bun run build`"
|
|
457
|
-
```
|
|
458
|
-
|
|
459
|
-
**Slash**: `/memory` · `/memory list` · `/memory show <name>` ·
|
|
460
|
-
`/memory forget <name>` · `/memory clear <scope> confirm`.
|
|
461
|
-
**Model tools**: `remember(type, scope, name, description, content)` ·
|
|
462
|
-
`forget(scope, name)` · `recall_memory(scope, name)`.
|
|
463
|
-
|
|
464
|
-
Project scope is only available inside `reasonix code` (needs a real
|
|
465
|
-
sandbox root to hash); plain `reasonix` gets the global scope only.
|
|
466
|
-
|
|
467
|
-
### Skills — user-authored prompt packs
|
|
468
|
-
|
|
469
|
-
Skills are prose instruction blocks you drop on disk. Reasonix pins
|
|
470
|
-
their names + one-line descriptions into the system prompt; the
|
|
471
|
-
model can call `run_skill({name: "..."})` on its own when a match
|
|
472
|
-
fits, or you can type `/skill <name> [args]` to run one manually.
|
|
473
|
-
|
|
474
|
-
Two scopes, same layout as user memory:
|
|
475
|
-
|
|
476
|
-
- `<project>/.reasonix/skills/` — per-project skills (commit them to
|
|
477
|
-
share with your team, or add to `.gitignore` for personal drafts).
|
|
478
|
-
- `~/.reasonix/skills/` — global skills available everywhere.
|
|
479
|
-
|
|
480
|
-
Either layout works: `<name>/SKILL.md` (preferred — can bundle
|
|
481
|
-
additional assets alongside) or flat `<name>.md`.
|
|
482
|
-
|
|
483
|
-
```markdown
|
|
484
|
-
---
|
|
485
|
-
name: review
|
|
486
|
-
description: Review uncommitted changes and flag risks
|
|
487
128
|
---
|
|
488
129
|
|
|
489
|
-
|
|
490
|
-
hunk does, call out potential regressions, and list files that might
|
|
491
|
-
need additional tests. Don't propose edits unless I ask.
|
|
492
|
-
```
|
|
493
|
-
|
|
494
|
-
Use it:
|
|
495
|
-
|
|
496
|
-
```
|
|
497
|
-
reasonix code › /skill review
|
|
498
|
-
▸ running skill: review
|
|
499
|
-
assistant
|
|
500
|
-
▸ tool<run_command> → git diff --cached
|
|
501
|
-
▸ 3 改动,1 个需要回归测试 …
|
|
502
|
-
```
|
|
503
|
-
|
|
504
|
-
Or let the model pick autonomously — because the skill's name +
|
|
505
|
-
description are pinned in the prefix, asking "帮我看下未提交的改动有没
|
|
506
|
-
有风险" triggers `run_skill({name: "review"})` without you typing the
|
|
507
|
-
slash command.
|
|
508
|
-
|
|
509
|
-
**Slash**: `/skill` (list) · `/skill show <name>` · `/skill <name>
|
|
510
|
-
[args]` (inject body as user turn).
|
|
511
|
-
|
|
512
|
-
**Deliberately not tied** to any other client's directory convention
|
|
513
|
-
(`.claude/skills`, etc.) — Reasonix is model-agnostic at the
|
|
514
|
-
conversation layer. Any SKILL.md you author works; the body is
|
|
515
|
-
prose, so skills authored for other tools usually port over unchanged
|
|
516
|
-
(Reasonix's tool names differ — `filesystem` / `shell` / `web` — but
|
|
517
|
-
the model reads the instructions and picks our equivalents).
|
|
518
|
-
|
|
519
|
-
### Hooks — automate around tool calls and turns
|
|
520
|
-
|
|
521
|
-
Drop a `settings.json` under `.reasonix/` (project or `~/`) and
|
|
522
|
-
Reasonix will fire shell commands at four well-known points in
|
|
523
|
-
the loop: before a tool runs, after a tool returns, before your
|
|
524
|
-
prompt reaches the model, and after the turn ends.
|
|
525
|
-
|
|
526
|
-
```json
|
|
527
|
-
// <project>/.reasonix/settings.json ← committable
|
|
528
|
-
// ~/.reasonix/settings.json ← per-user
|
|
529
|
-
{
|
|
530
|
-
"hooks": {
|
|
531
|
-
"PreToolUse": [{ "match": "edit_file|write_file", "command": "bun scripts/guard.ts" }],
|
|
532
|
-
"PostToolUse": [{ "match": "edit_file", "command": "biome format --write" }],
|
|
533
|
-
"UserPromptSubmit": [{ "command": "echo $(date +%s) >> ~/.reasonix/prompts.log" }],
|
|
534
|
-
"Stop": [{ "command": "bun test --run", "timeout": 60000 }]
|
|
535
|
-
}
|
|
536
|
-
}
|
|
537
|
-
```
|
|
538
|
-
|
|
539
|
-
Each hook is a shell command. Reasonix invokes it with stdin = a
|
|
540
|
-
JSON envelope describing the event:
|
|
130
|
+
## Contributing
|
|
541
131
|
|
|
542
|
-
|
|
543
|
-
{ "event": "PreToolUse", "cwd": "/path/to/project",
|
|
544
|
-
"toolName": "edit_file", "toolArgs": { "path": "src/x.ts", "..." } }
|
|
545
|
-
```
|
|
546
|
-
|
|
547
|
-
Exit code drives the decision:
|
|
548
|
-
|
|
549
|
-
- **0** — pass; loop continues normally
|
|
550
|
-
- **2** — block (only on `PreToolUse` / `UserPromptSubmit`); the
|
|
551
|
-
hook's stderr becomes the synthetic tool result the model sees,
|
|
552
|
-
or the prompt is dropped entirely
|
|
553
|
-
- **anything else** — warn; loop continues, stderr renders as a
|
|
554
|
-
yellow row inline
|
|
132
|
+
Reasonix is solo-maintained but designed to grow. Scoped starter issues:
|
|
555
133
|
|
|
556
|
-
|
|
557
|
-
|
|
558
|
-
|
|
559
|
-
`
|
|
134
|
+
- [#15 — `reasonix doctor --json` flag](https://github.com/esengine/reasonix/issues/15) · CLI · 2-3h
|
|
135
|
+
- [#16 — `web_search` / `web_fetch` actionable error messages](https://github.com/esengine/reasonix/issues/16) · tools · 2-3h
|
|
136
|
+
- [#17 — Slash command "did you mean?" suggestion](https://github.com/esengine/reasonix/issues/17) · TUI · 2-3h
|
|
137
|
+
- [#18 — Unit tests for `clipboard.ts`](https://github.com/esengine/reasonix/issues/18) · tests · 2-3h
|
|
560
138
|
|
|
561
|
-
|
|
562
|
-
`settings.json` from disk without losing your session).
|
|
139
|
+
Each has background, code pointers, acceptance criteria, hints. Browse all [`good first issue`](https://github.com/esengine/reasonix/labels/good%20first%20issue)s.
|
|
563
140
|
|
|
564
|
-
|
|
565
|
-
|
|
566
|
-
|
|
567
|
-
|
|
568
|
-
and the shell command to run. The slash does *not* spawn
|
|
569
|
-
`npm install` — stdio:inherit into a running Ink renderer corrupts
|
|
570
|
-
the display. Exit the session and run `reasonix update` in a
|
|
571
|
-
fresh shell when you actually want to install.
|
|
572
|
-
|
|
573
|
-
---
|
|
141
|
+
**Open Discussions** — opinions wanted:
|
|
142
|
+
- [#20 · CLI / TUI design](https://github.com/esengine/reasonix/discussions/20) — what's broken, what's missing, what would you change?
|
|
143
|
+
- [#21 · Dashboard design](https://github.com/esengine/reasonix/discussions/21) — react against the [proposed mockup](./design/agent-dashboard.html)
|
|
144
|
+
- [#22 · Future feature wishlist](https://github.com/esengine/reasonix/discussions/22) — what would you build into Reasonix next?
|
|
574
145
|
|
|
575
|
-
|
|
576
|
-
|
|
577
|
-
Same TUI, no filesystem tools unless you opt in via MCP. Good for
|
|
578
|
-
drafting, Q&A, schema design, architecture discussions, or driving
|
|
579
|
-
your own MCP servers. Sessions persist per name under
|
|
580
|
-
`~/.reasonix/sessions/`.
|
|
581
|
-
|
|
582
|
-
```bash
|
|
583
|
-
npx reasonix # uses saved config + wizard-selected MCP
|
|
584
|
-
npx reasonix --preset pro # pin v4-pro for the whole run (no auto-downgrade)
|
|
585
|
-
npx reasonix --session design # named session — resume later with --session design
|
|
586
|
-
```
|
|
587
|
-
|
|
588
|
-
Bridge your own MCP servers on the fly:
|
|
146
|
+
**Before your first PR**: read [`CONTRIBUTING.md`](./CONTRIBUTING.md). Short, strict project rules (comments, errors, libraries-over-hand-rolled); `tests/comment-policy.test.ts` enforces the comment ones and `npm run verify` is the pre-push gate.
|
|
589
147
|
|
|
590
148
|
```bash
|
|
591
|
-
|
|
592
|
-
|
|
593
|
-
|
|
594
|
-
|
|
595
|
-
|
|
596
|
-
MCP tools go through the same Cache-First + repair + context-safety
|
|
597
|
-
plumbing as native tools — 32k result cap, live progress-notification
|
|
598
|
-
rendering, retries.
|
|
599
|
-
|
|
600
|
-
---
|
|
601
|
-
|
|
602
|
-
## Commands inside the session
|
|
603
|
-
|
|
604
|
-
<details>
|
|
605
|
-
<summary><strong>Slash command reference</strong> (click to expand)</summary>
|
|
606
|
-
|
|
607
|
-
**Core**
|
|
608
|
-
|
|
609
|
-
| command | what it does |
|
|
610
|
-
|---|---|
|
|
611
|
-
| `/help` · `/?` | full command reference with hints |
|
|
612
|
-
| `/status` | current model · flags · context · session |
|
|
613
|
-
| `/new` · `/reset` | fresh conversation in the same session |
|
|
614
|
-
| `/clear` | clear visible scrollback only (log kept) |
|
|
615
|
-
| `/retry` | truncate and resend your last message (fresh sample) |
|
|
616
|
-
| `/exit` · `/quit` | quit |
|
|
617
|
-
|
|
618
|
-
**Model**
|
|
619
|
-
|
|
620
|
-
| command | what it does |
|
|
621
|
-
|---|---|
|
|
622
|
-
| `/preset <auto\|flash\|pro>` | model commitment — `auto` = flash with escalation, `flash` = locked flash, `pro` = locked pro |
|
|
623
|
-
| `/model <id>` | switch DeepSeek model (`deepseek-v4-flash`, `deepseek-v4-pro`, plus `deepseek-chat` / `deepseek-reasoner` compat aliases) |
|
|
624
|
-
| `/models` | list live models from DeepSeek `/models` endpoint |
|
|
625
|
-
| `/harvest [on\|off]` | toggle R1 plan-state extraction |
|
|
626
|
-
| `/branch <N\|off>` | run N parallel samples per turn, pick best (N ≥ 2) |
|
|
627
|
-
| `/effort <high\|max>` | reasoning_effort cap — `max` is the agent default, `high` is cheaper/faster |
|
|
628
|
-
| `/think` | dump the last turn's full thinking-mode reasoning |
|
|
629
|
-
|
|
630
|
-
**Context & tools**
|
|
631
|
-
|
|
632
|
-
| command | what it does |
|
|
633
|
-
|---|---|
|
|
634
|
-
| `/mcp` | list attached MCP servers and their tools / resources / prompts |
|
|
635
|
-
| `/resource [uri]` | browse + read MCP resources (no arg → list URIs; `<uri>` → fetch) |
|
|
636
|
-
| `/prompt [name]` | browse + fetch MCP prompts |
|
|
637
|
-
| `/tool [N]` | dump the Nth tool call's full output (1 = latest) |
|
|
638
|
-
| `/compact [tokens]` | shrink oversized tool results in the log (default 4000 tokens/result) |
|
|
639
|
-
| `/context` | break down where context tokens are going (system / tools / log) |
|
|
640
|
-
| `/stats` | cross-session cost dashboard (today / week / month / all-time) |
|
|
641
|
-
| `/keys` | keyboard shortcuts + prompt prefixes (`!` / `@` / `/`) cheatsheet |
|
|
642
|
-
|
|
643
|
-
**Memory & skills**
|
|
644
|
-
|
|
645
|
-
| command | what it does |
|
|
646
|
-
|---|---|
|
|
647
|
-
| `/memory` | show pinned memory (REASONIX.md + ~/.reasonix/memory) |
|
|
648
|
-
| `/memory list` · `show <name>` · `forget <name>` · `clear <scope> confirm` | manage the store |
|
|
649
|
-
| `/skill` · `/skill list` | list discovered skills (project + global) |
|
|
650
|
-
| `/skill show <name>` | dump one skill's body |
|
|
651
|
-
| `/skill <name> [args]` | run a skill (inject body as user turn) |
|
|
652
|
-
|
|
653
|
-
**Sessions**
|
|
654
|
-
|
|
655
|
-
| command | what it does |
|
|
656
|
-
|---|---|
|
|
657
|
-
| `/sessions` | list saved sessions (current marked with `▸`) |
|
|
658
|
-
| `/forget` | delete the current session from disk |
|
|
659
|
-
| `/setup` | reconfigure (exit and run `reasonix setup`) |
|
|
660
|
-
|
|
661
|
-
**Code mode only** (`reasonix code`)
|
|
662
|
-
|
|
663
|
-
| command | what it does |
|
|
664
|
-
|---|---|
|
|
665
|
-
| `/apply` | commit the pending SEARCH/REPLACE blocks to disk |
|
|
666
|
-
| `/discard` | drop the pending edit blocks without writing |
|
|
667
|
-
| `/undo` | roll back the last applied edit batch |
|
|
668
|
-
| `/commit "msg"` | `git add -A && git commit -m "msg"` |
|
|
669
|
-
| `/plan [on\|off]` | toggle read-only plan mode |
|
|
670
|
-
| `/apply-plan` | force-approve a pending plan |
|
|
671
|
-
|
|
672
|
-
**Keyboard**
|
|
673
|
-
|
|
674
|
-
- `Enter` — submit
|
|
675
|
-
- `Shift+Enter` / `Ctrl+J` — newline (multi-line paste also supported;
|
|
676
|
-
`\` + Enter as a portable fallback)
|
|
677
|
-
- `↑` / `↓` — walk prompt history while idle; navigate slash-autocomplete
|
|
678
|
-
- `Tab` / `Enter` on a `/foo` prefix — accept the highlighted suggestion
|
|
679
|
-
- `Esc` — abort the current turn (stops the API call, cancels any
|
|
680
|
-
in-flight tool, rejects pending MCP requests)
|
|
681
|
-
- `y` / `n` on confirm prompts — hotkey accept / reject
|
|
682
|
-
|
|
683
|
-
</details>
|
|
684
|
-
|
|
685
|
-
---
|
|
686
|
-
|
|
687
|
-
## Sessions and safety nets
|
|
688
|
-
|
|
689
|
-
- Sessions live as JSONL under `~/.reasonix/sessions/<name>.jsonl`
|
|
690
|
-
(per directory for `reasonix code`). Every message appended
|
|
691
|
-
atomically; `Ctrl+C` never loses context.
|
|
692
|
-
- Tool results are capped at 32k chars per call. Oversized sessions
|
|
693
|
-
self-heal on load (shrinks + rewrites the file).
|
|
694
|
-
- Malformed `assistant.tool_calls` / `tool` pairing is validated on
|
|
695
|
-
every outgoing API call so a corrupted session can't keep 400ing.
|
|
696
|
-
- Context gauge turns yellow at 50%, red at 80% with a `/compact`
|
|
697
|
-
nudge. Approaching the 1M-token window (V4 flash + pro) triggers an
|
|
698
|
-
automatic compaction attempt before falling back to a forced summary.
|
|
699
|
-
- The `reasonix code` sandbox refuses any path that resolves outside
|
|
700
|
-
the launch directory, including symlink escape and `..` traversal.
|
|
701
|
-
|
|
702
|
-
### Troubleshooting: duplicate rows / ghost rendering
|
|
703
|
-
|
|
704
|
-
Some Windows terminals (Git Bash / MINTTY / winpty-wrapped shells)
|
|
705
|
-
don't fully implement the ANSI cursor-up escapes Ink uses to repaint
|
|
706
|
-
the live spinner region. Symptom: spinners, streaming previews, or
|
|
707
|
-
tool-result rows print multiple copies into scrollback instead of
|
|
708
|
-
overwriting in place.
|
|
709
|
-
|
|
710
|
-
If you hit this, run with plain mode:
|
|
711
|
-
|
|
712
|
-
```bash
|
|
713
|
-
REASONIX_UI=plain npx reasonix code
|
|
714
|
-
```
|
|
715
|
-
|
|
716
|
-
Plain mode suppresses live/animated rows and disables the internal
|
|
717
|
-
tick timer. You lose the streaming preview and spinners but gain
|
|
718
|
-
stable scrollback. Windows Terminal, PowerShell 7 in Windows
|
|
719
|
-
Terminal, and WezTerm don't need this opt-out.
|
|
720
|
-
|
|
721
|
-
---
|
|
722
|
-
|
|
723
|
-
## Web search — on by default
|
|
724
|
-
|
|
725
|
-
The model has two web tools the moment you launch: `web_search` and
|
|
726
|
-
`web_fetch`. No flag, no API key, no signup. When you ask about
|
|
727
|
-
something the model wasn't trained on (new releases, current events,
|
|
728
|
-
obscure APIs), it decides to call `web_search` on its own; if a
|
|
729
|
-
snippet isn't enough it follows up with `web_fetch`.
|
|
730
|
-
|
|
731
|
-
Backed by **Mojeek**'s public search page — an independent web
|
|
732
|
-
index, bot-friendly, no cookies/sessions. Coverage on niche or very
|
|
733
|
-
recent queries can be thinner than Google/Bing, but it's reliable
|
|
734
|
-
from scripts. (DDG was the original backend but started serving
|
|
735
|
-
anti-bot pages in 2026.)
|
|
736
|
-
|
|
737
|
-
**Turn it off** (offline mode / privacy / CI):
|
|
738
|
-
|
|
739
|
-
```json
|
|
740
|
-
// ~/.reasonix/config.json
|
|
741
|
-
{ "apiKey": "sk-…", "search": false }
|
|
742
|
-
```
|
|
743
|
-
|
|
744
|
-
```bash
|
|
745
|
-
REASONIX_SEARCH=off npx reasonix code
|
|
746
|
-
```
|
|
747
|
-
|
|
748
|
-
**Bring your own** (Kagi, SearXNG, internal caches): implement the
|
|
749
|
-
`WebSearchProvider` interface and call
|
|
750
|
-
`registerWebTools(registry, { provider })` yourself, or bridge an
|
|
751
|
-
existing MCP search server via `--mcp`.
|
|
752
|
-
|
|
753
|
-
---
|
|
754
|
-
|
|
755
|
-
## MCP — bring your own tools
|
|
756
|
-
|
|
757
|
-
Any [MCP](https://spec.modelcontextprotocol.io/) server works. The
|
|
758
|
-
wizard lets you pick from a catalog, or drive it by flag:
|
|
759
|
-
|
|
760
|
-
```bash
|
|
761
|
-
# stdio (local subprocess)
|
|
762
|
-
npx reasonix --mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/safe"
|
|
763
|
-
|
|
764
|
-
# multiple at once
|
|
765
|
-
npx reasonix \
|
|
766
|
-
--mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/safe" \
|
|
767
|
-
--mcp "demo=npx tsx examples/mcp-server-demo.ts"
|
|
768
|
-
|
|
769
|
-
# HTTP+SSE (remote / hosted)
|
|
770
|
-
npx reasonix --mcp "kb=https://mcp.example.com/sse"
|
|
771
|
-
```
|
|
772
|
-
|
|
773
|
-
`reasonix mcp list` shows the curated catalog. `reasonix mcp inspect
|
|
774
|
-
<spec>` connects once and dumps the server's tools / resources /
|
|
775
|
-
prompts without starting a chat. Progress notifications from
|
|
776
|
-
long-running tools (2025-03-26 spec) render live as a progress bar
|
|
777
|
-
in the spinner.
|
|
778
|
-
|
|
779
|
-
Supported transports: **stdio** (local command) and **HTTP+SSE**
|
|
780
|
-
(remote, MCP 2024-11-05 spec).
|
|
781
|
-
|
|
782
|
-
---
|
|
783
|
-
|
|
784
|
-
## CLI reference
|
|
785
|
-
|
|
786
|
-
<details>
|
|
787
|
-
<summary><strong>Commands, flags, env vars</strong> (click to expand)</summary>
|
|
788
|
-
|
|
789
|
-
```bash
|
|
790
|
-
npx reasonix code [path] # coding mode scoped to path (default: cwd)
|
|
791
|
-
npx reasonix # chat (uses saved config)
|
|
792
|
-
npx reasonix setup # reconfigure the wizard
|
|
793
|
-
npx reasonix chat --session work # named session
|
|
794
|
-
npx reasonix chat --no-session # ephemeral
|
|
795
|
-
npx reasonix run "ask anything" # one-shot, streams to stdout
|
|
796
|
-
npx reasonix stats session.jsonl # summarize a transcript
|
|
797
|
-
npx reasonix replay chat.jsonl # rebuild cost/cache from a transcript
|
|
798
|
-
npx reasonix diff a.jsonl b.jsonl --md # compare two transcripts
|
|
799
|
-
npx reasonix mcp list # curated MCP catalog
|
|
800
|
-
npx reasonix mcp inspect <spec> # probe a single MCP server
|
|
801
|
-
npx reasonix sessions # list saved sessions
|
|
802
|
-
```
|
|
803
|
-
|
|
804
|
-
Common flags:
|
|
805
|
-
|
|
806
|
-
```bash
|
|
807
|
-
--preset <auto|flash|pro> # model commitment (auto / locked-flash / locked-pro)
|
|
808
|
-
--model <id> # explicit model id
|
|
809
|
-
--harvest / --no-harvest # R1 plan-state extraction
|
|
810
|
-
--branch <N> # self-consistency budget
|
|
811
|
-
--mcp "name=cmd args…" # attach an MCP server (repeatable)
|
|
812
|
-
--transcript path.jsonl # write a JSONL transcript on the side
|
|
813
|
-
--session <name> # named session (default: per-dir for code mode)
|
|
814
|
-
--no-session # ephemeral
|
|
815
|
-
--no-config # ignore ~/.reasonix/config.json (CI-friendly)
|
|
816
|
-
```
|
|
817
|
-
|
|
818
|
-
Env vars (win over config):
|
|
819
|
-
|
|
820
|
-
```bash
|
|
821
|
-
export DEEPSEEK_API_KEY=sk-...
|
|
822
|
-
export DEEPSEEK_BASE_URL=https://... # optional alternate endpoint
|
|
823
|
-
export REASONIX_MEMORY=off # disable REASONIX.md + user memory
|
|
824
|
-
export REASONIX_SEARCH=off # disable web_search / web_fetch
|
|
825
|
-
export REASONIX_UI=plain # disable live rows (ghosting workaround)
|
|
826
|
-
```
|
|
827
|
-
|
|
828
|
-
</details>
|
|
829
|
-
|
|
830
|
-
---
|
|
831
|
-
|
|
832
|
-
## Library usage
|
|
833
|
-
|
|
834
|
-
<details>
|
|
835
|
-
<summary><strong>Programmatic API — embed reasonix in your own Node project</strong> (click to expand)</summary>
|
|
836
|
-
|
|
837
|
-
|
|
838
|
-
```ts
|
|
839
|
-
import {
|
|
840
|
-
CacheFirstLoop,
|
|
841
|
-
DeepSeekClient,
|
|
842
|
-
ImmutablePrefix,
|
|
843
|
-
ToolRegistry,
|
|
844
|
-
} from "reasonix";
|
|
845
|
-
|
|
846
|
-
const client = new DeepSeekClient(); // reads DEEPSEEK_API_KEY from env
|
|
847
|
-
const tools = new ToolRegistry();
|
|
848
|
-
|
|
849
|
-
tools.register({
|
|
850
|
-
name: "add",
|
|
851
|
-
description: "Add two integers",
|
|
852
|
-
parameters: {
|
|
853
|
-
type: "object",
|
|
854
|
-
properties: { a: { type: "integer" }, b: { type: "integer" } },
|
|
855
|
-
required: ["a", "b"],
|
|
856
|
-
},
|
|
857
|
-
fn: ({ a, b }: { a: number; b: number }) => a + b,
|
|
858
|
-
});
|
|
859
|
-
|
|
860
|
-
const loop = new CacheFirstLoop({
|
|
861
|
-
client,
|
|
862
|
-
tools,
|
|
863
|
-
prefix: new ImmutablePrefix({
|
|
864
|
-
system: "You are a math helper.",
|
|
865
|
-
toolSpecs: tools.specs(),
|
|
866
|
-
}),
|
|
867
|
-
harvest: true,
|
|
868
|
-
branch: 3,
|
|
869
|
-
});
|
|
870
|
-
|
|
871
|
-
for await (const ev of loop.step("What is 17 + 25?")) {
|
|
872
|
-
if (ev.role === "assistant_final") console.log(ev.content);
|
|
873
|
-
}
|
|
874
|
-
console.log(loop.stats.summary());
|
|
875
|
-
```
|
|
876
|
-
|
|
877
|
-
`ChatOptions.seedTools` accepts a pre-built `ToolRegistry` for
|
|
878
|
-
callers who want the `reasonix code` loop wiring without the CLI
|
|
879
|
-
wrapper. See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for
|
|
880
|
-
internals.
|
|
881
|
-
|
|
882
|
-
</details>
|
|
883
|
-
|
|
884
|
-
---
|
|
885
|
-
|
|
886
|
-
## Benchmarks — verify the cache-hit claim yourself
|
|
887
|
-
|
|
888
|
-
Every abstraction here earns its weight against a DeepSeek-specific
|
|
889
|
-
property — dirt-cheap tokens, R1 reasoning traces, automatic prefix
|
|
890
|
-
caching, JSON mode. Generic wrappers leave these on the table.
|
|
891
|
-
|
|
892
|
-
| | Reasonix default | generic frameworks |
|
|
893
|
-
|---|---|---|
|
|
894
|
-
| Prefix-stable loop (→ 85–95% cache hit) | yes | no (prompts rebuilt each turn) |
|
|
895
|
-
| Auto-flatten deep tool schemas | yes | no (DeepSeek drops args) |
|
|
896
|
-
| Retry with jittered backoff (429/503) | yes | no (custom callbacks) |
|
|
897
|
-
| Scavenge tool calls leaked into `<think>` | yes | no |
|
|
898
|
-
| Call-storm breaker on identical-arg repeats | yes | no |
|
|
899
|
-
| Live cache-hit / cost / vs-Claude panel | yes | no |
|
|
900
|
-
|
|
901
|
-
On the same τ-bench-lite workload — 8 multi-turn tool-use tasks × 3
|
|
902
|
-
repeats = 48 runs per side, live DeepSeek `deepseek-chat`, sole
|
|
903
|
-
variable prefix stability:
|
|
904
|
-
|
|
905
|
-
| metric | baseline (cache-hostile) | Reasonix | delta |
|
|
906
|
-
|---|---:|---:|---:|
|
|
907
|
-
| cache hit | 32.8% | **90.2%** | +57.4 pp |
|
|
908
|
-
| cost / task | $0.000992 | $0.000593 | **−40%** |
|
|
909
|
-
| pass rate | 100% (24/24) | **100% (24/24)** | — |
|
|
910
|
-
|
|
911
|
-
**Reproduce without spending an API credit:**
|
|
912
|
-
|
|
913
|
-
```bash
|
|
914
|
-
git clone https://github.com/esengine/reasonix.git && cd reasonix && npm install
|
|
915
|
-
npx reasonix replay benchmarks/tau-bench/transcripts/t01_address_happy.reasonix.r1.jsonl
|
|
916
|
-
npx reasonix diff \
|
|
917
|
-
benchmarks/tau-bench/transcripts/t01_address_happy.baseline.r1.jsonl \
|
|
918
|
-
benchmarks/tau-bench/transcripts/t01_address_happy.reasonix.r1.jsonl
|
|
149
|
+
git clone https://github.com/esengine/reasonix.git
|
|
150
|
+
cd reasonix
|
|
151
|
+
npm install
|
|
152
|
+
npm run dev code # run from source via tsx
|
|
153
|
+
npm run verify # lint + typecheck + 1665 tests
|
|
919
154
|
```
|
|
920
155
|
|
|
921
|
-
The committed JSONL transcripts carry per-turn `usage`, `cost`, and
|
|
922
|
-
`prefixHash`. Reasonix's prefix hash stays byte-stable across every
|
|
923
|
-
model call; baseline's churns on every turn. The cache delta is
|
|
924
|
-
*mechanically* attributable to log stability, not to a different
|
|
925
|
-
system prompt.
|
|
926
|
-
|
|
927
|
-
Full 48-run report:
|
|
928
|
-
[`benchmarks/tau-bench/report.md`](./benchmarks/tau-bench/report.md).
|
|
929
|
-
Reproduce with your own API key: `npx tsx
|
|
930
|
-
benchmarks/tau-bench/runner.ts --repeats 3`.
|
|
931
|
-
|
|
932
|
-
MCP reference runs (one single prefix hash across all 5 turns even
|
|
933
|
-
with two concurrent MCP subprocesses):
|
|
934
|
-
|
|
935
|
-
| server | turns | cache hit | cost | vs Claude |
|
|
936
|
-
|---|---:|---:|---:|---:|
|
|
937
|
-
| bundled demo (`add` / `echo` / `get_time`) | 2 | **96.6%** (turn 2) | $0.000254 | −94.0% |
|
|
938
|
-
| official `server-filesystem` | 5 | **96.7%** | $0.001235 | −97.0% |
|
|
939
|
-
| **both concurrently** | 5 | **81.1%** | $0.001852 | −95.9% |
|
|
940
|
-
|
|
941
156
|
---
|
|
942
157
|
|
|
943
158
|
## Non-goals
|
|
944
159
|
|
|
945
|
-
- **Multi-
|
|
946
|
-
- **
|
|
947
|
-
|
|
948
|
-
|
|
949
|
-
- **Multi-provider abstraction** (use LiteLLM). Reasonix is
|
|
950
|
-
DeepSeek-only on purpose — every pillar (cache-first loop, R1
|
|
951
|
-
harvesting, tool-call repair) is tuned against DeepSeek-specific
|
|
952
|
-
behavior and economics. Coupling to one backend is the feature.
|
|
953
|
-
- **RAG / vector stores** (use LlamaIndex).
|
|
954
|
-
- **Web UI / SaaS.**
|
|
955
|
-
|
|
956
|
-
Reasonix does DeepSeek, deeply.
|
|
957
|
-
|
|
958
|
-
---
|
|
959
|
-
|
|
960
|
-
## Development
|
|
961
|
-
|
|
962
|
-
```bash
|
|
963
|
-
git clone https://github.com/esengine/reasonix.git
|
|
964
|
-
cd reasonix
|
|
965
|
-
npm install
|
|
966
|
-
npm run dev code # run CLI from source via tsx
|
|
967
|
-
npm run build # tsup to dist/
|
|
968
|
-
npm test # vitest (1482 tests)
|
|
969
|
-
npm run lint # biome
|
|
970
|
-
npm run typecheck # tsc --noEmit
|
|
971
|
-
```
|
|
160
|
+
- **Multi-provider flexibility.** DeepSeek-only on purpose — every layer is tuned around DeepSeek's specific cache mechanic and pricing. Coupling to one backend is the feature.
|
|
161
|
+
- **IDE integration.** Terminal-first; the diff lives in `git diff`, the file tree in `ls`. The dashboard is a companion, not a Cursor replacement.
|
|
162
|
+
- **Hardest-leaderboard reasoning.** Claude Opus still wins some benchmarks. DeepSeek V4 is competitive on coding; if your work is "solve this PhD proof" rather than "fix this auth bug," start with Claude.
|
|
163
|
+
- **Air-gapped / fully-free.** DeepSeek's API has free credit on signup but isn't free forever. For air-gapped, see Aider + Ollama or [Continue](https://continue.dev).
|
|
972
164
|
|
|
973
165
|
---
|
|
974
166
|
|
|
975
167
|
## License
|
|
976
168
|
|
|
977
|
-
MIT
|
|
169
|
+
MIT — see [LICENSE](./LICENSE).
|