reasonix 0.4.13 → 0.4.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +371 -167
- package/dist/cli/index.js +164 -173
- package/dist/cli/index.js.map +1 -1
- package/dist/index.d.ts +1 -1
- package/dist/index.js +1 -1
- package/dist/index.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -9,227 +9,388 @@
|
|
|
9
9
|
**A DeepSeek-native AI coding assistant in your terminal.** Ink TUI. MCP
|
|
10
10
|
first-class. No LangChain.
|
|
11
11
|
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Quick start (60 seconds)
|
|
15
|
+
|
|
16
|
+
**1. Get a DeepSeek API key.** Free credit on signup:
|
|
17
|
+
<https://platform.deepseek.com/api_keys>
|
|
18
|
+
|
|
19
|
+
**2. Run it.** No install needed.
|
|
20
|
+
|
|
12
21
|
```bash
|
|
13
22
|
npx reasonix
|
|
14
23
|
```
|
|
15
24
|
|
|
16
|
-
|
|
17
|
-
preset → pick MCP servers from a checklist); every run after that drops
|
|
18
|
-
straight into chat with your tools wired up. Inside the chat, type `/help`.
|
|
25
|
+
First run walks you through a 30-second wizard:
|
|
19
26
|
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
Generic wrappers treat DeepSeek as "OpenAI with a different base URL"
|
|
24
|
-
and leave these advantages on the table. Reasonix leans into them:
|
|
25
|
-
on the same τ-bench-lite workload,
|
|
26
|
-
[**94.4% cache hit, ~40% cheaper tokens, 100% pass rate**](#validated-numbers)
|
|
27
|
-
vs. a cache-hostile baseline.
|
|
27
|
+
- paste your API key (saved to `~/.reasonix/config.json`)
|
|
28
|
+
- pick a preset — `fast` (cheap chat, default), `smart` (+R1 reasoning), `max` (+self-consistency branching)
|
|
29
|
+
- multi-select MCP servers from a catalog (filesystem, memory, github, puppeteer, …)
|
|
28
30
|
|
|
29
|
-
|
|
31
|
+
Every run after that drops you straight into chat.
|
|
30
32
|
|
|
31
|
-
|
|
33
|
+
**3. Inside the chat.** Type anything and hit Enter. Type `/help` to see
|
|
34
|
+
every command. The status bar at the top shows cache hit %, cost so far,
|
|
35
|
+
balance, and context usage. Press `Esc` to cancel whatever is running.
|
|
32
36
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
| **Tool-Call Repair** | Auto-flattens deep/wide schemas, scavenges tool calls leaked into `<think>`, repairs truncated JSON, breaks call-storms | always on |
|
|
42
|
-
| **Retry layer** | Exponential backoff + jitter on 408/429/500/502/503/504 and network errors. 4xx auth errors don't retry | always on |
|
|
43
|
-
| **Ink TUI** | Live cache-hit / cost / context panel. Streams R1 thinking to a compact preview. Renders Markdown (bold / lists / code / stripped LaTeX) | always on |
|
|
37
|
+
```
|
|
38
|
+
reasonix › explain what this project does
|
|
39
|
+
assistant
|
|
40
|
+
…streams R1 reasoning into a dim preview, then writes the answer…
|
|
41
|
+
status bar: cache hit 92% · cost $0.001 · ctx 8k/131k (6%) · balance 12.34 CNY
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Requires Node ≥ 18. Works on macOS, Linux, Windows (Git Bash + PowerShell).
|
|
44
45
|
|
|
45
46
|
---
|
|
46
47
|
|
|
47
|
-
##
|
|
48
|
+
## Using `reasonix code` — your terminal pair programmer
|
|
48
49
|
|
|
49
|
-
|
|
50
|
-
|
|
50
|
+
Scoped to the directory you launch from. The model has native
|
|
51
|
+
`read_file` / `write_file` / `edit_file` / `list_directory` /
|
|
52
|
+
`search_files` / `directory_tree` / `get_file_info` /
|
|
53
|
+
`create_directory` / `move_file` tools, all sandboxed — any path that
|
|
54
|
+
resolves outside the launch root (including `..` and symlink escapes)
|
|
55
|
+
is refused.
|
|
51
56
|
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
| Retry with jittered backoff (429/503) | ✅ | ❌ custom callbacks |
|
|
57
|
-
| Scavenge tool calls leaked into `<think>` | ✅ | ❌ |
|
|
58
|
-
| Call-storm breaker on identical-arg repeats | ✅ | ❌ |
|
|
59
|
-
| Live cache-hit / cost / vs-Claude panel | ✅ | ❌ |
|
|
60
|
-
| First-run config prompt + Markdown TUI | ✅ | ❌ |
|
|
57
|
+
```bash
|
|
58
|
+
cd my-project
|
|
59
|
+
npx reasonix code
|
|
60
|
+
```
|
|
61
61
|
|
|
62
|
-
|
|
63
|
-
win is that **a plain chat with Reasonix already pays for ~40% less tokens
|
|
64
|
-
than the same chat through a naive LangChain setup**, because the prefix
|
|
65
|
-
actually stays byte-stable.
|
|
62
|
+
### Walkthrough: fix a bug
|
|
66
63
|
|
|
67
|
-
|
|
64
|
+
```
|
|
65
|
+
reasonix code › users.ts 里 findByEmail 对大小写敏感导致登录失败,帮我改成大小写不敏感
|
|
68
66
|
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
67
|
+
assistant
|
|
68
|
+
▸ R1 reasoning · 1243 chars of thought
|
|
69
|
+
▸ tool<search_files> → src/users.ts, src/users.test.ts
|
|
70
|
+
▸ tool<read_file> → (src/users.ts, 412 chars)
|
|
71
|
+
▸ tool<read_file> → (src/users.test.ts, 820 chars)
|
|
72
|
+
▸ 找到了。findByEmail 直接用 === 比对。我把查询和存储都规范化为小写,并补上一条测试。
|
|
72
73
|
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
74
|
+
src/users.ts
|
|
75
|
+
<<<<<<< SEARCH
|
|
76
|
+
return users.find(u => u.email === email);
|
|
77
|
+
=======
|
|
78
|
+
const needle = email.toLowerCase();
|
|
79
|
+
return users.find(u => u.email.toLowerCase() === needle);
|
|
80
|
+
>>>>>>> REPLACE
|
|
80
81
|
|
|
81
|
-
|
|
82
|
+
src/users.test.ts
|
|
83
|
+
<<<<<<< SEARCH
|
|
84
|
+
it("finds by email", () => {
|
|
85
|
+
expect(findByEmail("a@b.com")).toBeDefined();
|
|
86
|
+
});
|
|
87
|
+
=======
|
|
88
|
+
it("finds by email", () => {
|
|
89
|
+
expect(findByEmail("a@b.com")).toBeDefined();
|
|
90
|
+
});
|
|
82
91
|
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
benchmarks/tau-bench/transcripts/t01_address_happy.baseline.r1.jsonl \
|
|
88
|
-
benchmarks/tau-bench/transcripts/t01_address_happy.reasonix.r1.jsonl
|
|
89
|
-
```
|
|
92
|
+
it("is case-insensitive", () => {
|
|
93
|
+
expect(findByEmail("A@B.COM")).toBeDefined();
|
|
94
|
+
});
|
|
95
|
+
>>>>>>> REPLACE
|
|
90
96
|
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
stays byte-stable across every model call; baseline's prefix churns on
|
|
94
|
-
every turn. The cache delta is *mechanically* attributable to log
|
|
95
|
-
stability, not to a different system prompt.
|
|
97
|
+
▸ 2 pending edits across 2 files — /apply to write · /discard to drop
|
|
98
|
+
```
|
|
96
99
|
|
|
97
|
-
|
|
98
|
-
|
|
100
|
+
**At this point nothing has been written to disk.** You have three
|
|
101
|
+
options:
|
|
99
102
|
|
|
100
|
-
|
|
103
|
+
- `/apply` — commit the pending blocks. You'll see `✓ applied
|
|
104
|
+
src/users.ts` and `✓ applied src/users.test.ts`. A snapshot of the
|
|
105
|
+
pre-edit file is kept so `/undo` can roll back.
|
|
106
|
+
- `/discard` — throw the blocks away without writing.
|
|
107
|
+
- Keep chatting — ask for adjustments. Say "also cover the empty
|
|
108
|
+
string case" and the model proposes another block set.
|
|
101
109
|
|
|
102
|
-
|
|
110
|
+
After applying:
|
|
103
111
|
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
112
|
+
```
|
|
113
|
+
reasonix code › /commit "fix: findByEmail case-insensitive"
|
|
114
|
+
▸ git add -A && git commit -m "fix: findByEmail case-insensitive"
|
|
115
|
+
[main a1b2c3d] fix: findByEmail case-insensitive
|
|
116
|
+
```
|
|
108
117
|
|
|
109
|
-
|
|
110
|
-
|---|---:|---:|---:|---:|---:|
|
|
111
|
-
| bundled demo (`add` / `echo` / `get_time`) | 2 | 1 | **96.6%** (turn 2) | $0.000254 | −94.0% |
|
|
112
|
-
| official `@modelcontextprotocol/server-filesystem` | 5 | 4 | **96.7%** overall | $0.001235 | −97.0% |
|
|
113
|
-
| **both concurrently** (`demo_add` + `fs_write_file`) | 5 | 4 | **81.1%** | $0.001852 | −95.9% |
|
|
118
|
+
`/commit` runs `git add -A && git commit -m ...` from the sandbox root.
|
|
114
119
|
|
|
115
|
-
|
|
116
|
-
separate subprocesses, tools from both exercised in one conversation.
|
|
117
|
-
**One single prefix hash across all 5 turns** — byte-stability survives
|
|
118
|
-
concurrent MCP subprocesses.
|
|
120
|
+
### Walkthrough: explore before editing
|
|
119
121
|
|
|
120
|
-
|
|
122
|
+
For "what does this code do?" questions the model uses the read-side
|
|
123
|
+
tools and replies in prose — no SEARCH/REPLACE blocks, no file writes.
|
|
124
|
+
Ask to change something only when you mean it:
|
|
121
125
|
|
|
122
|
-
```bash
|
|
123
|
-
npx reasonix replay benchmarks/tau-bench/transcripts/mcp-demo.add.jsonl
|
|
124
|
-
npx reasonix replay benchmarks/tau-bench/transcripts/mcp-filesystem.jsonl
|
|
125
126
|
```
|
|
127
|
+
reasonix code › 这个项目的路由是怎么组织的?
|
|
128
|
+
assistant
|
|
129
|
+
▸ tool<directory_tree> → (src/ tree, 47 entries)
|
|
130
|
+
▸ tool<read_file> → (src/router.ts, 1.2 KB)
|
|
131
|
+
▸ 路由分三层:顶层 AppRouter 注册 tab,每个 tab 用 React Router 的
|
|
132
|
+
nested routes 写子路径,最后 …
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
If the SEARCH text doesn't match the file byte-for-byte, `edit_file`
|
|
136
|
+
refuses the edit loudly rather than fuzzy-matching. The model sees the
|
|
137
|
+
error and retries with the correct search text — silent wrong edits are
|
|
138
|
+
worse than visible rejections.
|
|
126
139
|
|
|
127
|
-
|
|
128
|
-
(remote / hosted servers, MCP 2024-11-05 spec). Pass an `http(s)://`
|
|
129
|
-
URL to `--mcp` and Reasonix opens the SSE stream and POSTs JSON-RPC
|
|
130
|
-
to the endpoint the server advertises.
|
|
140
|
+
### Things to try
|
|
131
141
|
|
|
132
|
-
|
|
142
|
+
- `/tool 1` — dump the last tool call's full output (when the 400-char
|
|
143
|
+
inline clip isn't enough).
|
|
144
|
+
- `/think` — see the model's full R1 reasoning for the last turn
|
|
145
|
+
(reasoner preset only).
|
|
146
|
+
- `/undo` — roll back the last applied edit batch.
|
|
147
|
+
- `/new` — start fresh in the same directory without losing the
|
|
148
|
+
session file.
|
|
149
|
+
- Drop `--no-session` for an ephemeral session that doesn't persist.
|
|
150
|
+
|
|
151
|
+
```bash
|
|
152
|
+
npx reasonix code src/ # narrower sandbox (only src/ is writable)
|
|
153
|
+
npx reasonix code --no-session # ephemeral — nothing saved to disk
|
|
154
|
+
npx reasonix code --preset max # R1 reasoning + 3-way self-consistency
|
|
155
|
+
```
|
|
133
156
|
|
|
134
157
|
---
|
|
135
158
|
|
|
136
|
-
##
|
|
159
|
+
## Using `reasonix` — general chat
|
|
137
160
|
|
|
138
|
-
|
|
161
|
+
Same TUI, no filesystem tools unless you opt in via MCP. Good for
|
|
162
|
+
drafting, Q&A, schema design, architecture discussions, or driving
|
|
163
|
+
your own MCP servers. Sessions persist per name under
|
|
164
|
+
`~/.reasonix/sessions/`.
|
|
139
165
|
|
|
140
166
|
```bash
|
|
141
|
-
npx reasonix
|
|
167
|
+
npx reasonix # uses saved config + wizard-selected MCP
|
|
168
|
+
npx reasonix --preset smart # one-shot override
|
|
169
|
+
npx reasonix --session design # named session
|
|
170
|
+
npx reasonix --session design # resume it later — history intact
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### Walkthrough: a multi-turn session with R1 reasoning
|
|
174
|
+
|
|
142
175
|
```
|
|
176
|
+
reasonix › /preset smart
|
|
177
|
+
▸ switched to smart · model deepseek-reasoner · harvest on · branch off
|
|
178
|
+
|
|
179
|
+
reasonix › 我要给一个 Flutter 应用设计限时折扣的弹窗展示规则。目标:
|
|
180
|
+
每天首次打开时弹一次,连续弹 3 天后休眠 7 天。怎么实现?
|
|
181
|
+
|
|
182
|
+
assistant
|
|
183
|
+
▸ R1 reasoning · 2410 chars of thought
|
|
184
|
+
‹ subgoals (3): 持久化展示计数 · 判断是否过了 24h · 休眠窗口判断
|
|
185
|
+
‹ hypotheses (2): SharedPreferences 存计数 · lastShownAt 时间戳
|
|
186
|
+
‹ uncertainties (1): 用户换设备后重置的策略
|
|
187
|
+
|
|
188
|
+
建议数据模型:
|
|
189
|
+
lastShownAt: DateTime
|
|
190
|
+
consecutiveShows: int (0..3)
|
|
191
|
+
sleepUntil: DateTime?
|
|
192
|
+
…
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
`/think` dumps the full R1 thought trace; `/status` shows the current
|
|
196
|
+
model / flags / context use; `/retry` re-samples the same prompt with
|
|
197
|
+
a fresh random seed (useful when the first answer missed something).
|
|
143
198
|
|
|
144
|
-
|
|
145
|
-
(fast / smart / max), then offers a multi-select checklist of MCP
|
|
146
|
-
servers — filesystem, memory, github, puppeteer, everything. Everything
|
|
147
|
-
is saved to `~/.reasonix/config.json`. Subsequent runs drop straight
|
|
148
|
-
into chat.
|
|
199
|
+
### Walkthrough: attach MCP tools on the fly
|
|
149
200
|
|
|
150
|
-
|
|
201
|
+
```bash
|
|
202
|
+
# Attach the official filesystem server sandboxed to /tmp/scratch,
|
|
203
|
+
# plus a remote knowledge-base over SSE.
|
|
204
|
+
npx reasonix \
|
|
205
|
+
--mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/scratch" \
|
|
206
|
+
--mcp "kb=https://mcp.example.com/sse"
|
|
207
|
+
```
|
|
151
208
|
|
|
152
|
-
|
|
153
|
-
the **context gauge** (`ctx 42k/131k (32%)` — yellow at 50%, red + a
|
|
154
|
-
`/compact` nudge at 80%). A command strip under the input lists the
|
|
155
|
-
slash commands:
|
|
209
|
+
Inside the chat:
|
|
156
210
|
|
|
157
211
|
```
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
212
|
+
reasonix › /mcp
|
|
213
|
+
▸ fs (stdio, 11 tools) fs_read_file · fs_list_directory · fs_write_file · …
|
|
214
|
+
▸ kb (sse, 4 tools) kb_search · kb_get · kb_list_collections · kb_stat
|
|
215
|
+
|
|
216
|
+
reasonix › 在 /tmp/scratch 下把所有 .log 文件里含 "ERROR" 的行收集到 errors.txt
|
|
217
|
+
assistant
|
|
218
|
+
▸ tool<fs_search_files> → 4 matches
|
|
219
|
+
▸ tool<fs_read_file> → …
|
|
220
|
+
▸ tool<fs_write_file> → wrote 2.4 KB to errors.txt
|
|
221
|
+
▸ 已写入 errors.txt — 共 38 行,分布在 4 个源文件中。
|
|
165
222
|
```
|
|
166
223
|
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
224
|
+
MCP tools go through the same Cache-First + repair + context-safety
|
|
225
|
+
plumbing as native tools, including the 32k result cap and live
|
|
226
|
+
progress-notification rendering.
|
|
170
227
|
|
|
171
|
-
|
|
172
|
-
every message appended atomically, so killing the CLI never loses
|
|
173
|
-
context. Oversized tool results auto-heal on load, so poisoning a
|
|
174
|
-
session with one giant `read_file` doesn't brick your history.
|
|
228
|
+
### When to use `reasonix` vs `reasonix code`
|
|
175
229
|
|
|
176
|
-
|
|
230
|
+
| situation | command |
|
|
231
|
+
|---|---|
|
|
232
|
+
| Editing files in the current project | `reasonix code` |
|
|
233
|
+
| Exploring a project without writing files | `reasonix code` (it only writes on `/apply`) |
|
|
234
|
+
| Design / architecture / research chat | `reasonix` |
|
|
235
|
+
| Driving your own MCP servers | `reasonix --mcp "..."` |
|
|
236
|
+
| One-shot question, no TUI | `reasonix run "..."` |
|
|
237
|
+
| Reproducing a prior session / benchmark | `reasonix replay path.jsonl` |
|
|
177
238
|
|
|
178
|
-
|
|
179
|
-
`cwd`, coding system prompt, reasoner preset, per-directory session.
|
|
180
|
-
The model proposes edits as **SEARCH/REPLACE blocks**:
|
|
239
|
+
---
|
|
181
240
|
|
|
241
|
+
## Commands inside the session
|
|
242
|
+
|
|
243
|
+
| command | what it does |
|
|
244
|
+
|---|---|
|
|
245
|
+
| `/help` | full command reference with hints |
|
|
246
|
+
| `/status` | current model · flags · context · session |
|
|
247
|
+
| `/preset <fast\|smart\|max>` | one-tap bundle (model + harvest + branch) |
|
|
248
|
+
| `/model <id>` | switch DeepSeek model (`deepseek-chat`, `deepseek-reasoner`) |
|
|
249
|
+
| `/harvest [on\|off]` | toggle R1 plan-state extraction |
|
|
250
|
+
| `/branch <N\|off>` | run N parallel samples per turn, pick best (N ≥ 2) |
|
|
251
|
+
| `/mcp` | list attached MCP servers and their tools |
|
|
252
|
+
| `/tool [N]` | dump the Nth tool call's full output (1 = latest) |
|
|
253
|
+
| `/think` | dump the last turn's full R1 reasoning |
|
|
254
|
+
| `/retry` | truncate and resend your last message (fresh sample) |
|
|
255
|
+
| `/compact [cap]` | shrink oversized tool results in the log |
|
|
256
|
+
| `/sessions` | list saved sessions (current marked with `▸`) |
|
|
257
|
+
| `/forget` | delete the current session from disk |
|
|
258
|
+
| `/new` (alias `/reset`) | start a fresh conversation in the same session |
|
|
259
|
+
| `/clear` | clear visible scrollback only (log kept) |
|
|
260
|
+
| `/setup` | reconfigure (exit and run `reasonix setup`) |
|
|
261
|
+
| `/exit` | quit |
|
|
262
|
+
|
|
263
|
+
Additional commands in `reasonix code`:
|
|
264
|
+
|
|
265
|
+
| command | what it does |
|
|
266
|
+
|---|---|
|
|
267
|
+
| `/apply` | commit the pending SEARCH/REPLACE blocks to disk |
|
|
268
|
+
| `/discard` | drop the pending edit blocks without writing |
|
|
269
|
+
| `/undo` | roll back the last applied edit batch |
|
|
270
|
+
| `/commit "msg"` | `git add -A && git commit -m "msg"` |
|
|
271
|
+
|
|
272
|
+
**Keyboard:**
|
|
273
|
+
|
|
274
|
+
- `Enter` — submit
|
|
275
|
+
- `Shift+Enter` / `Ctrl+J` — newline (multi-line paste also supported; `\` + Enter as a portable fallback)
|
|
276
|
+
- `↑ / ↓` — walk prompt history while idle; navigate slash-autocomplete matches
|
|
277
|
+
- `Tab` / `Enter` on a `/foo` prefix — accept the highlighted suggestion
|
|
278
|
+
- `Esc` — abort the current turn (stops the API call, cancels any in-flight tool, rejects pending MCP requests)
|
|
279
|
+
- `y` / `n` on confirm prompts — hotkey accept / reject
|
|
280
|
+
|
|
281
|
+
---
|
|
282
|
+
|
|
283
|
+
## Sessions and safety nets
|
|
284
|
+
|
|
285
|
+
- Sessions live as JSONL under `~/.reasonix/sessions/<name>.jsonl` (per
|
|
286
|
+
directory for `reasonix code`). Every message appended atomically; `Ctrl+C`
|
|
287
|
+
never loses context.
|
|
288
|
+
- Tool results are capped at 32k chars per call. Oversized sessions
|
|
289
|
+
self-heal on load (shrinks + rewrites the file).
|
|
290
|
+
- Malformed `assistant.tool_calls` / `tool` pairing is validated on
|
|
291
|
+
every outgoing API call so a corrupted session can't keep 400ing.
|
|
292
|
+
- Context gauge turns yellow at 50%, red at 80% with a `/compact` nudge.
|
|
293
|
+
Approaching the 131k window triggers an automatic compaction attempt
|
|
294
|
+
before falling back to a forced summary.
|
|
295
|
+
- The model's sandbox in `reasonix code` refuses any path that resolves
|
|
296
|
+
outside the launch directory, including symlink escape and `..` traversal.
|
|
297
|
+
|
|
298
|
+
### Troubleshooting: duplicate rows / ghost rendering
|
|
299
|
+
|
|
300
|
+
Some Windows terminals (Git Bash / MINTTY / winpty-wrapped shells)
|
|
301
|
+
don't fully implement the ANSI cursor-up escapes Ink uses to repaint
|
|
302
|
+
the live spinner region. Symptom: spinners, streaming previews, or
|
|
303
|
+
tool-result rows print multiple copies into scrollback instead of
|
|
304
|
+
overwriting in place.
|
|
305
|
+
|
|
306
|
+
If you hit this, run with plain mode:
|
|
307
|
+
|
|
308
|
+
```bash
|
|
309
|
+
REASONIX_UI=plain npx reasonix code
|
|
310
|
+
# or
|
|
311
|
+
REASONIX_UI=plain npx reasonix
|
|
182
312
|
```
|
|
183
|
-
src/foo.ts
|
|
184
|
-
<<<<<<< SEARCH
|
|
185
|
-
const x = 1;
|
|
186
|
-
=======
|
|
187
|
-
const x = 2;
|
|
188
|
-
>>>>>>> REPLACE
|
|
189
|
-
```
|
|
190
313
|
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
314
|
+
Plain mode suppresses every live/animated row and disables the
|
|
315
|
+
internal tick timer. You lose the streaming preview and spinners
|
|
316
|
+
but gain stable scrollback. Committed events (your prompts, tool
|
|
317
|
+
results, the model's final responses) still render normally via
|
|
318
|
+
Ink's `<Static>` append path.
|
|
319
|
+
|
|
320
|
+
Windows Terminal, PowerShell 7 in Windows Terminal, and WezTerm
|
|
321
|
+
don't need this opt-out.
|
|
322
|
+
|
|
323
|
+
---
|
|
324
|
+
|
|
325
|
+
## MCP — bring your own tools
|
|
326
|
+
|
|
327
|
+
Any [MCP](https://spec.modelcontextprotocol.io/) server works. Wizard
|
|
328
|
+
lets you pick from a catalog, or drive it by flag:
|
|
195
329
|
|
|
196
330
|
```bash
|
|
197
|
-
|
|
198
|
-
npx reasonix
|
|
199
|
-
|
|
200
|
-
|
|
331
|
+
# stdio (local subprocess)
|
|
332
|
+
npx reasonix --mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/safe"
|
|
333
|
+
|
|
334
|
+
# multiple servers at once
|
|
335
|
+
npx reasonix \
|
|
336
|
+
--mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/safe" \
|
|
337
|
+
--mcp "demo=npx tsx examples/mcp-server-demo.ts"
|
|
338
|
+
|
|
339
|
+
# HTTP+SSE (remote / hosted)
|
|
340
|
+
npx reasonix --mcp "kb=https://mcp.example.com/sse"
|
|
201
341
|
```
|
|
202
342
|
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
343
|
+
`reasonix mcp list` shows the curated catalog. `reasonix mcp inspect <spec>`
|
|
344
|
+
connects once and dumps the server's tools / resources / prompts without
|
|
345
|
+
starting a chat. Progress notifications from long-running tools (2025-03-26
|
|
346
|
+
spec) render live as a progress bar in the spinner.
|
|
347
|
+
|
|
348
|
+
Supported transports: **stdio** (local command) and **HTTP+SSE** (remote,
|
|
349
|
+
MCP 2024-11-05 spec).
|
|
350
|
+
|
|
351
|
+
---
|
|
207
352
|
|
|
208
|
-
|
|
353
|
+
## CLI reference
|
|
209
354
|
|
|
210
355
|
```bash
|
|
211
|
-
npx reasonix
|
|
212
|
-
npx reasonix
|
|
213
|
-
npx reasonix
|
|
356
|
+
npx reasonix # chat (uses saved config)
|
|
357
|
+
npx reasonix code [path] # coding mode scoped to path (default: cwd)
|
|
358
|
+
npx reasonix setup # reconfigure the wizard
|
|
359
|
+
npx reasonix chat --session work # named session
|
|
360
|
+
npx reasonix chat --no-session # ephemeral
|
|
214
361
|
npx reasonix run "ask anything" # one-shot, streams to stdout
|
|
215
362
|
npx reasonix stats session.jsonl # summarize a transcript
|
|
216
|
-
npx reasonix replay chat.jsonl #
|
|
363
|
+
npx reasonix replay chat.jsonl # rebuild cost/cache from a transcript
|
|
217
364
|
npx reasonix diff a.jsonl b.jsonl --md # compare two transcripts
|
|
218
|
-
npx reasonix mcp list # curated MCP
|
|
365
|
+
npx reasonix mcp list # curated MCP catalog
|
|
366
|
+
npx reasonix mcp inspect <spec> # probe a single MCP server
|
|
367
|
+
npx reasonix sessions # list saved sessions
|
|
368
|
+
```
|
|
369
|
+
|
|
370
|
+
Common flags:
|
|
371
|
+
|
|
372
|
+
```bash
|
|
373
|
+
--preset <fast|smart|max> # bundle (model + harvest + branch)
|
|
374
|
+
--model <id> # explicit model id
|
|
375
|
+
--harvest / --no-harvest # R1 plan-state extraction
|
|
376
|
+
--branch <N> # self-consistency budget
|
|
377
|
+
--mcp "name=cmd args…" # attach an MCP server (repeatable)
|
|
378
|
+
--transcript path.jsonl # write a JSONL transcript on the side
|
|
379
|
+
--session <name> # named session (default: per-dir for code mode)
|
|
380
|
+
--no-session # ephemeral
|
|
381
|
+
--no-config # ignore ~/.reasonix/config.json (CI-friendly)
|
|
219
382
|
```
|
|
220
383
|
|
|
221
|
-
|
|
384
|
+
Env vars (win over config):
|
|
222
385
|
|
|
223
386
|
```bash
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
--mcp "filesystem=npx -y @modelcontextprotocol/server-filesystem /tmp/safe" \
|
|
227
|
-
--mcp "kb=https://mcp.example.com/sse" \
|
|
228
|
-
--transcript session.jsonl \
|
|
229
|
-
--no-config # ignore ~/.reasonix/config.json (for CI / reproducing issues)
|
|
387
|
+
export DEEPSEEK_API_KEY=sk-...
|
|
388
|
+
export DEEPSEEK_BASE_URL=https://... # optional alternate endpoint
|
|
230
389
|
```
|
|
231
390
|
|
|
232
|
-
|
|
391
|
+
---
|
|
392
|
+
|
|
393
|
+
## Library usage
|
|
233
394
|
|
|
234
395
|
```ts
|
|
235
396
|
import {
|
|
@@ -261,7 +422,7 @@ const loop = new CacheFirstLoop({
|
|
|
261
422
|
toolSpecs: tools.specs(),
|
|
262
423
|
}),
|
|
263
424
|
harvest: true,
|
|
264
|
-
branch: 3,
|
|
425
|
+
branch: 3,
|
|
265
426
|
});
|
|
266
427
|
|
|
267
428
|
for await (const ev of loop.step("What is 17 + 25?")) {
|
|
@@ -270,27 +431,72 @@ for await (const ev of loop.step("What is 17 + 25?")) {
|
|
|
270
431
|
console.log(loop.stats.summary());
|
|
271
432
|
```
|
|
272
433
|
|
|
273
|
-
|
|
434
|
+
`ChatOptions.seedTools` accepts a pre-built `ToolRegistry` for callers
|
|
435
|
+
who want the `reasonix code` loop wiring without the CLI wrapper.
|
|
436
|
+
See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for internals.
|
|
437
|
+
|
|
438
|
+
---
|
|
274
439
|
|
|
275
|
-
|
|
276
|
-
|
|
440
|
+
## Why Reasonix (not LangChain)
|
|
441
|
+
|
|
442
|
+
Every abstraction here earns its weight against a DeepSeek-specific
|
|
443
|
+
property — dirt-cheap tokens, R1 reasoning traces, automatic prefix
|
|
444
|
+
caching, JSON mode. Generic wrappers leave these on the table.
|
|
445
|
+
|
|
446
|
+
| | Reasonix default | generic frameworks |
|
|
447
|
+
|---|---|---|
|
|
448
|
+
| Prefix-stable loop (→ 85–95% cache hit) | yes | no (prompts rebuilt each turn) |
|
|
449
|
+
| Auto-flatten deep tool schemas | yes | no (DeepSeek drops args) |
|
|
450
|
+
| Retry with jittered backoff (429/503) | yes | no (custom callbacks) |
|
|
451
|
+
| Scavenge tool calls leaked into `<think>` | yes | no |
|
|
452
|
+
| Call-storm breaker on identical-arg repeats | yes | no |
|
|
453
|
+
| Live cache-hit / cost / vs-Claude panel | yes | no |
|
|
454
|
+
| First-run config prompt + Markdown TUI | yes | no |
|
|
455
|
+
|
|
456
|
+
On the same τ-bench-lite workload — 8 multi-turn tool-use tasks × 3
|
|
457
|
+
repeats = 48 runs per side, live DeepSeek `deepseek-chat`, sole variable
|
|
458
|
+
prefix stability:
|
|
459
|
+
|
|
460
|
+
| metric | baseline (cache-hostile) | Reasonix | delta |
|
|
461
|
+
|---|---:|---:|---:|
|
|
462
|
+
| cache hit | 46.6% | **94.4%** | +47.7 pp |
|
|
463
|
+
| cost / task | $0.002599 | $0.001579 | **−39%** |
|
|
464
|
+
| pass rate | 96% (23/24) | **100% (24/24)** | — |
|
|
465
|
+
|
|
466
|
+
**Verify it yourself — no API key, zero cost:**
|
|
277
467
|
|
|
278
468
|
```bash
|
|
279
|
-
|
|
280
|
-
|
|
469
|
+
git clone https://github.com/esengine/reasonix.git && cd reasonix && npm install
|
|
470
|
+
npx reasonix replay benchmarks/tau-bench/transcripts/t01_address_happy.reasonix.r1.jsonl
|
|
471
|
+
npx reasonix diff \
|
|
472
|
+
benchmarks/tau-bench/transcripts/t01_address_happy.baseline.r1.jsonl \
|
|
473
|
+
benchmarks/tau-bench/transcripts/t01_address_happy.reasonix.r1.jsonl
|
|
281
474
|
```
|
|
282
475
|
|
|
283
|
-
|
|
476
|
+
The committed JSONL transcripts carry per-turn `usage`, `cost`, and
|
|
477
|
+
`prefixHash`. Reasonix's prefix hash stays byte-stable across every
|
|
478
|
+
model call; baseline's churns on every turn. The cache delta is
|
|
479
|
+
*mechanically* attributable to log stability, not to a different
|
|
480
|
+
system prompt.
|
|
481
|
+
|
|
482
|
+
Full 48-run report: [`benchmarks/tau-bench/report.md`](./benchmarks/tau-bench/report.md).
|
|
483
|
+
Reproduce with your own API key: `npx tsx benchmarks/tau-bench/runner.ts --repeats 3`.
|
|
284
484
|
|
|
285
|
-
|
|
286
|
-
|
|
485
|
+
MCP reference runs (one single prefix hash across all 5 turns even
|
|
486
|
+
with two concurrent MCP subprocesses):
|
|
487
|
+
|
|
488
|
+
| server | turns | cache hit | cost | vs Claude |
|
|
489
|
+
|---|---:|---:|---:|---:|
|
|
490
|
+
| bundled demo (`add` / `echo` / `get_time`) | 2 | **96.6%** (turn 2) | $0.000254 | −94.0% |
|
|
491
|
+
| official `server-filesystem` | 5 | **96.7%** | $0.001235 | −97.0% |
|
|
492
|
+
| **both concurrently** | 5 | **81.1%** | $0.001852 | −95.9% |
|
|
287
493
|
|
|
288
494
|
---
|
|
289
495
|
|
|
290
496
|
## Non-goals
|
|
291
497
|
|
|
292
498
|
- Multi-agent orchestration (use LangGraph).
|
|
293
|
-
- RAG / vector stores (use LlamaIndex
|
|
499
|
+
- RAG / vector stores (use LlamaIndex).
|
|
294
500
|
- Multi-provider abstraction (use LiteLLM).
|
|
295
501
|
- Web UI / SaaS.
|
|
296
502
|
|
|
@@ -306,13 +512,11 @@ cd reasonix
|
|
|
306
512
|
npm install
|
|
307
513
|
npm run dev chat # run CLI from source via tsx
|
|
308
514
|
npm run build # tsup to dist/
|
|
309
|
-
npm test # vitest (
|
|
515
|
+
npm test # vitest (444 tests)
|
|
310
516
|
npm run lint # biome
|
|
311
517
|
npm run typecheck # tsc --noEmit
|
|
312
518
|
```
|
|
313
519
|
|
|
314
|
-
See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for internals.
|
|
315
|
-
|
|
316
520
|
---
|
|
317
521
|
|
|
318
522
|
## License
|