reasonix 0.18.1 → 0.20.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +63 -917
- package/README.zh-CN.md +66 -837
- package/dashboard/app.css +1987 -2416
- package/dashboard/dist/app.js +24639 -0
- package/dashboard/dist/app.js.map +1 -0
- package/dashboard/dist/vendor-hljs.css +10 -0
- package/dashboard/dist/vendor-uplot.css +1 -0
- package/dashboard/index.html +2 -2
- package/dist/cli/{chunk-RTVI2CLX.js → chunk-R2L5YEEF.js} +23 -11
- package/dist/cli/chunk-R2L5YEEF.js.map +1 -0
- package/dist/cli/index.js +5907 -3505
- package/dist/cli/index.js.map +1 -1
- package/dist/cli/{prompt-P54FIQAH.js → prompt-YUL7CYKY.js} +2 -2
- package/dist/index.d.ts +7 -1
- package/dist/index.js +23 -11
- package/dist/index.js.map +1 -1
- package/package.json +11 -29
- package/dashboard/app.js +0 -4768
- package/dashboard/codemirror.js +0 -36
- package/dist/cli/chunk-RTVI2CLX.js.map +0 -1
- /package/dist/cli/{prompt-P54FIQAH.js.map → prompt-YUL7CYKY.js.map} +0 -0
package/README.md
CHANGED
|
@@ -3,975 +3,121 @@
|
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
5
|
<p align="center">
|
|
6
|
-
<
|
|
6
|
+
<strong>English</strong> · <a href="./README.zh-CN.md">简体中文</a> · <a href="https://esengine.github.io/reasonix/">Website</a>
|
|
7
7
|
</p>
|
|
8
8
|
|
|
9
9
|
<p align="center">
|
|
10
|
-
<
|
|
10
|
+
<a href="https://www.npmjs.com/package/reasonix"><img src="https://img.shields.io/npm/v/reasonix.svg" alt="npm version"/></a>
|
|
11
|
+
<a href="https://github.com/esengine/reasonix/actions/workflows/ci.yml"><img src="https://github.com/esengine/reasonix/actions/workflows/ci.yml/badge.svg" alt="CI"/></a>
|
|
12
|
+
<a href="./LICENSE"><img src="https://img.shields.io/npm/l/reasonix.svg" alt="license"/></a>
|
|
13
|
+
<a href="https://www.npmjs.com/package/reasonix"><img src="https://img.shields.io/npm/dm/reasonix.svg" alt="downloads"/></a>
|
|
14
|
+
<a href="./package.json"><img src="https://img.shields.io/node/v/reasonix.svg" alt="node"/></a>
|
|
15
|
+
<a href="https://github.com/esengine/reasonix/stargazers"><img src="https://img.shields.io/github/stars/esengine/reasonix.svg?style=flat&logo=github&label=stars" alt="GitHub stars"/></a>
|
|
16
|
+
<a href="https://github.com/esengine/reasonix/discussions"><img src="https://img.shields.io/github/discussions/esengine/reasonix.svg?logo=github&label=discussions" alt="Discussions"/></a>
|
|
11
17
|
</p>
|
|
12
18
|
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
[](https://github.com/esengine/reasonix/actions/workflows/ci.yml)
|
|
17
|
-
[](./LICENSE)
|
|
18
|
-
[](https://www.npmjs.com/package/reasonix)
|
|
19
|
-
[](./package.json)
|
|
19
|
+
<p align="center">
|
|
20
|
+
<strong>A DeepSeek-native AI coding agent for your terminal.</strong> Engineered around DeepSeek's prefix-cache, so the savings are real and the loop stays cheap enough to leave on.
|
|
21
|
+
</p>
|
|
20
22
|
|
|
21
|
-
|
|
22
|
-
per task
|
|
23
|
-
|
|
24
|
-
MIT-licensed. No IDE lock-in. MCP first-class.
|
|
23
|
+
<p align="center">
|
|
24
|
+
<img src="docs/assets/hero-stats.svg" alt="94% live prefix-cache hit · ~30× cheaper per task vs Claude Code · MIT terminal-native" width="860"/>
|
|
25
|
+
</p>
|
|
25
26
|
|
|
26
27
|
---
|
|
27
28
|
|
|
28
|
-
## Quick start
|
|
29
|
-
|
|
30
|
-
**1. Get a DeepSeek API key.** Free credit on signup:
|
|
31
|
-
<https://platform.deepseek.com/api_keys>
|
|
32
|
-
|
|
33
|
-
**2. Point it at a project.** No install needed.
|
|
29
|
+
## Quick start
|
|
34
30
|
|
|
35
31
|
```bash
|
|
36
32
|
cd my-project
|
|
37
|
-
npx reasonix code
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
First run walks you through a 30-second wizard (paste API key → pick
|
|
41
|
-
preset → multi-select MCP servers). Every run after that drops you
|
|
42
|
-
straight in.
|
|
43
|
-
|
|
44
|
-
**3. Ask it to edit.** The model proposes edits as SEARCH/REPLACE
|
|
45
|
-
blocks — nothing hits disk until you `/apply`.
|
|
46
|
-
|
|
33
|
+
npx reasonix code # paste a DeepSeek API key on first run; persists after
|
|
47
34
|
```
|
|
48
|
-
reasonix code › users.ts 里 findByEmail 对大小写敏感导致登录失败,帮我改
|
|
49
35
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
▸ 找到了。findByEmail 直接用 === 比对。改成小写规范化并补一条测试。
|
|
54
|
-
|
|
55
|
-
src/users.ts
|
|
56
|
-
<<<<<<< SEARCH
|
|
57
|
-
return users.find(u => u.email === email);
|
|
58
|
-
=======
|
|
59
|
-
const needle = email.toLowerCase();
|
|
60
|
-
return users.find(u => u.email.toLowerCase() === needle);
|
|
61
|
-
>>>>>>> REPLACE
|
|
62
|
-
|
|
63
|
-
▸ 1 pending edit across 1 file — /apply to write · /discard to drop
|
|
64
|
-
|
|
65
|
-
reasonix code › /apply
|
|
66
|
-
▸ ✓ applied src/users.ts
|
|
67
|
-
```
|
|
36
|
+
<p align="center">
|
|
37
|
+
<img src="docs/assets/hero-terminal.svg" alt="Reasonix code mode — assistant proposes a SEARCH/REPLACE edit; nothing on disk until /apply" width="860"/>
|
|
38
|
+
</p>
|
|
68
39
|
|
|
69
|
-
Requires Node ≥
|
|
70
|
-
Windows Terminal). Press `Esc` anytime to abort; `/help` for the full
|
|
71
|
-
command list.
|
|
40
|
+
Requires Node ≥ 22. Tested on macOS, Linux, and Windows (PowerShell, Git Bash, Windows Terminal). Get a [DeepSeek API key →](https://platform.deepseek.com/api_keys) · `reasonix code --help` for flags.
|
|
72
41
|
|
|
73
42
|
---
|
|
74
43
|
|
|
75
|
-
##
|
|
44
|
+
## How it compares
|
|
76
45
|
|
|
77
|
-
|
|
|
78
|
-
|
|
79
|
-
| Backend
|
|
80
|
-
| Cost / typical task
|
|
81
|
-
|
|
|
82
|
-
|
|
|
83
|
-
|
|
|
84
|
-
|
|
|
85
|
-
| MCP servers | **first-class**| first-class | — | — |
|
|
46
|
+
| | Reasonix | Claude Code | Cursor | Aider |
|
|
47
|
+
|-----------------------------------|------------------|-----------------|--------------------|------------------|
|
|
48
|
+
| Backend | DeepSeek V4 | Anthropic | OpenAI / Anthropic | any (OpenRouter) |
|
|
49
|
+
| **Cost / typical task** | **~¥0.01–0.04** | ~¥0.40–4 | ¥150/mo + usage | varies |
|
|
50
|
+
| License | **MIT** | closed | closed | Apache 2 |
|
|
51
|
+
| **DeepSeek prefix-cache hit** | **94%** (live) | n/a | n/a | ~33% (baseline) |
|
|
52
|
+
| Embedded web dashboard | yes | — | n/a (IDE) | — |
|
|
53
|
+
| Persistent per-workspace sessions | yes | partial | n/a | — |
|
|
86
54
|
|
|
87
|
-
|
|
88
|
-
3 repeats, live `deepseek-chat`). Same workload, sole variable is
|
|
89
|
-
prefix stability — committed transcripts in [`benchmarks/`](./benchmarks/).
|
|
90
|
-
The full feature comparison [is below](#why-reasonix-vs-cursor--claude-code--cline--aider).
|
|
55
|
+
Plan mode, edit review, MCP, skills, hooks, and sandboxing are all `yes` for Reasonix and most peers — see the feature grid below for what they actually do here.
|
|
91
56
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
## Web dashboard
|
|
95
|
-
|
|
96
|
-
Type `/dashboard` inside any session and Reasonix prints a localhost
|
|
97
|
-
URL with a one-time token. Open it for a 13-tab control surface that
|
|
98
|
-
mirrors the running TUI — chat (with live streaming), the editor (file
|
|
99
|
-
tree + CodeMirror, syntax highlighting + autocomplete + side-by-side
|
|
100
|
-
diff for pending edits), Usage / Sessions / Plans / Tools /
|
|
101
|
-
Permissions / System / MCP / Skills / Memory / Hooks / Settings.
|
|
57
|
+
Numbers from `benchmarks/tau-bench-lite` (8 multi-turn tasks × 3 repeats, live `deepseek-chat`). [Committed transcripts →](./benchmarks/)
|
|
102
58
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
▸ http://127.0.0.1:54219/?token=… (open in browser)
|
|
106
|
-
```
|
|
59
|
+
<details>
|
|
60
|
+
<summary><strong>Why DeepSeek-only? — the cache economics</strong></summary>
|
|
107
61
|
|
|
108
|
-
|
|
109
|
-
mutation is CSRF-checked. The TUI keeps working — modals (shell
|
|
110
|
-
confirms, plan reviews, edit gates) mirror to whichever surface you
|
|
111
|
-
look at first. No build step, no Electron, no separate process to
|
|
112
|
-
keep alive.
|
|
62
|
+
Cheap tokens alone is half the story. DeepSeek's prefix-cache is **byte-stable**: the cache fingerprints from byte 0 of the prompt. Reasonix's loop is engineered around that — append-only growth, no re-ordering, no marker-based compaction — so the cache prefix survives every tool call.
|
|
113
63
|
|
|
114
|
-
|
|
64
|
+
By comparison, Claude Code is built around Anthropic's `cache_control` markers (a fundamentally different mechanic). Pointing it at DeepSeek's Anthropic-compat endpoint keeps the cheap tokens but loses the cache hits — markers are ignored, and the underlying prefix isn't byte-stable. Generic-backend tools (Aider / Cline / Continue) hit the same wall from the other direction: their compaction patterns destroy byte stability.
|
|
115
65
|
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
Three things you'd come to Reasonix for, that nothing else combines:
|
|
119
|
-
|
|
120
|
-
- **Cost economics that land in your bill.** DeepSeek V4 is ~30×
|
|
121
|
-
cheaper than Claude Sonnet per token. Cheap tokens alone isn't the
|
|
122
|
-
win — *cheap tokens with a 90%+ prefix-cache hit* is. Reasonix's
|
|
123
|
-
loop is engineered around append-only prompt growth so the
|
|
124
|
-
cache-stable prefix survives every tool call. The benchmarks
|
|
125
|
-
section verifies this end-to-end: 90.2% live cache hit, versus
|
|
126
|
-
32.8% for a generic harness on the same workload. The `/stats`
|
|
127
|
-
panel surfaces "vs Claude Sonnet 4.6" savings on every turn.
|
|
128
|
-
|
|
129
|
-
- **It lives in your terminal.** Pure CLI — no Electron, no VS Code
|
|
130
|
-
extension, no IDE plugin to wedge into your editor. Sits next to
|
|
131
|
-
git, tmux, and your shell history. macOS / Linux / Windows
|
|
132
|
-
(PowerShell, Git Bash, Windows Terminal all tested). The only
|
|
133
|
-
network call is to the DeepSeek API itself; no vendor server in
|
|
134
|
-
the middle.
|
|
135
|
-
|
|
136
|
-
- **Open source and hackable, end to end.** MIT-licensed TypeScript.
|
|
137
|
-
The entire loop, tool registry, cache-stable prefix, TUI, MCP
|
|
138
|
-
bridge — all in `src/` under 30k lines. Fork it, ship a private
|
|
139
|
-
build, drop it into CI. No SaaS layer, no enterprise tier, no
|
|
140
|
-
feature gates.
|
|
141
|
-
|
|
142
|
-
| | Reasonix | Claude Code | Cursor | Cline | Aider |
|
|
143
|
-
|---|---|---|---|---|---|
|
|
144
|
-
| Backend | DeepSeek V4 only | Anthropic only | OpenAI / Anthropic | any (OpenRouter) | any (OpenRouter) |
|
|
145
|
-
| Cost / typical task | **~$0.001–$0.005** | ~$0.05–$0.50 | $20/mo + usage | varies | varies |
|
|
146
|
-
| Where it runs | terminal | terminal + IDE | IDE (Electron) | VS Code only | terminal |
|
|
147
|
-
| License | **MIT** | closed | closed | Apache 2 | Apache 2 |
|
|
148
|
-
| Cache-first prefix loop | **engineered (94% hit)** | basic | n/a | n/a | basic |
|
|
149
|
-
| MCP servers | **first-class** | first-class | — | beta | — |
|
|
150
|
-
| Plan mode (read-only audit gate) | **yes** | yes | — | yes | — |
|
|
151
|
-
| User-authored skills | **yes** | yes | — | — | — |
|
|
152
|
-
| Edit review (no auto-write) | **yes** (`/apply`) | yes | partial | yes | yes |
|
|
153
|
-
| Workspace switch (`/cwd`, `change_workspace`) | **yes** | — | n/a (per-window) | — | — |
|
|
154
|
-
| Cross-session cost dashboard | **yes** (`/stats`) | — | — | — | — |
|
|
155
|
-
| Sandbox boundary enforcement | **strict** (refuses `..` escape) | yes | partial | yes | partial |
|
|
66
|
+
At DeepSeek's pricing — $0.07/Mtok uncached, $0.014/Mtok cached — **the difference between 50% and 94% hit is roughly 2.5× on input cost alone.** Same model, same API; the loop's invariants are what changed.
|
|
156
67
|
|
|
157
|
-
|
|
158
|
-
<summary><strong>When reasonix is the wrong choice · DeepSeek/Anthropic-compat caveats · vs Aider/Cline/Continue</strong></summary>
|
|
159
|
-
|
|
160
|
-
### Pick something else when
|
|
161
|
-
|
|
162
|
-
- **You want multi-provider flexibility** (mix Claude / GPT / Gemini /
|
|
163
|
-
local Llama in one tool). Try [Aider](https://aider.chat) or
|
|
164
|
-
[Cline](https://cline.bot). Reasonix is DeepSeek-only on purpose —
|
|
165
|
-
every layer (cache-first loop, R1 harvesting, JSON-mode tool repair,
|
|
166
|
-
reasoning-effort cap) is tuned against DeepSeek-specific behavior
|
|
167
|
-
and economics. Coupling to one backend is the feature, not a
|
|
168
|
-
limitation we'll grow out of.
|
|
169
|
-
- **You want IDE integration** (inline diff in your gutter,
|
|
170
|
-
multi-cursor, ghost text, refactor previews). Try
|
|
171
|
-
[Cursor](https://cursor.com) or Claude Code's IDE mode. Reasonix
|
|
172
|
-
is terminal-first; the diff lives in `git diff`, the file tree
|
|
173
|
-
lives in `ls`, the chat lives in your shell.
|
|
174
|
-
- **You're chasing the hardest reasoning benchmarks.** Claude Opus
|
|
175
|
-
4.6 still wins some leaderboards. DeepSeek V4-pro is competitive
|
|
176
|
-
on most coding tasks but doesn't lead every benchmark. If your
|
|
177
|
-
task is "solve this PhD-level proof" rather than "fix this auth
|
|
178
|
-
bug," start with Claude.
|
|
179
|
-
- **You need fully-local / fully-free**. DeepSeek's API has free
|
|
180
|
-
credit on signup, but isn't free forever. For air-gapped or
|
|
181
|
-
always-free, look at Aider + Ollama or [Continue](https://continue.dev).
|
|
182
|
-
|
|
183
|
-
### "But DeepSeek now has an Anthropic-compatible API — can't I just point Claude Code at it?"
|
|
184
|
-
|
|
185
|
-
You can. DeepSeek ships an official Anthropic-compatible endpoint at
|
|
186
|
-
`https://api.deepseek.com/anthropic`, and Claude Code (or any Anthropic
|
|
187
|
-
SDK client) talks to it without modification. The protocol works. The
|
|
188
|
-
**caching economics** don't transfer, and that's the whole point.
|
|
189
|
-
|
|
190
|
-
Look at DeepSeek's [own compatibility table](https://api-docs.deepseek.com/guides/anthropic_api):
|
|
191
|
-
|
|
192
|
-
| Field | Status on DeepSeek's compat endpoint |
|
|
193
|
-
|---|---|
|
|
194
|
-
| `cache_control` markers | **Ignored** |
|
|
195
|
-
| `mcp_servers` (API-level) | Ignored |
|
|
196
|
-
| `thinking.budget_tokens` | Ignored |
|
|
197
|
-
| Images / documents / citations | Not supported |
|
|
198
|
-
|
|
199
|
-
`cache_control: Ignored` is the load-bearing line. Two completely
|
|
200
|
-
different cache mechanics are colliding here:
|
|
201
|
-
|
|
202
|
-
| | Anthropic native | DeepSeek auto-cache |
|
|
203
|
-
|---|---|---|
|
|
204
|
-
| Model | **Marker-based.** You put `cache_control` on a message; Anthropic caches "everything up to this marker" as a content-addressed unit. Multiple markers = multiple independent breakpoints. | **Byte-stable prefix.** The cache fingerprints the literal byte stream from byte 0. |
|
|
205
|
-
| Claude Code's design | Built around this. Markers on system prompt + tool defs let the loop reorder, compact, or insert metadata after the markers without losing the cache. | n/a — Claude Code wasn't designed for byte-stable prefixes. |
|
|
206
|
-
| What happens when Claude Code → DeepSeek compat | Markers stripped (ignored). Claude Code's main caching strategy disappears. | Falls back to auto-cache. But Claude Code's prefix isn't byte-stable (markers were the *substitute* for byte-stability), so auto-cache misses too. |
|
|
207
|
-
|
|
208
|
-
Net effect: **Claude Code's loop, redirected at DeepSeek, gets the
|
|
209
|
-
cheap tokens and loses the cache hit it depended on.** A loop running
|
|
210
|
-
at 80%+ cache hit on Anthropic's marker cache lands somewhere in the
|
|
211
|
-
40-60% range on DeepSeek's auto-cache (matches the generic-harness
|
|
212
|
-
baseline in our benchmarks). Same model, same API, same workload —
|
|
213
|
-
the loop's invariants don't fit the cache mechanic it's now talking
|
|
214
|
-
to.
|
|
215
|
-
|
|
216
|
-
Reasonix's loop was designed around byte-stable prefix from line one.
|
|
217
|
-
No markers, no breakpoints — append-only is the invariant. That's why
|
|
218
|
-
the same τ-bench workload lands at **90.2% cache hit** on Reasonix
|
|
219
|
-
and **32.8%** on a cache-hostile baseline (committed transcripts;
|
|
220
|
-
benchmarks section below). At DeepSeek's pricing — $0.07/Mtok
|
|
221
|
-
uncached, ~$0.014/Mtok cached — the difference between 50% and 94%
|
|
222
|
-
hit is **roughly 2.5× on input cost alone**.
|
|
223
|
-
|
|
224
|
-
### "What about Aider / Cline / Continue?"
|
|
225
|
-
|
|
226
|
-
They support DeepSeek natively (no compat layer needed) and you do
|
|
227
|
-
get the cheap token price. What you don't get is the DeepSeek-
|
|
228
|
-
specific loop work — those tools' loops support every backend
|
|
229
|
-
generically (OpenAI / Anthropic / local Llama / ...) and use
|
|
230
|
-
compaction + summarization patterns that destroy byte-stability. They
|
|
231
|
-
land in the same 40-60% cache-hit range as the baseline. Plus a
|
|
232
|
-
handful of DeepSeek-specific quirks generic loops don't handle:
|
|
68
|
+
A few DeepSeek-specific fixes generic loops miss:
|
|
233
69
|
|
|
234
70
|
| Generic loops assume | DeepSeek actually does | Reasonix's fix |
|
|
235
71
|
|---|---|---|
|
|
236
|
-
| Reasoning emitted as a structured `thinking` block | R1 sometimes leaks tool-call JSON inside `<think>` tags | a `scavenge` pass that pulls escaped tool calls back out
|
|
237
|
-
| Tool schemas validated strictly | DeepSeek silently drops deeply-nested object/array params | auto-flatten — nested params get rewritten to single-level prefixed names
|
|
238
|
-
| Tool-call args are well-formed JSON | DeepSeek occasionally produces `string="false"` and other malformed fragments | dedicated `ToolCallRepair` heals the common shapes before
|
|
239
|
-
| Reasoning depth tuned via system-level switches | V4 exposes a `reasoning_effort` knob (`max` / `high`) | `/effort` slash + `--effort` flag
|
|
240
|
-
| Old tool results kept in full forever | 1M context — don't compact pre-emptively, but most agents do | call-storm breaker + result token cap, but the prefix is *never* rewritten; compaction lands as new turns at the tail |
|
|
241
|
-
|
|
242
|
-
> Cache-stability isn't a feature you turn on; it's an invariant
|
|
243
|
-
> the loop is designed around. Reasonix isn't yet-another agent
|
|
244
|
-
> CLI — it's an agent CLI built around DeepSeek's specific cache
|
|
245
|
-
> mechanic and pricing model.
|
|
246
|
-
|
|
247
|
-
</details>
|
|
248
|
-
|
|
249
|
-
---
|
|
250
|
-
|
|
251
|
-
## `reasonix code` — pair programmer in your terminal
|
|
252
|
-
|
|
253
|
-
Scoped to the directory you launch from. The model has native
|
|
254
|
-
`read_file` / `write_file` / `edit_file` / `list_directory` /
|
|
255
|
-
`search_files` / `directory_tree` / `get_file_info` /
|
|
256
|
-
`create_directory` / `move_file` tools, all sandboxed — any path that
|
|
257
|
-
resolves outside the launch root (including `..` and symlink escapes)
|
|
258
|
-
is refused. Plus `run_command` with a read-only allowlist; anything
|
|
259
|
-
state-mutating (`npm install`, `git commit`, …) is gated behind a
|
|
260
|
-
confirmation picker.
|
|
261
|
-
|
|
262
|
-
### Walkthrough: explore before editing
|
|
263
|
-
|
|
264
|
-
For "what does this code do?" questions the model uses the read-side
|
|
265
|
-
tools and replies in prose — no SEARCH/REPLACE blocks, no file
|
|
266
|
-
writes. Ask to change something only when you mean it:
|
|
267
|
-
|
|
268
|
-
```
|
|
269
|
-
reasonix code › 这个项目的路由是怎么组织的?
|
|
270
|
-
assistant
|
|
271
|
-
▸ tool<directory_tree> → (src/ tree, 47 entries)
|
|
272
|
-
▸ tool<read_file> → (src/router.ts, 1.2 KB)
|
|
273
|
-
▸ 路由分三层:顶层 AppRouter 注册 tab,每个 tab 用 React Router 的
|
|
274
|
-
nested routes 写子路径,最后 …
|
|
275
|
-
```
|
|
276
|
-
|
|
277
|
-
If an `edit_file` SEARCH block doesn't match the file byte-for-byte,
|
|
278
|
-
the edit is refused loudly rather than fuzzy-matched. The model sees
|
|
279
|
-
the error and retries — silent wrong edits are worse than visible
|
|
280
|
-
rejections.
|
|
281
|
-
|
|
282
|
-
### Plan mode — review before executing
|
|
283
|
-
|
|
284
|
-
For anything bigger than a typo, the model is encouraged to propose a
|
|
285
|
-
markdown plan first. You'll see a picker with **Approve / Refine /
|
|
286
|
-
Cancel**:
|
|
287
|
-
|
|
288
|
-
```
|
|
289
|
-
reasonix code › 把 auth 从 JWT 迁移到 session cookies
|
|
290
|
-
|
|
291
|
-
▸ plan submitted — awaiting your review
|
|
292
|
-
────────────────────────────────────────
|
|
293
|
-
## Summary
|
|
294
|
-
Swap JWT middleware for session cookies, keep user table intact.
|
|
295
|
-
|
|
296
|
-
## Files
|
|
297
|
-
- src/auth/middleware.ts — replace `verifyJwt` with `readSession`
|
|
298
|
-
- src/auth/session.ts — new file, in-memory store + signed cookie
|
|
299
|
-
- src/routes/login.ts — return Set-Cookie instead of a token
|
|
300
|
-
- tests/auth/*.test.ts — update fixtures
|
|
301
|
-
|
|
302
|
-
## Risks
|
|
303
|
-
- Existing logged-in users get logged out (no migration).
|
|
304
|
-
- Session store is in-memory; restart clears sessions.
|
|
305
|
-
────────────────────────────────────────
|
|
306
|
-
▸ Approve and implement
|
|
307
|
-
Refine — explore more
|
|
308
|
-
Cancel
|
|
309
|
-
```
|
|
310
|
-
|
|
311
|
-
**Force it** with `/plan` — enters an explicit read-only phase where
|
|
312
|
-
the model *must* submit a plan before any edit or non-allowlisted
|
|
313
|
-
shell call will execute. Use for high-stakes changes you want to
|
|
314
|
-
audit before the model touches disk. `/plan off` or picker
|
|
315
|
-
Approve/Cancel exits.
|
|
316
|
-
|
|
317
|
-
### Prompt prefixes — `!cmd` and `@path`
|
|
318
|
-
|
|
319
|
-
Two inline shortcuts that don't need a slash:
|
|
320
|
-
|
|
321
|
-
**`!<cmd>` — run a shell command in the sandbox and feed it to the
|
|
322
|
-
model.** Typed at the prompt, like bash. Output lands in the visible
|
|
323
|
-
log AND in the session so the model's next turn reasons about it:
|
|
324
|
-
|
|
325
|
-
```
|
|
326
|
-
reasonix code › !git status --short
|
|
327
|
-
▸ M src/users.ts
|
|
328
|
-
▸ M src/users.test.ts
|
|
329
|
-
|
|
330
|
-
reasonix code › 把这两个文件的改动说明一下
|
|
331
|
-
assistant
|
|
332
|
-
▸ tool<read_file> → src/users.ts, src/users.test.ts
|
|
333
|
-
▸ …
|
|
334
|
-
```
|
|
335
|
-
|
|
336
|
-
No allowlist gate — user-typed shell = explicit consent. 60s timeout,
|
|
337
|
-
32k char cap, survives session resume.
|
|
338
|
-
|
|
339
|
-
**`@path/to/file` — inline a file under "Referenced files."** Start
|
|
340
|
-
typing `@` and a picker appears (↑/↓ navigate, Tab/Enter to insert).
|
|
341
|
-
Good for "what does @src/users.ts do?" without making the model
|
|
342
|
-
`read_file` it first. Sandboxed: relative paths only, no `..` escape,
|
|
343
|
-
64KB per-file cap. Recent files rank higher.
|
|
344
|
-
|
|
345
|
-
### `/commit` — stage + commit in one step
|
|
346
|
-
|
|
347
|
-
```
|
|
348
|
-
reasonix code › /commit "fix: findByEmail case-insensitive"
|
|
349
|
-
▸ git add -A && git commit -m "fix: findByEmail case-insensitive"
|
|
350
|
-
[main a1b2c3d] fix: findByEmail case-insensitive
|
|
351
|
-
```
|
|
352
|
-
|
|
353
|
-
### Things to try
|
|
354
|
-
|
|
355
|
-
- `/tool 1` — dump the last tool call's full output (when the 400-char
|
|
356
|
-
inline clip isn't enough).
|
|
357
|
-
- `/think` — see the model's full reasoning for the last turn
|
|
358
|
-
(thinking-mode models: v4-flash / v4-pro / reasoner alias).
|
|
359
|
-
- `/undo` — roll back the last applied edit batch.
|
|
360
|
-
- `/new` — start fresh in the same directory without losing the
|
|
361
|
-
session file.
|
|
362
|
-
- `/effort high` — step down from the default `max` agent-class
|
|
363
|
-
reasoning_effort for cheaper/faster turns on simple tasks.
|
|
364
|
-
- `npx reasonix code --preset pro` — v4-pro for the whole session,
|
|
365
|
-
no auto-downgrade to flash. Pair with `--branch 3` if you want
|
|
366
|
-
3-way self-consistency on gnarly refactors.
|
|
367
|
-
- `npx reasonix code src/` — narrower sandbox (only `src/` is
|
|
368
|
-
writable).
|
|
369
|
-
- `npx reasonix code --no-session` — ephemeral; nothing saved.
|
|
370
|
-
|
|
371
|
-
### `reasonix stats` — how much did you actually save?
|
|
372
|
-
|
|
373
|
-
Every turn `reasonix chat|code|run` runs appends a compact record
|
|
374
|
-
(tokens + cost + what Claude Sonnet 4.6 would have charged) to
|
|
375
|
-
`~/.reasonix/usage.jsonl`. `reasonix stats` with no args rolls that
|
|
376
|
-
log into today / week / month / all-time windows:
|
|
377
|
-
|
|
378
|
-
```
|
|
379
|
-
Reasonix usage — /Users/you/.reasonix/usage.jsonl
|
|
380
|
-
|
|
381
|
-
turns cache hit cost (USD) vs Claude saved
|
|
382
|
-
----------------------------------------------------------------------
|
|
383
|
-
today 8 95.1% $0.004821 $0.1348 96.4%
|
|
384
|
-
week 34 93.8% $0.023104 $0.6081 96.2%
|
|
385
|
-
month 127 94.2% $0.081530 $2.1452 96.2%
|
|
386
|
-
all-time 342 94.0% $0.210881 $5.8934 96.4%
|
|
387
|
-
```
|
|
388
|
-
|
|
389
|
-
Privacy: only tokens, costs, and the session name you chose land
|
|
390
|
-
in the file. No prompts, no completions, no tool arguments.
|
|
391
|
-
`reasonix stats <transcript>` keeps the old per-file summary
|
|
392
|
-
(assistant turns + tool calls) for scripts that already use it.
|
|
393
|
-
|
|
394
|
-
### Staying current
|
|
395
|
-
|
|
396
|
-
The panel header shows the running version next to `Reasonix` (e.g.
|
|
397
|
-
`Reasonix 0.12.6 · v4-flash · AUTO · max …`, the trailing `max` is
|
|
398
|
-
the reasoning-effort badge — `/effort high` to step down).
|
|
399
|
-
A quiet 24-hour background check against
|
|
400
|
-
the npm registry surfaces a yellow `update: X.Y.Z` on the right side
|
|
401
|
-
of the same row when a newer version has been published. No blocking,
|
|
402
|
-
no nagging — the check runs once per day max and is silent on failure
|
|
403
|
-
(offline, firewall, etc.).
|
|
404
|
-
|
|
405
|
-
```bash
|
|
406
|
-
reasonix update # print current vs latest, run `npm i -g reasonix@latest`
|
|
407
|
-
reasonix update --dry-run # print the plan without running anything
|
|
408
|
-
```
|
|
409
|
-
|
|
410
|
-
Running via `npx`? The command detects that and prints a
|
|
411
|
-
cache-refresh hint instead — npx picks up the newest version on
|
|
412
|
-
its next invocation automatically.
|
|
413
|
-
|
|
414
|
-
### Project conventions — `REASONIX.md`
|
|
415
|
-
|
|
416
|
-
Drop a `REASONIX.md` in the project root and its contents are pinned
|
|
417
|
-
into the system prompt every launch. Committable team memory — house
|
|
418
|
-
conventions, domain glossary, things the model keeps forgetting:
|
|
419
|
-
|
|
420
|
-
```bash
|
|
421
|
-
cat > REASONIX.md <<'EOF'
|
|
422
|
-
# Notes for Reasonix
|
|
423
|
-
- Use snake_case for new Python modules; legacy camelCase modules keep their style.
|
|
424
|
-
- `cargo check` is in the auto-run allowlist; full `cargo test` needs confirmation.
|
|
425
|
-
- The `api/` dir mirrors `backend/` — keep schemas in sync.
|
|
426
|
-
EOF
|
|
427
|
-
```
|
|
428
|
-
|
|
429
|
-
Re-launch (or `/new`) to pick it up; the prefix is hashed once per
|
|
430
|
-
session to keep the DeepSeek cache warm. `/memory` prints what's
|
|
431
|
-
currently pinned. `REASONIX_MEMORY=off` disables every memory source
|
|
432
|
-
for CI / offline repro.
|
|
433
|
-
|
|
434
|
-
### User memory — `~/.reasonix/memory/`
|
|
435
|
-
|
|
436
|
-
A second, **private per-user** memory layer lives under your home
|
|
437
|
-
directory. Unlike `REASONIX.md` it's never committed, and the model
|
|
438
|
-
can write to it itself via the `remember` tool. Two scopes:
|
|
439
|
-
|
|
440
|
-
- `~/.reasonix/memory/global/` — cross-project (your preferences,
|
|
441
|
-
tooling).
|
|
442
|
-
- `~/.reasonix/memory/<project-hash>/` — scoped to one sandbox root
|
|
443
|
-
in `reasonix code` (decisions, local facts, per-repo shortcuts).
|
|
444
|
-
|
|
445
|
-
Each scope keeps an always-loaded `MEMORY.md` index of one-liners
|
|
446
|
-
plus zero or more `<name>.md` detail files (loaded on demand via
|
|
447
|
-
`recall_memory`). Writes land immediately; pinning into the system
|
|
448
|
-
prompt takes effect on next `/new` or launch so the cache prefix
|
|
449
|
-
stays stable for the current session.
|
|
450
|
-
|
|
451
|
-
```
|
|
452
|
-
reasonix code › 我用 bun 而不是 npm,请以后都用 bun 跑构建
|
|
453
|
-
|
|
454
|
-
assistant
|
|
455
|
-
▸ tool<remember> → project/bun_build saved
|
|
456
|
-
"Build command on this machine is `bun run build`"
|
|
457
|
-
```
|
|
458
|
-
|
|
459
|
-
**Slash**: `/memory` · `/memory list` · `/memory show <name>` ·
|
|
460
|
-
`/memory forget <name>` · `/memory clear <scope> confirm`.
|
|
461
|
-
**Model tools**: `remember(type, scope, name, description, content)` ·
|
|
462
|
-
`forget(scope, name)` · `recall_memory(scope, name)`.
|
|
463
|
-
|
|
464
|
-
Project scope is only available inside `reasonix code` (needs a real
|
|
465
|
-
sandbox root to hash); plain `reasonix` gets the global scope only.
|
|
466
|
-
|
|
467
|
-
### Skills — user-authored prompt packs
|
|
468
|
-
|
|
469
|
-
Skills are prose instruction blocks you drop on disk. Reasonix pins
|
|
470
|
-
their names + one-line descriptions into the system prompt; the
|
|
471
|
-
model can call `run_skill({name: "..."})` on its own when a match
|
|
472
|
-
fits, or you can type `/skill <name> [args]` to run one manually.
|
|
473
|
-
|
|
474
|
-
Two scopes, same layout as user memory:
|
|
475
|
-
|
|
476
|
-
- `<project>/.reasonix/skills/` — per-project skills (commit them to
|
|
477
|
-
share with your team, or add to `.gitignore` for personal drafts).
|
|
478
|
-
- `~/.reasonix/skills/` — global skills available everywhere.
|
|
479
|
-
|
|
480
|
-
Either layout works: `<name>/SKILL.md` (preferred — can bundle
|
|
481
|
-
additional assets alongside) or flat `<name>.md`.
|
|
482
|
-
|
|
483
|
-
```markdown
|
|
484
|
-
---
|
|
485
|
-
name: review
|
|
486
|
-
description: Review uncommitted changes and flag risks
|
|
487
|
-
---
|
|
488
|
-
|
|
489
|
-
Run `git diff` on staged and unstaged changes. Summarize what each
|
|
490
|
-
hunk does, call out potential regressions, and list files that might
|
|
491
|
-
need additional tests. Don't propose edits unless I ask.
|
|
492
|
-
```
|
|
493
|
-
|
|
494
|
-
Use it:
|
|
495
|
-
|
|
496
|
-
```
|
|
497
|
-
reasonix code › /skill review
|
|
498
|
-
▸ running skill: review
|
|
499
|
-
assistant
|
|
500
|
-
▸ tool<run_command> → git diff --cached
|
|
501
|
-
▸ 3 改动,1 个需要回归测试 …
|
|
502
|
-
```
|
|
72
|
+
| Reasoning emitted as a structured `thinking` block | R1 sometimes leaks tool-call JSON inside `<think>` tags | a `scavenge` pass that pulls escaped tool calls back out |
|
|
73
|
+
| Tool schemas validated strictly | DeepSeek silently drops deeply-nested object/array params | auto-flatten — nested params get rewritten to single-level prefixed names |
|
|
74
|
+
| Tool-call args are well-formed JSON | DeepSeek occasionally produces `string="false"` and other malformed fragments | dedicated `ToolCallRepair` heals the common shapes before dispatch |
|
|
75
|
+
| Reasoning depth tuned via system-level switches | V4 exposes a `reasoning_effort` knob (`max` / `high`) | `/effort` slash + `--effort` flag for cheap turns |
|
|
503
76
|
|
|
504
|
-
|
|
505
|
-
description are pinned in the prefix, asking "帮我看下未提交的改动有没
|
|
506
|
-
有风险" triggers `run_skill({name: "review"})` without you typing the
|
|
507
|
-
slash command.
|
|
508
|
-
|
|
509
|
-
**Slash**: `/skill` (list) · `/skill show <name>` · `/skill <name>
|
|
510
|
-
[args]` (inject body as user turn).
|
|
511
|
-
|
|
512
|
-
**Deliberately not tied** to any other client's directory convention
|
|
513
|
-
(`.claude/skills`, etc.) — Reasonix is model-agnostic at the
|
|
514
|
-
conversation layer. Any SKILL.md you author works; the body is
|
|
515
|
-
prose, so skills authored for other tools usually port over unchanged
|
|
516
|
-
(Reasonix's tool names differ — `filesystem` / `shell` / `web` — but
|
|
517
|
-
the model reads the instructions and picks our equivalents).
|
|
518
|
-
|
|
519
|
-
### Hooks — automate around tool calls and turns
|
|
520
|
-
|
|
521
|
-
Drop a `settings.json` under `.reasonix/` (project or `~/`) and
|
|
522
|
-
Reasonix will fire shell commands at four well-known points in
|
|
523
|
-
the loop: before a tool runs, after a tool returns, before your
|
|
524
|
-
prompt reaches the model, and after the turn ends.
|
|
525
|
-
|
|
526
|
-
```json
|
|
527
|
-
// <project>/.reasonix/settings.json ← committable
|
|
528
|
-
// ~/.reasonix/settings.json ← per-user
|
|
529
|
-
{
|
|
530
|
-
"hooks": {
|
|
531
|
-
"PreToolUse": [{ "match": "edit_file|write_file", "command": "bun scripts/guard.ts" }],
|
|
532
|
-
"PostToolUse": [{ "match": "edit_file", "command": "biome format --write" }],
|
|
533
|
-
"UserPromptSubmit": [{ "command": "echo $(date +%s) >> ~/.reasonix/prompts.log" }],
|
|
534
|
-
"Stop": [{ "command": "bun test --run", "timeout": 60000 }]
|
|
535
|
-
}
|
|
536
|
-
}
|
|
537
|
-
```
|
|
538
|
-
|
|
539
|
-
Each hook is a shell command. Reasonix invokes it with stdin = a
|
|
540
|
-
JSON envelope describing the event:
|
|
541
|
-
|
|
542
|
-
```json
|
|
543
|
-
{ "event": "PreToolUse", "cwd": "/path/to/project",
|
|
544
|
-
"toolName": "edit_file", "toolArgs": { "path": "src/x.ts", "..." } }
|
|
545
|
-
```
|
|
546
|
-
|
|
547
|
-
Exit code drives the decision:
|
|
548
|
-
|
|
549
|
-
- **0** — pass; loop continues normally
|
|
550
|
-
- **2** — block (only on `PreToolUse` / `UserPromptSubmit`); the
|
|
551
|
-
hook's stderr becomes the synthetic tool result the model sees,
|
|
552
|
-
or the prompt is dropped entirely
|
|
553
|
-
- **anything else** — warn; loop continues, stderr renders as a
|
|
554
|
-
yellow row inline
|
|
555
|
-
|
|
556
|
-
`match` is anchored regex on the tool name; `*` or omitted matches
|
|
557
|
-
every tool. Project hooks fire before global hooks. Default
|
|
558
|
-
timeouts: 5s for blocking events, 30s for logging events; per-hook
|
|
559
|
-
`timeout` overrides.
|
|
560
|
-
|
|
561
|
-
**Slash**: `/hooks` (list active hooks) · `/hooks reload` (re-read
|
|
562
|
-
`settings.json` from disk without losing your session).
|
|
563
|
-
|
|
564
|
-
### Staying current from inside the TUI
|
|
565
|
-
|
|
566
|
-
`/update` inside a running session shows your current version, the
|
|
567
|
-
last-resolved latest version (from the quiet 24h background check),
|
|
568
|
-
and the shell command to run. The slash does *not* spawn
|
|
569
|
-
`npm install` — stdio:inherit into a running Ink renderer corrupts
|
|
570
|
-
the display. Exit the session and run `reasonix update` in a
|
|
571
|
-
fresh shell when you actually want to install.
|
|
572
|
-
|
|
573
|
-
---
|
|
574
|
-
|
|
575
|
-
## `reasonix` — also works as general chat
|
|
576
|
-
|
|
577
|
-
Same TUI, no filesystem tools unless you opt in via MCP. Good for
|
|
578
|
-
drafting, Q&A, schema design, architecture discussions, or driving
|
|
579
|
-
your own MCP servers. Sessions persist per name under
|
|
580
|
-
`~/.reasonix/sessions/`.
|
|
581
|
-
|
|
582
|
-
```bash
|
|
583
|
-
npx reasonix # uses saved config + wizard-selected MCP
|
|
584
|
-
npx reasonix --preset pro # pin v4-pro for the whole run (no auto-downgrade)
|
|
585
|
-
npx reasonix --session design # named session — resume later with --session design
|
|
586
|
-
```
|
|
587
|
-
|
|
588
|
-
Bridge your own MCP servers on the fly:
|
|
589
|
-
|
|
590
|
-
```bash
|
|
591
|
-
npx reasonix \
|
|
592
|
-
--mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/safe" \
|
|
593
|
-
--mcp "kb=https://mcp.example.com/sse"
|
|
594
|
-
```
|
|
595
|
-
|
|
596
|
-
MCP tools go through the same Cache-First + repair + context-safety
|
|
597
|
-
plumbing as native tools — 32k result cap, live progress-notification
|
|
598
|
-
rendering, retries.
|
|
599
|
-
|
|
600
|
-
---
|
|
601
|
-
|
|
602
|
-
## Commands inside the session
|
|
603
|
-
|
|
604
|
-
<details>
|
|
605
|
-
<summary><strong>Slash command reference</strong> (click to expand)</summary>
|
|
606
|
-
|
|
607
|
-
**Core**
|
|
608
|
-
|
|
609
|
-
| command | what it does |
|
|
610
|
-
|---|---|
|
|
611
|
-
| `/help` · `/?` | full command reference with hints |
|
|
612
|
-
| `/status` | current model · flags · context · session |
|
|
613
|
-
| `/new` · `/reset` | fresh conversation in the same session |
|
|
614
|
-
| `/clear` | clear visible scrollback only (log kept) |
|
|
615
|
-
| `/retry` | truncate and resend your last message (fresh sample) |
|
|
616
|
-
| `/exit` · `/quit` | quit |
|
|
617
|
-
|
|
618
|
-
**Model**
|
|
619
|
-
|
|
620
|
-
| command | what it does |
|
|
621
|
-
|---|---|
|
|
622
|
-
| `/preset <auto\|flash\|pro>` | model commitment — `auto` = flash with escalation, `flash` = locked flash, `pro` = locked pro |
|
|
623
|
-
| `/model <id>` | switch DeepSeek model (`deepseek-v4-flash`, `deepseek-v4-pro`, plus `deepseek-chat` / `deepseek-reasoner` compat aliases) |
|
|
624
|
-
| `/models` | list live models from DeepSeek `/models` endpoint |
|
|
625
|
-
| `/harvest [on\|off]` | toggle R1 plan-state extraction |
|
|
626
|
-
| `/branch <N\|off>` | run N parallel samples per turn, pick best (N ≥ 2) |
|
|
627
|
-
| `/effort <high\|max>` | reasoning_effort cap — `max` is the agent default, `high` is cheaper/faster |
|
|
628
|
-
| `/think` | dump the last turn's full thinking-mode reasoning |
|
|
629
|
-
|
|
630
|
-
**Context & tools**
|
|
631
|
-
|
|
632
|
-
| command | what it does |
|
|
633
|
-
|---|---|
|
|
634
|
-
| `/mcp` | list attached MCP servers and their tools / resources / prompts |
|
|
635
|
-
| `/resource [uri]` | browse + read MCP resources (no arg → list URIs; `<uri>` → fetch) |
|
|
636
|
-
| `/prompt [name]` | browse + fetch MCP prompts |
|
|
637
|
-
| `/tool [N]` | dump the Nth tool call's full output (1 = latest) |
|
|
638
|
-
| `/compact [tokens]` | shrink oversized tool results in the log (default 4000 tokens/result) |
|
|
639
|
-
| `/context` | break down where context tokens are going (system / tools / log) |
|
|
640
|
-
| `/stats` | cross-session cost dashboard (today / week / month / all-time) |
|
|
641
|
-
| `/keys` | keyboard shortcuts + prompt prefixes (`!` / `@` / `/`) cheatsheet |
|
|
642
|
-
|
|
643
|
-
**Memory & skills**
|
|
644
|
-
|
|
645
|
-
| command | what it does |
|
|
646
|
-
|---|---|
|
|
647
|
-
| `/memory` | show pinned memory (REASONIX.md + ~/.reasonix/memory) |
|
|
648
|
-
| `/memory list` · `show <name>` · `forget <name>` · `clear <scope> confirm` | manage the store |
|
|
649
|
-
| `/skill` · `/skill list` | list discovered skills (project + global) |
|
|
650
|
-
| `/skill show <name>` | dump one skill's body |
|
|
651
|
-
| `/skill <name> [args]` | run a skill (inject body as user turn) |
|
|
652
|
-
|
|
653
|
-
**Sessions**
|
|
654
|
-
|
|
655
|
-
| command | what it does |
|
|
656
|
-
|---|---|
|
|
657
|
-
| `/sessions` | list saved sessions (current marked with `▸`) |
|
|
658
|
-
| `/forget` | delete the current session from disk |
|
|
659
|
-
| `/setup` | reconfigure (exit and run `reasonix setup`) |
|
|
660
|
-
|
|
661
|
-
**Code mode only** (`reasonix code`)
|
|
662
|
-
|
|
663
|
-
| command | what it does |
|
|
664
|
-
|---|---|
|
|
665
|
-
| `/apply` | commit the pending SEARCH/REPLACE blocks to disk |
|
|
666
|
-
| `/discard` | drop the pending edit blocks without writing |
|
|
667
|
-
| `/undo` | roll back the last applied edit batch |
|
|
668
|
-
| `/commit "msg"` | `git add -A && git commit -m "msg"` |
|
|
669
|
-
| `/plan [on\|off]` | toggle read-only plan mode |
|
|
670
|
-
| `/apply-plan` | force-approve a pending plan |
|
|
671
|
-
|
|
672
|
-
**Keyboard**
|
|
673
|
-
|
|
674
|
-
- `Enter` — submit
|
|
675
|
-
- `Shift+Enter` / `Ctrl+J` — newline (multi-line paste also supported;
|
|
676
|
-
`\` + Enter as a portable fallback)
|
|
677
|
-
- `↑` / `↓` — walk prompt history while idle; navigate slash-autocomplete
|
|
678
|
-
- `Tab` / `Enter` on a `/foo` prefix — accept the highlighted suggestion
|
|
679
|
-
- `Esc` — abort the current turn (stops the API call, cancels any
|
|
680
|
-
in-flight tool, rejects pending MCP requests)
|
|
681
|
-
- `y` / `n` on confirm prompts — hotkey accept / reject
|
|
682
|
-
|
|
683
|
-
</details>
|
|
684
|
-
|
|
685
|
-
---
|
|
686
|
-
|
|
687
|
-
## Sessions and safety nets
|
|
688
|
-
|
|
689
|
-
- Sessions live as JSONL under `~/.reasonix/sessions/<name>.jsonl`
|
|
690
|
-
(per directory for `reasonix code`). Every message appended
|
|
691
|
-
atomically; `Ctrl+C` never loses context.
|
|
692
|
-
- Tool results are capped at 32k chars per call. Oversized sessions
|
|
693
|
-
self-heal on load (shrinks + rewrites the file).
|
|
694
|
-
- Malformed `assistant.tool_calls` / `tool` pairing is validated on
|
|
695
|
-
every outgoing API call so a corrupted session can't keep 400ing.
|
|
696
|
-
- Context gauge turns yellow at 50%, red at 80% with a `/compact`
|
|
697
|
-
nudge. Approaching the 1M-token window (V4 flash + pro) triggers an
|
|
698
|
-
automatic compaction attempt before falling back to a forced summary.
|
|
699
|
-
- The `reasonix code` sandbox refuses any path that resolves outside
|
|
700
|
-
the launch directory, including symlink escape and `..` traversal.
|
|
701
|
-
|
|
702
|
-
### Troubleshooting: duplicate rows / ghost rendering
|
|
703
|
-
|
|
704
|
-
Some Windows terminals (Git Bash / MINTTY / winpty-wrapped shells)
|
|
705
|
-
don't fully implement the ANSI cursor-up escapes Ink uses to repaint
|
|
706
|
-
the live spinner region. Symptom: spinners, streaming previews, or
|
|
707
|
-
tool-result rows print multiple copies into scrollback instead of
|
|
708
|
-
overwriting in place.
|
|
709
|
-
|
|
710
|
-
If you hit this, run with plain mode:
|
|
711
|
-
|
|
712
|
-
```bash
|
|
713
|
-
REASONIX_UI=plain npx reasonix code
|
|
714
|
-
```
|
|
715
|
-
|
|
716
|
-
Plain mode suppresses live/animated rows and disables the internal
|
|
717
|
-
tick timer. You lose the streaming preview and spinners but gain
|
|
718
|
-
stable scrollback. Windows Terminal, PowerShell 7 in Windows
|
|
719
|
-
Terminal, and WezTerm don't need this opt-out.
|
|
720
|
-
|
|
721
|
-
---
|
|
722
|
-
|
|
723
|
-
## Web search — on by default
|
|
724
|
-
|
|
725
|
-
The model has two web tools the moment you launch: `web_search` and
|
|
726
|
-
`web_fetch`. No flag, no API key, no signup. When you ask about
|
|
727
|
-
something the model wasn't trained on (new releases, current events,
|
|
728
|
-
obscure APIs), it decides to call `web_search` on its own; if a
|
|
729
|
-
snippet isn't enough it follows up with `web_fetch`.
|
|
730
|
-
|
|
731
|
-
Backed by **Mojeek**'s public search page — an independent web
|
|
732
|
-
index, bot-friendly, no cookies/sessions. Coverage on niche or very
|
|
733
|
-
recent queries can be thinner than Google/Bing, but it's reliable
|
|
734
|
-
from scripts. (DDG was the original backend but started serving
|
|
735
|
-
anti-bot pages in 2026.)
|
|
736
|
-
|
|
737
|
-
**Turn it off** (offline mode / privacy / CI):
|
|
738
|
-
|
|
739
|
-
```json
|
|
740
|
-
// ~/.reasonix/config.json
|
|
741
|
-
{ "apiKey": "sk-…", "search": false }
|
|
742
|
-
```
|
|
743
|
-
|
|
744
|
-
```bash
|
|
745
|
-
REASONIX_SEARCH=off npx reasonix code
|
|
746
|
-
```
|
|
747
|
-
|
|
748
|
-
**Bring your own** (Kagi, SearXNG, internal caches): implement the
|
|
749
|
-
`WebSearchProvider` interface and call
|
|
750
|
-
`registerWebTools(registry, { provider })` yourself, or bridge an
|
|
751
|
-
existing MCP search server via `--mcp`.
|
|
752
|
-
|
|
753
|
-
---
|
|
754
|
-
|
|
755
|
-
## MCP — bring your own tools
|
|
756
|
-
|
|
757
|
-
Any [MCP](https://spec.modelcontextprotocol.io/) server works. The
|
|
758
|
-
wizard lets you pick from a catalog, or drive it by flag:
|
|
759
|
-
|
|
760
|
-
```bash
|
|
761
|
-
# stdio (local subprocess)
|
|
762
|
-
npx reasonix --mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/safe"
|
|
763
|
-
|
|
764
|
-
# multiple at once
|
|
765
|
-
npx reasonix \
|
|
766
|
-
--mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/safe" \
|
|
767
|
-
--mcp "demo=npx tsx examples/mcp-server-demo.ts"
|
|
768
|
-
|
|
769
|
-
# HTTP+SSE (remote / hosted)
|
|
770
|
-
npx reasonix --mcp "kb=https://mcp.example.com/sse"
|
|
771
|
-
```
|
|
772
|
-
|
|
773
|
-
`reasonix mcp list` shows the curated catalog. `reasonix mcp inspect
|
|
774
|
-
<spec>` connects once and dumps the server's tools / resources /
|
|
775
|
-
prompts without starting a chat. Progress notifications from
|
|
776
|
-
long-running tools (2025-03-26 spec) render live as a progress bar
|
|
777
|
-
in the spinner.
|
|
778
|
-
|
|
779
|
-
Supported transports: **stdio** (local command) and **HTTP+SSE**
|
|
780
|
-
(remote, MCP 2024-11-05 spec).
|
|
781
|
-
|
|
782
|
-
---
|
|
783
|
-
|
|
784
|
-
## CLI reference
|
|
785
|
-
|
|
786
|
-
<details>
|
|
787
|
-
<summary><strong>Commands, flags, env vars</strong> (click to expand)</summary>
|
|
788
|
-
|
|
789
|
-
```bash
|
|
790
|
-
npx reasonix code [path] # coding mode scoped to path (default: cwd)
|
|
791
|
-
npx reasonix # chat (uses saved config)
|
|
792
|
-
npx reasonix setup # reconfigure the wizard
|
|
793
|
-
npx reasonix chat --session work # named session
|
|
794
|
-
npx reasonix chat --no-session # ephemeral
|
|
795
|
-
npx reasonix run "ask anything" # one-shot, streams to stdout
|
|
796
|
-
npx reasonix stats session.jsonl # summarize a transcript
|
|
797
|
-
npx reasonix replay chat.jsonl # rebuild cost/cache from a transcript
|
|
798
|
-
npx reasonix diff a.jsonl b.jsonl --md # compare two transcripts
|
|
799
|
-
npx reasonix mcp list # curated MCP catalog
|
|
800
|
-
npx reasonix mcp inspect <spec> # probe a single MCP server
|
|
801
|
-
npx reasonix sessions # list saved sessions
|
|
802
|
-
```
|
|
803
|
-
|
|
804
|
-
Common flags:
|
|
805
|
-
|
|
806
|
-
```bash
|
|
807
|
-
--preset <auto|flash|pro> # model commitment (auto / locked-flash / locked-pro)
|
|
808
|
-
--model <id> # explicit model id
|
|
809
|
-
--harvest / --no-harvest # R1 plan-state extraction
|
|
810
|
-
--branch <N> # self-consistency budget
|
|
811
|
-
--mcp "name=cmd args…" # attach an MCP server (repeatable)
|
|
812
|
-
--transcript path.jsonl # write a JSONL transcript on the side
|
|
813
|
-
--session <name> # named session (default: per-dir for code mode)
|
|
814
|
-
--no-session # ephemeral
|
|
815
|
-
--no-config # ignore ~/.reasonix/config.json (CI-friendly)
|
|
816
|
-
```
|
|
817
|
-
|
|
818
|
-
Env vars (win over config):
|
|
819
|
-
|
|
820
|
-
```bash
|
|
821
|
-
export DEEPSEEK_API_KEY=sk-...
|
|
822
|
-
export DEEPSEEK_BASE_URL=https://... # optional alternate endpoint
|
|
823
|
-
export REASONIX_MEMORY=off # disable REASONIX.md + user memory
|
|
824
|
-
export REASONIX_SEARCH=off # disable web_search / web_fetch
|
|
825
|
-
export REASONIX_UI=plain # disable live rows (ghosting workaround)
|
|
826
|
-
```
|
|
77
|
+
Cache stability isn't a feature you turn on; it's an invariant the loop is designed around. That's the entire reason Reasonix is DeepSeek-only.
|
|
827
78
|
|
|
828
79
|
</details>
|
|
829
80
|
|
|
830
81
|
---
|
|
831
82
|
|
|
832
|
-
##
|
|
833
|
-
|
|
834
|
-
<details>
|
|
835
|
-
<summary><strong>Programmatic API — embed reasonix in your own Node project</strong> (click to expand)</summary>
|
|
836
|
-
|
|
837
|
-
|
|
838
|
-
```ts
|
|
839
|
-
import {
|
|
840
|
-
CacheFirstLoop,
|
|
841
|
-
DeepSeekClient,
|
|
842
|
-
ImmutablePrefix,
|
|
843
|
-
ToolRegistry,
|
|
844
|
-
} from "reasonix";
|
|
845
|
-
|
|
846
|
-
const client = new DeepSeekClient(); // reads DEEPSEEK_API_KEY from env
|
|
847
|
-
const tools = new ToolRegistry();
|
|
848
|
-
|
|
849
|
-
tools.register({
|
|
850
|
-
name: "add",
|
|
851
|
-
description: "Add two integers",
|
|
852
|
-
parameters: {
|
|
853
|
-
type: "object",
|
|
854
|
-
properties: { a: { type: "integer" }, b: { type: "integer" } },
|
|
855
|
-
required: ["a", "b"],
|
|
856
|
-
},
|
|
857
|
-
fn: ({ a, b }: { a: number; b: number }) => a + b,
|
|
858
|
-
});
|
|
859
|
-
|
|
860
|
-
const loop = new CacheFirstLoop({
|
|
861
|
-
client,
|
|
862
|
-
tools,
|
|
863
|
-
prefix: new ImmutablePrefix({
|
|
864
|
-
system: "You are a math helper.",
|
|
865
|
-
toolSpecs: tools.specs(),
|
|
866
|
-
}),
|
|
867
|
-
harvest: true,
|
|
868
|
-
branch: 3,
|
|
869
|
-
});
|
|
870
|
-
|
|
871
|
-
for await (const ev of loop.step("What is 17 + 25?")) {
|
|
872
|
-
if (ev.role === "assistant_final") console.log(ev.content);
|
|
873
|
-
}
|
|
874
|
-
console.log(loop.stats.summary());
|
|
875
|
-
```
|
|
83
|
+
## What's in the box
|
|
876
84
|
|
|
877
|
-
|
|
878
|
-
|
|
879
|
-
|
|
880
|
-
internals.
|
|
85
|
+
<p align="center">
|
|
86
|
+
<img src="docs/assets/feature-grid.svg" alt="Feature grid — cache-first loop, plan mode, MCP first-class, sessions and dashboard, hooks, memory and skills" width="860"/>
|
|
87
|
+
</p>
|
|
881
88
|
|
|
882
|
-
|
|
89
|
+
Permissions (`allow` / `ask` / `deny`), tool-call repair (flatten · scavenge · truncation · storm), and `/effort` for cheap turns round out the loop. [Architecture →](./docs/ARCHITECTURE.md) · [Dashboard mockup →](https://esengine.github.io/reasonix/design/agent-dashboard.html) · [TUI mockup →](https://esengine.github.io/reasonix/design/agent-tui-terminal.html) · [Website →](https://esengine.github.io/reasonix/)
|
|
883
90
|
|
|
884
91
|
---
|
|
885
92
|
|
|
886
|
-
##
|
|
93
|
+
## Contributing
|
|
887
94
|
|
|
888
|
-
|
|
889
|
-
property — dirt-cheap tokens, R1 reasoning traces, automatic prefix
|
|
890
|
-
caching, JSON mode. Generic wrappers leave these on the table.
|
|
95
|
+
Reasonix is solo-maintained but designed to grow. Scoped starter tickets — each with background, code pointers, acceptance criteria, and hints — live under the [`good first issue`](https://github.com/esengine/reasonix/labels/good%20first%20issue) label. Pick anything open.
|
|
891
96
|
|
|
892
|
-
|
|
893
|
-
|
|
894
|
-
|
|
895
|
-
|
|
896
|
-
| Retry with jittered backoff (429/503) | yes | no (custom callbacks) |
|
|
897
|
-
| Scavenge tool calls leaked into `<think>` | yes | no |
|
|
898
|
-
| Call-storm breaker on identical-arg repeats | yes | no |
|
|
899
|
-
| Live cache-hit / cost / vs-Claude panel | yes | no |
|
|
97
|
+
**Open Discussions** — opinions wanted:
|
|
98
|
+
- [#20 · CLI / TUI design](https://github.com/esengine/reasonix/discussions/20) — what's broken, what's missing, what would you change?
|
|
99
|
+
- [#21 · Dashboard design](https://github.com/esengine/reasonix/discussions/21) — react against the [proposed mockup](https://esengine.github.io/reasonix/design/agent-dashboard.html)
|
|
100
|
+
- [#22 · Future feature wishlist](https://github.com/esengine/reasonix/discussions/22) — what would you build into Reasonix next?
|
|
900
101
|
|
|
901
|
-
|
|
902
|
-
repeats = 48 runs per side, live DeepSeek `deepseek-chat`, sole
|
|
903
|
-
variable prefix stability:
|
|
102
|
+
**Before your first PR**: read [`CONTRIBUTING.md`](./CONTRIBUTING.md) — short, strict project rules (comments, errors, libraries-over-hand-rolled). `tests/comment-policy.test.ts` enforces the comment ones; `npm run verify` is the pre-push gate. By participating you agree to the [Code of Conduct](./CODE_OF_CONDUCT.md). Security issues → [SECURITY.md](./SECURITY.md).
|
|
904
103
|
|
|
905
|
-
|
|
906
|
-
|---|---:|---:|---:|
|
|
907
|
-
| cache hit | 32.8% | **90.2%** | +57.4 pp |
|
|
908
|
-
| cost / task | $0.000992 | $0.000593 | **−40%** |
|
|
909
|
-
| pass rate | 100% (24/24) | **100% (24/24)** | — |
|
|
104
|
+
### Contributors
|
|
910
105
|
|
|
911
|
-
|
|
912
|
-
|
|
913
|
-
|
|
914
|
-
git clone https://github.com/esengine/reasonix.git && cd reasonix && npm install
|
|
915
|
-
npx reasonix replay benchmarks/tau-bench/transcripts/t01_address_happy.reasonix.r1.jsonl
|
|
916
|
-
npx reasonix diff \
|
|
917
|
-
benchmarks/tau-bench/transcripts/t01_address_happy.baseline.r1.jsonl \
|
|
918
|
-
benchmarks/tau-bench/transcripts/t01_address_happy.reasonix.r1.jsonl
|
|
919
|
-
```
|
|
920
|
-
|
|
921
|
-
The committed JSONL transcripts carry per-turn `usage`, `cost`, and
|
|
922
|
-
`prefixHash`. Reasonix's prefix hash stays byte-stable across every
|
|
923
|
-
model call; baseline's churns on every turn. The cache delta is
|
|
924
|
-
*mechanically* attributable to log stability, not to a different
|
|
925
|
-
system prompt.
|
|
926
|
-
|
|
927
|
-
Full 48-run report:
|
|
928
|
-
[`benchmarks/tau-bench/report.md`](./benchmarks/tau-bench/report.md).
|
|
929
|
-
Reproduce with your own API key: `npx tsx
|
|
930
|
-
benchmarks/tau-bench/runner.ts --repeats 3`.
|
|
931
|
-
|
|
932
|
-
MCP reference runs (one single prefix hash across all 5 turns even
|
|
933
|
-
with two concurrent MCP subprocesses):
|
|
934
|
-
|
|
935
|
-
| server | turns | cache hit | cost | vs Claude |
|
|
936
|
-
|---|---:|---:|---:|---:|
|
|
937
|
-
| bundled demo (`add` / `echo` / `get_time`) | 2 | **96.6%** (turn 2) | $0.000254 | −94.0% |
|
|
938
|
-
| official `server-filesystem` | 5 | **96.7%** | $0.001235 | −97.0% |
|
|
939
|
-
| **both concurrently** | 5 | **81.1%** | $0.001852 | −95.9% |
|
|
106
|
+
<a href="https://github.com/esengine/reasonix/graphs/contributors">
|
|
107
|
+
<img src="https://contrib.rocks/image?repo=esengine/reasonix" alt="Contributors to esengine/reasonix"/>
|
|
108
|
+
</a>
|
|
940
109
|
|
|
941
110
|
---
|
|
942
111
|
|
|
943
112
|
## Non-goals
|
|
944
113
|
|
|
945
|
-
- **Multi-
|
|
946
|
-
- **
|
|
947
|
-
|
|
948
|
-
|
|
949
|
-
- **Multi-provider abstraction** (use LiteLLM). Reasonix is
|
|
950
|
-
DeepSeek-only on purpose — every pillar (cache-first loop, R1
|
|
951
|
-
harvesting, tool-call repair) is tuned against DeepSeek-specific
|
|
952
|
-
behavior and economics. Coupling to one backend is the feature.
|
|
953
|
-
- **RAG / vector stores** (use LlamaIndex).
|
|
954
|
-
- **Web UI / SaaS.**
|
|
955
|
-
|
|
956
|
-
Reasonix does DeepSeek, deeply.
|
|
957
|
-
|
|
958
|
-
---
|
|
959
|
-
|
|
960
|
-
## Development
|
|
961
|
-
|
|
962
|
-
```bash
|
|
963
|
-
git clone https://github.com/esengine/reasonix.git
|
|
964
|
-
cd reasonix
|
|
965
|
-
npm install
|
|
966
|
-
npm run dev code # run CLI from source via tsx
|
|
967
|
-
npm run build # tsup to dist/
|
|
968
|
-
npm test # vitest (1482 tests)
|
|
969
|
-
npm run lint # biome
|
|
970
|
-
npm run typecheck # tsc --noEmit
|
|
971
|
-
```
|
|
114
|
+
- **Multi-provider flexibility.** DeepSeek-only on purpose — every layer is tuned around DeepSeek's specific cache mechanic and pricing. Coupling to one backend is the feature.
|
|
115
|
+
- **IDE integration.** Terminal-first; the diff lives in `git diff`, the file tree in `ls`. The dashboard is a companion, not a Cursor replacement.
|
|
116
|
+
- **Hardest-leaderboard reasoning.** Claude Opus still wins some benchmarks. DeepSeek V4 is competitive on coding; if your work is "solve this PhD proof" rather than "fix this auth bug," start with Claude.
|
|
117
|
+
- **Air-gapped / fully-free.** Reasonix needs a paid DeepSeek API key. For air-gapped or zero-cost runs see Aider + Ollama or [Continue](https://continue.dev).
|
|
972
118
|
|
|
973
119
|
---
|
|
974
120
|
|
|
975
121
|
## License
|
|
976
122
|
|
|
977
|
-
MIT
|
|
123
|
+
MIT — see [LICENSE](./LICENSE).
|