reasonix 0.3.0-alpha.6 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -6,20 +6,25 @@
6
6
  [![downloads](https://img.shields.io/npm/dm/reasonix.svg)](https://www.npmjs.com/package/reasonix)
7
7
  [![node](https://img.shields.io/node/v/reasonix.svg)](./package.json)
8
8
 
9
- **The DeepSeek-native agent framework.** TypeScript. Ink TUI. No LangChain.
10
-
11
- Reasonix is not another generic agent wrapper. Every abstraction is justified
12
- by a DeepSeek-specific property — dirt-cheap tokens, R1 reasoning traces,
13
- automatic prefix caching, JSON mode. Generic frameworks treat DeepSeek as
14
- "OpenAI with a different base URL" and leave these advantages on the table.
15
- Reasonix leans into them.
9
+ **A DeepSeek-native AI coding assistant in your terminal.** Ink TUI. MCP
10
+ first-class. No LangChain.
16
11
 
17
12
  ```bash
18
- npx reasonix chat # first run prompts for your DeepSeek key
19
- # inside the TUI, type /help for everything else
13
+ npx reasonix
20
14
  ```
21
15
 
22
- No flag soup. All feature toggles live behind slash commands in the TUI.
16
+ One command. First run walks you through a 30-second wizard (API key →
17
+ preset → pick MCP servers from a checklist); every run after that drops
18
+ straight into chat with your tools wired up. Inside the chat, type `/help`.
19
+
20
+ Why bother with yet another agent framework? Because every abstraction
21
+ here earns its weight against a DeepSeek-specific property — dirt-cheap
22
+ tokens, R1 reasoning traces, automatic prefix caching, JSON mode.
23
+ Generic wrappers treat DeepSeek as "OpenAI with a different base URL"
24
+ and leave these advantages on the table. Reasonix leans into them:
25
+ on the same τ-bench-lite workload,
26
+ [**94.4% cache hit, ~40% cheaper tokens, 100% pass rate**](#validated-numbers)
27
+ vs. a cache-hostile baseline.
23
28
 
24
29
  ---
25
30
 
@@ -27,12 +32,15 @@ No flag soup. All feature toggles live behind slash commands in the TUI.
27
32
 
28
33
  | Feature | How it works | Opt in |
29
34
  |---|---|---|
35
+ | **Setup wizard** | First run of `npx reasonix`: pick preset, multi-select MCP servers from a curated catalog, saved to config so the next run just launches chat | always on (first run) |
36
+ | **MCP (stdio + SSE)** | Multi-server bridge — every MCP tool inherits Cache-First + repair + context-safety automatically. `reasonix mcp list` shows the catalog | always on |
30
37
  | **Cache-First Loop** | Immutable prefix + append-only log = prefix byte-stable across turns → DeepSeek's automatic prefix cache hits at 70–95% | always on |
31
- | **R1 Thought Harvesting** | Parses `reasoning_content` into typed `{ subgoals, hypotheses, uncertainties, rejectedPaths }` via a cheap V3 call | `--harvest` |
32
- | **Self-Consistency Branching** | Runs N parallel samples at spread temperatures; picks the one with the fewest flagged uncertainties | `--branch <N>` |
38
+ | **Context safety net** | Tool results capped at 32k chars · oversized sessions auto-heal on load · `/compact` to shrink further · ctx gauge in the status bar · Esc to abort exploration and get a forced summary | always on |
39
+ | **R1 Thought Harvesting** | Parses `reasoning_content` into typed `{ subgoals, hypotheses, uncertainties, rejectedPaths }` via a cheap V3 call | `/preset smart` |
40
+ | **Self-Consistency Branching** | Runs N parallel samples at spread temperatures; picks the one with the fewest flagged uncertainties | `/preset max` / `/branch N` |
33
41
  | **Tool-Call Repair** | Auto-flattens deep/wide schemas, scavenges tool calls leaked into `<think>`, repairs truncated JSON, breaks call-storms | always on |
34
42
  | **Retry layer** | Exponential backoff + jitter on 408/429/500/502/503/504 and network errors. 4xx auth errors don't retry | always on |
35
- | **Ink TUI** | Live cache-hit / cost panel. Streams R1 thinking to a compact preview. Renders Markdown (bold / lists / code / stripped LaTeX) | always on |
43
+ | **Ink TUI** | Live cache-hit / cost / context panel. Streams R1 thinking to a compact preview. Renders Markdown (bold / lists / code / stripped LaTeX) | always on |
36
44
 
37
45
  ---
38
46
 
@@ -91,10 +99,12 @@ with your own API key: `npx tsx benchmarks/tau-bench/runner.ts --repeats 3`.
91
99
 
92
100
  [r]: ./benchmarks/tau-bench/report.md
93
101
 
94
- ### Extends to MCP (v0.3-alpha)
102
+ ### MCP works out of the box
95
103
 
96
104
  Any [MCP](https://spec.modelcontextprotocol.io/) server's tools inherit
97
- the same Cache-First benefits. Two live runs, two data points:
105
+ Cache-First + repair + context-safety automatically. The wizard (`npx
106
+ reasonix`) lets you multi-select from a curated catalog — no flags, no
107
+ JSON-by-hand. Three live reference runs:
98
108
 
99
109
  | server | turns | tool calls | cache hit | cost | vs Claude |
100
110
  |---|---:|---:|---:|---:|---:|
@@ -103,40 +113,21 @@ the same Cache-First benefits. Two live runs, two data points:
103
113
  | **both concurrently** (`demo_add` + `fs_write_file`) | 5 | 4 | **81.1%** | $0.001852 | −95.9% |
104
114
 
105
115
  The third row is the ecosystem proof: two MCP servers running as
106
- separate subprocesses, tools from both exercised in one conversation
107
- (compute `17+25` with the demo server, write the result to a real file
108
- via the filesystem server). **One single prefix hash across all 5
109
- turns** — byte-stability survives concurrent MCP subprocesses.
116
+ separate subprocesses, tools from both exercised in one conversation.
117
+ **One single prefix hash across all 5 turns** byte-stability survives
118
+ concurrent MCP subprocesses.
110
119
 
111
- **Reproduce without an API key** (replay the committed transcripts):
120
+ Reproduce without an API key (replay the committed transcripts):
112
121
 
113
122
  ```bash
114
123
  npx reasonix replay benchmarks/tau-bench/transcripts/mcp-demo.add.jsonl
115
124
  npx reasonix replay benchmarks/tau-bench/transcripts/mcp-filesystem.jsonl
116
125
  ```
117
126
 
118
- **Reproduce with your own key** (live, ~$0.002):
119
-
120
- ```bash
121
- # Don't know what MCP servers exist? Start here:
122
- reasonix mcp list
123
- # Prints a curated catalog (filesystem, fetch, github, sqlite, …) with
124
- # ready-to-paste --mcp commands.
125
-
126
- # One server:
127
- reasonix chat --mcp "filesystem=npx -y @modelcontextprotocol/server-filesystem /tmp/safe"
128
-
129
- # Multiple servers at once — each gets its own namespace prefix:
130
- reasonix chat \
131
- --mcp "fs=npx -y @modelcontextprotocol/server-filesystem /tmp/safe" \
132
- --mcp "mem=npx -y @modelcontextprotocol/server-memory"
133
- # Tools land in a shared registry as fs_read_file, mem_set, etc.
134
-
135
- # Remote / hosted MCP server — pass an http(s) URL instead of a command.
136
- # Reasonix opens an SSE stream and POSTs JSON-RPC to the endpoint the
137
- # server advertises (MCP 2024-11-05 HTTP+SSE transport).
138
- reasonix chat --mcp "kb=https://mcp.example.com/sse"
139
- ```
127
+ Supported transports: **stdio** (local `npx` or binary) and **HTTP+SSE**
128
+ (remote / hosted servers, MCP 2024-11-05 spec). Pass an `http(s)://`
129
+ URL to `--mcp` and Reasonix opens the SSE stream and POSTs JSON-RPC
130
+ to the endpoint the server advertises.
140
131
 
141
132
  [mcp]: ./benchmarks/tau-bench/transcripts/mcp-demo.add.jsonl
142
133
 
@@ -144,55 +135,66 @@ reasonix chat --mcp "kb=https://mcp.example.com/sse"
144
135
 
145
136
  ## Usage
146
137
 
147
- ### CLI
138
+ ### One command
148
139
 
149
140
  ```bash
150
- npx reasonix chat # auto-saves to session 'default'; resumes next time
151
- npx reasonix chat --session work # use a different named session
152
- npx reasonix chat --no-session # ephemeral — nothing persisted
153
- npx reasonix run "ask anything" # one-shot, streams to stdout
154
- npx reasonix stats session.jsonl # quick summary of a transcript
155
- npx reasonix replay chat.jsonl # pretty-print a transcript + rebuild cost/cache offline
156
- npx reasonix diff a.jsonl b.jsonl --md diff.md # compare two transcripts: cache/cost delta + first divergence
141
+ npx reasonix
157
142
  ```
158
143
 
159
- Sessions live as JSONL under `~/.reasonix/sessions/<name>.jsonl` every
160
- turn's message log is appended atomically, so killing the CLI never loses
161
- context. Inside the TUI: `/sessions` to list, `/forget` to delete the
162
- current one.
144
+ First run: a wizard asks for your API key, lets you pick a preset
145
+ (fast / smart / max), then offers a multi-select checklist of MCP
146
+ servers filesystem, memory, github, puppeteer, everything. Everything
147
+ is saved to `~/.reasonix/config.json`. Subsequent runs drop straight
148
+ into chat.
163
149
 
164
- ### Inside the chat — slash commands
150
+ ### Inside the chat
165
151
 
166
- A command strip runs under the input box so you don't have to memorize
167
- anything. Type `/help` for the full list. The biggest shortcut:
152
+ A status bar at the top shows cache hit %, cost, Claude-equivalent, and
153
+ the **context gauge** (`ctx 42k/131k (32%)` yellow at 50%, red + a
154
+ `/compact` nudge at 80%). A command strip under the input lists the
155
+ slash commands:
168
156
 
169
157
  ```
170
- /preset fast deepseek-chat, no harvest, no branch (default)
171
- /preset smart reasoner + harvest (~10x cost)
172
- /preset max reasoner + harvest + branch 3 (~30x cost, slowest)
158
+ /help full list + hints
159
+ /preset <fast|smart|max> one-tap bundles (model + harvest + branch)
160
+ /mcp list attached MCP servers and tools
161
+ /compact [cap] shrink oversized tool results in history
162
+ /sessions · /forget list / delete saved sessions
163
+ /setup reconfigure (exits and tells you to run `reasonix setup`)
164
+ /clear · /exit
173
165
  ```
174
166
 
175
- One-tap switch between fast daily driver, careful thinker, and max-quality
176
- self-consistency. Individual knobs are available too:
167
+ **Esc while thinking** abort the current exploration and force the
168
+ model to summarize what it already found. No more "model ran 24 tool
169
+ calls and gave up" — you get an answer every time.
177
170
 
178
- ```
179
- /status show current model / harvest / branch / stream
180
- /model <id> deepseek-chat or deepseek-reasoner
181
- /harvest [on|off] Pillar 2 parse R1 reasoning into typed plan state
182
- /branch <N|off> run N parallel samples per turn, pick most confident
183
- /clear clear displayed history (log is kept)
184
- /exit quit
185
- ```
171
+ Sessions live as JSONL under `~/.reasonix/sessions/<name>.jsonl` —
172
+ every message appended atomically, so killing the CLI never loses
173
+ context. Oversized tool results auto-heal on load, so poisoning a
174
+ session with one giant `read_file` doesn't brick your history.
186
175
 
187
- The top panel shows active flags live: `· harvest · branch3` appear next to
188
- the model once enabled.
176
+ ### Advanced CLI subcommands and flags
189
177
 
190
- ### Flags (for automation / CI)
178
+ ```bash
179
+ npx reasonix setup # reconfigure any time
180
+ npx reasonix chat --session work # a different named session
181
+ npx reasonix chat --no-session # ephemeral — nothing persisted
182
+ npx reasonix run "ask anything" # one-shot, streams to stdout
183
+ npx reasonix stats session.jsonl # summarize a transcript
184
+ npx reasonix replay chat.jsonl # scrub a transcript + rebuild cost/cache
185
+ npx reasonix diff a.jsonl b.jsonl --md # compare two transcripts
186
+ npx reasonix mcp list # curated MCP server catalog
187
+ ```
191
188
 
192
- The same knobs are also available as CLI flags if you're scripting:
189
+ Power users can still bypass config and drive Reasonix with flags:
193
190
 
194
191
  ```bash
195
- npx reasonix chat -m deepseek-reasoner --harvest --branch 3 --transcript session.jsonl
192
+ npx reasonix chat \
193
+ --preset max \
194
+ --mcp "filesystem=npx -y @modelcontextprotocol/server-filesystem /tmp/safe" \
195
+ --mcp "kb=https://mcp.example.com/sse" \
196
+ --transcript session.jsonl \
197
+ --no-config # ignore ~/.reasonix/config.json (for CI / reproducing issues)
196
198
  ```
197
199
 
198
200
  ### Library
@@ -238,16 +240,19 @@ console.log(loop.stats.summary());
238
240
 
239
241
  ### Configuration
240
242
 
241
- On first run the CLI prompts for your DeepSeek API key and saves it to
242
- `~/.reasonix/config.json`. Alternatives:
243
+ The wizard handles everything on first run. If you'd rather use env vars
244
+ (CI, shared boxes, etc.):
243
245
 
244
246
  ```bash
245
- export DEEPSEEK_API_KEY=sk-... # env var (wins over config file)
247
+ export DEEPSEEK_API_KEY=sk-... # wins over ~/.reasonix/config.json
246
248
  export DEEPSEEK_BASE_URL=https://... # optional alternate endpoint
247
249
  ```
248
250
 
249
251
  Get a key (free credit on signup): <https://platform.deepseek.com/api_keys>
250
252
 
253
+ Re-run `npx reasonix setup` any time to add/remove MCP servers or switch
254
+ preset — your existing selections are pre-checked.
255
+
251
256
  ---
252
257
 
253
258
  ## Non-goals
@@ -269,7 +274,7 @@ cd reasonix
269
274
  npm install
270
275
  npm run dev chat # run CLI from source via tsx
271
276
  npm run build # tsup to dist/
272
- npm test # vitest (89 tests)
277
+ npm test # vitest (279 tests)
273
278
  npm run lint # biome
274
279
  npm run typecheck # tsc --noEmit
275
280
  ```