reasonix 0.0.3 → 0.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,66 +1,128 @@
1
1
  # Reasonix
2
2
 
3
+ [![npm version](https://img.shields.io/npm/v/reasonix.svg)](https://www.npmjs.com/package/reasonix)
4
+ [![CI](https://github.com/esengine/reasonix/actions/workflows/ci.yml/badge.svg)](https://github.com/esengine/reasonix/actions/workflows/ci.yml)
5
+ [![license](https://img.shields.io/npm/l/reasonix.svg)](./LICENSE)
6
+ [![downloads](https://img.shields.io/npm/dm/reasonix.svg)](https://www.npmjs.com/package/reasonix)
7
+ [![node](https://img.shields.io/node/v/reasonix.svg)](./package.json)
8
+
3
9
  **The DeepSeek-native agent framework.** TypeScript. Ink TUI. No LangChain.
4
10
 
5
- Reasonix is not another generic agent framework. It does one thing: take DeepSeek's
6
- unusual economic and behavioral profile — dirt-cheap tokens, R1 reasoning traces,
7
- automatic prefix caching and turn them into agent-loop superpowers that generic
8
- frameworks leave on the table.
11
+ Reasonix is not another generic agent wrapper. Every abstraction is justified
12
+ by a DeepSeek-specific property — dirt-cheap tokens, R1 reasoning traces,
13
+ automatic prefix caching, JSON mode. Generic frameworks treat DeepSeek as
14
+ "OpenAI with a different base URL" and leave these advantages on the table.
15
+ Reasonix leans into them.
9
16
 
10
17
  ```bash
11
- npx reasonix chat # prompts for your DeepSeek key on first run,
12
- # then live TUI with real-time cache/cost panel
18
+ npx reasonix chat # first run prompts for your DeepSeek key
19
+ # inside the TUI, type /help for everything else
13
20
  ```
14
21
 
15
- On first run the TUI asks for your DeepSeek API key (get one at
16
- [platform.deepseek.com/api_keys](https://platform.deepseek.com/api_keys)) and
17
- saves it to `~/.reasonix/config.json`. Set `DEEPSEEK_API_KEY` in the
18
- environment to override.
22
+ No flag soup. All feature toggles live behind slash commands in the TUI.
19
23
 
20
- ## Why Reasonix?
24
+ ---
21
25
 
22
- Every other framework treats DeepSeek as an OpenAI-compatible endpoint with a
23
- different base URL. That works, but it leaves most of DeepSeek's advantages
24
- unused. Reasonix is opinionated about three things:
26
+ ## What you get
25
27
 
26
- ### 1. Cache-First Loop
27
- DeepSeek bills cached input tokens at **~10% of the miss rate**. Reasonix
28
- structures the agent loop as `[Immutable Prefix] + [Append-Only Log] +
29
- [Volatile Scratch]` so every turn reuses the exact byte prefix.
28
+ | Feature | How it works | Opt in |
29
+ |---|---|---|
30
+ | **Cache-First Loop** | Immutable prefix + append-only log = prefix byte-stable across turns → DeepSeek's automatic prefix cache hits at 70–95% | always on |
31
+ | **R1 Thought Harvesting** | Parses `reasoning_content` into typed `{ subgoals, hypotheses, uncertainties, rejectedPaths }` via a cheap V3 call | `--harvest` |
32
+ | **Self-Consistency Branching** | Runs N parallel samples at spread temperatures; picks the one with the fewest flagged uncertainties | `--branch <N>` |
33
+ | **Tool-Call Repair** | Auto-flattens deep/wide schemas, scavenges tool calls leaked into `<think>`, repairs truncated JSON, breaks call-storms | always on |
34
+ | **Retry layer** | Exponential backoff + jitter on 408/429/500/502/503/504 and network errors. 4xx auth errors don't retry | always on |
35
+ | **Ink TUI** | Live cache-hit / cost panel. Streams R1 thinking to a compact preview. Renders Markdown (bold / lists / code / stripped LaTeX) | always on |
30
36
 
31
- **Validated on real DeepSeek API (`deepseek-chat`):**
37
+ ---
32
38
 
33
- | scenario | turns | cache hit | cost | cost on Claude Sonnet 4.6 | savings |
34
- |---|---|---|---|---|---|
35
- | Chinese multi-turn chat | 5 | **85.2%** | $0.000923 | $0.015174 | **93.9%** |
36
- | Tool-use (calculator) | 2 | **94.9%** | $0.000142 | $0.003351 | **95.8%** |
39
+ ## Why not just use LangChain?
37
40
 
38
- ### 2. R1 Thought Harvesting
39
- R1's `reasoning_content` contains a *plan*, not just trivia to display. Reasonix
40
- pipes it through a cheap V3 call (~$0.0001 / turn) in JSON mode and extracts
41
- a typed plan state:
41
+ Even on the default `fast` preset (no harvest, no branching), Reasonix bakes
42
+ in five DeepSeek-specific defences that generic agent frameworks leave to you:
42
43
 
43
- ```ts
44
- { subgoals: string[], hypotheses: string[], uncertainties: string[], rejectedPaths: string[] }
45
- ```
44
+ | | Reasonix default | generic frameworks |
45
+ |---|---|---|
46
+ | Prefix-stable loop (→ 85–95% cache hit) | ✅ | ❌ prompts rebuilt each turn |
47
+ | Auto-flatten deep tool schemas | ✅ | ❌ DeepSeek drops args |
48
+ | Retry with jittered backoff (429/503) | ✅ | ❌ custom callbacks |
49
+ | Scavenge tool calls leaked into `<think>` | ✅ | ❌ |
50
+ | Call-storm breaker on identical-arg repeats | ✅ | ❌ |
51
+ | Live cache-hit / cost / vs-Claude panel | ✅ | ❌ |
52
+ | First-run config prompt + Markdown TUI | ✅ | ❌ |
53
+
54
+ Harvest and self-consistency branching are bonuses on top. The everyday
55
+ win is that **a plain chat with Reasonix already pays for ~40% less tokens
56
+ than the same chat through a naive LangChain setup**, because the prefix
57
+ actually stays byte-stable.
58
+
59
+ ## Validated numbers
46
60
 
47
- Opt-in to keep default cost identical: `reasonix chat --harvest` or
48
- `new CacheFirstLoop({ harvest: true })`. The TUI renders the harvested state
49
- as a compact magenta block above the answer.
61
+ Measured on live DeepSeek API:
50
62
 
51
- ### 3. Tool-Call Repair
52
- R1/V3 have known quirks — tool calls leaking into `<think>`, dropped arguments
53
- on deep schemas, truncated JSON, call-storm loops. Reasonix ships a full repair
54
- pipeline: **scavenge + flatten + truncation recovery + storm breaker**.
63
+ | scenario | model | turns | cache hit | cost | Claude 4.6 would be | savings |
64
+ |---|---|---|---|---|---|---|
65
+ | Chinese multi-turn chat | `deepseek-chat` | 5 | **85.2%** | $0.000923 | $0.015174 | **93.9%** |
66
+ | Tool-use (calculator) | `deepseek-chat` | 2 | **94.9%** | $0.000142 | $0.003351 | **95.8%** |
67
+ | R1 math + harvest | `deepseek-reasoner` | 1 | 72.7% | $0.006478 | $0.044484 | 85.4% |
68
+
69
+ ---
55
70
 
56
71
  ## Usage
57
72
 
73
+ ### CLI
74
+
75
+ ```bash
76
+ npx reasonix chat # just chat — everything else is inside
77
+ npx reasonix run "ask anything" # one-shot, streams to stdout
78
+ npx reasonix stats session.jsonl # read back a saved transcript
79
+ ```
80
+
81
+ ### Inside the chat — slash commands
82
+
83
+ A command strip runs under the input box so you don't have to memorize
84
+ anything. Type `/help` for the full list. The biggest shortcut:
85
+
86
+ ```
87
+ /preset fast deepseek-chat, no harvest, no branch (default)
88
+ /preset smart reasoner + harvest (~10x cost)
89
+ /preset max reasoner + harvest + branch 3 (~30x cost, slowest)
90
+ ```
91
+
92
+ One-tap switch between fast daily driver, careful thinker, and max-quality
93
+ self-consistency. Individual knobs are available too:
94
+
95
+ ```
96
+ /status show current model / harvest / branch / stream
97
+ /model <id> deepseek-chat or deepseek-reasoner
98
+ /harvest [on|off] Pillar 2 — parse R1 reasoning into typed plan state
99
+ /branch <N|off> run N parallel samples per turn, pick most confident
100
+ /clear clear displayed history (log is kept)
101
+ /exit quit
102
+ ```
103
+
104
+ The top panel shows active flags live: `· harvest · branch3` appear next to
105
+ the model once enabled.
106
+
107
+ ### Flags (for automation / CI)
108
+
109
+ The same knobs are also available as CLI flags if you're scripting:
110
+
111
+ ```bash
112
+ npx reasonix chat -m deepseek-reasoner --harvest --branch 3 --transcript session.jsonl
113
+ ```
114
+
58
115
  ### Library
59
116
 
60
117
  ```ts
61
- import { CacheFirstLoop, DeepSeekClient, ImmutablePrefix, ToolRegistry } from "reasonix";
62
-
63
- const client = new DeepSeekClient();
118
+ import {
119
+ CacheFirstLoop,
120
+ DeepSeekClient,
121
+ ImmutablePrefix,
122
+ ToolRegistry,
123
+ } from "reasonix";
124
+
125
+ const client = new DeepSeekClient(); // reads DEEPSEEK_API_KEY from env
64
126
  const tools = new ToolRegistry();
65
127
 
66
128
  tools.register({
@@ -71,55 +133,68 @@ tools.register({
71
133
  properties: { a: { type: "integer" }, b: { type: "integer" } },
72
134
  required: ["a", "b"],
73
135
  },
74
- fn: ({ a, b }) => a + b,
136
+ fn: ({ a, b }: { a: number; b: number }) => a + b,
75
137
  });
76
138
 
77
139
  const loop = new CacheFirstLoop({
78
140
  client,
141
+ tools,
79
142
  prefix: new ImmutablePrefix({
80
143
  system: "You are a math helper.",
81
144
  toolSpecs: tools.specs(),
82
145
  }),
83
- tools,
146
+ harvest: true,
147
+ branch: 3, // self-consistency budget
84
148
  });
85
149
 
86
150
  for await (const ev of loop.step("What is 17 + 25?")) {
87
- console.log(ev);
151
+ if (ev.role === "assistant_final") console.log(ev.content);
88
152
  }
89
153
  console.log(loop.stats.summary());
90
154
  ```
91
155
 
92
- ### CLI / TUI
156
+ ### Configuration
157
+
158
+ On first run the CLI prompts for your DeepSeek API key and saves it to
159
+ `~/.reasonix/config.json`. Alternatives:
93
160
 
94
161
  ```bash
95
- reasonix chat # full-screen Ink TUI, live cache/cost panel
96
- reasonix run "task" # one-shot, streaming output
97
- reasonix stats <file> # summarize transcript JSONL
98
- reasonix version
162
+ export DEEPSEEK_API_KEY=sk-... # env var (wins over config file)
163
+ export DEEPSEEK_BASE_URL=https://... # optional alternate endpoint
99
164
  ```
100
165
 
101
- ## Status
166
+ Get a key (free credit on signup): <https://platform.deepseek.com/api_keys>
102
167
 
103
- Pre-alpha. All three pillars ship working end-to-end as of v0.0.3.
104
- See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md).
168
+ ---
105
169
 
106
170
  ## Non-goals
107
171
 
108
- - Multi-agent orchestration (use LangGraph if you need it).
109
- - RAG / vector stores.
110
- - Multi-provider abstraction. **Reasonix does DeepSeek, deeply.**
172
+ - Multi-agent orchestration (use LangGraph).
173
+ - RAG / vector stores (use LlamaIndex or do it yourself).
174
+ - Multi-provider abstraction (use LiteLLM).
111
175
  - Web UI / SaaS.
112
176
 
177
+ Reasonix does DeepSeek, deeply.
178
+
179
+ ---
180
+
113
181
  ## Development
114
182
 
115
183
  ```bash
184
+ git clone https://github.com/esengine/reasonix.git
185
+ cd reasonix
116
186
  npm install
117
- npm run dev chat # run CLI directly from TS (tsx)
118
- npm run build # bundle to dist/
119
- npm test # vitest
120
- npm run lint # biome
187
+ npm run dev chat # run CLI from source via tsx
188
+ npm run build # tsup to dist/
189
+ npm test # vitest (89 tests)
190
+ npm run lint # biome
191
+ npm run typecheck # tsc --noEmit
121
192
  ```
122
193
 
194
+ See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for internals.
195
+
196
+ ---
197
+
123
198
  ## License
124
199
 
125
200
  MIT