reasonix 0.2.2 → 0.3.0-alpha.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -91,6 +91,39 @@ with your own API key: `npx tsx benchmarks/tau-bench/runner.ts --repeats 3`.
91
91
 
92
92
  [r]: ./benchmarks/tau-bench/report.md
93
93
 
94
+ ### Extends to MCP (v0.3-alpha)
95
+
96
+ Any [MCP](https://spec.modelcontextprotocol.io/) server's tools inherit
97
+ the same Cache-First benefits. Two live runs, two data points:
98
+
99
+ | server | turns | tool calls | cache hit | cost | vs Claude |
100
+ |---|---:|---:|---:|---:|---:|
101
+ | bundled demo (`add` / `echo` / `get_time`) | 2 | 1 | **96.6%** (turn 2) | $0.000254 | −94.0% |
102
+ | official `@modelcontextprotocol/server-filesystem` | 5 | 4 | **96.7%** overall | $0.001235 | −97.0% |
103
+
104
+ The second run is the interesting one — it's through an *external*,
105
+ production MCP server (no code we control). Five turns including
106
+ `list_directory`, a permission-denied recovery via
107
+ `list_allowed_directories`, a successful retry, and `read_text_file`.
108
+ Byte-stable prefix held across every turn; cache hit stayed at 96.7%.
109
+
110
+ **Reproduce without an API key** (replay the committed transcripts):
111
+
112
+ ```bash
113
+ npx reasonix replay benchmarks/tau-bench/transcripts/mcp-demo.add.jsonl
114
+ npx reasonix replay benchmarks/tau-bench/transcripts/mcp-filesystem.jsonl
115
+ ```
116
+
117
+ **Reproduce with your own key** (live, ~$0.002):
118
+
119
+ ```bash
120
+ reasonix chat --mcp "node --import tsx examples/mcp-server-demo.ts"
121
+ # or against the real filesystem server:
122
+ reasonix chat --mcp "npx -y @modelcontextprotocol/server-filesystem /path/to/safe/dir"
123
+ ```
124
+
125
+ [mcp]: ./benchmarks/tau-bench/transcripts/mcp-demo.add.jsonl
126
+
94
127
  ---
95
128
 
96
129
  ## Usage