@unerr-ai/unerr 0.2.7 → 0.2.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +90 -63
- package/dist/cli.js +1816 -2354
- package/package.json +10 -22
- package/scripts/postinstall.mjs +0 -312
package/README.md
CHANGED
|
@@ -1,15 +1,18 @@
|
|
|
1
|
-
<
|
|
2
|
-
<a href="https://www.unerr.dev/"><img src="https://unerr.dev/icon-wordmark.svg" alt="unerr
|
|
3
|
-
</
|
|
1
|
+
<h1 align="center">
|
|
2
|
+
<a href="https://www.unerr.dev/"><img src="https://unerr.dev/icon-wordmark.svg" alt="unerr" width="320" /></a>
|
|
3
|
+
</h1>
|
|
4
4
|
|
|
5
5
|
<p align="center">
|
|
6
|
-
<strong>Your AI agent has read your codebase. It
|
|
6
|
+
<strong>Your AI agent has read your codebase. It still can't safely change it.</strong>
|
|
7
7
|
</p>
|
|
8
8
|
|
|
9
9
|
<p align="center">
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
10
|
+
Every tool built to help hands your agent <em>advice it can ignore</em> — a memory it has to remember to check,<br/>
|
|
11
|
+
a graph it has to choose to query, a reviewer that only speaks up after the break is already written.<br/>
|
|
12
|
+
<strong>unerr is the guardrail it can't skip.</strong> The moment your agent edits a function, unerr puts the live call graph<br/>
|
|
13
|
+
and the rule you pinned to that exact function <em>into the edit itself</em> — automatically, not on request — and re-anchors<br/>
|
|
14
|
+
that rule when the code moves, so it never goes quietly stale. The 24 callers and the standard it's about to break are<br/>
|
|
15
|
+
on screen <em>before</em> the function changes. Every time. Whether or not the agent thought to ask.
|
|
13
16
|
</p>
|
|
14
17
|
|
|
15
18
|
<p align="center">
|
|
@@ -31,39 +34,59 @@
|
|
|
31
34
|
<sub>Zero configuration. Install, restart your IDE, and the next prompt already knows your repo.</sub>
|
|
32
35
|
</p>
|
|
33
36
|
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
</
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
<details>
|
|
40
|
+
<summary><strong>Contents</strong></summary>
|
|
41
|
+
|
|
42
|
+
- [The gap nobody else closes](#the-gap-nobody-else-closes)
|
|
43
|
+
- [The pains this fixes](#the-pains-this-fixes)
|
|
44
|
+
- [What changes when you install it](#what-changes-when-you-install-it)
|
|
45
|
+
- [See it in action](#see-it-in-action)
|
|
46
|
+
- [Quick Start](#quick-start)
|
|
47
|
+
- [Who it's for](#who-its-for)
|
|
48
|
+
- [Why a guardrail has to be one runtime, not five tools](#why-a-guardrail-has-to-be-one-runtime-not-five-tools)
|
|
49
|
+
- [How the runtime works](#how-the-runtime-works)
|
|
50
|
+
- [Fewer tokens, as a side effect](#fewer-tokens-as-a-side-effect)
|
|
51
|
+
- [License](#license)
|
|
52
|
+
|
|
53
|
+
</details>
|
|
38
54
|
|
|
39
55
|
---
|
|
40
56
|
|
|
41
|
-
## The
|
|
57
|
+
## The gap nobody else closes
|
|
42
58
|
|
|
43
|
-
|
|
59
|
+
On a small or greenfield project the agent holds the whole repo in its head and reading the live code is enough — you don't need us. The wall is the *large, existing, multi-contributor* codebase, and it's the same wall every time: the agent can't fit the whole thing in context, so it acts on the slice it can see and never reads the rest. It changes a signature and breaks 7 of 24 callers it never read. It writes a fourth copy of a pattern your team standardized months ago — even with the rule spelled out in `.cursorrules`. Neither shows up as an error. They show up as a senior engineer's afternoon.
|
|
44
60
|
|
|
45
|
-
|
|
61
|
+
The knowledge that would have stopped it — who calls this function, which pattern is load-bearing — already exists. The whole market is built on getting that knowledge to the agent. And it falls into two shapes, both of which leak:
|
|
62
|
+
|
|
63
|
+
| What it does | The shape | Why it leaks |
|
|
64
|
+
|---|---|---|
|
|
65
|
+
| **Tells the agent things.** Memory stores, code-graph servers, context packers, rule files. | A tool the agent calls *when it remembers to.* | Optional context is optional. Agents skip the retrieval tool **~58% of the time even when explicitly told to use it** ([CodeCompass, 2026](https://arxiv.org/abs/2602.20048)). Advice it can ignore, it ignores. |
|
|
66
|
+
| **Checks the agent afterward.** Reviewers, linters, CI gates. | A pass over the diff *after the code is written.* | The break already happened. Now it's a comment on a pull request and a second round of work — not a change that never broke anything. |
|
|
46
67
|
|
|
47
|
-
|
|
68
|
+
There's a third shape, and almost no one ships it: **guidance wired into the moment of the edit, that the agent can't route around, and that re-anchors itself when the code moves so it never goes quietly stale.** Not a tool it chooses to consult. Not a review after the fact. A guardrail that fires *as it edits* — and stays true to the code because it's recomputed from the code, not from a doc that rots.
|
|
69
|
+
|
|
70
|
+
That's unerr. The agent doesn't have to ask. Before the edit lands, it already sees the callers it would break and the standard it's about to violate.
|
|
48
71
|
|
|
49
72
|
| The old way | With unerr |
|
|
50
73
|
|---|---|
|
|
51
|
-
| The agent
|
|
52
|
-
|
|
|
53
|
-
|
|
|
74
|
+
| The agent changes a function without reading its 24 callers — 7 sites break silently. | **Cascade guard** puts the call graph in front of the edit *before it runs* — every caller on screen, no asking required. |
|
|
75
|
+
| You wrote the rule in `.cursorrules`. The agent acknowledged it, then ignored it once context filled up. | **Anchored rules** surface the standard the instant the agent touches that scope — and re-anchor when the code moves instead of going stale. |
|
|
76
|
+
| A rule or spec stays confident long after the code moved out from under it. Nothing recomputes it. | Every fact is pinned to a live entity in the graph. When the code moves, the fact **fails loud** instead of staying silently wrong. |
|
|
54
77
|
|
|
55
78
|
---
|
|
56
79
|
|
|
57
80
|
## The pains this fixes
|
|
58
81
|
|
|
59
|
-
You
|
|
82
|
+
You know this feeling, and it gets *worse* as the repo grows, not better:
|
|
60
83
|
|
|
61
|
-
-
|
|
62
|
-
-
|
|
63
|
-
- The
|
|
64
|
-
-
|
|
84
|
+
- **You're babysitting it.** You can't fire-and-forget, because the one time you look away is the time it quietly breaks something load-bearing. You've become its scheduler and its safety net at once.
|
|
85
|
+
- **You don't trust it to touch anything important.** It treats your codebase as a flat wall of text — locally correct, globally wrong — so the load-bearing changes still land on you.
|
|
86
|
+
- **The rule you wrote gets acknowledged, then dropped.** A few turns later the context fills up and your `.cursorrules` line may as well not exist.
|
|
87
|
+
- **Approval fatigue.** You approve so many reasonable edits that the dangerous one slides through — the hundredth confirmation looks exactly like the first.
|
|
65
88
|
|
|
66
|
-
These aren't four problems. They're one: **
|
|
89
|
+
These aren't four problems. They're one: **the agent acts on a codebase it can't hold in its head, and nothing it can't bypass is watching the change.** You babysit because there's no guardrail it can't skip. unerr is that guardrail — so you can look away.
|
|
67
90
|
|
|
68
91
|
---
|
|
69
92
|
|
|
@@ -71,16 +94,16 @@ These aren't four problems. They're one: **your agent acts on your codebase with
|
|
|
71
94
|
|
|
72
95
|
| You feel | What unerr does |
|
|
73
96
|
|---|---|
|
|
74
|
-
| **
|
|
75
|
-
| **
|
|
76
|
-
| **
|
|
77
|
-
| **
|
|
97
|
+
| **You stop babysitting.** The agent runs for an hour and you're not bracing for a silent break. | Every edit is preceded — automatically — by a graph lookup. All 24 callers are visible *before* it touches the function. The guardrail fires whether or not the agent thought to ask. |
|
|
98
|
+
| **Your rules finally get honored.** The standard you set is applied at the edit, not acknowledged and forgotten. | unerr pins each rule and decision to the file or entity it governs and surfaces it the instant the agent touches that scope — then re-anchors it when the code moves. Keep your `.cursorrules` and specs; unerr makes sure they're actually applied. |
|
|
99
|
+
| **It stops thrashing.** No more watching it retry the same broken fix three times. | A **loop breaker** watches the timeline and stops the agent re-trying a change that already failed twice — before it burns your turn and your patience. |
|
|
100
|
+
| **The agent stays sharp at turn 50.** | `file_read({entity})` returns 200 lines instead of 3,000; shell output is trimmed automatically. The window stays uncluttered, so the model isn't fighting "lost in the middle." |
|
|
78
101
|
|
|
79
|
-
**What it looks like in your chat** — before the Edit tool runs, unerr injects this into the agent's context:
|
|
102
|
+
**What it looks like in your chat** — before the Edit tool runs, unerr injects this into the agent's context, on its own:
|
|
80
103
|
|
|
81
104
|
> ⚡ unerr · cascade guard: editing `src/payments/gateway.ts` changes a signature with callers that must be updated in the same change — `processPayment`: **24 callers at risk across 6 files** (19 source, 5 test). Call `get_references({key:'processPayment', direction:'callers'})` and update every caller before finishing.
|
|
82
105
|
|
|
83
|
-
The outcome
|
|
106
|
+
The outcome: **agents that behave like senior engineers** — checking dependencies before editing, honoring the standard, and refusing to thrash on a function they've already failed on three times.
|
|
84
107
|
|
|
85
108
|
---
|
|
86
109
|
|
|
@@ -95,19 +118,19 @@ The outcome you get is **agents that behave like senior engineers** — checking
|
|
|
95
118
|
|
|
96
119
|
Two places unerr shows up so you know it's working — inside the chat, and in a browser.
|
|
97
120
|
|
|
98
|
-
**Inside the chat.** Every coding turn opens with one line naming what unerr loaded ("loaded a convention you wrote yesterday for `src/payments/gateway.ts`…") and closes with one line totalling what it saved
|
|
121
|
+
**Inside the chat.** Every coding turn opens with one line naming what unerr loaded ("loaded a convention you wrote yesterday for `src/payments/gateway.ts`…") and closes with one line totalling what it caught and saved ("this turn: 2 catches · ≈ 4.2k tokens saved · +5 turns of headroom this session"). Catches are *named, countable events*, not a ratio.
|
|
99
122
|
|
|
100
|
-
**In a browser.** A live dashboard at `http://localhost:9847` reads from the same store the agent reads from over MCP — the graph it navigates, the facts it remembers, the
|
|
123
|
+
**In a browser.** A live dashboard at `http://localhost:9847` reads from the same store the agent reads from over MCP — the graph it navigates, the facts it remembers, the breaks it caught, and the score showing which of those facts actually shaped the next answer.
|
|
101
124
|
|
|
102
125
|
<p align="center">
|
|
103
126
|
<img src="https://unerr.dev/open-cli/screenshots/end-of-turn-receipt.png" alt="unerr end-of-turn receipt — tokens saved and headroom kept open this turn" width="380" />
|
|
104
127
|
<img src="https://unerr.dev/open-cli/screenshots/end-of-turn-receipt-2.png" alt="unerr end-of-turn receipt — named, countable catches totalled at the close of a turn" width="380" />
|
|
105
|
-
<br/><sub><strong>End-of-turn receipt</strong> · every coding turn closes with one line totalling what unerr saved you — named, countable catches, not a ratio.</sub>
|
|
128
|
+
<br/><sub><strong>End-of-turn receipt</strong> · every coding turn closes with one line totalling what unerr caught and saved you — named, countable catches, not a ratio.</sub>
|
|
106
129
|
</p>
|
|
107
130
|
|
|
108
131
|
<p align="center">
|
|
109
132
|
<img src="https://unerr.dev/open-cli/screenshots/dashboard.png" alt="unerr dashboard — live overview" width="300" />
|
|
110
|
-
<br/><sub><strong>Dashboard</strong> · live overview — active sessions, recent tool calls, tokens the agent skipped this turn.</sub>
|
|
133
|
+
<br/><sub><strong>Dashboard</strong> · live overview — active sessions, recent tool calls, breaks caught, tokens the agent skipped this turn.</sub>
|
|
111
134
|
</p>
|
|
112
135
|
|
|
113
136
|
<p align="center">
|
|
@@ -153,9 +176,9 @@ Install multiple agents in the same repo — each writes its own config. Idempot
|
|
|
153
176
|
|
|
154
177
|
### 3. Restart your IDE
|
|
155
178
|
|
|
156
|
-
Close and reopen your IDE (or start a new chat session). Your agent picks up unerr through MCP — graph-backed tools, persistent memory,
|
|
179
|
+
Close and reopen your IDE (or start a new chat session). Your agent picks up unerr through MCP — graph-backed tools, persistent memory, the edit-time guardrail all available immediately.
|
|
157
180
|
|
|
158
|
-
> **Dashboard:** <http://localhost:9847> — open any time to watch unerr
|
|
181
|
+
> **Dashboard:** <http://localhost:9847> — open any time to watch unerr at work in real time.
|
|
159
182
|
|
|
160
183
|
> Need manual setup or any other MCP client? `unerr install --show-instructions <agent>` prints copy-pasteable steps.
|
|
161
184
|
|
|
@@ -163,50 +186,54 @@ Close and reopen your IDE (or start a new chat session). Your agent picks up une
|
|
|
163
186
|
|
|
164
187
|
## Who it's for
|
|
165
188
|
|
|
166
|
-
- **
|
|
167
|
-
- **
|
|
168
|
-
- **
|
|
189
|
+
- **Engineers on large, existing codebases.** The dependency graph, the load-bearing patterns, and the prior incidents a senior engineer carries in their head — handed to the agent before every edit, so it stops breaking callers it never read.
|
|
190
|
+
- **Teams with conventions worth enforcing.** The standard you agreed on once, applied every time the agent touches that scope — no `.cursorrules` file to hand-maintain, re-paste, or merge-conflict over, and no hoping the agent remembers to look.
|
|
191
|
+
- **Solo builders shipping into a codebase that's already grown.** The continuous thread across tools — switch from Claude Code in the terminal to Cursor in the IDE and the graph, rules, and history come with you, instead of relearning the repo every session.
|
|
169
192
|
|
|
170
193
|
---
|
|
171
194
|
|
|
172
|
-
## Why one runtime, not five
|
|
195
|
+
## Why a guardrail has to be one runtime, not five tools
|
|
173
196
|
|
|
174
|
-
|
|
197
|
+
A guardrail the agent *can't skip* can't be a tool the agent chooses to call. That's the whole reason unerr is one local runtime sitting *behind* the MCP every agent already speaks — not a fifth server in the agent's tool list.
|
|
175
198
|
|
|
176
|
-
|
|
199
|
+
Every coding agent on your machine — Claude Code, Cursor, Windsurf, Antigravity — speaks MCP. MCP carries tool calls; it does not carry context, and it does not fire anything on its own. So a memory server, a graph server, and a compressor sit there *waiting to be invoked* — and an agent under context pressure skips them. unerr instead intercepts at the moment that matters — the read, the edit — and injects the one scoped thing that's relevant, automatically. The agent can't forget to call something that isn't waiting to be called.
|
|
177
200
|
|
|
178
|
-
|
|
179
|
-
|---|---|---|
|
|
180
|
-
| Memory across sessions | dedicated memory tools | Memory tied to the *current* state of the code — facts get drift signals when the file they're about moves. |
|
|
181
|
-
| Code-graph navigation | dedicated code-graph tools | The graph is read *before every file read* — surgical context instead of 3,000-line dumps. |
|
|
182
|
-
| Output compression | dedicated compression tools | Compression is fed through the same MCP runtime as the graph and memory, not a separate tool the agent has to remember to invoke. |
|
|
183
|
-
| Convention enforcement | `.cursorrules`, CLAUDE.md hand-maintained | Conventions auto-detected from ≥70% adherence in the code. No file to maintain. |
|
|
201
|
+
That only works if the pieces live in **one** process. The guardrails worth having each fire on a *join* no single tool can make:
|
|
184
202
|
|
|
185
|
-
|
|
203
|
+
- **Cascade guard** needs the call graph *and* the edit-intent ledger on the same process, at the same instant.
|
|
204
|
+
- **Drift** needs memory that's anchored to a live graph — so the fact knows the moment its code moved.
|
|
205
|
+
- **Convention drift** needs the auto-detected pattern store *and* the new-code stream in the same memory space.
|
|
206
|
+
- **Loop breaker** needs the full timeline of what the agent already tried.
|
|
186
207
|
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
- **~84%** of an AI coding agent's tokens are tool output, mostly file reads ([JetBrains, NeurIPS 2025](https://blog.jetbrains.com/research/2025/12/efficient-context-management/)) — unerr intercepts at the read layer, so attention isn't diluted.
|
|
190
|
-
- **Tool-selection accuracy collapses 58% → 26% as MCP tools go from 9 to 51** ([LangChain ReAct benchmark](https://blog.langchain.com/react-agent-benchmarking/)) — unerr is one MCP runtime instead of five, freeing the agent's tool-selection budget. Anthropic itself acknowledged this in Jan 2026 by shipping [MCP Tool Search](https://www.anthropic.com/engineering/code-execution-with-mcp) to hide tool definitions until queried.
|
|
191
|
-
- **0** LLM calls per query in the core — facts, conventions, drift signals, and graph lookups are all algorithmic. No API keys, no per-turn inference cost, no telemetry.
|
|
192
|
-
- **86–90%** of an agent's code-navigation tokens removed in head-to-head benchmarks vs grep+read — real tokenizer, fidelity-gated, reproducible on any repo ([benchmarks](./benchmarks/README.md)).
|
|
208
|
+
These aren't features you can buy individually and bolt together. They're emergent properties of one runtime — and they're exactly what turns "context the agent might read" into "a guardrail it can't skip."
|
|
193
209
|
|
|
194
210
|
---
|
|
195
211
|
|
|
196
212
|
## How the runtime works
|
|
197
213
|
|
|
198
|
-
One local process per repo. Four
|
|
214
|
+
One local process per repo. Four mechanisms, joined deterministically — the **mechanisms** are how; the **guardrail** is what you get.
|
|
199
215
|
|
|
200
|
-
|
|
|
216
|
+
| Mechanism (the how) | What's inside | What it powers (the what) |
|
|
201
217
|
|---|---|---|
|
|
202
|
-
| **Live code graph** | CozoDB · tree-sitter ASTs · SCIP-verified call graphs · 18+ languages · <5ms queries |
|
|
218
|
+
| **Live code graph** | CozoDB · tree-sitter ASTs · SCIP-verified call graphs · 18+ languages · <5ms queries | The agent opens 50 targeted lines and a caller list — not 3,000 lines and a guess. Read *before every file read*, so cascade guard knows what an edit breaks. |
|
|
203
219
|
| **Anchored memory** | Typed facts · conventions auto-detected at ≥70% adherence · decay-adjusted confidence | Every fact is pinned to a file or entity in the graph. When the code moves, the fact gets a **drift signal** — never silent staleness. |
|
|
204
|
-
| **Context delivery** | Shell output compression (
|
|
205
|
-
| **Behaviour modules** | cascade guard · convention drift · loop breaker · session continuity · auto-doc · change narrative · architecture guard | Each guardrail fires on a
|
|
220
|
+
| **Context delivery** | Shell output compression (645+ command classifiers) · web fetches (5–10× via Defuddle + BM25) · entity-targeted file reads | The relevant slice arrives automatically at the read — the agent never has to remember which tool to invoke for which content. |
|
|
221
|
+
| **Behaviour modules** | cascade guard · convention drift · loop breaker · session continuity · auto-doc · change narrative · architecture guard | Each guardrail fires on a join of the three above, *at the moment of the edit* — not as a tool the agent chose, not as a review after the fact. |
|
|
222
|
+
|
|
223
|
+
**The unifying point.** Drift detection requires memory anchored to a live graph. Cascade guard requires the graph and the edit-intent ledger on one process. Convention drift requires the pattern store and the new-code stream in the same memory space. Spread these across five disconnected MCP servers and none of them can fire — they can only sit and wait to be called, which is the failure mode this whole thing exists to fix. That's the difference between a stack of tools and a guardrail.
|
|
206
224
|
|
|
207
|
-
|
|
225
|
+
---
|
|
226
|
+
|
|
227
|
+
## Fewer tokens, as a side effect
|
|
228
|
+
|
|
229
|
+
unerr was built to stop bad changes, not to save tokens. But a guardrail that only ever hands over *the one scoped fact that matters* — the rule for the entity in front of the agent, 50 lines instead of 3,000 — spends far fewer tokens almost by accident. So you get this for free:
|
|
230
|
+
|
|
231
|
+
- **86–90%** of an agent's code-navigation tokens removed in head-to-head benchmarks vs grep+read — real tokenizer, fidelity-gated (any "saving" that lost the answer is discarded), reproducible on any repo. [See the benchmarks →](./benchmarks/README.md)
|
|
232
|
+
- **~84%** of an AI coding agent's tokens are tool output, mostly file reads ([JetBrains, NeurIPS 2025](https://blog.jetbrains.com/research/2025/12/efficient-context-management/)) — unerr intercepts at the read layer, so the window isn't diluted.
|
|
233
|
+
- **Tool-selection accuracy collapses 58% → 26% as MCP tools go from 9 to 51** ([LangChain ReAct benchmark](https://blog.langchain.com/react-agent-benchmarking/)) — unerr is one runtime instead of five servers, so it doesn't eat the agent's tool-selection budget. Anthropic itself acknowledged this in Jan 2026 by shipping [MCP Tool Search](https://www.anthropic.com/engineering/code-execution-with-mcp).
|
|
234
|
+
- **0** LLM calls per query in the core — facts, conventions, drift signals, and graph lookups are all algorithmic. No API keys, no per-turn inference cost, no telemetry.
|
|
208
235
|
|
|
209
|
-
|
|
236
|
+
The point was never the token number. It's that the agent lands on the right code, sees the right guardrail, and you stop paying — in tokens *and* in afternoons — for the changes it would otherwise have to undo.
|
|
210
237
|
|
|
211
238
|
---
|
|
212
239
|
|
|
@@ -240,7 +267,7 @@ One local DB per repo. Zero network calls. No API keys. No cloud. Your code neve
|
|
|
240
267
|
|
|
241
268
|
Full module map and source-tree breakdown: **[ARCHITECTURE.md](./docs/ARCHITECTURE.md)**.
|
|
242
269
|
|
|
243
|
-
**Design principles** — zero network calls; stdout is sacred (MCP JSON-RPC only, everything else to stderr); <5 ms query responses; first useful output <5 s (shallow index first, deep enrichment in background); graceful degradation (the agent still works if unerr is down, you just lose the
|
|
270
|
+
**Design principles** — zero network calls; stdout is sacred (MCP JSON-RPC only, everything else to stderr); <5 ms query responses; first useful output <5 s (shallow index first, deep enrichment in background); graceful degradation (the agent still works if unerr is down, you just lose the guardrail layer).
|
|
244
271
|
|
|
245
272
|
**Tech stack** TypeScript (ESM) · CozoDB (Rust/NAPI) · web-tree-sitter (WASM) · MCP SDK · Ink (React CLI) · React + Vite (dashboard) · tsup · Vitest
|
|
246
273
|
|