brain-cache 2.1.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,52 @@
1
+ ---
2
+ name: brain-cache
3
+ description: "Local codebase embeddings that save tokens and money. Use brain-cache MCP tools instead of reading files or grepping — they return better results with fewer tokens sent to Claude."
4
+ allowed-tools: Bash Read Grep
5
+ ---
6
+
7
+ ## What brain-cache does
8
+
9
+ brain-cache indexes your codebase locally using Ollama embeddings — no data leaves your machine. When you need to understand code, it retrieves only the relevant parts and fits them to a token budget before sending anything to Claude. This means Claude sees better, more focused context while you spend fewer tokens per query.
10
+
11
+ Use brain-cache tools before reading files or using Grep/Glob for codebase questions. They return better, token-efficient results.
12
+
13
+ ## Tool routing
14
+
15
+ | Query type | Tool | NOT this |
16
+ |-----------|------|---------|
17
+ | Locate a function, type, or symbol | `search_codebase` | `build_context` |
18
+ | Understand how specific code works across files | `build_context` | file reads |
19
+ | Diagnose brain-cache failures | `doctor` | -- |
20
+ | Reindex the project | `index_repo` | -- |
21
+
22
+ ## search_codebase (locate code)
23
+
24
+ Call `mcp__brain-cache__search_codebase` to find functions, types, definitions, or implementations by meaning rather than keyword match.
25
+
26
+ Use for: "Where is X defined?", "Find the auth middleware", "Which file handles request validation?"
27
+
28
+ Do NOT use for understanding how code works — use `build_context` once you have located the symbol.
29
+
30
+ ## build_context (understand behavior)
31
+
32
+ Call `mcp__brain-cache__build_context` with a focused question about how specific code works. It retrieves semantically relevant code, deduplicates results, and fits them to a token budget.
33
+
34
+ Use for: "How does X work?", "What does this function do?", debugging unfamiliar code paths.
35
+
36
+ Do NOT use for locating symbols — use `search_codebase` first to find where code lives.
37
+
38
+ Do NOT use just to get a file overview — ask a specific behavioral question.
39
+
40
+ ## index_repo (reindex)
41
+
42
+ Call `mcp__brain-cache__index_repo` only when the user explicitly asks to reindex, or after major code changes such as a large refactor or pulling a significant upstream diff.
43
+
44
+ Do not call proactively. Do not call at the start of each session.
45
+
46
+ ## doctor (diagnose issues)
47
+
48
+ Call `mcp__brain-cache__doctor` when any brain-cache tool fails or returns unexpected results. It checks index health and Ollama connectivity and tells you what to fix.
49
+
50
+ ## Status line
51
+
52
+ brain-cache displays cumulative token savings in the Claude Code status bar. After tool calls you will see `brain-cache down-arrow{pct}% {n} saved` — this confirms cost savings are working. If the status bar shows idle, no tools have been called yet in the current session.
package/README.md CHANGED
@@ -1,65 +1,42 @@
1
1
  # brain-cache
2
2
 
3
- > Stop sending your entire repo to Claude.
3
+ > Your local GPU finally has a job.
4
4
 
5
- brain-cache is an MCP server that gives Claude local, indexed access to your codebase — so it finds what matters instead of reading everything.
6
-
7
- → ~90% fewer tokens sent to Claude
8
- → Sharper, grounded answers
9
- → No data leaves your machine
5
+ brain-cache is a local AI runtime that sits between your codebase and Claude. It runs embeddings and retrieval on your machine — so Claude only sees what actually matters. Fewer tokens. Better answers. Your API bill stops looking like a mortgage payment.
10
6
 
11
7
  ![brain-cache only sends the parts of your codebase that matter — not everything.](assets/brain-cache.svg)
12
8
 
13
9
  ---
14
10
 
15
- ## Use inside Claude Code (MCP)
16
-
17
- The primary way to use brain-cache is as an MCP server. Run `brain-cache init` once — it auto-configures `.mcp.json` in your project root so Claude Code connects immediately. No manual JSON setup needed.
11
+ ## How it works
18
12
 
19
- Claude then has access to:
20
-
21
- - **`build_context`** Assembles relevant context for any question. Use this instead of reading files.
22
- - **`search_codebase`** Finds functions, types, and symbols by meaning, not keyword. Use this instead of grep.
23
- - **`index_repo`** — Rebuilds the local vector index.
24
- - **`doctor`** — Diagnoses index health and Ollama connectivity.
25
-
26
- No copy/pasting code into prompts. No manual file opens. Claude knows where to look.
13
+ 1. Embeds your query locally via Ollama (fast, free, no API calls)
14
+ 2. Retrieves the most relevant code chunks from its local vector index
15
+ 3. Trims and deduplicates the context to fit a tight token budget
16
+ 4. Hands Claude a clean, minimal context not your entire repo
27
17
 
28
18
  ---
29
19
 
30
- ## The problem
31
-
32
- When you ask Claude about your codebase, you either:
33
-
34
- - paste huge chunks of code ❌
35
- - rely on vague context ❌
36
- - or let tools send way too much ❌
37
-
38
- Result:
39
-
40
- - worse answers
41
- - hallucinations
42
- - massive token usage
20
+ ## Use inside Claude Code (MCP)
43
21
 
44
- ---
22
+ The primary way to use brain-cache is as an MCP server. Run `brain-cache init` once — it auto-configures `.mcp.json` in your project root so Claude Code connects immediately. No manual JSON setup needed.
45
23
 
46
- ## 🧠 How it works
24
+ Claude then has access to:
47
25
 
48
- brain-cache is the layer between your codebase and Claude.
26
+ - **`build_context`** Assembles relevant context for any question. Use instead of reading files.
27
+ - **`search_codebase`** — Finds functions, types, and symbols by meaning, not keyword. Use instead of grep.
28
+ - **`index_repo`** — Rebuilds the local vector index.
49
29
 
50
- 1. Your code is indexed locally using Ollama embeddings — nothing leaves your machine
51
- 2. When you ask Claude a question, it calls `build_context` or `search_codebase` automatically
52
- 3. brain-cache retrieves only the relevant files, trims duplicates, and fits them to a token budget
53
- 4. Claude gets tight, useful context — not your entire repo
30
+ Also included: **`doctor`** diagnoses index health and Ollama connectivity.
54
31
 
55
- AI should read the right parts and nothing else. brain-cache is the layer that makes that possible.
32
+ No copy/pasting code into prompts. No manual file opens. Claude knows where to look.
56
33
 
57
34
  ---
58
35
 
59
- ## 🔥 Example
36
+ ## Example
60
37
 
61
38
  ```
62
- > "Explain the overall architecture of this project"
39
+ > "How does the auth middleware work?"
63
40
 
64
41
  brain-cache: context assembled (74 tokens, 97% reduction)
65
42
 
@@ -68,11 +45,11 @@ Estimated without: ~2,795
68
45
  Reduction: 97%
69
46
  ```
70
47
 
71
- Claude gets only what matters answers are sharper and grounded.
48
+ Claude gets only what matters answers are sharper and grounded.
72
49
 
73
50
  ---
74
51
 
75
- ## Quick start
52
+ ## Quick start
76
53
 
77
54
  **Step 1: Install**
78
55
 
@@ -87,11 +64,11 @@ brain-cache init
87
64
  brain-cache index
88
65
  ```
89
66
 
90
- `brain-cache init` sets up your project: configures `.mcp.json` so Claude Code connects to brain-cache automatically, and appends MCP tool instructions to `CLAUDE.md`. Runs once; idempotent.
67
+ `brain-cache init` sets up your project: configures `.mcp.json` so Claude Code connects to brain-cache automatically, appends MCP tool instructions to `CLAUDE.md`, installs the brain-cache skill to `.claude/skills/brain-cache/SKILL.md`, and installs a status line in Claude Code that shows cumulative token savings. Runs once; idempotent.
91
68
 
92
69
  **Step 3: Use Claude normally**
93
70
 
94
- brain-cache tools are called automatically. You dont change how you work — the context just gets better.
71
+ brain-cache tools are called automatically. You don't change how you work — the context just gets better.
95
72
 
96
73
  > **Advanced:** `init` creates `.mcp.json` automatically. If you need to customise it manually, the expected shape is:
97
74
  > ```json
@@ -107,7 +84,24 @@ brain-cache tools are called automatically. You don’t change how you work —
107
84
 
108
85
  ---
109
86
 
110
- ## 📊 Optional: Token savings footer
87
+ ## Install as Claude Code skill
88
+
89
+ brain-cache ships as a Claude Code skill. After `brain-cache init`, the skill is
90
+ installed at `.claude/skills/brain-cache/SKILL.md` in your project. Claude
91
+ automatically learns when and how to use brain-cache tools.
92
+
93
+ To install manually, copy the `.claude/skills/brain-cache/` directory into your
94
+ project root.
95
+
96
+ ---
97
+
98
+ ## Status line
99
+
100
+ After `brain-cache init`, the status line in Claude Code's bottom bar shows your cumulative token savings session by session. You see the reduction without doing anything different.
101
+
102
+ ---
103
+
104
+ ## Optional: Token savings footer
111
105
 
112
106
  brain-cache returns token usage stats in its tool responses (tokens sent, estimated without, reduction %). By default, Claude decides whether to surface these — no footer is forced.
113
107
 
@@ -119,7 +113,7 @@ When using brain-cache build_context, include the token savings summary from the
119
113
 
120
114
  This keeps it transparent and under your control.
121
115
 
122
- ## 🎛 Tuning how much Claude uses brain-cache
116
+ ## Tuning how much Claude uses brain-cache
123
117
 
124
118
  `brain-cache init` adds a section to your project's `CLAUDE.md` with clear instructions to use brain-cache tools first. This works well for most users.
125
119
 
@@ -134,37 +128,7 @@ Or soften it if you prefer Claude to decide on its own. It's your `CLAUDE.md`
134
128
 
135
129
  ---
136
130
 
137
- ## 🧩 Core capabilities
138
-
139
- - 🧠 Local embeddings via Ollama — no API calls, no data sent out
140
- - 🔍 Semantic vector search over your codebase
141
- - ✂️ Context trimming and deduplication
142
- - 🎯 Token budget optimisation
143
- - 🤖 MCP server for Claude Code integration
144
- - ⚡ CLI for setup, debugging, and admin
145
-
146
- ---
147
-
148
- ## 🧠 Why it’s different
149
-
150
- Most AI coding tools:
151
-
152
- - send too much context
153
- - hide retrieval behind hosted services
154
- - require you to prompt-engineer your way to good answers
155
-
156
- brain-cache is:
157
-
158
- - 🏠 Local-first — embeddings run on your machine
159
- - 🔍 Transparent — you can inspect exactly what context gets sent
160
- - 🎯 Token-aware — every call shows the reduction
161
- - ⚙️ Developer-controlled — no vendor lock-in, no cloud dependency
162
-
163
- Think: **Vite, but for LLM context.**
164
-
165
- ---
166
-
167
- ## 🧪 CLI commands
131
+ ## CLI commands
168
132
 
169
133
  The CLI is the setup and admin interface. Use it to init, index, debug, and diagnose — not as the primary interface.
170
134
 
@@ -174,12 +138,13 @@ brain-cache index Build/rebuild the vector index
174
138
  brain-cache search "auth middleware" Manual search (useful for debugging)
175
139
  brain-cache context "auth flow" Manual context building (useful for debugging)
176
140
  brain-cache ask "how does auth work?" Direct Claude query via CLI
141
+ brain-cache status Show index and system status
177
142
  brain-cache doctor Check system health
178
143
  ```
179
144
 
180
145
  ---
181
146
 
182
- ## 📊 Token savings
147
+ ## Token savings
183
148
 
184
149
  Every call shows exactly what was saved:
185
150
 
@@ -187,41 +152,25 @@ Every call shows exactly what was saved:
187
152
  context: 1,240 tokens (93% reduction)
188
153
  ```
189
154
 
190
- Less noise better reasoning cheaper usage.
191
-
192
- ---
193
-
194
- ## 🧠 Built with GSD
195
-
196
- This project uses the GSD (Get Shit Done) framework — an AI-driven workflow for going from idea → research → plan → execution. brain-cache is both a product of that philosophy and a tool that makes it work better: tight context, better outcomes.
197
-
198
- ---
199
-
200
- ## ⚠️ Status
201
-
202
- Early stage — actively improving:
203
-
204
- - ⏳ reranking (planned)
205
- - ⏳ context compression
206
- - ⏳ live indexing (watch mode)
155
+ Less noise better reasoning cheaper usage.
207
156
 
208
157
  ---
209
158
 
210
- ## 🛠 Requirements
159
+ ## Requirements
211
160
 
212
- - Node.js 22+
213
- - Ollama running locally (`nomic-embed-text` model)
161
+ - Node.js >= 22
162
+ - Ollama running locally (`nomic-embed-text` model recommended)
214
163
  - Anthropic API key (for `ask` command only)
215
164
 
216
165
  ---
217
166
 
218
- ## ⭐️ If this is useful
167
+ ## If this is useful
219
168
 
220
169
  Give it a star — or try it on your repo and let me know what breaks.
221
170
 
222
171
  ---
223
172
 
224
- ## 📄 License
173
+ ## License
225
174
 
226
175
  MIT — see LICENSE for details.
227
176
 
@@ -10,8 +10,6 @@ Use brain-cache tools before reading files or using Grep/Glob for codebase quest
10
10
  |-----------|------|---------|
11
11
  | Locate a function, type, or symbol | \`search_codebase\` | \`build_context\` |
12
12
  | Understand how specific code works across files | \`build_context\` | file reads |
13
- | Trace a call path across files | \`trace_flow\` | \`build_context\` |
14
- | Explain project architecture or structure | \`explain_codebase\` | \`build_context\` |
15
13
  | Diagnose brain-cache failures | \`doctor\` | -- |
16
14
  | Reindex the project | \`index_repo\` | -- |
17
15
 
@@ -27,28 +25,6 @@ Call \`mcp__brain-cache__build_context\` with a focused question about how speci
27
25
 
28
26
  Use for: "How does X work?", "What does this function do?", debugging unfamiliar code paths.
29
27
 
30
- Do NOT use for architecture overviews (use explain_codebase) or call-path tracing (use trace_flow).
31
-
32
- ### trace_flow (trace call paths)
33
-
34
- Call \`mcp__brain-cache__trace_flow\` to trace how a function call propagates through the codebase. Returns structured hops showing the call chain across files.
35
-
36
- Use for: "How does X flow to Y?", "Trace how X calls Y across files", "What happens when X is called?", "Call path from X to Y".
37
-
38
- Do NOT use for code understanding queries like "how does X work" or "what does X do" \u2014 use build_context instead.
39
-
40
- Use trace_flow instead of build_context when the question is about call propagation or execution flow across files.
41
-
42
- ### explain_codebase (architecture overview)
43
-
44
- Call \`mcp__brain-cache__explain_codebase\` to get a module-grouped architecture overview. No follow-up question needed.
45
-
46
- Use for: "Explain the project architecture", "How is this project structured?", "What does this project do?", "Give me an overview of the codebase".
47
-
48
- Do NOT use for questions about specific code behavior or how a particular function works \u2014 use build_context instead.
49
-
50
- Use explain_codebase instead of build_context when the question is about overall structure or getting oriented.
51
-
52
28
  ### doctor (diagnose issues)
53
29
 
54
30
  Call \`mcp__brain-cache__doctor\` when any brain-cache tool fails or returns unexpected results.
package/dist/cli.js CHANGED
@@ -5,11 +5,11 @@ import {
5
5
 
6
6
  // src/cli/index.ts
7
7
  import { Command } from "commander";
8
- var version = true ? "2.1.0" : "dev";
8
+ var version = true ? "3.0.0" : "dev";
9
9
  var program = new Command();
10
10
  program.name("brain-cache").description("Local AI runtime \u2014 GPU cache layer for Claude").version(version);
11
11
  program.command("init").description("Detect hardware, pull embedding model, create config directory").action(async () => {
12
- const { runInit } = await import("./init-BCMT64T2.js");
12
+ const { runInit } = await import("./init-2E4JMZZC.js");
13
13
  await runInit();
14
14
  });
15
15
  program.command("doctor").description("Report system health: GPU, VRAM tier, Ollama status").action(async () => {
@@ -13,9 +13,10 @@ import {
13
13
  import "./chunk-TXLCXXKY.js";
14
14
 
15
15
  // src/workflows/init.ts
16
- import { existsSync, readFileSync, writeFileSync, appendFileSync, chmodSync, mkdirSync } from "fs";
17
- import { join } from "path";
16
+ import { existsSync, readFileSync, writeFileSync, appendFileSync, chmodSync, mkdirSync, copyFileSync } from "fs";
17
+ import { join, dirname } from "path";
18
18
  import { homedir } from "os";
19
+ import { fileURLToPath } from "url";
19
20
  async function runInit() {
20
21
  process.stderr.write("brain-cache: detecting hardware capabilities...\n");
21
22
  const profile = await detectCapabilities();
@@ -88,7 +89,7 @@ async function runInit() {
88
89
  process.stderr.write("brain-cache: created .mcp.json with brain-cache MCP server.\n");
89
90
  }
90
91
  const claudeMdPath = "CLAUDE.md";
91
- const { CLAUDE_MD_SECTION: brainCacheSection } = await import("./claude-md-section-O5LMKH4O.js");
92
+ const { CLAUDE_MD_SECTION: brainCacheSection } = await import("./claude-md-section-K47HUTE4.js");
92
93
  if (existsSync(claudeMdPath)) {
93
94
  const content = readFileSync(claudeMdPath, "utf-8");
94
95
  if (content.includes("## Brain-Cache MCP Tools")) {
@@ -101,6 +102,20 @@ async function runInit() {
101
102
  writeFileSync(claudeMdPath, brainCacheSection.trimStart());
102
103
  process.stderr.write("brain-cache: created CLAUDE.md with Brain-Cache MCP Tools section.\n");
103
104
  }
105
+ const currentFile = fileURLToPath(import.meta.url);
106
+ const packageRoot = join(dirname(currentFile), "..", "..");
107
+ const skillSource = join(packageRoot, ".claude", "skills", "brain-cache", "SKILL.md");
108
+ const skillTargetDir = join(process.cwd(), ".claude", "skills", "brain-cache");
109
+ const skillTarget = join(skillTargetDir, "SKILL.md");
110
+ if (existsSync(skillTarget)) {
111
+ process.stderr.write("brain-cache: skill already installed at .claude/skills/brain-cache/SKILL.md, skipping.\n");
112
+ } else if (!existsSync(skillSource)) {
113
+ process.stderr.write("brain-cache: Warning: skill source not found in package. Copy .claude/skills/brain-cache/ manually from the repo.\n");
114
+ } else {
115
+ mkdirSync(skillTargetDir, { recursive: true });
116
+ copyFileSync(skillSource, skillTarget);
117
+ process.stderr.write("brain-cache: installed skill to .claude/skills/brain-cache/SKILL.md\n");
118
+ }
104
119
  const { STATUSLINE_SCRIPT_CONTENT } = await import("./statusline-script-NFUDFOWK.js");
105
120
  const statuslinePath = join(homedir(), ".brain-cache", "statusline.mjs");
106
121
  if (existsSync(statuslinePath)) {
package/dist/mcp.js CHANGED
@@ -2359,7 +2359,7 @@ function accumulateStats(delta, ttlMs) {
2359
2359
  }
2360
2360
 
2361
2361
  // src/mcp/index.ts
2362
- var version = true ? "2.1.0" : "dev";
2362
+ var version = true ? "3.0.0" : "dev";
2363
2363
  var log13 = childLogger("mcp");
2364
2364
  var server = new McpServer({ name: "brain-cache", version });
2365
2365
  server.registerTool(
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "brain-cache",
3
- "version": "2.1.0",
3
+ "version": "3.0.0",
4
4
  "description": "Local MCP-first context engine for Claude. Index your codebase, retrieve only what matters, and cut token usage.",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -9,6 +9,7 @@
9
9
  },
10
10
  "files": [
11
11
  "dist/",
12
+ ".claude/skills/",
12
13
  "README.md",
13
14
  "LICENSE"
14
15
  ],