npm - brain-cache - Versions diffs - 2.1.0 → 3.0.0 - Mend

brain-cache 2.1.0 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/.claude/skills/brain-cache/SKILL.md +52 -0
package/README.md +49 -100
package/dist/{claude-md-section-O5LMKH4O.js → claude-md-section-K47HUTE4.js} +0 -24
package/dist/cli.js +2 -2
package/dist/{init-BCMT64T2.js → init-2E4JMZZC.js} +18 -3
package/dist/mcp.js +1 -1
package/package.json +2 -1

package/.claude/skills/brain-cache/SKILL.md ADDED Viewed

@@ -0,0 +1,52 @@
+---
+name: brain-cache
+description: "Local codebase embeddings that save tokens and money. Use brain-cache MCP tools instead of reading files or grepping — they return better results with fewer tokens sent to Claude."
+allowed-tools: Bash Read Grep
+---
+## What brain-cache does
+brain-cache indexes your codebase locally using Ollama embeddings — no data leaves your machine. When you need to understand code, it retrieves only the relevant parts and fits them to a token budget before sending anything to Claude. This means Claude sees better, more focused context while you spend fewer tokens per query.
+Use brain-cache tools before reading files or using Grep/Glob for codebase questions. They return better, token-efficient results.
+## Tool routing
+| Query type | Tool | NOT this |
+|-----------|------|---------|
+| Locate a function, type, or symbol | `search_codebase` | `build_context` |
+| Understand how specific code works across files | `build_context` | file reads |
+| Diagnose brain-cache failures | `doctor` | -- |
+| Reindex the project | `index_repo` | -- |
+## search_codebase (locate code)
+Call `mcp__brain-cache__search_codebase` to find functions, types, definitions, or implementations by meaning rather than keyword match.
+Use for: "Where is X defined?", "Find the auth middleware", "Which file handles request validation?"
+Do NOT use for understanding how code works — use `build_context` once you have located the symbol.
+## build_context (understand behavior)
+Call `mcp__brain-cache__build_context` with a focused question about how specific code works. It retrieves semantically relevant code, deduplicates results, and fits them to a token budget.
+Use for: "How does X work?", "What does this function do?", debugging unfamiliar code paths.
+Do NOT use for locating symbols — use `search_codebase` first to find where code lives.
+Do NOT use just to get a file overview — ask a specific behavioral question.
+## index_repo (reindex)
+Call `mcp__brain-cache__index_repo` only when the user explicitly asks to reindex, or after major code changes such as a large refactor or pulling a significant upstream diff.
+Do not call proactively. Do not call at the start of each session.
+## doctor (diagnose issues)
+Call `mcp__brain-cache__doctor` when any brain-cache tool fails or returns unexpected results. It checks index health and Ollama connectivity and tells you what to fix.
+## Status line
+brain-cache displays cumulative token savings in the Claude Code status bar. After tool calls you will see `brain-cache down-arrow{pct}% {n} saved` — this confirms cost savings are working. If the status bar shows idle, no tools have been called yet in the current session.

package/README.md CHANGED Viewed

@@ -1,65 +1,42 @@
 # brain-cache
-> Stop sending your entire repo to Claude.
+> Your local GPU finally has a job.
-brain-cache is an MCP server that gives Claude local, indexed access to your codebase — so it finds what matters instead of reading everything.
-→ ~90% fewer tokens sent to Claude
-→ Sharper, grounded answers
-→ No data leaves your machine
+brain-cache is a local AI runtime that sits between your codebase and Claude. It runs embeddings and retrieval on your machine — so Claude only sees what actually matters. Fewer tokens. Better answers. Your API bill stops looking like a mortgage payment.
 ![brain-cache only sends the parts of your codebase that matter — not everything.](assets/brain-cache.svg)
 ---
-## Use inside Claude Code (MCP)
-The primary way to use brain-cache is as an MCP server. Run `brain-cache init` once — it auto-configures `.mcp.json` in your project root so Claude Code connects immediately. No manual JSON setup needed.
+## How it works
-Claude then has access to:
-- **`build_context`** — Assembles relevant context for any question. Use this instead of reading files.
-- **`search_codebase`** — Finds functions, types, and symbols by meaning, not keyword. Use this instead of grep.
-- **`index_repo`** — Rebuilds the local vector index.
-- **`doctor`** — Diagnoses index health and Ollama connectivity.
-No copy/pasting code into prompts. No manual file opens. Claude knows where to look.
+1. Embeds your query locally via Ollama (fast, free, no API calls)
+2. Retrieves the most relevant code chunks from its local vector index
+3. Trims and deduplicates the context to fit a tight token budget
+4. Hands Claude a clean, minimal context — not your entire repo
 ---
-## ⚡ The problem
-When you ask Claude about your codebase, you either:
-- paste huge chunks of code ❌
-- rely on vague context ❌
-- or let tools send way too much ❌
-Result:
-- worse answers
-- hallucinations
-- massive token usage
+## Use inside Claude Code (MCP)
----
+The primary way to use brain-cache is as an MCP server. Run `brain-cache init` once — it auto-configures `.mcp.json` in your project root so Claude Code connects immediately. No manual JSON setup needed.
-## 🧠 How it works
+Claude then has access to:
-brain-cache is the layer between your codebase and Claude.
+- **`build_context`** — Assembles relevant context for any question. Use instead of reading files.
+- **`search_codebase`** — Finds functions, types, and symbols by meaning, not keyword. Use instead of grep.
+- **`index_repo`** — Rebuilds the local vector index.
-1. Your code is indexed locally using Ollama embeddings — nothing leaves your machine
-2. When you ask Claude a question, it calls `build_context` or `search_codebase` automatically
-3. brain-cache retrieves only the relevant files, trims duplicates, and fits them to a token budget
-4. Claude gets tight, useful context — not your entire repo
+Also included: **`doctor`** — diagnoses index health and Ollama connectivity.
-AI should read the right parts — and nothing else. brain-cache is the layer that makes that possible.
+No copy/pasting code into prompts. No manual file opens. Claude knows where to look.
 ---
-## 🔥 Example
+## Example
 ```
-> "Explain the overall architecture of this project"
+> "How does the auth middleware work?"
 brain-cache: context assembled (74 tokens, 97% reduction)
@@ -68,11 +45,11 @@ Estimated without:         ~2,795
 Reduction:                 97%
 ```
-Claude gets only what matters → answers are sharper and grounded.
+Claude gets only what matters — answers are sharper and grounded.
 ---
-## ⚡ Quick start
+## Quick start
 **Step 1: Install**
@@ -87,11 +64,11 @@ brain-cache init
 brain-cache index
 ```
-`brain-cache init` sets up your project: configures `.mcp.json` so Claude Code connects to brain-cache automatically, and appends MCP tool instructions to `CLAUDE.md`. Runs once; idempotent.
+`brain-cache init` sets up your project: configures `.mcp.json` so Claude Code connects to brain-cache automatically, appends MCP tool instructions to `CLAUDE.md`, installs the brain-cache skill to `.claude/skills/brain-cache/SKILL.md`, and installs a status line in Claude Code that shows cumulative token savings. Runs once; idempotent.
 **Step 3: Use Claude normally**
-brain-cache tools are called automatically. You don’t change how you work — the context just gets better.
+brain-cache tools are called automatically. You don't change how you work — the context just gets better.
 > **Advanced:** `init` creates `.mcp.json` automatically. If you need to customise it manually, the expected shape is:
 > ```json
@@ -107,7 +84,24 @@ brain-cache tools are called automatically. You don’t change how you work —
 ---
-## 📊 Optional: Token savings footer
+## Install as Claude Code skill
+brain-cache ships as a Claude Code skill. After `brain-cache init`, the skill is
+installed at `.claude/skills/brain-cache/SKILL.md` in your project. Claude
+automatically learns when and how to use brain-cache tools.
+To install manually, copy the `.claude/skills/brain-cache/` directory into your
+project root.
+---
+## Status line
+After `brain-cache init`, the status line in Claude Code's bottom bar shows your cumulative token savings session by session. You see the reduction without doing anything different.
+---
+## Optional: Token savings footer
 brain-cache returns token usage stats in its tool responses (tokens sent, estimated without, reduction %). By default, Claude decides whether to surface these — no footer is forced.
@@ -119,7 +113,7 @@ When using brain-cache build_context, include the token savings summary from the
 This keeps it transparent and under your control.
-## 🎛 Tuning how much Claude uses brain-cache
+## Tuning how much Claude uses brain-cache
 `brain-cache init` adds a section to your project's `CLAUDE.md` with clear instructions to use brain-cache tools first. This works well for most users.
@@ -134,37 +128,7 @@ Or soften it if you prefer Claude to decide on its own. It's your `CLAUDE.md`
 ---
-## 🧩 Core capabilities
-- 🧠 Local embeddings via Ollama — no API calls, no data sent out
-- 🔍 Semantic vector search over your codebase
-- ✂️ Context trimming and deduplication
-- 🎯 Token budget optimisation
-- 🤖 MCP server for Claude Code integration
-- ⚡ CLI for setup, debugging, and admin
----
-## 🧠 Why it’s different
-Most AI coding tools:
-- send too much context
-- hide retrieval behind hosted services
-- require you to prompt-engineer your way to good answers
-brain-cache is:
-- 🏠 Local-first — embeddings run on your machine
-- 🔍 Transparent — you can inspect exactly what context gets sent
-- 🎯 Token-aware — every call shows the reduction
-- ⚙️ Developer-controlled — no vendor lock-in, no cloud dependency
-Think: **Vite, but for LLM context.**
----
-## 🧪 CLI commands
+## CLI commands
 The CLI is the setup and admin interface. Use it to init, index, debug, and diagnose — not as the primary interface.
@@ -174,12 +138,13 @@ brain-cache index                     Build/rebuild the vector index
 brain-cache search "auth middleware"  Manual search (useful for debugging)
 brain-cache context "auth flow"       Manual context building (useful for debugging)
 brain-cache ask "how does auth work?" Direct Claude query via CLI
+brain-cache status                    Show index and system status
 brain-cache doctor                    Check system health
 ```
 ---
-## 📊 Token savings
+## Token savings
 Every call shows exactly what was saved:
@@ -187,41 +152,25 @@ Every call shows exactly what was saved:
 context: 1,240 tokens (93% reduction)
 ```
-Less noise → better reasoning → cheaper usage.
----
-## 🧠 Built with GSD
-This project uses the GSD (Get Shit Done) framework — an AI-driven workflow for going from idea → research → plan → execution. brain-cache is both a product of that philosophy and a tool that makes it work better: tight context, better outcomes.
----
-## ⚠️ Status
-Early stage — actively improving:
-- ⏳ reranking (planned)
-- ⏳ context compression
-- ⏳ live indexing (watch mode)
+Less noise — better reasoning — cheaper usage.
 ---
-## 🛠 Requirements
+## Requirements
-- Node.js 22+
-- Ollama running locally (`nomic-embed-text` model)
+- Node.js >= 22
+- Ollama running locally (`nomic-embed-text` model recommended)
 - Anthropic API key (for `ask` command only)
 ---
-## ⭐️ If this is useful
+## If this is useful
 Give it a star — or try it on your repo and let me know what breaks.
 ---
-## 📄 License
+## License
 MIT — see LICENSE for details.

package/dist/{claude-md-section-O5LMKH4O.js → claude-md-section-K47HUTE4.js} RENAMED Viewed

@@ -10,8 +10,6 @@ Use brain-cache tools before reading files or using Grep/Glob for codebase quest
 |-----------|------|---------|
 | Locate a function, type, or symbol | \`search_codebase\` | \`build_context\` |
 | Understand how specific code works across files | \`build_context\` | file reads |
-| Trace a call path across files | \`trace_flow\` | \`build_context\` |
-| Explain project architecture or structure | \`explain_codebase\` | \`build_context\` |
 | Diagnose brain-cache failures | \`doctor\` | -- |
 | Reindex the project | \`index_repo\` | -- |
@@ -27,28 +25,6 @@ Call \`mcp__brain-cache__build_context\` with a focused question about how speci
 Use for: "How does X work?", "What does this function do?", debugging unfamiliar code paths.
-Do NOT use for architecture overviews (use explain_codebase) or call-path tracing (use trace_flow).
-### trace_flow (trace call paths)
-Call \`mcp__brain-cache__trace_flow\` to trace how a function call propagates through the codebase. Returns structured hops showing the call chain across files.
-Use for: "How does X flow to Y?", "Trace how X calls Y across files", "What happens when X is called?", "Call path from X to Y".
-Do NOT use for code understanding queries like "how does X work" or "what does X do" \u2014 use build_context instead.
-Use trace_flow instead of build_context when the question is about call propagation or execution flow across files.
-### explain_codebase (architecture overview)
-Call \`mcp__brain-cache__explain_codebase\` to get a module-grouped architecture overview. No follow-up question needed.
-Use for: "Explain the project architecture", "How is this project structured?", "What does this project do?", "Give me an overview of the codebase".
-Do NOT use for questions about specific code behavior or how a particular function works \u2014 use build_context instead.
-Use explain_codebase instead of build_context when the question is about overall structure or getting oriented.
 ### doctor (diagnose issues)
 Call \`mcp__brain-cache__doctor\` when any brain-cache tool fails or returns unexpected results.

package/dist/cli.js CHANGED Viewed

@@ -5,11 +5,11 @@ import {
 // src/cli/index.ts
 import { Command } from "commander";
-var version = true ? "2.1.0" : "dev";
+var version = true ? "3.0.0" : "dev";
 var program = new Command();
 program.name("brain-cache").description("Local AI runtime \u2014 GPU cache layer for Claude").version(version);
 program.command("init").description("Detect hardware, pull embedding model, create config directory").action(async () => {
-  const { runInit } = await import("./init-BCMT64T2.js");
+  const { runInit } = await import("./init-2E4JMZZC.js");
   await runInit();
 });
 program.command("doctor").description("Report system health: GPU, VRAM tier, Ollama status").action(async () => {

package/dist/{init-BCMT64T2.js → init-2E4JMZZC.js} RENAMED Viewed

@@ -13,9 +13,10 @@ import {
 import "./chunk-TXLCXXKY.js";
 // src/workflows/init.ts
-import { existsSync, readFileSync, writeFileSync, appendFileSync, chmodSync, mkdirSync } from "fs";
-import { join } from "path";
+import { existsSync, readFileSync, writeFileSync, appendFileSync, chmodSync, mkdirSync, copyFileSync } from "fs";
+import { join, dirname } from "path";
 import { homedir } from "os";
+import { fileURLToPath } from "url";
 async function runInit() {
   process.stderr.write("brain-cache: detecting hardware capabilities...\n");
   const profile = await detectCapabilities();
@@ -88,7 +89,7 @@ async function runInit() {
     process.stderr.write("brain-cache: created .mcp.json with brain-cache MCP server.\n");
   }
   const claudeMdPath = "CLAUDE.md";
-  const { CLAUDE_MD_SECTION: brainCacheSection } = await import("./claude-md-section-O5LMKH4O.js");
+  const { CLAUDE_MD_SECTION: brainCacheSection } = await import("./claude-md-section-K47HUTE4.js");
   if (existsSync(claudeMdPath)) {
     const content = readFileSync(claudeMdPath, "utf-8");
     if (content.includes("## Brain-Cache MCP Tools")) {
@@ -101,6 +102,20 @@ async function runInit() {
     writeFileSync(claudeMdPath, brainCacheSection.trimStart());
     process.stderr.write("brain-cache: created CLAUDE.md with Brain-Cache MCP Tools section.\n");
   }
+  const currentFile = fileURLToPath(import.meta.url);
+  const packageRoot = join(dirname(currentFile), "..", "..");
+  const skillSource = join(packageRoot, ".claude", "skills", "brain-cache", "SKILL.md");
+  const skillTargetDir = join(process.cwd(), ".claude", "skills", "brain-cache");
+  const skillTarget = join(skillTargetDir, "SKILL.md");
+  if (existsSync(skillTarget)) {
+    process.stderr.write("brain-cache: skill already installed at .claude/skills/brain-cache/SKILL.md, skipping.\n");
+  } else if (!existsSync(skillSource)) {
+    process.stderr.write("brain-cache: Warning: skill source not found in package. Copy .claude/skills/brain-cache/ manually from the repo.\n");
+  } else {
+    mkdirSync(skillTargetDir, { recursive: true });
+    copyFileSync(skillSource, skillTarget);
+    process.stderr.write("brain-cache: installed skill to .claude/skills/brain-cache/SKILL.md\n");
+  }
   const { STATUSLINE_SCRIPT_CONTENT } = await import("./statusline-script-NFUDFOWK.js");
   const statuslinePath = join(homedir(), ".brain-cache", "statusline.mjs");
   if (existsSync(statuslinePath)) {

package/dist/mcp.js CHANGED Viewed

@@ -2359,7 +2359,7 @@ function accumulateStats(delta, ttlMs) {
 }
 // src/mcp/index.ts
-var version = true ? "2.1.0" : "dev";
+var version = true ? "3.0.0" : "dev";
 var log13 = childLogger("mcp");
 var server = new McpServer({ name: "brain-cache", version });
 server.registerTool(

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "brain-cache",
-  "version": "2.1.0",
+  "version": "3.0.0",
   "description": "Local MCP-first context engine for Claude. Index your codebase, retrieve only what matters, and cut token usage.",
   "license": "MIT",
   "type": "module",
@@ -9,6 +9,7 @@
   },
   "files": [
     "dist/",
+    ".claude/skills/",
     "README.md",
     "LICENSE"
   ],