brainbank 0.4.1 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +155 -91
- package/assets/architecture.png +0 -0
- package/dist/{base-4SUgeRWT.d.ts → base-B_vJSAbj.d.ts} +41 -45
- package/dist/chunk-424UFCY7.js +78 -0
- package/dist/chunk-424UFCY7.js.map +1 -0
- package/dist/{chunk-5VUYPNH3.js → chunk-7EZR47JV.js} +6 -3
- package/dist/chunk-7EZR47JV.js.map +1 -0
- package/dist/chunk-B77KABWH.js +41 -0
- package/dist/chunk-B77KABWH.js.map +1 -0
- package/dist/{chunk-Y3JKI6QN.js → chunk-C4KDZGRX.js} +236 -59
- package/dist/chunk-C4KDZGRX.js.map +1 -0
- package/dist/{chunk-FI7GWG4W.js → chunk-HPNUMUIF.js} +5 -2
- package/dist/chunk-HPNUMUIF.js.map +1 -0
- package/dist/{chunk-MGIFEPYZ.js → chunk-PXK62M5W.js} +38 -28
- package/dist/chunk-PXK62M5W.js.map +1 -0
- package/dist/{chunk-QNHBCOKB.js → chunk-U2Q2XGPZ.js} +7 -2
- package/dist/{chunk-QNHBCOKB.js.map → chunk-U2Q2XGPZ.js.map} +1 -1
- package/dist/{chunk-FINIFKAY.js → chunk-VVXYZIIB.js} +9 -8
- package/dist/chunk-VVXYZIIB.js.map +1 -0
- package/dist/{chunk-2BEWWQL2.js → chunk-YC4ZQLDN.js} +528 -436
- package/dist/chunk-YC4ZQLDN.js.map +1 -0
- package/dist/{chunk-E6WQM4DN.js → chunk-YOLKSYWK.js} +1 -1
- package/dist/chunk-YOLKSYWK.js.map +1 -0
- package/dist/chunk-ZNLN2VWV.js +110 -0
- package/dist/chunk-ZNLN2VWV.js.map +1 -0
- package/dist/cli.js +24 -31
- package/dist/cli.js.map +1 -1
- package/dist/code.d.ts +2 -2
- package/dist/code.js +2 -1
- package/dist/docs.d.ts +2 -2
- package/dist/docs.js +2 -1
- package/dist/git.d.ts +2 -2
- package/dist/git.js +2 -1
- package/dist/index.d.ts +81 -18
- package/dist/index.js +23 -11
- package/dist/index.js.map +1 -1
- package/dist/local-embedding-ZIMTK6PU.js +8 -0
- package/dist/local-embedding-ZIMTK6PU.js.map +1 -0
- package/dist/memory.d.ts +2 -2
- package/dist/memory.js +2 -2
- package/dist/notes.d.ts +2 -2
- package/dist/notes.js +3 -2
- package/dist/qwen3-reranker-3MHEENT5.js +8 -0
- package/dist/qwen3-reranker-3MHEENT5.js.map +1 -0
- package/dist/resolve-CUJWY6HP.js +10 -0
- package/dist/resolve-CUJWY6HP.js.map +1 -0
- package/package.json +9 -8
- package/dist/chunk-2BEWWQL2.js.map +0 -1
- package/dist/chunk-5VUYPNH3.js.map +0 -1
- package/dist/chunk-E6WQM4DN.js.map +0 -1
- package/dist/chunk-FI7GWG4W.js.map +0 -1
- package/dist/chunk-FINIFKAY.js.map +0 -1
- package/dist/chunk-MGIFEPYZ.js.map +0 -1
- package/dist/chunk-Y3JKI6QN.js.map +0 -1
package/README.md
CHANGED
|
@@ -5,13 +5,14 @@
|
|
|
5
5
|
BrainBank gives LLMs a long-term memory that persists between sessions.
|
|
6
6
|
|
|
7
7
|
- **All-in-one** — core + code + git + docs + CLI in a single `brainbank` package
|
|
8
|
-
- **Pluggable
|
|
8
|
+
- **Pluggable plugins** — `.use()` only what you need (code, git, docs, or custom)
|
|
9
9
|
- **Dynamic collections** — `brain.collection('errors')` for any structured data
|
|
10
10
|
- **Hybrid search** — vector + BM25 fused with Reciprocal Rank Fusion
|
|
11
11
|
- **Pluggable embeddings** — local WASM (free), OpenAI, or Perplexity (standard & contextualized)
|
|
12
12
|
- **Multi-repo** — index multiple repositories into one shared database
|
|
13
13
|
- **Portable** — single `.brainbank/brainbank.db` file
|
|
14
|
-
- **Optional packages** — [`@brainbank/memory`](#memory) (fact extraction + entity graph), [`@brainbank/
|
|
14
|
+
- **Optional packages** — [`@brainbank/memory`](#memory) (fact extraction + entity graph), [`@brainbank/mcp`](#mcp-server) (MCP server)
|
|
15
|
+
- **Optional reranker** — Qwen3-0.6B cross-encoder via `Qwen3Reranker` (opt-in)
|
|
15
16
|
|
|
16
17
|

|
|
17
18
|
|
|
@@ -28,7 +29,7 @@ Most AI memory solutions (mem0, Zep, LangMem) require cloud services, external d
|
|
|
28
29
|
| Infrastructure | **SQLite file** | Vector DB + cloud | Neo4j + cloud | LangGraph Platform |
|
|
29
30
|
| LLM required to write | **No**¹ | Yes | Yes | Yes |
|
|
30
31
|
| Code-aware | **19 AST-parsed languages (tree-sitter), git, co-edits** | ✗ | ✗ | ✗ |
|
|
31
|
-
| Custom
|
|
32
|
+
| Custom plugins | **`.use()` plugin system** | ✗ | ✗ | ✗ |
|
|
32
33
|
| Search | **Vector + BM25 + RRF** | Vector + graph² | Vector + BM25 + graph | Vector only |
|
|
33
34
|
| Framework lock-in | **None** | Optional | Zep cloud | LangChain |
|
|
34
35
|
| Portable | **Copy one file** | Tied to DB | Tied to cloud | Tied to platform |
|
|
@@ -50,12 +51,12 @@ Most AI memory solutions (mem0, Zep, LangMem) require cloud services, external d
|
|
|
50
51
|
- [Quick Start](#quick-start)
|
|
51
52
|
- [CLI](#cli)
|
|
52
53
|
- [Programmatic API](#programmatic-api)
|
|
53
|
-
- [
|
|
54
|
+
- [Plugins](#plugins)
|
|
54
55
|
- [Collections](#collections)
|
|
55
56
|
- [Search](#search)
|
|
56
57
|
- [Document Collections](#document-collections)
|
|
57
58
|
- [Context Generation](#context-generation)
|
|
58
|
-
- [Custom
|
|
59
|
+
- [Custom Plugins](#custom-plugins)
|
|
59
60
|
- [AI Agent Integration](#ai-agent-integration)
|
|
60
61
|
- [Examples](#examples)
|
|
61
62
|
- [Watch Mode](#watch-mode)
|
|
@@ -73,6 +74,7 @@ Most AI memory solutions (mem0, Zep, LangMem) require cloud services, external d
|
|
|
73
74
|
- [Benchmarks](#benchmarks)
|
|
74
75
|
- [Search Quality: AST vs Sliding Window](#search-quality-ast-vs-sliding-window)
|
|
75
76
|
- [Grammar Support](#grammar-support)
|
|
77
|
+
- [RAG Retrieval Quality](#rag-retrieval-quality) · [Full Results →](./BENCHMARKS.md)
|
|
76
78
|
|
|
77
79
|
---
|
|
78
80
|
|
|
@@ -87,20 +89,48 @@ npm install brainbank
|
|
|
87
89
|
| Package | When to install |
|
|
88
90
|
|---------|----------------|
|
|
89
91
|
| `@brainbank/memory` | Deterministic memory extraction + entity graph for LLM conversations |
|
|
90
|
-
| `@brainbank/reranker` | Cross-encoder reranker (Qwen3-0.6B, ~640MB model) |
|
|
91
92
|
| `@brainbank/mcp` | MCP server for AI tool integration |
|
|
92
93
|
|
|
93
94
|
```bash
|
|
94
95
|
# Memory — automatic fact extraction & dedup for chatbots/agents
|
|
95
96
|
npm install @brainbank/memory
|
|
96
97
|
|
|
97
|
-
# Reranker —
|
|
98
|
-
npm install
|
|
98
|
+
# Reranker — built-in, install the runtime dependency to enable
|
|
99
|
+
npm install node-llama-cpp
|
|
99
100
|
|
|
100
101
|
# MCP server — for Antigravity, Claude Desktop, etc.
|
|
101
102
|
npm install @brainbank/mcp
|
|
102
103
|
```
|
|
103
104
|
|
|
105
|
+
### Tree-Sitter Grammars
|
|
106
|
+
|
|
107
|
+
BrainBank uses [tree-sitter](https://tree-sitter.github.io/) for AST-aware code chunking. **JavaScript and TypeScript grammars are included by default.** Other languages require installing the corresponding grammar package:
|
|
108
|
+
|
|
109
|
+
```bash
|
|
110
|
+
# Install only the grammars you need
|
|
111
|
+
npm install tree-sitter-python tree-sitter-go tree-sitter-rust
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
If you index a file whose grammar isn't installed, BrainBank will throw a clear error:
|
|
115
|
+
|
|
116
|
+
```
|
|
117
|
+
BrainBank: Grammar 'tree-sitter-python' is not installed. Run: npm install tree-sitter-python
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
<details>
|
|
121
|
+
<summary>All available grammars (19 languages)</summary>
|
|
122
|
+
|
|
123
|
+
| Category | Packages |
|
|
124
|
+
|----------|----------|
|
|
125
|
+
| **Included** | `tree-sitter-javascript`, `tree-sitter-typescript` |
|
|
126
|
+
| Web | `tree-sitter-html`, `tree-sitter-css` |
|
|
127
|
+
| Systems | `tree-sitter-go`, `tree-sitter-rust`, `tree-sitter-c`, `tree-sitter-cpp`, `tree-sitter-swift` |
|
|
128
|
+
| JVM | `tree-sitter-java`, `tree-sitter-kotlin`, `tree-sitter-scala` |
|
|
129
|
+
| Scripting | `tree-sitter-python`, `tree-sitter-ruby`, `tree-sitter-php`, `tree-sitter-lua`, `tree-sitter-bash`, `tree-sitter-elixir` |
|
|
130
|
+
| .NET | `tree-sitter-c-sharp` |
|
|
131
|
+
|
|
132
|
+
</details>
|
|
133
|
+
|
|
104
134
|
---
|
|
105
135
|
|
|
106
136
|
## Quick Start
|
|
@@ -177,10 +207,10 @@ brainbank watch # Watch repo, auto re-index on save
|
|
|
177
207
|
# Watching /path/to/repo for changes...
|
|
178
208
|
# 14:30:02 ✓ code: src/api.ts
|
|
179
209
|
# 14:30:05 ✓ code: src/routes.ts
|
|
180
|
-
# 14:30:08 ✓ csv: data/metrics.csv ← custom
|
|
210
|
+
# 14:30:08 ✓ csv: data/metrics.csv ← custom plugin
|
|
181
211
|
```
|
|
182
212
|
|
|
183
|
-
> Watch mode monitors **code files** by default. [Custom
|
|
213
|
+
> Watch mode monitors **code files** by default. [Custom plugins](#custom-plugins) that implement `watchPatterns()` and `onFileChange()` are automatically picked up — their name appears in the console output alongside the built-in `code` plugin. Git history and document collections are not affected by file-system changes and must be re-indexed explicitly with `brainbank index` / `brainbank docs`.
|
|
184
214
|
|
|
185
215
|
### Document Collections
|
|
186
216
|
|
|
@@ -234,11 +264,11 @@ brainbank serve # Start MCP server (stdio)
|
|
|
234
264
|
|
|
235
265
|
Use BrainBank as a library in your TypeScript/Node.js project.
|
|
236
266
|
|
|
237
|
-
###
|
|
267
|
+
### Plugins
|
|
238
268
|
|
|
239
|
-
BrainBank uses pluggable
|
|
269
|
+
BrainBank uses pluggable plugins. Register only what you need with `.use()`:
|
|
240
270
|
|
|
241
|
-
|
|
|
271
|
+
| Plugin | Import | Description |
|
|
242
272
|
|---------|--------|-------------|
|
|
243
273
|
| `code` | `brainbank/code` | AST-aware code chunking via tree-sitter (19 languages) |
|
|
244
274
|
| `git` | `brainbank/git` | Git commit history, diffs, co-edit relationships |
|
|
@@ -250,7 +280,7 @@ import { code } from 'brainbank/code';
|
|
|
250
280
|
import { git } from 'brainbank/git';
|
|
251
281
|
import { docs } from 'brainbank/docs';
|
|
252
282
|
|
|
253
|
-
// Pick only the
|
|
283
|
+
// Pick only the plugins you need
|
|
254
284
|
const brain = new BrainBank({ repoPath: '.' })
|
|
255
285
|
.use(code())
|
|
256
286
|
.use(git())
|
|
@@ -291,7 +321,7 @@ decisions.prune({ olderThan: '30d' }); // remove older than 30 days
|
|
|
291
321
|
brain.listCollectionNames(); // → ['decisions', ...]
|
|
292
322
|
```
|
|
293
323
|
|
|
294
|
-
> 📂 See [examples/
|
|
324
|
+
> 📂 See [examples/collection](examples/collection/) for a complete runnable demo with cross-collection linking and metadata.
|
|
295
325
|
|
|
296
326
|
### Watch Mode
|
|
297
327
|
|
|
@@ -301,7 +331,7 @@ Auto-re-index when files change:
|
|
|
301
331
|
// API
|
|
302
332
|
const watcher = brain.watch({
|
|
303
333
|
debounceMs: 2000,
|
|
304
|
-
onIndex: (file,
|
|
334
|
+
onIndex: (file, plugin) => console.log(`${plugin}: ${file}`),
|
|
305
335
|
onError: (err) => console.error(err.message),
|
|
306
336
|
});
|
|
307
337
|
|
|
@@ -317,15 +347,15 @@ brainbank watch
|
|
|
317
347
|
# 14:30:05 ✓ code: src/routes.ts
|
|
318
348
|
```
|
|
319
349
|
|
|
320
|
-
#### Custom
|
|
350
|
+
#### Custom Plugin Watch
|
|
321
351
|
|
|
322
|
-
Custom
|
|
352
|
+
Custom plugins can hook into watch mode by implementing `onFileChange` and `watchPatterns`:
|
|
323
353
|
|
|
324
354
|
```typescript
|
|
325
|
-
import type {
|
|
355
|
+
import type { Plugin, PluginContext } from 'brainbank';
|
|
326
356
|
|
|
327
|
-
function
|
|
328
|
-
let ctx:
|
|
357
|
+
function csvPlugin(): Plugin {
|
|
358
|
+
let ctx: PluginContext;
|
|
329
359
|
|
|
330
360
|
return {
|
|
331
361
|
name: 'csv',
|
|
@@ -334,7 +364,7 @@ function csvIndexer(): Indexer {
|
|
|
334
364
|
ctx = context;
|
|
335
365
|
},
|
|
336
366
|
|
|
337
|
-
// Tell watch which files this
|
|
367
|
+
// Tell watch which files this plugin cares about
|
|
338
368
|
watchPatterns() {
|
|
339
369
|
return ['**/*.csv', '**/*.tsv'];
|
|
340
370
|
},
|
|
@@ -356,7 +386,7 @@ function csvIndexer(): Indexer {
|
|
|
356
386
|
|
|
357
387
|
const brain = new BrainBank({ dbPath: './brain.db' })
|
|
358
388
|
.use(code())
|
|
359
|
-
.use(
|
|
389
|
+
.use(csvPlugin());
|
|
360
390
|
|
|
361
391
|
await brain.initialize();
|
|
362
392
|
brain.watch(); // Now watches .ts, .py, etc. AND .csv, .tsv
|
|
@@ -423,16 +453,16 @@ const context = await brain.getContext('add rate limiting to the API', {
|
|
|
423
453
|
// Returns: ## Relevant Code, ## Git History, ## Relevant Documents
|
|
424
454
|
```
|
|
425
455
|
|
|
426
|
-
### Custom
|
|
456
|
+
### Custom Plugins
|
|
427
457
|
|
|
428
|
-
Implement the `
|
|
458
|
+
Implement the `Plugin` interface to build your own:
|
|
429
459
|
|
|
430
460
|
```typescript
|
|
431
|
-
import type {
|
|
461
|
+
import type { Plugin, PluginContext } from 'brainbank';
|
|
432
462
|
|
|
433
|
-
const
|
|
463
|
+
const myPlugin: Plugin = {
|
|
434
464
|
name: 'custom',
|
|
435
|
-
async initialize(ctx:
|
|
465
|
+
async initialize(ctx: PluginContext) {
|
|
436
466
|
// ctx.db — shared SQLite database
|
|
437
467
|
// ctx.embedding — shared embedding provider
|
|
438
468
|
// ctx.collection() — create dynamic collections
|
|
@@ -441,10 +471,10 @@ const myIndexer: Indexer = {
|
|
|
441
471
|
},
|
|
442
472
|
};
|
|
443
473
|
|
|
444
|
-
brain.use(
|
|
474
|
+
brain.use(myPlugin);
|
|
445
475
|
```
|
|
446
476
|
|
|
447
|
-
#### Using custom
|
|
477
|
+
#### Using custom plugins with the CLI
|
|
448
478
|
|
|
449
479
|
Drop `.ts` files into `.brainbank/indexers/` — the CLI auto-discovers them:
|
|
450
480
|
|
|
@@ -456,11 +486,11 @@ Drop `.ts` files into `.brainbank/indexers/` — the CLI auto-discovers them:
|
|
|
456
486
|
└── jira.ts
|
|
457
487
|
```
|
|
458
488
|
|
|
459
|
-
Each file exports a default `
|
|
489
|
+
Each file exports a default `Plugin`:
|
|
460
490
|
|
|
461
491
|
```typescript
|
|
462
492
|
// .brainbank/indexers/slack.ts
|
|
463
|
-
import type {
|
|
493
|
+
import type { Plugin } from 'brainbank';
|
|
464
494
|
|
|
465
495
|
export default {
|
|
466
496
|
name: 'slack',
|
|
@@ -468,14 +498,14 @@ export default {
|
|
|
468
498
|
const msgs = ctx.collection('slack_messages');
|
|
469
499
|
// ... fetch and index slack messages
|
|
470
500
|
},
|
|
471
|
-
} satisfies
|
|
501
|
+
} satisfies Plugin;
|
|
472
502
|
```
|
|
473
503
|
|
|
474
|
-
That's it — all CLI commands automatically pick up your
|
|
504
|
+
That's it — all CLI commands automatically pick up your plugins:
|
|
475
505
|
|
|
476
506
|
```bash
|
|
477
507
|
brainbank index # runs code + git + docs + slack + jira
|
|
478
|
-
brainbank stats # shows all
|
|
508
|
+
brainbank stats # shows all plugins
|
|
479
509
|
brainbank kv search slack_messages "deploy" # search slack data
|
|
480
510
|
```
|
|
481
511
|
|
|
@@ -493,18 +523,18 @@ export default {
|
|
|
493
523
|
};
|
|
494
524
|
```
|
|
495
525
|
|
|
496
|
-
Everything lives in `.brainbank/` — DB, config, and custom
|
|
526
|
+
Everything lives in `.brainbank/` — DB, config, and custom plugins:
|
|
497
527
|
|
|
498
528
|
```
|
|
499
529
|
.brainbank/
|
|
500
530
|
├── brainbank.db # SQLite database (auto-created)
|
|
501
531
|
├── config.ts # Optional project config
|
|
502
|
-
└── indexers/ # Optional custom
|
|
532
|
+
└── indexers/ # Optional custom plugin files
|
|
503
533
|
├── slack.ts
|
|
504
534
|
└── jira.ts
|
|
505
535
|
```
|
|
506
536
|
|
|
507
|
-
No folder and no config file? The CLI uses the built-in
|
|
537
|
+
No folder and no config file? The CLI uses the built-in plugins (`code`, `git`, `docs`).
|
|
508
538
|
|
|
509
539
|
---
|
|
510
540
|
|
|
@@ -555,19 +585,19 @@ Teach your AI coding agent to use BrainBank as persistent memory. Add an `AGENTS
|
|
|
555
585
|
| **Cursor** | Add rules in `.cursor/rules` |
|
|
556
586
|
| **MCP** (any agent) | See [MCP Server](#mcp-server) config below |
|
|
557
587
|
|
|
558
|
-
#### Custom
|
|
588
|
+
#### Custom Plugin: Auto-Ingest Conversation Logs
|
|
559
589
|
|
|
560
590
|
For agents that produce structured logs (e.g. Antigravity's `brain/` directory), auto-index them:
|
|
561
591
|
|
|
562
592
|
```typescript
|
|
563
593
|
// .brainbank/indexers/conversations.ts
|
|
564
|
-
import type {
|
|
594
|
+
import type { Plugin, PluginContext } from 'brainbank';
|
|
565
595
|
import * as fs from 'node:fs';
|
|
566
596
|
import * as path from 'node:path';
|
|
567
597
|
|
|
568
598
|
export default {
|
|
569
599
|
name: 'conversations',
|
|
570
|
-
async initialize(ctx:
|
|
600
|
+
async initialize(ctx: PluginContext) {
|
|
571
601
|
const conversations = ctx.collection('conversations');
|
|
572
602
|
const logsDir = path.join(ctx.config.repoPath, '.gemini/antigravity/brain');
|
|
573
603
|
if (!fs.existsSync(logsDir)) return;
|
|
@@ -583,7 +613,7 @@ export default {
|
|
|
583
613
|
});
|
|
584
614
|
}
|
|
585
615
|
},
|
|
586
|
-
} satisfies
|
|
616
|
+
} satisfies Plugin;
|
|
587
617
|
```
|
|
588
618
|
|
|
589
619
|
```bash
|
|
@@ -595,8 +625,9 @@ brainbank kv search conversations "what did we decide about auth"
|
|
|
595
625
|
|
|
596
626
|
| Example | Description | Run |
|
|
597
627
|
|---------|-------------|-----|
|
|
598
|
-
| [
|
|
599
|
-
| [
|
|
628
|
+
| [rag](examples/rag/) | RAG chatbot — docs retrieval + generation | `OPENAI_API_KEY=sk-... PERPLEXITY_API_KEY=pplx-... npx tsx examples/rag/rag.ts --docs <path>` |
|
|
629
|
+
| [memory](examples/memory/) | Memory chatbot — fact extraction + entity graph | `OPENAI_API_KEY=sk-... npx tsx examples/memory/memory.ts` |
|
|
630
|
+
| [collection](examples/collection/) | Collections, semantic search, tags, metadata linking | `npx tsx examples/collection/collection.ts` |
|
|
600
631
|
|
|
601
632
|
---
|
|
602
633
|
|
|
@@ -617,48 +648,36 @@ Add to your MCP config (`~/.gemini/antigravity/mcp_config.json` or Claude Deskto
|
|
|
617
648
|
"mcpServers": {
|
|
618
649
|
"brainbank": {
|
|
619
650
|
"command": "npx",
|
|
620
|
-
"args": ["-y", "@brainbank/mcp"]
|
|
621
|
-
"env": {
|
|
622
|
-
"BRAINBANK_EMBEDDING": "openai"
|
|
623
|
-
}
|
|
651
|
+
"args": ["-y", "@brainbank/mcp"]
|
|
624
652
|
}
|
|
625
653
|
}
|
|
626
654
|
}
|
|
627
655
|
```
|
|
628
656
|
|
|
629
|
-
The
|
|
657
|
+
**Zero-config.** The MCP server auto-detects:
|
|
658
|
+
- **Repo path** — from `repo` tool param > `BRAINBANK_REPO` env > `findRepoRoot(cwd)`
|
|
659
|
+
- **Embedding provider** — from `provider_key` stored in the DB (set during `brainbank index --embedding openai`)
|
|
630
660
|
|
|
631
|
-
>
|
|
661
|
+
> [!TIP]
|
|
662
|
+
> Index your repo once with the CLI to set up the embedding provider:
|
|
663
|
+
> ```bash
|
|
664
|
+
> brainbank index . --embedding openai # stores provider_key=openai in DB
|
|
665
|
+
> ```
|
|
666
|
+
> After that, the MCP server (and any future CLI runs) auto-resolve the correct provider from the DB — no env vars needed.
|
|
632
667
|
|
|
633
|
-
>
|
|
634
|
-
|
|
635
|
-
> [!CAUTION]
|
|
636
|
-
> **Embedding Provider Consistency is Critical**
|
|
637
|
-
>
|
|
638
|
-
> The embedding provider used by the MCP server **must match** the one used during indexing. Mismatched dimensions cause `initialize()` to throw or search to return empty results.
|
|
639
|
-
>
|
|
640
|
-
> **Common failure scenario:**
|
|
641
|
-
> 1. You index via CLI with `BRAINBANK_EMBEDDING=openai` (1536 dims)
|
|
642
|
-
> 2. MCP server starts without `BRAINBANK_EMBEDDING` env var → defaults to local (384 dims)
|
|
643
|
-
> 3. **Result:** BrainBank throws `Embedding dimension mismatch` on every search
|
|
644
|
-
>
|
|
645
|
-
> **Fix:** Always set `BRAINBANK_EMBEDDING` consistently in your MCP config, CLI, and API usage. If you indexed with OpenAI, your MCP config **must** include `"BRAINBANK_EMBEDDING": "openai"`. Same for `perplexity` or `perplexity-context`. If you switch providers, run `brainbank reembed` to regenerate all vectors.
|
|
668
|
+
> [!NOTE]
|
|
669
|
+
> If you switch embedding providers (e.g. local → OpenAI), run `brainbank reembed` to regenerate all vectors. BrainBank auto-detects dimension mismatches and warns you.
|
|
646
670
|
|
|
647
671
|
### Available Tools
|
|
648
672
|
|
|
649
673
|
| Tool | Description |
|
|
650
674
|
|------|-------------|
|
|
651
|
-
| `
|
|
652
|
-
| `
|
|
653
|
-
| `
|
|
654
|
-
| `
|
|
655
|
-
| `
|
|
656
|
-
| `
|
|
657
|
-
| `brainbank_history` | Git history for a file |
|
|
658
|
-
| `brainbank_coedits` | Files that change together |
|
|
659
|
-
| `brainbank_collection_add` | Add item to a KV collection |
|
|
660
|
-
| `brainbank_collection_search` | Search a KV collection |
|
|
661
|
-
| `brainbank_collection_trim` | Trim a KV collection |
|
|
675
|
+
| `brainbank_search` | Unified search — `mode: hybrid` (default), `vector`, or `keyword` |
|
|
676
|
+
| `brainbank_context` | Formatted context block for a task (code + git + co-edits) |
|
|
677
|
+
| `brainbank_index` | Trigger incremental code/git/docs indexing |
|
|
678
|
+
| `brainbank_stats` | Index statistics (files, commits, chunks, collections) |
|
|
679
|
+
| `brainbank_history` | Git history for a specific file |
|
|
680
|
+
| `brainbank_collection` | KV collection ops — `action: add`, `search`, or `trim` |
|
|
662
681
|
|
|
663
682
|
---
|
|
664
683
|
|
|
@@ -666,7 +685,7 @@ The agent passes the `repo` parameter on each tool call based on the active work
|
|
|
666
685
|
|
|
667
686
|
```typescript
|
|
668
687
|
import { BrainBank, OpenAIEmbedding } from 'brainbank';
|
|
669
|
-
import { Qwen3Reranker } from '
|
|
688
|
+
import { Qwen3Reranker } from 'brainbank'; // built-in, requires node-llama-cpp
|
|
670
689
|
|
|
671
690
|
const brain = new BrainBank({
|
|
672
691
|
repoPath: '.',
|
|
@@ -752,7 +771,12 @@ Real benchmarks on a production NestJS backend (1052 code chunks + git history):
|
|
|
752
771
|
|
|
753
772
|
### Reranker
|
|
754
773
|
|
|
755
|
-
BrainBank
|
|
774
|
+
BrainBank ships with an optional cross-encoder reranker using **Qwen3-Reranker-0.6B** via `node-llama-cpp`. It runs 100% locally — no API keys needed. The reranker is **disabled by default**.
|
|
775
|
+
|
|
776
|
+
```bash
|
|
777
|
+
# Only requirement — the LLM runtime (model auto-downloads on first use)
|
|
778
|
+
npm install node-llama-cpp
|
|
779
|
+
```
|
|
756
780
|
|
|
757
781
|
#### When to Use It
|
|
758
782
|
|
|
@@ -775,7 +799,7 @@ The reranker runs local neural inference on every search result, which improves
|
|
|
775
799
|
|
|
776
800
|
```typescript
|
|
777
801
|
import { BrainBank } from 'brainbank';
|
|
778
|
-
import { Qwen3Reranker } from '
|
|
802
|
+
import { Qwen3Reranker } from 'brainbank';
|
|
779
803
|
|
|
780
804
|
const brain = new BrainBank({
|
|
781
805
|
reranker: new Qwen3Reranker(), // ~640MB model, auto-downloaded on first use
|
|
@@ -835,7 +859,7 @@ const brain = new BrainBank({ repoPath: '.' });
|
|
|
835
859
|
brain.use(notes());
|
|
836
860
|
await brain.initialize();
|
|
837
861
|
|
|
838
|
-
const notesPlugin = brain.
|
|
862
|
+
const notesPlugin = brain.plugin('notes');
|
|
839
863
|
|
|
840
864
|
// Store a conversation digest
|
|
841
865
|
await notesPlugin.remember({
|
|
@@ -877,7 +901,7 @@ const brain = new BrainBank({ repoPath: '.' });
|
|
|
877
901
|
brain.use(memory());
|
|
878
902
|
await brain.initialize();
|
|
879
903
|
|
|
880
|
-
const mem = brain.
|
|
904
|
+
const mem = brain.plugin('memory');
|
|
881
905
|
|
|
882
906
|
// Record a learning pattern
|
|
883
907
|
await mem.learn({
|
|
@@ -961,7 +985,7 @@ The `LLMProvider` interface works with any framework:
|
|
|
961
985
|
| Vercel AI SDK | `generateText()` → string |
|
|
962
986
|
| Any LLM | Implement `{ generate(messages) → string }` |
|
|
963
987
|
|
|
964
|
-
> 📂 See [examples/
|
|
988
|
+
> 📂 See [examples/memory](examples/memory/) for a runnable demo. All three LLM backends supported via `--llm` flag.
|
|
965
989
|
|
|
966
990
|
> 📦 Full docs: [packages/memory/README.md](packages/memory/README.md)
|
|
967
991
|
|
|
@@ -972,10 +996,12 @@ The `LLMProvider` interface works with any framework:
|
|
|
972
996
|
| Variable | Description |
|
|
973
997
|
|----------|-------------|
|
|
974
998
|
| `BRAINBANK_REPO` | Default repository path (optional — auto-detected from `.git/` or passed per tool call) |
|
|
975
|
-
| `
|
|
999
|
+
| `BRAINBANK_RERANKER` | Reranker: `none` (default), `qwen3` |
|
|
976
1000
|
| `BRAINBANK_DEBUG` | Show full stack traces |
|
|
977
|
-
| `OPENAI_API_KEY` | Required when using
|
|
978
|
-
| `PERPLEXITY_API_KEY` | Required when using
|
|
1001
|
+
| `OPENAI_API_KEY` | Required when using `--embedding openai` |
|
|
1002
|
+
| `PERPLEXITY_API_KEY` | Required when using `--embedding perplexity` or `perplexity-context` |
|
|
1003
|
+
|
|
1004
|
+
> **Note:** `BRAINBANK_EMBEDDING` env var has been removed. Use `brainbank index --embedding <provider>` on first index — the provider is stored in the DB and auto-resolved on subsequent runs.
|
|
979
1005
|
|
|
980
1006
|
---
|
|
981
1007
|
|
|
@@ -985,7 +1011,7 @@ BrainBank can index multiple repositories into a **single shared database**. Thi
|
|
|
985
1011
|
|
|
986
1012
|
### How It Works
|
|
987
1013
|
|
|
988
|
-
When you point BrainBank at a directory that contains multiple Git repositories (subdirectories with `.git/`), the CLI **auto-detects** them and creates namespaced
|
|
1014
|
+
When you point BrainBank at a directory that contains multiple Git repositories (subdirectories with `.git/`), the CLI **auto-detects** them and creates namespaced plugins:
|
|
989
1015
|
|
|
990
1016
|
```bash
|
|
991
1017
|
~/projects/
|
|
@@ -1019,9 +1045,9 @@ brainbank hsearch "cancel job confirmation" --repo ~/projects
|
|
|
1019
1045
|
# and shared utilities — all in one search.
|
|
1020
1046
|
```
|
|
1021
1047
|
|
|
1022
|
-
### Namespaced
|
|
1048
|
+
### Namespaced Plugins
|
|
1023
1049
|
|
|
1024
|
-
Each sub-repository gets its own namespaced
|
|
1050
|
+
Each sub-repository gets its own namespaced plugin instances (e.g., `code:frontend`, `git:backend`). Same-type plugins share a single HNSW vector index for efficient memory usage and unified search.
|
|
1025
1051
|
|
|
1026
1052
|
### Programmatic API
|
|
1027
1053
|
|
|
@@ -1084,7 +1110,7 @@ For large classes (>80 lines), the chunker descends into the class body and extr
|
|
|
1084
1110
|
|
|
1085
1111
|
All indexing is **incremental by default** — only new or changed content is processed:
|
|
1086
1112
|
|
|
1087
|
-
|
|
|
1113
|
+
| Plugin | How it detects changes | What gets skipped |
|
|
1088
1114
|
|---------|----------------------|-------------------|
|
|
1089
1115
|
| **Code** | FNV-1a hash of file content | Unchanged files |
|
|
1090
1116
|
| **Git** | Unique commit hash | Already-indexed commits |
|
|
@@ -1233,6 +1259,41 @@ All 9 core grammars verified, each parsing in **<0.05ms**:
|
|
|
1233
1259
|
|
|
1234
1260
|
> Additional grammars available: C++, Swift, C#, Kotlin, Scala, Lua, Elixir, Bash, HTML, CSS
|
|
1235
1261
|
|
|
1262
|
+
### RAG Retrieval Quality
|
|
1263
|
+
|
|
1264
|
+
BrainBank's hybrid search pipeline (Vector + BM25 → RRF) with Perplexity Context embeddings (2560d):
|
|
1265
|
+
|
|
1266
|
+
| Benchmark | Metric | Score |
|
|
1267
|
+
|---|---|:---:|
|
|
1268
|
+
| **BEIR SciFact** (5,183 docs, 300 queries) | NDCG@10 | **0.761** |
|
|
1269
|
+
| **Custom semantic** (69 docs, 20 queries) | R@5 | **83%** |
|
|
1270
|
+
|
|
1271
|
+
The hybrid pipeline improved R@5 by **+26pp over vector-only** retrieval on our custom eval.
|
|
1272
|
+
|
|
1273
|
+
#### BrainBank vs QMD (Head-to-Head)
|
|
1274
|
+
|
|
1275
|
+
Compared against [QMD](https://github.com/tobi/qmd), a local-first search engine using GGUF models (embeddinggemma-300M + query expansion + reranker) — same corpus, same 20 queries:
|
|
1276
|
+
|
|
1277
|
+
| Metric | BrainBank + Reranker | QMD + Reranker |
|
|
1278
|
+
|---|:---:|:---:|
|
|
1279
|
+
| **R@5** | **83%** | 65% |
|
|
1280
|
+
| **MRR** | **0.57** | 0.45 |
|
|
1281
|
+
| **Misses** | **1/20** | 6/20 |
|
|
1282
|
+
|
|
1283
|
+
> BrainBank wins by +18pp R@5. QMD is competitive on semantic queries (81% vs 94%) and ties on broad queries (83% vs 83%) — impressive for a fully local pipeline with zero API calls.
|
|
1284
|
+
|
|
1285
|
+
See **[BENCHMARKS.md](./BENCHMARKS.md)** for full pipeline progression, per-technique impact, QMD comparison details, and reproduction instructions.
|
|
1286
|
+
|
|
1287
|
+
#### Running the RAG Eval
|
|
1288
|
+
|
|
1289
|
+
```bash
|
|
1290
|
+
# Custom eval on your own docs
|
|
1291
|
+
PERPLEXITY_API_KEY=pplx-... npx tsx test/benchmarks/rag/eval.ts --docs ~/path/to/docs
|
|
1292
|
+
|
|
1293
|
+
# BEIR standard benchmark
|
|
1294
|
+
PERPLEXITY_API_KEY=pplx-... npx tsx test/benchmarks/rag/beir-eval.ts --dataset scifact
|
|
1295
|
+
```
|
|
1296
|
+
|
|
1236
1297
|
### Running Benchmarks
|
|
1237
1298
|
|
|
1238
1299
|
```bash
|
|
@@ -1241,6 +1302,9 @@ node test/benchmarks/grammar-support.mjs
|
|
|
1241
1302
|
|
|
1242
1303
|
# Search quality A/B (uses BrainBank's own source files)
|
|
1243
1304
|
node test/benchmarks/search-quality.mjs
|
|
1305
|
+
|
|
1306
|
+
# RAG retrieval quality (requires Perplexity API key + docs folder)
|
|
1307
|
+
PERPLEXITY_API_KEY=pplx-... npx tsx test/benchmarks/rag/eval.ts --docs ~/path/to/docs
|
|
1244
1308
|
```
|
|
1245
1309
|
|
|
1246
1310
|
---
|
|
@@ -1259,7 +1323,7 @@ node test/benchmarks/search-quality.mjs
|
|
|
1259
1323
|
│ │
|
|
1260
1324
|
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌────────────┐│
|
|
1261
1325
|
│ │ Code │ │ Git │ │ Docs │ │ Collection ││
|
|
1262
|
-
│ │
|
|
1326
|
+
│ │ Plugin │ │ Indexer │ │ Indexer │ │ (dynamic) ││
|
|
1263
1327
|
│ └────┬────┘ └────┬────┘ └────┬────┘ └─────┬──────┘│
|
|
1264
1328
|
│ │ │ │ │ │
|
|
1265
1329
|
│ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌─────▼──────┐│
|
|
@@ -1309,7 +1373,7 @@ Final results (sorted by blended score)
|
|
|
1309
1373
|
|
|
1310
1374
|
### Data Flow
|
|
1311
1375
|
|
|
1312
|
-
1. **Index** —
|
|
1376
|
+
1. **Index** — Plugins parse files into chunks (tree-sitter AST for code, heading-based for docs)
|
|
1313
1377
|
2. **Embed** — Each chunk gets a vector (local WASM or OpenAI)
|
|
1314
1378
|
3. **Store** — Chunks + vectors → SQLite, vectors → HNSW index
|
|
1315
1379
|
4. **Search** — Query → HNSW k-NN + BM25 keyword → RRF fusion → optional reranker
|
|
@@ -1320,8 +1384,8 @@ Final results (sorted by blended score)
|
|
|
1320
1384
|
## Testing
|
|
1321
1385
|
|
|
1322
1386
|
```bash
|
|
1323
|
-
npm test # Unit tests (
|
|
1324
|
-
npm test -- --integration # Full suite (
|
|
1387
|
+
npm test # Unit tests (172 tests)
|
|
1388
|
+
npm test -- --integration # Full suite (includes real models + all domains)
|
|
1325
1389
|
npm test -- --filter code # Filter by test name
|
|
1326
1390
|
npm test -- --verbose # Show assertion details
|
|
1327
1391
|
```
|
package/assets/architecture.png
CHANGED
|
Binary file
|