stellavault 0.8.2 → 0.8.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Stellavault
2
2
 
3
- [![CI](https://github.com/Evanciel/stellavault/actions/workflows/ci.yml/badge.svg)](https://github.com/Evanciel/stellavault/actions/workflows/ci.yml) [![npm](https://img.shields.io/npm/v/stellavault)](https://www.npmjs.com/package/stellavault) [![tests](https://img.shields.io/badge/tests-223%20passing-brightgreen)]()
3
+ [![CI](https://github.com/Evanciel/stellavault/actions/workflows/ci.yml/badge.svg)](https://github.com/Evanciel/stellavault/actions/workflows/ci.yml) [![npm](https://img.shields.io/npm/v/stellavault)](https://www.npmjs.com/package/stellavault) [![tests](https://img.shields.io/badge/tests-245%20passing-brightgreen)]() [![node](https://img.shields.io/badge/node-%E2%89%A520-339933?logo=node.js&logoColor=white)]() [![license](https://img.shields.io/badge/license-MIT-blue)](LICENSE)
4
4
 
5
5
  > **Drop anything. It compiles itself into knowledge.** Claude remembers everything you know.
6
6
 
@@ -11,6 +11,10 @@ Self-compiling knowledge base with a full-featured editor, 3D neural graph, AI-p
11
11
  <br><em>Your vault as a neural network. Local-first, no cloud required.</em>
12
12
  </p>
13
13
 
14
+ ## Contents
15
+
16
+ [Install](#install) · [Editor](#editor) · [Pipeline](#the-pipeline) · [Intelligence](#intelligence-what-makes-stellavault-unique) · [Search & Ranking](#search--ranking) · [MCP Integration](#mcp-integration-21-tools) · [3D Visualization](#3d-visualization) · [Configuration](#configuration) · [Performance](#performance) · [Tech Stack](#tech-stack) · [Security](#security) · [Troubleshooting](#troubleshooting)
17
+
14
18
  ## Install
15
19
 
16
20
  ### Desktop App (Recommended — one click)
@@ -120,6 +124,24 @@ These features do **not exist** in Obsidian — even with plugins.
120
124
 
121
125
  ---
122
126
 
127
+ ## Search & Ranking
128
+
129
+ Hybrid retrieval that fuses multiple signals with **weighted Reciprocal Rank Fusion (RRF)** — tuned for a personal knowledge vault, fully local, zero API keys:
130
+
131
+ | Signal | What it captures | Default weight |
132
+ |--------|------------------|---------------:|
133
+ | **Semantic** (dense) | meaning; multilingual (50+ languages) | `1.0` |
134
+ | **BM25** (keyword) | exact terms, code, names | `1.0` |
135
+ | **Entity-linking** | your `[[wikilinks]]`, `#tags`, headings, titles — the curated graph | `1.5` |
136
+ | **FSRS recency** | gently surfaces notes you're actively using / forgetting | `±10%` |
137
+
138
+ - **Entity matching** resolves natural-language queries via fuzzy substring + punctuation-normalized matching (Korean / CJK friendly), with a **per-document diversity cap** so one large note can't flood the top results.
139
+ - **Recency** reuses the same FSRS memory model as the decay engine (not raw file mtime) — a note you're forgetting resurfaces; a mastered evergreen note isn't buried just for being old.
140
+ - **Adaptive rerank** (long-running MCP server) further boosts results by your current session context (recent tags / paths).
141
+ - Every weight is **tunable** per vault or via env vars — see [Configuration](#configuration).
142
+
143
+ ---
144
+
123
145
  ## MCP Integration (21 Tools)
124
146
 
125
147
  ```bash
@@ -128,13 +150,13 @@ stellavault setup # one command → Claude Code, Claude Desktop, Curs
128
150
  claude mcp add stellavault -- stellavault serve
129
151
  ```
130
152
 
131
- Claude can search, ask, draft, lint, and analyze your vault directly. Search
132
- fuses **semantic + BM25 + entity-linking** — your `[[wikilinks]]`, tags, and
133
- headings become retrieval signals — with session-adaptive reranking.
153
+ Claude can search, ask, draft, lint, and analyze your vault directly. Search runs
154
+ the full hybrid pipeline — **weighted RRF** over semantic + BM25 + entity-linking,
155
+ plus **FSRS recency** and session-adaptive reranking (see [Search & Ranking](#search--ranking)).
134
156
 
135
157
  | Tool | What it does |
136
158
  |------|-------------|
137
- | `search` | Hybrid semantic + BM25 + entity-linking, adaptive rerank |
159
+ | `search` | Weighted RRF (semantic + BM25 + entity) + FSRS recency + adaptive rerank |
138
160
  | `ask` | Vault-grounded Q&A |
139
161
  | `generate-draft` | AI drafts from your knowledge |
140
162
  | `get-decay-status` | Memory decay report (FSRS) |
@@ -218,6 +240,34 @@ stellavault decay # What are you forgetting?
218
240
 
219
241
  ---
220
242
 
243
+ ## Configuration
244
+
245
+ Stellavault reads `./.stellavault.json` (or `~/.stellavault.json`). Search ranking is fully tunable — sensible defaults work out of the box:
246
+
247
+ ```jsonc
248
+ {
249
+ "search": {
250
+ "rrfK": 60,
251
+ "weights": { "semantic": 1.0, "bm25": 1.0, "entity": 1.5 },
252
+ "recencyWeight": 0.2, // FSRS recency strength; 0 = off
253
+ "entityAliases": { "k8s": ["kubernetes"] } // synonym / cross-lingual groups (exact-only)
254
+ }
255
+ }
256
+ ```
257
+
258
+ Environment variables override config (parsed with guards):
259
+
260
+ | Env var | Effect |
261
+ |---------|--------|
262
+ | `STELLAVAULT_W_SEMANTIC` / `_BM25` / `_ENTITY` | per-signal RRF weight (e.g. `STELLAVAULT_W_ENTITY=2.0` for aggressive entity surfacing) |
263
+ | `STELLAVAULT_RECENCY_WEIGHT` | recency strength `0`–`1` (`0` disables) |
264
+ | `STELLAVAULT_DB_PATH` | override the index DB location |
265
+ | `STELLAVAULT_WATCH` | `0` to disable the auto-reindex file watcher while `serve` runs |
266
+
267
+ > Note: cross-lingual recall (e.g. a Korean query finding English notes) is handled automatically by the multilingual embedding model — `entityAliases` is an optional precision boost for the curated entity graph (tags / wikilinks) and abbreviations.
268
+
269
+ ---
270
+
221
271
  ## Performance
222
272
 
223
273
  Tested on synthetic vaults — all operations under 1 second for typical use cases:
@@ -253,7 +303,7 @@ Key optimizations:
253
303
  | Runtime | Node.js 20+ (ESM, TypeScript) |
254
304
  | Vector Store | SQLite-vec (local, zero config) |
255
305
  | Embedding | MiniLM-L12-v2 (local, 50+ languages, batch processing) |
256
- | Search | BM25 + Cosine + RRF Fusion |
306
+ | Search | Weighted RRF (semantic + BM25 + entity) + FSRS recency |
257
307
  | Math | KaTeX (inline + display) |
258
308
  | Code | lowlight / highlight.js (40+ languages) |
259
309
  | 3D | React Three Fiber + Three.js |
@@ -40,7 +40,9 @@ function mergeConfig(defaults, overrides) {
40
40
  ...defaults.search,
41
41
  ...overrides.search,
42
42
  // B3 §4 — deep-merge weights so a partial override keeps the other defaults.
43
- weights: { ...defaults.search.weights, ...overrides.search?.weights }
43
+ weights: { ...defaults.search.weights, ...overrides.search?.weights },
44
+ // B2.2 — merge alias groups (override wins per-key).
45
+ entityAliases: { ...defaults.search.entityAliases, ...overrides.search?.entityAliases }
44
46
  },
45
47
  mcp: { ...defaults.mcp, ...overrides.mcp }
46
48
  };
@@ -96,8 +98,10 @@ var init_config = __esm({
96
98
  rrfK: 60,
97
99
  weights: { semantic: 1, bm25: 1, entity: 1.5 },
98
100
  // B2.1: entity leads (per-doc cap prevents flooding)
99
- recencyWeight: 0.2
101
+ recencyWeight: 0.2,
100
102
  // B3 §1.3 (±10% bound)
103
+ entityAliases: {}
104
+ // B2.2 — user-defined synonym groups
101
105
  },
102
106
  mcp: {
103
107
  mode: "stdio",
@@ -496,6 +500,34 @@ function extractQueryTerms(query) {
496
500
  }
497
501
  return [...set].slice(0, MAX_QUERY_TERMS);
498
502
  }
503
+ function buildAliasIndex(aliases) {
504
+ const index = /* @__PURE__ */ new Map();
505
+ if (!aliases)
506
+ return index;
507
+ for (const [key, arr] of Object.entries(aliases)) {
508
+ const group = [normalize(key), ...(Array.isArray(arr) ? arr : []).map(normalize)].filter(Boolean);
509
+ const uniq = [...new Set(group)];
510
+ if (uniq.length < 2)
511
+ continue;
512
+ for (const term of uniq) {
513
+ const others = uniq.filter((t2) => t2 !== term);
514
+ index.set(term, [.../* @__PURE__ */ new Set([...index.get(term) ?? [], ...others])]);
515
+ }
516
+ }
517
+ return index;
518
+ }
519
+ function expandWithAliases(terms, aliasIndex) {
520
+ if (!aliasIndex || aliasIndex.size === 0)
521
+ return terms;
522
+ const out = new Set(terms);
523
+ for (const t2 of terms) {
524
+ const syn = aliasIndex.get(t2);
525
+ if (syn)
526
+ for (const s of syn)
527
+ out.add(s);
528
+ }
529
+ return [...out].slice(0, MAX_QUERY_TERMS);
530
+ }
499
531
  var MAX_ENTITIES_PER_CHUNK, MAX_QUERY_TERMS, STOPWORDS;
500
532
  var init_entity_extractor = __esm({
501
533
  "packages/core/dist/indexer/entity-extractor.js"() {
@@ -3905,16 +3937,17 @@ function createSqliteVecStore(dbPath, dimensions = 384) {
3905
3937
  // FTS5 rank is negative (lower = better)
3906
3938
  }));
3907
3939
  },
3908
- async searchEntities(entities, limit) {
3909
- if (!entities || entities.length === 0)
3940
+ async searchEntities(entities, limit, exactExtra = []) {
3941
+ if ((!entities || entities.length === 0) && exactExtra.length === 0)
3910
3942
  return [];
3911
- const exactPH = entities.map(() => "?").join(",");
3943
+ const allExact = [...entities, ...exactExtra];
3944
+ const exactPH = allExact.map(() => "?").join(",");
3912
3945
  const fuzzy = entities.filter((t2) => t2.length >= 4 && (/\s/.test(t2) || /[^\x00-\x7f]/.test(t2) || t2.length >= 6)).slice(0, 16);
3913
3946
  let matched;
3914
3947
  let matchedParams;
3915
3948
  if (fuzzy.length === 0) {
3916
3949
  matched = `SELECT chunk_id, CAST(COUNT(*) AS REAL) AS score FROM chunk_entities WHERE entity IN (${exactPH}) GROUP BY chunk_id`;
3917
- matchedParams = [...entities];
3950
+ matchedParams = [...allExact];
3918
3951
  } else {
3919
3952
  const esc = (t2) => t2.replace(/[\\%_]/g, "\\$&");
3920
3953
  const likeClause = fuzzy.map(() => `entity LIKE ? ESCAPE '\\'`).join(" OR ");
@@ -3925,7 +3958,7 @@ function createSqliteVecStore(dbPath, dimensions = 384) {
3925
3958
  SELECT chunk_id, 0.4 AS w FROM chunk_entities
3926
3959
  WHERE (${likeClause}) AND entity NOT IN (${exactPH})
3927
3960
  ) GROUP BY chunk_id`;
3928
- matchedParams = [...entities, ...fuzzy.map((t2) => `%${esc(t2)}%`), ...entities];
3961
+ matchedParams = [...allExact, ...fuzzy.map((t2) => `%${esc(t2)}%`), ...allExact];
3929
3962
  }
3930
3963
  const rows = db.prepare(`
3931
3964
  SELECT chunk_id, score FROM (
@@ -4143,13 +4176,14 @@ async function searchSemantic(store, embedder, query, limit) {
4143
4176
 
4144
4177
  // packages/core/dist/search/entity.js
4145
4178
  init_entity_extractor();
4146
- async function searchEntities(store, query, limit) {
4179
+ async function searchEntities(store, query, limit, aliasIndex) {
4147
4180
  if (typeof store.searchEntities !== "function")
4148
4181
  return [];
4149
4182
  const terms = extractQueryTerms(query);
4150
4183
  if (terms.length === 0)
4151
4184
  return [];
4152
- return store.searchEntities(terms, limit);
4185
+ const aliasExact = expandWithAliases(terms, aliasIndex).filter((t2) => !terms.includes(t2));
4186
+ return store.searchEntities(terms, limit, aliasExact);
4153
4187
  }
4154
4188
 
4155
4189
  // packages/core/dist/search/rrf.js
@@ -4173,6 +4207,9 @@ function rrfFusionN(lists, k = 60, limit = 10, opts = {}) {
4173
4207
  return [...scores.entries()].sort((a, b) => b[1] - a[1]).slice(0, limit).map(([chunkId, score]) => ({ chunkId, score }));
4174
4208
  }
4175
4209
 
4210
+ // packages/core/dist/search/index.js
4211
+ init_entity_extractor();
4212
+
4176
4213
  // packages/core/dist/search/adaptive.js
4177
4214
  function createAdaptiveSearch(deps) {
4178
4215
  const { baseSearch } = deps;
@@ -4243,6 +4280,7 @@ var DEFAULT_SIGNAL_WEIGHTS = {
4243
4280
  function createSearchEngine(deps) {
4244
4281
  const { store, embedder, rrfK = 60, getDecayEngine } = deps;
4245
4282
  const baseWeights = { ...DEFAULT_SIGNAL_WEIGHTS, ...deps.weights };
4283
+ const aliasIndex = buildAliasIndex(deps.entityAliases);
4246
4284
  const FETCH_LIMIT = 30;
4247
4285
  return {
4248
4286
  async search(options) {
@@ -4251,7 +4289,7 @@ function createSearchEngine(deps) {
4251
4289
  const [bm25Results, semanticResults, entityResults] = await Promise.all([
4252
4290
  searchBm25(store, query, FETCH_LIMIT),
4253
4291
  searchSemantic(store, embedder, query, FETCH_LIMIT),
4254
- searchEntities(store, query, FETCH_LIMIT)
4292
+ searchEntities(store, query, FETCH_LIMIT, aliasIndex)
4255
4293
  ]);
4256
4294
  const lists = [semanticResults, bm25Results, entityResults];
4257
4295
  const weights = [w.semantic, w.bm25, w.entity];
@@ -5757,7 +5795,7 @@ function createMcpServer(options) {
5757
5795
  const askTool = createAskTool(searchEngine, vaultPath);
5758
5796
  const generateDraftTool = createGenerateDraftTool(searchEngine, vaultPath);
5759
5797
  const agenticTools = embedder ? createAgenticGraphTools(store, embedder, vaultPath) : [];
5760
- const server = new Server({ name: "stellavault", version: "0.8.2" }, { capabilities: { tools: {} } });
5798
+ const server = new Server({ name: "stellavault", version: "0.8.4" }, { capabilities: { tools: {} } });
5761
5799
  server.setRequestHandler(ListToolsRequestSchema, async () => ({
5762
5800
  tools: [
5763
5801
  searchToolDef,
@@ -7963,7 +8001,9 @@ function createKnowledgeHub(config, options = {}) {
7963
8001
  embedder,
7964
8002
  rrfK: config.search.rrfK,
7965
8003
  weights: { semantic: sw.semantic, bm25: sw.bm25, entity: sw.entity, recency: sw.recency },
7966
- getDecayEngine
8004
+ getDecayEngine,
8005
+ entityAliases: config.search.entityAliases
8006
+ // B2.2 — cross-lingual/synonym groups
7967
8007
  });
7968
8008
  const mcpServer = createMcpServer({ store, searchEngine, vaultPath: config.vaultPath, ready: options.ready });
7969
8009
  return { store, embedder, searchEngine, mcpServer, config };
@@ -8100,7 +8140,9 @@ async function searchCommand(query, options, cmd) {
8100
8140
  store,
8101
8141
  embedder,
8102
8142
  rrfK: config.search.rrfK,
8103
- weights: { semantic: sw.semantic, bm25: sw.bm25, entity: sw.entity, recency: sw.recency }
8143
+ weights: { semantic: sw.semantic, bm25: sw.bm25, entity: sw.entity, recency: sw.recency },
8144
+ entityAliases: config.search.entityAliases
8145
+ // B2.2
8104
8146
  });
8105
8147
  const results = await engine.search({ query, limit });
8106
8148
  await store.close();
@@ -10934,7 +10976,7 @@ if (nodeVersion < 20) {
10934
10976
  process.exit(1);
10935
10977
  }
10936
10978
  var program = new Command();
10937
- var SV_VERSION = true ? "0.8.2" : "0.0.0-dev";
10979
+ var SV_VERSION = true ? "0.8.4" : "0.0.0-dev";
10938
10980
  program.name("stellavault").description("Stellavault \u2014 Self-compiling knowledge base for your Obsidian vault").version(SV_VERSION).option("--json", "Output in JSON format (for scripting)").option("--quiet", "Suppress non-essential output");
10939
10981
  program.command("init").description("Interactive setup wizard \u2014 get started in 3 minutes").action(initCommand);
10940
10982
  program.command("doctor").description("Diagnose setup issues (config, vault, DB, model, Node version)").action(doctorCommand);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "stellavault",
3
- "version": "0.8.2",
3
+ "version": "0.8.4",
4
4
  "description": "Drop anything. It compiles itself into knowledge. Claude remembers everything you know. Local-first MCP server, vault files never modified.",
5
5
  "repository": {
6
6
  "type": "git",