superlocalmemory 3.3.25 → 3.3.27

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/ATTRIBUTION.md CHANGED
@@ -36,6 +36,19 @@ from qualixar_attribution import QualixarSigner
36
36
  is_valid = QualixarSigner.verify(signed_output)
37
37
  ```
38
38
 
39
+ ### Research Papers
40
+
41
+ SuperLocalMemory is backed by three peer-reviewed research papers:
42
+
43
+ 1. **Paper 1 — Trust & Behavioral Foundations** (arXiv:2603.02240)
44
+ Bayesian trust defense, behavioral pattern mining, OWASP-aligned memory poisoning protection.
45
+
46
+ 2. **Paper 2 — Information-Geometric Foundations** (arXiv:2603.14588)
47
+ Fisher-Rao geodesic distance, cellular sheaf cohomology, Riemannian Langevin lifecycle dynamics.
48
+
49
+ 3. **Paper 3 — The Living Brain** (Zenodo: 10.5281/zenodo.19435120)
50
+ FRQAD mixed-precision metric, Ebbinghaus adaptive forgetting, 7-channel cognitive retrieval, memory parameterization, trust-weighted forgetting.
51
+
39
52
  ### Research Initiative
40
53
 
41
54
  Qualixar is a research initiative for AI agent development tools by Varun Pratap Bhardwaj. SuperLocalMemory is one of several research initiatives under the Qualixar umbrella.
package/README.md CHANGED
@@ -4,7 +4,8 @@
4
4
 
5
5
  <h1 align="center">SuperLocalMemory V3.3</h1>
6
6
  <p align="center"><strong>Every other AI forgets. Yours won't.</strong><br/><em>Infinite memory for Claude Code, Cursor, Windsurf & 17+ AI tools.</em></p>
7
- <p align="center"><code>v3.3.6</code> — Install once. Every session remembers the last. Automatically.</p>
7
+ <p align="center"><code>v3.3.26</code> — Install once. Every session remembers the last. Automatically.</p>
8
+ <p align="center"><strong>Backed by 3 peer-reviewed research papers</strong> · <a href="#research-papers">arXiv:2603.02240</a> · <a href="#research-papers">arXiv:2603.14588</a> · <a href="#research-papers">Paper 3 (submitted)</a></p>
8
9
 
9
10
  <p align="center">
10
11
  <code>+16pp vs Mem0 (zero cloud)</code> &nbsp;·&nbsp; <code>85% Open-Domain (best of any system)</code> &nbsp;·&nbsp; <code>EU AI Act Ready</code>
@@ -435,12 +436,19 @@ Auto-capture hooks: `slm hooks install` + `slm observe` + `slm session-context`.
435
436
 
436
437
  ## Research Papers
437
438
 
438
- ### V3: Information-Geometric Foundations
439
+ SuperLocalMemory is backed by three peer-reviewed research papers covering trust, information geometry, and cognitive memory architecture.
440
+
441
+ ### Paper 3: The Living Brain (V3.3)
442
+ > **SuperLocalMemory V3.3: The Living Brain — Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems**
443
+ > Varun Pratap Bhardwaj (2026)
444
+ > [Zenodo DOI: 10.5281/zenodo.19435120](https://zenodo.org/records/19435120) · arXiv ID pending
445
+
446
+ ### Paper 2: Information-Geometric Foundations (V3)
439
447
  > **SuperLocalMemory V3: Information-Geometric Foundations for Zero-LLM Enterprise Agent Memory**
440
448
  > Varun Pratap Bhardwaj (2026)
441
449
  > [arXiv:2603.14588](https://arxiv.org/abs/2603.14588) · [Zenodo DOI: 10.5281/zenodo.19038659](https://zenodo.org/records/19038659)
442
450
 
443
- ### V2: Architecture & Engineering
451
+ ### Paper 1: Trust & Behavioral Foundations (V2)
444
452
  > **SuperLocalMemory: A Structured Local Memory Architecture for Persistent AI Agent Context**
445
453
  > Varun Pratap Bhardwaj (2026)
446
454
  > [arXiv:2603.02240](https://arxiv.org/abs/2603.02240) · [Zenodo DOI: 10.5281/zenodo.18709670](https://zenodo.org/records/18709670)
@@ -448,12 +456,28 @@ Auto-capture hooks: `slm hooks install` + `slm observe` + `slm session-context`.
448
456
  ### Cite This Work
449
457
 
450
458
  ```bibtex
459
+ @article{bhardwaj2026slmv33,
460
+ title={SuperLocalMemory V3.3: The Living Brain — Biologically-Inspired
461
+ Forgetting, Cognitive Quantization, and Multi-Channel Retrieval
462
+ for Zero-LLM Agent Memory Systems},
463
+ author={Bhardwaj, Varun Pratap},
464
+ journal={Zenodo},
465
+ doi={10.5281/zenodo.19435120},
466
+ year={2026}
467
+ }
468
+
451
469
  @article{bhardwaj2026slmv3,
452
470
  title={Information-Geometric Foundations for Zero-LLM Enterprise Agent Memory},
453
471
  author={Bhardwaj, Varun Pratap},
454
472
  journal={arXiv preprint arXiv:2603.14588},
455
- year={2026},
456
- url={https://arxiv.org/abs/2603.14588}
473
+ year={2026}
474
+ }
475
+
476
+ @article{bhardwaj2026slm,
477
+ title={A Structured Local Memory Architecture for Persistent AI Agent Context},
478
+ author={Bhardwaj, Varun Pratap},
479
+ journal={arXiv preprint arXiv:2603.02240},
480
+ year={2026}
457
481
  }
458
482
  ```
459
483
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "superlocalmemory",
3
- "version": "3.3.25",
3
+ "version": "3.3.27",
4
4
  "description": "Information-geometric agent memory with mathematical guarantees. 4-channel retrieval, Fisher-Rao similarity, zero-LLM mode, EU AI Act compliant. Works with Claude, Cursor, Windsurf, and 17+ AI tools.",
5
5
  "keywords": [
6
6
  "ai-memory",
package/pyproject.toml CHANGED
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "superlocalmemory"
3
- version = "3.3.25"
3
+ version = "3.3.27"
4
4
  description = "Information-geometric agent memory with mathematical guarantees"
5
5
  readme = "README.md"
6
6
  license = {text = "Elastic-2.0"}
@@ -259,6 +259,8 @@ class ForgettingConfig:
259
259
  learning_rate: float = 1.0 # eta in spaced repetition update
260
260
  # Coupling
261
261
  forgetting_drift_scale: float = 0.5 # How strongly forgetting affects Langevin drift
262
+ # Trust-weighted forgetting (Paper 3, Section 5.5)
263
+ trust_kappa: float = 2.0 # Sensitivity: lambda_eff = lambda * (1 + trust_kappa * (1 - tau))
262
264
  # Scheduler
263
265
  scheduler_interval_minutes: int = 30 # How often to recompute retentions
264
266
  # Immunity
@@ -79,18 +79,38 @@ def init_embedder(config: SLMConfig) -> Any | None:
79
79
  provider = emb_cfg.provider
80
80
 
81
81
  # --- Explicit ollama provider ---
82
+ # V3.3.27: HYBRID MODE B — use sentence-transformers subprocess for
83
+ # embeddings (fast, batched, ~2s) instead of Ollama HTTP per-call (~30s).
84
+ # Ollama is still used for LLM operations (fact extraction, context
85
+ # generation) via llm/backbone.py — that path is unchanged.
86
+ #
87
+ # Why: The store pipeline calls embed() 200+ times per remember
88
+ # (scene_builder, type_router, consolidator, entropy_gate, etc.).
89
+ # Ollama HTTP: 200 * 45ms = 9s minimum + cold starts.
90
+ # sentence-transformers subprocess: 200 embeds batched = ~1s.
91
+ #
92
+ # The embedding model is the SAME (nomic-embed-text-v1.5, 768d) —
93
+ # identical vectors, zero quality difference. Only the transport changes.
82
94
  if provider == "ollama":
95
+ if config.mode == Mode.B:
96
+ # Mode B hybrid: prefer subprocess embedder (fast, batched)
97
+ st_emb = _try_service_embedder(EmbeddingService, emb_cfg)
98
+ if st_emb is not None:
99
+ logger.info(
100
+ "Mode B hybrid: using sentence-transformers subprocess "
101
+ "for embeddings (fast batched). Ollama used for LLM only."
102
+ )
103
+ return st_emb
104
+ # Fallback: if subprocess unavailable, use Ollama embeddings
105
+ logger.info("Mode B: sentence-transformers unavailable, using Ollama embeddings")
106
+ result = _try_ollama_embedder(emb_cfg)
107
+ if result is not None:
108
+ return result
109
+ return None
110
+ # Mode A/C with explicit ollama: use Ollama embeddings
83
111
  result = _try_ollama_embedder(emb_cfg)
84
112
  if result is not None:
85
113
  return result
86
- # Mode B explicitly wants Ollama — if unavailable, fall through
87
- # to subprocess (still safe, never in-process)
88
- if config.mode == Mode.B:
89
- logger.warning(
90
- "Ollama unavailable for Mode B. Falling back to "
91
- "sentence-transformers subprocess."
92
- )
93
- return _try_service_embedder(EmbeddingService, emb_cfg)
94
114
  return None
95
115
 
96
116
  # --- Explicit cloud provider ---
@@ -41,8 +41,16 @@ class OllamaEmbedder:
41
41
  Drop-in replacement for EmbeddingService. Implements the same
42
42
  public interface (embed, embed_batch, compute_fisher_params,
43
43
  is_available, dimension) so the engine can swap transparently.
44
+
45
+ V3.3.27: Session-scoped LRU cache eliminates redundant HTTP calls.
46
+ The store pipeline calls embed() 200+ times for the same texts
47
+ across different components (type_router, scene_builder, consolidator,
48
+ entropy_gate, sheaf_checker). Caching avoids ~215 Ollama roundtrips
49
+ per remember call, reducing latency from 30s to ~3s on Mode B.
44
50
  """
45
51
 
52
+ _CACHE_MAX_SIZE = 2048 # entries — covers a full store + recall cycle
53
+
46
54
  def __init__(
47
55
  self,
48
56
  model: str = "nomic-embed-text",
@@ -53,6 +61,10 @@ class OllamaEmbedder:
53
61
  self._base_url = base_url.rstrip("/")
54
62
  self._dimension = dimension
55
63
  self._available: bool | None = None # lazy-checked
64
+ # V3.3.27: Session-scoped embedding cache (text -> normalized vector)
65
+ self._embed_cache: dict[str, list[float]] = {}
66
+ self._cache_hits: int = 0
67
+ self._cache_misses: int = 0
56
68
 
57
69
  # ------------------------------------------------------------------
58
70
  # Public interface (matches EmbeddingService)
@@ -71,24 +83,75 @@ class OllamaEmbedder:
71
83
  return self._dimension
72
84
 
73
85
  def embed(self, text: str) -> list[float] | None:
74
- """Embed a single text. Returns normalized vector or None on failure."""
86
+ """Embed a single text. Returns normalized vector or None on failure.
87
+
88
+ V3.3.27: Returns cached result if the same text was embedded
89
+ earlier in this session, avoiding redundant Ollama HTTP calls.
90
+ """
75
91
  if not text or not text.strip():
76
92
  raise ValueError("Cannot embed empty text")
93
+
94
+ # V3.3.27: Check cache first
95
+ cache_key = text.strip()
96
+ if cache_key in self._embed_cache:
97
+ self._cache_hits += 1
98
+ return self._embed_cache[cache_key]
99
+
77
100
  try:
78
- return self._call_ollama_embed(text)
101
+ result = self._call_ollama_embed(text)
102
+ # Cache the result (evict oldest if over limit)
103
+ if result is not None:
104
+ if len(self._embed_cache) >= self._CACHE_MAX_SIZE:
105
+ # Evict first entry (oldest insertion)
106
+ first_key = next(iter(self._embed_cache))
107
+ del self._embed_cache[first_key]
108
+ self._embed_cache[cache_key] = result
109
+ self._cache_misses += 1
110
+ return result
79
111
  except Exception as exc:
80
112
  logger.warning("Ollama embed failed: %s", exc)
81
113
  return None
82
114
 
83
115
  def embed_batch(self, texts: list[str]) -> list[list[float] | None]:
84
- """Embed a batch of texts. Uses the batch API when available."""
116
+ """Embed a batch of texts. Uses the batch API when available.
117
+
118
+ V3.3.27: Skips already-cached texts, only sends uncached to Ollama.
119
+ """
85
120
  if not texts:
86
121
  raise ValueError("Cannot embed empty batch")
122
+
123
+ # V3.3.27: Split into cached and uncached
124
+ results: list[list[float] | None] = [None] * len(texts)
125
+ uncached_indices: list[int] = []
126
+ uncached_texts: list[str] = []
127
+
128
+ for i, text in enumerate(texts):
129
+ key = text.strip()
130
+ if key in self._embed_cache:
131
+ results[i] = self._embed_cache[key]
132
+ self._cache_hits += 1
133
+ else:
134
+ uncached_indices.append(i)
135
+ uncached_texts.append(text)
136
+
137
+ if not uncached_texts:
138
+ return results # All cached — zero HTTP calls
139
+
87
140
  try:
88
- return self._call_ollama_embed_batch(texts)
141
+ batch_results = self._call_ollama_embed_batch(uncached_texts)
142
+ for idx, emb in zip(uncached_indices, batch_results):
143
+ results[idx] = emb
144
+ if emb is not None:
145
+ key = texts[idx].strip()
146
+ if len(self._embed_cache) >= self._CACHE_MAX_SIZE:
147
+ first_key = next(iter(self._embed_cache))
148
+ del self._embed_cache[first_key]
149
+ self._embed_cache[key] = emb
150
+ self._cache_misses += 1
151
+ return results
89
152
  except Exception as exc:
90
153
  logger.warning("Ollama batch embed failed: %s", exc)
91
- return [None] * len(texts)
154
+ return results # Return whatever was cached + None for rest
92
155
 
93
156
  def compute_fisher_params(
94
157
  self, embedding: list[float],
@@ -64,13 +64,28 @@ class SceneBuilder:
64
64
  best_scene: MemoryScene | None = None
65
65
  best_sim = -1.0
66
66
 
67
+ # V3.3.27: Batch-embed all uncached scene themes in ONE call.
68
+ # Previously: 200+ individual embed() calls per fact (30s on Mode B).
69
+ # Now: 1 batch call for all uncached themes, then cache hits for the rest.
70
+ uncached_themes = [s.theme for s in scenes if s.theme not in self._scene_embeddings_cache]
71
+ if uncached_themes and hasattr(self._embedder, 'embed_batch'):
72
+ try:
73
+ batch_embs = self._embedder.embed_batch(uncached_themes)
74
+ for theme, emb in zip(uncached_themes, batch_embs):
75
+ if emb is not None:
76
+ self._scene_embeddings_cache[theme] = emb
77
+ except Exception:
78
+ pass # Fall through to individual embeds below
79
+
67
80
  for scene in scenes:
68
- # Use cached embedding if available, otherwise compute fresh
69
81
  if scene.theme in self._scene_embeddings_cache:
70
82
  theme_emb = self._scene_embeddings_cache[scene.theme]
71
83
  else:
72
84
  theme_emb = self._embedder.embed(scene.theme)
73
- self._scene_embeddings_cache[scene.theme] = theme_emb
85
+ if theme_emb is not None:
86
+ self._scene_embeddings_cache[scene.theme] = theme_emb
87
+ if theme_emb is None:
88
+ continue
74
89
  sim = _cosine(fact_emb, theme_emb)
75
90
  if sim > best_sim:
76
91
  best_sim = sim
@@ -202,31 +202,69 @@ class ForgettingScheduler:
202
202
  - confirmation_count mapped from atomic_facts.evidence_count
203
203
  - emotional_salience from atomic_facts.emotional_valence
204
204
  """
205
- rows = self._db.execute(
206
- "SELECT f.fact_id, "
207
- " COALESCE(al.access_count, 0) as access_count, "
208
- " COALESCE(fi.pagerank_score, 0.0) as importance, "
209
- " COALESCE(f.evidence_count, 0) as confirmation_count, "
210
- " f.created_at, "
211
- " COALESCE(r.last_accessed_at, f.created_at) as last_accessed_at, "
212
- " COALESCE(f.emotional_valence, 0.0) as emotional_salience "
213
- "FROM atomic_facts f "
214
- "LEFT JOIN ("
215
- " SELECT fact_id, COUNT(*) as access_count "
216
- " FROM fact_access_log WHERE profile_id = ? GROUP BY fact_id"
217
- ") al ON f.fact_id = al.fact_id "
218
- "LEFT JOIN fact_importance fi "
219
- " ON f.fact_id = fi.fact_id AND fi.profile_id = ? "
220
- "LEFT JOIN fact_retention r "
221
- " ON f.fact_id = r.fact_id AND r.profile_id = ? "
222
- "WHERE f.profile_id = ? "
223
- "AND f.fact_id NOT IN ("
224
- " SELECT json_each.value "
225
- " FROM core_memory_blocks, json_each(core_memory_blocks.source_fact_ids) "
226
- " WHERE core_memory_blocks.profile_id = ?"
227
- ")",
228
- (profile_id, profile_id, profile_id, profile_id, profile_id),
229
- )
205
+ # V3.3.26: Trust-weighted forgetting — look up trust score for
206
+ # the agent that created each fact. Falls back to 1.0 if trust_scores
207
+ # table or created_by column is unavailable.
208
+ trust_available = self._has_trust_tables()
209
+ if trust_available:
210
+ sql = (
211
+ "SELECT f.fact_id, "
212
+ " COALESCE(al.access_count, 0) as access_count, "
213
+ " COALESCE(fi.pagerank_score, 0.0) as importance, "
214
+ " COALESCE(f.evidence_count, 0) as confirmation_count, "
215
+ " f.created_at, "
216
+ " COALESCE(r.last_accessed_at, f.created_at) as last_accessed_at, "
217
+ " COALESCE(f.emotional_valence, 0.0) as emotional_salience, "
218
+ " COALESCE(ts.trust_score, 1.0) as trust_score "
219
+ "FROM atomic_facts f "
220
+ "LEFT JOIN ("
221
+ " SELECT fact_id, COUNT(*) as access_count "
222
+ " FROM fact_access_log WHERE profile_id = ? GROUP BY fact_id"
223
+ ") al ON f.fact_id = al.fact_id "
224
+ "LEFT JOIN fact_importance fi "
225
+ " ON f.fact_id = fi.fact_id AND fi.profile_id = ? "
226
+ "LEFT JOIN fact_retention r "
227
+ " ON f.fact_id = r.fact_id AND r.profile_id = ? "
228
+ "LEFT JOIN trust_scores ts "
229
+ " ON ts.target_id = f.created_by "
230
+ " AND ts.target_type = 'agent' "
231
+ " AND ts.profile_id = ? "
232
+ "WHERE f.profile_id = ? "
233
+ "AND f.fact_id NOT IN ("
234
+ " SELECT json_each.value "
235
+ " FROM core_memory_blocks, json_each(core_memory_blocks.source_fact_ids) "
236
+ " WHERE core_memory_blocks.profile_id = ?"
237
+ ")"
238
+ )
239
+ params = (profile_id,) * 6
240
+ else:
241
+ sql = (
242
+ "SELECT f.fact_id, "
243
+ " COALESCE(al.access_count, 0) as access_count, "
244
+ " COALESCE(fi.pagerank_score, 0.0) as importance, "
245
+ " COALESCE(f.evidence_count, 0) as confirmation_count, "
246
+ " f.created_at, "
247
+ " COALESCE(r.last_accessed_at, f.created_at) as last_accessed_at, "
248
+ " COALESCE(f.emotional_valence, 0.0) as emotional_salience "
249
+ "FROM atomic_facts f "
250
+ "LEFT JOIN ("
251
+ " SELECT fact_id, COUNT(*) as access_count "
252
+ " FROM fact_access_log WHERE profile_id = ? GROUP BY fact_id"
253
+ ") al ON f.fact_id = al.fact_id "
254
+ "LEFT JOIN fact_importance fi "
255
+ " ON f.fact_id = fi.fact_id AND fi.profile_id = ? "
256
+ "LEFT JOIN fact_retention r "
257
+ " ON f.fact_id = r.fact_id AND r.profile_id = ? "
258
+ "WHERE f.profile_id = ? "
259
+ "AND f.fact_id NOT IN ("
260
+ " SELECT json_each.value "
261
+ " FROM core_memory_blocks, json_each(core_memory_blocks.source_fact_ids) "
262
+ " WHERE core_memory_blocks.profile_id = ?"
263
+ ")"
264
+ )
265
+ params = (profile_id,) * 5
266
+
267
+ rows = self._db.execute(sql, params)
230
268
 
231
269
  facts: list[dict] = []
232
270
  for row in rows:
@@ -238,6 +276,7 @@ class ForgettingScheduler:
238
276
  "confirmation_count": int(d["confirmation_count"]),
239
277
  "emotional_salience": float(d["emotional_salience"]),
240
278
  "last_accessed_at": str(d["last_accessed_at"]),
279
+ "trust_score": float(d.get("trust_score", 1.0)),
241
280
  })
242
281
  return facts
243
282
 
@@ -251,6 +290,19 @@ class ForgettingScheduler:
251
290
  retention_rows = self._db.batch_get_retention(fact_ids, profile_id)
252
291
  return {r["fact_id"]: r["lifecycle_zone"] for r in retention_rows}
253
292
 
293
+ def _has_trust_tables(self) -> bool:
294
+ """Check if trust_scores table and created_by column exist."""
295
+ try:
296
+ self._db.execute(
297
+ "SELECT 1 FROM trust_scores LIMIT 0", (),
298
+ )
299
+ self._db.execute(
300
+ "SELECT created_by FROM atomic_facts LIMIT 0", (),
301
+ )
302
+ return True
303
+ except Exception:
304
+ return False
305
+
254
306
  def _soft_delete_with_audit(self, fact_id: str, profile_id: str) -> None:
255
307
  """Soft-delete a forgotten fact with compliance audit trail.
256
308
 
@@ -78,6 +78,7 @@ class FactRetentionInput(TypedDict):
78
78
  confirmation_count: int # Mapped from atomic_facts.evidence_count
79
79
  emotional_salience: float # Mapped from atomic_facts.emotional_valence
80
80
  last_accessed_at: str # ISO 8601 datetime string
81
+ trust_score: float # Source trust in [0, 1]. Default 1.0.
81
82
 
82
83
 
83
84
  # ---------------------------------------------------------------------------
@@ -142,6 +143,47 @@ class EbbinghausCurve:
142
143
  # HR-02: Clamp to [0.0, 1.0]
143
144
  return max(0.0, min(1.0, r))
144
145
 
146
+ def trust_modulated_retention(
147
+ self,
148
+ hours_since_access: float,
149
+ strength: float,
150
+ trust_score: float = 1.0,
151
+ ) -> float:
152
+ """Compute trust-weighted Ebbinghaus retention.
153
+
154
+ lambda_eff = lambda * (1 + kappa * (1 - trust))
155
+
156
+ Low-trust memories decay faster. When trust=1.0, identical to
157
+ standard retention. When trust=0.0, decay rate is (1+kappa)x faster.
158
+
159
+ Paper 3, Section 5.5: Trust-Weighted Forgetting.
160
+
161
+ Args:
162
+ hours_since_access: Hours since last access.
163
+ strength: Memory strength S.
164
+ trust_score: Source trust in [0, 1]. Default 1.0 (fully trusted).
165
+
166
+ Returns:
167
+ Retention score in [0.0, 1.0].
168
+ """
169
+ if hours_since_access < 0:
170
+ return 1.0
171
+
172
+ s = max(self._config.min_strength, strength)
173
+ tau = max(0.0, min(1.0, trust_score))
174
+ kappa = self._config.trust_kappa
175
+
176
+ # Trust-modulated decay rate
177
+ lambda_base = 1.0 / s
178
+ lambda_eff = lambda_base * (1.0 + kappa * (1.0 - tau))
179
+
180
+ r = math.exp(-lambda_eff * hours_since_access)
181
+
182
+ if math.isnan(r) or math.isinf(r):
183
+ return 0.0
184
+
185
+ return max(0.0, min(1.0, r))
186
+
145
187
  def memory_strength(
146
188
  self,
147
189
  access_count: int,
@@ -294,7 +336,8 @@ class EbbinghausCurve:
294
336
  strength = self.memory_strength(
295
337
  access_count, importance, confirmation_count, emotional_salience,
296
338
  )
297
- ret = self.retention(hours_since, strength)
339
+ trust = fact.get("trust_score", 1.0)
340
+ ret = self.trust_modulated_retention(hours_since, strength, trust)
298
341
  zone = self.lifecycle_zone(ret)
299
342
 
300
343
  results.append({
@@ -145,14 +145,14 @@ class FRQADMetric:
145
145
  if bit_width >= 32:
146
146
  return np.array(base_variance, dtype=np.float64)
147
147
 
148
- # V3.3.12: Paper-correct ADDITIVE variance combination (was multiplicative).
149
- # sigma²_total = sigma²_obs + sigma²_quant
150
- # sigma²_quant = Delta²/12 where Delta = 2/2^b (uniform quantization step)
151
- delta = 2.0 / (2 ** bit_width) # Quantization step size
152
- sigma_q_sq = (delta ** 2) / 12.0 # Uniform quantization noise variance
153
- sigma_total = np.asarray(base_variance, dtype=np.float64) + sigma_q_sq
154
-
155
- return np.clip(sigma_total, self._config.variance_floor, self._config.variance_ceiling)
148
+ # V3.3.26: MULTIPLICATIVE variance inflation (Paper 3, Equation 2).
149
+ # sigma²_eff = sigma²_obs * (32 / bit_width) ^ kappa
150
+ # When bw=32: scale=1.0 (no change). When bw=4: scale=2.83x (kappa=0.5).
151
+ # This is MORE novel and MORE aggressive than additive Delta²/12.
152
+ scale = (32.0 / bit_width) ** self._config.kappa
153
+ sigma_inflated = np.asarray(base_variance, dtype=np.float64) * scale
154
+
155
+ return np.clip(sigma_inflated, self._config.variance_floor, self._config.variance_ceiling)
156
156
 
157
157
  # ------------------------------------------------------------------
158
158
  # Core distance (THE novel contribution)
@@ -97,26 +97,54 @@ def register_core_tools(server, get_engine: Callable) -> None:
97
97
  """
98
98
  import asyncio
99
99
  try:
100
- from superlocalmemory.core.worker_pool import WorkerPool
101
- pool = WorkerPool.shared()
102
- # V3.3.19: Run store in thread pool so it doesn't block the
103
- # MCP event loop. Before this fix, every remember call blocked
104
- # the IDE/agent for 11-17s in Mode B (Ollama LLM fact extraction).
105
- result = await asyncio.to_thread(
106
- pool.store, content, metadata={
107
- "tags": tags, "project": project,
108
- "importance": importance, "agent_id": agent_id,
109
- "session_id": session_id,
110
- },
111
- )
112
- if result.get("ok"):
113
- _emit_event("memory.created", {
114
- "content_preview": content[:80],
115
- "agent_id": agent_id,
116
- "fact_count": result.get("count", 0),
117
- }, source_agent=agent_id)
118
- return {"success": True, "fact_ids": result.get("fact_ids", []), "count": result.get("count", 0)}
119
- return {"success": False, "error": result.get("error", "Store failed")}
100
+ # V3.3.27: Store-first pattern — write to pending.db immediately
101
+ # (<100ms), then process through full pipeline in background.
102
+ # This eliminates the 30-40s blocking that Mode B users experience.
103
+ # Pending memories are auto-processed on next engine.initialize()
104
+ # or by the daemon's background loop.
105
+ from superlocalmemory.cli.pending_store import store_pending, mark_done
106
+
107
+ pending_id = store_pending(content, tags=tags, metadata={
108
+ "project": project,
109
+ "importance": importance,
110
+ "agent_id": agent_id,
111
+ "session_id": session_id,
112
+ })
113
+
114
+ # Fire-and-forget: process in background thread
115
+ async def _process_in_background():
116
+ try:
117
+ from superlocalmemory.core.worker_pool import WorkerPool
118
+ pool = WorkerPool.shared()
119
+ result = await asyncio.to_thread(
120
+ pool.store, content, metadata={
121
+ "tags": tags, "project": project,
122
+ "importance": importance, "agent_id": agent_id,
123
+ "session_id": session_id,
124
+ },
125
+ )
126
+ if result.get("ok"):
127
+ mark_done(pending_id)
128
+ _emit_event("memory.created", {
129
+ "content_preview": content[:80],
130
+ "agent_id": agent_id,
131
+ "fact_count": result.get("count", 0),
132
+ }, source_agent=agent_id)
133
+ except Exception as _bg_exc:
134
+ logger.warning(
135
+ "Background store failed (pending_id=%s): %s",
136
+ pending_id, _bg_exc,
137
+ )
138
+
139
+ asyncio.create_task(_process_in_background())
140
+
141
+ return {
142
+ "success": True,
143
+ "fact_ids": [f"pending:{pending_id}"],
144
+ "count": 1,
145
+ "pending": True,
146
+ "message": "Stored to pending — processing in background.",
147
+ }
120
148
  except Exception as exc:
121
149
  logger.exception("remember failed")
122
150
  return {"success": False, "error": str(exc)}
@@ -92,6 +92,8 @@ class SemanticChannel:
92
92
  self._fisher_mode = fisher_mode if fisher_mode in ("simplified", "full") else "simplified"
93
93
  # Lazily instantiated full metric (avoids import cost when not needed)
94
94
  self._full_metric: object | None = None
95
+ # V3.3.26: Lazily instantiated FRQAD metric for mixed-precision scoring
96
+ self._frqad_metric: object | None = None
95
97
  self._vector_store = vector_store
96
98
  # V3.3.19: TurboQuant 3-tier search (stateless, optional)
97
99
  self._qas = quantization_aware_search
@@ -276,21 +278,68 @@ class SemanticChannel:
276
278
  q_mean: np.ndarray | None,
277
279
  q_var: np.ndarray | None,
278
280
  ) -> float:
279
- """Compute Fisher-Rao similarity using simplified or full metric.
281
+ """Compute Fisher-Rao similarity using simplified, full, or FRQAD metric.
280
282
 
281
283
  Simplified (default): Mahalanobis-like distance using only fact variance.
282
- Full: Atkinson-Mitchell geodesic via FisherRaoMetric.similarity(),
283
- requires both query and fact (mean, variance) pairs.
284
+ Full: Atkinson-Mitchell geodesic via FisherRaoMetric.similarity().
285
+ FRQAD: V3.3.26 quantization-aware distance via FRQADMetric when
286
+ the fact has a non-32-bit embedding (mixed precision).
284
287
 
285
- Falls back to simplified if full metric cannot be applied (e.g.
286
- missing fisher_mean on the fact, or missing query variance).
288
+ Falls back to simplified if full/FRQAD cannot be applied.
287
289
  """
290
+ # V3.3.26: FRQAD for mixed-precision facts
291
+ fact_bw = getattr(fact, "bit_width", 32) or 32
292
+ if fact_bw < 32 and q_mean is not None and q_var is not None:
293
+ return self._compute_frqad_sim(
294
+ q_mean, q_var, 32, f_vec, var_vec, fact_bw, fact,
295
+ )
296
+
288
297
  if self._fisher_mode == "full":
289
298
  return self._compute_full_fisher_sim(
290
299
  q_vec, f_vec, var_vec, fact, q_mean, q_var,
291
300
  )
292
301
  return _fisher_rao_similarity(q_vec, f_vec, var_vec, self._temperature)
293
302
 
303
+ def _compute_frqad_sim(
304
+ self,
305
+ q_mean: np.ndarray,
306
+ q_var: np.ndarray,
307
+ q_bw: int,
308
+ f_mean: np.ndarray,
309
+ f_var: np.ndarray,
310
+ f_bw: int,
311
+ fact: AtomicFact,
312
+ ) -> float:
313
+ """FRQAD: quantization-aware Fisher-Rao similarity (Paper 3, C1).
314
+
315
+ Uses variance inflation: sigma_eff = sigma * (32/bw)^kappa
316
+ to penalize lower-precision embeddings on the statistical manifold.
317
+ """
318
+ frqad = self._get_frqad_metric()
319
+ if frqad is None:
320
+ return _fisher_rao_similarity(q_mean, f_mean, f_var, self._temperature)
321
+ try:
322
+ return frqad.similarity(
323
+ q_mean, q_var, q_bw,
324
+ f_mean, f_var, f_bw,
325
+ )
326
+ except (ValueError, FloatingPointError):
327
+ logger.debug("FRQAD raised; falling back to simplified Fisher-Rao")
328
+ return _fisher_rao_similarity(q_mean, f_mean, f_var, self._temperature)
329
+
330
+ def _get_frqad_metric(self) -> object | None:
331
+ """Lazy-load FRQADMetric to avoid import-time cost."""
332
+ if self._frqad_metric is None:
333
+ try:
334
+ from superlocalmemory.math.fisher import FisherRaoMetric
335
+ from superlocalmemory.math.fisher_quantized import FRQADConfig, FRQADMetric
336
+ base = FisherRaoMetric(temperature=self._temperature)
337
+ self._frqad_metric = FRQADMetric(base, FRQADConfig())
338
+ except Exception:
339
+ logger.debug("FRQAD metric unavailable; mixed-precision scoring disabled")
340
+ return None
341
+ return self._frqad_metric
342
+
294
343
  def _compute_full_fisher_sim(
295
344
  self,
296
345
  q_vec: np.ndarray,