superlocalmemory 3.4.37 → 3.4.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -10,6 +10,64 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
10
10
 
11
11
  ---
12
12
 
13
+ ## [3.4.38] - 2026-04-26
14
+
15
+ **P0 silent data loss fix.** The async `/remember` pipeline was broken since
16
+ v3.4.32 — memories were being marked "queued" and acknowledged but never
17
+ actually persisting to memory.db during runtime. Only daemon-restart drained
18
+ the pending queue (limit 20 per restart). 18 memories were permanently lost
19
+ to a NoneType iterable crash between April 15-26, 2026, all recoverable
20
+ because the content was preserved in pending.db.
21
+
22
+ ### Fixed
23
+ - **Materializer `_engine` NameError** (`unified_daemon.py`). The background
24
+ pending materializer thread referenced a module-level `_engine` global
25
+ that was never declared. Result: every iteration threw `NameError: name
26
+ '_engine' is not defined`, the exception was caught and logged as
27
+ "materializer loop error", and the thread slept 5s and retried forever
28
+ without ever processing pending memories. Bug present since v3.4.32.
29
+ Fixed by declaring `_engine = None` at module level and assigning
30
+ `_engine = engine` in the FastAPI lifespan after `engine.initialize()`.
31
+ - **scene_builder NoneType crash** (`encoding/scene_builder.py:assign_to_scene`).
32
+ When the embedding worker was unavailable (cold-start timeout, crash),
33
+ `embedder.embed()` returned None. The code checked `theme_emb is None`
34
+ but never checked `fact_emb is None`, so `_cosine(None, theme_emb)`
35
+ called `zip(None, theme_emb)` → `'NoneType' object is not iterable`,
36
+ propagating up through `engine.store()` → mark_failed → permanent loss.
37
+ Fixed by guarding `fact_emb is None` (skip scene assignment, still create
38
+ scene) and adding defensive `None` check to `_cosine()` itself.
39
+ - **Retry-aware mark_failed** (`cli/pending_store.py`). Previously, ANY
40
+ exception during materialization permanently marked the memory as
41
+ failed — even transient errors like embedding worker timeout. Now uses
42
+ the existing `retry_count` column: keeps status as `pending` until 3
43
+ retries, only marks `failed` after all retries are exhausted.
44
+
45
+ ### Added
46
+ - **Diagnostic logging in materializer** — "Materializer: waiting for
47
+ engine to init...", "engine acquired, starting drain loop", "processing
48
+ N pending memories" — so operators can verify the materializer is alive
49
+ without grepping for absence of error messages.
50
+ - **`tests/test_integration/test_async_remember_e2e.py`** — full
51
+ production pipeline test: POST `/remember` (async, default mode) →
52
+ wait up to 60s → verify content in `memory.db` → recall returns it.
53
+ This is the test that was missing for 8+ months. The 4,501 existing
54
+ test functions test components in isolation (mocking `store_pending`)
55
+ and never exercise the full async flow that real users hit.
56
+
57
+ ### Recovery
58
+ On install, if you have existing failed records in `pending.db`, they will
59
+ be auto-retried on the next daemon restart by `engine._process_pending_memories()`.
60
+ To manually recover, run:
61
+ ```python
62
+ import sqlite3
63
+ db = sqlite3.connect('~/.superlocalmemory/pending.db')
64
+ db.execute("UPDATE pending_memories SET status='pending', retry_count=0, error=NULL WHERE status='failed'")
65
+ db.commit()
66
+ ```
67
+ Then `slm restart`.
68
+
69
+ ---
70
+
13
71
  ## [3.4.37] - 2026-04-26
14
72
 
15
73
  **P0 RAM fix.** Total SLM footprint reduced from ~14 GB peak to ~2.3 GB peak
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "superlocalmemory",
3
- "version": "3.4.37",
3
+ "version": "3.4.38",
4
4
  "description": "Information-geometric agent memory with mathematical guarantees. 4-channel retrieval, Fisher-Rao similarity, zero-LLM mode, EU AI Act compliant. Works with Claude, Cursor, Windsurf, and 17+ AI tools.",
5
5
  "keywords": [
6
6
  "ai-memory",
package/pyproject.toml CHANGED
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "superlocalmemory"
3
- version = "3.4.37"
3
+ version = "3.4.38"
4
4
  description = "Information-geometric agent memory with mathematical guarantees"
5
5
  readme = "README.md"
6
6
  license = {text = "AGPL-3.0-or-later"}
@@ -1,3 +1,3 @@
1
1
  """SuperLocalMemory — information-geometric agent memory."""
2
2
 
3
- __version__ = "3.4.37"
3
+ __version__ = "3.4.38"
@@ -122,13 +122,22 @@ def mark_done(row_id: int, base_dir: Path | None = None) -> None:
122
122
 
123
123
 
124
124
  def mark_failed(row_id: int, error: str, base_dir: Path | None = None) -> None:
125
- """Mark a pending memory as failed with error message."""
125
+ """Mark a pending memory as failed with error message.
126
+
127
+ v3.4.38: Now retry-aware. If retry_count < _MAX_RETRIES, keeps status as
128
+ 'pending' so the materializer will retry on next iteration. Only marks
129
+ permanently failed after _MAX_RETRIES (3) attempts. The previous behavior
130
+ permanently lost 18 memories between April 15-26, 2026 to transient errors.
131
+ """
126
132
  conn = _get_db(base_dir)
127
133
  try:
134
+ # Increment retry count and conditionally update status
128
135
  conn.execute(
129
- "UPDATE pending_memories SET status = 'failed', error = ?, "
130
- "retry_count = retry_count + 1 WHERE id = ?",
131
- (error, row_id),
136
+ "UPDATE pending_memories SET error = ?, "
137
+ "retry_count = retry_count + 1, "
138
+ "status = CASE WHEN retry_count + 1 >= ? THEN 'failed' ELSE 'pending' END "
139
+ "WHERE id = ?",
140
+ (error, _MAX_RETRIES, row_id),
132
141
  )
133
142
  conn.commit()
134
143
  finally:
@@ -167,6 +167,15 @@ def run_store(
167
167
  session_date=parsed_date, speaker_a=speaker,
168
168
  )
169
169
 
170
+ # v3.4.38: Defensive None guard. extract_facts() returns None on transient
171
+ # failures (embedding worker timeout, LLM call fail). Without this guard,
172
+ # line 201's `{f.content for f in facts}` raises 'NoneType' object is not
173
+ # iterable, causing the caller to mark_failed permanently — even though
174
+ # the content is still recoverable. 18 memories were lost to this between
175
+ # April 15-26, 2026.
176
+ if facts is None:
177
+ facts = []
178
+
170
179
  # V3.3.11: Also store raw content as a verbatim fact to preserve details
171
180
  # that fact extraction may abstract away (dates, names, specifics).
172
181
  # This ensures BM25 and semantic search can always find the original text.
@@ -56,6 +56,15 @@ class SceneBuilder:
56
56
  # Always compute fact embedding first — needed for comparisons
57
57
  fact_emb = self._embedder.embed(new_fact.content)
58
58
 
59
+ # v3.4.38: Defensive None guard. embedder.embed() returns None when
60
+ # the embedding worker is unavailable (timeout, crash). Without this
61
+ # guard, _cosine(None, theme_emb) → zip(None, ...) → 'NoneType'
62
+ # object is not iterable, propagating up to engine.store() and
63
+ # causing the entire memory to be lost. Better to skip scene
64
+ # assignment than lose the memory.
65
+ if fact_emb is None:
66
+ return self._create_scene(new_fact, profile_id)
67
+
59
68
  scenes = self._get_scenes(profile_id)
60
69
  if not scenes:
61
70
  return self._create_scene(new_fact, profile_id)
@@ -189,7 +198,12 @@ class SceneBuilder:
189
198
  )
190
199
 
191
200
 
192
- def _cosine(a: list[float], b: list[float]) -> float:
201
+ def _cosine(a: list[float] | None, b: list[float] | None) -> float:
202
+ # v3.4.38: Defensive None guard — embedder can return None on worker
203
+ # unavailability. Returning 0.0 is correct: zero similarity means no
204
+ # match, which falls back to creating a new scene.
205
+ if a is None or b is None:
206
+ return 0.0
193
207
  dot = sum(x * y for x, y in zip(a, b))
194
208
  na = sum(x * x for x in a) ** 0.5
195
209
  nb = sum(x * x for x in b) ** 0.5
@@ -148,6 +148,13 @@ from superlocalmemory.core.recall_gate import (
148
148
  in_flight as _recalls_in_flight,
149
149
  )
150
150
 
151
+ # v3.4.38: Module-level engine reference for the pending materializer.
152
+ # Set by the FastAPI lifespan after engine.initialize(). Was missing before,
153
+ # causing "name '_engine' is not defined" errors that blocked materialization
154
+ # of pending memories — they accumulated forever, only being processed at
155
+ # daemon startup via engine._process_pending_memories().
156
+ _engine = None
157
+
151
158
 
152
159
  # ---------------------------------------------------------------------------
153
160
  # Observation debounce buffer (migrated from daemon.py)
@@ -420,6 +427,9 @@ async def lifespan(application: FastAPI):
420
427
 
421
428
  application.state.engine = engine
422
429
  application.state.config = config
430
+ # v3.4.38: Wire module-level _engine for the pending materializer.
431
+ global _engine
432
+ _engine = engine
423
433
  logger.info("Unified daemon: MemoryEngine initialized (mode=%s)", config.mode.value)
424
434
 
425
435
  # LLD-07 §4 — deferred migrations (e.g. M006 reward column) need to
@@ -1378,16 +1388,31 @@ def _start_pending_materializer() -> None:
1378
1388
  from superlocalmemory.cli.pending_store import (
1379
1389
  get_pending, mark_done, mark_failed,
1380
1390
  )
1391
+ # v3.4.38: log first engine acquisition so we know materializer is alive
1392
+ _engine_logged = False
1393
+ _waiting_logged = False
1381
1394
  while not _materializer_stop.is_set():
1382
1395
  try:
1383
- engine = _engine # may be None briefly at startup
1396
+ # v3.4.38: Read fresh module global on every iteration so we
1397
+ # pick up the engine after lifespan sets it. Use the import
1398
+ # trick to ensure we're reading the live module attribute,
1399
+ # not a stale local reference.
1400
+ import superlocalmemory.server.unified_daemon as _ud
1401
+ engine = _ud._engine
1384
1402
  if engine is None:
1403
+ if not _waiting_logged:
1404
+ logger.info("Materializer: waiting for engine to init...")
1405
+ _waiting_logged = True
1385
1406
  time.sleep(2.0)
1386
1407
  continue
1408
+ if not _engine_logged:
1409
+ logger.info("Materializer: engine acquired, starting drain loop")
1410
+ _engine_logged = True
1387
1411
  pending = get_pending(limit=5)
1388
1412
  if not pending:
1389
1413
  time.sleep(2.0)
1390
1414
  continue
1415
+ logger.info("Materializer: processing %d pending memories", len(pending))
1391
1416
  for item in pending:
1392
1417
  if _materializer_stop.is_set():
1393
1418
  break
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: superlocalmemory
3
- Version: 3.4.37
3
+ Version: 3.4.38
4
4
  Summary: Information-geometric agent memory with mathematical guarantees
5
5
  Author-email: Varun Pratap Bhardwaj <admin@superlocalmemory.com>
6
6
  License: AGPL-3.0-or-later