superlocalmemory 3.4.40 → 3.4.42

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -9,6 +9,98 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
9
9
 
10
10
  ---
11
11
 
12
+ ## [3.4.42] - 2026-05-11
13
+
14
+ Operational reliability release. Three latent bugs in the daemon /
15
+ worker-singleton paths that surfaced together when running on a
16
+ fresh-install machine and produced misleading "failed" output despite
17
+ the system actually working. None of them affected the core recall or
18
+ remember pipelines on a healthy daemon — they only broke `slm restart`,
19
+ `slm warmup`, and `slm health` cosmetically — but the resulting noise
20
+ eroded trust and made real failures harder to diagnose. All three are
21
+ fixed without changing public APIs.
22
+
23
+ ### Fixed
24
+ - **`slm restart` Step 3 false-negative.** Step 2 of `cmd_restart`
25
+ acquires `daemon.lock` via `fcntl.flock(LOCK_EX | LOCK_NB)` to block
26
+ other CLI/MCP processes from racing to start a daemon during the
27
+ restart window. Step 3 then called `ensure_daemon()`, which itself
28
+ attempts to acquire the same lock from a separate file descriptor in
29
+ the SAME process. BSD-style flock blocks per-fd even within one
30
+ process, so the second flock failed with `EWOULDBLOCK`,
31
+ `ensure_daemon` fell into its "wait for someone else to start it"
32
+ branch, timed out at 60 s, and reported "failed to start" — even
33
+ though no actual error occurred and a follow-up CLI call would
34
+ successfully start the daemon. Fixed by extracting
35
+ `_start_daemon_subprocess()` from `ensure_daemon()`. The new helper
36
+ performs the raw `subprocess.Popen` + PID/port file write +
37
+ `_wait_for_daemon` polling without taking the lock. `cmd_restart`
38
+ Step 3 now calls the helper directly (it already holds the lock);
39
+ `ensure_daemon()` itself is unchanged for external callers — it
40
+ acquires the lock and then delegates to the same helper. (`B1`)
41
+
42
+ - **`slm warmup` "embedding verification failed" when daemon is up.**
43
+ `EmbeddingService._ensure_worker` enforces a machine-wide singleton
44
+ via a PID file (v3.4.13): only one embedding worker can exist per
45
+ machine, normally owned by the unified daemon. A fresh
46
+ `EmbeddingService` started by `slm warmup` saw the singleton, set
47
+ `_available = False`, returned `None` from `_subprocess_embed`, and
48
+ printed "Model loaded but embedding verification failed" with a
49
+ diagnostic that incorrectly guessed at a "Node.js wrapper Python-path
50
+ mismatch" (no Node.js is involved when running `slm warmup` from the
51
+ shell). Fixed by making `cmd_warmup` daemon-aware: when the daemon
52
+ is reachable and reports `engine=initialized`, the model is already
53
+ loaded inside the daemon's worker — print a `[PASS]` summary and
54
+ return without spawning a redundant local worker. The original
55
+ local-spawn path is preserved as a fall-through for the daemon-down
56
+ case. (`B2a`)
57
+
58
+ - **Reranker false-positive "warmup failed" warning in CLI processes.**
59
+ Any CLI process that wires a `RetrievalEngine` while the daemon is
60
+ running (`slm health`, `slm doctor`, `slm recall`) would log
61
+ `"Cross-encoder reranker warmup failed — recalls will use fallback
62
+ scoring"` even though the daemon's reranker was healthy and serving
63
+ fine. The CLI process's own warmup was correctly blocked by the
64
+ reranker singleton, but the message did not distinguish the benign
65
+ singleton case from a real model-load failure. Fixed in
66
+ `engine_wiring.init_engine`: when `warmup_sync` returns `False`,
67
+ probe `_is_reranker_worker_alive()`. If another process owns the
68
+ worker, log an `INFO` line describing the singleton ownership;
69
+ reserve the `WARNING` for the genuine no-owner failure case. The
70
+ diagnostic value of the warning is preserved — only the false
71
+ positive is removed. (`B2b`)
72
+
73
+ ### Added
74
+ - 17 new unit tests covering the three fixes (`tests/test_cli/test_v3442_*`,
75
+ `tests/test_core/test_v3442_reranker_warmup_singleton.py`). Tests are
76
+ fully mocked (no real subprocess spawn, no DB) and run in <1 s.
77
+ - `pytest-asyncio>=0.21` added to both `[project.optional-dependencies].dev`
78
+ and `[dependency-groups].dev` in `pyproject.toml`. `asyncio_mode = "auto"`
79
+ configured in `[tool.pytest.ini_options]`, and the `asyncio` marker is now
80
+ registered. Resolves a local-vs-CI environment drift where 6 async adapter
81
+ tests (`tests/test_adapters/test_sync_loop.py`) failed locally for anyone
82
+ who installed via `pip install -e ".[dev]"` without separately installing
83
+ `pytest-asyncio` — the CI publish workflow installs the plugin explicitly,
84
+ so PyPI builds were not blocked, but the failures were noisy and
85
+ contributor-hostile.
86
+
87
+ ---
88
+
89
+ ## [3.4.41] - 2026-05-09
90
+
91
+ Hotfix release. Pins `tree-sitter-language-pack` to the `<1` line. The
92
+ upstream 1.x rewrite (Rust-backed) ships an incompatible Parser API — the
93
+ language-pack's bundled `Parser` no longer exposes `.parse()`, breaking the
94
+ code-graph extractor and its test suite. Pinning to the 0.x line restores
95
+ the documented API. A migration to the 1.x API will follow in a later
96
+ release once call-site changes are validated.
97
+
98
+ ### Fixed
99
+ - `code_graph` extractor and tests broken by `tree-sitter-language-pack 1.x`.
100
+ Constraint changed from `>=0.3,<2` to `>=0.5,<1`.
101
+
102
+ ---
103
+
12
104
  ## [3.4.40] - 2026-05-09
13
105
 
14
106
  Recall performance and entity-profile hygiene. Two scaling issues surfaced
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "superlocalmemory",
3
- "version": "3.4.40",
3
+ "version": "3.4.42",
4
4
  "description": "Information-geometric agent memory with mathematical guarantees. 4-channel retrieval, Fisher-Rao similarity, zero-LLM mode, EU AI Act compliant. Works with Claude, Cursor, Windsurf, and 17+ AI tools.",
5
5
  "keywords": [
6
6
  "ai-memory",
package/pyproject.toml CHANGED
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "superlocalmemory"
3
- version = "3.4.40"
3
+ version = "3.4.42"
4
4
  description = "Information-geometric agent memory with mathematical guarantees"
5
5
  readme = "README.md"
6
6
  license = {text = "AGPL-3.0-or-later"}
@@ -46,7 +46,7 @@ dependencies = [
46
46
  "orjson>=3.9.0",
47
47
  # CodeGraph — code knowledge graph (v3.4)
48
48
  "tree-sitter>=0.23.0,<1",
49
- "tree-sitter-language-pack>=0.3,<2",
49
+ "tree-sitter-language-pack>=0.5,<1",
50
50
  "rustworkx>=0.15,<1",
51
51
  "watchdog>=4.0,<6",
52
52
  # V3.4.3: Unified Brain
@@ -98,6 +98,7 @@ full = [
98
98
  dev = [
99
99
  "pytest>=8.0",
100
100
  "pytest-cov>=4.1",
101
+ "pytest-asyncio>=0.21",
101
102
  "sqlite-vec>=0.1.6",
102
103
  ]
103
104
 
@@ -124,10 +125,12 @@ superlocalmemory = ["ui/**/*", "skills/**/*"]
124
125
  testpaths = ["tests"]
125
126
  pythonpath = ["src"]
126
127
  addopts = "-m 'not slow and not ollama and not benchmark'"
128
+ asyncio_mode = "auto"
127
129
  markers = [
128
130
  "slow: marks tests as slow — real engine/model loading (run with: pytest -m slow)",
129
131
  "ollama: marks tests that require a running Ollama instance",
130
132
  "benchmark: marks CI-only evo-memory benchmark tests (run with: pytest tests/test_benchmarks/ -m benchmark)",
133
+ "asyncio: marks tests as async — runs via pytest-asyncio (auto-mode in this project)",
131
134
  ]
132
135
  filterwarnings = [
133
136
  "ignore::DeprecationWarning:vaderSentiment",
@@ -167,5 +170,6 @@ select = ["E", "F", "I", "W"]
167
170
  dev = [
168
171
  "build>=1.4.0",
169
172
  "pytest>=9.0.2",
173
+ "pytest-asyncio>=0.21",
170
174
  "twine>=6.2.0",
171
175
  ]
@@ -316,9 +316,16 @@ def cmd_restart(args: Namespace) -> None:
316
316
  f"removed: {', '.join(cleaned)}" if cleaned else "already clean")
317
317
 
318
318
  # Step 3: Start fresh daemon (lock still held — no races)
319
+ # v3.4.42: Call _start_daemon_subprocess() directly instead of
320
+ # ensure_daemon(). The latter tries to acquire daemon.lock itself,
321
+ # which the SAME PROCESS holds via restart_lock_fd above — BSD-style
322
+ # flock blocks per-fd even within one process, so ensure_daemon would
323
+ # fall into its lock-fail branch and time out after 60s while the
324
+ # actual daemon never gets started. Calling the helper directly
325
+ # bypasses that self-deadlock and starts the daemon as intended.
319
326
  time.sleep(1)
320
- from superlocalmemory.cli.daemon import ensure_daemon
321
- started = ensure_daemon()
327
+ from superlocalmemory.cli.daemon import _start_daemon_subprocess
328
+ started = _start_daemon_subprocess()
322
329
 
323
330
  # Release restart lock — daemon is now running with its own lock
324
331
  if restart_lock_fd:
@@ -1662,7 +1669,19 @@ def cmd_mcp(_args: Namespace) -> None:
1662
1669
 
1663
1670
 
1664
1671
  def cmd_warmup(_args: Namespace) -> None:
1665
- """Pre-download the embedding model so first use is instant."""
1672
+ """Pre-download the embedding model so first use is instant.
1673
+
1674
+ v3.4.42: daemon-aware. The embedding worker is a machine-wide
1675
+ singleton (`_is_embedding_worker_alive` + PID file), so when the
1676
+ unified daemon is running it OWNS the worker. A fresh
1677
+ `EmbeddingService` started here would see the singleton, set
1678
+ `_available = False`, return None from `_subprocess_embed`, and
1679
+ print "embedding verification failed" — even though the daemon's
1680
+ worker is already happily serving the same model. The fix: detect
1681
+ the daemon, verify via its health endpoint, and skip the local
1682
+ spawn. Only fall through to the original local-worker path when
1683
+ the daemon is genuinely unreachable.
1684
+ """
1666
1685
  import superlocalmemory.core.embeddings as _emb_mod
1667
1686
 
1668
1687
  print("SuperLocalMemory V3 — Embedding Model Warmup")
@@ -1671,7 +1690,37 @@ def cmd_warmup(_args: Namespace) -> None:
1671
1690
  print(f" Model: nomic-ai/nomic-embed-text-v1.5 (~500MB)")
1672
1691
  print()
1673
1692
 
1674
- # Increase timeout for first-time download
1693
+ # v3.4.42 daemon-aware fast path. If the daemon is up and reports
1694
+ # engine=initialized, the embedding model is already loaded inside
1695
+ # the daemon's worker subprocess. No need to spawn a redundant one;
1696
+ # in fact, the machine-wide singleton would refuse to do so anyway.
1697
+ try:
1698
+ from superlocalmemory.cli.daemon import (
1699
+ is_daemon_running, daemon_request,
1700
+ )
1701
+ if is_daemon_running():
1702
+ health = daemon_request("GET", "/health")
1703
+ if health and health.get("engine") == "initialized":
1704
+ from superlocalmemory.core.config import EmbeddingConfig
1705
+ cfg = EmbeddingConfig()
1706
+ print("[PASS] Daemon is running with embedding model loaded.")
1707
+ print(f" Model: {cfg.model_name} ({cfg.dimension}-dim)")
1708
+ print("Semantic search is fully operational.")
1709
+ return
1710
+ # Daemon up but engine not yet initialized — warn and return
1711
+ # rather than racing the daemon for the singleton lock.
1712
+ engine_state = (health or {}).get("engine", "unknown")
1713
+ print(f"[INFO] Daemon is up but engine state is '{engine_state}'.")
1714
+ print(" Wait ~30s and retry, or run: slm doctor")
1715
+ return
1716
+ except Exception:
1717
+ # Any failure in the daemon path falls through to local warmup —
1718
+ # better to spawn a local worker than block warmup entirely.
1719
+ pass
1720
+
1721
+ # Local-warmup fallback path: daemon is unreachable, so it's safe
1722
+ # to spawn our own embedding worker (no singleton conflict).
1723
+ # Increase timeout for first-time download.
1675
1724
  original_timeout = _emb_mod._SUBPROCESS_RESPONSE_TIMEOUT
1676
1725
  _emb_mod._SUBPROCESS_RESPONSE_TIMEOUT = 180 # 3 min for cold start
1677
1726
 
@@ -137,6 +137,50 @@ def daemon_request(method: str, path: str, body: dict | None = None) -> dict | N
137
137
  _LOCK_FILE = Path.home() / ".superlocalmemory" / "daemon.lock"
138
138
 
139
139
 
140
+ def _start_daemon_subprocess() -> bool:
141
+ """Spawn the unified daemon subprocess and wait for readiness.
142
+
143
+ v3.4.42: Extracted from ensure_daemon() so callers that already hold
144
+ daemon.lock (e.g. cmd_restart Step 2) can start the daemon WITHOUT
145
+ triggering a second flock acquisition. BSD-style flock blocks per-fd
146
+ even within the same process, so the previous code path produced a
147
+ self-deadlock when called from Step 3 of `slm restart`: the lock held
148
+ by Step 2 caused ensure_daemon's own flock to fail with EWOULDBLOCK,
149
+ falling into the wait-for-someone-else branch and timing out at 60s
150
+ even though the daemon would have started cleanly.
151
+
152
+ PRECONDITION: caller has either acquired daemon.lock OR is certain no
153
+ other CLI/MCP process is racing to start a daemon (e.g. we just killed
154
+ everything in `slm restart` Step 1).
155
+
156
+ Returns True if daemon is reachable on the health endpoint within
157
+ 60 seconds, False otherwise.
158
+ """
159
+ if is_daemon_running():
160
+ return True
161
+
162
+ import subprocess
163
+ cmd = [sys.executable, "-m", "superlocalmemory.server.unified_daemon", "--start"]
164
+ log_dir = Path.home() / ".superlocalmemory" / "logs"
165
+ log_dir.mkdir(parents=True, exist_ok=True)
166
+ log_file = log_dir / "daemon.log"
167
+
168
+ kwargs: dict = {}
169
+ if sys.platform == "win32":
170
+ kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
171
+ else:
172
+ kwargs["start_new_session"] = True
173
+
174
+ with open(log_file, "a") as lf:
175
+ proc = subprocess.Popen(cmd, stdout=lf, stderr=lf, **kwargs)
176
+
177
+ # Write PID immediately so other callers see it during warmup
178
+ _PID_FILE.write_text(str(proc.pid))
179
+ _PORT_FILE.write_text(str(_DEFAULT_PORT))
180
+
181
+ return _wait_for_daemon(timeout=60)
182
+
183
+
140
184
  def ensure_daemon() -> bool:
141
185
  """Start daemon if not running. Returns True if daemon is ready.
142
186
 
@@ -145,6 +189,12 @@ def ensure_daemon() -> bool:
145
189
  2. File lock prevents two callers from starting concurrent daemons
146
190
  3. After starting, waits for PID file (not health check) — fast detection
147
191
  4. Cross-platform: macOS + Windows + Linux
192
+
193
+ v3.4.42: Refactored to delegate the actual subprocess start to
194
+ `_start_daemon_subprocess()`. Callers that already hold daemon.lock
195
+ (e.g. `slm restart` Step 3) should call that helper directly to avoid
196
+ the same-process flock self-deadlock that returned a false-negative
197
+ "failed to start" while the daemon was actually starting cleanly.
148
198
  """
149
199
  if is_daemon_running():
150
200
  return True
@@ -176,27 +226,9 @@ def ensure_daemon() -> bool:
176
226
  if is_daemon_running():
177
227
  return True
178
228
 
179
- # Start unified daemon in background
180
- import subprocess
181
- cmd = [sys.executable, "-m", "superlocalmemory.server.unified_daemon", "--start"]
182
- log_dir = Path.home() / ".superlocalmemory" / "logs"
183
- log_dir.mkdir(parents=True, exist_ok=True)
184
- log_file = log_dir / "daemon.log"
185
-
186
- kwargs: dict = {}
187
- if sys.platform == "win32":
188
- kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
189
- else:
190
- kwargs["start_new_session"] = True
191
-
192
- with open(log_file, "a") as lf:
193
- proc = subprocess.Popen(cmd, stdout=lf, stderr=lf, **kwargs)
194
-
195
- # Write PID immediately so other callers see it during warmup
196
- _PID_FILE.write_text(str(proc.pid))
197
- _PORT_FILE.write_text(str(_DEFAULT_PORT))
198
-
199
- return _wait_for_daemon(timeout=60)
229
+ # Start unified daemon in background — delegated to helper so the
230
+ # same logic can be reused by callers that already hold the lock.
231
+ return _start_daemon_subprocess()
200
232
 
201
233
  except Exception as exc:
202
234
  # Daemon auto-start is the entry point for dashboard / mesh /
@@ -559,14 +559,39 @@ def init_retrieval(
559
559
  # The CrossEncoderReranker constructor starts background warmup, but
560
560
  # callers can also call warmup_sync() to block until ready.
561
561
  # Here we just log warmup status — benchmark scripts call warmup_sync() explicitly.
562
+ #
563
+ # v3.4.42: Distinguish the legitimate "another process owns the
564
+ # reranker worker" case (machine-wide singleton — usually the unified
565
+ # daemon) from a real warmup failure. Before this fix, any CLI process
566
+ # that wired an Engine while the daemon was up would log
567
+ # "reranker warmup failed — recalls will use fallback scoring" even
568
+ # though the daemon's reranker was healthy and serving fine. The
569
+ # warning was a false positive that masked real failures and eroded
570
+ # trust in slm health / slm doctor output.
562
571
  if reranker is not None:
563
572
  import threading
564
573
  def _log_warmup_status() -> None:
565
574
  ready = reranker.warmup_sync(timeout=180)
566
575
  if ready:
567
576
  logger.info("Cross-encoder reranker warm and ready")
568
- else:
569
- logger.warning("Cross-encoder reranker warmup failed recalls will use fallback scoring")
577
+ return
578
+ # warmup_sync returned False. Could be (a) singleton held by
579
+ # another process (benign), or (b) actual model load failure.
580
+ # Disambiguate by probing the singleton PID file.
581
+ try:
582
+ from superlocalmemory.retrieval.reranker import _is_reranker_worker_alive
583
+ if _is_reranker_worker_alive():
584
+ logger.info(
585
+ "Cross-encoder reranker worker held by another process "
586
+ "(machine-wide singleton — usually the unified daemon); "
587
+ "this process will route reranking through that worker"
588
+ )
589
+ return
590
+ except Exception:
591
+ pass
592
+ logger.warning(
593
+ "Cross-encoder reranker warmup failed — recalls will use fallback scoring"
594
+ )
570
595
  t = threading.Thread(target=_log_warmup_status, daemon=True, name="ce-init-warmup")
571
596
  t.start()
572
597
 
@@ -30,12 +30,12 @@ DB_PATH = MEMORY_DIR / "memory.db"
30
30
  def _get_agent_id(default: str = "mcp_client") -> str:
31
31
  """Resolve the calling agent's ID for attribution.
32
32
 
33
- Each Avenger (Claude, Codex, Gemini, Kimi, GLM, Qwen, etc.) sets the
34
- ``SLM_AGENT_ID`` env var in its MCP server config so that memories,
33
+ Each MCP client (Claude Code, Codex, Gemini CLI, Kimi, etc.) can set
34
+ the ``SLM_AGENT_ID`` env var in its MCP server config so that memories,
35
35
  observations, and registry entries are tagged with the actual source
36
36
  agent — not the legacy ``"mcp_client"`` default.
37
37
 
38
- v3.4.39+: enables proper cross-Avenger attribution in ``session_init``,
38
+ v3.4.39+: enables proper per-agent attribution in ``session_init``,
39
39
  ``observe``, and event emissions.
40
40
  """
41
41
  return os.environ.get("SLM_AGENT_ID", default)
@@ -174,8 +174,8 @@ def register_active_tools(server, get_engine: Callable) -> None:
174
174
  The system will NOT store low-confidence or irrelevant content.
175
175
 
176
176
  v3.4.39: ``agent_id`` now defaults to the ``SLM_AGENT_ID`` env var
177
- (set by each Avenger's MCP config) so observations carry proper
178
- cross-Avenger attribution.
177
+ (set by each MCP client's config) so observations carry proper
178
+ per-agent attribution.
179
179
  """
180
180
  if agent_id is None:
181
181
  agent_id = _get_agent_id()