npm - superlocalmemory - Versions diffs - 3.4.40 → 3.4.42 - Mend

superlocalmemory 3.4.40 → 3.4.42

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/CHANGELOG.md +92 -0
package/package.json +1 -1
package/pyproject.toml +6 -2
package/src/superlocalmemory/cli/commands.py +53 -4
package/src/superlocalmemory/cli/daemon.py +53 -21
package/src/superlocalmemory/core/engine_wiring.py +27 -2
package/src/superlocalmemory/mcp/tools_active.py +5 -5

package/CHANGELOG.md CHANGED Viewed

@@ -9,6 +9,98 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ---
+## [3.4.42] - 2026-05-11
+Operational reliability release. Three latent bugs in the daemon /
+worker-singleton paths that surfaced together when running on a
+fresh-install machine and produced misleading "failed" output despite
+the system actually working. None of them affected the core recall or
+remember pipelines on a healthy daemon — they only broke `slm restart`,
+`slm warmup`, and `slm health` cosmetically — but the resulting noise
+eroded trust and made real failures harder to diagnose. All three are
+fixed without changing public APIs.
+### Fixed
+- **`slm restart` Step 3 false-negative.** Step 2 of `cmd_restart`
+  acquires `daemon.lock` via `fcntl.flock(LOCK_EX | LOCK_NB)` to block
+  other CLI/MCP processes from racing to start a daemon during the
+  restart window. Step 3 then called `ensure_daemon()`, which itself
+  attempts to acquire the same lock from a separate file descriptor in
+  the SAME process. BSD-style flock blocks per-fd even within one
+  process, so the second flock failed with `EWOULDBLOCK`,
+  `ensure_daemon` fell into its "wait for someone else to start it"
+  branch, timed out at 60 s, and reported "failed to start" — even
+  though no actual error occurred and a follow-up CLI call would
+  successfully start the daemon. Fixed by extracting
+  `_start_daemon_subprocess()` from `ensure_daemon()`. The new helper
+  performs the raw `subprocess.Popen` + PID/port file write +
+  `_wait_for_daemon` polling without taking the lock. `cmd_restart`
+  Step 3 now calls the helper directly (it already holds the lock);
+  `ensure_daemon()` itself is unchanged for external callers — it
+  acquires the lock and then delegates to the same helper. (`B1`)
+- **`slm warmup` "embedding verification failed" when daemon is up.**
+  `EmbeddingService._ensure_worker` enforces a machine-wide singleton
+  via a PID file (v3.4.13): only one embedding worker can exist per
+  machine, normally owned by the unified daemon. A fresh
+  `EmbeddingService` started by `slm warmup` saw the singleton, set
+  `_available = False`, returned `None` from `_subprocess_embed`, and
+  printed "Model loaded but embedding verification failed" with a
+  diagnostic that incorrectly guessed at a "Node.js wrapper Python-path
+  mismatch" (no Node.js is involved when running `slm warmup` from the
+  shell). Fixed by making `cmd_warmup` daemon-aware: when the daemon
+  is reachable and reports `engine=initialized`, the model is already
+  loaded inside the daemon's worker — print a `[PASS]` summary and
+  return without spawning a redundant local worker. The original
+  local-spawn path is preserved as a fall-through for the daemon-down
+  case. (`B2a`)
+- **Reranker false-positive "warmup failed" warning in CLI processes.**
+  Any CLI process that wires a `RetrievalEngine` while the daemon is
+  running (`slm health`, `slm doctor`, `slm recall`) would log
+  `"Cross-encoder reranker warmup failed — recalls will use fallback
+  scoring"` even though the daemon's reranker was healthy and serving
+  fine. The CLI process's own warmup was correctly blocked by the
+  reranker singleton, but the message did not distinguish the benign
+  singleton case from a real model-load failure. Fixed in
+  `engine_wiring.init_engine`: when `warmup_sync` returns `False`,
+  probe `_is_reranker_worker_alive()`. If another process owns the
+  worker, log an `INFO` line describing the singleton ownership;
+  reserve the `WARNING` for the genuine no-owner failure case. The
+  diagnostic value of the warning is preserved — only the false
+  positive is removed. (`B2b`)
+### Added
+- 17 new unit tests covering the three fixes (`tests/test_cli/test_v3442_*`,
+  `tests/test_core/test_v3442_reranker_warmup_singleton.py`). Tests are
+  fully mocked (no real subprocess spawn, no DB) and run in <1 s.
+- `pytest-asyncio>=0.21` added to both `[project.optional-dependencies].dev`
+  and `[dependency-groups].dev` in `pyproject.toml`. `asyncio_mode = "auto"`
+  configured in `[tool.pytest.ini_options]`, and the `asyncio` marker is now
+  registered. Resolves a local-vs-CI environment drift where 6 async adapter
+  tests (`tests/test_adapters/test_sync_loop.py`) failed locally for anyone
+  who installed via `pip install -e ".[dev]"` without separately installing
+  `pytest-asyncio` — the CI publish workflow installs the plugin explicitly,
+  so PyPI builds were not blocked, but the failures were noisy and
+  contributor-hostile.
+---
+## [3.4.41] - 2026-05-09
+Hotfix release. Pins `tree-sitter-language-pack` to the `<1` line. The
+upstream 1.x rewrite (Rust-backed) ships an incompatible Parser API — the
+language-pack's bundled `Parser` no longer exposes `.parse()`, breaking the
+code-graph extractor and its test suite. Pinning to the 0.x line restores
+the documented API. A migration to the 1.x API will follow in a later
+release once call-site changes are validated.
+### Fixed
+- `code_graph` extractor and tests broken by `tree-sitter-language-pack 1.x`.
+  Constraint changed from `>=0.3,<2` to `>=0.5,<1`.
+---
 ## [3.4.40] - 2026-05-09
 Recall performance and entity-profile hygiene. Two scaling issues surfaced

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "superlocalmemory",
-  "version": "3.4.40",
+  "version": "3.4.42",
   "description": "Information-geometric agent memory with mathematical guarantees. 4-channel retrieval, Fisher-Rao similarity, zero-LLM mode, EU AI Act compliant. Works with Claude, Cursor, Windsurf, and 17+ AI tools.",
   "keywords": [
     "ai-memory",

package/pyproject.toml CHANGED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "superlocalmemory"
-version = "3.4.40"
+version = "3.4.42"
 description = "Information-geometric agent memory with mathematical guarantees"
 readme = "README.md"
 license = {text = "AGPL-3.0-or-later"}
@@ -46,7 +46,7 @@ dependencies = [
     "orjson>=3.9.0",
     # CodeGraph — code knowledge graph (v3.4)
     "tree-sitter>=0.23.0,<1",
-    "tree-sitter-language-pack>=0.3,<2",
+    "tree-sitter-language-pack>=0.5,<1",
     "rustworkx>=0.15,<1",
     "watchdog>=4.0,<6",
     # V3.4.3: Unified Brain
@@ -98,6 +98,7 @@ full = [
 dev = [
     "pytest>=8.0",
     "pytest-cov>=4.1",
+    "pytest-asyncio>=0.21",
     "sqlite-vec>=0.1.6",
 ]
@@ -124,10 +125,12 @@ superlocalmemory = ["ui/**/*", "skills/**/*"]
 testpaths = ["tests"]
 pythonpath = ["src"]
 addopts = "-m 'not slow and not ollama and not benchmark'"
+asyncio_mode = "auto"
 markers = [
     "slow: marks tests as slow — real engine/model loading (run with: pytest -m slow)",
     "ollama: marks tests that require a running Ollama instance",
     "benchmark: marks CI-only evo-memory benchmark tests (run with: pytest tests/test_benchmarks/ -m benchmark)",
+    "asyncio: marks tests as async — runs via pytest-asyncio (auto-mode in this project)",
 ]
 filterwarnings = [
     "ignore::DeprecationWarning:vaderSentiment",
@@ -167,5 +170,6 @@ select = ["E", "F", "I", "W"]
 dev = [
     "build>=1.4.0",
     "pytest>=9.0.2",
+    "pytest-asyncio>=0.21",
     "twine>=6.2.0",
 ]

package/src/superlocalmemory/cli/commands.py CHANGED Viewed

@@ -316,9 +316,16 @@ def cmd_restart(args: Namespace) -> None:
          f"removed: {', '.join(cleaned)}" if cleaned else "already clean")
     # Step 3: Start fresh daemon (lock still held — no races)
+    # v3.4.42: Call _start_daemon_subprocess() directly instead of
+    # ensure_daemon(). The latter tries to acquire daemon.lock itself,
+    # which the SAME PROCESS holds via restart_lock_fd above — BSD-style
+    # flock blocks per-fd even within one process, so ensure_daemon would
+    # fall into its lock-fail branch and time out after 60s while the
+    # actual daemon never gets started. Calling the helper directly
+    # bypasses that self-deadlock and starts the daemon as intended.
     time.sleep(1)
-    from superlocalmemory.cli.daemon import ensure_daemon
-    started = ensure_daemon()
+    from superlocalmemory.cli.daemon import _start_daemon_subprocess
+    started = _start_daemon_subprocess()
     # Release restart lock — daemon is now running with its own lock
     if restart_lock_fd:
@@ -1662,7 +1669,19 @@ def cmd_mcp(_args: Namespace) -> None:
 def cmd_warmup(_args: Namespace) -> None:
-    """Pre-download the embedding model so first use is instant."""
+    """Pre-download the embedding model so first use is instant.
+    v3.4.42: daemon-aware. The embedding worker is a machine-wide
+    singleton (`_is_embedding_worker_alive` + PID file), so when the
+    unified daemon is running it OWNS the worker. A fresh
+    `EmbeddingService` started here would see the singleton, set
+    `_available = False`, return None from `_subprocess_embed`, and
+    print "embedding verification failed" — even though the daemon's
+    worker is already happily serving the same model. The fix: detect
+    the daemon, verify via its health endpoint, and skip the local
+    spawn. Only fall through to the original local-worker path when
+    the daemon is genuinely unreachable.
+    """
     import superlocalmemory.core.embeddings as _emb_mod
     print("SuperLocalMemory V3 — Embedding Model Warmup")
@@ -1671,7 +1690,37 @@ def cmd_warmup(_args: Namespace) -> None:
     print(f"  Model:  nomic-ai/nomic-embed-text-v1.5 (~500MB)")
     print()
-    # Increase timeout for first-time download
+    # v3.4.42 — daemon-aware fast path. If the daemon is up and reports
+    # engine=initialized, the embedding model is already loaded inside
+    # the daemon's worker subprocess. No need to spawn a redundant one;
+    # in fact, the machine-wide singleton would refuse to do so anyway.
+    try:
+        from superlocalmemory.cli.daemon import (
+            is_daemon_running, daemon_request,
+        )
+        if is_daemon_running():
+            health = daemon_request("GET", "/health")
+            if health and health.get("engine") == "initialized":
+                from superlocalmemory.core.config import EmbeddingConfig
+                cfg = EmbeddingConfig()
+                print("[PASS] Daemon is running with embedding model loaded.")
+                print(f"       Model: {cfg.model_name} ({cfg.dimension}-dim)")
+                print("Semantic search is fully operational.")
+                return
+            # Daemon up but engine not yet initialized — warn and return
+            # rather than racing the daemon for the singleton lock.
+            engine_state = (health or {}).get("engine", "unknown")
+            print(f"[INFO] Daemon is up but engine state is '{engine_state}'.")
+            print("       Wait ~30s and retry, or run: slm doctor")
+            return
+    except Exception:
+        # Any failure in the daemon path falls through to local warmup —
+        # better to spawn a local worker than block warmup entirely.
+        pass
+    # Local-warmup fallback path: daemon is unreachable, so it's safe
+    # to spawn our own embedding worker (no singleton conflict).
+    # Increase timeout for first-time download.
     original_timeout = _emb_mod._SUBPROCESS_RESPONSE_TIMEOUT
     _emb_mod._SUBPROCESS_RESPONSE_TIMEOUT = 180  # 3 min for cold start

package/src/superlocalmemory/cli/daemon.py CHANGED Viewed

@@ -137,6 +137,50 @@ def daemon_request(method: str, path: str, body: dict | None = None) -> dict | N
 _LOCK_FILE = Path.home() / ".superlocalmemory" / "daemon.lock"
+def _start_daemon_subprocess() -> bool:
+    """Spawn the unified daemon subprocess and wait for readiness.
+    v3.4.42: Extracted from ensure_daemon() so callers that already hold
+    daemon.lock (e.g. cmd_restart Step 2) can start the daemon WITHOUT
+    triggering a second flock acquisition. BSD-style flock blocks per-fd
+    even within the same process, so the previous code path produced a
+    self-deadlock when called from Step 3 of `slm restart`: the lock held
+    by Step 2 caused ensure_daemon's own flock to fail with EWOULDBLOCK,
+    falling into the wait-for-someone-else branch and timing out at 60s
+    even though the daemon would have started cleanly.
+    PRECONDITION: caller has either acquired daemon.lock OR is certain no
+    other CLI/MCP process is racing to start a daemon (e.g. we just killed
+    everything in `slm restart` Step 1).
+    Returns True if daemon is reachable on the health endpoint within
+    60 seconds, False otherwise.
+    """
+    if is_daemon_running():
+        return True
+    import subprocess
+    cmd = [sys.executable, "-m", "superlocalmemory.server.unified_daemon", "--start"]
+    log_dir = Path.home() / ".superlocalmemory" / "logs"
+    log_dir.mkdir(parents=True, exist_ok=True)
+    log_file = log_dir / "daemon.log"
+    kwargs: dict = {}
+    if sys.platform == "win32":
+        kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
+    else:
+        kwargs["start_new_session"] = True
+    with open(log_file, "a") as lf:
+        proc = subprocess.Popen(cmd, stdout=lf, stderr=lf, **kwargs)
+    # Write PID immediately so other callers see it during warmup
+    _PID_FILE.write_text(str(proc.pid))
+    _PORT_FILE.write_text(str(_DEFAULT_PORT))
+    return _wait_for_daemon(timeout=60)
 def ensure_daemon() -> bool:
     """Start daemon if not running. Returns True if daemon is ready.
@@ -145,6 +189,12 @@ def ensure_daemon() -> bool:
       2. File lock prevents two callers from starting concurrent daemons
       3. After starting, waits for PID file (not health check) — fast detection
       4. Cross-platform: macOS + Windows + Linux
+    v3.4.42: Refactored to delegate the actual subprocess start to
+    `_start_daemon_subprocess()`. Callers that already hold daemon.lock
+    (e.g. `slm restart` Step 3) should call that helper directly to avoid
+    the same-process flock self-deadlock that returned a false-negative
+    "failed to start" while the daemon was actually starting cleanly.
     """
     if is_daemon_running():
         return True
@@ -176,27 +226,9 @@ def ensure_daemon() -> bool:
         if is_daemon_running():
             return True
-        # Start unified daemon in background
-        import subprocess
-        cmd = [sys.executable, "-m", "superlocalmemory.server.unified_daemon", "--start"]
-        log_dir = Path.home() / ".superlocalmemory" / "logs"
-        log_dir.mkdir(parents=True, exist_ok=True)
-        log_file = log_dir / "daemon.log"
-        kwargs: dict = {}
-        if sys.platform == "win32":
-            kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
-        else:
-            kwargs["start_new_session"] = True
-        with open(log_file, "a") as lf:
-            proc = subprocess.Popen(cmd, stdout=lf, stderr=lf, **kwargs)
-        # Write PID immediately so other callers see it during warmup
-        _PID_FILE.write_text(str(proc.pid))
-        _PORT_FILE.write_text(str(_DEFAULT_PORT))
-        return _wait_for_daemon(timeout=60)
+        # Start unified daemon in background — delegated to helper so the
+        # same logic can be reused by callers that already hold the lock.
+        return _start_daemon_subprocess()
     except Exception as exc:
         # Daemon auto-start is the entry point for dashboard / mesh /

package/src/superlocalmemory/core/engine_wiring.py CHANGED Viewed

@@ -559,14 +559,39 @@ def init_retrieval(
     # The CrossEncoderReranker constructor starts background warmup, but
     # callers can also call warmup_sync() to block until ready.
     # Here we just log warmup status — benchmark scripts call warmup_sync() explicitly.
+    #
+    # v3.4.42: Distinguish the legitimate "another process owns the
+    # reranker worker" case (machine-wide singleton — usually the unified
+    # daemon) from a real warmup failure. Before this fix, any CLI process
+    # that wired an Engine while the daemon was up would log
+    # "reranker warmup failed — recalls will use fallback scoring" even
+    # though the daemon's reranker was healthy and serving fine. The
+    # warning was a false positive that masked real failures and eroded
+    # trust in slm health / slm doctor output.
     if reranker is not None:
         import threading
         def _log_warmup_status() -> None:
             ready = reranker.warmup_sync(timeout=180)
             if ready:
                 logger.info("Cross-encoder reranker warm and ready")
-            else:
-                logger.warning("Cross-encoder reranker warmup failed — recalls will use fallback scoring")
+                return
+            # warmup_sync returned False. Could be (a) singleton held by
+            # another process (benign), or (b) actual model load failure.
+            # Disambiguate by probing the singleton PID file.
+            try:
+                from superlocalmemory.retrieval.reranker import _is_reranker_worker_alive
+                if _is_reranker_worker_alive():
+                    logger.info(
+                        "Cross-encoder reranker worker held by another process "
+                        "(machine-wide singleton — usually the unified daemon); "
+                        "this process will route reranking through that worker"
+                    )
+                    return
+            except Exception:
+                pass
+            logger.warning(
+                "Cross-encoder reranker warmup failed — recalls will use fallback scoring"
+            )
         t = threading.Thread(target=_log_warmup_status, daemon=True, name="ce-init-warmup")
         t.start()

package/src/superlocalmemory/mcp/tools_active.py CHANGED Viewed

@@ -30,12 +30,12 @@ DB_PATH = MEMORY_DIR / "memory.db"
 def _get_agent_id(default: str = "mcp_client") -> str:
     """Resolve the calling agent's ID for attribution.
-    Each Avenger (Claude, Codex, Gemini, Kimi, GLM, Qwen, etc.) sets the
-    ``SLM_AGENT_ID`` env var in its MCP server config so that memories,
+    Each MCP client (Claude Code, Codex, Gemini CLI, Kimi, etc.) can set
+    the ``SLM_AGENT_ID`` env var in its MCP server config so that memories,
     observations, and registry entries are tagged with the actual source
     agent — not the legacy ``"mcp_client"`` default.
-    v3.4.39+: enables proper cross-Avenger attribution in ``session_init``,
+    v3.4.39+: enables proper per-agent attribution in ``session_init``,
     ``observe``, and event emissions.
     """
     return os.environ.get("SLM_AGENT_ID", default)
@@ -174,8 +174,8 @@ def register_active_tools(server, get_engine: Callable) -> None:
         The system will NOT store low-confidence or irrelevant content.
         v3.4.39: ``agent_id`` now defaults to the ``SLM_AGENT_ID`` env var
-        (set by each Avenger's MCP config) so observations carry proper
-        cross-Avenger attribution.
+        (set by each MCP client's config) so observations carry proper
+        per-agent attribution.
         """
         if agent_id is None:
             agent_id = _get_agent_id()