npm - superlocalmemory - Versions diffs - 3.4.42 → 3.4.44 - Mend

superlocalmemory 3.4.42 → 3.4.44

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/CHANGELOG.md +102 -0
package/README.md +41 -0
package/package.json +1 -1
package/pyproject.toml +43 -38
package/scripts/install.ps1 +19 -10
package/scripts/install.sh +15 -21
package/scripts/postinstall.js +9 -77
package/src/superlocalmemory/__init__.py +1 -1
package/src/superlocalmemory/cli/commands.py +57 -27
package/src/superlocalmemory/core/embedding_worker.py +9 -8
package/src/superlocalmemory/core/engine_wiring.py +10 -29
package/src/superlocalmemory/hooks/before_web_hook.py +128 -0
package/src/superlocalmemory/hooks/claude_code_hooks.py +57 -15
package/src/superlocalmemory/hooks/hook_handlers.py +27 -15
package/src/superlocalmemory/hooks/topic_shift_hook.py +272 -0
package/src/superlocalmemory/server/unified_daemon.py +36 -3

package/CHANGELOG.md CHANGED Viewed

@@ -9,6 +9,108 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ---
+## [3.4.43] - 2026-05-12
+Smart-hook architecture release. Replaces the time-based 15-minute recall
+reminder with event-based detection that only fires when there's a real
+signal to recall against. Adds a pre-web-search recall hook so SLM's local
+memories are always surfaced before paying for external research.
+Both additions are perf-budgeted, fail-open, and idempotent. They activate
+on the next `slm hooks install` (or `slm init`); existing installations
+keep working unchanged until upgraded.
+### Added
+- **`slm hook topic_shift`** — UserPromptSubmit handler that keeps a 5-prompt
+  sliding window of content-word lists per session and emits a single-line
+  recall reminder ONLY when the current prompt's content-word set has zero
+  overlap with EVERY recent prompt (the strictest defensible signal for a
+  genuine topic pivot). Per-prompt max-overlap algorithm; not jaccard-vs-union
+  which over-fires on natural conversational drift. Stdlib-only, latency
+  <10ms p99. State file at `/tmp/slm-topicstate-{sha256(session_id)[:16]}.json`,
+  auto-purged after 24h. Observability log at `~/.superlocalmemory/logs/
+  topic-shift.log` (TSV: timestamp, session_hash, current_words_count,
+  window_depth, max_overlap, fired, prompt_preview). Disable with
+  `SLM_TOPIC_SHIFT_LOG=0`. Module: `superlocalmemory/hooks/topic_shift_hook.py`.
+- **`slm hook before_web`** — PreToolUse handler wired on
+  `matcher="WebSearch|WebFetch"`. Extracts the search query / URL / prompt
+  from Claude Code stdin, runs `slm recall <query> --limit 5`, injects
+  results as a `<system-reminder>` with the standard untrusted-boundary
+  markers so Claude reads local memory BEFORE the web call fires. Cost:
+  ~500-800ms warm per fire, but only on web tool calls (5-20x per typical
+  session). Fail-open on SLM-down / timeout / empty results. Module:
+  `superlocalmemory/hooks/before_web_hook.py`.
+- **`HOOKS_VERSION = "3.4.43"`** — bumped so `slm hooks status` flags
+  pre-3.4.43 wirings as outdated. Run `slm hooks install` to upgrade
+  to the new wiring.
+### Changed
+- **`_hook_checkpoint` periodic nag REMOVED.** The 15-minute "[SLM] 15+ min
+  since last context refresh" and 30-minute "[SLM] Call
+  mcp__superlocalmemory__get_learned_patterns" reminders previously emitted
+  by `slm hook checkpoint` are gone. Time-based reminders were noisy on
+  focused sessions and blind to quick topic pivots within a window. The
+  event-based topic_shift hook is the replacement; on-demand
+  `get_learned_patterns` MCP calls cover the learning side.
+  `_hook_checkpoint`'s real value — auto-observe on file-change events —
+  is unchanged. The `_RECALL_INTERVAL` and `_LEARN_INTERVAL` constants
+  are retained for backward import compatibility.
+### Fixed
+- **`slm mode <X>` CLI no longer clobbers embedding / retrieval / evolution /
+  forgetting / math settings.** Before this release the CLI handler called
+  `SLMConfig.for_mode(...)` passing only `llm_*` kwargs — silently
+  re-deriving every other field from mode defaults. A user with a tuned
+  cross-encoder (`cross-encoder/ms-marco-MiniLM-L-12-v2`) or a custom
+  embedding endpoint would lose their settings on every `slm mode b`.
+  The v3.4.34 `mode_change=True` guard only protected the `mode` field
+  itself; surrounding fields were lost. v3.4.43 reworks `cmd_mode` to
+  mutate only `config.mode` and save — preserving all other config
+  byte-for-byte. Mode-appropriate LLM defaults are populated ONLY when
+  the user has no provider set (so the daemon can still come up on a
+  fresh install). Tests: `tests/test_mode_switch_preservation.py` (7 new
+  regression tests covering A↔B, B↔A, anchor preservation, JSON path,
+  no-write-on-read, and the "Embedding model changed" warning that
+  used to fire on every benign mode switch).
+- **Default `PreToolUse` entry added on `slm hooks install`**. Previously
+  PreToolUse was empty unless `include_gate=True`. Now it contains one
+  entry (`before_web` on `WebSearch|WebFetch`) by default; gating users
+  get that PLUS the firewall entry. Existing settings are merged
+  idempotently — `_is_slm_hook_entry` recognises the new wiring so
+  `slm hooks remove` cleans it up properly.
+### Security
+- **CVE-2025-69872 closed (diskcache pickle deserialization RCE).** `diskcache`
+  was declared in `pyproject.toml` but never imported anywhere in `src/` or
+  `tests/` — a phantom dependency. Removed entirely. The `slm doctor`
+  performance-deps check no longer references it. Zero behavior change for
+  users; lower attack surface; smaller install.
+- **CVE-2026-1839 (transformers Trainer torch.load RCE) — UNREACHABLE in SLM,
+  upstream-pinned.** The vulnerable method `Trainer._load_rng_state` is in
+  training code paths. SLM is inference-only (uses `sentence-transformers`
+  with ONNX backend; never instantiates `Trainer`). pip-audit flags the dep
+  version because the vulnerable bytes are installed, but the code path is
+  never executed by SLM. We CANNOT pin `transformers>=5.0.0` (the upstream
+  fix) yet because `optimum-onnx 0.1.0` (the latest upstream release as of
+  v3.4.43) caps `transformers<4.58.0` — and `embedding_worker.py` requires
+  the ONNX backend. Will tighten the pin when optimum-onnx ships a
+  transformers-5.x-compatible build. Tracking issue: see project changelog
+  for v3.4.44+. Sentence-transformers minimum bumped to `>=5.2.0` to lock
+  out 5.0.0-5.1.2 (which capped transformers `<5.0.0` even more strictly)
+  and give the resolver maximum headroom for when the upstream pin lifts.
+### Migration
+- Existing v3.4.42 users: run `slm hooks install` (or `slm init`) once
+  after upgrading to pull in the new UserPromptSubmit and PreToolUse
+  entries. `slm hooks status` will flag the version mismatch.
+- The settings.json merge is idempotent; running install twice is safe.
+- Topic-shift detection works immediately on first new session — no DB
+  or state migration required.
+- `pip install -U superlocalmemory` will pull `transformers>=5.0.0` and
+  drop the unused `diskcache` dep automatically.
+---
 ## [3.4.42] - 2026-05-11
 Operational reliability release. Three latent bugs in the daemon /

package/README.md CHANGED Viewed

@@ -234,6 +234,47 @@ All `--json` responses follow a consistent envelope with `success`, `command`, `
 ---
+## Smart-hook architecture (v3.4.43)
+SLM ships a small set of Claude Code hooks that fire memory operations only
+when there's a real signal — not on a timer, not on every keystroke. The
+hooks are perf-budgeted (<10ms p99 for the hot path) and fail-open (any
+crash → silent exit, never blocks your prompt). Install them with one
+command:
+```bash
+slm hooks install      # wires hooks into ~/.claude/settings.json
+slm hooks status       # shows what's installed
+slm hooks remove       # cleans up, preserves non-SLM hooks
+```
+| Hook | Event | When it fires | Why |
+|---|---|---|---|
+| `slm hook start` | SessionStart | Once at session boot | Injects core memory + recent context + learned patterns. ~80ms. |
+| `slm hook user_prompt_rehash` | UserPromptSubmit | Every prompt | Detects re-queries within 60s (negative signal that prior recall didn't satisfy). <10ms hot path. |
+| **`slm hook topic_shift`** *(new in 3.4.43)* | UserPromptSubmit | When current prompt shares zero content words with every prompt in a 5-turn sliding window | Surfaces a one-line "consider recall" hint on real topic pivots. Replaces the time-based 15-min nag — event-based, not timer-based. <10ms. |
+| **`slm hook before_web`** *(new in 3.4.43)* | PreToolUse on `WebSearch\|WebFetch` | Every web search/fetch | Runs `slm recall <query> --limit 5` and injects local memories as a system-reminder BEFORE the web call. Cost: ~500-800ms per fire, fires 5-20× per session. |
+| `slm hook checkpoint` | PostToolUse on `Write\|Edit` | Every file write/edit | Auto-observes file changes into SLM. No periodic nag (removed in v3.4.43). |
+| `slm hook post_tool_outcome` | PostToolUse (all tools) | Every tool call | Tracks which recalled facts got used (learning signal). |
+| `slm hook stop` | Stop | Session end | Saves rich session summary with git context. |
+**What "smart" means here:** the hooks don't interrupt you on a schedule.
+They watch for specific events that indicate memory work would add value —
+a topic pivot, a web call about to fire, a re-asked question, a file edit.
+Otherwise they stay out of your way.
+**Observability for the new hooks:**
+`topic_shift` writes one TSV line per decision to
+`~/.superlocalmemory/logs/topic-shift.log`
+(`timestamp | session_hash | current_words_count | window_depth | max_overlap |
+fired | prompt_preview`). Disable with `SLM_TOPIC_SHIFT_LOG=0`.
+**Upgrading from v3.4.42 or older:** Run `slm hooks install` once after
+upgrade to pull in the new wiring. `slm hooks status` will flag the
+version mismatch. Merge is idempotent — safe to run twice.
+---
 ## Three Operating Modes
 | Mode | What | Cloud? | EU AI Act | Best For |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "superlocalmemory",
-  "version": "3.4.42",
+  "version": "3.4.44",
   "description": "Information-geometric agent memory with mathematical guarantees. 4-channel retrieval, Fisher-Rao similarity, zero-LLM mode, EU AI Act compliant. Works with Claude, Cursor, Windsurf, and 17+ AI tools.",
   "keywords": [
     "ai-memory",

package/pyproject.toml CHANGED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "superlocalmemory"
-version = "3.4.42"
+version = "3.4.44"
 description = "Information-geometric agent memory with mathematical guarantees"
 readme = "README.md"
 license = {text = "AGPL-3.0-or-later"}
@@ -29,37 +29,42 @@ classifiers = [
 ]
 dependencies = [
-    "httpx>=0.24.0",
-    "numpy>=1.26.0,<3.0.0",
-    "scipy>=1.12.0,<2.0.0",
-    "networkx>=3.0",
-    "mcp>=1.0.0",
-    "python-dateutil>=2.9.0.post0",
-    "rank-bm25>=0.2.2",
-    "vadersentiment>=3.3.2",
-    "einops>=0.8.2",
-    "fastapi[all]>=0.135.1",
-    "uvicorn>=0.42.0",
-    "websockets>=16.0",
-    "lightgbm>=4.0.0",
-    "diskcache>=5.6.0",
-    "orjson>=3.9.0",
-    # CodeGraph — code knowledge graph (v3.4)
-    "tree-sitter>=0.23.0,<1",
-    "tree-sitter-language-pack>=0.5,<1",
-    "rustworkx>=0.15,<1",
-    "watchdog>=4.0,<6",
-    # V3.4.3: Unified Brain
-    "psutil>=5.9.0",
-    "structlog>=24.0.0,<27.0.0",
-    # Cross-platform file locking for single-daemon enforcement.
-    "portalocker>=2.7.0,<4.0.0",
-    # V3.4.18: Semantic search + cross-encoder reranker (npm install parity).
-    # Previously under [search] extra — pip users silently lost 30pp of recall
-    # quality vs. npm users. Now ships by default for both install paths.
-    "sentence-transformers[onnx]>=5.0.0",
-    "torch>=2.2.0",
-    "scikit-learn>=1.3.0,<2.0.0",
+    # All versions hard-pinned to the verified-good combination.
+    # Mixing versions outside these pins triggers per-batch memory
+    # blow-up in the embedding worker on Apple Silicon and breaks
+    # recall/remember latency targets. Update only after benchmarking.
+    "httpx==0.28.1",
+    "numpy==2.4.4",
+    "scipy==1.17.1",
+    "networkx==3.6.1",
+    "mcp==1.27.1",
+    "python-dateutil==2.9.0.post0",
+    "rank-bm25==0.2.2",
+    "vadersentiment==3.3.2",
+    "einops==0.8.2",
+    "fastapi[all]==0.136.1",
+    "uvicorn==0.46.0",
+    "websockets==16.0",
+    "lightgbm==4.6.0",
+    "orjson==3.11.9",
+    "tree-sitter==0.25.2",
+    "tree-sitter-language-pack==0.13.0",
+    "rustworkx==0.17.1",
+    "watchdog==5.0.3",
+    "psutil==7.2.2",
+    "structlog==25.5.0",
+    "portalocker==3.2.0",
+    # Semantic search + cross-encoder reranker. Embedding stack is
+    # extremely sensitive to version drift on Apple Silicon — newer
+    # versions allocate dramatically more per-batch memory.
+    "sentence-transformers[onnx]==5.3.0",
+    "onnxruntime==1.24.4",
+    "transformers==4.57.6",
+    "huggingface_hub==0.36.2",
+    "torch==2.11.0",
+    "scikit-learn==1.8.0",
+    # Vector KNN extension for the semantic channel.
+    "sqlite-vec==0.1.9",
 ]
 [project.optional-dependencies]
@@ -67,12 +72,13 @@ dependencies = [
 # moved into core in v3.4.18. ``pip install superlocalmemory[search]`` still
 # works but installs nothing extra.
 search = [
-    "sentence-transformers[onnx]>=5.0.0",
-    "einops>=0.8.2",
-    "torch>=2.2.0",
-    "scikit-learn>=1.3.0,<2.0.0",
+    # Same hard pin as core deps — see comment above.
+    "sentence-transformers[onnx]==5.3.0",
+    "einops==0.8.2",
+    "torch==2.11.0",
+    "scikit-learn==1.8.0",
     "geoopt>=0.5.0",
-    "onnxruntime>=1.17.0",
+    "onnxruntime==1.24.4",
 ]
 ui = [
     "fastapi[all]>=0.135.1",
@@ -83,7 +89,6 @@ learning = [
     "lightgbm>=4.0.0",
 ]
 performance = [
-    "diskcache>=5.6.0",
     "orjson>=3.9.0",
 ]
 ingestion = [

package/scripts/install.ps1 CHANGED Viewed

@@ -233,22 +233,31 @@ print('Database ready')
     Write-Host "WARNING: setup_validator.py not found, skipping database init" -ForegroundColor Yellow
 }
-# Install core dependencies (required for graph & dashboard)
+# Install SuperLocalMemory and all dependencies via pyproject.toml (single source of truth)
 Write-Host ""
-Write-Host "Installing core dependencies..."
-Write-Host "INFO: This ensures graph visualization and patterns work out-of-box" -ForegroundColor Yellow
+Write-Host "Installing SuperLocalMemory and all dependencies..."
+Write-Host "INFO: Versions are pinned in pyproject.toml -- same versions for every install path" -ForegroundColor Yellow
+# Find pyproject.toml (parent of scripts/ or scripts/ itself)
+$ParentDir = Split-Path -Parent $REPO_DIR
+if (Test-Path (Join-Path $ParentDir "pyproject.toml")) {
+    $ProjRoot = $ParentDir
+} elseif (Test-Path (Join-Path $REPO_DIR "pyproject.toml")) {
+    $ProjRoot = $REPO_DIR
+} else {
+    $ProjRoot = $null
+}
-$coreRequirements = Join-Path $REPO_DIR "requirements-core.txt"
-if (Test-Path $coreRequirements) {
+if ($ProjRoot) {
     try {
-        & python -m pip install -q -r $coreRequirements 2>$null
-        Write-Host "OK Core dependencies installed (graph, dashboard, patterns)" -ForegroundColor Green
+        & python -m pip install -q -e $ProjRoot 2>$null
+        Write-Host "OK SuperLocalMemory and all dependencies installed (pinned versions)" -ForegroundColor Green
     } catch {
-        Write-Host "WARNING: Core dependency installation failed. Some features may not work." -ForegroundColor Yellow
-        Write-Host "   Install manually: python -m pip install -r $coreRequirements" -ForegroundColor Yellow
+        Write-Host "WARNING: Dependency installation failed." -ForegroundColor Yellow
+        Write-Host "   Install manually: python -m pip install -e $ProjRoot" -ForegroundColor Yellow
     }
 } else {
-    Write-Host "WARNING: requirements-core.txt not found, skipping dependency installation" -ForegroundColor Yellow
+    Write-Host "WARNING: pyproject.toml not found, cannot install dependencies" -ForegroundColor Yellow
 }
 # Initialize knowledge graph and pattern learning

package/scripts/install.sh CHANGED Viewed

@@ -358,8 +358,8 @@ except Exception as e:
 # Install core dependencies (required for graph & dashboard)
 echo ""
-echo "Installing core dependencies..."
-echo "⏳ This ensures graph visualization and patterns work out-of-box"
+echo "Installing SuperLocalMemory and all dependencies..."
+echo "⏳ Versions are pinned in pyproject.toml — same versions for every install path"
 # Detect pip installation method
 if pip3 install --help | grep -q "break-system-packages"; then
@@ -368,31 +368,25 @@ else
     PIP_FLAGS=""
 fi
-if [ -f "${REPO_DIR}/requirements-core.txt" ]; then
-    if pip3 install $PIP_FLAGS -q -r "${REPO_DIR}/requirements-core.txt"; then
-        echo "✓ Core dependencies installed (graph, dashboard, patterns)"
-    else
-        echo "⚠️  Core dependency installation failed. Some features may not work."
-        echo "   Install manually: pip3 install -r ${REPO_DIR}/requirements-core.txt"
-    fi
+# Find the repo root (parent of scripts/)
+PKG_ROOT="$(cd "${REPO_DIR}/.." && pwd)"
+if [ -f "${PKG_ROOT}/pyproject.toml" ]; then
+    PROJ_ROOT="${PKG_ROOT}"
+elif [ -f "${REPO_DIR}/pyproject.toml" ]; then
+    PROJ_ROOT="${REPO_DIR}"
 else
-    echo "⚠️  requirements-core.txt not found, skipping dependency installation"
+    PROJ_ROOT=""
 fi
-# Install learning dependencies (v2.7+)
-echo ""
-echo "Installing learning dependencies..."
-echo "  Enables intelligent pattern learning and personalized recall"
-if [ -f "${REPO_DIR}/requirements-learning.txt" ]; then
-    if pip3 install $PIP_FLAGS -q -r "${REPO_DIR}/requirements-learning.txt" 2>/dev/null; then
-        echo "✓ Learning dependencies installed (personalized ranking enabled)"
+if [ -n "${PROJ_ROOT}" ]; then
+    if pip3 install $PIP_FLAGS -q -e "${PROJ_ROOT}"; then
+        echo "✓ SuperLocalMemory and all dependencies installed (pinned versions)"
     else
-        echo "○ Learning dependencies skipped (core features unaffected)"
-        echo "  To install later: pip3 install lightgbm scipy"
+        echo "⚠️  Dependency installation failed."
+        echo "   Install manually: pip3 install -e ${PROJ_ROOT}"
     fi
 else
-    echo "○ requirements-learning.txt not found (learning features will use rule-based ranking)"
+    echo "⚠️  pyproject.toml not found, cannot install dependencies"
 fi
 # Initialize knowledge graph and pattern learning

package/scripts/postinstall.js CHANGED Viewed

@@ -97,83 +97,15 @@ function pipInstall(packages, label) {
     return false;
 }
-// Core dependencies (REQUIRED — product won't work without these)
-const coreDeps = [
-    'numpy>=1.26.0', 'scipy>=1.12.0', 'networkx>=3.0',
-    'httpx>=0.24.0', 'python-dateutil>=2.9.0',
-    'rank-bm25>=0.2.2', 'vaderSentiment>=3.3.2',
-    'einops>=0.8.2', 'mcp>=1.0.0',
-];
-if (pipInstall(coreDeps, 'core')) {
-    console.log('✓ Core dependencies installed (math, search, NLP)');
-} else {
-    console.log('⚠ Core dependency installation failed.');
-    console.log('  Run manually: pip install ' + coreDeps.join(' '));
-}
-// Search + ONNX reranking (V3.3.2 — enables 6-channel retrieval + cross-encoder)
-const searchDeps = [
-    'sentence-transformers[onnx]>=4.0.0',
-    'einops>=0.7.0', 'geoopt>=0.5.0',
-    'onnxruntime>=1.17.0',
-];
-console.log('\nInstalling semantic search + ONNX reranking engine...');
-console.log('  (sentence-transformers 4+, ONNX Runtime, Fisher-Rao geometry)');
-if (pipInstall(searchDeps, 'search')) {
-    console.log('✓ Search engine installed (sentence-transformers + ONNX + Fisher-Rao)');
-    console.log('  Cross-encoder reranking enabled for ALL modes (+30pp quality)');
-    console.log('');
-    console.log('  Models auto-download on first use:');
-    console.log('    - Embedding: nomic-ai/nomic-embed-text-v1.5 (~500MB)');
-    console.log('    - Reranker: cross-encoder/ms-marco-MiniLM-L-6-v2 (~90MB)');
-    console.log('  To pre-download now, run: slm warmup');
-} else {
-    console.log('⚠ Search engine installation failed (BM25 keyword search still works).');
-    console.log('  For full 6-channel retrieval + reranking, run:');
-    console.log('  pip install "sentence-transformers[onnx]>=4.0.0" einops geoopt onnxruntime');
-}
-// Dashboard dependencies (IMPORTANT — enables web dashboard + MCP server)
-const dashboardDeps = ['fastapi[all]>=0.135.1', 'uvicorn>=0.42.0', 'websockets>=16.0'];
-console.log('\nInstalling dashboard & server dependencies...');
-if (pipInstall(dashboardDeps, 'dashboard')) {
-    console.log('✓ Dashboard & MCP server dependencies installed (fastapi + uvicorn)');
-} else {
-    console.log('⚠ Dashboard installation failed.');
-    console.log('  Run manually: pip install \'fastapi[all]\' uvicorn websockets');
-}
-// Learning dependencies (enables adaptive retrieval after 200+ signals)
-const learningDeps = ['lightgbm>=4.0.0'];
-console.log('\nInstalling learning engine...');
-if (pipInstall(learningDeps, 'learning')) {
-    console.log('✓ Learning engine installed (lightgbm — adaptive ranking)');
-} else {
-    console.log('⚠ Learning installation failed (retrieval still works without it).');
-    console.log('  Run manually: pip install lightgbm');
-}
-// Performance dependencies (optional — improves caching and JSON speed)
-const perfDeps = ['diskcache>=5.6.0', 'orjson>=3.9.0'];
-console.log('\nInstalling performance optimizations...');
-if (pipInstall(perfDeps, 'performance')) {
-    console.log('✓ Performance optimizations installed (diskcache + orjson)');
-} else {
-    console.log('⚠ Performance deps skipped (system works fine without them).');
-}
-// V3.4.3: Unified Brain dependencies (health monitor, structured logging, file watching)
-const brainDeps = ['psutil>=5.9.0', 'structlog>=24.0.0', 'watchdog>=4.0.0'];
-console.log('\nInstalling Unified Brain dependencies (health monitor, file watcher)...');
-if (pipInstall(brainDeps, 'brain')) {
-    console.log('✓ Unified Brain deps installed (psutil + structlog + watchdog)');
-    console.log('  Health monitoring, structured logging, and file watching enabled');
-} else {
-    console.log('⚠ Unified Brain deps partially installed (health monitoring may be limited).');
-    console.log('  Run manually: pip install psutil structlog watchdog');
-}
+// Install the superlocalmemory package and all its pinned dependencies
+// in one shot. pyproject.toml is the single source of truth for versions,
+// so users via npm get exactly the same dep set as users via pip.
+console.log('\nInstalling SuperLocalMemory and all dependencies...');
+console.log('  (Single pip install — versions pinned in pyproject.toml)');
+console.log('  This may take 1-3 minutes (downloads ~500MB of models on first use).');
+console.log('');
+console.log('  Includes: numpy, scipy, fastapi, sentence-transformers, onnxruntime,');
+console.log('           torch, transformers, sqlite-vec, lightgbm, mcp, and more.');
 // --- Step 3b: Install the superlocalmemory package itself ---
 // This ensures `python -m superlocalmemory.cli.main` always resolves the

package/src/superlocalmemory/__init__.py CHANGED Viewed

@@ -1,3 +1,3 @@
 """SuperLocalMemory — information-geometric agent memory."""
-__version__ = "3.4.39"
+__version__ = "3.4.44"

package/src/superlocalmemory/cli/commands.py CHANGED Viewed

@@ -629,24 +629,53 @@ def cmd_setup(args: Namespace) -> None:
 def cmd_mode(args: Namespace) -> None:
-    """Get or set the operating mode."""
+    """Get or set the operating mode.
+    v3.4.43 behavior change: switching modes via this CLI now PRESERVES the
+    user's existing embedding, retrieval, evolution, forgetting, and math
+    settings. Previously the CLI called ``SLMConfig.for_mode(...)`` which
+    re-derived every field from mode defaults — silently clobbering user
+    customizations (e.g. a tuned cross-encoder model, a custom embedding
+    endpoint, or custom forgetting half-lives). The v3.4.34 ``mode_change=True``
+    guard only protected the ``mode`` field itself; everything else was lost.
+    New rules:
+      - Only ``config.mode`` changes.
+      - If the user has NO LLM provider configured AND is switching to a mode
+        that typically needs one (B or C), mode-appropriate LLM defaults are
+        populated to avoid the daemon coming up dead. Existing LLM config
+        is preserved as-is.
+      - Embedding / retrieval / evolution / forgetting / math: untouched.
+    """
     from superlocalmemory.core.config import SLMConfig
     from superlocalmemory.storage.models import Mode
     config = SLMConfig.load()
+    def _apply_mode_change(new_value: str) -> tuple[SLMConfig, bool]:
+        """Mutate-in-place mode switch. Returns (updated_config, llm_was_set).
+        Only changes ``config.mode``. If the user has no LLM provider
+        configured AND is moving to Mode B or C, populates the mode's
+        default LLM block so the daemon has something to talk to.
+        Everything else (embedding, retrieval, evolution, forgetting,
+        math, profile) is preserved byte-for-byte.
+        """
+        new_mode = Mode(new_value)
+        llm_was_set = False
+        if new_mode != Mode.A and not config.llm.provider:
+            defaults = SLMConfig.for_mode(new_mode)
+            config.llm = defaults.llm
+            llm_was_set = True
+        config.mode = new_mode
+        config.save(mode_change=True)
+        return config, llm_was_set
     if getattr(args, 'json', False):
         from superlocalmemory.cli.json_output import json_print
         if args.value:
             old_mode = config.mode.value.upper()
-            updated = SLMConfig.for_mode(
-                Mode(args.value),
-                llm_provider=config.llm.provider,
-                llm_model=config.llm.model,
-                llm_api_key=config.llm.api_key,
-                llm_api_base=config.llm.api_base,
-            )
-            updated.save(mode_change=True)
+            updated, _ = _apply_mode_change(args.value)
             json_print("mode", data={
                 "previous_mode": old_mode, "current_mode": args.value.upper(),
             }, next_actions=[
@@ -661,20 +690,18 @@ def cmd_mode(args: Namespace) -> None:
         return
     if args.value:
-        updated = SLMConfig.for_mode(
-            Mode(args.value),
-            llm_provider=config.llm.provider,
-            llm_model=config.llm.model,
-            llm_api_key=config.llm.api_key,
-            llm_api_base=config.llm.api_base,
-        )
-        updated.save(mode_change=True)
+        updated, llm_was_set = _apply_mode_change(args.value)
         print(f"Mode set to: {args.value.upper()}")
-        # V3.3: Check if embedding model changed — inform about re-indexing
-        if (config.embedding.provider != updated.embedding.provider
-                or config.embedding.model_name != updated.embedding.model_name):
-            print("  ⚠ Embedding model changed. Re-indexing will run on next recall.")
+        # v3.4.43: embedding/retrieval are now preserved, so the old
+        # "Embedding model changed. Re-indexing will run on next recall."
+        # warning no longer fires from a CLI mode switch — that was the
+        # symptom of the bug. The warning is retained ONLY as an
+        # informational note when LLM defaults were freshly populated.
+        if llm_was_set:
+            print(f"  ℹ LLM provider populated from mode defaults: "
+                  f"{updated.llm.provider}/{updated.llm.model}. "
+                  f"Run `slm provider set` to customize.")
         # V3.3.4: Warn if Mode C lacks cloud API key
         if args.value == "c" and not updated.llm.api_key:
@@ -1422,19 +1449,22 @@ def cmd_doctor(args: Namespace) -> None:
                "brew install libomp && pip install --force-reinstall lightgbm")
     # 6. Performance deps
+    # v3.4.43: diskcache removed from this check — it was a phantom dependency
+    # (declared in pyproject.toml but never imported anywhere in src/ or tests/).
+    # Dropping it closes CVE-2025-69872 (pickle deserialization RCE) without any
+    # behavior change. orjson remains a real performance dep.
     perf_ok = []
-    for mod in ["diskcache", "orjson"]:
+    for mod in ["orjson"]:
         try:
             __import__(mod)
             perf_ok.append(mod)
         except ImportError:
             pass
-    if len(perf_ok) == 2:
-        _check("Performance deps", "PASS", "diskcache, orjson")
+    if perf_ok:
+        _check("Performance deps", "PASS", "orjson")
     else:
-        missing = {"diskcache", "orjson"} - set(perf_ok)
-        _check("Performance deps", "WARN", f"Missing: {', '.join(missing)}",
-               "pip install diskcache orjson")
+        _check("Performance deps", "WARN", "Missing: orjson",
+               "pip install orjson")
     # 7. Embedding worker functional test — skipped under --quick.
     if quick:

package/src/superlocalmemory/core/embedding_worker.py CHANGED Viewed

@@ -53,24 +53,25 @@ def _start_parent_watchdog() -> None:
 def _load_embedding_model(name: str) -> tuple:
-    """Load embedding model. ONNX first (no memory leak), PyTorch fallback.
-    V3.3.17: PyTorch SentenceTransformer on ARM64 Mac leaks memory —
-    grows from 300MB to 17GB after ~200 encode calls. ONNX Runtime
-    has no such issue. Same approach as CrossEncoder ONNX migration.
+    """Load embedding model. ONNX CPU-only first, PyTorch fallback.
     Returns (model, backend_name) or (None, "").
     """
     from sentence_transformers import SentenceTransformer
-    # Tier 1: ONNX (stable memory; ~1.1 GB for nomic-embed-text-v1.5)
+    # ONNX with explicit CPU provider — avoids CoreML EP memory overhead.
     try:
-        m = SentenceTransformer(name, backend="onnx", trust_remote_code=True)
+        m = SentenceTransformer(
+            name,
+            backend="onnx",
+            trust_remote_code=True,
+            model_kwargs={"provider": "CPUExecutionProvider"},
+        )
         return m, "onnx"
     except Exception:
         pass
-    # Tier 2: PyTorch CPU (stable at ~1.4GB after 100+ calls, verified)
+    # PyTorch CPU fallback.
     try:
         import torch
         with torch.inference_mode():