nexo-brain 7.14.0 → 7.15.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +1 -1
- package/README.md +8 -4
- package/bin/nexo-brain.js +27 -6
- package/package.json +1 -1
- package/src/agent_runner.py +86 -0
- package/src/claim_graph.py +19 -4
- package/src/cli.py +27 -0
- package/src/cognitive/_core.py +124 -12
- package/src/cognitive/_search.py +156 -0
- package/src/db/_learnings.py +22 -10
- package/src/db/_schema.py +68 -0
- package/src/db/_semantic_similarity.py +4 -0
- package/src/doctor/providers/runtime.py +88 -0
- package/src/email_sent_events.py +14 -3
- package/src/enforcement_engine.py +52 -0
- package/src/hnsw_index.py +15 -3
- package/src/hook_observability.py +71 -0
- package/src/local_model_manifest.json +16 -13
- package/src/local_models.py +3 -0
- package/src/migrate_embeddings.py +17 -6
- package/src/plugins/cognitive_memory.py +1 -1
- package/src/script_registry.py +68 -0
- package/src/scripts/nexo-daily-self-audit.py +133 -65
- package/src/scripts/nexo-email-monitor.py +212 -4
- package/src/scripts/nexo-morning-agent.py +191 -0
- package/src/scripts/nexo-send-reply.py +1 -0
- package/src/server.py +4 -2
- package/src/tools_learnings.py +37 -15
- package/templates/core-prompts/interactive-startup.md +1 -1
- package/templates/core-prompts/server-mcp-instructions.md +3 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "nexo-brain",
|
|
3
|
-
"version": "7.
|
|
3
|
+
"version": "7.15.1",
|
|
4
4
|
"description": "Local cognitive runtime for Claude Code \u2014 persistent memory, overnight learning, doctor diagnostics, personal scripts, recovery-aware jobs, startup preflight, and optional dashboard/power helper.",
|
|
5
5
|
"author": {
|
|
6
6
|
"name": "NEXO Brain",
|
package/README.md
CHANGED
|
@@ -18,7 +18,11 @@
|
|
|
18
18
|
|
|
19
19
|
[Watch the overview video](https://nexo-brain.com/watch/) · [Watch on YouTube](https://www.youtube.com/watch?v=i2lkGhKyVqI) · [Open the infographic](https://nexo-brain.com/assets/nexo-brain-infographic-v5.png)
|
|
20
20
|
|
|
21
|
-
Version `7.
|
|
21
|
+
Version `7.15.1` is the current packaged-runtime line. Patch release over v7.15.0 - Brain drains larger self-audit clusters, bounds hook history with update-time cleanup, filters normal Codex bootstrap reads, routes email-monitor effort by message complexity, and locks morning briefings by local date and recipient.
|
|
22
|
+
|
|
23
|
+
Previously in `7.15.0`: minor release — Brain unifies sent-email continuity across send paths, moves cognitive recall to multilingual embeddings, forces tagged learnings into context, hardens email loop guards and headless runners, exposes learning creation dates, and adds AUTO-N burst postmortems.
|
|
24
|
+
|
|
25
|
+
Previously in `7.14.0`: minor release — Brain closes the install/reliability loop with update-path venv recovery, platform-gated wheels, WSL Desktop-managed flag preservation, startup memory authority warnings, legacy MEMORY write blocking, post-action real-world verification, and stale followup triage.
|
|
22
26
|
|
|
23
27
|
Previously in `7.13.9`: patch release — Brain moves aside an existing managed `.venv` when it was created with unsupported Python <3.10, then recreates it with the supported interpreter prepared by Desktop.
|
|
24
28
|
|
|
@@ -383,7 +387,7 @@ That keeps the core Ebbinghaus model, but makes decay more individual and less p
|
|
|
383
387
|
|
|
384
388
|
### Semantic Search (Finding by Meaning)
|
|
385
389
|
|
|
386
|
-
NEXO Brain doesn't search by keywords. It searches by **meaning** using vector embeddings (fastembed,
|
|
390
|
+
NEXO Brain doesn't search by keywords. It searches by **meaning** using multilingual vector embeddings (fastembed, 384 dimensions).
|
|
387
391
|
|
|
388
392
|
Example: If you search for "deploy problems", NEXO Brain will find a memory about "SSH connection timeout on production server" — even though they share zero words. This is how human associative memory works.
|
|
389
393
|
|
|
@@ -601,7 +605,7 @@ NEXO Brain was evaluated on [LoCoMo](https://github.com/snap-research/locomo) (A
|
|
|
601
605
|
- 93.3% adversarial rejection rate — reliably says "I don't know" when information isn't available
|
|
602
606
|
- 74.9% recall across 1,986 questions
|
|
603
607
|
- Open-domain F1: 0.637 | Multi-hop F1: 0.333 | Temporal F1: 0.326
|
|
604
|
-
- Runs on CPU with
|
|
608
|
+
- Runs on CPU with local multilingual embeddings — no GPU required
|
|
605
609
|
- First MCP memory server benchmarked on a peer-reviewed dataset
|
|
606
610
|
|
|
607
611
|
Full results in [`benchmarks/locomo/results/`](benchmarks/locomo/results/).
|
|
@@ -1447,7 +1451,7 @@ See [benchmarks/results/memory-recall-vs-static.md](benchmarks/results/memory-re
|
|
|
1447
1451
|
|
|
1448
1452
|
### v0.9.0 — Cognitive Memory (2026-03-15)
|
|
1449
1453
|
- Atkinson-Shiffrin memory model (STM → LTM promotion)
|
|
1450
|
-
- Semantic RAG with
|
|
1454
|
+
- Semantic RAG with pinned local multilingual fastembed models
|
|
1451
1455
|
- Trust scoring, sentiment detection, adaptive personality modes
|
|
1452
1456
|
- Ebbinghaus decay, sister detection, quarantine system
|
|
1453
1457
|
|
package/bin/nexo-brain.js
CHANGED
|
@@ -24,7 +24,26 @@ const readline = require("readline");
|
|
|
24
24
|
require = createRequire(path.join(__dirname, "nexo-brain.js"));
|
|
25
25
|
const { runViaWsl } = require("./windows-wsl-bridge");
|
|
26
26
|
|
|
27
|
-
|
|
27
|
+
function isCliEntrypoint() {
|
|
28
|
+
const invoked = process.argv && process.argv[1] ? String(process.argv[1]) : "";
|
|
29
|
+
if (!invoked) return false;
|
|
30
|
+
|
|
31
|
+
const normalize = (candidate) => {
|
|
32
|
+
try {
|
|
33
|
+
return fs.realpathSync.native(candidate);
|
|
34
|
+
} catch {
|
|
35
|
+
try {
|
|
36
|
+
return fs.realpathSync(candidate);
|
|
37
|
+
} catch {
|
|
38
|
+
return path.resolve(candidate);
|
|
39
|
+
}
|
|
40
|
+
}
|
|
41
|
+
};
|
|
42
|
+
|
|
43
|
+
return normalize(invoked) === normalize(__filename);
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
if (process.platform === "win32" && isCliEntrypoint()) {
|
|
28
47
|
const bridged = runViaWsl({
|
|
29
48
|
scriptPath: __filename,
|
|
30
49
|
args: process.argv.slice(2),
|
|
@@ -4983,8 +5002,10 @@ async function main() {
|
|
|
4983
5002
|
}
|
|
4984
5003
|
}
|
|
4985
5004
|
|
|
4986
|
-
|
|
4987
|
-
|
|
4988
|
-
|
|
4989
|
-
|
|
4990
|
-
|
|
5005
|
+
if (isCliEntrypoint()) {
|
|
5006
|
+
Promise.resolve(main()).catch((err) => {
|
|
5007
|
+
closeReadline();
|
|
5008
|
+
console.error("Setup failed:", err.message);
|
|
5009
|
+
process.exit(1);
|
|
5010
|
+
});
|
|
5011
|
+
}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "nexo-brain",
|
|
3
|
-
"version": "7.
|
|
3
|
+
"version": "7.15.1",
|
|
4
4
|
"mcpName": "io.github.wazionapps/nexo",
|
|
5
5
|
"description": "NEXO Brain — Shared brain for AI agents. Persistent memory, semantic RAG, natural forgetting, metacognitive guard, trust scoring, 150+ MCP tools. Works with Claude Code, Codex, Claude Desktop & any MCP client. 100% local, free.",
|
|
6
6
|
"homepage": "https://nexo-brain.com",
|
package/src/agent_runner.py
CHANGED
|
@@ -4,6 +4,7 @@ from __future__ import annotations
|
|
|
4
4
|
|
|
5
5
|
import json
|
|
6
6
|
import os
|
|
7
|
+
import re
|
|
7
8
|
import paths
|
|
8
9
|
import shlex
|
|
9
10
|
import shutil
|
|
@@ -385,6 +386,79 @@ def _headless_env(env: dict | None = None) -> dict:
|
|
|
385
386
|
return merged
|
|
386
387
|
|
|
387
388
|
|
|
389
|
+
_MUTATING_TOOL_NAMES = frozenset({
|
|
390
|
+
"write",
|
|
391
|
+
"edit",
|
|
392
|
+
"multiedit",
|
|
393
|
+
"notebookedit",
|
|
394
|
+
"delete",
|
|
395
|
+
"bash",
|
|
396
|
+
"shell",
|
|
397
|
+
})
|
|
398
|
+
|
|
399
|
+
|
|
400
|
+
def _runner_mutating_tools_allowed(allowed_tools: str) -> bool:
|
|
401
|
+
text = str(allowed_tools or "").strip().lower()
|
|
402
|
+
if not text:
|
|
403
|
+
return True
|
|
404
|
+
parts = {part.strip().split(":", 1)[0].lower() for part in re.split(r"[,;\s]+", text) if part.strip()}
|
|
405
|
+
return bool(parts & _MUTATING_TOOL_NAMES)
|
|
406
|
+
|
|
407
|
+
|
|
408
|
+
def _extract_runner_guard_paths(prompt: str, cwd: Path) -> list[str]:
|
|
409
|
+
found: set[str] = set()
|
|
410
|
+
text = str(prompt or "")
|
|
411
|
+
for match in re.findall(r"(?<![A-Za-z0-9_])(?:/[^\s'\"`<>]+|[A-Za-z]:\\[^\s'\"`<>]+)", text):
|
|
412
|
+
cleaned = match.rstrip(".,);:]")
|
|
413
|
+
if cleaned:
|
|
414
|
+
found.add(cleaned)
|
|
415
|
+
for match in re.findall(r"(?<![A-Za-z0-9_])(?:src|scripts|tests|docs|lib|renderer|app)/[A-Za-z0-9_./-]+\.[A-Za-z0-9]+", text):
|
|
416
|
+
found.add(str((cwd / match.rstrip(".,);:]")).resolve()))
|
|
417
|
+
try:
|
|
418
|
+
resolved_cwd = cwd.resolve()
|
|
419
|
+
except Exception:
|
|
420
|
+
resolved_cwd = cwd
|
|
421
|
+
runtime_core = NEXO_HOME / "core"
|
|
422
|
+
try:
|
|
423
|
+
if resolved_cwd == runtime_core or runtime_core in resolved_cwd.parents:
|
|
424
|
+
found.add(str(resolved_cwd))
|
|
425
|
+
except Exception:
|
|
426
|
+
pass
|
|
427
|
+
return sorted(found)
|
|
428
|
+
|
|
429
|
+
|
|
430
|
+
def _run_headless_runner_guard(*, caller: str, cwd: Path, prompt: str, allowed_tools: str) -> dict:
|
|
431
|
+
if not _runner_mutating_tools_allowed(allowed_tools):
|
|
432
|
+
return {"blocked": False, "skipped": "read_only_tools"}
|
|
433
|
+
guard_paths = _extract_runner_guard_paths(prompt, cwd)
|
|
434
|
+
if not guard_paths:
|
|
435
|
+
return {"blocked": False, "skipped": "no_explicit_paths"}
|
|
436
|
+
try:
|
|
437
|
+
runtime_root = str(NEXO_HOME)
|
|
438
|
+
if runtime_root and runtime_root not in sys.path:
|
|
439
|
+
sys.path.insert(0, runtime_root)
|
|
440
|
+
from plugins.guard import handle_guard_check # type: ignore
|
|
441
|
+
|
|
442
|
+
output = handle_guard_check(
|
|
443
|
+
files=",".join(guard_paths),
|
|
444
|
+
area=f"runner:{caller or 'headless'}",
|
|
445
|
+
project_hint=f"headless runner caller={caller or 'unknown'} cwd={cwd}",
|
|
446
|
+
include_schemas="true",
|
|
447
|
+
)
|
|
448
|
+
except Exception as exc:
|
|
449
|
+
return {
|
|
450
|
+
"blocked": True,
|
|
451
|
+
"summary": f"Runner guard unavailable: {exc}",
|
|
452
|
+
"paths": guard_paths,
|
|
453
|
+
}
|
|
454
|
+
blocked = "BLOCKING RULES" in str(output or "")
|
|
455
|
+
return {
|
|
456
|
+
"blocked": blocked,
|
|
457
|
+
"summary": str(output or ""),
|
|
458
|
+
"paths": guard_paths,
|
|
459
|
+
}
|
|
460
|
+
|
|
461
|
+
|
|
388
462
|
def _load_client_bootstrap_prompt(client: str) -> str:
|
|
389
463
|
try:
|
|
390
464
|
from bootstrap_docs import load_bootstrap_prompt
|
|
@@ -1000,6 +1074,18 @@ def run_automation_prompt(
|
|
|
1000
1074
|
reasoning_effort=reasoning_effort,
|
|
1001
1075
|
preferences=prefs,
|
|
1002
1076
|
)
|
|
1077
|
+
guard_result = _run_headless_runner_guard(
|
|
1078
|
+
caller=caller,
|
|
1079
|
+
cwd=cwd_path,
|
|
1080
|
+
prompt=prompt,
|
|
1081
|
+
allowed_tools=allowed_tools,
|
|
1082
|
+
)
|
|
1083
|
+
if guard_result.get("blocked"):
|
|
1084
|
+
stderr = "NEXO runner guard blocked this automation before editing shared files.\n"
|
|
1085
|
+
summary = str(guard_result.get("summary") or "").strip()
|
|
1086
|
+
if summary:
|
|
1087
|
+
stderr = _append_stderr(stderr, summary)
|
|
1088
|
+
return subprocess.CompletedProcess(["nexo-runner-guard"], 2, "", stderr)
|
|
1003
1089
|
started_at = time.perf_counter()
|
|
1004
1090
|
|
|
1005
1091
|
if selected_backend == CLIENT_CLAUDE_CODE:
|
package/src/claim_graph.py
CHANGED
|
@@ -22,13 +22,28 @@ def _get_db():
|
|
|
22
22
|
|
|
23
23
|
|
|
24
24
|
def _embed(text: str) -> np.ndarray:
|
|
25
|
-
|
|
26
|
-
|
|
25
|
+
try:
|
|
26
|
+
import cognitive
|
|
27
|
+
return cognitive.embed(text)
|
|
28
|
+
except Exception:
|
|
29
|
+
try:
|
|
30
|
+
import cognitive
|
|
31
|
+
dim = int(getattr(cognitive, "EMBEDDING_DIM", 384) or 384)
|
|
32
|
+
except Exception:
|
|
33
|
+
dim = 768
|
|
34
|
+
return np.zeros(dim, dtype=np.float32)
|
|
27
35
|
|
|
28
36
|
|
|
29
37
|
def _cosine_similarity(a, b) -> float:
|
|
30
|
-
|
|
31
|
-
|
|
38
|
+
try:
|
|
39
|
+
import cognitive
|
|
40
|
+
return cognitive.cosine_similarity(a, b)
|
|
41
|
+
except Exception:
|
|
42
|
+
norm_a = np.linalg.norm(a)
|
|
43
|
+
norm_b = np.linalg.norm(b)
|
|
44
|
+
if norm_a == 0 or norm_b == 0:
|
|
45
|
+
return 0.0
|
|
46
|
+
return float(np.dot(a, b) / (norm_a * norm_b))
|
|
32
47
|
|
|
33
48
|
|
|
34
49
|
def _array_to_blob(arr: np.ndarray) -> bytes:
|
package/src/cli.py
CHANGED
|
@@ -19,6 +19,7 @@ Entry points:
|
|
|
19
19
|
nexo scripts run NAME_OR_PATH [-- args...]
|
|
20
20
|
nexo scripts doctor [NAME_OR_PATH] [--json]
|
|
21
21
|
nexo scripts call TOOL --input JSON [--json-output]
|
|
22
|
+
nexo automations reactivate NAME [--test-run] [--json]
|
|
22
23
|
nexo skills list [--level ...] [--source-kind ...] [--json]
|
|
23
24
|
nexo skills get ID [--json]
|
|
24
25
|
nexo skills apply ID [--params JSON] [--mode ...] [--dry-run] [--json]
|
|
@@ -688,6 +689,22 @@ def _automations_set_enabled(args, enabled):
|
|
|
688
689
|
return 0
|
|
689
690
|
|
|
690
691
|
|
|
692
|
+
def _automations_reactivate(args):
|
|
693
|
+
from script_registry import reactivate_automation
|
|
694
|
+
|
|
695
|
+
result = reactivate_automation(args.name, test_run=bool(getattr(args, "test_run", False)))
|
|
696
|
+
if args.json:
|
|
697
|
+
print(json.dumps(result, indent=2, ensure_ascii=False))
|
|
698
|
+
return 0 if result.get("ok") else 1
|
|
699
|
+
if not result.get("ok"):
|
|
700
|
+
print(result.get("error", "Could not reactivate automation"), file=sys.stderr)
|
|
701
|
+
return 1
|
|
702
|
+
print(f"Automation {result['name']} enabled.")
|
|
703
|
+
if result.get("test_run"):
|
|
704
|
+
print("Test run completed.")
|
|
705
|
+
return 0
|
|
706
|
+
|
|
707
|
+
|
|
691
708
|
def _automations_status(args):
|
|
692
709
|
from script_registry import get_automation_status
|
|
693
710
|
|
|
@@ -2956,6 +2973,14 @@ def main():
|
|
|
2956
2973
|
automations_disable_p.add_argument("name", help="Automation name or path")
|
|
2957
2974
|
automations_disable_p.add_argument("--json", action="store_true", help="JSON output")
|
|
2958
2975
|
|
|
2976
|
+
automations_reactivate_p = automations_sub.add_parser(
|
|
2977
|
+
"reactivate",
|
|
2978
|
+
help="Enable an automation and optionally run a check",
|
|
2979
|
+
)
|
|
2980
|
+
automations_reactivate_p.add_argument("name", help="Automation name or path")
|
|
2981
|
+
automations_reactivate_p.add_argument("--test-run", action="store_true", help="Run the automation's check without sending")
|
|
2982
|
+
automations_reactivate_p.add_argument("--json", action="store_true", help="JSON output")
|
|
2983
|
+
|
|
2959
2984
|
automations_status_p = automations_sub.add_parser("status", help="Read automation status")
|
|
2960
2985
|
automations_status_p.add_argument("name", help="Automation name or path")
|
|
2961
2986
|
automations_status_p.add_argument("--json", action="store_true", help="JSON output")
|
|
@@ -3439,6 +3464,8 @@ def main():
|
|
|
3439
3464
|
return _automations_set_enabled(args, True)
|
|
3440
3465
|
elif args.automations_command == "disable":
|
|
3441
3466
|
return _automations_set_enabled(args, False)
|
|
3467
|
+
elif args.automations_command == "reactivate":
|
|
3468
|
+
return _automations_reactivate(args)
|
|
3442
3469
|
elif args.automations_command == "status":
|
|
3443
3470
|
return _automations_status(args)
|
|
3444
3471
|
elif args.automations_command == "instructions":
|
package/src/cognitive/_core.py
CHANGED
|
@@ -3,8 +3,10 @@
|
|
|
3
3
|
import base64
|
|
4
4
|
import json
|
|
5
5
|
import math
|
|
6
|
+
import hashlib
|
|
6
7
|
import os
|
|
7
8
|
import re
|
|
9
|
+
import shutil
|
|
8
10
|
import sqlite3
|
|
9
11
|
import numpy as np
|
|
10
12
|
from datetime import datetime, timedelta
|
|
@@ -18,7 +20,19 @@ _cognitive_dir = paths.cognitive_dir()
|
|
|
18
20
|
_cognitive_dir.mkdir(parents=True, exist_ok=True)
|
|
19
21
|
|
|
20
22
|
COGNITIVE_DB = str(_cognitive_dir / "cognitive.db")
|
|
21
|
-
|
|
23
|
+
def _configured_embedding_dim() -> int:
|
|
24
|
+
try:
|
|
25
|
+
from local_models import get_local_model_spec
|
|
26
|
+
|
|
27
|
+
dim = int(get_local_model_spec("bge-base-embeddings").dimension or 0)
|
|
28
|
+
if dim > 0:
|
|
29
|
+
return dim
|
|
30
|
+
except Exception:
|
|
31
|
+
pass
|
|
32
|
+
return 384
|
|
33
|
+
|
|
34
|
+
|
|
35
|
+
EMBEDDING_DIM = _configured_embedding_dim()
|
|
22
36
|
LAMBDA_STM = 0.004126 # half-life = ln(2) / (7 * 24) ≈ 7 days
|
|
23
37
|
LAMBDA_LTM = 0.000481 # half-life = ln(2) / (60 * 24) ≈ 60 days
|
|
24
38
|
DEFAULT_MEMORY_STABILITY = 1.0
|
|
@@ -307,20 +321,37 @@ def _migrate_memory_personalization(conn: sqlite3.Connection):
|
|
|
307
321
|
|
|
308
322
|
|
|
309
323
|
def _auto_migrate_embeddings(conn: sqlite3.Connection):
|
|
310
|
-
"""
|
|
324
|
+
"""Re-embed when vector dimension or pinned embedding model changes."""
|
|
311
325
|
try:
|
|
312
|
-
|
|
326
|
+
conn.execute("""
|
|
327
|
+
CREATE TABLE IF NOT EXISTS embedding_model_state (
|
|
328
|
+
key TEXT PRIMARY KEY,
|
|
329
|
+
value TEXT NOT NULL,
|
|
330
|
+
updated_at TEXT DEFAULT (datetime('now'))
|
|
331
|
+
)
|
|
332
|
+
""")
|
|
333
|
+
current_marker = _current_embedding_model_marker()
|
|
334
|
+
stored = conn.execute(
|
|
335
|
+
"SELECT value FROM embedding_model_state WHERE key = 'embedding_model_marker'"
|
|
336
|
+
).fetchone()
|
|
337
|
+
stored_marker = stored["value"] if stored else ""
|
|
338
|
+
|
|
339
|
+
row = None
|
|
340
|
+
for table in ("stm_memories", "ltm_memories", "quarantine"):
|
|
341
|
+
row = conn.execute(f"SELECT embedding FROM {table} LIMIT 1").fetchone()
|
|
342
|
+
if row:
|
|
343
|
+
break
|
|
313
344
|
if not row:
|
|
314
|
-
|
|
345
|
+
_write_embedding_model_marker(conn, current_marker)
|
|
346
|
+
return
|
|
315
347
|
|
|
316
348
|
vec = np.frombuffer(row["embedding"], dtype=np.float32)
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
return # Unknown dimension, don't touch
|
|
349
|
+
dimension_matches = len(vec) == EMBEDDING_DIM
|
|
350
|
+
model_matches = stored_marker == current_marker
|
|
351
|
+
if dimension_matches and model_matches:
|
|
352
|
+
return
|
|
322
353
|
|
|
323
|
-
|
|
354
|
+
_backup_cognitive_db_for_embedding_migration(stored_marker, current_marker)
|
|
324
355
|
model = _get_model()
|
|
325
356
|
|
|
326
357
|
for table in ("stm_memories", "ltm_memories", "quarantine"):
|
|
@@ -333,14 +364,75 @@ def _auto_migrate_embeddings(conn: sqlite3.Connection):
|
|
|
333
364
|
|
|
334
365
|
embeddings = list(model.embed(contents))
|
|
335
366
|
for mem_id, emb in zip(ids, embeddings):
|
|
336
|
-
|
|
367
|
+
arr = np.array(emb, dtype=np.float32)
|
|
368
|
+
if len(arr) != EMBEDDING_DIM:
|
|
369
|
+
raise ValueError(f"embedding dimension mismatch: {len(arr)} != {EMBEDDING_DIM}")
|
|
370
|
+
blob = arr.tobytes()
|
|
337
371
|
conn.execute(f"UPDATE {table} SET embedding = ? WHERE id = ?", (blob, mem_id))
|
|
338
372
|
|
|
373
|
+
_write_embedding_model_marker(conn, current_marker)
|
|
339
374
|
conn.commit()
|
|
340
375
|
except Exception:
|
|
341
376
|
pass # Don't break startup if migration fails
|
|
342
377
|
|
|
343
378
|
|
|
379
|
+
def _current_embedding_model_marker() -> str:
|
|
380
|
+
try:
|
|
381
|
+
from local_models import get_local_model_spec
|
|
382
|
+
|
|
383
|
+
spec = get_local_model_spec("bge-base-embeddings")
|
|
384
|
+
return "|".join([
|
|
385
|
+
spec.name,
|
|
386
|
+
spec.kind,
|
|
387
|
+
spec.model_id,
|
|
388
|
+
spec.source_repo,
|
|
389
|
+
spec.revision,
|
|
390
|
+
str(EMBEDDING_DIM),
|
|
391
|
+
])
|
|
392
|
+
except Exception:
|
|
393
|
+
return f"unknown|{EMBEDDING_DIM}"
|
|
394
|
+
|
|
395
|
+
|
|
396
|
+
def _write_embedding_model_marker(conn: sqlite3.Connection, marker: str) -> None:
|
|
397
|
+
conn.execute(
|
|
398
|
+
"""
|
|
399
|
+
INSERT INTO embedding_model_state (key, value, updated_at)
|
|
400
|
+
VALUES ('embedding_model_marker', ?, datetime('now'))
|
|
401
|
+
ON CONFLICT(key) DO UPDATE SET
|
|
402
|
+
value = excluded.value,
|
|
403
|
+
updated_at = excluded.updated_at
|
|
404
|
+
""",
|
|
405
|
+
(marker,),
|
|
406
|
+
)
|
|
407
|
+
conn.commit()
|
|
408
|
+
|
|
409
|
+
|
|
410
|
+
def _backup_cognitive_db_for_embedding_migration(old_marker: str, new_marker: str) -> None:
|
|
411
|
+
db_path = Path(COGNITIVE_DB)
|
|
412
|
+
if not db_path.exists():
|
|
413
|
+
return
|
|
414
|
+
stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
|
|
415
|
+
backup = db_path.with_name(f"{db_path.name}.bak-embedding-{stamp}")
|
|
416
|
+
meta = backup.with_suffix(backup.suffix + ".json")
|
|
417
|
+
try:
|
|
418
|
+
shutil.copy2(db_path, backup)
|
|
419
|
+
meta.write_text(
|
|
420
|
+
json.dumps(
|
|
421
|
+
{
|
|
422
|
+
"old_marker": old_marker,
|
|
423
|
+
"new_marker": new_marker,
|
|
424
|
+
"created_at": datetime.now().isoformat(timespec="seconds"),
|
|
425
|
+
},
|
|
426
|
+
indent=2,
|
|
427
|
+
ensure_ascii=True,
|
|
428
|
+
sort_keys=True,
|
|
429
|
+
) + "\n",
|
|
430
|
+
encoding="utf-8",
|
|
431
|
+
)
|
|
432
|
+
except Exception:
|
|
433
|
+
pass
|
|
434
|
+
|
|
435
|
+
|
|
344
436
|
def _init_tables(conn: sqlite3.Connection):
|
|
345
437
|
"""Create tables if they don't exist."""
|
|
346
438
|
conn.executescript("""
|
|
@@ -558,6 +650,8 @@ def _get_model():
|
|
|
558
650
|
"""Lazy-load fastembed TextEmbedding model."""
|
|
559
651
|
global _model
|
|
560
652
|
if _model is None:
|
|
653
|
+
if _model_download_disabled():
|
|
654
|
+
raise RuntimeError("cognitive model loading disabled for this environment")
|
|
561
655
|
from local_models import build_fastembed_embedding
|
|
562
656
|
|
|
563
657
|
_model = build_fastembed_embedding("bge-base-embeddings")
|
|
@@ -577,6 +671,22 @@ def _get_reranker():
|
|
|
577
671
|
return _reranker if _reranker is not False else None
|
|
578
672
|
|
|
579
673
|
|
|
674
|
+
def _model_download_disabled() -> bool:
|
|
675
|
+
return os.environ.get("NEXO_SKIP_COGNITIVE_MODEL_DOWNLOAD", "").strip().lower() in {"1", "true", "yes"}
|
|
676
|
+
|
|
677
|
+
|
|
678
|
+
def _deterministic_fallback_embedding(text: str) -> np.ndarray:
|
|
679
|
+
"""Return a stable vector for tests/offline fallback paths."""
|
|
680
|
+
digest = hashlib.sha256(str(text or "").encode("utf-8", errors="ignore")).digest()
|
|
681
|
+
arr = np.zeros(EMBEDDING_DIM, dtype=np.float32)
|
|
682
|
+
for index, byte in enumerate(digest):
|
|
683
|
+
arr[index] = (float(byte) / 255.0) - 0.5
|
|
684
|
+
norm = np.linalg.norm(arr)
|
|
685
|
+
if norm > 0:
|
|
686
|
+
arr = arr / norm
|
|
687
|
+
return arr.astype(np.float32)
|
|
688
|
+
|
|
689
|
+
|
|
580
690
|
def rerank_results(query: str, results: list[dict], top_k: int = 5) -> list[dict]:
|
|
581
691
|
"""Rerank search results using cross-encoder for precise top-k.
|
|
582
692
|
|
|
@@ -603,9 +713,11 @@ def rerank_results(query: str, results: list[dict], top_k: int = 5) -> list[dict
|
|
|
603
713
|
|
|
604
714
|
|
|
605
715
|
def embed(text: str) -> np.ndarray:
|
|
606
|
-
"""Embed text into a
|
|
716
|
+
"""Embed text into a float32 vector. Returns zeros for empty text."""
|
|
607
717
|
if not text or not text.strip():
|
|
608
718
|
return np.zeros(EMBEDDING_DIM, dtype=np.float32)
|
|
719
|
+
if _model_download_disabled():
|
|
720
|
+
return _deterministic_fallback_embedding(text)
|
|
609
721
|
model = _get_model()
|
|
610
722
|
embeddings = list(model.embed([text]))
|
|
611
723
|
return np.array(embeddings[0], dtype=np.float32)
|