npm - nexo-brain - Versions diffs - 7.8.1 → 7.9.0 - Mend

nexo-brain 7.8.1 → 7.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/.claude-plugin/plugin.json +1 -1
package/README.md +5 -1
package/bin/nexo-brain.js +12 -0
package/package.json +1 -1
package/src/hooks/compact_session_resolver.py +135 -0
package/src/hooks/post_compact.py +17 -2
package/src/hooks/pre_compact.py +20 -1
package/src/semantic_reasoner.py +581 -0
package/src/semantic_router.py +452 -0
package/tool-enforcement-map.json +1 -1

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "nexo-brain",
-  "version": "7.8.1",
+  "version": "7.9.0",
   "description": "Local cognitive runtime for Claude Code \u2014 persistent memory, overnight learning, doctor diagnostics, personal scripts, recovery-aware jobs, startup preflight, and optional dashboard/power helper.",
   "author": {
     "name": "NEXO Brain",

package/README.md CHANGED Viewed

@@ -18,7 +18,11 @@
 [Watch the overview video](https://nexo-brain.com/watch/) · [Watch on YouTube](https://www.youtube.com/watch?v=i2lkGhKyVqI) · [Open the infographic](https://nexo-brain.com/assets/nexo-brain-infographic-v5.png)
-Version `7.8.1` is the current packaged-runtime line. Patch release that closes the last compaction-continuity gap Francisco flagged after v7.8.0: `pre-compact.sh` Layer 2 emergency auto-diary and Layer 3 `compaction_memory.record_auto_flush` now use the exact `TARGET_SID` resolved from `CLAUDE_SESSION_ID` instead of falling back to `ORDER BY last_update_epoch DESC LIMIT 1` ("latest active session"). In multi-conversation Desktop that fallback routinely wrote the emergency diary against the wrong conversation even though the main restore path was already exact-SID in v7.8.0. `last_diary_ts` is also scoped by `session_id` now. Fail-closed when no `CLAUDE_SESSION_ID` resolves. New behavioural tests drive the real shell script with two sessions in the DB to pin the invariant. Fixed a latent bash-escape bug in `pre-compact.sh` where a double-quoted string inside a Python comment silently closed the `python3 -c "..."` argument early — caught by adding the behavioural tests. Pytest 2092 passing (+2 new behavioural). No Desktop bump.
+Version `7.9.0` is the current packaged-runtime line. Minor release that ships the foundation of the semantic stack (router + reasoner + CLI) under the ONEPASS LLM Coverage plan, plus two product-bug fixes observed in the wild on 2026-04-23. New `src/semantic_router.py` exposes 18 named `decision_kinds` (13 textual + 5 code-aware) with a per-kind policy table and the layer chain `fast_local → semantic_reasoner → remote_fallback`. New `src/semantic_reasoner.py` adds Mode A (`multipass_local`: reuses the mDeBERTa pin with three prompt-perturbed passes + majority vote + 0.75 floor) and Mode B (`cached_llm`: wrapper over `call_model_raw` with a pid+uuid atomic-write 24h-TTL disk cache at `~/.nexo/runtime/operations/semantic-reasoner-cache.json`, SHA-256 keyed by `decision_kind` + normalized input, LRU-bounded at 2000 entries, corrupt entries dropped on read). New `scripts/semantic-classify.py` JSON-in JSON-out CLI lets external MCP clients (including the closed-source NEXO Desktop companion) query Brain as the single semantic authority. New `NEXO_SEMANTIC_REASONER` kill switch (`0`/`off`/`false`/`no`/`disable`/`disabled`) honours the plan mandate for a runtime opt-out separate from `NEXO_LOCAL_CLASSIFIER`. Bug fixes: `bin/nexo-brain.js` upgrade flow now copies `templates/` root the same way fresh install and same-version refresh already did (Maria iMac 7.1.10→7.8.1 upgrade had lost 27 core-prompts templates and broken post-update import verification); and `tool-enforcement-map.json` `nexo_startup.enforcement.inject_prompt` now instructs the model to preload the 13 `mcp__nexo__*` protocol tools via `ToolSearch` before calling `nexo_startup` when the host MCP client defers tool schemas (Claude Code with many MCPs installed). Audit-driven hardening: router/reasoner defensively use `getattr` over the `call_model_raw` module and add a trailing `except Exception` so provider errors degrade with `remote_error` instead of propagating; cache writes use pid+uuid tmp + `fsync` + `os.replace` to survive concurrent writers; `NEXO_SEMANTIC_REASONER_TTL` parse tolerates malformed values. Tests: +50 (22 router, 20 reasoner, 8 CLI). Per-site migration of existing callers (`session_end_intent`, `r14`, `r16`, `r17`, `r20`, `r34`, T4 gates, `tools_drive`, `nexo-followup-runner`) is explicitly deferred to follow-up patch releases and tracked as `NF-SEMANTIC-ROUTER-SITE-MIGRATION`; nothing in this release changes the behaviour of the existing callers. Companion coordinated release: NEXO Desktop v0.28.0.
+Previously in `7.8.2`: patch release that fixes the compact-hook observability gap Francisco flagged after v7.8.1: `hook_runs.session_id` was empty for 7 out of 8 recent compaction rows (and when populated it stored the raw Claude Code token instead of the NEXO sid), so per-session queries over `hook_runs` for compact events could not be joined back to the NEXO session that actually compacted. v7.8.2 adds `src/hooks/compact_session_resolver.py` with `resolve_nexo_sid(claude_session_id)`, which walks the same rails the shell already uses: `sessions.claude_session_id` match, then `session_claude_aliases.claude_session_id` (most recent `last_seen` wins), then the per-conversation sidecar under `runtime/data/compacting/<safe-claude-id>.txt`, then the legacy global sidecar for single-conversation setups. `src/hooks/pre_compact.py` and `src/hooks/post_compact.py` now call the resolver and store the real NEXO sid in `hook_runs.session_id`; both wrappers also stash `{claude_session_id, sid_source}` in `hook_runs.metadata` so "why is this row still empty?" has a one-query answer. Nine new tests in `tests/test_hook_runs_compact_sid_resolution.py` pin the five resolver rails (sessions / alias / sidecar / legacy / none), malformed-sidecar rejection, the pre- and post-compact wrapper end-to-end paths, and the empty-state wrapper rail so a clean audit trail is written even when nothing resolves. No Desktop bump.
+Previously in `7.8.1`: patch release that closed the last compaction-continuity gap Francisco flagged after v7.8.0: `pre-compact.sh` Layer 2 emergency auto-diary and Layer 3 `compaction_memory.record_auto_flush` now use the exact `TARGET_SID` resolved from `CLAUDE_SESSION_ID` instead of falling back to `ORDER BY last_update_epoch DESC LIMIT 1` ("latest active session"). In multi-conversation Desktop that fallback routinely wrote the emergency diary against the wrong conversation even though the main restore path was already exact-SID in v7.8.0. `last_diary_ts` is also scoped by `session_id` now. Fail-closed when no `CLAUDE_SESSION_ID` resolves. New behavioural tests drive the real shell script with two sessions in the DB to pin the invariant. Fixed a latent bash-escape bug in `pre-compact.sh` where a double-quoted string inside a Python comment silently closed the `python3 -c "..."` argument early — caught by adding the behavioural tests. Pytest 2092 passing (+2 new behavioural). No Desktop bump.
 Previously in `7.8.0`: minor release that closed the PostCompact continuity work Francisco requested after v7.7: `src/hooks/post_compact.py` is a real registered hook (part of the canonical 9-hook set, was 8), `pre-compact.sh` resolves the exact NEXO SID from `CLAUDE_SESSION_ID` instead of falling back to "latest active session" (that was actively wrong in multi-conversation Desktop), the sidecar moves from `/tmp` to `$NEXO_HOME/runtime/data/compacting-sid.txt` so two concurrent compactions on two conversations cannot race on `/tmp`, `post-compact.sh` removes its "latest checkpoint" fallback (fail-closed to a diagnostic systemMessage instead of restoring the wrong conversation), and the hook cross-checks the sidecar SID against the env-resolved one so a "SID mismatch" is logged as such. Pre- and post-compact now emit NDJSON events the engine drains on every periodic tick via `_consume_pending_hook_events()`; the queue file is truncated after read so an event never fires twice. A new contract test (`tests/test_v78_compaction_continuity.py`) pins 11 invariants across ten rails including the hook registration, the exact-SID resolution path, fail-closed behaviour, and that `compaction_count` only increments on real restore. Pytest 2086 passing (+16 vs v7.7). No Desktop bump — v0.27.0 continues to ship.

package/bin/nexo-brain.js CHANGED Viewed

@@ -2127,6 +2127,18 @@ async function main() {
         writeRuntimeCoreArtifactsManifest(NEXO_HOME, srcDir);
         log("  Scripts updated.");
+        // Update templates/ root (core-prompts/, CLAUDE.md.template, etc.) — recursive
+        // Managed surface: copyDirRec overwrites without diffing, so any
+        // hand-edited template under ~/.nexo/templates/ is replaced on
+        // upgrade. Keep local forks under personal/ or outside the runtime
+        // home to avoid silent loss.
+        const migTemplatesSrc = path.join(__dirname, "..", "templates");
+        const migTemplatesDest = path.join(NEXO_HOME, "templates");
+        if (fs.existsSync(migTemplatesSrc)) {
+          copyDirRec(migTemplatesSrc, migTemplatesDest);
+          log("  Templates updated (user-edited templates/ files are overwritten).");
+        }
         // Register ALL 8 core hooks in settings.json (additive — don't remove user's custom hooks)
         let settings = {};
         if (fs.existsSync(CLAUDE_SETTINGS)) {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "nexo-brain",
-  "version": "7.8.1",
+  "version": "7.9.0",
   "mcpName": "io.github.wazionapps/nexo",
   "description": "NEXO Brain \u2014 Shared brain for AI agents. Persistent memory, semantic RAG, natural forgetting, metacognitive guard, trust scoring, 150+ MCP tools. Works with Claude Code, Codex, Claude Desktop & any MCP client. 100% local, free.",
   "homepage": "https://nexo-brain.com",

package/src/hooks/compact_session_resolver.py ADDED Viewed

@@ -0,0 +1,135 @@
+"""Resolve the NEXO sid for compaction hook observability.
+`hook_runs.session_id` must hold the NEXO sid (`nexo-NNNNNN-N`) so that a
+query like "every compaction of session X" works without joining on the
+raw Claude Code token. Pre-v7.8.2 the two Python wrappers stored
+`os.environ.get("CLAUDE_SESSION_ID", "")` directly, which produced two
+problems at once: rows with `session_id=''` when the env was missing,
+and rows with the raw Claude token (not a NEXO sid) when it was
+present. This helper centralises the resolution against the same rails
+the shell scripts use.
+Resolution order:
+  1. ENV `CLAUDE_SESSION_ID` with `sessions.claude_session_id` match.
+  2. ENV `CLAUDE_SESSION_ID` with `session_claude_aliases.claude_session_id`
+     match (most recent `last_seen` wins).
+  3. Per-conversation sidecar written by pre-compact.sh at
+     the compacting folder under the runtime data dir.
+  4. Legacy global sidecar at compacting-sid.txt (single-conv legacy path).
+Returns (nexo_sid, source) so the caller can stash `source` in the
+`hook_runs.metadata` JSON for debugging "why is this row still empty".
+"""
+from __future__ import annotations
+import os
+import re
+import sqlite3
+from pathlib import Path
+_NEXO_SID_RE = re.compile(r"^nexo-[0-9]+-[0-9]+$")
+_SAFE_CLAUDE_ID_RE = re.compile(r"[^a-zA-Z0-9._-]")
+def _nexo_home() -> Path:
+    return Path(os.environ.get("NEXO_HOME", str(Path.home() / ".nexo")))
+def _candidate_data_dirs() -> list[Path]:
+    home = _nexo_home()
+    dirs: list[Path] = []
+    for cand in (home / "runtime" / "data", home / "data"):
+        if cand not in dirs:
+            dirs.append(cand)
+    return dirs
+def _db_path() -> Path | None:
+    for d in _candidate_data_dirs():
+        p = d / "nexo.db"
+        if p.is_file():
+            return p
+    return None
+def _safe_claude_id(claude_session_id: str) -> str:
+    return _SAFE_CLAUDE_ID_RE.sub("_", claude_session_id or "")
+def _read_sidecar(path: Path) -> str:
+    try:
+        text = path.read_text(encoding="utf-8").strip()
+    except Exception:
+        return ""
+    return text if _NEXO_SID_RE.match(text) else ""
+def _db_lookup(claude_session_id: str) -> tuple[str, str]:
+    if not claude_session_id:
+        return "", ""
+    db = _db_path()
+    if db is None:
+        return "", ""
+    try:
+        conn = sqlite3.connect(str(db), timeout=3)
+    except Exception:
+        return "", ""
+    try:
+        try:
+            row = conn.execute(
+                "SELECT sid FROM sessions WHERE claude_session_id = ? LIMIT 1",
+                (claude_session_id,),
+            ).fetchone()
+        except Exception:
+            row = None
+        if row and row[0] and _NEXO_SID_RE.match(row[0]):
+            return row[0], "sessions"
+        try:
+            row = conn.execute(
+                "SELECT sid FROM session_claude_aliases "
+                "WHERE claude_session_id = ? "
+                "ORDER BY last_seen DESC LIMIT 1",
+                (claude_session_id,),
+            ).fetchone()
+        except Exception:
+            row = None
+        if row and row[0] and _NEXO_SID_RE.match(row[0]):
+            return row[0], "alias"
+    finally:
+        try:
+            conn.close()
+        except Exception:
+            pass
+    return "", ""
+def resolve_nexo_sid(claude_session_id: str = "") -> tuple[str, str]:
+    """Resolve the NEXO sid for the current compaction invocation.
+    Returns ``(nexo_sid, source)`` where ``source`` is one of:
+    - ``sessions``       resolved via sessions table claude_session_id.
+    - ``alias``          resolved via session_claude_aliases.
+    - ``sidecar``        per-conversation sidecar file.
+    - ``sidecar_legacy`` legacy global sidecar (single-conv path).
+    - ``none``           no rail matched; caller stores empty string.
+    """
+    token = (claude_session_id or os.environ.get("CLAUDE_SESSION_ID", "") or "").strip()
+    if token:
+        sid, source = _db_lookup(token)
+        if sid:
+            return sid, source
+        safe_id = _safe_claude_id(token)
+        if safe_id:
+            for base in _candidate_data_dirs():
+                side = _read_sidecar(base / "compacting" / f"{safe_id}.txt")
+                if side:
+                    return side, "sidecar"
+    for base in _candidate_data_dirs():
+        side = _read_sidecar(base / "compacting-sid.txt")
+        if side:
+            return side, "sidecar_legacy"
+    return "", "none"

package/src/hooks/post_compact.py CHANGED Viewed

@@ -22,15 +22,30 @@ from pathlib import Path
 _DIR = Path(__file__).resolve().parent
-def _record(duration_ms: int, exit_code: int, session_id: str) -> None:
+def _record(duration_ms: int, exit_code: int, claude_session_id: str) -> None:
+    """Log a hook_runs row with the resolved NEXO sid.
+    v7.8.2 — see the matching docstring in pre_compact.py. Post-compact
+    runs after the shell script has already consumed the per-conv
+    sidecar, but the DB rails (sessions/aliases) stay valid, so the
+    resolver still returns a sid in the common case. `sid_source` goes
+    into metadata for empty-row triage.
+    """
     try:
         sys.path.insert(0, str(_DIR.parent))
+        sys.path.insert(0, str(_DIR))
         import hook_observability  # type: ignore
+        from compact_session_resolver import resolve_nexo_sid  # type: ignore
+        nexo_sid, sid_source = resolve_nexo_sid(claude_session_id)
         hook_observability.record_hook_run(
             "post_compact",
             duration_ms=duration_ms,
             exit_code=exit_code,
-            session_id=session_id,
+            session_id=nexo_sid,
+            metadata={
+                "claude_session_id": claude_session_id,
+                "sid_source": sid_source,
+            },
         )
     except Exception:
         pass

package/src/hooks/pre_compact.py CHANGED Viewed

@@ -18,14 +18,33 @@ _DIR = Path(__file__).resolve().parent
 def _record(duration_ms: int, exit_code: int) -> None:
+    """Log a hook_runs row with the resolved NEXO sid.
+    v7.8.2 — the raw `CLAUDE_SESSION_ID` env token is not a NEXO sid, so
+    storing it in `hook_runs.session_id` made per-session queries useless
+    and left the column empty whenever Claude Code did not forward the
+    env. `compact_session_resolver.resolve_nexo_sid` walks the same
+    rails the shell script uses (sessions → aliases → per-conv sidecar
+    → legacy global sidecar) and returns `(nexo_sid, source)`. The raw
+    Claude token and the resolution source end up in `metadata` so an
+    operator can debug why a given row is still empty.
+    """
     try:
         sys.path.insert(0, str(_DIR.parent))
+        sys.path.insert(0, str(_DIR))
         import hook_observability  # type: ignore
+        from compact_session_resolver import resolve_nexo_sid  # type: ignore
+        claude_id = os.environ.get("CLAUDE_SESSION_ID", "")
+        nexo_sid, sid_source = resolve_nexo_sid(claude_id)
         hook_observability.record_hook_run(
             "pre_compact",
             duration_ms=duration_ms,
             exit_code=exit_code,
-            session_id=os.environ.get("CLAUDE_SESSION_ID", ""),
+            session_id=nexo_sid,
+            metadata={
+                "claude_session_id": claude_id,
+                "sid_source": sid_source,
+            },
         )
     except Exception:
         pass

package/src/semantic_reasoner.py ADDED Viewed

@@ -0,0 +1,581 @@
+"""semantic_reasoner — second-layer semantic decision maker.
+Plan ONEPASS LLM Coverage. Called through ``src/semantic_router.py``.
+Exposes a single ``reason()`` entrypoint with two modes:
+    Mode A  — ``multipass_local``  (textual decision kinds)
+        Reuses the already-pinned ``LocalZeroShotClassifier`` (see
+        ``docs/classifier-model-notes.md``) but with stricter behaviour:
+        three inference passes with mild prompt perturbations, then
+        majority vote across passes. A decision is only accepted if at
+        least two of three passes agree AND the agreed confidence is
+        above the stricter threshold. This kills single-pass false
+        positives without adding a new model dependency.
+    Mode B  — ``cached_llm``  (code-aware decision kinds)
+        Thin wrapper around ``call_model_raw`` with a disk cache scoped
+        by (decision_kind, sha256(normalized_prompt)). TTL = 24h. The
+        cache lives under ``~/.nexo/runtime/operations/semantic-reasoner-cache.json``
+        alongside the existing classifier install state. Cache hits
+        return instantly and are flagged in ``meta.cache_hit``. Misses
+        call the LLM; the response and its normalized verdict are
+        written back to the cache atomically.
+Pin notes: this module does not introduce a new downloaded model.
+Mode A reuses ``MODEL_ID``/``MODEL_REVISION`` from ``classifier_local``.
+Mode B resolves the LLM through the standard resonance map with
+``caller='semantic_reasoner'`` and ``tier='muy_bajo'``; the pin lives
+in ``resonance_map`` like every other LLM caller.
+See ``docs/semantic-reasoner-model-notes.md`` for the rationale behind
+this "upgrade-in-place, pin-by-reuse" strategy, and why a dedicated
+stronger local LLM (Llama 3.1 8B, etc.) is explicitly deferred to a
+future release.
+"""
+from __future__ import annotations
+import hashlib
+import json
+import logging
+import os
+import re
+import time
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+_logger = logging.getLogger(__name__)
+# ---------------------------------------------------------------------------
+# Shared dataclass imported from the router
+# ---------------------------------------------------------------------------
+def _import_router_result():
+    """Lazy import to avoid circular dependency on semantic_router."""
+    from semantic_router import RouterResult
+    return RouterResult
+# ---------------------------------------------------------------------------
+# Mode A — multi-pass local
+# ---------------------------------------------------------------------------
+_PROMPT_PERTURBATIONS: tuple[str, ...] = (
+    "{q}",
+    "Decide: {q}",
+    "Classify this utterance: {q}",
+)
+def _collect_local_votes(
+    question: str, labels: tuple[str, ...]
+) -> list[tuple[str, float, dict[str, float]]]:
+    """Run the local classifier three times with mild prompt variations.
+    Returns a list of ``(label, confidence, scores)`` triples. Any
+    pass that fails silently returns a zero-confidence entry so the
+    vote aggregator can still detect quorum problems.
+    """
+    try:
+        from classifier_local import LocalZeroShotClassifier
+    except Exception as exc:  # pragma: no cover
+        _logger.debug("semantic_reasoner: classifier_local unavailable (%s)", exc)
+        return []
+    clf = LocalZeroShotClassifier(confidence_floor=0.0)
+    votes: list[tuple[str, float, dict[str, float]]] = []
+    for template in _PROMPT_PERTURBATIONS:
+        prompt = template.format(q=question)
+        result = clf.classify(prompt, labels)
+        if result is None:
+            votes.append(("", 0.0, {}))
+            continue
+        votes.append((result.label, float(result.confidence), dict(result.scores)))
+    return votes
+def _aggregate_votes(
+    votes: list[tuple[str, float, dict[str, float]]],
+    confidence_floor: float,
+) -> tuple[str | None, float, dict[str, Any]]:
+    """Majority vote across passes. Returns (label_or_none, confidence, meta)."""
+    if not votes:
+        return None, 0.0, {"reason": "no_votes"}
+    counts: dict[str, int] = {}
+    confidences: dict[str, list[float]] = {}
+    for label, confidence, _scores in votes:
+        if not label:
+            continue
+        counts[label] = counts.get(label, 0) + 1
+        confidences.setdefault(label, []).append(confidence)
+    if not counts:
+        return None, 0.0, {"reason": "all_passes_failed", "votes": len(votes)}
+    best_label = max(counts, key=lambda lbl: (counts[lbl], max(confidences[lbl])))
+    vote_count = counts[best_label]
+    avg_confidence = sum(confidences[best_label]) / len(confidences[best_label])
+    meta: dict[str, Any] = {
+        "votes_total": len(votes),
+        "votes_for_best": vote_count,
+        "avg_confidence": round(avg_confidence, 4),
+        "per_label_counts": dict(counts),
+    }
+    if vote_count < 2:
+        meta["reason"] = "no_majority"
+        return None, avg_confidence, meta
+    if avg_confidence < confidence_floor:
+        meta["reason"] = "below_threshold"
+        return None, avg_confidence, meta
+    return best_label, avg_confidence, meta
+def _reason_multipass_local(
+    *,
+    decision_kind: str,
+    question: str,
+    labels: tuple[str, ...] | None,
+    confidence_floor: float,
+):
+    RouterResult = _import_router_result()
+    if not labels:
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="semantic_reasoner",
+            degraded=True,
+            error="multipass_local requires labels",
+        )
+    votes = _collect_local_votes(question, labels)
+    label, confidence, meta = _aggregate_votes(votes, confidence_floor)
+    if label is None:
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="semantic_reasoner",
+            degraded=True,
+            error=meta.get("reason", "aggregation_failed"),
+            meta={"mode": "multipass_local", "aggregate": meta},
+        )
+    return RouterResult(
+        ok=True,
+        decision_kind=decision_kind,
+        verdict=label,
+        label=label,
+        confidence=round(float(confidence), 4),
+        route_used="semantic_reasoner",
+        degraded=False,
+        meta={"mode": "multipass_local", "aggregate": meta},
+    )
+# ---------------------------------------------------------------------------
+# Mode B — cached LLM
+# ---------------------------------------------------------------------------
+_DEFAULT_CACHE_TTL_SECONDS = 24 * 3600
+def _cache_path() -> Path:
+    """Resolve the on-disk cache location.
+    Reuses ``paths.operations_dir()`` so the reasoner state lives next to
+    the existing ``classifier-install-state.json``. If ``paths`` is not
+    importable (heavy module; test context), fall back to a deterministic
+    location under ``NEXO_HOME``.
+    """
+    override = os.environ.get("NEXO_SEMANTIC_REASONER_CACHE_PATH", "").strip()
+    if override:
+        return Path(override)
+    try:
+        import paths
+        return paths.operations_dir() / "semantic-reasoner-cache.json"
+    except Exception:
+        home = os.environ.get("NEXO_HOME", "").strip()
+        root = Path(home) if home else Path.home() / ".nexo"
+        return root / "runtime" / "operations" / "semantic-reasoner-cache.json"
+def _normalize_for_hash(text: str) -> str:
+    """Normalise whitespace/case so equivalent prompts hit the same cache
+    entry. Does not touch content semantics beyond whitespace collapse."""
+    return re.sub(r"\s+", " ", (text or "").strip().lower())
+def _cache_key(
+    *,
+    decision_kind: str,
+    question: str,
+    labels: tuple[str, ...] | None,
+    context: str,
+) -> str:
+    payload = json.dumps(
+        {
+            "kind": decision_kind,
+            "q": _normalize_for_hash(question),
+            "ctx": _normalize_for_hash(context)[:400],
+            "labels": list(labels) if labels else [],
+        },
+        sort_keys=True,
+        ensure_ascii=False,
+    )
+    return hashlib.sha256(payload.encode("utf-8")).hexdigest()
+def _read_cache() -> dict[str, Any]:
+    try:
+        path = _cache_path()
+        if not path.is_file():
+            return {}
+        data = json.loads(path.read_text() or "{}")
+        if isinstance(data, dict):
+            return data
+    except Exception as exc:  # pragma: no cover — corrupt cache
+        _logger.warning("semantic_reasoner: cache read failed (%s); starting fresh", exc)
+    return {}
+def _write_cache(cache: dict[str, Any]) -> None:
+    """Atomic write with pid+uuid suffix so concurrent Brain / Desktop CLI
+    writers do not stomp each other's temp file."""
+    try:
+        path = _cache_path()
+        path.parent.mkdir(parents=True, exist_ok=True)
+        import os as _os
+        import uuid as _uuid
+        tmp = path.with_name(
+            f"{path.name}.tmp.{_os.getpid()}.{_uuid.uuid4().hex[:8]}"
+        )
+        payload = json.dumps(cache, ensure_ascii=False, sort_keys=True)
+        with open(tmp, "w", encoding="utf-8") as handle:
+            handle.write(payload)
+            handle.flush()
+            try:
+                _os.fsync(handle.fileno())
+            except OSError:
+                pass
+        _os.replace(tmp, path)
+    except Exception as exc:  # pragma: no cover
+        _logger.warning("semantic_reasoner: cache write failed (%s)", exc)
+def _cache_get(key: str, ttl_seconds: int) -> dict[str, Any] | None:
+    cache = _read_cache()
+    entry = cache.get(key)
+    if not isinstance(entry, dict):
+        return None
+    ts = float(entry.get("ts", 0.0) or 0.0)
+    if ts <= 0.0:
+        return None
+    if (time.time() - ts) > ttl_seconds:
+        return None
+    return entry
+def _cache_put(key: str, entry: dict[str, Any]) -> None:
+    cache = _read_cache()
+    cache[key] = {**entry, "ts": time.time()}
+    if len(cache) > 2000:
+        # Keep the 1800 most-recent entries to avoid unbounded growth. The
+        # bound is advisory; callers should keep reasoner prompts small.
+        items = sorted(cache.items(), key=lambda kv: float(kv[1].get("ts", 0.0) or 0.0))
+        cache = dict(items[-1800:])
+    _write_cache(cache)
+def _parse_ttl_env() -> int:
+    """Read ``NEXO_SEMANTIC_REASONER_TTL`` defensively.
+    Malformed values (non-integer, negative) fall back to the default so
+    operator typos never crash the reasoner on first call.
+    """
+    raw = os.environ.get("NEXO_SEMANTIC_REASONER_TTL", "")
+    if not raw:
+        return _DEFAULT_CACHE_TTL_SECONDS
+    try:
+        parsed = int(raw)
+    except (TypeError, ValueError):
+        _logger.warning(
+            "semantic_reasoner: invalid NEXO_SEMANTIC_REASONER_TTL=%r; "
+            "using default %d",
+            raw,
+            _DEFAULT_CACHE_TTL_SECONDS,
+        )
+        return _DEFAULT_CACHE_TTL_SECONDS
+    if parsed <= 0:
+        return _DEFAULT_CACHE_TTL_SECONDS
+    return parsed
+def _reason_cached_llm(
+    *,
+    decision_kind: str,
+    question: str,
+    labels: tuple[str, ...] | None,
+    context: str,
+    confidence_floor: float,
+):
+    RouterResult = _import_router_result()
+    ttl = _parse_ttl_env()
+    key = _cache_key(
+        decision_kind=decision_kind,
+        question=question,
+        labels=labels,
+        context=context,
+    )
+    cached = _cache_get(key, ttl)
+    if cached is not None:
+        cached_verdict = cached.get("verdict")
+        if isinstance(cached_verdict, str) and cached_verdict.strip():
+            return RouterResult(
+                ok=True,
+                decision_kind=decision_kind,
+                verdict=cached_verdict,
+                label=cached_verdict,
+                confidence=float(cached.get("confidence", 0.6)),
+                route_used="semantic_reasoner",
+                degraded=False,
+                meta={
+                    "mode": "cached_llm",
+                    "cache_hit": True,
+                    "cache_key": key[:12],
+                },
+            )
+        # Corrupt entry (verdict missing or non-string). Drop it and fall
+        # through to a live call so the caller is never handed a cached
+        # "ok=True, verdict=None" sentinel.
+        _logger.warning(
+            "semantic_reasoner: dropping corrupt cache entry for key=%s",
+            key[:12],
+        )
+    try:
+        import call_model_raw as _cmr
+    except Exception as exc:  # pragma: no cover
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="semantic_reasoner",
+            degraded=True,
+            error=f"call_model_raw unavailable: {exc}",
+            meta={"mode": "cached_llm", "cache_hit": False},
+        )
+    call_model_raw_fn = getattr(_cmr, "call_model_raw", None)
+    classifier_unavailable_cls = getattr(
+        _cmr, "ClassifierUnavailableError", Exception
+    )
+    if call_model_raw_fn is None:
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="semantic_reasoner",
+            degraded=True,
+            error="call_model_raw callable missing",
+            meta={"mode": "cached_llm", "cache_hit": False},
+        )
+    prompt = _build_reasoner_prompt(
+        decision_kind=decision_kind,
+        question=question,
+        labels=labels,
+        context=context,
+    )
+    system = (
+        "You are NEXO's code-aware semantic reasoner. Answer with the "
+        "single best label from the provided list (no prose). If no "
+        "label fits, answer 'unknown'."
+    )
+    try:
+        raw = call_model_raw_fn(
+            prompt,
+            system=system,
+            caller="semantic_reasoner",
+            tier="muy_bajo",
+            max_tokens=32,
+            temperature=0.0,
+        )
+    except classifier_unavailable_cls as exc:
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="semantic_reasoner",
+            degraded=True,
+            error=f"remote_unavailable: {exc}",
+            meta={"mode": "cached_llm", "cache_hit": False},
+        )
+    except Exception as exc:  # noqa: BLE001 — fail-closed
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="semantic_reasoner",
+            degraded=True,
+            error=f"remote_error: {exc}",
+            meta={"mode": "cached_llm", "cache_hit": False},
+        )
+    verdict = _normalize_verdict(raw, labels)
+    if verdict is None:
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="semantic_reasoner",
+            degraded=True,
+            error="llm_returned_unknown_or_unparseable",
+            meta={
+                "mode": "cached_llm",
+                "cache_hit": False,
+                "raw": (raw or "")[:80],
+            },
+        )
+    _cache_put(
+        key,
+        {
+            "verdict": verdict,
+            "confidence": max(confidence_floor, 0.6),
+            "decision_kind": decision_kind,
+        },
+    )
+    return RouterResult(
+        ok=True,
+        decision_kind=decision_kind,
+        verdict=verdict,
+        label=verdict,
+        confidence=max(confidence_floor, 0.6),
+        route_used="semantic_reasoner",
+        degraded=False,
+        meta={"mode": "cached_llm", "cache_hit": False, "cache_key": key[:12]},
+    )
+def _build_reasoner_prompt(
+    *,
+    decision_kind: str,
+    question: str,
+    labels: tuple[str, ...] | None,
+    context: str,
+) -> str:
+    parts = [
+        f"decision_kind: {decision_kind}",
+        f"question: {question}",
+    ]
+    if context:
+        parts.append(f"context: {context[:600]}")
+    if labels:
+        parts.append("candidate_labels: " + ", ".join(labels))
+        parts.append("Reply with exactly one of the labels above.")
+    else:
+        parts.append("Reply with the shortest phrase answering the question.")
+    return "\n".join(parts)
+def _normalize_verdict(
+    raw: str, labels: tuple[str, ...] | None
+) -> str | None:
+    text = (raw or "").strip().lower()
+    if not text:
+        return None
+    if text == "unknown":
+        return None
+    if labels:
+        for label in labels:
+            if label.lower() == text:
+                return label
+        for label in labels:
+            if label.lower() in text:
+                return label
+        return None
+    return text
+# ---------------------------------------------------------------------------
+# Public entrypoint
+# ---------------------------------------------------------------------------
+_REASONER_OFF_VALUES = {"0", "off", "false", "no", "disable", "disabled"}
+def _is_reasoner_disabled() -> bool:
+    """Honour the ``NEXO_SEMANTIC_REASONER`` runtime kill switch.
+    The plan (ONEPASS LLM Coverage) explicitly required an env opt-out
+    dedicated to the reasoner, separate from ``NEXO_LOCAL_CLASSIFIER``
+    (which only gates install-time provisioning). Operators who hit a
+    reasoner regression in production can set ``NEXO_SEMANTIC_REASONER=0``
+    to force every ``reason()`` call to refuse; the router then falls
+    through to ``remote_fallback`` on its own.
+    """
+    raw = os.environ.get("NEXO_SEMANTIC_REASONER", "").strip().lower()
+    return raw in _REASONER_OFF_VALUES
+def reason(
+    *,
+    decision_kind: str,
+    question: str,
+    labels: tuple[str, ...] | list[str] | None,
+    context: str = "",
+    mode: str = "multipass_local",
+    confidence_floor: float = 0.75,
+):
+    """Dispatch to the configured mode. Called by ``semantic_router.route``.
+    Returns a ``RouterResult``. The router knows how to keep going to
+    ``remote_fallback`` if this layer refuses.
+    """
+    RouterResult = _import_router_result()
+    if _is_reasoner_disabled():
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="semantic_reasoner",
+            degraded=True,
+            error="reasoner_disabled_by_env",
+            meta={"env": "NEXO_SEMANTIC_REASONER"},
+        )
+    labels_tuple: tuple[str, ...] | None = tuple(labels) if labels else None
+    if mode == "multipass_local":
+        return _reason_multipass_local(
+            decision_kind=decision_kind,
+            question=question,
+            labels=labels_tuple,
+            confidence_floor=confidence_floor,
+        )
+    if mode == "cached_llm":
+        return _reason_cached_llm(
+            decision_kind=decision_kind,
+            question=question,
+            labels=labels_tuple,
+            context=context,
+            confidence_floor=confidence_floor,
+        )
+    return RouterResult(
+        ok=False,
+        decision_kind=decision_kind,
+        route_used="semantic_reasoner",
+        degraded=True,
+        error=f"unknown reasoner mode: {mode}",
+    )
+__all__ = ["reason"]

package/src/semantic_router.py ADDED Viewed

@@ -0,0 +1,452 @@
+"""semantic_router — Plan ONEPASS LLM Coverage.
+Central router for every model-backed semantic decision in NEXO Brain. Call
+sites declare a *decision_kind* and pass question/context; the router
+applies the policy for that kind and dispatches through the stack:
+    fast_local  ->  semantic_reasoner  ->  remote_fallback
+Design contract (from ~/Desktop/NEXO-ONEPASS-LLM-COVERAGE-RELEASE-PLAN.md):
+- Brain owns the semantic contract, model pins and routing policy.
+- Every call site passes a *named* decision_kind; policy lives here, not in
+  the caller. This replaces the previous pattern where each caller invented
+  its own policy tree.
+- The existing ``LocalZeroShotClassifier`` stays as the cheap multilingual
+  first pass (``fast_local``).
+- ``semantic_reasoner`` is the second, stronger layer. Its implementation
+  lives in ``src/semantic_reasoner.py`` with two modes: Mode A (strict
+  multi-pass over the same local classifier with tighter thresholds) and
+  Mode B (LLM-cached reasoner for code-aware decisions).
+- ``remote_fallback`` is the existing ``call_model_raw`` chain. It is no
+  longer the default path for local-friendly decisions; it only fires if
+  the upstream layers refuse or degrade.
+The router returns a ``RouterResult`` dataclass so callers can inspect
+which route was used, whether degraded mode is active, and what confidence
+the decision carries. This is also what Desktop will consume via the
+``brain-semantic-router.js`` bridge shipped in the companion PR.
+"""
+from __future__ import annotations
+import logging
+from dataclasses import dataclass, field
+from typing import Any
+_logger = logging.getLogger(__name__)
+# ---------------------------------------------------------------------------
+# Contract dataclasses
+# ---------------------------------------------------------------------------
+@dataclass
+class RouterResult:
+    """Outcome of a ``route()`` call.
+    Fields match the minimum contract documented in the plan (section
+    "Minimum router output contract"):
+      - ``ok``: overall success (at least one layer produced a decision)
+      - ``decision_kind``: the kind the caller passed
+      - ``verdict``: the chosen label when the caller used zero-shot
+        classification; None when the underlying layer returned free text
+      - ``label``: alias for ``verdict`` to match the plan's wording; kept
+        consistent to simplify Desktop bridge mapping
+      - ``confidence``: [0.0, 1.0]
+      - ``route_used``: one of ``fast_local``, ``semantic_reasoner``,
+        ``remote_fallback``, or ``no_route`` when every layer refused
+      - ``degraded``: True when the chosen layer could not meet its normal
+        bar (fallback fired, stricter threshold not met, cache-only, etc.)
+      - ``error``: short human-readable reason when ``ok`` is False
+      - ``meta``: free-form layer-specific evidence (scores dict, cache
+        key, latency, model id) — Desktop uses it for telemetry
+    """
+    ok: bool
+    decision_kind: str
+    verdict: str | None = None
+    label: str | None = None
+    confidence: float = 0.0
+    route_used: str = "no_route"
+    degraded: bool = False
+    error: str | None = None
+    meta: dict[str, Any] = field(default_factory=dict)
+# ---------------------------------------------------------------------------
+# Decision kinds + policy table
+# ---------------------------------------------------------------------------
+#
+# The plan enumerates 18 decision_kinds that need to route through here. They
+# fall into two families:
+#
+#   TEXTUAL     — the first-line local classifier is good enough; the
+#                 reasoner adds a stricter multi-pass check for ambiguous
+#                 cases. Remote is only a last-resort safety net.
+#
+#   CODE_AWARE  — the fast local classifier is not designed for code-aware
+#                 semantics (T4 R15/R23e/R23f/R23h, r20). The reasoner
+#                 routes those straight to a cached LLM call.
+#
+# Any decision_kind not listed here falls through to remote_fallback with
+# ``degraded=True`` to make accidental misuse visible in telemetry instead
+# of silent.
+#
+# Keep this map in lockstep with ``docs/semantic-reasoner-model-notes.md``.
+TEXTUAL_KINDS: tuple[str, ...] = (
+    "session_end_intent",
+    "autonomy_mandate",
+    "guard_verbal_ack",
+    "r14_correction",
+    "r16_declared_done",
+    "r17_promise_debt",
+    "r34_identity_coherence",
+    "followup_operator_attention",
+    "drive_signal_type",
+    "drive_area",
+    "reply_event_type",
+    "query_intent",
+    "sentiment_intent",
+)
+CODE_AWARE_KINDS: tuple[str, ...] = (
+    "r20_constant_change",
+    "t4_r15",
+    "t4_r23e",
+    "t4_r23f",
+    "t4_r23h",
+)
+ALL_DECISION_KINDS: tuple[str, ...] = TEXTUAL_KINDS + CODE_AWARE_KINDS
+# Per-kind policy. Explicit, human-readable, no defaults that silently
+# expand coverage. Changing policy = editing this dict + updating the
+# model-notes doc + bumping tests.
+_POLICY: dict[str, dict[str, Any]] = {
+    kind: {
+        "family": "textual",
+        "fast_local_threshold": 0.60,
+        "reasoner_mode": "multipass_local",
+        "reasoner_threshold": 0.75,
+        "allow_remote_fallback": True,
+    }
+    for kind in TEXTUAL_KINDS
+}
+_POLICY.update(
+    {
+        kind: {
+            "family": "code_aware",
+            "fast_local_threshold": None,  # skip fast_local
+            "reasoner_mode": "cached_llm",
+            "reasoner_threshold": 0.60,
+            "allow_remote_fallback": True,
+        }
+        for kind in CODE_AWARE_KINDS
+    }
+)
+def policy_for(decision_kind: str) -> dict[str, Any] | None:
+    """Return the policy entry for a kind, or None if unknown."""
+    return _POLICY.get(decision_kind)
+# ---------------------------------------------------------------------------
+# Layer adapters
+# ---------------------------------------------------------------------------
+#
+# The router does not import the heavy modules at the top of the file so
+# that a caller who only wants ``policy_for`` or ``ALL_DECISION_KINDS`` does
+# not pay the import cost. The adapters below resolve the dependencies
+# lazily and wrap failures as ``None`` so the router can advance to the
+# next layer deterministically.
+def _run_fast_local(
+    *,
+    question: str,
+    labels: tuple[str, ...],
+    confidence_floor: float,
+) -> RouterResult | None:
+    """Try ``LocalZeroShotClassifier``. Return None on unavailable or
+    below-threshold so the router advances."""
+    try:
+        from classifier_local import LocalZeroShotClassifier
+    except Exception as exc:  # pragma: no cover — install not ready
+        _logger.debug("semantic_router: classifier_local unavailable (%s)", exc)
+        return None
+    clf = LocalZeroShotClassifier(confidence_floor=confidence_floor)
+    result = clf.classify(question, labels)
+    if result is None:
+        return None
+    if result.confidence < confidence_floor:
+        return None
+    return RouterResult(
+        ok=True,
+        decision_kind="",  # filled by caller
+        verdict=result.label,
+        label=result.label,
+        confidence=float(result.confidence),
+        route_used="fast_local",
+        degraded=False,
+        meta={
+            "scores": dict(result.scores),
+            "latency_ms": float(result.latency_ms),
+            "threshold": confidence_floor,
+        },
+    )
+def _run_semantic_reasoner(
+    *,
+    decision_kind: str,
+    question: str,
+    labels: tuple[str, ...] | None,
+    context: str,
+    mode: str,
+    confidence_floor: float,
+) -> RouterResult | None:
+    """Delegate to ``src/semantic_reasoner.py``. Return None on unavailable
+    so the router advances to remote_fallback."""
+    try:
+        from semantic_reasoner import reason
+    except Exception as exc:  # pragma: no cover
+        _logger.debug("semantic_router: semantic_reasoner unavailable (%s)", exc)
+        return None
+    try:
+        return reason(
+            decision_kind=decision_kind,
+            question=question,
+            labels=labels,
+            context=context,
+            mode=mode,
+            confidence_floor=confidence_floor,
+        )
+    except Exception as exc:  # noqa: BLE001 — fail-closed, degrade to remote
+        _logger.warning("semantic_reasoner.reason raised: %s", exc)
+        return None
+def _run_remote_fallback(
+    *,
+    decision_kind: str,
+    question: str,
+    labels: tuple[str, ...] | None,
+    context: str,
+) -> RouterResult | None:
+    """Last-resort LLM call via ``call_model_raw``. The router marks the
+    result as ``degraded=True`` so telemetry shows when the stack fell
+    through."""
+    try:
+        import call_model_raw as _cmr
+    except Exception as exc:  # pragma: no cover
+        _logger.debug("semantic_router: call_model_raw unavailable (%s)", exc)
+        return None
+    # Resolve symbols defensively. Tests sometimes stub only ``call_model_raw``
+    # and forget ``ClassifierUnavailableError`` (or vice versa); without this
+    # guard a missing attribute later becomes NameError at ``except`` time and
+    # crashes the router instead of degrading.
+    call_model_raw_fn = getattr(_cmr, "call_model_raw", None)
+    classifier_unavailable_cls = getattr(
+        _cmr, "ClassifierUnavailableError", Exception
+    )
+    if call_model_raw_fn is None:
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="remote_fallback",
+            degraded=True,
+            error="call_model_raw callable missing",
+        )
+    prompt = _build_remote_prompt(
+        decision_kind=decision_kind,
+        question=question,
+        labels=labels,
+        context=context,
+    )
+    system = (
+        "You are NEXO's remote semantic fallback. Answer with the single "
+        "best label from the provided list, or with 'unknown' if none fit. "
+        "No prose, no explanation."
+    )
+    try:
+        raw = call_model_raw_fn(
+            prompt,
+            system=system,
+            caller="semantic_reasoner",
+            tier="muy_bajo",
+            max_tokens=32,
+            temperature=0.0,
+        )
+    except classifier_unavailable_cls as exc:
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="remote_fallback",
+            degraded=True,
+            error=f"remote_unavailable: {exc}",
+        )
+    except Exception as exc:  # noqa: BLE001 — fail-closed, never re-raise
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="remote_fallback",
+            degraded=True,
+            error=f"remote_error: {exc}",
+        )
+    verdict = _normalize_remote_answer(raw, labels)
+    raw_preview = (raw or "")[:120]
+    return RouterResult(
+        ok=verdict is not None,
+        decision_kind=decision_kind,
+        verdict=verdict,
+        label=verdict,
+        confidence=0.55 if verdict is not None else 0.0,
+        route_used="remote_fallback",
+        degraded=True,  # always degraded relative to the local-first ideal
+        meta={"raw_response": raw_preview},
+    )
+def _build_remote_prompt(
+    *,
+    decision_kind: str,
+    question: str,
+    labels: tuple[str, ...] | None,
+    context: str,
+) -> str:
+    parts = [
+        f"Decision kind: {decision_kind}",
+        f"Question: {question}",
+    ]
+    if context:
+        parts.append(f"Context: {context[:400]}")
+    if labels:
+        parts.append("Candidate labels: " + ", ".join(labels))
+        parts.append("Reply with exactly one of the labels above.")
+    else:
+        parts.append("Reply with the shortest phrase that answers the question.")
+    return "\n".join(parts)
+def _normalize_remote_answer(
+    raw: str, labels: tuple[str, ...] | None
+) -> str | None:
+    text = (raw or "").strip().lower()
+    if not text:
+        return None
+    if labels:
+        for label in labels:
+            if label.lower() == text:
+                return label
+        for label in labels:
+            if label.lower() in text:
+                return label
+        return None
+    return text
+# ---------------------------------------------------------------------------
+# Public entrypoint
+# ---------------------------------------------------------------------------
+def route(
+    *,
+    decision_kind: str,
+    question: str,
+    context: str = "",
+    labels: tuple[str, ...] | list[str] | None = None,
+    allow_remote_fallback: bool = True,
+) -> RouterResult:
+    """Route a semantic decision through the stack.
+    The caller names the *kind* of decision. The router looks up the policy,
+    dispatches through fast_local -> semantic_reasoner -> remote_fallback,
+    and returns the first layer that produced a decision above its
+    threshold.
+    ``allow_remote_fallback=False`` forces local-only behaviour; the router
+    will return ``ok=False, route_used='no_route'`` if every local layer
+    refused. Useful for strict-offline automation or pytest.
+    """
+    policy = policy_for(decision_kind)
+    if policy is None:
+        return RouterResult(
+            ok=False,
+            decision_kind=decision_kind,
+            route_used="no_route",
+            degraded=True,
+            error=f"unknown decision_kind: {decision_kind}",
+        )
+    labels_tuple: tuple[str, ...] | None = (
+        tuple(labels) if labels else None
+    )
+    # Step 1 — fast_local for textual families only.
+    if policy["fast_local_threshold"] is not None and labels_tuple:
+        fast = _run_fast_local(
+            question=question,
+            labels=labels_tuple,
+            confidence_floor=float(policy["fast_local_threshold"]),
+        )
+        if fast is not None:
+            fast.decision_kind = decision_kind
+            return fast
+    # Step 2 — semantic_reasoner (Mode A or B depending on policy).
+    reasoned = _run_semantic_reasoner(
+        decision_kind=decision_kind,
+        question=question,
+        labels=labels_tuple,
+        context=context,
+        mode=str(policy["reasoner_mode"]),
+        confidence_floor=float(policy["reasoner_threshold"]),
+    )
+    if reasoned is not None and reasoned.ok:
+        return reasoned
+    # Step 3 — remote_fallback if allowed.
+    if allow_remote_fallback and policy.get("allow_remote_fallback", True):
+        remote = _run_remote_fallback(
+            decision_kind=decision_kind,
+            question=question,
+            labels=labels_tuple,
+            context=context,
+        )
+        if remote is not None:
+            return remote
+    return RouterResult(
+        ok=False,
+        decision_kind=decision_kind,
+        route_used="no_route",
+        degraded=True,
+        error="every layer refused or was unavailable",
+    )
+__all__ = [
+    "ALL_DECISION_KINDS",
+    "CODE_AWARE_KINDS",
+    "RouterResult",
+    "TEXTUAL_KINDS",
+    "policy_for",
+    "route",
+]

package/tool-enforcement-map.json CHANGED Viewed

@@ -3214,7 +3214,7 @@
             "threshold": 1
           }
         ],
-        "inject_prompt": "You must start by calling nexo_startup to register this session. Execute it now with a brief task description. Do not produce visible text.",
+        "inject_prompt": "You must start by calling nexo_startup to register this session. If mcp__nexo__* tools appear as deferred in the tool list (names visible but JSONSchemas not loaded), first call ToolSearch with query \"select:mcp__nexo__nexo_startup,mcp__nexo__nexo_heartbeat,mcp__nexo__nexo_session_diary_read,mcp__nexo__nexo_reminders,mcp__nexo__nexo_smart_startup,mcp__nexo__nexo_task_open,mcp__nexo__nexo_task_close,mcp__nexo__nexo_task_acknowledge_guard,mcp__nexo__nexo_guard_check,mcp__nexo__nexo_learning_add,mcp__nexo__nexo_confidence_check,mcp__nexo__nexo_followup_create,mcp__nexo__nexo_protocol_debt_resolve\" to load the schemas — deferred is not absent. If more nexo_* tools appear deferred later in the session, preload them the same way instead of giving up on them. Then execute nexo_startup with a brief task description. Do not produce visible text.",
         "triggers_after": [
           "nexo_smart_startup",
           "nexo_session_diary_read",