npm - superlocalmemory - Versions diffs - 2.7.1 → 2.7.3 - Mend

superlocalmemory 2.7.1 → 2.7.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/CHANGELOG.md +13 -1
package/README.md +1 -1
package/docs/ARCHITECTURE.md +8 -8
package/docs/COMPRESSION-README.md +1 -1
package/docs/SEARCH-ENGINE-V2.2.0.md +1 -0
package/mcp_server.py +77 -0
package/package.json +1 -1
package/src/agent_registry.py +3 -3
package/src/graph_engine.py +15 -11
package/src/learning/feature_extractor.py +77 -16
package/src/learning/feedback_collector.py +9 -2
package/src/learning/tests/test_synthetic_bootstrap.py +1 -1
package/src/trust_scorer.py +288 -74
package/ui/app.js +4 -4
package/ui/js/agents.js +4 -4

package/CHANGELOG.md CHANGED Viewed

@@ -16,6 +16,18 @@ SuperLocalMemory V2 - Intelligent local memory system for AI coding assistants.
 ---
+## [2.7.3] - 2026-02-16
+### Improved
+- Enhanced trust scoring accuracy
+- Improved search result relevance across all access methods
+- Better error handling for optional components
+### Fixed
+- Corrected outdated performance references in documentation
+---
 ## [2.7.1] - 2026-02-16
 ### Added
@@ -270,7 +282,7 @@ SuperLocalMemory V2 represents a complete architectural rewrite with intelligent
 - **Profile Management** - Multi-profile support with isolated databases
 ### Performance
-- 3.3x faster search (45ms vs 150ms in V1)
+- Improved search performance over V1 (see Performance Benchmarks)
 - 60-96% storage reduction with compression
 ### Research Foundation

package/README.md CHANGED Viewed

@@ -425,7 +425,7 @@ WAL mode + serialized write queue = zero "database is locked" errors, ever.
 ### Storage
-10,000 memories = **13.6 MB** on disk (~1.9 KB per memory). Your entire AI memory history takes less space than a photo.
+10,000 memories = **13.6 MB** on disk (~1.4 KB per memory). Your entire AI memory history takes less space than a photo.
 ### Graph Construction

package/docs/ARCHITECTURE.md CHANGED Viewed

@@ -381,7 +381,7 @@ SuperLocalMemory V2 uses a hierarchical, additive architecture where each layer
    - Cold storage archival
 **Performance:**
-- Full-text search: ~45ms (avg)
+- Full-text search: Sub-11ms median for typical databases (see wiki Performance Benchmarks for measured data)
 - Insert: <10ms
 - Tag search: ~30ms
@@ -737,7 +737,7 @@ Vue: 10% confidence → Low priority, exploratory
 | Operation | Complexity | Typical Time |
 |-----------|-----------|--------------|
 | Add memory | O(1) | <10ms |
-| Search (FTS5) | O(log n) | ~45ms |
+| Search (FTS5) | O(log n) | ~11ms median (100 memories) |
 | Graph build | O(n²) worst, O(n log n) avg | ~2s (100 memories) |
 | Pattern update | O(n) | <2s (100 memories) |
 | Find related | O(1) | <10ms |
@@ -794,12 +794,12 @@ Vue: 10% confidence → Low priority, exploratory
 ### Current Limits (Tested)
-| Memories | Build Time | Search Time | Database Size |
-|----------|-----------|-------------|---------------|
-| 20 | <0.03s | ~30ms | ~30KB |
-| 100 | ~2s | ~45ms | ~150KB |
-| 500 | ~15s | ~60ms | ~700KB |
-| 1000 | ~45s | ~100ms | ~1.5MB |
+| Memories | Build Time | Search Time (median) | Database Size |
+|----------|-----------|----------------------|---------------|
+| 100 | 0.28s | 10.6ms | ~150KB |
+| 500 | ~5s | 65.2ms | ~700KB |
+| 1,000 | 10.6s | 124.3ms | 1.50 MB |
+| 5,000 | 277s | 1,172ms | ~6.8 MB |
 ### Scaling Strategies

package/docs/COMPRESSION-README.md CHANGED Viewed

@@ -245,7 +245,7 @@ Priority order:
 - Tier 2 (40 memories @ 10KB): 400KB
 - Tier 3 (30 memories @ 2KB): 60KB
 - **Total: 1.96MB (61% reduction)**
-- **Search time: 45ms** (only scan Tier 1+2)
+- **Search time: Sub-11ms median for typical databases** (only scan Tier 1+2, see wiki Performance Benchmarks)
 - **Memory load: 1.9MB** (Tier 3 loaded on-demand)
 ### Space Savings Scale

package/docs/SEARCH-ENGINE-V2.2.0.md CHANGED Viewed

@@ -418,6 +418,7 @@ weights = {'bm25': 0.4, 'semantic': 0.3, 'graph': 0.3}  # Default
 - Index time is one-time cost
 - Search time scales sub-linearly (inverted index efficiency)
 - Hybrid search includes fusion overhead (~10-15ms)
+- These are projected estimates for the optional BM25 engine. See wiki Performance Benchmarks for measured end-to-end search latency.
 ---

package/mcp_server.py CHANGED Viewed

@@ -72,6 +72,70 @@ try:
 except ImportError:
     LEARNING_AVAILABLE = False
+# ============================================================================
+# Synthetic Bootstrap Auto-Trigger (v2.7 — P1-12)
+# Runs ONCE on first recall if: memory count > 50, no model, LightGBM available.
+# Spawns in background thread — never blocks recall. All errors swallowed.
+# ============================================================================
+_bootstrap_checked = False
+def _maybe_bootstrap():
+    """Check if synthetic bootstrap is needed and run it in a background thread.
+    Called once from the first recall invocation. Sets _bootstrap_checked = True
+    immediately to prevent re-entry. The actual bootstrap runs in a daemon thread
+    so it never blocks the recall response.
+    Conditions for bootstrap:
+        1. LEARNING_AVAILABLE and ML_RANKING_AVAILABLE flags are True
+        2. SyntheticBootstrapper.should_bootstrap() returns True (checks:
+           - LightGBM + NumPy installed
+           - No existing model file at ~/.claude-memory/models/ranker.txt
+           - Memory count > 50)
+    CRITICAL: This function wraps everything in try/except. Bootstrap failure
+    must NEVER break recall. It is purely an optimization — first-time ML
+    model creation so users don't have to wait 200+ recalls for personalization.
+    """
+    global _bootstrap_checked
+    _bootstrap_checked = True  # Set immediately to prevent re-entry
+    try:
+        if not LEARNING_AVAILABLE:
+            return
+        if not ML_RANKING_AVAILABLE:
+            return
+        from learning.synthetic_bootstrap import SyntheticBootstrapper
+        bootstrapper = SyntheticBootstrapper(memory_db_path=DB_PATH)
+        if not bootstrapper.should_bootstrap():
+            return
+        # Run bootstrap in background thread — never block recall
+        import threading
+        def _run_bootstrap():
+            try:
+                result = bootstrapper.bootstrap_model()
+                if result:
+                    import logging
+                    logging.getLogger("superlocalmemory.mcp").info(
+                        "Synthetic bootstrap complete: %d samples",
+                        result.get('training_samples', 0)
+                    )
+            except Exception:
+                pass  # Bootstrap failure is never critical
+        thread = threading.Thread(target=_run_bootstrap, daemon=True)
+        thread.start()
+    except Exception:
+        pass  # Any failure in bootstrap setup is swallowed silently
 def _sanitize_error(error: Exception) -> str:
     """Strip internal paths and structure from error messages."""
     msg = str(error)
@@ -356,6 +420,10 @@ async def recall(
         else:
             results = store.search(query, limit=limit)
+        # v2.7: Auto-trigger synthetic bootstrap on first recall (P1-12)
+        if not _bootstrap_checked:
+            _maybe_bootstrap()
         # v2.7: Learning-based re-ranking (optional, graceful fallback)
         if LEARNING_AVAILABLE:
             try:
@@ -868,6 +936,15 @@ async def search(query: str) -> dict:
         store = get_store()
         raw_results = store.search(query, limit=20)
+        # v2.7: Learning-based re-ranking (optional, graceful fallback)
+        if LEARNING_AVAILABLE:
+            try:
+                ranker = get_adaptive_ranker()
+                if ranker:
+                    raw_results = ranker.rerank(raw_results, query)
+            except Exception:
+                pass  # Re-ranking failure must never break search
         results = []
         for r in raw_results:
             if r.get('score', 0) < 0.2:

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "superlocalmemory",
-  "version": "2.7.1",
+  "version": "2.7.3",
   "description": "Your AI Finally Remembers You - Local-first intelligent memory system for AI assistants. Works with Claude, Cursor, Windsurf, VS Code/Copilot, Codex, and 17+ AI tools. 100% local, zero cloud dependencies.",
   "keywords": [
     "ai-memory",

package/src/agent_registry.py CHANGED Viewed

@@ -98,7 +98,7 @@ class AgentRegistry:
                         last_seen TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                         memories_written INTEGER DEFAULT 0,
                         memories_recalled INTEGER DEFAULT 0,
-                        trust_score REAL DEFAULT 1.0,
+                        trust_score REAL DEFAULT 0.667,
                         metadata TEXT DEFAULT '{}'
                     )
                 ''')
@@ -126,7 +126,7 @@ class AgentRegistry:
                     last_seen TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
                     memories_written INTEGER DEFAULT 0,
                     memories_recalled INTEGER DEFAULT 0,
-                    trust_score REAL DEFAULT 1.0,
+                    trust_score REAL DEFAULT 0.667,
                     metadata TEXT DEFAULT '{}'
                 )
             ''')
@@ -150,7 +150,7 @@ class AgentRegistry:
         Register or update an agent in the registry.
         If the agent already exists, updates last_seen and metadata.
-        If new, creates the entry with trust_score=1.0.
+        If new, creates the entry with trust_score=0.667 (Beta(2,1) prior).
         Args:
             agent_id: Unique identifier (e.g., "mcp:claude-desktop")

package/src/graph_engine.py CHANGED Viewed

@@ -297,12 +297,11 @@ class ClusterBuilder:
         Returns:
             Number of clusters created
         """
-        # Import igraph modules here to avoid conflicts
-        try:
-            import igraph as ig
-            import leidenalg
-        except ImportError:
-            raise ImportError("python-igraph and leidenalg required. Install: pip install python-igraph leidenalg")
+        if not IGRAPH_AVAILABLE:
+            logger.warning("igraph/leidenalg not installed. Graph clustering disabled. Install with: pip3 install python-igraph leidenalg")
+            return 0
+        import igraph as ig
+        import leidenalg
         conn = sqlite3.connect(self.db_path)
         cursor = conn.cursor()
@@ -457,11 +456,11 @@ class ClusterBuilder:
         Returns:
             Dictionary with hierarchical clustering statistics
         """
-        try:
-            import igraph as ig
-            import leidenalg
-        except ImportError:
-            raise ImportError("python-igraph and leidenalg required. Install: pip install python-igraph leidenalg")
+        if not IGRAPH_AVAILABLE:
+            logger.warning("igraph/leidenalg not installed. Hierarchical clustering disabled. Install with: pip3 install python-igraph leidenalg")
+            return {'subclusters_created': 0, 'depth_reached': 0}
+        import igraph as ig
+        import leidenalg
         conn = sqlite3.connect(self.db_path)
         cursor = conn.cursor()
@@ -512,6 +511,8 @@ class ClusterBuilder:
                                profile: str, min_size: int, max_depth: int,
                                current_depth: int) -> Tuple[int, int]:
         """Recursively sub-cluster a community using Leiden."""
+        if not IGRAPH_AVAILABLE:
+            return 0, current_depth - 1
         import igraph as ig
         import leidenalg
@@ -1038,6 +1039,9 @@ class GraphEngine:
                 'summaries_generated': summaries,
                 'time_seconds': round(elapsed, 2)
             }
+            if not IGRAPH_AVAILABLE:
+                stats['warning'] = 'igraph/leidenalg not installed — graph built without clustering. Install with: pip3 install python-igraph leidenalg'
             logger.info(f"Graph build complete: {stats}")
             return stats

package/src/learning/feature_extractor.py CHANGED Viewed

@@ -12,22 +12,23 @@ Attribution must be preserved in all copies or derivatives.
 """
 """
-FeatureExtractor — Extracts 9-dimensional feature vectors for candidate memories.
+FeatureExtractor — Extracts 10-dimensional feature vectors for candidate memories.
 Each memory retrieved during recall gets a feature vector that feeds into
 the AdaptiveRanker. In Phase 1 (rule-based), features drive boosting weights.
 In Phase 2 (ML), features become LightGBM input columns.
-Feature Vector (9 dimensions):
-    [0] bm25_score       — Existing retrieval score from search results
-    [1] tfidf_score      — TF-IDF cosine similarity from search results
-    [2] tech_match       — Does memory match user's tech preferences?
-    [3] project_match    — Is memory from the current project?
-    [4] workflow_fit     — Does memory fit current workflow phase?
-    [5] source_quality   — Quality score of the source that created this memory
-    [6] importance_norm  — Normalized importance (importance / 10.0)
-    [7] recency_score    — Exponential decay based on age (180-day half-life)
-    [8] access_frequency — How often this memory was accessed (capped at 1.0)
+Feature Vector (10 dimensions):
+    [0] bm25_score          — Existing retrieval score from search results
+    [1] tfidf_score         — TF-IDF cosine similarity from search results
+    [2] tech_match          — Does memory match user's tech preferences?
+    [3] project_match       — Is memory from the current project?
+    [4] workflow_fit        — Does memory fit current workflow phase?
+    [5] source_quality      — Quality score of the source that created this memory
+    [6] importance_norm     — Normalized importance (importance / 10.0)
+    [7] recency_score       — Exponential decay based on age (180-day half-life)
+    [8] access_frequency    — How often this memory was accessed (capped at 1.0)
+    [9] pattern_confidence  — Max Beta-Binomial confidence from learned patterns
 Design Principles:
     - All features normalized to [0.0, 1.0] range for ML compatibility
@@ -59,6 +60,7 @@ FEATURE_NAMES = [
     'importance_norm',     # 6: Normalized importance (importance / 10.0)
     'recency_score',       # 7: Exponential decay based on age
     'access_frequency',    # 8: How often this memory was accessed (capped at 1.0)
+    'pattern_confidence',  # 9: Max Beta-Binomial confidence from learned patterns
 ]
 NUM_FEATURES = len(FEATURE_NAMES)
@@ -100,7 +102,7 @@ _MAX_ACCESS_COUNT = 10
 class FeatureExtractor:
     """
-    Extracts 9-dimensional feature vectors for candidate memories.
+    Extracts 10-dimensional feature vectors for candidate memories.
     Usage:
         extractor = FeatureExtractor()
@@ -111,7 +113,7 @@ class FeatureExtractor:
             workflow_phase='testing',
         )
         features = extractor.extract_batch(memories, query="search optimization")
-        # features is List[List[float]], shape (n_memories, 9)
+        # features is List[List[float]], shape (n_memories, 10)
     """
     FEATURE_NAMES = FEATURE_NAMES
@@ -125,6 +127,8 @@ class FeatureExtractor:
         self._current_project_lower: Optional[str] = None
         self._workflow_phase: Optional[str] = None
         self._workflow_keywords: List[str] = []
+        # Pattern confidence cache: maps lowercased pattern value -> confidence
+        self._pattern_cache: Dict[str, float] = {}
     def set_context(
         self,
@@ -132,6 +136,7 @@ class FeatureExtractor:
         tech_preferences: Optional[Dict[str, dict]] = None,
         current_project: Optional[str] = None,
         workflow_phase: Optional[str] = None,
+        pattern_confidences: Optional[Dict[str, float]] = None,
     ):
         """
         Set context for feature extraction. Called once per recall query.
@@ -146,6 +151,9 @@ class FeatureExtractor:
                               From cross_project_aggregator or pattern_learner.
             current_project: Name of the currently active project (if detected).
             workflow_phase: Current workflow phase (planning, coding, testing, etc).
+            pattern_confidences: Map of lowercased pattern value -> confidence (0.0-1.0).
+                                 From pattern_learner.PatternStore.get_patterns().
+                                 Used for feature [9] pattern_confidence.
         """
         self._source_scores = source_scores or {}
         self._tech_preferences = tech_preferences or {}
@@ -166,9 +174,12 @@ class FeatureExtractor:
             if workflow_phase else []
         )
+        # Cache pattern confidences for feature [9]
+        self._pattern_cache = pattern_confidences or {}
     def extract_features(self, memory: dict, query: str) -> List[float]:
         """
-        Extract 9-dimensional feature vector for a single memory.
+        Extract 10-dimensional feature vector for a single memory.
         Args:
             memory: Memory dict from search results. Expected keys:
@@ -177,7 +188,7 @@ class FeatureExtractor:
             query: The recall query string.
         Returns:
-            List of 9 floats in [0.0, 1.0] range, one per feature.
+            List of 10 floats in [0.0, 1.0] range, one per feature.
         """
         return [
             self._compute_bm25_score(memory),
@@ -189,6 +200,7 @@ class FeatureExtractor:
             self._compute_importance_norm(memory),
             self._compute_recency_score(memory),
             self._compute_access_frequency(memory),
+            self._compute_pattern_confidence(memory),
         ]
     def extract_batch(
@@ -204,7 +216,7 @@ class FeatureExtractor:
             query: The recall query string.
         Returns:
-            List of feature vectors (List[List[float]]), shape (n, 9).
+            List of feature vectors (List[List[float]]), shape (n, 10).
             Returns empty list if memories is empty.
         """
         if not memories:
@@ -447,6 +459,55 @@ class FeatureExtractor:
         return min(access_count / float(_MAX_ACCESS_COUNT), 1.0)
+    def _compute_pattern_confidence(self, memory: dict) -> float:
+        """
+        Compute max Beta-Binomial confidence from learned patterns matching this memory.
+        Looks up the cached pattern_confidences (set via set_context) and checks
+        if any pattern value appears in the memory's content or tags. Returns the
+        maximum confidence among all matching patterns.
+        Returns:
+            Max confidence (0.0-1.0) from matching patterns
+            0.5 if no patterns loaded (neutral — unknown)
+            0.0 if patterns loaded but none match
+        """
+        if not self._pattern_cache:
+            return 0.5  # No patterns available — neutral
+        content = memory.get('content', '')
+        if not content:
+            return 0.0
+        content_lower = content.lower()
+        # Also check tags
+        tags_str = ''
+        tags = memory.get('tags', [])
+        if isinstance(tags, list):
+            tags_str = ' '.join(t.lower() for t in tags)
+        elif isinstance(tags, str):
+            tags_str = tags.lower()
+        searchable = content_lower + ' ' + tags_str
+        max_confidence = 0.0
+        for pattern_value, confidence in self._pattern_cache.items():
+            # Pattern values are already lowercased in the cache
+            pattern_lower = pattern_value.lower() if pattern_value else ''
+            if not pattern_lower:
+                continue
+            # Word-boundary check for short patterns to avoid false positives
+            if len(pattern_lower) <= 3:
+                if re.search(r'\b' + re.escape(pattern_lower) + r'\b', searchable):
+                    max_confidence = max(max_confidence, confidence)
+            else:
+                if pattern_lower in searchable:
+                    max_confidence = max(max_confidence, confidence)
+        return max(0.0, min(max_confidence, 1.0))
 # ============================================================================
 # Module-level convenience functions
 # ============================================================================

package/src/learning/feedback_collector.py CHANGED Viewed

@@ -122,9 +122,16 @@ class FeedbackCollector:
         """
         Args:
             learning_db: LearningDB instance for persisting feedback.
-                         If None, feedback is logged but not stored.
+                         If None, auto-creates a LearningDB instance.
         """
-        self.learning_db = learning_db
+        if learning_db is None:
+            try:
+                from .learning_db import LearningDB
+                self.learning_db = LearningDB()
+            except Exception:
+                self.learning_db = None
+        else:
+            self.learning_db = learning_db
         # In-memory buffer for passive decay tracking.
         # Structure: {query_hash: {memory_id: times_returned_count}}

package/src/learning/tests/test_synthetic_bootstrap.py CHANGED Viewed

@@ -258,7 +258,7 @@ class TestGenerateSyntheticData:
             assert "label" in r
             assert "source" in r
             assert "features" in r
-            assert len(r["features"]) == 9  # 9-dimensional feature vector
+            assert len(r["features"]) == 10  # 10-dimensional feature vector
     def test_labels_in_range(self, bootstrapper_with_data):
         records = bootstrapper_with_data.generate_synthetic_training_data()

package/src/trust_scorer.py CHANGED Viewed

@@ -12,10 +12,23 @@ Attribution must be preserved in all copies or derivatives.
 """
 """
-TrustScorer — Silent trust signal collection for AI agents.
+TrustScorer — Bayesian Beta-Binomial trust scoring for AI agents.
+Scoring Model:
+    Each agent's trust is modeled as a Beta(alpha, beta) distribution.
+    - alpha accumulates evidence of trustworthy behavior
+    - beta accumulates evidence of untrustworthy behavior
+    - Trust score = alpha / (alpha + beta)  (posterior mean)
+    Prior: Beta(2.0, 1.0) → initial trust = 0.667
+    This gives new agents a positive-but-not-maximal starting trust,
+    well above the 0.3 enforcement threshold but with room to grow.
+    This follows the MACLA Beta-Binomial approach (arXiv:2512.18950)
+    already used in pattern_learner.py for confidence scoring.
 v2.5 BEHAVIOR (this version):
-    - All agents start at trust 1.0
+    - All agents start at Beta(2.0, 1.0) → trust 0.667
     - Signals are collected silently (no enforcement, no ranking, no blocking)
     - Trust scores are updated in agent_registry.trust_score
     - Dashboard shows scores but they don't affect recall ordering yet
@@ -30,31 +43,26 @@ v3.0 BEHAVIOR (future):
     - Admin approval workflow for untrusted agents
 Trust Signals (all silently collected):
-    POSITIVE (increase trust):
+    POSITIVE (increase alpha — build trust):
         - Memory recalled by other agents (cross-agent validation)
         - Memory updated (shows ongoing relevance)
         - High importance memories (agent writes valuable content)
         - Consistent write patterns (not spam-like)
-    NEGATIVE (decrease trust):
+    NEGATIVE (increase beta — erode trust):
         - Memory deleted shortly after creation (low quality)
         - Very high write volume in short time (potential spam/poisoning)
         - Content flagged or overwritten by user
     NEUTRAL:
-        - Normal read/write patterns
+        - Normal read/write patterns (tiny alpha nudge to reward activity)
         - Agent disconnects/reconnects
-Scoring Algorithm:
-    Bayesian-inspired moving average. Each signal adjusts the score
-    by a small delta. Score is clamped to [0.0, 1.0].
-    new_score = old_score + (delta * decay_factor)
-    decay_factor = 1 / (1 + signal_count * 0.01)  # Stabilizes over time
-    This means early signals have more impact, and the score converges
-    as more data is collected. Similar to MACLA Beta-Binomial approach
-    (arXiv:2512.18950) but simplified for local computation.
+Decay:
+    Every DECAY_INTERVAL signals per agent, both alpha and beta are
+    multiplied by DECAY_FACTOR (0.995). This slowly forgets very old
+    signals so recent behavior matters more. Floors prevent total
+    information loss: alpha >= 1.0, beta >= 0.5.
 Security (OWASP for Agentic AI):
     - Memory poisoning (#1 threat): Trust scoring is the first defense layer
@@ -64,7 +72,6 @@ Security (OWASP for Agentic AI):
 import json
 import logging
-import math
 import threading
 from datetime import datetime, timedelta
 from pathlib import Path
@@ -72,24 +79,59 @@ from typing import Optional, Dict, List
 logger = logging.getLogger("superlocalmemory.trust")
-# Signal deltas (how much each signal moves the trust score)
-SIGNAL_DELTAS = {
-    # Positive signals
-    "memory_recalled_by_others": +0.02,
-    "memory_updated": +0.01,
-    "high_importance_write": +0.015,   # importance >= 7
-    "consistent_pattern": +0.01,
-    # Negative signals
-    "quick_delete": -0.03,             # deleted within 1 hour of creation
-    "high_volume_burst": -0.02,        # >20 writes in 5 minutes
-    "content_overwritten_by_user": -0.01,
-    # Neutral (logged but no score change)
-    "normal_write": 0.0,
-    "normal_recall": 0.0,
+# ---------------------------------------------------------------------------
+# Beta-Binomial signal weights
+# ---------------------------------------------------------------------------
+# Positive signals increment alpha (building trust).
+# Negative signals increment beta (eroding trust).
+# Neutral signals give a tiny alpha nudge to reward normal activity.
+#
+# Asymmetry: negative weights are larger than positive weights.
+# This means it's harder to build trust than to lose it — the system
+# is intentionally skeptical. One poisoning event takes many good
+# actions to recover from.
+# ---------------------------------------------------------------------------
+SIGNAL_WEIGHTS = {
+    # Positive signals → alpha += weight
+    "memory_recalled_by_others": ("positive", 0.30),   # cross-agent validation
+    "memory_updated":            ("positive", 0.15),   # ongoing relevance
+    "high_importance_write":     ("positive", 0.20),   # valuable content (importance >= 7)
+    "consistent_pattern":        ("positive", 0.15),   # stable write behavior
+    # Negative signals → beta += weight
+    "quick_delete":              ("negative", 0.50),   # deleted within 1 hour
+    "high_volume_burst":         ("negative", 0.40),   # >20 writes in 5 minutes
+    "content_overwritten_by_user": ("negative", 0.25), # user had to fix output
+    # Neutral signals → tiny alpha nudge
+    "normal_write":              ("neutral", 0.01),
+    "normal_recall":             ("neutral", 0.01),
 }
+# Backward-compatible: expose SIGNAL_DELTAS as a derived dict so that
+# bm6_trust.py (which imports SIGNAL_DELTAS) and any other consumer
+# continues to work. The values represent the *direction* and *magnitude*
+# of each signal: positive for alpha, negative for beta, zero for neutral.
+SIGNAL_DELTAS = {}
+for _sig, (_direction, _weight) in SIGNAL_WEIGHTS.items():
+    if _direction == "positive":
+        SIGNAL_DELTAS[_sig] = +_weight
+    elif _direction == "negative":
+        SIGNAL_DELTAS[_sig] = -_weight
+    else:
+        SIGNAL_DELTAS[_sig] = 0.0
+# ---------------------------------------------------------------------------
+# Beta prior and decay parameters
+# ---------------------------------------------------------------------------
+INITIAL_ALPHA = 2.0        # Slight positive prior
+INITIAL_BETA = 1.0         # → initial trust = 2/(2+1) = 0.667
+DECAY_FACTOR = 0.995       # Multiply alpha & beta every DECAY_INTERVAL signals
+DECAY_INTERVAL = 50        # Apply decay every N signals per agent
+ALPHA_FLOOR = 1.0          # Never decay alpha below this
+BETA_FLOOR = 0.5           # Never decay beta below this
 # Thresholds
 QUICK_DELETE_HOURS = 1       # Delete within 1 hour = negative signal
 BURST_THRESHOLD = 20         # >20 writes in burst window = negative
@@ -98,9 +140,12 @@ BURST_WINDOW_MINUTES = 5     # Burst detection window
 class TrustScorer:
     """
-    Silent trust signal collector for AI agents.
+    Bayesian Beta-Binomial trust scorer for AI agents.
+    Each agent is modeled as Beta(alpha, beta). Positive signals
+    increment alpha, negative signals increment beta. The trust
+    score is the posterior mean: alpha / (alpha + beta).
-    v2.5: Collection only, no enforcement. All agents start at 1.0.
     Thread-safe singleton per database path.
     """
@@ -136,19 +181,26 @@ class TrustScorer:
         self._write_timestamps: Dict[str, list] = {}
         self._timestamps_lock = threading.Lock()
-        # Signal count per agent (for decay factor calculation)
+        # Signal count per agent (for decay interval tracking)
         self._signal_counts: Dict[str, int] = {}
+        # In-memory cache of Beta parameters per agent
+        # Key: agent_id, Value: (alpha, beta)
+        self._beta_params: Dict[str, tuple] = {}
+        self._beta_lock = threading.Lock()
         self._init_schema()
-        logger.info("TrustScorer initialized (v2.5 — silent collection, no enforcement)")
+        logger.info("TrustScorer initialized (Beta-Binomial — alpha=%.1f, beta=%.1f prior)",
+                     INITIAL_ALPHA, INITIAL_BETA)
     def _init_schema(self):
-        """Create trust_signals table for audit trail."""
+        """Create trust_signals table and add alpha/beta columns to agent_registry."""
         try:
             from db_connection_manager import DbConnectionManager
             mgr = DbConnectionManager.get_instance(self.db_path)
             def _create(conn):
+                # Trust signals audit trail
                 conn.execute('''
                     CREATE TABLE IF NOT EXISTS trust_signals (
                         id INTEGER PRIMARY KEY AUTOINCREMENT,
@@ -169,6 +221,18 @@ class TrustScorer:
                     CREATE INDEX IF NOT EXISTS idx_trust_created
                     ON trust_signals(created_at)
                 ''')
+                # Add trust_alpha and trust_beta columns to agent_registry
+                # (backward compatible — old databases get these columns added)
+                for col_name, col_default in [("trust_alpha", INITIAL_ALPHA),
+                                               ("trust_beta", INITIAL_BETA)]:
+                    try:
+                        conn.execute(
+                            f'ALTER TABLE agent_registry ADD COLUMN {col_name} REAL DEFAULT {col_default}'
+                        )
+                    except Exception:
+                        pass  # Column already exists
                 conn.commit()
             mgr.execute_write(_create)
@@ -189,11 +253,108 @@ class TrustScorer:
             ''')
             conn.execute('CREATE INDEX IF NOT EXISTS idx_trust_agent ON trust_signals(agent_id)')
             conn.execute('CREATE INDEX IF NOT EXISTS idx_trust_created ON trust_signals(created_at)')
+            # Add trust_alpha and trust_beta columns (backward compatible)
+            for col_name, col_default in [("trust_alpha", INITIAL_ALPHA),
+                                           ("trust_beta", INITIAL_BETA)]:
+                try:
+                    conn.execute(
+                        f'ALTER TABLE agent_registry ADD COLUMN {col_name} REAL DEFAULT {col_default}'
+                    )
+                except sqlite3.OperationalError:
+                    pass  # Column already exists
             conn.commit()
             conn.close()
     # =========================================================================
-    # Signal Recording
+    # Beta Parameter Management
+    # =========================================================================
+    def _get_beta_params(self, agent_id: str) -> tuple:
+        """
+        Get (alpha, beta) for an agent. Checks in-memory cache first,
+        then database, then falls back to prior defaults.
+        Returns:
+            (alpha, beta) tuple
+        """
+        with self._beta_lock:
+            if agent_id in self._beta_params:
+                return self._beta_params[agent_id]
+        # Not in cache — read from database
+        alpha, beta = None, None
+        try:
+            from db_connection_manager import DbConnectionManager
+            mgr = DbConnectionManager.get_instance(self.db_path)
+            with mgr.read_connection() as conn:
+                cursor = conn.cursor()
+                cursor.execute(
+                    "SELECT trust_alpha, trust_beta FROM agent_registry WHERE agent_id = ?",
+                    (agent_id,)
+                )
+                row = cursor.fetchone()
+                if row:
+                    alpha = row[0]
+                    beta = row[1]
+        except Exception:
+            pass
+        # Fall back to defaults if NULL or missing
+        if alpha is None or beta is None:
+            alpha = INITIAL_ALPHA
+            beta = INITIAL_BETA
+        with self._beta_lock:
+            self._beta_params[agent_id] = (alpha, beta)
+        return (alpha, beta)
+    def _set_beta_params(self, agent_id: str, alpha: float, beta: float):
+        """
+        Update (alpha, beta) in cache and persist to agent_registry.
+        Also computes and stores the derived trust_score = alpha/(alpha+beta).
+        """
+        trust_score = alpha / (alpha + beta) if (alpha + beta) > 0 else 0.0
+        with self._beta_lock:
+            self._beta_params[agent_id] = (alpha, beta)
+        try:
+            from db_connection_manager import DbConnectionManager
+            mgr = DbConnectionManager.get_instance(self.db_path)
+            def _update(conn):
+                conn.execute(
+                    """UPDATE agent_registry
+                       SET trust_score = ?, trust_alpha = ?, trust_beta = ?
+                       WHERE agent_id = ?""",
+                    (round(trust_score, 4), round(alpha, 4), round(beta, 4), agent_id)
+                )
+                conn.commit()
+            mgr.execute_write(_update)
+        except Exception as e:
+            logger.error("Failed to persist Beta params for %s: %s", agent_id, e)
+    def _apply_decay(self, agent_id: str, alpha: float, beta: float) -> tuple:
+        """
+        Apply periodic decay to alpha and beta to forget very old signals.
+        Called every DECAY_INTERVAL signals per agent.
+        Multiplies both by DECAY_FACTOR with floor constraints.
+        Returns:
+            (decayed_alpha, decayed_beta)
+        """
+        new_alpha = max(ALPHA_FLOOR, alpha * DECAY_FACTOR)
+        new_beta = max(BETA_FLOOR, beta * DECAY_FACTOR)
+        return (new_alpha, new_beta)
+    # =========================================================================
+    # Signal Recording (Beta-Binomial Update)
     # =========================================================================
     def record_signal(
@@ -203,50 +364,68 @@ class TrustScorer:
         context: Optional[dict] = None,
     ) -> bool:
         """
-        Record a trust signal for an agent.
+        Record a trust signal for an agent using Beta-Binomial update.
-        Silently adjusts the agent's trust score based on the signal type.
-        The signal and score change are logged to trust_signals table.
+        Positive signals increment alpha (trust evidence).
+        Negative signals increment beta (distrust evidence).
+        Neutral signals give a tiny alpha nudge.
+        Trust score = alpha / (alpha + beta) — the posterior mean.
         Args:
             agent_id: Agent that generated the signal
-            signal_type: One of SIGNAL_DELTAS keys
+            signal_type: One of SIGNAL_WEIGHTS keys
             context: Additional context (memory_id, etc.)
+        Returns:
+            True if signal was recorded successfully
         """
-        if signal_type not in SIGNAL_DELTAS:
+        if signal_type not in SIGNAL_WEIGHTS:
             logger.warning("Unknown trust signal: %s", signal_type)
-            return
+            return False
+        direction, weight = SIGNAL_WEIGHTS[signal_type]
-        delta = SIGNAL_DELTAS[signal_type]
+        # Get current Beta parameters
+        alpha, beta = self._get_beta_params(agent_id)
+        old_score = alpha / (alpha + beta) if (alpha + beta) > 0 else 0.0
-        # Get current trust score from agent registry
-        old_score = self._get_agent_trust(agent_id)
-        if old_score is None:
-            old_score = 1.0  # Default for unknown agents
+        # Apply Beta-Binomial update
+        if direction == "positive":
+            alpha += weight
+        elif direction == "negative":
+            beta += weight
+        else:  # neutral — tiny alpha nudge
+            alpha += weight
-        # Apply decay factor (score stabilizes over time)
-        count = self._signal_counts.get(agent_id, 0)
-        decay = 1.0 / (1.0 + count * 0.01)
-        adjusted_delta = delta * decay
+        # Apply periodic decay
+        count = self._signal_counts.get(agent_id, 0) + 1
+        self._signal_counts[agent_id] = count
-        # Calculate new score (clamped to [0.0, 1.0])
-        new_score = max(0.0, min(1.0, old_score + adjusted_delta))
+        if count % DECAY_INTERVAL == 0:
+            alpha, beta = self._apply_decay(agent_id, alpha, beta)
-        # Update signal count
-        self._signal_counts[agent_id] = count + 1
+        # Compute new trust score (posterior mean)
+        new_score = alpha / (alpha + beta) if (alpha + beta) > 0 else 0.0
+        # Compute delta for audit trail (backward compatible with trust_signals table)
+        delta = new_score - old_score
         # Persist signal to audit trail
-        self._persist_signal(agent_id, signal_type, adjusted_delta, old_score, new_score, context)
+        self._persist_signal(agent_id, signal_type, delta, old_score, new_score, context)
-        # Update agent trust score (if score actually changed)
-        if abs(new_score - old_score) > 0.0001:
-            self._update_agent_trust(agent_id, new_score)
+        # Persist updated Beta parameters and derived trust_score
+        self._set_beta_params(agent_id, alpha, beta)
         logger.debug(
-            "Trust signal: agent=%s, type=%s, delta=%.4f, score=%.4f→%.4f",
-            agent_id, signal_type, adjusted_delta, old_score, new_score
+            "Trust signal: agent=%s, type=%s (%s, w=%.2f), "
+            "alpha=%.2f, beta=%.2f, score=%.4f->%.4f",
+            agent_id, signal_type, direction, weight,
+            alpha, beta, old_score, new_score
         )
+        return True
     def _persist_signal(self, agent_id, signal_type, delta, old_score, new_score, context):
         """Save signal to trust_signals table."""
         try:
@@ -265,7 +444,12 @@ class TrustScorer:
             logger.error("Failed to persist trust signal: %s", e)
     def _get_agent_trust(self, agent_id: str) -> Optional[float]:
-        """Get current trust score from agent_registry."""
+        """
+        Get current trust score from agent_registry.
+        This reads the derived trust_score column (which is always kept
+        in sync with alpha/(alpha+beta) by _set_beta_params).
+        """
         try:
             from db_connection_manager import DbConnectionManager
             mgr = DbConnectionManager.get_instance(self.db_path)
@@ -282,7 +466,13 @@ class TrustScorer:
             return None
     def _update_agent_trust(self, agent_id: str, new_score: float):
-        """Update trust score in agent_registry."""
+        """
+        Update trust score in agent_registry (legacy compatibility method).
+        In Beta-Binomial mode, this is a no-op because _set_beta_params
+        already updates trust_score alongside alpha and beta. Kept for
+        backward compatibility if any external code calls it directly.
+        """
         try:
             from db_connection_manager import DbConnectionManager
             mgr = DbConnectionManager.get_instance(self.db_path)
@@ -373,16 +563,40 @@ class TrustScorer:
     # =========================================================================
     def get_trust_score(self, agent_id: str) -> float:
-        """Get current trust score for an agent. Returns 1.0 if unknown."""
-        score = self._get_agent_trust(agent_id)
-        return score if score is not None else 1.0
+        """
+        Get current trust score for an agent.
+        Computes alpha/(alpha+beta) from cached or stored Beta params.
+        Returns INITIAL_ALPHA/(INITIAL_ALPHA+INITIAL_BETA) = 0.667 for
+        unknown agents.
+        """
+        alpha, beta = self._get_beta_params(agent_id)
+        if (alpha + beta) > 0:
+            return alpha / (alpha + beta)
+        return INITIAL_ALPHA / (INITIAL_ALPHA + INITIAL_BETA)
+    def get_beta_params(self, agent_id: str) -> Dict[str, float]:
+        """
+        Get the Beta distribution parameters for an agent.
+        Returns:
+            {"alpha": float, "beta": float, "trust_score": float}
+        """
+        alpha, beta = self._get_beta_params(agent_id)
+        score = alpha / (alpha + beta) if (alpha + beta) > 0 else 0.0
+        return {
+            "alpha": round(alpha, 4),
+            "beta": round(beta, 4),
+            "trust_score": round(score, 4),
+        }
     def check_trust(self, agent_id: str, operation: str = "write") -> bool:
         """
         Check if agent is trusted enough for the given operation.
         v2.6 enforcement: blocks write/delete for agents with trust < 0.3.
-        New agents start at 1.0 — only repeated bad behavior triggers blocking.
+        New agents start at Beta(2,1) → trust 0.667 — only repeated bad
+        behavior triggers blocking.
         Args:
             agent_id: The agent identifier
@@ -394,14 +608,12 @@ class TrustScorer:
         if operation == "read":
             return True  # Reads are always allowed
-        score = self._get_agent_trust(agent_id)
-        if score is None:
-            return True  # Unknown agent = first-time = allowed (starts at 1.0)
+        score = self.get_trust_score(agent_id)
         threshold = 0.3  # Block write/delete below this
         if score < threshold:
             logger.warning(
-                "Trust enforcement: agent '%s' blocked from '%s' (trust=%.2f < %.2f)",
+                "Trust enforcement: agent '%s' blocked from '%s' (trust=%.4f < %.2f)",
                 agent_id, operation, score, threshold
             )
             return False
@@ -479,7 +691,9 @@ class TrustScorer:
                 "total_signals": total_signals,
                 "by_signal_type": by_type,
                 "by_agent": by_agent,
-                "avg_trust_score": round(avg, 4) if avg else 1.0,
+                "avg_trust_score": round(avg, 4) if avg else INITIAL_ALPHA / (INITIAL_ALPHA + INITIAL_BETA),
+                "scoring_model": "Beta-Binomial",
+                "prior": f"Beta({INITIAL_ALPHA}, {INITIAL_BETA})",
                 "enforcement": "enabled (v2.6 — write/delete blocked below 0.3 trust)",
             }

package/ui/app.js CHANGED Viewed

@@ -1445,9 +1445,9 @@ async function loadAgents() {
             // Trust score
             var tdTrust = document.createElement('td');
-            var trustScore = agent.trust_score != null ? agent.trust_score : 1.0;
-            tdTrust.className = trustScore < 0.7 ? 'text-danger fw-bold'
-                : trustScore < 0.9 ? 'text-warning fw-bold' : 'text-success fw-bold';
+            var trustScore = agent.trust_score != null ? agent.trust_score : 0.667;
+            tdTrust.className = trustScore < 0.3 ? 'text-danger fw-bold'
+                : trustScore < 0.5 ? 'text-warning fw-bold' : 'text-success fw-bold';
             tdTrust.textContent = trustScore.toFixed(2);
             tr.appendChild(tdTrust);
@@ -1524,7 +1524,7 @@ async function loadTrustOverview() {
         card2.className = 'border rounded p-3 text-center';
         var val2 = document.createElement('div');
         val2.className = 'fs-4 fw-bold';
-        val2.textContent = (stats.avg_trust_score || 1.0).toFixed(3);
+        val2.textContent = (stats.avg_trust_score || 0.667).toFixed(3);
         card2.appendChild(val2);
         var lbl2 = document.createElement('small');
         lbl2.className = 'text-muted';

package/ui/js/agents.js CHANGED Viewed

@@ -78,9 +78,9 @@ async function loadAgents() {
             tr.appendChild(tdProto);
             var tdTrust = document.createElement('td');
-            var trustScore = agent.trust_score != null ? agent.trust_score : 1.0;
-            tdTrust.className = trustScore < 0.7 ? 'text-danger fw-bold'
-                : trustScore < 0.9 ? 'text-warning fw-bold' : 'text-success fw-bold';
+            var trustScore = agent.trust_score != null ? agent.trust_score : 0.667;
+            tdTrust.className = trustScore < 0.3 ? 'text-danger fw-bold'
+                : trustScore < 0.5 ? 'text-warning fw-bold' : 'text-success fw-bold';
             tdTrust.textContent = trustScore.toFixed(2);
             tr.appendChild(tdTrust);
@@ -133,7 +133,7 @@ async function loadTrustOverview() {
         var cardData = [
             { value: (stats.total_signals || 0).toLocaleString(), label: 'Total Signals Collected', cls: '' },
-            { value: (stats.avg_trust_score || 1.0).toFixed(3), label: 'Average Trust Score', cls: '' },
+            { value: (stats.avg_trust_score || 0.667).toFixed(3), label: 'Average Trust Score', cls: '' },
             { value: stats.enforcement || 'disabled', label: 'Enforcement Status', cls: 'text-info' }
         ];