PyPI - knowledge-rag - Versions diffs - 3.6.1__tar.gz → 3.7.0__tar.gz - Mend

knowledge-rag 3.6.1tar.gz → 3.7.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: knowledge-rag
-Version: 3.6.1
+Version: 3.7.0
 Summary: Local RAG System for Claude Code — Hybrid search + Cross-encoder Reranking + 12 MCP Tools + 20 Format Parsers. Zero external servers.
 Project-URL: Homepage, https://github.com/lyonzin/knowledge-rag
 Project-URL: Repository, https://github.com/lyonzin/knowledge-rag
@@ -30,7 +30,7 @@ Requires-Dist: python-docx>=1.0.0
 Requires-Dist: python-pptx>=1.0.0
 Requires-Dist: pyyaml>=6.0
 Requires-Dist: rank-bm25>=0.2.2
-Requires-Dist: requests>=2.31.0
+Requires-Dist: requests>=2.33.0
 Requires-Dist: watchdog>=4.0.0
 Provides-Extra: gpu
 Requires-Dist: onnxruntime-gpu>=1.14.0; extra == 'gpu'
@@ -40,7 +40,9 @@ Description-Content-Type: text/markdown
 <div align="center">
-![Version](https://img.shields.io/badge/version-3.5.2-blue.svg)
+[![PyPI](https://img.shields.io/pypi/v/knowledge-rag)](https://pypi.org/project/knowledge-rag/)
+[![NPM](https://img.shields.io/npm/v/knowledge-rag)](https://www.npmjs.com/package/knowledge-rag)
+[![Downloads](https://static.pepy.tech/badge/knowledge-rag/month)](https://pepy.tech/project/knowledge-rag)
 ![Python](https://img.shields.io/badge/python-3.11%2B-green.svg)
 ![License](https://img.shields.io/badge/license-MIT-yellow.svg)
 ![Platform](https://img.shields.io/badge/platform-Windows%20%7C%20Linux%20%7C%20macOS-lightgrey.svg)
@@ -48,7 +50,6 @@ Description-Content-Type: text/markdown
 [![CI](https://github.com/lyonzin/knowledge-rag/actions/workflows/ci.yml/badge.svg)](https://github.com/lyonzin/knowledge-rag/actions/workflows/ci.yml)
 [![CodeQL](https://github.com/lyonzin/knowledge-rag/actions/workflows/security.yml/badge.svg)](https://github.com/lyonzin/knowledge-rag/actions/workflows/security.yml)
 [![Glama Score](https://glama.ai/mcp/servers/lyonzin/knowledge-rag/badges/score.svg)](https://glama.ai/mcp/servers/lyonzin/knowledge-rag)
-[![PyPI](https://img.shields.io/pypi/v/knowledge-rag)](https://pypi.org/project/knowledge-rag/)
 ### Your docs, your machine, zero cloud. Claude Code searches them natively.
@@ -809,7 +810,7 @@ models:
     dimensions: 384
     gpu: false                         # Set true + pip install knowledge-rag[gpu]
   reranker:
-    enabled: true                      # Set false on low-resource machines
+    enabled: true                      # Falls back to RRF if model is unavailable
     model: "Xenova/ms-marco-MiniLM-L-6-v2"
     top_k_multiplier: 3               # Candidates fetched before reranking
@@ -896,6 +897,8 @@ For `.md` files, chunking splits at `##` and `###` header boundaries first. Sect
 | `models.reranker.model` | `Xenova/ms-marco-MiniLM-L-6-v2` | Reranker model |
 | `models.reranker.top_k_multiplier` | 3 | Fetch N*multiplier candidates for reranking |
+If the reranker model is not available locally and the machine cannot download it, search now falls back to the RRF order from hybrid semantic+BM25 retrieval. This keeps `search_knowledge` available offline, but result ordering may be less precise for ambiguous queries until the reranker model is cached.
 **Embedding model options** (fastest → most accurate):
 - `BAAI/bge-small-en-v1.5` — 384D, ~33MB (default)
 - `BAAI/bge-base-en-v1.5` — 768D, ~130MB
@@ -1026,6 +1029,31 @@ rm -rf models_cache
 # Then restart the MCP server
 ```
+### Reranker model download fails
+The reranker is lazy-loaded on the first query. If the model is not cached and the machine is offline, search continues without reranking and uses the RRF order from hybrid retrieval. To keep reranking enabled offline, run one query while online or pre-populate `models_cache/` on the target machine.
+You can still disable reranking explicitly in `config.yaml`:
+```yaml
+models:
+  reranker:
+    enabled: false
+```
+Disabling reranking reduces memory use and avoids first-query model loading. The tradeoff is lower ranking precision, especially when several chunks match the same terms but only one is the best answer.
+### ChromaDB index crashes on startup
+Native ChromaDB failures can terminate Python before normal exception handling runs. Startup now probes ChromaDB in a child process before initializing the MCP server. If the probe crashes, the active `chroma_db/` and `index_metadata.json` are moved to `data/backups/auto-repair-*`, and the next startup can rebuild a clean index.
+The same guarded behavior is available through either console script:
+```bash
+knowledge-rag
+knowledge-rag-guarded
+```
 ### Index is empty
 ```bash
@@ -1056,7 +1084,7 @@ pip install --upgrade knowledge-rag
 ### Slow first query
-The cross-encoder reranker model is lazy-loaded on the first query. This adds a one-time ~2-3 second delay for model download and loading. Subsequent queries are fast.
+The cross-encoder reranker model is lazy-loaded on the first query. This adds a one-time ~2-3 second delay for model download and loading. Subsequent queries are fast. If the model cannot be loaded, search falls back to RRF ordering and does not retry loading the reranker until the server restarts.
 ### Memory usage
@@ -1066,8 +1094,16 @@ With ~200 documents, expect ~300-500MB RAM. The embedding model (~50MB) and rera
 ## Changelog
-### v3.6.1 (2026-04-23)
+### Unreleased
+- **FIX**: Startup preflight probes ChromaDB in a child process and moves crashing persistent indexes to `data/backups/auto-repair-*` before MCP initialization.
+- **FIX**: Reranker load failures now fall back to RRF ordering instead of failing `search_knowledge` on offline machines.
+- **FIX**: Virtualenv project-root detection now handles Python symlinks that resolve to the system interpreter.
+- **NEW**: `knowledge-rag-guarded` console script kept as an explicit guarded startup alias.
+### v3.6.2 (2026-04-23)
+- **INFRA**: NPM provenance attestation (SLSA supply chain security), full README on npm page
 - **DOCS**: Reorganize Installation section — add NPX and Docker install methods, update What's New to v3.6.0
 ### v3.6.0 (2026-04-23)

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/README.md RENAMED Viewed

@@ -2,7 +2,9 @@
 <div align="center">
-![Version](https://img.shields.io/badge/version-3.5.2-blue.svg)
+[![PyPI](https://img.shields.io/pypi/v/knowledge-rag)](https://pypi.org/project/knowledge-rag/)
+[![NPM](https://img.shields.io/npm/v/knowledge-rag)](https://www.npmjs.com/package/knowledge-rag)
+[![Downloads](https://static.pepy.tech/badge/knowledge-rag/month)](https://pepy.tech/project/knowledge-rag)
 ![Python](https://img.shields.io/badge/python-3.11%2B-green.svg)
 ![License](https://img.shields.io/badge/license-MIT-yellow.svg)
 ![Platform](https://img.shields.io/badge/platform-Windows%20%7C%20Linux%20%7C%20macOS-lightgrey.svg)
@@ -10,7 +12,6 @@
 [![CI](https://github.com/lyonzin/knowledge-rag/actions/workflows/ci.yml/badge.svg)](https://github.com/lyonzin/knowledge-rag/actions/workflows/ci.yml)
 [![CodeQL](https://github.com/lyonzin/knowledge-rag/actions/workflows/security.yml/badge.svg)](https://github.com/lyonzin/knowledge-rag/actions/workflows/security.yml)
 [![Glama Score](https://glama.ai/mcp/servers/lyonzin/knowledge-rag/badges/score.svg)](https://glama.ai/mcp/servers/lyonzin/knowledge-rag)
-[![PyPI](https://img.shields.io/pypi/v/knowledge-rag)](https://pypi.org/project/knowledge-rag/)
 ### Your docs, your machine, zero cloud. Claude Code searches them natively.
@@ -771,7 +772,7 @@ models:
     dimensions: 384
     gpu: false                         # Set true + pip install knowledge-rag[gpu]
   reranker:
-    enabled: true                      # Set false on low-resource machines
+    enabled: true                      # Falls back to RRF if model is unavailable
     model: "Xenova/ms-marco-MiniLM-L-6-v2"
     top_k_multiplier: 3               # Candidates fetched before reranking
@@ -858,6 +859,8 @@ For `.md` files, chunking splits at `##` and `###` header boundaries first. Sect
 | `models.reranker.model` | `Xenova/ms-marco-MiniLM-L-6-v2` | Reranker model |
 | `models.reranker.top_k_multiplier` | 3 | Fetch N*multiplier candidates for reranking |
+If the reranker model is not available locally and the machine cannot download it, search now falls back to the RRF order from hybrid semantic+BM25 retrieval. This keeps `search_knowledge` available offline, but result ordering may be less precise for ambiguous queries until the reranker model is cached.
 **Embedding model options** (fastest → most accurate):
 - `BAAI/bge-small-en-v1.5` — 384D, ~33MB (default)
 - `BAAI/bge-base-en-v1.5` — 768D, ~130MB
@@ -988,6 +991,31 @@ rm -rf models_cache
 # Then restart the MCP server
 ```
+### Reranker model download fails
+The reranker is lazy-loaded on the first query. If the model is not cached and the machine is offline, search continues without reranking and uses the RRF order from hybrid retrieval. To keep reranking enabled offline, run one query while online or pre-populate `models_cache/` on the target machine.
+You can still disable reranking explicitly in `config.yaml`:
+```yaml
+models:
+  reranker:
+    enabled: false
+```
+Disabling reranking reduces memory use and avoids first-query model loading. The tradeoff is lower ranking precision, especially when several chunks match the same terms but only one is the best answer.
+### ChromaDB index crashes on startup
+Native ChromaDB failures can terminate Python before normal exception handling runs. Startup now probes ChromaDB in a child process before initializing the MCP server. If the probe crashes, the active `chroma_db/` and `index_metadata.json` are moved to `data/backups/auto-repair-*`, and the next startup can rebuild a clean index.
+The same guarded behavior is available through either console script:
+```bash
+knowledge-rag
+knowledge-rag-guarded
+```
 ### Index is empty
 ```bash
@@ -1018,7 +1046,7 @@ pip install --upgrade knowledge-rag
 ### Slow first query
-The cross-encoder reranker model is lazy-loaded on the first query. This adds a one-time ~2-3 second delay for model download and loading. Subsequent queries are fast.
+The cross-encoder reranker model is lazy-loaded on the first query. This adds a one-time ~2-3 second delay for model download and loading. Subsequent queries are fast. If the model cannot be loaded, search falls back to RRF ordering and does not retry loading the reranker until the server restarts.
 ### Memory usage
@@ -1028,8 +1056,16 @@ With ~200 documents, expect ~300-500MB RAM. The embedding model (~50MB) and rera
 ## Changelog
-### v3.6.1 (2026-04-23)
+### Unreleased
+- **FIX**: Startup preflight probes ChromaDB in a child process and moves crashing persistent indexes to `data/backups/auto-repair-*` before MCP initialization.
+- **FIX**: Reranker load failures now fall back to RRF ordering instead of failing `search_knowledge` on offline machines.
+- **FIX**: Virtualenv project-root detection now handles Python symlinks that resolve to the system interpreter.
+- **NEW**: `knowledge-rag-guarded` console script kept as an explicit guarded startup alias.
+### v3.6.2 (2026-04-23)
+- **INFRA**: NPM provenance attestation (SLSA supply chain security), full README on npm page
 - **DOCS**: Reorganize Installation section — add NPX and Docker install methods, update What's New to v3.6.0
 ### v3.6.0 (2026-04-23)

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/mcp_server/config.py RENAMED Viewed

@@ -54,10 +54,11 @@ def _has_documents(path: Path) -> bool:
 def _venv_project_dir():
     """Detect project root from venv location (pip install from PyPI)."""
-    exe = Path(sys.executable).resolve()
-    for parent in exe.parents:
-        if parent.name in ("venv", ".venv", "env", ".env"):
-            return parent.parent
+    candidates = [Path(sys.prefix), Path(sys.executable), Path(sys.executable).resolve()]
+    for candidate in candidates:
+        for parent in (candidate, *candidate.parents):
+            if parent.name in ("venv", ".venv", "env", ".env"):
+                return parent.parent
     return None

knowledge_rag-3.7.0/mcp_server/guarded.py ADDED Viewed

@@ -0,0 +1,10 @@
+"""Backward-compatible guarded console entry point for knowledge-rag."""
+from __future__ import annotations
+from .server import main
+def guarded_main() -> None:
+    """Run the MCP server; server.main performs startup preflight."""
+    main()

knowledge_rag-3.7.0/mcp_server/preflight.py ADDED Viewed

@@ -0,0 +1,74 @@
+"""Startup preflight checks for persistent ChromaDB state."""
+from __future__ import annotations
+import os
+import shutil
+import subprocess
+import sys
+from datetime import datetime
+from pathlib import Path
+from .config import BASE_DIR, config
+def _backup_active_index(reason: str) -> Path:
+    """Move active ChromaDB state aside so the server can rebuild cleanly."""
+    stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
+    backup_dir = config.data_dir / "backups" / f"auto-repair-{stamp}"
+    backup_dir.mkdir(parents=True, exist_ok=False)
+    if config.chroma_dir.exists():
+        shutil.move(str(config.chroma_dir), str(backup_dir / f"chroma_db.{reason}"))
+    metadata_file = config.data_dir / "index_metadata.json"
+    if metadata_file.exists():
+        shutil.move(str(metadata_file), str(backup_dir / f"index_metadata.{reason}.json"))
+    return backup_dir
+def _probe_chroma(timeout_seconds: int = 30) -> subprocess.CompletedProcess[str]:
+    """Check Chroma in a child process so native crashes do not kill MCP startup."""
+    code = r"""
+import chromadb
+from mcp_server.config import config
+if not config.chroma_dir.exists():
+    print("missing")
+    raise SystemExit(0)
+client = chromadb.PersistentClient(path=str(config.chroma_dir))
+collection = client.get_or_create_collection(name=config.collection_name)
+print(collection.count())
+"""
+    env = os.environ.copy()
+    env.setdefault("KNOWLEDGE_RAG_DIR", str(BASE_DIR))
+    return subprocess.run(
+        [sys.executable, "-c", code],
+        cwd=str(BASE_DIR),
+        env=env,
+        text=True,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        timeout=timeout_seconds,
+        check=False,
+    )
+def run_preflight(timeout_seconds: int = 30) -> bool:
+    """Return True when active Chroma state was moved aside for repair."""
+    result = _probe_chroma(timeout_seconds=timeout_seconds)
+    if result.returncode == 0:
+        return False
+    reason = "segfault" if result.returncode in (-11, 139) else "failed"
+    backup_dir = _backup_active_index(reason)
+    print(
+        f"[RECOVERY] Chroma preflight failed with code {result.returncode}; moved active index to {backup_dir}",
+        file=sys.stderr,
+    )
+    if result.stderr:
+        print(result.stderr[-2000:], file=sys.stderr)
+    return True

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/mcp_server/server.py RENAMED Viewed

@@ -248,13 +248,22 @@ class CrossEncoderReranker:
     def __init__(self, model: str = None):
         self.model_name = model or config.reranker_model
         self._model = None  # Lazy init
+        self._load_failed = False
-    def _ensure_model(self):
+    def _ensure_model(self) -> bool:
         """Lazy initialization of cross-encoder model"""
+        if self._load_failed:
+            return False
         if self._model is None:
             print(f"[INFO] Loading reranker model: {self.model_name}...")
-            self._model = TextCrossEncoder(model_name=self.model_name, cache_dir=str(config.models_cache_dir))
-            print("[INFO] Reranker model loaded successfully")
+            try:
+                self._model = TextCrossEncoder(model_name=self.model_name, cache_dir=str(config.models_cache_dir))
+                print("[INFO] Reranker model loaded successfully")
+            except Exception as e:
+                self._load_failed = True
+                print(f"[WARN] Reranker unavailable, using RRF order: {e}")
+                return False
+        return True
     def rerank(self, query: str, documents: List[Dict[str, Any]], top_k: int = 5) -> List[Dict[str, Any]]:
         """
@@ -271,7 +280,8 @@ class CrossEncoderReranker:
         if not documents or not config.reranker_enabled:
             return documents[:top_k]
-        self._ensure_model()
+        if not self._ensure_model():
+            return documents[:top_k]
         texts = [doc.get("document", "") for doc in documents]
@@ -1924,6 +1934,10 @@ def main():
         _handle_init()
         return
+    from .preflight import run_preflight
+    run_preflight()
     orchestrator = get_orchestrator()
     # Migration: check dimension mismatch AFTER full init (avoids segfault during __init__)

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/npm/README.md RENAMED Viewed

@@ -2,7 +2,7 @@
 # Knowledge RAG
-Local RAG system for Claude Code. Hybrid BM25 + semantic search with cross-encoder reranking. 12 MCP tools. Zero external servers. Everything runs on your machine.
+Local RAG system for Claude Code. Hybrid BM25 + semantic search with cross-encoder reranking. 12 MCP tools, 20 format parsers. Zero external servers. Everything runs on your machine.
 ## Quick Start

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "knowledge-rag"
-version = "3.6.1"
+version = "3.7.0"
 description = "Local RAG System for Claude Code — Hybrid search + Cross-encoder Reranking + 12 MCP Tools + 20 Format Parsers. Zero external servers."
 readme = "README.md"
 license = {text = "MIT"}
@@ -34,7 +34,7 @@ dependencies = [
     "fastembed[reranking]>=0.4.0",
     "mcp>=1.0.0",
     "rank-bm25>=0.2.2",
-    "requests>=2.31.0",
+    "requests>=2.33.0",
     "beautifulsoup4>=4.12.0",
     "python-docx>=1.0.0",
     "openpyxl>=3.1.0",
@@ -54,6 +54,7 @@ Changelog = "https://github.com/lyonzin/knowledge-rag/releases"
 [project.scripts]
 knowledge-rag = "mcp_server.server:main"
+knowledge-rag-guarded = "mcp_server.guarded:guarded_main"
 [tool.hatch.build.targets.wheel]
 packages = ["mcp_server"]

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/requirements.txt RENAMED Viewed

@@ -1,6 +1,6 @@
 # Knowledge RAG System - Python Dependencies
 # ==========================================
-# Requires Python 3.11 or 3.12 (NOT 3.13+ due to onnxruntime)
+# Requires Python 3.11+ (3.11, 3.12, 3.13, 3.14 supported)
 # Vector Database (uses new PersistentClient API)
 chromadb>=1.4.0
@@ -19,7 +19,7 @@ mcp>=1.0.0
 rank-bm25>=0.2.2
 # URL content fetching (add_from_url tool)
-requests>=2.31.0
+requests>=2.33.0
 # HTML parsing (add_from_url tool)
 beautifulsoup4>=4.12.0
@@ -44,6 +44,6 @@ watchdog>=4.0.0
 # 2. Default embedding model: BAAI/bge-small-en-v1.5 (384-dim)
 #    Cached in ~/.cache/fastembed/
 #
-# 3. Python 3.13+ is NOT supported because chromadb
-#    depends on onnxruntime which has no 3.13 wheels
+# 3. Python 3.13+ is supported since v3.5.1
+#    (onnxruntime now ships wheels for 3.13 and 3.14)
 # ==========================================

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/.gitignore RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/LICENSE RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/config.example.yaml RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/documents/examples/sample-document.md RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/mcp_server/__init__.py RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/mcp_server/ingestion.py RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/presets/cybersecurity.yaml RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/presets/developer.yaml RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/presets/general.yaml RENAMED Viewed

File without changes

{knowledge_rag-3.6.1 → knowledge_rag-3.7.0}/presets/research.yaml RENAMED Viewed

File without changes

knowledge-rag 3.6.1__tar.gz → 3.7.0__tar.gz

knowledge-rag 3.6.1tar.gz → 3.7.0tar.gz