PyPI - halflife-rag - Versions diffs - 0.1.0__tar.gz - Mend

halflife-rag 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

halflife_rag-0.1.0/PKG-INFO +15 -0
halflife_rag-0.1.0/README.md +140 -0
halflife_rag-0.1.0/api/main.py +142 -0
halflife_rag-0.1.0/engine/classifier/doc_type.py +54 -0
halflife_rag-0.1.0/engine/classifier/query_intent.py +91 -0
halflife_rag-0.1.0/engine/decay/base.py +14 -0
halflife_rag-0.1.0/engine/decay/exponential.py +18 -0
halflife_rag-0.1.0/engine/decay/learned.py +46 -0
halflife_rag-0.1.0/engine/decay/learned_model.py +249 -0
halflife_rag-0.1.0/engine/decay/piecewise.py +21 -0
halflife_rag-0.1.0/engine/decay/registry.py +16 -0
halflife_rag-0.1.0/engine/events/bus.py +33 -0
halflife_rag-0.1.0/engine/feedback/updater.py +43 -0
halflife_rag-0.1.0/engine/fusion/consistency.py +46 -0
halflife_rag-0.1.0/engine/fusion/reranker.py +199 -0
halflife_rag-0.1.0/engine/ingestion/pipeline.py +204 -0
halflife_rag-0.1.0/engine/store/redis_store.py +149 -0
halflife_rag-0.1.0/halflife_rag.egg-info/PKG-INFO +15 -0
halflife_rag-0.1.0/halflife_rag.egg-info/SOURCES.txt +33 -0
halflife_rag-0.1.0/halflife_rag.egg-info/dependency_links.txt +1 -0
halflife_rag-0.1.0/halflife_rag.egg-info/entry_points.txt +2 -0
halflife_rag-0.1.0/halflife_rag.egg-info/requires.txt +9 -0
halflife_rag-0.1.0/halflife_rag.egg-info/top_level.txt +3 -0
halflife_rag-0.1.0/pyproject.toml +30 -0
halflife_rag-0.1.0/scripts/benchmark.py +461 -0
halflife_rag-0.1.0/scripts/cli.py +63 -0
halflife_rag-0.1.0/scripts/corpus.py +445 -0
halflife_rag-0.1.0/scripts/generate_benchmark_data.py +103 -0
halflife_rag-0.1.0/scripts/quickstart.py +98 -0
halflife_rag-0.1.0/scripts/run_benchmark.py +24 -0
halflife_rag-0.1.0/scripts/train_mlp.py +335 -0
halflife_rag-0.1.0/scripts/visualize_decay.py +51 -0
halflife_rag-0.1.0/setup.cfg +4 -0
halflife_rag-0.1.0/tests/test_benchmark.py +393 -0
halflife_rag-0.1.0/tests/test_decay.py +46 -0

halflife_rag-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,15 @@
+Metadata-Version: 2.4
+Name: halflife-rag
+Version: 0.1.0
+Summary: Temporal-aware re-ranking engine for RAG
+Author-email: Your Name <you@example.com>
+Requires-Python: >=3.10
+Requires-Dist: fastapi
+Requires-Dist: uvicorn
+Requires-Dist: redis
+Requires-Dist: qdrant-client
+Requires-Dist: numpy
+Requires-Dist: scipy
+Requires-Dist: sentence-transformers
+Requires-Dist: pydantic
+Requires-Dist: python-dotenv

halflife_rag-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,140 @@
+# HalfLife
+**Temporal-Aware Chunk Re-Ranking Engine for Retrieval-Augmented Generation (RAG)**
+HalfLife is a plug-and-play middleware that enhances any RAG pipeline by re-ranking retrieved chunks using **temporal signals**, **decay functions**, and **multi-factor scoring**.
+Instead of relying solely on semantic similarity, HalfLife introduces a **time-aware ranking layer** that improves freshness, relevance, and contextual correctness in generated responses.
+---
+## ✨ Why HalfLife?
+Traditional RAG systems rank documents using:
+```
+relevance ≈ semantic similarity(query, document)
+```
+HalfLife extends this to:
+```
+relevance = f(semantic_similarity, temporal_decay, trust, priors)
+```
+This enables:
+* Better handling of **time-sensitive queries**
+* Reduced reliance on **outdated information**
+* Improved **context diversity across time**
+* More **robust and explainable retrieval pipelines**
+---
+## 🧠 Core Idea
+HalfLife sits between your retriever (e.g., Qdrant) and your LLM:
+```
+Retriever → HalfLife → LLM
+```
+It **re-scores and reorders chunks** before they are passed into the model.
+---
+## ⚙️ Core Features
+### 🔍 1. Plug-and-Play Reranking
+HalfLife sits between your retriever (e.g., Qdrant) and your LLM. It **re-scores and reorders chunks** before they reach the model.
+### ⏳ 2. Multi-Strategy Decay
+Supports modular decay functions via a central registry:
+*   **Exponential**: Standard time-based decay.
+*   **Piecewise**: Different decay rates for recent vs. historical windows.
+*   **Learned (NEW)**: Features a pure-NumPy MLP (`DecayMLP`) that predicts the optimal $\lambda$ at ingestion time based on document type, source, and feedback.
+### 🧠 3. Intent-Aware Fusion
+HalfLife automatically classifies user queries into **Fresh**, **Historical**, or **Static** intents and adapts its scoring weights accordingly:
+*   **Fresh Query**: Penalizes older results to surface recent breakthroughs.
+*   **Historical Query**: Inverts the decay signal to pull older source documents to the top.
+---
+## 🏗️ Architecture
+```
+User Query
+    ↓
+Intent Classifier (Fresh vs Historical)
+    ↓
+Vector Retrieval (Qdrant)
+    ↓
+HalfLife Engine
+    ├── Score Fetch (Redis-backed)
+    ├── Learned λ Prediction (MLP)
+    └── Intent-Aware Fusion
+    ↓
+Re-ranked Chunks
+```
+---
+## 🛠️ Getting Started (Developer Experience)
+### 1. Install via Pip (Package Mode)
+HalfLife is now a standard Python package. You can install it and use the `halflife` CLI:
+```bash
+git clone https://github.com/yourusername/halflife.git
+pip install -e .
+```
+### 2. Launch Services
+Infrastructure is managed via Docker:
+```bash
+docker-compose up -d
+```
+### 3. Unified CLI
+Use the `halflife` command for all common tasks:
+```bash
+# Run the end-to-end quickstart
+halflife quickstart
+# Start the API server
+halflife serve --port 8000
+# Run evaluation benchmarks
+halflife benchmark --output results.json
+```
+---
+## 🧪 Evaluation & Rigour: The Decoy Mechanism
+To ensure HalfLife's effectiveness, we built a **108-chunk synthetic corpus** containing "Decoys". For every relevant chunk, there is a decoy with **identical text but a different timestamp**.
+Because their embeddings are identical, standard cosine similarity cannot separate them. Only HalfLife's temporal engine can correctly surface the right chunk, providing a rigorous test for your RAG pipeline's time-awareness.
+---
+## 🧬 Learned Decay Workflow
+1.  **Collect Baseline**: Run `halflife benchmark --output run_001.json`.
+2.  **Train the MLP**: Run `halflife train --results run_001.json`.
+3.  **Deploy**: The engine automatically loads `decay_mlp.npz` and starts predicting $\lambda$ for all new ingested chunks.
+---
+## 🧩 Status & Roadmap
+*   [x] **Phase 1**: Core Decay Engine & Redis Metadata Store.
+*   [x] **Phase 2**: Intent-Aware Fusion & Historical Inversion.
+*   [x] **Phase 3**: Learned Decay MLP & Benchmark Harness.
+*   [ ] **Phase 4**: Event-Driven Fact Supersession (In Progress).
+*   [ ] **Phase 5**: Multi-Vector Store SDKs (Pinecone, Weaviate).
+---
+## 📄 License & Contributing
+MIT License. Contributions are welcome for new decay functions and integration plugins!

halflife_rag-0.1.0/api/main.py ADDED Viewed

@@ -0,0 +1,142 @@
+import os
+import logging
+from typing import List, Dict, Optional
+from fastapi import FastAPI, Body, HTTPException
+from pydantic import BaseModel, Field
+from engine.store.redis_store import RedisStore
+from engine.fusion.reranker import Reranker
+from engine.classifier.query_intent import QueryIntentClassifier
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+app = FastAPI(title="HalfLife Re-ranking API", version="0.2.0")
+REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")
+store      = RedisStore(url=REDIS_URL)
+reranker   = Reranker(store)
+classifier = QueryIntentClassifier()
+# ------------------------------------------------------------------ #
+#  Request / response models                                          #
+# ------------------------------------------------------------------ #
+class ChunkInput(BaseModel):
+    id:      str
+    score:   float = Field(..., ge=0.0, le=1.0)
+    payload: Dict  = Field(default_factory=dict,
+                           description="Qdrant payload — must include 'timestamp'")
+class RerankRequest(BaseModel):
+    query:   str
+    chunks:  List[ChunkInput]
+    top_k:   int             = Field(10, ge=1, le=100)
+    weights: Optional[Dict]  = None   # override auto-weights from classifier
+class MetadataIngestRequest(BaseModel):
+    chunk_id:     str
+    decay_type:   str   = "exponential"
+    decay_params: Dict  = Field(default_factory=lambda: {"lambda": 1e-6})
+    trust_score:  float = Field(0.5, ge=0.0, le=1.0)
+class FeedbackRequest(BaseModel):
+    chunk_id:   str
+    was_useful: bool
+# ------------------------------------------------------------------ #
+#  Endpoints                                                          #
+# ------------------------------------------------------------------ #
+@app.get("/health")
+def health_check():
+    redis_ok = False
+    try:
+        redis_ok = store.client.ping() if store.client else False
+    except Exception:
+        pass
+    return {"status": "ok", "redis": redis_ok}
+@app.post("/rerank")
+def rerank_endpoint(req: RerankRequest):
+    """
+    Main middleware endpoint.
+    """
+    try:
+        # Classify query intent → weights + intent label
+        classification = classifier.classify(req.query)
+        weights = req.weights or classification["weights"]
+        intent  = classification["intent"]
+        chunks_as_dicts = [c.model_dump() for c in req.chunks]
+        result = reranker.rerank(
+            query=req.query,
+            chunks=chunks_as_dicts,
+            top_k=req.top_k,
+            weights=weights,
+            intent=intent,
+        )
+        return {
+            **result,
+            "query_intent": intent,
+        }
+    except Exception as e:
+        logger.exception("Rerank failed")
+        raise HTTPException(status_code=500, detail=str(e))
+@app.post("/ingest/metadata")
+def ingest_metadata(req: MetadataIngestRequest):
+    """
+    Directly write Redis metadata for a chunk.
+    """
+    metadata = RedisStore.build_metadata(
+        chunk_id=req.chunk_id,
+        decay_type=req.decay_type,
+        decay_params=req.decay_params,
+        trust_score=req.trust_score,
+    )
+    store.set_chunk(req.chunk_id, metadata)
+    store.mark_dirty(req.chunk_id)
+    return {"status": "ingested", "chunk_id": req.chunk_id}
+@app.post("/feedback")
+def feedback_endpoint(req: FeedbackRequest):
+    """
+    Log chunk utility signal. Marks cache dirty.
+    """
+    from engine.feedback.updater import FeedbackUpdater
+    updater = FeedbackUpdater(store)
+    updater.log_feedback(req.chunk_id, req.was_useful)
+    store.mark_dirty(req.chunk_id)
+    store.increment_feedback(req.chunk_id, req.was_useful)
+    return {"status": "recorded", "chunk_id": req.chunk_id}
+@app.get("/chunks/{chunk_id}/debug")
+def debug_chunk(chunk_id: str):
+    """
+    Inspect the full Redis state for a chunk.
+    """
+    metadata = store.get_chunk(chunk_id)
+    if not metadata:
+        raise HTTPException(status_code=404, detail=f"No metadata for chunk {chunk_id}")
+    cached_score    = store.get_cached_score(chunk_id)
+    feedback_counts = store.get_feedback_counts(chunk_id)
+    return {
+        "chunk_id":       chunk_id,
+        "metadata":       metadata,
+        "cached_score":   cached_score,
+        "feedback_counts": feedback_counts,
+        "dirty":          cached_score is None,
+    }

halflife_rag-0.1.0/engine/classifier/doc_type.py ADDED Viewed

@@ -0,0 +1,54 @@
+from typing import Dict
+class DocTypeClassifier:
+    """
+    Classifies a text chunk into a document category:
+    - news: Fast decay (exponential)
+    - documentation: Stable/Step decay (piecewise)
+    - research: Slow/Landmark decay (exponential with slow lambda)
+    - reference: Universal truth
+    """
+    NEWS_KEYWORDS = {"breaking", "today", "flash", "update", "newsworthy"}
+    DOCS_KEYWORDS = {"version", "release", "api", "usage", "compatibility"}
+    RESEARCH_KEYWORDS = {"abstract", "paper", "methodology", "citation", "experiment"}
+    def classify(self, text: str) -> Dict:
+        """
+        Classifies the document type and returns initial decay settings.
+        """
+        text_lower = text.lower()
+        # Check for News
+        if any(kw in text_lower for kw in self.NEWS_KEYWORDS):
+            return {
+                "doc_type": "news",
+                "decay_type": "exponential",
+                "decay_params": {"lambda": 1e-5}, # ~1 day half-life
+                "trust_score": 0.6
+            }
+        # Check for Documentation
+        if any(kw in text_lower for kw in self.DOCS_KEYWORDS):
+            return {
+                "doc_type": "documentation",
+                "decay_type": "piecewise",
+                "decay_params": {}, # Using defaults in piecewise
+                "trust_score": 0.8
+            }
+        # Check for Research
+        if any(kw in text_lower for kw in self.RESEARCH_KEYWORDS):
+            return {
+                "doc_type": "research",
+                "decay_type": "exponential",
+                "decay_params": {"lambda": 1e-7}, # Landmark papers
+                "trust_score": 0.9
+            }
+        # Default fallback
+        return {
+            "doc_type": "generic",
+            "decay_type": "exponential",
+            "decay_params": {"lambda": 1e-6}, # ~8 days half-life
+            "trust_score": 0.5
+        }

halflife_rag-0.1.0/engine/classifier/query_intent.py ADDED Viewed

@@ -0,0 +1,91 @@
+import logging
+from typing import Dict
+logger = logging.getLogger(__name__)
+class QueryIntentClassifier:
+    """
+    Classifies the temporal intent of a query and returns fusion weights.
+    Intent categories:
+        fresh      — user wants current information ("latest", "recent")
+                     β (temporal) is high, α (vector) is lower
+        historical — user wants evolution or past state ("history of", "how did X evolve")
+                     β is kept moderate but the reranker INVERTS temporal_score
+                     so that older chunks rank higher (see reranker.py)
+        static     — time-agnostic ("what is", "define", "explain")
+                     α (vector) dominates, temporal signal is minimal
+    The reranker consumes both 'weights' and 'intent' from this output.
+    Weights alone are not enough for historical queries — the inversion
+    flag is what actually surfaces old content.
+    """
+    FRESH_KEYWORDS = {
+        "latest", "recent", "newest", "current", "now",
+        "today", "this week", "this month", "breaking", "just",
+        "updated", "new", "2024", "2025",
+    }
+    HISTORICAL_KEYWORDS = {
+        "history", "historical", "evolution", "evolved", "origins",
+        "background", "originally", "used to", "how did", "first version",
+        "early", "founded", "invented", "introduced", "over the years",
+        "timeline", "progression",
+    }
+    def classify(self, query: str) -> Dict:
+        """
+        Returns:
+            {
+                "intent":  "fresh" | "historical" | "static",
+                "weights": {"vector": float, "temporal": float, "trust": float},
+            }
+        """
+        q = query.lower()
+        if any(kw in q for kw in self.FRESH_KEYWORDS):
+            return {
+                "intent": "fresh",
+                "weights": {
+                    "vector":   0.3,
+                    "temporal": 0.6,
+                    "trust":    0.1,
+                },
+            }
+        if any(kw in q for kw in self.HISTORICAL_KEYWORDS):
+            # NOTE: weights here look similar to static, but the reranker
+            # receives intent="historical" and inverts temporal_score.
+            # The result: temporal weight still matters, but it now rewards
+            # old chunks instead of fresh ones.
+            return {
+                "intent": "historical",
+                "weights": {
+                    "vector":   0.5,
+                    "temporal": 0.3,
+                    "trust":    0.2,
+                },
+            }
+        # Default: static / time-agnostic
+        return {
+            "intent": "static",
+            "weights": {
+                "vector":   0.7,
+                "temporal": 0.1,
+                "trust":    0.2,
+            },
+        }
+if __name__ == "__main__":
+    clf = QueryIntentClassifier()
+    for q in [
+        "latest BERT papers",
+        "history of transformer architectures",
+        "what is attention mechanism",
+    ]:
+        result = clf.classify(q)
+        print(f"{q!r:45s} → intent={result['intent']}, weights={result['weights']}")

halflife_rag-0.1.0/engine/decay/base.py ADDED Viewed

@@ -0,0 +1,14 @@
+from abc import ABC, abstractmethod
+from datetime import datetime
+class DecayFunction(ABC):
+    def __init__(self, params: dict):
+        self.params = params
+    @abstractmethod
+    def compute(self, timestamp: datetime, now: datetime) -> float:
+        """
+        Compute the decay score for a given timestamp and the current time.
+        Returns a value between 0.0 and 1.0.
+        """
+        pass

halflife_rag-0.1.0/engine/decay/exponential.py ADDED Viewed

@@ -0,0 +1,18 @@
+import math
+from datetime import datetime
+from .base import DecayFunction
+class ExponentialDecay(DecayFunction):
+    """
+    Exponential decay: score = e^(-lambda * delta_time)
+    Good for news and fast-moving trends.
+    """
+    def compute(self, timestamp: datetime, now: datetime) -> float:
+        delta_seconds = (now - timestamp).total_seconds()
+        # Ensure delta_seconds is not negative (e.g., if there's a minor clock drift)
+        delta_seconds = max(0, delta_seconds)
+        # lambda_ is the decay constant. Default: 1e-6 (roughly half-life of 8 days)
+        lambda_ = self.params.get("lambda", 1e-6)
+        return math.exp(-lambda_ * delta_seconds)

halflife_rag-0.1.0/engine/decay/learned.py ADDED Viewed

@@ -0,0 +1,46 @@
+"""
+learned.py — LearnedDecay function (Option A: chunk-level λ predictor).
+This class is used by the DecayRegistry like any other decay function.
+The key difference from ExponentialDecay: λ was predicted by the MLP
+at ingestion time (not hand-tuned), stored in Redis, and loaded here.
+At query time this is pure exponential decay — no ML inference.
+The MLP runs only at ingestion time (or when feedback updates trigger
+a λ re-prediction via FeedbackUpdater).
+The decay_params dict in Redis must contain:
+    {"lambda": float}    — predicted by LearnedDecayEngine.predict_lambda()
+If lambda is missing, falls back to the MLP's cold-start default for
+generic doc type — equivalent to ExponentialDecay with λ=1e-6.
+"""
+import math
+from datetime import datetime
+from .base import DecayFunction
+class LearnedDecay(DecayFunction):
+    """
+    Exponential decay using a MLP-predicted λ stored in params.
+    Identical runtime behaviour to ExponentialDecay — the "learned"
+    part is in how λ was set, not in how it's used.
+    decay(Δt) = e^(-λ · Δt)
+    where λ ∈ [1e-8, 1e-4] was predicted by DecayMLP from chunk features.
+    """
+    def compute(self, timestamp: datetime, now: datetime) -> float:
+        delta_seconds = (now - timestamp).total_seconds()
+        delta_seconds = max(0.0, delta_seconds)
+        lambda_ = self.params.get("lambda", 1e-6)
+        # Safety clamp — λ should always be in the MLP's output range,
+        # but guard against stale Redis values or hand-edited metadata.
+        lambda_ = max(1e-8, min(lambda_, 1e-4))
+        return math.exp(-lambda_ * delta_seconds)