PyPI - haiku.rag - Versions diffs - 0.9.2__tar.gz → 0.10.0__tar.gz - Mend

haiku.rag 0.9.2tar.gz → 0.10.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of haiku.rag might be problematic. Click here for more details.

Files changed (107) hide show

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/.pre-commit-config.yaml RENAMED Viewed

@@ -20,13 +20,3 @@ repos:
     rev: v1.1.399
     hooks:
       - id: pyright
-  - repo: https://github.com/RodrigoGonzalez/check-mkdocs
-    rev: v1.2.0
-    hooks:
-      - id: check-mkdocs
-        name: check-mkdocs
-        args: ["--config", "mkdocs.yml"] # Optional, mkdocs.yml is the default
-        # If you have additional plugins or libraries that are not included in
-        # check-mkdocs, add them here
-        additional_dependencies: ["mkdocs-material"]

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: haiku.rag
-Version: 0.9.2
+Version: 0.10.0
 Summary: Agentic Retrieval Augmented Generation (RAG) with LanceDB
 Author-email: Yiorgis Gozadinos <ggozadinos@gmail.com>
 License: MIT
@@ -23,6 +23,7 @@ Requires-Dist: fastmcp>=2.12.3
 Requires-Dist: httpx>=0.28.1
 Requires-Dist: lancedb>=0.25.0
 Requires-Dist: pydantic-ai>=1.0.8
+Requires-Dist: pydantic-graph>=1.0.8
 Requires-Dist: pydantic>=2.11.9
 Requires-Dist: python-dotenv>=1.1.1
 Requires-Dist: rich>=14.1.0
@@ -48,6 +49,7 @@ Retrieval-Augmented Generation (RAG) library built on LanceDB.
 - **Local LanceDB**: No external servers required, supports also LanceDB cloud storage, S3, Google Cloud & Azure
 - **Multiple embedding providers**: Ollama, VoyageAI, OpenAI, vLLM
 - **Multiple QA providers**: Any provider/model supported by Pydantic AI
+- **Research graph (multi‑agent)**: Plan → Search → Evaluate → Synthesize with agentic AI
 - **Native hybrid search**: Vector + full-text search with native LanceDB RRF reranking
 - **Reranking**: Default search result reranking with MixedBread AI, Cohere, or vLLM
 - **Question answering**: Built-in QA agents on your documents
@@ -75,6 +77,14 @@ haiku-rag ask "Who is the author of haiku.rag?"
 # Ask questions with citations
 haiku-rag ask "Who is the author of haiku.rag?" --cite
+# Multi‑agent research (iterative plan/search/evaluate)
+haiku-rag research \
+  "What are the main drivers and trends of global temperature anomalies since 1990?" \
+  --max-iterations 2 \
+  --confidence-threshold 0.8 \
+  --max-concurrency 3 \
+  --verbose
 # Rebuild database (re-chunk and re-embed all documents)
 haiku-rag rebuild
@@ -90,6 +100,13 @@ haiku-rag serve
 ```python
 from haiku.rag.client import HaikuRAG
+from haiku.rag.research import (
+    ResearchContext,
+    ResearchDeps,
+    ResearchState,
+    build_research_graph,
+    PlanNode,
+)
 async with HaikuRAG("database.lancedb") as client:
     # Add document
@@ -107,6 +124,25 @@ async with HaikuRAG("database.lancedb") as client:
     # Ask questions with citations
     answer = await client.ask("Who is the author of haiku.rag?", cite=True)
     print(answer)
+    # Multi‑agent research pipeline (Plan → Search → Evaluate → Synthesize)
+    graph = build_research_graph()
+    state = ResearchState(
+        question=(
+            "What are the main drivers and trends of global temperature "
+            "anomalies since 1990?"
+        ),
+        context=ResearchContext(original_question="…"),
+        max_iterations=2,
+        confidence_threshold=0.8,
+        max_concurrency=3,
+    )
+    deps = ResearchDeps(client=client)
+    start = PlanNode(provider=None, model=None)
+    result = await graph.run(start, state=state, deps=deps)
+    report = result.output
+    print(report.title)
+    print(report.executive_summary)
 ```
 ## MCP Server

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/README.md RENAMED Viewed

@@ -11,6 +11,7 @@ Retrieval-Augmented Generation (RAG) library built on LanceDB.
 - **Local LanceDB**: No external servers required, supports also LanceDB cloud storage, S3, Google Cloud & Azure
 - **Multiple embedding providers**: Ollama, VoyageAI, OpenAI, vLLM
 - **Multiple QA providers**: Any provider/model supported by Pydantic AI
+- **Research graph (multi‑agent)**: Plan → Search → Evaluate → Synthesize with agentic AI
 - **Native hybrid search**: Vector + full-text search with native LanceDB RRF reranking
 - **Reranking**: Default search result reranking with MixedBread AI, Cohere, or vLLM
 - **Question answering**: Built-in QA agents on your documents
@@ -38,6 +39,14 @@ haiku-rag ask "Who is the author of haiku.rag?"
 # Ask questions with citations
 haiku-rag ask "Who is the author of haiku.rag?" --cite
+# Multi‑agent research (iterative plan/search/evaluate)
+haiku-rag research \
+  "What are the main drivers and trends of global temperature anomalies since 1990?" \
+  --max-iterations 2 \
+  --confidence-threshold 0.8 \
+  --max-concurrency 3 \
+  --verbose
 # Rebuild database (re-chunk and re-embed all documents)
 haiku-rag rebuild
@@ -53,6 +62,13 @@ haiku-rag serve
 ```python
 from haiku.rag.client import HaikuRAG
+from haiku.rag.research import (
+    ResearchContext,
+    ResearchDeps,
+    ResearchState,
+    build_research_graph,
+    PlanNode,
+)
 async with HaikuRAG("database.lancedb") as client:
     # Add document
@@ -70,6 +86,25 @@ async with HaikuRAG("database.lancedb") as client:
     # Ask questions with citations
     answer = await client.ask("Who is the author of haiku.rag?", cite=True)
     print(answer)
+    # Multi‑agent research pipeline (Plan → Search → Evaluate → Synthesize)
+    graph = build_research_graph()
+    state = ResearchState(
+        question=(
+            "What are the main drivers and trends of global temperature "
+            "anomalies since 1990?"
+        ),
+        context=ResearchContext(original_question="…"),
+        max_iterations=2,
+        confidence_threshold=0.8,
+        max_concurrency=3,
+    )
+    deps = ResearchDeps(client=client)
+    start = PlanNode(provider=None, model=None)
+    result = await graph.run(start, state=state, deps=deps)
+    report = result.output
+    print(report.title)
+    print(report.executive_summary)
 ```
 ## MCP Server

haiku_rag-0.10.0/docs/agents.md ADDED Viewed

@@ -0,0 +1,104 @@
+## Agents
+Two agentic flows are provided by haiku.rag:
+- Simple QA Agent — a focused question answering agent
+- Research Multi‑Agent — a multi‑step, analyzable research workflow
+### Simple QA Agent
+The simple QA agent answers a single question using the knowledge base. It retrieves relevant chunks, optionally expands context around them, and asks the model to answer strictly based on that context.
+Key points:
+- Uses a single `search_documents` tool to fetch relevant chunks
+- Can be run with or without inline citations in the prompt
+- Returns a plain string answer
+Python usage:
+```python
+from haiku.rag.client import HaikuRAG
+from haiku.rag.qa.agent import QuestionAnswerAgent
+client = HaikuRAG(path_to_db)
+# Choose a provider and model (see Configuration for env defaults)
+agent = QuestionAnswerAgent(
+    client=client,
+    provider="openai",  # or "ollama", "vllm", etc.
+    model="gpt-4o-mini",
+    use_citations=False,  # set True to bias prompt towards citing sources
+)
+answer = await agent.answer("What is climate change?")
+print(answer)
+```
+### Research Graph
+The research workflow is implemented as a typed pydantic‑graph. It plans, searches (in parallel batches), evaluates, and synthesizes into a final report — with clear stop conditions and shared state.
+```mermaid
+---
+title: Research graph
+---
+stateDiagram-v2
+  PlanNode --> SearchDispatchNode
+  SearchDispatchNode --> EvaluateNode
+  EvaluateNode --> SearchDispatchNode
+  EvaluateNode --> SynthesizeNode
+  SynthesizeNode --> [*]
+```
+Key nodes:
+- Plan: builds up to 3 standalone sub‑questions (uses an internal presearch tool)
+- Search (batched): answers sub‑questions using the KB with minimal, verbatim context
+- Evaluate: extracts insights, proposes new questions, and checks sufficiency/confidence
+- Synthesize: generates a final structured report
+Primary models:
+- `SearchAnswer` — one per sub‑question (query, answer, context, sources)
+- `EvaluationResult` — insights, new questions, sufficiency, confidence
+- `ResearchReport` — final report (title, executive summary, findings, conclusions, …)
+CLI usage:
+```bash
+haiku-rag research "How does haiku.rag organize and query documents?" \
+  --max-iterations 2 \
+  --confidence-threshold 0.8 \
+  --max-concurrency 3 \
+  --verbose
+```
+Python usage:
+```python
+from haiku.rag.client import HaikuRAG
+from haiku.rag.research import (
+    ResearchContext,
+    ResearchDeps,
+    ResearchState,
+    build_research_graph,
+    PlanNode,
+)
+async with HaikuRAG(path_to_db) as client:
+    graph = build_research_graph()
+    state = ResearchState(
+        question="What are the main drivers and trends of global temperature anomalies since 1990?",
+        context=ResearchContext(original_question=... ),
+        max_iterations=2,
+        confidence_threshold=0.8,
+        max_concurrency=3,
+    )
+    deps = ResearchDeps(client=client)
+    result = await graph.run(PlanNode(provider=None, model=None), state=state, deps=deps)
+    report = result.output
+    print(report.title)
+    print(report.executive_summary)
+```

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/docs/cli.md RENAMED Viewed

@@ -84,6 +84,24 @@ haiku-rag ask "Who is the author of haiku.rag?" --cite
 The QA agent will search your documents for relevant information and provide a comprehensive answer. With `--cite`, responses include citations showing which documents were used.
+## Research
+Run the multi-step research graph:
+```bash
+haiku-rag research "How does haiku.rag organize and query documents?" \
+  --max-iterations 2 \
+  --confidence-threshold 0.8 \
+  --max-concurrency 3 \
+  --verbose
+```
+Flags:
+- `--max-iterations, -n`: maximum search/evaluate cycles (default: 3)
+- `--confidence-threshold`: stop once evaluation confidence meets/exceeds this (default: 0.8)
+- `--max-concurrency`: number of sub-questions searched in parallel each iteration (default: 3)
+- `--verbose`: show planning, searching previews, evaluation summary, and stop reason
 ## Server
 Start the MCP server:

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/mkdocs.yml RENAMED Viewed

@@ -76,4 +76,8 @@ markdown_extensions:
       use_pygments: true
   - pymdownx.inlinehilite
   - pymdownx.snippets
-  - pymdownx.superfences
+  - pymdownx.superfences:
+      custom_fences:
+        - name: mermaid
+          class: mermaid
+          format: !!python/name:pymdownx.superfences.fence_code_format

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/pyproject.toml RENAMED Viewed

@@ -1,7 +1,8 @@
 [project]
 name = "haiku.rag"
-version = "0.9.2"
 description = "Agentic Retrieval Augmented Generation (RAG) with LanceDB"
+version = "0.10.0"
 authors = [{ name = "Yiorgis Gozadinos", email = "ggozadinos@gmail.com" }]
 license = { text = "MIT" }
 readme = { file = "README.md", content-type = "text/markdown" }
@@ -28,6 +29,7 @@ dependencies = [
     "lancedb>=0.25.0",
     "pydantic>=2.11.9",
     "pydantic-ai>=1.0.8",
+    "pydantic-graph>=1.0.8",
     "python-dotenv>=1.1.1",
     "rich>=14.1.0",
     "tiktoken>=0.11.0",
@@ -89,6 +91,7 @@ line-ending = "auto"
 [tool.pyright]
 venvPath = "."
 venv = ".venv"
+pythonVersion = "3.12"
 [tool.pytest.ini_options]
 asyncio_default_fixture_loop_scope = "session"

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/src/haiku/rag/app.py RENAMED Viewed

@@ -9,7 +9,13 @@ from haiku.rag.client import HaikuRAG
 from haiku.rag.config import Config
 from haiku.rag.mcp import create_mcp_server
 from haiku.rag.monitor import FileWatcher
-from haiku.rag.research.orchestrator import ResearchOrchestrator
+from haiku.rag.research.dependencies import ResearchContext
+from haiku.rag.research.graph import (
+    PlanNode,
+    ResearchDeps,
+    ResearchState,
+    build_research_graph,
+)
 from haiku.rag.store.models.chunk import Chunk
 from haiku.rag.store.models.document import Document
@@ -80,30 +86,54 @@ class HaikuRAGApp:
                 self.console.print(f"[red]Error: {e}[/red]")
     async def research(
-        self, question: str, max_iterations: int = 3, verbose: bool = False
+        self,
+        question: str,
+        max_iterations: int = 3,
+        confidence_threshold: float = 0.8,
+        max_concurrency: int = 1,
+        verbose: bool = False,
     ):
-        """Run multi-agent research on a question."""
+        """Run research via the pydantic-graph pipeline (default)."""
         async with HaikuRAG(db_path=self.db_path) as client:
             try:
-                # Create orchestrator with default config or fallback to QA
-                orchestrator = ResearchOrchestrator()
                 if verbose:
-                    self.console.print(
-                        f"[bold cyan]Starting research with {orchestrator.provider}:{orchestrator.model}[/bold cyan]"
-                    )
+                    self.console.print("[bold cyan]Starting research[/bold cyan]")
                     self.console.print(f"[bold blue]Question:[/bold blue] {question}")
                     self.console.print()
-                # Conduct research
-                report = await orchestrator.conduct_research(
+                graph = build_research_graph()
+                state = ResearchState(
                     question=question,
-                    client=client,
+                    context=ResearchContext(original_question=question),
                     max_iterations=max_iterations,
-                    verbose=verbose,
-                    console=self.console if verbose else None,
+                    confidence_threshold=confidence_threshold,
+                    max_concurrency=max_concurrency,
+                )
+                deps = ResearchDeps(
+                    client=client, console=self.console if verbose else None
                 )
+                start = PlanNode(
+                    provider=Config.RESEARCH_PROVIDER or Config.QA_PROVIDER,
+                    model=Config.RESEARCH_MODEL or Config.QA_MODEL,
+                )
+                # Prefer graph.run; fall back to iter if unavailable
+                report = None
+                try:
+                    result = await graph.run(start, state=state, deps=deps)
+                    report = result.output
+                except Exception:
+                    from pydantic_graph import End
+                    async with graph.iter(start, state=state, deps=deps) as run:
+                        node = run.next_node
+                        while not isinstance(node, End):
+                            node = await run.next(node)
+                        if run.result:
+                            report = run.result.output
+                if report is None:
+                    raise RuntimeError("Graph did not produce a report")
                 # Display the report
                 self.console.print("[bold green]Research Report[/bold green]")
                 self.console.rule()
@@ -115,6 +145,12 @@ class HaikuRAGApp:
                 self.console.print(report.executive_summary)
                 self.console.print()
+                # Confidence (from last evaluation)
+                if state.last_eval:
+                    conf = state.last_eval.confidence_score  # type: ignore[attr-defined]
+                    self.console.print(f"[bold cyan]Confidence:[/bold cyan] {conf:.1%}")
+                    self.console.print()
                 # Main Findings
                 if report.main_findings:
                     self.console.print("[bold cyan]Main Findings:[/bold cyan]")

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/src/haiku/rag/cli.py RENAMED Viewed

@@ -13,10 +13,10 @@ from haiku.rag.logging import configure_cli_logging
 from haiku.rag.migration import migrate_sqlite_to_lancedb
 from haiku.rag.utils import is_up_to_date
-logfire.configure(send_to_logfire="if-token-present")
-logfire.instrument_pydantic_ai()
-if not Config.ENV == "development":
+if Config.ENV == "development":
+    logfire.configure(send_to_logfire="if-token-present")
+    logfire.instrument_pydantic_ai()
+else:
     warnings.filterwarnings("ignore")
 cli = typer.Typer(
@@ -250,6 +250,16 @@ def research(
         "-n",
         help="Maximum search/analyze iterations",
     ),
+    confidence_threshold: float = typer.Option(
+        0.8,
+        "--confidence-threshold",
+        help="Minimum confidence (0-1) to stop",
+    ),
+    max_concurrency: int = typer.Option(
+        1,
+        "--max-concurrency",
+        help="Max concurrent searches per iteration (planned)",
+    ),
     db: Path = typer.Option(
         Config.DEFAULT_DATA_DIR / "haiku.rag.lancedb",
         "--db",
@@ -266,6 +276,8 @@ def research(
         app.research(
             question=question,
             max_iterations=max_iterations,
+            confidence_threshold=confidence_threshold,
+            max_concurrency=max_concurrency,
             verbose=verbose,
         )
     )

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/src/haiku/rag/client.py RENAMED Viewed

@@ -388,7 +388,7 @@ class HaikuRAG:
                 all_chunks = adjacent_chunks + [chunk]
                 # Get the range of orders for this expanded chunk
-                orders = [c.metadata.get("order", 0) for c in all_chunks]
+                orders = [c.order for c in all_chunks]
                 min_order = min(orders)
                 max_order = max(orders)
@@ -398,9 +398,7 @@ class HaikuRAG:
                         "score": score,
                         "min_order": min_order,
                         "max_order": max_order,
-                        "all_chunks": sorted(
-                            all_chunks, key=lambda c: c.metadata.get("order", 0)
-                        ),
+                        "all_chunks": sorted(all_chunks, key=lambda c: c.order),
                     }
                 )
@@ -459,7 +457,7 @@ class HaikuRAG:
                 # Merge all_chunks and deduplicate by order
                 all_chunks_dict = {}
                 for chunk in current["all_chunks"] + range_info["all_chunks"]:
-                    order = chunk.metadata.get("order", 0)
+                    order = chunk.order
                     all_chunks_dict[order] = chunk
                 current["all_chunks"] = [
                     all_chunks_dict[order] for order in sorted(all_chunks_dict.keys())

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/src/haiku/rag/reranking/mxbai.py RENAMED Viewed

@@ -1,4 +1,4 @@
-from mxbai_rerank import MxbaiRerankV2
+from mxbai_rerank import MxbaiRerankV2  # pyright: ignore[reportMissingImports]
 from haiku.rag.config import Config
 from haiku.rag.reranking.base import RerankerBase

haiku_rag-0.10.0/src/haiku/rag/research/__init__.py ADDED Viewed

@@ -0,0 +1,20 @@
+from haiku.rag.research.dependencies import ResearchContext, ResearchDependencies
+from haiku.rag.research.graph import (
+    PlanNode,
+    ResearchDeps,
+    ResearchState,
+    build_research_graph,
+)
+from haiku.rag.research.models import EvaluationResult, ResearchReport, SearchAnswer
+__all__ = [
+    "ResearchDependencies",
+    "ResearchContext",
+    "SearchAnswer",
+    "EvaluationResult",
+    "ResearchReport",
+    "ResearchDeps",
+    "ResearchState",
+    "PlanNode",
+    "build_research_graph",
+]

haiku_rag-0.10.0/src/haiku/rag/research/common.py ADDED Viewed

@@ -0,0 +1,53 @@
+from typing import Any
+from pydantic_ai import format_as_xml
+from pydantic_ai.models.openai import OpenAIChatModel
+from pydantic_ai.providers.ollama import OllamaProvider
+from pydantic_ai.providers.openai import OpenAIProvider
+from haiku.rag.config import Config
+from haiku.rag.research.dependencies import ResearchContext
+def get_model(provider: str, model: str) -> Any:
+    if provider == "ollama":
+        return OpenAIChatModel(
+            model_name=model,
+            provider=OllamaProvider(base_url=f"{Config.OLLAMA_BASE_URL}/v1"),
+        )
+    elif provider == "vllm":
+        return OpenAIChatModel(
+            model_name=model,
+            provider=OpenAIProvider(
+                base_url=f"{Config.VLLM_RESEARCH_BASE_URL or Config.VLLM_QA_BASE_URL}/v1",
+                api_key="none",
+            ),
+        )
+    else:
+        return f"{provider}:{model}"
+def log(console, msg: str) -> None:
+    if console:
+        console.print(msg)
+def format_context_for_prompt(context: ResearchContext) -> str:
+    """Format the research context as XML for inclusion in prompts."""
+    context_data = {
+        "original_question": context.original_question,
+        "unanswered_questions": context.sub_questions,
+        "qa_responses": [
+            {
+                "question": qa.query,
+                "answer": qa.answer,
+                "context_snippets": qa.context,
+                "sources": qa.sources,  # pyright: ignore[reportAttributeAccessIssue]
+            }
+            for qa in context.qa_responses
+        ],
+        "insights": context.insights,
+        "gaps": context.gaps,
+    }
+    return format_as_xml(context_data, root_tag="research_context")

{haiku_rag-0.9.2 → haiku_rag-0.10.0}/src/haiku/rag/research/dependencies.py RENAMED Viewed

@@ -1,7 +1,8 @@
 from pydantic import BaseModel, Field
+from rich.console import Console
 from haiku.rag.client import HaikuRAG
-from haiku.rag.research.base import SearchAnswer
+from haiku.rag.research.models import SearchAnswer
 class ResearchContext(BaseModel):
@@ -11,7 +12,7 @@ class ResearchContext(BaseModel):
     sub_questions: list[str] = Field(
         default_factory=list, description="Decomposed sub-questions"
     )
-    qa_responses: list["SearchAnswer"] = Field(
+    qa_responses: list[SearchAnswer] = Field(
         default_factory=list, description="Structured QA pairs used during research"
     )
     insights: list[str] = Field(
@@ -21,7 +22,7 @@ class ResearchContext(BaseModel):
         default_factory=list, description="Identified information gaps"
     )
-    def add_qa_response(self, qa: "SearchAnswer") -> None:
+    def add_qa_response(self, qa: SearchAnswer) -> None:
         """Add a structured QA response (minimal context already included)."""
         self.qa_responses.append(qa)
@@ -43,3 +44,4 @@ class ResearchDependencies(BaseModel):
     client: HaikuRAG = Field(description="RAG client for document operations")
     context: ResearchContext = Field(description="Shared research context")
+    console: Console | None = None

haiku_rag-0.10.0/src/haiku/rag/research/graph.py ADDED Viewed

@@ -0,0 +1,29 @@
+from pydantic_graph import Graph
+from haiku.rag.research.models import ResearchReport
+from haiku.rag.research.nodes.evaluate import EvaluateNode
+from haiku.rag.research.nodes.plan import PlanNode
+from haiku.rag.research.nodes.search import SearchDispatchNode
+from haiku.rag.research.nodes.synthesize import SynthesizeNode
+from haiku.rag.research.state import ResearchDeps, ResearchState
+__all__ = [
+    "PlanNode",
+    "SearchDispatchNode",
+    "EvaluateNode",
+    "SynthesizeNode",
+    "ResearchState",
+    "ResearchDeps",
+    "build_research_graph",
+]
+def build_research_graph() -> Graph[ResearchState, ResearchDeps, ResearchReport]:
+    return Graph(
+        nodes=[
+            PlanNode,
+            SearchDispatchNode,
+            EvaluateNode,
+            SynthesizeNode,
+        ]
+    )

haiku.rag 0.9.2__tar.gz → 0.10.0__tar.gz

Potentially problematic release.

haiku.rag 0.9.2tar.gz → 0.10.0tar.gz