PyPI - haiku.rag - Versions diffs - 0.8.0__tar.gz → 0.9.0__tar.gz - Mend

haiku.rag 0.8.0tar.gz → 0.9.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of haiku.rag might be problematic. Click here for more details.

Files changed (92) hide show

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/.gitignore RENAMED Viewed

@@ -18,3 +18,5 @@ tests/data/
 # environment variables
 .env
 TODO.md
+PLAN.md
+DEVNOTES.md

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/PKG-INFO RENAMED Viewed

@@ -1,7 +1,7 @@
 Metadata-Version: 2.4
 Name: haiku.rag
-Version: 0.8.0
-Summary: Retrieval Augmented Generation (RAG) with LanceDB
+Version: 0.9.0
+Summary: Agentic Retrieval Augmented Generation (RAG) with LanceDB
 Author-email: Yiorgis Gozadinos <ggozadinos@gmail.com>
 License: MIT
 License-File: LICENSE
@@ -18,14 +18,13 @@ Classifier: Programming Language :: Python :: 3.11
 Classifier: Programming Language :: Python :: 3.12
 Classifier: Typing :: Typed
 Requires-Python: >=3.12
-Requires-Dist: docling>=2.49.0
-Requires-Dist: fastmcp>=2.8.1
+Requires-Dist: docling>=2.52.0
+Requires-Dist: fastmcp>=2.12.3
 Requires-Dist: httpx>=0.28.1
-Requires-Dist: lancedb>=0.24.3
-Requires-Dist: ollama>=0.5.3
-Requires-Dist: pydantic-ai>=0.8.1
-Requires-Dist: pydantic>=2.11.7
-Requires-Dist: python-dotenv>=1.1.0
+Requires-Dist: lancedb>=0.25.0
+Requires-Dist: pydantic-ai>=1.0.8
+Requires-Dist: pydantic>=2.11.9
+Requires-Dist: python-dotenv>=1.1.1
 Requires-Dist: rich>=14.1.0
 Requires-Dist: tiktoken>=0.11.0
 Requires-Dist: typer>=0.16.1
@@ -33,7 +32,7 @@ Requires-Dist: watchfiles>=1.1.0
 Provides-Extra: mxbai
 Requires-Dist: mxbai-rerank>=0.1.6; extra == 'mxbai'
 Provides-Extra: voyageai
-Requires-Dist: voyageai>=0.3.2; extra == 'voyageai'
+Requires-Dist: voyageai>=0.3.5; extra == 'voyageai'
 Description-Content-Type: text/markdown
 # Haiku RAG
@@ -128,4 +127,5 @@ Full documentation at: https://ggozad.github.io/haiku.rag/
 - [Configuration](https://ggozad.github.io/haiku.rag/configuration/) - Environment variables
 - [CLI](https://ggozad.github.io/haiku.rag/cli/) - Command reference
 - [Python API](https://ggozad.github.io/haiku.rag/python/) - Complete API docs
+- [Agents](https://ggozad.github.io/haiku.rag/agents/) - QA agent and multi-agent research
 - [Benchmarks](https://ggozad.github.io/haiku.rag/benchmarks/) - Performance Benchmarks

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/README.md RENAMED Viewed

@@ -90,4 +90,5 @@ Full documentation at: https://ggozad.github.io/haiku.rag/
 - [Configuration](https://ggozad.github.io/haiku.rag/configuration/) - Environment variables
 - [CLI](https://ggozad.github.io/haiku.rag/cli/) - Command reference
 - [Python API](https://ggozad.github.io/haiku.rag/python/) - Complete API docs
+- [Agents](https://ggozad.github.io/haiku.rag/agents/) - QA agent and multi-agent research
 - [Benchmarks](https://ggozad.github.io/haiku.rag/benchmarks/) - Performance Benchmarks

haiku_rag-0.9.0/docs/agents.md ADDED Viewed

@@ -0,0 +1,83 @@
+## Agents
+Two agentic flows are provided by haiku.rag:
+- Simple QA Agent — a focused question answering agent
+- Research Multi‑Agent — a multi‑step, analyzable research workflow
+### Simple QA Agent
+The simple QA agent answers a single question using the knowledge base. It retrieves relevant chunks, optionally expands context around them, and asks the model to answer strictly based on that context.
+Key points:
+- Uses a single `search_documents` tool to fetch relevant chunks
+- Can be run with or without inline citations in the prompt
+- Returns a plain string answer
+Python usage:
+```python
+from haiku.rag.client import HaikuRAG
+from haiku.rag.qa.agent import QuestionAnswerAgent
+client = HaikuRAG(path_to_db)
+# Choose a provider and model (see Configuration for env defaults)
+agent = QuestionAnswerAgent(
+    client=client,
+    provider="openai",  # or "ollama", "vllm", etc.
+    model="gpt-4o-mini",
+    use_citations=False,  # set True to bias prompt towards citing sources
+)
+answer = await agent.answer("What is climate change?")
+print(answer)
+```
+### Research Multi‑Agent
+The research workflow coordinates specialized agents to plan, search, analyze, and synthesize a comprehensive answer. It is designed for deeper questions that benefit from iterative investigation and structured reporting.
+Components:
+- Orchestrator: Plans, coordinates, and loops until confidence is sufficient
+- Search Specialist: Performs targeted RAG searches and answers sub‑questions
+- Analysis & Evaluation: Extracts insights, identifies gaps, proposes new questions
+- Synthesis: Produces a final structured research report
+Primary models:
+- `ResearchPlan` — produced by the orchestrator when planning
+  - `main_question: str`
+  - `sub_questions: list[str]` (standalone, self‑contained queries)
+- `SearchAnswer` — produced by the search specialist for each sub‑question
+  - `query: str` — the executed sub‑question
+  - `answer: str` — the agent’s answer grounded in retrieved context
+  - `context: list[str]` — minimal verbatim snippets used for the answer
+  - `sources: list[str]` — document URIs aligned with `context`
+- `EvaluationResult` — insights, new standalone questions, sufficiency & confidence
+- `ResearchReport` — the final synthesized report
+Python usage:
+```python
+from haiku.rag.client import HaikuRAG
+from haiku.rag.research import ResearchOrchestrator
+client = HaikuRAG(path_to_db)
+orchestrator = ResearchOrchestrator(provider="openai", model="gpt-4o-mini")
+report = await orchestrator.conduct_research(
+    question="What are the main drivers and recent trends of global temperature anomalies since 1990?",
+    client=client,
+    max_iterations=2,
+    confidence_threshold=0.8,
+    verbose=False,
+)
+print(report.title)
+print(report.executive_summary)
+```

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/docs/benchmarks.md RENAMED Viewed

@@ -16,8 +16,8 @@ The recall obtained is ~0.79 for matching in the top result, raising to ~0.91 fo
 |---------------------------------------|-------------------|-------------------|------------------------|
 | Ollama / `mxbai-embed-large`          | 0.79              | 0.91              | None                   |
 | Ollama / `mxbai-embed-large`          | 0.90              | 0.95              | `mxbai-rerank-base-v2` |
-<!-- | Ollama / `nomic-embed-text`           | 0.74              | 0.88              | None                   |
-| OpenAI / `text-embeddings-3-small`    | 0.75              | 0.88              | None                   |
+| Ollama / `nomic-embed-text-v1.5`      | 0.74              | 0.90              | None                   |
+<!-- | OpenAI / `text-embeddings-3-small`    | 0.75              | 0.88              | None                   |
 | OpenAI / `text-embeddings-3-small`    | 0.75              | 0.88              | None                   |
 | OpenAI / `text-embeddings-3-small`    | 0.83              | 0.90              | Cohere / `rerank-v3.5` | -->

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/docs/cli.md RENAMED Viewed

@@ -36,8 +36,10 @@ haiku-rag add-src https://example.com/article.html
 ```
 !!! note
-    As you add documents to `haiku.rag` the database keeps growing. By default, `lanceDB` supports versioning
-    of your data. You can optimize and compact the database by running the [vaccum](#vacuum-optimize-and-cleanup) command.
+    As you add documents to `haiku.rag` the database keeps growing. By default, LanceDB supports versioning
+    of your data. Create/update operations are atomic‑feeling: if anything fails during chunking or embedding,
+    the database rolls back to the pre‑operation snapshot using LanceDB table versioning. You can optimize and
+    compact the database by running the [vacuum](#vacuum-optimize-and-cleanup) command.
 ### Get Document

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/docs/configuration.md RENAMED Viewed

@@ -223,3 +223,35 @@ CHUNK_SIZE=256
 # into single chunks with continuous content to eliminate duplication
 CONTEXT_CHUNK_RADIUS=0
 ```
+#### Markdown Preprocessor
+Optionally preprocess Markdown before chunking by pointing to a callable that receives and returns Markdown text. This is useful for normalizing content, stripping boilerplate, or applying custom transformations before chunk boundaries are computed.
+```bash
+# A callable path in one of these formats:
+# - package.module:func
+# - package.module.func
+# - /abs/or/relative/path/to/file.py:func
+MARKDOWN_PREPROCESSOR="my_pkg.preprocess:clean_md"
+```
+!!! note
+    - The function signature should be `def clean_md(text: str) -> str` or `async def clean_md(text: str) -> str`.
+    - If the function raises or returns a non-string, haiku.rag logs a warning and proceeds without preprocessing.
+    - The preprocessor affects only the chunking pipeline. The stored document content remains unchanged.
+Example implementation:
+```python
+# my_pkg/preprocess.py
+def clean_md(text: str) -> str:
+    # strip HTML comments and collapse multiple blank lines
+    lines = [line for line in text.splitlines() if not line.strip().startswith("<!--")]
+    out = []
+    for line in lines:
+        if line.strip() == "" and (out and out[-1] == ""):
+            continue
+        out.append(line)
+    return "\n".join(out)
+```

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/docs/index.md RENAMED Viewed

@@ -55,6 +55,7 @@ haiku-rag migrate old_database.sqlite  # Migrate from SQLite
 - [Server](server.md) - File monitoring and server mode
 - [MCP](mcp.md) - Model Context Protocol integration
 - [Python](python.md) - Python API reference
+- [Agents](agents.md) - QA agent and multi-agent research
 ## License

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/docs/python.md RENAMED Viewed

@@ -109,6 +109,14 @@ await client.vacuum()
 This compacts tables and removes historical versions to keep disk usage in check. It’s safe to run anytime, for example after bulk imports or periodically in long‑running apps.
+### Atomic Writes and Rollback
+Document create and update operations take a snapshot of table versions before any write and automatically roll back to that snapshot if something fails (for example, during chunking or embedding). This restores both the `documents` and `chunks` tables to their pre‑operation state using LanceDB’s table versioning.
+- Applies to: `create_document(...)`, `create_document_from_source(...)`, `update_document(...)`, and internal rebuild/update flows.
+- Scope: Both document rows and all associated chunks are rolled back together.
+- Vacuum: Running `vacuum()` later prunes old versions for disk efficiency; rollbacks occur immediately during the failing operation and are not impacted.
 ## Searching Documents
 The search method performs native hybrid search (vector + full-text) using LanceDB with optional reranking for improved relevance:
@@ -196,3 +204,5 @@ print(answer)
 The QA agent will search your documents for relevant information and use the configured LLM to generate a comprehensive answer. With `cite=True`, responses include citations showing which documents were used as sources.
 The QA provider and model can be configured via environment variables (see [Configuration](configuration.md)).
+See also: [Agents](agents.md) for details on the QA agent and the multi‑agent research workflow.

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/mkdocs.yml RENAMED Viewed

@@ -61,8 +61,9 @@ nav:
       - Configuration: configuration.md
       - CLI: cli.md
       - Server: server.md
-      - MCP: mcp.md
+      - Agents: agents.md
       - Python: python.md
+      - MCP: mcp.md
       - Benchmarks: benchmarks.md
 markdown_extensions:
   - admonition

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/pyproject.toml RENAMED Viewed

@@ -1,7 +1,7 @@
 [project]
 name = "haiku.rag"
-version = "0.8.0"
-description = "Retrieval Augmented Generation (RAG) with LanceDB"
+version = "0.9.0"
+description = "Agentic Retrieval Augmented Generation (RAG) with LanceDB"
 authors = [{ name = "Yiorgis Gozadinos", email = "ggozadinos@gmail.com" }]
 license = { text = "MIT" }
 readme = { file = "README.md", content-type = "text/markdown" }
@@ -22,14 +22,13 @@ classifiers = [
 ]
 dependencies = [
-    "docling>=2.49.0",
-    "fastmcp>=2.8.1",
+    "docling>=2.52.0",
+    "fastmcp>=2.12.3",
     "httpx>=0.28.1",
-    "lancedb>=0.24.3",
-    "ollama>=0.5.3",
-    "pydantic>=2.11.7",
-    "pydantic-ai>=0.8.1",
-    "python-dotenv>=1.1.0",
+    "lancedb>=0.25.0",
+    "pydantic>=2.11.9",
+    "pydantic-ai>=1.0.8",
+    "python-dotenv>=1.1.1",
     "rich>=14.1.0",
     "tiktoken>=0.11.0",
     "typer>=0.16.1",
@@ -37,7 +36,7 @@ dependencies = [
 ]
 [project.optional-dependencies]
-voyageai = ["voyageai>=0.3.2"]
+voyageai = ["voyageai>=0.3.5"]
 mxbai = ["mxbai-rerank>=0.1.6"]
 [project.scripts]
@@ -52,15 +51,16 @@ packages = ["src/haiku"]
 [dependency-groups]
 dev = [
-    "datasets>=3.6.0",
+    "datasets>=4.1.0",
+    "logfire>=4.7.0",
     "mkdocs>=1.6.1",
     "mkdocs-material>=9.6.14",
     "pre-commit>=4.2.0",
-    "pyright>=1.1.404",
-    "pytest>=8.4.0",
-    "pytest-asyncio>=1.0.0",
-    "pytest-cov>=6.2.1",
-    "ruff>=0.11.13",
+    "pyright>=1.1.405",
+    "pytest>=8.4.2",
+    "pytest-asyncio>=1.2.0",
+    "pytest-cov>=7.0.0",
+    "ruff>=0.13.0",
 ]
 [tool.ruff]

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/src/haiku/rag/app.py RENAMED Viewed

@@ -9,6 +9,7 @@ from haiku.rag.client import HaikuRAG
 from haiku.rag.config import Config
 from haiku.rag.mcp import create_mcp_server
 from haiku.rag.monitor import FileWatcher
+from haiku.rag.research.orchestrator import ResearchOrchestrator
 from haiku.rag.store.models.chunk import Chunk
 from haiku.rag.store.models.document import Document
@@ -78,6 +79,85 @@ class HaikuRAGApp:
             except Exception as e:
                 self.console.print(f"[red]Error: {e}[/red]")
+    async def research(
+        self, question: str, max_iterations: int = 3, verbose: bool = False
+    ):
+        """Run multi-agent research on a question."""
+        async with HaikuRAG(db_path=self.db_path) as client:
+            try:
+                # Create orchestrator with default config or fallback to QA
+                orchestrator = ResearchOrchestrator()
+                if verbose:
+                    self.console.print(
+                        f"[bold cyan]Starting research with {orchestrator.provider}:{orchestrator.model}[/bold cyan]"
+                    )
+                    self.console.print(f"[bold blue]Question:[/bold blue] {question}")
+                    self.console.print()
+                # Conduct research
+                report = await orchestrator.conduct_research(
+                    question=question,
+                    client=client,
+                    max_iterations=max_iterations,
+                    verbose=verbose,
+                    console=self.console if verbose else None,
+                )
+                # Display the report
+                self.console.print("[bold green]Research Report[/bold green]")
+                self.console.rule()
+                # Title and Executive Summary
+                self.console.print(f"[bold]{report.title}[/bold]")
+                self.console.print()
+                self.console.print("[bold cyan]Executive Summary:[/bold cyan]")
+                self.console.print(report.executive_summary)
+                self.console.print()
+                # Main Findings
+                if report.main_findings:
+                    self.console.print("[bold cyan]Main Findings:[/bold cyan]")
+                    for finding in report.main_findings:
+                        self.console.print(f"• {finding}")
+                    self.console.print()
+                # Themes
+                if report.themes:
+                    self.console.print("[bold cyan]Key Themes:[/bold cyan]")
+                    for theme, explanation in report.themes.items():
+                        self.console.print(f"• [bold]{theme}[/bold]: {explanation}")
+                    self.console.print()
+                # Conclusions
+                if report.conclusions:
+                    self.console.print("[bold cyan]Conclusions:[/bold cyan]")
+                    for conclusion in report.conclusions:
+                        self.console.print(f"• {conclusion}")
+                    self.console.print()
+                # Recommendations
+                if report.recommendations:
+                    self.console.print("[bold cyan]Recommendations:[/bold cyan]")
+                    for rec in report.recommendations:
+                        self.console.print(f"• {rec}")
+                    self.console.print()
+                # Limitations
+                if report.limitations:
+                    self.console.print("[bold yellow]Limitations:[/bold yellow]")
+                    for limitation in report.limitations:
+                        self.console.print(f"• {limitation}")
+                    self.console.print()
+                # Sources Summary
+                if report.sources_summary:
+                    self.console.print("[bold cyan]Sources:[/bold cyan]")
+                    self.console.print(report.sources_summary)
+            except Exception as e:
+                self.console.print(f"[red]Error during research: {e}[/red]")
     async def rebuild(self):
         async with HaikuRAG(db_path=self.db_path, skip_validation=True) as client:
             try:

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/src/haiku/rag/cli.py RENAMED Viewed

@@ -3,6 +3,7 @@ import warnings
 from importlib.metadata import version
 from pathlib import Path
+import logfire
 import typer
 from rich.console import Console
@@ -12,6 +13,9 @@ from haiku.rag.logging import configure_cli_logging
 from haiku.rag.migration import migrate_sqlite_to_lancedb
 from haiku.rag.utils import is_up_to_date
+logfire.configure(send_to_logfire="if-token-present")
+logfire.instrument_pydantic_ai()
 if not Config.ENV == "development":
     warnings.filterwarnings("ignore")
@@ -235,6 +239,38 @@ def ask(
     asyncio.run(app.ask(question=question, cite=cite))
+@cli.command("research", help="Run multi-agent research and output a concise report")
+def research(
+    question: str = typer.Argument(
+        help="The research question to investigate",
+    ),
+    max_iterations: int = typer.Option(
+        3,
+        "--max-iterations",
+        "-n",
+        help="Maximum search/analyze iterations",
+    ),
+    db: Path = typer.Option(
+        Config.DEFAULT_DATA_DIR / "haiku.rag.lancedb",
+        "--db",
+        help="Path to the LanceDB database file",
+    ),
+    verbose: bool = typer.Option(
+        False,
+        "--verbose",
+        help="Show verbose progress output",
+    ),
+):
+    app = HaikuRAGApp(db_path=db)
+    asyncio.run(
+        app.research(
+            question=question,
+            max_iterations=max_iterations,
+            verbose=verbose,
+        )
+    )
 @cli.command("settings", help="Display current configuration settings")
 def settings():
     app = HaikuRAGApp(db_path=Path())  # Don't need actual DB for settings

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/src/haiku/rag/config.py RENAMED Viewed

@@ -27,15 +27,25 @@ class AppConfig(BaseModel):
     RERANK_MODEL: str = ""
     QA_PROVIDER: str = "ollama"
-    QA_MODEL: str = "qwen3"
+    QA_MODEL: str = "gpt-oss"
+    # Research defaults (fallback to QA if not provided via env)
+    RESEARCH_PROVIDER: str = "ollama"
+    RESEARCH_MODEL: str = "gpt-oss"
     CHUNK_SIZE: int = 256
     CONTEXT_CHUNK_RADIUS: int = 0
+    # Optional dotted path or file path to a callable that preprocesses
+    # markdown content before chunking. Examples:
+    MARKDOWN_PREPROCESSOR: str = ""
     OLLAMA_BASE_URL: str = "http://localhost:11434"
     VLLM_EMBEDDINGS_BASE_URL: str = ""
     VLLM_RERANK_BASE_URL: str = ""
     VLLM_QA_BASE_URL: str = ""
+    VLLM_RESEARCH_BASE_URL: str = ""
     # Provider keys
     VOYAGE_API_KEY: str = ""

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/src/haiku/rag/qa/agent.py RENAMED Viewed

@@ -6,7 +6,7 @@ from pydantic_ai.providers.openai import OpenAIProvider
 from haiku.rag.client import HaikuRAG
 from haiku.rag.config import Config
-from haiku.rag.qa.prompts import SYSTEM_PROMPT, SYSTEM_PROMPT_WITH_CITATIONS
+from haiku.rag.qa.prompts import QA_SYSTEM_PROMPT, QA_SYSTEM_PROMPT_WITH_CITATIONS
 class SearchResult(BaseModel):
@@ -31,7 +31,9 @@ class QuestionAnswerAgent:
     ):
         self._client = client
-        system_prompt = SYSTEM_PROMPT_WITH_CITATIONS if use_citations else SYSTEM_PROMPT
+        system_prompt = (
+            QA_SYSTEM_PROMPT_WITH_CITATIONS if use_citations else QA_SYSTEM_PROMPT
+        )
         model_obj = self._get_model(provider, model)
         self._agent = Agent(

{haiku_rag-0.8.0 → haiku_rag-0.9.0}/src/haiku/rag/qa/prompts.py RENAMED Viewed

@@ -1,4 +1,4 @@
-SYSTEM_PROMPT = """
+QA_SYSTEM_PROMPT = """
 You are a knowledgeable assistant that helps users find information from a document knowledge base.
 Your process:
@@ -21,7 +21,7 @@ Be concise, and always maintain accuracy over completeness. Prefer short, direct
 /no_think
 """
-SYSTEM_PROMPT_WITH_CITATIONS = """
+QA_SYSTEM_PROMPT_WITH_CITATIONS = """
 You are a knowledgeable assistant that helps users find information from a document knowledge base.
 IMPORTANT: You MUST use the search_documents tool for every question. Do not answer any question without first searching the knowledge base.

haiku_rag-0.9.0/src/haiku/rag/research/__init__.py ADDED Viewed

@@ -0,0 +1,35 @@
+"""Multi-agent research workflow for advanced RAG queries."""
+from haiku.rag.research.base import (
+    BaseResearchAgent,
+    ResearchOutput,
+    SearchAnswer,
+    SearchResult,
+)
+from haiku.rag.research.dependencies import ResearchContext, ResearchDependencies
+from haiku.rag.research.evaluation_agent import (
+    AnalysisEvaluationAgent,
+    EvaluationResult,
+)
+from haiku.rag.research.orchestrator import ResearchOrchestrator, ResearchPlan
+from haiku.rag.research.search_agent import SearchSpecialistAgent
+from haiku.rag.research.synthesis_agent import ResearchReport, SynthesisAgent
+__all__ = [
+    # Base classes
+    "BaseResearchAgent",
+    "ResearchDependencies",
+    "ResearchContext",
+    "SearchResult",
+    "ResearchOutput",
+    # Specialized agents
+    "SearchAnswer",
+    "SearchSpecialistAgent",
+    "AnalysisEvaluationAgent",
+    "EvaluationResult",
+    "SynthesisAgent",
+    "ResearchReport",
+    # Orchestrator
+    "ResearchOrchestrator",
+    "ResearchPlan",
+]

haiku.rag 0.8.0__tar.gz → 0.9.0__tar.gz

Potentially problematic release.

haiku.rag 0.8.0tar.gz → 0.9.0tar.gz