PyPI - haiku.rag - Versions diffs - 0.8.0__tar.gz → 0.8.1__tar.gz - Mend

haiku.rag 0.8.0tar.gz → 0.8.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of haiku.rag might be problematic. Click here for more details.

Files changed (78) hide show

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: haiku.rag
-Version: 0.8.0
+Version: 0.8.1
 Summary: Retrieval Augmented Generation (RAG) with LanceDB
 Author-email: Yiorgis Gozadinos <ggozadinos@gmail.com>
 License: MIT

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/docs/benchmarks.md RENAMED Viewed

@@ -16,8 +16,8 @@ The recall obtained is ~0.79 for matching in the top result, raising to ~0.91 fo
 |---------------------------------------|-------------------|-------------------|------------------------|
 | Ollama / `mxbai-embed-large`          | 0.79              | 0.91              | None                   |
 | Ollama / `mxbai-embed-large`          | 0.90              | 0.95              | `mxbai-rerank-base-v2` |
-<!-- | Ollama / `nomic-embed-text`           | 0.74              | 0.88              | None                   |
-| OpenAI / `text-embeddings-3-small`    | 0.75              | 0.88              | None                   |
+| Ollama / `nomic-embed-text-v1.5`      | 0.74              | 0.90              | None                   |
+<!-- | OpenAI / `text-embeddings-3-small`    | 0.75              | 0.88              | None                   |
 | OpenAI / `text-embeddings-3-small`    | 0.75              | 0.88              | None                   |
 | OpenAI / `text-embeddings-3-small`    | 0.83              | 0.90              | Cohere / `rerank-v3.5` | -->

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/docs/cli.md RENAMED Viewed

@@ -36,8 +36,10 @@ haiku-rag add-src https://example.com/article.html
 ```
 !!! note
-    As you add documents to `haiku.rag` the database keeps growing. By default, `lanceDB` supports versioning
-    of your data. You can optimize and compact the database by running the [vaccum](#vacuum-optimize-and-cleanup) command.
+    As you add documents to `haiku.rag` the database keeps growing. By default, LanceDB supports versioning
+    of your data. Create/update operations are atomic‑feeling: if anything fails during chunking or embedding,
+    the database rolls back to the pre‑operation snapshot using LanceDB table versioning. You can optimize and
+    compact the database by running the [vacuum](#vacuum-optimize-and-cleanup) command.
 ### Get Document

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/docs/configuration.md RENAMED Viewed

@@ -223,3 +223,35 @@ CHUNK_SIZE=256
 # into single chunks with continuous content to eliminate duplication
 CONTEXT_CHUNK_RADIUS=0
 ```
+#### Markdown Preprocessor
+Optionally preprocess Markdown before chunking by pointing to a callable that receives and returns Markdown text. This is useful for normalizing content, stripping boilerplate, or applying custom transformations before chunk boundaries are computed.
+```bash
+# A callable path in one of these formats:
+# - package.module:func
+# - package.module.func
+# - /abs/or/relative/path/to/file.py:func
+MARKDOWN_PREPROCESSOR="my_pkg.preprocess:clean_md"
+```
+!!! note
+    - The function signature should be `def clean_md(text: str) -> str` or `async def clean_md(text: str) -> str`.
+    - If the function raises or returns a non-string, haiku.rag logs a warning and proceeds without preprocessing.
+    - The preprocessor affects only the chunking pipeline. The stored document content remains unchanged.
+Example implementation:
+```python
+# my_pkg/preprocess.py
+def clean_md(text: str) -> str:
+    # strip HTML comments and collapse multiple blank lines
+    lines = [line for line in text.splitlines() if not line.strip().startswith("<!--")]
+    out = []
+    for line in lines:
+        if line.strip() == "" and (out and out[-1] == ""):
+            continue
+        out.append(line)
+    return "\n".join(out)
+```

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/docs/python.md RENAMED Viewed

@@ -109,6 +109,14 @@ await client.vacuum()
 This compacts tables and removes historical versions to keep disk usage in check. It’s safe to run anytime, for example after bulk imports or periodically in long‑running apps.
+### Atomic Writes and Rollback
+Document create and update operations take a snapshot of table versions before any write and automatically roll back to that snapshot if something fails (for example, during chunking or embedding). This restores both the `documents` and `chunks` tables to their pre‑operation state using LanceDB’s table versioning.
+- Applies to: `create_document(...)`, `create_document_from_source(...)`, `update_document(...)`, and internal rebuild/update flows.
+- Scope: Both document rows and all associated chunks are rolled back together.
+- Vacuum: Running `vacuum()` later prunes old versions for disk efficiency; rollbacks occur immediately during the failing operation and are not impacted.
 ## Searching Documents
 The search method performs native hybrid search (vector + full-text) using LanceDB with optional reranking for improved relevance:

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "haiku.rag"
-version = "0.8.0"
+version = "0.8.1"
 description = "Retrieval Augmented Generation (RAG) with LanceDB"
 authors = [{ name = "Yiorgis Gozadinos", email = "ggozadinos@gmail.com" }]
 license = { text = "MIT" }
@@ -53,6 +53,7 @@ packages = ["src/haiku"]
 [dependency-groups]
 dev = [
     "datasets>=3.6.0",
+    "logfire>=4.6.0",
     "mkdocs>=1.6.1",
     "mkdocs-material>=9.6.14",
     "pre-commit>=4.2.0",

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/src/haiku/rag/config.py RENAMED Viewed

@@ -32,6 +32,10 @@ class AppConfig(BaseModel):
     CHUNK_SIZE: int = 256
     CONTEXT_CHUNK_RADIUS: int = 0
+    # Optional dotted path or file path to a callable that preprocesses
+    # markdown content before chunking. Examples:
+    MARKDOWN_PREPROCESSOR: str = ""
     OLLAMA_BASE_URL: str = "http://localhost:11434"
     VLLM_EMBEDDINGS_BASE_URL: str = ""
     VLLM_RERANK_BASE_URL: str = ""

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/src/haiku/rag/store/engine.py RENAMED Viewed

@@ -209,6 +209,21 @@ class Store:
         # LanceDB connections are automatically managed
         pass
+    def current_table_versions(self) -> dict[str, int]:
+        """Capture current versions of key tables for rollback using LanceDB's API."""
+        return {
+            "documents": int(self.documents_table.version),
+            "chunks": int(self.chunks_table.version),
+            "settings": int(self.settings_table.version),
+        }
+    def restore_table_versions(self, versions: dict[str, int]) -> bool:
+        """Restore tables to the provided versions using LanceDB's API."""
+        self.documents_table.restore(int(versions["documents"]))
+        self.chunks_table.restore(int(versions["chunks"]))
+        self.settings_table.restore(int(versions["settings"]))
+        return True
     @property
     def _connection(self):
         """Compatibility property for repositories expecting _connection."""

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/src/haiku/rag/store/repositories/chunk.py RENAMED Viewed

@@ -1,4 +1,5 @@
 import asyncio
+import inspect
 import json
 import logging
 from uuid import uuid4
@@ -11,6 +12,7 @@ from haiku.rag.config import Config
 from haiku.rag.embeddings import get_embedder
 from haiku.rag.store.engine import DocumentRecord, Store
 from haiku.rag.store.models.chunk import Chunk
+from haiku.rag.utils import load_callable, text_to_docling_document
 logger = logging.getLogger(__name__)
@@ -152,7 +154,28 @@ class ChunkRepository:
         self, document_id: str, document: DoclingDocument
     ) -> list[Chunk]:
         """Create chunks and embeddings for a document from DoclingDocument."""
-        chunk_texts = await chunker.chunk(document)
+        # Optionally preprocess markdown before chunking
+        processed_document = document
+        preprocessor_path = Config.MARKDOWN_PREPROCESSOR
+        if preprocessor_path:
+            try:
+                pre_fn = load_callable(preprocessor_path)
+                markdown = document.export_to_markdown()
+                result = pre_fn(markdown)
+                if inspect.isawaitable(result):
+                    result = await result  # type: ignore[assignment]
+                processed_markdown = result
+                if not isinstance(processed_markdown, str):
+                    raise ValueError("Preprocessor must return a markdown string")
+                processed_document = text_to_docling_document(
+                    processed_markdown, name="content.md"
+                )
+            except Exception as e:
+                logger.warning(
+                    f"Failed to apply MARKDOWN_PREPROCESSOR '{preprocessor_path}': {e}. Proceeding without preprocessing."
+                )
+        chunk_texts = await chunker.chunk(processed_document)
         embeddings = await self.embedder.embed(chunk_texts)

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/src/haiku/rag/store/repositories/document.py RENAMED Viewed

@@ -171,44 +171,64 @@ class DocumentRepository:
         chunks: list["Chunk"] | None = None,
     ) -> Document:
         """Create a document with its chunks and embeddings."""
+        # Snapshot table versions for versioned rollback (if supported)
+        versions = self.store.current_table_versions()
         # Create the document
         created_doc = await self.create(entity)
-        # Create chunks if not provided
-        if chunks is None:
-            assert created_doc.id is not None, (
-                "Document ID should not be None after creation"
-            )
-            await self.chunk_repository.create_chunks_for_document(
-                created_doc.id, docling_document
-            )
-        else:
-            # Use provided chunks, set order from list position
-            assert created_doc.id is not None, (
-                "Document ID should not be None after creation"
-            )
-            for order, chunk in enumerate(chunks):
-                chunk.document_id = created_doc.id
-                chunk.metadata["order"] = order
-                await self.chunk_repository.create(chunk)
-        return created_doc
+        # Attempt to create chunks; on failure, prefer version rollback
+        try:
+            # Create chunks if not provided
+            if chunks is None:
+                assert created_doc.id is not None, (
+                    "Document ID should not be None after creation"
+                )
+                await self.chunk_repository.create_chunks_for_document(
+                    created_doc.id, docling_document
+                )
+            else:
+                # Use provided chunks, set order from list position
+                assert created_doc.id is not None, (
+                    "Document ID should not be None after creation"
+                )
+                for order, chunk in enumerate(chunks):
+                    chunk.document_id = created_doc.id
+                    chunk.metadata["order"] = order
+                    await self.chunk_repository.create(chunk)
+            return created_doc
+        except Exception:
+            # Roll back to the captured versions and re-raise
+            self.store.restore_table_versions(versions)
+            raise
     async def _update_with_docling(
         self, entity: Document, docling_document: DoclingDocument
     ) -> Document:
         """Update a document and regenerate its chunks."""
-        # Delete existing chunks
         assert entity.id is not None, "Document ID is required for update"
+        # Snapshot table versions for versioned rollback
+        versions = self.store.current_table_versions()
+        # Delete existing chunks before writing new ones
         await self.chunk_repository.delete_by_document_id(entity.id)
-        # Update the document
-        updated_doc = await self.update(entity)
+        try:
+            # Update the document
+            updated_doc = await self.update(entity)
-        # Create new chunks
-        assert updated_doc.id is not None, "Document ID should not be None after update"
-        await self.chunk_repository.create_chunks_for_document(
-            updated_doc.id, docling_document
-        )
+            # Create new chunks
+            assert updated_doc.id is not None, (
+                "Document ID should not be None after update"
+            )
+            await self.chunk_repository.create_chunks_for_document(
+                updated_doc.id, docling_document
+            )
-        return updated_doc
+            return updated_doc
+        except Exception:
+            # Roll back to the captured versions and re-raise
+            self.store.restore_table_versions(versions)
+            raise

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/src/haiku/rag/utils.py RENAMED Viewed

@@ -1,10 +1,13 @@
 import asyncio
+import importlib
+import importlib.util
 import sys
 from collections.abc import Callable
 from functools import wraps
 from importlib import metadata
 from io import BytesIO
 from pathlib import Path
+from types import ModuleType
 import httpx
 from docling.document_converter import DocumentConverter
@@ -106,3 +109,54 @@ def text_to_docling_document(text: str, name: str = "content.md") -> DoclingDocu
     converter = DocumentConverter()
     result = converter.convert(doc_stream)
     return result.document
+def load_callable(path: str):
+    """Load a callable from a dotted path or file path.
+    Supported formats:
+    - "package.module:func" or "package.module.func"
+    - "path/to/file.py:func"
+    Returns the loaded callable. Raises ValueError on failure.
+    """
+    if not path:
+        raise ValueError("Empty callable path provided")
+    module_part = None
+    func_name = None
+    if ":" in path:
+        module_part, func_name = path.split(":", 1)
+    else:
+        # split by last dot for module.attr
+        if "." in path:
+            module_part, func_name = path.rsplit(".", 1)
+        else:
+            raise ValueError(
+                "Invalid callable path format. Use 'module:func' or 'module.func' or 'file.py:func'."
+            )
+    # Try file path first
+    mod: ModuleType | None = None
+    module_path = Path(module_part)
+    if module_path.suffix == ".py" and module_path.exists():
+        spec = importlib.util.spec_from_file_location(module_path.stem, module_path)
+        if spec and spec.loader:
+            mod = importlib.util.module_from_spec(spec)
+            spec.loader.exec_module(mod)
+    else:
+        # Import as a module path
+        try:
+            mod = importlib.import_module(module_part)
+        except Exception as e:
+            raise ValueError(f"Failed to import module '{module_part}': {e}")
+    if not hasattr(mod, func_name):
+        raise ValueError(f"Callable '{func_name}' not found in module '{module_part}'")
+    func = getattr(mod, func_name)
+    if not callable(func):
+        raise ValueError(
+            f"Attribute '{func_name}' in module '{module_part}' is not callable"
+        )
+    return func

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/tests/generate_benchmark_db.py RENAMED Viewed

@@ -1,6 +1,7 @@
 import asyncio
 from pathlib import Path
+import logfire
 from datasets import Dataset, load_dataset
 from llm_judge import LLMJudge
 from rich.console import Console
@@ -11,6 +12,8 @@ from haiku.rag.client import HaikuRAG
 from haiku.rag.logging import configure_cli_logging
 from haiku.rag.qa import get_qa_agent
+logfire.configure()
+logfire.instrument_pydantic_ai()
 configure_cli_logging()
 console = Console()
@@ -119,7 +122,6 @@ async def run_qa_benchmark(k: int | None = None):
         async with HaikuRAG(db_path) as rag:
             qa = get_qa_agent(rag)
             for doc in corpus:
                 question = doc["question"]  # type: ignore
                 expected_answer = doc["answer"]  # type: ignore

{haiku_rag-0.8.0 → haiku_rag-0.8.1}/tests/test_client.py RENAMED Viewed

@@ -526,121 +526,117 @@ async def test_client_ask_with_cite(temp_db_path):
 @pytest.mark.asyncio
 async def test_client_expand_context(temp_db_path):
     """Test expanding search results with adjacent chunks."""
-    # Mock Config to have CONTEXT_CHUNK_RADIUS = 2
-    with patch("haiku.rag.client.Config.CONTEXT_CHUNK_RADIUS", 2):
-        async with HaikuRAG(temp_db_path) as client:
-            # Create chunks manually
-            manual_chunks = [
-                Chunk(content="Chunk 0 content", metadata={"order": 0}),
-                Chunk(content="Chunk 1 content", metadata={"order": 1}),
-                Chunk(content="Chunk 2 content", metadata={"order": 2}),
-                Chunk(content="Chunk 3 content", metadata={"order": 3}),
-                Chunk(content="Chunk 4 content", metadata={"order": 4}),
-            ]
-            doc = await client.create_document(
-                content="Full document content",
-                uri="test_doc.txt",
-                chunks=manual_chunks,
-            )
+    async with HaikuRAG(temp_db_path) as client:
+        # Create chunks manually
+        manual_chunks = [
+            Chunk(content="Chunk 0 content", metadata={"order": 0}),
+            Chunk(content="Chunk 1 content", metadata={"order": 1}),
+            Chunk(content="Chunk 2 content", metadata={"order": 2}),
+            Chunk(content="Chunk 3 content", metadata={"order": 3}),
+            Chunk(content="Chunk 4 content", metadata={"order": 4}),
+        ]
-            # Get all chunks for the document
-            assert doc.id is not None
-            chunks = await client.chunk_repository.get_by_document_id(doc.id)
-            assert len(chunks) == 5
+        doc = await client.create_document(
+            content="Full document content",
+            uri="test_doc.txt",
+            chunks=manual_chunks,
+        )
-            # Find the middle chunk (order=2)
-            middle_chunk = next(c for c in chunks if c.metadata.get("order") == 2)
-            search_results = [(middle_chunk, 0.8)]
+        # Get all chunks for the document
+        assert doc.id is not None
+        chunks = await client.chunk_repository.get_by_document_id(doc.id)
+        assert len(chunks) == 5
-            # Test expand_context
-            expanded_results = await client.expand_context(search_results)
+        # Find the middle chunk (order=2)
+        middle_chunk = next(c for c in chunks if c.metadata.get("order") == 2)
+        search_results = [(middle_chunk, 0.8)]
-            assert len(expanded_results) == 1
-            expanded_chunk, score = expanded_results[0]
+        # Test expand_context with radius=2
+        expanded_results = await client.expand_context(search_results, radius=2)
-            # Check that the expanded chunk has combined content
-            assert expanded_chunk.id == middle_chunk.id
-            assert score == 0.8
-            assert "Chunk 2 content" in expanded_chunk.content
+        assert len(expanded_results) == 1
+        expanded_chunk, score = expanded_results[0]
-            # Should include all chunks (radius=2 from chunk 2 = chunks 0,1,2,3,4)
-            assert "Chunk 0 content" in expanded_chunk.content
-            assert "Chunk 1 content" in expanded_chunk.content
-            assert "Chunk 2 content" in expanded_chunk.content
-            assert "Chunk 3 content" in expanded_chunk.content
-            assert "Chunk 4 content" in expanded_chunk.content
+        # Check that the expanded chunk has combined content
+        assert expanded_chunk.id == middle_chunk.id
+        assert score == 0.8
+        assert "Chunk 2 content" in expanded_chunk.content
+        # Should include all chunks (radius=2 from chunk 2 = chunks 0,1,2,3,4)
+        assert "Chunk 0 content" in expanded_chunk.content
+        assert "Chunk 1 content" in expanded_chunk.content
+        assert "Chunk 2 content" in expanded_chunk.content
+        assert "Chunk 3 content" in expanded_chunk.content
+        assert "Chunk 4 content" in expanded_chunk.content
 @pytest.mark.asyncio
 async def test_client_expand_context_radius_zero(temp_db_path):
     """Test expand_context with radius 0 returns original results."""
-    with patch("haiku.rag.client.Config.CONTEXT_CHUNK_RADIUS", 0):
-        async with HaikuRAG(temp_db_path) as client:
-            # Create a simple document
-            doc = await client.create_document(content="Simple test content")
-            assert doc.id is not None
-            chunks = await client.chunk_repository.get_by_document_id(doc.id)
+    async with HaikuRAG(temp_db_path) as client:
+        # Create a simple document
+        doc = await client.create_document(content="Simple test content")
+        assert doc.id is not None
+        chunks = await client.chunk_repository.get_by_document_id(doc.id)
-            search_results = [(chunks[0], 0.9)]
-            expanded_results = await client.expand_context(search_results)
+        search_results = [(chunks[0], 0.9)]
+        expanded_results = await client.expand_context(search_results, radius=0)
-            # Should return exactly the same results
-            assert expanded_results == search_results
+        # Should return exactly the same results
+        assert expanded_results == search_results
 @pytest.mark.asyncio
 async def test_client_expand_context_multiple_chunks(temp_db_path):
     """Test expand_context with multiple search results."""
-    with patch("haiku.rag.client.Config.CONTEXT_CHUNK_RADIUS", 1):
-        async with HaikuRAG(temp_db_path) as client:
-            # Create first document with manual chunks
-            doc1_chunks = [
-                Chunk(content="Doc1 Part A", metadata={"order": 0}),
-                Chunk(content="Doc1 Part B", metadata={"order": 1}),
-                Chunk(content="Doc1 Part C", metadata={"order": 2}),
-            ]
-            doc1 = await client.create_document(
-                content="Doc1 content", uri="doc1.txt", chunks=doc1_chunks
-            )
+    async with HaikuRAG(temp_db_path) as client:
+        # Create first document with manual chunks
+        doc1_chunks = [
+            Chunk(content="Doc1 Part A", metadata={"order": 0}),
+            Chunk(content="Doc1 Part B", metadata={"order": 1}),
+            Chunk(content="Doc1 Part C", metadata={"order": 2}),
+        ]
+        doc1 = await client.create_document(
+            content="Doc1 content", uri="doc1.txt", chunks=doc1_chunks
+        )
-            # Create second document with manual chunks
-            doc2_chunks = [
-                Chunk(content="Doc2 Section X", metadata={"order": 0}),
-                Chunk(content="Doc2 Section Y", metadata={"order": 1}),
-            ]
-            doc2 = await client.create_document(
-                content="Doc2 content", uri="doc2.txt", chunks=doc2_chunks
-            )
+        # Create second document with manual chunks
+        doc2_chunks = [
+            Chunk(content="Doc2 Section X", metadata={"order": 0}),
+            Chunk(content="Doc2 Section Y", metadata={"order": 1}),
+        ]
+        doc2 = await client.create_document(
+            content="Doc2 content", uri="doc2.txt", chunks=doc2_chunks
+        )
-            assert doc1.id is not None
-            assert doc2.id is not None
-            chunks1 = await client.chunk_repository.get_by_document_id(doc1.id)
-            chunks2 = await client.chunk_repository.get_by_document_id(doc2.id)
-            # Get middle chunk from doc1 (order=1) and first chunk from doc2 (order=0)
-            chunk1 = next(c for c in chunks1 if c.metadata.get("order") == 1)
-            chunk2 = next(c for c in chunks2 if c.metadata.get("order") == 0)
-            search_results = [(chunk1, 0.8), (chunk2, 0.7)]
-            expanded_results = await client.expand_context(search_results)
-            assert len(expanded_results) == 2
-            # Check first expanded result (should include chunks 0,1,2 from doc1)
-            expanded1, score1 = expanded_results[0]
-            assert expanded1.id == chunk1.id
-            assert score1 == 0.8
-            assert "Doc1 Part A" in expanded1.content
-            assert "Doc1 Part B" in expanded1.content
-            assert "Doc1 Part C" in expanded1.content
-            # Check second expanded result (should include chunks 0,1 from doc2)
-            expanded2, score2 = expanded_results[1]
-            assert expanded2.id == chunk2.id
-            assert score2 == 0.7
-            assert "Doc2 Section X" in expanded2.content
-            assert "Doc2 Section Y" in expanded2.content
+        assert doc1.id is not None
+        assert doc2.id is not None
+        chunks1 = await client.chunk_repository.get_by_document_id(doc1.id)
+        chunks2 = await client.chunk_repository.get_by_document_id(doc2.id)
+        # Get middle chunk from doc1 (order=1) and first chunk from doc2 (order=0)
+        chunk1 = next(c for c in chunks1 if c.metadata.get("order") == 1)
+        chunk2 = next(c for c in chunks2 if c.metadata.get("order") == 0)
+        search_results = [(chunk1, 0.8), (chunk2, 0.7)]
+        expanded_results = await client.expand_context(search_results, radius=1)
+        assert len(expanded_results) == 2
+        # Check first expanded result (should include chunks 0,1,2 from doc1)
+        expanded1, score1 = expanded_results[0]
+        assert expanded1.id == chunk1.id
+        assert score1 == 0.8
+        assert "Doc1 Part A" in expanded1.content
+        assert "Doc1 Part B" in expanded1.content
+        assert "Doc1 Part C" in expanded1.content
+        # Check second expanded result (should include chunks 0,1 from doc2)
+        expanded2, score2 = expanded_results[1]
+        assert expanded2.id == chunk2.id
+        assert score2 == 0.7
+        assert "Doc2 Section X" in expanded2.content
+        assert "Doc2 Section Y" in expanded2.content
 @pytest.mark.asyncio

haiku.rag 0.8.0__tar.gz → 0.8.1__tar.gz

Potentially problematic release.

haiku.rag 0.8.0tar.gz → 0.8.1tar.gz