PyPI - kiln-ai - Versions diffs - 0.21.0__tar.gz → 0.22.1__tar.gz - Mend

kiln-ai 0.21.0tar.gz → 0.22.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of kiln-ai might be problematic. Click here for more details.

Files changed (257) hide show

{kiln_ai-0.21.0 → kiln_ai-0.22.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: kiln-ai
-Version: 0.21.0
+Version: 0.22.1
 Summary: Kiln AI
 Project-URL: Homepage, https://kiln.tech
 Project-URL: Repository, https://github.com/Kiln-AI/kiln
@@ -28,8 +28,10 @@ Requires-Dist: llama-index-vector-stores-lancedb>=0.3.3
 Requires-Dist: llama-index>=0.13.3
 Requires-Dist: openai>=1.53.0
 Requires-Dist: pdoc>=15.0.0
+Requires-Dist: pillow>=11.1.0
 Requires-Dist: pydantic>=2.9.2
 Requires-Dist: pypdf>=6.0.0
+Requires-Dist: pypdfium2>=4.30.0
 Requires-Dist: pytest-benchmark>=5.1.0
 Requires-Dist: pytest-cov>=6.0.0
 Requires-Dist: pyyaml>=6.0.2
@@ -83,6 +85,10 @@ The library has a [comprehensive set of docs](https://kiln-ai.github.io/Kiln/kil
   - [Building and Running a Kiln Task from Code](#building-and-running-a-kiln-task-from-code)
   - [Tagging Task Runs Programmatically](#tagging-task-runs-programmatically)
   - [Adding Custom Model or AI Provider from Code](#adding-custom-model-or-ai-provider-from-code)
+- [Taking Kiln RAG to production](#taking-kiln-rag-to-production)
+  - [Load a LlamaIndex Vector Store](#load-a-llamaindex-vector-store)
+  - [Example: LanceDB Cloud](#example-lancedb-cloud)
+  - [Deploy RAG without LlamaIndex](#deploy-rag-without-llamaindex)t
 - [Full API Reference](#full-api-reference)
 ## Installation
@@ -350,6 +356,78 @@ custom_model_ids.append(new_model)
 Config.shared().custom_models = custom_model_ids
 ```
+## Taking Kiln RAG to production
+When you're ready to deploy your RAG system, you can export your processed documents to any vector store supported by LlamaIndex. This allows you to use your Kiln-configured chunking and embedding settings in production.
+### Load a LlamaIndex Vector Store
+Kiln provides a `VectorStoreLoader` that yields your processed document chunks as LlamaIndex `TextNode` objects. These nodes contain the same metadata, chunking and embedding data as your Kiln Search Tool configuration.
+```py
+from kiln_ai.datamodel import Project
+from kiln_ai.datamodel.rag import RagConfig
+from kiln_ai.adapters.vector_store_loaders import VectorStoreLoader
+# Load your project and RAG configuration
+project = Project.load_from_file("path/to/your/project.kiln")
+rag_config = RagConfig.from_id_and_parent_path("rag-config-id", project.path)
+# Create the loader
+loader = VectorStoreLoader(project=project, rag_config=rag_config)
+# Export chunks to any LlamaIndex vector store
+async for batch in loader.iter_llama_index_nodes(batch_size=10):
+    # Insert into your chosen vector store
+    # Examples: LanceDB, Pinecone, Chroma, Qdrant, etc.
+    pass
+```
+**Supported Vector Stores:** LlamaIndex supports 20+ vector stores including LanceDB, Pinecone, Weaviate, Chroma, Qdrant, and more. See the [full list](https://developers.llamaindex.ai/python/framework/module_guides/storing/vector_stores/).
+### Example: LanceDB Cloud
+Internally Kiln uses LanceDB. By using LanceDB cloud you'll get the same indexing behaviour as in app.
+Here's a complete example using LanceDB Cloud:
+```py
+from kiln_ai.datamodel import Project
+from kiln_ai.datamodel.rag import RagConfig
+from kiln_ai.datamodel.vector_store import VectorStoreConfig
+from kiln_ai.adapters.vector_store_loaders import VectorStoreLoader
+from kiln_ai.adapters.vector_store.lancedb_adapter import lancedb_construct_from_config
+# Load configurations
+project = Project.load_from_file("path/to/your/project.kiln")
+rag_config = RagConfig.from_id_and_parent_path("rag-config-id", project.path)
+vector_store_config = VectorStoreConfig.from_id_and_parent_path(
+    rag_config.vector_store_config_id, project.path,
+)
+# Create LanceDB vector store
+lancedb_store = lancedb_construct_from_config(
+    vector_store_config=vector_store_config,
+    uri="db://my-project",
+    api_key="sk_...",
+    region="us-east-1",
+    table_name="my-documents",  # Created automatically
+)
+# Export and insert your documents
+loader = VectorStoreLoader(project=project, rag_config=rag_config)
+async for batch in loader.iter_llama_index_nodes(batch_size=100):
+    await lancedb_store.async_add(batch)
+print("Documents successfully exported to LanceDB!")
+```
+After export, query your data using [LlamaIndex](https://developers.llamaindex.ai/python/framework-api-reference/storage/vector_store/lancedb/) or the [LanceDB client](https://lancedb.github.io/lancedb/).
+### Deploy RAG without LlamaIndex
+While Kiln is designed for deploying to LlamaIndex, you don't need to use it. The `iter_llama_index_nodes` returns a `TextNode` object which includes all the data you need to build a RAG index in any stack: embedding, text, document name, chunk ID, etc.
 ## Full API Reference
 The library can do a lot more than the examples we've shown here.

{kiln_ai-0.21.0 → kiln_ai-0.22.1}/README.md RENAMED Viewed

@@ -43,6 +43,10 @@ The library has a [comprehensive set of docs](https://kiln-ai.github.io/Kiln/kil
   - [Building and Running a Kiln Task from Code](#building-and-running-a-kiln-task-from-code)
   - [Tagging Task Runs Programmatically](#tagging-task-runs-programmatically)
   - [Adding Custom Model or AI Provider from Code](#adding-custom-model-or-ai-provider-from-code)
+- [Taking Kiln RAG to production](#taking-kiln-rag-to-production)
+  - [Load a LlamaIndex Vector Store](#load-a-llamaindex-vector-store)
+  - [Example: LanceDB Cloud](#example-lancedb-cloud)
+  - [Deploy RAG without LlamaIndex](#deploy-rag-without-llamaindex)t
 - [Full API Reference](#full-api-reference)
 ## Installation
@@ -310,6 +314,78 @@ custom_model_ids.append(new_model)
 Config.shared().custom_models = custom_model_ids
 ```
+## Taking Kiln RAG to production
+When you're ready to deploy your RAG system, you can export your processed documents to any vector store supported by LlamaIndex. This allows you to use your Kiln-configured chunking and embedding settings in production.
+### Load a LlamaIndex Vector Store
+Kiln provides a `VectorStoreLoader` that yields your processed document chunks as LlamaIndex `TextNode` objects. These nodes contain the same metadata, chunking and embedding data as your Kiln Search Tool configuration.
+```py
+from kiln_ai.datamodel import Project
+from kiln_ai.datamodel.rag import RagConfig
+from kiln_ai.adapters.vector_store_loaders import VectorStoreLoader
+# Load your project and RAG configuration
+project = Project.load_from_file("path/to/your/project.kiln")
+rag_config = RagConfig.from_id_and_parent_path("rag-config-id", project.path)
+# Create the loader
+loader = VectorStoreLoader(project=project, rag_config=rag_config)
+# Export chunks to any LlamaIndex vector store
+async for batch in loader.iter_llama_index_nodes(batch_size=10):
+    # Insert into your chosen vector store
+    # Examples: LanceDB, Pinecone, Chroma, Qdrant, etc.
+    pass
+```
+**Supported Vector Stores:** LlamaIndex supports 20+ vector stores including LanceDB, Pinecone, Weaviate, Chroma, Qdrant, and more. See the [full list](https://developers.llamaindex.ai/python/framework/module_guides/storing/vector_stores/).
+### Example: LanceDB Cloud
+Internally Kiln uses LanceDB. By using LanceDB cloud you'll get the same indexing behaviour as in app.
+Here's a complete example using LanceDB Cloud:
+```py
+from kiln_ai.datamodel import Project
+from kiln_ai.datamodel.rag import RagConfig
+from kiln_ai.datamodel.vector_store import VectorStoreConfig
+from kiln_ai.adapters.vector_store_loaders import VectorStoreLoader
+from kiln_ai.adapters.vector_store.lancedb_adapter import lancedb_construct_from_config
+# Load configurations
+project = Project.load_from_file("path/to/your/project.kiln")
+rag_config = RagConfig.from_id_and_parent_path("rag-config-id", project.path)
+vector_store_config = VectorStoreConfig.from_id_and_parent_path(
+    rag_config.vector_store_config_id, project.path,
+)
+# Create LanceDB vector store
+lancedb_store = lancedb_construct_from_config(
+    vector_store_config=vector_store_config,
+    uri="db://my-project",
+    api_key="sk_...",
+    region="us-east-1",
+    table_name="my-documents",  # Created automatically
+)
+# Export and insert your documents
+loader = VectorStoreLoader(project=project, rag_config=rag_config)
+async for batch in loader.iter_llama_index_nodes(batch_size=100):
+    await lancedb_store.async_add(batch)
+print("Documents successfully exported to LanceDB!")
+```
+After export, query your data using [LlamaIndex](https://developers.llamaindex.ai/python/framework-api-reference/storage/vector_store/lancedb/) or the [LanceDB client](https://lancedb.github.io/lancedb/).
+### Deploy RAG without LlamaIndex
+While Kiln is designed for deploying to LlamaIndex, you don't need to use it. The `iter_llama_index_nodes` returns a `TextNode` object which includes all the data you need to build a RAG index in any stack: embedding, text, document name, chunk ID, etc.
 ## Full API Reference
 The library can do a lot more than the examples we've shown here.

{kiln_ai-0.21.0 → kiln_ai-0.22.1}/kiln_ai/adapters/extractors/litellm_extractor.py RENAMED Viewed

@@ -1,6 +1,7 @@
 import asyncio
 import hashlib
 import logging
+from functools import cached_property
 from pathlib import Path
 from typing import Any, List
@@ -13,23 +14,16 @@ from kiln_ai.adapters.extractors.base_extractor import (
     ExtractionOutput,
 )
 from kiln_ai.adapters.extractors.encoding import to_base64_url
-from kiln_ai.adapters.ml_model_list import built_in_models_from_provider
+from kiln_ai.adapters.ml_model_list import (
+    KilnModelProvider,
+    built_in_models_from_provider,
+)
 from kiln_ai.adapters.provider_tools import LiteLlmCoreConfig
 from kiln_ai.datamodel.datamodel_enums import ModelProviderName
 from kiln_ai.datamodel.extraction import ExtractorConfig, ExtractorType, Kind
 from kiln_ai.utils.filesystem_cache import FilesystemCache
 from kiln_ai.utils.litellm import get_litellm_provider_info
-from kiln_ai.utils.pdf_utils import split_pdf_into_pages
-def max_pdf_page_concurrency_for_model(model_name: str) -> int:
-    # we assume each batch takes ~5s to complete (likely more in practice)
-    # lowest rate limit is 150 RPM for Tier 1 accounts for gemini-2.5-pro
-    if model_name == "gemini/gemini-2.5-pro":
-        return 2
-    # other models support at least 500 RPM for lowest tier accounts
-    return 5
+from kiln_ai.utils.pdf_utils import convert_pdf_to_images, split_pdf_into_pages
 logger = logging.getLogger(__name__)
@@ -74,11 +68,11 @@ def encode_file_litellm_format(path: Path, mime_type: str) -> dict[str, Any]:
         "text/markdown",
         "text/plain",
     ] or any(mime_type.startswith(m) for m in ["video/", "audio/"]):
-        pdf_bytes = path.read_bytes()
+        file_bytes = path.read_bytes()
         return {
             "type": "file",
             "file": {
-                "file_data": to_base64_url(mime_type, pdf_bytes),
+                "file_data": to_base64_url(mime_type, file_bytes),
             },
         }
@@ -101,6 +95,7 @@ class LitellmExtractor(BaseExtractor):
         extractor_config: ExtractorConfig,
         litellm_core_config: LiteLlmCoreConfig,
         filesystem_cache: FilesystemCache | None = None,
+        default_max_parallel_requests: int = 5,
     ):
         if extractor_config.extractor_type != ExtractorType.LITELLM:
             raise ValueError(
@@ -133,6 +128,7 @@ class LitellmExtractor(BaseExtractor):
         }
         self.litellm_core_config = litellm_core_config
+        self.default_max_parallel_requests = default_max_parallel_requests
     def pdf_page_cache_key(self, pdf_path: Path, page_number: int) -> str:
         """
@@ -171,13 +167,35 @@ class LitellmExtractor(BaseExtractor):
         logger.debug(f"Cache miss for page {page_number} of {pdf_path}")
         return None
+    async def convert_pdf_page_to_image_input(
+        self, page_path: Path, page_number: int
+    ) -> ExtractionInput:
+        image_paths = await convert_pdf_to_images(page_path, page_path.parent)
+        if len(image_paths) != 1:
+            raise ValueError(
+                f"Expected 1 image, got {len(image_paths)} for page {page_number} in {page_path}"
+            )
+        image_path = image_paths[0]
+        page_input = ExtractionInput(path=str(image_path), mime_type="image/png")
+        return page_input
     async def _extract_single_pdf_page(
-        self, pdf_path: Path, page_path: Path, prompt: str, page_number: int
+        self,
+        pdf_path: Path,
+        page_path: Path,
+        prompt: str,
+        page_number: int,
     ) -> str:
         try:
-            page_input = ExtractionInput(
-                path=str(page_path), mime_type="application/pdf"
-            )
+            if self.model_provider.multimodal_requires_pdf_as_image:
+                page_input = await self.convert_pdf_page_to_image_input(
+                    page_path, page_number
+                )
+            else:
+                page_input = ExtractionInput(
+                    path=str(page_path), mime_type="application/pdf"
+                )
             completion_kwargs = self._build_completion_kwargs(prompt, page_input)
             response = await litellm.acompletion(**completion_kwargs)
         except Exception as e:
@@ -201,11 +219,6 @@ class LitellmExtractor(BaseExtractor):
             )
         content = response.choices[0].message.content
-        if not content:
-            raise ValueError(
-                f"No text returned from extraction model when extracting page {page_number} for {page_path}"
-            )
         if self.filesystem_cache is not None:
             # we don't want to fail the whole extraction just because cache write fails
             # as that would block the whole flow
@@ -242,13 +255,14 @@ class LitellmExtractor(BaseExtractor):
                     continue
                 extract_page_jobs.append(
-                    self._extract_single_pdf_page(pdf_path, page_path, prompt, i)
+                    self._extract_single_pdf_page(
+                        pdf_path, page_path, prompt, page_number=i
+                    )
                 )
                 page_indices_for_jobs.append(i)
                 if (
-                    len(extract_page_jobs)
-                    >= max_pdf_page_concurrency_for_model(self.litellm_model_slug())
+                    len(extract_page_jobs) >= self.max_parallel_requests_for_model
                     or i == len(page_paths) - 1
                 ):
                     extraction_results = await asyncio.gather(
@@ -295,7 +309,7 @@ class LitellmExtractor(BaseExtractor):
         self, prompt: str, extraction_input: ExtractionInput
     ) -> dict[str, Any]:
         completion_kwargs = {
-            "model": self.litellm_model_slug(),
+            "model": self.litellm_model_slug,
             "messages": [
                 {
                     "role": "user",
@@ -367,20 +381,26 @@ class LitellmExtractor(BaseExtractor):
             content_format=self.extractor_config.output_format,
         )
-    def litellm_model_slug(self) -> str:
+    @cached_property
+    def model_provider(self) -> KilnModelProvider:
         kiln_model_provider = built_in_models_from_provider(
             ModelProviderName(self.extractor_config.model_provider_name),
             self.extractor_config.model_name,
         )
         if kiln_model_provider is None:
             raise ValueError(
                 f"Model provider {self.extractor_config.model_provider_name} not found in the list of built-in models"
             )
+        return kiln_model_provider
+    @cached_property
+    def max_parallel_requests_for_model(self) -> int:
+        value = self.model_provider.max_parallel_requests
+        return value if value is not None else self.default_max_parallel_requests
-        # need to translate into LiteLLM model slug
+    @cached_property
+    def litellm_model_slug(self) -> str:
         litellm_provider_name = get_litellm_provider_info(
-            kiln_model_provider,
+            self.model_provider,
         )
         return litellm_provider_name.litellm_model_id

kiln-ai 0.21.0__tar.gz → 0.22.1__tar.gz

Potentially problematic release.

kiln-ai 0.21.0tar.gz → 0.22.1tar.gz