PyPI - natural-pdf - Versions diffs - 0.2.1.dev0__tar.gz → 0.2.3__tar.gz - Mend

natural-pdf 0.2.1.dev0tar.gz → 0.2.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (277) hide show

{natural_pdf-0.2.1.dev0 → natural_pdf-0.2.3}/.gitignore RENAMED Viewed

@@ -292,4 +292,4 @@ build/
 # Ignore evaluation results generated by bad_pdf_eval suite
 eval_results/
-bad_pdf_analysis
+bad_pdf_analysis

natural_pdf-0.2.3/CLAUDE.md ADDED Viewed

@@ -0,0 +1,85 @@
+# Natural PDF Library Analysis
+## Library Overview
+Natural PDF is a Python library for intelligent PDF document processing that combines traditional PDF parsing with modern AI capabilities. It provides a jQuery-like API for selecting and manipulating PDF elements with spatial awareness.
+## Core Goals & Purpose
+- **Intelligent PDF Processing**: Goes beyond simple text extraction to understand document structure and spatial relationships
+- **AI-Enhanced Workflows**: Integrates OCR, document Q&A, classification, and LLM-based data extraction
+- **Spatial Navigation**: Provides methods like `.below()`, `.above()`, `.left()` for intuitive element selection
+- **Multi-format Support**: Handles both text-based PDFs and image-based (OCR-required) documents
+## Key Use Cases & Workflows
+### 1. Basic Text and Table Extraction
+- Load PDFs from local files or URLs
+- Extract text with layout preservation
+- Find and extract tables automatically
+- Use spatial selectors: `page.find('text:contains(Violations)').below()`
+### 2. OCR Integration
+- Multiple OCR engines supported: EasyOCR (default), Surya, PaddleOCR, DocTR
+- Configurable resolution and detection modes
+- OCR correction using LLMs
+- Human-in-the-loop correction workflows with exportable packages
+### 3. AI-Powered Data Extraction
+- **Document Q&A**: Extractive question answering with confidence scores
+- **Structured Data**: Extract specific fields with schema validation using Pydantic
+- **LLM Integration**: OpenAI/Gemini compatible for advanced extraction
+- **Classification**: Document/page categorization using text or vision models
+### 4. Advanced Document Processing
+- **Multi-column/Page Flows**: Reflow content across columns or pages for proper reading order
+- **Layout Analysis**: YOLO, TATR for automatic document structure detection
+- **Visual Element Detection**: Checkbox classification, form field extraction
+- **Table Structure Detection**: Manual line detection for complex tables
+### 5. Visualization and Display
+- **Page Limit for show()**: By default, `pdf.show()` displays only the first 30 pages to prevent overwhelming displays
+  - Use `pdf.show(limit=10)` to show fewer pages
+  - Use `pdf.show(limit=None)` to display all pages
+  - Works with all layout options: `pdf.show(limit=20, layout='grid', columns=4)`
+- **Exclusion Zone Visualization**: Use `exclusions='red'` parameter to visualize exclusion zones
+  - `page.show(exclusions='red')` highlights exclusions in red
+  - `page.show(exclusions='blue')` highlights exclusions in blue
+  - `page.show(exclusions=True)` uses default red color
+  - Works at PDF level too: `pdf.show(exclusions='green')`
+### 6. Directional Navigation Improvements
+- **Smart defaults for spatial methods**:
+  - `.left()` and `.right()` now default to `height='element'` (matches element height)
+  - `.above()` and `.below()` continue to default to `width='full'` (full page width)
+  - This matches common use cases: looking sideways usually wants same height, looking up/down wants full width
+- **Enhanced discoverability**:
+  - Docstrings include examples showing different height/width options
+  - Clear parameter names ('height' for left/right, 'width' for above/below)
+### 6a. Enhanced Exclusion Support
+- **ElementCollection support in callable exclusions**: `pdf.add_exclusion(lambda page: page.find_all('text:contains("Header")'))` now works
+- **List/iterable support**: Callable exclusions can return lists or other iterables of elements
+- **Automatic conversion**: Elements from iterables are automatically converted to exclusion regions
+- **Backward compatibility**: Existing Region and callable exclusions continue to work unchanged
+### 7. Page Grouping with groupby()
+- **Simple grouping by selector text**: `pages.groupby('text[size=16]')` groups by header text
+- **Callable functions for complex logic**: `pages.groupby(lambda p: p.find('text:contains("CITY")').extract_text())`
+- **Pandas-style iteration**: `for title, pages in grouped:` (no `.items()` needed)
+- **Dict-like access**: `grouped.get('CITY OF MADISON')` or `grouped.get_group('key')`
+- **Index-based access**: `grouped[0]` (first group), `grouped[-1]` (last group), `grouped['key']` (by name)
+- **Group exploration**: `grouped.info()` shows all groups with indexes and page counts
+- **Batch operations**: `grouped.apply(lambda pages: len(pages.find_all('table')))`
+- **Visual inspection**: `grouped.show(limit=2)` shows first 2 pages of each group
+- **Progress bar support**: Automatic progress bars for large collections, disable with `show_progress=False`
+- **None handling**: Pages with no matching elements group under `None` key
+## Development Best Practices
+### File and Resource Management
+- When making temp files, put them in temp/
+- When creating test files, put them in tests/
+- Most fixes and changes need a test, and should be done with test-driven development
+### Environment and Tooling
+- Always use the virtual environment in .venv
+- Use uv when possible for efficient package management

{natural_pdf-0.2.1.dev0/natural_pdf.egg-info → natural_pdf-0.2.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: natural-pdf
-Version: 0.2.1.dev0
+Version: 0.2.3
 Summary: A more intuitive interface for working with PDFs
 Author-email: Jonathan Soma <jonathan.soma@gmail.com>
 License-Expression: MIT
@@ -14,7 +14,7 @@ License-File: LICENSE
 Requires-Dist: scikit-learn
 Requires-Dist: markdown
 Requires-Dist: pandas
-Requires-Dist: pdfplumber
+Requires-Dist: pdfplumber>=0.11.7
 Requires-Dist: colormath2
 Requires-Dist: pillow
 Requires-Dist: colour

{natural_pdf-0.2.1.dev0 → natural_pdf-0.2.3}/docs/layout-analysis/index.md RENAMED Viewed

@@ -105,7 +105,7 @@ page.find_all('region[model=tatr]').show(group_by='region_type', width=700)
 # page.analyze_layout(engine="docling")
 # page.find_all('region[model=docling]').show(group_by='region_type')
-# page.to_image(width=700)
+# page.render(width=700)
 ```
 ```python

{natural_pdf-0.2.1.dev0 → natural_pdf-0.2.3}/docs/quick-reference/index.md RENAMED Viewed

@@ -156,11 +156,25 @@ elements.show(color="red")                      # Single collection
 elements.show(color="blue", label="Headers")    # With label
 elements.show(group_by='type')                  # Color by type
-# Multiple collections together
+# Quick highlighting (one-liner)
+page.highlight(elements1, elements2, elements3)  # Multiple elements
+page.highlight(                                  # With custom colors
+    (elements1, 'red'),
+    (elements2, 'blue'),
+    (elements3, 'green')
+)
+# Multiple collections with context manager
 with page.highlights() as h:
     h.add(elements1, color="red", label="Type 1")
     h.add(elements2, color="blue", label="Type 2")
     h.show()
+# Auto-display in Jupyter/Colab
+with page.highlights(show=True) as h:
+    h.add(elements1, label="Headers")
+    h.add(elements2, label="Content")
+    # Displays automatically when exiting context
 ```
 ### Viewing

{natural_pdf-0.2.1.dev0 → natural_pdf-0.2.3}/docs/visual-debugging/index.md RENAMED Viewed

@@ -83,6 +83,47 @@ with page.highlights() as h:
     h.show()
 ```
+### Jupyter/Colab Support
+In Jupyter notebooks and Google Colab, you can use `show=True` to automatically display the highlights when exiting the context:
+```python
+# Automatically displays the image in Jupyter/Colab
+with page.highlights(show=True) as h:
+    h.add(summary_elements, label='Summary')
+    h.add(date_elements, label='Date')
+    h.add(line_elements, label='Lines')
+    # No need to call h.show() - displays automatically!
+```
+### Quick Highlighting with `.highlight()`
+For simple highlighting tasks, use the `.highlight()` convenience method:
+```python
+# Highlight multiple elements in one line
+page.highlight(summary_elements, date_elements, line_elements)
+# With custom colors
+page.highlight(
+    (summary_elements, 'red'),
+    (date_elements, 'blue'),
+    (line_elements, 'green')
+)
+# With colors and labels
+page.highlight(
+    (summary_elements, 'red', 'Summary Text'),
+    (date_elements, 'blue', 'Date Fields'),
+    (line_elements, 'green', 'Separator Lines')
+)
+# Pass additional parameters like width or resolution
+page.highlight(summary_elements, date_elements, width=800, labels=True)
+```
+This method is particularly useful in Jupyter/Colab environments where the image displays automatically as the cell output.
 ## Customizing Multiple Highlights
 Customize the appearance of multiple highlights using the context manager:
@@ -133,7 +174,7 @@ content = title.below(height=200)
 content.show()
 ```
-Or look at just the region by itself
+Or look at just the region by itself:
 ```python
 # Find a title and create a region below it
@@ -144,6 +185,27 @@ content = title.below(height=200)
 content.show(crop=True)
 ```
+### Highlighting Multiple Regions
+The `.highlight()` method works with regions too:
+```python
+# Create multiple regions
+left = page.region(left=0, right=page.width/3, top=0, bottom=page.height)
+mid = page.region(left=page.width/3, right=page.width/3*2, top=0, bottom=page.height)
+right = page.region(left=page.width/3*2, right=page.width, top=0, bottom=page.height)
+# Highlight all three regions
+page.highlight(left, mid, right)
+# Or with custom colors
+page.highlight(
+    (left, 'red', 'Left Column'),
+    (mid, 'green', 'Middle Column'),
+    (right, 'blue', 'Right Column')
+)
+```
 ## Working with Text Styles
 Visualize text styles to understand the document structure:

{natural_pdf-0.2.1.dev0 → natural_pdf-0.2.3}/natural_pdf/analyzers/guides.py RENAMED Viewed

@@ -3,7 +3,7 @@
 import json
 import logging
 from collections import UserList
-from typing import TYPE_CHECKING, Any, Dict, List, Literal, Optional, Tuple, Union
+from typing import TYPE_CHECKING, Any, Callable, Dict, List, Literal, Optional, Tuple, Union
 import numpy as np
 from PIL import Image, ImageDraw
@@ -16,6 +16,7 @@ if TYPE_CHECKING:
     from natural_pdf.elements.element_collection import ElementCollection
     from natural_pdf.elements.region import Region
     from natural_pdf.flows.region import FlowRegion
+    from natural_pdf.tables.result import TableResult
 logger = logging.getLogger(__name__)
@@ -131,6 +132,15 @@ class GuidesList(UserList):
         self._parent = parent_guides
         self._axis = axis
+    def __getitem__(self, i):
+        """Override to handle slicing properly."""
+        if isinstance(i, slice):
+            # Return a new GuidesList with the sliced data
+            return self.__class__(self._parent, self._axis, self.data[i])
+        else:
+            # For single index, return the value directly
+            return self.data[i]
     def from_content(
         self,
         markers: Union[str, List[str], "ElementCollection", None],
@@ -140,6 +150,7 @@ class GuidesList(UserList):
         tolerance: float = 5,
         *,
         append: bool = False,
+        apply_exclusions: bool = True,
     ) -> "Guides":
         """
         Create guides from content markers and add to this axis.
@@ -154,6 +165,7 @@ class GuidesList(UserList):
             align: How to align guides relative to found elements
             outer: Whether to add outer boundary guides
             tolerance: Tolerance for snapping to element edges
+            apply_exclusions: Whether to apply exclusion zones when searching for text
         Returns:
             Parent Guides object for chaining
@@ -178,6 +190,7 @@ class GuidesList(UserList):
                     align=align,
                     outer=outer,
                     tolerance=tolerance,
+                    apply_exclusions=apply_exclusions,
                 )
                 # Collect guides from this region
@@ -260,6 +273,7 @@ class GuidesList(UserList):
             align=align,
             outer=outer,
             tolerance=tolerance,
+            apply_exclusions=apply_exclusions,
         )
         # Replace or append based on parameter
@@ -1398,6 +1412,7 @@ class Guides:
         align: Literal["left", "right", "center", "between"] = "left",
         outer: bool = True,
         tolerance: float = 5,
+        apply_exclusions: bool = True,
     ) -> "Guides":
         """
         Create guides based on text content positions.
@@ -1413,6 +1428,7 @@ class Guides:
             align: Where to place guides relative to found text
             outer: Whether to add guides at the boundaries
             tolerance: Maximum distance to search for text
+            apply_exclusions: Whether to apply exclusion zones when searching for text
         Returns:
             New Guides object aligned to text content
@@ -1431,6 +1447,7 @@ class Guides:
                     align=align,
                     outer=outer,
                     tolerance=tolerance,
+                    apply_exclusions=apply_exclusions,
                 )
                 # Store in flow guides
@@ -1469,7 +1486,7 @@ class Guides:
         # Find each marker and determine guide position
         for marker in marker_texts:
             if hasattr(obj, "find"):
-                element = obj.find(f'text:contains("{marker}")')
+                element = obj.find(f'text:contains("{marker}")', apply_exclusions=apply_exclusions)
                 if element:
                     if axis == "vertical":
                         if align == "left":
@@ -1498,7 +1515,9 @@ class Guides:
             marker_bounds = []
             for marker in marker_texts:
                 if hasattr(obj, "find"):
-                    element = obj.find(f'text:contains("{marker}")')
+                    element = obj.find(
+                        f'text:contains("{marker}")', apply_exclusions=apply_exclusions
+                    )
                     if element:
                         if axis == "vertical":
                             marker_bounds.append((element.x0, element.x1))
@@ -3285,6 +3304,7 @@ class Guides:
         align: Literal["left", "right", "center", "between"] = "left",
         outer: bool = True,
         tolerance: float = 5,
+        apply_exclusions: bool = True,
     ) -> "Guides":
         """
         Instance method: Add guides from content, allowing chaining.
@@ -3301,6 +3321,7 @@ class Guides:
             align: How to align guides relative to found elements
             outer: Whether to add outer boundary guides
             tolerance: Tolerance for snapping to element edges
+            apply_exclusions: Whether to apply exclusion zones when searching for text
         Returns:
             Self for method chaining
@@ -3318,6 +3339,7 @@ class Guides:
             align=align,
             outer=outer,
             tolerance=tolerance,
+            apply_exclusions=apply_exclusions,
         )
         # Add the appropriate coordinates to this object
@@ -3421,6 +3443,140 @@ class Guides:
         return self
+    def extract_table(
+        self,
+        target: Optional[Union["Page", "Region"]] = None,
+        source: str = "guides_temp",
+        cell_padding: float = 0.5,
+        include_outer_boundaries: bool = False,
+        method: Optional[str] = None,
+        table_settings: Optional[dict] = None,
+        use_ocr: bool = False,
+        ocr_config: Optional[dict] = None,
+        text_options: Optional[Dict] = None,
+        cell_extraction_func: Optional[Callable[["Region"], Optional[str]]] = None,
+        show_progress: bool = False,
+        content_filter: Optional[Union[str, Callable[[str], bool], List[str]]] = None,
+        *,
+        multi_page: Literal["auto", True, False] = "auto",
+    ) -> "TableResult":
+        """
+        Extract table data directly from guides without leaving temporary regions.
+        This method:
+        1. Creates table structure using build_grid()
+        2. Extracts table data from the created table region
+        3. Cleans up all temporary regions
+        4. Returns the TableResult
+        Args:
+            target: Page or Region to create regions on (uses self.context if None)
+            source: Source label for temporary regions (will be cleaned up)
+            cell_padding: Internal padding for cell regions in points
+            include_outer_boundaries: Whether to add boundaries at edges if missing
+            method: Table extraction method ('tatr', 'pdfplumber', 'text', etc.)
+            table_settings: Settings for pdfplumber table extraction
+            use_ocr: Whether to use OCR for text extraction
+            ocr_config: OCR configuration parameters
+            text_options: Dictionary of options for the 'text' method
+            cell_extraction_func: Optional callable for custom cell text extraction
+            show_progress: Controls progress bar for text method
+            content_filter: Content filtering function or patterns
+            multi_page: Controls multi-region table creation for FlowRegions
+        Returns:
+            TableResult: Extracted table data
+        Raises:
+            ValueError: If no table region is created from the guides
+        Example:
+            ```python
+            from natural_pdf.analyzers import Guides
+            # Create guides from detected lines
+            guides = Guides.from_lines(page, source_label="detected")
+            # Extract table directly - no temporary regions left behind
+            table_data = guides.extract_table()
+            # Convert to pandas DataFrame
+            df = table_data.to_df()
+            ```
+        """
+        target_obj = target or self.context
+        if not target_obj:
+            raise ValueError("No target object available. Provide target parameter or context.")
+        # Get the page for cleanup later
+        if hasattr(target_obj, "x0") and hasattr(target_obj, "top"):  # Region
+            page = target_obj._page
+            element_manager = page._element_mgr
+        elif hasattr(target_obj, "_element_mgr"):  # Page
+            page = target_obj
+            element_manager = page._element_mgr
+        else:
+            raise ValueError(f"Target object {target_obj} is not a Page or Region")
+        try:
+            # Step 1: Build grid structure (creates temporary regions)
+            grid_result = self.build_grid(
+                target=target_obj,
+                source=source,
+                cell_padding=cell_padding,
+                include_outer_boundaries=include_outer_boundaries,
+                multi_page=multi_page,
+            )
+            # Step 2: Get the table region and extract table data
+            table_region = grid_result["regions"]["table"]
+            if table_region is None:
+                raise ValueError(
+                    "No table region was created from the guides. Check that you have both vertical and horizontal guides."
+                )
+            # Handle multi-page case where table_region might be a list
+            if isinstance(table_region, list):
+                if not table_region:
+                    raise ValueError("No table regions were created from the guides.")
+                # Use the first table region for extraction
+                table_region = table_region[0]
+            # Step 3: Extract table data using the region's extract_table method
+            table_result = table_region.extract_table(
+                method=method,
+                table_settings=table_settings,
+                use_ocr=use_ocr,
+                ocr_config=ocr_config,
+                text_options=text_options,
+                cell_extraction_func=cell_extraction_func,
+                show_progress=show_progress,
+                content_filter=content_filter,
+            )
+            return table_result
+        finally:
+            # Step 4: Clean up all temporary regions created by build_grid
+            # This ensures no regions are left behind regardless of success/failure
+            try:
+                regions_to_remove = [
+                    r
+                    for r in element_manager.regions
+                    if getattr(r, "source", None) == source
+                    and getattr(r, "region_type", None)
+                    in {"table", "table_row", "table_column", "table_cell"}
+                ]
+                for region in regions_to_remove:
+                    element_manager.remove_element(region, element_type="regions")
+                if regions_to_remove:
+                    logger.debug(f"Cleaned up {len(regions_to_remove)} temporary regions")
+            except Exception as cleanup_err:
+                logger.warning(f"Failed to clean up temporary regions: {cleanup_err}")
     def _get_flow_orientation(self) -> Literal["vertical", "horizontal", "unknown"]:
         """Determines if a FlowRegion's constituent parts are arranged vertically or horizontally."""
         if not self.is_flow_region or len(self.context.constituent_regions) < 2:

{natural_pdf-0.2.1.dev0 → natural_pdf-0.2.3}/natural_pdf/collections/mixins.py RENAMED Viewed

@@ -29,9 +29,22 @@ class DirectionalCollectionMixin:
         """Find regions to the right of all elements in this collection."""
         return self.apply(lambda element: element.right(**kwargs))
-    def expand(self, **kwargs) -> "ElementCollection":
-        """Expand all elements in this collection."""
-        return self.apply(lambda element: element.expand(**kwargs))
+    def expand(self, *args, **kwargs) -> "ElementCollection":
+        """Expand all elements in this collection.
+        Args:
+            *args: If a single positional argument is provided, expands all elements
+                   by that amount in all directions.
+            **kwargs: Keyword arguments for directional expansion (left, right, top, bottom, etc.)
+        Examples:
+            # Expand all elements by 5 pixels in all directions
+            collection.expand(5)
+            # Expand with different amounts in each direction
+            collection.expand(left=10, right=5, top=3, bottom=7)
+        """
+        return self.apply(lambda element: element.expand(*args, **kwargs))
 class ApplyMixin:

{natural_pdf-0.2.1.dev0 → natural_pdf-0.2.3}/natural_pdf/core/highlighting_service.py RENAMED Viewed

@@ -335,6 +335,7 @@ class HighlightContext:
         self.show_on_exit = show_on_exit
         self.highlight_groups = []
         self._color_manager = ColorManager()
+        self._exit_image = None  # Store image for Jupyter display
     def add(
         self,
@@ -421,6 +422,11 @@ class HighlightContext:
             )
             return None
+    @property
+    def image(self) -> Optional[Image.Image]:
+        """Get the last generated image (useful after context exit)."""
+        return self._exit_image
     def __enter__(self) -> "HighlightContext":
         """Enter the context."""
         return self
@@ -428,7 +434,25 @@ class HighlightContext:
     def __exit__(self, exc_type, exc_val, exc_tb):
         """Exit the context, optionally showing highlights."""
         if self.show_on_exit and not exc_type:
-            self.show()
+            self._exit_image = self.show()
+            # Check if we're in a Jupyter/IPython environment
+            try:
+                # Try to get IPython instance
+                from IPython import get_ipython
+                ipython = get_ipython()
+                if ipython is not None:
+                    # We're in IPython/Jupyter
+                    from IPython.display import display
+                    if self._exit_image is not None:
+                        display(self._exit_image)
+            except (ImportError, NameError):
+                # Not in Jupyter or IPython not available - that's OK
+                pass
+        # __exit__ must return False to not suppress exceptions
         return False
@@ -689,7 +713,7 @@ class HighlightingService:
         logger.debug(f"Added highlight to page {page_index}: {highlight}")
         # --- Invalidate page-level image cache --------------------------------
-        # The Page.to_image method maintains an internal cache keyed by rendering
+        # The Page.render method maintains an internal cache keyed by rendering
         # parameters.  Because the cache key currently does **not** incorporate
         # any information about the highlights themselves, it can return stale
         # images after highlights are added or removed.  To ensure the next
@@ -700,11 +724,11 @@ class HighlightingService:
             if hasattr(page_obj, "_to_image_cache"):
                 page_obj._to_image_cache.clear()
                 logger.debug(
-                    f"Cleared cached to_image renders for page {page_index} after adding a highlight."
+                    f"Cleared cached render images for page {page_index} after adding a highlight."
                 )
         except Exception as cache_err:  # pragma: no cover – never fail highlight creation
             logger.warning(
-                f"Failed to invalidate to_image cache for page {page_index}: {cache_err}",
+                f"Failed to invalidate render cache for page {page_index}: {cache_err}",
                 exc_info=True,
             )
@@ -737,11 +761,11 @@ class HighlightingService:
             if hasattr(page_obj, "_to_image_cache"):
                 page_obj._to_image_cache.clear()
                 logger.debug(
-                    f"Cleared cached to_image renders for page {page_index} after removing highlights."
+                    f"Cleared cached render images for page {page_index} after removing highlights."
                 )
         except Exception as cache_err:  # pragma: no cover
             logger.warning(
-                f"Failed to invalidate to_image cache for page {page_index}: {cache_err}",
+                f"Failed to invalidate render cache for page {page_index}: {cache_err}",
                 exc_info=True,
             )
@@ -760,7 +784,7 @@ class HighlightingService:
         labels: bool = True,
         legend_position: str = "right",
         render_ocr: bool = False,
-        **kwargs,  # Pass other args to pdfplumber.page.to_image if needed
+        **kwargs,  # Pass other args to pdfplumber.page.to_image if needed (internal API)
     ) -> Optional[Image.Image]:
         """
         Renders a specific page with its highlights.
@@ -773,7 +797,7 @@ class HighlightingService:
             labels: Whether to include a legend for highlights.
             legend_position: Position of the legend.
             render_ocr: Whether to render OCR text on the image.
-            kwargs: Additional keyword arguments for pdfplumber's page.to_image (e.g., width, height).
+            kwargs: Additional keyword arguments for pdfplumber's internal page.to_image (e.g., width, height).
         Returns:
             A PIL Image object of the rendered page, or None if rendering fails.
@@ -957,7 +981,7 @@ class HighlightingService:
             crop_bbox: Optional bounding box (x0, top, x1, bottom) in PDF coordinate
                 space to crop the output image to, before legends or other overlays are
                 applied. If None, no cropping is performed.
-            **kwargs: Additional args for pdfplumber's to_image (e.g., width, height).
+            **kwargs: Additional args for pdfplumber's internal to_image (e.g., width, height).
         Returns:
             PIL Image of the preview, or None if rendering fails.

natural-pdf 0.2.1.dev0__tar.gz → 0.2.3__tar.gz

natural-pdf 0.2.1.dev0tar.gz → 0.2.3tar.gz