PyPI - kreuzberg - Versions diffs - 3.3.0__tar.gz → 3.4.0__tar.gz - Mend

kreuzberg 3.3.0tar.gz → 3.4.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (188) hide show

kreuzberg-3.4.0/.docker/Dockerfile ADDED Viewed

@@ -0,0 +1,21 @@
+FROM ghcr.io/astral-sh/uv:python3.13-bookworm as app
+ARG EXTRAS=""
+WORKDIR /app
+ENV PYTHONDONTWRITEBYTECODE 1
+ENV PYTHONUNBUFFERED 1
+ENV UV_LINK_MODE=copy
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    pandoc \
+    tesseract-ocr \
+    && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
+COPY pyproject.toml uv.lock README.md ./
+COPY kreuzberg kreuzberg
+RUN uv sync --extra api${EXTRAS:+ --extra ${EXTRAS}} --no-editable --no-dev --compile-bytecode
+RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser
+USER appuser
+CMD ["litestar", "--app", "kreuzberg._api.main:app", "run", "--host", "0.0.0.0"]

kreuzberg-3.4.0/.docker/README.md ADDED Viewed

@@ -0,0 +1,87 @@
+# Kreuzberg Docker Images
+[![GitHub](https://img.shields.io/badge/GitHub-Goldziher%2Fkreuzberg-blue)](https://github.com/Goldziher/kreuzberg)
+[![PyPI](https://badge.fury.io/py/kreuzberg.svg)](https://badge.fury.io/py/kreuzberg)
+[![Documentation](https://img.shields.io/badge/docs-GitHub_Pages-blue)](https://goldziher.github.io/kreuzberg/)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/Goldziher/kreuzberg/blob/main/LICENSE)
+High-performance Python library for text extraction from documents, available as Docker images.
+**Source Code**: [github.com/Goldziher/kreuzberg](https://github.com/Goldziher/kreuzberg)
+## Quick Start
+```bash
+docker run -p 8000:8000 goldziher/kreuzberg:latest
+```
+## Available Tags
+- `latest` - Latest stable release with API server and Tesseract OCR
+- `X.Y.Z` - Specific version (e.g., `3.0.0`)
+- `X.Y.Z-easyocr` - With EasyOCR support
+- `X.Y.Z-paddle` - With PaddleOCR support
+- `X.Y.Z-gmft` - With GMFT table extraction
+- `X.Y.Z-all` - With all optional dependencies
+## Usage
+### Extract Files via API
+```bash
+# Single file
+curl -X POST http://localhost:8000/extract \
+  -F "data=@document.pdf"
+# Multiple files
+curl -X POST http://localhost:8000/extract \
+  -F "data=@document1.pdf" \
+  -F "data=@document2.docx"
+```
+### Docker Compose
+```yaml
+version: '3.8'
+services:
+  kreuzberg:
+    image: goldziher/kreuzberg:latest
+    ports:
+      - "8000:8000"
+    restart: unless-stopped
+```
+## Features
+- **🚀 High Performance**: Optimized for speed and efficiency
+- **📄 Multiple Formats**: PDF, DOCX, images, HTML, and more
+- **🔍 OCR Support**: Built-in Tesseract, optional EasyOCR/PaddleOCR
+- **📊 Table Extraction**: Extract tables with GMFT
+- **🔒 Secure**: Runs as non-root user, no external API calls
+- **📦 Ready to Use**: Pre-configured API server
+## Documentation
+- **[GitHub Repository](https://github.com/Goldziher/kreuzberg)** - Source code and issue tracking
+- **[Full Documentation](https://goldziher.github.io/kreuzberg/)** - Complete user guide and API reference
+- **[API Documentation](https://goldziher.github.io/kreuzberg/user-guide/api-server/)** - REST API endpoints and usage
+- **[Docker Guide](https://goldziher.github.io/kreuzberg/user-guide/docker/)** - Detailed Docker usage guide
+## Support
+- **Issues**: [github.com/Goldziher/kreuzberg/issues](https://github.com/Goldziher/kreuzberg/issues)
+- **Discussions**: [github.com/Goldziher/kreuzberg/discussions](https://github.com/Goldziher/kreuzberg/discussions)
+- **Discord**: [Join our community](https://discord.gg/pXxagNK2zN)
+## Contributing
+Contributions are welcome! See our [Contributing Guide](https://github.com/Goldziher/kreuzberg/blob/main/docs/contributing.md).
+## License
+MIT License - see [LICENSE](https://github.com/Goldziher/kreuzberg/blob/main/LICENSE) for details.
+______________________________________________________________________
+Made with ❤️ by the [Kreuzberg contributors](https://github.com/Goldziher/kreuzberg/graphs/contributors)

kreuzberg-3.4.0/.dockerignore ADDED Viewed

@@ -0,0 +1,15 @@
+*.pyc
+*.pyd
+*.pyo
+.git
+.github/
+.gitignore
+.idea
+.mypy_cache
+.pytest_cache
+.ruff_cache
+.vscode
+__pycache__
+benchmarks/
+docs/
+tests/

kreuzberg-3.4.0/.github/workflows/publish-docker.yml ADDED Viewed

@@ -0,0 +1,101 @@
+# .github/workflows/publish-docker.yml
+name: Publish Docker Images
+on:
+  workflow_run:
+    workflows: ["Release"]
+    types:
+      - completed
+    branches:
+      - main
+jobs:
+  build-and-push:
+    runs-on: ubuntu-latest
+    if: ${{ github.event.workflow_run.conclusion == 'success' }}
+    permissions:
+      contents: read
+      packages: write
+    strategy:
+      matrix:
+        include:
+          - name: core
+            extras: ""
+            tag_suffix: "" # The base image tag (includes API + tesseract)
+          - name: easyocr
+            extras: "easyocr"
+            tag_suffix: "-easyocr"
+          - name: paddle
+            extras: "paddleocr"
+            tag_suffix: "-paddle"
+          - name: gmft
+            extras: "gmft"
+            tag_suffix: "-gmft"
+          - name: all
+            extras: "all"
+            tag_suffix: "-all"
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          ref: ${{ github.event.workflow_run.head_branch }}
+      - name: Get release version
+        id: get_version
+        run: |
+          echo "VERSION=${{ github.event.workflow_run.head_branch }}" >> $GITHUB_OUTPUT
+          # If triggered by a tag, extract version
+          if [[ "${{ github.event.workflow_run.head_branch }}" =~ ^v[0-9]+\.[0-9]+\.[0-9]+ ]]; then
+            echo "VERSION=${{ github.event.workflow_run.head_branch }}" >> $GITHUB_OUTPUT
+          else
+            # Get the latest tag
+            git fetch --tags
+            echo "VERSION=$(git describe --tags --abbrev=0)" >> $GITHUB_OUTPUT
+          fi
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+      - name: Log in to Docker Hub
+        uses: docker/login-action@v3
+        with:
+          username: ${{ secrets.DOCKERHUB_USERNAME }}
+          password: ${{ secrets.DOCKERHUB_TOKEN }}
+      - name: Extract metadata (tags, labels) for Docker
+        id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: goldziher/kreuzberg
+          tags: |
+            # Release version tag (e.g., v3.0.0-easyocr)
+            type=raw,value=${{ steps.get_version.outputs.VERSION }}${{ matrix.tag_suffix }}
+            # Latest tag for each variant (e.g., latest-easyocr)
+            type=raw,value=latest${{ matrix.tag_suffix }}
+      - name: Build and push Docker image
+        uses: docker/build-push-action@v5
+        with:
+          context: .
+          file: ./.docker/Dockerfile
+          platforms: linux/amd64,linux/arm64
+          push: true
+          build-args: |
+            EXTRAS=${{ matrix.extras }}
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+      - name: Update Docker Hub README
+        uses: peter-evans/dockerhub-description@v4
+        if: matrix.name == 'core'
+        with:
+          username: ${{ secrets.DOCKERHUB_USERNAME }}
+          password: ${{ secrets.DOCKERHUB_TOKEN }}
+          repository: goldziher/kreuzberg
+          readme-filepath: ./.docker/README.md

{kreuzberg-3.3.0 → kreuzberg-3.4.0}/.gitignore RENAMED Viewed

@@ -6,28 +6,29 @@
 *.py[cod]
 *.suo
 *.user
-.DS_store
+*temp/
 .coverage
 .coverage*
+.cursorrules
 .dist/
+.DS_store
 .env
 .idea/
+.kreuzberg/
 .mypy_cache/
 .pytest_cache/
 .python-version
+.ropeproject
 .ruff_cache/
 .run/
 .venv/
 .vscode/
 .windsurfrules
-.cursorrules
-CLAUDE.md
-GEMINI.md
 __pycache__/
+benchmark_results.json
+CLAUDE.md
 coverage.xml
+docker-compose.yaml
+GEMINI.md
 prompt_template.egg-info/
 requirements.txt
-Dockerfile
-docker-compose.yaml
-benchmark_results.json
-.kreuzberg/

{kreuzberg-3.3.0 → kreuzberg-3.4.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: kreuzberg
-Version: 3.3.0
+Version: 3.4.0
 Summary: A text extraction library supporting PDFs, images, office documents and more
 Project-URL: homepage, https://github.com/Goldziher/kreuzberg
 Author-email: Na'aman Hirschfeld <nhirschfed@gmail.com>
@@ -34,12 +34,15 @@ Provides-Extra: all
 Requires-Dist: click>=8.2.1; extra == 'all'
 Requires-Dist: easyocr>=1.7.2; extra == 'all'
 Requires-Dist: gmft>=0.4.2; extra == 'all'
+Requires-Dist: litestar[opentelemetry,standard,structlog]>=2.1.6; extra == 'all'
 Requires-Dist: paddleocr>=3.1.0; extra == 'all'
 Requires-Dist: paddlepaddle>=3.1.0; extra == 'all'
 Requires-Dist: rich>=14.0.0; extra == 'all'
 Requires-Dist: semantic-text-splitter>=0.27.0; extra == 'all'
 Requires-Dist: setuptools>=80.9.0; extra == 'all'
 Requires-Dist: tomli>=2.0.0; (python_version < '3.11') and extra == 'all'
+Provides-Extra: api
+Requires-Dist: litestar[opentelemetry,standard,structlog]>=2.1.6; extra == 'api'
 Provides-Extra: chunking
 Requires-Dist: semantic-text-splitter>=0.27.0; extra == 'chunking'
 Provides-Extra: cli
@@ -63,10 +66,14 @@ Description-Content-Type: text/markdown
 [![Documentation](https://img.shields.io/badge/docs-GitHub_Pages-blue)](https://goldziher.github.io/kreuzberg/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-Kreuzberg is a Python library for text extraction from documents. It provides a unified interface for extracting text from PDFs, images, office documents, and more, with both async and sync APIs.
+Kreuzberg is a **high-performance** Python library for text extraction from documents. **Benchmarked as one of the fastest text extraction libraries available**, it provides a unified interface for extracting text from PDFs, images, office documents, and more, with both async and sync APIs optimized for speed and efficiency.
 ## Why Kreuzberg?
+- **🚀 Substantially Faster**: Extraction speeds that significantly outperform other text extraction libraries
+- **⚡ Unique Dual API**: The only framework supporting both sync and async APIs for maximum flexibility
+- **💾 Memory Efficient**: Lower memory footprint compared to competing libraries
+- **📊 Proven Performance**: [Comprehensive benchmarks](https://github.com/Goldziher/python-text-extraction-libs-benchmarks) demonstrate superior performance across formats
 - **Simple and Hassle-Free**: Clean API that just works, without complex configuration
 - **Local Processing**: No external API calls or cloud dependencies required
 - **Resource Efficient**: Lightweight processing without GPU requirements
@@ -85,6 +92,9 @@ pip install kreuzberg
 # Or install with CLI support
 pip install "kreuzberg[cli]"
+# Or install with API server
+pip install "kreuzberg[api]"
 ```
 Install pandoc:
@@ -134,6 +144,31 @@ async def main():
 asyncio.run(main())
 ```
+## Docker
+Docker images are available for easy deployment:
+```bash
+# Run the API server
+docker run -p 8000:8000 goldziher/kreuzberg:latest
+# Extract files via API
+curl -X POST http://localhost:8000/extract -F "data=@document.pdf"
+```
+See the [Docker documentation](https://goldziher.github.io/kreuzberg/user-guide/docker/) for more options.
+## REST API
+Run Kreuzberg as a REST API server:
+```bash
+pip install "kreuzberg[api]"
+litestar --app kreuzberg._api.main:app run
+```
+See the [API documentation](https://goldziher.github.io/kreuzberg/user-guide/api-server/) for endpoints and usage.
 ## Command Line Interface
 Kreuzberg includes a powerful CLI for processing documents from the command line:
@@ -208,7 +243,31 @@ For comparison and selection guidance, see the [OCR Backends](https://goldziher.
 ## Performance
-Kreuzberg offers both sync and async APIs. Choose the right one based on your use case:
+Kreuzberg delivers **exceptional performance** compared to other text extraction libraries:
+### 🏆 Competitive Benchmarks
+[Comprehensive benchmarks](https://github.com/Goldziher/python-text-extraction-libs-benchmarks) comparing Kreuzberg against other popular Python text extraction libraries show:
+- **Fastest Extraction**: Consistently fastest processing times across file formats
+- **Lowest Memory Usage**: Most memory-efficient text extraction solution
+- **100% Success Rate**: Reliable extraction across all tested document types
+- **Optimal for High-Throughput**: Designed for real-time, production applications
+### 💾 Installation Size Efficiency
+Kreuzberg delivers maximum performance with minimal overhead:
+1. **Kreuzberg**: 71.0 MB (20 deps) - Most lightweight
+1. **Unstructured**: 145.8 MB (54 deps) - Moderate footprint
+1. **MarkItDown**: 250.7 MB (25 deps) - ML inference overhead
+1. **Docling**: 1,031.9 MB (88 deps) - Full ML stack included
+**Kreuzberg is up to 14x smaller** than competing solutions while delivering superior performance.
+### ⚡ Sync vs Async Performance
+Kreuzberg is the only library offering both sync and async APIs. Choose based on your use case:
 | Operation              | Sync Time | Async Time | Async Advantage    |
 | ---------------------- | --------- | ---------- | ------------------ |
@@ -218,11 +277,7 @@ Kreuzberg offers both sync and async APIs. Choose the right one based on your us
 | OCR processing         | 0.4s      | 0.7s       | **✅ 1.7x faster** |
 | Batch operations       | 38.6s     | 8.5s       | **✅ 4.5x faster** |
-**Rule of thumb:**
-- Use **sync** for simple documents and CLI applications
-- Use **async** for complex PDFs, OCR, and batch processing
-- Use **batch operations** for multiple files
+**Rule of thumb:** Use async for complex documents, OCR, batch processing, and backend APIs.
 For detailed benchmarks and methodology, see our [Performance Documentation](https://goldziher.github.io/kreuzberg/advanced/performance/).

{kreuzberg-3.3.0 → kreuzberg-3.4.0}/README.md RENAMED Viewed

@@ -5,10 +5,14 @@
 [![Documentation](https://img.shields.io/badge/docs-GitHub_Pages-blue)](https://goldziher.github.io/kreuzberg/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-Kreuzberg is a Python library for text extraction from documents. It provides a unified interface for extracting text from PDFs, images, office documents, and more, with both async and sync APIs.
+Kreuzberg is a **high-performance** Python library for text extraction from documents. **Benchmarked as one of the fastest text extraction libraries available**, it provides a unified interface for extracting text from PDFs, images, office documents, and more, with both async and sync APIs optimized for speed and efficiency.
 ## Why Kreuzberg?
+- **🚀 Substantially Faster**: Extraction speeds that significantly outperform other text extraction libraries
+- **⚡ Unique Dual API**: The only framework supporting both sync and async APIs for maximum flexibility
+- **💾 Memory Efficient**: Lower memory footprint compared to competing libraries
+- **📊 Proven Performance**: [Comprehensive benchmarks](https://github.com/Goldziher/python-text-extraction-libs-benchmarks) demonstrate superior performance across formats
 - **Simple and Hassle-Free**: Clean API that just works, without complex configuration
 - **Local Processing**: No external API calls or cloud dependencies required
 - **Resource Efficient**: Lightweight processing without GPU requirements
@@ -27,6 +31,9 @@ pip install kreuzberg
 # Or install with CLI support
 pip install "kreuzberg[cli]"
+# Or install with API server
+pip install "kreuzberg[api]"
 ```
 Install pandoc:
@@ -76,6 +83,31 @@ async def main():
 asyncio.run(main())
 ```
+## Docker
+Docker images are available for easy deployment:
+```bash
+# Run the API server
+docker run -p 8000:8000 goldziher/kreuzberg:latest
+# Extract files via API
+curl -X POST http://localhost:8000/extract -F "data=@document.pdf"
+```
+See the [Docker documentation](https://goldziher.github.io/kreuzberg/user-guide/docker/) for more options.
+## REST API
+Run Kreuzberg as a REST API server:
+```bash
+pip install "kreuzberg[api]"
+litestar --app kreuzberg._api.main:app run
+```
+See the [API documentation](https://goldziher.github.io/kreuzberg/user-guide/api-server/) for endpoints and usage.
 ## Command Line Interface
 Kreuzberg includes a powerful CLI for processing documents from the command line:
@@ -150,7 +182,31 @@ For comparison and selection guidance, see the [OCR Backends](https://goldziher.
 ## Performance
-Kreuzberg offers both sync and async APIs. Choose the right one based on your use case:
+Kreuzberg delivers **exceptional performance** compared to other text extraction libraries:
+### 🏆 Competitive Benchmarks
+[Comprehensive benchmarks](https://github.com/Goldziher/python-text-extraction-libs-benchmarks) comparing Kreuzberg against other popular Python text extraction libraries show:
+- **Fastest Extraction**: Consistently fastest processing times across file formats
+- **Lowest Memory Usage**: Most memory-efficient text extraction solution
+- **100% Success Rate**: Reliable extraction across all tested document types
+- **Optimal for High-Throughput**: Designed for real-time, production applications
+### 💾 Installation Size Efficiency
+Kreuzberg delivers maximum performance with minimal overhead:
+1. **Kreuzberg**: 71.0 MB (20 deps) - Most lightweight
+1. **Unstructured**: 145.8 MB (54 deps) - Moderate footprint
+1. **MarkItDown**: 250.7 MB (25 deps) - ML inference overhead
+1. **Docling**: 1,031.9 MB (88 deps) - Full ML stack included
+**Kreuzberg is up to 14x smaller** than competing solutions while delivering superior performance.
+### ⚡ Sync vs Async Performance
+Kreuzberg is the only library offering both sync and async APIs. Choose based on your use case:
 | Operation              | Sync Time | Async Time | Async Advantage    |
 | ---------------------- | --------- | ---------- | ------------------ |
@@ -160,11 +216,7 @@ Kreuzberg offers both sync and async APIs. Choose the right one based on your us
 | OCR processing         | 0.4s      | 0.7s       | **✅ 1.7x faster** |
 | Batch operations       | 38.6s     | 8.5s       | **✅ 4.5x faster** |
-**Rule of thumb:**
-- Use **sync** for simple documents and CLI applications
-- Use **async** for complex PDFs, OCR, and batch processing
-- Use **batch operations** for multiple files
+**Rule of thumb:** Use async for complex documents, OCR, batch processing, and backend APIs.
 For detailed benchmarks and methodology, see our [Performance Documentation](https://goldziher.github.io/kreuzberg/advanced/performance/).

{kreuzberg-3.3.0 → kreuzberg-3.4.0}/docs/advanced/performance.md RENAMED Viewed

@@ -4,18 +4,29 @@ Kreuzberg provides both synchronous and asynchronous APIs, each optimized for di
 ## Quick Reference
-| Use Case            | Recommended API              | Reason                       |
-| ------------------- | ---------------------------- | ---------------------------- |
-| CLI tools           | `extract_file_sync()`        | Lower overhead, simpler code |
-| Web applications    | `await extract_file()`       | Better concurrency           |
-| Simple documents    | `extract_file_sync()`        | Faster for small files       |
-| Complex PDFs        | `await extract_file()`       | Parallelized processing      |
-| Batch processing    | `await batch_extract_file()` | Concurrent execution         |
-| OCR-heavy workloads | `await extract_file()`       | Multiprocessing benefits     |
+| Use Case            | Recommended API              | Reason                                 |
+| ------------------- | ---------------------------- | -------------------------------------- |
+| CLI tools           | `extract_file_sync()`        | Lower overhead, simpler code           |
+| **Backend APIs**    | `await extract_file()`       | **Always use async in async contexts** |
+| Web applications    | `await extract_file()`       | Better concurrency                     |
+| Simple documents    | `extract_file_sync()`        | Faster for small files                 |
+| Complex PDFs        | `await extract_file()`       | Parallelized processing                |
+| Batch processing    | `await batch_extract_file()` | Concurrent execution                   |
+| OCR-heavy workloads | `await extract_file()`       | Multiprocessing benefits               |
-## Benchmark Results
+## Competitive Performance
-All benchmarks were conducted on macOS 15.5 with ARM64 (14 cores, 48GB RAM) using Python 3.13.3.
+[Comprehensive benchmarks](https://github.com/Goldziher/python-text-extraction-libs-benchmarks) comparing Kreuzberg against other popular Python text extraction libraries demonstrate:
+- **Fastest Extraction**: Consistently fastest processing times across file formats
+- **Lowest Memory Usage**: Most memory-efficient text extraction solution
+- **Smallest Installation**: 71.0 MB vs competitors ranging from 145.8 MB to 1,031.9 MB
+- **100% Success Rate**: Reliable extraction across all tested document types
+- **Optimal for High-Throughput**: Designed for real-time, production applications
+## Internal Benchmark Results
+All internal benchmarks were conducted on macOS 15.5 with ARM64 (14 cores, 48GB RAM) using Python 3.13.3.
 ### Single Document Processing
@@ -50,6 +61,29 @@ All benchmarks were conducted on macOS 15.5 with ARM64 (14 cores, 48GB RAM) usin
 1. **Simpler Path**: Direct execution without thread/process coordination
 1. **Fast Startup**: Immediate execution for quick operations
+### Backend API Considerations
+**Important**: When working in an async context (like FastAPI, Django async views, aiohttp), **always use the async API** even for simple documents:
+```python
+# ✅ Correct: Use async in async contexts
+async def extract_endpoint(file_path: str):
+    result = await extract_file(file_path)  # Non-blocking
+    return result
+# ❌ Wrong: Sync in async context blocks the event loop
+async def extract_endpoint(file_path: str):
+    result = extract_file_sync(file_path)  # Blocks event loop!
+    return result
+```
+**Why this matters:**
+- Sync operations in async contexts block the entire event loop
+- This prevents other requests from being processed concurrently
+- Backend throughput drops dramatically
+- Use async consistently throughout your async application stack
 ### The Crossover Point
 The performance crossover occurs around **10KB file size** or when **OCR is required**:
@@ -219,6 +253,17 @@ Choose your API based on your specific needs:
 - **Sync for simplicity**: CLI tools, simple documents, single-threaded applications
 - **Async for scale**: Web applications, batch processing, complex documents
+- **Async for backends**: **Always use async in async contexts** (FastAPI, Django async, etc.)
 - **Batch for efficiency**: Multiple files, concurrent processing requirements
+### Key Decision Points
+1. **Are you in an async context?** → Use async API
+1. **Processing multiple files?** → Use batch operations
+1. **Simple single document in sync context?** → Sync may be faster
+1. **Complex documents or OCR required?** → Use async API
+1. **Building a web API?** → Use async API
 The performance characteristics will vary based on your specific documents, hardware, and usage patterns. We recommend benchmarking with your actual data to make informed decisions.
+**Remember**: Kreuzberg is benchmarked as one of the fastest text extraction libraries available, delivering superior performance regardless of which API you choose.

kreuzberg 3.3.0__tar.gz → 3.4.0__tar.gz

kreuzberg 3.3.0tar.gz → 3.4.0tar.gz