PyPI - mdify-cli - Versions diffs - 1.5.0__tar.gz → 2.0.0__tar.gz - Mend

mdify-cli 1.5.0tar.gz → 2.0.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

{mdify_cli-1.5.0/mdify_cli.egg-info → mdify_cli-2.0.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: mdify-cli
-Version: 1.5.0
+Version: 2.0.0
 Summary: Convert PDFs and document images into structured Markdown for LLM workflows
 Author: tiroq
 License-Expression: MIT
@@ -24,6 +24,9 @@ Classifier: Topic :: Utilities
 Requires-Python: >=3.8
 Description-Content-Type: text/markdown
 License-File: LICENSE
+Requires-Dist: requests
+Provides-Extra: dev
+Requires-Dist: pytest>=7.0; extra == "dev"
 Dynamic: license-file
 # mdify
@@ -98,15 +101,32 @@ Recursively convert files:
 mdify /path/to/documents -r -g "*.pdf"
 ```
-### Masking sensitive content
+### GPU Acceleration
-Mask PII and sensitive content in images:
+For faster processing with NVIDIA GPU:
 ```bash
-mdify document.pdf -m
-mdify document.pdf --mask
+mdify --gpu documents/*.pdf
 ```
-This uses Docling's content-aware masking to obscure sensitive information in embedded images.
+Requires NVIDIA GPU with CUDA support and nvidia-container-toolkit.
+### ⚠️ PII Masking (Deprecated)
+The `--mask` flag is deprecated and will be ignored in this version. PII masking functionality was available in older versions using a custom runtime but is not supported with the current docling-serve backend.
+If PII masking is critical for your use case, please use mdify v1.5.x or earlier versions.
+## Performance
+mdify now uses docling-serve for significantly faster batch processing:
+- **Single model load**: Models are loaded once per session, not per file
+- **~10-20x speedup** for multiple file conversions compared to previous versions
+- **GPU acceleration**: Use `--gpu` for additional 2-6x speedup (requires NVIDIA GPU)
+### First Run Behavior
+The first conversion takes longer (~30-60s) as the container loads ML models into memory. Subsequent files in the same batch process quickly, typically in 1-3 seconds per file.
 ## Options
@@ -119,9 +139,11 @@ This uses Docling's content-aware masking to obscure sensitive information in em
 | `--flat` | Disable directory structure preservation |
 | `--overwrite` | Overwrite existing output files |
 | `-q, --quiet` | Suppress progress messages |
-| `-m, --mask` | Mask PII and sensitive content in images |
+| `-m, --mask` | ⚠️ **Deprecated**: PII masking not supported in current version |
+| `--gpu` | Use GPU-accelerated container (requires NVIDIA GPU and nvidia-container-toolkit) |
+| `--port PORT` | Container port (default: 5001) |
 | `--runtime RUNTIME` | Container runtime: docker or podman (auto-detected) |
-| `--image IMAGE` | Custom container image (default: ghcr.io/tiroq/mdify-runtime:latest) |
+| `--image IMAGE` | Custom container image (default: ghcr.io/docling-project/docling-serve-cpu:main) |
 | `--pull POLICY` | Image pull policy: always, missing, never (default: missing) |
 | `--check-update` | Check for available updates and exit |
 | `--version` | Show version and exit |
@@ -175,19 +197,22 @@ The CLI:
 - Pulls the runtime container on first use
 - Mounts files and runs conversions in the container
-## Container Image
+## Container Images
+mdify uses official docling-serve containers:
-The runtime container is hosted at:
+**CPU Version** (default):
 ```
-ghcr.io/tiroq/mdify-runtime:latest
+ghcr.io/docling-project/docling-serve-cpu:main
 ```
-To build locally:
-```bash
-cd runtime
-docker build -t mdify-runtime .
+**GPU Version** (use with `--gpu` flag):
+```
+ghcr.io/docling-project/docling-serve-cu126:main
 ```
+These are official images from the [docling-serve project](https://github.com/DS4SD/docling-serve).
 ## Updates
 mdify checks for updates daily. When a new version is available:

{mdify_cli-1.5.0 → mdify_cli-2.0.0}/README.md RENAMED Viewed

@@ -70,15 +70,32 @@ Recursively convert files:
 mdify /path/to/documents -r -g "*.pdf"
 ```
-### Masking sensitive content
+### GPU Acceleration
-Mask PII and sensitive content in images:
+For faster processing with NVIDIA GPU:
 ```bash
-mdify document.pdf -m
-mdify document.pdf --mask
+mdify --gpu documents/*.pdf
 ```
-This uses Docling's content-aware masking to obscure sensitive information in embedded images.
+Requires NVIDIA GPU with CUDA support and nvidia-container-toolkit.
+### ⚠️ PII Masking (Deprecated)
+The `--mask` flag is deprecated and will be ignored in this version. PII masking functionality was available in older versions using a custom runtime but is not supported with the current docling-serve backend.
+If PII masking is critical for your use case, please use mdify v1.5.x or earlier versions.
+## Performance
+mdify now uses docling-serve for significantly faster batch processing:
+- **Single model load**: Models are loaded once per session, not per file
+- **~10-20x speedup** for multiple file conversions compared to previous versions
+- **GPU acceleration**: Use `--gpu` for additional 2-6x speedup (requires NVIDIA GPU)
+### First Run Behavior
+The first conversion takes longer (~30-60s) as the container loads ML models into memory. Subsequent files in the same batch process quickly, typically in 1-3 seconds per file.
 ## Options
@@ -91,9 +108,11 @@ This uses Docling's content-aware masking to obscure sensitive information in em
 | `--flat` | Disable directory structure preservation |
 | `--overwrite` | Overwrite existing output files |
 | `-q, --quiet` | Suppress progress messages |
-| `-m, --mask` | Mask PII and sensitive content in images |
+| `-m, --mask` | ⚠️ **Deprecated**: PII masking not supported in current version |
+| `--gpu` | Use GPU-accelerated container (requires NVIDIA GPU and nvidia-container-toolkit) |
+| `--port PORT` | Container port (default: 5001) |
 | `--runtime RUNTIME` | Container runtime: docker or podman (auto-detected) |
-| `--image IMAGE` | Custom container image (default: ghcr.io/tiroq/mdify-runtime:latest) |
+| `--image IMAGE` | Custom container image (default: ghcr.io/docling-project/docling-serve-cpu:main) |
 | `--pull POLICY` | Image pull policy: always, missing, never (default: missing) |
 | `--check-update` | Check for available updates and exit |
 | `--version` | Show version and exit |
@@ -147,19 +166,22 @@ The CLI:
 - Pulls the runtime container on first use
 - Mounts files and runs conversions in the container
-## Container Image
+## Container Images
+mdify uses official docling-serve containers:
-The runtime container is hosted at:
+**CPU Version** (default):
 ```
-ghcr.io/tiroq/mdify-runtime:latest
+ghcr.io/docling-project/docling-serve-cpu:main
 ```
-To build locally:
-```bash
-cd runtime
-docker build -t mdify-runtime .
+**GPU Version** (use with `--gpu` flag):
+```
+ghcr.io/docling-project/docling-serve-cu126:main
 ```
+These are official images from the [docling-serve project](https://github.com/DS4SD/docling-serve).
 ## Updates
 mdify checks for updates daily. When a new version is available:

{mdify_cli-1.5.0 → mdify_cli-2.0.0}/mdify/__init__.py RENAMED Viewed

@@ -1,3 +1,3 @@
 """mdify - Convert documents to Markdown via Docling container."""
-__version__ = "1.5.0"
+__version__ = "2.0.0"

mdify-cli 1.5.0__tar.gz → 2.0.0__tar.gz

mdify-cli 1.5.0tar.gz → 2.0.0tar.gz