PyPI - docling-ocr-onnxtr - Versions diffs - 0.1.3__tar.gz → 0.2.1__tar.gz - Mend

docling-ocr-onnxtr 0.1.3tar.gz → 0.2.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

{docling_ocr_onnxtr-0.1.3 → docling_ocr_onnxtr-0.2.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: docling-ocr-onnxtr
-Version: 0.1.3
+Version: 0.2.1
 Summary: Onnx Text Recognition (OnnxTR) OCR plugin for docling
 Author-email: Felix Dittrich <felixdittrich92@gmail.com>
 Maintainer: Felix Dittrich
@@ -266,7 +266,7 @@ Dynamic: license-file
 [![codecov](https://codecov.io/gh/felixdittrich92/docling-OCR-OnnxTR/graph/badge.svg?token=L3AHXKV86A)](https://codecov.io/gh/felixdittrich92/docling-OCR-OnnxTR)
 [![Codacy Badge](https://app.codacy.com/project/badge/Grade/0d250447650240ee9ca573950fea8b99)](https://app.codacy.com/gh/felixdittrich92/docling-OCR-OnnxTR/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
 [![CodeFactor](https://www.codefactor.io/repository/github/felixdittrich92/docling-ocr-onnxtr/badge)](https://www.codefactor.io/repository/github/felixdittrich92/docling-ocr-onnxtr)
-[![Pypi](https://img.shields.io/badge/pypi-v0.1.3-blue.svg)](https://pypi.org/project/docling-ocr-onnxtr/)
+[![Pypi](https://img.shields.io/badge/pypi-v0.2.0-blue.svg)](https://pypi.org/project/docling-ocr-onnxtr/)
 ![PyPI - Downloads](https://img.shields.io/pypi/dm/docling-ocr-onnxtr)
 The `docling-OCR-OnnxTR` repository provides a plugin that integrates the [OnnxTR OCR engine](https://github.com/felixdittrich92/OnnxTR) into the [Docling framework](https://github.com/docling-project/docling), enhancing document processing capabilities with efficient and accurate text recognition.
@@ -358,6 +358,62 @@ def main():
     print(md)
+if __name__ == "__main__":
+    main()
+```
+It is also possible to load the models from local files instead of using the Hugging Face Hub or downloading them from the repo:
+```python
+from docling.datamodel.pipeline_options import PdfPipelineOptions
+from docling.document_converter import (
+    ConversionResult,
+    DocumentConverter,
+    InputFormat,
+    PdfFormatOption,
+)
+from docling_ocr_onnxtr import OnnxtrOcrOptions
+from onnxtr.models import db_mobilenet_v3_large, parseq
+def main():
+    # Source document to convert
+    source = "https://arxiv.org/pdf/2408.09869v4"
+    # Load models from local files
+    # NOTE: You need to download the models first and then adjust the paths accordingly.
+    det_model = db_mobilenet_v3_large("/home/felix/.cache/onnxtr/models/db_mobilenet_v3_large-1866973f.onnx")
+    reco_model = parseq("/home/felix/.cache/onnxtr/models/parseq-00b40714.onnx")
+    ocr_options = OnnxtrOcrOptions(
+        # Text detection model
+        det_arch=det_model,
+        # Text recognition model
+        reco_arch=reco_model,
+        # This can be set to `True` to auto-correct the orientation of the pages
+        auto_correct_orientation=False,
+    )
+    pipeline_options = PdfPipelineOptions(
+        ocr_options=ocr_options,
+    )
+    pipeline_options.allow_external_plugins = True  # <-- enabled the external plugins
+    # Convert the document
+    converter = DocumentConverter(
+        format_options={
+            InputFormat.PDF: PdfFormatOption(
+                pipeline_options=pipeline_options,
+            ),
+        },
+    )
+    conversion_result: ConversionResult = converter.convert(source=source)
+    doc = conversion_result.document
+    md = doc.export_to_markdown()
+    print(md)
 if __name__ == "__main__":
     main()
 ```

{docling_ocr_onnxtr-0.1.3 → docling_ocr_onnxtr-0.2.1}/README.md RENAMED Viewed

@@ -7,7 +7,7 @@
 [![codecov](https://codecov.io/gh/felixdittrich92/docling-OCR-OnnxTR/graph/badge.svg?token=L3AHXKV86A)](https://codecov.io/gh/felixdittrich92/docling-OCR-OnnxTR)
 [![Codacy Badge](https://app.codacy.com/project/badge/Grade/0d250447650240ee9ca573950fea8b99)](https://app.codacy.com/gh/felixdittrich92/docling-OCR-OnnxTR/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
 [![CodeFactor](https://www.codefactor.io/repository/github/felixdittrich92/docling-ocr-onnxtr/badge)](https://www.codefactor.io/repository/github/felixdittrich92/docling-ocr-onnxtr)
-[![Pypi](https://img.shields.io/badge/pypi-v0.1.3-blue.svg)](https://pypi.org/project/docling-ocr-onnxtr/)
+[![Pypi](https://img.shields.io/badge/pypi-v0.2.0-blue.svg)](https://pypi.org/project/docling-ocr-onnxtr/)
 ![PyPI - Downloads](https://img.shields.io/pypi/dm/docling-ocr-onnxtr)
 The `docling-OCR-OnnxTR` repository provides a plugin that integrates the [OnnxTR OCR engine](https://github.com/felixdittrich92/OnnxTR) into the [Docling framework](https://github.com/docling-project/docling), enhancing document processing capabilities with efficient and accurate text recognition.
@@ -99,6 +99,62 @@ def main():
     print(md)
+if __name__ == "__main__":
+    main()
+```
+It is also possible to load the models from local files instead of using the Hugging Face Hub or downloading them from the repo:
+```python
+from docling.datamodel.pipeline_options import PdfPipelineOptions
+from docling.document_converter import (
+    ConversionResult,
+    DocumentConverter,
+    InputFormat,
+    PdfFormatOption,
+)
+from docling_ocr_onnxtr import OnnxtrOcrOptions
+from onnxtr.models import db_mobilenet_v3_large, parseq
+def main():
+    # Source document to convert
+    source = "https://arxiv.org/pdf/2408.09869v4"
+    # Load models from local files
+    # NOTE: You need to download the models first and then adjust the paths accordingly.
+    det_model = db_mobilenet_v3_large("/home/felix/.cache/onnxtr/models/db_mobilenet_v3_large-1866973f.onnx")
+    reco_model = parseq("/home/felix/.cache/onnxtr/models/parseq-00b40714.onnx")
+    ocr_options = OnnxtrOcrOptions(
+        # Text detection model
+        det_arch=det_model,
+        # Text recognition model
+        reco_arch=reco_model,
+        # This can be set to `True` to auto-correct the orientation of the pages
+        auto_correct_orientation=False,
+    )
+    pipeline_options = PdfPipelineOptions(
+        ocr_options=ocr_options,
+    )
+    pipeline_options.allow_external_plugins = True  # <-- enabled the external plugins
+    # Convert the document
+    converter = DocumentConverter(
+        format_options={
+            InputFormat.PDF: PdfFormatOption(
+                pipeline_options=pipeline_options,
+            ),
+        },
+    )
+    conversion_result: ConversionResult = converter.convert(source=source)
+    doc = conversion_result.document
+    md = doc.export_to_markdown()
+    print(md)
 if __name__ == "__main__":
     main()
 ```

{docling_ocr_onnxtr-0.1.3 → docling_ocr_onnxtr-0.2.1}/docling_ocr_onnxtr/onnxtr_model.py RENAMED Viewed

@@ -1,4 +1,4 @@
-# Copyright (C) 2021-2025, Felix Dittrich.
+# Copyright (C) 2021-2026, Felix Dittrich.
 # This program is licensed under the Apache License 2.0.
 # See LICENSE or go to <https://opensource.org/licenses/Apache-2.0> for full license details.
@@ -12,7 +12,7 @@ import numpy
 import numpy as np
 from docling.datamodel.base_models import Page
 from docling.datamodel.document import ConversionResult
-from docling.datamodel.pipeline_options import (
+from docling.datamodel.pipeline_options import (  # type: ignore[attr-defined]
     AcceleratorOptions,
     OcrOptions,
 )
@@ -91,11 +91,13 @@ class OnnxtrOcrModel(BaseOcrModel):
             self.reader = ocr_predictor(
                 det_arch=(
-                    from_hub(self.options.det_arch) if self.options.det_arch.count("/") == 1 else self.options.det_arch
+                    from_hub(self.options.det_arch)
+                    if isinstance(self.options.det_arch, str) and self.options.det_arch.count("/") == 1
+                    else self.options.det_arch
                 ),
                 reco_arch=(
                     from_hub(self.options.reco_arch)
-                    if self.options.reco_arch.count("/") == 1
+                    if isinstance(self.options.reco_arch, str) and self.options.reco_arch.count("/") == 1
                     else self.options.reco_arch
                 ),
                 det_bs=1,  # NOTE: Should be always 1, because docling handles batching

{docling_ocr_onnxtr-0.1.3 → docling_ocr_onnxtr-0.2.1}/docling_ocr_onnxtr/options.py RENAMED Viewed

@@ -1,4 +1,4 @@
-# Copyright (C) 2021-2025, Felix Dittrich.
+# Copyright (C) 2021-2026, Felix Dittrich.
 # This program is licensed under the Apache License 2.0.
 # See LICENSE or go to <https://opensource.org/licenses/Apache-2.0> for full license details.
@@ -22,9 +22,9 @@ class OnnxtrOcrOptions(OcrOptions):
     # detection model objectness score threshold 'fast algorithm'
     objectness_score: float = 0.3
-    # NOTE: This can be also a hf hub model
-    det_arch: str = "fast_base"
-    reco_arch: str = "crnn_vgg16_bn"
+    # NOTE: This can be also a hf hub model or an instance of a model class.
+    det_arch: Any = "fast_base"
+    reco_arch: Any = "crnn_vgg16_bn"
     reco_bs: int = 512
     auto_correct_orientation: bool = False
     preserve_aspect_ratio: bool = True

{docling_ocr_onnxtr-0.1.3 → docling_ocr_onnxtr-0.2.1}/docling_ocr_onnxtr/plugin.py RENAMED Viewed

@@ -1,4 +1,4 @@
-# Copyright (C) 2021-2025, Felix Dittrich.
+# Copyright (C) 2021-2026, Felix Dittrich.
 # This program is licensed under the Apache License 2.0.
 # See LICENSE or go to <https://opensource.org/licenses/Apache-2.0> for full license details.

docling_ocr_onnxtr-0.2.1/docling_ocr_onnxtr/version.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__ = 'v0.2.1'

{docling_ocr_onnxtr-0.1.3 → docling_ocr_onnxtr-0.2.1}/docling_ocr_onnxtr.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: docling-ocr-onnxtr
-Version: 0.1.3
+Version: 0.2.1
 Summary: Onnx Text Recognition (OnnxTR) OCR plugin for docling
 Author-email: Felix Dittrich <felixdittrich92@gmail.com>
 Maintainer: Felix Dittrich
@@ -266,7 +266,7 @@ Dynamic: license-file
 [![codecov](https://codecov.io/gh/felixdittrich92/docling-OCR-OnnxTR/graph/badge.svg?token=L3AHXKV86A)](https://codecov.io/gh/felixdittrich92/docling-OCR-OnnxTR)
 [![Codacy Badge](https://app.codacy.com/project/badge/Grade/0d250447650240ee9ca573950fea8b99)](https://app.codacy.com/gh/felixdittrich92/docling-OCR-OnnxTR/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
 [![CodeFactor](https://www.codefactor.io/repository/github/felixdittrich92/docling-ocr-onnxtr/badge)](https://www.codefactor.io/repository/github/felixdittrich92/docling-ocr-onnxtr)
-[![Pypi](https://img.shields.io/badge/pypi-v0.1.3-blue.svg)](https://pypi.org/project/docling-ocr-onnxtr/)
+[![Pypi](https://img.shields.io/badge/pypi-v0.2.0-blue.svg)](https://pypi.org/project/docling-ocr-onnxtr/)
 ![PyPI - Downloads](https://img.shields.io/pypi/dm/docling-ocr-onnxtr)
 The `docling-OCR-OnnxTR` repository provides a plugin that integrates the [OnnxTR OCR engine](https://github.com/felixdittrich92/OnnxTR) into the [Docling framework](https://github.com/docling-project/docling), enhancing document processing capabilities with efficient and accurate text recognition.
@@ -358,6 +358,62 @@ def main():
     print(md)
+if __name__ == "__main__":
+    main()
+```
+It is also possible to load the models from local files instead of using the Hugging Face Hub or downloading them from the repo:
+```python
+from docling.datamodel.pipeline_options import PdfPipelineOptions
+from docling.document_converter import (
+    ConversionResult,
+    DocumentConverter,
+    InputFormat,
+    PdfFormatOption,
+)
+from docling_ocr_onnxtr import OnnxtrOcrOptions
+from onnxtr.models import db_mobilenet_v3_large, parseq
+def main():
+    # Source document to convert
+    source = "https://arxiv.org/pdf/2408.09869v4"
+    # Load models from local files
+    # NOTE: You need to download the models first and then adjust the paths accordingly.
+    det_model = db_mobilenet_v3_large("/home/felix/.cache/onnxtr/models/db_mobilenet_v3_large-1866973f.onnx")
+    reco_model = parseq("/home/felix/.cache/onnxtr/models/parseq-00b40714.onnx")
+    ocr_options = OnnxtrOcrOptions(
+        # Text detection model
+        det_arch=det_model,
+        # Text recognition model
+        reco_arch=reco_model,
+        # This can be set to `True` to auto-correct the orientation of the pages
+        auto_correct_orientation=False,
+    )
+    pipeline_options = PdfPipelineOptions(
+        ocr_options=ocr_options,
+    )
+    pipeline_options.allow_external_plugins = True  # <-- enabled the external plugins
+    # Convert the document
+    converter = DocumentConverter(
+        format_options={
+            InputFormat.PDF: PdfFormatOption(
+                pipeline_options=pipeline_options,
+            ),
+        },
+    )
+    conversion_result: ConversionResult = converter.convert(source=source)
+    doc = conversion_result.document
+    md = doc.export_to_markdown()
+    print(md)
 if __name__ == "__main__":
     main()
 ```

{docling_ocr_onnxtr-0.1.3 → docling_ocr_onnxtr-0.2.1}/setup.py RENAMED Viewed

@@ -1,4 +1,4 @@
-# Copyright (C) 2021-2025, Felix Dittrich.
+# Copyright (C) 2021-2026, Felix Dittrich.
 # This program is licensed under the Apache License 2.0.
 # See LICENSE or go to <https://opensource.org/licenses/Apache-2.0> for full license details.
@@ -9,7 +9,7 @@ from pathlib import Path
 from setuptools import setup
 PKG_NAME = "docling_ocr_onnxtr"
-VERSION = os.getenv("BUILD_VERSION", "0.1.3a0")
+VERSION = os.getenv("BUILD_VERSION", "0.2.1a0")
 if __name__ == "__main__":

{docling_ocr_onnxtr-0.1.3 → docling_ocr_onnxtr-0.2.1}/tests/test_plugin.py RENAMED Viewed

@@ -15,9 +15,8 @@ from docling.document_converter import DocumentConverter, PdfFormatOption
 from docling_ocr_onnxtr import OnnxtrOcrOptions
 from .test_data_gen_flag import GEN_TEST_DATA
-from .verify_utils import verify_conversion_result_v1, verify_conversion_result_v2
+from .verify_utils import verify_conversion_result_v2
-GENERATE_V1 = GEN_TEST_DATA
 GENERATE_V2 = GEN_TEST_DATA
@@ -73,7 +72,7 @@ def test_e2e_conversions(ocr_options: OcrOptions):
     print(f"Converting with ocr_engine: {ocr_options.kind}, language: {ocr_options.lang}")
     converter = get_converter(ocr_options=ocr_options)
     for pdf_path in pdf_paths:
-        if not ocr_options.auto_correct_orientation and "rotated" in pdf_path.name:
+        if not ocr_options.auto_correct_orientation and ("rotated" in pdf_path.name or "rotation" in pdf_path.name):
             # Skip rotated PDFs if orientation correction is disabled
             print(f"Skipping {pdf_path} due to orientation correction settings.")
             continue
@@ -82,12 +81,6 @@ def test_e2e_conversions(ocr_options: OcrOptions):
         doc_result: ConversionResult = converter.convert(pdf_path)
         try:
-            verify_conversion_result_v1(
-                input_path=pdf_path,
-                doc_result=doc_result,
-                generate=GENERATE_V1,
-                fuzzy=True,
-            )
             verify_conversion_result_v2(
                 input_path=pdf_path,
                 doc_result=doc_result,
@@ -95,7 +88,7 @@ def test_e2e_conversions(ocr_options: OcrOptions):
                 fuzzy=True,
             )
         except AssertionError as e:
-            if "rotated" in pdf_path.name:
+            if "rotated" in pdf_path.name or "rotation" in pdf_path.name or ocr_options.auto_correct_orientation:
                 pytest.xfail(f"Skipping {pdf_path} due to orientation correction settings: {e}")
             else:
                 raise  # Unexpected failure — re-raise the error