natural-pdf 0.1.20__tar.gz → 0.1.22__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {natural_pdf-0.1.20/natural_pdf.egg-info → natural_pdf-0.1.22}/PKG-INFO +19 -6
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/README.md +18 -4
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/installation/index.md +32 -0
- natural_pdf-0.1.22/docs/tutorials/01-loading-and-extraction.ipynb +320 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/02-finding-elements.ipynb +42 -42
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/03-extracting-blocks.ipynb +17 -17
- natural_pdf-0.1.22/docs/tutorials/04-table-extraction.ipynb +557 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/05-excluding-content.ipynb +30 -30
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/06-document-qa.ipynb +28 -28
- natural_pdf-0.1.22/docs/tutorials/07-layout-analysis.ipynb +615 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/07-working-with-regions.ipynb +58 -58
- natural_pdf-0.1.22/docs/tutorials/08-spatial-navigation.ipynb +512 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/09-section-extraction.ipynb +93 -93
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/10-form-field-extraction.ipynb +50 -50
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/11-enhanced-table-processing.ipynb +6 -6
- natural_pdf-0.1.22/docs/tutorials/12-ocr-integration.ipynb +4197 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/13-semantic-search.ipynb +174 -174
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/cli.py +8 -27
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/pdf.py +31 -45
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/base.py +2 -2
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/elements.py +1 -1
- {natural_pdf-0.1.20 → natural_pdf-0.1.22/natural_pdf.egg-info}/PKG-INFO +19 -6
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/requires.txt +0 -1
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pyproject.toml +0 -1
- natural_pdf-0.1.20/docs/tutorials/01-loading-and-extraction.ipynb +0 -320
- natural_pdf-0.1.20/docs/tutorials/04-table-extraction.ipynb +0 -557
- natural_pdf-0.1.20/docs/tutorials/07-layout-analysis.ipynb +0 -615
- natural_pdf-0.1.20/docs/tutorials/08-spatial-navigation.ipynb +0 -512
- natural_pdf-0.1.20/docs/tutorials/12-ocr-integration.ipynb +0 -4197
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/analysis_framework.mdc +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/coding-style.mdc +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/edit-md-instead-of-ipynb.mdc +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/minimal-comments.mdc +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/natural-pdf-overview.mdc +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/user-friendly-library-code.mdc +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.github/workflows/docs.yml +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.gitignore +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.pre-commit-config.yaml +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/01-execute_notebooks.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/02-run_all_tutorials.sh +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/CLAUDE.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/LICENSE +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/MANIFEST.in +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/audit_packaging.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/check_run_md.sh +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/api/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/favicon.png +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/favicon.svg +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/javascripts/custom.js +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/logo.svg +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/sample-screen.png +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/social-preview.png +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/social-preview.svg +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/stylesheets/custom.css +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/categorizing-documents/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/data-extraction/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/describe/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/describe/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/document-qa/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/document-qa/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/element-selection/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/element-selection/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/finetuning/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/interactive-widget/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/interactive-widget/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/layout-analysis/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/layout-analysis/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/loops-and-groups/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/loops-and-groups/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/ocr/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/pdf-navigation/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/pdf-navigation/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/reflowing-pages/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/reflowing-pages/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/regions/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/regions/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tables/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tables/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/text-analysis/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/text-analysis/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/text-extraction/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/text-extraction/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/01-loading-and-extraction.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/02-finding-elements.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/03-extracting-blocks.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/04-table-extraction.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/05-excluding-content.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/06-document-qa.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/07-layout-analysis.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/07-working-with-regions.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/08-spatial-navigation.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/09-section-extraction.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/10-form-field-extraction.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/11-enhanced-table-processing.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/12-ocr-integration.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/13-semantic-search.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/14-categorizing-documents.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/14-categorizing-documents.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/visual-debugging/index.ipynb +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/visual-debugging/index.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/visual-debugging/region.png +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/mkdocs.yml +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/base.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/docling.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/gemini.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/layout_analyzer.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/layout_manager.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/layout_options.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/paddle.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/pdfplumber_table_finder.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/surya.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/table_structure_utils.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/tatr.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/yolo.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/shape_detection_mixin.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/text_options.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/text_structure.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/utils.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/classification/manager.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/classification/mixin.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/classification/results.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/collections/mixins.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/collections/pdf_collection.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/element_manager.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/highlighting_service.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/page.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/mixin.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/summary.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/base.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/collections.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/line.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/rect.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/region.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/text.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/export/mixin.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/base.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/data/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/data/pdf.ttf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/data/sRGB.icc +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/hocr.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/hocr_font.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/original_pdf.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/paddleocr.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/searchable_pdf.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/extraction/manager.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/extraction/mixin.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/extraction/result.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/collections.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/element.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/flow.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/region.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine_doctr.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine_easyocr.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine_paddle.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine_surya.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/ocr_factory.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/ocr_manager.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/ocr_options.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/utils.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/qa/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/qa/document_qa.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/lancedb_search_service.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/numpy_search_service.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/search_options.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/search_service_protocol.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/searchable_mixin.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/selectors/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/selectors/parser.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/finetune/fine_tune_paddleocr.md +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/spa/css/style.css +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/spa/index.html +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/spa/js/app.js +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/spa/words.txt +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/debug.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/highlighting.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/identifiers.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/locks.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/packaging.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/reading_order.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/text_extraction.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/visualization.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/widgets/__init__.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/widgets/viewer.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/SOURCES.txt +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/dependency_links.txt +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/entry_points.txt +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/top_level.txt +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/noxfile.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/.gitkeep +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/01-practice.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/0500000US42001.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/0500000US42007.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/2014 Statistics.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/2019 Statistics.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/30.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/Atlanta_Public_Schools_GA_sample.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/anexo_edital_6604_1743480-table.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/appendix_fy2026.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/cia-doc.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/geometry.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/image.png +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/image.png.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/multicolumn.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/needs-ocr.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/red.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr-2.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr-3.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr-small.jpg +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr-wide.jpg +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/word-counter.pdf +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/publish.sh +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/sample-screen.png +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/setup.cfg +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/test_install.sh +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/conftest.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/exporters/test_paddleocr_exporter.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_containment_geometry.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_elements.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_loading.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_spatial.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_text_extraction.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_loading_original.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_optional_deps.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_tutorials.py +0 -0
- {natural_pdf-0.1.20 → natural_pdf-0.1.22}/uv.lock +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: natural-pdf
|
3
|
-
Version: 0.1.
|
3
|
+
Version: 0.1.22
|
4
4
|
Summary: A more intuitive interface for working with PDFs
|
5
5
|
Author-email: Jonathan Soma <jonathan.soma@gmail.com>
|
6
6
|
License-Expression: MIT
|
@@ -21,7 +21,6 @@ Requires-Dist: urllib3
|
|
21
21
|
Requires-Dist: tqdm
|
22
22
|
Requires-Dist: pydantic
|
23
23
|
Requires-Dist: jenkspy
|
24
|
-
Requires-Dist: pikepdf
|
25
24
|
Requires-Dist: scipy
|
26
25
|
Requires-Dist: torch
|
27
26
|
Requires-Dist: torchvision
|
@@ -88,14 +87,28 @@ Natural PDF lets you find and extract content from PDFs using simple code that m
|
|
88
87
|
pip install natural-pdf
|
89
88
|
```
|
90
89
|
|
91
|
-
|
90
|
+
Need OCR engines, layout models, or other heavy add-ons? Install the **core** once, then use the helper CLI to pull in exactly what you need:
|
92
91
|
|
93
92
|
```bash
|
94
|
-
#
|
95
|
-
|
93
|
+
# add PaddleOCR (+paddlex) after the fact
|
94
|
+
npdf install paddle
|
95
|
+
|
96
|
+
# Surya OCR and the YOLO Doc-Layout detector in one go
|
97
|
+
npdf install surya yolo
|
98
|
+
|
99
|
+
# see what's already on your machine
|
100
|
+
npdf list
|
101
|
+
```
|
102
|
+
|
103
|
+
Light-weight extras such as `deskew` or `search` can still be added with
|
104
|
+
classic PEP-508 markers if you prefer:
|
105
|
+
|
106
|
+
```bash
|
107
|
+
pip install "natural-pdf[deskew]"
|
108
|
+
pip install "natural-pdf[search]"
|
96
109
|
```
|
97
110
|
|
98
|
-
|
111
|
+
More details in the [installation guide](https://jsoma.github.io/natural-pdf/installation/).
|
99
112
|
|
100
113
|
## Quick Start
|
101
114
|
|
@@ -15,14 +15,28 @@ Natural PDF lets you find and extract content from PDFs using simple code that m
|
|
15
15
|
pip install natural-pdf
|
16
16
|
```
|
17
17
|
|
18
|
-
|
18
|
+
Need OCR engines, layout models, or other heavy add-ons? Install the **core** once, then use the helper CLI to pull in exactly what you need:
|
19
19
|
|
20
20
|
```bash
|
21
|
-
#
|
22
|
-
|
21
|
+
# add PaddleOCR (+paddlex) after the fact
|
22
|
+
npdf install paddle
|
23
|
+
|
24
|
+
# Surya OCR and the YOLO Doc-Layout detector in one go
|
25
|
+
npdf install surya yolo
|
26
|
+
|
27
|
+
# see what's already on your machine
|
28
|
+
npdf list
|
29
|
+
```
|
30
|
+
|
31
|
+
Light-weight extras such as `deskew` or `search` can still be added with
|
32
|
+
classic PEP-508 markers if you prefer:
|
33
|
+
|
34
|
+
```bash
|
35
|
+
pip install "natural-pdf[deskew]"
|
36
|
+
pip install "natural-pdf[search]"
|
23
37
|
```
|
24
38
|
|
25
|
-
|
39
|
+
More details in the [installation guide](https://jsoma.github.io/natural-pdf/installation/).
|
26
40
|
|
27
41
|
## Quick Start
|
28
42
|
|
@@ -30,6 +30,38 @@ pip install natural-pdf[favorites]
|
|
30
30
|
|
31
31
|
Other OCR and layout analysis engines like `surya`, `easyocr`, `paddle`, `doctr`, and `docling` can be installed via `pip` as needed. The library will provide you with an error message and installation command if you try to use an engine that isn't installed.
|
32
32
|
|
33
|
+
After the core install you have two ways to add **optional engines**:
|
34
|
+
|
35
|
+
### 1 · Helper CLI (recommended)
|
36
|
+
|
37
|
+
```bash
|
38
|
+
# list optional groups and their install-status
|
39
|
+
npdf list
|
40
|
+
|
41
|
+
# install PaddleOCR stack
|
42
|
+
npdf install paddle
|
43
|
+
|
44
|
+
# install Surya OCR + YOLO layout detector
|
45
|
+
npdf install surya yolo
|
46
|
+
```
|
47
|
+
|
48
|
+
The CLI runs each wheel in its own resolver pass, so it avoids strict
|
49
|
+
version pins like `paddleocr → paddlex==3.0.1` while still upgrading to
|
50
|
+
`paddlex 3.0.2`.
|
51
|
+
|
52
|
+
### 2 · Classic extras (for the light stuff)
|
53
|
+
|
54
|
+
```bash
|
55
|
+
# Deskewing
|
56
|
+
pip install "natural-pdf[deskew]"
|
57
|
+
|
58
|
+
# Semantic search service
|
59
|
+
pip install "natural-pdf[search]"
|
60
|
+
```
|
61
|
+
|
62
|
+
If you attempt to use an engine that is missing, the library will raise an
|
63
|
+
error that tells you which `npdf install …` command to run.
|
64
|
+
|
33
65
|
## Your First PDF Extraction
|
34
66
|
|
35
67
|
Here's a quick example to make sure everything is working:
|