natural-pdf 0.2.10__tar.gz → 0.2.11__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {natural_pdf-0.2.10/natural_pdf.egg-info → natural_pdf-0.2.11}/PKG-INFO +1 -1
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/guides.py +122 -30
- {natural_pdf-0.2.10 → natural_pdf-0.2.11/natural_pdf.egg-info}/PKG-INFO +1 -1
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_guides.py +64 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.cursor/rules/analysis_framework.mdc +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.cursor/rules/coding-style.mdc +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.cursor/rules/edit-md-instead-of-ipynb.mdc +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.cursor/rules/minimal-comments.mdc +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.cursor/rules/natural-pdf-overview.mdc +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.cursor/rules/user-friendly-library-code.mdc +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.github/workflows/ci.yml +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.github/workflows/docs.yml +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.github/workflows/nightly-tutorials.yml +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.gitignore +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/.pre-commit-config.yaml +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/01-execute_notebooks.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/02-run_all_tutorials.sh +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/CLAUDE.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/LICENSE +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/MANIFEST.in +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/README.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/audit_packaging.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/check_run_md.sh +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/api/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/assets/favicon.png +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/assets/favicon.svg +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/assets/javascripts/custom.js +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/assets/logo.svg +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/assets/sample-screen.png +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/assets/social-preview.png +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/assets/social-preview.svg +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/assets/stylesheets/custom.css +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/categorizing-documents/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/data-extraction/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/describe/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/document-qa/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/element-selection/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/extracting-clean-text/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/finetuning/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/fix-messy-tables/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/fix-messy-tables/table_1.csv +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/fix-messy-tables/table_2.csv +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/fix-messy-tables/table_3.csv +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/installation/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/interactive-widget/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/layout-analysis/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/loops-and-groups/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/ocr/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/pdf-navigation/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/process-forms-and-invoices/extracted_form_data.csv +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/process-forms-and-invoices/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/quick-reference/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/reflowing-pages/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/regions/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tables/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/text-analysis/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/01-loading-and-extraction.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/02-finding-elements.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/03-extracting-blocks.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/04-table-extraction.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/05-excluding-content.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/06-document-qa.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/07-layout-analysis.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/07-working-with-regions.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/08-spatial-navigation.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/09-section-extraction.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/10-form-field-extraction.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/11-enhanced-table-processing.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/12-ocr-integration.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/13-semantic-search.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/tutorials/14-categorizing-documents.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/visual-debugging/index.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/visual-debugging/region.png +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/mkdocs.yml +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/base.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/docling.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/gemini.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/layout_analyzer.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/layout_manager.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/layout_options.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/paddle.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/pdfplumber_table_finder.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/surya.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/table_structure_utils.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/tatr.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/yolo.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/shape_detection_mixin.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/text_options.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/text_structure.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/utils.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/classification/manager.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/classification/mixin.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/classification/results.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/cli.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/collections/mixins.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/element_manager.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/highlighting_service.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/page.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/page_collection.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/page_groupby.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/pdf.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/pdf_collection.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/core/render_spec.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/describe/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/describe/base.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/describe/elements.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/describe/mixin.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/describe/summary.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/elements/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/elements/base.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/elements/element_collection.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/elements/image.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/elements/line.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/elements/rect.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/elements/region.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/elements/text.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/export/mixin.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/base.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/data/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/data/pdf.ttf +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/data/sRGB.icc +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/hocr.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/hocr_font.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/original_pdf.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/paddleocr.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/exporters/searchable_pdf.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/extraction/manager.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/extraction/mixin.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/extraction/result.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/flows/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/flows/collections.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/flows/element.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/flows/flow.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/flows/region.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/engine.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/engine_doctr.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/engine_easyocr.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/engine_paddle.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/engine_surya.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/ocr_factory.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/ocr_manager.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/ocr_options.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/ocr/utils.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/qa/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/qa/document_qa.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/qa/qa_result.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/search/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/search/lancedb_search_service.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/search/numpy_search_service.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/search/search_options.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/search/search_service_protocol.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/search/searchable_mixin.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/selectors/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/selectors/parser.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/tables/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/tables/result.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/templates/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/templates/finetune/fine_tune_paddleocr.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/templates/spa/css/style.css +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/templates/spa/index.html +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/templates/spa/js/app.js +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/templates/spa/words.txt +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/text_mixin.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/bidi_mirror.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/color_utils.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/debug.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/highlighting.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/identifiers.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/layout.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/locks.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/packaging.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/reading_order.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/text_extraction.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/utils/visualization.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/vision/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/vision/mixin.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/vision/results.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/vision/similarity.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/widgets/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/widgets/viewer.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf.egg-info/SOURCES.txt +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf.egg-info/dependency_links.txt +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf.egg-info/entry_points.txt +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf.egg-info/requires.txt +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf.egg-info/top_level.txt +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/noxfile.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/memory_comparison.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/pdf_analyzer.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/performance_analysis.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/performance_results/image_heavy_snapshots.csv +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/performance_results/image_heavy_snapshots.json +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/performance_results/text_heavy_snapshots.csv +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/performance_results/text_heavy_snapshots.json +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/test_cleanup_methods.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/optimization/test_memory_fix.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/publish.sh +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/pyproject.toml +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/sample-screen.png +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/setup.cfg +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/conftest.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/exporters/test_paddleocr_exporter.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_annotate.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_arabic_performance.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_arabic_real_world.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_color_conversion.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_color_hex_display.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_core/test_containment_geometry.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_core/test_elements.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_core/test_loading.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_core/test_spatial.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_core/test_text_extraction.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_core/test_text_layer.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_crop_enhancements.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_crop_region_highlights.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_directional_defaults.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_dissolve.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_dissolve_cross_page_bug.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_dissolve_debug_issue.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_dissolve_real_world_issue.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_dissolve_single_elements.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_dissolve_vertical_offset_issue.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_document_qa.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_element_addition.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_element_collection_show_cols.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_element_collection_slicing.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_element_show_crop_highlights.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_empty_pseudo_class.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_exclusions.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_expand.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_extraction_error.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_extraction_mixin_fix.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_extraction_text_and_vision.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_extraction_working.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_find_similar.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_first_last_selectors.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_fix_get_sections_zero_height.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_flow_region_directional.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_get_sections_fix_comprehensive.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_get_sections_zero_height.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_groupby.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_guides_apply_exclusions.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_guides_apply_exclusions_simple.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_guides_extract_table.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_guides_extract_table_collections.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_guides_extract_table_exclusions.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_guides_extract_table_real.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_guides_integration.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_highlight_detection.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_highlight_detection_comprehensive.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_highlight_protocol.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_highlight_protocol_simple.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_highlight_regions.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_comprehensive.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_debug.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_final.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_final_verification.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_fix.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_mock.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_simple.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_types_pdf.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_verification.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_include_boundaries_with_real_text.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_loading_original.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_merge_connected.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_merge_connected_real_world.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_merge_method.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_multi_page_table_discovery.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_optional_deps.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_page_exclusion_lists.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_pdf_add_exclusion_elementcollection.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_region_show_crop_highlights.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_region_viewer.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_sections_end_only.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_sections_with_start_and_end.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_show_column_layout.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_show_edge_cases.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_show_exclusions.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_show_exclusions_feature.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_show_limit.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_skip_repeating_headers_multipage.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_slice_cache_reuse.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_slice_exclusion_fix.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_slice_exclusion_issue.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_slice_exclusion_mock.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_sliced_collection_exclusions.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_strikethrough_detection.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_table_result_header_mismatch.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_table_result_keep_blank.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_tiny_text_tables.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_tiny_text_tables_table.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_tutorials.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_underline_detection.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tests/test_update_text.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/todo/bad_pdf_analysis.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/todo/evaluation.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/IMPROVEMENTS_SUMMARY.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/LLM_NaturalPDF_CheatSheet.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/LLM_NaturalPDF_Workflows.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/README.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/__init__.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/analyser.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/collate_summaries.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/compile_attempts_markdown.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/eval_suite.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/evaluate_quality.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/export_enrichment_csv.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/extraction_decision_tree.md +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/llm_enrich.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/llm_enrich_with_retry.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/reporter.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/tools/bad_pdf_eval/utils.py +0 -0
- {natural_pdf-0.2.10 → natural_pdf-0.2.11}/uv.lock +0 -0
@@ -4246,12 +4246,26 @@ class _ColumnAccessor:
|
|
4246
4246
|
"""Return number of columns (vertical guides - 1)."""
|
4247
4247
|
return max(0, len(self._guides.vertical) - 1)
|
4248
4248
|
|
4249
|
-
def __getitem__(self, index: int) -> "Region":
|
4250
|
-
"""Get column at the specified index."""
|
4251
|
-
|
4252
|
-
|
4253
|
-
|
4254
|
-
|
4249
|
+
def __getitem__(self, index: Union[int, slice]) -> Union["Region", "ElementCollection"]:
|
4250
|
+
"""Get column at the specified index or slice."""
|
4251
|
+
from natural_pdf.elements.element_collection import ElementCollection
|
4252
|
+
|
4253
|
+
if isinstance(index, slice):
|
4254
|
+
# Handle slice notation - return multiple columns
|
4255
|
+
columns = []
|
4256
|
+
num_cols = len(self)
|
4257
|
+
|
4258
|
+
# Convert slice to range of indices
|
4259
|
+
start, stop, step = index.indices(num_cols)
|
4260
|
+
for i in range(start, stop, step):
|
4261
|
+
columns.append(self._guides.column(i))
|
4262
|
+
|
4263
|
+
return ElementCollection(columns)
|
4264
|
+
else:
|
4265
|
+
# Handle negative indexing
|
4266
|
+
if index < 0:
|
4267
|
+
index = len(self) + index
|
4268
|
+
return self._guides.column(index)
|
4255
4269
|
|
4256
4270
|
|
4257
4271
|
class _RowAccessor:
|
@@ -4264,12 +4278,26 @@ class _RowAccessor:
|
|
4264
4278
|
"""Return number of rows (horizontal guides - 1)."""
|
4265
4279
|
return max(0, len(self._guides.horizontal) - 1)
|
4266
4280
|
|
4267
|
-
def __getitem__(self, index: int) -> "Region":
|
4268
|
-
"""Get row at the specified index."""
|
4269
|
-
|
4270
|
-
|
4271
|
-
|
4272
|
-
|
4281
|
+
def __getitem__(self, index: Union[int, slice]) -> Union["Region", "ElementCollection"]:
|
4282
|
+
"""Get row at the specified index or slice."""
|
4283
|
+
from natural_pdf.elements.element_collection import ElementCollection
|
4284
|
+
|
4285
|
+
if isinstance(index, slice):
|
4286
|
+
# Handle slice notation - return multiple rows
|
4287
|
+
rows = []
|
4288
|
+
num_rows = len(self)
|
4289
|
+
|
4290
|
+
# Convert slice to range of indices
|
4291
|
+
start, stop, step = index.indices(num_rows)
|
4292
|
+
for i in range(start, stop, step):
|
4293
|
+
rows.append(self._guides.row(i))
|
4294
|
+
|
4295
|
+
return ElementCollection(rows)
|
4296
|
+
else:
|
4297
|
+
# Handle negative indexing
|
4298
|
+
if index < 0:
|
4299
|
+
index = len(self) + index
|
4300
|
+
return self._guides.row(index)
|
4273
4301
|
|
4274
4302
|
|
4275
4303
|
class _CellAccessor:
|
@@ -4278,33 +4306,82 @@ class _CellAccessor:
|
|
4278
4306
|
def __init__(self, guides: "Guides"):
|
4279
4307
|
self._guides = guides
|
4280
4308
|
|
4281
|
-
def __getitem__(self, key) -> Union["Region", "_CellRowAccessor"]:
|
4309
|
+
def __getitem__(self, key) -> Union["Region", "_CellRowAccessor", "ElementCollection"]:
|
4282
4310
|
"""
|
4283
4311
|
Get cell(s) at the specified position.
|
4284
4312
|
|
4285
4313
|
Supports:
|
4286
|
-
- guides.cells[row, col] -
|
4287
|
-
- guides.cells[row][col] - nested
|
4314
|
+
- guides.cells[row, col] - single cell
|
4315
|
+
- guides.cells[row][col] - single cell (nested)
|
4316
|
+
- guides.cells[row, :] - all cells in a row
|
4317
|
+
- guides.cells[:, col] - all cells in a column
|
4318
|
+
- guides.cells[:, :] - all cells
|
4319
|
+
- guides.cells[row][:] - all cells in a row (nested)
|
4288
4320
|
"""
|
4321
|
+
from natural_pdf.elements.element_collection import ElementCollection
|
4322
|
+
|
4289
4323
|
if isinstance(key, tuple) and len(key) == 2:
|
4290
|
-
# Direct tuple access: guides.cells[row, col]
|
4291
4324
|
row, col = key
|
4292
|
-
|
4293
|
-
|
4294
|
-
|
4295
|
-
|
4296
|
-
|
4297
|
-
|
4325
|
+
|
4326
|
+
# Handle slices for row and/or column
|
4327
|
+
if isinstance(row, slice) or isinstance(col, slice):
|
4328
|
+
cells = []
|
4329
|
+
num_rows = len(self._guides.rows)
|
4330
|
+
num_cols = len(self._guides.columns)
|
4331
|
+
|
4332
|
+
# Convert slices to ranges
|
4333
|
+
if isinstance(row, slice):
|
4334
|
+
row_indices = range(*row.indices(num_rows))
|
4335
|
+
else:
|
4336
|
+
# Single row index
|
4337
|
+
if row < 0:
|
4338
|
+
row = num_rows + row
|
4339
|
+
row_indices = [row]
|
4340
|
+
|
4341
|
+
if isinstance(col, slice):
|
4342
|
+
col_indices = range(*col.indices(num_cols))
|
4343
|
+
else:
|
4344
|
+
# Single column index
|
4345
|
+
if col < 0:
|
4346
|
+
col = num_cols + col
|
4347
|
+
col_indices = [col]
|
4348
|
+
|
4349
|
+
# Collect all cells in the specified ranges
|
4350
|
+
for r in row_indices:
|
4351
|
+
for c in col_indices:
|
4352
|
+
cells.append(self._guides.cell(r, c))
|
4353
|
+
|
4354
|
+
return ElementCollection(cells)
|
4355
|
+
else:
|
4356
|
+
# Both are integers - single cell access
|
4357
|
+
# Handle negative indexing for both row and col
|
4358
|
+
if row < 0:
|
4359
|
+
row = len(self._guides.rows) + row
|
4360
|
+
if col < 0:
|
4361
|
+
col = len(self._guides.columns) + col
|
4362
|
+
return self._guides.cell(row, col)
|
4363
|
+
elif isinstance(key, slice):
|
4364
|
+
# First level slice: guides.cells[:] - return all rows as accessors
|
4365
|
+
# For now, let's return all cells flattened
|
4366
|
+
cells = []
|
4367
|
+
num_rows = len(self._guides.rows)
|
4368
|
+
row_indices = range(*key.indices(num_rows))
|
4369
|
+
|
4370
|
+
for r in row_indices:
|
4371
|
+
for c in range(len(self._guides.columns)):
|
4372
|
+
cells.append(self._guides.cell(r, c))
|
4373
|
+
|
4374
|
+
return ElementCollection(cells)
|
4298
4375
|
elif isinstance(key, int):
|
4299
4376
|
# First level of nested access: guides.cells[row]
|
4300
4377
|
# Handle negative indexing for row
|
4301
4378
|
if key < 0:
|
4302
4379
|
key = len(self._guides.rows) + key
|
4303
|
-
# Return a row accessor that allows [col] indexing
|
4380
|
+
# Return a row accessor that allows [col] or [:] indexing
|
4304
4381
|
return _CellRowAccessor(self._guides, key)
|
4305
4382
|
else:
|
4306
4383
|
raise TypeError(
|
4307
|
-
f"Cell indices must be integers or tuple of two integers, got {type(key)}"
|
4384
|
+
f"Cell indices must be integers, slices, or tuple of two integers/slices, got {type(key)}"
|
4308
4385
|
)
|
4309
4386
|
|
4310
4387
|
|
@@ -4315,9 +4392,24 @@ class _CellRowAccessor:
|
|
4315
4392
|
self._guides = guides
|
4316
4393
|
self._row = row
|
4317
4394
|
|
4318
|
-
def __getitem__(self, col: int) -> "Region":
|
4319
|
-
"""Get cell at [row][col]."""
|
4320
|
-
|
4321
|
-
|
4322
|
-
|
4323
|
-
|
4395
|
+
def __getitem__(self, col: Union[int, slice]) -> Union["Region", "ElementCollection"]:
|
4396
|
+
"""Get cell at [row][col] or all cells in row with [row][:]."""
|
4397
|
+
from natural_pdf.elements.element_collection import ElementCollection
|
4398
|
+
|
4399
|
+
if isinstance(col, slice):
|
4400
|
+
# Handle slice notation - return all cells in this row
|
4401
|
+
cells = []
|
4402
|
+
num_cols = len(self._guides.columns)
|
4403
|
+
|
4404
|
+
# Convert slice to range of indices
|
4405
|
+
start, stop, step = col.indices(num_cols)
|
4406
|
+
for c in range(start, stop, step):
|
4407
|
+
cells.append(self._guides.cell(self._row, c))
|
4408
|
+
|
4409
|
+
return ElementCollection(cells)
|
4410
|
+
else:
|
4411
|
+
# Handle single column index
|
4412
|
+
# Handle negative indexing for column
|
4413
|
+
if col < 0:
|
4414
|
+
col = len(self._guides.columns) + col
|
4415
|
+
return self._guides.cell(self._row, col)
|
@@ -637,5 +637,69 @@ def test_property_accessors_with_negative_indexing():
|
|
637
637
|
_ = guides.cells[0, -4] # Column index out of bounds
|
638
638
|
|
639
639
|
|
640
|
+
def test_property_accessors_with_slicing():
|
641
|
+
"""Test property-based accessors with slice notation."""
|
642
|
+
pdf = PDF("pdfs/01-practice.pdf")
|
643
|
+
page = pdf.pages[0]
|
644
|
+
|
645
|
+
# Create guides with 3x3 grid
|
646
|
+
guides = Guides(page)
|
647
|
+
guides.vertical.divide(3) # Creates 4 vertical guides = 3 columns
|
648
|
+
guides.horizontal.divide(3) # Creates 4 horizontal guides = 3 rows
|
649
|
+
|
650
|
+
# Test getting all cells in a row
|
651
|
+
row_cells = guides.cells[0][:]
|
652
|
+
assert hasattr(row_cells, "__len__") # Should be an ElementCollection
|
653
|
+
assert len(row_cells) == 3 # 3 cells in a row
|
654
|
+
|
655
|
+
# Test getting all cells in a row with tuple notation
|
656
|
+
row_cells_tuple = guides.cells[0, :]
|
657
|
+
assert len(row_cells_tuple) == 3
|
658
|
+
# Should contain same cells
|
659
|
+
assert all(c1.x0 == c2.x0 and c1.top == c2.top for c1, c2 in zip(row_cells, row_cells_tuple))
|
660
|
+
|
661
|
+
# Test getting all cells in a column
|
662
|
+
col_cells = guides.cells[:, 0]
|
663
|
+
assert len(col_cells) == 3 # 3 cells in a column
|
664
|
+
|
665
|
+
# Test getting all cells
|
666
|
+
all_cells = guides.cells[:, :]
|
667
|
+
assert len(all_cells) == 9 # 3x3 = 9 cells
|
668
|
+
|
669
|
+
# Test getting all rows
|
670
|
+
all_rows = guides.rows[:]
|
671
|
+
assert len(all_rows) == 3
|
672
|
+
|
673
|
+
# Test getting all columns
|
674
|
+
all_cols = guides.columns[:]
|
675
|
+
assert len(all_cols) == 3
|
676
|
+
|
677
|
+
# Test slice with step
|
678
|
+
every_other_col = guides.columns[::2]
|
679
|
+
assert len(every_other_col) == 2 # columns 0 and 2
|
680
|
+
|
681
|
+
# Test negative indices in slices
|
682
|
+
last_row_cells = guides.cells[-1, :]
|
683
|
+
assert len(last_row_cells) == 3
|
684
|
+
|
685
|
+
# Test partial slices
|
686
|
+
first_two_rows = guides.rows[:2]
|
687
|
+
assert len(first_two_rows) == 2
|
688
|
+
|
689
|
+
last_two_cols = guides.columns[-2:]
|
690
|
+
assert len(last_two_cols) == 2
|
691
|
+
|
692
|
+
# Test that cells are in correct order
|
693
|
+
# First row cells should be ordered left to right
|
694
|
+
first_row = guides.cells[0, :]
|
695
|
+
for i in range(len(first_row) - 1):
|
696
|
+
assert first_row[i].x0 < first_row[i + 1].x0
|
697
|
+
|
698
|
+
# First column cells should be ordered top to bottom
|
699
|
+
first_col = guides.cells[:, 0]
|
700
|
+
for i in range(len(first_col) - 1):
|
701
|
+
assert first_col[i].top < first_col[i + 1].top
|
702
|
+
|
703
|
+
|
640
704
|
if __name__ == "__main__":
|
641
705
|
pytest.main([__file__])
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
{natural_pdf-0.2.10 → natural_pdf-0.2.11}/docs/process-forms-and-invoices/extracted_form_data.csv
RENAMED
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
{natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/pdfplumber_table_finder.py
RENAMED
File without changes
|
File without changes
|
{natural_pdf-0.2.10 → natural_pdf-0.2.11}/natural_pdf/analyzers/layout/table_structure_utils.py
RENAMED
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|
File without changes
|