natural-pdf 0.1.19__tar.gz → 0.1.21__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (241) hide show
  1. {natural_pdf-0.1.19/natural_pdf.egg-info → natural_pdf-0.1.21}/PKG-INFO +26 -25
  2. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/README.md +18 -4
  3. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/installation/index.md +32 -0
  4. natural_pdf-0.1.21/docs/tutorials/01-loading-and-extraction.ipynb +320 -0
  5. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/02-finding-elements.ipynb +42 -42
  6. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/03-extracting-blocks.ipynb +17 -17
  7. natural_pdf-0.1.21/docs/tutorials/04-table-extraction.ipynb +557 -0
  8. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/05-excluding-content.ipynb +30 -30
  9. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/06-document-qa.ipynb +28 -28
  10. natural_pdf-0.1.21/docs/tutorials/07-layout-analysis.ipynb +615 -0
  11. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/07-working-with-regions.ipynb +58 -58
  12. natural_pdf-0.1.21/docs/tutorials/08-spatial-navigation.ipynb +512 -0
  13. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/09-section-extraction.ipynb +92 -92
  14. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/10-form-field-extraction.ipynb +50 -50
  15. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/11-enhanced-table-processing.ipynb +6 -6
  16. natural_pdf-0.1.21/docs/tutorials/12-ocr-integration.ipynb +4197 -0
  17. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/13-semantic-search.ipynb +148 -148
  18. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/14-categorizing-documents.ipynb +596 -596
  19. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/layout_manager.py +86 -80
  20. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/yolo.py +2 -2
  21. natural_pdf-0.1.21/natural_pdf/cli.py +134 -0
  22. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/__init__.py +1 -0
  23. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/engine_paddle.py +1 -1
  24. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/ocr_factory.py +9 -9
  25. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/ocr_manager.py +1 -9
  26. {natural_pdf-0.1.19 → natural_pdf-0.1.21/natural_pdf.egg-info}/PKG-INFO +26 -25
  27. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf.egg-info/SOURCES.txt +2 -0
  28. natural_pdf-0.1.21/natural_pdf.egg-info/entry_points.txt +3 -0
  29. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf.egg-info/requires.txt +1 -17
  30. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/noxfile.py +4 -1
  31. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pyproject.toml +12 -26
  32. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/test_core/test_loading.py +0 -1
  33. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/test_loading_original.py +0 -1
  34. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/test_optional_deps.py +0 -9
  35. natural_pdf-0.1.19/docs/tutorials/01-loading-and-extraction.ipynb +0 -320
  36. natural_pdf-0.1.19/docs/tutorials/04-table-extraction.ipynb +0 -557
  37. natural_pdf-0.1.19/docs/tutorials/07-layout-analysis.ipynb +0 -615
  38. natural_pdf-0.1.19/docs/tutorials/08-spatial-navigation.ipynb +0 -512
  39. natural_pdf-0.1.19/docs/tutorials/12-ocr-integration.ipynb +0 -4197
  40. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.cursor/rules/analysis_framework.mdc +0 -0
  41. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.cursor/rules/coding-style.mdc +0 -0
  42. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.cursor/rules/edit-md-instead-of-ipynb.mdc +0 -0
  43. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.cursor/rules/minimal-comments.mdc +0 -0
  44. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.cursor/rules/natural-pdf-overview.mdc +0 -0
  45. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.cursor/rules/user-friendly-library-code.mdc +0 -0
  46. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.github/workflows/docs.yml +0 -0
  47. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.gitignore +0 -0
  48. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/.pre-commit-config.yaml +0 -0
  49. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/01-execute_notebooks.py +0 -0
  50. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/02-run_all_tutorials.sh +0 -0
  51. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/CLAUDE.md +0 -0
  52. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/LICENSE +0 -0
  53. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/MANIFEST.in +0 -0
  54. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/audit_packaging.py +0 -0
  55. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/check_run_md.sh +0 -0
  56. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/api/index.md +0 -0
  57. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/assets/favicon.png +0 -0
  58. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/assets/favicon.svg +0 -0
  59. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/assets/javascripts/custom.js +0 -0
  60. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/assets/logo.svg +0 -0
  61. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/assets/sample-screen.png +0 -0
  62. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/assets/social-preview.png +0 -0
  63. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/assets/social-preview.svg +0 -0
  64. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/assets/stylesheets/custom.css +0 -0
  65. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/categorizing-documents/index.md +0 -0
  66. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/data-extraction/index.md +0 -0
  67. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/describe/index.ipynb +0 -0
  68. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/describe/index.md +0 -0
  69. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/document-qa/index.ipynb +0 -0
  70. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/document-qa/index.md +0 -0
  71. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/element-selection/index.ipynb +0 -0
  72. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/element-selection/index.md +0 -0
  73. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/finetuning/index.md +0 -0
  74. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/index.md +0 -0
  75. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/interactive-widget/index.ipynb +0 -0
  76. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/interactive-widget/index.md +0 -0
  77. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/layout-analysis/index.ipynb +0 -0
  78. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/layout-analysis/index.md +0 -0
  79. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/loops-and-groups/index.ipynb +0 -0
  80. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/loops-and-groups/index.md +0 -0
  81. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/ocr/index.md +0 -0
  82. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/pdf-navigation/index.ipynb +0 -0
  83. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/pdf-navigation/index.md +0 -0
  84. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/reflowing-pages/index.ipynb +0 -0
  85. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/reflowing-pages/index.md +0 -0
  86. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/regions/index.ipynb +0 -0
  87. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/regions/index.md +0 -0
  88. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tables/index.ipynb +0 -0
  89. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tables/index.md +0 -0
  90. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/text-analysis/index.ipynb +0 -0
  91. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/text-analysis/index.md +0 -0
  92. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/text-extraction/index.ipynb +0 -0
  93. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/text-extraction/index.md +0 -0
  94. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/01-loading-and-extraction.md +0 -0
  95. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/02-finding-elements.md +0 -0
  96. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/03-extracting-blocks.md +0 -0
  97. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/04-table-extraction.md +0 -0
  98. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/05-excluding-content.md +0 -0
  99. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/06-document-qa.md +0 -0
  100. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/07-layout-analysis.md +0 -0
  101. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/07-working-with-regions.md +0 -0
  102. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/08-spatial-navigation.md +0 -0
  103. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/09-section-extraction.md +0 -0
  104. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/10-form-field-extraction.md +0 -0
  105. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/11-enhanced-table-processing.md +0 -0
  106. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/12-ocr-integration.md +0 -0
  107. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/13-semantic-search.md +0 -0
  108. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/tutorials/14-categorizing-documents.md +0 -0
  109. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/visual-debugging/index.ipynb +0 -0
  110. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/visual-debugging/index.md +0 -0
  111. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/docs/visual-debugging/region.png +0 -0
  112. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/mkdocs.yml +0 -0
  113. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/__init__.py +0 -0
  114. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/__init__.py +0 -0
  115. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/__init__.py +0 -0
  116. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/base.py +0 -0
  117. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/docling.py +0 -0
  118. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/gemini.py +0 -0
  119. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/layout_analyzer.py +0 -0
  120. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/layout_options.py +0 -0
  121. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/paddle.py +0 -0
  122. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/pdfplumber_table_finder.py +0 -0
  123. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/surya.py +0 -0
  124. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/table_structure_utils.py +0 -0
  125. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/layout/tatr.py +0 -0
  126. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/shape_detection_mixin.py +0 -0
  127. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/text_options.py +0 -0
  128. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/text_structure.py +0 -0
  129. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/analyzers/utils.py +0 -0
  130. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/classification/manager.py +0 -0
  131. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/classification/mixin.py +0 -0
  132. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/classification/results.py +0 -0
  133. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/collections/mixins.py +0 -0
  134. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/collections/pdf_collection.py +0 -0
  135. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/core/__init__.py +0 -0
  136. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/core/element_manager.py +0 -0
  137. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/core/highlighting_service.py +0 -0
  138. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/core/page.py +0 -0
  139. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/core/pdf.py +0 -0
  140. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/describe/__init__.py +0 -0
  141. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/describe/base.py +0 -0
  142. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/describe/elements.py +0 -0
  143. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/describe/mixin.py +0 -0
  144. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/describe/summary.py +0 -0
  145. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/elements/__init__.py +0 -0
  146. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/elements/base.py +0 -0
  147. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/elements/collections.py +0 -0
  148. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/elements/line.py +0 -0
  149. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/elements/rect.py +0 -0
  150. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/elements/region.py +0 -0
  151. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/elements/text.py +0 -0
  152. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/export/mixin.py +0 -0
  153. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/base.py +0 -0
  154. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/data/__init__.py +0 -0
  155. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/data/pdf.ttf +0 -0
  156. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/data/sRGB.icc +0 -0
  157. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/hocr.py +0 -0
  158. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/hocr_font.py +0 -0
  159. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/original_pdf.py +0 -0
  160. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/paddleocr.py +0 -0
  161. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/exporters/searchable_pdf.py +0 -0
  162. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/extraction/manager.py +0 -0
  163. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/extraction/mixin.py +0 -0
  164. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/extraction/result.py +0 -0
  165. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/flows/__init__.py +0 -0
  166. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/flows/collections.py +0 -0
  167. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/flows/element.py +0 -0
  168. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/flows/flow.py +0 -0
  169. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/flows/region.py +0 -0
  170. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/__init__.py +0 -0
  171. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/engine.py +0 -0
  172. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/engine_doctr.py +0 -0
  173. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/engine_easyocr.py +0 -0
  174. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/engine_surya.py +0 -0
  175. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/ocr_options.py +0 -0
  176. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/ocr/utils.py +0 -0
  177. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/qa/__init__.py +0 -0
  178. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/qa/document_qa.py +0 -0
  179. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/search/__init__.py +0 -0
  180. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/search/lancedb_search_service.py +0 -0
  181. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/search/numpy_search_service.py +0 -0
  182. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/search/search_options.py +0 -0
  183. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/search/search_service_protocol.py +0 -0
  184. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/search/searchable_mixin.py +0 -0
  185. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/selectors/__init__.py +0 -0
  186. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/selectors/parser.py +0 -0
  187. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/templates/__init__.py +0 -0
  188. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/templates/finetune/fine_tune_paddleocr.md +0 -0
  189. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/templates/spa/css/style.css +0 -0
  190. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/templates/spa/index.html +0 -0
  191. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/templates/spa/js/app.js +0 -0
  192. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/templates/spa/words.txt +0 -0
  193. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/__init__.py +0 -0
  194. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/debug.py +0 -0
  195. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/highlighting.py +0 -0
  196. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/identifiers.py +0 -0
  197. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/locks.py +0 -0
  198. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/packaging.py +0 -0
  199. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/reading_order.py +0 -0
  200. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/text_extraction.py +0 -0
  201. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/utils/visualization.py +0 -0
  202. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/widgets/__init__.py +0 -0
  203. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf/widgets/viewer.py +0 -0
  204. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf.egg-info/dependency_links.txt +0 -0
  205. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/natural_pdf.egg-info/top_level.txt +0 -0
  206. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/.gitkeep +0 -0
  207. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/01-practice.pdf +0 -0
  208. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/0500000US42001.pdf +0 -0
  209. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/0500000US42007.pdf +0 -0
  210. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/2014 Statistics.pdf +0 -0
  211. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/2019 Statistics.pdf +0 -0
  212. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/30.pdf +0 -0
  213. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/Atlanta_Public_Schools_GA_sample.pdf +0 -0
  214. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/anexo_edital_6604_1743480-table.pdf +0 -0
  215. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/appendix_fy2026.pdf +0 -0
  216. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/cia-doc.pdf +0 -0
  217. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/geometry.pdf +0 -0
  218. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/image.png +0 -0
  219. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/image.png.pdf +0 -0
  220. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/multicolumn.pdf +0 -0
  221. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/needs-ocr.pdf +0 -0
  222. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/red.pdf +0 -0
  223. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/tiny-ocr-2.pdf +0 -0
  224. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/tiny-ocr-3.pdf +0 -0
  225. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/tiny-ocr-small.jpg +0 -0
  226. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/tiny-ocr-wide.jpg +0 -0
  227. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/tiny-ocr.pdf +0 -0
  228. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/tiny.pdf +0 -0
  229. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/pdfs/word-counter.pdf +0 -0
  230. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/publish.sh +0 -0
  231. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/sample-screen.png +0 -0
  232. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/setup.cfg +0 -0
  233. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/test_install.sh +0 -0
  234. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/conftest.py +0 -0
  235. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/exporters/test_paddleocr_exporter.py +0 -0
  236. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/test_core/test_containment_geometry.py +0 -0
  237. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/test_core/test_elements.py +0 -0
  238. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/test_core/test_spatial.py +0 -0
  239. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/test_core/test_text_extraction.py +0 -0
  240. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/tests/test_tutorials.py +0 -0
  241. {natural_pdf-0.1.19 → natural_pdf-0.1.21}/uv.lock +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: natural-pdf
3
- Version: 0.1.19
3
+ Version: 0.1.21
4
4
  Summary: A more intuitive interface for working with PDFs
5
5
  Author-email: Jonathan Soma <jonathan.soma@gmail.com>
6
6
  License-Expression: MIT
@@ -34,13 +34,6 @@ Provides-Extra: test
34
34
  Requires-Dist: pytest; extra == "test"
35
35
  Requires-Dist: pytest-xdist; extra == "test"
36
36
  Requires-Dist: setuptools; extra == "test"
37
- Provides-Extra: search
38
- Requires-Dist: lancedb; extra == "search"
39
- Requires-Dist: pyarrow; extra == "search"
40
- Provides-Extra: favorites
41
- Requires-Dist: natural-pdf[deskew]; extra == "favorites"
42
- Requires-Dist: natural-pdf[ocr-export]; extra == "favorites"
43
- Requires-Dist: natural-pdf[paddle]; extra == "favorites"
44
37
  Provides-Extra: dev
45
38
  Requires-Dist: black; extra == "dev"
46
39
  Requires-Dist: isort; extra == "dev"
@@ -58,25 +51,19 @@ Requires-Dist: nbclient; extra == "dev"
58
51
  Requires-Dist: ipykernel; extra == "dev"
59
52
  Requires-Dist: pre-commit; extra == "dev"
60
53
  Requires-Dist: setuptools; extra == "dev"
61
- Provides-Extra: deskew
62
- Requires-Dist: deskew>=1.5; extra == "deskew"
63
- Requires-Dist: img2pdf; extra == "deskew"
64
54
  Provides-Extra: all
65
55
  Requires-Dist: natural-pdf[ocr-export]; extra == "all"
66
56
  Requires-Dist: natural-pdf[deskew]; extra == "all"
67
57
  Requires-Dist: natural-pdf[test]; extra == "all"
68
58
  Requires-Dist: natural-pdf[search]; extra == "all"
69
- Requires-Dist: natural-pdf[extras]; extra == "all"
70
59
  Requires-Dist: natural-pdf[favorites]; extra == "all"
71
- Provides-Extra: paddle
72
- Requires-Dist: paddlepaddle>=3.0.0; extra == "paddle"
73
- Requires-Dist: paddleocr>=3.0.1; extra == "paddle"
74
- Requires-Dist: paddlex>=3.0.1; extra == "paddle"
75
- Provides-Extra: extras
76
- Requires-Dist: surya-ocr>=0.13.0; extra == "extras"
77
- Requires-Dist: doclayout_yolo; extra == "extras"
78
- Requires-Dist: easyocr; extra == "extras"
79
- Requires-Dist: natural-pdf[paddle]; extra == "extras"
60
+ Requires-Dist: natural-pdf[export-extras]; extra == "all"
61
+ Provides-Extra: deskew
62
+ Requires-Dist: deskew>=1.5; extra == "deskew"
63
+ Requires-Dist: img2pdf; extra == "deskew"
64
+ Provides-Extra: search
65
+ Requires-Dist: lancedb; extra == "search"
66
+ Requires-Dist: pyarrow; extra == "search"
80
67
  Provides-Extra: ocr-export
81
68
  Requires-Dist: pikepdf; extra == "ocr-export"
82
69
  Provides-Extra: export-extras
@@ -101,14 +88,28 @@ Natural PDF lets you find and extract content from PDFs using simple code that m
101
88
  pip install natural-pdf
102
89
  ```
103
90
 
104
- For optional features like specific OCR engines, layout analysis models, or the interactive Jupyter widget, you can install one to two million different extras. If you just want the greatest hits:
91
+ Need OCR engines, layout models, or other heavy add-ons? Install the **core** once, then use the helper CLI to pull in exactly what you need:
92
+
93
+ ```bash
94
+ # add PaddleOCR (+paddlex) after the fact
95
+ npdf install paddle
96
+
97
+ # Surya OCR and the YOLO Doc-Layout detector in one go
98
+ npdf install surya yolo
99
+
100
+ # see what's already on your machine
101
+ npdf list
102
+ ```
103
+
104
+ Light-weight extras such as `deskew` or `search` can still be added with
105
+ classic PEP-508 markers if you prefer:
105
106
 
106
107
  ```bash
107
- # deskewing, OCR (surya) + layout analysis (yolo), interactive browsing
108
- pip install natural-pdf[favorites]
108
+ pip install "natural-pdf[deskew]"
109
+ pip install "natural-pdf[search]"
109
110
  ```
110
111
 
111
- See the [installation guide](https://jsoma.github.io/natural-pdf/installation/) for more details on extras.
112
+ More details in the [installation guide](https://jsoma.github.io/natural-pdf/installation/).
112
113
 
113
114
  ## Quick Start
114
115
 
@@ -15,14 +15,28 @@ Natural PDF lets you find and extract content from PDFs using simple code that m
15
15
  pip install natural-pdf
16
16
  ```
17
17
 
18
- For optional features like specific OCR engines, layout analysis models, or the interactive Jupyter widget, you can install one to two million different extras. If you just want the greatest hits:
18
+ Need OCR engines, layout models, or other heavy add-ons? Install the **core** once, then use the helper CLI to pull in exactly what you need:
19
19
 
20
20
  ```bash
21
- # deskewing, OCR (surya) + layout analysis (yolo), interactive browsing
22
- pip install natural-pdf[favorites]
21
+ # add PaddleOCR (+paddlex) after the fact
22
+ npdf install paddle
23
+
24
+ # Surya OCR and the YOLO Doc-Layout detector in one go
25
+ npdf install surya yolo
26
+
27
+ # see what's already on your machine
28
+ npdf list
29
+ ```
30
+
31
+ Light-weight extras such as `deskew` or `search` can still be added with
32
+ classic PEP-508 markers if you prefer:
33
+
34
+ ```bash
35
+ pip install "natural-pdf[deskew]"
36
+ pip install "natural-pdf[search]"
23
37
  ```
24
38
 
25
- See the [installation guide](https://jsoma.github.io/natural-pdf/installation/) for more details on extras.
39
+ More details in the [installation guide](https://jsoma.github.io/natural-pdf/installation/).
26
40
 
27
41
  ## Quick Start
28
42
 
@@ -30,6 +30,38 @@ pip install natural-pdf[favorites]
30
30
 
31
31
  Other OCR and layout analysis engines like `surya`, `easyocr`, `paddle`, `doctr`, and `docling` can be installed via `pip` as needed. The library will provide you with an error message and installation command if you try to use an engine that isn't installed.
32
32
 
33
+ After the core install you have two ways to add **optional engines**:
34
+
35
+ ### 1&nbsp;·&nbsp;Helper CLI (recommended)
36
+
37
+ ```bash
38
+ # list optional groups and their install-status
39
+ npdf list
40
+
41
+ # install PaddleOCR stack
42
+ npdf install paddle
43
+
44
+ # install Surya OCR + YOLO layout detector
45
+ npdf install surya yolo
46
+ ```
47
+
48
+ The CLI runs each wheel in its own resolver pass, so it avoids strict
49
+ version pins like `paddleocr → paddlex==3.0.1` while still upgrading to
50
+ `paddlex 3.0.2`.
51
+
52
+ ### 2&nbsp;·&nbsp;Classic extras (for the light stuff)
53
+
54
+ ```bash
55
+ # Deskewing
56
+ pip install "natural-pdf[deskew]"
57
+
58
+ # Semantic search service
59
+ pip install "natural-pdf[search]"
60
+ ```
61
+
62
+ If you attempt to use an engine that is missing, the library will raise an
63
+ error that tells you which `npdf install …` command to run.
64
+
33
65
  ## Your First PDF Extraction
34
66
 
35
67
  Here's a quick example to make sure everything is working: