natural-pdf 0.1.20__tar.gz → 0.1.22__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (241) hide show
  1. {natural_pdf-0.1.20/natural_pdf.egg-info → natural_pdf-0.1.22}/PKG-INFO +19 -6
  2. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/README.md +18 -4
  3. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/installation/index.md +32 -0
  4. natural_pdf-0.1.22/docs/tutorials/01-loading-and-extraction.ipynb +320 -0
  5. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/02-finding-elements.ipynb +42 -42
  6. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/03-extracting-blocks.ipynb +17 -17
  7. natural_pdf-0.1.22/docs/tutorials/04-table-extraction.ipynb +557 -0
  8. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/05-excluding-content.ipynb +30 -30
  9. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/06-document-qa.ipynb +28 -28
  10. natural_pdf-0.1.22/docs/tutorials/07-layout-analysis.ipynb +615 -0
  11. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/07-working-with-regions.ipynb +58 -58
  12. natural_pdf-0.1.22/docs/tutorials/08-spatial-navigation.ipynb +512 -0
  13. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/09-section-extraction.ipynb +93 -93
  14. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/10-form-field-extraction.ipynb +50 -50
  15. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/11-enhanced-table-processing.ipynb +6 -6
  16. natural_pdf-0.1.22/docs/tutorials/12-ocr-integration.ipynb +4197 -0
  17. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/13-semantic-search.ipynb +174 -174
  18. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/cli.py +8 -27
  19. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/pdf.py +31 -45
  20. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/base.py +2 -2
  21. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/elements.py +1 -1
  22. {natural_pdf-0.1.20 → natural_pdf-0.1.22/natural_pdf.egg-info}/PKG-INFO +19 -6
  23. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/requires.txt +0 -1
  24. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pyproject.toml +0 -1
  25. natural_pdf-0.1.20/docs/tutorials/01-loading-and-extraction.ipynb +0 -320
  26. natural_pdf-0.1.20/docs/tutorials/04-table-extraction.ipynb +0 -557
  27. natural_pdf-0.1.20/docs/tutorials/07-layout-analysis.ipynb +0 -615
  28. natural_pdf-0.1.20/docs/tutorials/08-spatial-navigation.ipynb +0 -512
  29. natural_pdf-0.1.20/docs/tutorials/12-ocr-integration.ipynb +0 -4197
  30. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/analysis_framework.mdc +0 -0
  31. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/coding-style.mdc +0 -0
  32. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/edit-md-instead-of-ipynb.mdc +0 -0
  33. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/minimal-comments.mdc +0 -0
  34. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/natural-pdf-overview.mdc +0 -0
  35. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.cursor/rules/user-friendly-library-code.mdc +0 -0
  36. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.github/workflows/docs.yml +0 -0
  37. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.gitignore +0 -0
  38. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/.pre-commit-config.yaml +0 -0
  39. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/01-execute_notebooks.py +0 -0
  40. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/02-run_all_tutorials.sh +0 -0
  41. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/CLAUDE.md +0 -0
  42. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/LICENSE +0 -0
  43. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/MANIFEST.in +0 -0
  44. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/audit_packaging.py +0 -0
  45. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/check_run_md.sh +0 -0
  46. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/api/index.md +0 -0
  47. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/favicon.png +0 -0
  48. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/favicon.svg +0 -0
  49. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/javascripts/custom.js +0 -0
  50. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/logo.svg +0 -0
  51. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/sample-screen.png +0 -0
  52. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/social-preview.png +0 -0
  53. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/social-preview.svg +0 -0
  54. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/assets/stylesheets/custom.css +0 -0
  55. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/categorizing-documents/index.md +0 -0
  56. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/data-extraction/index.md +0 -0
  57. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/describe/index.ipynb +0 -0
  58. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/describe/index.md +0 -0
  59. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/document-qa/index.ipynb +0 -0
  60. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/document-qa/index.md +0 -0
  61. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/element-selection/index.ipynb +0 -0
  62. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/element-selection/index.md +0 -0
  63. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/finetuning/index.md +0 -0
  64. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/index.md +0 -0
  65. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/interactive-widget/index.ipynb +0 -0
  66. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/interactive-widget/index.md +0 -0
  67. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/layout-analysis/index.ipynb +0 -0
  68. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/layout-analysis/index.md +0 -0
  69. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/loops-and-groups/index.ipynb +0 -0
  70. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/loops-and-groups/index.md +0 -0
  71. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/ocr/index.md +0 -0
  72. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/pdf-navigation/index.ipynb +0 -0
  73. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/pdf-navigation/index.md +0 -0
  74. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/reflowing-pages/index.ipynb +0 -0
  75. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/reflowing-pages/index.md +0 -0
  76. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/regions/index.ipynb +0 -0
  77. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/regions/index.md +0 -0
  78. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tables/index.ipynb +0 -0
  79. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tables/index.md +0 -0
  80. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/text-analysis/index.ipynb +0 -0
  81. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/text-analysis/index.md +0 -0
  82. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/text-extraction/index.ipynb +0 -0
  83. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/text-extraction/index.md +0 -0
  84. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/01-loading-and-extraction.md +0 -0
  85. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/02-finding-elements.md +0 -0
  86. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/03-extracting-blocks.md +0 -0
  87. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/04-table-extraction.md +0 -0
  88. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/05-excluding-content.md +0 -0
  89. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/06-document-qa.md +0 -0
  90. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/07-layout-analysis.md +0 -0
  91. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/07-working-with-regions.md +0 -0
  92. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/08-spatial-navigation.md +0 -0
  93. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/09-section-extraction.md +0 -0
  94. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/10-form-field-extraction.md +0 -0
  95. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/11-enhanced-table-processing.md +0 -0
  96. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/12-ocr-integration.md +0 -0
  97. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/13-semantic-search.md +0 -0
  98. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/14-categorizing-documents.ipynb +0 -0
  99. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/tutorials/14-categorizing-documents.md +0 -0
  100. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/visual-debugging/index.ipynb +0 -0
  101. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/visual-debugging/index.md +0 -0
  102. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/docs/visual-debugging/region.png +0 -0
  103. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/mkdocs.yml +0 -0
  104. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/__init__.py +0 -0
  105. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/__init__.py +0 -0
  106. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/__init__.py +0 -0
  107. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/base.py +0 -0
  108. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/docling.py +0 -0
  109. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/gemini.py +0 -0
  110. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/layout_analyzer.py +0 -0
  111. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/layout_manager.py +0 -0
  112. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/layout_options.py +0 -0
  113. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/paddle.py +0 -0
  114. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/pdfplumber_table_finder.py +0 -0
  115. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/surya.py +0 -0
  116. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/table_structure_utils.py +0 -0
  117. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/tatr.py +0 -0
  118. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/layout/yolo.py +0 -0
  119. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/shape_detection_mixin.py +0 -0
  120. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/text_options.py +0 -0
  121. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/text_structure.py +0 -0
  122. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/analyzers/utils.py +0 -0
  123. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/classification/manager.py +0 -0
  124. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/classification/mixin.py +0 -0
  125. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/classification/results.py +0 -0
  126. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/collections/mixins.py +0 -0
  127. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/collections/pdf_collection.py +0 -0
  128. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/__init__.py +0 -0
  129. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/element_manager.py +0 -0
  130. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/highlighting_service.py +0 -0
  131. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/core/page.py +0 -0
  132. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/__init__.py +0 -0
  133. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/mixin.py +0 -0
  134. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/describe/summary.py +0 -0
  135. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/__init__.py +0 -0
  136. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/base.py +0 -0
  137. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/collections.py +0 -0
  138. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/line.py +0 -0
  139. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/rect.py +0 -0
  140. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/region.py +0 -0
  141. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/elements/text.py +0 -0
  142. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/export/mixin.py +0 -0
  143. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/__init__.py +0 -0
  144. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/base.py +0 -0
  145. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/data/__init__.py +0 -0
  146. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/data/pdf.ttf +0 -0
  147. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/data/sRGB.icc +0 -0
  148. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/hocr.py +0 -0
  149. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/hocr_font.py +0 -0
  150. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/original_pdf.py +0 -0
  151. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/paddleocr.py +0 -0
  152. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/exporters/searchable_pdf.py +0 -0
  153. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/extraction/manager.py +0 -0
  154. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/extraction/mixin.py +0 -0
  155. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/extraction/result.py +0 -0
  156. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/__init__.py +0 -0
  157. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/collections.py +0 -0
  158. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/element.py +0 -0
  159. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/flow.py +0 -0
  160. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/flows/region.py +0 -0
  161. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/__init__.py +0 -0
  162. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine.py +0 -0
  163. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine_doctr.py +0 -0
  164. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine_easyocr.py +0 -0
  165. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine_paddle.py +0 -0
  166. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/engine_surya.py +0 -0
  167. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/ocr_factory.py +0 -0
  168. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/ocr_manager.py +0 -0
  169. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/ocr_options.py +0 -0
  170. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/ocr/utils.py +0 -0
  171. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/qa/__init__.py +0 -0
  172. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/qa/document_qa.py +0 -0
  173. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/__init__.py +0 -0
  174. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/lancedb_search_service.py +0 -0
  175. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/numpy_search_service.py +0 -0
  176. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/search_options.py +0 -0
  177. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/search_service_protocol.py +0 -0
  178. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/search/searchable_mixin.py +0 -0
  179. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/selectors/__init__.py +0 -0
  180. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/selectors/parser.py +0 -0
  181. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/__init__.py +0 -0
  182. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/finetune/fine_tune_paddleocr.md +0 -0
  183. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/spa/css/style.css +0 -0
  184. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/spa/index.html +0 -0
  185. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/spa/js/app.js +0 -0
  186. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/templates/spa/words.txt +0 -0
  187. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/__init__.py +0 -0
  188. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/debug.py +0 -0
  189. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/highlighting.py +0 -0
  190. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/identifiers.py +0 -0
  191. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/locks.py +0 -0
  192. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/packaging.py +0 -0
  193. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/reading_order.py +0 -0
  194. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/text_extraction.py +0 -0
  195. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/utils/visualization.py +0 -0
  196. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/widgets/__init__.py +0 -0
  197. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf/widgets/viewer.py +0 -0
  198. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/SOURCES.txt +0 -0
  199. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/dependency_links.txt +0 -0
  200. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/entry_points.txt +0 -0
  201. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/natural_pdf.egg-info/top_level.txt +0 -0
  202. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/noxfile.py +0 -0
  203. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/.gitkeep +0 -0
  204. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/01-practice.pdf +0 -0
  205. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/0500000US42001.pdf +0 -0
  206. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/0500000US42007.pdf +0 -0
  207. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/2014 Statistics.pdf +0 -0
  208. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/2019 Statistics.pdf +0 -0
  209. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/30.pdf +0 -0
  210. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/Atlanta_Public_Schools_GA_sample.pdf +0 -0
  211. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/anexo_edital_6604_1743480-table.pdf +0 -0
  212. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/appendix_fy2026.pdf +0 -0
  213. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/cia-doc.pdf +0 -0
  214. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/geometry.pdf +0 -0
  215. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/image.png +0 -0
  216. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/image.png.pdf +0 -0
  217. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/multicolumn.pdf +0 -0
  218. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/needs-ocr.pdf +0 -0
  219. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/red.pdf +0 -0
  220. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr-2.pdf +0 -0
  221. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr-3.pdf +0 -0
  222. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr-small.jpg +0 -0
  223. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr-wide.jpg +0 -0
  224. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny-ocr.pdf +0 -0
  225. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/tiny.pdf +0 -0
  226. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/pdfs/word-counter.pdf +0 -0
  227. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/publish.sh +0 -0
  228. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/sample-screen.png +0 -0
  229. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/setup.cfg +0 -0
  230. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/test_install.sh +0 -0
  231. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/conftest.py +0 -0
  232. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/exporters/test_paddleocr_exporter.py +0 -0
  233. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_containment_geometry.py +0 -0
  234. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_elements.py +0 -0
  235. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_loading.py +0 -0
  236. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_spatial.py +0 -0
  237. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_core/test_text_extraction.py +0 -0
  238. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_loading_original.py +0 -0
  239. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_optional_deps.py +0 -0
  240. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/tests/test_tutorials.py +0 -0
  241. {natural_pdf-0.1.20 → natural_pdf-0.1.22}/uv.lock +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: natural-pdf
3
- Version: 0.1.20
3
+ Version: 0.1.22
4
4
  Summary: A more intuitive interface for working with PDFs
5
5
  Author-email: Jonathan Soma <jonathan.soma@gmail.com>
6
6
  License-Expression: MIT
@@ -21,7 +21,6 @@ Requires-Dist: urllib3
21
21
  Requires-Dist: tqdm
22
22
  Requires-Dist: pydantic
23
23
  Requires-Dist: jenkspy
24
- Requires-Dist: pikepdf
25
24
  Requires-Dist: scipy
26
25
  Requires-Dist: torch
27
26
  Requires-Dist: torchvision
@@ -88,14 +87,28 @@ Natural PDF lets you find and extract content from PDFs using simple code that m
88
87
  pip install natural-pdf
89
88
  ```
90
89
 
91
- For optional features like specific OCR engines, layout analysis models, or the interactive Jupyter widget, you can install one to two million different extras. If you just want the greatest hits:
90
+ Need OCR engines, layout models, or other heavy add-ons? Install the **core** once, then use the helper CLI to pull in exactly what you need:
92
91
 
93
92
  ```bash
94
- # deskewing, OCR (surya) + layout analysis (yolo), interactive browsing
95
- pip install natural-pdf[favorites]
93
+ # add PaddleOCR (+paddlex) after the fact
94
+ npdf install paddle
95
+
96
+ # Surya OCR and the YOLO Doc-Layout detector in one go
97
+ npdf install surya yolo
98
+
99
+ # see what's already on your machine
100
+ npdf list
101
+ ```
102
+
103
+ Light-weight extras such as `deskew` or `search` can still be added with
104
+ classic PEP-508 markers if you prefer:
105
+
106
+ ```bash
107
+ pip install "natural-pdf[deskew]"
108
+ pip install "natural-pdf[search]"
96
109
  ```
97
110
 
98
- See the [installation guide](https://jsoma.github.io/natural-pdf/installation/) for more details on extras.
111
+ More details in the [installation guide](https://jsoma.github.io/natural-pdf/installation/).
99
112
 
100
113
  ## Quick Start
101
114
 
@@ -15,14 +15,28 @@ Natural PDF lets you find and extract content from PDFs using simple code that m
15
15
  pip install natural-pdf
16
16
  ```
17
17
 
18
- For optional features like specific OCR engines, layout analysis models, or the interactive Jupyter widget, you can install one to two million different extras. If you just want the greatest hits:
18
+ Need OCR engines, layout models, or other heavy add-ons? Install the **core** once, then use the helper CLI to pull in exactly what you need:
19
19
 
20
20
  ```bash
21
- # deskewing, OCR (surya) + layout analysis (yolo), interactive browsing
22
- pip install natural-pdf[favorites]
21
+ # add PaddleOCR (+paddlex) after the fact
22
+ npdf install paddle
23
+
24
+ # Surya OCR and the YOLO Doc-Layout detector in one go
25
+ npdf install surya yolo
26
+
27
+ # see what's already on your machine
28
+ npdf list
29
+ ```
30
+
31
+ Light-weight extras such as `deskew` or `search` can still be added with
32
+ classic PEP-508 markers if you prefer:
33
+
34
+ ```bash
35
+ pip install "natural-pdf[deskew]"
36
+ pip install "natural-pdf[search]"
23
37
  ```
24
38
 
25
- See the [installation guide](https://jsoma.github.io/natural-pdf/installation/) for more details on extras.
39
+ More details in the [installation guide](https://jsoma.github.io/natural-pdf/installation/).
26
40
 
27
41
  ## Quick Start
28
42
 
@@ -30,6 +30,38 @@ pip install natural-pdf[favorites]
30
30
 
31
31
  Other OCR and layout analysis engines like `surya`, `easyocr`, `paddle`, `doctr`, and `docling` can be installed via `pip` as needed. The library will provide you with an error message and installation command if you try to use an engine that isn't installed.
32
32
 
33
+ After the core install you have two ways to add **optional engines**:
34
+
35
+ ### 1&nbsp;·&nbsp;Helper CLI (recommended)
36
+
37
+ ```bash
38
+ # list optional groups and their install-status
39
+ npdf list
40
+
41
+ # install PaddleOCR stack
42
+ npdf install paddle
43
+
44
+ # install Surya OCR + YOLO layout detector
45
+ npdf install surya yolo
46
+ ```
47
+
48
+ The CLI runs each wheel in its own resolver pass, so it avoids strict
49
+ version pins like `paddleocr → paddlex==3.0.1` while still upgrading to
50
+ `paddlex 3.0.2`.
51
+
52
+ ### 2&nbsp;·&nbsp;Classic extras (for the light stuff)
53
+
54
+ ```bash
55
+ # Deskewing
56
+ pip install "natural-pdf[deskew]"
57
+
58
+ # Semantic search service
59
+ pip install "natural-pdf[search]"
60
+ ```
61
+
62
+ If you attempt to use an engine that is missing, the library will raise an
63
+ error that tells you which `npdf install …` command to run.
64
+
33
65
  ## Your First PDF Extraction
34
66
 
35
67
  Here's a quick example to make sure everything is working: