PyPI - doctra - Versions diffs - 0.1.1__tar.gz → 0.3.0__tar.gz - Mend

doctra 0.1.1tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (56) hide show

{doctra-0.1.1/doctra.egg-info → doctra-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: doctra
-Version: 0.1.1
+Version: 0.3.0
 Summary: Parse, extract, and analyze documents with ease
 Home-page: https://github.com/AdemBoukhris457/Doctra
 Author: Adem Boukhris
@@ -241,6 +241,8 @@ Provides-Extra: openai
 Requires-Dist: openai>=1.0.0; extra == "openai"
 Provides-Extra: gemini
 Requires-Dist: google-generativeai>=0.3.0; extra == "gemini"
+Provides-Extra: anthropic
+Requires-Dist: anthropic>=0.40.0; extra == "anthropic"
 Provides-Extra: dev
 Requires-Dist: pytest>=6.0; extra == "dev"
 Requires-Dist: pytest-cov>=2.0; extra == "dev"
@@ -329,7 +331,7 @@ parser = StructuredPDFParser()
 # Parser with VLM for structured data extraction
 parser = StructuredPDFParser(
     use_vlm=True,
-    vlm_provider="openai",  # or "gemini"
+    vlm_provider="openai",  # or "gemini" or "anthropic" or "openrouter"
     vlm_api_key="your_api_key_here"
 )
@@ -344,7 +346,7 @@ parser = StructuredPDFParser(
     # VLM Settings
     use_vlm=True,
     vlm_provider="openai",
-    vlm_model="gpt-4o",
+    vlm_model="gpt-5",
     vlm_api_key="your_api_key",
     # Layout Detection Settings
@@ -406,7 +408,7 @@ parser = ChartTablePDFParser(
     # VLM Settings
     use_vlm=True,
     vlm_provider="openai",
-    vlm_model="gpt-4o",
+    vlm_model="gpt-5",
     vlm_api_key="your_api_key",
     # Layout Detection Settings
@@ -545,7 +547,7 @@ parser = StructuredPDFParser(
     use_vlm=True,
     vlm_provider="openai",
     vlm_api_key="your_openai_api_key",
-    vlm__model="gpt-4o",
+    vlm__model="gpt-5",
     layout_model_name="PP-DocLayout_plus-L",
     dpi=300,  # Higher DPI for better quality
     min_score=0.5,  # Higher confidence threshold
@@ -623,4 +625,41 @@ parser.display_pages_with_boxes("document.pdf")
 - **Pandas**: Data manipulation
 - **OpenPyXL**: Excel file generation
 - **Google Generative AI**: For Gemini VLM integration
-- **OpenAI**: For GPT-4 VLM integration
+- **OpenAI**: For GPT-5 VLM integration
+## 🖥️ Web Interface (Gradio)
+You can try Doctra in a simple web UI powered by Gradio.
+### Run locally
+```bash
+pip install -U gradio
+python gradio_app.py
+```
+Then open the printed URL (default `http://127.0.0.1:7860`).
+Notes:
+- If using VLM, set the API key field in the UI or export `VLM_API_KEY`.
+- Outputs are saved under `outputs/<pdf_stem>/` and previewed in the UI.
+### Deploy on Hugging Face Spaces
+1) Create a new Space (type: Gradio, SDK: Python).
+2) Add these files to the Space repo:
+   - Your package code (or install from PyPI).
+   - `gradio_app.py` (entry point).
+   - `requirements.txt` with at least:
+```text
+doctra
+gradio
+```
+3) Set a secret named `VLM_API_KEY` if you want VLM features.
+4) In Space settings, set `python gradio_app.py` as the run command (or rely on auto-detect).
+The Space will build and expose the same interface for uploads and processing.

doctra 0.1.1__tar.gz → 0.3.0__tar.gz

doctra 0.1.1tar.gz → 0.3.0tar.gz