datalab-python-sdk 0.1.1__tar.gz → 0.1.3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/.github/workflows/ci.yml +1 -1
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/.github/workflows/publish.yml +0 -1
- datalab_python_sdk-0.1.3/PKG-INFO +68 -0
- datalab_python_sdk-0.1.3/README.md +53 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/datalab_sdk/__init__.py +5 -7
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/datalab_sdk/cli.py +4 -4
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/datalab_sdk/client.py +11 -11
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/datalab_sdk/models.py +22 -14
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/datalab_sdk/settings.py +1 -0
- datalab_python_sdk-0.1.3/integration/README.md +36 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/integration/test_live_api.py +7 -7
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/integration/test_readme_examples.py +22 -22
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/pyproject.toml +17 -8
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/tests/conftest.py +2 -3
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/tests/test_client_methods.py +3 -3
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/uv.lock +24 -119
- datalab_python_sdk-0.1.1/PKG-INFO +0 -17
- datalab_python_sdk-0.1.1/README.md +0 -178
- datalab_python_sdk-0.1.1/integration/README.md +0 -71
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/.gitignore +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/.pre-commit-config.yaml +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/.python-version +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/LICENSE +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/08-Lambda-Calculus.pptx +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/adversarial.pdf +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/bid_evaluation.docx +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/book_review.ppt +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/book_store.xls +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/chi_hind.png +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/how_to_read.doc +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/normandy.epub +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/sample-1-sheet.xlsx +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/thinkpython.pdf +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/data/vibe.html +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/datalab_sdk/exceptions.py +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/datalab_sdk/mimetypes.py +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/integration/__init__.py +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/poetry.lock +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/pytest.ini +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/tests/__init__.py +0 -0
- {datalab_python_sdk-0.1.1 → datalab_python_sdk-0.1.3}/tests/test_cli_simple.py +0 -0
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: datalab-python-sdk
|
|
3
|
+
Version: 0.1.3
|
|
4
|
+
Summary: SDK for the Datalab document intelligence API
|
|
5
|
+
Author-email: Datalab Team <hi@datalab.to>
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
License-File: LICENSE
|
|
8
|
+
Keywords: api,datalab,document-intelligence,sdk
|
|
9
|
+
Requires-Python: >=3.10
|
|
10
|
+
Requires-Dist: aiohttp>=3.12.14
|
|
11
|
+
Requires-Dist: click>=8.2.1
|
|
12
|
+
Requires-Dist: pydantic-settings<3.0.0,>=2.10.1
|
|
13
|
+
Requires-Dist: pydantic<3.0.0,>=2.11.7
|
|
14
|
+
Description-Content-Type: text/markdown
|
|
15
|
+
|
|
16
|
+
# Datalab SDK
|
|
17
|
+
|
|
18
|
+
A Python SDK for the [Datalab API](https://www.datalab.to) - a document intelligence platform powered by [marker](https://github.com/VikParuchuri/marker) and [surya](https://github.com/VikParuchuri/surya).
|
|
19
|
+
|
|
20
|
+
See the full documentation at [https://documentation.datalab.to](https://documentation.datalab.to).
|
|
21
|
+
|
|
22
|
+
## Installation
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
pip install datalab-python-sdk
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Quick Start
|
|
29
|
+
|
|
30
|
+
### Authentication
|
|
31
|
+
|
|
32
|
+
Get your API key from [https://www.datalab.to/app/keys](https://www.datalab.to/app/keys):
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
export DATALAB_API_KEY="your_api_key_here"
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### Basic Usage
|
|
39
|
+
|
|
40
|
+
```python
|
|
41
|
+
from datalab_sdk import DatalabClient
|
|
42
|
+
|
|
43
|
+
client = DatalabClient() # use env var from above, or pass api_key="your_api_key_here"
|
|
44
|
+
|
|
45
|
+
# Convert PDF to markdown
|
|
46
|
+
result = client.convert("document.pdf")
|
|
47
|
+
print(result.markdown)
|
|
48
|
+
|
|
49
|
+
# OCR a document
|
|
50
|
+
ocr_result = client.ocr("document.pdf")
|
|
51
|
+
print(ocr_result.pages) # Get all text as string
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## CLI Usage
|
|
55
|
+
|
|
56
|
+
The SDK includes a command-line interface:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
# Convert document to markdown
|
|
60
|
+
datalab convert document.pdf
|
|
61
|
+
|
|
62
|
+
# OCR with JSON output
|
|
63
|
+
datalab ocr document.pdf --output-format json
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## License
|
|
67
|
+
|
|
68
|
+
MIT License
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
# Datalab SDK
|
|
2
|
+
|
|
3
|
+
A Python SDK for the [Datalab API](https://www.datalab.to) - a document intelligence platform powered by [marker](https://github.com/VikParuchuri/marker) and [surya](https://github.com/VikParuchuri/surya).
|
|
4
|
+
|
|
5
|
+
See the full documentation at [https://documentation.datalab.to](https://documentation.datalab.to).
|
|
6
|
+
|
|
7
|
+
## Installation
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
pip install datalab-python-sdk
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
## Quick Start
|
|
14
|
+
|
|
15
|
+
### Authentication
|
|
16
|
+
|
|
17
|
+
Get your API key from [https://www.datalab.to/app/keys](https://www.datalab.to/app/keys):
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
export DATALAB_API_KEY="your_api_key_here"
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
### Basic Usage
|
|
24
|
+
|
|
25
|
+
```python
|
|
26
|
+
from datalab_sdk import DatalabClient
|
|
27
|
+
|
|
28
|
+
client = DatalabClient() # use env var from above, or pass api_key="your_api_key_here"
|
|
29
|
+
|
|
30
|
+
# Convert PDF to markdown
|
|
31
|
+
result = client.convert("document.pdf")
|
|
32
|
+
print(result.markdown)
|
|
33
|
+
|
|
34
|
+
# OCR a document
|
|
35
|
+
ocr_result = client.ocr("document.pdf")
|
|
36
|
+
print(ocr_result.pages) # Get all text as string
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## CLI Usage
|
|
40
|
+
|
|
41
|
+
The SDK includes a command-line interface:
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
# Convert document to markdown
|
|
45
|
+
datalab convert document.pdf
|
|
46
|
+
|
|
47
|
+
# OCR with JSON output
|
|
48
|
+
datalab ocr document.pdf --output-format json
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## License
|
|
52
|
+
|
|
53
|
+
MIT License
|
|
@@ -7,13 +7,10 @@ supporting document conversion, OCR, layout analysis, and table recognition.
|
|
|
7
7
|
|
|
8
8
|
from .client import DatalabClient, AsyncDatalabClient
|
|
9
9
|
from .exceptions import DatalabError, DatalabAPIError, DatalabTimeoutError
|
|
10
|
-
from .models import
|
|
11
|
-
|
|
12
|
-
OCRResult,
|
|
13
|
-
ProcessingOptions,
|
|
14
|
-
)
|
|
10
|
+
from .models import ConversionResult, OCRResult, ConvertOptions, OCROptions
|
|
11
|
+
from .settings import settings
|
|
15
12
|
|
|
16
|
-
__version__ =
|
|
13
|
+
__version__ = settings.VERSION
|
|
17
14
|
__all__ = [
|
|
18
15
|
"DatalabClient",
|
|
19
16
|
"AsyncDatalabClient",
|
|
@@ -22,5 +19,6 @@ __all__ = [
|
|
|
22
19
|
"DatalabTimeoutError",
|
|
23
20
|
"ConversionResult",
|
|
24
21
|
"OCRResult",
|
|
25
|
-
"
|
|
22
|
+
"ConvertOptions",
|
|
23
|
+
"OCROptions",
|
|
26
24
|
]
|
|
@@ -12,7 +12,7 @@ import click
|
|
|
12
12
|
|
|
13
13
|
from datalab_sdk.client import DatalabClient, AsyncDatalabClient
|
|
14
14
|
from datalab_sdk.mimetypes import SUPPORTED_EXTENSIONS
|
|
15
|
-
from datalab_sdk.models import ProcessingOptions
|
|
15
|
+
from datalab_sdk.models import OCROptions, ConvertOptions, ProcessingOptions
|
|
16
16
|
from datalab_sdk.exceptions import DatalabError
|
|
17
17
|
from datalab_sdk.settings import settings
|
|
18
18
|
|
|
@@ -186,7 +186,7 @@ def process_single_file_sync(
|
|
|
186
186
|
|
|
187
187
|
|
|
188
188
|
@click.group()
|
|
189
|
-
@click.version_option(version=
|
|
189
|
+
@click.version_option(version=settings.VERSION)
|
|
190
190
|
def cli():
|
|
191
191
|
"""Datalab SDK - Command line interface for document processing"""
|
|
192
192
|
pass
|
|
@@ -242,7 +242,7 @@ def convert(
|
|
|
242
242
|
]
|
|
243
243
|
|
|
244
244
|
# Create processing options
|
|
245
|
-
options =
|
|
245
|
+
options = ConvertOptions(
|
|
246
246
|
output_format=output_format,
|
|
247
247
|
max_pages=max_pages,
|
|
248
248
|
force_ocr=force_ocr,
|
|
@@ -366,7 +366,7 @@ def ocr(
|
|
|
366
366
|
click.echo(f"❌ Skipping {path}: unsupported file type", err=True)
|
|
367
367
|
sys.exit(1)
|
|
368
368
|
|
|
369
|
-
options =
|
|
369
|
+
options = OCROptions(
|
|
370
370
|
max_pages=max_pages,
|
|
371
371
|
page_range=page_range,
|
|
372
372
|
skip_cache=skip_cache,
|
|
@@ -14,7 +14,13 @@ from datalab_sdk.exceptions import (
|
|
|
14
14
|
DatalabFileError,
|
|
15
15
|
)
|
|
16
16
|
from datalab_sdk.mimetypes import MIMETYPE_MAP
|
|
17
|
-
from datalab_sdk.models import
|
|
17
|
+
from datalab_sdk.models import (
|
|
18
|
+
ConversionResult,
|
|
19
|
+
OCRResult,
|
|
20
|
+
ProcessingOptions,
|
|
21
|
+
ConvertOptions,
|
|
22
|
+
OCROptions,
|
|
23
|
+
)
|
|
18
24
|
from datalab_sdk.settings import settings
|
|
19
25
|
|
|
20
26
|
|
|
@@ -62,7 +68,7 @@ class AsyncDatalabClient:
|
|
|
62
68
|
timeout=timeout,
|
|
63
69
|
headers={
|
|
64
70
|
"X-Api-Key": self.api_key,
|
|
65
|
-
"User-Agent": "datalab-python-sdk/
|
|
71
|
+
"User-Agent": f"datalab-python-sdk/{settings.VERSION}",
|
|
66
72
|
},
|
|
67
73
|
)
|
|
68
74
|
|
|
@@ -170,7 +176,7 @@ class AsyncDatalabClient:
|
|
|
170
176
|
) -> ConversionResult:
|
|
171
177
|
"""Convert a document using the marker endpoint"""
|
|
172
178
|
if options is None:
|
|
173
|
-
options =
|
|
179
|
+
options = ConvertOptions()
|
|
174
180
|
|
|
175
181
|
initial_data = await self._make_request(
|
|
176
182
|
"POST", "/api/v1/marker", data=self.get_form_params(file_path, options)
|
|
@@ -212,7 +218,7 @@ class AsyncDatalabClient:
|
|
|
212
218
|
) -> OCRResult:
|
|
213
219
|
"""Perform OCR on a document"""
|
|
214
220
|
if options is None:
|
|
215
|
-
options =
|
|
221
|
+
options = OCROptions()
|
|
216
222
|
|
|
217
223
|
initial_data = await self._make_request(
|
|
218
224
|
"POST", "/api/v1/ocr", data=self.get_form_params(file_path, options)
|
|
@@ -263,13 +269,7 @@ class DatalabClient:
|
|
|
263
269
|
|
|
264
270
|
def _run_async(self, coro):
|
|
265
271
|
"""Run async coroutine in sync context"""
|
|
266
|
-
|
|
267
|
-
loop = asyncio.get_event_loop()
|
|
268
|
-
except RuntimeError:
|
|
269
|
-
loop = asyncio.new_event_loop()
|
|
270
|
-
asyncio.set_event_loop(loop)
|
|
271
|
-
|
|
272
|
-
return loop.run_until_complete(self._async_wrapper(coro))
|
|
272
|
+
return asyncio.run(self._async_wrapper(coro))
|
|
273
273
|
|
|
274
274
|
async def _async_wrapper(self, coro):
|
|
275
275
|
"""Wrapper to ensure session management"""
|
|
@@ -11,25 +11,11 @@ import base64
|
|
|
11
11
|
|
|
12
12
|
@dataclass
|
|
13
13
|
class ProcessingOptions:
|
|
14
|
-
"""Options for document processing"""
|
|
15
|
-
|
|
16
14
|
# Common options
|
|
17
15
|
max_pages: Optional[int] = None
|
|
18
16
|
skip_cache: bool = True
|
|
19
17
|
page_range: Optional[str] = None
|
|
20
18
|
|
|
21
|
-
# Marker specific options
|
|
22
|
-
force_ocr: bool = False
|
|
23
|
-
format_lines: bool = False
|
|
24
|
-
paginate: bool = False
|
|
25
|
-
use_llm: bool = False
|
|
26
|
-
strip_existing_ocr: bool = False
|
|
27
|
-
disable_image_extraction: bool = False
|
|
28
|
-
block_correction_prompt: Optional[str] = None
|
|
29
|
-
additional_config: Optional[Dict[str, Any]] = None
|
|
30
|
-
page_schema: Optional[Dict[str, Any]] = None
|
|
31
|
-
output_format: str = "markdown" # markdown, json, html
|
|
32
|
-
|
|
33
19
|
def to_form_data(self) -> Dict[str, Any]:
|
|
34
20
|
"""Convert to form data format for API requests"""
|
|
35
21
|
form_data = {}
|
|
@@ -47,6 +33,28 @@ class ProcessingOptions:
|
|
|
47
33
|
return form_data
|
|
48
34
|
|
|
49
35
|
|
|
36
|
+
@dataclass
|
|
37
|
+
class ConvertOptions(ProcessingOptions):
|
|
38
|
+
"""Options for marker conversion"""
|
|
39
|
+
|
|
40
|
+
# Marker specific options
|
|
41
|
+
force_ocr: bool = False
|
|
42
|
+
format_lines: bool = False
|
|
43
|
+
paginate: bool = False
|
|
44
|
+
use_llm: bool = False
|
|
45
|
+
strip_existing_ocr: bool = False
|
|
46
|
+
disable_image_extraction: bool = False
|
|
47
|
+
block_correction_prompt: Optional[str] = None
|
|
48
|
+
additional_config: Optional[Dict[str, Any]] = None
|
|
49
|
+
page_schema: Optional[Dict[str, Any]] = None
|
|
50
|
+
output_format: str = "markdown" # markdown, json, html
|
|
51
|
+
|
|
52
|
+
|
|
53
|
+
@dataclass
|
|
54
|
+
class OCROptions(ProcessingOptions):
|
|
55
|
+
pass
|
|
56
|
+
|
|
57
|
+
|
|
50
58
|
@dataclass
|
|
51
59
|
class ConversionResult:
|
|
52
60
|
"""Result from document conversion (marker endpoint)"""
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
# Integration Tests
|
|
2
|
+
|
|
3
|
+
This directory contains integration tests that run against the live Datalab API.
|
|
4
|
+
|
|
5
|
+
## Setup
|
|
6
|
+
|
|
7
|
+
1. **Set your API key** as an environment variable:
|
|
8
|
+
```bash
|
|
9
|
+
export DATALAB_API_KEY="your_api_key_here"
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
2. **Optional: Set custom base URL** if testing against a different server:
|
|
13
|
+
```bash
|
|
14
|
+
export DATALAB_BASE_URL="https://custom.datalab.to"
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
## Running the Tests
|
|
19
|
+
|
|
20
|
+
Run all integration tests:
|
|
21
|
+
```bash
|
|
22
|
+
pytest integration/ -v
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Run specific test classes:
|
|
26
|
+
```bash
|
|
27
|
+
pytest integration/test_live_api.py::TestMarkerIntegration -v
|
|
28
|
+
pytest integration/test_live_api.py::TestOCRIntegration -v
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
Run individual tests:
|
|
32
|
+
```bash
|
|
33
|
+
pytest integration/test_live_api.py::TestMarkerIntegration::test_convert_pdf_basic -v
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
Set `-n 4` to run 4 parallel test workers.
|
|
@@ -14,7 +14,7 @@ import pytest
|
|
|
14
14
|
import os
|
|
15
15
|
from pathlib import Path
|
|
16
16
|
from datalab_sdk import DatalabClient, AsyncDatalabClient
|
|
17
|
-
from datalab_sdk.models import
|
|
17
|
+
from datalab_sdk.models import ConversionResult, OCRResult, ConvertOptions, OCROptions
|
|
18
18
|
from datalab_sdk.exceptions import DatalabError
|
|
19
19
|
|
|
20
20
|
# Test data files
|
|
@@ -32,7 +32,7 @@ class TestMarkerIntegration:
|
|
|
32
32
|
pdf_file = DATA_DIR / "adversarial.pdf"
|
|
33
33
|
|
|
34
34
|
# Convert with limited pages to keep test fast
|
|
35
|
-
options =
|
|
35
|
+
options = ConvertOptions(max_pages=2)
|
|
36
36
|
result = client.convert(pdf_file, options=options)
|
|
37
37
|
|
|
38
38
|
# Verify result
|
|
@@ -52,7 +52,7 @@ class TestMarkerIntegration:
|
|
|
52
52
|
doc_file = DATA_DIR / "bid_evaluation.docx"
|
|
53
53
|
|
|
54
54
|
# Convert to HTML format
|
|
55
|
-
options =
|
|
55
|
+
options = ConvertOptions(output_format="html", max_pages=1)
|
|
56
56
|
result = client.convert(doc_file, options=options)
|
|
57
57
|
|
|
58
58
|
# Verify result
|
|
@@ -70,7 +70,7 @@ class TestMarkerIntegration:
|
|
|
70
70
|
ppt_file = DATA_DIR / "08-Lambda-Calculus.pptx"
|
|
71
71
|
|
|
72
72
|
# Convert to JSON format
|
|
73
|
-
options =
|
|
73
|
+
options = ConvertOptions(output_format="json", max_pages=1)
|
|
74
74
|
result = await client.convert(ppt_file, options=options)
|
|
75
75
|
|
|
76
76
|
# Verify result
|
|
@@ -94,7 +94,7 @@ class TestOCRIntegration:
|
|
|
94
94
|
pdf_file = DATA_DIR / "thinkpython.pdf"
|
|
95
95
|
|
|
96
96
|
# OCR with limited pages
|
|
97
|
-
options =
|
|
97
|
+
options = OCROptions(max_pages=1)
|
|
98
98
|
result = client.ocr(pdf_file, options)
|
|
99
99
|
|
|
100
100
|
# Verify result
|
|
@@ -149,7 +149,7 @@ class TestOCRIntegration:
|
|
|
149
149
|
pdf_file = DATA_DIR / "adversarial.pdf"
|
|
150
150
|
|
|
151
151
|
# OCR with limited pages
|
|
152
|
-
options =
|
|
152
|
+
options = OCROptions(max_pages=2)
|
|
153
153
|
result = await client.ocr(pdf_file, options)
|
|
154
154
|
|
|
155
155
|
# Verify result
|
|
@@ -223,7 +223,7 @@ class TestSaveOutput:
|
|
|
223
223
|
output_path = tmp_path / "test_output"
|
|
224
224
|
|
|
225
225
|
# Convert with save_output
|
|
226
|
-
options =
|
|
226
|
+
options = ConvertOptions(max_pages=1)
|
|
227
227
|
result = client.convert(pdf_file, options=options, save_output=output_path)
|
|
228
228
|
|
|
229
229
|
# Verify result
|
|
@@ -15,7 +15,7 @@ import json
|
|
|
15
15
|
import tempfile
|
|
16
16
|
from pathlib import Path
|
|
17
17
|
from datalab_sdk import DatalabClient, AsyncDatalabClient
|
|
18
|
-
from datalab_sdk.models import
|
|
18
|
+
from datalab_sdk.models import ConvertOptions, OCROptions
|
|
19
19
|
from datalab_sdk.settings import settings
|
|
20
20
|
|
|
21
21
|
# Test data files
|
|
@@ -34,7 +34,7 @@ class TestBasicUsageExamples:
|
|
|
34
34
|
|
|
35
35
|
# Convert PDF to markdown (using test data)
|
|
36
36
|
result = client.convert(
|
|
37
|
-
DATA_DIR / "adversarial.pdf", options=
|
|
37
|
+
DATA_DIR / "adversarial.pdf", options=ConvertOptions(max_pages=1)
|
|
38
38
|
)
|
|
39
39
|
print(result.markdown)
|
|
40
40
|
|
|
@@ -61,7 +61,7 @@ class TestBasicUsageExamples:
|
|
|
61
61
|
async with AsyncDatalabClient() as client:
|
|
62
62
|
# Convert PDF to markdown
|
|
63
63
|
result = await client.convert(
|
|
64
|
-
DATA_DIR / "adversarial.pdf", options=
|
|
64
|
+
DATA_DIR / "adversarial.pdf", options=ConvertOptions(max_pages=1)
|
|
65
65
|
)
|
|
66
66
|
print(result.markdown)
|
|
67
67
|
|
|
@@ -85,19 +85,19 @@ class TestAPIMethodExamples:
|
|
|
85
85
|
|
|
86
86
|
def test_document_conversion_examples(self):
|
|
87
87
|
"""Test Document Conversion section examples"""
|
|
88
|
-
from datalab_sdk import DatalabClient,
|
|
88
|
+
from datalab_sdk import DatalabClient, ConvertOptions
|
|
89
89
|
|
|
90
90
|
client = DatalabClient()
|
|
91
91
|
|
|
92
92
|
# Basic conversion
|
|
93
93
|
result = client.convert(
|
|
94
|
-
DATA_DIR / "adversarial.pdf", options=
|
|
94
|
+
DATA_DIR / "adversarial.pdf", options=ConvertOptions(max_pages=1)
|
|
95
95
|
)
|
|
96
96
|
assert result.success is True
|
|
97
97
|
assert result.markdown is not None
|
|
98
98
|
|
|
99
99
|
# With options
|
|
100
|
-
options =
|
|
100
|
+
options = ConvertOptions(
|
|
101
101
|
force_ocr=True,
|
|
102
102
|
output_format="html",
|
|
103
103
|
use_llm=False, # Keep false for cost reasons
|
|
@@ -112,7 +112,7 @@ class TestAPIMethodExamples:
|
|
|
112
112
|
output_path = Path(tmp_dir) / "result"
|
|
113
113
|
result = client.convert(
|
|
114
114
|
DATA_DIR / "adversarial.pdf",
|
|
115
|
-
options=
|
|
115
|
+
options=ConvertOptions(max_pages=1),
|
|
116
116
|
save_output=output_path,
|
|
117
117
|
)
|
|
118
118
|
assert result.success is True
|
|
@@ -132,7 +132,7 @@ class TestAPIMethodExamples:
|
|
|
132
132
|
assert isinstance(text, str)
|
|
133
133
|
|
|
134
134
|
# OCR with options
|
|
135
|
-
options =
|
|
135
|
+
options = OCROptions(max_pages=1)
|
|
136
136
|
result = client.ocr(DATA_DIR / "adversarial.pdf", options)
|
|
137
137
|
assert result.success is True
|
|
138
138
|
assert len(result.pages) > 0
|
|
@@ -158,7 +158,7 @@ class TestErrorHandlingExamples:
|
|
|
158
158
|
# Test with valid file (should not raise error)
|
|
159
159
|
try:
|
|
160
160
|
result = client.convert(
|
|
161
|
-
DATA_DIR / "adversarial.pdf", options=
|
|
161
|
+
DATA_DIR / "adversarial.pdf", options=ConvertOptions(max_pages=1)
|
|
162
162
|
)
|
|
163
163
|
assert result.success is True
|
|
164
164
|
except DatalabAPIError as e:
|
|
@@ -171,7 +171,7 @@ class TestErrorHandlingExamples:
|
|
|
171
171
|
|
|
172
172
|
with pytest.raises(DatalabAPIError):
|
|
173
173
|
invalid_client.convert(
|
|
174
|
-
DATA_DIR / "adversarial.pdf", options=
|
|
174
|
+
DATA_DIR / "adversarial.pdf", options=ConvertOptions(max_pages=1)
|
|
175
175
|
)
|
|
176
176
|
|
|
177
177
|
|
|
@@ -180,10 +180,10 @@ class TestExamplesSectionFromReadme:
|
|
|
180
180
|
|
|
181
181
|
def test_extract_json_data_example(self):
|
|
182
182
|
"""Test Extract JSON Data example"""
|
|
183
|
-
from datalab_sdk import DatalabClient,
|
|
183
|
+
from datalab_sdk import DatalabClient, ConvertOptions
|
|
184
184
|
|
|
185
185
|
client = DatalabClient()
|
|
186
|
-
options =
|
|
186
|
+
options = ConvertOptions(output_format="json", max_pages=1)
|
|
187
187
|
result = client.convert(DATA_DIR / "adversarial.pdf", options=options)
|
|
188
188
|
|
|
189
189
|
# Parse JSON to find equations (modified to not fail if no equations)
|
|
@@ -221,7 +221,7 @@ class TestExamplesSectionFromReadme:
|
|
|
221
221
|
if file.suffix == ".pdf":
|
|
222
222
|
result = await client.convert(
|
|
223
223
|
str(file),
|
|
224
|
-
options=
|
|
224
|
+
options=ConvertOptions(max_pages=1),
|
|
225
225
|
save_output=output_path,
|
|
226
226
|
)
|
|
227
227
|
print(f"{file.name}: {result.page_count} pages")
|
|
@@ -247,7 +247,7 @@ class TestClientInitializationVariations:
|
|
|
247
247
|
|
|
248
248
|
client = DatalabClient()
|
|
249
249
|
result = client.convert(
|
|
250
|
-
DATA_DIR / "adversarial.pdf", options=
|
|
250
|
+
DATA_DIR / "adversarial.pdf", options=ConvertOptions(max_pages=1)
|
|
251
251
|
)
|
|
252
252
|
assert result.success is True
|
|
253
253
|
|
|
@@ -262,7 +262,7 @@ class TestClientInitializationVariations:
|
|
|
262
262
|
# Client should use environment variable
|
|
263
263
|
client = DatalabClient()
|
|
264
264
|
result = client.convert(
|
|
265
|
-
DATA_DIR / "adversarial.pdf", options=
|
|
265
|
+
DATA_DIR / "adversarial.pdf", options=ConvertOptions(max_pages=1)
|
|
266
266
|
)
|
|
267
267
|
assert result.success is True
|
|
268
268
|
finally:
|
|
@@ -280,7 +280,7 @@ class TestClientInitializationVariations:
|
|
|
280
280
|
api_key=settings.DATALAB_API_KEY, base_url=settings.DATALAB_HOST
|
|
281
281
|
) as client:
|
|
282
282
|
result = await client.convert(
|
|
283
|
-
DATA_DIR / "adversarial.pdf", options=
|
|
283
|
+
DATA_DIR / "adversarial.pdf", options=ConvertOptions(max_pages=1)
|
|
284
284
|
)
|
|
285
285
|
assert result.success is True
|
|
286
286
|
assert result.markdown is not None
|
|
@@ -291,9 +291,9 @@ class TestProcessingOptionsVariations:
|
|
|
291
291
|
|
|
292
292
|
def test_processing_options_defaults(self):
|
|
293
293
|
"""Test ProcessingOptions with default values"""
|
|
294
|
-
from datalab_sdk import
|
|
294
|
+
from datalab_sdk import ConvertOptions
|
|
295
295
|
|
|
296
|
-
options =
|
|
296
|
+
options = ConvertOptions()
|
|
297
297
|
assert options.force_ocr is False
|
|
298
298
|
assert options.output_format == "markdown"
|
|
299
299
|
assert options.use_llm is False
|
|
@@ -301,9 +301,9 @@ class TestProcessingOptionsVariations:
|
|
|
301
301
|
|
|
302
302
|
def test_processing_options_custom_values(self):
|
|
303
303
|
"""Test ProcessingOptions with custom values"""
|
|
304
|
-
from datalab_sdk import
|
|
304
|
+
from datalab_sdk import ConvertOptions
|
|
305
305
|
|
|
306
|
-
options =
|
|
306
|
+
options = ConvertOptions(
|
|
307
307
|
force_ocr=True,
|
|
308
308
|
output_format="html",
|
|
309
309
|
use_llm=False, # Keep false for cost reasons
|
|
@@ -319,9 +319,9 @@ class TestProcessingOptionsVariations:
|
|
|
319
319
|
|
|
320
320
|
def test_processing_options_json_output(self):
|
|
321
321
|
"""Test ProcessingOptions with JSON output"""
|
|
322
|
-
from datalab_sdk import
|
|
322
|
+
from datalab_sdk import ConvertOptions
|
|
323
323
|
|
|
324
|
-
options =
|
|
324
|
+
options = ConvertOptions(output_format="json", max_pages=1)
|
|
325
325
|
|
|
326
326
|
client = DatalabClient()
|
|
327
327
|
result = client.convert(DATA_DIR / "adversarial.pdf", options=options)
|
|
@@ -1,24 +1,29 @@
|
|
|
1
1
|
[project]
|
|
2
2
|
name = "datalab-python-sdk"
|
|
3
|
-
|
|
4
|
-
|
|
3
|
+
authors = [
|
|
4
|
+
{name = "Datalab Team", email = "hi@datalab.to"}
|
|
5
|
+
]
|
|
6
|
+
readme = "README.md"
|
|
7
|
+
license = "MIT"
|
|
8
|
+
repository = "https://github.com/datalab-to/sdk"
|
|
9
|
+
keywords = ["datalab", "sdk", "document-intelligence", "api"]
|
|
10
|
+
version = "0.1.3"
|
|
11
|
+
description = "SDK for the Datalab document intelligence API"
|
|
5
12
|
requires-python = ">=3.10"
|
|
6
13
|
dependencies = [
|
|
7
14
|
"aiohttp>=3.12.14",
|
|
8
15
|
"click>=8.2.1",
|
|
9
|
-
"pydantic
|
|
10
|
-
"pydantic-settings
|
|
11
|
-
"pytest-asyncio>=1.0.0",
|
|
16
|
+
"pydantic>=2.11.7,<3.0.0",
|
|
17
|
+
"pydantic-settings>=2.10.1,<3.0.0",
|
|
12
18
|
]
|
|
13
19
|
|
|
14
|
-
|
|
15
20
|
[project.scripts]
|
|
16
21
|
datalab = "datalab_sdk.cli:cli"
|
|
17
22
|
|
|
18
|
-
[project.
|
|
23
|
+
[project.dev-dependencies]
|
|
19
24
|
test = [
|
|
20
25
|
"pytest>=7.4.0",
|
|
21
|
-
"pytest-asyncio>=0.
|
|
26
|
+
"pytest-asyncio>=1.0.0",
|
|
22
27
|
"pytest-mock>=3.11.0",
|
|
23
28
|
"pytest-cov>=4.1.0",
|
|
24
29
|
"aiofiles>=23.2.0",
|
|
@@ -33,7 +38,11 @@ packages = ["datalab_sdk"]
|
|
|
33
38
|
|
|
34
39
|
[dependency-groups]
|
|
35
40
|
dev = [
|
|
41
|
+
"aiohttp>=3.12.14",
|
|
42
|
+
"click>=8.2.1",
|
|
36
43
|
"pre-commit>=4.2.0",
|
|
37
44
|
"pytest>=8.4.1",
|
|
45
|
+
"pytest-asyncio>=1.0.0",
|
|
46
|
+
"pytest-xdist>=3.8.0",
|
|
38
47
|
"ruff>=0.12.2",
|
|
39
48
|
]
|
|
@@ -10,8 +10,7 @@ from aiohttp.test_utils import TestServer
|
|
|
10
10
|
import json
|
|
11
11
|
import tempfile
|
|
12
12
|
|
|
13
|
-
from datalab_sdk import DatalabClient, AsyncDatalabClient
|
|
14
|
-
from datalab_sdk.models import ProcessingOptions
|
|
13
|
+
from datalab_sdk import DatalabClient, AsyncDatalabClient, ConvertOptions
|
|
15
14
|
|
|
16
15
|
|
|
17
16
|
@pytest.fixture
|
|
@@ -144,7 +143,7 @@ async def mock_async_client(mock_server):
|
|
|
144
143
|
@pytest.fixture
|
|
145
144
|
def processing_options():
|
|
146
145
|
"""Create sample processing options"""
|
|
147
|
-
return
|
|
146
|
+
return ConvertOptions(
|
|
148
147
|
force_ocr=True, output_format="markdown", use_llm=False, max_pages=10
|
|
149
148
|
)
|
|
150
149
|
|
|
@@ -7,7 +7,7 @@ from unittest.mock import patch, AsyncMock
|
|
|
7
7
|
import json
|
|
8
8
|
|
|
9
9
|
from datalab_sdk import DatalabClient, AsyncDatalabClient
|
|
10
|
-
from datalab_sdk.models import
|
|
10
|
+
from datalab_sdk.models import ConversionResult, OCRResult, ConvertOptions, OCROptions
|
|
11
11
|
from datalab_sdk.exceptions import DatalabAPIError, DatalabFileError
|
|
12
12
|
|
|
13
13
|
|
|
@@ -124,7 +124,7 @@ class TestConvertMethod:
|
|
|
124
124
|
pdf_file.write_bytes(b"%PDF-1.4\n%Test PDF content\n%%EOF\n")
|
|
125
125
|
|
|
126
126
|
# Create processing options
|
|
127
|
-
options =
|
|
127
|
+
options = ConvertOptions(
|
|
128
128
|
force_ocr=True, output_format="html", use_llm=True, max_pages=5
|
|
129
129
|
)
|
|
130
130
|
|
|
@@ -338,7 +338,7 @@ class TestOCRMethod:
|
|
|
338
338
|
mock_request.return_value = mock_initial_response
|
|
339
339
|
mock_poll.return_value = mock_result_response
|
|
340
340
|
|
|
341
|
-
options =
|
|
341
|
+
options = OCROptions(
|
|
342
342
|
max_pages=2,
|
|
343
343
|
)
|
|
344
344
|
|
|
@@ -1,19 +1,6 @@
|
|
|
1
1
|
version = 1
|
|
2
2
|
revision = 2
|
|
3
3
|
requires-python = ">=3.10"
|
|
4
|
-
resolution-markers = [
|
|
5
|
-
"python_full_version >= '3.11'",
|
|
6
|
-
"python_full_version < '3.11'",
|
|
7
|
-
]
|
|
8
|
-
|
|
9
|
-
[[package]]
|
|
10
|
-
name = "aiofiles"
|
|
11
|
-
version = "24.1.0"
|
|
12
|
-
source = { registry = "https://pypi.org/simple" }
|
|
13
|
-
sdist = { url = "https://files.pythonhosted.org/packages/0b/03/a88171e277e8caa88a4c77808c20ebb04ba74cc4681bf1e9416c862de237/aiofiles-24.1.0.tar.gz", hash = "sha256:22a075c9e5a3810f0c2e48f3008c94d68c65d763b9b03857924c99e57355166c", size = 30247, upload-time = "2024-06-24T11:02:03.584Z" }
|
|
14
|
-
wheels = [
|
|
15
|
-
{ url = "https://files.pythonhosted.org/packages/a5/45/30bb92d442636f570cb5651bc661f52b610e2eec3f891a5dc3a4c3667db0/aiofiles-24.1.0-py3-none-any.whl", hash = "sha256:b4ec55f4195e3eb5d7abd1bf7e061763e864dd4954231fb8539a0ef8bb8260e5", size = 15896, upload-time = "2024-06-24T11:02:01.529Z" },
|
|
16
|
-
]
|
|
17
4
|
|
|
18
5
|
[[package]]
|
|
19
6
|
name = "aiohappyeyeballs"
|
|
@@ -180,122 +167,44 @@ wheels = [
|
|
|
180
167
|
{ url = "https://files.pythonhosted.org/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6", size = 25335, upload-time = "2022-10-25T02:36:20.889Z" },
|
|
181
168
|
]
|
|
182
169
|
|
|
183
|
-
[[package]]
|
|
184
|
-
name = "coverage"
|
|
185
|
-
version = "7.9.2"
|
|
186
|
-
source = { registry = "https://pypi.org/simple" }
|
|
187
|
-
sdist = { url = "https://files.pythonhosted.org/packages/04/b7/c0465ca253df10a9e8dae0692a4ae6e9726d245390aaef92360e1d6d3832/coverage-7.9.2.tar.gz", hash = "sha256:997024fa51e3290264ffd7492ec97d0690293ccd2b45a6cd7d82d945a4a80c8b", size = 813556, upload-time = "2025-07-03T10:54:15.101Z" }
|
|
188
|
-
wheels = [
|
|
189
|
-
{ url = "https://files.pythonhosted.org/packages/a1/0d/5c2114fd776c207bd55068ae8dc1bef63ecd1b767b3389984a8e58f2b926/coverage-7.9.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:66283a192a14a3854b2e7f3418d7db05cdf411012ab7ff5db98ff3b181e1f912", size = 212039, upload-time = "2025-07-03T10:52:38.955Z" },
|
|
190
|
-
{ url = "https://files.pythonhosted.org/packages/cf/ad/dc51f40492dc2d5fcd31bb44577bc0cc8920757d6bc5d3e4293146524ef9/coverage-7.9.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:4e01d138540ef34fcf35c1aa24d06c3de2a4cffa349e29a10056544f35cca15f", size = 212428, upload-time = "2025-07-03T10:52:41.36Z" },
|
|
191
|
-
{ url = "https://files.pythonhosted.org/packages/a2/a3/55cb3ff1b36f00df04439c3993d8529193cdf165a2467bf1402539070f16/coverage-7.9.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f22627c1fe2745ee98d3ab87679ca73a97e75ca75eb5faee48660d060875465f", size = 241534, upload-time = "2025-07-03T10:52:42.956Z" },
|
|
192
|
-
{ url = "https://files.pythonhosted.org/packages/eb/c9/a8410b91b6be4f6e9c2e9f0dce93749b6b40b751d7065b4410bf89cb654b/coverage-7.9.2-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4b1c2d8363247b46bd51f393f86c94096e64a1cf6906803fa8d5a9d03784bdbf", size = 239408, upload-time = "2025-07-03T10:52:44.199Z" },
|
|
193
|
-
{ url = "https://files.pythonhosted.org/packages/ff/c4/6f3e56d467c612b9070ae71d5d3b114c0b899b5788e1ca3c93068ccb7018/coverage-7.9.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c10c882b114faf82dbd33e876d0cbd5e1d1ebc0d2a74ceef642c6152f3f4d547", size = 240552, upload-time = "2025-07-03T10:52:45.477Z" },
|
|
194
|
-
{ url = "https://files.pythonhosted.org/packages/fd/20/04eda789d15af1ce79bce5cc5fd64057c3a0ac08fd0576377a3096c24663/coverage-7.9.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:de3c0378bdf7066c3988d66cd5232d161e933b87103b014ab1b0b4676098fa45", size = 240464, upload-time = "2025-07-03T10:52:46.809Z" },
|
|
195
|
-
{ url = "https://files.pythonhosted.org/packages/a9/5a/217b32c94cc1a0b90f253514815332d08ec0812194a1ce9cca97dda1cd20/coverage-7.9.2-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:1e2f097eae0e5991e7623958a24ced3282676c93c013dde41399ff63e230fcf2", size = 239134, upload-time = "2025-07-03T10:52:48.149Z" },
|
|
196
|
-
{ url = "https://files.pythonhosted.org/packages/34/73/1d019c48f413465eb5d3b6898b6279e87141c80049f7dbf73fd020138549/coverage-7.9.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:28dc1f67e83a14e7079b6cea4d314bc8b24d1aed42d3582ff89c0295f09b181e", size = 239405, upload-time = "2025-07-03T10:52:49.687Z" },
|
|
197
|
-
{ url = "https://files.pythonhosted.org/packages/49/6c/a2beca7aa2595dad0c0d3f350382c381c92400efe5261e2631f734a0e3fe/coverage-7.9.2-cp310-cp310-win32.whl", hash = "sha256:bf7d773da6af9e10dbddacbf4e5cab13d06d0ed93561d44dae0188a42c65be7e", size = 214519, upload-time = "2025-07-03T10:52:51.036Z" },
|
|
198
|
-
{ url = "https://files.pythonhosted.org/packages/fc/c8/91e5e4a21f9a51e2c7cdd86e587ae01a4fcff06fc3fa8cde4d6f7cf68df4/coverage-7.9.2-cp310-cp310-win_amd64.whl", hash = "sha256:0c0378ba787681ab1897f7c89b415bd56b0b2d9a47e5a3d8dc0ea55aac118d6c", size = 215400, upload-time = "2025-07-03T10:52:52.313Z" },
|
|
199
|
-
{ url = "https://files.pythonhosted.org/packages/39/40/916786453bcfafa4c788abee4ccd6f592b5b5eca0cd61a32a4e5a7ef6e02/coverage-7.9.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:a7a56a2964a9687b6aba5b5ced6971af308ef6f79a91043c05dd4ee3ebc3e9ba", size = 212152, upload-time = "2025-07-03T10:52:53.562Z" },
|
|
200
|
-
{ url = "https://files.pythonhosted.org/packages/9f/66/cc13bae303284b546a030762957322bbbff1ee6b6cb8dc70a40f8a78512f/coverage-7.9.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:123d589f32c11d9be7fe2e66d823a236fe759b0096f5db3fb1b75b2fa414a4fa", size = 212540, upload-time = "2025-07-03T10:52:55.196Z" },
|
|
201
|
-
{ url = "https://files.pythonhosted.org/packages/0f/3c/d56a764b2e5a3d43257c36af4a62c379df44636817bb5f89265de4bf8bd7/coverage-7.9.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:333b2e0ca576a7dbd66e85ab402e35c03b0b22f525eed82681c4b866e2e2653a", size = 245097, upload-time = "2025-07-03T10:52:56.509Z" },
|
|
202
|
-
{ url = "https://files.pythonhosted.org/packages/b1/46/bd064ea8b3c94eb4ca5d90e34d15b806cba091ffb2b8e89a0d7066c45791/coverage-7.9.2-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:326802760da234baf9f2f85a39e4a4b5861b94f6c8d95251f699e4f73b1835dc", size = 242812, upload-time = "2025-07-03T10:52:57.842Z" },
|
|
203
|
-
{ url = "https://files.pythonhosted.org/packages/43/02/d91992c2b29bc7afb729463bc918ebe5f361be7f1daae93375a5759d1e28/coverage-7.9.2-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:19e7be4cfec248df38ce40968c95d3952fbffd57b400d4b9bb580f28179556d2", size = 244617, upload-time = "2025-07-03T10:52:59.239Z" },
|
|
204
|
-
{ url = "https://files.pythonhosted.org/packages/b7/4f/8fadff6bf56595a16d2d6e33415841b0163ac660873ed9a4e9046194f779/coverage-7.9.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:0b4a4cb73b9f2b891c1788711408ef9707666501ba23684387277ededab1097c", size = 244263, upload-time = "2025-07-03T10:53:00.601Z" },
|
|
205
|
-
{ url = "https://files.pythonhosted.org/packages/9b/d2/e0be7446a2bba11739edb9f9ba4eff30b30d8257370e237418eb44a14d11/coverage-7.9.2-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:2c8937fa16c8c9fbbd9f118588756e7bcdc7e16a470766a9aef912dd3f117dbd", size = 242314, upload-time = "2025-07-03T10:53:01.932Z" },
|
|
206
|
-
{ url = "https://files.pythonhosted.org/packages/9d/7d/dcbac9345000121b8b57a3094c2dfcf1ccc52d8a14a40c1d4bc89f936f80/coverage-7.9.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:42da2280c4d30c57a9b578bafd1d4494fa6c056d4c419d9689e66d775539be74", size = 242904, upload-time = "2025-07-03T10:53:03.478Z" },
|
|
207
|
-
{ url = "https://files.pythonhosted.org/packages/41/58/11e8db0a0c0510cf31bbbdc8caf5d74a358b696302a45948d7c768dfd1cf/coverage-7.9.2-cp311-cp311-win32.whl", hash = "sha256:14fa8d3da147f5fdf9d298cacc18791818f3f1a9f542c8958b80c228320e90c6", size = 214553, upload-time = "2025-07-03T10:53:05.174Z" },
|
|
208
|
-
{ url = "https://files.pythonhosted.org/packages/3a/7d/751794ec8907a15e257136e48dc1021b1f671220ecccfd6c4eaf30802714/coverage-7.9.2-cp311-cp311-win_amd64.whl", hash = "sha256:549cab4892fc82004f9739963163fd3aac7a7b0df430669b75b86d293d2df2a7", size = 215441, upload-time = "2025-07-03T10:53:06.472Z" },
|
|
209
|
-
{ url = "https://files.pythonhosted.org/packages/62/5b/34abcedf7b946c1c9e15b44f326cb5b0da852885312b30e916f674913428/coverage-7.9.2-cp311-cp311-win_arm64.whl", hash = "sha256:c2667a2b913e307f06aa4e5677f01a9746cd08e4b35e14ebcde6420a9ebb4c62", size = 213873, upload-time = "2025-07-03T10:53:07.699Z" },
|
|
210
|
-
{ url = "https://files.pythonhosted.org/packages/53/d7/7deefc6fd4f0f1d4c58051f4004e366afc9e7ab60217ac393f247a1de70a/coverage-7.9.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:ae9eb07f1cfacd9cfe8eaee6f4ff4b8a289a668c39c165cd0c8548484920ffc0", size = 212344, upload-time = "2025-07-03T10:53:09.3Z" },
|
|
211
|
-
{ url = "https://files.pythonhosted.org/packages/95/0c/ee03c95d32be4d519e6a02e601267769ce2e9a91fc8faa1b540e3626c680/coverage-7.9.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:9ce85551f9a1119f02adc46d3014b5ee3f765deac166acf20dbb851ceb79b6f3", size = 212580, upload-time = "2025-07-03T10:53:11.52Z" },
|
|
212
|
-
{ url = "https://files.pythonhosted.org/packages/8b/9f/826fa4b544b27620086211b87a52ca67592622e1f3af9e0a62c87aea153a/coverage-7.9.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f8f6389ac977c5fb322e0e38885fbbf901743f79d47f50db706e7644dcdcb6e1", size = 246383, upload-time = "2025-07-03T10:53:13.134Z" },
|
|
213
|
-
{ url = "https://files.pythonhosted.org/packages/7f/b3/4477aafe2a546427b58b9c540665feff874f4db651f4d3cb21b308b3a6d2/coverage-7.9.2-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ff0d9eae8cdfcd58fe7893b88993723583a6ce4dfbfd9f29e001922544f95615", size = 243400, upload-time = "2025-07-03T10:53:14.614Z" },
|
|
214
|
-
{ url = "https://files.pythonhosted.org/packages/f8/c2/efffa43778490c226d9d434827702f2dfbc8041d79101a795f11cbb2cf1e/coverage-7.9.2-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:fae939811e14e53ed8a9818dad51d434a41ee09df9305663735f2e2d2d7d959b", size = 245591, upload-time = "2025-07-03T10:53:15.872Z" },
|
|
215
|
-
{ url = "https://files.pythonhosted.org/packages/c6/e7/a59888e882c9a5f0192d8627a30ae57910d5d449c80229b55e7643c078c4/coverage-7.9.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:31991156251ec202c798501e0a42bbdf2169dcb0f137b1f5c0f4267f3fc68ef9", size = 245402, upload-time = "2025-07-03T10:53:17.124Z" },
|
|
216
|
-
{ url = "https://files.pythonhosted.org/packages/92/a5/72fcd653ae3d214927edc100ce67440ed8a0a1e3576b8d5e6d066ed239db/coverage-7.9.2-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:d0d67963f9cbfc7c7f96d4ac74ed60ecbebd2ea6eeb51887af0f8dce205e545f", size = 243583, upload-time = "2025-07-03T10:53:18.781Z" },
|
|
217
|
-
{ url = "https://files.pythonhosted.org/packages/5c/f5/84e70e4df28f4a131d580d7d510aa1ffd95037293da66fd20d446090a13b/coverage-7.9.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:49b752a2858b10580969ec6af6f090a9a440a64a301ac1528d7ca5f7ed497f4d", size = 244815, upload-time = "2025-07-03T10:53:20.168Z" },
|
|
218
|
-
{ url = "https://files.pythonhosted.org/packages/39/e7/d73d7cbdbd09fdcf4642655ae843ad403d9cbda55d725721965f3580a314/coverage-7.9.2-cp312-cp312-win32.whl", hash = "sha256:88d7598b8ee130f32f8a43198ee02edd16d7f77692fa056cb779616bbea1b355", size = 214719, upload-time = "2025-07-03T10:53:21.521Z" },
|
|
219
|
-
{ url = "https://files.pythonhosted.org/packages/9f/d6/7486dcc3474e2e6ad26a2af2db7e7c162ccd889c4c68fa14ea8ec189c9e9/coverage-7.9.2-cp312-cp312-win_amd64.whl", hash = "sha256:9dfb070f830739ee49d7c83e4941cc767e503e4394fdecb3b54bfdac1d7662c0", size = 215509, upload-time = "2025-07-03T10:53:22.853Z" },
|
|
220
|
-
{ url = "https://files.pythonhosted.org/packages/b7/34/0439f1ae2593b0346164d907cdf96a529b40b7721a45fdcf8b03c95fcd90/coverage-7.9.2-cp312-cp312-win_arm64.whl", hash = "sha256:4e2c058aef613e79df00e86b6d42a641c877211384ce5bd07585ed7ba71ab31b", size = 213910, upload-time = "2025-07-03T10:53:24.472Z" },
|
|
221
|
-
{ url = "https://files.pythonhosted.org/packages/94/9d/7a8edf7acbcaa5e5c489a646226bed9591ee1c5e6a84733c0140e9ce1ae1/coverage-7.9.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:985abe7f242e0d7bba228ab01070fde1d6c8fa12f142e43debe9ed1dde686038", size = 212367, upload-time = "2025-07-03T10:53:25.811Z" },
|
|
222
|
-
{ url = "https://files.pythonhosted.org/packages/e8/9e/5cd6f130150712301f7e40fb5865c1bc27b97689ec57297e568d972eec3c/coverage-7.9.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:82c3939264a76d44fde7f213924021ed31f55ef28111a19649fec90c0f109e6d", size = 212632, upload-time = "2025-07-03T10:53:27.075Z" },
|
|
223
|
-
{ url = "https://files.pythonhosted.org/packages/a8/de/6287a2c2036f9fd991c61cefa8c64e57390e30c894ad3aa52fac4c1e14a8/coverage-7.9.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ae5d563e970dbe04382f736ec214ef48103d1b875967c89d83c6e3f21706d5b3", size = 245793, upload-time = "2025-07-03T10:53:28.408Z" },
|
|
224
|
-
{ url = "https://files.pythonhosted.org/packages/06/cc/9b5a9961d8160e3cb0b558c71f8051fe08aa2dd4b502ee937225da564ed1/coverage-7.9.2-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:bdd612e59baed2a93c8843c9a7cb902260f181370f1d772f4842987535071d14", size = 243006, upload-time = "2025-07-03T10:53:29.754Z" },
|
|
225
|
-
{ url = "https://files.pythonhosted.org/packages/49/d9/4616b787d9f597d6443f5588619c1c9f659e1f5fc9eebf63699eb6d34b78/coverage-7.9.2-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:256ea87cb2a1ed992bcdfc349d8042dcea1b80436f4ddf6e246d6bee4b5d73b6", size = 244990, upload-time = "2025-07-03T10:53:31.098Z" },
|
|
226
|
-
{ url = "https://files.pythonhosted.org/packages/48/83/801cdc10f137b2d02b005a761661649ffa60eb173dcdaeb77f571e4dc192/coverage-7.9.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:f44ae036b63c8ea432f610534a2668b0c3aee810e7037ab9d8ff6883de480f5b", size = 245157, upload-time = "2025-07-03T10:53:32.717Z" },
|
|
227
|
-
{ url = "https://files.pythonhosted.org/packages/c8/a4/41911ed7e9d3ceb0ffb019e7635468df7499f5cc3edca5f7dfc078e9c5ec/coverage-7.9.2-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:82d76ad87c932935417a19b10cfe7abb15fd3f923cfe47dbdaa74ef4e503752d", size = 243128, upload-time = "2025-07-03T10:53:34.009Z" },
|
|
228
|
-
{ url = "https://files.pythonhosted.org/packages/10/41/344543b71d31ac9cb00a664d5d0c9ef134a0fe87cb7d8430003b20fa0b7d/coverage-7.9.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:619317bb86de4193debc712b9e59d5cffd91dc1d178627ab2a77b9870deb2868", size = 244511, upload-time = "2025-07-03T10:53:35.434Z" },
|
|
229
|
-
{ url = "https://files.pythonhosted.org/packages/d5/81/3b68c77e4812105e2a060f6946ba9e6f898ddcdc0d2bfc8b4b152a9ae522/coverage-7.9.2-cp313-cp313-win32.whl", hash = "sha256:0a07757de9feb1dfafd16ab651e0f628fd7ce551604d1bf23e47e1ddca93f08a", size = 214765, upload-time = "2025-07-03T10:53:36.787Z" },
|
|
230
|
-
{ url = "https://files.pythonhosted.org/packages/06/a2/7fac400f6a346bb1a4004eb2a76fbff0e242cd48926a2ce37a22a6a1d917/coverage-7.9.2-cp313-cp313-win_amd64.whl", hash = "sha256:115db3d1f4d3f35f5bb021e270edd85011934ff97c8797216b62f461dd69374b", size = 215536, upload-time = "2025-07-03T10:53:38.188Z" },
|
|
231
|
-
{ url = "https://files.pythonhosted.org/packages/08/47/2c6c215452b4f90d87017e61ea0fd9e0486bb734cb515e3de56e2c32075f/coverage-7.9.2-cp313-cp313-win_arm64.whl", hash = "sha256:48f82f889c80af8b2a7bb6e158d95a3fbec6a3453a1004d04e4f3b5945a02694", size = 213943, upload-time = "2025-07-03T10:53:39.492Z" },
|
|
232
|
-
{ url = "https://files.pythonhosted.org/packages/a3/46/e211e942b22d6af5e0f323faa8a9bc7c447a1cf1923b64c47523f36ed488/coverage-7.9.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:55a28954545f9d2f96870b40f6c3386a59ba8ed50caf2d949676dac3ecab99f5", size = 213088, upload-time = "2025-07-03T10:53:40.874Z" },
|
|
233
|
-
{ url = "https://files.pythonhosted.org/packages/d2/2f/762551f97e124442eccd907bf8b0de54348635b8866a73567eb4e6417acf/coverage-7.9.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:cdef6504637731a63c133bb2e6f0f0214e2748495ec15fe42d1e219d1b133f0b", size = 213298, upload-time = "2025-07-03T10:53:42.218Z" },
|
|
234
|
-
{ url = "https://files.pythonhosted.org/packages/7a/b7/76d2d132b7baf7360ed69be0bcab968f151fa31abe6d067f0384439d9edb/coverage-7.9.2-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:bcd5ebe66c7a97273d5d2ddd4ad0ed2e706b39630ed4b53e713d360626c3dbb3", size = 256541, upload-time = "2025-07-03T10:53:43.823Z" },
|
|
235
|
-
{ url = "https://files.pythonhosted.org/packages/a0/17/392b219837d7ad47d8e5974ce5f8dc3deb9f99a53b3bd4d123602f960c81/coverage-7.9.2-cp313-cp313t-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:9303aed20872d7a3c9cb39c5d2b9bdbe44e3a9a1aecb52920f7e7495410dfab8", size = 252761, upload-time = "2025-07-03T10:53:45.19Z" },
|
|
236
|
-
{ url = "https://files.pythonhosted.org/packages/d5/77/4256d3577fe1b0daa8d3836a1ebe68eaa07dd2cbaf20cf5ab1115d6949d4/coverage-7.9.2-cp313-cp313t-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc18ea9e417a04d1920a9a76fe9ebd2f43ca505b81994598482f938d5c315f46", size = 254917, upload-time = "2025-07-03T10:53:46.931Z" },
|
|
237
|
-
{ url = "https://files.pythonhosted.org/packages/53/99/fc1a008eef1805e1ddb123cf17af864743354479ea5129a8f838c433cc2c/coverage-7.9.2-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:6406cff19880aaaadc932152242523e892faff224da29e241ce2fca329866584", size = 256147, upload-time = "2025-07-03T10:53:48.289Z" },
|
|
238
|
-
{ url = "https://files.pythonhosted.org/packages/92/c0/f63bf667e18b7f88c2bdb3160870e277c4874ced87e21426128d70aa741f/coverage-7.9.2-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:2d0d4f6ecdf37fcc19c88fec3e2277d5dee740fb51ffdd69b9579b8c31e4232e", size = 254261, upload-time = "2025-07-03T10:53:49.99Z" },
|
|
239
|
-
{ url = "https://files.pythonhosted.org/packages/8c/32/37dd1c42ce3016ff8ec9e4b607650d2e34845c0585d3518b2a93b4830c1a/coverage-7.9.2-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:c33624f50cf8de418ab2b4d6ca9eda96dc45b2c4231336bac91454520e8d1fac", size = 255099, upload-time = "2025-07-03T10:53:51.354Z" },
|
|
240
|
-
{ url = "https://files.pythonhosted.org/packages/da/2e/af6b86f7c95441ce82f035b3affe1cd147f727bbd92f563be35e2d585683/coverage-7.9.2-cp313-cp313t-win32.whl", hash = "sha256:1df6b76e737c6a92210eebcb2390af59a141f9e9430210595251fbaf02d46926", size = 215440, upload-time = "2025-07-03T10:53:52.808Z" },
|
|
241
|
-
{ url = "https://files.pythonhosted.org/packages/4d/bb/8a785d91b308867f6b2e36e41c569b367c00b70c17f54b13ac29bcd2d8c8/coverage-7.9.2-cp313-cp313t-win_amd64.whl", hash = "sha256:f5fd54310b92741ebe00d9c0d1d7b2b27463952c022da6d47c175d246a98d1bd", size = 216537, upload-time = "2025-07-03T10:53:54.273Z" },
|
|
242
|
-
{ url = "https://files.pythonhosted.org/packages/1d/a0/a6bffb5e0f41a47279fd45a8f3155bf193f77990ae1c30f9c224b61cacb0/coverage-7.9.2-cp313-cp313t-win_arm64.whl", hash = "sha256:c48c2375287108c887ee87d13b4070a381c6537d30e8487b24ec721bf2a781cb", size = 214398, upload-time = "2025-07-03T10:53:56.715Z" },
|
|
243
|
-
{ url = "https://files.pythonhosted.org/packages/d7/85/f8bbefac27d286386961c25515431482a425967e23d3698b75a250872924/coverage-7.9.2-pp39.pp310.pp311-none-any.whl", hash = "sha256:8a1166db2fb62473285bcb092f586e081e92656c7dfa8e9f62b4d39d7e6b5050", size = 204013, upload-time = "2025-07-03T10:54:12.084Z" },
|
|
244
|
-
{ url = "https://files.pythonhosted.org/packages/3c/38/bbe2e63902847cf79036ecc75550d0698af31c91c7575352eb25190d0fb3/coverage-7.9.2-py3-none-any.whl", hash = "sha256:e425cd5b00f6fc0ed7cdbd766c70be8baab4b7839e4d4fe5fac48581dd968ea4", size = 204005, upload-time = "2025-07-03T10:54:13.491Z" },
|
|
245
|
-
]
|
|
246
|
-
|
|
247
|
-
[package.optional-dependencies]
|
|
248
|
-
toml = [
|
|
249
|
-
{ name = "tomli", marker = "python_full_version <= '3.11'" },
|
|
250
|
-
]
|
|
251
|
-
|
|
252
170
|
[[package]]
|
|
253
171
|
name = "datalab-python-sdk"
|
|
254
|
-
version = "0.1.
|
|
172
|
+
version = "0.1.3"
|
|
255
173
|
source = { editable = "." }
|
|
256
174
|
dependencies = [
|
|
257
175
|
{ name = "aiohttp" },
|
|
258
176
|
{ name = "click" },
|
|
259
177
|
{ name = "pydantic" },
|
|
260
178
|
{ name = "pydantic-settings" },
|
|
261
|
-
{ name = "pytest-asyncio" },
|
|
262
|
-
]
|
|
263
|
-
|
|
264
|
-
[package.optional-dependencies]
|
|
265
|
-
test = [
|
|
266
|
-
{ name = "aiofiles" },
|
|
267
|
-
{ name = "pytest" },
|
|
268
|
-
{ name = "pytest-asyncio" },
|
|
269
|
-
{ name = "pytest-cov" },
|
|
270
|
-
{ name = "pytest-mock" },
|
|
271
179
|
]
|
|
272
180
|
|
|
273
181
|
[package.dev-dependencies]
|
|
274
182
|
dev = [
|
|
183
|
+
{ name = "aiohttp" },
|
|
184
|
+
{ name = "click" },
|
|
275
185
|
{ name = "pre-commit" },
|
|
276
186
|
{ name = "pytest" },
|
|
187
|
+
{ name = "pytest-asyncio" },
|
|
188
|
+
{ name = "pytest-xdist" },
|
|
277
189
|
{ name = "ruff" },
|
|
278
190
|
]
|
|
279
191
|
|
|
280
192
|
[package.metadata]
|
|
281
193
|
requires-dist = [
|
|
282
|
-
{ name = "aiofiles", marker = "extra == 'test'", specifier = ">=23.2.0" },
|
|
283
194
|
{ name = "aiohttp", specifier = ">=3.12.14" },
|
|
284
195
|
{ name = "click", specifier = ">=8.2.1" },
|
|
285
196
|
{ name = "pydantic", specifier = ">=2.11.7,<3.0.0" },
|
|
286
197
|
{ name = "pydantic-settings", specifier = ">=2.10.1,<3.0.0" },
|
|
287
|
-
{ name = "pytest", marker = "extra == 'test'", specifier = ">=7.4.0" },
|
|
288
|
-
{ name = "pytest-asyncio", specifier = ">=1.0.0" },
|
|
289
|
-
{ name = "pytest-asyncio", marker = "extra == 'test'", specifier = ">=0.21.0" },
|
|
290
|
-
{ name = "pytest-cov", marker = "extra == 'test'", specifier = ">=4.1.0" },
|
|
291
|
-
{ name = "pytest-mock", marker = "extra == 'test'", specifier = ">=3.11.0" },
|
|
292
198
|
]
|
|
293
|
-
provides-extras = ["test"]
|
|
294
199
|
|
|
295
200
|
[package.metadata.requires-dev]
|
|
296
201
|
dev = [
|
|
202
|
+
{ name = "aiohttp", specifier = ">=3.12.14" },
|
|
203
|
+
{ name = "click", specifier = ">=8.2.1" },
|
|
297
204
|
{ name = "pre-commit", specifier = ">=4.2.0" },
|
|
298
205
|
{ name = "pytest", specifier = ">=8.4.1" },
|
|
206
|
+
{ name = "pytest-asyncio", specifier = ">=1.0.0" },
|
|
207
|
+
{ name = "pytest-xdist", specifier = ">=3.8.0" },
|
|
299
208
|
{ name = "ruff", specifier = ">=0.12.2" },
|
|
300
209
|
]
|
|
301
210
|
|
|
@@ -313,13 +222,22 @@ name = "exceptiongroup"
|
|
|
313
222
|
version = "1.3.0"
|
|
314
223
|
source = { registry = "https://pypi.org/simple" }
|
|
315
224
|
dependencies = [
|
|
316
|
-
{ name = "typing-extensions", marker = "python_full_version < '3.
|
|
225
|
+
{ name = "typing-extensions", marker = "python_full_version < '3.13'" },
|
|
317
226
|
]
|
|
318
227
|
sdist = { url = "https://files.pythonhosted.org/packages/0b/9f/a65090624ecf468cdca03533906e7c69ed7588582240cfe7cc9e770b50eb/exceptiongroup-1.3.0.tar.gz", hash = "sha256:b241f5885f560bc56a59ee63ca4c6a8bfa46ae4ad651af316d4e81817bb9fd88", size = 29749, upload-time = "2025-05-10T17:42:51.123Z" }
|
|
319
228
|
wheels = [
|
|
320
229
|
{ url = "https://files.pythonhosted.org/packages/36/f4/c6e662dade71f56cd2f3735141b265c3c79293c109549c1e6933b0651ffc/exceptiongroup-1.3.0-py3-none-any.whl", hash = "sha256:4d111e6e0c13d0644cad6ddaa7ed0261a0b36971f6d23e7ec9b4b9097da78a10", size = 16674, upload-time = "2025-05-10T17:42:49.33Z" },
|
|
321
230
|
]
|
|
322
231
|
|
|
232
|
+
[[package]]
|
|
233
|
+
name = "execnet"
|
|
234
|
+
version = "2.1.1"
|
|
235
|
+
source = { registry = "https://pypi.org/simple" }
|
|
236
|
+
sdist = { url = "https://files.pythonhosted.org/packages/bb/ff/b4c0dc78fbe20c3e59c0c7334de0c27eb4001a2b2017999af398bf730817/execnet-2.1.1.tar.gz", hash = "sha256:5189b52c6121c24feae288166ab41b32549c7e2348652736540b9e6e7d4e72e3", size = 166524, upload-time = "2024-04-08T09:04:19.245Z" }
|
|
237
|
+
wheels = [
|
|
238
|
+
{ url = "https://files.pythonhosted.org/packages/43/09/2aea36ff60d16dd8879bdb2f5b3ee0ba8d08cbbdcdfe870e695ce3784385/execnet-2.1.1-py3-none-any.whl", hash = "sha256:26dee51f1b80cebd6d0ca8e74dd8745419761d3bef34163928cbebbdc4749fdc", size = 40612, upload-time = "2024-04-08T09:04:17.414Z" },
|
|
239
|
+
]
|
|
240
|
+
|
|
323
241
|
[[package]]
|
|
324
242
|
name = "filelock"
|
|
325
243
|
version = "3.18.0"
|
|
@@ -849,29 +767,16 @@ wheels = [
|
|
|
849
767
|
]
|
|
850
768
|
|
|
851
769
|
[[package]]
|
|
852
|
-
name = "pytest-
|
|
853
|
-
version = "
|
|
854
|
-
source = { registry = "https://pypi.org/simple" }
|
|
855
|
-
dependencies = [
|
|
856
|
-
{ name = "coverage", extra = ["toml"] },
|
|
857
|
-
{ name = "pluggy" },
|
|
858
|
-
{ name = "pytest" },
|
|
859
|
-
]
|
|
860
|
-
sdist = { url = "https://files.pythonhosted.org/packages/18/99/668cade231f434aaa59bbfbf49469068d2ddd945000621d3d165d2e7dd7b/pytest_cov-6.2.1.tar.gz", hash = "sha256:25cc6cc0a5358204b8108ecedc51a9b57b34cc6b8c967cc2c01a4e00d8a67da2", size = 69432, upload-time = "2025-06-12T10:47:47.684Z" }
|
|
861
|
-
wheels = [
|
|
862
|
-
{ url = "https://files.pythonhosted.org/packages/bc/16/4ea354101abb1287856baa4af2732be351c7bee728065aed451b678153fd/pytest_cov-6.2.1-py3-none-any.whl", hash = "sha256:f5bc4c23f42f1cdd23c70b1dab1bbaef4fc505ba950d53e0081d0730dd7e86d5", size = 24644, upload-time = "2025-06-12T10:47:45.932Z" },
|
|
863
|
-
]
|
|
864
|
-
|
|
865
|
-
[[package]]
|
|
866
|
-
name = "pytest-mock"
|
|
867
|
-
version = "3.14.1"
|
|
770
|
+
name = "pytest-xdist"
|
|
771
|
+
version = "3.8.0"
|
|
868
772
|
source = { registry = "https://pypi.org/simple" }
|
|
869
773
|
dependencies = [
|
|
774
|
+
{ name = "execnet" },
|
|
870
775
|
{ name = "pytest" },
|
|
871
776
|
]
|
|
872
|
-
sdist = { url = "https://files.pythonhosted.org/packages/
|
|
777
|
+
sdist = { url = "https://files.pythonhosted.org/packages/78/b4/439b179d1ff526791eb921115fca8e44e596a13efeda518b9d845a619450/pytest_xdist-3.8.0.tar.gz", hash = "sha256:7e578125ec9bc6050861aa93f2d59f1d8d085595d6551c2c90b6f4fad8d3a9f1", size = 88069, upload-time = "2025-07-01T13:30:59.346Z" }
|
|
873
778
|
wheels = [
|
|
874
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
779
|
+
{ url = "https://files.pythonhosted.org/packages/ca/31/d4e37e9e550c2b92a9cbc2e4d0b7420a27224968580b5a447f420847c975/pytest_xdist-3.8.0-py3-none-any.whl", hash = "sha256:202ca578cfeb7370784a8c33d6d05bc6e13b4f25b5053c30a152269fd10f0b88", size = 46396, upload-time = "2025-07-01T13:30:56.632Z" },
|
|
875
780
|
]
|
|
876
781
|
|
|
877
782
|
[[package]]
|
|
@@ -1,17 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: datalab-python-sdk
|
|
3
|
-
Version: 0.1.1
|
|
4
|
-
Summary: Auto-generated SDK for Datalab API
|
|
5
|
-
License-File: LICENSE
|
|
6
|
-
Requires-Python: >=3.10
|
|
7
|
-
Requires-Dist: aiohttp>=3.12.14
|
|
8
|
-
Requires-Dist: click>=8.2.1
|
|
9
|
-
Requires-Dist: pydantic-settings<3.0.0,>=2.10.1
|
|
10
|
-
Requires-Dist: pydantic<3.0.0,>=2.11.7
|
|
11
|
-
Requires-Dist: pytest-asyncio>=1.0.0
|
|
12
|
-
Provides-Extra: test
|
|
13
|
-
Requires-Dist: aiofiles>=23.2.0; extra == 'test'
|
|
14
|
-
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'test'
|
|
15
|
-
Requires-Dist: pytest-cov>=4.1.0; extra == 'test'
|
|
16
|
-
Requires-Dist: pytest-mock>=3.11.0; extra == 'test'
|
|
17
|
-
Requires-Dist: pytest>=7.4.0; extra == 'test'
|
|
@@ -1,178 +0,0 @@
|
|
|
1
|
-
# Datalab SDK
|
|
2
|
-
|
|
3
|
-
A Python SDK for the [Datalab API](https://www.datalab.to) - a document intelligence platform powered by [marker](https://github.com/VikParuchuri/marker) and [surya](https://github.com/VikParuchuri/surya).
|
|
4
|
-
|
|
5
|
-
## Installation
|
|
6
|
-
|
|
7
|
-
```bash
|
|
8
|
-
pip install datalab-sdk
|
|
9
|
-
```
|
|
10
|
-
|
|
11
|
-
## Quick Start
|
|
12
|
-
|
|
13
|
-
### Authentication
|
|
14
|
-
|
|
15
|
-
Get your API key from [https://www.datalab.to/app/keys](https://www.datalab.to/app/keys):
|
|
16
|
-
|
|
17
|
-
```bash
|
|
18
|
-
export DATALAB_API_KEY="your_api_key_here"
|
|
19
|
-
```
|
|
20
|
-
|
|
21
|
-
### Basic Usage
|
|
22
|
-
|
|
23
|
-
```python
|
|
24
|
-
from datalab_sdk import DatalabClient
|
|
25
|
-
|
|
26
|
-
client = DatalabClient() # use env var from above, or pass api_key="your_api_key_here"
|
|
27
|
-
|
|
28
|
-
# Convert PDF to markdown
|
|
29
|
-
result = client.convert("document.pdf")
|
|
30
|
-
print(result.markdown)
|
|
31
|
-
|
|
32
|
-
# OCR a document
|
|
33
|
-
ocr_result = client.ocr("document.pdf")
|
|
34
|
-
print(ocr_result.get_text()) # Get all text as string
|
|
35
|
-
```
|
|
36
|
-
|
|
37
|
-
### Async Usage
|
|
38
|
-
|
|
39
|
-
```python
|
|
40
|
-
import asyncio
|
|
41
|
-
from datalab_sdk import AsyncDatalabClient
|
|
42
|
-
|
|
43
|
-
async def main():
|
|
44
|
-
async with AsyncDatalabClient(api_key="YOUR_API_KEY") as client:
|
|
45
|
-
# Convert PDF to markdown
|
|
46
|
-
result = await client.convert("document.pdf")
|
|
47
|
-
print(result.markdown)
|
|
48
|
-
|
|
49
|
-
# OCR a document
|
|
50
|
-
ocr_result = await client.ocr("document.pdf")
|
|
51
|
-
print(f"OCR found {len(ocr_result.pages)} pages")
|
|
52
|
-
|
|
53
|
-
asyncio.run(main())
|
|
54
|
-
```
|
|
55
|
-
|
|
56
|
-
## API Methods
|
|
57
|
-
|
|
58
|
-
### Document Conversion
|
|
59
|
-
|
|
60
|
-
Convert PDFs, Office documents, and images to markdown, HTML, or JSON.
|
|
61
|
-
|
|
62
|
-
```python
|
|
63
|
-
# Basic conversion
|
|
64
|
-
result = client.convert("document.pdf")
|
|
65
|
-
|
|
66
|
-
# With options
|
|
67
|
-
from datalab_sdk import ProcessingOptions
|
|
68
|
-
options = ProcessingOptions(
|
|
69
|
-
force_ocr=True,
|
|
70
|
-
output_format="html",
|
|
71
|
-
use_llm=True,
|
|
72
|
-
max_pages=10
|
|
73
|
-
)
|
|
74
|
-
result = client.convert("document.pdf", options=options)
|
|
75
|
-
|
|
76
|
-
# Convert and save automatically
|
|
77
|
-
result = client.convert("document.pdf", save_output="output/result")
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
### OCR
|
|
81
|
-
|
|
82
|
-
Extract text with bounding boxes from documents.
|
|
83
|
-
|
|
84
|
-
```python
|
|
85
|
-
# Basic OCR
|
|
86
|
-
result = client.ocr("document.pdf")
|
|
87
|
-
print(result.get_text())
|
|
88
|
-
|
|
89
|
-
# OCR with options
|
|
90
|
-
from datalab_sdk import ProcessingOptions
|
|
91
|
-
options = ProcessingOptions(
|
|
92
|
-
max_pages=2
|
|
93
|
-
)
|
|
94
|
-
result = client.ocr("document.pdf", options)
|
|
95
|
-
|
|
96
|
-
# OCR and save automatically
|
|
97
|
-
result = client.ocr("document.pdf", save_output="output/ocr_result")
|
|
98
|
-
```
|
|
99
|
-
|
|
100
|
-
## CLI Usage
|
|
101
|
-
|
|
102
|
-
The SDK includes a command-line interface:
|
|
103
|
-
|
|
104
|
-
```bash
|
|
105
|
-
# Convert document to markdown
|
|
106
|
-
datalab convert document.pdf
|
|
107
|
-
|
|
108
|
-
# OCR with JSON output
|
|
109
|
-
datalab ocr document.pdf --output-format json
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
## Error Handling
|
|
113
|
-
|
|
114
|
-
```python
|
|
115
|
-
from datalab_sdk import DatalabAPIError, DatalabTimeoutError
|
|
116
|
-
|
|
117
|
-
try:
|
|
118
|
-
result = client.convert("document.pdf")
|
|
119
|
-
except DatalabAPIError as e:
|
|
120
|
-
print(f"API Error: {e}")
|
|
121
|
-
except DatalabTimeoutError as e:
|
|
122
|
-
print(f"Timeout: {e}")
|
|
123
|
-
```
|
|
124
|
-
|
|
125
|
-
## Supported File Types
|
|
126
|
-
|
|
127
|
-
- **PDF**: `pdf`
|
|
128
|
-
- **Images**: `png`, `jpeg`, `webp`, `gif`, `tiff`
|
|
129
|
-
- **Office Documents**: `docx`, `xlsx`, `pptx`, `doc`, `xls`, `ppt`
|
|
130
|
-
- **Other**: `html`, `epub`, `odt`, `ods`, `odp`
|
|
131
|
-
|
|
132
|
-
## Rate Limits
|
|
133
|
-
|
|
134
|
-
- 200 requests per 60 seconds
|
|
135
|
-
- Maximum 200 concurrent requests
|
|
136
|
-
- 200MB file size limit
|
|
137
|
-
|
|
138
|
-
* email hi@datalab.to for higher limits.
|
|
139
|
-
|
|
140
|
-
## Examples
|
|
141
|
-
|
|
142
|
-
### Extract JSON Data
|
|
143
|
-
|
|
144
|
-
```python
|
|
145
|
-
from datalab_sdk import DatalabClient, ProcessingOptions
|
|
146
|
-
|
|
147
|
-
client = DatalabClient(api_key="YOUR_API_KEY")
|
|
148
|
-
options = ProcessingOptions(output_format="json")
|
|
149
|
-
result = client.convert("research_paper.pdf", options=options)
|
|
150
|
-
|
|
151
|
-
# Parse JSON to find equations
|
|
152
|
-
import json
|
|
153
|
-
data = json.loads(result.json)
|
|
154
|
-
equations = [block for block in data if block.get('block_type') == 'Formula']
|
|
155
|
-
print(f"Found {len(equations)} equations")
|
|
156
|
-
```
|
|
157
|
-
|
|
158
|
-
### Batch Process Documents
|
|
159
|
-
|
|
160
|
-
```python
|
|
161
|
-
import asyncio
|
|
162
|
-
from pathlib import Path
|
|
163
|
-
from datalab_sdk import AsyncDatalabClient
|
|
164
|
-
|
|
165
|
-
async def process_documents():
|
|
166
|
-
files = list(Path("documents/").glob("*.pdf"))
|
|
167
|
-
|
|
168
|
-
async with AsyncDatalabClient(api_key="YOUR_API_KEY") as client:
|
|
169
|
-
for file in files[:5]:
|
|
170
|
-
result = await client.convert(str(file), save_output=f"output/{file.stem}")
|
|
171
|
-
print(f"{file.name}: {result.page_count} pages")
|
|
172
|
-
|
|
173
|
-
asyncio.run(process_documents())
|
|
174
|
-
```
|
|
175
|
-
|
|
176
|
-
## License
|
|
177
|
-
|
|
178
|
-
MIT License
|
|
@@ -1,71 +0,0 @@
|
|
|
1
|
-
# Integration Tests
|
|
2
|
-
|
|
3
|
-
This directory contains integration tests that run against the live Datalab API.
|
|
4
|
-
|
|
5
|
-
## Setup
|
|
6
|
-
|
|
7
|
-
1. **Set your API key** as an environment variable:
|
|
8
|
-
```bash
|
|
9
|
-
export DATALAB_API_KEY="your_api_key_here"
|
|
10
|
-
```
|
|
11
|
-
|
|
12
|
-
2. **Optional: Set custom base URL** if testing against a different server:
|
|
13
|
-
```bash
|
|
14
|
-
export DATALAB_BASE_URL="https://custom.datalab.to"
|
|
15
|
-
```
|
|
16
|
-
|
|
17
|
-
## Running the Tests
|
|
18
|
-
|
|
19
|
-
Run all integration tests:
|
|
20
|
-
```bash
|
|
21
|
-
pytest integration/ -v
|
|
22
|
-
```
|
|
23
|
-
|
|
24
|
-
Run specific test classes:
|
|
25
|
-
```bash
|
|
26
|
-
pytest integration/test_live_api.py::TestMarkerIntegration -v
|
|
27
|
-
pytest integration/test_live_api.py::TestOCRIntegration -v
|
|
28
|
-
```
|
|
29
|
-
|
|
30
|
-
Run individual tests:
|
|
31
|
-
```bash
|
|
32
|
-
pytest integration/test_live_api.py::TestMarkerIntegration::test_convert_pdf_basic -v
|
|
33
|
-
```
|
|
34
|
-
|
|
35
|
-
## Test Coverage
|
|
36
|
-
|
|
37
|
-
### Marker/Convert Tests
|
|
38
|
-
- **test_convert_pdf_basic**: Basic PDF to markdown conversion
|
|
39
|
-
- **test_convert_office_document**: Word document to HTML conversion
|
|
40
|
-
- **test_convert_async_with_json**: Async PowerPoint to JSON conversion
|
|
41
|
-
|
|
42
|
-
### OCR Tests
|
|
43
|
-
- **test_ocr_pdf_basic**: Basic PDF OCR with text extraction
|
|
44
|
-
- **test_ocr_image_file**: OCR on PNG image file
|
|
45
|
-
- **test_ocr_async_multiple_pages**: Async OCR with multiple pages
|
|
46
|
-
|
|
47
|
-
### Error Handling Tests
|
|
48
|
-
- **test_invalid_api_key**: Invalid API key handling
|
|
49
|
-
- **test_nonexistent_file**: Nonexistent file handling
|
|
50
|
-
- **test_unsupported_file_type**: Unsupported file type handling
|
|
51
|
-
|
|
52
|
-
### Save Output Tests
|
|
53
|
-
- **test_convert_with_save_output**: Automatic file saving for conversion
|
|
54
|
-
- **test_ocr_with_save_output**: Automatic file saving for OCR
|
|
55
|
-
|
|
56
|
-
## Test Data Files Used
|
|
57
|
-
|
|
58
|
-
The tests use sample files from the `data/` directory:
|
|
59
|
-
- `adversarial.pdf` - PDF document
|
|
60
|
-
- `bid_evaluation.docx` - Word document
|
|
61
|
-
- `08-Lambda-Calculus.pptx` - PowerPoint presentation
|
|
62
|
-
- `thinkpython.pdf` - PDF book
|
|
63
|
-
- `chi_hind.png` - Image file
|
|
64
|
-
|
|
65
|
-
## Notes
|
|
66
|
-
|
|
67
|
-
- Tests use `max_pages=1` or `max_pages=2` to keep API usage minimal
|
|
68
|
-
- LLM mode is disabled to avoid extra costs
|
|
69
|
-
- All tests require a valid API key and will be skipped if not provided
|
|
70
|
-
- Tests make actual API calls and will consume API credits
|
|
71
|
-
- Some tests may take time to complete due to processing delays
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|