sinapsis-deepseek-ocr 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,295 @@
1
+ Metadata-Version: 2.4
2
+ Name: sinapsis-deepseek-ocr
3
+ Version: 0.1.0
4
+ Summary: A powerful DeepSeek-based Optical Character Recognition (OCR) implementation supporting text extraction and grounding.
5
+ Author-email: SinapsisAI <dev@sinapsis.tech>
6
+ Project-URL: Homepage, https://sinapsis.tech
7
+ Project-URL: Documentation, https://docs.sinapsis.tech/docs
8
+ Project-URL: Tutorials, https://docs.sinapsis.tech/tutorials
9
+ Project-URL: Repository, https://github.com/Sinapsis-AI/sinapsis-ocr.git
10
+ Requires-Python: >=3.10
11
+ Description-Content-Type: text/markdown
12
+ License-File: LICENSE
13
+ Requires-Dist: sinapsis>=0.2.24
14
+ Requires-Dist: flash-attn>=2.8.3
15
+ Requires-Dist: torch==2.8.0
16
+ Requires-Dist: transformers==4.46.3
17
+ Requires-Dist: tokenizers==0.20.3
18
+ Requires-Dist: einops>=0.8.1
19
+ Requires-Dist: addict>=2.4.0
20
+ Requires-Dist: easydict>=1.13
21
+ Requires-Dist: accelerate>=1.12.0
22
+ Requires-Dist: sinapsis-generic-data-tools>=0.1.11
23
+ Provides-Extra: sinapsis-data-readers
24
+ Requires-Dist: sinapsis-data-readers[opencv]>=0.1.0; extra == "sinapsis-data-readers"
25
+ Provides-Extra: sinapsis-data-writers
26
+ Requires-Dist: sinapsis-data-writers>=0.1.0; extra == "sinapsis-data-writers"
27
+ Provides-Extra: sinapsis-data-visualization
28
+ Requires-Dist: sinapsis-data-visualization[visualization-matplotlib]>=0.1.0; extra == "sinapsis-data-visualization"
29
+ Provides-Extra: all
30
+ Requires-Dist: sinapsis-deepseek-ocr[sinapsis-data-readers]; extra == "all"
31
+ Requires-Dist: sinapsis-deepseek-ocr[sinapsis-data-writers]; extra == "all"
32
+ Requires-Dist: sinapsis-deepseek-ocr[sinapsis-data-visualization]; extra == "all"
33
+ Dynamic: license-file
34
+
35
+ <h1 align="center">
36
+ <br>
37
+ <a href="https://sinapsis.tech/">
38
+ <img
39
+ src="https://github.com/Sinapsis-AI/brand-resources/blob/main/sinapsis_logo/4x/logo.png?raw=true"
40
+ alt="" width="300">
41
+ </a><br>
42
+ Sinapsis DeepSeek OCR
43
+ <br>
44
+ </h1>
45
+
46
+ <h4 align="center">DeepSeek-based Optical Character Recognition (OCR) for images</h4>
47
+
48
+ <p align="center">
49
+ <a href="#installation">🐍 Installation</a> •
50
+ <a href="#features">🚀 Features</a> •
51
+ <a href="#usage">📚 Usage example</a> •
52
+ <a href="#webapp">🌐 Webapp</a> •
53
+ <a href="#documentation">📙 Documentation</a> •
54
+ <a href="#license">🔍 License</a>
55
+ </p>
56
+
57
+ **Sinapsis DeepSeek OCR** provides a powerful implementation for extracting text from images using DeepSeek's OCR model. It supports optional grounding for bounding box extraction.
58
+
59
+ <h2 id="installation">🐍 Installation</h2>
60
+
61
+ Install using your package manager of choice. We encourage the use of <code>uv</code>
62
+
63
+ Example with <code>uv</code>:
64
+
65
+ ```bash
66
+ uv pip install sinapsis-deepseek-ocr --extra-index-url https://pypi.sinapsis.tech
67
+ ```
68
+ or with raw <code>pip</code>:
69
+ ```bash
70
+ pip install sinapsis-deepseek-ocr --extra-index-url https://pypi.sinapsis.tech
71
+ ```
72
+
73
+ > [!IMPORTANT]
74
+ > Templates may require extra dependencies. For development, we recommend installing the package with all the optional dependencies:
75
+ >
76
+
77
+ with <code>uv</code>:
78
+
79
+ ```bash
80
+ uv pip install sinapsis-deepseek-ocr[all] --extra-index-url https://pypi.sinapsis.tech
81
+ ```
82
+ or with raw <code>pip</code>:
83
+ ```bash
84
+ pip install sinapsis-deepseek-ocr[all] --extra-index-url https://pypi.sinapsis.tech
85
+ ```
86
+
87
+ > [!TIP]
88
+ > Use CLI command ```sinapsis info --all-template-names``` to show a list with all the available Template names installed with Sinapsis OCR.
89
+
90
+ > [!TIP]
91
+ > Use CLI command ```sinapsis info --example-template-config DeepSeekOCRInference``` to produce an example Agent config for the DeepSeekOCRInference template.
92
+
93
+ <h2 id="features">🚀 Features</h2>
94
+
95
+ <h3>Templates Supported</h3>
96
+
97
+ This module includes a template tailored for the DeepSeek OCR engine:
98
+
99
+ - **DeepSeekOCRInference**: Uses DeepSeek's OCR model to extract text from images. Supports optional grounding for bounding box extraction.
100
+
101
+ <details>
102
+ <summary><strong><span style="font-size: 1.25em;">DeepSeekOCRInference Attributes</span></strong></summary>
103
+
104
+ - **`prompt`** (str): The prompt to send to the model. Defaults to `"OCR the image."`.
105
+ - **`enable_grounding`** (bool): Whether to enable grounding for bbox extraction. Defaults to `False`.
106
+ - **`mode`** (str): The inference mode. Options: `"tiny"`, `"small"`, `"gundam"`, `"base"`, `"large"`. Defaults to `"base"`.
107
+ - **`init_args`** (DeepSeekOCRInitArgs): Initialization arguments for the model including:
108
+ - `pretrained_model_name_or_path`: Model identifier. Defaults to `"deepseek-ai/DeepSeek-OCR"`.
109
+ - `torch_dtype`: Model precision (`"float16"`, `"bfloat16"`, `"auto"`). Defaults to `"auto"`.
110
+ - `attn_implementation`: Attention implementation. Defaults to `"flash_attention_2"`.
111
+ - **Note**: This model requires CUDA. CPU inference is not supported.
112
+
113
+ </details>
114
+
115
+ <h2 id="usage">📚 Usage example</h2>
116
+
117
+ <details>
118
+ <summary><strong><span style="font-size: 1.4em;">Text Extraction (No Grounding)</span></strong></summary>
119
+
120
+ ```yaml
121
+ agent:
122
+ name: deepseek_ocr_agent
123
+ description: Agent to run inference with DeepSeek OCR
124
+
125
+ templates:
126
+ - template_name: InputTemplate
127
+ class_name: InputTemplate
128
+ attributes: {}
129
+
130
+ - template_name: FolderImageDatasetCV2
131
+ class_name: FolderImageDatasetCV2
132
+ template_input: InputTemplate
133
+ attributes:
134
+ data_dir: dataset/input
135
+
136
+ - template_name: DeepSeekOCRInference
137
+ class_name: DeepSeekOCRInference
138
+ template_input: FolderImageDatasetCV2
139
+ attributes:
140
+ prompt: "Perform OCR."
141
+ enable_grounding: false
142
+ mode: base
143
+ ```
144
+ </details>
145
+
146
+ <details>
147
+ <summary><strong><span style="font-size: 1.4em;">With Grounding (Bounding Boxes)</span></strong></summary>
148
+
149
+ ```yaml
150
+ agent:
151
+ name: deepseek_ocr_grounding_agent
152
+ description: Agent with grounding for bbox extraction
153
+
154
+ templates:
155
+ - template_name: InputTemplate
156
+ class_name: InputTemplate
157
+ attributes: {}
158
+
159
+ - template_name: FolderImageDatasetCV2
160
+ class_name: FolderImageDatasetCV2
161
+ template_input: InputTemplate
162
+ attributes:
163
+ data_dir: dataset/input
164
+
165
+ - template_name: DeepSeekOCRInference
166
+ class_name: DeepSeekOCRInference
167
+ template_input: FolderImageDatasetCV2
168
+ attributes:
169
+ prompt: "Convert the document to markdown."
170
+ enable_grounding: true
171
+ mode: base
172
+
173
+ - template_name: BBoxDrawer
174
+ class_name: BBoxDrawer
175
+ template_input: DeepSeekOCRInference
176
+ attributes:
177
+ draw_confidence: True
178
+ draw_extra_labels: True
179
+
180
+ - template_name: ImageSaver
181
+ class_name: ImageSaver
182
+ template_input: BBoxDrawer
183
+ attributes:
184
+ save_dir: output
185
+ root_dir: dataset
186
+ ```
187
+ </details>
188
+
189
+ To run, simply use:
190
+
191
+ ```bash
192
+ sinapsis run name_of_the_config.yml
193
+ ```
194
+
195
+ <h2 id="webapp">🌐 Webapp</h2>
196
+
197
+ The webapp provides a simple interface to extract text from images using DeepSeek OCR. Upload your image, and the app will process it and display the detected text.
198
+
199
+ > [!IMPORTANT]
200
+ > To run the app you first need to clone the sinapsis-ocr repository:
201
+
202
+ ```bash
203
+ git clone https://github.com/Sinapsis-ai/sinapsis-ocr.git
204
+ cd sinapsis-ocr
205
+ ```
206
+
207
+ > [!NOTE]
208
+ > If you'd like to enable external app sharing in Gradio, `export GRADIO_SHARE_APP=True`
209
+
210
+ > [!IMPORTANT]
211
+ > To use DeepSeek OCR in the webapp, set the environment variable:
212
+ > `AGENT_CONFIG_PATH=/app/packages/sinapsis_deepseek_ocr/src/sinapsis_deepseek_ocr/configs/inference.yaml`
213
+
214
+ <details>
215
+ <summary id="docker"><strong><span style="font-size: 1.4em;">🐳 Docker</span></strong></summary>
216
+
217
+ **IMPORTANT** This docker image depends on the sinapsis:base image. Please refer to the official [sinapsis](https://github.com/Sinapsis-ai/sinapsis?tab=readme-ov-file#docker) instructions to Build with Docker.
218
+
219
+ 1. **Build the sinapsis-ocr image**:
220
+
221
+ ```bash
222
+ docker compose -f docker/compose.yaml build
223
+ ```
224
+
225
+ 2. **Start the app container**:
226
+
227
+ ```bash
228
+ docker compose -f docker/compose_app.yaml up
229
+ ```
230
+
231
+ 3. **Check the status**:
232
+
233
+ ```bash
234
+ docker logs -f sinapsis-ocr-app
235
+ ```
236
+
237
+ 4. The logs will display the URL to access the webapp, e.g.:
238
+
239
+ **NOTE**: The url can be different, check the output of logs
240
+
241
+ ```bash
242
+ Running on local URL: http://127.0.0.1:7860
243
+ ```
244
+
245
+ 5. To stop the app:
246
+
247
+ ```bash
248
+ docker compose -f docker/compose_app.yaml down
249
+ ```
250
+
251
+ </details>
252
+
253
+ <details>
254
+ <summary id="uv"><strong><span style="font-size: 1.4em;">💻 UV</span></strong></summary>
255
+
256
+ To run the webapp using the <code>uv</code> package manager, please:
257
+
258
+ 1. **Create the virtual environment and sync the dependencies**:
259
+
260
+ ```bash
261
+ uv sync --frozen
262
+ ```
263
+
264
+ 2. **Install packages**:
265
+ ```bash
266
+ uv pip install sinapsis-deepseek-ocr[all] --extra-index-url https://pypi.sinapsis.tech
267
+ ```
268
+ 3. **Run the webapp**:
269
+
270
+ ```bash
271
+ uv run webapps/gradio_ocr.py
272
+ ```
273
+
274
+ 4. **The terminal will display the URL to access the webapp, e.g.**:
275
+
276
+ ```bash
277
+ Running on local URL: http://127.0.0.1:7860
278
+ ```
279
+ NOTE: The url can be different, check the output of the terminal
280
+
281
+ 5. To stop the app press `Control + C` on the terminal
282
+
283
+ </details>
284
+
285
+ <h2 id="documentation">📙 Documentation</h2>
286
+
287
+ Documentation for this and other sinapsis packages is available on the [sinapsis website](https://docs.sinapsis.tech/docs)
288
+
289
+ Tutorials for different projects within sinapsis are available at [sinapsis tutorials page](https://docs.sinapsis.tech/tutorials)
290
+
291
+ <h2 id="license">🔍 License</h2>
292
+
293
+ This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the [LICENSE](LICENSE) file.
294
+
295
+ For commercial use, please refer to our [official Sinapsis website](https://sinapsis.tech) for information on obtaining a commercial license.
@@ -0,0 +1,14 @@
1
+ sinapsis_deepseek_ocr/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
2
+ sinapsis_deepseek_ocr/helpers/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
3
+ sinapsis_deepseek_ocr/helpers/bbox_utils.py,sha256=3lkYXPUP_bM0G1PClBA8364dtU9x9JCq_r_FaDyqX7I,565
4
+ sinapsis_deepseek_ocr/helpers/grounding_parser.py,sha256=nFKjP1oR1wMgIGhPImM49I2DJ_8ThESsUER8rDUaVpA,1176
5
+ sinapsis_deepseek_ocr/helpers/mode_registry.py,sha256=Waj27UsI9dOB9ATjfIxJgM-vOXUM_9JPlfqKzfmvBnY,1267
6
+ sinapsis_deepseek_ocr/helpers/schemas.py,sha256=b6TsFzVii6wYe5FvoW4r1y1wg2906D8anDboDRqTQw4,2396
7
+ sinapsis_deepseek_ocr/helpers/tags.py,sha256=-drXXyKOvPDlA5ykdpNMGrRd1nJJ7JZQ6xNUmU1FI-I,547
8
+ sinapsis_deepseek_ocr/templates/__init__.py,sha256=4bVYh-bWmJj-g9ysLoFLcfIlHQzqlg-Pl-BADYuBl74,498
9
+ sinapsis_deepseek_ocr/templates/deepseek_ocr_inference.py,sha256=PrHYjzP9TzjKVZg7bwApocq7MSmoyv2Mi8mVluWt6KM,7809
10
+ sinapsis_deepseek_ocr-0.1.0.dist-info/licenses/LICENSE,sha256=ILBn-G3jdarm2w8oOrLmXeJNU3czuJvVhDLBASWdhM8,34522
11
+ sinapsis_deepseek_ocr-0.1.0.dist-info/METADATA,sha256=2hFdUbA425nS20ZrRobECG0D57jZg86m13uHh6enWkk,9304
12
+ sinapsis_deepseek_ocr-0.1.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
13
+ sinapsis_deepseek_ocr-0.1.0.dist-info/top_level.txt,sha256=hb_Gw4JQgFdjuF9wjm4wwajLTiEhI79jzug9vgRbHv0,22
14
+ sinapsis_deepseek_ocr-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: setuptools (80.9.0)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+