mistral-ai-ocr 1.0__py3-none-any.whl → 1.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {mistral_ai_ocr-1.0.dist-info → mistral_ai_ocr-1.1.dist-info}/METADATA +38 -38
- mistral_ai_ocr-1.1.dist-info/RECORD +7 -0
- mistral_ai_ocr-1.0.dist-info/RECORD +0 -7
- {mistral_ai_ocr-1.0.dist-info → mistral_ai_ocr-1.1.dist-info}/WHEEL +0 -0
- {mistral_ai_ocr-1.0.dist-info → mistral_ai_ocr-1.1.dist-info}/entry_points.txt +0 -0
- {mistral_ai_ocr-1.0.dist-info → mistral_ai_ocr-1.1.dist-info}/top_level.txt +0 -0
@@ -1,12 +1,48 @@
|
|
1
1
|
Metadata-Version: 2.1
|
2
2
|
Name: mistral-ai-ocr
|
3
|
-
Version: 1.
|
3
|
+
Version: 1.1
|
4
4
|
Description-Content-Type: text/markdown
|
5
5
|
Requires-Dist: mistralai
|
6
6
|
Requires-Dist: python-dotenv
|
7
7
|
|
8
8
|
# Mistral AI OCR
|
9
|
-
This is a simple script that uses the Mistral AI OCR API to
|
9
|
+
This is a simple script that uses the Mistral AI OCR API to get the Markdown text from a PDF or image file
|
10
|
+
|
11
|
+
# Usage
|
12
|
+
|
13
|
+
## Install the Requirements
|
14
|
+
|
15
|
+
To install the necessary requirements, run the following command:
|
16
|
+
|
17
|
+
```sh
|
18
|
+
pip install mistral-ai-ocr
|
19
|
+
```
|
20
|
+
|
21
|
+
## Typical Usage
|
22
|
+
|
23
|
+
```sh
|
24
|
+
mistral-ai-ocr paper.pdf
|
25
|
+
mistral-ai-ocr paper.pdf --api-key jrWjJE5lFketfB2sA6vvhQK2SoHQ6R39
|
26
|
+
mistral-ai-ocr paper.pdf -o revision
|
27
|
+
mistral-ai-ocr paper.pdf -e
|
28
|
+
mistral-ai-ocr paper.pdf -m FULL
|
29
|
+
mistral-ai-ocr page74.jpg -e
|
30
|
+
mistral-ai-ocr -j paper.json
|
31
|
+
mistral-ai-ocr -j paper.json -m TEXT_NO_PAGES -n
|
32
|
+
```
|
33
|
+
|
34
|
+
## Arguments
|
35
|
+
|
36
|
+
| Argument || Description |
|
37
|
+
|-|-|-|
|
38
|
+
| | | input PDF or image file |
|
39
|
+
| -k API_KEY | --api-key API_KEY | Mistral API key, can be set via the **MISTRAL_API_KEY** environment variable |
|
40
|
+
| -o OUTPUT | --output OUTPUT | output directory path. If not set, a directory will be created in the current working directory using the same stem (filename without extension) as the input file |
|
41
|
+
| -j JSON_OCR_RESPONSE | --json-ocr-response JSON_OCR_RESPONSE | path from which to load a pre-existing JSON OCR response (any input file will be ignored) |
|
42
|
+
| -m MODE | --mode MODE | mode of operation: either the name or numerical value of the mode. _Defaults to FULL_NO_PAGES_ |
|
43
|
+
| -s PAGE_SEPARATOR | --page-separator PAGE_SEPARATOR | page separator to use when writing the Markdown file. _Defaults to `\n`_ |
|
44
|
+
| -n | --no-json | do not write the JSON OCR response to a file. By default, the response is written |
|
45
|
+
| -e | --load-dot-env | load the .env file from the current directory using [`python-dotenv`](https://pypi.org/project/python-dotenv/), to retrieve the Mistral API key |
|
10
46
|
|
11
47
|
## Modes
|
12
48
|
|
@@ -108,42 +144,6 @@ paper
|
|
108
144
|
|
109
145
|
By default, the JSON response from the Mistral AI OCR API is saved in the output directory. To disable JSON output, use the `-n` or `--no-json` argument. To experiment with a different **mode** without using additional API calls, reuse an existing JSON response instead of the original input file
|
110
146
|
|
111
|
-
# Usage
|
112
|
-
|
113
|
-
## Install the Requirements
|
114
|
-
|
115
|
-
To install the necessary requirements, run the following command:
|
116
|
-
|
117
|
-
```sh
|
118
|
-
pip install mistral-ai-ocr
|
119
|
-
```
|
120
|
-
|
121
|
-
## Typical Usage
|
122
|
-
|
123
|
-
```sh
|
124
|
-
mistral-ai-ocr paper.pdf
|
125
|
-
mistral-ai-ocr paper.pdf --api-key jrWjJE5lFketfB2sA6vvhQK2SoHQ6R39
|
126
|
-
mistral-ai-ocr paper.pdf -o revision
|
127
|
-
mistral-ai-ocr paper.pdf -e
|
128
|
-
mistral-ai-ocr paper.pdf -m FULL
|
129
|
-
mistral-ai-ocr page74.jpg -e
|
130
|
-
mistral-ai-ocr -j paper.json
|
131
|
-
mistral-ai-ocr -j paper.json -m TEXT_NO_PAGES -n
|
132
|
-
```
|
133
|
-
|
134
|
-
## Arguments
|
135
|
-
|
136
|
-
| Argument || Description |
|
137
|
-
|-|-|-|
|
138
|
-
| | | input PDF or image file |
|
139
|
-
| -k API_KEY | --api-key API_KEY | Mistral API key, can be set via the **MISTRAL_API_KEY** environment variable |
|
140
|
-
| -o OUTPUT | --output OUTPUT | output directory path. If not set, a directory will be created in the current working directory using the same stem (filename without extension) as the input file |
|
141
|
-
| -j JSON_OCR_RESPONSE | --json-ocr-response JSON_OCR_RESPONSE | path from which to load a pre-existing JSON OCR response (any input file will be ignored) |
|
142
|
-
| -m MODE | --mode MODE | mode of operation: either the name or numerical value of the mode. _Defaults to FULL_NO_PAGES_ |
|
143
|
-
| -s PAGE_SEPARATOR | --page-separator PAGE_SEPARATOR | page separator to use when writing the Markdown file. _Defaults to `\n`_ |
|
144
|
-
| -n | --no-json | do not write the JSON OCR response to a file. By default, the response is written |
|
145
|
-
| -e | --load-dot-env | load the .env file from the current directory using [`python-dotenv`](https://pypi.org/project/python-dotenv/), to retrieve the Mistral API key |
|
146
|
-
|
147
147
|
### Mistral AI API Key
|
148
148
|
|
149
149
|
To obtain an API key, you need a [Mistral AI](https://auth.mistral.ai/ui/registration) account. Then visit [https://admin.mistral.ai/organization/api-keys](https://admin.mistral.ai/organization/api-keys) and click the **Create new key** button
|
@@ -0,0 +1,7 @@
|
|
1
|
+
mistral_ai_ocr/__init__.py,sha256=wOwicDbjQMcMWubEnPogXRyAiV6JVrJhiXZmWRkPsVw,9248
|
2
|
+
mistral_ai_ocr/__main__.py,sha256=5Jrno0r448BT2HdNrXLi2H6RLkUqP76IV4kmB3HuJ6g,3247
|
3
|
+
mistral_ai_ocr-1.1.dist-info/METADATA,sha256=lxFIzL-zbSIqM-_IUVRuhniCmWzM6PwlPACeKbkdVmk,4410
|
4
|
+
mistral_ai_ocr-1.1.dist-info/WHEEL,sha256=G16H4A3IeoQmnOrYV4ueZGKSjhipXx8zc8nu9FGlvMA,92
|
5
|
+
mistral_ai_ocr-1.1.dist-info/entry_points.txt,sha256=m-ENd87vam6706-mmfzVfBq5q028TKM-7SMLUakWd-U,64
|
6
|
+
mistral_ai_ocr-1.1.dist-info/top_level.txt,sha256=4X0WShtu4WEMtVriRP9X2Fia0ORjbAK03bRYimMvRHA,15
|
7
|
+
mistral_ai_ocr-1.1.dist-info/RECORD,,
|
@@ -1,7 +0,0 @@
|
|
1
|
-
mistral_ai_ocr/__init__.py,sha256=wOwicDbjQMcMWubEnPogXRyAiV6JVrJhiXZmWRkPsVw,9248
|
2
|
-
mistral_ai_ocr/__main__.py,sha256=5Jrno0r448BT2HdNrXLi2H6RLkUqP76IV4kmB3HuJ6g,3247
|
3
|
-
mistral_ai_ocr-1.0.dist-info/METADATA,sha256=hqq21JC0r4mFIy0vY7UYm9R_Mbcr4ebgYawnyGNiPyo,4401
|
4
|
-
mistral_ai_ocr-1.0.dist-info/WHEEL,sha256=G16H4A3IeoQmnOrYV4ueZGKSjhipXx8zc8nu9FGlvMA,92
|
5
|
-
mistral_ai_ocr-1.0.dist-info/entry_points.txt,sha256=m-ENd87vam6706-mmfzVfBq5q028TKM-7SMLUakWd-U,64
|
6
|
-
mistral_ai_ocr-1.0.dist-info/top_level.txt,sha256=4X0WShtu4WEMtVriRP9X2Fia0ORjbAK03bRYimMvRHA,15
|
7
|
-
mistral_ai_ocr-1.0.dist-info/RECORD,,
|
File without changes
|
File without changes
|
File without changes
|