parseur-py 1.5.2__tar.gz → 1.6.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- parseur_py-1.6.0/PKG-INFO +462 -0
- parseur_py-1.6.0/README.md +441 -0
- parseur_py-1.6.0/parseur/__init__.py +67 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/cli.py +222 -11
- parseur_py-1.6.0/parseur/client.py +71 -0
- parseur_py-1.6.0/parseur/document.py +428 -0
- parseur_py-1.6.0/parseur/export_config.py +171 -0
- parseur_py-1.6.0/parseur/mailbox.py +599 -0
- parseur_py-1.6.0/parseur/mcp_server.py +1327 -0
- parseur_py-1.6.0/parseur/parser_field.py +214 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/schemas/document.py +16 -0
- parseur_py-1.6.0/parseur/schemas/export_config.py +61 -0
- parseur_py-1.6.0/parseur/schemas/mailbox.py +345 -0
- parseur_py-1.6.0/parseur/schemas/paserfield.py +93 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/server.py +1 -1
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/utils.py +19 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/webhook.py +30 -13
- parseur_py-1.6.0/parseur_py.egg-info/PKG-INFO +462 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur_py.egg-info/SOURCES.txt +13 -0
- parseur_py-1.6.0/parseur_py.egg-info/entry_points.txt +4 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur_py.egg-info/requires.txt +4 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/pyproject.toml +12 -2
- parseur_py-1.6.0/tests/test_api_key_override.py +72 -0
- parseur_py-1.6.0/tests/test_cli.py +73 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/tests/test_document.py +115 -16
- parseur_py-1.6.0/tests/test_export_config.py +165 -0
- parseur_py-1.6.0/tests/test_mailbox.py +766 -0
- parseur_py-1.6.0/tests/test_mcp_server.py +143 -0
- parseur_py-1.6.0/tests/test_parser_field.py +142 -0
- parseur_py-1.5.2/PKG-INFO +0 -282
- parseur_py-1.5.2/README.md +0 -264
- parseur_py-1.5.2/parseur/__init__.py +0 -31
- parseur_py-1.5.2/parseur/client.py +0 -51
- parseur_py-1.5.2/parseur/document.py +0 -217
- parseur_py-1.5.2/parseur/mailbox.py +0 -77
- parseur_py-1.5.2/parseur/schemas/mailbox.py +0 -177
- parseur_py-1.5.2/parseur/schemas/paserfield.py +0 -25
- parseur_py-1.5.2/parseur_py.egg-info/PKG-INFO +0 -282
- parseur_py-1.5.2/parseur_py.egg-info/entry_points.txt +0 -2
- parseur_py-1.5.2/tests/test_mailbox.py +0 -198
- {parseur_py-1.5.2 → parseur_py-1.6.0}/LICENSE +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/config.py +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/decorator.py +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/event.py +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/schemas/__init__.py +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur/schemas/webhook.py +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur_py.egg-info/dependency_links.txt +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/parseur_py.egg-info/top_level.txt +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/setup.cfg +0 -0
- {parseur_py-1.5.2 → parseur_py-1.6.0}/tests/test_webhook.py +0 -0
|
@@ -0,0 +1,462 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: parseur-py
|
|
3
|
+
Version: 1.6.0
|
|
4
|
+
Summary: A Python client for the Parseur.com API to manage mailboxes, documents, uploads, and listen for new parsing events.
|
|
5
|
+
Author-email: Parseur Team <admin@parseur.com>
|
|
6
|
+
Project-URL: Homepage, https://github.com/parseur/parseur-py
|
|
7
|
+
Project-URL: Repository, https://github.com/parseur/parseur-py
|
|
8
|
+
Project-URL: Issues, https://github.com/parseur/parseur-py/issues
|
|
9
|
+
Requires-Python: >=3.10
|
|
10
|
+
Description-Content-Type: text/markdown
|
|
11
|
+
License-File: LICENSE
|
|
12
|
+
Requires-Dist: click>=8.2.1
|
|
13
|
+
Requires-Dist: requests>=2.31.0
|
|
14
|
+
Requires-Dist: marshmallow>=4.0.0
|
|
15
|
+
Provides-Extra: listener
|
|
16
|
+
Requires-Dist: Flask>=3.1.0; extra == "listener"
|
|
17
|
+
Provides-Extra: mcp
|
|
18
|
+
Requires-Dist: mcp>=1.18.0; extra == "mcp"
|
|
19
|
+
Requires-Dist: pydantic>=2.0; extra == "mcp"
|
|
20
|
+
Dynamic: license-file
|
|
21
|
+
|
|
22
|
+
<!-- mcp-name: io.github.parseur/parseur-py -->
|
|
23
|
+
|
|
24
|
+
# 🤖🧙parseur-py
|
|
25
|
+
|
|
26
|
+
**parseur-py** is a modern Python client for the [Parseur](https://parseur.com) API.
|
|
27
|
+
|
|
28
|
+
It lets you **manage mailboxes, documents, uploads, and webhooks** programmatically or from the command line.
|
|
29
|
+
|
|
30
|
+
Built to help you automate document parsing at scale, parseur-py makes integrating with Parseur fast, easy, and Pythonic.
|
|
31
|
+
|
|
32
|
+
[](https://github.com/parseur/parseur-py)
|
|
33
|
+
[](https://badge.fury.io/py/parseur-py)
|
|
34
|
+
[](https://opensource.org/licenses/MIT)
|
|
35
|
+
[](https://parseur-py.readthedocs.io/en/latest/?badge=latest)
|
|
36
|
+
[](https://pepy.tech/projects/parseur-py)
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## ✨ Features
|
|
41
|
+
|
|
42
|
+
✅ List, search, and sort mailboxes
|
|
43
|
+
✅ Get mailbox details and schema
|
|
44
|
+
✅ List, search, filter, and sort documents
|
|
45
|
+
✅ Upload documents by file or email content
|
|
46
|
+
✅ Reprocess, skip, copy, or delete documents
|
|
47
|
+
✅ Manage custom webhooks for real-time events
|
|
48
|
+
✅ Listen to events in real time with a temporary webhook & tunnel
|
|
49
|
+
✅ Fully-featured **Command Line Interface (CLI)**
|
|
50
|
+
✅ Built-in **MCP server** to drive Parseur from AI assistants (Claude, Cursor, …)
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## ⚠️ Disclaimer about Localtunnel
|
|
55
|
+
|
|
56
|
+
When using the `parseur listen` command (with event listener support), your data is forwarded through **localtunnel servers**.
|
|
57
|
+
|
|
58
|
+
These servers are **not affiliated with Parseur** and are **not covered** by Parseur’s [Privacy Policy](https://parseur.com/privacy) or [Data Processing Agreement](https://parseur.com/dpa).
|
|
59
|
+
|
|
60
|
+
Data transmitted through localtunnel is **not encrypted end-to-end**.
|
|
61
|
+
|
|
62
|
+
➡️ **Use this feature at your own risk.**
|
|
63
|
+
|
|
64
|
+
For production-grade setups, we strongly recommend configuring your own secure webhook endpoint instead of relying on localtunnel.
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
## 🚀 Quick Start
|
|
69
|
+
|
|
70
|
+
### Install the package
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
pip install parseur-py
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
With event listener support (Flask + localtunnel)
|
|
77
|
+
|
|
78
|
+
```bash
|
|
79
|
+
pip install "parseur-py[listener]"
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
With MCP server support (use Parseur from AI assistants)
|
|
83
|
+
|
|
84
|
+
```bash
|
|
85
|
+
pip install "parseur-py[mcp]"
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### Install the package from source
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
pip install -e .
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
### Build documentation
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
pip install -r requirements-doc.txt
|
|
98
|
+
cd docs
|
|
99
|
+
make html
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### Run the tests
|
|
103
|
+
|
|
104
|
+
Unit tests run fully offline:
|
|
105
|
+
|
|
106
|
+
```bash
|
|
107
|
+
pytest
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
Integration tests hit the real Parseur API. They create and delete real
|
|
111
|
+
resources, so they're skipped unless credentials are provided **via the
|
|
112
|
+
environment** (never committed or stored in `~/.parseur.conf`):
|
|
113
|
+
|
|
114
|
+
```bash
|
|
115
|
+
PARSEUR_API_BASE=https://api.parseur.com \
|
|
116
|
+
PARSEUR_API_KEY=sk_your_key \
|
|
117
|
+
pytest tests/integration -v
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
### Initialize your configuration
|
|
123
|
+
|
|
124
|
+
Store your Parseur API credentials securely:
|
|
125
|
+
|
|
126
|
+
```bash
|
|
127
|
+
parseur init --api-key YOUR_PARSEUR_API_KEY
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
Your config is saved (by default) in:
|
|
131
|
+
|
|
132
|
+
```
|
|
133
|
+
~/.parseur.conf
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
### Example usage
|
|
139
|
+
|
|
140
|
+
List all your mailboxes:
|
|
141
|
+
|
|
142
|
+
```bash
|
|
143
|
+
parseur list-mailboxes
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
List documents in a mailbox:
|
|
147
|
+
|
|
148
|
+
```bash
|
|
149
|
+
parseur list-documents 12345
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
Upload a file to a mailbox (add `--wait` to block until it is parsed, with a live progress bar):
|
|
153
|
+
|
|
154
|
+
```bash
|
|
155
|
+
parseur upload-file 12345 ./path/to/document.pdf
|
|
156
|
+
parseur upload-file 12345 ./path/to/document.pdf --wait
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
Download a mailbox's results as a file (stdout by default, or `--output`):
|
|
160
|
+
|
|
161
|
+
```bash
|
|
162
|
+
parseur download-mailbox 12345 --format csv -o results.csv # whole mailbox
|
|
163
|
+
parseur list-parser-fields 12345 # find a table field id
|
|
164
|
+
parseur download-field 12345 PF951 --format xlsx -o lines.xlsx # a table field
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
Register a custom webhook:
|
|
168
|
+
|
|
169
|
+
```bash
|
|
170
|
+
parseur create-webhook --event document.processed --target-url https://yourserver.com/webhook --mailbox-id 12345
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
Listen to events in real time (requires [listener]):
|
|
174
|
+
|
|
175
|
+
```bash
|
|
176
|
+
parseur listen --event document.processed --mailbox-id 12345
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
With forwarding:
|
|
180
|
+
|
|
181
|
+
```bash
|
|
182
|
+
parseur listen --event document.processed --mailbox-id 12345 --redirect-url http://localhost --redirect-port 8000
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
---
|
|
186
|
+
|
|
187
|
+
## 📜 CLI Commands
|
|
188
|
+
|
|
189
|
+
Run:
|
|
190
|
+
|
|
191
|
+
```bash
|
|
192
|
+
parseur --help
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
for a full list of available commands.
|
|
196
|
+
|
|
197
|
+
### Highlights
|
|
198
|
+
|
|
199
|
+
- **init**: Set your API token and (optional) base URL
|
|
200
|
+
- **list-mailboxes**: Search and sort mailboxes
|
|
201
|
+
- **get-mailbox**: Fetch a mailbox by ID
|
|
202
|
+
- **get-mailbox-schema**: Get the mailbox parsing schema
|
|
203
|
+
- **list-parser-fields**: List the fields extracted by a mailbox
|
|
204
|
+
- **download-mailbox / download-field / download-export**: Download results as a file (csv/json/xlsx)
|
|
205
|
+
- **list-export-configs**: List a mailbox's custom export configurations
|
|
206
|
+
- **list-documents**: Advanced document search, filtering, sorting
|
|
207
|
+
- **get-document / get-document-logs**: Fetch document details and processing logs
|
|
208
|
+
- **reprocess-document / skip-document / copy-document / split-document / reverse-split-document / delete-document**: Document lifecycle operations
|
|
209
|
+
- **upload-file / upload-text**: Upload new documents (add `--wait` for synchronous parsing)
|
|
210
|
+
- **upload-folder**: Upload every file matching a glob path
|
|
211
|
+
- **create-webhook / get-webhook / list-webhooks / delete-webhook**: Create, get, list, and delete custom webhook integrations.
|
|
212
|
+
- **enable-webhook / pause-webhook**: Activate or pause a webhook for a specific mailbox.
|
|
213
|
+
- **listen**: Create a temporary webhook and listen to events in real time (with optional redirect & silent mode)
|
|
214
|
+
- **mcp**: Run the Parseur MCP server so AI assistants can manage your account (requires `[mcp]`)
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
## 🔎 Advanced Search & Filtering
|
|
219
|
+
|
|
220
|
+
**Mailbox listing supports:**
|
|
221
|
+
|
|
222
|
+
- **Search** by name or email prefix
|
|
223
|
+
- **Sort** by:
|
|
224
|
+
- name
|
|
225
|
+
- document_count
|
|
226
|
+
- template_count
|
|
227
|
+
- PARSEDOK_count (processed)
|
|
228
|
+
- PARSEDKO_count (failed)
|
|
229
|
+
- QUOTAEXC_count (quota exceeded)
|
|
230
|
+
- EXPORTKO_count (export failed)
|
|
231
|
+
|
|
232
|
+
**Document listing supports:**
|
|
233
|
+
|
|
234
|
+
- **Search** in:
|
|
235
|
+
- document ID
|
|
236
|
+
- document name
|
|
237
|
+
- template name
|
|
238
|
+
- email addresses (from, to, cc, bcc)
|
|
239
|
+
- document metadata header
|
|
240
|
+
- **Sort** by:
|
|
241
|
+
- name
|
|
242
|
+
- created (received date)
|
|
243
|
+
- processed date
|
|
244
|
+
- status
|
|
245
|
+
- **Filter** by:
|
|
246
|
+
- received_after / received_before dates
|
|
247
|
+
- **Include** parsed result in response
|
|
248
|
+
|
|
249
|
+
---
|
|
250
|
+
|
|
251
|
+
## ⚡ Webhooks Support
|
|
252
|
+
|
|
253
|
+
Easily register custom webhooks for events like:
|
|
254
|
+
|
|
255
|
+
- `document.processed`
|
|
256
|
+
- `document.processed.flattened`
|
|
257
|
+
- `document.template_needed`
|
|
258
|
+
- `document.export_failed`
|
|
259
|
+
- `table.processed`
|
|
260
|
+
- `table.processed.flattened`
|
|
261
|
+
|
|
262
|
+
Your webhook endpoint will receive POST notifications with Parseur payloads, enabling real-time integrations with your systems.
|
|
263
|
+
|
|
264
|
+
---
|
|
265
|
+
|
|
266
|
+
## 🤖 MCP Server (AI assistants)
|
|
267
|
+
|
|
268
|
+
**parseur-py** ships an [MCP](https://modelcontextprotocol.io) server that exposes Parseur as tools any MCP-compatible AI assistant (Claude Desktop, Cursor, Claude Code, …) can call directly — listing mailboxes, uploading documents, reading parsed results, managing webhooks, and more.
|
|
269
|
+
|
|
270
|
+
### Install
|
|
271
|
+
|
|
272
|
+
The MCP server is an **optional** feature, behind the `mcp` extra (it pulls in
|
|
273
|
+
the `mcp` SDK and `pydantic`; the base `parseur-py` install does not):
|
|
274
|
+
|
|
275
|
+
```bash
|
|
276
|
+
pip install "parseur-py[mcp]"
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
> Zero-install: if you have [`uv`](https://docs.astral.sh/uv/), you don't need to
|
|
280
|
+
> install anything — `uvx --from "parseur-py[mcp]" parseur-py` runs the server in
|
|
281
|
+
> a throwaway environment. This is the recommended way to wire it into an MCP
|
|
282
|
+
> client (see below).
|
|
283
|
+
|
|
284
|
+
### Run
|
|
285
|
+
|
|
286
|
+
The server speaks MCP over **stdio**. It reads your API key from
|
|
287
|
+
`~/.parseur.conf` (run `parseur init` first) or from the `PARSEUR_API_KEY`
|
|
288
|
+
environment variable (which takes precedence — MCP clients usually inject it).
|
|
289
|
+
|
|
290
|
+
```bash
|
|
291
|
+
parseur mcp # via the main CLI (needs the [mcp] extra)
|
|
292
|
+
parseur-mcp # dedicated console script
|
|
293
|
+
python -m parseur.mcp_server # as a module
|
|
294
|
+
uvx --from "parseur-py[mcp]" parseur-py # zero-install with uv
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
### Configure your client
|
|
298
|
+
|
|
299
|
+
Add Parseur to your MCP client config. Example for **Claude Desktop**
|
|
300
|
+
(`claude_desktop_config.json`).
|
|
301
|
+
|
|
302
|
+
Zero-install with `uv` (recommended — nothing to install or keep updated):
|
|
303
|
+
|
|
304
|
+
```json
|
|
305
|
+
{
|
|
306
|
+
"mcpServers": {
|
|
307
|
+
"parseur": {
|
|
308
|
+
"command": "uvx",
|
|
309
|
+
"args": ["--from", "parseur-py[mcp]", "parseur-py"],
|
|
310
|
+
"env": {
|
|
311
|
+
"PARSEUR_API_KEY": "YOUR_PARSEUR_API_KEY"
|
|
312
|
+
}
|
|
313
|
+
}
|
|
314
|
+
}
|
|
315
|
+
}
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
Or, if you installed `parseur-py[mcp]` yourself, point at the console script:
|
|
319
|
+
|
|
320
|
+
```json
|
|
321
|
+
{
|
|
322
|
+
"mcpServers": {
|
|
323
|
+
"parseur": {
|
|
324
|
+
"command": "parseur-mcp",
|
|
325
|
+
"env": {
|
|
326
|
+
"PARSEUR_API_KEY": "YOUR_PARSEUR_API_KEY"
|
|
327
|
+
}
|
|
328
|
+
}
|
|
329
|
+
}
|
|
330
|
+
}
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
### Available tools
|
|
334
|
+
|
|
335
|
+
Every tool carries a title, a description, per-argument descriptions, behavioral
|
|
336
|
+
annotations (read-only / destructive / idempotent / open-world hints), and a
|
|
337
|
+
structured JSON output schema — so the assistant understands exactly what each
|
|
338
|
+
command does, what it expects, and what it returns. Destructive tools
|
|
339
|
+
(`delete_*`) are flagged so clients can ask for confirmation first.
|
|
340
|
+
|
|
341
|
+
The server exposes the full client surface as **54 MCP tools**:
|
|
342
|
+
|
|
343
|
+
- **Mailboxes**: `list_mailboxes`, `get_mailbox`, `get_mailbox_schema`, `create_mailbox`, `delete_mailbox`
|
|
344
|
+
- **Mailbox settings** (one tool per setting, instead of a generic update): `rename_mailbox`, `set_ai_engine`, `set_ai_instructions`, `set_email_processing`, `set_metadata`, `set_timezone`, `set_date_format`, `set_decimal_separator`, `set_allowed_extensions`, `set_sender_filter`, `split_by_ai`, `split_by_page`, `split_by_page_range`, `split_by_keywords`, `process_page_range`, `process_odd_pages`, `process_even_pages`
|
|
345
|
+
- **Parser fields**: `list_parser_fields`, `add_parser_field`, `update_parser_field`, `delete_parser_field`
|
|
346
|
+
- **Documents**: `list_documents`, `get_document`, `get_document_logs`, `reprocess_document`, `skip_document`, `copy_document`, `split_document`, `reverse_split_document`, `delete_document`
|
|
347
|
+
- **Uploads**: `upload_file`, `upload_text` (asynchronous) and `upload_file_and_wait`, `upload_text_and_wait`, `wait_for_document` (block until parsed)
|
|
348
|
+
- **Webhooks**: `list_webhooks`, `get_webhook`, `create_webhook`, `delete_webhook`, `enable_webhook`, `pause_webhook`
|
|
349
|
+
- **Exports**: `get_mailbox_export`, `get_table_export`, `list_export_fields`, `list_export_configs`, `get_export_config`, `create_export_config`, `update_export_config`, `delete_export_config`
|
|
350
|
+
|
|
351
|
+
> **Uploading files over MCP:** `upload_file` takes a **file path** and the server reads it directly — just give the absolute path, no base64. When the desktop app launches the server locally (the usual setup) it runs on your machine, so it can read your files; no extra permission/config is needed (this server is not sandboxed to specific folders). To copy a document that is already in another mailbox, use `copy_document` instead — no file transfer at all.
|
|
352
|
+
|
|
353
|
+
> **Getting results out as a file:** three kinds of export, each returning ready-to-use, self-authenticating `csv` / `json` / `xlsx` download links you can hand to the user. `get_mailbox_export` exports the whole mailbox (one row per document); `get_table_export` exports a single table field's rows (one row per line item); for a custom column selection, build an export config with `list_export_fields` + `create_export_config`.
|
|
354
|
+
|
|
355
|
+
### Workflow
|
|
356
|
+
|
|
357
|
+
The tools follow the lifecycle of a mailbox — the server ships these same instructions so the assistant can follow the flow on its own:
|
|
358
|
+
|
|
359
|
+
1. **Create a mailbox** with just a title (`create_mailbox`). Don't define fields up front: Parseur auto-detects them from the first documents during its identification phase. Adjust them afterwards with `add_parser_field` / `update_parser_field` / `delete_parser_field`.
|
|
360
|
+
2. **Send documents** to parse with `upload_file` (a path on the server's machine) or `upload_text` (email/HTML content).
|
|
361
|
+
3. **Wait for the result.** Parsing is asynchronous: a document is pending while its status is `INCOMING` / `ANALYZING` / `PROGRESS` and finished at `PARSEDOK` (or `PARSEDKO` / `EXPORTKO`). The parsed data is in the document's `result` field, populated only once it reaches `PARSEDOK`. Prefer `upload_file_and_wait` / `upload_text_and_wait` to get it in one call, or `wait_for_document` to wait on an existing document.
|
|
362
|
+
4. **Get the data out** as a file via the three exports above, or push each parsed document to a URL in real time with `create_webhook`.
|
|
363
|
+
|
|
364
|
+
IDs follow a simple convention: mailboxes and webhooks use an integer id, documents a string id, parser/table fields a `PF...` string id, and export configs an integer id.
|
|
365
|
+
|
|
366
|
+
### Publishing to the MCP Registry
|
|
367
|
+
|
|
368
|
+
The server is described by [`server.json`](server.json) and can be published to the official [MCP Registry](https://registry.modelcontextprotocol.io) under the `io.github.parseur/parseur-py` name. Ownership is verified through the GitHub `parseur` organization and the `<!-- mcp-name: io.github.parseur/parseur-py -->` marker shipped in this README (so it is present in the PyPI package).
|
|
369
|
+
|
|
370
|
+
```bash
|
|
371
|
+
# 1. Install the publisher CLI
|
|
372
|
+
curl -L "https://github.com/modelcontextprotocol/registry/releases/latest/download/mcp-publisher_$(uname -s | tr '[:upper:]' '[:lower:]')_$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/').tar.gz" | tar xz mcp-publisher
|
|
373
|
+
|
|
374
|
+
# 2. Authenticate (opens GitHub; you must be a member of the `parseur` org)
|
|
375
|
+
./mcp-publisher login github
|
|
376
|
+
|
|
377
|
+
# 3. Publish the version described in server.json
|
|
378
|
+
./mcp-publisher publish
|
|
379
|
+
```
|
|
380
|
+
|
|
381
|
+
Before publishing, make sure the version in `server.json` (both the top-level `version` and `packages[].version`) matches a release of `parseur-py` already on PyPI whose README contains the `mcp-name` marker. A GitHub Actions workflow (`.github/workflows/publish-mcp-registry.yml`) does this automatically on each GitHub release using GitHub OIDC.
|
|
382
|
+
|
|
383
|
+
Once published, clients install and run the server with:
|
|
384
|
+
|
|
385
|
+
```bash
|
|
386
|
+
uvx --from "parseur-py[mcp]" parseur-py
|
|
387
|
+
```
|
|
388
|
+
|
|
389
|
+
---
|
|
390
|
+
|
|
391
|
+
## 🛠️ Configuration
|
|
392
|
+
|
|
393
|
+
Your API token and settings are stored in a simple INI file:
|
|
394
|
+
|
|
395
|
+
```
|
|
396
|
+
[parseur]
|
|
397
|
+
api_token = YOUR_API_KEY
|
|
398
|
+
base_url = https://api.parseur.com
|
|
399
|
+
```
|
|
400
|
+
|
|
401
|
+
You can customize the path by setting \`--config-path\` in your calls if needed.
|
|
402
|
+
|
|
403
|
+
---
|
|
404
|
+
|
|
405
|
+
## 🐍 Python Client Usage
|
|
406
|
+
|
|
407
|
+
Beyond the CLI, **parseur-py** is a standard Python library. Example:
|
|
408
|
+
|
|
409
|
+
```python
|
|
410
|
+
import parseur
|
|
411
|
+
|
|
412
|
+
parseur.api_key = "YOUR_API_KEY"
|
|
413
|
+
|
|
414
|
+
for mailbox in parseur.Mailbox.list():
|
|
415
|
+
print(mailbox.name)
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
### Per-call API key override
|
|
419
|
+
|
|
420
|
+
Every method accepts an optional `api_key` argument that **takes priority over
|
|
421
|
+
the global `parseur.api_key`** for that single call — useful for multi-account
|
|
422
|
+
or multi-tenant code:
|
|
423
|
+
|
|
424
|
+
```python
|
|
425
|
+
parseur.Mailbox.list(api_key="sk_account_a")
|
|
426
|
+
parseur.Document.upload_file(123, "invoice.pdf", api_key="sk_account_b")
|
|
427
|
+
```
|
|
428
|
+
|
|
429
|
+
---
|
|
430
|
+
|
|
431
|
+
## 📖 Documentation
|
|
432
|
+
|
|
433
|
+
- [Parseur Official API Docs](https://help.parseur.com/en/articles/3566128-use-parseur-document-parsing-api)
|
|
434
|
+
- This package mirrors Parseur’s REST API, adding pagination handling, schema support, and convenient CLI commands.
|
|
435
|
+
|
|
436
|
+
---
|
|
437
|
+
|
|
438
|
+
## 💼 License
|
|
439
|
+
|
|
440
|
+
MIT License
|
|
441
|
+
|
|
442
|
+
---
|
|
443
|
+
|
|
444
|
+
## 🤝 Contributing
|
|
445
|
+
|
|
446
|
+
We welcome contributions! Please:
|
|
447
|
+
|
|
448
|
+
1. Fork the repo
|
|
449
|
+
2. Create your feature branch (`git checkout -b feature/foo`)
|
|
450
|
+
3. Commit your changes (`git commit -am 'Add foo'`)
|
|
451
|
+
4. Push to the branch (`git push origin feature/foo`)
|
|
452
|
+
5. Open a pull request
|
|
453
|
+
|
|
454
|
+
---
|
|
455
|
+
|
|
456
|
+
## ✨ Credits
|
|
457
|
+
|
|
458
|
+
Developed with ❤️ by the [Parseur](https://parseur.com) team.
|
|
459
|
+
|
|
460
|
+
---
|
|
461
|
+
|
|
462
|
+
*Parseur is the easiest way to automatically extract data from emails and documents. Stop copy-pasting data and automate your workflows!*
|