@sylphx/pdf-reader-mcp 2.5.2 → 2.5.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +67 -4
- package/dist/index.js +1964 -1669
- package/package.json +8 -3
package/README.md
CHANGED
|
@@ -11,7 +11,7 @@
|
|
|
11
11
|
[](https://www.typescriptlang.org/)
|
|
12
12
|
[](https://www.npmjs.com/package/@sylphx/pdf-reader-mcp)
|
|
13
13
|
|
|
14
|
-
**
|
|
14
|
+
**PDF inspection** • **Structured element output** • **Semantic citation chunks** • **Local-first MCP**
|
|
15
15
|
|
|
16
16
|
<a href="https://mseep.ai/app/SylphxAI-pdf-reader-mcp">
|
|
17
17
|
<img src="https://mseep.net/pr/SylphxAI-pdf-reader-mcp-badge.png" alt="Security Validated" width="200"/>
|
|
@@ -23,7 +23,7 @@
|
|
|
23
23
|
|
|
24
24
|
## 🚀 Overview
|
|
25
25
|
|
|
26
|
-
PDF Reader MCP is a **production-ready** Model Context Protocol server that empowers AI agents with **structured, local-first PDF processing capabilities**.
|
|
26
|
+
PDF Reader MCP is a **production-ready** Model Context Protocol server that empowers AI agents with **structured, local-first PDF processing capabilities**. Inspect PDFs before extraction, then extract text, Markdown, semantic citation chunks, images, tables, annotations, outlines, structure trees, form fields, attachment metadata, and agent-ready document elements with strong performance and reliability.
|
|
27
27
|
|
|
28
28
|
**The Problem:**
|
|
29
29
|
```typescript
|
|
@@ -37,6 +37,7 @@ PDF Reader MCP is a **production-ready** Model Context Protocol server that empo
|
|
|
37
37
|
**The Solution:**
|
|
38
38
|
```typescript
|
|
39
39
|
// PDF Reader MCP
|
|
40
|
+
- Preflight PDF inspection for agent extraction planning 🔎
|
|
40
41
|
- 5-10x faster parallel processing ⚡
|
|
41
42
|
- Structured element output for agent workflows 🧩
|
|
42
43
|
- Markdown rendering for RAG and summarization 📝
|
|
@@ -64,6 +65,7 @@ PDF Reader MCP is a **production-ready** Model Context Protocol server that empo
|
|
|
64
65
|
### Developer Experience
|
|
65
66
|
|
|
66
67
|
- 🎯 **Path Flexibility** - Absolute & relative paths, Windows/Unix support (v1.3.0)
|
|
68
|
+
- 🔎 **PDF Inspection** - Profile PDFs before extraction and get recommended `read_pdf` arguments for agent workflows
|
|
67
69
|
- 🧩 **Structured Elements** - Optional page-level elements with stable IDs, provenance, and best-effort bounding boxes
|
|
68
70
|
- 📝 **Markdown Rendering** - Optional page-aware Markdown for RAG, summarization, and agent context
|
|
69
71
|
- 🔗 **Citation Chunks** - Optional page, semantic, size, and table chunks with element IDs and best-effort bounding boxes
|
|
@@ -71,7 +73,7 @@ PDF Reader MCP is a **production-ready** Model Context Protocol server that empo
|
|
|
71
73
|
- 🖼️ **Smart Ordering** - Column-aware content ordering improves natural reading flow
|
|
72
74
|
- 🛡️ **Type Safe** - Full TypeScript with strict mode enabled
|
|
73
75
|
- 📚 **Battle-tested** - Automated tests, strict TypeScript, and CI validation
|
|
74
|
-
- 🎨 **Simple API** -
|
|
76
|
+
- 🎨 **Simple API** - `inspect_pdf` plans extraction, `read_pdf` performs extraction
|
|
75
77
|
|
|
76
78
|
---
|
|
77
79
|
|
|
@@ -202,6 +204,29 @@ npm install -g @sylphx/pdf-reader-mcp
|
|
|
202
204
|
|
|
203
205
|
## 🎯 Quick Start
|
|
204
206
|
|
|
207
|
+
### Inspect Before Extraction
|
|
208
|
+
|
|
209
|
+
Use `inspect_pdf` when an agent needs to decide how to process an unfamiliar
|
|
210
|
+
PDF. It samples a bounded number of pages, detects selectable-text versus
|
|
211
|
+
image-like pages, surfaces document signals, and recommends useful `read_pdf`
|
|
212
|
+
arguments without extracting image bytes.
|
|
213
|
+
|
|
214
|
+
```json
|
|
215
|
+
{
|
|
216
|
+
"sources": [{
|
|
217
|
+
"path": "documents/report.pdf"
|
|
218
|
+
}],
|
|
219
|
+
"sample_pages": 5,
|
|
220
|
+
"include_metadata": true
|
|
221
|
+
}
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
**Result:**
|
|
225
|
+
- PDF profile such as `digital_text`, `scanned_or_image_only`, or `mixed_text_and_scan`
|
|
226
|
+
- Page-level text density, token estimates, and image paint-operation counts
|
|
227
|
+
- Signals for outlines, page labels, forms, attachments, permissions, and structure trees
|
|
228
|
+
- Recommended `read_pdf` arguments for citation chunks, safety findings, tables, or OCR triage
|
|
229
|
+
|
|
205
230
|
### Basic Usage
|
|
206
231
|
|
|
207
232
|
```json
|
|
@@ -383,6 +408,7 @@ npm install -g @sylphx/pdf-reader-mcp
|
|
|
383
408
|
## ✨ Features
|
|
384
409
|
|
|
385
410
|
### Core Capabilities
|
|
411
|
+
- ✅ **PDF Inspection** - Profile PDFs before extraction, detect low-text/scanned pages, and recommend `read_pdf` options
|
|
386
412
|
- ✅ **Text Extraction** - Full document or specific pages with intelligent parsing
|
|
387
413
|
- ✅ **Image Extraction** - Base64-encoded with complete metadata (width, height, format)
|
|
388
414
|
- ✅ **Structured Elements** - Agent-ready elements with stable IDs, provenance, and best-effort bounding boxes
|
|
@@ -408,6 +434,18 @@ npm install -g @sylphx/pdf-reader-mcp
|
|
|
408
434
|
|
|
409
435
|
## 🆕 Latest Improvements
|
|
410
436
|
|
|
437
|
+
### Agent-Native PDF Inspection
|
|
438
|
+
|
|
439
|
+
`inspect_pdf` adds a lightweight planning tool for agent workflows. It samples
|
|
440
|
+
up to 20 pages per source, counts selectable text and image paint operations,
|
|
441
|
+
surfaces document-level signals, and returns a recommendation with the next
|
|
442
|
+
best `read_pdf` arguments.
|
|
443
|
+
|
|
444
|
+
Inspection is intentionally low overhead: it does not decode image bytes and it
|
|
445
|
+
does not perform OCR. When sampled pages look scanned or image-only, the tool
|
|
446
|
+
marks `needs_ocr: true` so agents do not mistake an image-based PDF for a text
|
|
447
|
+
extraction failure.
|
|
448
|
+
|
|
411
449
|
### Agent-Ready Structured Output
|
|
412
450
|
|
|
413
451
|
`include_elements` adds structured document elements to the JSON response while keeping the existing text, metadata, image, and table outputs backward compatible.
|
|
@@ -479,9 +517,34 @@ The extraction pipeline also separates distant same-line text into independent s
|
|
|
479
517
|
|
|
480
518
|
## 📖 API Reference
|
|
481
519
|
|
|
520
|
+
### `inspect_pdf` Tool
|
|
521
|
+
|
|
522
|
+
Plan PDF extraction before running a heavier read. This is useful for agents
|
|
523
|
+
that need to choose between metadata review, citation-ready extraction, mixed
|
|
524
|
+
PDF handling, or OCR-capable workflows.
|
|
525
|
+
|
|
526
|
+
#### Parameters
|
|
527
|
+
|
|
528
|
+
| Parameter | Type | Description | Default |
|
|
529
|
+
|-----------|------|-------------|---------|
|
|
530
|
+
| `sources` | Array | List of PDF sources to inspect | Required |
|
|
531
|
+
| `sample_pages` | number | Maximum pages to sample per source, capped at 20 | `5` |
|
|
532
|
+
| `include_metadata` | boolean | Include PDF metadata and info objects | `true` |
|
|
533
|
+
|
|
534
|
+
#### Response Fields
|
|
535
|
+
|
|
536
|
+
| Field | Description |
|
|
537
|
+
|-------|-------------|
|
|
538
|
+
| `profile` | `digital_text`, `scanned_or_image_only`, `mixed_text_and_scan`, `low_text_or_form`, or `unknown` |
|
|
539
|
+
| `sampled_pages` | Pages used for the bounded inspection sample |
|
|
540
|
+
| `page_signals` | Text chars, text items, token estimate, image paint operations, and scan/low-text flags |
|
|
541
|
+
| `document_signals` | Outline, labels, permissions, forms, attachments, and structure-tree availability |
|
|
542
|
+
| `recommendation` | Suggested workflow, OCR need, reason, and ready-to-use `read_pdf` arguments |
|
|
543
|
+
|
|
482
544
|
### `read_pdf` Tool
|
|
483
545
|
|
|
484
|
-
The
|
|
546
|
+
The extraction tool that handles PDF content, structure, citations, images,
|
|
547
|
+
tables, and document signals.
|
|
485
548
|
|
|
486
549
|
#### Parameters
|
|
487
550
|
|