@sylphx/pdf-reader-mcp 2.5.2 → 2.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +67 -4
  2. package/dist/index.js +1964 -1669
  3. package/package.json +8 -3
package/README.md CHANGED
@@ -11,7 +11,7 @@
11
11
  [![TypeScript](https://img.shields.io/badge/TypeScript-6.0-blue.svg?style=flat-square)](https://www.typescriptlang.org/)
12
12
  [![Downloads](https://img.shields.io/npm/dm/@sylphx/pdf-reader-mcp?style=flat-square)](https://www.npmjs.com/package/@sylphx/pdf-reader-mcp)
13
13
 
14
- **5-10x faster parallel processing** • **Structured element output** • **Semantic citation chunks** • **CI-backed quality**
14
+ **PDF inspection** • **Structured element output** • **Semantic citation chunks** • **Local-first MCP**
15
15
 
16
16
  <a href="https://mseep.ai/app/SylphxAI-pdf-reader-mcp">
17
17
  <img src="https://mseep.net/pr/SylphxAI-pdf-reader-mcp-badge.png" alt="Security Validated" width="200"/>
@@ -23,7 +23,7 @@
23
23
 
24
24
  ## 🚀 Overview
25
25
 
26
- PDF Reader MCP is a **production-ready** Model Context Protocol server that empowers AI agents with **structured, local-first PDF processing capabilities**. Extract text, Markdown, semantic citation chunks, images, tables, annotations, outlines, structure trees, form fields, attachment metadata, and agent-ready document elements with strong performance and reliability.
26
+ PDF Reader MCP is a **production-ready** Model Context Protocol server that empowers AI agents with **structured, local-first PDF processing capabilities**. Inspect PDFs before extraction, then extract text, Markdown, semantic citation chunks, images, tables, annotations, outlines, structure trees, form fields, attachment metadata, and agent-ready document elements with strong performance and reliability.
27
27
 
28
28
  **The Problem:**
29
29
  ```typescript
@@ -37,6 +37,7 @@ PDF Reader MCP is a **production-ready** Model Context Protocol server that empo
37
37
  **The Solution:**
38
38
  ```typescript
39
39
  // PDF Reader MCP
40
+ - Preflight PDF inspection for agent extraction planning 🔎
40
41
  - 5-10x faster parallel processing ⚡
41
42
  - Structured element output for agent workflows 🧩
42
43
  - Markdown rendering for RAG and summarization 📝
@@ -64,6 +65,7 @@ PDF Reader MCP is a **production-ready** Model Context Protocol server that empo
64
65
  ### Developer Experience
65
66
 
66
67
  - 🎯 **Path Flexibility** - Absolute & relative paths, Windows/Unix support (v1.3.0)
68
+ - 🔎 **PDF Inspection** - Profile PDFs before extraction and get recommended `read_pdf` arguments for agent workflows
67
69
  - 🧩 **Structured Elements** - Optional page-level elements with stable IDs, provenance, and best-effort bounding boxes
68
70
  - 📝 **Markdown Rendering** - Optional page-aware Markdown for RAG, summarization, and agent context
69
71
  - 🔗 **Citation Chunks** - Optional page, semantic, size, and table chunks with element IDs and best-effort bounding boxes
@@ -71,7 +73,7 @@ PDF Reader MCP is a **production-ready** Model Context Protocol server that empo
71
73
  - 🖼️ **Smart Ordering** - Column-aware content ordering improves natural reading flow
72
74
  - 🛡️ **Type Safe** - Full TypeScript with strict mode enabled
73
75
  - 📚 **Battle-tested** - Automated tests, strict TypeScript, and CI validation
74
- - 🎨 **Simple API** - Single tool handles all operations elegantly
76
+ - 🎨 **Simple API** - `inspect_pdf` plans extraction, `read_pdf` performs extraction
75
77
 
76
78
  ---
77
79
 
@@ -202,6 +204,29 @@ npm install -g @sylphx/pdf-reader-mcp
202
204
 
203
205
  ## 🎯 Quick Start
204
206
 
207
+ ### Inspect Before Extraction
208
+
209
+ Use `inspect_pdf` when an agent needs to decide how to process an unfamiliar
210
+ PDF. It samples a bounded number of pages, detects selectable-text versus
211
+ image-like pages, surfaces document signals, and recommends useful `read_pdf`
212
+ arguments without extracting image bytes.
213
+
214
+ ```json
215
+ {
216
+ "sources": [{
217
+ "path": "documents/report.pdf"
218
+ }],
219
+ "sample_pages": 5,
220
+ "include_metadata": true
221
+ }
222
+ ```
223
+
224
+ **Result:**
225
+ - PDF profile such as `digital_text`, `scanned_or_image_only`, or `mixed_text_and_scan`
226
+ - Page-level text density, token estimates, and image paint-operation counts
227
+ - Signals for outlines, page labels, forms, attachments, permissions, and structure trees
228
+ - Recommended `read_pdf` arguments for citation chunks, safety findings, tables, or OCR triage
229
+
205
230
  ### Basic Usage
206
231
 
207
232
  ```json
@@ -383,6 +408,7 @@ npm install -g @sylphx/pdf-reader-mcp
383
408
  ## ✨ Features
384
409
 
385
410
  ### Core Capabilities
411
+ - ✅ **PDF Inspection** - Profile PDFs before extraction, detect low-text/scanned pages, and recommend `read_pdf` options
386
412
  - ✅ **Text Extraction** - Full document or specific pages with intelligent parsing
387
413
  - ✅ **Image Extraction** - Base64-encoded with complete metadata (width, height, format)
388
414
  - ✅ **Structured Elements** - Agent-ready elements with stable IDs, provenance, and best-effort bounding boxes
@@ -408,6 +434,18 @@ npm install -g @sylphx/pdf-reader-mcp
408
434
 
409
435
  ## 🆕 Latest Improvements
410
436
 
437
+ ### Agent-Native PDF Inspection
438
+
439
+ `inspect_pdf` adds a lightweight planning tool for agent workflows. It samples
440
+ up to 20 pages per source, counts selectable text and image paint operations,
441
+ surfaces document-level signals, and returns a recommendation with the next
442
+ best `read_pdf` arguments.
443
+
444
+ Inspection is intentionally low overhead: it does not decode image bytes and it
445
+ does not perform OCR. When sampled pages look scanned or image-only, the tool
446
+ marks `needs_ocr: true` so agents do not mistake an image-based PDF for a text
447
+ extraction failure.
448
+
411
449
  ### Agent-Ready Structured Output
412
450
 
413
451
  `include_elements` adds structured document elements to the JSON response while keeping the existing text, metadata, image, and table outputs backward compatible.
@@ -479,9 +517,34 @@ The extraction pipeline also separates distant same-line text into independent s
479
517
 
480
518
  ## 📖 API Reference
481
519
 
520
+ ### `inspect_pdf` Tool
521
+
522
+ Plan PDF extraction before running a heavier read. This is useful for agents
523
+ that need to choose between metadata review, citation-ready extraction, mixed
524
+ PDF handling, or OCR-capable workflows.
525
+
526
+ #### Parameters
527
+
528
+ | Parameter | Type | Description | Default |
529
+ |-----------|------|-------------|---------|
530
+ | `sources` | Array | List of PDF sources to inspect | Required |
531
+ | `sample_pages` | number | Maximum pages to sample per source, capped at 20 | `5` |
532
+ | `include_metadata` | boolean | Include PDF metadata and info objects | `true` |
533
+
534
+ #### Response Fields
535
+
536
+ | Field | Description |
537
+ |-------|-------------|
538
+ | `profile` | `digital_text`, `scanned_or_image_only`, `mixed_text_and_scan`, `low_text_or_form`, or `unknown` |
539
+ | `sampled_pages` | Pages used for the bounded inspection sample |
540
+ | `page_signals` | Text chars, text items, token estimate, image paint operations, and scan/low-text flags |
541
+ | `document_signals` | Outline, labels, permissions, forms, attachments, and structure-tree availability |
542
+ | `recommendation` | Suggested workflow, OCR need, reason, and ready-to-use `read_pdf` arguments |
543
+
482
544
  ### `read_pdf` Tool
483
545
 
484
- The single tool that handles all PDF operations.
546
+ The extraction tool that handles PDF content, structure, citations, images,
547
+ tables, and document signals.
485
548
 
486
549
  #### Parameters
487
550