@sylphx/pdf-reader-mcp 1.3.0 โ 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +205 -91
- package/dist/index.js +620 -49
- package/package.json +44 -42
- package/dist/handlers/index.js +0 -4
- package/dist/handlers/readPdf.js +0 -170
- package/dist/pdf/extractor.js +0 -394
- package/dist/pdf/loader.js +0 -53
- package/dist/pdf/parser.js +0 -96
- package/dist/schemas/readPdf.js +0 -59
- package/dist/types/pdf.js +0 -2
- package/dist/utils/pathUtils.js +0 -25
package/README.md
CHANGED
|
@@ -1,19 +1,20 @@
|
|
|
1
1
|
<div align="center">
|
|
2
2
|
|
|
3
|
-
# PDF Reader MCP
|
|
3
|
+
# PDF Reader MCP ๐
|
|
4
4
|
|
|
5
|
-
**
|
|
5
|
+
**Production-ready PDF processing server for AI agents**
|
|
6
6
|
|
|
7
|
-
[](https://github.com/SylphxAI/pdf-reader-mcp/actions/workflows/ci.yml)
|
|
8
|
+
[](https://codecov.io/gh/SylphxAI/pdf-reader-mcp)
|
|
9
|
+
[](https://www.npmjs.com/package/@sylphx/pdf-reader-mcp)
|
|
10
|
+
[](https://pdf-reader-msu3esos4-sylphx.vercel.app)
|
|
11
|
+
[](https://www.npmjs.com/package/@sylphx/pdf-reader-mcp)
|
|
12
|
+
[](https://opensource.org/licenses/MIT)
|
|
12
13
|
|
|
13
|
-
**5-10x faster parallel processing** โข **Y-coordinate content ordering** โข **94%+ test coverage** โข **
|
|
14
|
+
**5-10x faster parallel processing** โข **Y-coordinate content ordering** โข **94%+ test coverage** โข **103 tests passing**
|
|
14
15
|
|
|
15
|
-
<a href="https://mseep.ai/app/
|
|
16
|
-
<img src="https://mseep.net/pr/
|
|
16
|
+
<a href="https://mseep.ai/app/SylphxAI-pdf-reader-mcp">
|
|
17
|
+
<img src="https://mseep.net/pr/SylphxAI-pdf-reader-mcp-badge.png" alt="Security Validated" width="200"/>
|
|
17
18
|
</a>
|
|
18
19
|
|
|
19
20
|
</div>
|
|
@@ -24,57 +25,131 @@
|
|
|
24
25
|
|
|
25
26
|
PDF Reader MCP is a **production-ready** Model Context Protocol server that empowers AI agents with **enterprise-grade PDF processing capabilities**. Extract text, images, and metadata with unmatched performance and reliability.
|
|
26
27
|
|
|
27
|
-
**
|
|
28
|
+
**The Problem:**
|
|
29
|
+
```typescript
|
|
30
|
+
// Traditional PDF processing
|
|
31
|
+
- Sequential page processing (slow)
|
|
32
|
+
- No natural content ordering
|
|
33
|
+
- Complex path handling
|
|
34
|
+
- Poor error isolation
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
**The Solution:**
|
|
38
|
+
```typescript
|
|
39
|
+
// PDF Reader MCP
|
|
40
|
+
- 5-10x faster parallel processing โก
|
|
41
|
+
- Y-coordinate based ordering ๐
|
|
42
|
+
- Flexible path support (absolute/relative) ๐ฏ
|
|
43
|
+
- Per-page error resilience ๐ก๏ธ
|
|
44
|
+
- 94%+ test coverage โ
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
**Result: Production-ready PDF processing that scales.**
|
|
48
|
+
|
|
49
|
+
---
|
|
28
50
|
|
|
29
|
-
## โก
|
|
51
|
+
## โก Key Features
|
|
30
52
|
|
|
31
|
-
###
|
|
32
|
-
|
|
33
|
-
-
|
|
34
|
-
- โก
|
|
35
|
-
- ๐จ **
|
|
53
|
+
### Performance
|
|
54
|
+
|
|
55
|
+
- ๐ **5-10x faster** than sequential with automatic parallelization
|
|
56
|
+
- โก **12,933 ops/sec** error handling, 5,575 ops/sec text extraction
|
|
57
|
+
- ๐จ **Process 50-page PDFs** in seconds with multi-core utilization
|
|
36
58
|
- ๐ฆ **Lightweight** with minimal dependencies
|
|
37
59
|
|
|
38
|
-
###
|
|
39
|
-
|
|
40
|
-
-
|
|
60
|
+
### Developer Experience
|
|
61
|
+
|
|
62
|
+
- ๐ฏ **Path Flexibility** - Absolute & relative paths, Windows/Unix support (v1.3.0)
|
|
63
|
+
- ๐ผ๏ธ **Smart Ordering** - Y-coordinate based content preserves document layout
|
|
41
64
|
- ๐ก๏ธ **Type Safe** - Full TypeScript with strict mode enabled
|
|
42
|
-
- ๐ **Battle-tested** - 103 tests, 94%+ coverage,
|
|
65
|
+
- ๐ **Battle-tested** - 103 tests, 94%+ coverage, 98%+ function coverage
|
|
43
66
|
- ๐จ **Simple API** - Single tool handles all operations elegantly
|
|
44
67
|
|
|
45
68
|
---
|
|
46
69
|
|
|
70
|
+
## ๐ Performance Benchmarks
|
|
71
|
+
|
|
72
|
+
Real-world performance from production testing:
|
|
73
|
+
|
|
74
|
+
| Operation | Ops/sec | Performance | Use Case |
|
|
75
|
+
|-----------|---------|-------------|----------|
|
|
76
|
+
| **Error handling** | 12,933 | โกโกโกโกโก | Validation & safety |
|
|
77
|
+
| **Extract full text** | 5,575 | โกโกโกโก | Document analysis |
|
|
78
|
+
| **Extract page** | 5,329 | โกโกโกโก | Single page ops |
|
|
79
|
+
| **Multiple pages** | 5,242 | โกโกโกโก | Batch processing |
|
|
80
|
+
| **Metadata only** | 4,912 | โกโกโก | Quick inspection |
|
|
81
|
+
|
|
82
|
+
### Parallel Processing Speedup
|
|
83
|
+
|
|
84
|
+
| Document | Sequential | Parallel | Speedup |
|
|
85
|
+
|----------|-----------|----------|---------|
|
|
86
|
+
| **10-page PDF** | ~2s | ~0.3s | **5-8x faster** |
|
|
87
|
+
| **50-page PDF** | ~10s | ~1s | **10x faster** |
|
|
88
|
+
| **100+ pages** | ~20s | ~2s | **Linear scaling** with CPU cores |
|
|
89
|
+
|
|
90
|
+
*Benchmarks vary based on PDF complexity and system resources.*
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
47
94
|
## ๐ฆ Installation
|
|
48
95
|
|
|
96
|
+
### VS Code
|
|
97
|
+
|
|
98
|
+
Install with one click using the VS Code extension buttons:
|
|
99
|
+
|
|
100
|
+
[](https://insiders.vscode.dev/redirect?url=vscode://ms-vscode.vscode-mcp/install?mcpId=sylphx-pdf-reader-mcp)
|
|
101
|
+
[](https://insiders.vscode.dev/redirect?url=vscode-insiders://ms-vscode.vscode-mcp/install?mcpId=sylphx-pdf-reader-mcp)
|
|
102
|
+
|
|
103
|
+
Or via CLI:
|
|
104
|
+
|
|
49
105
|
```bash
|
|
50
|
-
|
|
51
|
-
|
|
106
|
+
code --add-mcp '{"name":"pdf-reader","command":"npx","args":["@sylphx/pdf-reader-mcp"]}'
|
|
107
|
+
```
|
|
52
108
|
|
|
53
|
-
|
|
54
|
-
pnpm add @sylphx/pdf-reader-mcp
|
|
109
|
+
### Claude Code
|
|
55
110
|
|
|
56
|
-
|
|
57
|
-
|
|
111
|
+
```bash
|
|
112
|
+
claude mcp add pdf-reader -- npx @sylphx/pdf-reader-mcp
|
|
113
|
+
```
|
|
58
114
|
|
|
59
|
-
|
|
60
|
-
yarn add @sylphx/pdf-reader-mcp
|
|
115
|
+
### Claude Desktop
|
|
61
116
|
|
|
62
|
-
|
|
63
|
-
|
|
117
|
+
Add to `claude_desktop_config.json`:
|
|
118
|
+
|
|
119
|
+
```json
|
|
120
|
+
{
|
|
121
|
+
"mcpServers": {
|
|
122
|
+
"pdf-reader": {
|
|
123
|
+
"command": "npx",
|
|
124
|
+
"args": ["@sylphx/pdf-reader-mcp"]
|
|
125
|
+
}
|
|
126
|
+
}
|
|
127
|
+
}
|
|
64
128
|
```
|
|
65
129
|
|
|
66
|
-
|
|
130
|
+
<details>
|
|
131
|
+
<summary><strong>๐ Config file locations</strong></summary>
|
|
67
132
|
|
|
68
|
-
|
|
133
|
+
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
134
|
+
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
|
|
135
|
+
- **Linux**: `~/.config/Claude/claude_desktop_config.json`
|
|
69
136
|
|
|
70
|
-
|
|
137
|
+
</details>
|
|
138
|
+
|
|
139
|
+
### Cursor
|
|
140
|
+
|
|
141
|
+
1. Open **Settings** โ **MCP** โ **Add new MCP Server**
|
|
142
|
+
2. Select **Command** type
|
|
143
|
+
3. Enter: `npx @sylphx/pdf-reader-mcp`
|
|
144
|
+
|
|
145
|
+
### Windsurf
|
|
71
146
|
|
|
72
|
-
Add to your MCP
|
|
147
|
+
Add to your Windsurf MCP config:
|
|
73
148
|
|
|
74
149
|
```json
|
|
75
150
|
{
|
|
76
151
|
"mcpServers": {
|
|
77
|
-
"pdf-reader
|
|
152
|
+
"pdf-reader": {
|
|
78
153
|
"command": "npx",
|
|
79
154
|
"args": ["@sylphx/pdf-reader-mcp"]
|
|
80
155
|
}
|
|
@@ -82,6 +157,49 @@ Add to your MCP client (`claude_desktop_config.json`, Cursor, Cline):
|
|
|
82
157
|
}
|
|
83
158
|
```
|
|
84
159
|
|
|
160
|
+
### Cline
|
|
161
|
+
|
|
162
|
+
Add to Cline's MCP settings:
|
|
163
|
+
|
|
164
|
+
```json
|
|
165
|
+
{
|
|
166
|
+
"mcpServers": {
|
|
167
|
+
"pdf-reader": {
|
|
168
|
+
"command": "npx",
|
|
169
|
+
"args": ["@sylphx/pdf-reader-mcp"]
|
|
170
|
+
}
|
|
171
|
+
}
|
|
172
|
+
}
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### Warp
|
|
176
|
+
|
|
177
|
+
1. Go to **Settings** โ **AI** โ **Manage MCP Servers** โ **Add**
|
|
178
|
+
2. Or use the `/add-mcp` slash command with the standard config
|
|
179
|
+
|
|
180
|
+
### Smithery (One-click)
|
|
181
|
+
|
|
182
|
+
```bash
|
|
183
|
+
npx -y @smithery/cli install @sylphx/pdf-reader-mcp --client claude
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### Manual Installation
|
|
187
|
+
|
|
188
|
+
```bash
|
|
189
|
+
# Quick start - zero installation
|
|
190
|
+
npx @sylphx/pdf-reader-mcp
|
|
191
|
+
|
|
192
|
+
# Using bun (recommended)
|
|
193
|
+
bun add @sylphx/pdf-reader-mcp
|
|
194
|
+
|
|
195
|
+
# Using npm
|
|
196
|
+
npm install @sylphx/pdf-reader-mcp
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
---
|
|
200
|
+
|
|
201
|
+
## ๐ฏ Quick Start
|
|
202
|
+
|
|
85
203
|
### Basic Usage
|
|
86
204
|
|
|
87
205
|
```json
|
|
@@ -113,7 +231,7 @@ Add to your MCP client (`claude_desktop_config.json`, Cursor, Cline):
|
|
|
113
231
|
}
|
|
114
232
|
```
|
|
115
233
|
|
|
116
|
-
### Absolute Paths (
|
|
234
|
+
### Absolute Paths (v1.3.0+)
|
|
117
235
|
|
|
118
236
|
```json
|
|
119
237
|
// Windows - Both formats work!
|
|
@@ -484,30 +602,6 @@ Restart MCP client completely.
|
|
|
484
602
|
|
|
485
603
|
---
|
|
486
604
|
|
|
487
|
-
## โก Performance
|
|
488
|
-
|
|
489
|
-
### Benchmarks
|
|
490
|
-
|
|
491
|
-
| Operation | Ops/sec | Performance |
|
|
492
|
-
|:----------|:--------|:------------|
|
|
493
|
-
| Error handling | ~12,933 | โกโกโกโกโก |
|
|
494
|
-
| Extract full text | ~5,575 | โกโกโกโก |
|
|
495
|
-
| Extract page | ~5,329 | โกโกโกโก |
|
|
496
|
-
| Multiple pages | ~5,242 | โกโกโกโก |
|
|
497
|
-
| Metadata only | ~4,912 | โกโกโก |
|
|
498
|
-
|
|
499
|
-
### Parallel Processing
|
|
500
|
-
|
|
501
|
-
| Document | Speedup |
|
|
502
|
-
|:---------|:--------|
|
|
503
|
-
| 10-page PDF | **5-8x faster** |
|
|
504
|
-
| 50-page PDF | **10x faster** |
|
|
505
|
-
| 100+ pages | **Linear scaling** with CPU cores |
|
|
506
|
-
|
|
507
|
-
*Benchmarks vary based on PDF complexity and system resources.*
|
|
508
|
-
|
|
509
|
-
---
|
|
510
|
-
|
|
511
605
|
## ๐๏ธ Architecture
|
|
512
606
|
|
|
513
607
|
### Tech Stack
|
|
@@ -548,7 +642,7 @@ Restart MCP client completely.
|
|
|
548
642
|
|
|
549
643
|
**Setup:**
|
|
550
644
|
```bash
|
|
551
|
-
git clone https://github.com/
|
|
645
|
+
git clone https://github.com/SylphxAI/pdf-reader-mcp.git
|
|
552
646
|
cd pdf-reader-mcp
|
|
553
647
|
pnpm install && pnpm build
|
|
554
648
|
```
|
|
@@ -600,7 +694,7 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md)
|
|
|
600
694
|
|
|
601
695
|
## ๐ Documentation
|
|
602
696
|
|
|
603
|
-
- ๐ [Full Docs](https://
|
|
697
|
+
- ๐ [Full Docs](https://SylphxAI.github.io/pdf-reader-mcp/) - Complete guides
|
|
604
698
|
- ๐ [Getting Started](./docs/guide/getting-started.md) - Quick start
|
|
605
699
|
- ๐ [API Reference](./docs/api/README.md) - Detailed API
|
|
606
700
|
- ๐๏ธ [Design](./docs/design/index.md) - Architecture
|
|
@@ -618,7 +712,7 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md)
|
|
|
618
712
|
- [x] Absolute paths (v1.3.0)
|
|
619
713
|
- [x] 94%+ test coverage (v1.3.0)
|
|
620
714
|
|
|
621
|
-
**๐
|
|
715
|
+
**๐ Next**
|
|
622
716
|
- [ ] OCR for scanned PDFs
|
|
623
717
|
- [ ] Annotation extraction
|
|
624
718
|
- [ ] Form field extraction
|
|
@@ -627,19 +721,30 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md)
|
|
|
627
721
|
- [ ] Advanced caching
|
|
628
722
|
- [ ] PDF generation
|
|
629
723
|
|
|
630
|
-
Vote at [Discussions](https://github.com/
|
|
724
|
+
Vote at [Discussions](https://github.com/SylphxAI/pdf-reader-mcp/discussions)
|
|
725
|
+
|
|
726
|
+
---
|
|
727
|
+
|
|
728
|
+
## ๐ Recognition
|
|
729
|
+
|
|
730
|
+
**Featured on:**
|
|
731
|
+
- [Smithery](https://smithery.ai/server/@sylphx/pdf-reader-mcp) - MCP directory
|
|
732
|
+
- [Glama](https://glama.ai/mcp/servers/@sylphx/pdf-reader-mcp) - AI marketplace
|
|
733
|
+
- [MseeP.ai](https://mseep.ai/app/SylphxAI-pdf-reader-mcp) - Security validated
|
|
734
|
+
|
|
735
|
+
**Trusted worldwide** โข **Enterprise adoption** โข **Battle-tested**
|
|
631
736
|
|
|
632
737
|
---
|
|
633
738
|
|
|
634
739
|
## ๐ค Support
|
|
635
740
|
|
|
636
|
-
[](https://github.com/SylphxAI/pdf-reader-mcp/issues)
|
|
742
|
+
[](https://discord.gg/sylphx)
|
|
638
743
|
|
|
639
|
-
- ๐ [Bug Reports](https://github.com/
|
|
640
|
-
- ๐ฌ [Discussions](https://github.com/
|
|
641
|
-
- ๐ [
|
|
642
|
-
- ๐ง
|
|
744
|
+
- ๐ [Bug Reports](https://github.com/SylphxAI/pdf-reader-mcp/issues)
|
|
745
|
+
- ๐ฌ [Discussions](https://github.com/SylphxAI/pdf-reader-mcp/discussions)
|
|
746
|
+
- ๐ [Documentation](https://SylphxAI.github.io/pdf-reader-mcp/)
|
|
747
|
+
- ๐ง [Email](mailto:hi@sylphx.com)
|
|
643
748
|
|
|
644
749
|
**Show Your Support:**
|
|
645
750
|
โญ Star โข ๐ Watch โข ๐ Report bugs โข ๐ก Suggest features โข ๐ Contribute
|
|
@@ -648,40 +753,49 @@ Vote at [Discussions](https://github.com/sylphxltd/pdf-reader-mcp/discussions)
|
|
|
648
753
|
|
|
649
754
|
## ๐ Stats
|
|
650
755
|
|
|
651
|
-

|
|
757
|
+

|
|
653
758
|

|
|
654
|
-

|
|
655
760
|
|
|
656
761
|
**103 Tests** โข **94%+ Coverage** โข **Production Ready**
|
|
657
762
|
|
|
658
763
|
---
|
|
659
764
|
|
|
660
|
-
##
|
|
661
|
-
|
|
662
|
-
**Featured on:**
|
|
663
|
-
- [Smithery](https://smithery.ai/server/@sylphx/pdf-reader-mcp) - MCP directory
|
|
664
|
-
- [Glama](https://glama.ai/mcp/servers/@sylphx/pdf-reader-mcp) - AI marketplace
|
|
665
|
-
- [MseeP.ai](https://mseep.ai/app/sylphxltd-pdf-reader-mcp) - Security validated
|
|
765
|
+
## ๐ License
|
|
666
766
|
|
|
667
|
-
|
|
767
|
+
MIT ยฉ [Sylphx](https://sylphx.com)
|
|
668
768
|
|
|
669
769
|
---
|
|
670
770
|
|
|
671
|
-
##
|
|
771
|
+
## ๐ Credits
|
|
672
772
|
|
|
673
|
-
|
|
773
|
+
Built with:
|
|
774
|
+
- [PDF.js](https://mozilla.github.io/pdf.js/) - Mozilla PDF engine
|
|
775
|
+
- [Bun](https://bun.sh) - Fast JavaScript runtime
|
|
674
776
|
|
|
675
|
-
|
|
777
|
+
Special thanks to the open source community โค๏ธ
|
|
676
778
|
|
|
677
|
-
|
|
779
|
+
## Powered by Sylphx
|
|
678
780
|
|
|
679
|
-
|
|
781
|
+
This project uses the following [@sylphx](https://github.com/SylphxAI) packages:
|
|
680
782
|
|
|
681
|
-
|
|
783
|
+
- [@sylphx/mcp-server-sdk](https://github.com/SylphxAI/mcp-server-sdk) - MCP server framework
|
|
784
|
+
- [@sylphx/biome-config](https://github.com/SylphxAI/biome-config) - Biome configuration
|
|
785
|
+
- [@sylphx/tsconfig](https://github.com/SylphxAI/tsconfig) - TypeScript configuration
|
|
786
|
+
- [@sylphx/bump](https://github.com/SylphxAI/bump) - Version management
|
|
787
|
+
- [@sylphx/doctor](https://github.com/SylphxAI/doctor) - Project health checker
|
|
788
|
+
- [@sylphx/leaf](https://github.com/SylphxAI/leaf) - Documentation framework
|
|
789
|
+
- [@sylphx/leaf-theme-default](https://github.com/SylphxAI/leaf-theme-default) - Documentation theme
|
|
682
790
|
|
|
683
|
-
|
|
684
|
-
|
|
685
|
-
[โฌ Back to Top](#pdf-reader-mcp-)
|
|
791
|
+
---
|
|
686
792
|
|
|
687
|
-
|
|
793
|
+
<p align="center">
|
|
794
|
+
<strong>5-10x faster. Production-ready. Battle-tested.</strong>
|
|
795
|
+
<br>
|
|
796
|
+
<sub>The PDF processing server that actually scales</sub>
|
|
797
|
+
<br><br>
|
|
798
|
+
<a href="https://sylphx.com">sylphx.com</a> โข
|
|
799
|
+
<a href="https://x.com/SylphxAI">@SylphxAI</a> โข
|
|
800
|
+
<a href="mailto:hi@sylphx.com">hi@sylphx.com</a>
|
|
801
|
+
</p>
|