@sylphx/pdf-reader-mcp 1.3.0 โ 1.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +110 -78
- package/dist/index.js +524 -45
- package/package.json +17 -11
- package/dist/handlers/index.js +0 -4
- package/dist/handlers/readPdf.js +0 -170
- package/dist/pdf/extractor.js +0 -394
- package/dist/pdf/loader.js +0 -53
- package/dist/pdf/parser.js +0 -96
- package/dist/schemas/readPdf.js +0 -59
- package/dist/types/pdf.js +0 -2
- package/dist/utils/pathUtils.js +0 -25
package/README.md
CHANGED
|
@@ -1,19 +1,20 @@
|
|
|
1
1
|
<div align="center">
|
|
2
2
|
|
|
3
|
-
# PDF Reader MCP
|
|
3
|
+
# PDF Reader MCP ๐
|
|
4
4
|
|
|
5
|
-
**
|
|
5
|
+
**Production-ready PDF processing server for AI agents**
|
|
6
6
|
|
|
7
|
-
[](https://github.com/SylphxAI/pdf-reader-mcp/actions/workflows/ci.yml)
|
|
8
|
+
[](https://codecov.io/gh/SylphxAI/pdf-reader-mcp)
|
|
9
|
+
[](https://www.npmjs.com/package/@sylphx/pdf-reader-mcp)
|
|
10
|
+
[](https://pdf-reader-msu3esos4-sylphx.vercel.app)
|
|
11
|
+
[](https://www.npmjs.com/package/@sylphx/pdf-reader-mcp)
|
|
12
|
+
[](https://opensource.org/licenses/MIT)
|
|
12
13
|
|
|
13
|
-
**5-10x faster parallel processing** โข **Y-coordinate content ordering** โข **94%+ test coverage** โข **
|
|
14
|
+
**5-10x faster parallel processing** โข **Y-coordinate content ordering** โข **94%+ test coverage** โข **103 tests passing**
|
|
14
15
|
|
|
15
|
-
<a href="https://mseep.ai/app/
|
|
16
|
-
<img src="https://mseep.net/pr/
|
|
16
|
+
<a href="https://mseep.ai/app/SylphxAI-pdf-reader-mcp">
|
|
17
|
+
<img src="https://mseep.net/pr/SylphxAI-pdf-reader-mcp-badge.png" alt="Security Validated" width="200"/>
|
|
17
18
|
</a>
|
|
18
19
|
|
|
19
20
|
</div>
|
|
@@ -24,26 +25,72 @@
|
|
|
24
25
|
|
|
25
26
|
PDF Reader MCP is a **production-ready** Model Context Protocol server that empowers AI agents with **enterprise-grade PDF processing capabilities**. Extract text, images, and metadata with unmatched performance and reliability.
|
|
26
27
|
|
|
27
|
-
**
|
|
28
|
+
**The Problem:**
|
|
29
|
+
```typescript
|
|
30
|
+
// Traditional PDF processing
|
|
31
|
+
- Sequential page processing (slow)
|
|
32
|
+
- No natural content ordering
|
|
33
|
+
- Complex path handling
|
|
34
|
+
- Poor error isolation
|
|
35
|
+
```
|
|
28
36
|
|
|
29
|
-
|
|
37
|
+
**The Solution:**
|
|
38
|
+
```typescript
|
|
39
|
+
// PDF Reader MCP
|
|
40
|
+
- 5-10x faster parallel processing โก
|
|
41
|
+
- Y-coordinate based ordering ๐
|
|
42
|
+
- Flexible path support (absolute/relative) ๐ฏ
|
|
43
|
+
- Per-page error resilience ๐ก๏ธ
|
|
44
|
+
- 94%+ test coverage โ
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
**Result: Production-ready PDF processing that scales.**
|
|
48
|
+
|
|
49
|
+
---
|
|
30
50
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
-
|
|
51
|
+
## โก Key Features
|
|
52
|
+
|
|
53
|
+
### Performance
|
|
54
|
+
|
|
55
|
+
- ๐ **5-10x faster** than sequential with automatic parallelization
|
|
56
|
+
- โก **12,933 ops/sec** error handling, 5,575 ops/sec text extraction
|
|
57
|
+
- ๐จ **Process 50-page PDFs** in seconds with multi-core utilization
|
|
36
58
|
- ๐ฆ **Lightweight** with minimal dependencies
|
|
37
59
|
|
|
38
|
-
###
|
|
39
|
-
|
|
40
|
-
-
|
|
60
|
+
### Developer Experience
|
|
61
|
+
|
|
62
|
+
- ๐ฏ **Path Flexibility** - Absolute & relative paths, Windows/Unix support (v1.3.0)
|
|
63
|
+
- ๐ผ๏ธ **Smart Ordering** - Y-coordinate based content preserves document layout
|
|
41
64
|
- ๐ก๏ธ **Type Safe** - Full TypeScript with strict mode enabled
|
|
42
|
-
- ๐ **Battle-tested** - 103 tests, 94%+ coverage,
|
|
65
|
+
- ๐ **Battle-tested** - 103 tests, 94%+ coverage, 98%+ function coverage
|
|
43
66
|
- ๐จ **Simple API** - Single tool handles all operations elegantly
|
|
44
67
|
|
|
45
68
|
---
|
|
46
69
|
|
|
70
|
+
## ๐ Performance Benchmarks
|
|
71
|
+
|
|
72
|
+
Real-world performance from production testing:
|
|
73
|
+
|
|
74
|
+
| Operation | Ops/sec | Performance | Use Case |
|
|
75
|
+
|-----------|---------|-------------|----------|
|
|
76
|
+
| **Error handling** | 12,933 | โกโกโกโกโก | Validation & safety |
|
|
77
|
+
| **Extract full text** | 5,575 | โกโกโกโก | Document analysis |
|
|
78
|
+
| **Extract page** | 5,329 | โกโกโกโก | Single page ops |
|
|
79
|
+
| **Multiple pages** | 5,242 | โกโกโกโก | Batch processing |
|
|
80
|
+
| **Metadata only** | 4,912 | โกโกโก | Quick inspection |
|
|
81
|
+
|
|
82
|
+
### Parallel Processing Speedup
|
|
83
|
+
|
|
84
|
+
| Document | Sequential | Parallel | Speedup |
|
|
85
|
+
|----------|-----------|----------|---------|
|
|
86
|
+
| **10-page PDF** | ~2s | ~0.3s | **5-8x faster** |
|
|
87
|
+
| **50-page PDF** | ~10s | ~1s | **10x faster** |
|
|
88
|
+
| **100+ pages** | ~20s | ~2s | **Linear scaling** with CPU cores |
|
|
89
|
+
|
|
90
|
+
*Benchmarks vary based on PDF complexity and system resources.*
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
47
94
|
## ๐ฆ Installation
|
|
48
95
|
|
|
49
96
|
```bash
|
|
@@ -113,7 +160,7 @@ Add to your MCP client (`claude_desktop_config.json`, Cursor, Cline):
|
|
|
113
160
|
}
|
|
114
161
|
```
|
|
115
162
|
|
|
116
|
-
### Absolute Paths (
|
|
163
|
+
### Absolute Paths (v1.3.0+)
|
|
117
164
|
|
|
118
165
|
```json
|
|
119
166
|
// Windows - Both formats work!
|
|
@@ -484,30 +531,6 @@ Restart MCP client completely.
|
|
|
484
531
|
|
|
485
532
|
---
|
|
486
533
|
|
|
487
|
-
## โก Performance
|
|
488
|
-
|
|
489
|
-
### Benchmarks
|
|
490
|
-
|
|
491
|
-
| Operation | Ops/sec | Performance |
|
|
492
|
-
|:----------|:--------|:------------|
|
|
493
|
-
| Error handling | ~12,933 | โกโกโกโกโก |
|
|
494
|
-
| Extract full text | ~5,575 | โกโกโกโก |
|
|
495
|
-
| Extract page | ~5,329 | โกโกโกโก |
|
|
496
|
-
| Multiple pages | ~5,242 | โกโกโกโก |
|
|
497
|
-
| Metadata only | ~4,912 | โกโกโก |
|
|
498
|
-
|
|
499
|
-
### Parallel Processing
|
|
500
|
-
|
|
501
|
-
| Document | Speedup |
|
|
502
|
-
|:---------|:--------|
|
|
503
|
-
| 10-page PDF | **5-8x faster** |
|
|
504
|
-
| 50-page PDF | **10x faster** |
|
|
505
|
-
| 100+ pages | **Linear scaling** with CPU cores |
|
|
506
|
-
|
|
507
|
-
*Benchmarks vary based on PDF complexity and system resources.*
|
|
508
|
-
|
|
509
|
-
---
|
|
510
|
-
|
|
511
534
|
## ๐๏ธ Architecture
|
|
512
535
|
|
|
513
536
|
### Tech Stack
|
|
@@ -548,7 +571,7 @@ Restart MCP client completely.
|
|
|
548
571
|
|
|
549
572
|
**Setup:**
|
|
550
573
|
```bash
|
|
551
|
-
git clone https://github.com/
|
|
574
|
+
git clone https://github.com/SylphxAI/pdf-reader-mcp.git
|
|
552
575
|
cd pdf-reader-mcp
|
|
553
576
|
pnpm install && pnpm build
|
|
554
577
|
```
|
|
@@ -600,7 +623,7 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md)
|
|
|
600
623
|
|
|
601
624
|
## ๐ Documentation
|
|
602
625
|
|
|
603
|
-
- ๐ [Full Docs](https://
|
|
626
|
+
- ๐ [Full Docs](https://SylphxAI.github.io/pdf-reader-mcp/) - Complete guides
|
|
604
627
|
- ๐ [Getting Started](./docs/guide/getting-started.md) - Quick start
|
|
605
628
|
- ๐ [API Reference](./docs/api/README.md) - Detailed API
|
|
606
629
|
- ๐๏ธ [Design](./docs/design/index.md) - Architecture
|
|
@@ -618,7 +641,7 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md)
|
|
|
618
641
|
- [x] Absolute paths (v1.3.0)
|
|
619
642
|
- [x] 94%+ test coverage (v1.3.0)
|
|
620
643
|
|
|
621
|
-
**๐
|
|
644
|
+
**๐ Next**
|
|
622
645
|
- [ ] OCR for scanned PDFs
|
|
623
646
|
- [ ] Annotation extraction
|
|
624
647
|
- [ ] Form field extraction
|
|
@@ -627,19 +650,30 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md)
|
|
|
627
650
|
- [ ] Advanced caching
|
|
628
651
|
- [ ] PDF generation
|
|
629
652
|
|
|
630
|
-
Vote at [Discussions](https://github.com/
|
|
653
|
+
Vote at [Discussions](https://github.com/SylphxAI/pdf-reader-mcp/discussions)
|
|
654
|
+
|
|
655
|
+
---
|
|
656
|
+
|
|
657
|
+
## ๐ Recognition
|
|
658
|
+
|
|
659
|
+
**Featured on:**
|
|
660
|
+
- [Smithery](https://smithery.ai/server/@sylphx/pdf-reader-mcp) - MCP directory
|
|
661
|
+
- [Glama](https://glama.ai/mcp/servers/@sylphx/pdf-reader-mcp) - AI marketplace
|
|
662
|
+
- [MseeP.ai](https://mseep.ai/app/SylphxAI-pdf-reader-mcp) - Security validated
|
|
663
|
+
|
|
664
|
+
**Trusted worldwide** โข **Enterprise adoption** โข **Battle-tested**
|
|
631
665
|
|
|
632
666
|
---
|
|
633
667
|
|
|
634
668
|
## ๐ค Support
|
|
635
669
|
|
|
636
|
-
[](https://github.com/SylphxAI/pdf-reader-mcp/issues)
|
|
671
|
+
[](https://discord.gg/sylphx)
|
|
638
672
|
|
|
639
|
-
- ๐ [Bug Reports](https://github.com/
|
|
640
|
-
- ๐ฌ [Discussions](https://github.com/
|
|
641
|
-
- ๐ [
|
|
642
|
-
- ๐ง
|
|
673
|
+
- ๐ [Bug Reports](https://github.com/SylphxAI/pdf-reader-mcp/issues)
|
|
674
|
+
- ๐ฌ [Discussions](https://github.com/SylphxAI/pdf-reader-mcp/discussions)
|
|
675
|
+
- ๐ [Documentation](https://SylphxAI.github.io/pdf-reader-mcp/)
|
|
676
|
+
- ๐ง [Email](mailto:hi@sylphx.com)
|
|
643
677
|
|
|
644
678
|
**Show Your Support:**
|
|
645
679
|
โญ Star โข ๐ Watch โข ๐ Report bugs โข ๐ก Suggest features โข ๐ Contribute
|
|
@@ -648,40 +682,38 @@ Vote at [Discussions](https://github.com/sylphxltd/pdf-reader-mcp/discussions)
|
|
|
648
682
|
|
|
649
683
|
## ๐ Stats
|
|
650
684
|
|
|
651
|
-

|
|
686
|
+

|
|
653
687
|

|
|
654
|
-

|
|
655
689
|
|
|
656
690
|
**103 Tests** โข **94%+ Coverage** โข **Production Ready**
|
|
657
691
|
|
|
658
692
|
---
|
|
659
693
|
|
|
660
|
-
## ๐ Recognition
|
|
661
|
-
|
|
662
|
-
**Featured on:**
|
|
663
|
-
- [Smithery](https://smithery.ai/server/@sylphx/pdf-reader-mcp) - MCP directory
|
|
664
|
-
- [Glama](https://glama.ai/mcp/servers/@sylphx/pdf-reader-mcp) - AI marketplace
|
|
665
|
-
- [MseeP.ai](https://mseep.ai/app/sylphxltd-pdf-reader-mcp) - Security validated
|
|
666
|
-
|
|
667
|
-
**Trusted worldwide** โข **Enterprise adoption** โข **Battle-tested**
|
|
668
|
-
|
|
669
|
-
---
|
|
670
|
-
|
|
671
694
|
## ๐ License
|
|
672
695
|
|
|
673
|
-
MIT
|
|
674
|
-
|
|
675
|
-
See [LICENSE](./LICENSE) for details.
|
|
696
|
+
MIT ยฉ [Sylphx](https://sylphx.com)
|
|
676
697
|
|
|
677
698
|
---
|
|
678
699
|
|
|
679
|
-
|
|
700
|
+
## ๐ Credits
|
|
680
701
|
|
|
681
|
-
|
|
702
|
+
Built with:
|
|
703
|
+
- [PDF.js](https://mozilla.github.io/pdf.js/) - Mozilla PDF engine
|
|
704
|
+
- [MCP SDK](https://modelcontextprotocol.io) - Model Context Protocol
|
|
705
|
+
- [Vitest](https://vitest.dev) - Fast testing framework
|
|
682
706
|
|
|
683
|
-
|
|
707
|
+
Special thanks to the open source community โค๏ธ
|
|
684
708
|
|
|
685
|
-
|
|
709
|
+
---
|
|
686
710
|
|
|
687
|
-
|
|
711
|
+
<p align="center">
|
|
712
|
+
<strong>5-10x faster. Production-ready. Battle-tested.</strong>
|
|
713
|
+
<br>
|
|
714
|
+
<sub>The PDF processing server that actually scales</sub>
|
|
715
|
+
<br><br>
|
|
716
|
+
<a href="https://sylphx.com">sylphx.com</a> โข
|
|
717
|
+
<a href="https://x.com/SylphxAI">@SylphxAI</a> โข
|
|
718
|
+
<a href="mailto:hi@sylphx.com">hi@sylphx.com</a>
|
|
719
|
+
</p>
|