webpeel 0.8.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (99) hide show
  1. package/README.md +39 -5
  2. package/dist/cli.js +1299 -85
  3. package/dist/cli.js.map +1 -1
  4. package/dist/core/application-tracker.d.ts +85 -0
  5. package/dist/core/application-tracker.d.ts.map +1 -0
  6. package/dist/core/application-tracker.js +184 -0
  7. package/dist/core/application-tracker.js.map +1 -0
  8. package/dist/core/apply.d.ts +163 -0
  9. package/dist/core/apply.d.ts.map +1 -0
  10. package/dist/core/apply.js +817 -0
  11. package/dist/core/apply.js.map +1 -0
  12. package/dist/core/branding.d.ts +1 -1
  13. package/dist/core/branding.d.ts.map +1 -1
  14. package/dist/core/budget.d.ts +43 -0
  15. package/dist/core/budget.d.ts.map +1 -0
  16. package/dist/core/budget.js +325 -0
  17. package/dist/core/budget.js.map +1 -0
  18. package/dist/core/challenge-detection.d.ts +27 -0
  19. package/dist/core/challenge-detection.d.ts.map +1 -0
  20. package/dist/core/challenge-detection.js +436 -0
  21. package/dist/core/challenge-detection.js.map +1 -0
  22. package/dist/core/change-tracking.d.ts.map +1 -1
  23. package/dist/core/change-tracking.js +10 -1
  24. package/dist/core/change-tracking.js.map +1 -1
  25. package/dist/core/crawler.d.ts.map +1 -1
  26. package/dist/core/crawler.js +17 -4
  27. package/dist/core/crawler.js.map +1 -1
  28. package/dist/core/diff.d.ts +62 -0
  29. package/dist/core/diff.d.ts.map +1 -0
  30. package/dist/core/diff.js +289 -0
  31. package/dist/core/diff.js.map +1 -0
  32. package/dist/core/extract-listings.d.ts +39 -0
  33. package/dist/core/extract-listings.d.ts.map +1 -0
  34. package/dist/core/extract-listings.js +331 -0
  35. package/dist/core/extract-listings.js.map +1 -0
  36. package/dist/core/extract.d.ts.map +1 -1
  37. package/dist/core/extract.js +15 -2
  38. package/dist/core/extract.js.map +1 -1
  39. package/dist/core/fetcher.d.ts +29 -3
  40. package/dist/core/fetcher.d.ts.map +1 -1
  41. package/dist/core/fetcher.js +158 -20
  42. package/dist/core/fetcher.js.map +1 -1
  43. package/dist/core/human.d.ts +176 -0
  44. package/dist/core/human.d.ts.map +1 -0
  45. package/dist/core/human.js +681 -0
  46. package/dist/core/human.js.map +1 -0
  47. package/dist/core/jobs.d.ts +12 -2
  48. package/dist/core/jobs.d.ts.map +1 -1
  49. package/dist/core/jobs.js +124 -2
  50. package/dist/core/jobs.js.map +1 -1
  51. package/dist/core/map.d.ts.map +1 -1
  52. package/dist/core/map.js +14 -2
  53. package/dist/core/map.js.map +1 -1
  54. package/dist/core/paginate.d.ts +32 -0
  55. package/dist/core/paginate.d.ts.map +1 -0
  56. package/dist/core/paginate.js +107 -0
  57. package/dist/core/paginate.js.map +1 -0
  58. package/dist/core/rate-governor.d.ts +81 -0
  59. package/dist/core/rate-governor.d.ts.map +1 -0
  60. package/dist/core/rate-governor.js +238 -0
  61. package/dist/core/rate-governor.js.map +1 -0
  62. package/dist/core/search-provider.d.ts +5 -0
  63. package/dist/core/search-provider.d.ts.map +1 -1
  64. package/dist/core/search-provider.js +81 -2
  65. package/dist/core/search-provider.js.map +1 -1
  66. package/dist/core/site-search.d.ts +45 -0
  67. package/dist/core/site-search.d.ts.map +1 -0
  68. package/dist/core/site-search.js +253 -0
  69. package/dist/core/site-search.js.map +1 -0
  70. package/dist/core/strategies.d.ts +8 -0
  71. package/dist/core/strategies.d.ts.map +1 -1
  72. package/dist/core/strategies.js +185 -45
  73. package/dist/core/strategies.js.map +1 -1
  74. package/dist/core/strategy-hooks.d.ts +6 -0
  75. package/dist/core/strategy-hooks.d.ts.map +1 -1
  76. package/dist/core/strategy-hooks.js.map +1 -1
  77. package/dist/core/table-format.d.ts +31 -0
  78. package/dist/core/table-format.d.ts.map +1 -0
  79. package/dist/core/table-format.js +147 -0
  80. package/dist/core/table-format.js.map +1 -0
  81. package/dist/core/user-agents.d.ts +58 -0
  82. package/dist/core/user-agents.d.ts.map +1 -0
  83. package/dist/core/user-agents.js +159 -0
  84. package/dist/core/user-agents.js.map +1 -0
  85. package/dist/core/watch.d.ts +100 -0
  86. package/dist/core/watch.d.ts.map +1 -0
  87. package/dist/core/watch.js +368 -0
  88. package/dist/core/watch.js.map +1 -0
  89. package/dist/index.d.ts +13 -2
  90. package/dist/index.d.ts.map +1 -1
  91. package/dist/index.js +41 -4
  92. package/dist/index.js.map +1 -1
  93. package/dist/mcp/server.js +3 -0
  94. package/dist/mcp/server.js.map +1 -1
  95. package/dist/types.d.ts +73 -0
  96. package/dist/types.d.ts.map +1 -1
  97. package/dist/types.js.map +1 -1
  98. package/llms.txt +1 -1
  99. package/package.json +3 -3
package/README.md CHANGED
@@ -61,10 +61,21 @@ First 25 fetches work instantly, no signup. After that, [sign up free](https://a
61
61
  | **Firecrawl-compatible** | ✅ Drop-in replacement | ✅ Native | ❌ | ❌ |
62
62
  | **Self-hosting** | ✅ Docker compose | ⚠️ Complex | ❌ | N/A |
63
63
  | **Autonomous agent** | ✅ BYOK any LLM | ⚠️ Locked | ❌ | ❌ |
64
- | **MCP tools** | ✅ 9 tools | 3 | 0 | 1 |
65
- | **License** | ✅ MIT | AGPL-3.0 | Proprietary | MIT |
64
+ | **MCP tools** | ✅ 11 tools | 3 | 0 | 1 |
65
+ | **License** | ✅ AGPL-3.0 | AGPL-3.0 | Proprietary | MIT |
66
66
  | **Pricing** | **Free / $9 / $29** | $0 / $16 / $83 | Custom | Free |
67
67
 
68
+ ## Benchmarks
69
+
70
+ Evaluated on 30 real-world URLs across 6 categories (static, dynamic, SPA, protected, documents, international) against 6 competing web fetching APIs.
71
+
72
+ | Metric | WebPeel | Next best |
73
+ |--------|:-------:|:---------:|
74
+ | **Success rate** | **100%** (30/30) | 93.3% (Firecrawl, Exa, LinkUp) |
75
+ | **Content quality** | **92.3%** | 83.2% (Exa) |
76
+
77
+ WebPeel is the only tool that successfully extracted content from all 30 test URLs. Full methodology and per-category breakdown: [webpeel.dev/blog/benchmarks](https://webpeel.dev/blog/benchmarks)
78
+
68
79
  ## Install
69
80
 
70
81
  ```bash
@@ -117,9 +128,9 @@ Zero dependencies. Pure Python 3.8+. [Full SDK docs →](python-sdk/README.md)
117
128
 
118
129
  ### MCP Server
119
130
 
120
- 9 tools for Claude Desktop, Cursor, VS Code, and Windsurf:
131
+ 11 tools for Claude Desktop, Cursor, VS Code, and Windsurf:
121
132
 
122
- `webpeel_fetch` · `webpeel_search` · `webpeel_crawl` · `webpeel_map` · `webpeel_extract` · `webpeel_batch` · `webpeel_agent` · `webpeel_summarize` · `webpeel_brand`
133
+ `webpeel_fetch` · `webpeel_search` · `webpeel_crawl` · `webpeel_map` · `webpeel_extract` · `webpeel_batch` · `webpeel_brand` · `webpeel_change_track` · `webpeel_summarize` · `webpeel_answer` · `webpeel_screenshot`
123
134
 
124
135
  ```json
125
136
  {
@@ -144,7 +155,7 @@ git clone https://github.com/webpeel/webpeel.git
144
155
  cd webpeel && docker compose up
145
156
  ```
146
157
 
147
- Full API at `http://localhost:3000`. MIT licensed no restrictions.
158
+ Full API at `http://localhost:3000`. AGPL-3.0 licensed. [Commercial licensing available](mailto:support@webpeel.dev).
148
159
 
149
160
  ## Features
150
161
 
@@ -224,6 +235,29 @@ curl "https://api.webpeel.dev/v1/fetch?url=https://example.com" \
224
235
 
225
236
  Extra credit costs: fetch $0.002, search $0.001, stealth $0.01. Resets every Monday. All features on all plans. [Compare with Firecrawl →](https://webpeel.dev/migrate-from-firecrawl)
226
237
 
238
+ ## Project Structure
239
+
240
+ ```
241
+ webpeel/
242
+ ├── src/
243
+ │ ├── core/ # Core library (fetcher, strategies, markdown, crawl, search)
244
+ │ ├── mcp/ # MCP server (11 tools for AI assistants)
245
+ │ ├── server/ # Express API server (hosted version)
246
+ │ │ ├── routes/ # API route handlers
247
+ │ │ ├── middleware/ # Auth, rate limiting, SSRF protection
248
+ │ │ └── premium/ # Server-only premium features
249
+ │ ├── tests/ # Vitest test suites
250
+ │ ├── cli.ts # CLI entry point
251
+ │ ├── index.ts # Library exports
252
+ │ └── types.ts # TypeScript type definitions
253
+ ├── python-sdk/ # Python SDK (PyPI: webpeel)
254
+ ├── integrations/ # LangChain, LlamaIndex, CrewAI, Dify, n8n
255
+ ├── site/ # Landing page (webpeel.dev)
256
+ ├── dashboard/ # Next.js dashboard (app.webpeel.dev)
257
+ ├── benchmarks/ # Performance comparison suite
258
+ └── skills/ # AI agent skills (Claude Code, etc.)
259
+ ```
260
+
227
261
  ## Development
228
262
 
229
263
  ```bash