research-powerpack-mcp 3.5.0 → 3.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/README.md +674 -63
  2. package/dist/clients/reddit.d.ts +6 -1
  3. package/dist/clients/reddit.d.ts.map +1 -1
  4. package/dist/clients/reddit.js +60 -24
  5. package/dist/clients/reddit.js.map +1 -1
  6. package/dist/clients/scraper.d.ts +6 -1
  7. package/dist/clients/scraper.d.ts.map +1 -1
  8. package/dist/clients/scraper.js +77 -38
  9. package/dist/clients/scraper.js.map +1 -1
  10. package/dist/clients/search.d.ts +2 -2
  11. package/dist/clients/search.d.ts.map +1 -1
  12. package/dist/clients/search.js +11 -6
  13. package/dist/clients/search.js.map +1 -1
  14. package/dist/config/index.d.ts.map +1 -1
  15. package/dist/config/index.js +5 -1
  16. package/dist/config/index.js.map +1 -1
  17. package/dist/config/loader.d.ts.map +1 -1
  18. package/dist/config/loader.js +6 -1
  19. package/dist/config/loader.js.map +1 -1
  20. package/dist/index.js +28 -86
  21. package/dist/index.js.map +1 -1
  22. package/dist/schemas/web-search.js +1 -1
  23. package/dist/schemas/web-search.js.map +1 -1
  24. package/dist/services/file-attachment.d.ts.map +1 -1
  25. package/dist/services/file-attachment.js +25 -22
  26. package/dist/services/file-attachment.js.map +1 -1
  27. package/dist/tools/reddit.d.ts.map +1 -1
  28. package/dist/tools/reddit.js +43 -55
  29. package/dist/tools/reddit.js.map +1 -1
  30. package/dist/tools/registry.js +2 -2
  31. package/dist/tools/registry.js.map +1 -1
  32. package/dist/tools/research.d.ts +1 -2
  33. package/dist/tools/research.d.ts.map +1 -1
  34. package/dist/tools/research.js +69 -59
  35. package/dist/tools/research.js.map +1 -1
  36. package/dist/tools/scrape.d.ts +1 -2
  37. package/dist/tools/scrape.d.ts.map +1 -1
  38. package/dist/tools/scrape.js +74 -96
  39. package/dist/tools/scrape.js.map +1 -1
  40. package/dist/tools/search.d.ts +1 -2
  41. package/dist/tools/search.d.ts.map +1 -1
  42. package/dist/tools/search.js +19 -21
  43. package/dist/tools/search.js.map +1 -1
  44. package/dist/tools/utils.d.ts +68 -16
  45. package/dist/tools/utils.d.ts.map +1 -1
  46. package/dist/tools/utils.js +75 -22
  47. package/dist/tools/utils.js.map +1 -1
  48. package/dist/utils/concurrency.d.ts +29 -0
  49. package/dist/utils/concurrency.d.ts.map +1 -0
  50. package/dist/utils/concurrency.js +73 -0
  51. package/dist/utils/concurrency.js.map +1 -0
  52. package/dist/utils/logger.d.ts +26 -23
  53. package/dist/utils/logger.d.ts.map +1 -1
  54. package/dist/utils/logger.js +41 -24
  55. package/dist/utils/logger.js.map +1 -1
  56. package/dist/utils/response.d.ts +49 -62
  57. package/dist/utils/response.d.ts.map +1 -1
  58. package/dist/utils/response.js +102 -134
  59. package/dist/utils/response.js.map +1 -1
  60. package/dist/utils/url-aggregator.d.ts.map +1 -1
  61. package/dist/utils/url-aggregator.js +6 -4
  62. package/dist/utils/url-aggregator.js.map +1 -1
  63. package/package.json +2 -8
  64. package/dist/config/env.d.ts +0 -75
  65. package/dist/config/env.d.ts.map +0 -1
  66. package/dist/config/env.js +0 -87
  67. package/dist/config/env.js.map +0 -1
  68. package/dist/config/yaml/tools-enhanced.yaml +0 -0
package/README.md CHANGED
@@ -1,29 +1,137 @@
1
- # Research Powerpack MCP
1
+ <h1 align="center">🔬 Research Powerpack MCP 🔬</h1>
2
+ <h3 align="center">Stop tab-hopping for research. Start getting god-tier context.</h3>
2
3
 
3
- > The ultimate research toolkit for AI assistants. Stop tab-hopping. Start getting better context.
4
+ <p align="center">
5
+ <strong>
6
+ <em>The ultimate research toolkit for your AI coding assistant. It searches the web, mines Reddit, scrapes any URL, and synthesizes everything into perfectly structured context your LLM actually understands.</em>
7
+ </strong>
8
+ </p>
4
9
 
5
- [![npm](https://img.shields.io/npm/v/research-powerpack-mcp.svg)](https://www.npmjs.com/package/research-powerpack-mcp)
6
- [![license](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
- [![install-mcp](https://img.shields.io/badge/install--mcp-compatible-blue.svg)](https://github.com/wong2/install-mcp)
10
+ <p align="center">
11
+ <!-- Package Info -->
12
+ <a href="https://www.npmjs.com/package/research-powerpack-mcp"><img alt="npm" src="https://img.shields.io/npm/v/research-powerpack-mcp.svg?style=flat-square&color=4D87E6"></a>
13
+ <a href="#"><img alt="node" src="https://img.shields.io/badge/node-18+-4D87E6.svg?style=flat-square"></a>
14
+ &nbsp;&nbsp;•&nbsp;&nbsp;
15
+ <!-- Features -->
16
+ <a href="https://opensource.org/licenses/MIT"><img alt="license" src="https://img.shields.io/badge/License-MIT-F9A825.svg?style=flat-square"></a>
17
+ <a href="#"><img alt="platform" src="https://img.shields.io/badge/platform-macOS_|_Linux_|_Windows-2ED573.svg?style=flat-square"></a>
18
+ </p>
8
19
 
9
- **Search the web, mine Reddit, scrape any URL, and synthesize everything into LLM-ready context.**
20
+ <p align="center">
21
+ <img alt="modular" src="https://img.shields.io/badge/🧩_modular-use_1_tool_or_all_5-2ED573.svg?style=for-the-badge">
22
+ <img alt="zero crash" src="https://img.shields.io/badge/💪_zero_crash-missing_keys_=_helpful_errors-2ED573.svg?style=for-the-badge">
23
+ </p>
10
24
 
11
- ## Quick Start
25
+ <div align="center">
12
26
 
13
- ```bash
14
- # One-line install (auto-detects your MCP client)
15
- npx install-mcp research-powerpack-mcp --client claude-desktop
27
+ ### 🧭 Quick Navigation
28
+
29
+ [**⚡ Get Started**](#-get-started-in-60-seconds)
30
+ [**✨ Key Features**](#-feature-breakdown-the-secret-sauce) •
31
+ [**🎮 Usage & Examples**](#-tool-reference) •
32
+ [**⚙️ API Key Setup**](#-api-key-setup-guides) •
33
+ [**🆚 Why This Slaps**](#-why-this-slaps-other-methods)
34
+
35
+ </div>
36
+
37
+ ---
38
+
39
+ **`research-powerpack-mcp`** is the research assistant your AI wishes it had. Stop asking your LLM to guess about things it doesn't know. This MCP server acts like a senior researcher, searching the web, mining Reddit discussions, scraping documentation, and synthesizing everything into perfectly structured context so your AI can actually give you answers worth a damn.
40
+
41
+ <div align="center">
42
+ <table>
43
+ <tr>
44
+ <td align="center">
45
+ <h3>🔍</h3>
46
+ <b>Batch Web Search</b><br/>
47
+ <sub>100 keywords in parallel</sub>
48
+ </td>
49
+ <td align="center">
50
+ <h3>💬</h3>
51
+ <b>Reddit Mining</b><br/>
52
+ <sub>Real opinions, not marketing</sub>
53
+ </td>
54
+ <td align="center">
55
+ <h3>🌐</h3>
56
+ <b>Universal Scraping</b><br/>
57
+ <sub>JS rendering + geo-targeting</sub>
58
+ </td>
59
+ <td align="center">
60
+ <h3>🧠</h3>
61
+ <b>Deep Research</b><br/>
62
+ <sub>AI synthesis with citations</sub>
63
+ </td>
64
+ </tr>
65
+ </table>
66
+ </div>
67
+
68
+ How it slaps:
69
+ - **You:** "What's the best database for my use case?"
70
+ - **AI + Powerpack:** Searches Google, mines Reddit threads, scrapes docs, synthesizes findings.
71
+ - **You:** Get an actually informed answer with real community opinions and citations.
72
+ - **Result:** Ship better decisions. Skip the 47 browser tabs.
73
+
74
+ ---
75
+
76
+ ## 💥 Why This Slaps Other Methods
77
+
78
+ Manually researching is a vibe-killer. `research-powerpack-mcp` makes other methods look ancient.
79
+
80
+ <table align="center">
81
+ <tr>
82
+ <td align="center"><b>❌ The Old Way (Pain)</b></td>
83
+ <td align="center"><b>✅ The Powerpack Way (Glory)</b></td>
84
+ </tr>
85
+ <tr>
86
+ <td>
87
+ <ol>
88
+ <li>Open 15 browser tabs.</li>
89
+ <li>Skim Stack Overflow answers from 2019.</li>
90
+ <li>Search Reddit, get distracted by drama.</li>
91
+ <li>Copy-paste random snippets to your AI.</li>
92
+ <li>Get a mediocre answer from confused context.</li>
93
+ </ol>
94
+ </td>
95
+ <td>
96
+ <ol>
97
+ <li>Ask your AI to research it.</li>
98
+ <li>AI searches, scrapes, mines Reddit automatically.</li>
99
+ <li>Receive synthesized insights with sources.</li>
100
+ <li>Make an informed decision.</li>
101
+ <li>Go grab a coffee. ☕</li>
102
+ </ol>
103
+ </td>
104
+ </tr>
105
+ </table>
106
+
107
+ We're not just fetching random pages. We're building **high-signal, low-noise context** with CTR-weighted ranking, smart comment allocation, and intelligent token distribution that prevents massive responses from breaking your LLM's context window.
108
+
109
+ ---
16
110
 
17
- # Or use our install script
18
- curl -fsSL https://raw.githubusercontent.com/yigitkonur/research-powerpack-mcp/main/install.sh | bash
111
+ ## 🚀 Get Started in 60 Seconds
19
112
 
20
- # Manual install
113
+ ### 1. Install
114
+
115
+ ```bash
21
116
  npm install research-powerpack-mcp
22
117
  ```
23
118
 
24
- ### Manual Configuration
119
+ ### 2. Configure Your MCP Client
120
+
121
+ <div align="center">
122
+
123
+ | Client | Config File | Docs |
124
+ |:------:|:-----------:|:----:|
125
+ | 🖥️ **Claude Desktop** | `claude_desktop_config.json` | [Setup](#claude-desktop) |
126
+ | ⌨️ **Claude Code** | `~/.claude.json` or CLI | [Setup](#claude-code-cli) |
127
+ | 🎯 **Cursor** | `.cursor/mcp.json` | [Setup](#cursorwindsurf) |
128
+ | 🏄 **Windsurf** | MCP settings | [Setup](#cursorwindsurf) |
129
+
130
+ </div>
131
+
132
+ #### Claude Desktop
25
133
 
26
- Add to your MCP client config:
134
+ Add to your `claude_desktop_config.json`:
27
135
 
28
136
  ```json
29
137
  {
@@ -43,77 +151,580 @@ Add to your MCP client config:
43
151
  }
44
152
  ```
45
153
 
46
- **Config locations:**
47
- - **Claude Desktop**: `~/Library/Application Support/Claude/claude_desktop_config.json`
48
- - **Claude Code**: `~/.claude.json`
49
- - **Cursor**: `.cursor/mcp.json`
154
+ or quick install (for MacOS):
155
+
156
+ ```
157
+ cat ~/Library/Application\ Support/Claude/claude_desktop_config.json | jq '.mcpServers["research-powerpack"] = {
158
+ "command": "npx",
159
+ "args": ["research-powerpack-mcp@latest"],
160
+ "disabled": false,
161
+ "env": {
162
+ "OPENROUTER_API_KEY": "xxx",
163
+ "REDDIT_CLIENT_ID": "xxx",
164
+ "REDDIT_CLIENT_SECRET": "xxx",
165
+ "RESEARCH_MODEL": "xxxx",
166
+ "SCRAPEDO_API_KEY": "xxx",
167
+ "SERPER_API_KEY": "xxxx"
168
+ }
169
+ }' | tee ~/Library/Application\ Support/Claude/claude_desktop_config.json
170
+ ```
171
+
172
+ #### Claude Code (CLI)
173
+
174
+ One command to rule them all:
175
+
176
+ ```bash
177
+ claude mcp add research-powerpack npx \
178
+ --scope user \
179
+ --env SERPER_API_KEY=your_key \
180
+ --env REDDIT_CLIENT_ID=your_id \
181
+ --env REDDIT_CLIENT_SECRET=your_secret \
182
+ --env OPENROUTER_API_KEY=your_key \
183
+ --env OPENROUTER_BASE_URL=https://openrouter.ai/api/v1 \
184
+ --env RESEARCH_MODEL=x-ai/grok-4.1-fast \
185
+ -- research-powerpack-mcp
186
+ ```
187
+
188
+ Or manually add to `~/.claude.json`:
189
+
190
+ ```json
191
+ {
192
+ "mcpServers": {
193
+ "research-powerpack": {
194
+ "command": "npx",
195
+ "args": ["research-powerpack-mcp"],
196
+ "env": {
197
+ "SERPER_API_KEY": "your_key",
198
+ "REDDIT_CLIENT_ID": "your_id",
199
+ "REDDIT_CLIENT_SECRET": "your_secret",
200
+ "OPENROUTER_API_KEY": "your_key",
201
+ "OPENROUTER_BASE_URL": "https://openrouter.ai/api/v1",
202
+ "RESEARCH_MODEL": "x-ai/grok-4.1-fast"
203
+ }
204
+ }
205
+ }
206
+ }
207
+ ```
208
+
209
+ #### Cursor/Windsurf
210
+
211
+ Add to `.cursor/mcp.json` or equivalent:
212
+
213
+ ```json
214
+ {
215
+ "mcpServers": {
216
+ "research-powerpack": {
217
+ "command": "npx",
218
+ "args": ["research-powerpack-mcp"],
219
+ "env": {
220
+ "SERPER_API_KEY": "your_key"
221
+ }
222
+ }
223
+ }
224
+ }
225
+ ```
226
+
227
+ > **✨ Zero Crash Promise:** Missing API keys? No problem. The server always starts. Tools just return helpful setup instructions instead of exploding.
228
+
229
+ ---
230
+
231
+ ## ✨ Feature Breakdown: The Secret Sauce
232
+
233
+ <div align="center">
234
+
235
+ | Feature | What It Does | Why You Care |
236
+ | :---: | :--- | :--- |
237
+ | **🔍 Batch Search**<br/>`100 keywords parallel` | Search Google for up to 100 queries simultaneously | Cover every angle of a topic in one shot |
238
+ | **📊 CTR Ranking**<br/>`Smart URL scoring` | Identifies URLs that appear across multiple searches | Surfaces high-consensus authoritative sources |
239
+ | **💬 Reddit Mining**<br/>`Real human opinions` | Google-powered Reddit search + native API fetching | Get actual user experiences, not marketing fluff |
240
+ | **🎯 Smart Allocation**<br/>`Token-aware budgets` | 1,000 comment budget distributed across posts | Deep dive on 2 posts or quick scan on 50 |
241
+ | **🌐 Universal Scraping**<br/>`Works on everything` | Auto-fallback: basic → JS render → geo-targeting | Handles SPAs, paywalls, and geo-restricted content |
242
+ | **🧠 Deep Research**<br/>`AI-powered synthesis` | Batch research with web search and citations | Get comprehensive answers to complex questions |
243
+ | **🧩 Modular Design**<br/>`Use what you need` | Each tool works independently | Pay only for the APIs you actually use |
244
+
245
+ </div>
246
+
247
+ ---
248
+
249
+ ## 🎮 Tool Reference
250
+
251
+ <div align="center">
252
+ <table>
253
+ <tr>
254
+ <td align="center">
255
+ <h3>🔍</h3>
256
+ <b><code>web_search</code></b><br/>
257
+ <sub>Batch Google search</sub>
258
+ </td>
259
+ <td align="center">
260
+ <h3>💬</h3>
261
+ <b><code>search_reddit</code></b><br/>
262
+ <sub>Find Reddit discussions</sub>
263
+ </td>
264
+ <td align="center">
265
+ <h3>📖</h3>
266
+ <b><code>get_reddit_post</code></b><br/>
267
+ <sub>Fetch posts + comments</sub>
268
+ </td>
269
+ <td align="center">
270
+ <h3>🌐</h3>
271
+ <b><code>scrape_links</code></b><br/>
272
+ <sub>Extract any URL</sub>
273
+ </td>
274
+ <td align="center">
275
+ <h3>🧠</h3>
276
+ <b><code>deep_research</code></b><br/>
277
+ <sub>AI synthesis</sub>
278
+ </td>
279
+ </tr>
280
+ </table>
281
+ </div>
282
+
283
+ ### `web_search`
284
+
285
+ **Batch web search** using Google via Serper API. Search up to 100 keywords in parallel.
286
+
287
+ | Parameter | Type | Required | Description |
288
+ |-----------|------|----------|-------------|
289
+ | `keywords` | `string[]` | Yes | Search queries (1-100). Use distinct keywords for maximum coverage. |
290
+
291
+ **Supports Google operators:** `site:`, `-exclusion`, `"exact phrase"`, `filetype:`
292
+
293
+ ```json
294
+ {
295
+ "keywords": [
296
+ "best IDE 2025",
297
+ "VS Code alternatives",
298
+ "Cursor vs Windsurf comparison"
299
+ ]
300
+ }
301
+ ```
302
+
303
+ ---
304
+
305
+ ### `search_reddit`
306
+
307
+ **Search Reddit** via Google with automatic `site:reddit.com` filtering.
308
+
309
+ | Parameter | Type | Required | Description |
310
+ |-----------|------|----------|-------------|
311
+ | `queries` | `string[]` | Yes | Search queries (max 10) |
312
+ | `date_after` | `string` | No | Filter results after date (YYYY-MM-DD) |
313
+
314
+ **Search operators:** `intitle:keyword`, `"exact phrase"`, `OR`, `-exclude`
315
+
316
+ ```json
317
+ {
318
+ "queries": [
319
+ "best mechanical keyboard 2025",
320
+ "intitle:keyboard recommendation"
321
+ ],
322
+ "date_after": "2024-01-01"
323
+ }
324
+ ```
325
+
326
+ ---
327
+
328
+ ### `get_reddit_post`
329
+
330
+ **Fetch Reddit posts** with smart comment allocation (1,000 comment budget distributed automatically).
331
+
332
+ | Parameter | Type | Required | Default | Description |
333
+ |-----------|------|----------|---------|-------------|
334
+ | `urls` | `string[]` | Yes | — | Reddit post URLs (2-50) |
335
+ | `fetch_comments` | `boolean` | No | `true` | Whether to fetch comments |
336
+ | `max_comments` | `number` | No | auto | Override comment allocation |
337
+
338
+ **Smart Allocation:**
339
+ - 2 posts → ~500 comments/post (deep dive)
340
+ - 10 posts → ~100 comments/post
341
+ - 50 posts → ~20 comments/post (quick scan)
342
+
343
+ ```json
344
+ {
345
+ "urls": [
346
+ "https://reddit.com/r/programming/comments/abc123/post_title",
347
+ "https://reddit.com/r/webdev/comments/def456/another_post"
348
+ ]
349
+ }
350
+ ```
351
+
352
+ ---
50
353
 
51
- ## Features
354
+ ### `scrape_links`
52
355
 
53
- | Tool | Description | Required ENV |
54
- |------|-------------|--------------|
55
- | `web_search` | Batch Google search (100 keywords parallel) | `SERPER_API_KEY` |
56
- | `search_reddit` | Find Reddit discussions | `SERPER_API_KEY` |
57
- | `get_reddit_post` | Fetch posts + comments (1K comment budget) | `REDDIT_CLIENT_ID` + `SECRET` |
58
- | `scrape_links` | Extract any URL (JS rendering, geo-targeting) | `SCRAPEDO_API_KEY` |
59
- | `deep_research` | AI-powered synthesis with citations | `OPENROUTER_API_KEY` |
356
+ **Universal URL content extraction** with automatic fallback modes.
60
357
 
61
- **Zero crash promise:** Missing API keys? Server starts anyway. Tools return helpful setup instructions.
358
+ | Parameter | Type | Required | Default | Description |
359
+ |-----------|------|----------|---------|-------------|
360
+ | `urls` | `string[]` | Yes | — | URLs to scrape (3-50) |
361
+ | `timeout` | `number` | No | `30` | Timeout per URL (seconds) |
362
+ | `use_llm` | `boolean` | No | `false` | Enable AI extraction |
363
+ | `what_to_extract` | `string` | No | — | Extraction instructions for AI |
62
364
 
63
- ## API Keys
365
+ **Automatic Fallback:** Basic → JS rendering → JS + US geo-targeting
64
366
 
65
- | Service | Free Tier | Get Key |
66
- |---------|-----------|---------|
67
- | **Serper** (Search) | 2,500 queries/mo | [serper.dev](https://serper.dev) |
68
- | **Reddit** (Posts) | Unlimited | [reddit.com/prefs/apps](https://reddit.com/prefs/apps) |
69
- | **Scrape.do** (Scraping) | 1,000 credits/mo | [scrape.do](https://scrape.do) |
70
- | **OpenRouter** (AI) | Pay-as-you-go | [openrouter.ai](https://openrouter.ai) |
367
+ ```json
368
+ {
369
+ "urls": ["https://example.com/article1", "https://example.com/article2"],
370
+ "use_llm": true,
371
+ "what_to_extract": "Extract the main arguments and key statistics"
372
+ }
373
+ ```
374
+
375
+ ---
376
+
377
+ ### `deep_research`
378
+
379
+ **AI-powered batch research** with web search and citations.
71
380
 
72
- ## Usage Examples
381
+ | Parameter | Type | Required | Description |
382
+ |-----------|------|----------|-------------|
383
+ | `questions` | `object[]` | Yes | Research questions (2-10) |
384
+ | `questions[].question` | `string` | Yes | The research question |
385
+ | `questions[].file_attachments` | `object[]` | No | Files to include as context |
73
386
 
74
- ### Web Search
75
- ```typescript
76
- // Search multiple keywords in parallel
77
- web_search({ keywords: ["React hooks 2025", "Vue 3 composition API", "Svelte stores"] })
387
+ **Token Allocation:** 32,000 tokens distributed across questions:
388
+ - 2 questions → 16,000 tokens/question (deep dive)
389
+ - 10 questions 3,200 tokens/question (rapid multi-topic)
390
+
391
+ ```json
392
+ {
393
+ "questions": [
394
+ { "question": "What are the current best practices for React Server Components in 2025?" },
395
+ { "question": "Compare Bun vs Node.js for production workloads with benchmarks." }
396
+ ]
397
+ }
78
398
  ```
79
399
 
80
- ### Reddit Research
81
- ```typescript
82
- // Find discussions
83
- search_reddit({ queries: ["best mechanical keyboard", "keyboard recommendations"] })
400
+ ---
401
+
402
+ ## ⚙️ Environment Variables & Tool Availability
403
+
404
+ Research Powerpack uses a **modular architecture**. Tools are automatically enabled based on which API keys you provide:
84
405
 
85
- // Fetch with auto comment allocation
86
- get_reddit_post({ urls: ["https://reddit.com/r/..."] })
406
+ <div align="center">
407
+
408
+ | ENV Variable | Tools Enabled | Free Tier |
409
+ |:------------:|:-------------:|:---------:|
410
+ | `SERPER_API_KEY` | `web_search`, `search_reddit` | 2,500 queries/mo |
411
+ | `REDDIT_CLIENT_ID` + `SECRET` | `get_reddit_post` | Unlimited |
412
+ | `SCRAPEDO_API_KEY` | `scrape_links` | 1,000 credits/mo |
413
+ | `OPENROUTER_API_KEY` | `deep_research` + AI in `scrape_links` | Pay-as-you-go |
414
+ | `RESEARCH_MODEL` | Model for `deep_research` | Default: `perplexity/sonar-deep-research` |
415
+ | `LLM_EXTRACTION_MODEL` | Model for AI extraction in `scrape_links` | Default: `openrouter/gpt-oss-120b:nitro` |
416
+
417
+ </div>
418
+
419
+ ### Configuration Examples
420
+
421
+ ```bash
422
+ # Search-only mode (just web_search and search_reddit)
423
+ SERPER_API_KEY=xxx
424
+
425
+ # Reddit research mode (search + fetch posts)
426
+ SERPER_API_KEY=xxx
427
+ REDDIT_CLIENT_ID=xxx
428
+ REDDIT_CLIENT_SECRET=xxx
429
+
430
+ # Full research mode (all 5 tools)
431
+ SERPER_API_KEY=xxx
432
+ REDDIT_CLIENT_ID=xxx
433
+ REDDIT_CLIENT_SECRET=xxx
434
+ SCRAPEDO_API_KEY=xxx
435
+ OPENROUTER_API_KEY=xxx
87
436
  ```
88
437
 
89
- ### Web Scraping
90
- ```typescript
91
- // With AI extraction
92
- scrape_links({
93
- urls: ["https://example.com"],
94
- use_llm: true,
95
- what_to_extract: "Extract pricing tiers | features | user reviews"
96
- })
438
+ ---
439
+
440
+ ## 🔑 API Key Setup Guides
441
+
442
+ <details>
443
+ <summary><b>🔍 Serper API (Google Search) — FREE: 2,500 queries/month</b></summary>
444
+
445
+ ### What you get
446
+ - Fast Google search results via API
447
+ - Enables `web_search` and `search_reddit` tools
448
+
449
+ ### Setup Steps
450
+ 1. Go to [serper.dev](https://serper.dev)
451
+ 2. Click **"Get API Key"** (top right)
452
+ 3. Sign up with email or Google
453
+ 4. Copy your API key from the dashboard
454
+ 5. Add to your config:
455
+ ```
456
+ SERPER_API_KEY=your_key_here
457
+ ```
458
+
459
+ ### Pricing
460
+ - **Free**: 2,500 queries/month
461
+ - **Paid**: $50/month for 50,000 queries
462
+
463
+ </details>
464
+
465
+ <details>
466
+ <summary><b>🤖 Reddit OAuth — FREE: Unlimited access</b></summary>
467
+
468
+ ### What you get
469
+ - Full Reddit API access
470
+ - Fetch posts and comments with upvote sorting
471
+ - Enables `get_reddit_post` tool
472
+
473
+ ### Setup Steps
474
+ 1. Go to [reddit.com/prefs/apps](https://www.reddit.com/prefs/apps)
475
+ 2. Scroll down and click **"create another app..."**
476
+ 3. Fill in:
477
+ - **Name**: `research-powerpack` (or any name)
478
+ - **App type**: Select **"script"** (important!)
479
+ - **Redirect URI**: `http://localhost:8080`
480
+ 4. Click **"create app"**
481
+ 5. Copy your credentials:
482
+ - **Client ID**: The string under your app name
483
+ - **Client Secret**: The "secret" field
484
+ 6. Add to your config:
485
+ ```
486
+ REDDIT_CLIENT_ID=your_client_id
487
+ REDDIT_CLIENT_SECRET=your_client_secret
488
+ ```
489
+
490
+ </details>
491
+
492
+ <details>
493
+ <summary><b>🌐 Scrape.do (Web Scraping) — FREE: 1,000 credits/month</b></summary>
494
+
495
+ ### What you get
496
+ - JavaScript rendering support
497
+ - Geo-targeting and CAPTCHA handling
498
+ - Enables `scrape_links` tool
499
+
500
+ ### Setup Steps
501
+ 1. Go to [scrape.do](https://scrape.do)
502
+ 2. Click **"Start Free"**
503
+ 3. Sign up with email
504
+ 4. Copy your API key from the dashboard
505
+ 5. Add to your config:
506
+ ```
507
+ SCRAPEDO_API_KEY=your_key_here
508
+ ```
509
+
510
+ ### Credit Usage
511
+ - **Basic scrape**: 1 credit
512
+ - **JavaScript rendering**: 5 credits
513
+ - **Geo-targeting**: +25 credits
514
+
515
+ </details>
516
+
517
+ <details>
518
+ <summary><b>🧠 OpenRouter (AI Models) — Pay-as-you-go</b></summary>
519
+
520
+ ### What you get
521
+ - Access to 100+ AI models via one API
522
+ - Enables `deep_research` tool
523
+ - Enables AI extraction in `scrape_links`
524
+
525
+ ### Setup Steps
526
+ 1. Go to [openrouter.ai](https://openrouter.ai)
527
+ 2. Sign up with Google/GitHub/email
528
+ 3. Go to [openrouter.ai/keys](https://openrouter.ai/keys)
529
+ 4. Click **"Create Key"**
530
+ 5. Copy the key (starts with `sk-or-...`)
531
+ 6. Add to your config:
532
+ ```
533
+ OPENROUTER_API_KEY=sk-or-v1-xxxxx
534
+ ```
535
+
536
+ ### Recommended Models for Deep Research
537
+ ```bash
538
+ # Default (optimized for research)
539
+ RESEARCH_MODEL=perplexity/sonar-deep-research
540
+
541
+ # Fast and capable
542
+ RESEARCH_MODEL=x-ai/grok-4.1-fast
543
+
544
+ # High quality
545
+ RESEARCH_MODEL=anthropic/claude-3.5-sonnet
546
+
547
+ # Budget-friendly
548
+ RESEARCH_MODEL=openai/gpt-4o-mini
97
549
  ```
98
550
 
99
- ### Deep Research
100
- ```typescript
101
- deep_research({
102
- questions: [{
103
- question: "Compare PostgreSQL vs MySQL for production workloads with benchmarks"
104
- }]
105
- })
551
+ ### Recommended Models for AI Extraction (`use_llm` in `scrape_links`)
552
+ ```bash
553
+ # Default (fast and cost-effective for extraction)
554
+ LLM_EXTRACTION_MODEL=openrouter/gpt-oss-120b:nitro
555
+
556
+ # High quality extraction
557
+ LLM_EXTRACTION_MODEL=anthropic/claude-3.5-sonnet
558
+
559
+ # Budget-friendly
560
+ LLM_EXTRACTION_MODEL=openai/gpt-4o-mini
106
561
  ```
107
562
 
108
- ## Development
563
+ > **Note:** `RESEARCH_MODEL` and `LLM_EXTRACTION_MODEL` are independent. You can use a powerful model for deep research and a faster/cheaper model for content extraction, or vice versa.
564
+
565
+ </details>
566
+
567
+ ---
568
+
569
+ ## 🔥 Recommended Workflows
570
+
571
+ ### Research a Technology Decision
572
+
573
+ ```
574
+ 1. web_search → ["React vs Vue 2025", "Next.js vs Nuxt comparison"]
575
+ 2. search_reddit → ["best frontend framework 2025", "Next.js production experience"]
576
+ 3. get_reddit_post → [URLs from step 2]
577
+ 4. scrape_links → [Documentation and blog URLs from step 1]
578
+ 5. deep_research → [Synthesize findings into specific questions]
579
+ ```
580
+
581
+ ### Competitive Analysis
582
+
583
+ ```
584
+ 1. web_search → ["competitor name review", "competitor vs alternatives"]
585
+ 2. scrape_links → [Competitor websites, review sites]
586
+ 3. search_reddit → ["competitor name experience", "switching from competitor"]
587
+ 4. get_reddit_post → [URLs from step 3]
588
+ ```
589
+
590
+ ### Debug an Obscure Error
591
+
592
+ ```
593
+ 1. web_search → ["exact error message", "error + framework name"]
594
+ 2. search_reddit → ["error message", "framework + error type"]
595
+ 3. get_reddit_post → [URLs with solutions]
596
+ 4. scrape_links → [Stack Overflow answers, GitHub issues]
597
+ ```
598
+
599
+ ---
600
+
601
+ ## 🔥 Enable Full Power Mode
602
+
603
+ For the best research experience, configure all four API keys:
604
+
605
+ ```bash
606
+ SERPER_API_KEY=your_serper_key # Free: 2,500 queries/month
607
+ REDDIT_CLIENT_ID=your_reddit_id # Free: Unlimited
608
+ REDDIT_CLIENT_SECRET=your_reddit_secret
609
+ SCRAPEDO_API_KEY=your_scrapedo_key # Free: 1,000 credits/month
610
+ OPENROUTER_API_KEY=your_openrouter_key # Pay-as-you-go
611
+ ```
612
+
613
+ This unlocks:
614
+ - **5 research tools** working together
615
+ - **AI-powered content extraction** in scrape_links
616
+ - **Deep research with web search** and citations
617
+ - **Complete Reddit mining** (search → fetch → analyze)
618
+
619
+ **Total setup time:** ~10 minutes. **Total free tier value:** ~$50/month equivalent.
620
+
621
+ ---
622
+
623
+ ## 🛠️ Development
109
624
 
110
625
  ```bash
626
+ # Clone
111
627
  git clone https://github.com/yigitkonur/research-powerpack-mcp.git
112
628
  cd research-powerpack-mcp
629
+
630
+ # Install
113
631
  npm install
632
+
633
+ # Development
114
634
  npm run dev
635
+
636
+ # Build
637
+ npm run build
638
+
639
+ # Type check
640
+ npm run typecheck
115
641
  ```
116
642
 
117
- ## License
643
+ ---
644
+
645
+ ## 🏗️ Architecture (v3.4.0+)
646
+
647
+ The codebase uses a **YAML-driven configuration system** with **aggressive LLM optimization** (v3.5.0+):
648
+
649
+ ### Core Architecture
650
+
651
+ | Component | File | Purpose |
652
+ |-----------|------|---------|
653
+ | **Tool Definitions** | `src/config/yaml/tools.yaml` | Single source of truth for all tool metadata |
654
+ | **Handler Registry** | `src/tools/registry.ts` | Declarative tool registration + `executeTool` wrapper |
655
+ | **YAML Loader** | `src/config/loader.ts` | Parses YAML, generates MCP-compatible definitions (cached) |
656
+ | **Concurrency Utils** | `src/utils/concurrency.ts` | Bounded parallel execution (`pMap`/`pMapSettled`) |
657
+ | **Shared Utils** | `src/tools/utils.ts` | Common utility functions |
658
+
659
+ **Adding a new tool:**
660
+ 1. Add tool definition to `tools.yaml`
661
+ 2. Create handler in `src/tools/`
662
+ 3. Register in `src/tools/registry.ts`
663
+
664
+ See `docs/refactoring/04-migration-guide.md` for detailed instructions.
665
+
666
+ ### Performance & Stability (v3.5.1+)
667
+
668
+ All parallel operations use **bounded concurrency** to prevent CPU spikes and API rate limits:
669
+
670
+ | Operation | Before | After |
671
+ |-----------|--------|-------|
672
+ | Reddit search queries | 50 concurrent | 8 concurrent |
673
+ | Web scraping batches | 30 concurrent | 10 concurrent |
674
+ | Deep research questions | Unbounded | 3 concurrent |
675
+ | Reddit post fetching | 10 concurrent | 5 concurrent |
676
+ | File attachments | Unbounded | 5 concurrent |
677
+
678
+ Additional optimizations:
679
+ - YAML config cached in memory (no repeated disk reads)
680
+ - Async file I/O (no event loop blocking)
681
+ - Pre-compiled regex patterns for hot paths
682
+ - Reddit auth token deduplication (prevents concurrent token requests)
683
+
684
+ ### LLM Optimization (v3.5.0+)
685
+
686
+ All tools include **aggressive guidance** to force LLMs to use them optimally:
687
+
688
+ | Feature | Description |
689
+ |---------|-------------|
690
+ | **Configurable Limits** | All min/max values in YAML (`limits` section) |
691
+ | **BAD vs GOOD Examples** | Every tool shows anti-patterns and perfect usage |
692
+ | **Aggressive Phrasing** | Changed from "you can" to "you MUST" |
693
+ | **Visual Formatting** | Emoji headers, section dividers, icons for visual scanning |
694
+ | **Templates** | Structured formats for questions, extractions, file descriptions |
695
+
696
+ **Key Enhancements:**
697
+ - `search_reddit`: Minimum 10 queries (was 3), 10-category formula
698
+ - `deep_research`: 7-section question template, file attachment requirements
699
+ - `scrape_links`: Extraction template with OR statements, use_llm=true push
700
+ - `web_search`: Minimum 3 keywords, search operator examples
701
+ - `file_attachments`: Numbered 5-section description template
702
+
703
+ See `docs/refactoring/07-llm-optimization-summary.md` for full details.
704
+
705
+ ---
706
+
707
+ ## 🔥 Common Issues & Quick Fixes
708
+
709
+ <details>
710
+ <summary><b>Expand for troubleshooting tips</b></summary>
711
+
712
+ | Problem | Solution |
713
+ | :--- | :--- |
714
+ | **Tool returns "API key not configured"** | Add the required ENV variable to your MCP config. The error message tells you exactly which key is missing. |
715
+ | **Reddit posts returning empty** | Check your `REDDIT_CLIENT_ID` and `REDDIT_CLIENT_SECRET`. Make sure you created a "script" type app. |
716
+ | **Scraping fails on JavaScript sites** | This is expected for first attempt. The tool auto-retries with JS rendering. If still failing, the site may be blocking scrapers. |
717
+ | **Deep research taking too long** | Use a faster model like `x-ai/grok-4.1-fast` instead of `perplexity/sonar-deep-research`. |
718
+ | **Token limit errors** | Reduce the number of URLs/questions per request. The tool distributes a fixed token budget. |
719
+
720
+ </details>
721
+
722
+ ---
723
+
724
+ <div align="center">
725
+
726
+ **Built with 🔥 because manually researching for your AI is a soul-crushing waste of time.**
118
727
 
119
728
  MIT © [Yiğit Konur](https://github.com/yigitkonur)
729
+
730
+ </div>