mcp-researchpowerpack 3.9.5 → 4.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (188) hide show
  1. package/README.md +116 -208
  2. package/dist/index.js +280 -337
  3. package/dist/index.js.map +7 -1
  4. package/dist/mcp-use.json +7 -0
  5. package/dist/src/clients/reddit.js +278 -0
  6. package/dist/src/clients/reddit.js.map +7 -0
  7. package/dist/src/clients/scraper.js +326 -0
  8. package/dist/src/clients/scraper.js.map +7 -0
  9. package/dist/src/clients/search.js +217 -0
  10. package/dist/src/clients/search.js.map +7 -0
  11. package/dist/src/config/index.js +138 -0
  12. package/dist/src/config/index.js.map +7 -0
  13. package/dist/src/prompts/deep-research.js +41 -0
  14. package/dist/src/prompts/deep-research.js.map +7 -0
  15. package/dist/src/prompts/reddit-sentiment.js +47 -0
  16. package/dist/src/prompts/reddit-sentiment.js.map +7 -0
  17. package/dist/src/schemas/reddit.js +21 -0
  18. package/dist/src/schemas/reddit.js.map +7 -0
  19. package/dist/src/schemas/scrape-links.js +26 -0
  20. package/dist/src/schemas/scrape-links.js.map +7 -0
  21. package/dist/src/schemas/start-research.js +17 -0
  22. package/dist/src/schemas/start-research.js.map +7 -0
  23. package/dist/src/schemas/web-search.js +53 -0
  24. package/dist/src/schemas/web-search.js.map +7 -0
  25. package/dist/src/services/llm-processor.js +675 -0
  26. package/dist/src/services/llm-processor.js.map +7 -0
  27. package/dist/src/services/markdown-cleaner.js +62 -0
  28. package/dist/src/services/markdown-cleaner.js.map +7 -0
  29. package/dist/src/services/workflow-state.js +116 -0
  30. package/dist/src/services/workflow-state.js.map +7 -0
  31. package/dist/src/tools/mcp-helpers.js +50 -0
  32. package/dist/src/tools/mcp-helpers.js.map +7 -0
  33. package/dist/src/tools/reddit.js +277 -0
  34. package/dist/src/tools/reddit.js.map +7 -0
  35. package/dist/src/tools/registry.js +18 -0
  36. package/dist/src/tools/registry.js.map +7 -0
  37. package/dist/src/tools/scrape.js +334 -0
  38. package/dist/src/tools/scrape.js.map +7 -0
  39. package/dist/src/tools/search.js +423 -0
  40. package/dist/src/tools/search.js.map +7 -0
  41. package/dist/src/tools/start-research.js +199 -0
  42. package/dist/src/tools/start-research.js.map +7 -0
  43. package/dist/src/tools/utils.js +29 -0
  44. package/dist/src/tools/utils.js.map +7 -0
  45. package/dist/src/utils/bootstrap-guard.js +27 -0
  46. package/dist/src/utils/bootstrap-guard.js.map +7 -0
  47. package/dist/src/utils/concurrency.js +62 -0
  48. package/dist/src/utils/concurrency.js.map +7 -0
  49. package/dist/src/utils/content-extractor.js +61 -0
  50. package/dist/src/utils/content-extractor.js.map +7 -0
  51. package/dist/src/utils/errors.js +211 -0
  52. package/dist/src/utils/errors.js.map +7 -0
  53. package/dist/src/utils/logger.js +25 -0
  54. package/dist/src/utils/logger.js.map +7 -0
  55. package/dist/src/utils/markdown-formatter.js +15 -0
  56. package/dist/src/utils/markdown-formatter.js.map +7 -0
  57. package/dist/src/utils/reddit-keyword-guard.js +29 -0
  58. package/dist/src/utils/reddit-keyword-guard.js.map +7 -0
  59. package/dist/src/utils/response.js +81 -0
  60. package/dist/src/utils/response.js.map +7 -0
  61. package/dist/src/utils/retry.js +13 -0
  62. package/dist/src/utils/retry.js.map +7 -0
  63. package/dist/src/utils/sanitize.js +10 -0
  64. package/dist/src/utils/sanitize.js.map +7 -0
  65. package/dist/src/utils/source-type.js +41 -0
  66. package/dist/src/utils/source-type.js.map +7 -0
  67. package/dist/src/utils/url-aggregator.js +227 -0
  68. package/dist/src/utils/url-aggregator.js.map +7 -0
  69. package/dist/src/utils/workflow-key.js +14 -0
  70. package/dist/src/utils/workflow-key.js.map +7 -0
  71. package/dist/src/version.js +32 -0
  72. package/dist/src/version.js.map +7 -0
  73. package/package.json +33 -28
  74. package/dist/clients/reddit.d.ts +0 -69
  75. package/dist/clients/reddit.d.ts.map +0 -1
  76. package/dist/clients/reddit.js +0 -369
  77. package/dist/clients/reddit.js.map +0 -1
  78. package/dist/clients/research.d.ts +0 -67
  79. package/dist/clients/research.d.ts.map +0 -1
  80. package/dist/clients/research.js +0 -290
  81. package/dist/clients/research.js.map +0 -1
  82. package/dist/clients/scraper.d.ts +0 -72
  83. package/dist/clients/scraper.d.ts.map +0 -1
  84. package/dist/clients/scraper.js +0 -351
  85. package/dist/clients/scraper.js.map +0 -1
  86. package/dist/clients/search.d.ts +0 -57
  87. package/dist/clients/search.d.ts.map +0 -1
  88. package/dist/clients/search.js +0 -223
  89. package/dist/clients/search.js.map +0 -1
  90. package/dist/config/index.d.ts +0 -78
  91. package/dist/config/index.d.ts.map +0 -1
  92. package/dist/config/index.js +0 -201
  93. package/dist/config/index.js.map +0 -1
  94. package/dist/config/loader.d.ts +0 -40
  95. package/dist/config/loader.d.ts.map +0 -1
  96. package/dist/config/loader.js +0 -322
  97. package/dist/config/loader.js.map +0 -1
  98. package/dist/config/types.d.ts +0 -81
  99. package/dist/config/types.d.ts.map +0 -1
  100. package/dist/config/types.js +0 -6
  101. package/dist/config/types.js.map +0 -1
  102. package/dist/config/yaml/tools.yaml +0 -146
  103. package/dist/index.d.ts +0 -7
  104. package/dist/index.d.ts.map +0 -1
  105. package/dist/schemas/deep-research.d.ts +0 -64
  106. package/dist/schemas/deep-research.d.ts.map +0 -1
  107. package/dist/schemas/deep-research.js +0 -224
  108. package/dist/schemas/deep-research.js.map +0 -1
  109. package/dist/schemas/scrape-links.d.ts +0 -32
  110. package/dist/schemas/scrape-links.d.ts.map +0 -1
  111. package/dist/schemas/scrape-links.js +0 -34
  112. package/dist/schemas/scrape-links.js.map +0 -1
  113. package/dist/schemas/web-search.d.ts +0 -22
  114. package/dist/schemas/web-search.d.ts.map +0 -1
  115. package/dist/schemas/web-search.js +0 -21
  116. package/dist/schemas/web-search.js.map +0 -1
  117. package/dist/services/file-attachment.d.ts +0 -30
  118. package/dist/services/file-attachment.d.ts.map +0 -1
  119. package/dist/services/file-attachment.js +0 -207
  120. package/dist/services/file-attachment.js.map +0 -1
  121. package/dist/services/llm-processor.d.ts +0 -29
  122. package/dist/services/llm-processor.d.ts.map +0 -1
  123. package/dist/services/llm-processor.js +0 -244
  124. package/dist/services/llm-processor.js.map +0 -1
  125. package/dist/services/markdown-cleaner.d.ts +0 -8
  126. package/dist/services/markdown-cleaner.d.ts.map +0 -1
  127. package/dist/services/markdown-cleaner.js +0 -74
  128. package/dist/services/markdown-cleaner.js.map +0 -1
  129. package/dist/tools/definitions.d.ts +0 -16
  130. package/dist/tools/definitions.d.ts.map +0 -1
  131. package/dist/tools/definitions.js +0 -17
  132. package/dist/tools/definitions.js.map +0 -1
  133. package/dist/tools/reddit.d.ts +0 -14
  134. package/dist/tools/reddit.d.ts.map +0 -1
  135. package/dist/tools/reddit.js +0 -265
  136. package/dist/tools/reddit.js.map +0 -1
  137. package/dist/tools/registry.d.ts +0 -71
  138. package/dist/tools/registry.d.ts.map +0 -1
  139. package/dist/tools/registry.js +0 -252
  140. package/dist/tools/registry.js.map +0 -1
  141. package/dist/tools/research.d.ts +0 -14
  142. package/dist/tools/research.d.ts.map +0 -1
  143. package/dist/tools/research.js +0 -196
  144. package/dist/tools/research.js.map +0 -1
  145. package/dist/tools/scrape.d.ts +0 -14
  146. package/dist/tools/scrape.d.ts.map +0 -1
  147. package/dist/tools/scrape.js +0 -234
  148. package/dist/tools/scrape.js.map +0 -1
  149. package/dist/tools/search.d.ts +0 -10
  150. package/dist/tools/search.d.ts.map +0 -1
  151. package/dist/tools/search.js +0 -158
  152. package/dist/tools/search.js.map +0 -1
  153. package/dist/tools/utils.d.ts +0 -105
  154. package/dist/tools/utils.d.ts.map +0 -1
  155. package/dist/tools/utils.js +0 -159
  156. package/dist/tools/utils.js.map +0 -1
  157. package/dist/utils/concurrency.d.ts +0 -28
  158. package/dist/utils/concurrency.d.ts.map +0 -1
  159. package/dist/utils/concurrency.js +0 -92
  160. package/dist/utils/concurrency.js.map +0 -1
  161. package/dist/utils/errors.d.ts +0 -95
  162. package/dist/utils/errors.d.ts.map +0 -1
  163. package/dist/utils/errors.js +0 -390
  164. package/dist/utils/errors.js.map +0 -1
  165. package/dist/utils/logger.d.ts +0 -39
  166. package/dist/utils/logger.d.ts.map +0 -1
  167. package/dist/utils/logger.js +0 -57
  168. package/dist/utils/logger.js.map +0 -1
  169. package/dist/utils/markdown-formatter.d.ts +0 -5
  170. package/dist/utils/markdown-formatter.d.ts.map +0 -1
  171. package/dist/utils/markdown-formatter.js +0 -15
  172. package/dist/utils/markdown-formatter.js.map +0 -1
  173. package/dist/utils/response.d.ts +0 -93
  174. package/dist/utils/response.d.ts.map +0 -1
  175. package/dist/utils/response.js +0 -170
  176. package/dist/utils/response.js.map +0 -1
  177. package/dist/utils/retry.d.ts +0 -43
  178. package/dist/utils/retry.d.ts.map +0 -1
  179. package/dist/utils/retry.js +0 -57
  180. package/dist/utils/retry.js.map +0 -1
  181. package/dist/utils/url-aggregator.d.ts +0 -90
  182. package/dist/utils/url-aggregator.d.ts.map +0 -1
  183. package/dist/utils/url-aggregator.js +0 -538
  184. package/dist/utils/url-aggregator.js.map +0 -1
  185. package/dist/version.d.ts +0 -29
  186. package/dist/version.d.ts.map +0 -1
  187. package/dist/version.js +0 -55
  188. package/dist/version.js.map +0 -1
package/README.md CHANGED
@@ -1,263 +1,171 @@
1
- <h1 align="center">🔬 MCP Research Powerpack</h1>
1
+ # mcp-researchpowerpack
2
2
 
3
- <p align="center">
4
- <strong>Five research tools for AI assistants — search, scrape, mine Reddit, and synthesize with LLMs.</strong>
5
- </p>
3
+ HTTP MCP server for research. Orientation-first search, Reddit mining, and scraping — all over `/mcp`.
6
4
 
7
- <p align="center">
8
- <a href="https://www.npmjs.com/package/mcp-research-powerpack"><img src="https://img.shields.io/npm/v/mcp-research-powerpack.svg?style=flat-square&color=cb3837" alt="npm"></a>
9
- <a href="https://www.npmjs.com/package/mcp-research-powerpack"><img src="https://img.shields.io/npm/dm/mcp-research-powerpack.svg?style=flat-square&color=blue" alt="downloads"></a>
10
- <a href="https://nodejs.org/"><img src="https://img.shields.io/badge/node-%3E%3D20-93450a.svg?style=flat-square" alt="node"></a>
11
- <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/license-MIT-grey.svg?style=flat-square" alt="license"></a>
12
- <a href="https://modelcontextprotocol.io"><img src="https://img.shields.io/badge/MCP-compatible-5a67d8.svg?style=flat-square" alt="MCP"></a>
13
- </p>
5
+ Built on [mcp-use](https://github.com/nicepkg/mcp-use). No stdio, HTTP only.
14
6
 
15
- <p align="center">
16
- <code>npx mcp-research-powerpack</code>
17
- </p>
7
+ ## tools
18
8
 
19
- ---
9
+ | tool | what it does | needs |
10
+ |------|-------------|-------|
11
+ | `start-research` | one-time orientation step that unlocks the research workflow for the current conversation/session. Emits the companion `run-research` skill install hint on every boot. | none |
12
+ | `web-search` | parallel Google search across 1–100 queries with URL aggregation, hostname-heuristic `source_type` tagging, and follow-up suggestions. `scope: "reddit"` filters to post permalinks (subreddit homepages dropped). `verbose: true` restores per-row metadata + Signals block. | `SERPER_API_KEY` |
13
+ | `get-reddit-post` | fetch 1–100 Reddit posts with full comment trees. Returns `isError: true` when every URL fails. | `REDDIT_CLIENT_ID` + `REDDIT_CLIENT_SECRET` |
14
+ | `scrape-links` | scrape 1–100 URLs with optional LLM extraction. HTML chrome stripped server-side via Readability. Reddit URLs are rejected with `UNSUPPORTED_URL_TYPE` — use `get-reddit-post`. | `SCRAPEDO_API_KEY` |
20
15
 
21
- An [MCP](https://modelcontextprotocol.io) server that gives Claude, Cursor, Windsurf, and any MCP-compatible AI assistant a complete research toolkit. Google search, Reddit deep-dives, web scraping with AI extraction, and multi-model deep research — all as tools that chain into each other.
16
+ Also exposes `/health`, `health://status`, and two optional MCP prompts: `deep-research` and `reddit-sentiment`.
22
17
 
23
- Zero config to start. Each API key you add unlocks more capabilities.
18
+ ## workflow
24
19
 
25
- ## Tools
20
+ Call `start-research` once at the beginning of each conversation/session.
26
21
 
27
- | Tool | What it does | Requires |
28
- |:-----|:-------------|:---------|
29
- | **`web_search`** | Parallel Google search across 3–100 keywords with CTR-weighted ranking and consensus detection | `SERPER_API_KEY` |
30
- | **`search_reddit`** | Same search engine filtered to reddit.com — 10–50 queries in parallel | `SERPER_API_KEY` |
31
- | **`get_reddit_post`** | Fetch 2–50 Reddit posts with full comment trees, smart comment budget allocation | `REDDIT_CLIENT_ID` + `REDDIT_CLIENT_SECRET` |
32
- | **`scrape_links`** | Scrape 1–50 URLs with JS rendering fallback, HTML→Markdown, optional AI extraction | `SCRAPEDO_API_KEY` |
33
- | **`deep_research`** | Send questions to research-capable models (Grok, Gemini) with web search, file attachments | `OPENROUTER_API_KEY` |
22
+ It returns the orientation brief that teaches how to route between:
34
23
 
35
- Tools are designed to **chain**: `web_search` `scrape_links` `search_reddit` `get_reddit_post` `deep_research` for synthesis. Each tool suggests the next logical step in its output.
24
+ - `web-search` (with `scope: "web" | "reddit" | "both"`)
25
+ - `get-reddit-post`
26
+ - `scrape-links`
36
27
 
37
- ## Quick Start
28
+ All three gated tools advertise this precondition via `_meta.requires: ["start-research"]` in `tools/list`, so capability-aware clients can skip pre-bootstrap calls.
38
29
 
39
- ### Claude Desktop / Claude Code
30
+ Pair the server with the [`run-research`](https://github.com/yigitkonur/skills-by-yigitkonur/tree/main/skills/run-research) skill for the full agentic playbook:
40
31
 
41
- Add to your MCP config (`~/Library/Application Support/Claude/claude_desktop_config.json`):
42
-
43
- ```json
44
- {
45
- "mcpServers": {
46
- "research-powerpack": {
47
- "command": "npx",
48
- "args": ["-y", "mcp-research-powerpack"],
49
- "env": {
50
- "SERPER_API_KEY": "your-key-here",
51
- "OPENROUTER_API_KEY": "your-key-here"
52
- }
53
- }
54
- }
55
- }
32
+ ```bash
33
+ npx -y skills add -y -g yigitkonur/skills-by-yigitkonur/skills/run-research
56
34
  ```
57
35
 
58
- ### Cursor
36
+ ## quickstart
37
+
38
+ ```bash
39
+ # from npm
40
+ HOST=127.0.0.1 PORT=3000 npx -y mcp-researchpowerpack
41
+
42
+ # from source
43
+ git clone https://github.com/yigitkonur/mcp-researchpowerpack.git
44
+ cd mcp-researchpowerpack
45
+ pnpm install && pnpm dev
46
+ ```
59
47
 
60
- Add to `.cursor/mcp.json` in your project:
48
+ Connect your client to `http://localhost:3000/mcp`:
61
49
 
62
50
  ```json
63
51
  {
64
52
  "mcpServers": {
65
53
  "research-powerpack": {
66
- "command": "npx",
67
- "args": ["-y", "mcp-research-powerpack"],
68
- "env": {
69
- "SERPER_API_KEY": "your-key-here"
70
- }
54
+ "url": "http://localhost:3000/mcp"
71
55
  }
72
56
  }
73
57
  }
74
58
  ```
75
59
 
76
- ### From Source
77
-
78
- ```bash
79
- git clone https://github.com/yigitkonur/mcp-research-powerpack.git
80
- cd mcp-research-powerpack
81
- pnpm install && pnpm build
82
- pnpm start
83
- ```
84
-
85
- ### HTTP Transport
86
-
87
- ```bash
88
- MCP_TRANSPORT=http MCP_PORT=3000 npx mcp-research-powerpack
89
- ```
90
-
91
- Exposes `/mcp` endpoint (POST/GET/DELETE with session headers) and `/health`.
92
-
93
- ## API Keys
94
-
95
- Each key unlocks a capability. Missing keys silently disable their tools — the server never crashes.
96
-
97
- | Variable | Enables | Free Tier |
98
- |:---------|:--------|:----------|
99
- | `SERPER_API_KEY` | `web_search`, `search_reddit` | 2,500 searches/mo — [serper.dev](https://serper.dev) |
100
- | `REDDIT_CLIENT_ID` + `REDDIT_CLIENT_SECRET` | `get_reddit_post` | Unlimited — [reddit.com/prefs/apps](https://www.reddit.com/prefs/apps) (script type) |
101
- | `SCRAPEDO_API_KEY` | `scrape_links` | 1,000 credits/mo — [scrape.do](https://scrape.do) |
102
- | `OPENROUTER_API_KEY` | `deep_research`, LLM extraction | Pay-per-token — [openrouter.ai](https://openrouter.ai) |
103
- | `CEREBRAS_API_KEY` | Cerebras LLM extraction | — |
104
- | `USE_CEREBRAS` | Enable Cerebras for extraction (set `true`) | `false` |
105
-
106
- ## Configuration
107
-
108
- Optional tuning via environment variables:
109
-
110
- | Variable | Default | Description |
111
- |:---------|:--------|:------------|
112
- | `RESEARCH_MODEL` | `x-ai/grok-4-fast` | Primary deep research model |
113
- | `RESEARCH_FALLBACK_MODEL` | `google/gemini-2.5-flash` | Fallback when primary fails |
114
- | `LLM_EXTRACTION_MODEL` | `openai/gpt-oss-120b:nitro` | Model for scrape/reddit AI extraction |
115
- | `DEFAULT_REASONING_EFFORT` | `high` | Research depth: `low`, `medium`, `high` |
116
- | `DEFAULT_MAX_URLS` | `100` | Max search results per research question (10–200) |
117
- | `API_TIMEOUT_MS` | `1800000` | Request timeout in ms (default: 30 min) |
118
- | `MCP_TRANSPORT` | `stdio` | Transport mode: `stdio` or `http` |
119
- | `MCP_PORT` | `3000` | Port for HTTP mode |
120
- | `USE_CEREBRAS` | `false` | Set to `true` to use Cerebras for extraction instead of OpenRouter |
121
- | `CEREBRAS_API_KEY` | — | API key for Cerebras cloud — [cloud.cerebras.ai](https://cloud.cerebras.ai) |
122
-
123
- ### Cerebras Support
124
-
125
- When `USE_CEREBRAS=true` and `CEREBRAS_API_KEY` are set, the `scrape_links` tool uses Cerebras (Z.ai GLM 4.7) for AI content extraction instead of OpenRouter. This provides:
126
-
127
- - **Ultra-fast extraction** — Cerebras inference is optimized for speed
128
- - **Independent from OpenRouter** — extraction works even without `OPENROUTER_API_KEY`
129
- - **Automatic fallback** — if Cerebras is not configured, falls back to OpenRouter
130
-
131
- ```bash
132
- # Enable Cerebras for extraction
133
- USE_CEREBRAS=true CEREBRAS_API_KEY=your-key npx mcp-research-powerpack
134
- ```
135
-
136
- ### Network Resilience
137
-
138
- All LLM API calls include built-in stability protections:
139
-
140
- - **Request deadlines** — hard timeout prevents calls from hanging indefinitely
141
- - **Stall detection** — if no response arrives within a threshold, the request is aborted and retried
142
- - **Exponential backoff** — transient failures (429, 5xx) retry with jitter to avoid thundering herd
143
- - **Connection loss recovery** — network errors (ECONNRESET, ECONNREFUSED) trigger automatic retry
144
- - **Graceful degradation** — all tools return structured errors instead of crashing
60
+ ## config
145
61
 
146
- ## How It Works
62
+ Copy `.env.example`, set only what you need. Missing keys don't crash the server — they disable the affected capability with a clear error.
147
63
 
148
- ### Search Ranking
64
+ ### server
149
65
 
150
- Results from multiple queries are deduplicated by normalized URL and scored using **CTR-weighted position values** (position 1 = 100.0, position 10 = 12.56). URLs appearing across multiple queries get a consensus marker. Frequency threshold starts at ≥3, falls back to ≥2, then ≥1 to ensure results.
66
+ | var | default | |
67
+ |-----|---------|---|
68
+ | `PORT` | `3000` | HTTP port |
69
+ | `HOST` | `127.0.0.1` | bind address |
70
+ | `ALLOWED_ORIGINS` | unset | comma-separated origins for host validation |
71
+ | `MCP_URL` | unset | fallback public MCP URL used by the production origin-protection guard |
72
+ | `REDIS_URL` | unset | Redis-backed MCP sessions, distributed SSE, and workflow state |
151
73
 
152
- ### Reddit Comment Budget
74
+ ### providers
153
75
 
154
- Global budget of **1,000 comments**, max 200 per post. After the first pass, surplus from posts with fewer comments is redistributed to truncated posts in a second fetch pass.
76
+ | var | enables |
77
+ |-----|---------|
78
+ | `SERPER_API_KEY` | `web-search` (open web + `scope: "reddit"`) |
79
+ | `REDDIT_CLIENT_ID` + `REDDIT_CLIENT_SECRET` | `get-reddit-post` |
80
+ | `SCRAPEDO_API_KEY` | `scrape-links` |
81
+ | `LLM_API_KEY` | AI extraction, search classification, and raw-mode refine suggestions |
155
82
 
156
- ### Scraping Pipeline
83
+ ### llm (AI extraction + classification)
157
84
 
158
- **Three-mode fallback** per URL: basic → JS rendering → JS + US geo-targeting. Results go through HTML→Markdown conversion (Turndown), then optional AI extraction with a 100K char input cap and 8,000 token output per URL.
85
+ Any OpenAI-compatible provider works OpenRouter, Cerebras, Together, etc.
159
86
 
160
- ### Deep Research
87
+ | var | default | |
88
+ |-----|---------|---|
89
+ | `LLM_API_KEY` | *(required for LLM features)* | API key for the LLM provider |
90
+ | `LLM_BASE_URL` | `https://openrouter.ai/api/v1` | base URL |
91
+ | `LLM_MODEL` | `openai/gpt-5.4-mini` | model identifier |
92
+ | `LLM_MAX_TOKENS` | `8000` | max output tokens |
93
+ | `LLM_REASONING` | `low` | `none` \| `low` \| `medium` \| `high` |
94
+ | `LLM_CONCURRENCY` | `50` | parallel LLM calls |
161
95
 
162
- **32,000 token budget** divided across questions (1 question = 32K, 10 questions = 3.2K each). Gemini models get `google_search` tool access. Grok/Perplexity get `search_parameters` with citations. Primary model fails → automatic fallback to secondary model.
96
+ ### evals
163
97
 
164
- ### File Attachments
98
+ `pnpm test:evals` writes a JSON artifact to `test-results/eval-runs/<timestamp>.json`.
165
99
 
166
- `deep_research` can read **local files** and include them as context. Files over 600 lines are smart-truncated (first 500 + last 100 lines). Line ranges supported. Line numbers preserved in output.
100
+ When an OpenAI API key is present, it performs a live Responses API + remote MCP evaluation.
101
+ Without an API key, it exits successfully in explicit skip mode and records that skip in the artifact.
167
102
 
168
- ## Concurrency
103
+ Useful env vars:
169
104
 
170
- | Operation | Parallel Limit |
171
- |:----------|:---------------|
172
- | Web search keywords | 8 |
173
- | Reddit search queries | 8 |
174
- | Reddit post fetches per batch | 5 (batches of 10) |
175
- | URL scraping per batch | 10 (batches of 30) |
176
- | LLM extraction | 3 |
177
- | Deep research questions | 3 |
105
+ - `EVAL_MCP_URL`
106
+ - `EVAL_MODEL`
107
+ - `EVAL_API_KEY` or `OPENAI_API_KEY`
178
108
 
179
- All clients use **manual retry with exponential backoff and jitter**. The OpenAI SDK's built-in retry is disabled (`maxRetries: 0`).
109
+ ## dev
180
110
 
181
- ## Architecture
182
-
183
- ```
184
- src/
185
- ├── index.ts Entry point — STDIO + HTTP transport, graceful shutdown
186
- ├── worker.ts Cloudflare Workers entry (Durable Objects)
187
- ├── config/
188
- │ ├── index.ts Env parsing, capability detection, lazy Proxy config
189
- │ ├── loader.ts YAML → Zod → JSON Schema pipeline
190
- │ └── yaml/tools.yaml Single source of truth for tool definitions
191
- ├── schemas/ Zod input validation (deep-research, scrape-links, web-search)
192
- ├── tools/
193
- │ ├── registry.ts Tool lookup → capability check → validate → execute
194
- │ ├── search.ts web_search handler
195
- │ ├── reddit.ts search_reddit + get_reddit_post handlers
196
- │ ├── scrape.ts scrape_links handler
197
- │ └── research.ts deep_research handler
198
- ├── clients/
199
- │ ├── search.ts Google Serper API client
200
- │ ├── reddit.ts Reddit OAuth + comment tree parser
201
- │ ├── scraper.ts Scrape.do client with fallback modes
202
- │ └── research.ts OpenRouter client with model-specific handling
203
- ├── services/
204
- │ ├── llm-processor.ts Shared LLM extraction (singleton OpenAI client)
205
- │ ├── markdown-cleaner.ts HTML → Markdown via Turndown
206
- │ └── file-attachment.ts Local file reading with line ranges
207
- └── utils/
208
- ├── retry.ts Shared backoff + retry constants
209
- ├── concurrency.ts Bounded parallel execution (pMap, pMapSettled)
210
- ├── url-aggregator.ts CTR-weighted scoring + consensus detection
211
- ├── errors.ts Error classification + structured errors
212
- ├── logger.ts MCP logging protocol
213
- └── response.ts Standardized 70/20/10 output formatting
111
+ ```bash
112
+ pnpm install
113
+ pnpm dev # watch mode, serves :3000/mcp
114
+ pnpm typecheck # tsc --noEmit
115
+ pnpm test # unit + http integration tests
116
+ pnpm build # compile to dist/
117
+ pnpm inspect # mcp-use inspector
214
118
  ```
215
119
 
216
- ## Deploy
217
-
218
- ### Cloudflare Workers
120
+ ## deploy
219
121
 
220
122
  ```bash
221
- npx wrangler deploy
123
+ pnpm build
124
+ pnpm deploy # manufact cloud
222
125
  ```
223
126
 
224
- Uses Durable Objects with SQLite storage. YAML-based tool definitions are replaced with inline definitions since there's no filesystem in Workers.
225
-
226
- ### npm
227
-
228
- Published as [`mcp-research-powerpack`](https://www.npmjs.com/package/mcp-research-powerpack). Binary names: `mcp-research-powerpack`, `research-powerpack-mcp`.
229
-
230
- ## Development
127
+ Or self-host anywhere with Node 20.19+ / 22.12+:
231
128
 
232
129
  ```bash
233
- pnpm install # Install dependencies
234
- pnpm dev # Run with tsx (live TypeScript)
235
- pnpm build # Compile to dist/
236
- pnpm typecheck # Type-check without emitting
237
- pnpm start # Run compiled output
130
+ HOST=0.0.0.0 ALLOWED_ORIGINS=https://app.example.com pnpm start
238
131
  ```
239
132
 
240
- ### Testing
133
+ ## architecture
241
134
 
242
- ```bash
243
- pnpm test:web-search # Test web search tool
244
- pnpm test:reddit-search # Test Reddit search
245
- pnpm test:scrape-links # Test scraping
246
- pnpm test:deep-research # Test deep research
247
- pnpm test:all # Run all tests
248
- pnpm test:check # Check environment setup
135
+ ```
136
+ index.ts server startup, cors, health, shutdown
137
+ src/
138
+ config/ env parsing, capability detection, lazy proxy config
139
+ clients/ provider API clients (serper, reddit, scrapedo)
140
+ prompts/ optional MCP prompts for deep-research and reddit-sentiment
141
+ tools/
142
+ registry.ts registerAllTools() — wires tools to MCP server
143
+ start-research.ts workflow orientation entrypoint
144
+ search.ts web-search handler
145
+ reddit.ts get-reddit-post
146
+ scrape.ts scrape-links handler
147
+ mcp-helpers.ts response builders (markdown + structured MCP output)
148
+ utils.ts shared formatters, token budget allocation
149
+ services/
150
+ workflow-state.ts conversation-aware workflow state with memory/Redis backends
151
+ llm-processor.ts AI extraction/synthesis via OpenAI-compatible API
152
+ markdown-cleaner.ts HTML/markdown cleanup
153
+ schemas/ zod v4 input validation per tool
154
+ utils/
155
+ workflow-key.ts workflow identity derivation from user/session context
156
+ bootstrap-guard.ts hard gate enforcing start-research first
157
+ reddit-keyword-guard.ts one-shot redirect for reddit-first web-search misuse
158
+ sanitize.ts strips URL/control-char injection from follow-up suggestions
159
+ errors.ts structured error codes (retryable classification)
160
+ concurrency.ts pMap/pMapSettled — bounded parallel execution
161
+ retry.ts exponential backoff with jitter
162
+ url-aggregator.ts CTR-weighted URL ranking for search consensus
163
+ response.ts formatSuccess/formatError/formatBatchHeader
164
+ logger.ts mcpLog() — stderr-only (MCP-safe)
249
165
  ```
250
166
 
251
- ## Contributing
252
-
253
- 1. Fork the repository
254
- 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
255
- 3. Make your changes
256
- 4. Run `pnpm typecheck && pnpm build` to verify
257
- 5. Commit (`git commit -m 'feat: add amazing feature'`)
258
- 6. Push to your branch (`git push origin feature/amazing-feature`)
259
- 7. Open a Pull Request
167
+ Key patterns: capability detection at startup, conversation-aware workflow gating via `start-research`, always-on structured MCP tool output, raw and classified follow-up guidance in `web-search`, bounded concurrency, CTR-based URL ranking, tools never throw (always return `toolFailure`), and structured errors with retry classification.
260
168
 
261
- ## License
169
+ ## license
262
170
 
263
- [MIT](https://opensource.org/licenses/MIT) © [Yiğit Konur](https://github.com/yigitkonur)
171
+ MIT