mcp-researchpowerpack 6.0.18 → 7.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,29 +2,46 @@
2
2
 
3
3
  # mcp-researchpowerpack
4
4
 
5
- HTTP MCP server for research. Five tools, orientation-first, built for agents that run multi-pass research loops.
5
+ http mcp server for research. five tools, orientation-first, built for agents
6
+ that run multi-pass research loops.
6
7
 
7
- Built on [mcp-use](https://github.com/nicepkg/mcp-use). No stdio, HTTP only.
8
+ ships on [`mcp-use`](https://github.com/nicepkg/mcp-use). every external call
9
+ flows through an effect ts service layer for typed concurrency, typed errors,
10
+ and timeouts that actually mean something. no stdio — http only.
8
11
 
9
12
  ## tools
10
13
 
11
14
  | tool | what it does | needs |
12
15
  |------|-------------|-------|
13
- | `start-research` | returns a goal-tailored brief: `primary_branch` (reddit / web / both), exact `first_call_sequence`, 25–50 keyword seeds, iteration hints, gaps to watch, stop criteria. Call FIRST every session. | `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` for non-degraded brief generation (optional) |
14
- | `raw-web-search` | parallel search, up to 50 `keywords` per call. Serper is primary; Jina Search is fallback when Serper is missing, fails, or yields empty query results. Returns the raw ranked markdown list directly. Use for broad discovery, audit trails, and Reddit permalink discovery via explicit `site:reddit.com/r/.../comments` probes. | `SERPER_API_KEY` or `JINA_API_KEY` |
15
- | `smart-web-search` | parallel search, up to 50 `keywords` per call, plus required `extract`. Serper/Jina provider order matches raw search. Always runs LLM classification and returns tiered markdown (HIGHLY_RELEVANT / MAYBE_RELEVANT / OTHER) + grounded synthesis + gaps + refine suggestions. Supports `scope: "web" \| "reddit" \| "both"`. | `SERPER_API_KEY` or `JINA_API_KEY` + LLM env |
16
- | `raw-scrape-links` | fetch URLs in parallel and return full markdown directly. Reddit post permalinks route through the Reddit API with threaded comments. Non-Reddit URLs use Jina Reader first, then Jina Reader through Scrape.do proxy mode, then optional Kernel browser rendering for web pages. | optional `REDDIT_CLIENT_ID` / `REDDIT_CLIENT_SECRET`, `SCRAPEDO_API_KEY`, `JINA_API_KEY`, `KERNEL_API_KEY` |
17
- | `smart-scrape-links` | same fetch stack as raw scrape, then always runs per-URL LLM extraction with required `extract`. Use for focused evidence packs with `## Source`, `## Matches`, `## Not found`, and `## Follow-up signals`. | raw scrape providers + LLM env |
16
+ | `start-research` | returns a goal-tailored brief: `primary_branch` (reddit / web / both), exact `first_call_sequence`, 25–50 keyword seeds, iteration hints, gaps to watch, stop criteria. call first every session. | `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` for the non-degraded brief (optional — server falls back to a static playbook) |
17
+ | `raw-web-search` | parallel search, up to 50 `keywords` per call. serper is primary; jina search is the fallback when serper is missing, fails, or returns empty. returns the raw ranked markdown list directly — no llm pass. use for broad discovery, audit trails, and reddit permalink probes via explicit `site:reddit.com/r/.../comments`. | `SERPER_API_KEY` or `JINA_API_KEY` |
18
+ | `smart-web-search` | parallel search, up to 50 `keywords` per call, plus required `extract`. same provider order as raw. always runs llm classification and returns tiered markdown (HIGHLY_RELEVANT / MAYBE_RELEVANT / OTHER) + grounded synthesis + gaps + refine suggestions. supports `scope: "web" \| "reddit" \| "both"`. | `SERPER_API_KEY` or `JINA_API_KEY` + llm env triple |
19
+ | `raw-scrape-links` | fetch urls in parallel, return full markdown directly. reddit post permalinks route through the reddit api with threaded comments. non-reddit urls hit jina reader first, then jina reader through scrape.do proxy mode, then optional kernel browser rendering for web pages. pdf / docx / pptx / xlsx urls go straight through jina reader. | optional `REDDIT_CLIENT_ID` / `REDDIT_CLIENT_SECRET`, `SCRAPEDO_API_KEY`, `JINA_API_KEY`, `KERNEL_API_KEY` |
20
+ | `smart-scrape-links` | same fetch stack as raw scrape, then per-url llm extraction with required `extract`. returns focused evidence packs with `## Source`, `## Matches`, `## Not found`, and `## Follow-up signals`. | raw scrape providers + llm env triple |
18
21
 
19
- Also exposes `/health` and `health://status`.
22
+ also exposes `/health` (simplified for proxies) and `health://status` (full
23
+ json: planner/extractor reachability, consecutive failure counters, uptime,
24
+ active sessions).
20
25
 
21
26
  ## workflow
22
27
 
23
- Call `start-research` once at the beginning of each session with your goal. The server returns a brief that tells the agent exactly which tool to call first (reddit-first for sentiment/migration, web-first for spec/bug/pricing, both when opinion-heavy AND needs official sources), what keyword seeds to fire, and when to stop.
28
+ call `start-research` once per session with your goal. the server returns a
29
+ brief that names the first tool to fire (reddit-first for sentiment/migration,
30
+ web-first for spec/bug/pricing, both when opinion-heavy and you also need
31
+ official sources), the keyword seeds to fan out, and stop criteria.
24
32
 
25
- For search fan-out, use bad → better rewrite thinking before calling `raw-web-search` or `smart-web-search`: turn broad phrases like `<feature> support`, `<product> pricing`, `<library> bug fix`, or `<tool> reviews` into source-aware probes such as `site:<official-docs-domain> "<feature>" "<platform-or-version>"`, `site:<vendor-domain> "<product>" pricing "enterprise" OR "free tier"`, `"<exact error text>" "<library-or-package>" "<version>" site:github.com`, or `site:reddit.com/r/<community>/comments "<tool>" "migration" OR "regression"`.
33
+ for search fan-out, think bad → better before calling `raw-web-search` or
34
+ `smart-web-search`. turn broad phrases like `<feature> support`, `<product>
35
+ pricing`, `<library> bug fix`, or `<tool> reviews` into source-aware probes
36
+ like:
26
37
 
27
- Pair the server with the [`run-research`](https://github.com/yigitkonur/skills-by-yigitkonur/tree/main/skills/run-research) skill for the full agentic playbook:
38
+ - `site:<official-docs-domain> "<feature>" "<platform-or-version>"`
39
+ - `site:<vendor-domain> "<product>" pricing "enterprise" OR "free tier"`
40
+ - `"<exact error text>" "<library-or-package>" "<version>" site:github.com`
41
+ - `site:reddit.com/r/<community>/comments "<tool>" "migration" OR "regression"`
42
+
43
+ pair the server with the [`run-research`](https://github.com/yigitkonur/skills-by-yigitkonur/tree/main/skills/run-research)
44
+ skill for the full agentic playbook:
28
45
 
29
46
  ```bash
30
47
  npx -y skills add -y -g https://github.com/yigitkonur/skills-by-yigitkonur --skill /run-research
@@ -42,7 +59,7 @@ cd mcp-researchpowerpack
42
59
  pnpm install && pnpm dev
43
60
  ```
44
61
 
45
- Connect your client to `http://localhost:3000/mcp`:
62
+ point your client at `http://localhost:3000/mcp`:
46
63
 
47
64
  ```json
48
65
  {
@@ -54,51 +71,70 @@ Connect your client to `http://localhost:3000/mcp`:
54
71
  }
55
72
  ```
56
73
 
74
+ or skip the install entirely and hit the hosted deployment at
75
+ `https://research.yigitkonur.com/mcp`.
76
+
57
77
  ## config
58
78
 
59
- Copy `.env.example`, set only what you need. Missing keys don't crash the server — they disable the affected capability with a clear error.
79
+ copy `.env.example`, set only what you need. missing keys don't crash the
80
+ server — they disable the affected capability with a clear error at call time.
60
81
 
61
82
  ### server
62
83
 
63
84
  | var | default | |
64
85
  |-----|---------|---|
65
- | `PORT` | `3000` | HTTP port |
66
- | `HOST` | `127.0.0.1` | bind address |
67
- | `ALLOWED_ORIGINS` | unset | comma-separated origins for host validation |
68
- | `MCP_URL` | unset | fallback public MCP URL used by the production origin-protection guard |
86
+ | `PORT` | `3000` | http port |
87
+ | `HOST` | `127.0.0.1` | bind address; cloud runtimes that set `PORT` auto-switch to `0.0.0.0` |
88
+ | `ALLOWED_ORIGINS` | unset | comma-separated origins for host validation / cors |
89
+ | `MCP_URL` | unset | fallback public mcp url used by the production origin-protection guard |
90
+ | `NODE_ENV` | unset | set to `production` to enforce `ALLOWED_ORIGINS` or `MCP_URL` (server exits otherwise) |
91
+ | `DEBUG` | unset | `1` or `2` to bump mcp-use debug verbosity |
69
92
 
70
93
  ### providers
71
94
 
72
95
  | var | enables |
73
96
  |-----|---------|
74
97
  | `SERPER_API_KEY` | primary raw/smart web search provider |
75
- | `SCRAPEDO_API_KEY` | Scrape.do proxy-mode retry for Jina Reader (`X-Proxy-Url`) |
98
+ | `SCRAPEDO_API_KEY` | scrape.do proxy-mode retry for jina reader (`X-Proxy-Url`) |
76
99
  | `REDDIT_CLIENT_ID` + `REDDIT_CLIENT_SECRET` | raw/smart scrape for reddit.com permalinks (threaded post + comments) |
77
- | `JINA_API_KEY` | Jina Search fallback and authenticated Jina Reader requests |
78
- | `KERNEL_API_KEY` | optional Kernel browser-render fallback after Jina direct + proxy fail |
79
- | `KERNEL_PROJECT` | optional Kernel project scoping header |
100
+ | `JINA_API_KEY` | jina search fallback and authenticated jina reader requests |
101
+ | `KERNEL_API_KEY` | optional kernel browser-render fallback after jina direct + proxy fail |
102
+ | `KERNEL_PROJECT` | optional kernel project scoping header for org-wide api keys |
80
103
  | `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` | goal-tailored brief, `smart-web-search`, `smart-scrape-links` |
81
104
 
82
- ### llm (AI extraction + classification)
105
+ ### llm
83
106
 
84
- Any OpenAI-compatible endpoint. `LLM_API_KEY`, `LLM_BASE_URL`, and `LLM_MODEL` are all required together. Reasoning effort is always `low`.
107
+ any openai-compatible endpoint. `LLM_API_KEY`, `LLM_BASE_URL`, and `LLM_MODEL`
108
+ are required together. reasoning effort is hardcoded to `low`.
85
109
 
86
110
  | var | required? | |
87
111
  |-----|-----------|---|
88
- | `LLM_API_KEY` | yes | API key for the endpoint |
89
- | `LLM_BASE_URL` | yes | base URL for the OpenAI-compatible endpoint (e.g. `https://server.up.railway.app/v1`) |
112
+ | `LLM_API_KEY` | yes | api key for the endpoint |
113
+ | `LLM_BASE_URL` | yes | base url for the openai-compatible endpoint (e.g. `https://server.up.railway.app/v1`) |
90
114
  | `LLM_MODEL` | yes | primary model (e.g. `gpt-5.4-mini`) |
91
- | `LLM_FALLBACK_MODEL` | no | model to use after primary exhausts all retries — gets 3 additional attempts (e.g. `gpt-5.4`) |
92
- | `LLM_CONCURRENCY` | no (default `50`) | parallel LLM calls |
115
+ | `LLM_FALLBACK_MODEL` | no | model to use after primary exhausts retries — gets 3 more attempts (e.g. `gpt-5.4`). also receives oversized inputs that exceed the primary's context window |
93
116
 
94
- ### evals
117
+ ### concurrency
95
118
 
96
- `pnpm test:evals` writes a JSON artifact to `test-results/eval-runs/<timestamp>.json`.
119
+ all optional. provider limits are clamped 1–200; kernel is clamped 1–20.
120
+
121
+ | var | default | controls |
122
+ |-----|---------|----------|
123
+ | `CONCURRENCY_SEARCH` | `50` | parallel serper / jina search queries |
124
+ | `CONCURRENCY_SCRAPER` | `50` | parallel scrape.do (proxy mode) requests |
125
+ | `CONCURRENCY_JINA_READER` | `50` | parallel jina reader fetches |
126
+ | `CONCURRENCY_REDDIT` | `50` | parallel reddit api fetches |
127
+ | `CONCURRENCY_KERNEL` | `3` | parallel kernel browser-render fallbacks |
128
+ | `LLM_CONCURRENCY` | `50` | parallel llm extraction / classification calls |
129
+
130
+ ### evals
97
131
 
98
- When an OpenAI API key is present, it performs a live Responses API + remote MCP evaluation.
99
- Without an API key, it exits successfully in explicit skip mode and records that skip in the artifact.
132
+ `pnpm test:evals` writes a json artifact to `test-results/eval-runs/<timestamp>.json`.
133
+ when an openai api key is present, it runs a live responses-api + remote-mcp
134
+ eval. without one, it exits successfully in explicit skip mode and records
135
+ the skip in the artifact.
100
136
 
101
- Useful env vars:
137
+ useful env vars:
102
138
 
103
139
  - `EVAL_MCP_URL`
104
140
  - `EVAL_MODEL`
@@ -117,13 +153,13 @@ pnpm inspect # mcp-use inspector
117
153
 
118
154
  ## deploy
119
155
 
120
- Deploy to Manufact Cloud via the `mcp-use` CLI (GitHub-backed):
156
+ deploy to manufact cloud via the `mcp-use` cli (github-backed):
121
157
 
122
158
  ```bash
123
159
  pnpm deploy # runs the package script: mcp-use deploy
124
160
  ```
125
161
 
126
- Or self-host anywhere with Node 20.19+ / 22.12+:
162
+ or self-host anywhere with node 20.19+ / 22.12+:
127
163
 
128
164
  ```bash
129
165
  HOST=0.0.0.0 ALLOWED_ORIGINS=https://app.example.com pnpm start
@@ -135,30 +171,39 @@ HOST=0.0.0.0 ALLOWED_ORIGINS=https://app.example.com pnpm start
135
171
  index.ts server startup, cors, health, shutdown
136
172
  src/
137
173
  config/ env parsing, capability detection, lazy proxy config
138
- clients/ provider API clients (serper, reddit, scrapedo, jina)
174
+ effect/ typed service tags + Live layers; runExternalEffect()
175
+ is the single boundary tool handlers cross to talk
176
+ to the outside world
177
+ clients/ provider api clients (serper, jina, kernel, reddit,
178
+ scrapedo) — wrapped by Live layers in src/effect/
139
179
  tools/
140
- registry.ts registerAllTools() — wires 5 tools
141
- start-research.ts goal-tailored brief + static playbook
142
- search.ts raw/smart search handlers (CTR ranking + optional LLM classification)
143
- scrape.ts raw/smart scrape handlers (Reddit API, Jina Reader, Scrape.do proxy,
144
- optional Kernel, optional LLM extraction)
180
+ registry.ts registerAllTools() — wires the five tools
181
+ start-research.ts goal-tailored brief + static playbook + planner
182
+ circuit-breaker
183
+ search.ts raw/smart search handlers (ctr ranking + optional
184
+ llm classification)
185
+ scrape.ts raw/smart scrape handlers (reddit api, jina reader,
186
+ scrape.do proxy retry, optional kernel, optional
187
+ llm extraction)
145
188
  mcp-helpers.ts markdown response builders
146
- utils.ts shared formatters
147
189
  services/
148
- llm-processor.ts AI extraction, classification, brief generation — primary + fallback model, always low reasoning
149
- markdown-cleaner.ts HTML/markdown cleanup
190
+ llm-processor.ts llm extraction, classification, brief generation —
191
+ primary + fallback model, always low reasoning,
192
+ oversized inputs route straight to fallback
193
+ markdown-cleaner.ts html/markdown cleanup (readability + turndown)
150
194
  schemas/ zod v4 input validation per tool
151
- utils/
152
- sanitize.ts strips URL/control-char injection from follow-up suggestions
153
- errors.ts structured error codes (retryable classification)
154
- concurrency.ts pMap/pMapSettled — thin wrappers over p-map@7
155
- retry.ts exponential backoff with jitter
156
- url-aggregator.ts CTR-weighted URL ranking for search consensus
157
- response.ts formatSuccess/formatError/formatBatchHeader
158
- logger.ts mcpLog() — stderr-only (MCP-safe)
195
+ utils/ errors, retry, ctr aggregator, response builders,
196
+ logger (stderr-only, mcp-safe)
159
197
  ```
160
198
 
161
- Key patterns: capability detection at startup, description-led tool routing (no bootstrap gate), markdown-only MCP tool output, raw/smart tool split, tiered classified output in `smart-web-search`, Reddit API routing in scrape tools, Jina Reader first for non-Reddit URLs, Scrape.do proxy-mode retry through `X-Proxy-Url`, optional Kernel browser-render fallback, bounded concurrency via `p-map`, CTR-based URL ranking, tools never throw (always return `toolFailure`), and structured errors with retry classification.
199
+ key patterns: capability detection at startup, description-led tool routing
200
+ (no bootstrap gate), markdown-only mcp tool output for search/scrape,
201
+ raw/smart tool split, tiered classified output in `smart-web-search`, reddit
202
+ api routing in scrape tools, jina reader first for non-reddit urls,
203
+ scrape.do proxy-mode retry through `X-Proxy-Url`, optional kernel
204
+ browser-render fallback, bounded concurrency via `Effect.forEach`, ctr-based
205
+ url ranking, tools never throw (always return `toolFailure`), and structured
206
+ errors with retry classification.
162
207
 
163
208
  ## license
164
209
 
package/dist/mcp-use.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "includeInspector": false,
3
- "buildTime": "2026-05-05T10:02:24.122Z",
4
- "buildId": "94ba0cb513ead054",
3
+ "buildTime": "2026-05-05T10:52:07.548Z",
4
+ "buildId": "db9ed26f9820f7b5",
5
5
  "entryPoint": "dist/index.js",
6
6
  "widgets": {}
7
7
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mcp-researchpowerpack",
3
- "version": "6.0.18",
3
+ "version": "7.0.0",
4
4
  "description": "HTTP-first MCP research server: start-research plus raw/smart search and scrape tools — built on mcp-use.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",