mcp-researchpowerpack 6.0.18 → 7.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +97 -52
- package/dist/mcp-use.json +2 -2
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -2,29 +2,46 @@
|
|
|
2
2
|
|
|
3
3
|
# mcp-researchpowerpack
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
http mcp server for research. five tools, orientation-first, built for agents
|
|
6
|
+
that run multi-pass research loops.
|
|
6
7
|
|
|
7
|
-
|
|
8
|
+
ships on [`mcp-use`](https://github.com/nicepkg/mcp-use). every external call
|
|
9
|
+
flows through an effect ts service layer for typed concurrency, typed errors,
|
|
10
|
+
and timeouts that actually mean something. no stdio — http only.
|
|
8
11
|
|
|
9
12
|
## tools
|
|
10
13
|
|
|
11
14
|
| tool | what it does | needs |
|
|
12
15
|
|------|-------------|-------|
|
|
13
|
-
| `start-research` | returns a goal-tailored brief: `primary_branch` (reddit / web / both), exact `first_call_sequence`, 25–50 keyword seeds, iteration hints, gaps to watch, stop criteria.
|
|
14
|
-
| `raw-web-search` | parallel search, up to 50 `keywords` per call.
|
|
15
|
-
| `smart-web-search` | parallel search, up to 50 `keywords` per call, plus required `extract`.
|
|
16
|
-
| `raw-scrape-links` | fetch
|
|
17
|
-
| `smart-scrape-links` | same fetch stack as raw scrape, then
|
|
16
|
+
| `start-research` | returns a goal-tailored brief: `primary_branch` (reddit / web / both), exact `first_call_sequence`, 25–50 keyword seeds, iteration hints, gaps to watch, stop criteria. call first every session. | `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` for the non-degraded brief (optional — server falls back to a static playbook) |
|
|
17
|
+
| `raw-web-search` | parallel search, up to 50 `keywords` per call. serper is primary; jina search is the fallback when serper is missing, fails, or returns empty. returns the raw ranked markdown list directly — no llm pass. use for broad discovery, audit trails, and reddit permalink probes via explicit `site:reddit.com/r/.../comments`. | `SERPER_API_KEY` or `JINA_API_KEY` |
|
|
18
|
+
| `smart-web-search` | parallel search, up to 50 `keywords` per call, plus required `extract`. same provider order as raw. always runs llm classification and returns tiered markdown (HIGHLY_RELEVANT / MAYBE_RELEVANT / OTHER) + grounded synthesis + gaps + refine suggestions. supports `scope: "web" \| "reddit" \| "both"`. | `SERPER_API_KEY` or `JINA_API_KEY` + llm env triple |
|
|
19
|
+
| `raw-scrape-links` | fetch urls in parallel, return full markdown directly. reddit post permalinks route through the reddit api with threaded comments. non-reddit urls hit jina reader first, then jina reader through scrape.do proxy mode, then optional kernel browser rendering for web pages. pdf / docx / pptx / xlsx urls go straight through jina reader. | optional `REDDIT_CLIENT_ID` / `REDDIT_CLIENT_SECRET`, `SCRAPEDO_API_KEY`, `JINA_API_KEY`, `KERNEL_API_KEY` |
|
|
20
|
+
| `smart-scrape-links` | same fetch stack as raw scrape, then per-url llm extraction with required `extract`. returns focused evidence packs with `## Source`, `## Matches`, `## Not found`, and `## Follow-up signals`. | raw scrape providers + llm env triple |
|
|
18
21
|
|
|
19
|
-
|
|
22
|
+
also exposes `/health` (simplified for proxies) and `health://status` (full
|
|
23
|
+
json: planner/extractor reachability, consecutive failure counters, uptime,
|
|
24
|
+
active sessions).
|
|
20
25
|
|
|
21
26
|
## workflow
|
|
22
27
|
|
|
23
|
-
|
|
28
|
+
call `start-research` once per session with your goal. the server returns a
|
|
29
|
+
brief that names the first tool to fire (reddit-first for sentiment/migration,
|
|
30
|
+
web-first for spec/bug/pricing, both when opinion-heavy and you also need
|
|
31
|
+
official sources), the keyword seeds to fan out, and stop criteria.
|
|
24
32
|
|
|
25
|
-
|
|
33
|
+
for search fan-out, think bad → better before calling `raw-web-search` or
|
|
34
|
+
`smart-web-search`. turn broad phrases like `<feature> support`, `<product>
|
|
35
|
+
pricing`, `<library> bug fix`, or `<tool> reviews` into source-aware probes
|
|
36
|
+
like:
|
|
26
37
|
|
|
27
|
-
|
|
38
|
+
- `site:<official-docs-domain> "<feature>" "<platform-or-version>"`
|
|
39
|
+
- `site:<vendor-domain> "<product>" pricing "enterprise" OR "free tier"`
|
|
40
|
+
- `"<exact error text>" "<library-or-package>" "<version>" site:github.com`
|
|
41
|
+
- `site:reddit.com/r/<community>/comments "<tool>" "migration" OR "regression"`
|
|
42
|
+
|
|
43
|
+
pair the server with the [`run-research`](https://github.com/yigitkonur/skills-by-yigitkonur/tree/main/skills/run-research)
|
|
44
|
+
skill for the full agentic playbook:
|
|
28
45
|
|
|
29
46
|
```bash
|
|
30
47
|
npx -y skills add -y -g https://github.com/yigitkonur/skills-by-yigitkonur --skill /run-research
|
|
@@ -42,7 +59,7 @@ cd mcp-researchpowerpack
|
|
|
42
59
|
pnpm install && pnpm dev
|
|
43
60
|
```
|
|
44
61
|
|
|
45
|
-
|
|
62
|
+
point your client at `http://localhost:3000/mcp`:
|
|
46
63
|
|
|
47
64
|
```json
|
|
48
65
|
{
|
|
@@ -54,51 +71,70 @@ Connect your client to `http://localhost:3000/mcp`:
|
|
|
54
71
|
}
|
|
55
72
|
```
|
|
56
73
|
|
|
74
|
+
or skip the install entirely and hit the hosted deployment at
|
|
75
|
+
`https://research.yigitkonur.com/mcp`.
|
|
76
|
+
|
|
57
77
|
## config
|
|
58
78
|
|
|
59
|
-
|
|
79
|
+
copy `.env.example`, set only what you need. missing keys don't crash the
|
|
80
|
+
server — they disable the affected capability with a clear error at call time.
|
|
60
81
|
|
|
61
82
|
### server
|
|
62
83
|
|
|
63
84
|
| var | default | |
|
|
64
85
|
|-----|---------|---|
|
|
65
|
-
| `PORT` | `3000` |
|
|
66
|
-
| `HOST` | `127.0.0.1` | bind address |
|
|
67
|
-
| `ALLOWED_ORIGINS` | unset | comma-separated origins for host validation |
|
|
68
|
-
| `MCP_URL` | unset | fallback public
|
|
86
|
+
| `PORT` | `3000` | http port |
|
|
87
|
+
| `HOST` | `127.0.0.1` | bind address; cloud runtimes that set `PORT` auto-switch to `0.0.0.0` |
|
|
88
|
+
| `ALLOWED_ORIGINS` | unset | comma-separated origins for host validation / cors |
|
|
89
|
+
| `MCP_URL` | unset | fallback public mcp url used by the production origin-protection guard |
|
|
90
|
+
| `NODE_ENV` | unset | set to `production` to enforce `ALLOWED_ORIGINS` or `MCP_URL` (server exits otherwise) |
|
|
91
|
+
| `DEBUG` | unset | `1` or `2` to bump mcp-use debug verbosity |
|
|
69
92
|
|
|
70
93
|
### providers
|
|
71
94
|
|
|
72
95
|
| var | enables |
|
|
73
96
|
|-----|---------|
|
|
74
97
|
| `SERPER_API_KEY` | primary raw/smart web search provider |
|
|
75
|
-
| `SCRAPEDO_API_KEY` |
|
|
98
|
+
| `SCRAPEDO_API_KEY` | scrape.do proxy-mode retry for jina reader (`X-Proxy-Url`) |
|
|
76
99
|
| `REDDIT_CLIENT_ID` + `REDDIT_CLIENT_SECRET` | raw/smart scrape for reddit.com permalinks (threaded post + comments) |
|
|
77
|
-
| `JINA_API_KEY` |
|
|
78
|
-
| `KERNEL_API_KEY` | optional
|
|
79
|
-
| `KERNEL_PROJECT` | optional
|
|
100
|
+
| `JINA_API_KEY` | jina search fallback and authenticated jina reader requests |
|
|
101
|
+
| `KERNEL_API_KEY` | optional kernel browser-render fallback after jina direct + proxy fail |
|
|
102
|
+
| `KERNEL_PROJECT` | optional kernel project scoping header for org-wide api keys |
|
|
80
103
|
| `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` | goal-tailored brief, `smart-web-search`, `smart-scrape-links` |
|
|
81
104
|
|
|
82
|
-
### llm
|
|
105
|
+
### llm
|
|
83
106
|
|
|
84
|
-
|
|
107
|
+
any openai-compatible endpoint. `LLM_API_KEY`, `LLM_BASE_URL`, and `LLM_MODEL`
|
|
108
|
+
are required together. reasoning effort is hardcoded to `low`.
|
|
85
109
|
|
|
86
110
|
| var | required? | |
|
|
87
111
|
|-----|-----------|---|
|
|
88
|
-
| `LLM_API_KEY` | yes |
|
|
89
|
-
| `LLM_BASE_URL` | yes | base
|
|
112
|
+
| `LLM_API_KEY` | yes | api key for the endpoint |
|
|
113
|
+
| `LLM_BASE_URL` | yes | base url for the openai-compatible endpoint (e.g. `https://server.up.railway.app/v1`) |
|
|
90
114
|
| `LLM_MODEL` | yes | primary model (e.g. `gpt-5.4-mini`) |
|
|
91
|
-
| `LLM_FALLBACK_MODEL` | no | model to use after primary exhausts
|
|
92
|
-
| `LLM_CONCURRENCY` | no (default `50`) | parallel LLM calls |
|
|
115
|
+
| `LLM_FALLBACK_MODEL` | no | model to use after primary exhausts retries — gets 3 more attempts (e.g. `gpt-5.4`). also receives oversized inputs that exceed the primary's context window |
|
|
93
116
|
|
|
94
|
-
###
|
|
117
|
+
### concurrency
|
|
95
118
|
|
|
96
|
-
|
|
119
|
+
all optional. provider limits are clamped 1–200; kernel is clamped 1–20.
|
|
120
|
+
|
|
121
|
+
| var | default | controls |
|
|
122
|
+
|-----|---------|----------|
|
|
123
|
+
| `CONCURRENCY_SEARCH` | `50` | parallel serper / jina search queries |
|
|
124
|
+
| `CONCURRENCY_SCRAPER` | `50` | parallel scrape.do (proxy mode) requests |
|
|
125
|
+
| `CONCURRENCY_JINA_READER` | `50` | parallel jina reader fetches |
|
|
126
|
+
| `CONCURRENCY_REDDIT` | `50` | parallel reddit api fetches |
|
|
127
|
+
| `CONCURRENCY_KERNEL` | `3` | parallel kernel browser-render fallbacks |
|
|
128
|
+
| `LLM_CONCURRENCY` | `50` | parallel llm extraction / classification calls |
|
|
129
|
+
|
|
130
|
+
### evals
|
|
97
131
|
|
|
98
|
-
|
|
99
|
-
|
|
132
|
+
`pnpm test:evals` writes a json artifact to `test-results/eval-runs/<timestamp>.json`.
|
|
133
|
+
when an openai api key is present, it runs a live responses-api + remote-mcp
|
|
134
|
+
eval. without one, it exits successfully in explicit skip mode and records
|
|
135
|
+
the skip in the artifact.
|
|
100
136
|
|
|
101
|
-
|
|
137
|
+
useful env vars:
|
|
102
138
|
|
|
103
139
|
- `EVAL_MCP_URL`
|
|
104
140
|
- `EVAL_MODEL`
|
|
@@ -117,13 +153,13 @@ pnpm inspect # mcp-use inspector
|
|
|
117
153
|
|
|
118
154
|
## deploy
|
|
119
155
|
|
|
120
|
-
|
|
156
|
+
deploy to manufact cloud via the `mcp-use` cli (github-backed):
|
|
121
157
|
|
|
122
158
|
```bash
|
|
123
159
|
pnpm deploy # runs the package script: mcp-use deploy
|
|
124
160
|
```
|
|
125
161
|
|
|
126
|
-
|
|
162
|
+
or self-host anywhere with node 20.19+ / 22.12+:
|
|
127
163
|
|
|
128
164
|
```bash
|
|
129
165
|
HOST=0.0.0.0 ALLOWED_ORIGINS=https://app.example.com pnpm start
|
|
@@ -135,30 +171,39 @@ HOST=0.0.0.0 ALLOWED_ORIGINS=https://app.example.com pnpm start
|
|
|
135
171
|
index.ts server startup, cors, health, shutdown
|
|
136
172
|
src/
|
|
137
173
|
config/ env parsing, capability detection, lazy proxy config
|
|
138
|
-
|
|
174
|
+
effect/ typed service tags + Live layers; runExternalEffect()
|
|
175
|
+
is the single boundary tool handlers cross to talk
|
|
176
|
+
to the outside world
|
|
177
|
+
clients/ provider api clients (serper, jina, kernel, reddit,
|
|
178
|
+
scrapedo) — wrapped by Live layers in src/effect/
|
|
139
179
|
tools/
|
|
140
|
-
registry.ts registerAllTools() — wires
|
|
141
|
-
start-research.ts goal-tailored brief + static playbook
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
180
|
+
registry.ts registerAllTools() — wires the five tools
|
|
181
|
+
start-research.ts goal-tailored brief + static playbook + planner
|
|
182
|
+
circuit-breaker
|
|
183
|
+
search.ts raw/smart search handlers (ctr ranking + optional
|
|
184
|
+
llm classification)
|
|
185
|
+
scrape.ts raw/smart scrape handlers (reddit api, jina reader,
|
|
186
|
+
scrape.do proxy retry, optional kernel, optional
|
|
187
|
+
llm extraction)
|
|
145
188
|
mcp-helpers.ts markdown response builders
|
|
146
|
-
utils.ts shared formatters
|
|
147
189
|
services/
|
|
148
|
-
llm-processor.ts
|
|
149
|
-
|
|
190
|
+
llm-processor.ts llm extraction, classification, brief generation —
|
|
191
|
+
primary + fallback model, always low reasoning,
|
|
192
|
+
oversized inputs route straight to fallback
|
|
193
|
+
markdown-cleaner.ts html/markdown cleanup (readability + turndown)
|
|
150
194
|
schemas/ zod v4 input validation per tool
|
|
151
|
-
utils/
|
|
152
|
-
|
|
153
|
-
errors.ts structured error codes (retryable classification)
|
|
154
|
-
concurrency.ts pMap/pMapSettled — thin wrappers over p-map@7
|
|
155
|
-
retry.ts exponential backoff with jitter
|
|
156
|
-
url-aggregator.ts CTR-weighted URL ranking for search consensus
|
|
157
|
-
response.ts formatSuccess/formatError/formatBatchHeader
|
|
158
|
-
logger.ts mcpLog() — stderr-only (MCP-safe)
|
|
195
|
+
utils/ errors, retry, ctr aggregator, response builders,
|
|
196
|
+
logger (stderr-only, mcp-safe)
|
|
159
197
|
```
|
|
160
198
|
|
|
161
|
-
|
|
199
|
+
key patterns: capability detection at startup, description-led tool routing
|
|
200
|
+
(no bootstrap gate), markdown-only mcp tool output for search/scrape,
|
|
201
|
+
raw/smart tool split, tiered classified output in `smart-web-search`, reddit
|
|
202
|
+
api routing in scrape tools, jina reader first for non-reddit urls,
|
|
203
|
+
scrape.do proxy-mode retry through `X-Proxy-Url`, optional kernel
|
|
204
|
+
browser-render fallback, bounded concurrency via `Effect.forEach`, ctr-based
|
|
205
|
+
url ranking, tools never throw (always return `toolFailure`), and structured
|
|
206
|
+
errors with retry classification.
|
|
162
207
|
|
|
163
208
|
## license
|
|
164
209
|
|
package/dist/mcp-use.json
CHANGED
package/package.json
CHANGED