npm - mcp-researchpowerpack - Versions diffs - 6.0.18 → 7.0.0 - Mend

mcp-researchpowerpack 6.0.18 → 7.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -2,29 +2,46 @@
 # mcp-researchpowerpack
-HTTP MCP server for research. Five tools, orientation-first, built for agents that run multi-pass research loops.
+http mcp server for research. five tools, orientation-first, built for agents
+that run multi-pass research loops.
-Built on [mcp-use](https://github.com/nicepkg/mcp-use). No stdio, HTTP only.
+ships on [`mcp-use`](https://github.com/nicepkg/mcp-use). every external call
+flows through an effect ts service layer for typed concurrency, typed errors,
+and timeouts that actually mean something. no stdio — http only.
 ## tools
 | tool | what it does | needs |
 |------|-------------|-------|
-| `start-research` | returns a goal-tailored brief: `primary_branch` (reddit / web / both), exact `first_call_sequence`, 25–50 keyword seeds, iteration hints, gaps to watch, stop criteria. Call FIRST every session. | `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` for non-degraded brief generation (optional) |
-| `raw-web-search` | parallel search, up to 50 `keywords` per call. Serper is primary; Jina Search is fallback when Serper is missing, fails, or yields empty query results. Returns the raw ranked markdown list directly. Use for broad discovery, audit trails, and Reddit permalink discovery via explicit `site:reddit.com/r/.../comments` probes. | `SERPER_API_KEY` or `JINA_API_KEY` |
-| `smart-web-search` | parallel search, up to 50 `keywords` per call, plus required `extract`. Serper/Jina provider order matches raw search. Always runs LLM classification and returns tiered markdown (HIGHLY_RELEVANT / MAYBE_RELEVANT / OTHER) + grounded synthesis + gaps + refine suggestions. Supports `scope: "web" \| "reddit" \| "both"`. | `SERPER_API_KEY` or `JINA_API_KEY` + LLM env |
-| `raw-scrape-links` | fetch URLs in parallel and return full markdown directly. Reddit post permalinks route through the Reddit API with threaded comments. Non-Reddit URLs use Jina Reader first, then Jina Reader through Scrape.do proxy mode, then optional Kernel browser rendering for web pages. | optional `REDDIT_CLIENT_ID` / `REDDIT_CLIENT_SECRET`, `SCRAPEDO_API_KEY`, `JINA_API_KEY`, `KERNEL_API_KEY` |
-| `smart-scrape-links` | same fetch stack as raw scrape, then always runs per-URL LLM extraction with required `extract`. Use for focused evidence packs with `## Source`, `## Matches`, `## Not found`, and `## Follow-up signals`. | raw scrape providers + LLM env |
+| `start-research` | returns a goal-tailored brief: `primary_branch` (reddit / web / both), exact `first_call_sequence`, 25–50 keyword seeds, iteration hints, gaps to watch, stop criteria. call first every session. | `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` for the non-degraded brief (optional — server falls back to a static playbook) |
+| `raw-web-search` | parallel search, up to 50 `keywords` per call. serper is primary; jina search is the fallback when serper is missing, fails, or returns empty. returns the raw ranked markdown list directly — no llm pass. use for broad discovery, audit trails, and reddit permalink probes via explicit `site:reddit.com/r/.../comments`. | `SERPER_API_KEY` or `JINA_API_KEY` |
+| `smart-web-search` | parallel search, up to 50 `keywords` per call, plus required `extract`. same provider order as raw. always runs llm classification and returns tiered markdown (HIGHLY_RELEVANT / MAYBE_RELEVANT / OTHER) + grounded synthesis + gaps + refine suggestions. supports `scope: "web" \| "reddit" \| "both"`. | `SERPER_API_KEY` or `JINA_API_KEY` + llm env triple |
+| `raw-scrape-links` | fetch urls in parallel, return full markdown directly. reddit post permalinks route through the reddit api with threaded comments. non-reddit urls hit jina reader first, then jina reader through scrape.do proxy mode, then optional kernel browser rendering for web pages. pdf / docx / pptx / xlsx urls go straight through jina reader. | optional `REDDIT_CLIENT_ID` / `REDDIT_CLIENT_SECRET`, `SCRAPEDO_API_KEY`, `JINA_API_KEY`, `KERNEL_API_KEY` |
+| `smart-scrape-links` | same fetch stack as raw scrape, then per-url llm extraction with required `extract`. returns focused evidence packs with `## Source`, `## Matches`, `## Not found`, and `## Follow-up signals`. | raw scrape providers + llm env triple |
-Also exposes `/health` and `health://status`.
+also exposes `/health` (simplified for proxies) and `health://status` (full
+json: planner/extractor reachability, consecutive failure counters, uptime,
+active sessions).
 ## workflow
-Call `start-research` once at the beginning of each session with your goal. The server returns a brief that tells the agent exactly which tool to call first (reddit-first for sentiment/migration, web-first for spec/bug/pricing, both when opinion-heavy AND needs official sources), what keyword seeds to fire, and when to stop.
+call `start-research` once per session with your goal. the server returns a
+brief that names the first tool to fire (reddit-first for sentiment/migration,
+web-first for spec/bug/pricing, both when opinion-heavy and you also need
+official sources), the keyword seeds to fan out, and stop criteria.
-For search fan-out, use bad → better rewrite thinking before calling `raw-web-search` or `smart-web-search`: turn broad phrases like `<feature> support`, `<product> pricing`, `<library> bug fix`, or `<tool> reviews` into source-aware probes such as `site:<official-docs-domain> "<feature>" "<platform-or-version>"`, `site:<vendor-domain> "<product>" pricing "enterprise" OR "free tier"`, `"<exact error text>" "<library-or-package>" "<version>" site:github.com`, or `site:reddit.com/r/<community>/comments "<tool>" "migration" OR "regression"`.
+for search fan-out, think bad → better before calling `raw-web-search` or
+`smart-web-search`. turn broad phrases like `<feature> support`, `<product>
+pricing`, `<library> bug fix`, or `<tool> reviews` into source-aware probes
+like:
-Pair the server with the [`run-research`](https://github.com/yigitkonur/skills-by-yigitkonur/tree/main/skills/run-research) skill for the full agentic playbook:
+- `site:<official-docs-domain> "<feature>" "<platform-or-version>"`
+- `site:<vendor-domain> "<product>" pricing "enterprise" OR "free tier"`
+- `"<exact error text>" "<library-or-package>" "<version>" site:github.com`
+- `site:reddit.com/r/<community>/comments "<tool>" "migration" OR "regression"`
+pair the server with the [`run-research`](https://github.com/yigitkonur/skills-by-yigitkonur/tree/main/skills/run-research)
+skill for the full agentic playbook:
 ```bash
 npx -y skills add -y -g https://github.com/yigitkonur/skills-by-yigitkonur --skill /run-research
@@ -42,7 +59,7 @@ cd mcp-researchpowerpack
 pnpm install && pnpm dev
 ```
-Connect your client to `http://localhost:3000/mcp`:
+point your client at `http://localhost:3000/mcp`:
 ```json
 {
@@ -54,51 +71,70 @@ Connect your client to `http://localhost:3000/mcp`:
 }
 ```
+or skip the install entirely and hit the hosted deployment at
+`https://research.yigitkonur.com/mcp`.
 ## config
-Copy `.env.example`, set only what you need. Missing keys don't crash the server — they disable the affected capability with a clear error.
+copy `.env.example`, set only what you need. missing keys don't crash the
+server — they disable the affected capability with a clear error at call time.
 ### server
 | var | default | |
 |-----|---------|---|
-| `PORT` | `3000` | HTTP port |
-| `HOST` | `127.0.0.1` | bind address |
-| `ALLOWED_ORIGINS` | unset | comma-separated origins for host validation |
-| `MCP_URL` | unset | fallback public MCP URL used by the production origin-protection guard |
+| `PORT` | `3000` | http port |
+| `HOST` | `127.0.0.1` | bind address; cloud runtimes that set `PORT` auto-switch to `0.0.0.0` |
+| `ALLOWED_ORIGINS` | unset | comma-separated origins for host validation / cors |
+| `MCP_URL` | unset | fallback public mcp url used by the production origin-protection guard |
+| `NODE_ENV` | unset | set to `production` to enforce `ALLOWED_ORIGINS` or `MCP_URL` (server exits otherwise) |
+| `DEBUG` | unset | `1` or `2` to bump mcp-use debug verbosity |
 ### providers
 | var | enables |
 |-----|---------|
 | `SERPER_API_KEY` | primary raw/smart web search provider |
-| `SCRAPEDO_API_KEY` | Scrape.do proxy-mode retry for Jina Reader (`X-Proxy-Url`) |
+| `SCRAPEDO_API_KEY` | scrape.do proxy-mode retry for jina reader (`X-Proxy-Url`) |
 | `REDDIT_CLIENT_ID` + `REDDIT_CLIENT_SECRET` | raw/smart scrape for reddit.com permalinks (threaded post + comments) |
-| `JINA_API_KEY` | Jina Search fallback and authenticated Jina Reader requests |
-| `KERNEL_API_KEY` | optional Kernel browser-render fallback after Jina direct + proxy fail |
-| `KERNEL_PROJECT` | optional Kernel project scoping header |
+| `JINA_API_KEY` | jina search fallback and authenticated jina reader requests |
+| `KERNEL_API_KEY` | optional kernel browser-render fallback after jina direct + proxy fail |
+| `KERNEL_PROJECT` | optional kernel project scoping header for org-wide api keys |
 | `LLM_API_KEY` + `LLM_BASE_URL` + `LLM_MODEL` | goal-tailored brief, `smart-web-search`, `smart-scrape-links` |
-### llm (AI extraction + classification)
+### llm
-Any OpenAI-compatible endpoint. `LLM_API_KEY`, `LLM_BASE_URL`, and `LLM_MODEL` are all required together. Reasoning effort is always `low`.
+any openai-compatible endpoint. `LLM_API_KEY`, `LLM_BASE_URL`, and `LLM_MODEL`
+are required together. reasoning effort is hardcoded to `low`.
 | var | required? | |
 |-----|-----------|---|
-| `LLM_API_KEY` | yes | API key for the endpoint |
-| `LLM_BASE_URL` | yes | base URL for the OpenAI-compatible endpoint (e.g. `https://server.up.railway.app/v1`) |
+| `LLM_API_KEY` | yes | api key for the endpoint |
+| `LLM_BASE_URL` | yes | base url for the openai-compatible endpoint (e.g. `https://server.up.railway.app/v1`) |
 | `LLM_MODEL` | yes | primary model (e.g. `gpt-5.4-mini`) |
-| `LLM_FALLBACK_MODEL` | no | model to use after primary exhausts all retries — gets 3 additional attempts (e.g. `gpt-5.4`) |
-| `LLM_CONCURRENCY` | no (default `50`) | parallel LLM calls |
+| `LLM_FALLBACK_MODEL` | no | model to use after primary exhausts retries — gets 3 more attempts (e.g. `gpt-5.4`). also receives oversized inputs that exceed the primary's context window |
-### evals
+### concurrency
-`pnpm test:evals` writes a JSON artifact to `test-results/eval-runs/<timestamp>.json`.
+all optional. provider limits are clamped 1–200; kernel is clamped 1–20.
+| var | default | controls |
+|-----|---------|----------|
+| `CONCURRENCY_SEARCH` | `50` | parallel serper / jina search queries |
+| `CONCURRENCY_SCRAPER` | `50` | parallel scrape.do (proxy mode) requests |
+| `CONCURRENCY_JINA_READER` | `50` | parallel jina reader fetches |
+| `CONCURRENCY_REDDIT` | `50` | parallel reddit api fetches |
+| `CONCURRENCY_KERNEL` | `3` | parallel kernel browser-render fallbacks |
+| `LLM_CONCURRENCY` | `50` | parallel llm extraction / classification calls |
+### evals
-When an OpenAI API key is present, it performs a live Responses API + remote MCP evaluation.
-Without an API key, it exits successfully in explicit skip mode and records that skip in the artifact.
+`pnpm test:evals` writes a json artifact to `test-results/eval-runs/<timestamp>.json`.
+when an openai api key is present, it runs a live responses-api + remote-mcp
+eval. without one, it exits successfully in explicit skip mode and records
+the skip in the artifact.
-Useful env vars:
+useful env vars:
 - `EVAL_MCP_URL`
 - `EVAL_MODEL`
@@ -117,13 +153,13 @@ pnpm inspect      # mcp-use inspector
 ## deploy
-Deploy to Manufact Cloud via the `mcp-use` CLI (GitHub-backed):
+deploy to manufact cloud via the `mcp-use` cli (github-backed):
 ```bash
 pnpm deploy       # runs the package script: mcp-use deploy
 ```
-Or self-host anywhere with Node 20.19+ / 22.12+:
+or self-host anywhere with node 20.19+ / 22.12+:
 ```bash
 HOST=0.0.0.0 ALLOWED_ORIGINS=https://app.example.com pnpm start
@@ -135,30 +171,39 @@ HOST=0.0.0.0 ALLOWED_ORIGINS=https://app.example.com pnpm start
 index.ts                 server startup, cors, health, shutdown
 src/
   config/                env parsing, capability detection, lazy proxy config
-  clients/               provider API clients (serper, reddit, scrapedo, jina)
+  effect/                typed service tags + Live layers; runExternalEffect()
+                         is the single boundary tool handlers cross to talk
+                         to the outside world
+  clients/               provider api clients (serper, jina, kernel, reddit,
+                         scrapedo) — wrapped by Live layers in src/effect/
   tools/
-    registry.ts          registerAllTools() — wires 5 tools
-    start-research.ts    goal-tailored brief + static playbook
-    search.ts            raw/smart search handlers (CTR ranking + optional LLM classification)
-    scrape.ts            raw/smart scrape handlers (Reddit API, Jina Reader, Scrape.do proxy,
-                         optional Kernel, optional LLM extraction)
+    registry.ts          registerAllTools() — wires the five tools
+    start-research.ts    goal-tailored brief + static playbook + planner
+                         circuit-breaker
+    search.ts            raw/smart search handlers (ctr ranking + optional
+                         llm classification)
+    scrape.ts            raw/smart scrape handlers (reddit api, jina reader,
+                         scrape.do proxy retry, optional kernel, optional
+                         llm extraction)
     mcp-helpers.ts       markdown response builders
-    utils.ts             shared formatters
   services/
-    llm-processor.ts     AI extraction, classification, brief generation — primary + fallback model, always low reasoning
-    markdown-cleaner.ts  HTML/markdown cleanup
+    llm-processor.ts     llm extraction, classification, brief generation —
+                         primary + fallback model, always low reasoning,
+                         oversized inputs route straight to fallback
+    markdown-cleaner.ts  html/markdown cleanup (readability + turndown)
   schemas/               zod v4 input validation per tool
-  utils/
-    sanitize.ts          strips URL/control-char injection from follow-up suggestions
-    errors.ts            structured error codes (retryable classification)
-    concurrency.ts       pMap/pMapSettled — thin wrappers over p-map@7
-    retry.ts             exponential backoff with jitter
-    url-aggregator.ts    CTR-weighted URL ranking for search consensus
-    response.ts          formatSuccess/formatError/formatBatchHeader
-    logger.ts            mcpLog() — stderr-only (MCP-safe)
+  utils/                 errors, retry, ctr aggregator, response builders,
+                         logger (stderr-only, mcp-safe)
 ```
-Key patterns: capability detection at startup, description-led tool routing (no bootstrap gate), markdown-only MCP tool output, raw/smart tool split, tiered classified output in `smart-web-search`, Reddit API routing in scrape tools, Jina Reader first for non-Reddit URLs, Scrape.do proxy-mode retry through `X-Proxy-Url`, optional Kernel browser-render fallback, bounded concurrency via `p-map`, CTR-based URL ranking, tools never throw (always return `toolFailure`), and structured errors with retry classification.
+key patterns: capability detection at startup, description-led tool routing
+(no bootstrap gate), markdown-only mcp tool output for search/scrape,
+raw/smart tool split, tiered classified output in `smart-web-search`, reddit
+api routing in scrape tools, jina reader first for non-reddit urls,
+scrape.do proxy-mode retry through `X-Proxy-Url`, optional kernel
+browser-render fallback, bounded concurrency via `Effect.forEach`, ctr-based
+url ranking, tools never throw (always return `toolFailure`), and structured
+errors with retry classification.
 ## license

package/dist/mcp-use.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "includeInspector": false,
-  "buildTime": "2026-05-05T10:02:24.122Z",
-  "buildId": "94ba0cb513ead054",
+  "buildTime": "2026-05-05T10:52:07.548Z",
+  "buildId": "db9ed26f9820f7b5",
   "entryPoint": "dist/index.js",
   "widgets": {}
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mcp-researchpowerpack",
-  "version": "6.0.18",
+  "version": "7.0.0",
   "description": "HTTP-first MCP research server: start-research plus raw/smart search and scrape tools — built on mcp-use.",
   "type": "module",
   "main": "dist/index.js",