mcp-researchpowerpack 5.0.1 → 6.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +25 -41
- package/dist/index.js +6 -52
- package/dist/index.js.map +2 -2
- package/dist/mcp-use.json +2 -2
- package/dist/src/config/index.js +19 -8
- package/dist/src/config/index.js.map +2 -2
- package/dist/src/prompts/deep-research.js +14 -10
- package/dist/src/prompts/deep-research.js.map +2 -2
- package/dist/src/prompts/reddit-sentiment.js +6 -6
- package/dist/src/prompts/reddit-sentiment.js.map +1 -1
- package/dist/src/schemas/scrape-links.js +1 -1
- package/dist/src/schemas/scrape-links.js.map +2 -2
- package/dist/src/schemas/start-research.js +2 -2
- package/dist/src/schemas/start-research.js.map +1 -1
- package/dist/src/services/llm-processor.js +124 -111
- package/dist/src/services/llm-processor.js.map +2 -2
- package/dist/src/tools/registry.js +0 -2
- package/dist/src/tools/registry.js.map +2 -2
- package/dist/src/tools/scrape.js +235 -127
- package/dist/src/tools/scrape.js.map +3 -3
- package/dist/src/tools/search.js +3 -23
- package/dist/src/tools/search.js.map +2 -2
- package/dist/src/tools/start-research.js +63 -72
- package/dist/src/tools/start-research.js.map +2 -2
- package/dist/src/tools/utils.js +0 -14
- package/dist/src/tools/utils.js.map +2 -2
- package/dist/src/utils/concurrency.js +11 -46
- package/dist/src/utils/concurrency.js.map +2 -2
- package/package.json +3 -3
- package/dist/src/schemas/reddit.js +0 -21
- package/dist/src/schemas/reddit.js.map +0 -7
- package/dist/src/services/workflow-state.js +0 -116
- package/dist/src/services/workflow-state.js.map +0 -7
- package/dist/src/tools/reddit.js +0 -277
- package/dist/src/tools/reddit.js.map +0 -7
- package/dist/src/utils/bootstrap-guard.js +0 -27
- package/dist/src/utils/bootstrap-guard.js.map +0 -7
- package/dist/src/utils/reddit-keyword-guard.js +0 -29
- package/dist/src/utils/reddit-keyword-guard.js.map +0 -7
- package/dist/src/utils/workflow-key.js +0 -14
- package/dist/src/utils/workflow-key.js.map +0 -7
package/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# mcp-researchpowerpack
|
|
2
2
|
|
|
3
|
-
HTTP MCP server for research.
|
|
3
|
+
HTTP MCP server for research. Three tools, orientation-first, built for agents that run multi-pass research loops.
|
|
4
4
|
|
|
5
5
|
Built on [mcp-use](https://github.com/nicepkg/mcp-use). No stdio, HTTP only.
|
|
6
6
|
|
|
@@ -8,24 +8,15 @@ Built on [mcp-use](https://github.com/nicepkg/mcp-use). No stdio, HTTP only.
|
|
|
8
8
|
|
|
9
9
|
| tool | what it does | needs |
|
|
10
10
|
|------|-------------|-------|
|
|
11
|
-
| `start-research` |
|
|
12
|
-
| `web-search` | parallel Google search
|
|
13
|
-
| `
|
|
14
|
-
| `scrape-links` | scrape 1–100 URLs with optional LLM extraction. HTML chrome stripped server-side via Readability. Reddit URLs are rejected with `UNSUPPORTED_URL_TYPE` — use `get-reddit-post`. | `SCRAPEDO_API_KEY` |
|
|
11
|
+
| `start-research` | returns a goal-tailored brief: `primary_branch` (reddit / web / both), exact `first_call_sequence`, 25–50 keyword seeds, iteration hints, gaps to watch, stop criteria. Call FIRST every session. | `LLM_API_KEY` (brief generation) |
|
|
12
|
+
| `web-search` | parallel Google search, up to 50 queries per call, parallel-callable across turns. `scope: "web" \| "reddit" \| "both"` — reddit mode filters to post permalinks. Returns tiered markdown (HIGHLY_RELEVANT / MAYBE_RELEVANT / OTHER) + grounded synthesis + gaps + refine suggestions. | `SERPER_API_KEY` |
|
|
13
|
+
| `scrape-links` | fetch URLs in parallel with per-URL LLM extraction. Auto-detects `reddit.com/r/.../comments/` permalinks and routes them through the Reddit API (threaded post + comments); every other URL flows through the HTTP scraper. Parallel-callable. | `SCRAPEDO_API_KEY` (+ `REDDIT_CLIENT_ID` / `REDDIT_CLIENT_SECRET` for reddit URLs) |
|
|
15
14
|
|
|
16
15
|
Also exposes `/health`, `health://status`, and two optional MCP prompts: `deep-research` and `reddit-sentiment`.
|
|
17
16
|
|
|
18
17
|
## workflow
|
|
19
18
|
|
|
20
|
-
Call `start-research` once at the beginning of each
|
|
21
|
-
|
|
22
|
-
It returns the orientation brief that teaches how to route between:
|
|
23
|
-
|
|
24
|
-
- `web-search` (with `scope: "web" | "reddit" | "both"`)
|
|
25
|
-
- `get-reddit-post`
|
|
26
|
-
- `scrape-links`
|
|
27
|
-
|
|
28
|
-
All three gated tools advertise this precondition via `_meta.requires: ["start-research"]` in `tools/list`, so capability-aware clients can skip pre-bootstrap calls.
|
|
19
|
+
Call `start-research` once at the beginning of each session with your goal. The server returns a brief that tells the agent exactly which tool to call first (reddit-first for sentiment/migration, web-first for spec/bug/pricing, both when opinion-heavy AND needs official sources), what keyword seeds to fire, and when to stop.
|
|
29
20
|
|
|
30
21
|
Pair the server with the [`run-research`](https://github.com/yigitkonur/skills-by-yigitkonur/tree/main/skills/run-research) skill for the full agentic playbook:
|
|
31
22
|
|
|
@@ -69,29 +60,27 @@ Copy `.env.example`, set only what you need. Missing keys don't crash the server
|
|
|
69
60
|
| `HOST` | `127.0.0.1` | bind address |
|
|
70
61
|
| `ALLOWED_ORIGINS` | unset | comma-separated origins for host validation |
|
|
71
62
|
| `MCP_URL` | unset | fallback public MCP URL used by the production origin-protection guard |
|
|
72
|
-
| `REDIS_URL` | unset | Redis-backed MCP sessions, distributed SSE, and workflow state |
|
|
73
63
|
|
|
74
64
|
### providers
|
|
75
65
|
|
|
76
66
|
| var | enables |
|
|
77
67
|
|-----|---------|
|
|
78
|
-
| `SERPER_API_KEY` | `web-search` (
|
|
79
|
-
| `
|
|
80
|
-
| `
|
|
81
|
-
| `LLM_API_KEY` | AI extraction, search classification,
|
|
68
|
+
| `SERPER_API_KEY` | `web-search` (all scopes) |
|
|
69
|
+
| `SCRAPEDO_API_KEY` | `scrape-links` for non-reddit URLs |
|
|
70
|
+
| `REDDIT_CLIENT_ID` + `REDDIT_CLIENT_SECRET` | `scrape-links` for reddit.com permalinks (threaded post + comments) |
|
|
71
|
+
| `LLM_API_KEY` | goal-tailored brief, AI extraction, search classification, raw-mode refine suggestions |
|
|
82
72
|
|
|
83
73
|
### llm (AI extraction + classification)
|
|
84
74
|
|
|
85
|
-
Any OpenAI-compatible
|
|
75
|
+
Any OpenAI-compatible endpoint — OpenRouter, OpenAI, Cerebras, Together, etc. All three vars are required together.
|
|
86
76
|
|
|
87
|
-
| var |
|
|
88
|
-
|
|
89
|
-
| `LLM_API_KEY` |
|
|
90
|
-
| `LLM_BASE_URL` | `https://openrouter.ai/api/v1`
|
|
91
|
-
| `LLM_MODEL` |
|
|
92
|
-
| `
|
|
93
|
-
| `
|
|
94
|
-
| `LLM_CONCURRENCY` | `50` | parallel LLM calls |
|
|
77
|
+
| var | required? | |
|
|
78
|
+
|-----|-----------|---|
|
|
79
|
+
| `LLM_API_KEY` | yes (for LLM features) | API key for the provider |
|
|
80
|
+
| `LLM_BASE_URL` | yes (for LLM features) | e.g. `https://openrouter.ai/api/v1`, `https://api.openai.com/v1`, `https://api.cerebras.ai/v1` |
|
|
81
|
+
| `LLM_MODEL` | yes (for LLM features) | model identifier your endpoint accepts |
|
|
82
|
+
| `LLM_REASONING` | no (default `none`) | `none` \| `low` \| `medium` \| `high` — opt-in per endpoint support |
|
|
83
|
+
| `LLM_CONCURRENCY` | no (default `50`) | parallel LLM calls |
|
|
95
84
|
|
|
96
85
|
### evals
|
|
97
86
|
|
|
@@ -139,32 +128,27 @@ src/
|
|
|
139
128
|
clients/ provider API clients (serper, reddit, scrapedo)
|
|
140
129
|
prompts/ optional MCP prompts for deep-research and reddit-sentiment
|
|
141
130
|
tools/
|
|
142
|
-
registry.ts registerAllTools() — wires tools
|
|
143
|
-
start-research.ts
|
|
144
|
-
search.ts web-search handler
|
|
145
|
-
|
|
146
|
-
scrape.ts scrape-links handler
|
|
131
|
+
registry.ts registerAllTools() — wires 3 tools + 2 prompts
|
|
132
|
+
start-research.ts goal-tailored brief + static playbook
|
|
133
|
+
search.ts web-search handler (with CTR-weighted URL aggregation + LLM classification)
|
|
134
|
+
scrape.ts scrape-links handler (reddit + web branches in parallel)
|
|
147
135
|
mcp-helpers.ts response builders (markdown + structured MCP output)
|
|
148
|
-
utils.ts shared formatters
|
|
136
|
+
utils.ts shared formatters
|
|
149
137
|
services/
|
|
150
|
-
|
|
151
|
-
llm-processor.ts AI extraction/synthesis via OpenAI-compatible API
|
|
138
|
+
llm-processor.ts AI extraction, classification, brief generation via OpenAI-compatible API
|
|
152
139
|
markdown-cleaner.ts HTML/markdown cleanup
|
|
153
140
|
schemas/ zod v4 input validation per tool
|
|
154
141
|
utils/
|
|
155
|
-
workflow-key.ts workflow identity derivation from user/session context
|
|
156
|
-
bootstrap-guard.ts hard gate enforcing start-research first
|
|
157
|
-
reddit-keyword-guard.ts one-shot redirect for reddit-first web-search misuse
|
|
158
142
|
sanitize.ts strips URL/control-char injection from follow-up suggestions
|
|
159
143
|
errors.ts structured error codes (retryable classification)
|
|
160
|
-
concurrency.ts pMap/pMapSettled —
|
|
144
|
+
concurrency.ts pMap/pMapSettled — thin wrappers over p-map@7
|
|
161
145
|
retry.ts exponential backoff with jitter
|
|
162
146
|
url-aggregator.ts CTR-weighted URL ranking for search consensus
|
|
163
147
|
response.ts formatSuccess/formatError/formatBatchHeader
|
|
164
148
|
logger.ts mcpLog() — stderr-only (MCP-safe)
|
|
165
149
|
```
|
|
166
150
|
|
|
167
|
-
Key patterns: capability detection at startup,
|
|
151
|
+
Key patterns: capability detection at startup, description-led tool routing (no bootstrap gate), always-on structured MCP tool output, tiered classified output in `web-search`, parallel reddit + web branches in `scrape-links`, bounded concurrency via `p-map`, CTR-based URL ranking, tools never throw (always return `toolFailure`), and structured errors with retry classification.
|
|
168
152
|
|
|
169
153
|
## license
|
|
170
154
|
|
package/dist/index.js
CHANGED
|
@@ -7,18 +7,10 @@ import {
|
|
|
7
7
|
InMemorySessionStore,
|
|
8
8
|
InMemoryStreamManager,
|
|
9
9
|
MCPServer,
|
|
10
|
-
RedisSessionStore,
|
|
11
|
-
RedisStreamManager,
|
|
12
10
|
object
|
|
13
11
|
} from "mcp-use/server";
|
|
14
|
-
import { createClient } from "redis";
|
|
15
12
|
import { SERVER } from "./src/config/index.js";
|
|
16
13
|
import { getLLMHealth } from "./src/services/llm-processor.js";
|
|
17
|
-
import {
|
|
18
|
-
closeWorkflowStateStore,
|
|
19
|
-
configureWorkflowStateStore,
|
|
20
|
-
getWorkflowStateStore
|
|
21
|
-
} from "./src/services/workflow-state.js";
|
|
22
14
|
import { registerAllTools } from "./src/tools/registry.js";
|
|
23
15
|
const DEFAULT_PORT = 3e3;
|
|
24
16
|
const SHUTDOWN_TIMEOUT_MS = 1e4;
|
|
@@ -99,48 +91,17 @@ function resolveAllowedOrigins() {
|
|
|
99
91
|
}
|
|
100
92
|
return void 0;
|
|
101
93
|
}
|
|
102
|
-
|
|
103
|
-
const redisUrl = process.env.REDIS_URL?.trim();
|
|
104
|
-
if (!redisUrl) {
|
|
105
|
-
return {
|
|
106
|
-
sessionConfig: {
|
|
107
|
-
sessionStore: new InMemorySessionStore(),
|
|
108
|
-
streamManager: new InMemoryStreamManager()
|
|
109
|
-
},
|
|
110
|
-
cleanupFns: []
|
|
111
|
-
};
|
|
112
|
-
}
|
|
113
|
-
const commandClient = createClient({ url: redisUrl });
|
|
114
|
-
const pubSubClient = commandClient.duplicate();
|
|
115
|
-
await Promise.all([commandClient.connect(), pubSubClient.connect()]);
|
|
94
|
+
function buildSessionConfig() {
|
|
116
95
|
return {
|
|
117
96
|
sessionConfig: {
|
|
118
|
-
sessionStore: new
|
|
119
|
-
|
|
120
|
-
}),
|
|
121
|
-
streamManager: new RedisStreamManager({
|
|
122
|
-
client: commandClient,
|
|
123
|
-
pubSubClient
|
|
124
|
-
})
|
|
97
|
+
sessionStore: new InMemorySessionStore(),
|
|
98
|
+
streamManager: new InMemoryStreamManager()
|
|
125
99
|
},
|
|
126
|
-
cleanupFns: [
|
|
127
|
-
async () => {
|
|
128
|
-
await pubSubClient.quit();
|
|
129
|
-
},
|
|
130
|
-
async () => {
|
|
131
|
-
await commandClient.quit();
|
|
132
|
-
}
|
|
133
|
-
]
|
|
100
|
+
cleanupFns: []
|
|
134
101
|
};
|
|
135
102
|
}
|
|
136
103
|
function buildHealthPayload(server, startedAt) {
|
|
137
104
|
const llm = getLLMHealth();
|
|
138
|
-
let workflowStateSize = null;
|
|
139
|
-
try {
|
|
140
|
-
const store = getWorkflowStateStore();
|
|
141
|
-
workflowStateSize = store.size?.() ?? null;
|
|
142
|
-
} catch {
|
|
143
|
-
}
|
|
144
105
|
return {
|
|
145
106
|
status: "ok",
|
|
146
107
|
name: SERVER.NAME,
|
|
@@ -148,9 +109,6 @@ function buildHealthPayload(server, startedAt) {
|
|
|
148
109
|
transport: "http",
|
|
149
110
|
uptime_seconds: Math.floor((Date.now() - startedAt) / 1e3),
|
|
150
111
|
active_sessions: server.getActiveSessions().length,
|
|
151
|
-
workflow_state_size: workflowStateSize,
|
|
152
|
-
// LLM health — surfaced so capability-aware clients render degraded mode
|
|
153
|
-
// once at session start instead of parsing per-call footers.
|
|
154
112
|
llm_planner_ok: llm.lastPlannerOk,
|
|
155
113
|
llm_extractor_ok: llm.lastExtractorOk,
|
|
156
114
|
llm_planner_checked_at: llm.lastPlannerCheckedAt,
|
|
@@ -169,8 +127,7 @@ async function main() {
|
|
|
169
127
|
const port = resolvePort();
|
|
170
128
|
const baseUrl = process.env.MCP_URL?.trim() || void 0;
|
|
171
129
|
const allowedOrigins = resolveAllowedOrigins();
|
|
172
|
-
const { sessionConfig, cleanupFns } =
|
|
173
|
-
await configureWorkflowStateStore(process.env.REDIS_URL?.trim());
|
|
130
|
+
const { sessionConfig, cleanupFns } = buildSessionConfig();
|
|
174
131
|
startupLogger.info(`Starting ${SERVER.NAME} v${SERVER.VERSION}`);
|
|
175
132
|
startupLogger.info(`Binding HTTP server to ${host}:${port}`);
|
|
176
133
|
if (allowedOrigins && allowedOrigins.length > 0) {
|
|
@@ -215,9 +172,7 @@ async function main() {
|
|
|
215
172
|
planner_available: llm.plannerConfigured,
|
|
216
173
|
extractor_available: llm.extractorConfigured,
|
|
217
174
|
planner_model: process.env.LLM_MODEL ?? process.env.LLM_EXTRACTION_MODEL ?? null,
|
|
218
|
-
extractor_model: process.env.LLM_MODEL ?? process.env.LLM_EXTRACTION_MODEL ?? null
|
|
219
|
-
// Tools that require start-research to bootstrap the session first.
|
|
220
|
-
requires_bootstrap: ["web-search", "scrape-links", "get-reddit-post"]
|
|
175
|
+
extractor_model: process.env.LLM_MODEL ?? process.env.LLM_EXTRACTION_MODEL ?? null
|
|
221
176
|
}
|
|
222
177
|
}
|
|
223
178
|
});
|
|
@@ -261,7 +216,6 @@ async function main() {
|
|
|
261
216
|
try {
|
|
262
217
|
startupLogger.warn(`Shutdown signal received: ${signal}`);
|
|
263
218
|
await server.close();
|
|
264
|
-
await closeWorkflowStateStore();
|
|
265
219
|
for (const cleanupFn of cleanupFns) {
|
|
266
220
|
await cleanupFn();
|
|
267
221
|
}
|
package/dist/index.js.map
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"version": 3,
|
|
3
3
|
"sources": ["../index.ts"],
|
|
4
|
-
"sourcesContent": ["#!/usr/bin/env node\n\n// Expand libuv thread pool for parallel DNS lookups (default 4 is too low for 20+ concurrent connections)\nif (!process.env.UV_THREADPOOL_SIZE) {\n process.env.UV_THREADPOOL_SIZE = '8';\n}\n\nimport { Logger } from 'mcp-use';\nimport {\n InMemorySessionStore,\n InMemoryStreamManager,\n MCPServer,\n RedisSessionStore,\n RedisStreamManager,\n object,\n type ServerConfig,\n} from 'mcp-use/server';\nimport { createClient, type RedisClientType } from 'redis';\n\nimport { SERVER } from './src/config/index.js';\nimport { getLLMHealth } from './src/services/llm-processor.js';\nimport {\n closeWorkflowStateStore,\n configureWorkflowStateStore,\n getWorkflowStateStore,\n} from './src/services/workflow-state.js';\nimport { registerAllTools } from './src/tools/registry.js';\n\nconst DEFAULT_PORT = 3000 as const;\nconst SHUTDOWN_TIMEOUT_MS = 10_000 as const;\nconst WEBSITE_URL = 'https://github.com/yigitkonur/mcp-researchpowerpack-http' as const;\nconst LOCAL_DEFAULT_HOST = '127.0.0.1' as const;\n\ntype CleanupFn = () => Promise<void>;\n\nconst startupLogger = Logger.get('startup');\n\nfunction parseCsvEnv(value: string | undefined): string[] | undefined {\n if (!value) return undefined;\n\n const parts = value\n .split(',')\n .map((part) => part.trim())\n .filter(Boolean);\n\n return parts.length > 0 ? parts : undefined;\n}\n\nfunction parsePort(value: string | undefined, fallback: number): number {\n const parsed = Number.parseInt(value ?? '', 10);\n if (Number.isFinite(parsed) && parsed > 0) {\n return parsed;\n }\n\n return fallback;\n}\n\nfunction resolvePort(): number {\n const portFlagIndex = process.argv.findIndex((arg) => arg === '--port');\n if (portFlagIndex >= 0) {\n return parsePort(process.argv[portFlagIndex + 1], DEFAULT_PORT);\n }\n\n return parsePort(process.env.PORT, DEFAULT_PORT);\n}\n\nfunction resolveHost(): string {\n const explicitHost = process.env.HOST?.trim();\n if (explicitHost) {\n return explicitHost;\n }\n\n // Cloud runtimes typically inject PORT and expect the process to listen on all interfaces.\n if (process.env.PORT?.trim()) {\n return '0.0.0.0';\n }\n\n return LOCAL_DEFAULT_HOST;\n}\n\nfunction buildCors(allowedOrigins: string[] | undefined): ServerConfig['cors'] {\n if (!allowedOrigins || allowedOrigins.length === 0) {\n return undefined;\n }\n\n return {\n origin: allowedOrigins,\n allowMethods: ['GET', 'HEAD', 'POST', 'PUT', 'DELETE', 'OPTIONS'],\n allowHeaders: [\n 'Content-Type',\n 'Accept',\n 'Authorization',\n 'mcp-protocol-version',\n 'mcp-session-id',\n 'X-Proxy-Token',\n 'X-Target-URL',\n ],\n exposeHeaders: ['mcp-session-id'],\n };\n}\n\nfunction configureLogging(): void {\n Logger.configure({\n level: process.env.NODE_ENV === 'production' ? 'info' : 'debug',\n format: 'minimal',\n });\n\n const debug = process.env.DEBUG?.trim();\n if (debug === '2') {\n Logger.setDebug(2);\n } else if (debug) {\n Logger.setDebug(1);\n }\n}\n\nfunction normalizeOrigin(value: string, envName: string): string {\n try {\n return new URL(value).origin;\n } catch {\n throw new Error(`${envName} must contain absolute URLs with protocol. Received: ${value}`);\n }\n}\n\nfunction resolveAllowedOrigins(): string[] | undefined {\n const explicitOrigins = parseCsvEnv(process.env.ALLOWED_ORIGINS);\n if (explicitOrigins && explicitOrigins.length > 0) {\n return explicitOrigins.map(origin => normalizeOrigin(origin, 'ALLOWED_ORIGINS'));\n }\n\n return undefined;\n}\n\nasync function buildSessionConfig(): Promise<{\n sessionConfig: Pick<ServerConfig, 'sessionStore' | 'streamManager'>;\n cleanupFns: CleanupFn[];\n}> {\n const redisUrl = process.env.REDIS_URL?.trim();\n\n if (!redisUrl) {\n return {\n sessionConfig: {\n sessionStore: new InMemorySessionStore(),\n streamManager: new InMemoryStreamManager(),\n },\n cleanupFns: [],\n };\n }\n\n const commandClient = createClient({ url: redisUrl });\n const pubSubClient = commandClient.duplicate();\n\n await Promise.all([commandClient.connect(), pubSubClient.connect()]);\n\n return {\n sessionConfig: {\n sessionStore: new RedisSessionStore({\n client: commandClient as RedisClientType,\n }),\n streamManager: new RedisStreamManager({\n client: commandClient as RedisClientType,\n pubSubClient: pubSubClient as RedisClientType,\n }),\n },\n cleanupFns: [\n async () => {\n await pubSubClient.quit();\n },\n async () => {\n await commandClient.quit();\n },\n ],\n };\n}\n\nfunction buildHealthPayload(server: MCPServer, startedAt: number) {\n const llm = getLLMHealth();\n // Workflow-state size \u2014 when the in-memory store backs us, this is now\n // bounded by the TTL sweep added in mcp-revisions/contract-fixes/04.\n let workflowStateSize: number | null = null;\n try {\n const store = getWorkflowStateStore();\n workflowStateSize = store.size?.() ?? null;\n } catch { /* store not configured yet */ }\n return {\n status: 'ok',\n name: SERVER.NAME,\n version: SERVER.VERSION,\n transport: 'http',\n uptime_seconds: Math.floor((Date.now() - startedAt) / 1000),\n active_sessions: server.getActiveSessions().length,\n workflow_state_size: workflowStateSize,\n // LLM health \u2014 surfaced so capability-aware clients render degraded mode\n // once at session start instead of parsing per-call footers.\n llm_planner_ok: llm.lastPlannerOk,\n llm_extractor_ok: llm.lastExtractorOk,\n llm_planner_checked_at: llm.lastPlannerCheckedAt,\n llm_extractor_checked_at: llm.lastExtractorCheckedAt,\n llm_planner_error: llm.lastPlannerError,\n llm_extractor_error: llm.lastExtractorError,\n planner_configured: llm.plannerConfigured,\n extractor_configured: llm.extractorConfigured,\n timestamp: new Date().toISOString(),\n };\n}\n\nasync function main(): Promise<void> {\n configureLogging();\n\n const isProduction = process.env.NODE_ENV === 'production';\n const host = resolveHost();\n const port = resolvePort();\n const baseUrl = process.env.MCP_URL?.trim() || undefined;\n const allowedOrigins = resolveAllowedOrigins();\n\n const { sessionConfig, cleanupFns } = await buildSessionConfig();\n await configureWorkflowStateStore(process.env.REDIS_URL?.trim());\n\n startupLogger.info(`Starting ${SERVER.NAME} v${SERVER.VERSION}`);\n startupLogger.info(`Binding HTTP server to ${host}:${port}`);\n if (allowedOrigins && allowedOrigins.length > 0) {\n startupLogger.info(`Host validation enabled for origins: ${allowedOrigins.join(', ')}`);\n } else if (isProduction) {\n if (!baseUrl) {\n startupLogger.error(\n 'Production mode requires ALLOWED_ORIGINS or MCP_URL to be set. ' +\n 'Without host validation, the server is vulnerable to DNS rebinding attacks. ' +\n 'Set ALLOWED_ORIGINS to the public deployment URL or custom domain.',\n );\n process.exit(1);\n }\n startupLogger.warn(\n 'Host validation is disabled because ALLOWED_ORIGINS is not set. ' +\n 'MCP_URL is set, so the server will start \u2014 but set ALLOWED_ORIGINS for full origin protection.',\n );\n } else {\n startupLogger.info('Host validation disabled for local development');\n }\n\n const server = new MCPServer({\n name: SERVER.NAME,\n title: 'Research Powerpack',\n version: SERVER.VERSION,\n description: SERVER.DESCRIPTION,\n websiteUrl: WEBSITE_URL,\n host,\n baseUrl,\n cors: buildCors(allowedOrigins),\n allowedOrigins,\n ...sessionConfig,\n });\n\n registerAllTools(server);\n\n // Advertise our LLM-augmentation capability via the MCP `experimental`\n // namespace so capability-aware clients can branch at initialize-time\n // instead of parsing per-call footers. mcp-use creates a fresh native MCP\n // server per session via `getServerForSession()`, so we patch that factory\n // to register our experimental capability on every session. The capability\n // values are read fresh on each session so health flips are observable.\n // See: docs/code-review/context/06-mcp-use-best-practices-primer.md (#3, #6).\n try {\n type Native = { server?: { registerCapabilities?: (caps: Record<string, unknown>) => void } };\n type Patched = { getServerForSession?: (sessionId?: string) => Native };\n const patched = server as unknown as Patched;\n const original = patched.getServerForSession?.bind(server);\n if (original) {\n patched.getServerForSession = (sessionId?: string): Native => {\n const native = original(sessionId);\n try {\n const llm = getLLMHealth();\n native.server?.registerCapabilities?.({\n experimental: {\n research_powerpack: {\n planner_available: llm.plannerConfigured,\n extractor_available: llm.extractorConfigured,\n planner_model:\n process.env.LLM_MODEL ?? process.env.LLM_EXTRACTION_MODEL ?? null,\n extractor_model:\n process.env.LLM_MODEL ?? process.env.LLM_EXTRACTION_MODEL ?? null,\n // Tools that require start-research to bootstrap the session first.\n requires_bootstrap: ['web-search', 'scrape-links', 'get-reddit-post'],\n },\n },\n });\n } catch {\n // Capability registration is advisory; never block session creation.\n }\n return native;\n };\n }\n } catch (err) {\n startupLogger.warn(`Could not patch session-server factory: ${String(err)}`);\n }\n\n const startedAt = Date.now();\n\n server.get('/health', (c) => c.json(buildHealthPayload(server, startedAt)));\n server.get('/healthz', (c) => c.json(buildHealthPayload(server, startedAt)));\n\n // Some MCP clients (Claude Desktop, Cursor, VS Code) proactively probe\n // /.well-known/oauth-protected-resource before receiving any 401, per the\n // MCP 2025-03-26 spec. Without these routes the server returns 404 and some\n // clients surface a spurious \"authentication required\" error. A minimal PRM\n // response with no authorization_servers field explicitly signals that this\n // server requires no authentication.\n const resourceBaseUrl = baseUrl ?? `http://${host}:${port}`;\n server.get('/.well-known/oauth-protected-resource', (c) =>\n c.json({ resource: resourceBaseUrl }),\n );\n server.get('/.well-known/oauth-protected-resource/mcp', (c) =>\n c.json({ resource: `${resourceBaseUrl}/mcp` }),\n );\n\n server.resource(\n {\n name: 'server-health',\n uri: 'health://status',\n description: 'Current server health, uptime, and active MCP session count.',\n mimeType: 'application/json',\n },\n async () => object(buildHealthPayload(server, startedAt)),\n );\n\n let isShuttingDown = false;\n\n async function shutdown(signal: string, exitCode: number): Promise<void> {\n if (isShuttingDown) return;\n isShuttingDown = true;\n\n const forceExit = setTimeout(() => {\n startupLogger.error(`Forced exit after ${SHUTDOWN_TIMEOUT_MS}ms (${signal})`);\n process.exit(1);\n }, SHUTDOWN_TIMEOUT_MS);\n\n try {\n startupLogger.warn(`Shutdown signal received: ${signal}`);\n await server.close();\n await closeWorkflowStateStore();\n\n for (const cleanupFn of cleanupFns) {\n await cleanupFn();\n }\n\n clearTimeout(forceExit);\n process.exit(exitCode);\n } catch (error) {\n clearTimeout(forceExit);\n const message = error instanceof Error ? (error.stack ?? error.message) : String(error);\n startupLogger.error(`Error while stopping server: ${message}`);\n process.exit(1);\n }\n }\n\n process.on('SIGTERM', () => {\n void shutdown('SIGTERM', 0);\n });\n\n process.on('SIGINT', () => {\n void shutdown('SIGINT', 0);\n });\n\n process.on('uncaughtException', (error) => {\n startupLogger.error(`Uncaught exception: ${error.stack ?? error.message}`);\n void shutdown('uncaughtException', 1);\n });\n\n process.on('unhandledRejection', (reason) => {\n startupLogger.error(`Unhandled rejection: ${String(reason)}`);\n void shutdown('unhandledRejection', 1);\n });\n\n await server.listen(port);\n\n startupLogger.info(`${SERVER.NAME} v${SERVER.VERSION} listening on http://${host}:${port}/mcp`);\n}\n\nvoid main().catch((error) => {\n const message = error instanceof Error ? (error.stack ?? error.message) : String(error);\n startupLogger.error(`Server failed to start: ${message}`);\n process.exit(1);\n});\n"],
|
|
5
|
-
"mappings": ";AAGA,IAAI,CAAC,QAAQ,IAAI,oBAAoB;AACnC,UAAQ,IAAI,qBAAqB;AACnC;AAEA,SAAS,cAAc;AACvB;AAAA,EACE;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,
|
|
4
|
+
"sourcesContent": ["#!/usr/bin/env node\n\n// Expand libuv thread pool for parallel DNS lookups (default 4 is too low for 20+ concurrent connections)\nif (!process.env.UV_THREADPOOL_SIZE) {\n process.env.UV_THREADPOOL_SIZE = '8';\n}\n\nimport { Logger } from 'mcp-use';\nimport {\n InMemorySessionStore,\n InMemoryStreamManager,\n MCPServer,\n object,\n type ServerConfig,\n} from 'mcp-use/server';\n\nimport { SERVER } from './src/config/index.js';\nimport { getLLMHealth } from './src/services/llm-processor.js';\nimport { registerAllTools } from './src/tools/registry.js';\n\nconst DEFAULT_PORT = 3000 as const;\nconst SHUTDOWN_TIMEOUT_MS = 10_000 as const;\nconst WEBSITE_URL = 'https://github.com/yigitkonur/mcp-researchpowerpack-http' as const;\nconst LOCAL_DEFAULT_HOST = '127.0.0.1' as const;\n\ntype CleanupFn = () => Promise<void>;\n\nconst startupLogger = Logger.get('startup');\n\nfunction parseCsvEnv(value: string | undefined): string[] | undefined {\n if (!value) return undefined;\n\n const parts = value\n .split(',')\n .map((part) => part.trim())\n .filter(Boolean);\n\n return parts.length > 0 ? parts : undefined;\n}\n\nfunction parsePort(value: string | undefined, fallback: number): number {\n const parsed = Number.parseInt(value ?? '', 10);\n if (Number.isFinite(parsed) && parsed > 0) {\n return parsed;\n }\n\n return fallback;\n}\n\nfunction resolvePort(): number {\n const portFlagIndex = process.argv.findIndex((arg) => arg === '--port');\n if (portFlagIndex >= 0) {\n return parsePort(process.argv[portFlagIndex + 1], DEFAULT_PORT);\n }\n\n return parsePort(process.env.PORT, DEFAULT_PORT);\n}\n\nfunction resolveHost(): string {\n const explicitHost = process.env.HOST?.trim();\n if (explicitHost) {\n return explicitHost;\n }\n\n // Cloud runtimes typically inject PORT and expect the process to listen on all interfaces.\n if (process.env.PORT?.trim()) {\n return '0.0.0.0';\n }\n\n return LOCAL_DEFAULT_HOST;\n}\n\nfunction buildCors(allowedOrigins: string[] | undefined): ServerConfig['cors'] {\n if (!allowedOrigins || allowedOrigins.length === 0) {\n return undefined;\n }\n\n return {\n origin: allowedOrigins,\n allowMethods: ['GET', 'HEAD', 'POST', 'PUT', 'DELETE', 'OPTIONS'],\n allowHeaders: [\n 'Content-Type',\n 'Accept',\n 'Authorization',\n 'mcp-protocol-version',\n 'mcp-session-id',\n 'X-Proxy-Token',\n 'X-Target-URL',\n ],\n exposeHeaders: ['mcp-session-id'],\n };\n}\n\nfunction configureLogging(): void {\n Logger.configure({\n level: process.env.NODE_ENV === 'production' ? 'info' : 'debug',\n format: 'minimal',\n });\n\n const debug = process.env.DEBUG?.trim();\n if (debug === '2') {\n Logger.setDebug(2);\n } else if (debug) {\n Logger.setDebug(1);\n }\n}\n\nfunction normalizeOrigin(value: string, envName: string): string {\n try {\n return new URL(value).origin;\n } catch {\n throw new Error(`${envName} must contain absolute URLs with protocol. Received: ${value}`);\n }\n}\n\nfunction resolveAllowedOrigins(): string[] | undefined {\n const explicitOrigins = parseCsvEnv(process.env.ALLOWED_ORIGINS);\n if (explicitOrigins && explicitOrigins.length > 0) {\n return explicitOrigins.map(origin => normalizeOrigin(origin, 'ALLOWED_ORIGINS'));\n }\n\n return undefined;\n}\n\nfunction buildSessionConfig(): {\n sessionConfig: Pick<ServerConfig, 'sessionStore' | 'streamManager'>;\n cleanupFns: CleanupFn[];\n} {\n return {\n sessionConfig: {\n sessionStore: new InMemorySessionStore(),\n streamManager: new InMemoryStreamManager(),\n },\n cleanupFns: [],\n };\n}\n\nfunction buildHealthPayload(server: MCPServer, startedAt: number) {\n const llm = getLLMHealth();\n return {\n status: 'ok',\n name: SERVER.NAME,\n version: SERVER.VERSION,\n transport: 'http',\n uptime_seconds: Math.floor((Date.now() - startedAt) / 1000),\n active_sessions: server.getActiveSessions().length,\n llm_planner_ok: llm.lastPlannerOk,\n llm_extractor_ok: llm.lastExtractorOk,\n llm_planner_checked_at: llm.lastPlannerCheckedAt,\n llm_extractor_checked_at: llm.lastExtractorCheckedAt,\n llm_planner_error: llm.lastPlannerError,\n llm_extractor_error: llm.lastExtractorError,\n planner_configured: llm.plannerConfigured,\n extractor_configured: llm.extractorConfigured,\n timestamp: new Date().toISOString(),\n };\n}\n\nasync function main(): Promise<void> {\n configureLogging();\n\n const isProduction = process.env.NODE_ENV === 'production';\n const host = resolveHost();\n const port = resolvePort();\n const baseUrl = process.env.MCP_URL?.trim() || undefined;\n const allowedOrigins = resolveAllowedOrigins();\n\n const { sessionConfig, cleanupFns } = buildSessionConfig();\n\n startupLogger.info(`Starting ${SERVER.NAME} v${SERVER.VERSION}`);\n startupLogger.info(`Binding HTTP server to ${host}:${port}`);\n if (allowedOrigins && allowedOrigins.length > 0) {\n startupLogger.info(`Host validation enabled for origins: ${allowedOrigins.join(', ')}`);\n } else if (isProduction) {\n if (!baseUrl) {\n startupLogger.error(\n 'Production mode requires ALLOWED_ORIGINS or MCP_URL to be set. ' +\n 'Without host validation, the server is vulnerable to DNS rebinding attacks. ' +\n 'Set ALLOWED_ORIGINS to the public deployment URL or custom domain.',\n );\n process.exit(1);\n }\n startupLogger.warn(\n 'Host validation is disabled because ALLOWED_ORIGINS is not set. ' +\n 'MCP_URL is set, so the server will start \u2014 but set ALLOWED_ORIGINS for full origin protection.',\n );\n } else {\n startupLogger.info('Host validation disabled for local development');\n }\n\n const server = new MCPServer({\n name: SERVER.NAME,\n title: 'Research Powerpack',\n version: SERVER.VERSION,\n description: SERVER.DESCRIPTION,\n websiteUrl: WEBSITE_URL,\n host,\n baseUrl,\n cors: buildCors(allowedOrigins),\n allowedOrigins,\n ...sessionConfig,\n });\n\n registerAllTools(server);\n\n // Advertise our LLM-augmentation capability via the MCP `experimental`\n // namespace so capability-aware clients can branch at initialize-time\n // instead of parsing per-call footers. mcp-use creates a fresh native MCP\n // server per session via `getServerForSession()`, so we patch that factory\n // to register our experimental capability on every session. The capability\n // values are read fresh on each session so health flips are observable.\n // See: docs/code-review/context/06-mcp-use-best-practices-primer.md (#3, #6).\n try {\n type Native = { server?: { registerCapabilities?: (caps: Record<string, unknown>) => void } };\n type Patched = { getServerForSession?: (sessionId?: string) => Native };\n const patched = server as unknown as Patched;\n const original = patched.getServerForSession?.bind(server);\n if (original) {\n patched.getServerForSession = (sessionId?: string): Native => {\n const native = original(sessionId);\n try {\n const llm = getLLMHealth();\n native.server?.registerCapabilities?.({\n experimental: {\n research_powerpack: {\n planner_available: llm.plannerConfigured,\n extractor_available: llm.extractorConfigured,\n planner_model:\n process.env.LLM_MODEL ?? process.env.LLM_EXTRACTION_MODEL ?? null,\n extractor_model:\n process.env.LLM_MODEL ?? process.env.LLM_EXTRACTION_MODEL ?? null,\n },\n },\n });\n } catch {\n // Capability registration is advisory; never block session creation.\n }\n return native;\n };\n }\n } catch (err) {\n startupLogger.warn(`Could not patch session-server factory: ${String(err)}`);\n }\n\n const startedAt = Date.now();\n\n server.get('/health', (c) => c.json(buildHealthPayload(server, startedAt)));\n server.get('/healthz', (c) => c.json(buildHealthPayload(server, startedAt)));\n\n // Some MCP clients (Claude Desktop, Cursor, VS Code) proactively probe\n // /.well-known/oauth-protected-resource before receiving any 401, per the\n // MCP 2025-03-26 spec. Without these routes the server returns 404 and some\n // clients surface a spurious \"authentication required\" error. A minimal PRM\n // response with no authorization_servers field explicitly signals that this\n // server requires no authentication.\n const resourceBaseUrl = baseUrl ?? `http://${host}:${port}`;\n server.get('/.well-known/oauth-protected-resource', (c) =>\n c.json({ resource: resourceBaseUrl }),\n );\n server.get('/.well-known/oauth-protected-resource/mcp', (c) =>\n c.json({ resource: `${resourceBaseUrl}/mcp` }),\n );\n\n server.resource(\n {\n name: 'server-health',\n uri: 'health://status',\n description: 'Current server health, uptime, and active MCP session count.',\n mimeType: 'application/json',\n },\n async () => object(buildHealthPayload(server, startedAt)),\n );\n\n let isShuttingDown = false;\n\n async function shutdown(signal: string, exitCode: number): Promise<void> {\n if (isShuttingDown) return;\n isShuttingDown = true;\n\n const forceExit = setTimeout(() => {\n startupLogger.error(`Forced exit after ${SHUTDOWN_TIMEOUT_MS}ms (${signal})`);\n process.exit(1);\n }, SHUTDOWN_TIMEOUT_MS);\n\n try {\n startupLogger.warn(`Shutdown signal received: ${signal}`);\n await server.close();\n\n for (const cleanupFn of cleanupFns) {\n await cleanupFn();\n }\n\n clearTimeout(forceExit);\n process.exit(exitCode);\n } catch (error) {\n clearTimeout(forceExit);\n const message = error instanceof Error ? (error.stack ?? error.message) : String(error);\n startupLogger.error(`Error while stopping server: ${message}`);\n process.exit(1);\n }\n }\n\n process.on('SIGTERM', () => {\n void shutdown('SIGTERM', 0);\n });\n\n process.on('SIGINT', () => {\n void shutdown('SIGINT', 0);\n });\n\n process.on('uncaughtException', (error) => {\n startupLogger.error(`Uncaught exception: ${error.stack ?? error.message}`);\n void shutdown('uncaughtException', 1);\n });\n\n process.on('unhandledRejection', (reason) => {\n startupLogger.error(`Unhandled rejection: ${String(reason)}`);\n void shutdown('unhandledRejection', 1);\n });\n\n await server.listen(port);\n\n startupLogger.info(`${SERVER.NAME} v${SERVER.VERSION} listening on http://${host}:${port}/mcp`);\n}\n\nvoid main().catch((error) => {\n const message = error instanceof Error ? (error.stack ?? error.message) : String(error);\n startupLogger.error(`Server failed to start: ${message}`);\n process.exit(1);\n});\n"],
|
|
5
|
+
"mappings": ";AAGA,IAAI,CAAC,QAAQ,IAAI,oBAAoB;AACnC,UAAQ,IAAI,qBAAqB;AACnC;AAEA,SAAS,cAAc;AACvB;AAAA,EACE;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,OAEK;AAEP,SAAS,cAAc;AACvB,SAAS,oBAAoB;AAC7B,SAAS,wBAAwB;AAEjC,MAAM,eAAe;AACrB,MAAM,sBAAsB;AAC5B,MAAM,cAAc;AACpB,MAAM,qBAAqB;AAI3B,MAAM,gBAAgB,OAAO,IAAI,SAAS;AAE1C,SAAS,YAAY,OAAiD;AACpE,MAAI,CAAC,MAAO,QAAO;AAEnB,QAAM,QAAQ,MACX,MAAM,GAAG,EACT,IAAI,CAAC,SAAS,KAAK,KAAK,CAAC,EACzB,OAAO,OAAO;AAEjB,SAAO,MAAM,SAAS,IAAI,QAAQ;AACpC;AAEA,SAAS,UAAU,OAA2B,UAA0B;AACtE,QAAM,SAAS,OAAO,SAAS,SAAS,IAAI,EAAE;AAC9C,MAAI,OAAO,SAAS,MAAM,KAAK,SAAS,GAAG;AACzC,WAAO;AAAA,EACT;AAEA,SAAO;AACT;AAEA,SAAS,cAAsB;AAC7B,QAAM,gBAAgB,QAAQ,KAAK,UAAU,CAAC,QAAQ,QAAQ,QAAQ;AACtE,MAAI,iBAAiB,GAAG;AACtB,WAAO,UAAU,QAAQ,KAAK,gBAAgB,CAAC,GAAG,YAAY;AAAA,EAChE;AAEA,SAAO,UAAU,QAAQ,IAAI,MAAM,YAAY;AACjD;AAEA,SAAS,cAAsB;AAC7B,QAAM,eAAe,QAAQ,IAAI,MAAM,KAAK;AAC5C,MAAI,cAAc;AAChB,WAAO;AAAA,EACT;AAGA,MAAI,QAAQ,IAAI,MAAM,KAAK,GAAG;AAC5B,WAAO;AAAA,EACT;AAEA,SAAO;AACT;AAEA,SAAS,UAAU,gBAA4D;AAC7E,MAAI,CAAC,kBAAkB,eAAe,WAAW,GAAG;AAClD,WAAO;AAAA,EACT;AAEA,SAAO;AAAA,IACL,QAAQ;AAAA,IACR,cAAc,CAAC,OAAO,QAAQ,QAAQ,OAAO,UAAU,SAAS;AAAA,IAChE,cAAc;AAAA,MACZ;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,IACF;AAAA,IACA,eAAe,CAAC,gBAAgB;AAAA,EAClC;AACF;AAEA,SAAS,mBAAyB;AAChC,SAAO,UAAU;AAAA,IACf,OAAO,QAAQ,IAAI,aAAa,eAAe,SAAS;AAAA,IACxD,QAAQ;AAAA,EACV,CAAC;AAED,QAAM,QAAQ,QAAQ,IAAI,OAAO,KAAK;AACtC,MAAI,UAAU,KAAK;AACjB,WAAO,SAAS,CAAC;AAAA,EACnB,WAAW,OAAO;AAChB,WAAO,SAAS,CAAC;AAAA,EACnB;AACF;AAEA,SAAS,gBAAgB,OAAe,SAAyB;AAC/D,MAAI;AACF,WAAO,IAAI,IAAI,KAAK,EAAE;AAAA,EACxB,QAAQ;AACN,UAAM,IAAI,MAAM,GAAG,OAAO,wDAAwD,KAAK,EAAE;AAAA,EAC3F;AACF;AAEA,SAAS,wBAA8C;AACrD,QAAM,kBAAkB,YAAY,QAAQ,IAAI,eAAe;AAC/D,MAAI,mBAAmB,gBAAgB,SAAS,GAAG;AACjD,WAAO,gBAAgB,IAAI,YAAU,gBAAgB,QAAQ,iBAAiB,CAAC;AAAA,EACjF;AAEA,SAAO;AACT;AAEA,SAAS,qBAGP;AACA,SAAO;AAAA,IACL,eAAe;AAAA,MACb,cAAc,IAAI,qBAAqB;AAAA,MACvC,eAAe,IAAI,sBAAsB;AAAA,IAC3C;AAAA,IACA,YAAY,CAAC;AAAA,EACf;AACF;AAEA,SAAS,mBAAmB,QAAmB,WAAmB;AAChE,QAAM,MAAM,aAAa;AACzB,SAAO;AAAA,IACL,QAAQ;AAAA,IACR,MAAM,OAAO;AAAA,IACb,SAAS,OAAO;AAAA,IAChB,WAAW;AAAA,IACX,gBAAgB,KAAK,OAAO,KAAK,IAAI,IAAI,aAAa,GAAI;AAAA,IAC1D,iBAAiB,OAAO,kBAAkB,EAAE;AAAA,IAC5C,gBAAgB,IAAI;AAAA,IACpB,kBAAkB,IAAI;AAAA,IACtB,wBAAwB,IAAI;AAAA,IAC5B,0BAA0B,IAAI;AAAA,IAC9B,mBAAmB,IAAI;AAAA,IACvB,qBAAqB,IAAI;AAAA,IACzB,oBAAoB,IAAI;AAAA,IACxB,sBAAsB,IAAI;AAAA,IAC1B,YAAW,oBAAI,KAAK,GAAE,YAAY;AAAA,EACpC;AACF;AAEA,eAAe,OAAsB;AACnC,mBAAiB;AAEjB,QAAM,eAAe,QAAQ,IAAI,aAAa;AAC9C,QAAM,OAAO,YAAY;AACzB,QAAM,OAAO,YAAY;AACzB,QAAM,UAAU,QAAQ,IAAI,SAAS,KAAK,KAAK;AAC/C,QAAM,iBAAiB,sBAAsB;AAE7C,QAAM,EAAE,eAAe,WAAW,IAAI,mBAAmB;AAEzD,gBAAc,KAAK,YAAY,OAAO,IAAI,KAAK,OAAO,OAAO,EAAE;AAC/D,gBAAc,KAAK,0BAA0B,IAAI,IAAI,IAAI,EAAE;AAC3D,MAAI,kBAAkB,eAAe,SAAS,GAAG;AAC/C,kBAAc,KAAK,wCAAwC,eAAe,KAAK,IAAI,CAAC,EAAE;AAAA,EACxF,WAAW,cAAc;AACvB,QAAI,CAAC,SAAS;AACZ,oBAAc;AAAA,QACZ;AAAA,MAGF;AACA,cAAQ,KAAK,CAAC;AAAA,IAChB;AACA,kBAAc;AAAA,MACZ;AAAA,IAEF;AAAA,EACF,OAAO;AACL,kBAAc,KAAK,gDAAgD;AAAA,EACrE;AAEA,QAAM,SAAS,IAAI,UAAU;AAAA,IAC3B,MAAM,OAAO;AAAA,IACb,OAAO;AAAA,IACP,SAAS,OAAO;AAAA,IAChB,aAAa,OAAO;AAAA,IACpB,YAAY;AAAA,IACZ;AAAA,IACA;AAAA,IACA,MAAM,UAAU,cAAc;AAAA,IAC9B;AAAA,IACA,GAAG;AAAA,EACL,CAAC;AAED,mBAAiB,MAAM;AASvB,MAAI;AAGF,UAAM,UAAU;AAChB,UAAM,WAAW,QAAQ,qBAAqB,KAAK,MAAM;AACzD,QAAI,UAAU;AACZ,cAAQ,sBAAsB,CAAC,cAA+B;AAC5D,cAAM,SAAS,SAAS,SAAS;AACjC,YAAI;AACF,gBAAM,MAAM,aAAa;AACzB,iBAAO,QAAQ,uBAAuB;AAAA,YACpC,cAAc;AAAA,cACZ,oBAAoB;AAAA,gBAClB,mBAAmB,IAAI;AAAA,gBACvB,qBAAqB,IAAI;AAAA,gBACzB,eACE,QAAQ,IAAI,aAAa,QAAQ,IAAI,wBAAwB;AAAA,gBAC/D,iBACE,QAAQ,IAAI,aAAa,QAAQ,IAAI,wBAAwB;AAAA,cACjE;AAAA,YACF;AAAA,UACF,CAAC;AAAA,QACH,QAAQ;AAAA,QAER;AACA,eAAO;AAAA,MACT;AAAA,IACF;AAAA,EACF,SAAS,KAAK;AACZ,kBAAc,KAAK,2CAA2C,OAAO,GAAG,CAAC,EAAE;AAAA,EAC7E;AAEA,QAAM,YAAY,KAAK,IAAI;AAE3B,SAAO,IAAI,WAAW,CAAC,MAAM,EAAE,KAAK,mBAAmB,QAAQ,SAAS,CAAC,CAAC;AAC1E,SAAO,IAAI,YAAY,CAAC,MAAM,EAAE,KAAK,mBAAmB,QAAQ,SAAS,CAAC,CAAC;AAQ3E,QAAM,kBAAkB,WAAW,UAAU,IAAI,IAAI,IAAI;AACzD,SAAO;AAAA,IAAI;AAAA,IAAyC,CAAC,MACnD,EAAE,KAAK,EAAE,UAAU,gBAAgB,CAAC;AAAA,EACtC;AACA,SAAO;AAAA,IAAI;AAAA,IAA6C,CAAC,MACvD,EAAE,KAAK,EAAE,UAAU,GAAG,eAAe,OAAO,CAAC;AAAA,EAC/C;AAEA,SAAO;AAAA,IACL;AAAA,MACE,MAAM;AAAA,MACN,KAAK;AAAA,MACL,aAAa;AAAA,MACb,UAAU;AAAA,IACZ;AAAA,IACA,YAAY,OAAO,mBAAmB,QAAQ,SAAS,CAAC;AAAA,EAC1D;AAEA,MAAI,iBAAiB;AAErB,iBAAe,SAAS,QAAgB,UAAiC;AACvE,QAAI,eAAgB;AACpB,qBAAiB;AAEjB,UAAM,YAAY,WAAW,MAAM;AACjC,oBAAc,MAAM,qBAAqB,mBAAmB,OAAO,MAAM,GAAG;AAC5E,cAAQ,KAAK,CAAC;AAAA,IAChB,GAAG,mBAAmB;AAEtB,QAAI;AACF,oBAAc,KAAK,6BAA6B,MAAM,EAAE;AACxD,YAAM,OAAO,MAAM;AAEnB,iBAAW,aAAa,YAAY;AAClC,cAAM,UAAU;AAAA,MAClB;AAEA,mBAAa,SAAS;AACtB,cAAQ,KAAK,QAAQ;AAAA,IACvB,SAAS,OAAO;AACd,mBAAa,SAAS;AACtB,YAAM,UAAU,iBAAiB,QAAS,MAAM,SAAS,MAAM,UAAW,OAAO,KAAK;AACtF,oBAAc,MAAM,gCAAgC,OAAO,EAAE;AAC7D,cAAQ,KAAK,CAAC;AAAA,IAChB;AAAA,EACF;AAEA,UAAQ,GAAG,WAAW,MAAM;AAC1B,SAAK,SAAS,WAAW,CAAC;AAAA,EAC5B,CAAC;AAED,UAAQ,GAAG,UAAU,MAAM;AACzB,SAAK,SAAS,UAAU,CAAC;AAAA,EAC3B,CAAC;AAED,UAAQ,GAAG,qBAAqB,CAAC,UAAU;AACzC,kBAAc,MAAM,uBAAuB,MAAM,SAAS,MAAM,OAAO,EAAE;AACzE,SAAK,SAAS,qBAAqB,CAAC;AAAA,EACtC,CAAC;AAED,UAAQ,GAAG,sBAAsB,CAAC,WAAW;AAC3C,kBAAc,MAAM,wBAAwB,OAAO,MAAM,CAAC,EAAE;AAC5D,SAAK,SAAS,sBAAsB,CAAC;AAAA,EACvC,CAAC;AAED,QAAM,OAAO,OAAO,IAAI;AAExB,gBAAc,KAAK,GAAG,OAAO,IAAI,KAAK,OAAO,OAAO,wBAAwB,IAAI,IAAI,IAAI,MAAM;AAChG;AAEA,KAAK,KAAK,EAAE,MAAM,CAAC,UAAU;AAC3B,QAAM,UAAU,iBAAiB,QAAS,MAAM,SAAS,MAAM,UAAW,OAAO,KAAK;AACtF,gBAAc,MAAM,2BAA2B,OAAO,EAAE;AACxD,UAAQ,KAAK,CAAC;AAChB,CAAC;",
|
|
6
6
|
"names": []
|
|
7
7
|
}
|
package/dist/mcp-use.json
CHANGED
package/dist/src/config/index.js
CHANGED
|
@@ -93,11 +93,11 @@ const CTR_WEIGHTS = {
|
|
|
93
93
|
10: 12.56
|
|
94
94
|
};
|
|
95
95
|
function parseLlmReasoningEffort(value) {
|
|
96
|
-
if (value === "none") return "none";
|
|
97
|
-
if (
|
|
96
|
+
if (!value || value === "none") return "none";
|
|
97
|
+
if (VALID_REASONING_EFFORTS.includes(value)) {
|
|
98
98
|
return value;
|
|
99
99
|
}
|
|
100
|
-
return "
|
|
100
|
+
return "none";
|
|
101
101
|
}
|
|
102
102
|
function envWithFallback(...names) {
|
|
103
103
|
for (const name of names) {
|
|
@@ -109,12 +109,23 @@ function envWithFallback(...names) {
|
|
|
109
109
|
let cachedLlmExtraction = null;
|
|
110
110
|
function getLlmExtraction() {
|
|
111
111
|
if (cachedLlmExtraction) return cachedLlmExtraction;
|
|
112
|
+
const apiKey = envWithFallback("LLM_API_KEY", "LLM_EXTRACTION_API_KEY", "OPENROUTER_API_KEY") || "";
|
|
113
|
+
const baseUrl = envWithFallback("LLM_BASE_URL", "LLM_EXTRACTION_BASE_URL", "OPENROUTER_BASE_URL");
|
|
114
|
+
const model = envWithFallback("LLM_MODEL", "LLM_EXTRACTION_MODEL");
|
|
115
|
+
if (apiKey && !baseUrl) {
|
|
116
|
+
throw new Error(
|
|
117
|
+
"LLM_BASE_URL is required when LLM_API_KEY is set. Set LLM_BASE_URL to your OpenAI-compatible endpoint (e.g. https://openrouter.ai/api/v1, https://api.openai.com/v1, https://api.cerebras.ai/v1)."
|
|
118
|
+
);
|
|
119
|
+
}
|
|
120
|
+
if (apiKey && !model) {
|
|
121
|
+
throw new Error(
|
|
122
|
+
"LLM_MODEL is required when LLM_API_KEY is set. Set LLM_MODEL to a model identifier your endpoint accepts (e.g. openai/gpt-4.1-mini, gpt-4o, llama-3.3-70b)."
|
|
123
|
+
);
|
|
124
|
+
}
|
|
112
125
|
cachedLlmExtraction = {
|
|
113
|
-
API_KEY:
|
|
114
|
-
BASE_URL:
|
|
115
|
-
MODEL:
|
|
116
|
-
FALLBACK_MODEL: envWithFallback("LLM_FALLBACK_MODEL", "RESEARCH_FALLBACK_MODEL") || "google/gemini-2.5-flash",
|
|
117
|
-
MAX_TOKENS: safeParseInt(envWithFallback("LLM_MAX_TOKENS", "LLM_EXTRACTION_MAX_TOKENS"), 8e3, 1e3, 32e3),
|
|
126
|
+
API_KEY: apiKey,
|
|
127
|
+
BASE_URL: baseUrl || "",
|
|
128
|
+
MODEL: model || "",
|
|
118
129
|
REASONING_EFFORT: parseLlmReasoningEffort(envWithFallback("LLM_REASONING", "LLM_EXTRACTION_REASONING"))
|
|
119
130
|
};
|
|
120
131
|
return cachedLlmExtraction;
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"version": 3,
|
|
3
3
|
"sources": ["../../../src/config/index.ts"],
|
|
4
|
-
"sourcesContent": ["/**\n * Consolidated configuration\n * All environment variables, constants, and LLM config in one place\n */\n\nimport { Logger } from 'mcp-use';\n\nimport { VERSION, PACKAGE_NAME, PACKAGE_DESCRIPTION } from '../version.js';\n\n// ============================================================================\n// Safe Integer Parsing Helper\n// ============================================================================\n\n/**\n * Safely parse an integer from environment variable with bounds checking\n */\nfunction safeParseInt(\n value: string | undefined,\n defaultVal: number,\n min: number,\n max: number\n): number {\n const logger = Logger.get('config');\n\n if (!value) {\n return defaultVal;\n }\n\n const parsed = parseInt(value, 10);\n\n if (isNaN(parsed)) {\n logger.warn(`Invalid number \"${value}\", using default ${defaultVal}`);\n return defaultVal;\n }\n\n if (parsed < min) {\n logger.warn(`Value ${parsed} below minimum ${min}, clamping to ${min}`);\n return min;\n }\n\n if (parsed > max) {\n logger.warn(`Value ${parsed} above maximum ${max}, clamping to ${max}`);\n return max;\n }\n\n return parsed;\n}\n\n// ============================================================================\n// Reasoning Effort Validation\n// ============================================================================\n\nconst VALID_REASONING_EFFORTS = ['low', 'medium', 'high'] as const;\ntype ReasoningEffort = typeof VALID_REASONING_EFFORTS[number];\n\n// ============================================================================\n// Environment Parsing\n// ============================================================================\n\ninterface EnvConfig {\n SCRAPER_API_KEY: string;\n SEARCH_API_KEY: string | undefined;\n REDDIT_CLIENT_ID: string | undefined;\n REDDIT_CLIENT_SECRET: string | undefined;\n}\n\nlet cachedEnv: EnvConfig | null = null;\n\nexport function parseEnv(): EnvConfig {\n if (cachedEnv) return cachedEnv;\n cachedEnv = {\n SCRAPER_API_KEY: process.env.SCRAPEDO_API_KEY || '',\n SEARCH_API_KEY: process.env.SERPER_API_KEY || undefined,\n REDDIT_CLIENT_ID: process.env.REDDIT_CLIENT_ID || undefined,\n REDDIT_CLIENT_SECRET: process.env.REDDIT_CLIENT_SECRET || undefined,\n };\n return cachedEnv;\n}\n\n// ============================================================================\n// MCP Server Configuration\n// ============================================================================\n\nexport const SERVER = {\n NAME: PACKAGE_NAME,\n VERSION: VERSION,\n DESCRIPTION: PACKAGE_DESCRIPTION,\n} as const;\n\n// ============================================================================\n// Capability Detection (which features are available based on ENV)\n// ============================================================================\n\nexport interface Capabilities {\n reddit: boolean; // REDDIT_CLIENT_ID + REDDIT_CLIENT_SECRET\n search: boolean; // SERPER_API_KEY\n scraping: boolean; // SCRAPEDO_API_KEY\n llmExtraction: boolean; // LLM_API_KEY (or legacy: LLM_EXTRACTION_API_KEY, OPENROUTER_API_KEY)\n}\n\nexport function getCapabilities(): Capabilities {\n const env = parseEnv();\n return {\n reddit: !!(env.REDDIT_CLIENT_ID && env.REDDIT_CLIENT_SECRET),\n search: !!env.SEARCH_API_KEY,\n scraping: !!env.SCRAPER_API_KEY,\n llmExtraction: !!LLM_EXTRACTION.API_KEY,\n };\n}\n\nexport function getMissingEnvMessage(capability: keyof Capabilities): string {\n const messages: Record<keyof Capabilities, string> = {\n reddit: '\u274C **Reddit tools unavailable.** Set `REDDIT_CLIENT_ID` and `REDDIT_CLIENT_SECRET` to enable `get-reddit-post`.\\n\\n\uD83D\uDC49 Create a Reddit app at: https://www.reddit.com/prefs/apps (select \"script\" type)',\n search: '\u274C **Search unavailable.** Set `SERPER_API_KEY` to enable `web-search` (including `scope: \"reddit\"`).\\n\\n\uD83D\uDC49 Get your free API key at: https://serper.dev (2,500 free queries)',\n scraping: '\u274C **Web scraping unavailable.** Set `SCRAPEDO_API_KEY` to enable `scrape-links`.\\n\\n\uD83D\uDC49 Sign up at: https://scrape.do (1,000 free credits)',\n llmExtraction: '\u26A0\uFE0F **AI extraction disabled.** Set `LLM_API_KEY` to enable AI-powered content extraction and search classification.\\n\\nScraping will work but without intelligent content filtering.',\n };\n return messages[capability];\n}\n\n// ============================================================================\n// Concurrency Limits\n// ============================================================================\n\nexport const CONCURRENCY = {\n SEARCH: safeParseInt(process.env.CONCURRENCY_SEARCH, 50, 1, 200),\n SCRAPER: safeParseInt(process.env.CONCURRENCY_SCRAPER, 50, 1, 200),\n REDDIT: safeParseInt(process.env.CONCURRENCY_REDDIT, 50, 1, 200),\n LLM_EXTRACTION: safeParseInt(\n process.env.LLM_CONCURRENCY || process.env.LLM_EXTRACTION_CONCURRENCY,\n 50, 1, 200,\n ),\n} as const;\n\nexport const SCRAPER = {\n BATCH_SIZE: 30,\n EXTRACTION_PREFIX: 'Extract from document only \u2014 never hallucinate or add external knowledge.',\n EXTRACTION_SUFFIX: 'First line = content, not preamble. No confirmation messages.',\n} as const;\n\n// ============================================================================\n// Reddit Configuration\n// ============================================================================\n\nexport const REDDIT = {\n BATCH_SIZE: 10,\n MAX_WORDS_PER_POST: 50_000,\n MAX_WORDS_TOTAL: 500_000,\n MIN_POSTS: 1,\n MAX_POSTS: 50,\n RETRY_COUNT: 5,\n RETRY_DELAYS: [2000, 4000, 8000, 16000, 32000] as const,\n} as const;\n\n// ============================================================================\n// CTR Weights for URL Ranking (inspired from CTR research)\n// ============================================================================\n\nexport const CTR_WEIGHTS: Record<number, number> = {\n 1: 100.00,\n 2: 60.00,\n 3: 48.89,\n 4: 33.33,\n 5: 28.89,\n 6: 26.44,\n 7: 24.44,\n 8: 17.78,\n 9: 13.33,\n 10: 12.56,\n} as const;\n\n// ============================================================================\n// LLM Configuration\n//\n// Env var naming: LLM_* (canonical) with backwards-compatible fallbacks.\n// Fallback chain per variable:\n// LLM_API_KEY \u2190 LLM_EXTRACTION_API_KEY \u2190 OPENROUTER_API_KEY\n// LLM_BASE_URL \u2190 LLM_EXTRACTION_BASE_URL \u2190 OPENROUTER_BASE_URL
|
|
5
|
-
"mappings": "AAKA,SAAS,cAAc;AAEvB,SAAS,SAAS,cAAc,2BAA2B;AAS3D,SAAS,aACP,OACA,YACA,KACA,KACQ;AACR,QAAM,SAAS,OAAO,IAAI,QAAQ;AAElC,MAAI,CAAC,OAAO;AACV,WAAO;AAAA,EACT;AAEA,QAAM,SAAS,SAAS,OAAO,EAAE;AAEjC,MAAI,MAAM,MAAM,GAAG;AACjB,WAAO,KAAK,mBAAmB,KAAK,oBAAoB,UAAU,EAAE;AACpE,WAAO;AAAA,EACT;AAEA,MAAI,SAAS,KAAK;AAChB,WAAO,KAAK,SAAS,MAAM,kBAAkB,GAAG,iBAAiB,GAAG,EAAE;AACtE,WAAO;AAAA,EACT;AAEA,MAAI,SAAS,KAAK;AAChB,WAAO,KAAK,SAAS,MAAM,kBAAkB,GAAG,iBAAiB,GAAG,EAAE;AACtE,WAAO;AAAA,EACT;AAEA,SAAO;AACT;AAMA,MAAM,0BAA0B,CAAC,OAAO,UAAU,MAAM;AAcxD,IAAI,YAA8B;AAE3B,SAAS,WAAsB;AACpC,MAAI,UAAW,QAAO;AACtB,cAAY;AAAA,IACV,iBAAiB,QAAQ,IAAI,oBAAoB;AAAA,IACjD,gBAAgB,QAAQ,IAAI,kBAAkB;AAAA,IAC9C,kBAAkB,QAAQ,IAAI,oBAAoB;AAAA,IAClD,sBAAsB,QAAQ,IAAI,wBAAwB;AAAA,EAC5D;AACA,SAAO;AACT;AAMO,MAAM,SAAS;AAAA,EACpB,MAAM;AAAA,EACN;AAAA,EACA,aAAa;AACf;AAaO,SAAS,kBAAgC;AAC9C,QAAM,MAAM,SAAS;AACrB,SAAO;AAAA,IACL,QAAQ,CAAC,EAAE,IAAI,oBAAoB,IAAI;AAAA,IACvC,QAAQ,CAAC,CAAC,IAAI;AAAA,IACd,UAAU,CAAC,CAAC,IAAI;AAAA,IAChB,eAAe,CAAC,CAAC,eAAe;AAAA,EAClC;AACF;AAEO,SAAS,qBAAqB,YAAwC;AAC3E,QAAM,WAA+C;AAAA,IACnD,QAAQ;AAAA,IACR,QAAQ;AAAA,IACR,UAAU;AAAA,IACV,eAAe;AAAA,EACjB;AACA,SAAO,SAAS,UAAU;AAC5B;AAMO,MAAM,cAAc;AAAA,EACzB,QAAQ,aAAa,QAAQ,IAAI,oBAAoB,IAAI,GAAG,GAAG;AAAA,EAC/D,SAAS,aAAa,QAAQ,IAAI,qBAAqB,IAAI,GAAG,GAAG;AAAA,EACjE,QAAQ,aAAa,QAAQ,IAAI,oBAAoB,IAAI,GAAG,GAAG;AAAA,EAC/D,gBAAgB;AAAA,IACd,QAAQ,IAAI,mBAAmB,QAAQ,IAAI;AAAA,IAC3C;AAAA,IAAI;AAAA,IAAG;AAAA,EACT;AACF;AAEO,MAAM,UAAU;AAAA,EACrB,YAAY;AAAA,EACZ,mBAAmB;AAAA,EACnB,mBAAmB;AACrB;AAMO,MAAM,SAAS;AAAA,EACpB,YAAY;AAAA,EACZ,oBAAoB;AAAA,EACpB,iBAAiB;AAAA,EACjB,WAAW;AAAA,EACX,WAAW;AAAA,EACX,aAAa;AAAA,EACb,cAAc,CAAC,KAAM,KAAM,KAAM,MAAO,IAAK;AAC/C;AAMO,MAAM,cAAsC;AAAA,EACjD,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,IAAI;AACN;AAiBA,SAAS,wBAAwB,OAA+C;AAC9E,MAAI,UAAU,OAAQ,QAAO;
|
|
4
|
+
"sourcesContent": ["/**\n * Consolidated configuration\n * All environment variables, constants, and LLM config in one place\n */\n\nimport { Logger } from 'mcp-use';\n\nimport { VERSION, PACKAGE_NAME, PACKAGE_DESCRIPTION } from '../version.js';\n\n// ============================================================================\n// Safe Integer Parsing Helper\n// ============================================================================\n\n/**\n * Safely parse an integer from environment variable with bounds checking\n */\nfunction safeParseInt(\n value: string | undefined,\n defaultVal: number,\n min: number,\n max: number\n): number {\n const logger = Logger.get('config');\n\n if (!value) {\n return defaultVal;\n }\n\n const parsed = parseInt(value, 10);\n\n if (isNaN(parsed)) {\n logger.warn(`Invalid number \"${value}\", using default ${defaultVal}`);\n return defaultVal;\n }\n\n if (parsed < min) {\n logger.warn(`Value ${parsed} below minimum ${min}, clamping to ${min}`);\n return min;\n }\n\n if (parsed > max) {\n logger.warn(`Value ${parsed} above maximum ${max}, clamping to ${max}`);\n return max;\n }\n\n return parsed;\n}\n\n// ============================================================================\n// Reasoning Effort Validation\n// ============================================================================\n\nconst VALID_REASONING_EFFORTS = ['low', 'medium', 'high'] as const;\ntype ReasoningEffort = typeof VALID_REASONING_EFFORTS[number];\n\n// ============================================================================\n// Environment Parsing\n// ============================================================================\n\ninterface EnvConfig {\n SCRAPER_API_KEY: string;\n SEARCH_API_KEY: string | undefined;\n REDDIT_CLIENT_ID: string | undefined;\n REDDIT_CLIENT_SECRET: string | undefined;\n}\n\nlet cachedEnv: EnvConfig | null = null;\n\nexport function parseEnv(): EnvConfig {\n if (cachedEnv) return cachedEnv;\n cachedEnv = {\n SCRAPER_API_KEY: process.env.SCRAPEDO_API_KEY || '',\n SEARCH_API_KEY: process.env.SERPER_API_KEY || undefined,\n REDDIT_CLIENT_ID: process.env.REDDIT_CLIENT_ID || undefined,\n REDDIT_CLIENT_SECRET: process.env.REDDIT_CLIENT_SECRET || undefined,\n };\n return cachedEnv;\n}\n\n// ============================================================================\n// MCP Server Configuration\n// ============================================================================\n\nexport const SERVER = {\n NAME: PACKAGE_NAME,\n VERSION: VERSION,\n DESCRIPTION: PACKAGE_DESCRIPTION,\n} as const;\n\n// ============================================================================\n// Capability Detection (which features are available based on ENV)\n// ============================================================================\n\nexport interface Capabilities {\n reddit: boolean; // REDDIT_CLIENT_ID + REDDIT_CLIENT_SECRET\n search: boolean; // SERPER_API_KEY\n scraping: boolean; // SCRAPEDO_API_KEY\n llmExtraction: boolean; // LLM_API_KEY (or legacy: LLM_EXTRACTION_API_KEY, OPENROUTER_API_KEY)\n}\n\nexport function getCapabilities(): Capabilities {\n const env = parseEnv();\n return {\n reddit: !!(env.REDDIT_CLIENT_ID && env.REDDIT_CLIENT_SECRET),\n search: !!env.SEARCH_API_KEY,\n scraping: !!env.SCRAPER_API_KEY,\n llmExtraction: !!LLM_EXTRACTION.API_KEY,\n };\n}\n\nexport function getMissingEnvMessage(capability: keyof Capabilities): string {\n const messages: Record<keyof Capabilities, string> = {\n reddit: '\u274C **Reddit tools unavailable.** Set `REDDIT_CLIENT_ID` and `REDDIT_CLIENT_SECRET` to enable `get-reddit-post`.\\n\\n\uD83D\uDC49 Create a Reddit app at: https://www.reddit.com/prefs/apps (select \"script\" type)',\n search: '\u274C **Search unavailable.** Set `SERPER_API_KEY` to enable `web-search` (including `scope: \"reddit\"`).\\n\\n\uD83D\uDC49 Get your free API key at: https://serper.dev (2,500 free queries)',\n scraping: '\u274C **Web scraping unavailable.** Set `SCRAPEDO_API_KEY` to enable `scrape-links`.\\n\\n\uD83D\uDC49 Sign up at: https://scrape.do (1,000 free credits)',\n llmExtraction: '\u26A0\uFE0F **AI extraction disabled.** Set `LLM_API_KEY` to enable AI-powered content extraction and search classification.\\n\\nScraping will work but without intelligent content filtering.',\n };\n return messages[capability];\n}\n\n// ============================================================================\n// Concurrency Limits\n// ============================================================================\n\nexport const CONCURRENCY = {\n SEARCH: safeParseInt(process.env.CONCURRENCY_SEARCH, 50, 1, 200),\n SCRAPER: safeParseInt(process.env.CONCURRENCY_SCRAPER, 50, 1, 200),\n REDDIT: safeParseInt(process.env.CONCURRENCY_REDDIT, 50, 1, 200),\n LLM_EXTRACTION: safeParseInt(\n process.env.LLM_CONCURRENCY || process.env.LLM_EXTRACTION_CONCURRENCY,\n 50, 1, 200,\n ),\n} as const;\n\nexport const SCRAPER = {\n BATCH_SIZE: 30,\n EXTRACTION_PREFIX: 'Extract from document only \u2014 never hallucinate or add external knowledge.',\n EXTRACTION_SUFFIX: 'First line = content, not preamble. No confirmation messages.',\n} as const;\n\n// ============================================================================\n// Reddit Configuration\n// ============================================================================\n\nexport const REDDIT = {\n BATCH_SIZE: 10,\n MAX_WORDS_PER_POST: 50_000,\n MAX_WORDS_TOTAL: 500_000,\n MIN_POSTS: 1,\n MAX_POSTS: 50,\n RETRY_COUNT: 5,\n RETRY_DELAYS: [2000, 4000, 8000, 16000, 32000] as const,\n} as const;\n\n// ============================================================================\n// CTR Weights for URL Ranking (inspired from CTR research)\n// ============================================================================\n\nexport const CTR_WEIGHTS: Record<number, number> = {\n 1: 100.00,\n 2: 60.00,\n 3: 48.89,\n 4: 33.33,\n 5: 28.89,\n 6: 26.44,\n 7: 24.44,\n 8: 17.78,\n 9: 13.33,\n 10: 12.56,\n} as const;\n\n// ============================================================================\n// LLM Configuration\n//\n// Env var naming: LLM_* (canonical) with backwards-compatible fallbacks.\n// LLM_API_KEY + LLM_BASE_URL + LLM_MODEL are all required together \u2014 no defaults.\n// Fallback chain per variable:\n// LLM_API_KEY \u2190 LLM_EXTRACTION_API_KEY \u2190 OPENROUTER_API_KEY\n// LLM_BASE_URL \u2190 LLM_EXTRACTION_BASE_URL \u2190 OPENROUTER_BASE_URL\n// LLM_MODEL \u2190 LLM_EXTRACTION_MODEL\n// LLM_REASONING \u2190 LLM_EXTRACTION_REASONING \u2190 'none'\n// LLM_CONCURRENCY \u2190 LLM_EXTRACTION_CONCURRENCY \u2190 50\n// ============================================================================\n\ntype LlmReasoningEffort = ReasoningEffort | 'none';\n\nfunction parseLlmReasoningEffort(value: string | undefined): LlmReasoningEffort {\n if (!value || value === 'none') return 'none';\n if (VALID_REASONING_EFFORTS.includes(value as ReasoningEffort)) {\n return value as ReasoningEffort;\n }\n return 'none';\n}\n\ninterface LlmExtractionConfig {\n readonly MODEL: string;\n readonly BASE_URL: string;\n readonly API_KEY: string;\n readonly REASONING_EFFORT: LlmReasoningEffort;\n}\n\n/** Read an env var with a backwards-compatible fallback chain */\nfunction envWithFallback(...names: string[]): string | undefined {\n for (const name of names) {\n const val = process.env[name]?.trim();\n if (val) return val;\n }\n return undefined;\n}\n\nlet cachedLlmExtraction: LlmExtractionConfig | null = null;\n\nfunction getLlmExtraction(): LlmExtractionConfig {\n if (cachedLlmExtraction) return cachedLlmExtraction;\n\n const apiKey = envWithFallback('LLM_API_KEY', 'LLM_EXTRACTION_API_KEY', 'OPENROUTER_API_KEY') || '';\n const baseUrl = envWithFallback('LLM_BASE_URL', 'LLM_EXTRACTION_BASE_URL', 'OPENROUTER_BASE_URL');\n const model = envWithFallback('LLM_MODEL', 'LLM_EXTRACTION_MODEL');\n\n if (apiKey && !baseUrl) {\n throw new Error(\n 'LLM_BASE_URL is required when LLM_API_KEY is set. ' +\n 'Set LLM_BASE_URL to your OpenAI-compatible endpoint (e.g. https://openrouter.ai/api/v1, https://api.openai.com/v1, https://api.cerebras.ai/v1).',\n );\n }\n if (apiKey && !model) {\n throw new Error(\n 'LLM_MODEL is required when LLM_API_KEY is set. ' +\n 'Set LLM_MODEL to a model identifier your endpoint accepts (e.g. openai/gpt-4.1-mini, gpt-4o, llama-3.3-70b).',\n );\n }\n\n cachedLlmExtraction = {\n API_KEY: apiKey,\n BASE_URL: baseUrl || '',\n MODEL: model || '',\n REASONING_EFFORT: parseLlmReasoningEffort(envWithFallback('LLM_REASONING', 'LLM_EXTRACTION_REASONING')),\n };\n return cachedLlmExtraction;\n}\n\nexport const LLM_EXTRACTION: LlmExtractionConfig = new Proxy({} as LlmExtractionConfig, {\n get(_target, prop: string) {\n return getLlmExtraction()[prop as keyof LlmExtractionConfig];\n },\n});\n"],
|
|
5
|
+
"mappings": "AAKA,SAAS,cAAc;AAEvB,SAAS,SAAS,cAAc,2BAA2B;AAS3D,SAAS,aACP,OACA,YACA,KACA,KACQ;AACR,QAAM,SAAS,OAAO,IAAI,QAAQ;AAElC,MAAI,CAAC,OAAO;AACV,WAAO;AAAA,EACT;AAEA,QAAM,SAAS,SAAS,OAAO,EAAE;AAEjC,MAAI,MAAM,MAAM,GAAG;AACjB,WAAO,KAAK,mBAAmB,KAAK,oBAAoB,UAAU,EAAE;AACpE,WAAO;AAAA,EACT;AAEA,MAAI,SAAS,KAAK;AAChB,WAAO,KAAK,SAAS,MAAM,kBAAkB,GAAG,iBAAiB,GAAG,EAAE;AACtE,WAAO;AAAA,EACT;AAEA,MAAI,SAAS,KAAK;AAChB,WAAO,KAAK,SAAS,MAAM,kBAAkB,GAAG,iBAAiB,GAAG,EAAE;AACtE,WAAO;AAAA,EACT;AAEA,SAAO;AACT;AAMA,MAAM,0BAA0B,CAAC,OAAO,UAAU,MAAM;AAcxD,IAAI,YAA8B;AAE3B,SAAS,WAAsB;AACpC,MAAI,UAAW,QAAO;AACtB,cAAY;AAAA,IACV,iBAAiB,QAAQ,IAAI,oBAAoB;AAAA,IACjD,gBAAgB,QAAQ,IAAI,kBAAkB;AAAA,IAC9C,kBAAkB,QAAQ,IAAI,oBAAoB;AAAA,IAClD,sBAAsB,QAAQ,IAAI,wBAAwB;AAAA,EAC5D;AACA,SAAO;AACT;AAMO,MAAM,SAAS;AAAA,EACpB,MAAM;AAAA,EACN;AAAA,EACA,aAAa;AACf;AAaO,SAAS,kBAAgC;AAC9C,QAAM,MAAM,SAAS;AACrB,SAAO;AAAA,IACL,QAAQ,CAAC,EAAE,IAAI,oBAAoB,IAAI;AAAA,IACvC,QAAQ,CAAC,CAAC,IAAI;AAAA,IACd,UAAU,CAAC,CAAC,IAAI;AAAA,IAChB,eAAe,CAAC,CAAC,eAAe;AAAA,EAClC;AACF;AAEO,SAAS,qBAAqB,YAAwC;AAC3E,QAAM,WAA+C;AAAA,IACnD,QAAQ;AAAA,IACR,QAAQ;AAAA,IACR,UAAU;AAAA,IACV,eAAe;AAAA,EACjB;AACA,SAAO,SAAS,UAAU;AAC5B;AAMO,MAAM,cAAc;AAAA,EACzB,QAAQ,aAAa,QAAQ,IAAI,oBAAoB,IAAI,GAAG,GAAG;AAAA,EAC/D,SAAS,aAAa,QAAQ,IAAI,qBAAqB,IAAI,GAAG,GAAG;AAAA,EACjE,QAAQ,aAAa,QAAQ,IAAI,oBAAoB,IAAI,GAAG,GAAG;AAAA,EAC/D,gBAAgB;AAAA,IACd,QAAQ,IAAI,mBAAmB,QAAQ,IAAI;AAAA,IAC3C;AAAA,IAAI;AAAA,IAAG;AAAA,EACT;AACF;AAEO,MAAM,UAAU;AAAA,EACrB,YAAY;AAAA,EACZ,mBAAmB;AAAA,EACnB,mBAAmB;AACrB;AAMO,MAAM,SAAS;AAAA,EACpB,YAAY;AAAA,EACZ,oBAAoB;AAAA,EACpB,iBAAiB;AAAA,EACjB,WAAW;AAAA,EACX,WAAW;AAAA,EACX,aAAa;AAAA,EACb,cAAc,CAAC,KAAM,KAAM,KAAM,MAAO,IAAK;AAC/C;AAMO,MAAM,cAAsC;AAAA,EACjD,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,GAAG;AAAA,EACH,IAAI;AACN;AAiBA,SAAS,wBAAwB,OAA+C;AAC9E,MAAI,CAAC,SAAS,UAAU,OAAQ,QAAO;AACvC,MAAI,wBAAwB,SAAS,KAAwB,GAAG;AAC9D,WAAO;AAAA,EACT;AACA,SAAO;AACT;AAUA,SAAS,mBAAmB,OAAqC;AAC/D,aAAW,QAAQ,OAAO;AACxB,UAAM,MAAM,QAAQ,IAAI,IAAI,GAAG,KAAK;AACpC,QAAI,IAAK,QAAO;AAAA,EAClB;AACA,SAAO;AACT;AAEA,IAAI,sBAAkD;AAEtD,SAAS,mBAAwC;AAC/C,MAAI,oBAAqB,QAAO;AAEhC,QAAM,SAAS,gBAAgB,eAAe,0BAA0B,oBAAoB,KAAK;AACjG,QAAM,UAAU,gBAAgB,gBAAgB,2BAA2B,qBAAqB;AAChG,QAAM,QAAQ,gBAAgB,aAAa,sBAAsB;AAEjE,MAAI,UAAU,CAAC,SAAS;AACtB,UAAM,IAAI;AAAA,MACR;AAAA,IAEF;AAAA,EACF;AACA,MAAI,UAAU,CAAC,OAAO;AACpB,UAAM,IAAI;AAAA,MACR;AAAA,IAEF;AAAA,EACF;AAEA,wBAAsB;AAAA,IACpB,SAAS;AAAA,IACT,UAAU,WAAW;AAAA,IACrB,OAAO,SAAS;AAAA,IAChB,kBAAkB,wBAAwB,gBAAgB,iBAAiB,0BAA0B,CAAC;AAAA,EACxG;AACA,SAAO;AACT;AAEO,MAAM,iBAAsC,IAAI,MAAM,CAAC,GAA0B;AAAA,EACtF,IAAI,SAAS,MAAc;AACzB,WAAO,iBAAiB,EAAE,IAAiC;AAAA,EAC7D;AACF,CAAC;",
|
|
6
6
|
"names": []
|
|
7
7
|
}
|
|
@@ -12,25 +12,29 @@ function registerDeepResearchPrompt(server) {
|
|
|
12
12
|
},
|
|
13
13
|
async ({ topic }) => text(
|
|
14
14
|
[
|
|
15
|
-
"You are a research agent using the research-powerpack MCP tools. You are running a research LOOP, not answering from memory \u2014 every claim in your final answer must be traceable to a
|
|
15
|
+
"You are a research agent using the research-powerpack MCP tools (3 tools: `start-research`, `web-search`, `scrape-links`). You are running a research LOOP, not answering from memory \u2014 every non-trivial claim in your final answer must be traceable to a `scrape-links` excerpt. Never cite a URL from a `web-search` snippet alone.",
|
|
16
16
|
"",
|
|
17
17
|
`Research goal: ${topic}`,
|
|
18
18
|
"",
|
|
19
19
|
"## Workflow",
|
|
20
20
|
"",
|
|
21
|
-
"1. **Call `start-research` with `goal` = the research goal above.** The server returns a goal-tailored brief: classified goal type,
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
'
|
|
26
|
-
|
|
21
|
+
"1. **Call `start-research` with `goal` = the research goal above.** The server returns a goal-tailored brief: classified goal type, `primary_branch` (reddit / web / both), the exact `first_call_sequence`, 25\u201350 keyword seeds for your first `web-search` call, iteration hints, gaps to watch, and stop criteria.",
|
|
22
|
+
"2. **Fire `first_call_sequence` in order.**",
|
|
23
|
+
' - `primary_branch: web` \u2192 one `web-search` (scope: "web") with all keyword seeds in a flat `queries` array, then one `scrape-links` on the HIGHLY_RELEVANT + 2\u20133 best MAYBE_RELEVANT URLs.',
|
|
24
|
+
' - `primary_branch: reddit` \u2192 one `web-search` (scope: "reddit") with the seeds, then one `scrape-links` on the best post permalinks (auto-detected \u2192 Reddit API threaded post + comments).',
|
|
25
|
+
' - `primary_branch: both` \u2192 two parallel `web-search` calls in one turn (scope: "web" + scope: "reddit"), then one merged `scrape-links`.',
|
|
26
|
+
' Set `extract` on `web-search` to a specific description of what "relevant" means for this goal (not just a keyword).',
|
|
27
|
+
"3. **Read the classifier output**: `synthesis` (grounded in `[rank]` citations), `gaps` (each with an id), `refine_queries` (follow-ups linked to gap ids). If confidence is `low`, trust the `gaps` list more than the synthesis.",
|
|
28
|
+
"4. **Read every scrape extract**. Each page returns `## Source`, `## Matches` (verbatim facts), `## Not found` (admitted gaps), `## Follow-up signals` (new terms + referenced-but-unscraped URLs). Harvest from `## Follow-up signals` \u2014 those terms seed your next `web-search` round.",
|
|
29
|
+
"5. **Loop**: build the next `web-search` with the harvested terms + classifier-suggested refines. Scrape HIGHLY_RELEVANT URLs in contextually grouped parallel `scrape-links` calls (docs in one call, reddit threads in another). Stop when every `gaps_to_watch` item is closed AND no new terms appeared, OR after 4 passes \u2014 whichever comes first.",
|
|
27
30
|
"",
|
|
28
31
|
"## Output discipline",
|
|
29
32
|
"",
|
|
30
|
-
"- Cite URL (or Reddit
|
|
33
|
+
"- Cite URL (or Reddit permalink) for every non-trivial claim.",
|
|
34
|
+
"- Quote verbatim: numbers, versions, API names, prices, error messages, stacktraces, people's words.",
|
|
31
35
|
"- Separate documented facts from inferred conclusions explicitly.",
|
|
32
|
-
"- Include scrape dates
|
|
33
|
-
"- If any
|
|
36
|
+
"- Include scrape dates on time-sensitive claims.",
|
|
37
|
+
"- If any `stop_criteria` item from the brief is unmet, say so \u2014 do not paper over it."
|
|
34
38
|
].join("\n")
|
|
35
39
|
)
|
|
36
40
|
);
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"version": 3,
|
|
3
3
|
"sources": ["../../../src/prompts/deep-research.ts"],
|
|
4
|
-
"sourcesContent": ["import { text, type MCPServer } from 'mcp-use/server';\nimport { z } from 'zod';\n\nexport function registerDeepResearchPrompt(server: MCPServer): void {\n server.prompt(\n {\n name: 'deep-research',\n title: 'Deep Research',\n description: 'Multi-pass research loop on a topic using the research-powerpack tools.',\n schema: z.object({\n topic: z.string().describe('Topic to research. Be specific about what \"done\" looks like \u2014 the first tool call will generate a goal-tailored research brief from it.'),\n }),\n },\n async ({ topic }) => text(\n [\n 'You are a research agent using the research-powerpack MCP tools. You are running a research LOOP, not answering from memory \u2014 every claim in your final answer must be traceable to a
|
|
5
|
-
"mappings": "AAAA,SAAS,YAA4B;AACrC,SAAS,SAAS;AAEX,SAAS,2BAA2B,QAAyB;AAClE,SAAO;AAAA,IACL;AAAA,MACE,MAAM;AAAA,MACN,OAAO;AAAA,MACP,aAAa;AAAA,MACb,QAAQ,EAAE,OAAO;AAAA,QACf,OAAO,EAAE,OAAO,EAAE,SAAS,8IAAyI;AAAA,MACtK,CAAC;AAAA,IACH;AAAA,IACA,OAAO,EAAE,MAAM,MAAM;AAAA,MACnB;AAAA,QACE;AAAA,QACA;AAAA,QACA,kBAAkB,KAAK;AAAA,QACvB;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,MACF,EAAE,KAAK,IAAI;AAAA,IACb;AAAA,EACF;AACF;",
|
|
4
|
+
"sourcesContent": ["import { text, type MCPServer } from 'mcp-use/server';\nimport { z } from 'zod';\n\nexport function registerDeepResearchPrompt(server: MCPServer): void {\n server.prompt(\n {\n name: 'deep-research',\n title: 'Deep Research',\n description: 'Multi-pass research loop on a topic using the research-powerpack tools.',\n schema: z.object({\n topic: z.string().describe('Topic to research. Be specific about what \"done\" looks like \u2014 the first tool call will generate a goal-tailored research brief from it.'),\n }),\n },\n async ({ topic }) => text(\n [\n 'You are a research agent using the research-powerpack MCP tools (3 tools: `start-research`, `web-search`, `scrape-links`). You are running a research LOOP, not answering from memory \u2014 every non-trivial claim in your final answer must be traceable to a `scrape-links` excerpt. Never cite a URL from a `web-search` snippet alone.',\n '',\n `Research goal: ${topic}`,\n '',\n '## Workflow',\n '',\n '1. **Call `start-research` with `goal` = the research goal above.** The server returns a goal-tailored brief: classified goal type, `primary_branch` (reddit / web / both), the exact `first_call_sequence`, 25\u201350 keyword seeds for your first `web-search` call, iteration hints, gaps to watch, and stop criteria.',\n '2. **Fire `first_call_sequence` in order.**',\n ' - `primary_branch: web` \u2192 one `web-search` (scope: \"web\") with all keyword seeds in a flat `queries` array, then one `scrape-links` on the HIGHLY_RELEVANT + 2\u20133 best MAYBE_RELEVANT URLs.',\n ' - `primary_branch: reddit` \u2192 one `web-search` (scope: \"reddit\") with the seeds, then one `scrape-links` on the best post permalinks (auto-detected \u2192 Reddit API threaded post + comments).',\n ' - `primary_branch: both` \u2192 two parallel `web-search` calls in one turn (scope: \"web\" + scope: \"reddit\"), then one merged `scrape-links`.',\n ' Set `extract` on `web-search` to a specific description of what \"relevant\" means for this goal (not just a keyword).',\n '3. **Read the classifier output**: `synthesis` (grounded in `[rank]` citations), `gaps` (each with an id), `refine_queries` (follow-ups linked to gap ids). If confidence is `low`, trust the `gaps` list more than the synthesis.',\n '4. **Read every scrape extract**. Each page returns `## Source`, `## Matches` (verbatim facts), `## Not found` (admitted gaps), `## Follow-up signals` (new terms + referenced-but-unscraped URLs). Harvest from `## Follow-up signals` \u2014 those terms seed your next `web-search` round.',\n '5. **Loop**: build the next `web-search` with the harvested terms + classifier-suggested refines. Scrape HIGHLY_RELEVANT URLs in contextually grouped parallel `scrape-links` calls (docs in one call, reddit threads in another). Stop when every `gaps_to_watch` item is closed AND no new terms appeared, OR after 4 passes \u2014 whichever comes first.',\n '',\n '## Output discipline',\n '',\n '- Cite URL (or Reddit permalink) for every non-trivial claim.',\n '- Quote verbatim: numbers, versions, API names, prices, error messages, stacktraces, people\\'s words.',\n '- Separate documented facts from inferred conclusions explicitly.',\n '- Include scrape dates on time-sensitive claims.',\n '- If any `stop_criteria` item from the brief is unmet, say so \u2014 do not paper over it.',\n ].join('\\n'),\n ),\n );\n}\n"],
|
|
5
|
+
"mappings": "AAAA,SAAS,YAA4B;AACrC,SAAS,SAAS;AAEX,SAAS,2BAA2B,QAAyB;AAClE,SAAO;AAAA,IACL;AAAA,MACE,MAAM;AAAA,MACN,OAAO;AAAA,MACP,aAAa;AAAA,MACb,QAAQ,EAAE,OAAO;AAAA,QACf,OAAO,EAAE,OAAO,EAAE,SAAS,8IAAyI;AAAA,MACtK,CAAC;AAAA,IACH;AAAA,IACA,OAAO,EAAE,MAAM,MAAM;AAAA,MACnB;AAAA,QACE;AAAA,QACA;AAAA,QACA,kBAAkB,KAAK;AAAA,QACvB;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,MACF,EAAE,KAAK,IAAI;AAAA,IACb;AAAA,EACF;AACF;",
|
|
6
6
|
"names": []
|
|
7
7
|
}
|
|
@@ -16,18 +16,18 @@ function registerRedditSentimentPrompt(server) {
|
|
|
16
16
|
const subredditScope = subredditList.length ? ` Scope Reddit searches to ${subredditList.map((s) => `r/${s}`).join(", ")} when possible.` : "";
|
|
17
17
|
return text(
|
|
18
18
|
[
|
|
19
|
-
"You are a research agent using the research-powerpack MCP tools to characterize Reddit sentiment. You are running a research LOOP, not answering from memory. Sentiment claims must be traceable to specific
|
|
19
|
+
"You are a research agent using the research-powerpack MCP tools (3 tools: `start-research`, `web-search`, `scrape-links`) to characterize Reddit sentiment. You are running a research LOOP, not answering from memory. Sentiment claims must be traceable to specific Reddit threads you expanded via `scrape-links` \u2014 never cite a thread you have not scraped.",
|
|
20
20
|
"",
|
|
21
21
|
`Research goal: Reddit sentiment on "${topic}" \u2014 agreement distribution, dissent distribution, representative verbatim quotes with attribution, and the strongest causal explanations.${subredditScope}`,
|
|
22
22
|
"",
|
|
23
23
|
"## Workflow",
|
|
24
24
|
"",
|
|
25
|
-
|
|
26
|
-
'2. **Fire `web-search`
|
|
25
|
+
"1. **Call `start-research` with `goal` = the research goal above.** The brief will classify this as `sentiment`, set `primary_branch` to `reddit` (or `both` if official sources also matter), and list 25\u201350 seed queries ready for `web-search`.",
|
|
26
|
+
'2. **Fire two parallel `web-search` calls in one turn** \u2014 one with `scope: "reddit"` for post-permalink discovery, one with `scope: "web"` for supporting evidence (post-mortems, blog write-ups, GitHub issues). Set `extract` to describe the shape of the sentiment answer: "agreement reasons | dissent reasons | representative quotes | migration drivers".',
|
|
27
27
|
"3. **Shortlist the strongest Reddit threads** \u2014 those with (a) high comment count, (b) visible disagreement in replies, (c) specific stack/environment details from the OP. Avoid single-comment threads.",
|
|
28
|
-
"4. **Fetch with `
|
|
29
|
-
'5. **Scrape supporting evidence with `scrape-links
|
|
30
|
-
'6. **Loop**: if the classifier flags gaps ("no dissent voices captured", "no migration timeline") or
|
|
28
|
+
"4. **Fetch with `scrape-links`** \u2014 batch 3\u201310 reddit.com post permalinks in one call. `scrape-links` auto-detects `reddit.com/r/.../comments/` URLs and routes them through the Reddit API (threaded post + full comment tree). Read every comment tree end-to-end, not just the top-voted reply.",
|
|
29
|
+
'5. **Scrape supporting evidence** with another `scrape-links` call (in parallel, different call from the reddit batch) \u2014 blog post-mortems, GitHub issues, HN discussions referenced in the threads. Use `extract` = "concrete reasons | stack details | version numbers | outcome". The extractor preserves verbatim quotes and surfaces referenced-but-unscraped URLs under `## Follow-up signals`.',
|
|
30
|
+
'6. **Loop**: if the classifier flags gaps ("no dissent voices captured", "no migration timeline") or brief `gaps_to_watch` are unmet, build new queries and run another pass. Stop after 4 passes or when sentiment distribution stabilizes across two passes.',
|
|
31
31
|
"",
|
|
32
32
|
"## Output discipline",
|
|
33
33
|
"",
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"version": 3,
|
|
3
3
|
"sources": ["../../../src/prompts/reddit-sentiment.ts"],
|
|
4
|
-
"sourcesContent": ["import { text, type MCPServer } from 'mcp-use/server';\nimport { z } from 'zod';\n\nexport function registerRedditSentimentPrompt(server: MCPServer): void {\n server.prompt(\n {\n name: 'reddit-sentiment',\n title: 'Reddit Sentiment',\n description: 'Research Reddit sentiment for a topic using the research-powerpack tools \u2014 lived experience, migration stories, agreement/dissent distribution.',\n schema: z.object({\n topic: z.string().describe('Topic to evaluate. Phrase it as a sentiment question \u2014 \"what developers actually think about X\", \"why teams moved from X to Y\".'),\n subreddits: z.string().optional().describe('Optional comma-separated subreddit filters, e.g. \"webdev,javascript\".'),\n }),\n },\n async ({ topic, subreddits }) => {\n const subredditList = subreddits\n ? subreddits\n .split(',')\n .map((value) => value.trim().replace(/^\\/?r\\//i, ''))\n .filter(Boolean)\n : [];\n const subredditScope = subredditList.length\n ? ` Scope Reddit searches to ${subredditList.map((s) => `r/${s}`).join(', ')} when possible.`\n : '';\n\n return text(\n [\n 'You are a research agent using the research-powerpack MCP tools to characterize Reddit sentiment. You are running a research LOOP, not answering from memory. Sentiment claims must be traceable to specific
|
|
4
|
+
"sourcesContent": ["import { text, type MCPServer } from 'mcp-use/server';\nimport { z } from 'zod';\n\nexport function registerRedditSentimentPrompt(server: MCPServer): void {\n server.prompt(\n {\n name: 'reddit-sentiment',\n title: 'Reddit Sentiment',\n description: 'Research Reddit sentiment for a topic using the research-powerpack tools \u2014 lived experience, migration stories, agreement/dissent distribution.',\n schema: z.object({\n topic: z.string().describe('Topic to evaluate. Phrase it as a sentiment question \u2014 \"what developers actually think about X\", \"why teams moved from X to Y\".'),\n subreddits: z.string().optional().describe('Optional comma-separated subreddit filters, e.g. \"webdev,javascript\".'),\n }),\n },\n async ({ topic, subreddits }) => {\n const subredditList = subreddits\n ? subreddits\n .split(',')\n .map((value) => value.trim().replace(/^\\/?r\\//i, ''))\n .filter(Boolean)\n : [];\n const subredditScope = subredditList.length\n ? ` Scope Reddit searches to ${subredditList.map((s) => `r/${s}`).join(', ')} when possible.`\n : '';\n\n return text(\n [\n 'You are a research agent using the research-powerpack MCP tools (3 tools: `start-research`, `web-search`, `scrape-links`) to characterize Reddit sentiment. You are running a research LOOP, not answering from memory. Sentiment claims must be traceable to specific Reddit threads you expanded via `scrape-links` \u2014 never cite a thread you have not scraped.',\n '',\n `Research goal: Reddit sentiment on \"${topic}\" \u2014 agreement distribution, dissent distribution, representative verbatim quotes with attribution, and the strongest causal explanations.${subredditScope}`,\n '',\n '## Workflow',\n '',\n '1. **Call `start-research` with `goal` = the research goal above.** The brief will classify this as `sentiment`, set `primary_branch` to `reddit` (or `both` if official sources also matter), and list 25\u201350 seed queries ready for `web-search`.',\n '2. **Fire two parallel `web-search` calls in one turn** \u2014 one with `scope: \"reddit\"` for post-permalink discovery, one with `scope: \"web\"` for supporting evidence (post-mortems, blog write-ups, GitHub issues). Set `extract` to describe the shape of the sentiment answer: \"agreement reasons | dissent reasons | representative quotes | migration drivers\".',\n '3. **Shortlist the strongest Reddit threads** \u2014 those with (a) high comment count, (b) visible disagreement in replies, (c) specific stack/environment details from the OP. Avoid single-comment threads.',\n '4. **Fetch with `scrape-links`** \u2014 batch 3\u201310 reddit.com post permalinks in one call. `scrape-links` auto-detects `reddit.com/r/.../comments/` URLs and routes them through the Reddit API (threaded post + full comment tree). Read every comment tree end-to-end, not just the top-voted reply.',\n '5. **Scrape supporting evidence** with another `scrape-links` call (in parallel, different call from the reddit batch) \u2014 blog post-mortems, GitHub issues, HN discussions referenced in the threads. Use `extract` = \"concrete reasons | stack details | version numbers | outcome\". The extractor preserves verbatim quotes and surfaces referenced-but-unscraped URLs under `## Follow-up signals`.',\n '6. **Loop**: if the classifier flags gaps (\"no dissent voices captured\", \"no migration timeline\") or brief `gaps_to_watch` are unmet, build new queries and run another pass. Stop after 4 passes or when sentiment distribution stabilizes across two passes.',\n '',\n '## Output discipline',\n '',\n '- Report sentiment as a distribution (\"~N of M replies agreed / ~K dissented / rest off-topic\"), not a single mood label.',\n '- Cite every quote with the Reddit thread permalink plus `u/username` attribution.',\n '- Separate OP claims from reply-thread consensus \u2014 they often diverge.',\n '- If dissent is present, surface the strongest dissenting quote verbatim, even if the majority view dominates.',\n '- Include the scrape date on every time-sensitive claim.',\n ].join('\\n'),\n );\n },\n );\n}\n"],
|
|
5
5
|
"mappings": "AAAA,SAAS,YAA4B;AACrC,SAAS,SAAS;AAEX,SAAS,8BAA8B,QAAyB;AACrE,SAAO;AAAA,IACL;AAAA,MACE,MAAM;AAAA,MACN,OAAO;AAAA,MACP,aAAa;AAAA,MACb,QAAQ,EAAE,OAAO;AAAA,QACf,OAAO,EAAE,OAAO,EAAE,SAAS,sIAAiI;AAAA,QAC5J,YAAY,EAAE,OAAO,EAAE,SAAS,EAAE,SAAS,uEAAuE;AAAA,MACpH,CAAC;AAAA,IACH;AAAA,IACA,OAAO,EAAE,OAAO,WAAW,MAAM;AAC/B,YAAM,gBAAgB,aAClB,WACG,MAAM,GAAG,EACT,IAAI,CAAC,UAAU,MAAM,KAAK,EAAE,QAAQ,YAAY,EAAE,CAAC,EACnD,OAAO,OAAO,IACjB,CAAC;AACL,YAAM,iBAAiB,cAAc,SACjC,6BAA6B,cAAc,IAAI,CAAC,MAAM,KAAK,CAAC,EAAE,EAAE,KAAK,IAAI,CAAC,oBAC1E;AAEJ,aAAO;AAAA,QACL;AAAA,UACE;AAAA,UACA;AAAA,UACA,uCAAuC,KAAK,iJAA4I,cAAc;AAAA,UACtM;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,QACF,EAAE,KAAK,IAAI;AAAA,MACb;AAAA,IACF;AAAA,EACF;AACF;",
|
|
6
6
|
"names": []
|
|
7
7
|
}
|