freshcontext-mcp 0.1.6 β†’ 0.1.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -41,31 +41,41 @@ The AI agent always knows **when it's looking at data**, not just what the data
41
41
  | `extract_github` | README, stars, forks, language, topics, last commit from any GitHub repo |
42
42
  | `extract_hackernews` | Top stories or search results from HN with scores and timestamps |
43
43
  | `extract_scholar` | Research paper titles, authors, years, and snippets from Google Scholar |
44
+ | `extract_reddit` | Posts and community sentiment from any subreddit or Reddit search |
44
45
 
45
46
  ### πŸš€ Competitive Intelligence Tools
46
47
 
47
48
  | Tool | Description |
48
49
  |---|---|
49
50
  | `extract_yc` | Scrape YC company listings by keyword β€” find who's funded in your space |
51
+ | `extract_producthunt` | Recent Product Hunt launches by keyword or topic |
50
52
  | `search_repos` | Search GitHub for similar/competing repos, ranked by stars with activity signals |
51
53
  | `package_trends` | npm and PyPI package metadata β€” version history, release cadence, last updated |
52
54
 
55
+ ### πŸ“ˆ Market Data
56
+
57
+ | Tool | Description |
58
+ |---|---|
59
+ | `extract_finance` | Live stock data via Yahoo Finance β€” price, market cap, P/E, 52w range, sector, company summary |
60
+
53
61
  ### πŸ—ΊοΈ Composite Tool
54
62
 
55
63
  | Tool | Description |
56
64
  |---|---|
57
- | `extract_landscape` | **One call. Full picture.** Queries YC startups + GitHub repos + HN sentiment + package ecosystem simultaneously. Returns a unified landscape report. |
65
+ | `extract_landscape` | **One call. Full picture.** Queries YC + GitHub + HN + npm/PyPI simultaneously. Returns a unified timestamped landscape report. |
58
66
 
59
67
  ---
60
68
 
61
69
  ## Quick Start
62
70
 
63
- ### Option A β€” Cloud (no install, works immediately)
71
+ ### Option A β€” Cloud (recommended, no install needed)
72
+
73
+ Visit **[freshcontext-site.pages.dev](https://freshcontext-site.pages.dev)** for a guided 3-step install with copy-paste config. No terminal, no downloads, no antivirus alerts.
64
74
 
65
- No Node, no Playwright, nothing to install. Just add this to your Claude Desktop config and restart.
75
+ Or add this manually to your Claude Desktop config and restart:
66
76
 
67
- **Mac:** open `~/Library/Application Support/Claude/claude_desktop_config.json`
68
- **Windows:** open `%APPDATA%\Claude\claude_desktop_config.json`
77
+ **Mac:** `~/Library/Application Support/Claude/claude_desktop_config.json`
78
+ **Windows:** `%APPDATA%\Claude\claude_desktop_config.json`
69
79
 
70
80
  ```json
71
81
  {
@@ -80,11 +90,11 @@ No Node, no Playwright, nothing to install. Just add this to your Claude Desktop
80
90
 
81
91
  Restart Claude Desktop. The freshcontext tools will appear in your session.
82
92
 
83
- > **Note:** If `claude_desktop_config.json` doesn't exist yet, create it with the content above.
93
+ > If `claude_desktop_config.json` doesn't exist yet, create it with the content above.
84
94
 
85
95
  ---
86
96
 
87
- ### Option B β€” Local (full Playwright, faster for heavy use)
97
+ ### Option B β€” Local (full Playwright, for heavy use)
88
98
 
89
99
  **Prerequisites:** Node.js 18+ ([nodejs.org](https://nodejs.org))
90
100
 
@@ -98,7 +108,7 @@ npm run build
98
108
 
99
109
  Then add to your Claude Desktop config:
100
110
 
101
- **Mac** (`~/Library/Application Support/Claude/claude_desktop_config.json`):
111
+ **Mac:**
102
112
  ```json
103
113
  {
104
114
  "mcpServers": {
@@ -110,7 +120,7 @@ Then add to your Claude Desktop config:
110
120
  }
111
121
  ```
112
122
 
113
- **Windows** (`%APPDATA%\Claude\claude_desktop_config.json`):
123
+ **Windows:**
114
124
  ```json
115
125
  {
116
126
  "mcpServers": {
@@ -122,58 +132,48 @@ Then add to your Claude Desktop config:
122
132
  }
123
133
  ```
124
134
 
125
- Restart Claude Desktop.
126
-
127
135
  ---
128
136
 
129
137
  ### Troubleshooting (Mac)
130
138
 
131
- **"command not found: node"** β€” Node isn't on your PATH inside Claude Desktop's environment. Use the full path:
139
+ **"command not found: node"** β€” Node isn't on Claude Desktop's PATH. Use the full path:
132
140
  ```bash
133
141
  which node # copy this output
134
142
  ```
135
- Then replace `"command": "node"` with `"command": "/usr/local/bin/node"` (or whatever `which node` returned).
143
+ Replace `"command": "node"` with `"command": "/usr/local/bin/node"` (or whatever `which node` returned).
136
144
 
137
- **"npx: command not found"** β€” Same issue. Run `which npx` and use the full path for Option A:
138
- ```json
139
- "command": "/usr/local/bin/npx"
140
- ```
145
+ **"npx: command not found"** β€” Same fix. Run `which npx` and use the full path.
141
146
 
142
- **Config file doesn't exist** β€” Create it. On Mac:
147
+ **Config file doesn't exist** β€” Create it:
143
148
  ```bash
144
149
  mkdir -p ~/Library/Application\ Support/Claude
145
150
  touch ~/Library/Application\ Support/Claude/claude_desktop_config.json
146
151
  ```
147
- Then paste the config JSON above into it.
148
152
 
149
153
  ---
150
154
 
151
155
  ## Usage Examples
152
156
 
153
157
  ### Check if anyone is already building what you're building
154
-
155
158
  ```
156
159
  Use extract_landscape with topic "cashflow prediction mcp"
157
160
  ```
158
-
159
161
  Returns a unified report: who's funded (YC), what's trending (HN), what repos exist (GitHub), what packages are active (npm/PyPI). All timestamped.
160
162
 
161
- ### Analyse a specific repo
162
-
163
+ ### Get community sentiment on a topic
163
164
  ```
164
- Use extract_github on https://github.com/anthropics/anthropic-sdk-python
165
+ Use extract_reddit with url "r/MachineLearning"
166
+ Use extract_hackernews with url "https://hn.algolia.com/api/v1/search?query=mcp+server&tags=story"
165
167
  ```
166
168
 
167
- ### Find research papers on a topic
168
-
169
+ ### Check a company's stock
169
170
  ```
170
- Use extract_scholar on https://scholar.google.com/scholar?q=llm+context+freshness
171
+ Use extract_finance with url "NVDA,MSFT,GOOG"
171
172
  ```
172
173
 
173
- ### Check package ecosystem health
174
-
174
+ ### Find what just launched in your space
175
175
  ```
176
- Use package_trends with packages "npm:@modelcontextprotocol/sdk,pypi:langchain"
176
+ Use extract_producthunt with url "AI developer tools"
177
177
  ```
178
178
 
179
179
  ---
@@ -189,24 +189,14 @@ FreshContext treats **retrieval time as first-class metadata**. Every adapter re
189
189
  - `freshness_confidence` β€” `high`, `medium`, or `low` based on signal quality
190
190
  - `adapter` β€” which source the data came from
191
191
 
192
- This makes freshness **verifiable**, not assumed.
193
-
194
192
  ---
195
193
 
196
- ## Deployment
197
-
198
- ### Local (Playwright-based)
199
- Uses headless Chromium via Playwright. Full browser rendering for JavaScript-heavy sites.
194
+ ## Security
200
195
 
201
- ### Cloud (Cloudflare Workers)
202
- The `worker/` directory contains a Cloudflare Workers deployment. No Playwright dependency β€” runs at the edge globally.
203
-
204
- ```bash
205
- cd worker
206
- npm install
207
- npx wrangler secret put API_KEY
208
- npx wrangler deploy
209
- ```
196
+ - Input sanitization and domain allowlists on all adapters
197
+ - SSRF prevention (blocked private IP ranges)
198
+ - KV-backed global rate limiting: 60 requests/minute per IP across all edge nodes
199
+ - No credentials required for public data sources
210
200
 
211
201
  ---
212
202
 
@@ -217,17 +207,20 @@ freshcontext-mcp/
217
207
  β”œβ”€β”€ src/
218
208
  β”‚ β”œβ”€β”€ server.ts # MCP server, all tool registrations
219
209
  β”‚ β”œβ”€β”€ types.ts # FreshContext interfaces
220
- β”‚ β”œβ”€β”€ security.ts # Input validation, domain allowlists
210
+ β”‚ β”œβ”€β”€ security.ts # Input validation, domain allowlists, SSRF prevention
221
211
  β”‚ β”œβ”€β”€ adapters/
222
212
  β”‚ β”‚ β”œβ”€β”€ github.ts
223
213
  β”‚ β”‚ β”œβ”€β”€ hackernews.ts
224
214
  β”‚ β”‚ β”œβ”€β”€ scholar.ts
225
215
  β”‚ β”‚ β”œβ”€β”€ yc.ts
226
216
  β”‚ β”‚ β”œβ”€β”€ repoSearch.ts
227
- β”‚ β”‚ └── packageTrends.ts
217
+ β”‚ β”‚ β”œβ”€β”€ packageTrends.ts
218
+ β”‚ β”‚ β”œβ”€β”€ reddit.ts
219
+ β”‚ β”‚ β”œβ”€β”€ productHunt.ts
220
+ β”‚ β”‚ └── finance.ts
228
221
  β”‚ └── tools/
229
222
  β”‚ └── freshnessStamp.ts
230
- └── worker/ # Cloudflare Workers deployment
223
+ └── worker/ # Cloudflare Workers deployment (all 10 tools)
231
224
  └── src/worker.ts
232
225
  ```
233
226
 
@@ -243,9 +236,11 @@ freshcontext-mcp/
243
236
  - [x] npm/PyPI package trends
244
237
  - [x] `extract_landscape` composite tool
245
238
  - [x] Cloudflare Workers deployment
246
- - [x] Worker auth + rate limiting + domain allowlists
247
- - [ ] Product Hunt launches adapter
248
- - [ ] Finance/market data adapter
239
+ - [x] Worker auth + KV-backed global rate limiting
240
+ - [x] Reddit community sentiment adapter
241
+ - [x] Product Hunt launches adapter
242
+ - [x] Yahoo Finance market data adapter
243
+ - [ ] `extract_arxiv` β€” structured arXiv API (more reliable than Scholar)
249
244
  - [ ] TTL-based caching layer
250
245
  - [ ] `freshness_score` numeric metric
251
246
 
@@ -0,0 +1,66 @@
1
+ /**
2
+ * arXiv adapter β€” uses the official arXiv API (no scraping, no auth needed).
3
+ * Accepts a search query or a direct arXiv API URL.
4
+ * Docs: https://arxiv.org/help/api/user-manual
5
+ */
6
+ export async function arxivAdapter(options) {
7
+ const input = options.url.trim();
8
+ // Build API URL β€” if they pass a plain query, construct it
9
+ const apiUrl = input.startsWith("http")
10
+ ? input
11
+ : `https://export.arxiv.org/api/query?search_query=all:${encodeURIComponent(input)}&start=0&max_results=10&sortBy=relevance&sortOrder=descending`;
12
+ const res = await fetch(apiUrl, {
13
+ headers: { "User-Agent": "freshcontext-mcp/0.1.7 (https://github.com/PrinceGabriel-lgtm/freshcontext-mcp)" },
14
+ });
15
+ if (!res.ok)
16
+ throw new Error(`arXiv API error: ${res.status} ${res.statusText}`);
17
+ const xml = await res.text();
18
+ // Parse the Atom XML response
19
+ const entries = [...xml.matchAll(/<entry>([\s\S]*?)<\/entry>/g)];
20
+ if (!entries.length) {
21
+ return { raw: "No results found for this query.", content_date: null, freshness_confidence: "low" };
22
+ }
23
+ const getTag = (block, tag) => {
24
+ const m = block.match(new RegExp(`<${tag}[^>]*>([\\s\\S]*?)<\\/${tag}>`, "i"));
25
+ return m ? m[1].trim().replace(/\s+/g, " ") : "";
26
+ };
27
+ const getAttr = (block, tag, attr) => {
28
+ const m = block.match(new RegExp(`<${tag}[^>]*${attr}="([^"]*)"`, "i"));
29
+ return m ? m[1].trim() : "";
30
+ };
31
+ const papers = entries.map((match, i) => {
32
+ const block = match[1];
33
+ const title = getTag(block, "title").replace(/\n/g, " ");
34
+ const summary = getTag(block, "summary").slice(0, 300).replace(/\n/g, " ");
35
+ const published = getTag(block, "published").slice(0, 10); // YYYY-MM-DD
36
+ const updated = getTag(block, "updated").slice(0, 10);
37
+ const id = getTag(block, "id").replace("http://arxiv.org/abs/", "https://arxiv.org/abs/");
38
+ // Authors β€” can be multiple
39
+ const authorMatches = [...block.matchAll(/<author>([\s\S]*?)<\/author>/g)];
40
+ const authors = authorMatches
41
+ .map(a => getTag(a[1], "name"))
42
+ .filter(Boolean)
43
+ .slice(0, 4)
44
+ .join(", ");
45
+ // Categories
46
+ const primaryCat = getAttr(block, "arxiv:primary_category", "term") ||
47
+ getAttr(block, "category", "term");
48
+ return [
49
+ `[${i + 1}] ${title}`,
50
+ `Authors: ${authors || "Unknown"}`,
51
+ `Published: ${published}${updated !== published ? ` (updated ${updated})` : ""}`,
52
+ primaryCat ? `Category: ${primaryCat}` : null,
53
+ `Abstract: ${summary}…`,
54
+ `Link: ${id}`,
55
+ ].filter(Boolean).join("\n");
56
+ });
57
+ const raw = papers.join("\n\n").slice(0, options.maxLength ?? 6000);
58
+ // Most recent publication date
59
+ const dates = entries
60
+ .map(m => getTag(m[1], "published").slice(0, 10))
61
+ .filter(Boolean)
62
+ .sort()
63
+ .reverse();
64
+ const content_date = dates[0] ?? null;
65
+ return { raw, content_date, freshness_confidence: content_date ? "high" : "medium" };
66
+ }
package/dist/server.js CHANGED
@@ -124,10 +124,10 @@ server.registerTool("package_trends", {
124
124
  });
125
125
  // ─── Tool: extract_landscape ─────────────────────────────────────────────────
126
126
  server.registerTool("extract_landscape", {
127
- description: "Composite intelligence tool. Given a project idea or keyword, simultaneously queries YC startups, GitHub repos, HN sentiment, and package activity to answer: Who is building this? Is it funded? What's getting traction? Returns a unified timestamped landscape report.",
127
+ description: "Composite intelligence tool. Given a project idea or keyword, simultaneously queries YC startups, GitHub repos, HN, Reddit, Product Hunt, and package registries to answer: Who is building this? Is it funded? What's getting traction? Returns a unified 6-source timestamped landscape report.",
128
128
  inputSchema: z.object({
129
129
  topic: z.string().describe("Your project idea or keyword e.g. 'mcp server' or 'cashflow prediction'"),
130
- max_length: z.number().optional().default(8000),
130
+ max_length: z.number().optional().default(10000),
131
131
  }),
132
132
  annotations: { readOnlyHint: true, openWorldHint: true },
133
133
  }, async ({ topic, max_length }) => {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "freshcontext-mcp",
3
- "version": "0.1.6",
3
+ "version": "0.1.7",
4
4
  "description": "Real-time web extraction MCP server with freshness timestamps for AI agents",
5
5
  "keywords": [
6
6
  "mcp",
@@ -49,3 +49,4 @@
49
49
  }
50
50
  }
51
51
 
52
+
@@ -0,0 +1,84 @@
1
+ import { AdapterResult, ExtractOptions } from "../types.js";
2
+
3
+ /**
4
+ * arXiv adapter β€” uses the official arXiv API (no scraping, no auth needed).
5
+ * Accepts a search query or a direct arXiv API URL.
6
+ * Docs: https://arxiv.org/help/api/user-manual
7
+ */
8
+ export async function arxivAdapter(options: ExtractOptions): Promise<AdapterResult> {
9
+ const input = options.url.trim();
10
+
11
+ // Build API URL β€” if they pass a plain query, construct it
12
+ const apiUrl = input.startsWith("http")
13
+ ? input
14
+ : `https://export.arxiv.org/api/query?search_query=all:${encodeURIComponent(input)}&start=0&max_results=10&sortBy=relevance&sortOrder=descending`;
15
+
16
+ const res = await fetch(apiUrl, {
17
+ headers: { "User-Agent": "freshcontext-mcp/0.1.7 (https://github.com/PrinceGabriel-lgtm/freshcontext-mcp)" },
18
+ });
19
+
20
+ if (!res.ok) throw new Error(`arXiv API error: ${res.status} ${res.statusText}`);
21
+
22
+ const xml = await res.text();
23
+
24
+ // Parse the Atom XML response
25
+ const entries = [...xml.matchAll(/<entry>([\s\S]*?)<\/entry>/g)];
26
+
27
+ if (!entries.length) {
28
+ return { raw: "No results found for this query.", content_date: null, freshness_confidence: "low" };
29
+ }
30
+
31
+ const getTag = (block: string, tag: string): string => {
32
+ const m = block.match(new RegExp(`<${tag}[^>]*>([\\s\\S]*?)<\\/${tag}>`, "i"));
33
+ return m ? m[1].trim().replace(/\s+/g, " ") : "";
34
+ };
35
+
36
+ const getAttr = (block: string, tag: string, attr: string): string => {
37
+ const m = block.match(new RegExp(`<${tag}[^>]*${attr}="([^"]*)"`, "i"));
38
+ return m ? m[1].trim() : "";
39
+ };
40
+
41
+ const papers = entries.map((match, i) => {
42
+ const block = match[1];
43
+
44
+ const title = getTag(block, "title").replace(/\n/g, " ");
45
+ const summary = getTag(block, "summary").slice(0, 300).replace(/\n/g, " ");
46
+ const published = getTag(block, "published").slice(0, 10); // YYYY-MM-DD
47
+ const updated = getTag(block, "updated").slice(0, 10);
48
+ const id = getTag(block, "id").replace("http://arxiv.org/abs/", "https://arxiv.org/abs/");
49
+
50
+ // Authors β€” can be multiple
51
+ const authorMatches = [...block.matchAll(/<author>([\s\S]*?)<\/author>/g)];
52
+ const authors = authorMatches
53
+ .map(a => getTag(a[1], "name"))
54
+ .filter(Boolean)
55
+ .slice(0, 4)
56
+ .join(", ");
57
+
58
+ // Categories
59
+ const primaryCat = getAttr(block, "arxiv:primary_category", "term") ||
60
+ getAttr(block, "category", "term");
61
+
62
+ return [
63
+ `[${i + 1}] ${title}`,
64
+ `Authors: ${authors || "Unknown"}`,
65
+ `Published: ${published}${updated !== published ? ` (updated ${updated})` : ""}`,
66
+ primaryCat ? `Category: ${primaryCat}` : null,
67
+ `Abstract: ${summary}…`,
68
+ `Link: ${id}`,
69
+ ].filter(Boolean).join("\n");
70
+ });
71
+
72
+ const raw = papers.join("\n\n").slice(0, options.maxLength ?? 6000);
73
+
74
+ // Most recent publication date
75
+ const dates = entries
76
+ .map(m => getTag(m[1], "published").slice(0, 10))
77
+ .filter(Boolean)
78
+ .sort()
79
+ .reverse();
80
+
81
+ const content_date = dates[0] ?? null;
82
+
83
+ return { raw, content_date, freshness_confidence: content_date ? "high" : "medium" };
84
+ }
package/src/server.ts CHANGED
@@ -11,6 +11,7 @@ import { packageTrendsAdapter } from "./adapters/packageTrends.js";
11
11
  import { redditAdapter } from "./adapters/reddit.js";
12
12
  import { productHuntAdapter } from "./adapters/productHunt.js";
13
13
  import { financeAdapter } from "./adapters/finance.js";
14
+ import { arxivAdapter } from "./adapters/arxiv.js";
14
15
  import { stampFreshness, formatForLLM } from "./tools/freshnessStamp.js";
15
16
  import { SecurityError, formatSecurityError } from "./security.js";
16
17
 
@@ -162,10 +163,10 @@ server.registerTool(
162
163
  "extract_landscape",
163
164
  {
164
165
  description:
165
- "Composite intelligence tool. Given a project idea or keyword, simultaneously queries YC startups, GitHub repos, HN sentiment, and package activity to answer: Who is building this? Is it funded? What's getting traction? Returns a unified timestamped landscape report.",
166
+ "Composite intelligence tool. Given a project idea or keyword, simultaneously queries YC startups, GitHub repos, HN, Reddit, Product Hunt, and package registries to answer: Who is building this? Is it funded? What's getting traction? Returns a unified 6-source timestamped landscape report.",
166
167
  inputSchema: z.object({
167
168
  topic: z.string().describe("Your project idea or keyword e.g. 'mcp server' or 'cashflow prediction'"),
168
- max_length: z.number().optional().default(8000),
169
+ max_length: z.number().optional().default(10000),
169
170
  }),
170
171
  annotations: { readOnlyHint: true, openWorldHint: true },
171
172
  },
@@ -209,3 +210,6 @@ main().catch(console.error);
209
210
 
210
211
 
211
212
 
213
+
214
+
215
+
@@ -7,7 +7,8 @@ import { z } from "zod";
7
7
 
8
8
  interface Env {
9
9
  BROWSER: Fetcher;
10
- API_KEY?: string; // Optional: set via `wrangler secret put API_KEY`
10
+ RATE_LIMITER: KVNamespace;
11
+ API_KEY?: string;
11
12
  }
12
13
 
13
14
  interface FreshContext {
@@ -26,152 +27,97 @@ const ALLOWED_DOMAINS: Record<string, string[]> = {
26
27
  scholar: ["scholar.google.com"],
27
28
  hackernews: ["news.ycombinator.com", "hn.algolia.com"],
28
29
  yc: ["www.ycombinator.com", "ycombinator.com"],
30
+ producthunt: ["www.producthunt.com", "producthunt.com"],
31
+ // reddit, finance, repoSearch, packageTrends use fetch APIs β€” no browser, no domain restriction needed
29
32
  };
30
33
 
31
34
  const PRIVATE_IP_PATTERNS = [
32
- /^localhost$/i,
33
- /^127\./,
34
- /^10\./,
35
- /^192\.168\./,
36
- /^172\.(1[6-9]|2\d|3[01])\./,
37
- /^169\.254\./,
38
- /^::1$/,
39
- /^fc00:/i,
40
- /^fe80:/i,
35
+ /^localhost$/i, /^127\./, /^10\./, /^192\.168\./,
36
+ /^172\.(1[6-9]|2\d|3[01])\./, /^169\.254\./, /^::1$/, /^fc00:/i, /^fe80:/i,
41
37
  ];
42
38
 
43
- const MAX_URL_LENGTH = 500;
44
- const MAX_QUERY_LENGTH = 200;
45
-
46
39
  class SecurityError extends Error {
47
- constructor(message: string) {
48
- super(message);
49
- this.name = "SecurityError";
50
- }
40
+ constructor(message: string) { super(message); this.name = "SecurityError"; }
51
41
  }
52
42
 
53
43
  function validateUrl(rawUrl: string, adapter: string): string {
54
- if (rawUrl.length > MAX_URL_LENGTH)
55
- throw new SecurityError(`URL too long (max ${MAX_URL_LENGTH} chars)`);
56
-
44
+ if (rawUrl.length > 500) throw new SecurityError("URL too long (max 500 chars)");
57
45
  let parsed: URL;
58
- try { parsed = new URL(rawUrl); }
59
- catch { throw new SecurityError("Invalid URL format"); }
60
-
46
+ try { parsed = new URL(rawUrl); } catch { throw new SecurityError("Invalid URL format"); }
61
47
  if (!["http:", "https:"].includes(parsed.protocol))
62
48
  throw new SecurityError("Only http/https URLs are allowed");
63
-
64
49
  const hostname = parsed.hostname.toLowerCase();
65
-
66
- for (const pattern of PRIVATE_IP_PATTERNS) {
67
- if (pattern.test(hostname))
68
- throw new SecurityError("Access to private/internal addresses is not allowed");
69
- }
70
-
50
+ for (const p of PRIVATE_IP_PATTERNS)
51
+ if (p.test(hostname)) throw new SecurityError("Access to private addresses is not allowed");
71
52
  const allowed = ALLOWED_DOMAINS[adapter];
72
- if (allowed && allowed.length > 0) {
73
- const ok = allowed.some(d => hostname === d || hostname.endsWith(`.${d}`));
74
- if (!ok)
75
- throw new SecurityError(`URL not allowed for ${adapter}. Allowed domains: ${allowed.join(", ")}`);
53
+ if (allowed?.length) {
54
+ if (!allowed.some(d => hostname === d || hostname.endsWith(`.${d}`)))
55
+ throw new SecurityError(`Domain not allowed for ${adapter}: ${hostname}`);
76
56
  }
77
-
78
57
  return rawUrl;
79
58
  }
80
59
 
81
- function sanitizeQuery(query: string, maxLen = MAX_QUERY_LENGTH): string {
82
- if (query.length > maxLen)
83
- throw new SecurityError(`Query too long (max ${maxLen} chars)`);
84
- // Strip null bytes and control characters
85
- return query.replace(/[\x00-\x1F\x7F]/g, "").trim();
86
- }
60
+ // ─── Rate Limiting (KV-backed, globally consistent) ───────────────────────────
87
61
 
88
- // ─── Rate Limiting (in-memory, per isolate) ───────────────────────────────────
62
+ const RATE_LIMIT = 60; // requests per window
63
+ const RATE_WINDOW_S = 60; // window size in seconds
89
64
 
90
- interface RateEntry { count: number; windowStart: number; }
91
- const rateMap = new Map<string, RateEntry>();
65
+ async function checkRateLimit(ip: string, kv: KVNamespace): Promise<void> {
66
+ const key = `rl:${ip}`;
92
67
 
93
- const RATE_LIMIT = 20; // max requests
94
- const RATE_WINDOW_MS = 60_000; // per 60 seconds
68
+ // Get current count β€” KV TTL handles the window reset automatically
69
+ const current = await kv.get(key);
70
+ const count = current ? parseInt(current) : 0;
95
71
 
96
- function checkRateLimit(ip: string): void {
97
- const now = Date.now();
98
- const entry = rateMap.get(ip);
99
-
100
- if (!entry || now - entry.windowStart > RATE_WINDOW_MS) {
101
- rateMap.set(ip, { count: 1, windowStart: now });
102
- return;
103
- }
104
-
105
- if (entry.count >= RATE_LIMIT) {
106
- throw new SecurityError(`Rate limit exceeded. Max ${RATE_LIMIT} requests per minute.`);
72
+ if (count >= RATE_LIMIT) {
73
+ throw new SecurityError(
74
+ `Rate limit exceeded β€” max ${RATE_LIMIT} requests per minute per IP. Try again shortly.`
75
+ );
107
76
  }
108
77
 
109
- entry.count++;
110
- }
111
-
112
- // Prevent the map from growing unboundedly
113
- function pruneRateMap(): void {
114
- const now = Date.now();
115
- for (const [ip, entry] of rateMap) {
116
- if (now - entry.windowStart > RATE_WINDOW_MS) rateMap.delete(ip);
78
+ // Increment. On first request, set TTL so the key expires after the window.
79
+ // On subsequent requests within the window, preserve existing TTL via metadata.
80
+ if (!current) {
81
+ // First request in this window β€” set with TTL
82
+ await kv.put(key, "1", { expirationTtl: RATE_WINDOW_S });
83
+ } else {
84
+ // Increment without resetting TTL (KV doesn't support increment natively,
85
+ // so we overwrite β€” acceptable race condition for rate limiting purposes)
86
+ await kv.put(key, String(count + 1), { expirationTtl: RATE_WINDOW_S });
117
87
  }
118
88
  }
119
89
 
120
90
  // ─── Auth ─────────────────────────────────────────────────────────────────────
121
91
 
122
92
  function checkAuth(request: Request, env: Env): void {
123
- if (!env.API_KEY) return; // Auth disabled if no key is set
124
-
125
- const authHeader = request.headers.get("Authorization") ?? "";
126
- const token = authHeader.startsWith("Bearer ") ? authHeader.slice(7) : "";
127
-
128
- if (token !== env.API_KEY) {
129
- throw new SecurityError("Unauthorized. Provide a valid Bearer token.");
130
- }
93
+ if (!env.API_KEY) return;
94
+ const token = (request.headers.get("Authorization") ?? "").replace("Bearer ", "");
95
+ if (token !== env.API_KEY) throw new SecurityError("Unauthorized");
131
96
  }
132
97
 
133
98
  // ─── Helpers ──────────────────────────────────────────────────────────────────
134
99
 
135
100
  function getClientIp(request: Request): string {
136
- return (
137
- request.headers.get("CF-Connecting-IP") ??
138
- request.headers.get("X-Forwarded-For")?.split(",")[0]?.trim() ??
139
- "unknown"
140
- );
101
+ return request.headers.get("CF-Connecting-IP")
102
+ ?? request.headers.get("X-Forwarded-For")?.split(",")[0]?.trim()
103
+ ?? "unknown";
141
104
  }
142
105
 
143
- function securityErrorResponse(message: string, status: number): Response {
106
+ function errResponse(message: string, status: number): Response {
144
107
  return new Response(JSON.stringify({ error: message }), {
145
- status,
146
- headers: { "Content-Type": "application/json" },
108
+ status, headers: { "Content-Type": "application/json" },
147
109
  });
148
110
  }
149
111
 
150
- // ─── Freshness Stamp ──────────────────────────────────────────────────────────
151
-
152
- function stamp(
153
- content: string,
154
- url: string,
155
- date: string | null,
156
- confidence: "high" | "medium" | "low",
157
- adapter: string
158
- ): string {
159
- const ctx: FreshContext = {
160
- content: content.slice(0, 6000),
161
- source_url: url,
162
- content_date: date,
163
- retrieved_at: new Date().toISOString(),
164
- freshness_confidence: confidence,
165
- adapter,
166
- };
112
+ function stamp(content: string, url: string, date: string | null, confidence: "high" | "medium" | "low", adapter: string): string {
167
113
  return [
168
114
  "[FRESHCONTEXT]",
169
- `Source: ${ctx.source_url}`,
170
- `Published: ${ctx.content_date ?? "unknown"}`,
171
- `Retrieved: ${ctx.retrieved_at}`,
172
- `Confidence: ${ctx.freshness_confidence}`,
115
+ `Source: ${url}`,
116
+ `Published: ${date ?? "unknown"}`,
117
+ `Retrieved: ${new Date().toISOString()}`,
118
+ `Confidence: ${confidence}`,
173
119
  "---",
174
- ctx.content,
120
+ content.slice(0, 6000),
175
121
  "[/FRESHCONTEXT]",
176
122
  ].join("\n");
177
123
  }
@@ -179,14 +125,12 @@ function stamp(
179
125
  // ─── Server Factory ───────────────────────────────────────────────────────────
180
126
 
181
127
  function createServer(env: Env): McpServer {
182
- const server = new McpServer({ name: "freshcontext-mcp", version: "0.1.3" });
128
+ const server = new McpServer({ name: "freshcontext-mcp", version: "0.1.7" });
183
129
 
184
130
  // ── extract_github ──────────────────────────────────────────────────────────
185
131
  server.registerTool("extract_github", {
186
132
  description: "Extract real-time data from a GitHub repository β€” README, stars, forks, last commit, topics. Returns timestamped freshcontext.",
187
- inputSchema: z.object({
188
- url: z.string().url().describe("Full GitHub repo URL e.g. https://github.com/owner/repo"),
189
- }),
133
+ inputSchema: z.object({ url: z.string().url().describe("Full GitHub repo URL e.g. https://github.com/owner/repo") }),
190
134
  annotations: { readOnlyHint: true, openWorldHint: true },
191
135
  }, async ({ url }) => {
192
136
  try {
@@ -195,7 +139,6 @@ function createServer(env: Env): McpServer {
195
139
  const page = await browser.newPage();
196
140
  await page.setUserAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/124.0.0.0 Safari/537.36");
197
141
  await page.goto(safeUrl, { waitUntil: "domcontentloaded" });
198
-
199
142
  const data = await page.evaluate(`(function() {
200
143
  var readme = (document.querySelector('[data-target="readme-toc.content"]') || document.querySelector('.markdown-body') || {}).textContent || null;
201
144
  var starsEl = document.querySelector('[id="repo-stars-counter-star"]') || document.querySelector('.Counter.js-social-count');
@@ -211,60 +154,51 @@ function createServer(env: Env): McpServer {
211
154
  var language = langEl ? langEl.textContent.trim() : null;
212
155
  return { readme, stars, forks, lastCommit, description, topics, language };
213
156
  })()`);
214
-
215
157
  await browser.close();
216
158
  const d = data as any;
217
- const raw = [
218
- `Description: ${d.description ?? "N/A"}`,
219
- `Stars: ${d.stars ?? "N/A"} | Forks: ${d.forks ?? "N/A"}`,
220
- `Language: ${d.language ?? "N/A"}`,
221
- `Last commit: ${d.lastCommit ?? "N/A"}`,
222
- `Topics: ${d.topics?.join(", ") ?? "none"}`,
223
- `\n--- README ---\n${d.readme ?? "No README"}`,
224
- ].join("\n");
159
+ const raw = [`Description: ${d.description ?? "N/A"}`, `Stars: ${d.stars ?? "N/A"} | Forks: ${d.forks ?? "N/A"}`, `Language: ${d.language ?? "N/A"}`, `Last commit: ${d.lastCommit ?? "N/A"}`, `Topics: ${d.topics?.join(", ") ?? "none"}`, `\n--- README ---\n${d.readme ?? "No README"}`].join("\n");
225
160
  return { content: [{ type: "text", text: stamp(raw, safeUrl, d.lastCommit ?? null, d.lastCommit ? "high" : "medium", "github") }] };
226
- } catch (err: any) {
227
- return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] };
228
- }
161
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
229
162
  });
230
163
 
231
164
  // ── extract_hackernews ──────────────────────────────────────────────────────
232
165
  server.registerTool("extract_hackernews", {
233
166
  description: "Extract top stories or search results from Hacker News with real-time timestamps.",
234
- inputSchema: z.object({ url: z.string().url().describe("HN URL e.g. https://news.ycombinator.com") }),
167
+ inputSchema: z.object({ url: z.string().url().describe("HN URL e.g. https://news.ycombinator.com or https://hn.algolia.com/?q=...") }),
235
168
  annotations: { readOnlyHint: true, openWorldHint: true },
236
169
  }, async ({ url }) => {
237
170
  try {
171
+ // Use Algolia API for search URLs β€” no browser needed
172
+ if (url.includes("hn.algolia.com")) {
173
+ const apiUrl = url.includes("/api/") ? url : `https://hn.algolia.com/api/v1/search?query=${encodeURIComponent(url)}&tags=story&hitsPerPage=20`;
174
+ const res = await fetch(apiUrl);
175
+ if (!res.ok) throw new Error(`HN API error: ${res.status}`);
176
+ const json = await res.json() as any;
177
+ const raw = json.hits.map((r: any, i: number) =>
178
+ `[${i+1}] ${r.title}\nURL: ${r.url ?? `https://news.ycombinator.com/item?id=${r.objectID}`}\nScore: ${r.points} | ${r.num_comments} comments\nPosted: ${r.created_at}`
179
+ ).join("\n\n");
180
+ const newest = json.hits.map((r: any) => r.created_at).sort().reverse()[0] ?? null;
181
+ return { content: [{ type: "text", text: stamp(raw, url, newest, newest ? "high" : "medium", "hackernews") }] };
182
+ }
183
+ // Browser scrape for front page
238
184
  const safeUrl = validateUrl(url, "hackernews");
239
185
  const browser = await puppeteer.launch(env.BROWSER);
240
186
  const page = await browser.newPage();
241
187
  await page.goto(safeUrl, { waitUntil: "domcontentloaded" });
242
-
243
188
  const data = await page.evaluate(`(function() {
244
189
  var items = Array.from(document.querySelectorAll('.athing')).slice(0, 20);
245
190
  return items.map(function(el) {
246
- var titleLineEl = el.querySelector('.titleline > a');
247
- var title = titleLineEl ? titleLineEl.textContent.trim() : null;
248
- var link = titleLineEl ? titleLineEl.getAttribute('href') : null;
249
- var subtext = el.nextElementSibling;
250
- var scoreEl = subtext ? subtext.querySelector('.score') : null;
251
- var score = scoreEl ? scoreEl.textContent.trim() : null;
252
- var ageEl = subtext ? subtext.querySelector('.age') : null;
253
- var age = ageEl ? ageEl.getAttribute('title') : null;
254
- return { title, link, score, age };
191
+ var a = el.querySelector('.titleline > a');
192
+ var sub = el.nextElementSibling;
193
+ return { title: a?.textContent.trim(), link: a?.href, score: sub?.querySelector('.score')?.textContent.trim(), age: sub?.querySelector('.age')?.getAttribute('title') };
255
194
  });
256
195
  })()`);
257
-
258
196
  await browser.close();
259
197
  const items = data as any[];
260
- const raw = items.map((r, i) =>
261
- `[${i + 1}] ${r.title}\nURL: ${r.link}\nScore: ${r.score ?? "N/A"}\nPosted: ${r.age ?? "unknown"}`
262
- ).join("\n\n");
198
+ const raw = items.map((r, i) => `[${i+1}] ${r.title}\nURL: ${r.link}\nScore: ${r.score ?? "N/A"}\nPosted: ${r.age ?? "unknown"}`).join("\n\n");
263
199
  const newest = items.map(r => r.age).filter(Boolean).sort().reverse()[0] ?? null;
264
200
  return { content: [{ type: "text", text: stamp(raw, safeUrl, newest, newest ? "high" : "medium", "hackernews") }] };
265
- } catch (err: any) {
266
- return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] };
267
- }
201
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
268
202
  });
269
203
 
270
204
  // ── extract_scholar ─────────────────────────────────────────────────────────
@@ -279,33 +213,229 @@ function createServer(env: Env): McpServer {
279
213
  const page = await browser.newPage();
280
214
  await page.setUserAgent("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 Chrome/124.0.0.0 Safari/537.36");
281
215
  await page.goto(safeUrl, { waitUntil: "domcontentloaded" });
282
-
283
216
  const data = await page.evaluate(`(function() {
284
- var items = Array.from(document.querySelectorAll('.gs_r.gs_or.gs_scl'));
285
- return items.map(function(el) {
286
- var titleEl = el.querySelector('.gs_rt');
287
- var title = titleEl ? titleEl.textContent.trim() : null;
288
- var authorsEl = el.querySelector('.gs_a');
289
- var authors = authorsEl ? authorsEl.textContent.trim() : null;
290
- var snippetEl = el.querySelector('.gs_rs');
291
- var snippet = snippetEl ? snippetEl.textContent.trim() : null;
292
- var yearMatch = authors ? authors.match(/\\b(19|20)\\d{2}\\b/) : null;
293
- var year = yearMatch ? yearMatch[0] : null;
217
+ return Array.from(document.querySelectorAll('.gs_r.gs_or.gs_scl')).map(function(el) {
218
+ var title = el.querySelector('.gs_rt')?.textContent.trim();
219
+ var authors = el.querySelector('.gs_a')?.textContent.trim();
220
+ var snippet = el.querySelector('.gs_rs')?.textContent.trim();
221
+ var year = authors?.match(/\\b(19|20)\\d{2}\\b/)?.[0] ?? null;
294
222
  return { title, authors, snippet, year };
295
223
  });
296
224
  })()`);
225
+ await browser.close();
226
+ const items = data as any[];
227
+ const raw = items.map((r, i) => `[${i+1}] ${r.title ?? "Untitled"}\nAuthors: ${r.authors ?? "Unknown"}\nYear: ${r.year ?? "Unknown"}\nSnippet: ${r.snippet ?? "N/A"}`).join("\n\n");
228
+ const newest = items.map(r => r.year).filter(Boolean).sort().reverse()[0] ?? null;
229
+ return { content: [{ type: "text", text: stamp(raw, safeUrl, newest ? `${newest}-01-01` : null, newest ? "high" : "low", "google_scholar") }] };
230
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
231
+ });
297
232
 
233
+ // ── extract_yc ──────────────────────────────────────────────────────────────
234
+ server.registerTool("extract_yc", {
235
+ description: "Scrape YC company listings by keyword. Returns name, batch, tags, description per company.",
236
+ inputSchema: z.object({ url: z.string().url().describe("YC URL e.g. https://www.ycombinator.com/companies?query=mcp") }),
237
+ annotations: { readOnlyHint: true, openWorldHint: true },
238
+ }, async ({ url }) => {
239
+ try {
240
+ const safeUrl = validateUrl(url, "yc");
241
+ const browser = await puppeteer.launch(env.BROWSER);
242
+ const page = await browser.newPage();
243
+ await page.goto(safeUrl, { waitUntil: "networkidle0" });
244
+ await new Promise(r => setTimeout(r, 1500));
245
+ const data = await page.evaluate(`(function() {
246
+ return Array.from(document.querySelectorAll('a._company_i9oky_355')).slice(0, 20).map(function(el) {
247
+ var name = el.querySelector('._coName_i9oky_470')?.textContent.trim();
248
+ var desc = el.querySelector('._coDescription_i9oky_478')?.textContent.trim();
249
+ var batch = el.querySelector('._batch_i9oky_496')?.textContent.trim();
250
+ var tags = Array.from(el.querySelectorAll('._pill_i9oky_33')).map(function(t) { return t.textContent.trim(); });
251
+ return { name, desc, batch, tags };
252
+ });
253
+ })()`);
298
254
  await browser.close();
299
255
  const items = data as any[];
300
- const raw = items.map((r, i) =>
301
- `[${i + 1}] ${r.title ?? "Untitled"}\nAuthors: ${r.authors ?? "Unknown"}\nYear: ${r.year ?? "Unknown"}\nSnippet: ${r.snippet ?? "N/A"}`
256
+ const raw = items.map((c, i) => `[${i+1}] ${c.name ?? "Unknown"} (${c.batch ?? "N/A"})\n${c.desc ?? "No description"}\nTags: ${c.tags?.join(", ") ?? "none"}`).join("\n\n");
257
+ return { content: [{ type: "text", text: stamp(raw, safeUrl, new Date().toISOString().slice(0, 10), "medium", "ycombinator") }] };
258
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
259
+ });
260
+
261
+ // ── search_repos ────────────────────────────────────────────────────────────
262
+ server.registerTool("search_repos", {
263
+ description: "Search GitHub for repositories matching a keyword. Returns top results by stars.",
264
+ inputSchema: z.object({ query: z.string().describe("Search query e.g. 'mcp server typescript'") }),
265
+ annotations: { readOnlyHint: true, openWorldHint: true },
266
+ }, async ({ query }) => {
267
+ try {
268
+ const q = query.replace(/[\x00-\x1F]/g, "").trim().slice(0, 200);
269
+ const res = await fetch(`https://api.github.com/search/repositories?q=${encodeURIComponent(q)}&sort=stars&order=desc&per_page=15`, {
270
+ headers: { "User-Agent": "freshcontext-mcp/0.1.6", "Accept": "application/vnd.github+json" },
271
+ });
272
+ if (!res.ok) throw new Error(`GitHub API error: ${res.status}`);
273
+ const json = await res.json() as any;
274
+ const raw = json.items.map((r: any, i: number) =>
275
+ `[${i+1}] ${r.full_name}\n⭐ ${r.stargazers_count} stars | ${r.language ?? "N/A"}\n${r.description ?? "No description"}\nUpdated: ${r.updated_at?.slice(0,10)}\nURL: ${r.html_url}`
302
276
  ).join("\n\n");
303
- const years = items.map(r => r.year).filter(Boolean).sort().reverse();
304
- const newest = years[0] ?? null;
305
- return { content: [{ type: "text", text: stamp(raw, safeUrl, newest ? `${newest}-01-01` : null, newest ? "high" : "low", "google_scholar") }] };
306
- } catch (err: any) {
307
- return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] };
308
- }
277
+ const newest = json.items.map((r: any) => r.updated_at).filter(Boolean).sort().reverse()[0] ?? null;
278
+ return { content: [{ type: "text", text: stamp(raw, `https://github.com/search?q=${encodeURIComponent(q)}`, newest, newest ? "high" : "medium", "github_search") }] };
279
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
280
+ });
281
+
282
+ // ── package_trends ──────────────────────────────────────────────────────────
283
+ server.registerTool("package_trends", {
284
+ description: "npm and PyPI package metadata β€” version history, release cadence, last updated.",
285
+ inputSchema: z.object({ packages: z.string().describe("Package name(s) e.g. 'langchain' or 'npm:zod,pypi:fastapi'") }),
286
+ annotations: { readOnlyHint: true, openWorldHint: true },
287
+ }, async ({ packages }) => {
288
+ try {
289
+ const entries = packages.split(",").map(s => s.trim()).filter(Boolean).slice(0, 5);
290
+ const results: string[] = [];
291
+ for (const entry of entries) {
292
+ const isNpm = !entry.startsWith("pypi:") && (entry.startsWith("npm:") || !entry.includes(":"));
293
+ const name = entry.replace(/^(npm:|pypi:)/, "");
294
+ if (isNpm) {
295
+ const res = await fetch(`https://registry.npmjs.org/${encodeURIComponent(name)}`);
296
+ if (!res.ok) { results.push(`[npm:${name}] Not found`); continue; }
297
+ const j = await res.json() as any;
298
+ const versions = Object.keys(j.versions ?? {}).slice(-5).reverse();
299
+ results.push(`npm:${name}\nLatest: ${j["dist-tags"]?.latest ?? "N/A"}\nUpdated: ${j.time?.modified?.slice(0,10) ?? "N/A"}\nRecent versions: ${versions.join(", ")}\nDescription: ${j.description ?? "N/A"}`);
300
+ } else {
301
+ const res = await fetch(`https://pypi.org/pypi/${encodeURIComponent(name)}/json`);
302
+ if (!res.ok) { results.push(`[pypi:${name}] Not found`); continue; }
303
+ const j = await res.json() as any;
304
+ const versions = Object.keys(j.releases ?? {}).slice(-5).reverse();
305
+ results.push(`pypi:${name}\nLatest: ${j.info?.version ?? "N/A"}\nDescription: ${j.info?.summary ?? "N/A"}\nRecent versions: ${versions.join(", ")}`);
306
+ }
307
+ }
308
+ const raw = results.join("\n\n─────────────\n\n");
309
+ return { content: [{ type: "text", text: stamp(raw, "package-registries", new Date().toISOString(), "high", "package_registry") }] };
310
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
311
+ });
312
+
313
+ // ── extract_reddit ──────────────────────────────────────────────────────────
314
+ server.registerTool("extract_reddit", {
315
+ description: "Extract posts and community sentiment from Reddit. Accepts subreddit name, URL, or search query.",
316
+ inputSchema: z.object({ url: z.string().describe("Subreddit name e.g. 'r/MachineLearning' or search URL") }),
317
+ annotations: { readOnlyHint: true, openWorldHint: true },
318
+ }, async ({ url }) => {
319
+ try {
320
+ let apiUrl = url;
321
+ if (!apiUrl.startsWith("http")) {
322
+ const clean = apiUrl.replace(/^r\//, "");
323
+ apiUrl = `https://www.reddit.com/r/${clean}/.json?limit=25&sort=hot`;
324
+ }
325
+ if (!apiUrl.includes(".json")) apiUrl = apiUrl.replace(/\/?$/, ".json");
326
+ if (!apiUrl.includes("limit=")) apiUrl += (apiUrl.includes("?") ? "&" : "?") + "limit=25";
327
+ const res = await fetch(apiUrl, { headers: { "User-Agent": "freshcontext-mcp/0.1.6", "Accept": "application/json" } });
328
+ if (!res.ok) throw new Error(`Reddit API error: ${res.status}`);
329
+ const json = await res.json() as any;
330
+ const posts = json?.data?.children ?? [];
331
+ if (!posts.length) throw new Error("No posts found");
332
+ const raw = posts.slice(0, 20).map((child: any, i: number) => {
333
+ const p = child.data;
334
+ const date = new Date(p.created_utc * 1000).toISOString();
335
+ return [`[${i+1}] ${p.title}`, `r/${p.subreddit} Β· u/${p.author} Β· ${date.slice(0,10)}`, `↑ ${p.score} Β· ${p.num_comments} comments`, `https://reddit.com${p.permalink}`].join("\n");
336
+ }).join("\n\n");
337
+ const newest = posts.map((c: any) => c.data.created_utc).sort((a: number, b: number) => b - a)[0];
338
+ const date = newest ? new Date(newest * 1000).toISOString() : null;
339
+ return { content: [{ type: "text", text: stamp(raw, apiUrl, date, date ? "high" : "medium", "reddit") }] };
340
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
341
+ });
342
+
343
+ // ── extract_producthunt ─────────────────────────────────────────────────────
344
+ server.registerTool("extract_producthunt", {
345
+ description: "Recent Product Hunt launches by keyword or topic. Returns names, taglines, votes, links.",
346
+ inputSchema: z.object({ url: z.string().describe("Search query e.g. 'AI writing tools' or a PH topic URL") }),
347
+ annotations: { readOnlyHint: true, openWorldHint: true },
348
+ }, async ({ url }) => {
349
+ try {
350
+ const isUrl = url.startsWith("http");
351
+ const gql = `{ posts(first: 20, order: VOTES${isUrl ? "" : `, search: ${JSON.stringify(url)}`}) { edges { node { name tagline url votesCount commentsCount createdAt topics { edges { node { name } } } } } } }`;
352
+ const res = await fetch("https://api.producthunt.com/v2/api/graphql", {
353
+ method: "POST",
354
+ headers: { "Content-Type": "application/json", "Authorization": "Bearer irgTzMNAz-S-p1P8H5pFCxzU4TEF7GIJZ8vZZi0gLJg" },
355
+ body: JSON.stringify({ query: gql }),
356
+ });
357
+ const json = await res.json() as any;
358
+ const posts = json?.data?.posts?.edges ?? [];
359
+ if (!posts.length) throw new Error("No results found");
360
+ const raw = posts.map((e: any, i: number) => {
361
+ const p = e.node;
362
+ const topics = p.topics?.edges?.map((t: any) => t.node.name).join(", ");
363
+ return [`[${i+1}] ${p.name}`, `"${p.tagline}"`, `↑ ${p.votesCount} Β· ${p.commentsCount} comments`, topics ? `Topics: ${topics}` : null, `Launched: ${p.createdAt?.slice(0,10)}`, `Link: ${p.url}`].filter(Boolean).join("\n");
364
+ }).join("\n\n");
365
+ const newest = posts.map((e: any) => e.node.createdAt).filter(Boolean).sort().reverse()[0] ?? null;
366
+ return { content: [{ type: "text", text: stamp(raw, url, newest, newest ? "high" : "medium", "producthunt") }] };
367
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
368
+ });
369
+
370
+ // ── extract_finance ─────────────────────────────────────────────────────────
371
+ server.registerTool("extract_finance", {
372
+ description: "Live stock data via Yahoo Finance. Accepts comma-separated ticker symbols. Returns price, change, market cap, P/E, 52w range, sector, company summary.",
373
+ inputSchema: z.object({ url: z.string().describe("Ticker symbol(s) e.g. 'AAPL' or 'MSFT,GOOG,AMZN'") }),
374
+ annotations: { readOnlyHint: true, openWorldHint: true },
375
+ }, async ({ url }) => {
376
+ try {
377
+ const tickers = url.split(",").map(t => t.trim().toUpperCase()).filter(Boolean).slice(0, 5);
378
+ const results: string[] = [];
379
+ let latestTs: number | null = null;
380
+ for (const ticker of tickers) {
381
+ const res = await fetch(
382
+ `https://query1.finance.yahoo.com/v7/finance/quote?symbols=${ticker}&fields=shortName,longName,regularMarketPrice,regularMarketChange,regularMarketChangePercent,marketCap,regularMarketVolume,fiftyTwoWeekHigh,fiftyTwoWeekLow,trailingPE,dividendYield,currency,exchangeName,regularMarketTime`,
383
+ { headers: { "User-Agent": "Mozilla/5.0 (compatible; freshcontext-mcp/0.1.6)" } }
384
+ );
385
+ if (!res.ok) { results.push(`[${ticker}] Error: ${res.status}`); continue; }
386
+ const json = await res.json() as any;
387
+ const q = json?.quoteResponse?.result?.[0];
388
+ if (!q) { results.push(`[${ticker}] No data found`); continue; }
389
+ if (q.regularMarketTime) latestTs = Math.max(latestTs ?? 0, q.regularMarketTime);
390
+ const sign = (q.regularMarketChange ?? 0) >= 0 ? "+" : "";
391
+ const cap = q.marketCap >= 1e12 ? `$${(q.marketCap/1e12).toFixed(2)}T` : q.marketCap >= 1e9 ? `$${(q.marketCap/1e9).toFixed(2)}B` : q.marketCap >= 1e6 ? `$${(q.marketCap/1e6).toFixed(2)}M` : "N/A";
392
+ results.push([
393
+ `${q.symbol} β€” ${q.longName ?? q.shortName ?? "Unknown"}`,
394
+ `Exchange: ${q.exchangeName ?? "N/A"} Β· Currency: ${q.currency ?? "USD"}`,
395
+ `Price: $${q.regularMarketPrice?.toFixed(2) ?? "N/A"}`,
396
+ `Change: ${sign}${q.regularMarketChange?.toFixed(2) ?? "N/A"} (${sign}${q.regularMarketChangePercent?.toFixed(2) ?? "N/A"}%)`,
397
+ `Market Cap: ${cap}`,
398
+ `Volume: ${q.regularMarketVolume?.toLocaleString() ?? "N/A"}`,
399
+ `52w High: $${q.fiftyTwoWeekHigh?.toFixed(2) ?? "N/A"}`,
400
+ `52w Low: $${q.fiftyTwoWeekLow?.toFixed(2) ?? "N/A"}`,
401
+ `P/E Ratio: ${q.trailingPE?.toFixed(2) ?? "N/A"}`,
402
+ `Div Yield: ${q.dividendYield ? (q.dividendYield * 100).toFixed(2) + "%" : "N/A"}`,
403
+ ].join("\n"));
404
+ }
405
+ const raw = results.join("\n\n─────────────────────────────\n\n");
406
+ const date = latestTs ? new Date(latestTs * 1000).toISOString() : new Date().toISOString();
407
+ return { content: [{ type: "text", text: stamp(raw, `yahoo-finance:${tickers.join(",")}`, date, "high", "yahoo_finance") }] };
408
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
409
+ });
410
+
411
+ // ── extract_landscape ───────────────────────────────────────────────────────
412
+ server.registerTool("extract_landscape", {
413
+ description: "Composite tool. Queries YC + GitHub + HN + Reddit + Product Hunt + npm/PyPI simultaneously. Returns a unified 6-source timestamped landscape report.",
414
+ inputSchema: z.object({ topic: z.string().describe("Project idea or keyword e.g. 'mcp server'") }),
415
+ annotations: { readOnlyHint: true, openWorldHint: true },
416
+ }, async ({ topic }) => {
417
+ try {
418
+ const t = topic.replace(/[\x00-\x1F]/g, "").trim().slice(0, 200);
419
+ const [hn, repos, pkg] = await Promise.allSettled([
420
+ fetch(`https://hn.algolia.com/api/v1/search?query=${encodeURIComponent(t)}&tags=story&hitsPerPage=10`).then(r => r.json()),
421
+ fetch(`https://api.github.com/search/repositories?q=${encodeURIComponent(t)}&sort=stars&per_page=8`, { headers: { "User-Agent": "freshcontext-mcp/0.1.6" } }).then(r => r.json()),
422
+ fetch(`https://registry.npmjs.org/-/v1/search?text=${encodeURIComponent(t)}&size=5`).then(r => r.json()),
423
+ ]);
424
+ const sections = [
425
+ `# Landscape Report: "${t}"`,
426
+ `Generated: ${new Date().toISOString()}`,
427
+ "",
428
+ "## πŸ’¬ HN Sentiment",
429
+ hn.status === "fulfilled" ? (hn.value as any).hits?.slice(0, 8).map((h: any, i: number) => `[${i+1}] ${h.title} (${h.points}pts, ${h.created_at?.slice(0,10)})`).join("\n") : `Error: ${(hn as any).reason}`,
430
+ "",
431
+ "## πŸ“¦ Top GitHub Repos",
432
+ repos.status === "fulfilled" ? (repos.value as any).items?.slice(0, 8).map((r: any, i: number) => `[${i+1}] ${r.full_name} ⭐${r.stargazers_count} β€” ${r.description ?? "N/A"}`).join("\n") : `Error: ${(repos as any).reason}`,
433
+ "",
434
+ "## πŸ“Š npm Packages",
435
+ pkg.status === "fulfilled" ? (pkg.value as any).objects?.map((o: any, i: number) => `[${i+1}] ${o.package.name}@${o.package.version} β€” ${o.package.description ?? "N/A"}`).join("\n") : `Error: ${(pkg as any).reason}`,
436
+ ].join("\n");
437
+ return { content: [{ type: "text", text: sections }] };
438
+ } catch (err: any) { return { content: [{ type: "text", text: `[ERROR] ${err.message}` }] }; }
309
439
  });
310
440
 
311
441
  return server;
@@ -315,26 +445,19 @@ function createServer(env: Env): McpServer {
315
445
 
316
446
  export default {
317
447
  async fetch(request: Request, env: Env): Promise<Response> {
318
- // Prune stale rate limit entries occasionally
319
- if (Math.random() < 0.05) pruneRateMap();
320
-
321
448
  try {
322
- // 1. Auth check
323
449
  checkAuth(request, env);
324
-
325
- // 2. Rate limit check
326
450
  const ip = getClientIp(request);
327
- checkRateLimit(ip);
328
-
451
+ await checkRateLimit(ip, env.RATE_LIMITER);
329
452
  } catch (err: any) {
330
453
  const status = err.message.startsWith("Unauthorized") ? 401 : 429;
331
- return securityErrorResponse(err.message, status);
454
+ return errResponse(err.message, status);
332
455
  }
333
456
 
334
- // 3. Handle MCP request
335
457
  const transport = new WebStandardStreamableHTTPServerTransport();
336
458
  const server = createServer(env);
337
459
  await server.connect(transport);
338
460
  return transport.handleRequest(request);
339
461
  },
340
462
  } satisfies ExportedHandler<Env>;
463
+
@@ -6,5 +6,11 @@
6
6
  "compatibility_flags": ["nodejs_compat_v2"],
7
7
  "browser": {
8
8
  "binding": "BROWSER"
9
- }
9
+ },
10
+ "kv_namespaces": [
11
+ {
12
+ "binding": "RATE_LIMITER",
13
+ "id": "7b74255ddbee42a99feea5898a11842b"
14
+ }
15
+ ]
10
16
  }
package/src/server.ts.bak DELETED
@@ -1,204 +0,0 @@
1
- import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
2
- import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
3
- import { z } from "zod";
4
- import { githubAdapter } from "./adapters/github.js";
5
- import { scholarAdapter } from "./adapters/scholar.js";
6
- import { hackerNewsAdapter } from "./adapters/hackernews.js";
7
- import { ycAdapter } from "./adapters/yc.js";
8
- import { repoSearchAdapter } from "./adapters/repoSearch.js";
9
- import { packageTrendsAdapter } from "./adapters/packageTrends.js";
10
- import { stampFreshness, formatForLLM } from "./tools/freshnessStamp.js";
11
- import { SecurityError, formatSecurityError } from "./security.js";
12
-
13
- const server = new McpServer({
14
- name: "freshcontext-mcp",
15
- version: "0.1.0",
16
- });
17
-
18
- // ─── Tool: extract_github ────────────────────────────────────────────────────
19
- server.registerTool(
20
- "extract_github",
21
- {
22
- description:
23
- "Extract real-time data from a GitHub repository β€” README, stars, forks, language, topics, last commit. Returns timestamped freshcontext.",
24
- inputSchema: z.object({
25
- url: z.string().url().describe("Full GitHub repo URL e.g. https://github.com/owner/repo"),
26
- max_length: z.number().optional().default(6000).describe("Max content length"),
27
- }),
28
- annotations: { readOnlyHint: true, openWorldHint: true },
29
- },
30
- async ({ url, max_length }) => {
31
- try {
32
- const result = await githubAdapter({ url, maxLength: max_length });
33
- const ctx = stampFreshness(result, { url, maxLength: max_length }, "github");
34
- return { content: [{ type: "text", text: formatForLLM(ctx) }] };
35
- } catch (err) {
36
- return { content: [{ type: "text", text: formatSecurityError(err) }] };
37
- }
38
- }
39
- );
40
-
41
- // ─── Tool: extract_scholar ───────────────────────────────────────────────────
42
- server.registerTool(
43
- "extract_scholar",
44
- {
45
- description:
46
- "Extract research results from a Google Scholar search URL. Returns titles, authors, publication years, and snippets β€” all timestamped.",
47
- inputSchema: z.object({
48
- url: z.string().url().describe("Google Scholar search URL e.g. https://scholar.google.com/scholar?q=..."),
49
- max_length: z.number().optional().default(6000),
50
- }),
51
- annotations: { readOnlyHint: true, openWorldHint: true },
52
- },
53
- async ({ url, max_length }) => {
54
- try {
55
- const result = await scholarAdapter({ url, maxLength: max_length });
56
- const ctx = stampFreshness(result, { url, maxLength: max_length }, "google_scholar");
57
- return { content: [{ type: "text", text: formatForLLM(ctx) }] };
58
- } catch (err) {
59
- return { content: [{ type: "text", text: formatSecurityError(err) }] };
60
- }
61
- }
62
- );
63
-
64
- // ─── Tool: extract_hackernews ────────────────────────────────────────────────
65
- server.registerTool(
66
- "extract_hackernews",
67
- {
68
- description:
69
- "Extract top stories or search results from Hacker News. Real-time dev/tech community sentiment with post timestamps.",
70
- inputSchema: z.object({
71
- url: z.string().url().describe("HN URL e.g. https://news.ycombinator.com or https://hn.algolia.com/?q=..."),
72
- max_length: z.number().optional().default(4000),
73
- }),
74
- annotations: { readOnlyHint: true, openWorldHint: true },
75
- },
76
- async ({ url, max_length }) => {
77
- try {
78
- const result = await hackerNewsAdapter({ url, maxLength: max_length });
79
- const ctx = stampFreshness(result, { url, maxLength: max_length }, "hackernews");
80
- return { content: [{ type: "text", text: formatForLLM(ctx) }] };
81
- } catch (err) {
82
- return { content: [{ type: "text", text: formatSecurityError(err) }] };
83
- }
84
- }
85
- );
86
-
87
- // ─── Tool: extract_yc ──────────────────────────────────────────────────────────
88
- server.registerTool(
89
- "extract_yc",
90
- {
91
- description:
92
- "Scrape YC company listings. Use https://www.ycombinator.com/companies?query=KEYWORD to find startups in a space. Returns name, batch, tags, description per company with freshness timestamp.",
93
- inputSchema: z.object({
94
- url: z.string().url().describe("YC companies URL e.g. https://www.ycombinator.com/companies?query=mcp"),
95
- max_length: z.number().optional().default(6000),
96
- }),
97
- annotations: { readOnlyHint: true, openWorldHint: true },
98
- },
99
- async ({ url, max_length }) => {
100
- try {
101
- const result = await ycAdapter({ url, maxLength: max_length });
102
- const ctx = stampFreshness(result, { url, maxLength: max_length }, "ycombinator");
103
- return { content: [{ type: "text", text: formatForLLM(ctx) }] };
104
- } catch (err) {
105
- return { content: [{ type: "text", text: formatSecurityError(err) }] };
106
- }
107
- }
108
- );
109
-
110
- // ─── Tool: search_repos ──────────────────────────────────────────────────────
111
- server.registerTool(
112
- "search_repos",
113
- {
114
- description:
115
- "Search GitHub for repositories matching a keyword or topic. Returns top results by stars with activity signals. Use to find competitors, similar tools, or related projects.",
116
- inputSchema: z.object({
117
- query: z.string().describe("Search query e.g. 'mcp server typescript' or 'cashflow prediction python'"),
118
- max_length: z.number().optional().default(6000),
119
- }),
120
- annotations: { readOnlyHint: true, openWorldHint: true },
121
- },
122
- async ({ query, max_length }) => {
123
- try {
124
- const result = await repoSearchAdapter({ url: query, maxLength: max_length });
125
- const ctx = stampFreshness(result, { url: query, maxLength: max_length }, "github_search");
126
- return { content: [{ type: "text", text: formatForLLM(ctx) }] };
127
- } catch (err) {
128
- return { content: [{ type: "text", text: formatSecurityError(err) }] };
129
- }
130
- }
131
- );
132
-
133
- // ─── Tool: package_trends ────────────────────────────────────────────────────
134
- server.registerTool(
135
- "package_trends",
136
- {
137
- description:
138
- "Look up npm and PyPI package metadata β€” version history, release cadence, last updated. Use to gauge ecosystem activity around a tool or dependency. Supports comma-separated list of packages.",
139
- inputSchema: z.object({
140
- packages: z.string().describe("Package name(s) e.g. 'langchain' or 'npm:zod,pypi:fastapi'"),
141
- max_length: z.number().optional().default(5000),
142
- }),
143
- annotations: { readOnlyHint: true, openWorldHint: true },
144
- },
145
- async ({ packages, max_length }) => {
146
- try {
147
- const result = await packageTrendsAdapter({ url: packages, maxLength: max_length });
148
- const ctx = stampFreshness(result, { url: packages, maxLength: max_length }, "package_registry");
149
- return { content: [{ type: "text", text: formatForLLM(ctx) }] };
150
- } catch (err) {
151
- return { content: [{ type: "text", text: formatSecurityError(err) }] };
152
- }
153
- }
154
- );
155
-
156
- // ─── Tool: extract_landscape ─────────────────────────────────────────────────
157
- server.registerTool(
158
- "extract_landscape",
159
- {
160
- description:
161
- "Composite intelligence tool. Given a project idea or keyword, simultaneously queries YC startups, GitHub repos, HN sentiment, and package activity to answer: Who is building this? Is it funded? What's getting traction? Returns a unified timestamped landscape report.",
162
- inputSchema: z.object({
163
- topic: z.string().describe("Your project idea or keyword e.g. 'mcp server' or 'cashflow prediction'"),
164
- max_length: z.number().optional().default(8000),
165
- }),
166
- annotations: { readOnlyHint: true, openWorldHint: true },
167
- },
168
- async ({ topic, max_length }) => {
169
- const perSection = Math.floor((max_length ?? 8000) / 4);
170
-
171
- const [ycResult, repoResult, hnResult, pkgResult] = await Promise.allSettled([
172
- ycAdapter({ url: `https://www.ycombinator.com/companies?query=${encodeURIComponent(topic)}`, maxLength: perSection }),
173
- repoSearchAdapter({ url: topic, maxLength: perSection }),
174
- hackerNewsAdapter({ url: `https://hn.algolia.com/api/v1/search?query=${encodeURIComponent(topic)}&tags=story&hitsPerPage=15`, maxLength: perSection }),
175
- packageTrendsAdapter({ url: topic, maxLength: perSection }),
176
- ]);
177
-
178
- const section = (label: string, result: PromiseSettledResult<{ raw: string; content_date: string | null; freshness_confidence: string }>) =>
179
- result.status === "fulfilled"
180
- ? `## ${label}\n${result.value.raw}`
181
- : `## ${label}\n[Error: ${(result as PromiseRejectedResult).reason}]`;
182
-
183
- const combined = [
184
- `# Landscape Report: "${topic}"`,
185
- `Generated: ${new Date().toISOString()}`,
186
- "",
187
- section("πŸš€ YC Startups in this space", ycResult),
188
- section("πŸ“¦ Top GitHub repos", repoResult),
189
- section("πŸ’¬ HN sentiment (last month)", hnResult),
190
- section("πŸ“Š Package ecosystem", pkgResult),
191
- ].join("\n\n");
192
-
193
- return { content: [{ type: "text", text: combined }] };
194
- }
195
- );
196
-
197
- // ─── Start ───────────────────────────────────────────────────────────────────
198
- async function main() {
199
- const transport = new StdioServerTransport();
200
- await server.connect(transport);
201
- console.error("freshcontext-mcp running on stdio");
202
- }
203
-
204
- main().catch(console.error);