npm - freshcontext-mcp - Versions diffs - 0.3.1 → 0.3.3 - Mend

freshcontext-mcp 0.3.1 → 0.3.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/.actor/actor.json +8 -0
package/.actor/output_schema.json +13 -0
package/FRESHCONTEXT_SPEC.md +178 -0
package/ROADMAP.md +174 -0
package/dataset_schema.json +41 -0
package/dist/adapters/changelog.js +207 -0
package/dist/adapters/govcontracts.js +196 -0
package/dist/server.js +38 -0
package/input_schema.json +49 -0
package/package.json +1 -1

package/.actor/actor.json ADDED Viewed

@@ -0,0 +1,8 @@
+{
+  "actorSpecification": 1,
+  "name": "freshcontext-mcp",
+  "title": "FreshContext MCP",
+  "version": "0.3.1",
+  "input": "../input_schema.json",
+  "output": "./output_schema.json"
+}

package/.actor/output_schema.json ADDED Viewed

@@ -0,0 +1,13 @@
+{
+  "actorOutputSchemaVersion": 1,
+  "title": "FreshContext MCP Output",
+  "description": "Timestamped web intelligence results wrapped in FreshContext envelopes.",
+  "properties": {
+    "results": {
+      "type": "string",
+      "title": "Results",
+      "description": "FreshContext envelopes with content, source URL, retrieval timestamp, and freshness confidence.",
+      "template": "{{links.apiDefaultDatasetUrl}}/items"
+    }
+  }
+}

package/FRESHCONTEXT_SPEC.md ADDED Viewed

@@ -0,0 +1,178 @@
+# The FreshContext Specification
+**Version 1.0 — March 2026**
+*Authored by Immanuel Gabriel (Prince Gabriel) — Grootfontein, Namibia*
+---
+## What This Is
+The FreshContext Specification defines a standard envelope format for AI-retrieved web data.
+It exists to solve one problem: **AI models present stale data with the same confidence as fresh data, and users have no way to tell the difference.**
+FreshContext fixes this by wrapping every piece of retrieved content in a structured envelope that carries three guarantees:
+1. **When** the data was retrieved (exact ISO 8601 timestamp)
+2. **Where** it came from (canonical source URL)
+3. **How confident** we are that the content date is accurate (freshness confidence)
+Any tool, agent, or system that implements this spec is **FreshContext-compatible**.
+---
+## The Envelope Format
+Every FreshContext-compatible response MUST wrap its content in the following envelope:
+```
+[FRESHCONTEXT]
+Source: <canonical_url>
+Published: <content_date_or_"unknown">
+Retrieved: <iso8601_timestamp>
+Confidence: <high|medium|low>
+---
+<content>
+[/FRESHCONTEXT]
+```
+### Field Definitions
+| Field | Required | Format | Description |
+|---|---|---|---|
+| `Source` | Yes | Valid URL | The canonical URL of the original source |
+| `Published` | Yes | ISO 8601 date or `"unknown"` | Best estimate of when the content was originally published |
+| `Retrieved` | Yes | ISO 8601 datetime with timezone | Exact timestamp when this data was fetched |
+| `Confidence` | Yes | `high`, `medium`, or `low` | Confidence level of the `Published` date estimate |
+---
+## Confidence Levels
+### `high`
+The publication date was sourced from a structured, machine-readable field — an API response, HTML metadata tag, RSS feed, or official timestamp. The date is reliable.
+*Examples: GitHub API `pushed_at`, arXiv submission date, Hacker News `created_at`*
+### `medium`
+The publication date was inferred from page signals — visible date strings, URL patterns, or content heuristics. Likely correct but not guaranteed.
+*Examples: Blog post date parsed from HTML, URL containing `/2025/03/`, footer copyright year*
+### `low`
+No reliable date signal was found. The date is an estimate based on indirect signals or is entirely unknown.
+*Examples: Static page with no date, scraped content with no metadata, cached result of unknown age*
+---
+## Structured Form (JSON)
+Implementations MAY additionally expose freshness metadata as structured JSON alongside the text envelope:
+```json
+{
+  "freshcontext": {
+    "source_url": "https://github.com/owner/repo",
+    "content_date": "2026-03-05",
+    "retrieved_at": "2026-03-16T09:19:00.000Z",
+    "freshness_confidence": "high",
+    "adapter": "github",
+    "freshness_score": 94
+  },
+  "content": "..."
+}
+```
+### `freshness_score` (optional)
+A numeric representation of data freshness from 0–100, calculated as:
+```
+freshness_score = max(0, 100 - (days_since_retrieved × decay_rate))
+```
+Where `decay_rate` defaults to `1.5` for general web content. Implementations MAY use domain-specific decay rates (e.g., financial data decays faster than academic papers).
+| Score | Interpretation |
+|---|---|
+| 90–100 | Retrieved within hours — treat as current |
+| 70–89 | Retrieved within days — reliable for most uses |
+| 50–69 | Retrieved within weeks — verify before acting |
+| Below 50 | Retrieved more than a month ago — use with caution |
+---
+## Adapter Contract
+Any data source that feeds into a FreshContext-compatible system is called an **adapter**. Adapters MUST:
+1. Return raw content plus a `content_date` (or `null` if unknown)
+2. Set a `freshness_confidence` level based on how the date was determined
+3. Never fabricate or forward-date content timestamps
+4. Clearly identify which source system produced the data via the `adapter` field
+Adapters SHOULD:
+- Prefer structured API sources over scraped content when both are available
+- Log retrieval errors without silently returning cached or stale data
+- Surface rate-limit or access-denied errors explicitly rather than returning empty content
+---
+## Why This Matters for AI Agents
+Large language models have no internal clock. When an agent retrieves web data, it cannot distinguish between something published this morning and something published three years ago — unless that information is explicitly surfaced.
+Without FreshContext (or equivalent):
+- An agent recommending job listings may recommend roles that no longer exist
+- An agent summarising market trends may cite conditions from a previous cycle
+- An agent checking a competitor's pricing may act on outdated information
+With FreshContext:
+- Every piece of retrieved data carries its own timestamp
+- The agent can reason about data age before acting
+- Users can see exactly how fresh their AI's information is
+---
+## Compatibility
+A tool, server, or API is **FreshContext-compatible** if:
+- Its responses include the `[FRESHCONTEXT]...[/FRESHCONTEXT]` envelope, OR
+- Its responses include the structured JSON form with `freshcontext.retrieved_at` and `freshcontext.freshness_confidence` fields
+Partial implementations that include only `retrieved_at` without `freshness_confidence` are considered **FreshContext-aware** but not fully compatible.
+---
+## Reference Implementation
+The canonical reference implementation of this specification is:
+**freshcontext-mcp** — an MCP server with 11 adapters covering GitHub, Hacker News, Google Scholar, arXiv, Reddit, YC Companies, Product Hunt, npm/PyPI, financial markets, and a composite landscape tool.
+- npm: `freshcontext-mcp`
+- GitHub: https://github.com/PrinceGabriel-lgtm/freshcontext-mcp
+- Cloud endpoint: `https://freshcontext-mcp.gimmanuel73.workers.dev/mcp`
+---
+## Versioning
+This document is version 1.0 of the FreshContext Specification.
+Future versions will be tagged in this repository. Breaking changes to the envelope format will increment the major version. Additive changes (new optional fields, new confidence levels) will increment the minor version.
+---
+## License
+This specification is published under the MIT License.
+Implementations may be proprietary or open source.
+Attribution to the FreshContext Specification is appreciated but not required.
+---
+*"The work isn't gone. It's just waiting to be continued."*
+*— Prince Gabriel, Grootfontein, Namibia*

package/ROADMAP.md ADDED Viewed

@@ -0,0 +1,174 @@
+# FreshContext Roadmap
+> *This document describes what FreshContext is becoming — not just what it is today.*
+> *Built by Prince Gabriel — Grootfontein, Namibia 🇳🇦*
+---
+## Where We Are Today
+FreshContext is a working, deployed, monetized web intelligence engine for AI agents.
+**What's live and functional right now:**
+- 11 MCP adapters — GitHub, Hacker News, Google Scholar, arXiv, Reddit, YC Companies, Product Hunt, npm/PyPI trends, finance, job search, and `extract_landscape` (all 6 sources in one call)
+- Cloudflare Worker deployed globally at the edge with KV caching and rate limiting
+- D1 database with 18 active watched queries running on a 6-hour cron schedule
+- `GET /briefing` and `POST /briefing/now` endpoints for scheduled AI synthesis (synthesis paused pending Anthropic credits — infrastructure fully built)
+- Listed on npm (`freshcontext-mcp@0.3.1`) and the official MCP Registry
+- Published FreshContext Specification v1.0 — the standard this project is authoring
+- Apify Store listing pending approval (account under manual review)
+---
+## Layer 5 — Dashboard (Next Build)
+**Status: Designed, not yet built**
+A React frontend that makes the intelligence pipeline visible and beautiful.
+The dashboard pulls from live endpoints already built:
+- `GET /briefing` → renders the latest AI-generated briefing with per-adapter sections
+- `POST /briefing/now` → force-triggers a fresh synthesis on demand
+- `GET /watched-queries` → manage what topics are being monitored
+- User profile editor → update skills, targets, and context that shape briefing personalization
+**Design targets:**
+- Freshness confidence indicators on every source card (high/medium/low with color coding)
+- Briefing history timeline showing how signal has evolved over time
+- Watched query manager — add, pause, delete, and score queries by signal quality
+- "Force refresh" button with live streaming output
+**Deployment:** Cloudflare Pages — stays entirely within the Cloudflare free tier ecosystem.
+---
+## Layer 6 — Personalization Engine
+**Status: Schema designed in D1, logic not yet built**
+The `user_profiles` table already exists in D1 with fields for skills, certifications, targets, location, and context. The synthesis prompt already uses this data. What's missing is the user-facing surface:
+- Onboarding flow — build your profile in the app in under 3 minutes
+- Multiple profiles — team mode where each member gets their own briefing
+- Custom briefing schedules — not just every 6h, but user-defined intervals
+- Notification delivery — push briefings to Slack, email, or SMS via webhook
+---
+## Layer 7 — Watched Query Intelligence
+**Status: Data accumulating, intelligence layer not yet built**
+Every query run leaves a result in `scrape_results`. Over time this becomes a dataset with genuine historical value. The intelligence layer turns it into signal:
+- **Relevance scoring** — each result is scored against the user profile (0–100) before inclusion in briefings
+- **Deduplication** — same story appearing on HN and Reddit counts as one signal, not two
+- **Query performance scoring** — which watched queries are generating signal vs. noise? Surface the top performers.
+- **Smart suggestions** — "Based on your profile, you should also watch: mcp server rust, cloudflare workers ai"
+- **Trend detection** — alert when a topic spikes across multiple adapters simultaneously
+---
+## Layer 8 — New Adapters
+**Status: Planned, prioritised by acquisition value**
+These adapters extend FreshContext into new intelligence categories with zero API key requirements:
+| Adapter | Source | What it adds |
+|---|---|---|
+| `extract_devto` | dev.to public API | Developer article sentiment with clean publish dates |
+| `extract_changelog` | Any `/changelog` or `/releases` URL | Track any product's update cadence |
+| `extract_crunchbase_free` | Crunchbase public feed | Funding announcements with date signals |
+| `extract_govcontracts` | USASpending.gov API | Government contract awards — unique GTM signal |
+| `extract_npm_releases` | npm registry API | Package release velocity and adoption signals |
+| `extract_twitter_trends` | Nitter public endpoints | Real-time trending topics with no auth |
+| `extract_linkedin_jobs` | LinkedIn public job search | Job freshness — the origin story, completed |
+The `extract_changelog` and `extract_govcontracts` adapters are not available in any other MCP server. They represent a genuine capability gap in the market.
+---
+## Layer 9 — The Freshness Score Standard
+**Status: Spec written (FRESHCONTEXT_SPEC.md), numeric score not yet implemented**
+The FreshContext Specification v1.0 defines an optional `freshness_score` field (0–100) calculated as:
+```
+freshness_score = max(0, 100 - (days_since_retrieved × decay_rate))
+```
+Domain-specific decay rates will allow different categories of data to age at appropriate speeds:
+| Category | Decay Rate | Half-life |
+|---|---|---|
+| Financial data | 5.0 | ~10 days |
+| Job listings | 3.0 | ~17 days |
+| News / HN | 2.0 | ~25 days |
+| GitHub repos | 1.0 | ~50 days |
+| Academic papers | 0.3 | ~167 days |
+Once implemented, agents can filter results by `freshness_score > threshold` instead of relying on string confidence levels. This makes FreshContext usable as a query parameter, not just a label.
+---
+## Layer 10 — API + Monetization Infrastructure
+**Status: Pricing designed, billing not yet built**
+The monetization architecture planned for FreshContext:
+**Free tier**
+- 1 user profile
+- 5 watched queries
+- Daily briefings
+- All 11 adapters via MCP
+**Pro ($19/month)**
+- Unlimited watched queries
+- 6-hour briefings
+- All adapters including new ones
+- Freshness score on every result
+- API access (100k calls/month)
+**Team ($79/month)**
+- Multiple user profiles
+- Shared briefing feed
+- Slack / email delivery
+- 500k API calls/month
+- Priority support
+**Enterprise (custom)**
+- Dedicated Cloudflare Worker deployment
+- Custom adapter development
+- SLA-backed uptime
+- White-label briefing output
+**Billing implementation:** Lemon Squeezy (Namibia-compatible, merchant-of-record, no Stripe required)
+---
+## The Bigger Picture
+FreshContext started as a fix for a personal problem — AI giving stale job listings with no warning. It's becoming something more structural: a **data freshness layer for the AI agent ecosystem.**
+Every agent needs to know how old its data is. Right now, none of them do — not reliably, not with a standard format, not with a confidence signal. FreshContext is the first project to address this as a named, specified, open standard with a working reference implementation.
+The opportunity is to become the layer that other AI tools plug into when they need grounded, timestamped intelligence — not a scraper, not a search engine, but the envelope that makes retrieved data trustworthy.
+**The unfair advantage:** One developer, Cloudflare's global edge, a working spec, and a dataset that grows every six hours whether or not anyone is watching. The longer FreshContext runs, the more historical signal accumulates, and the harder it becomes to replicate from scratch.
+---
+## Contribution
+The FreshContext Specification is open. New adapters are the highest-value contribution — see `src/adapters/` for the pattern and `FRESHCONTEXT_SPEC.md` for the contract any adapter must fulfill.
+If you're building something FreshContext-compatible, open an issue and we'll add you to the ecosystem list.
+---
+*"The work isn't gone. It's just waiting to be continued."*

package/dataset_schema.json ADDED Viewed

@@ -0,0 +1,41 @@
+{
+  "actorSpecification": 1,
+  "fields": [
+    {
+      "fieldId": "adapter",
+      "fieldType": "String",
+      "title": "Adapter",
+      "description": "The source adapter used to retrieve the data (e.g. github, hackernews, reddit, yc, scholar)."
+    },
+    {
+      "fieldId": "source_url",
+      "fieldType": "String",
+      "title": "Source URL",
+      "description": "The URL of the original source the data was retrieved from."
+    },
+    {
+      "fieldId": "content",
+      "fieldType": "String",
+      "title": "Content",
+      "description": "The retrieved content from the source, truncated to max_length characters."
+    },
+    {
+      "fieldId": "retrieved_at",
+      "fieldType": "String",
+      "title": "Retrieved at",
+      "description": "ISO 8601 timestamp of when FreshContext fetched this data. Always reflects the actual retrieval time."
+    },
+    {
+      "fieldId": "content_date",
+      "fieldType": "String",
+      "title": "Content date",
+      "description": "Best estimate of when the original content was published. Null if unknown."
+    },
+    {
+      "fieldId": "freshness_confidence",
+      "fieldType": "String",
+      "title": "Freshness confidence",
+      "description": "Confidence level of the content_date estimate. One of: high (from structured API/metadata), medium (inferred from page signals), low (estimated or unknown)."
+    }
+  ]
+}

package/dist/adapters/changelog.js ADDED Viewed

@@ -0,0 +1,207 @@
+/**
+ * Changelog adapter — extracts update history from any product or repo.
+ *
+ * Accepts:
+ *   - Any URL: https://example.com → auto-discovers /changelog, /releases, /CHANGELOG.md
+ *   - GitHub repo URL: https://github.com/owner/repo → uses Releases API
+ *   - Direct changelog URL: https://example.com/changelog
+ *   - npm package name: e.g. "freshcontext-mcp" → fetches from npm registry
+ *
+ * What it returns:
+ *   - Most recent changelog entries with dates
+ *   - Version numbers when available
+ *   - Content of each entry (truncated)
+ *   - freshness_confidence based on how the date was sourced
+ *
+ * Why this matters for AI agents:
+ *   Agents checking "is this tool still maintained?" or "did they ship X feature?"
+ *   need to know WHEN changes happened — not just that they happened.
+ *   This adapter makes update cadence a first-class signal.
+ */
+const CHANGELOG_PATHS = [
+    "/changelog",
+    "/CHANGELOG",
+    "/CHANGELOG.md",
+    "/CHANGELOG.txt",
+    "/releases",
+    "/blog/changelog",
+    "/blog/releases",
+    "/updates",
+    "/whats-new",
+    "/what-s-new",
+    "/release-notes",
+];
+function sanitize(s) {
+    return s.replace(/[^\x20-\x7E\n]/g, "").trim();
+}
+// ─── GitHub Releases API ──────────────────────────────────────────────────────
+async function fetchGitHubReleases(owner, repo, maxLength) {
+    const res = await fetch(`https://api.github.com/repos/${owner}/${repo}/releases?per_page=10`, { headers: { "Accept": "application/vnd.github.v3+json", "User-Agent": "freshcontext-mcp" } });
+    if (!res.ok)
+        throw new Error(`GitHub releases API error: ${res.status}`);
+    const releases = await res.json();
+    if (!releases.length)
+        throw new Error("No releases found");
+    const stable = releases.filter((r) => !r.prerelease && !r.draft);
+    const items = stable.length ? stable : releases;
+    const raw = items
+        .slice(0, 8)
+        .map((r, i) => {
+        const body = sanitize(r.body ?? "").slice(0, 500);
+        return [
+            `[${i + 1}] ${r.tag_name}${r.name && r.name !== r.tag_name ? ` — ${r.name}` : ""}`,
+            `Released: ${r.published_at?.slice(0, 10) ?? "unknown"}`,
+            body ? `\n${body}` : "(no release notes)",
+        ].join("\n");
+    })
+        .join("\n\n")
+        .slice(0, maxLength);
+    const newest = items[0]?.published_at ?? null;
+    return { raw, content_date: newest, freshness_confidence: "high" };
+}
+// ─── npm Registry ─────────────────────────────────────────────────────────────
+async function fetchNpmChangelog(packageName, maxLength) {
+    const res = await fetch(`https://registry.npmjs.org/${encodeURIComponent(packageName)}`);
+    if (!res.ok)
+        throw new Error(`npm registry error: ${res.status}`);
+    const data = await res.json();
+    const times = data.time ?? {};
+    const versions = Object.keys(times)
+        .filter((k) => k !== "created" && k !== "modified" && /^\d/.test(k))
+        .sort((a, b) => new Date(times[b]).getTime() - new Date(times[a]).getTime())
+        .slice(0, 10);
+    const latest = data["dist-tags"]?.latest ?? versions[0];
+    const raw = [
+        `Package: ${data.name}`,
+        `Description: ${data.description ?? "N/A"}`,
+        `Latest: ${latest} (${times[latest]?.slice(0, 10) ?? "unknown"})`,
+        ``,
+        `Recent versions:`,
+        ...versions.map((v) => `  ${v} — ${times[v]?.slice(0, 10) ?? "unknown"}`),
+    ].join("\n").slice(0, maxLength);
+    const newest = versions[0] ? times[versions[0]] : null;
+    return { raw, content_date: newest ?? null, freshness_confidence: newest ? "high" : "medium" };
+}
+// ─── Browser-based changelog discovery ───────────────────────────────────────
+async function discoverChangelog(baseUrl, maxLength) {
+    const { chromium } = await import("playwright");
+    // Strip trailing slash and path — we want the root for discovery
+    const urlObj = new URL(baseUrl);
+    // If the URL already looks like a changelog page, go directly
+    const isDirectChangelog = CHANGELOG_PATHS.some((p) => urlObj.pathname.toLowerCase().includes(p.replace("/", "")));
+    const targetUrls = isDirectChangelog
+        ? [baseUrl]
+        : [baseUrl, ...CHANGELOG_PATHS.map((p) => `${urlObj.origin}${p}`)];
+    const browser = await chromium.launch({ headless: true });
+    for (const url of targetUrls) {
+        const page = await browser.newPage();
+        try {
+            const res = await page.goto(url, { waitUntil: "domcontentloaded", timeout: 15000 });
+            if (!res || !res.ok()) {
+                await page.close();
+                continue;
+            }
+            // Check if we landed on a real page with content
+            const content = await page.evaluate(`(function() {
+        // Try to find changelog-like content
+        var selectors = [
+          'article', 'main', '.changelog', '.releases', '.release-notes',
+          '[class*="changelog"]', '[class*="release"]', '[id*="changelog"]',
+          '[id*="release"]', '.prose', '.content', '.markdown-body'
+        ];
+        var el = null;
+        for (var i = 0; i < selectors.length; i++) {
+          el = document.querySelector(selectors[i]);
+          if (el && el.innerText && el.innerText.length > 100) break;
+        }
+        if (!el) el = document.body;
+        var text = el ? el.innerText : '';
+        // Extract dates — look for version/date patterns
+        var datePattern = /\\b(20\\d{2}[-/.](0[1-9]|1[0-2])[-/.](0[1-9]|[12]\\d|3[01]))\\b/g;
+        var versionPattern = /v?\\d+\\.\\d+(\\.\\d+)?(-\\w+)?/g;
+        var dates = (text.match(datePattern) || []).slice(0, 5);
+        var versions = (text.match(versionPattern) || []).slice(0, 5);
+        // Truncate to first 3000 chars of meaningful content
+        var truncated = text
+          .split('\\n')
+          .filter(function(l) { return l.trim().length > 0; })
+          .slice(0, 60)
+          .join('\\n');
+        return {
+          text: truncated,
+          dates: dates,
+          versions: versions,
+          title: document.title,
+          url: window.location.href,
+          hasContent: text.length > 200
+        };
+      })`);
+            const result = content;
+            if (!result.hasContent) {
+                await page.close();
+                continue;
+            }
+            // Check if this actually looks like a changelog
+            const looksLikeChangelog = result.url.toLowerCase().includes("changelog") ||
+                result.url.toLowerCase().includes("release") ||
+                result.url.toLowerCase().includes("update") ||
+                result.title.toLowerCase().includes("changelog") ||
+                result.title.toLowerCase().includes("release") ||
+                result.dates.length > 0 ||
+                result.versions.length > 1;
+            if (!looksLikeChangelog && url !== baseUrl) {
+                await page.close();
+                continue;
+            }
+            await browser.close();
+            const raw = [
+                `Source: ${result.url}`,
+                `Title: ${result.title}`,
+                result.versions.length ? `Versions found: ${result.versions.join(", ")}` : null,
+                result.dates.length ? `Dates found: ${result.dates.join(", ")}` : null,
+                ``,
+                sanitize(result.text),
+            ].filter(Boolean).join("\n").slice(0, maxLength);
+            // Best date is the first/most recent date found
+            const newestDate = result.dates.length > 0
+                ? result.dates.sort().reverse()[0]
+                : null;
+            const confidence = result.dates.length > 0 ? "medium" : "low";
+            return { raw, content_date: newestDate, freshness_confidence: confidence };
+        }
+        catch {
+            await page.close();
+            continue;
+        }
+    }
+    await browser.close();
+    throw new Error(`No changelog found at ${baseUrl} or common changelog paths`);
+}
+// ─── Main export ──────────────────────────────────────────────────────────────
+export async function changelogAdapter(options) {
+    const input = (options.url ?? "").trim();
+    const maxLength = options.maxLength ?? 6000;
+    // npm package name (no http, no dots at start, no slashes)
+    if (!input.startsWith("http") && !input.includes("/") && input.length > 0) {
+        return fetchNpmChangelog(input, maxLength);
+    }
+    // GitHub repo URL → use releases API
+    const ghMatch = input.match(/github\.com\/([^/]+)\/([^/?\s]+)/);
+    if (ghMatch) {
+        try {
+            return await fetchGitHubReleases(ghMatch[1], ghMatch[2], maxLength);
+        }
+        catch {
+            // Fall through to browser scrape if API fails
+        }
+    }
+    // Any other URL → discover changelog
+    return discoverChangelog(input, maxLength);
+}

package/dist/adapters/govcontracts.js ADDED Viewed

@@ -0,0 +1,196 @@
+/**
+ * Government Contracts adapter — fetches awarded contract data from USASpending.gov
+ *
+ * Why this is unique:
+ *   No other MCP server exposes government contract data.
+ *   For GTM teams, VC investors, and competitive researchers, knowing when a
+ *   company wins a government contract is a high-signal buying intent indicator.
+ *   A company that just won a $2M DoD contract is hiring, spending, and building.
+ *
+ * Accepts:
+ *   - Company name: "Cloudflare" → finds contracts awarded to that company
+ *   - NAICS code: "541511" → software publishers contracts
+ *   - Agency name: "Department of Defense" → all DoD contracts
+ *   - Keyword: "AI infrastructure" → contracts with that keyword
+ *   - A URL: https://api.usaspending.gov/... → direct API call
+ *
+ * Data source: USASpending.gov public API (no API key required)
+ * Coverage: All US federal contracts, grants, and awards
+ * Freshness: Updated daily by the US Treasury
+ *
+ * What it returns:
+ *   - Award recipient name and location
+ *   - Contract amount (obligated)
+ *   - Award date (high confidence timestamp)
+ *   - Awarding agency and sub-agency
+ *   - Contract description / award title
+ *   - NAICS code and description
+ *   - Period of performance dates
+ */
+function sanitize(s) {
+    return s.replace(/[^\x20-\x7E]/g, "").trim();
+}
+function formatUSD(amount) {
+    if (amount === null || isNaN(amount))
+        return "N/A";
+    if (Math.abs(amount) >= 1_000_000)
+        return `$${(amount / 1_000_000).toFixed(2)}M`;
+    if (Math.abs(amount) >= 1_000)
+        return `$${(amount / 1_000).toFixed(1)}K`;
+    return `$${amount.toFixed(0)}`;
+}
+// ─── Search by recipient (company name) ──────────────────────────────────────
+async function searchByRecipient(query, maxLength) {
+    const body = {
+        filters: {
+            recipient_search_text: [query],
+            time_period: [
+                {
+                    start_date: new Date(Date.now() - 365 * 2 * 86400000).toISOString().slice(0, 10),
+                    end_date: new Date().toISOString().slice(0, 10),
+                },
+            ],
+            award_type_codes: ["A", "B", "C", "D"], // contracts only
+        },
+        fields: [
+            "Award_ID", "Recipient_Name", "Award_Amount", "Description",
+            "Award_Date", "Start_Date", "End_Date",
+            "Awarding_Agency_Name", "Awarding_Sub_Agency_Name",
+            "recipient_location_state_name", "recipient_location_city_name",
+            "naics_code", "naics_description",
+        ],
+        page: 1,
+        limit: 10,
+        sort: "Award_Amount",
+        order: "desc",
+        subawards: false,
+    };
+    const res = await fetch("https://api.usaspending.gov/api/v2/search/spending_by_award/", {
+        method: "POST",
+        headers: { "Content-Type": "application/json", "User-Agent": "freshcontext-mcp" },
+        body: JSON.stringify(body),
+    });
+    if (!res.ok)
+        throw new Error(`USASpending API error: ${res.status}`);
+    const data = await res.json();
+    if (!data.results?.length) {
+        return {
+            raw: `No federal contracts found for "${query}" in the last 2 years.\n\nThis could mean:\n- The company name differs from the registered recipient name\n- The company operates under a subsidiary name\n- No contracts awarded in this period\n\nTry searching by parent company name or NAICS code.`,
+            content_date: null,
+            freshness_confidence: "high",
+        };
+    }
+    return formatResults(data.results, `Federal contracts — ${query}`, maxLength);
+}
+// ─── Search by keyword ────────────────────────────────────────────────────────
+async function searchByKeyword(keyword, maxLength) {
+    const body = {
+        filters: {
+            keywords: [keyword],
+            time_period: [
+                {
+                    start_date: new Date(Date.now() - 365 * 86400000).toISOString().slice(0, 10),
+                    end_date: new Date().toISOString().slice(0, 10),
+                },
+            ],
+            award_type_codes: ["A", "B", "C", "D"],
+        },
+        fields: [
+            "Award_ID", "Recipient_Name", "Award_Amount", "Description",
+            "Award_Date", "Start_Date", "End_Date",
+            "Awarding_Agency_Name", "Awarding_Sub_Agency_Name",
+            "recipient_location_state_name", "naics_code", "naics_description",
+        ],
+        page: 1,
+        limit: 10,
+        sort: "Award_Amount",
+        order: "desc",
+        subawards: false,
+    };
+    const res = await fetch("https://api.usaspending.gov/api/v2/search/spending_by_award/", {
+        method: "POST",
+        headers: { "Content-Type": "application/json", "User-Agent": "freshcontext-mcp" },
+        body: JSON.stringify(body),
+    });
+    if (!res.ok)
+        throw new Error(`USASpending keyword search error: ${res.status}`);
+    const data = await res.json();
+    if (!data.results?.length) {
+        return {
+            raw: `No federal contracts found matching keyword "${keyword}" in the last year.`,
+            content_date: null,
+            freshness_confidence: "high",
+        };
+    }
+    return formatResults(data.results, `Federal contracts matching "${keyword}"`, maxLength);
+}
+// ─── Format results ───────────────────────────────────────────────────────────
+function formatResults(results, title, maxLength) {
+    const lines = [title, ""];
+    results.forEach((award, i) => {
+        const desc = sanitize(award.Description ?? "No description");
+        const location = [award.recipient_location_city_name, award.recipient_location_state_name]
+            .filter(Boolean).join(", ") || "N/A";
+        lines.push(`[${i + 1}] ${sanitize(award.Recipient_Name ?? "Unknown")}`);
+        lines.push(`    Amount: ${formatUSD(award.Award_Amount)}`);
+        lines.push(`    Awarded: ${award.Award_Date?.slice(0, 10) ?? "unknown"}`);
+        lines.push(`    Period: ${award.Start_Date?.slice(0, 10) ?? "?"} → ${award.End_Date?.slice(0, 10) ?? "?"}`);
+        lines.push(`    Agency: ${sanitize(award.Awarding_Agency_Name ?? "N/A")}`);
+        if (award.Awarding_Sub_Agency_Name && award.Awarding_Sub_Agency_Name !== award.Awarding_Agency_Name) {
+            lines.push(`    Sub-agency: ${sanitize(award.Awarding_Sub_Agency_Name)}`);
+        }
+        if (award.naics_code) {
+            lines.push(`    NAICS: ${award.naics_code} — ${sanitize(award.naics_description ?? "")}`);
+        }
+        lines.push(`    Location: ${location}`);
+        lines.push(`    Description: ${desc.slice(0, 200)}`);
+        lines.push("");
+    });
+    const raw = lines.join("\n").slice(0, maxLength);
+    // Newest award date for freshness
+    const dates = results
+        .map((r) => r.Award_Date)
+        .filter(Boolean)
+        .sort()
+        .reverse();
+    return {
+        raw,
+        content_date: dates[0] ?? null,
+        freshness_confidence: "high", // USASpending dates are structured API fields
+    };
+}
+// ─── Main export ──────────────────────────────────────────────────────────────
+export async function govContractsAdapter(options) {
+    const input = (options.url ?? "").trim();
+    const maxLength = options.maxLength ?? 6000;
+    if (!input)
+        throw new Error("Query required: company name, keyword, or NAICS code");
+    // Direct API URL
+    if (input.startsWith("https://api.usaspending.gov")) {
+        const res = await fetch(input, { headers: { "User-Agent": "freshcontext-mcp" } });
+        if (!res.ok)
+            throw new Error(`USASpending direct fetch error: ${res.status}`);
+        const data = await res.json();
+        const raw = JSON.stringify(data, null, 2).slice(0, maxLength);
+        return { raw, content_date: new Date().toISOString(), freshness_confidence: "high" };
+    }
+    // NAICS code (6 digits)
+    if (/^\d{6}$/.test(input)) {
+        return searchByKeyword(input, maxLength);
+    }
+    // Default: try as recipient name first, fall back to keyword
+    try {
+        const result = await searchByRecipient(input, maxLength);
+        // If no results found, try keyword search
+        if (result.raw.includes("No federal contracts found")) {
+            const kwResult = await searchByKeyword(input, maxLength);
+            if (!kwResult.raw.includes("No federal contracts found")) {
+                return kwResult;
+            }
+        }
+        return result;
+    }
+    catch {
+        return searchByKeyword(input, maxLength);
+    }
+}

package/dist/server.js CHANGED Viewed

@@ -9,6 +9,8 @@ import { ycAdapter } from "./adapters/yc.js";
 import { repoSearchAdapter } from "./adapters/repoSearch.js";
 import { packageTrendsAdapter } from "./adapters/packageTrends.js";
 import { jobsAdapter } from "./adapters/jobs.js";
+import { changelogAdapter } from "./adapters/changelog.js";
+import { govContractsAdapter } from "./adapters/govcontracts.js";
 import { stampFreshness, formatForLLM } from "./tools/freshnessStamp.js";
 import { formatSecurityError } from "./security.js";
 const server = new McpServer({
@@ -182,6 +184,42 @@ server.registerTool("search_jobs", {
         return { content: [{ type: "text", text: formatSecurityError(err) }] };
     }
 });
+// ─── Tool: extract_changelog ────────────────────────────────────────────────
+server.registerTool("extract_changelog", {
+    description: "Extract update history from any product, repo, or package. Accepts a GitHub URL (uses Releases API), an npm package name, or any website URL (auto-discovers /changelog, /releases, /CHANGELOG.md). Returns version numbers, release dates, and entry content — all timestamped. Use this to check if a tool is actively maintained, when a feature shipped, or how fast a team moves.",
+    inputSchema: z.object({
+        url: z.string().describe("GitHub repo URL (https://github.com/owner/repo), npm package name (e.g. 'freshcontext-mcp'), or any website URL (https://example.com). Auto-discovers changelog paths."),
+        max_length: z.number().optional().default(6000).describe("Max content length"),
+    }),
+    annotations: { readOnlyHint: true, openWorldHint: true },
+}, async ({ url, max_length }) => {
+    try {
+        const result = await changelogAdapter({ url, maxLength: max_length });
+        const ctx = stampFreshness(result, { url, maxLength: max_length }, "changelog");
+        return { content: [{ type: "text", text: formatForLLM(ctx) }] };
+    }
+    catch (err) {
+        return { content: [{ type: "text", text: formatSecurityError(err) }] };
+    }
+});
+// ─── Tool: extract_govcontracts ────────────────────────────────────────────
+server.registerTool("extract_govcontracts", {
+    description: "Fetch US federal government contract awards from USASpending.gov. No API key required. Search by company name (e.g. 'Palantir'), keyword (e.g. 'AI infrastructure'), or NAICS code (e.g. '541511'). Returns award amounts, dates, awarding agency, NAICS code, and contract descriptions — all timestamped. Use this to find buying intent signals (a company that just won a $5M DoD contract is actively hiring and spending), competitive intelligence, or GTM targeting.",
+    inputSchema: z.object({
+        url: z.string().describe("Company name (e.g. 'Cloudflare'), keyword (e.g. 'machine learning'), NAICS code (e.g. '541511'), or direct USASpending API URL."),
+        max_length: z.number().optional().default(6000).describe("Max content length"),
+    }),
+    annotations: { readOnlyHint: true, openWorldHint: true },
+}, async ({ url, max_length }) => {
+    try {
+        const result = await govContractsAdapter({ url, maxLength: max_length });
+        const ctx = stampFreshness(result, { url, maxLength: max_length }, "govcontracts");
+        return { content: [{ type: "text", text: formatForLLM(ctx) }] };
+    }
+    catch (err) {
+        return { content: [{ type: "text", text: formatSecurityError(err) }] };
+    }
+});
 // ─── Start ───────────────────────────────────────────────────────────────────
 async function main() {
     const transport = new StdioServerTransport();

package/input_schema.json ADDED Viewed

@@ -0,0 +1,49 @@
+{
+  "title": "FreshContext MCP Input",
+  "type": "object",
+  "schemaVersion": 1,
+  "properties": {
+    "tool": {
+      "title": "Tool",
+      "type": "string",
+      "description": "The FreshContext tool to run.",
+      "enum": [
+        "extract_github",
+        "extract_hackernews",
+        "extract_scholar",
+        "extract_arxiv",
+        "extract_reddit",
+        "extract_yc",
+        "extract_producthunt",
+        "search_repos",
+        "package_trends",
+        "extract_finance",
+        "extract_landscape"
+      ],
+      "default": "extract_landscape",
+      "editor": "select"
+    },
+    "url": {
+      "title": "URL",
+      "type": "string",
+      "description": "URL to extract from. Required for: extract_github, extract_hackernews, extract_scholar, extract_reddit. E.g. https://github.com/owner/repo",
+      "editor": "textfield"
+    },
+    "query": {
+      "title": "Query",
+      "type": "string",
+      "description": "Search query. Required for: extract_landscape, search_repos, extract_yc, extract_producthunt, package_trends, extract_finance.",
+      "editor": "textfield"
+    },
+    "max_length": {
+      "title": "Max content length",
+      "type": "integer",
+      "description": "Maximum characters returned per result. Default: 6000.",
+      "default": 6000,
+      "minimum": 500,
+      "maximum": 20000,
+      "editor": "number"
+    }
+  },
+  "required": ["tool"]
+}

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "freshcontext-mcp",
   "mcpName": "io.github.PrinceGabriel-lgtm/freshcontext",
-  "version": "0.3.1",
+  "version": "0.3.3",
   "description": "Real-time web extraction MCP server with freshness timestamps for AI agents",
   "keywords": [
     "mcp",