npm - freshcontext-mcp - Versions diffs - 0.3.14 → 0.3.16 - Mend

freshcontext-mcp 0.3.14 → 0.3.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

package/.env.example +8 -0
package/README.md +117 -125
package/RESEARCH.md +487 -0
package/RISKS.md +137 -0
package/cleanup.ps1 +99 -0
package/demo/README.md +70 -0
package/demo/data.json +88 -0
package/demo/generate.mjs +199 -0
package/demo/index.html +513 -0
package/demo/logo-export.html +61 -0
package/demo/logo.svg +23 -0
package/dist/server.js +124 -66
package/dist/tools/freshnessStamp.js +30 -22
package/freshcontext-validate.js +196 -0
package/freshcontext.schema.json +103 -0
package/package.json +2 -2
package/server.json +3 -3
package/time-check.ps1 +46 -0
package/.actor/Dockerfile +0 -16
package/.actor/actor.json +0 -9
package/.actor/output_schema.json +0 -13
package/ARCHITECTURE_UPGRADE_CHECKLIST.md +0 -88
package/ARCHITECTURE_UPGRADE_ROADMAP_V1.md +0 -174
package/FRESHCONTEXT_SPEC.md +0 -178
package/HANDOFF.md +0 -184
package/ROADMAP.md +0 -174
package/SESSION_SAVE_ARCHITECTURE_V1.md +0 -67
package/SESSION_SAVE_ARCHITECTURE_V2.md +0 -142
package/SESSION_SAVE_V4.md +0 -60
package/SESSION_SAVE_V5.md +0 -121
package/USAGE.md +0 -294
package/add-cache.cjs +0 -86
package/dataset_schema.json +0 -41
package/input_schema.json +0 -48

package/.env.example ADDED Viewed

@@ -0,0 +1,8 @@
+# freshcontext-mcp environment variables
+# Copy to .env and fill in
+# Optional: GitHub Personal Access Token (increases rate limits for GitHub API fallback)
+GITHUB_TOKEN=
+# Optional: Proxy URL if needed for certain extractions
+# PROXY_URL=http://user:pass@host:port

package/README.md CHANGED Viewed

@@ -8,12 +8,46 @@ That's the problem freshcontext fixes.
 [![npm version](https://img.shields.io/npm/v/freshcontext-mcp)](https://www.npmjs.com/package/freshcontext-mcp)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![MCP Registry](https://img.shields.io/badge/MCP%20Registry-Listed-blue)](https://registry.modelcontextprotocol.io)
+> **Live demo:** [freshcontext-mcp.gimmanuel73.workers.dev/demo](https://freshcontext-mcp.gimmanuel73.workers.dev/demo) — same model, same query, two completely different answers. Only the temporal layer changed.
+---
+## The problem
+Large language models retrieve web data semantically. Cosine similarity finds the documents that match a query best — but cosine doesn't know when a document was written.
+So a 2022 blog post and a 2026 paper can score nearly identically. The model gets a context window full of stale documents and faithfully summarizes 2022 advice for a 2026 question.
+That's not hallucination. That's correct summarization of corrupted retrieval.
+> **Most RAG pipelines rank context correctly semantically but incorrectly temporally.**
+---
+## The layer
+FreshContext is a **temporal correction layer for retrieval systems**. One math correction applied before context reaches the LLM:
+```
+R_t = R_0 · e^(−λt)
+```
+- `R_0` — base semantic relevancy (whatever your retriever already gives you)
+- `λ` — source-specific decay constant (HN ≈14h half-life, blogs ≈29d, academic papers ≈1.6y)
+- `t` — hours elapsed since publication
+- `R_t` — decay-adjusted relevancy at query time
+That's the whole fix. No model swap. No re-embedding. No re-indexing. The layer drops onto whatever retrieval pipeline you already have.
+**The layer is the product.** The 20 adapters shipped with this repo are reference implementations demonstrating compatibility — useful, but commodity. The DAR engine, the freshness envelope, and the FreshContext Specification are the moat.
 ---
-## What it does
+## The standard
-Every MCP server returns data. freshcontext returns data **plus when it was retrieved and how confident that date is** — wrapped in a FreshContext envelope:
+Every FreshContext-compatible response wraps content in a structured envelope:
 ```
 [FRESHCONTEXT]
@@ -26,83 +60,76 @@ Confidence: high
 [/FRESHCONTEXT]
 ```
-Claude now knows the difference between something from this morning and something from two years ago. You do too.
+**When** it was retrieved. **Where** it came from. **How confident** we are the date is accurate.
+The FreshContext Specification v1.1 is published as an open standard under MIT licence. Any tool, agent, or system that wraps retrieved data in this envelope is FreshContext-compatible. → [Read the spec](./FRESHCONTEXT_SPEC.md) · [Read the methodology](./METHODOLOGY.md)
+---
+## The intelligence feed
+Beyond the per-call envelope, the production FreshContext deployment exposes a continuous, decay-scored, deduplicated feed:
+```
+GET /v1/intel/feed/:profile_id?limit=20&min_rt=0
+```
+Every signal is stamped with `base_score`, `rt_score`, `entropy_level` (low / stable / high), `ha_pri_sig` (SHA-256 provenance), `semantic_fingerprint` (cross-adapter dedup), and `published_at`. Ready for direct LLM or agent consumption — no synthesis required.
+Production endpoint: `https://freshcontext-mcp.gimmanuel73.workers.dev`
 ---
-## 19 tools. No API keys.
+## Reference adapters
+The repo ships 20 adapters demonstrating how to make any data source FreshContext-compatible. Useful as drop-in tools, but the value is the layer above them.
 ### Intelligence
-| Tool | What it gets you |
+| Adapter | What it returns |
 |---|---|
 | `extract_github` | README, stars, forks, language, topics, last commit |
 | `extract_hackernews` | Top stories or search results with scores and timestamps |
 | `extract_scholar` | Research papers — titles, authors, years, snippets |
-| `extract_arxiv` | arXiv papers via official API — more reliable than Scholar |
+| `extract_arxiv` | arXiv papers via official API |
 | `extract_reddit` | Posts and community sentiment from any subreddit |
 ### Competitive research
-| Tool | What it gets you |
+| Adapter | What it returns |
 |---|---|
-| `extract_yc` | YC company listings by keyword — who's funded in your space |
+| `extract_yc` | YC company listings by keyword |
 | `extract_producthunt` | Recent launches by topic |
 | `search_repos` | GitHub repos ranked by stars with activity signals |
 | `package_trends` | npm and PyPI metadata — version history, release cadence |
 ### Market data
-| Tool | What it gets you |
+| Adapter | What it returns |
 |---|---|
 | `extract_finance` | Live stock data — price, market cap, P/E, 52w range. Up to 5 tickers. |
-| `search_jobs` | Remote job listings from Remotive + HN "Who is Hiring" — every listing dated |
+| `search_jobs` | Remote job listings from Remotive, RemoteOK, HN "Who is Hiring" |
 ### Composites — multiple sources, one call
-| Tool | Sources | What it gets you |
+| Adapter | Sources | Purpose |
 |---|---|---|
 | `extract_landscape` | 6 | YC + GitHub + HN + Reddit + Product Hunt + npm in parallel |
-| `extract_gov_landscape` | 4 | Gov contracts + HN + GitHub repos + changelog |
+| `extract_idea_landscape` | 6 | HN + YC + GitHub + Jobs + npm + Product Hunt — full idea validation |
+| `extract_gov_landscape` | 4 | Gov contracts + HN + GitHub + changelog |
 | `extract_finance_landscape` | 5 | Finance + HN + Reddit + GitHub + changelog |
-| `extract_company_landscape` | 5 | **The full picture on any company** — see below |
+| `extract_company_landscape` | 5 | The full picture on any company |
 ### Unique — not available in any other MCP server
-| Tool | Source | What it gets you |
+| Adapter | Source | What it returns |
 |---|---|---|
-| `extract_changelog` | GitHub Releases API / npm / auto-discover | Update history from any repo, package, or website |
+| `extract_changelog` | GitHub Releases / npm / auto-discover | Update history from any repo, package, or website |
 | `extract_govcontracts` | USASpending.gov | US federal contract awards — company, amount, agency, period |
 | `extract_sec_filings` | SEC EDGAR | 8-K filings — legally mandated material event disclosures |
-| `extract_gdelt` | GDELT Project | Global news intelligence — 100+ languages, every country, 15-min updates |
-| `extract_gebiz` | data.gov.sg | Singapore Government procurement tenders — open dataset, no auth |
----
-## extract_company_landscape
-The most complete single-call company analysis available in any MCP server. Five sources fired in parallel:
-1. **SEC EDGAR** — what did they legally just disclose (8-K filings)
-2. **USASpending.gov** — who is giving them government money
-3. **GDELT** — what is global news saying right now
-4. **Changelog** — are they actually shipping product
-5. **Yahoo Finance** — what is the market pricing in
-```
-Use extract_company_landscape with company "Palantir" and ticker "PLTR"
-```
-Real output from March 26, 2026:
-> **Q4 2025:** Revenue $1.407B (+70% YoY). US commercial +137%. Rule of 40 score: **127%**.
-> **Federal contracts:** $292.7M Army Maven Smart System · $252.5M CDAO · $145M ICE · $130M Air Force · more
-> **SEC filing:** Q4 earnings 8-K filed Feb 3, 2026 — GAAP net income $609M, 43% margin
-> **GDELT:** ICE/Medicaid data controversy, UK MoD security warning, NHS opposition — all timestamped
-> **PLTR:** ~$154–157 · Market cap ~$370B · P/E 244x · 52w range $66 → $207
-Bloomberg Terminal doesn't read commit history as a company health signal. This does.
+| `extract_gdelt` | GDELT Project | Global news intelligence — 100+ languages, 15-min updates |
+| `extract_gebiz` | data.gov.sg | Singapore Government procurement tenders — open dataset |
 ---
-## Quick Start
+## Quick start
-### Option A — Cloud (no install)
+### Cloud (no install)
 Add to your Claude Desktop config and restart:
@@ -124,9 +151,7 @@ Restart Claude. Done.
 > Prefer a guided setup? Visit **[freshcontext-site.pages.dev](https://freshcontext-site.pages.dev)** — 3 steps, no terminal.
----
-### Option B — Local (full Playwright)
+### Local (full Playwright)
 **Requires:** Node.js 18+ ([nodejs.org](https://nodejs.org))
@@ -164,16 +189,14 @@ Add to Claude Desktop config:
 }
 ```
----
-### Troubleshooting (Mac)
+#### Mac troubleshooting
 **"command not found: node"** — Use the full path:
 ```bash
 which node  # copy this output, replace "node" in config
 ```
-**Config file doesn't exist** — Create it:
+**Config file doesn't exist:**
 ```bash
 mkdir -p ~/Library/Application\ Support/Claude
 touch ~/Library/Application\ Support/Claude/claude_desktop_config.json
@@ -183,23 +206,17 @@ touch ~/Library/Application\ Support/Claude/claude_desktop_config.json
 ## Usage examples
-**Is anyone already building what you're building?**
+**Should I build this idea?**
 ```
-Use extract_landscape with topic "cashflow prediction saas"
+Use extract_idea_landscape with idea "procurement intelligence saas"
 ```
-Returns who's funded, what's trending, what repos exist, what packages are moving — all timestamped.
+Returns funding signal, pain signal, crowding signal, market signal, ecosystem signal, and launch signal — all timestamped.
 **Full company intelligence in one call:**
 ```
 Use extract_company_landscape with company "Palantir" and ticker "PLTR"
 ```
-SEC filings + federal contracts + global news + changelog + market data. The complete picture.
-**What's Singapore's government procuring right now?**
-```
-Use extract_gebiz with url "artificial intelligence"
-```
-Returns live tenders from the Ministry of Finance open dataset — agency, amount, closing date, all timestamped.
+SEC filings + federal contracts + global news + changelog + market data.
 **Did that company just disclose something material?**
 ```
@@ -207,90 +224,61 @@ Use extract_sec_filings with url "Palantir Technologies"
 ```
 8-K filings are legally mandated within 4 business days of any material event — CEO change, acquisition, breach, major contract.
-**What is global news saying about a company?**
-```
-Use extract_gdelt with url "Palantir"
-```
-100+ languages, every country, updated every 15 minutes. Surfaces what Western sources miss.
-**What's the community actually saying right now?**
-```
-Use extract_reddit on r/MachineLearning
-Use extract_hackernews to search "mcp server 2026"
-```
 **Is this dependency still actively maintained?**
 ```
 Use extract_changelog with url "https://github.com/org/repo"
 ```
 Returns the last 8 releases with exact dates. If the last release was 18 months ago, you'll know before you pin the version.
-**Which companies just won government contracts in AI?**
-```
-Use extract_govcontracts with url "artificial intelligence"
-```
-Largest recent federal contract awards matching that keyword — company, amount, agency, award date.
 ---
-## How freshness works
-Most AI tools retrieve data silently. No timestamp, no signal, no way for the agent to know how old it is.
-freshcontext treats **retrieval time as first-class metadata**. Every adapter returns:
-- `retrieved_at` — exact ISO timestamp of the fetch
-- `content_date` — best estimate of when the content was originally published
-- `freshness_confidence` — `high`, `medium`, or `low` based on signal quality
-- `freshness_score` — numeric 0–100 score with domain-specific decay rates
-- `adapter` — which source the data came from
-When confidence is `high`, the date came from a structured field (API, metadata). When it's `medium` or `low`, freshcontext tells you why.
-The FreshContext Specification v1.0 is published as an open standard under MIT license. Any tool or agent that wraps retrieved data in the `[FRESHCONTEXT]` envelope is FreshContext-compatible.
-→ [Read the spec](./FRESHCONTEXT_SPEC.md)
+## Deployment & infrastructure
----
-## Security
+The reference implementation runs on Cloudflare's global edge:
-- Input sanitization and domain allowlists on all adapters
-- SSRF prevention (blocked private IP ranges)
-- KV-backed global rate limiting: 60 req/min per IP across all edge nodes
-- No credentials required — all public data sources
+| Endpoint | Method | Purpose |
+|---|---|---|
+| `/` | GET | Service info + endpoint list |
+| `/health` | GET | Liveness check |
+| `/mcp` | POST | MCP JSON-RPC transport |
+| `/demo` | GET | Live before/after demo (no API key required) |
+| `/briefing` | GET | Latest stored briefing |
+| `/v1/intel/feed/:profile_id` | GET | DAR-scored intelligence feed |
+| `/watched-queries` | GET | List all watched queries |
+- **D1 database** — 18 watched queries running on 6-hour cron with relevancy scoring
+- **KV-backed rate limiting** — 60 req/min per IP across all edge nodes
+- **Defensive valves** — clock-skew rejection (5min tolerance), hard floor at R_t<5, lazy decay at read time
+- **Provenance** — Ha-Pri SHA-256 audit signatures on every signal
+- **Schema migrations** — promise-gated, idempotent, run on first request after deploy
+Production: `https://freshcontext-mcp.gimmanuel73.workers.dev`
 ---
 ## Roadmap
-- [x] GitHub, HN, Scholar, YC, Reddit, Product Hunt, Finance, arXiv, Jobs adapters
-- [x] `extract_landscape` — 6-source composite tool
-- [x] `extract_changelog` — update cadence from any repo, package, or website
-- [x] `extract_govcontracts` — US federal contract intelligence via USASpending.gov
-- [x] `extract_sec_filings` — SEC EDGAR 8-K material event filings
-- [x] `extract_gdelt` — GDELT global news intelligence (100+ languages)
-- [x] `extract_gebiz` — Singapore Government procurement via data.gov.sg
-- [x] `extract_gov_landscape` — gov contracts + HN + GitHub + changelog composite
-- [x] `extract_finance_landscape` — finance + HN + Reddit + GitHub + changelog composite
-- [x] `extract_company_landscape` — 5-source company intelligence composite
-- [x] `freshness_score` numeric metric (0–100) with domain-specific decay rates
-- [x] Cloudflare Workers deployment — global edge with KV caching
-- [x] D1 database — 18 watched queries running on 6-hour cron
-- [x] Listed on official MCP Registry
-- [x] Listed on Apify Store
-- [x] FreshContext Specification v1.0 published
-- [x] GitHub Actions CI/CD — auto-publish to npm on every push
-- [ ] GKG upgrade for `extract_gdelt` — tone scores, goldstein scale, event codes
-- [ ] TTL-based caching layer
+- [x] FreshContext Specification v1.1 published (MIT, open standard)
+- [x] DAR engine with proprietary λ constants (v0.3.15)
+- [x] Ha-Pri audit signatures on every signal
+- [x] Semantic deduplication via fingerprinting
+- [x] Live before/after demo at `/demo`
+- [x] METHODOLOGY.md — formal IP and engineering documentation
+- [x] 20 reference adapters across intelligence, competitive research, market data, and composites
+- [x] Cloudflare Workers deployment — global edge, KV cache, KV rate limiting
+- [x] Listed on official MCP Registry, Apify Store, npm
+- [x] GitHub Actions CI/CD — auto-publish on every push
+- [ ] Webhook triggers — push high-entropy signals on threshold
 - [ ] Dashboard — React frontend for the D1 intelligence pipeline
-- [ ] Synthesis endpoint — `/briefing/now` AI-generated intelligence briefings
+- [ ] GKG upgrade for `extract_gdelt` — tone scores, goldstein scale, event codes
 ---
 ## Contributing
-PRs welcome. New adapters are the highest-value contribution — see `src/adapters/` for the pattern and `FRESHCONTEXT_SPEC.md` for the contract any adapter must fulfill.
+PRs welcome. New adapters are the highest-value contribution — see `src/adapters/` for the pattern and [`FRESHCONTEXT_SPEC.md`](./FRESHCONTEXT_SPEC.md) for the contract any adapter must fulfil.
+If you're building something FreshContext-compatible, open an issue and we'll add you to the ecosystem list.
 ---
@@ -302,3 +290,7 @@ MIT
 *Built by Prince Gabriel — Grootfontein, Namibia 🇳🇦*
 *"The work isn't gone. It's just waiting to be continued."*
+---
+**Also on:** [Apify Store](https://apify.com/prince_gabriel/freshcontext-mcp) · [MCP Registry](https://registry.modelcontextprotocol.io) · [npm](https://www.npmjs.com/package/freshcontext-mcp)