freshcontext-mcp 0.3.18 → 0.3.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,317 @@
1
+ # The FreshContext Specification
2
+ **Version 1.2 — May 2026**
3
+ *Authored by Immanuel Gabriel (Prince Gabriel) — Grootfontein, Namibia*
4
+
5
+ ---
6
+
7
+ ## What This Is
8
+
9
+ The FreshContext Specification defines a standard envelope format for AI-retrieved web data and a temporal correction layer that scores it.
10
+
11
+ It exists to solve one problem: **AI models present stale data with the same confidence as fresh data, and users have no way to tell the difference.**
12
+
13
+ > **Illustrative demonstration:** [freshcontext-mcp.gimmanuel73.workers.dev/demo](https://freshcontext-mcp.gimmanuel73.workers.dev/demo) — same model, same query, different answers from different ranked context. The demo shows how FreshContext treats freshness signals.
14
+
15
+ FreshContext addresses this by wrapping retrieved content in a structured envelope that carries three explicit properties:
16
+
17
+ 1. **When** the data was retrieved (exact ISO 8601 timestamp)
18
+ 2. **Where** it came from (canonical source URL)
19
+ 3. **How confident** we are that the content date is accurate (freshness confidence)
20
+
21
+ Any tool, agent, or system that implements this spec is **FreshContext-compatible**.
22
+
23
+ The specification is one half of the system. The other half — the Decay-Adjusted Relevancy (DAR) engine that scores wrapped signals — is documented separately in [METHODOLOGY.md](./METHODOLOGY.md).
24
+
25
+ ---
26
+
27
+ ## The Envelope Format
28
+
29
+ Every FreshContext-compatible response MUST wrap its content in the following envelope:
30
+
31
+ ```
32
+ [FRESHCONTEXT]
33
+ Source: <canonical_url>
34
+ Published: <content_date_or_"unknown">
35
+ Retrieved: <iso8601_timestamp>
36
+ Confidence: <high|medium|low>
37
+ ---
38
+ <content>
39
+ [/FRESHCONTEXT]
40
+ ```
41
+
42
+ ### Field Definitions
43
+
44
+ | Field | Required | Format | Description |
45
+ |---|---|---|---|
46
+ | `Source` | Yes | Valid URL | The canonical URL of the original source |
47
+ | `Published` | Yes | ISO 8601 date or `"unknown"` | Best estimate of when the content was originally published |
48
+ | `Retrieved` | Yes | ISO 8601 datetime with timezone | Exact timestamp when this data was fetched |
49
+ | `Confidence` | Yes | `high`, `medium`, or `low` | Confidence level of the `Published` date estimate |
50
+
51
+ ---
52
+
53
+ ## Confidence Levels
54
+
55
+ ### `high`
56
+ The publication date was sourced from a structured, machine-readable field — an API response, HTML metadata tag, RSS feed, or official timestamp. The date is reliable.
57
+
58
+ *Examples: GitHub API `pushed_at`, arXiv submission date, Hacker News `created_at`, SEC EDGAR filing date, USASpending.gov award date*
59
+
60
+ ### `medium`
61
+ The publication date was inferred from page signals — visible date strings, URL patterns, or content heuristics. Likely correct but not guaranteed.
62
+
63
+ *Examples: Blog post date parsed from HTML, URL containing `/2025/03/`, footer copyright year*
64
+
65
+ ### `low`
66
+ No reliable date signal was found. The date is an estimate based on indirect signals or is entirely unknown.
67
+
68
+ *Examples: Static page with no date, scraped content with no metadata, cached result of unknown age*
69
+
70
+ ---
71
+
72
+ ## Structured Form (JSON)
73
+
74
+ Implementations MAY additionally expose freshness metadata as structured JSON alongside the text envelope:
75
+
76
+ ```json
77
+ {
78
+ "freshcontext": {
79
+ "source_url": "https://github.com/owner/repo",
80
+ "content_date": "2026-03-05",
81
+ "retrieved_at": "2026-03-16T09:19:00.000Z",
82
+ "freshness_confidence": "high",
83
+ "adapter": "github",
84
+ "freshness_score": 94
85
+ },
86
+ "content": "..."
87
+ }
88
+ ```
89
+
90
+ ### `freshness_score` (optional)
91
+
92
+ A numeric representation of data freshness from 0-100. The canonical FreshContext model is exponential Decay-Adjusted Relevancy (DAR):
93
+
94
+ ```
95
+ R_t = R_0 · e^(-λt)
96
+ ```
97
+
98
+ Where:
99
+ - `R_0` is the starting relevance or freshness value.
100
+ - `λ` is the source-specific decay constant.
101
+ - `t` is the age of the signal, normally measured from `content_date` / `published_at` to retrieval or query time.
102
+ - `R_t` is the decay-adjusted relevance or freshness value.
103
+
104
+ For simple envelope-level `freshness_score`, `R_0` MAY be normalised to `100`. For ranked intelligence feeds, `R_0` MAY be semantic base relevance, profile relevance, or adapter-specific utility. The displayed `freshness_score` remains normalised to 0-100 for compatibility.
105
+
106
+ FreshContext scoring MAY be context-conditioned. Implementations that rank multiple signals MAY compute `R_0` from the requesting user, query, agent, workflow, semantic similarity, profile relevance, or adapter-specific relevance before applying DAR. The envelope contract remains the same: systems SHOULD expose source, publication time, retrieval time, confidence, and freshness metadata regardless of how `R_0` is computed.
107
+
108
+ Implementations SHOULD use source-specific `λ` values to reflect how quickly different categories of data lose temporal utility.
109
+
110
+ #### Reference Decay Classes
111
+
112
+ These are reference/default classes. Implementations MAY tune them by source, user profile, or deployment while preserving the exponential DAR model.
113
+
114
+ | Source class | Example λ (per hour) | Approximate half-life | Examples |
115
+ |---|---|---|---|
116
+ | Fast discussion | 0.050 | ~14 hours | Hacker News front-page stories |
117
+ | News cycle | 0.020 | ~35 hours | GDELT global news |
118
+ | Community / launch signals | 0.010 | ~3 days | Reddit, Product Hunt |
119
+ | Jobs / material events | 0.005 | ~6 days | Job listings, SEC 8-K filings |
120
+ | Procurement / market context | 0.001 | ~29 days | Stooq quotes, gov contracts, YC company data |
121
+ | Package / changelog cadence | 0.0005 | ~58 days | npm/PyPI, GitHub Releases |
122
+ | Repository activity | 0.0002 | ~5 months | GitHub repositories |
123
+ | Academic work | 0.00005 | ~1.6 years | arXiv, Google Scholar |
124
+
125
+ #### Score Interpretation
126
+
127
+ | Score | Interpretation |
128
+ |---|---|
129
+ | 90–100 | Very recent for its source class — treat as current |
130
+ | 70–89 | Still fresh enough for most uses |
131
+ | 50–69 | Decayed but potentially useful — verify before acting |
132
+ | Below 50 | Low temporal utility — use with caution |
133
+
134
+ ### Missing, Invalid, and Future Timestamps
135
+
136
+ FreshContext-compatible systems MUST NOT treat missing or invalid timestamps as fresh.
137
+
138
+ - If no reliable timestamp exists, implementations SHOULD set `freshness_score` to `null` and `freshness_confidence` to `low`.
139
+ - Missing timestamps MUST NOT be treated as current.
140
+ - Invalid timestamps MUST NOT be treated as current.
141
+ - Future timestamps beyond a small clock-skew tolerance MUST NOT be treated as current.
142
+ - Implementations SHOULD surface timestamp uncertainty explicitly.
143
+
144
+ ---
145
+
146
+ ## Adapter Contract
147
+
148
+ Any data source that feeds into a FreshContext-compatible system is called an **adapter**. Adapters MUST:
149
+
150
+ 1. Return raw content plus a `content_date` (or `null` if unknown)
151
+ 2. Set a `freshness_confidence` level based on how the date was determined
152
+ 3. Never fabricate or forward-date content timestamps
153
+ 4. Clearly identify which source system produced the data via the `adapter` field
154
+ 5. Surface retrieval failures explicitly
155
+
156
+ Adapters SHOULD:
157
+
158
+ - Prefer structured API sources over scraped content when both are available
159
+ - Log retrieval errors without silently returning cached or stale data
160
+ - Surface rate-limit or access-denied errors explicitly rather than returning empty content
161
+ - Use source-specific decay constants from the reference decay classes above
162
+
163
+ ### Failure Honesty
164
+
165
+ Adapters MUST NOT wrap failed, empty, blocked, timeout, rate-limited, access-denied, or malformed output as high-confidence fresh context.
166
+
167
+ If adapter output looks like an error, implementations SHOULD:
168
+
169
+ - Downgrade `freshness_confidence` to `low`
170
+ - Set `freshness_score` to `null`
171
+ - Preserve enough error context for diagnosis
172
+ - Avoid presenting the result as successful current context
173
+
174
+ ---
175
+
176
+ ## Composite Adapters
177
+
178
+ A **composite adapter** is a FreshContext-compatible adapter that calls multiple upstream adapters in parallel and combines their results into a single unified response. Each upstream result MUST retain its own FreshContext envelope — the composite wrapper MUST NOT collapse individual timestamps into a single envelope.
179
+
180
+ Composite adapters SHOULD:
181
+
182
+ - Fire all upstream calls in parallel (e.g. `Promise.allSettled`)
183
+ - Handle partial failures gracefully — if one upstream fails, return the rest
184
+ - Label each section clearly with its source adapter name
185
+ - Include a composite `retrieved_at` timestamp representing the time the composite call was initiated
186
+
187
+ *Examples in the reference implementation: `extract_landscape`, `extract_company_landscape`, `extract_idea_landscape`*
188
+
189
+ ---
190
+
191
+ ## Why This Matters for AI Agents
192
+
193
+ Large language models have no internal clock. When an agent retrieves web data, it cannot distinguish between something published this morning and something published three years ago — unless that information is explicitly surfaced.
194
+
195
+ Without FreshContext (or equivalent):
196
+ - An agent recommending job listings may recommend roles that no longer exist
197
+ - An agent summarising market trends may cite conditions from a previous cycle
198
+ - An agent checking a competitor's pricing may act on outdated information
199
+ - An agent synthesising news may present last year's controversy as current
200
+
201
+ With FreshContext:
202
+ - Every piece of retrieved data carries its own timestamp
203
+ - The agent can reason about data age before acting
204
+ - Users can see exactly how fresh their AI's information is
205
+ - Composite intelligence reports carry per-source freshness signals
206
+
207
+ ---
208
+
209
+ ## Compatibility
210
+
211
+ A tool, server, or API is **FreshContext-compatible** if:
212
+
213
+ - Its responses include the `[FRESHCONTEXT]...[/FRESHCONTEXT]` envelope, OR
214
+ - Its responses include the structured JSON form with `freshcontext.retrieved_at` and `freshcontext.freshness_confidence` fields
215
+
216
+ Partial implementations that include only `retrieved_at` without `freshness_confidence` are considered **FreshContext-aware** but not fully compatible.
217
+
218
+ ### Compatibility Levels
219
+
220
+ | Level | Requirements |
221
+ |---|---|
222
+ | **FreshContext-compatible** | Full envelope OR full JSON form with `retrieved_at` + `freshness_confidence` |
223
+ | **FreshContext-aware** | Includes `retrieved_at` but not `freshness_confidence` |
224
+ | **FreshContext-scored** | Full compatible + numeric `freshness_score` with domain-specific decay |
225
+
226
+ MCP is one interface, not the whole FreshContext model. The envelope and structured JSON metadata are the compatibility contract. FreshContext-compatible systems may be MCP servers, APIs, CLIs, npm packages, dashboards, agents, or internal services.
227
+
228
+ ---
229
+
230
+ ## Core-backed Implementation Status
231
+
232
+ FreshContext Core now owns the envelope/scoring layer in the reference implementation:
233
+
234
+ - Freshness scoring
235
+ - Envelope generation
236
+ - Confidence metadata
237
+ - Failure guards
238
+ - Shared types
239
+ - Rank/explain primitives
240
+
241
+ The live MCP Worker uses Core-backed envelope generation while Cloudflare-specific runtime behavior, MCP transport, cache policy, rate limits, and deployment concerns remain outside Core.
242
+
243
+ Claude Desktop is a supported MCP client, but FreshContext Core is not Claude-dependent. MCP is one interface over the methodology; FreshContext-compatible systems may also expose results through npm packages, APIs, CLIs, dashboards, or other agent runtimes.
244
+
245
+ ---
246
+
247
+ ## Reference Implementation
248
+
249
+ The canonical reference implementation of this specification is:
250
+
251
+ **freshcontext-mcp@0.3.19** — an MCP server with `evaluate_context` for caller-provided candidate context plus 21 read-only reference adapters covering:
252
+
253
+ **Intelligence:** GitHub, Hacker News, Google Scholar, arXiv, Reddit
254
+
255
+ **Competitive research:** YC Companies, Product Hunt, GitHub repo search, npm/PyPI package trends
256
+
257
+ **Market data:** Stooq quote data (up to 5 tickers), job listings (Remotive, RemoteOK, HN Hiring)
258
+
259
+ **Official, regulatory, and procurement sources:**
260
+ - `extract_changelog` — release history from any repo, npm package, or website
261
+ - `extract_govcontracts` — US federal contract awards (USASpending.gov)
262
+ - `extract_sec_filings` — SEC 8-K material event disclosures (EDGAR)
263
+ - `extract_gdelt` — global news intelligence, 100+ languages, updated every 15 minutes
264
+ - `extract_gebiz` — Singapore Government procurement (data.gov.sg)
265
+
266
+ **Composite landscapes:** `extract_landscape`, `extract_idea_landscape`, `extract_gov_landscape`, `extract_finance_landscape`, `extract_company_landscape`
267
+
268
+ **Deployment:**
269
+ - npm: `freshcontext-mcp`
270
+ - GitHub: https://github.com/PrinceGabriel-lgtm/freshcontext-mcp
271
+ - Cloud endpoint: `https://freshcontext-mcp.gimmanuel73.workers.dev/mcp`
272
+ - MCP Registry: `io.github.PrinceGabriel-lgtm/freshcontext`
273
+
274
+ ---
275
+
276
+ ## Changelog
277
+
278
+ ### Version 1.2 — May 2026
279
+ - Clarified exponential DAR scoring as the canonical freshness model.
280
+ - Added missing/invalid/future timestamp handling guidance.
281
+ - Added failure-honesty requirements for adapter output.
282
+ - Added Core-backed envelope/scoring implementation status.
283
+ - Added optional context-conditioned scoring language without changing the envelope contract.
284
+ - Clarified MCP as one interface over the FreshContext methodology.
285
+ - Updated reference implementation language for freshcontext-mcp@0.3.19, `evaluate_context`, and 21 read-only reference adapters.
286
+
287
+ ### Version 1.1 — April 2026
288
+ - Added Composite Adapters section
289
+ - Added domain-specific decay rate table with recommended values
290
+ - Added Compatibility Levels table (compatible / aware / scored)
291
+ - Updated reference implementation to 21 reference adapter tools
292
+ - Added `extract_gdelt`, `extract_gebiz`, `extract_sec_filings` to high-confidence examples
293
+ - Added Apify Store and MCP Registry to reference implementation listings
294
+
295
+ ### Version 1.0 — March 2026
296
+ - Initial specification published
297
+
298
+ ---
299
+
300
+ ## Versioning
301
+
302
+ This document is version 1.2 of the FreshContext Specification.
303
+
304
+ Future versions will be tagged in this repository. Breaking changes to the envelope format will increment the major version. Additive changes (new optional fields, new confidence levels, new recommended values) will increment the minor version.
305
+
306
+ ---
307
+
308
+ ## License
309
+
310
+ This specification is published under the MIT License.
311
+ Implementations may be proprietary or open source.
312
+ Attribution to the FreshContext Specification is appreciated but not required.
313
+
314
+ ---
315
+
316
+ *"The work isn't gone. It's just waiting to be continued."*
317
+ *— Prince Gabriel, Grootfontein, Namibia*