freshcontext-mcp 0.3.16 → 0.3.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/.env.example +3 -0
  2. package/LICENSE +21 -0
  3. package/NOTICE.md +17 -0
  4. package/README.md +395 -296
  5. package/SECURITY.md +34 -0
  6. package/TRADEMARKS.md +9 -0
  7. package/dist/adapters/arxiv.js +92 -48
  8. package/dist/adapters/finance.js +87 -101
  9. package/dist/adapters/gdelt.js +1 -1
  10. package/dist/adapters/gebiz.js +1 -1
  11. package/dist/adapters/hackernews.js +59 -29
  12. package/dist/adapters/productHunt.js +8 -4
  13. package/dist/adapters/registry.js +232 -0
  14. package/dist/adapters/repoSearch.js +1 -1
  15. package/dist/adapters/secFilings.js +1 -1
  16. package/dist/core/decay.js +61 -0
  17. package/dist/core/decision.js +176 -0
  18. package/dist/core/envelope.js +59 -0
  19. package/dist/core/explain.js +28 -0
  20. package/dist/core/guards.js +17 -0
  21. package/dist/core/index.js +11 -0
  22. package/dist/core/pipeline.js +101 -0
  23. package/dist/core/provenance.js +73 -0
  24. package/dist/core/rank.js +84 -0
  25. package/dist/core/signal.js +101 -0
  26. package/dist/core/sourceProfiles.js +126 -0
  27. package/dist/core/types.js +1 -0
  28. package/dist/core/utility.js +90 -0
  29. package/dist/rest/handler.js +126 -0
  30. package/dist/security.js +1 -1
  31. package/dist/server.js +10 -10
  32. package/dist/tools/freshnessStamp.js +1 -117
  33. package/dist/types.js +0 -1
  34. package/docs/API_DESIGN.md +434 -0
  35. package/docs/CODEX_MCP_USAGE.md +116 -0
  36. package/docs/CORE_API.md +224 -0
  37. package/docs/DEPENDENCY_DILIGENCE.md +63 -0
  38. package/docs/HA_PRI_V2_DESIGN.md +279 -0
  39. package/docs/OPERATIONAL_DEMO_RUNBOOK.md +458 -0
  40. package/docs/RELEASE_INTEGRITY.md +53 -0
  41. package/docs/RELEASE_NOTES.md +38 -0
  42. package/docs/SIGNAL_CONTRACT.md +89 -0
  43. package/docs/SOURCE_PROFILES.md +427 -0
  44. package/freshcontext.schema.json +103 -103
  45. package/package-script-guard.mjs +140 -0
  46. package/package.json +92 -52
  47. package/server.json +27 -28
  48. package/.github/workflows/publish.yml +0 -32
  49. package/RESEARCH.md +0 -487
  50. package/RISKS.md +0 -137
  51. package/cleanup.ps1 +0 -99
  52. package/demo/README.md +0 -70
  53. package/demo/data.json +0 -88
  54. package/demo/generate.mjs +0 -199
  55. package/demo/index.html +0 -513
  56. package/demo/logo-export.html +0 -61
  57. package/demo/logo.svg +0 -23
  58. package/dist/apify.js +0 -133
  59. package/freshcontext-validate.js +0 -196
  60. package/time-check.ps1 +0 -46
@@ -0,0 +1,434 @@
1
+ # FreshContext REST API Design
2
+
3
+ Status: design only
4
+
5
+ FreshContext REST is a future host around FreshContext Core. It should make the Core pipeline easy to use over HTTP without moving runtime, cache, storage, billing, dashboard, or adapter behavior into Core.
6
+
7
+ ## Purpose
8
+
9
+ FreshContext turns raw retrieval results into freshness-ranked, provenance-aware context for agents.
10
+
11
+ The REST host should expose the simplest useful path:
12
+
13
+ 1. User has raw retrieved signals.
14
+ 2. User sends those signals to FreshContext.
15
+ 3. Core evaluates freshness, confidence, ranking, explanation, optional envelope, and optional provenance.
16
+ 4. Host returns ranked context.
17
+ 5. Agent or app uses best context first.
18
+
19
+ ## Product Spine
20
+
21
+ ```text
22
+ raw signals
23
+ -> FreshContext Core evaluation
24
+ -> freshness-ranked, explained, provenance-aware context
25
+ -> agent / app / workflow
26
+ ```
27
+
28
+ REST should wrap Core. It should not become a crawler, dashboard, cache, Store, billing system, or Worker replacement.
29
+
30
+ ## Endpoint Table
31
+
32
+ | Method | Path | Purpose | Core function |
33
+ |---|---|---|---|
34
+ | POST | `/v1/evaluate` | Evaluate one signal | `evaluateSignal` |
35
+ | POST | `/v1/evaluate-batch` | Evaluate and rank multiple signals | `evaluateSignals` |
36
+ | POST | `/v1/stamp` | Produce FreshContext envelope text and JSON | `evaluateSignal` with `includeEnvelope: true` |
37
+ | GET | `/v1/health` | Return host health/version | host-only |
38
+ | GET | `/v1/spec` | Return static spec metadata and docs links | host-only |
39
+
40
+ ## POST /v1/evaluate
41
+
42
+ Evaluates one signal.
43
+
44
+ ### Request
45
+
46
+ ```json
47
+ {
48
+ "signal": {
49
+ "id": "sig_001",
50
+ "source": "https://example.com/article",
51
+ "source_type": "blog",
52
+ "title": "Example retrieved result",
53
+ "content": "Raw retrieved content...",
54
+ "published_at": "2026-05-24T12:00:00.000Z",
55
+ "retrieved_at": "2026-05-24T13:00:00.000Z",
56
+ "semantic_score": 0.84,
57
+ "date_confidence": "high",
58
+ "status": "success",
59
+ "metadata": {
60
+ "query": "browser agents"
61
+ }
62
+ },
63
+ "options": {
64
+ "includeEnvelope": true,
65
+ "includeProvenance": false
66
+ }
67
+ }
68
+ ```
69
+
70
+ ### Response
71
+
72
+ Shape: `CoreSignalEvaluationResult`.
73
+
74
+ ```json
75
+ {
76
+ "signal": {
77
+ "contract_version": "freshcontext.signal.v1",
78
+ "id": "sig_001",
79
+ "source": "https://example.com/article",
80
+ "source_type": "blog",
81
+ "title": "Example retrieved result",
82
+ "content": "Raw retrieved content...",
83
+ "published_at": "2026-05-24T12:00:00.000Z",
84
+ "retrieved_at": "2026-05-24T13:00:00.000Z",
85
+ "semantic_score": 0.84,
86
+ "date_confidence": "high",
87
+ "status": "success",
88
+ "metadata": {
89
+ "query": "browser agents"
90
+ },
91
+ "reasons": []
92
+ },
93
+ "freshness_score": 98,
94
+ "utility": {
95
+ "score": 83.6,
96
+ "contextualRelevance": 84,
97
+ "decayFactor": 0.995,
98
+ "dateConfidenceFactor": 1,
99
+ "statusFactor": 1,
100
+ "lambda": 0.001,
101
+ "ageHours": 1,
102
+ "status": "success",
103
+ "reasons": []
104
+ },
105
+ "ranked": {
106
+ "id": "sig_001",
107
+ "source": "https://example.com/article",
108
+ "source_type": "blog",
109
+ "title": "Example retrieved result",
110
+ "content": "Raw retrieved content...",
111
+ "published_at": "2026-05-24T12:00:00.000Z",
112
+ "retrieved_at": "2026-05-24T13:00:00.000Z",
113
+ "semantic_score": 0.84,
114
+ "date_confidence": "high",
115
+ "status": "success",
116
+ "metadata": {
117
+ "query": "browser agents"
118
+ },
119
+ "freshness_score": 98,
120
+ "final_score": 0.882,
121
+ "confidence": "high",
122
+ "reason": "Strong semantic match and current freshness for blog."
123
+ },
124
+ "explanation": "Strong semantic match and current freshness for blog.",
125
+ "envelope": {
126
+ "context": {},
127
+ "text": "[FRESHCONTEXT]...",
128
+ "structured": {}
129
+ },
130
+ "reasons": []
131
+ }
132
+ ```
133
+
134
+ The REST host must not fetch upstream data, cache results, write D1, enforce Ha-Pri, or alter ranking policy.
135
+
136
+ ## POST /v1/evaluate-batch
137
+
138
+ Evaluates and ranks multiple signals.
139
+
140
+ ### Request
141
+
142
+ ```json
143
+ {
144
+ "signals": [
145
+ {
146
+ "id": "sig_a",
147
+ "source": "https://example.com/fresh",
148
+ "source_type": "blog",
149
+ "content": "Fresh relevant content...",
150
+ "published_at": "2026-05-24T12:00:00.000Z",
151
+ "retrieved_at": "2026-05-24T13:00:00.000Z",
152
+ "semantic_score": 0.8
153
+ },
154
+ {
155
+ "id": "sig_b",
156
+ "source": "https://example.com/old",
157
+ "source_type": "blog",
158
+ "content": "Older relevant content...",
159
+ "published_at": "2025-01-01T00:00:00.000Z",
160
+ "retrieved_at": "2026-05-24T13:00:00.000Z",
161
+ "semantic_score": 0.9
162
+ }
163
+ ],
164
+ "options": {
165
+ "includeEnvelope": false,
166
+ "includeProvenance": false
167
+ }
168
+ }
169
+ ```
170
+
171
+ ### Response
172
+
173
+ ```json
174
+ {
175
+ "evaluations": [
176
+ {
177
+ "signal": {},
178
+ "freshness_score": 98,
179
+ "utility": {},
180
+ "ranked": {
181
+ "final_score": 0.85
182
+ },
183
+ "explanation": "Strong semantic match and current freshness for blog.",
184
+ "reasons": []
185
+ }
186
+ ]
187
+ }
188
+ ```
189
+
190
+ `evaluate-batch` must use `evaluateSignals`. Ordering follows `ranked.final_score`, with stable tie ordering. `utility.score` is sidecar output and must not be used as default ordering.
191
+
192
+ The REST host must not fetch upstream data, add host-specific scoring, replace Core ranking, or silently enable utility-weighted ranking.
193
+
194
+ ## POST /v1/stamp
195
+
196
+ Produces a FreshContext envelope and structured JSON for one result.
197
+
198
+ ### Request
199
+
200
+ ```json
201
+ {
202
+ "signal": {
203
+ "source": "https://example.com/source",
204
+ "source_type": "blog",
205
+ "content": "Raw content to wrap...",
206
+ "published_at": "2026-05-24T12:00:00.000Z",
207
+ "retrieved_at": "2026-05-24T13:00:00.000Z",
208
+ "semantic_score": 0.75,
209
+ "date_confidence": "high"
210
+ },
211
+ "options": {
212
+ "envelopeMaxLength": 8000,
213
+ "envelopeFormat": {
214
+ "publishedLabel": "Published",
215
+ "unknownDateText": "Publish date: unknown"
216
+ }
217
+ }
218
+ }
219
+ ```
220
+
221
+ ### Response
222
+
223
+ ```json
224
+ {
225
+ "context": {
226
+ "content": "Raw content to wrap...",
227
+ "source_url": "https://example.com/source",
228
+ "content_date": "2026-05-24T12:00:00.000Z",
229
+ "retrieved_at": "2026-05-24T13:00:00.000Z",
230
+ "freshness_confidence": "high",
231
+ "freshness_score": 98,
232
+ "adapter": "blog"
233
+ },
234
+ "text": "[FRESHCONTEXT]...",
235
+ "structured": {
236
+ "freshcontext": {
237
+ "source_url": "https://example.com/source",
238
+ "content_date": "2026-05-24T12:00:00.000Z",
239
+ "retrieved_at": "2026-05-24T13:00:00.000Z",
240
+ "freshness_confidence": "high",
241
+ "freshness_score": 98,
242
+ "adapter": "blog"
243
+ },
244
+ "content": "Raw content to wrap..."
245
+ }
246
+ }
247
+ ```
248
+
249
+ Implementation should prefer `evaluateSignal(..., { includeEnvelope: true })`.
250
+
251
+ This endpoint must not become a cache metadata endpoint. Cache status, cache age, TTL, key version, KV/D1 state, and Worker-specific cache metadata belong to hosts that own cache policy.
252
+
253
+ ## GET /v1/health
254
+
255
+ Returns host health and version metadata.
256
+
257
+ ### Response
258
+
259
+ ```json
260
+ {
261
+ "ok": true,
262
+ "service": "freshcontext-rest",
263
+ "version": "0.1.0",
264
+ "core_available": true
265
+ }
266
+ ```
267
+
268
+ The health endpoint must not expose secrets, tokens, environment values, private diagnostics, Cloudflare internals, or account metadata.
269
+
270
+ ## GET /v1/spec
271
+
272
+ Returns static spec metadata and documentation links.
273
+
274
+ ### Response
275
+
276
+ ```json
277
+ {
278
+ "service": "freshcontext-rest",
279
+ "spec_version": "1.2",
280
+ "signal_contract": "freshcontext.signal.v1",
281
+ "docs": {
282
+ "core_api": "/docs/CORE_API.md",
283
+ "signal_contract": "/docs/SIGNAL_CONTRACT.md",
284
+ "freshcontext_spec": "/FRESHCONTEXT_SPEC.md",
285
+ "methodology": "/METHODOLOGY.md"
286
+ }
287
+ }
288
+ ```
289
+
290
+ This endpoint must not become a dynamic registry, dashboard, account page, or source catalog.
291
+
292
+ ## Error Shape
293
+
294
+ REST errors should use one stable JSON shape:
295
+
296
+ ```json
297
+ {
298
+ "error": {
299
+ "code": "invalid_request",
300
+ "message": "Request body must include signal.",
301
+ "details": []
302
+ }
303
+ }
304
+ ```
305
+
306
+ Recommended initial codes:
307
+
308
+ | HTTP | Code | Meaning |
309
+ |---:|---|---|
310
+ | 400 | `invalid_request` | Missing or malformed JSON/body fields |
311
+ | 405 | `method_not_allowed` | Wrong method for endpoint |
312
+ | 413 | `payload_too_large` | Request exceeds host limit |
313
+ | 415 | `unsupported_media_type` | Non-JSON request body |
314
+ | 500 | `internal_error` | Unexpected host error |
315
+
316
+ Do not encode Core ranking/freshness uncertainty as HTTP errors. Missing timestamps, failed content, low confidence, and omitted provenance are valid evaluation outcomes and should appear in the Core result with reasons.
317
+
318
+ ## REST Non-Goals
319
+
320
+ The first REST design does not include:
321
+
322
+ - dashboard
323
+ - auth
324
+ - tenancy
325
+ - billing
326
+ - webhooks
327
+ - D1 persistence
328
+ - cache policy
329
+ - feed replacement
330
+ - production Ha-Pri enforcement
331
+ - utility-weighted ranking
332
+ - vector database
333
+ - adapter fetching
334
+ - crawler/scraper orchestration
335
+ - Worker runtime migration
336
+
337
+ ## Host / Core Boundary
338
+
339
+ Core owns:
340
+
341
+ - signal normalization
342
+ - timestamp/future-date/failure guards
343
+ - freshness scoring
344
+ - context utility sidecar
345
+ - default rank/explain behavior
346
+ - optional envelope generation
347
+ - optional Ha-Pri v2 material preparation
348
+
349
+ REST host owns:
350
+
351
+ - HTTP routing
352
+ - JSON parsing and response formatting
353
+ - HTTP status codes
354
+ - payload size limits
355
+ - request IDs/logging policy
356
+ - CORS policy if needed
357
+ - documentation examples
358
+
359
+ MCP host owns:
360
+
361
+ - MCP tool schemas
362
+ - reference adapter invocation
363
+ - MCP response shape
364
+ - client compatibility
365
+
366
+ Worker runtime owns:
367
+
368
+ - Cloudflare runtime behavior
369
+ - MCP transport
370
+ - KV cache policy
371
+ - D1/feed/cron/rate limiting
372
+ - cache metadata injection
373
+
374
+ Adapters own:
375
+
376
+ - fetching
377
+ - source-specific parsing
378
+ - raw source normalization before Core
379
+
380
+ Ops Pulse owns:
381
+
382
+ - runtime diagnostics
383
+ - Cloudflare health checks
384
+ - operational assays
385
+
386
+ Trust Scanner owns:
387
+
388
+ - repo/package/release integrity
389
+ - public-claim checks
390
+ - trust gate reporting
391
+
392
+ ## Security and Privacy Notes
393
+
394
+ - Do not log full request bodies by default.
395
+ - Do not return environment variables, tokens, API keys, Cloudflare metadata, or private diagnostics.
396
+ - Treat submitted content as client data.
397
+ - Keep provenance optional and explicit.
398
+ - Do not claim Ha-Pri v2 production enforcement unless a future Store/read-time verification path is implemented.
399
+ - Do not accept non-JSON request bodies for evaluation endpoints.
400
+ - Apply conservative host-level request size limits before invoking Core.
401
+
402
+ ## Tests Needed Before Implementation
403
+
404
+ Before adding REST route code, add tests for:
405
+
406
+ - `POST /v1/evaluate` request/response fixtures
407
+ - `POST /v1/evaluate-batch` ordering parity with `evaluateSignals`
408
+ - invalid JSON and missing body errors
409
+ - unsupported content type
410
+ - payload-too-large handling
411
+ - missing timestamp and future timestamp outcomes
412
+ - failed content cannot rank as fresh/high confidence
413
+ - envelope opt-in behavior
414
+ - provenance opt-in behavior and missing-material reasons
415
+ - no cache/D1/fetch side effects
416
+ - stable error shape
417
+
418
+ ## Future Expansion
419
+
420
+ Later phases may add:
421
+
422
+ - auth
423
+ - tenancy
424
+ - billing
425
+ - dashboard
426
+ - webhooks
427
+ - D1 persistence
428
+ - cache policy
429
+ - utility-weighted ranking as an explicit mode
430
+ - production Ha-Pri enforcement
431
+ - vector database integration
432
+ - adapter fetching
433
+
434
+ Each expansion should be opt-in and separately designed. The first REST host should remain a thin wrapper around Core evaluation.
@@ -0,0 +1,116 @@
1
+ # FreshContext MCP Usage with Codex
2
+
3
+ This note documents the verified Codex-compatible MCP setup for the local FreshContext repository.
4
+
5
+ ## What Codex can use
6
+
7
+ Codex can launch FreshContext as a local MCP server over stdio.
8
+
9
+ The verified local server entrypoint is:
10
+
11
+ ```powershell
12
+ & '<node-executable>' '<repo-root>\dist\server.js'
13
+ ```
14
+
15
+ The MCP server exposes 21 tools. The local smoke test verifies the package version, server version, expected tool count, and representative tool calls.
16
+
17
+ No credential is required for the local stdio smoke path.
18
+
19
+ ## Local stdio setup
20
+
21
+ Prerequisites:
22
+
23
+ - Node.js 20 or newer
24
+ - Repository dependencies installed with `npm install`
25
+ - Built server output at `dist/server.js`
26
+
27
+ From the repository root:
28
+
29
+ ```powershell
30
+ cd '<repo-root>'
31
+ npm install
32
+ npm run build
33
+ npm run smoke:stdio
34
+ ```
35
+
36
+ For a local Codex setup, use the same Node executable and built server path validated by the smoke test:
37
+
38
+ ```toml
39
+ [mcp_servers.freshcontext]
40
+ command = '<node-executable>'
41
+ args = ['<repo-root>\dist\server.js']
42
+ ```
43
+
44
+ A more portable variant is also valid when `node` is available on Codex's PATH:
45
+
46
+ ```toml
47
+ [mcp_servers.freshcontext]
48
+ command = "node"
49
+ args = ['<repo-root>\dist\server.js']
50
+ ```
51
+
52
+ Keep this configuration in the local Codex config file, not in the repository. Do not commit machine-local paths.
53
+
54
+ ## Remote /mcp setup
55
+
56
+ The repository declares a remote Streamable HTTP MCP endpoint in `server.json` and the README:
57
+
58
+ ```text
59
+ https://freshcontext-mcp.gimmanuel73.workers.dev/mcp
60
+ ```
61
+
62
+ For clients that need a stdio bridge to a remote MCP endpoint, the README uses `mcp-remote`:
63
+
64
+ ```toml
65
+ [mcp_servers.freshcontext_remote]
66
+ command = "npx"
67
+ args = ["-y", "mcp-remote", "https://freshcontext-mcp.gimmanuel73.workers.dev/mcp"]
68
+ ```
69
+
70
+ This remote path was identified from repository metadata. The validation in this task verified local stdio only, not remote Codex compatibility, Worker availability, or Codex Cloud support.
71
+
72
+ ## Verification steps
73
+
74
+ Run the local smoke test:
75
+
76
+ ```powershell
77
+ cd '<repo-root>'
78
+ npm run smoke:stdio
79
+ ```
80
+
81
+ Expected result:
82
+
83
+ ```json
84
+ {
85
+ "ok": true,
86
+ "package_version": "0.3.18",
87
+ "server_version": "0.3.18",
88
+ "tool_count": 21
89
+ }
90
+ ```
91
+
92
+ Run whitespace validation before committing docs:
93
+
94
+ ```powershell
95
+ git diff --check
96
+ ```
97
+
98
+ Expected result: no output and exit code 0.
99
+
100
+ ## Safety notes
101
+
102
+ - Do not place secrets, credentials, registry tokens, npm tokens, GitHub tokens, or Cloudflare tokens in Codex MCP config.
103
+ - Do not read, edit, print, or commit local token files, local environment files, registry credentials, Cloudflare local state, or Wrangler state.
104
+ - Do not commit local Codex config or machine-specific paths.
105
+ - Prefer the local stdio path for this compatibility check because it is verified by `npm run smoke:stdio`.
106
+ - Do not claim Codex Cloud support unless it is separately tested and documented.
107
+
108
+ ## Troubleshooting
109
+
110
+ If Codex cannot start the server:
111
+
112
+ - Confirm `dist/server.js` exists. If not, run `npm run build`.
113
+ - Confirm Node is installed with `node -v`. The package requires Node.js 20 or newer.
114
+ - If `node` is not found by Codex, use the full executable path from `node -p "process.execPath"`.
115
+ - Run `npm run smoke:stdio` from the repository root and confirm `tool_count` is 21.
116
+ - If the remote setup fails, verify network access, `npx` availability, and the remote endpoint separately. Do not treat remote failure as evidence that local stdio is broken.