freshcontext-mcp 0.3.16 → 0.3.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/.env.example +3 -0
  2. package/LICENSE +21 -0
  3. package/NOTICE.md +17 -0
  4. package/README.md +395 -296
  5. package/SECURITY.md +34 -0
  6. package/TRADEMARKS.md +9 -0
  7. package/dist/adapters/arxiv.js +92 -48
  8. package/dist/adapters/finance.js +87 -101
  9. package/dist/adapters/gdelt.js +1 -1
  10. package/dist/adapters/gebiz.js +1 -1
  11. package/dist/adapters/hackernews.js +59 -29
  12. package/dist/adapters/productHunt.js +8 -4
  13. package/dist/adapters/registry.js +232 -0
  14. package/dist/adapters/repoSearch.js +1 -1
  15. package/dist/adapters/secFilings.js +1 -1
  16. package/dist/core/decay.js +61 -0
  17. package/dist/core/decision.js +176 -0
  18. package/dist/core/envelope.js +59 -0
  19. package/dist/core/explain.js +28 -0
  20. package/dist/core/guards.js +17 -0
  21. package/dist/core/index.js +11 -0
  22. package/dist/core/pipeline.js +101 -0
  23. package/dist/core/provenance.js +73 -0
  24. package/dist/core/rank.js +84 -0
  25. package/dist/core/signal.js +101 -0
  26. package/dist/core/sourceProfiles.js +126 -0
  27. package/dist/core/types.js +1 -0
  28. package/dist/core/utility.js +90 -0
  29. package/dist/rest/handler.js +126 -0
  30. package/dist/security.js +1 -1
  31. package/dist/server.js +10 -10
  32. package/dist/tools/freshnessStamp.js +1 -117
  33. package/dist/types.js +0 -1
  34. package/docs/API_DESIGN.md +434 -0
  35. package/docs/CODEX_MCP_USAGE.md +116 -0
  36. package/docs/CORE_API.md +224 -0
  37. package/docs/DEPENDENCY_DILIGENCE.md +63 -0
  38. package/docs/HA_PRI_V2_DESIGN.md +279 -0
  39. package/docs/OPERATIONAL_DEMO_RUNBOOK.md +458 -0
  40. package/docs/RELEASE_INTEGRITY.md +53 -0
  41. package/docs/RELEASE_NOTES.md +38 -0
  42. package/docs/SIGNAL_CONTRACT.md +89 -0
  43. package/docs/SOURCE_PROFILES.md +427 -0
  44. package/freshcontext.schema.json +103 -103
  45. package/package-script-guard.mjs +140 -0
  46. package/package.json +92 -52
  47. package/server.json +27 -28
  48. package/.github/workflows/publish.yml +0 -32
  49. package/RESEARCH.md +0 -487
  50. package/RISKS.md +0 -137
  51. package/cleanup.ps1 +0 -99
  52. package/demo/README.md +0 -70
  53. package/demo/data.json +0 -88
  54. package/demo/generate.mjs +0 -199
  55. package/demo/index.html +0 -513
  56. package/demo/logo-export.html +0 -61
  57. package/demo/logo.svg +0 -23
  58. package/dist/apify.js +0 -133
  59. package/freshcontext-validate.js +0 -196
  60. package/time-check.ps1 +0 -46
@@ -0,0 +1,224 @@
1
+ # FreshContext Core API
2
+
3
+ FreshContext Core is the reusable engine layer in the current integrated MCP/Core package. It owns signal normalization, envelope creation, freshness scoring, failure honesty, rank/explain primitives, the context-utility primitive, and pure provenance helpers.
4
+
5
+ MCP, Worker HTTP, future REST, and future CLI/SDK surfaces should use Core as the contract center instead of redefining freshness or envelope behavior per host.
6
+
7
+ ## Stable Public Core API
8
+
9
+ Import stable Core functions from:
10
+
11
+ ```ts
12
+ import {
13
+ calculateFreshnessScore,
14
+ formatForLLM,
15
+ looksLikeFailedAdapterContent,
16
+ scoreLabel,
17
+ stampFreshness,
18
+ toStructuredJSON,
19
+ } from "./src/core/index.js";
20
+ ```
21
+
22
+ ### Envelope
23
+
24
+ - `stampFreshness(result, options, adapter)` creates a `FreshContext` object from adapter output.
25
+ - `formatForLLM(ctx, options?)` renders the text envelope and trailing structured JSON block.
26
+ - `toStructuredJSON(ctx)` returns the machine-readable FreshContext JSON shape.
27
+
28
+ ### Scoring
29
+
30
+ - `calculateFreshnessScore(content_date, retrieved_at, adapter)` returns a freshness score from `0..100`, or `null` when the content date cannot be trusted.
31
+ - `scoreLabel(score)` maps a numeric freshness score to a human-readable label.
32
+
33
+ ### Guards
34
+
35
+ - `looksLikeFailedAdapterContent(raw)` detects empty, security, timeout, and error-like adapter output so failed content is not stamped as fresh high-confidence context.
36
+
37
+ ## Core Evaluation Pipeline
38
+
39
+ The Core evaluation pipeline is the pure orchestration layer over the existing primitives.
40
+
41
+ Public exports:
42
+
43
+ - `evaluateSignal(input, options?)`
44
+ - `evaluateSignals(inputs, options?)`
45
+ - `CoreSignalEvaluationOptions`
46
+ - `CoreSignalEvaluationResult`
47
+ - `CoreSignalEnvelopeResult`
48
+ - `CoreSignalProvenanceOptions`
49
+
50
+ `evaluateSignal` normalizes a signal, applies timestamp/failure guards, computes `freshness_score`, computes context-conditioned utility, ranks/explains the signal, optionally creates a FreshContext envelope, and optionally prepares Ha-Pri v2 provenance material.
51
+
52
+ It does not fetch, cache, write D1, inspect Worker bindings, know MCP tool schemas, deploy, or publish. Hosts decide whether to store, cache, transmit, or expose the returned result.
53
+
54
+ `evaluateSignals` evaluates each input and returns evaluations sorted by existing `rankSignal` final score, preserving input order when scores tie. Context utility is returned as a sidecar and does not replace `final_score`.
55
+
56
+ Context utility is returned as sidecar output in the current pipeline; it does not replace or modify the default `rankSignal` / `evaluateSignals` ordering. A future pass may add an explicit utility-weighted ranking mode.
57
+
58
+ Local demo:
59
+
60
+ ```bash
61
+ npm run demo:evaluate:file -- examples/sources.academic.example.json
62
+ npm run demo:evaluate:file -- examples/sources.jobs.example.json
63
+ ```
64
+
65
+ The demo reads caller-provided JSON with `profile`, `intent`, and `signals`, then returns decision-first output. It does not fetch URLs, crawl, read folders, deploy REST, or implement Operator mode.
66
+
67
+ ### Stable Types
68
+
69
+ - `FreshContext`
70
+ - `AdapterResult`
71
+ - `ExtractOptions`
72
+ - `EnvelopeFormatOptions`
73
+ - `SignalConfidence`
74
+
75
+ These types describe the stable envelope and adapter result contract.
76
+
77
+ ## Signal Contract v1
78
+
79
+ Signal Contract v1 is the additive Core shape for a retrieved signal before it is ranked, wrapped, stored, or passed to an agent workflow.
80
+
81
+ Public exports:
82
+
83
+ - `SIGNAL_CONTRACT_VERSION`
84
+ - `normalizeSignal(input, options?)`
85
+ - `FreshContextSignalInput`
86
+ - `FreshContextSignal`
87
+ - `SignalDateConfidence`
88
+ - `SignalContractVersion`
89
+ - `SignalNormalizeOptions`
90
+
91
+ `published_at` is the canonical signal timestamp. `content_date` is accepted as an adapter/envelope compatibility alias. Normalization clears invalid or meaningfully future-dated timestamps, marks failed/error-looking content as `status: "failed"`, clamps `semantic_score` into `0..1`, and records normalization reasons.
92
+
93
+ See [Signal Contract v1](./SIGNAL_CONTRACT.md).
94
+
95
+ ## Source Profiles
96
+
97
+ Source Profiles are early public Core metadata for describing how classes of information age, fail, rank, and explain.
98
+
99
+ Public exports:
100
+
101
+ - `BUILT_IN_SOURCE_PROFILES`
102
+ - `getSourceProfile(profileId)`
103
+ - `listSourceProfiles()`
104
+ - `SourceProfile`
105
+ - `SourceProfileId`
106
+ - `SourceAuthorityHint`
107
+ - `SourceDatePolicy`
108
+ - `SourceFailurePolicy`
109
+ - `SourceSurface`
110
+
111
+ They reframe the 21 MCP tools as reference adapters and source-profile examples instead of the product identity. They do not implement `retrieve(...)`, Operator mode, adapter selection, crawling, local file search, or any host/runtime behavior.
112
+
113
+ ## Decision Helper
114
+
115
+ The decision helper translates a Core evaluation result into user-facing action meaning.
116
+
117
+ Public exports:
118
+
119
+ - `interpretEvaluation(evaluation, options?)`
120
+ - `interpretEvaluations(evaluations, options?)`
121
+ - `ContextDecision`
122
+ - `IntentProfileId`
123
+ - `ContextDecisionOptions`
124
+ - `ContextDecisionResult`
125
+
126
+ Supported decisions:
127
+
128
+ - `use_first`
129
+ - `cite_as_primary`
130
+ - `cite_as_supporting`
131
+ - `use_as_background`
132
+ - `needs_verification`
133
+ - `needs_refresh`
134
+ - `watch_only`
135
+ - `exclude`
136
+
137
+ Supported intent profiles:
138
+
139
+ - `citation_check`
140
+ - `student_research`
141
+ - `developer_adoption`
142
+ - `job_search`
143
+ - `market_watch`
144
+ - `business_due_diligence`
145
+ - `medical_literature_triage`
146
+
147
+ The helper consumes existing `CoreSignalEvaluationResult` fields plus optional Source Profile metadata. It does not change `evaluateSignal`, `evaluateSignals`, `rankSignal`, ranking order, freshness math, utility math, envelopes, provenance, or host behavior.
148
+
149
+ FreshContext decisions judge citation readiness, context usefulness, freshness, traceability, and uncertainty. They do not certify truth or provide legal, medical, tax, employment, academic, or investment advice.
150
+
151
+ Demo output will be updated separately so presentation stays separate from Core decision logic.
152
+
153
+ ## Public Ranking Primitives
154
+
155
+ The ranking primitives are public, but consumers should treat their score scales carefully:
156
+
157
+ - `rankSignal(signal, options?)`
158
+ - `rankSignals(signals, options?)`
159
+ - `explainSignal(rankedSignalLike)`
160
+ - `FreshSignal`
161
+ - `RankedSignal`
162
+ - `RankOptions`
163
+
164
+ Score scales:
165
+
166
+ - `semantic_score`: normalized `0..1`
167
+ - `final_score`: normalized `0..1`
168
+ - `freshness_score`: FreshContext freshness score `0..100`, or `null`
169
+
170
+ Ranking combines semantic relevance and freshness into a deterministic order. It does not own retrieval, embedding, vector search, storage, or host-specific scoring policy.
171
+
172
+ ## Experimental Utility Primitive
173
+
174
+ The context-conditioned utility primitive is pure and tested, but it is not production-wired into MCP ranking, Worker feeds, Store scoring, or runtime behavior.
175
+
176
+ Experimental exports:
177
+
178
+ - `calculateContextUtility`
179
+ - `ContextUtilityStatus`
180
+ - `ContextUtilityInput`
181
+ - `ContextUtilityResult`
182
+
183
+ These are pure Core math. They are now connected inside `evaluateSignal` as sidecar utility output, but they are not production-wired into MCP ranking, Worker feeds, Store scoring, or runtime behavior.
184
+
185
+ ## Provenance Helpers
186
+
187
+ Ha-Pri v2 is available as pure Core helper functionality:
188
+
189
+ - `canonicalizeHaPriContent`
190
+ - `sha256Hex`
191
+ - `calculateHaPriV2`
192
+ - `verifyHaPriV2`
193
+ - `HaPriV2Input`
194
+ - `HaPriV2Result`
195
+ - `HaPriV2VerificationResult`
196
+
197
+ `evaluateSignal` can optionally prepare Ha-Pri v2 material when `includeProvenance` is set and required input material is present. Core does not persist provenance, add D1 columns, verify rows on read, reject rows, or replace Worker Ha-Pri v1 behavior.
198
+
199
+ ## Internal, Policy, and Compatibility Exports
200
+
201
+ - `clampScore` is an internal ranking helper. It is currently exported for tests and utility use, but it should not be presented as a primary buyer-facing API.
202
+ - `LAMBDA` is the current policy constant table used by freshness scoring. It documents the reference decay policy, but it is not a buyer-facing tuning API.
203
+
204
+ Compatibility lanes should remain:
205
+
206
+ - `src/types.ts` re-exports legacy adapter types from Core.
207
+ - `src/tools/freshnessStamp.ts` re-exports envelope helpers for older MCP/npm import paths.
208
+
209
+ These lanes protect existing imports while Core becomes the center. Do not remove them until downstream imports have been migrated intentionally.
210
+
211
+ ## What Core Does Not Own
212
+
213
+ Core does not own:
214
+
215
+ - MCP transport
216
+ - Cloudflare runtime behavior
217
+ - KV cache policy
218
+ - Cache metadata injection
219
+ - D1, feed, or cron behavior
220
+ - Store/feed scoring and provenance persistence
221
+ - Hosted dashboard, API, deployment, or runtime concerns
222
+
223
+ Hosts may wrap Core outputs with their own transport, cache, session, rate-limit, or persistence metadata, but they should not fork the Core envelope and freshness contract without an explicit compatibility reason.
224
+
@@ -0,0 +1,63 @@
1
+ # FreshContext Dependency Diligence Notes
2
+
3
+ This document records dependency and license diligence notes from the Trust L4/L5 cleanup. It is not legal advice and does not replace professional review for external review, distribution, or formal diligence.
4
+
5
+ ## Current Audit Status
6
+
7
+ As of Pass 8-AB:
8
+
9
+ - `npm audit --omit=dev`: clean.
10
+ - `npm audit`: clean.
11
+ - The published MCP npm package excludes the Apify Actor entrypoint and does not install Apify/Crawlee in normal consumer installs.
12
+ - The previous moderate `qs` and `ws` advisories were resolved with narrow transitive overrides in the source checkout.
13
+ - No package version change was made.
14
+
15
+ ## Resolved Advisories
16
+
17
+ `qs`
18
+
19
+ - Previous severity: moderate.
20
+ - Path: `@modelcontextprotocol/sdk -> express/body-parser -> qs`.
21
+ - Resolution: pinned through npm `overrides` to `qs@6.15.2`.
22
+
23
+ `ws`
24
+
25
+ - Previous severity: moderate.
26
+ - Historical source-checkout path: `apify -> ws`.
27
+ - Resolution: pinned through npm `overrides` to `ws@8.20.1` for source-checkout Apify Actor workflows.
28
+
29
+ `file-type`
30
+
31
+ - Previous severity: moderate in fresh consumer installs.
32
+ - Historical consumer path: `apify -> @crawlee/utils -> file-type`.
33
+ - Resolution: Apify/Crawlee were removed from the normal published MCP package dependency surface. Apify remains a source-checkout / separate-actor concern.
34
+
35
+ ## License Inventory Notes
36
+
37
+ The Trust L4 license inventory was broadly permissive, including MIT, Apache-2.0, BSD variants, ISC, 0BSD, BlueOak-1.0.0, and similar permissive variants.
38
+
39
+ No GPL, AGPL, LGPL, MPL, EPL, CDDL, or similar copyleft licenses were reported in the Trust L4 scan.
40
+
41
+ `map-stream@0.1.0`
42
+
43
+ - Scanner result: `UNKNOWN`.
44
+ - Path observed during L4: transitive through source-checkout `apify` / Crawlee-related dependencies.
45
+ - Diligence note: package metadata appears incomplete, but the installed package includes an MIT-style license file.
46
+ - Action: keep as a source-checkout / actor-packaging diligence note and recheck before external diligence or Apify Actor distribution.
47
+
48
+ `caniuse-lite`
49
+
50
+ - Scanner result: `CC-BY-4.0`.
51
+ - Diligence note: preserve and review attribution requirements before external diligence, bundled distribution, or a formal review package.
52
+
53
+ ## Before External Diligence or Distribution
54
+
55
+ - Rerun `npm audit --omit=dev`.
56
+ - Rerun `npm audit`.
57
+ - Rerun dependency license inventory.
58
+ - Review scanner-unknown packages.
59
+ - Review `caniuse-lite` attribution requirements.
60
+ - Generate an SBOM if requested by an evaluator, reviewer, or downstream distributor.
61
+ - Review dependency and license posture with qualified counsel when external distribution or formal diligence requires it.
62
+
63
+ Do not treat this document as a legal conclusion.
@@ -0,0 +1,279 @@
1
+ # Ha-Pri v2 Design
2
+
3
+ Status: design + pure Core helper
4
+ Phase: Math Spine Phase 3-A / 3-B
5
+ Runtime impact: none
6
+
7
+ ## Purpose
8
+
9
+ Ha-Pri v2 is an additive provenance-hardening model for FreshContext Store/Ledger rows.
10
+
11
+ The goal is to keep Ha-Pri v1 readable while designing a stronger future signature that binds a row to canonical content, semantic identity, source metadata, timestamps, and engine version.
12
+
13
+ Phase 3-B adds pure Core helper functions and deterministic tests for the v2 model. Phase 3-C adds `examples/ha-pri-v2-example.ts`, a deterministic developer fixture showing `calculateHaPriV2` and `verifyHaPriV2` returning valid, invalid, and unknown verification states. Production Store wiring remains future work. This document does not change the D1 schema, change Worker write paths, migrate old rows, add HMAC secrets, or alter production scoring.
14
+
15
+ ## Current Ha-Pri v1 Audit
16
+
17
+ Ha-Pri v1 is implemented today as a provenance stamp and audit reference, not yet hard tamper enforcement.
18
+
19
+ ### Where v1 Lives
20
+
21
+ Current implementation points:
22
+
23
+ - `worker/src/intelligence.ts`
24
+ - `PROVENANCE_SALT = "FRESHCONTEXT_DAR_V1"`
25
+ - `generateAuditSig(resultId, contentHash)`
26
+ - `scoreSignal(...)` computes `ha_pri_sig`
27
+ - `worker/src/worker.ts`
28
+ - migration adds `ha_pri_sig TEXT`
29
+ - cron write path stores `ha_pri_sig` in `scrape_results`
30
+ - `/v1/intel/feed/:profile_id` returns `ha_pri_sig` in `intelligence_stamps`
31
+ - `tests/mathSpine.test.ts`
32
+ - checks that `generateAuditSig` matches the documented v1 formula
33
+
34
+ ### v1 Formula
35
+
36
+ ```text
37
+ ha_pri_sig = SHA-256(
38
+ result_id + ":" +
39
+ content_hash + ":" +
40
+ "FRESHCONTEXT_DAR_V1"
41
+ )
42
+ ```
43
+
44
+ ### What v1 Binds
45
+
46
+ Ha-Pri v1 binds:
47
+
48
+ - the generated `result_id`
49
+ - the current `content_hash` argument passed into `scoreSignal`
50
+ - the engine/version salt string `FRESHCONTEXT_DAR_V1`
51
+
52
+ In the current Worker cron path, `content_hash` is the value named `result_hash`, produced by `simpleHash(raw)`.
53
+
54
+ ### Current Hash Input
55
+
56
+ The current `result_hash` is a small rolling hash:
57
+
58
+ ```ts
59
+ let h = 0;
60
+ for (let i = 0; i < str.length; i++) h = Math.imul(31, h) + str.charCodeAt(i) | 0;
61
+ return Math.abs(h).toString(36);
62
+ ```
63
+
64
+ This is useful for cheap change detection, but it is not a cryptographic content digest.
65
+
66
+ ### Storage and Output
67
+
68
+ Ha-Pri v1 is stored in D1:
69
+
70
+ - `scrape_results.ha_pri_sig`
71
+
72
+ It is returned through the live intelligence feed:
73
+
74
+ - `signals[].intelligence_stamps.ha_pri_sig`
75
+
76
+ ### Verification Status
77
+
78
+ Current v1 behavior:
79
+
80
+ - generated on write: yes
81
+ - stored in D1: yes
82
+ - returned in feed/API output: yes
83
+ - recomputed on read: no
84
+ - used to reject tampered rows: no
85
+ - tied to canonical raw content SHA-256: no
86
+
87
+ So Ha-Pri v1 works as a provenance stamp and audit reference. It does not yet work as hard tamper enforcement.
88
+
89
+ ## Weaknesses in v1
90
+
91
+ 1. The signature uses a weak content-hash input.
92
+
93
+ `ha_pri_sig` is SHA-256, but it currently binds to the rolling `result_hash`, not to canonical raw content bytes. The v1 signature inherits the collision risk and ambiguity of the weaker input.
94
+
95
+ 2. No read-time verification exists.
96
+
97
+ Feed and debug reads return the stored signature, but they do not recompute it and compare stored vs recomputed values.
98
+
99
+ 3. No canonicalization contract exists for signed content.
100
+
101
+ The current signature signs a hash value, not a documented canonical representation of the row content.
102
+
103
+ 4. v1 does not bind all fields needed for provenance.
104
+
105
+ It does not directly bind adapter, published timestamp, scraped timestamp, semantic fingerprint, or a schema marker beyond the fixed salt.
106
+
107
+ 5. v1 is not authentication.
108
+
109
+ The salt is public. Anyone with row fields can compute the v1 signature. That is acceptable for a provenance reference, but it should not be presented as proof of origin from a private signing authority.
110
+
111
+ ## Ha-Pri v2 Design Goals
112
+
113
+ Ha-Pri v2 should be:
114
+
115
+ - additive, not a breaking migration
116
+ - deterministic
117
+ - recomputable
118
+ - explicit about canonicalization
119
+ - stronger than v1 for content integrity
120
+ - safe to run without secrets
121
+ - compatible with future HMAC signing, without requiring it now
122
+ - clear about verification status: valid, invalid, or unknown
123
+
124
+ ## Proposed v2 Fields
125
+
126
+ Future Store/Ledger rows may add:
127
+
128
+ ```text
129
+ canonical_content_sha256 TEXT
130
+ semantic_fingerprint_sha256 TEXT
131
+ ha_pri_sig_v2 TEXT
132
+ ha_pri_v2_status TEXT
133
+ ha_pri_v2_checked_at TEXT
134
+ ```
135
+
136
+ These are design-level names only. No schema change is made in this phase.
137
+
138
+ ### canonical_content_sha256
139
+
140
+ `canonical_content_sha256` is:
141
+
142
+ ```text
143
+ SHA-256(canonical raw content)
144
+ ```
145
+
146
+ It binds the actual content after deterministic normalization.
147
+
148
+ ### semantic_fingerprint_sha256
149
+
150
+ `semantic_fingerprint_sha256` is:
151
+
152
+ ```text
153
+ SHA-256(normalized title + canonical URL + publication date)
154
+ ```
155
+
156
+ It is a full SHA-256 version of the current shorter semantic fingerprint idea.
157
+
158
+ ## Canonicalization Rules
159
+
160
+ All canonicalization should be deterministic.
161
+
162
+ Recommended rules:
163
+
164
+ 1. Use UTF-8.
165
+ 2. Normalize line endings to `\n`.
166
+ 3. Trim trailing whitespace on each line.
167
+ 4. Preserve meaningful internal whitespace.
168
+ 5. Normalize null or missing optional fields to the literal string `"null"`.
169
+ 6. Use stable field order.
170
+ 7. Use ISO-8601 timestamps where available.
171
+ 8. Do not include fields whose values change during read-time verification unless they are explicitly part of the signed record.
172
+ 9. Version the canonicalization contract.
173
+
174
+ For future implementation, canonicalization should live in a pure helper with deterministic fixtures.
175
+
176
+ ## Proposed v2 Formula
177
+
178
+ ```text
179
+ ha_pri_sig_v2 = SHA-256(signingPayload)
180
+ ```
181
+
182
+ Where `signingPayload` is exactly:
183
+
184
+ ```text
185
+ FRESHCONTEXT_HA_PRI_V2
186
+ result_id=<resultId>
187
+ canonical_content_sha256=<canonicalContentSha256>
188
+ semantic_fingerprint_sha256=<semanticFingerprintSha256>
189
+ adapter=<adapter>
190
+ published_at=<publishedAt-or-null>
191
+ retrieved_at=<retrievedAt-or-null>
192
+ engine_version=<engineVersion>
193
+ ```
194
+
195
+ ### Field Meaning
196
+
197
+ - `FRESHCONTEXT_HA_PRI_V2`: schema/version string
198
+ - `result_id`: stable row identifier
199
+ - `canonical_content_sha256`: cryptographic digest of canonical raw content
200
+ - `semantic_fingerprint_sha256`: cryptographic digest of semantic identity fields
201
+ - `adapter`: source adapter name
202
+ - `published_at`: source/content publication timestamp, or explicit null sentinel
203
+ - `retrieved_at`: retrieval or collection timestamp, or explicit null sentinel
204
+ - `engine_version`: scoring/signature engine version
205
+
206
+ Store/Ledger systems may map `scraped_at` to the v2 `retrieved_at` signing field.
207
+
208
+ ## Verification Model
209
+
210
+ Future read or audit verification should:
211
+
212
+ 1. Load the stored row.
213
+ 2. Recompute canonical raw content from stored content fields.
214
+ 3. Recompute `canonical_content_sha256`.
215
+ 4. Recompute semantic identity fields.
216
+ 5. Recompute `semantic_fingerprint_sha256`.
217
+ 6. Recompute `ha_pri_sig_v2` from the canonical field sequence.
218
+ 7. Compare stored vs recomputed values.
219
+ 8. Mark the result:
220
+ - `valid`
221
+ - `invalid`
222
+ - `unknown`
223
+ 9. Surface verification status to internal/debug paths first.
224
+ 10. Avoid silently trusting unverifiable rows.
225
+
226
+ Verification must not mutate old rows during read unless a dedicated migration explicitly allows it.
227
+
228
+ ## Backward Compatibility
229
+
230
+ Ha-Pri v2 should not remove or reinterpret v1.
231
+
232
+ Rules:
233
+
234
+ - Keep `ha_pri_sig` readable.
235
+ - Add `ha_pri_sig_v2` separately.
236
+ - Treat old rows without v2 fields as `unknown`, not invalid.
237
+ - Do not reject old rows solely because they lack v2.
238
+ - Preserve v1 formula tests.
239
+ - Add v2 fixtures before any production write path changes.
240
+
241
+ ## Future HMAC Boundary
242
+
243
+ HMAC-SHA256 may be useful later if FreshContext needs origin authentication rather than only tamper evidence.
244
+
245
+ That would require:
246
+
247
+ - a private deployment key
248
+ - secret rotation
249
+ - key identifiers
250
+ - verification policy for old key versions
251
+ - clear trust-boundary documentation
252
+
253
+ This phase does not add HMAC, secrets, or key management.
254
+
255
+ ## Suggested Future Patch Sequence
256
+
257
+ 1. Add pure canonicalization helpers and deterministic tests.
258
+ 2. Add pure v2 signature helper and fixtures.
259
+ 3. Add optional verification helper that returns `valid`, `invalid`, or `unknown`.
260
+ 4. Add D1 columns in a separate schema phase.
261
+ 5. Write v2 fields for new rows only.
262
+ 6. Expose verification status on debug/internal endpoints first.
263
+ 7. Decide later whether public feed output should include v2 status.
264
+
265
+ ## Non-Goals
266
+
267
+ This design does not:
268
+
269
+ - change runtime behavior
270
+ - change scoring
271
+ - change MCP tool schemas
272
+ - change D1 schema
273
+ - change Worker write paths
274
+ - migrate old rows
275
+ - add HMAC
276
+ - add secrets
277
+ - reject rows in production
278
+ - publish npm
279
+ - deploy the Worker