freshcontext-mcp 0.3.19 → 0.3.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (75) hide show
  1. package/FRESHCONTEXT_SPEC.md +317 -0
  2. package/METHODOLOGY.md +381 -0
  3. package/README.md +55 -5
  4. package/SECURITY.md +9 -7
  5. package/dist/adapters/arxiv.d.ts +15 -0
  6. package/dist/adapters/arxiv.js +3 -2
  7. package/dist/adapters/changelog.d.ts +2 -0
  8. package/dist/adapters/changelog.js +4 -2
  9. package/dist/adapters/finance.d.ts +2 -0
  10. package/dist/adapters/finance.js +1 -1
  11. package/dist/adapters/gdelt.d.ts +2 -0
  12. package/dist/adapters/gdelt.js +1 -1
  13. package/dist/adapters/gebiz.d.ts +2 -0
  14. package/dist/adapters/gebiz.js +1 -1
  15. package/dist/adapters/github.d.ts +2 -0
  16. package/dist/adapters/govcontracts.d.ts +2 -0
  17. package/dist/adapters/hackernews.d.ts +2 -0
  18. package/dist/adapters/jobs.d.ts +2 -0
  19. package/dist/adapters/jobs.js +6 -6
  20. package/dist/adapters/packageTrends.d.ts +2 -0
  21. package/dist/adapters/productHunt.d.ts +2 -0
  22. package/dist/adapters/reddit.d.ts +8 -0
  23. package/dist/adapters/reddit.js +12 -5
  24. package/dist/adapters/registry.d.ts +19 -0
  25. package/dist/adapters/repoSearch.d.ts +2 -0
  26. package/dist/adapters/repoSearch.js +1 -1
  27. package/dist/adapters/scholar.d.ts +2 -0
  28. package/dist/adapters/secFilings.d.ts +2 -0
  29. package/dist/adapters/secFilings.js +1 -1
  30. package/dist/adapters/yc.d.ts +2 -0
  31. package/dist/core/decay.d.ts +5 -0
  32. package/dist/core/decision.d.ts +3 -0
  33. package/dist/core/decision.js +1 -3
  34. package/dist/core/envelope.d.ts +5 -0
  35. package/dist/core/envelope.js +9 -1
  36. package/dist/core/explain.d.ts +12 -0
  37. package/dist/core/guards.d.ts +1 -0
  38. package/dist/core/index.d.ts +14 -0
  39. package/dist/core/index.js +2 -0
  40. package/dist/core/pipeline.d.ts +3 -0
  41. package/dist/core/pipeline.js +8 -0
  42. package/dist/core/provenance.d.ts +5 -0
  43. package/dist/core/provenanceReadiness.d.ts +2 -0
  44. package/dist/core/provenanceReadiness.js +220 -0
  45. package/dist/core/rank.d.ts +4 -0
  46. package/dist/core/readable.d.ts +2 -0
  47. package/dist/core/readable.js +75 -0
  48. package/dist/core/signal.d.ts +3 -0
  49. package/dist/core/sourceProfiles.d.ts +4 -0
  50. package/dist/core/types.d.ts +239 -0
  51. package/dist/core/utility.d.ts +2 -0
  52. package/dist/rest/handler.d.ts +1 -0
  53. package/dist/security.d.ts +15 -0
  54. package/dist/security.js +3 -1
  55. package/dist/server.d.ts +2 -0
  56. package/dist/server.js +2 -2
  57. package/dist/tools/evaluateContext.d.ts +21 -0
  58. package/dist/tools/evaluateContext.js +22 -1
  59. package/dist/tools/freshnessStamp.d.ts +1 -0
  60. package/dist/types.d.ts +1 -0
  61. package/docs/API_DESIGN.md +28 -1
  62. package/docs/CLIENT_SETUP.md +166 -0
  63. package/docs/CODEX_MCP_USAGE.md +4 -4
  64. package/docs/CORE_API.md +69 -5
  65. package/docs/CORE_MCP_BOUNDARY.md +13 -4
  66. package/docs/FUTURE_LANES.md +26 -3
  67. package/docs/HA_PRI_V2_DESIGN.md +7 -1
  68. package/docs/HA_PRI_V2_PRODUCTION_ENFORCEMENT_PLAN.md +414 -0
  69. package/docs/HUMAN_READABLE_OUTPUT_CONTRACT.md +293 -0
  70. package/docs/RELEASE_INTEGRITY.md +1 -1
  71. package/docs/RELEASE_NOTES.md +33 -5
  72. package/docs/SIGNAL_CONTRACT.md +200 -2
  73. package/package-script-guard.mjs +59 -3
  74. package/package.json +33 -13
  75. package/server.json +2 -2
package/docs/CORE_API.md CHANGED
@@ -13,14 +13,21 @@ Import stable Core functions from:
13
13
  ```ts
14
14
  import {
15
15
  calculateFreshnessScore,
16
+ evaluateSignals,
16
17
  formatForLLM,
18
+ getSourceProfile,
19
+ interpretEvaluations,
17
20
  looksLikeFailedAdapterContent,
21
+ normalizeSignal,
22
+ prepareProvenanceReadiness,
18
23
  scoreLabel,
19
24
  stampFreshness,
20
25
  toStructuredJSON,
21
- } from "./src/core/index.js";
26
+ } from "freshcontext-mcp/core";
22
27
  ```
23
28
 
29
+ The `freshcontext-mcp/core` subpath is the direct Core import boundary inside the current MCP package. It does not create a standalone `freshcontext-core` package yet; that remains a future package-split lane.
30
+
24
31
  ### Envelope
25
32
 
26
33
  - `stampFreshness(result, options, adapter)` creates a `FreshContext` object from adapter output.
@@ -55,7 +62,7 @@ It does not fetch, cache, write D1, inspect Worker bindings, know MCP tool schem
55
62
 
56
63
  `evaluateSignals` evaluates each input and returns evaluations sorted by existing `rankSignal` final score, preserving input order when scores tie. Context utility is returned as a sidecar and does not replace `final_score`.
57
64
 
58
- Context utility is returned as sidecar output in the current pipeline; it does not replace or modify the default `rankSignal` / `evaluateSignals` ordering. A future pass may add an explicit utility-weighted ranking mode.
65
+ Context utility is returned as sidecar output in the current pipeline. It does not replace or modify the default `rankSignal` / `evaluateSignals` ordering, and it does not control default decision labels. A future pass may add an explicit utility-weighted ranking or decision-support mode.
59
66
 
60
67
  Local demo:
61
68
 
@@ -78,7 +85,7 @@ These types describe the stable envelope and adapter result contract.
78
85
 
79
86
  ## Signal Contract v1
80
87
 
81
- Signal Contract v1 is the additive Core shape for a retrieved signal before it is ranked, wrapped, stored, or passed to an agent workflow.
88
+ Signal Contract v1 is the current FreshContext input standard: the stable shape for candidate context before it is ranked, wrapped, stored, judged by `evaluate_context`, or passed to an agent workflow.
82
89
 
83
90
  Public exports:
84
91
 
@@ -92,6 +99,8 @@ Public exports:
92
99
 
93
100
  `published_at` is the canonical signal timestamp. `content_date` is accepted as an adapter/envelope compatibility alias. Normalization clears invalid or meaningfully future-dated timestamps, marks failed/error-looking content as `status: "failed"`, clamps `semantic_score` into `0..1`, and records normalization reasons.
94
101
 
102
+ Future context signals and control signals are optional future metadata layers, not replacements for Signal Contract v1 and not required public input fields today.
103
+
95
104
  See [Signal Contract v1](./SIGNAL_CONTRACT.md).
96
105
 
97
106
  ## Source Profiles
@@ -152,6 +161,51 @@ FreshContext decisions judge citation readiness, context usefulness, freshness,
152
161
 
153
162
  Demo output will be updated separately so presentation stays separate from Core decision logic.
154
163
 
164
+ ## Human-Readable Output
165
+
166
+ The human-readable output helper adds a small reader-facing layer on top of existing Core evaluations and decisions.
167
+
168
+ Public exports:
169
+
170
+ - `toReadableContextResult(evaluation, decision)`
171
+ - `HumanReadableContextResult`
172
+
173
+ The helper returns a bounded object with:
174
+
175
+ - `label`
176
+ - `summary`
177
+ - `why`
178
+ - `action`
179
+ - `warnings`
180
+
181
+ This is additive. It does not change machine decisions, ranking, freshness math, utility math, Source Profiles, envelopes, provenance, or host behavior. Utility may appear in `why` when it is already part of the decision reasons, but utility still does not control default decision labels.
182
+
183
+ Example structured result fragment:
184
+
185
+ ```json
186
+ {
187
+ "decision": "cite_as_primary",
188
+ "label": "Cite as primary",
189
+ "readable": {
190
+ "label": "Primary source",
191
+ "summary": "This source is strong enough to use as main evidence.",
192
+ "why": [
193
+ "Strong semantic match and current freshness for arxiv.",
194
+ "source profile academic_research uses lenient date policy",
195
+ "intent profile citation_check selected"
196
+ ],
197
+ "action": "Use this as main evidence while preserving citation and provenance.",
198
+ "warnings": [
199
+ "FreshContext judges citation readiness and context usefulness; it does not certify truth."
200
+ ]
201
+ }
202
+ }
203
+ ```
204
+
205
+ FreshContext does not certify truth. It records why context was used, supported, questioned, refreshed, watched, or excluded before it reaches a model.
206
+
207
+ See [Human-Readable Output Contract](./HUMAN_READABLE_OUTPUT_CONTRACT.md).
208
+
155
209
  ## Public Ranking Primitives
156
210
 
157
211
  The ranking primitives are public, but consumers should treat their score scales carefully:
@@ -173,7 +227,7 @@ Ranking combines semantic relevance and freshness into a deterministic order. It
173
227
 
174
228
  ## Experimental Utility Primitive
175
229
 
176
- The context-conditioned utility primitive is pure and tested, but it is not production-wired into MCP ranking, Worker feeds, Store scoring, or runtime behavior.
230
+ The context-conditioned utility primitive is pure and tested. It is surfaced in Core evaluation output and decision reasons, but it does not control default ranking or decision labels.
177
231
 
178
232
  Experimental exports:
179
233
 
@@ -182,7 +236,7 @@ Experimental exports:
182
236
  - `ContextUtilityInput`
183
237
  - `ContextUtilityResult`
184
238
 
185
- These are pure Core math. They are now connected inside `evaluateSignal` as sidecar utility output, but they are not production-wired into MCP ranking, Worker feeds, Store scoring, or runtime behavior.
239
+ These are pure Core math. They are now connected inside `evaluateSignal` as sidecar utility output and explanatory material, but they are not production-wired into MCP ranking, Worker feeds, Store scoring, or default decision thresholds.
186
240
 
187
241
  ## Provenance Helpers
188
242
 
@@ -198,6 +252,16 @@ Ha-Pri v2 is available as pure Core helper functionality:
198
252
 
199
253
  `evaluateSignal` can optionally prepare Ha-Pri v2 material when `includeProvenance` is set and required input material is present. Core does not persist provenance, add D1 columns, verify rows on read, reject rows, or replace Worker Ha-Pri v1 behavior.
200
254
 
255
+ Provenance readiness is available as a pure Core sidecar:
256
+
257
+ - `prepareProvenanceReadiness(input, options?)`
258
+ - `ProvenanceReadinessState`
259
+ - `ProvenanceReadinessResult`
260
+
261
+ It classifies caller-provided signal provenance as `complete`, `partial`, `incomplete`, `unknown`, or `derived`. The result exposes source identity completeness, timing completeness, normalized source and timestamp fields, canonical content hash material, optional semantic fingerprint hash material, optional Ha-Pri v2 identity material when the caller supplies enough inputs, warnings, and reasons.
262
+
263
+ `provenance_readiness` is included additively on `evaluateSignal` results and in the structured `evaluate_context` JSON result. It does not fetch, crawl, scrape, read folders, call adapters, change ranking, change decisions, certify truth, or enforce rejection policy.
264
+
201
265
  ## Internal, Policy, and Compatibility Exports
202
266
 
203
267
  - `clampScore` is an internal ranking helper. It is currently exported for tests and utility use, but it should not be presented as a primary buyer-facing API.
@@ -50,18 +50,27 @@ Worker/site surfaces own deployment concerns:
50
50
 
51
51
  Live today:
52
52
 
53
- - npm package: `freshcontext-mcp@0.3.19`
53
+ - npm package: `freshcontext-mcp@0.3.21`
54
54
  - MCP stdio server and published binary: `freshcontext-mcp`
55
+ - Core subpath export: `freshcontext-mcp/core`
55
56
  - `evaluate_context` MCP tool for caller-provided candidate context
56
57
  - 21 named read-only reference adapters
57
58
  - Core signal evaluation
58
59
  - Source Profiles
59
- - Decision Helper
60
- - adapter registry metadata
60
+ - Decision Helper
61
+ - human-readable `readable` output on structured `evaluate_context` results
62
+ - provenance readiness and readable handoff safety
63
+ - adapter registry metadata
61
64
  - arXiv signal-to-decision proof
62
65
  - bring-your-own-context local demos
63
66
  - Trust Scanner release gate
64
67
 
68
+ Network boundary:
69
+
70
+ - `evaluate_context` does not fetch, crawl, scrape, browse, read folders, or call adapters.
71
+ - The 21 named reference adapters are optional read-only network tools and use network access only when invoked.
72
+ - FreshContext Core remains the no-network judgment layer.
73
+
65
74
  Not live today:
66
75
 
67
76
  - standalone Core npm package
@@ -78,7 +87,7 @@ Not live today:
78
87
  The safe split path is staged:
79
88
 
80
89
  1. Keep `freshcontext-mcp` stable for current users.
81
- 2. Maintain Core as a pure internal export surface.
90
+ 2. Maintain Core as a pure package subpath export surface.
82
91
  3. Audit Core dependencies, Node/browser compatibility, and API stability.
83
92
  4. Publish a standalone Core package only after compatibility tests exist.
84
93
  5. Make `freshcontext-mcp` depend on the standalone Core package.
@@ -10,9 +10,10 @@ The current package boundary is documented in [Core / MCP Boundary](./CORE_MCP_B
10
10
 
11
11
  Live today:
12
12
 
13
- - npm package: `freshcontext-mcp@0.3.19`
13
+ - npm package: `freshcontext-mcp@0.3.21`
14
14
  - MCP stdio server
15
15
  - `evaluate_context` MCP tool for caller-provided candidate context
16
+ - Signal Contract v1 as the stable candidate-context input shape
16
17
  - 21 read-only reference adapters
17
18
  - Core signal evaluation
18
19
  - Source Profiles
@@ -32,6 +33,26 @@ Not live today:
32
33
  - standalone Core SDK package
33
34
  - full adapter ingestion
34
35
 
36
+ ## Phase 0: Stabilize The Signal Contract
37
+
38
+ Goal:
39
+
40
+ ```text
41
+ Treat Signal Contract v1 as the stable input boundary for FreshContext.
42
+ ```
43
+
44
+ Current contract:
45
+
46
+ ```text
47
+ title + content + source + source_type + published_at + retrieved_at + semantic_score
48
+ ```
49
+
50
+ This is live today. It is not the same thing as future context signals or control signals.
51
+
52
+ Tasks in this lane should document examples, invalid-input behavior, and normalization expectations. Do not expand required fields unless tests prove the new metadata improves decisions.
53
+
54
+ Future context signals, control signals, ingestion quality signals, structure preservation signals, and provenance confidence signals belong to later Decision Layer upgrades. They should remain optional metadata, not public required fields.
55
+
35
56
  ## Lane 1: Client Setup Reliability
36
57
 
37
58
  Goal:
@@ -121,9 +142,11 @@ Goal:
121
142
  Make decisions more useful without silently changing ranking.
122
143
  ```
123
144
 
124
- Possible inputs include context utility, control signal, future context signal, confidence tiers, and source-profile-specific thresholds.
145
+ Possible inputs include context utility, control signal, future context signal, ingestion quality, structure preservation, provenance confidence, confidence tiers, and source-profile-specific thresholds.
146
+
147
+ These are optional future metadata upgrades on top of Signal Contract v1. They should only be exposed when they make decisions clearer without making the caller-facing contract harder to use.
125
148
 
126
- Do not make `utility.score` affect ranking by default without a dedicated ranking policy pass.
149
+ Do not make `utility.score` affect ranking or decision labels by default without a dedicated policy pass.
127
150
 
128
151
  ## Lane 7: Ha-Pri v2 Production Path
129
152
 
@@ -10,7 +10,13 @@ Ha-Pri v2 is an additive provenance-hardening model for FreshContext Store/Ledge
10
10
 
11
11
  The goal is to keep Ha-Pri v1 readable while designing a stronger future signature that binds a row to canonical content, semantic identity, source metadata, timestamps, and engine version.
12
12
 
13
- Phase 3-B adds pure Core helper functions and deterministic tests for the v2 model. Phase 3-C adds `examples/ha-pri-v2-example.ts`, a deterministic developer fixture showing `calculateHaPriV2` and `verifyHaPriV2` returning valid, invalid, and unknown verification states. Production Store wiring remains future work. This document does not change the D1 schema, change Worker write paths, migrate old rows, add HMAC secrets, or alter production scoring.
13
+ Phase 3-B adds pure Core helper functions and deterministic tests for the v2 model. Phase 3-C adds `examples/ha-pri-v2-example.ts`, a deterministic developer fixture showing `calculateHaPriV2` and `verifyHaPriV2` returning valid, invalid, and unknown verification states. Production Store wiring remains future work. This document does not change the D1 schema, change Worker write paths, migrate old rows, add HMAC secrets, or alter production scoring.
14
+
15
+ Pass 11-J adds golden test vectors for the pure Core helpers. Ha-Pri v2 golden vectors prove deterministic Core provenance behavior: canonicalization, SHA-256 hashes, signing payload construction, signature generation, and verification status are stable and repeatable. They do not mean Ha-Pri v2 is production-enforced on Worker/D1 reads.
16
+
17
+ Plain SHA-256 provides deterministic integrity and audit checks. HMAC or private-key signing would be needed later for stronger origin-authentication guarantees.
18
+
19
+ Pass 11-K adds a design-only production enforcement plan in `docs/HA_PRI_V2_PRODUCTION_ENFORCEMENT_PLAN.md`. That plan covers the future D1/storage, write-path, read/debug verification, compatibility, backfill, threat model, and rollout path. It does not implement Worker/D1 enforcement.
14
20
 
15
21
  ## Current Ha-Pri v1 Audit
16
22
 
@@ -0,0 +1,414 @@
1
+ # Ha-Pri v2 Production Enforcement Plan
2
+
3
+ Status: design only
4
+ Phase: Pass 11-K
5
+ Runtime impact: none
6
+
7
+ Ha-Pri v2 production enforcement is a future rollout path. Current FreshContext releases only include the Core helper and deterministic golden vectors unless a later implementation pass explicitly wires Worker/D1 enforcement.
8
+
9
+ This plan describes how Ha-Pri v2 could move from pure Core provenance helper to production Store/Worker verification without overclaiming current behavior.
10
+
11
+ ## 1. Current State
12
+
13
+ Current FreshContext behavior:
14
+
15
+ - Ha-Pri v1 is the current Worker/feed audit stamp.
16
+ - Ha-Pri v1 is stored as `scrape_results.ha_pri_sig`.
17
+ - Ha-Pri v1 is returned in Worker feed `intelligence_stamps`.
18
+ - Ha-Pri v1 is a provenance stamp and audit reference, not hard row rejection.
19
+ - Ha-Pri v2 exists as a pure Core helper in `src/core/provenance.ts`.
20
+ - Ha-Pri v2 has deterministic golden vectors in `tests/fixtures/ha-pri-v2-golden-vectors.json`.
21
+ - Ha-Pri v2 is not production-enforced on Worker/D1 reads.
22
+ - Ha-Pri v2 does not currently reject rows.
23
+ - Ha-Pri v2 does not currently provide private-key origin authentication.
24
+
25
+ The current v2 helper provides deterministic canonicalization, SHA-256 hashing, signing payload construction, signature calculation, and verification status:
26
+
27
+ ```txt
28
+ valid
29
+ invalid
30
+ unknown
31
+ ```
32
+
33
+ That is real Core behavior. It is not yet Worker/D1 enforcement.
34
+
35
+ ## 2. Target Future State
36
+
37
+ Future production enforcement should make stored context rows independently reviewable after write.
38
+
39
+ Target behavior:
40
+
41
+ - Worker write path stores Ha-Pri v2 provenance material for new rows.
42
+ - Rows can later be verified as `valid`, `invalid`, or `unknown`.
43
+ - Debug/internal read paths can report verification status.
44
+ - Safe public read paths can expose limited verification status without leaking internals.
45
+ - Invalid rows are not silently treated as trusted.
46
+ - Old rows remain compatible through `unknown` and optional backfill behavior.
47
+ - Strict rejection remains optional and staged, not the first rollout.
48
+
49
+ The target is not "FreshContext proves truth." The target is "FreshContext can detect whether stored provenance material still matches the stored row material under the documented v2 contract."
50
+
51
+ ## 3. Proposed D1 / Storage Fields
52
+
53
+ Essential fields:
54
+
55
+ ```txt
56
+ ha_pri_sig_v2 TEXT
57
+ ha_pri_canonical_content_sha256 TEXT
58
+ ha_pri_semantic_fingerprint_sha256 TEXT
59
+ ha_pri_signing_payload_version TEXT
60
+ ha_pri_engine_version TEXT
61
+ ```
62
+
63
+ Recommended operational fields:
64
+
65
+ ```txt
66
+ ha_pri_verification_status TEXT
67
+ ha_pri_verified_at TEXT
68
+ ha_pri_backfill_status TEXT
69
+ ```
70
+
71
+ Likely unnecessary as separate stored fields:
72
+
73
+ ```txt
74
+ ha_pri_adapter
75
+ ha_pri_published_at
76
+ ha_pri_retrieved_at
77
+ ```
78
+
79
+ Reason: adapter, published timestamp, and retrieved/scraped timestamp already exist or should exist as first-class row fields. Duplicating them inside Ha-Pri-specific columns risks drift. The signing payload should read those canonical row fields directly during verification.
80
+
81
+ Minimum practical schema:
82
+
83
+ ```sql
84
+ ALTER TABLE scrape_results ADD COLUMN ha_pri_sig_v2 TEXT;
85
+ ALTER TABLE scrape_results ADD COLUMN ha_pri_canonical_content_sha256 TEXT;
86
+ ALTER TABLE scrape_results ADD COLUMN ha_pri_semantic_fingerprint_sha256 TEXT;
87
+ ALTER TABLE scrape_results ADD COLUMN ha_pri_signing_payload_version TEXT;
88
+ ALTER TABLE scrape_results ADD COLUMN ha_pri_engine_version TEXT;
89
+ ALTER TABLE scrape_results ADD COLUMN ha_pri_verification_status TEXT;
90
+ ALTER TABLE scrape_results ADD COLUMN ha_pri_verified_at TEXT;
91
+ ALTER TABLE scrape_results ADD COLUMN ha_pri_backfill_status TEXT;
92
+ ```
93
+
94
+ Storage status values should be boring and explicit:
95
+
96
+ ```txt
97
+ valid
98
+ invalid
99
+ unknown
100
+ not_checked
101
+ ```
102
+
103
+ Backfill status values should avoid pretending old rows were originally v2-stamped:
104
+
105
+ ```txt
106
+ none
107
+ backfilled
108
+ unknown_origin
109
+ failed
110
+ ```
111
+
112
+ ## 4. Write-Path Design
113
+
114
+ The future Worker write path should dual-stamp v1 and v2 for new rows.
115
+
116
+ Recommended write sequence:
117
+
118
+ 1. Adapter returns raw candidate content.
119
+ 2. Existing Worker scoring computes current DAR fields and Ha-Pri v1.
120
+ 3. Write path prepares Ha-Pri v2 input:
121
+ - `resultId`: row id that will be stored
122
+ - `rawContent`: canonical row raw content
123
+ - `semanticFingerprint`: semantic fingerprint material or stored fingerprint
124
+ - `adapter`: adapter id
125
+ - `publishedAt`: normalized source publication timestamp or `null`
126
+ - `retrievedAt`: normalized scrape/retrieval timestamp
127
+ - `engineVersion`: FreshContext engine/package version or explicit Worker engine version
128
+ 4. If required v2 material is complete, calculate:
129
+ - canonical content SHA-256
130
+ - semantic fingerprint SHA-256
131
+ - signing payload version
132
+ - Ha-Pri v2 signature
133
+ 5. Store v1 fields as today.
134
+ 6. Store v2 fields alongside v1 fields.
135
+ 7. If material is incomplete, store unknown-compatible metadata and do not pretend the row is valid.
136
+
137
+ Recommended incomplete-material behavior:
138
+
139
+ ```txt
140
+ ha_pri_sig_v2 = null
141
+ ha_pri_verification_status = "unknown"
142
+ ha_pri_backfill_status = "none"
143
+ ```
144
+
145
+ Canonical content should be produced by the same pure helper behavior used in Core golden vectors. Do not invent a second Worker-only canonicalization contract.
146
+
147
+ Semantic fingerprint should be produced before signing and should be stable across retries for the same underlying source item. If the fingerprint is missing, v2 signing should fall back to `unknown`, not a fake valid signature.
148
+
149
+ Engine version should be explicit. The safest initial choice is the package/server version used by the running Worker build.
150
+
151
+ ## 5. Read / Debug Verification Design
152
+
153
+ Future read verification should be a pure recomputation:
154
+
155
+ ```txt
156
+ verifyHaPriV2(row) -> valid | invalid | unknown
157
+ ```
158
+
159
+ Suggested internal verification input:
160
+
161
+ ```ts
162
+ {
163
+ resultId: row.id,
164
+ rawContent: row.raw_content,
165
+ semanticFingerprint: row.semantic_fingerprint,
166
+ adapter: row.adapter,
167
+ publishedAt: row.published_at,
168
+ retrievedAt: row.scraped_at,
169
+ engineVersion: row.ha_pri_engine_version
170
+ }
171
+ ```
172
+
173
+ Debug output may include:
174
+
175
+ ```json
176
+ {
177
+ "ha_pri_v2": {
178
+ "status": "valid",
179
+ "checked_at": "2026-06-11T12:00:00.000Z",
180
+ "payload_version": "FRESHCONTEXT_HA_PRI_V2",
181
+ "canonical_content_sha256": "sha256...",
182
+ "semantic_fingerprint_sha256": "sha256..."
183
+ }
184
+ }
185
+ ```
186
+
187
+ Safe public output should be smaller:
188
+
189
+ ```json
190
+ {
191
+ "provenance": {
192
+ "ha_pri_v2_status": "valid"
193
+ }
194
+ }
195
+ ```
196
+
197
+ Do not expose signing payloads or debug hashes in broad public outputs unless there is a clear user need.
198
+
199
+ Suggested staged behavior:
200
+
201
+ - Phase 1: report-only verification in internal/debug output.
202
+ - Phase 2: warning in read/debug path when invalid.
203
+ - Phase 3: optional strict mode for private deployments.
204
+ - Phase 4: possible reject/block policy only after replay data and operational evidence.
205
+
206
+ Invalid should not become automatic rejection first. A migration bug, canonicalization mismatch, or schema rollout issue could otherwise hide useful rows during rollout.
207
+
208
+ ## 6. Compatibility And Backfill
209
+
210
+ Old rows must remain readable.
211
+
212
+ Compatibility rules:
213
+
214
+ - v1-only rows verify as `unknown` for v2.
215
+ - Rows with missing v2 fields verify as `unknown`.
216
+ - Rows with malformed v2 signatures verify as `invalid`.
217
+ - Rows with present but mismatched v2 signatures verify as `invalid`.
218
+ - Missing `ha_pri_sig_v2` is not the same as tampering.
219
+ - Existing `ha_pri_sig` remains readable for historical continuity.
220
+
221
+ Backfill rules:
222
+
223
+ - Backfilled provenance must be marked as `backfilled` or `unknown_origin`.
224
+ - Backfill must not imply the row was v2-stamped at original write time.
225
+ - Backfill should preserve original row timestamps.
226
+ - Backfill should record when verification/backfill happened.
227
+ - Backfill should be reversible or repeatable where practical.
228
+
229
+ Possible backfill process:
230
+
231
+ 1. Select rows missing `ha_pri_sig_v2`.
232
+ 2. Reconstruct v2 input from stored row fields.
233
+ 3. If required material is complete, calculate v2 fields.
234
+ 4. Store v2 fields with `ha_pri_backfill_status = "backfilled"`.
235
+ 5. If required material is incomplete, store `ha_pri_verification_status = "unknown"` and `ha_pri_backfill_status = "unknown_origin"`.
236
+ 6. Report counts for backfilled, unknown, invalid, and failed rows.
237
+
238
+ ## 7. Security Boundary
239
+
240
+ Plain SHA-256 gives deterministic integrity and audit checks.
241
+
242
+ Plain SHA-256 does not prove private origin authentication. Anyone with all payload fields can recompute a plain SHA-256 signature.
243
+
244
+ Ha-Pri v2 helps detect:
245
+
246
+ - accidental row corruption
247
+ - changed content after write
248
+ - changed semantic fingerprint material
249
+ - changed adapter/timestamp/version fields included in the signing payload
250
+ - malformed stored signatures
251
+
252
+ Ha-Pri v2 does not solve:
253
+
254
+ - truth certification
255
+ - legal, medical, tax, employment, academic, or investment correctness
256
+ - private origin authentication without a secret or private key
257
+ - compromise of the write path before signing
258
+ - compromise of all row fields plus signature under plain SHA-256
259
+
260
+ Recommendation:
261
+
262
+ Do not add HMAC/private signing immediately to the open package. Keep the open package deterministic and stateless.
263
+
264
+ Consider HMAC or private-key signing later for:
265
+
266
+ - hosted FreshContext endpoints
267
+ - private production deployments
268
+ - paid/tenant-specific infrastructure
269
+ - environments where the verifier must know the row was stamped by a trusted FreshContext deployment
270
+
271
+ If HMAC/private signing is added later, it requires:
272
+
273
+ - secret storage outside the repo
274
+ - key ids
275
+ - key rotation
276
+ - old-key verification policy
277
+ - signer/verifier boundary documentation
278
+ - tests proving secrets are never logged or returned
279
+
280
+ ## 8. Threat Model
281
+
282
+ Threats considered:
283
+
284
+ ### Accidental row corruption
285
+
286
+ Ha-Pri v2 helps by recomputing the expected signature and surfacing `invalid`.
287
+
288
+ ### Stale or partial provenance
289
+
290
+ Ha-Pri v2 helps by returning `unknown` when required material is missing. The system should not pretend such rows are valid.
291
+
292
+ ### Tampered D1 rows
293
+
294
+ Ha-Pri v2 helps if an attacker changes stored content or bound fields without also updating all matching v2 fields.
295
+
296
+ Plain SHA-256 does not help if an attacker can rewrite all row fields and recompute the public signature.
297
+
298
+ ### Malformed signatures
299
+
300
+ Malformed, blank, or nonmatching signatures should produce `invalid` or `unknown` according to current helper behavior. They should not crash reads.
301
+
302
+ ### Recomputed public SHA-256 signatures
303
+
304
+ Because v2 currently uses plain SHA-256, a party with all fields can recompute a matching signature. HMAC/private signing is the later answer if origin authentication becomes necessary.
305
+
306
+ ### Debug endpoint leakage
307
+
308
+ Debug routes should remain authenticated. Public outputs should avoid exposing full signing payloads or internal row material unless deliberately needed.
309
+
310
+ ### Secret exposure if HMAC is added later
311
+
312
+ HMAC/private signing introduces secret-management risk. Secrets must live in deployment configuration, never in docs, fixtures, npm package output, or client-visible responses.
313
+
314
+ ## 9. Tests Needed Before Implementation
315
+
316
+ Future implementation should add tests before production rollout:
317
+
318
+ - D1 migration tests if the migration harness supports them.
319
+ - Write-path stamping tests for new rows.
320
+ - Dual-stamp tests proving v1 remains unchanged.
321
+ - Read-path verification tests for `valid`, `invalid`, and `unknown`.
322
+ - Old-row tests proving missing v2 fields are `unknown`, not invalid.
323
+ - Tampered-row tests for changed content, semantic fingerprint, adapter, timestamps, and engine version.
324
+ - Malformed signature tests.
325
+ - Debug output safety tests.
326
+ - Public output minimization tests.
327
+ - Backfill tests with complete and incomplete rows.
328
+ - Worker dry-run validation.
329
+ - HMAC/private signing tests if that later lands.
330
+
331
+ Do not add fake production-enforcement tests before implementation exists.
332
+
333
+ ## 10. Rollout Phases
334
+
335
+ ### Phase 0: Core helper and golden vectors
336
+
337
+ Done.
338
+
339
+ Includes:
340
+
341
+ - pure Core Ha-Pri v2 helper
342
+ - deterministic golden vectors
343
+ - valid / invalid / unknown verification behavior
344
+
345
+ ### Phase 1: Storage schema design
346
+
347
+ This document.
348
+
349
+ No migration yet.
350
+
351
+ ### Phase 2: Write-path dual-stamp v1 + v2
352
+
353
+ Add D1 columns and write v2 fields for new rows only. Keep v1 intact.
354
+
355
+ ### Phase 3: Report-only read/debug verification
356
+
357
+ Recompute v2 on read/debug paths and report status. Do not reject rows yet.
358
+
359
+ ### Phase 4: Backfill tooling
360
+
361
+ Backfill historical rows only with explicit `backfilled` or `unknown_origin` markers.
362
+
363
+ ### Phase 5: Optional strict mode
364
+
365
+ Private deployments may opt into warnings or blocking for invalid rows after enough evidence.
366
+
367
+ ### Phase 6: Hosted/private HMAC signing
368
+
369
+ Add origin-authenticated signing only if hosted/private use cases require it.
370
+
371
+ ## 11. Non-Goals
372
+
373
+ This pass does not implement:
374
+
375
+ - D1 migration
376
+ - Worker enforcement
377
+ - read-path rejection
378
+ - debug endpoint changes
379
+ - HMAC/private signing
380
+ - backfill script
381
+ - npm publish
382
+ - version bump
383
+ - Cloudflare deploy
384
+ - public security claim upgrade
385
+ - new MCP tools
386
+ - ranking/scoring changes
387
+ - Operator/retrieve behavior
388
+
389
+ ## Release Gates For Future Implementation
390
+
391
+ Before any implementation pass can claim production enforcement:
392
+
393
+ - migrations must be reviewed and dry-run
394
+ - new rows must dual-stamp v1 and v2
395
+ - v1 compatibility tests must pass
396
+ - v2 read verification tests must pass
397
+ - old-row `unknown` behavior must be proven
398
+ - tampered-row `invalid` behavior must be proven
399
+ - debug output must avoid leaking secrets or excessive internals
400
+ - Worker dry-run must pass
401
+ - Trust gate must pass with effective fail 0
402
+ - public docs must say exactly what is implemented
403
+
404
+ ## Product Interpretation
405
+
406
+ Ha-Pri v2 is a provenance hardening lane for stored FreshContext rows. It supports the larger product story only when it stays honest:
407
+
408
+ ```txt
409
+ candidate context in
410
+ decision-ready context out
411
+ optional provenance verification for stored rows
412
+ ```
413
+
414
+ It is not a truth engine and not a substitute for authentication, authorization, or hosted tenant isolation.