freshcontext-mcp 0.3.19 → 0.3.21
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/FRESHCONTEXT_SPEC.md +317 -0
- package/METHODOLOGY.md +381 -0
- package/README.md +55 -5
- package/SECURITY.md +9 -7
- package/dist/adapters/arxiv.d.ts +15 -0
- package/dist/adapters/arxiv.js +3 -2
- package/dist/adapters/changelog.d.ts +2 -0
- package/dist/adapters/changelog.js +4 -2
- package/dist/adapters/finance.d.ts +2 -0
- package/dist/adapters/finance.js +1 -1
- package/dist/adapters/gdelt.d.ts +2 -0
- package/dist/adapters/gdelt.js +1 -1
- package/dist/adapters/gebiz.d.ts +2 -0
- package/dist/adapters/gebiz.js +1 -1
- package/dist/adapters/github.d.ts +2 -0
- package/dist/adapters/govcontracts.d.ts +2 -0
- package/dist/adapters/hackernews.d.ts +2 -0
- package/dist/adapters/jobs.d.ts +2 -0
- package/dist/adapters/jobs.js +6 -6
- package/dist/adapters/packageTrends.d.ts +2 -0
- package/dist/adapters/productHunt.d.ts +2 -0
- package/dist/adapters/reddit.d.ts +8 -0
- package/dist/adapters/reddit.js +12 -5
- package/dist/adapters/registry.d.ts +19 -0
- package/dist/adapters/repoSearch.d.ts +2 -0
- package/dist/adapters/repoSearch.js +1 -1
- package/dist/adapters/scholar.d.ts +2 -0
- package/dist/adapters/secFilings.d.ts +2 -0
- package/dist/adapters/secFilings.js +1 -1
- package/dist/adapters/yc.d.ts +2 -0
- package/dist/core/decay.d.ts +5 -0
- package/dist/core/decision.d.ts +3 -0
- package/dist/core/decision.js +1 -3
- package/dist/core/envelope.d.ts +5 -0
- package/dist/core/envelope.js +9 -1
- package/dist/core/explain.d.ts +12 -0
- package/dist/core/guards.d.ts +1 -0
- package/dist/core/index.d.ts +14 -0
- package/dist/core/index.js +2 -0
- package/dist/core/pipeline.d.ts +3 -0
- package/dist/core/pipeline.js +8 -0
- package/dist/core/provenance.d.ts +5 -0
- package/dist/core/provenanceReadiness.d.ts +2 -0
- package/dist/core/provenanceReadiness.js +220 -0
- package/dist/core/rank.d.ts +4 -0
- package/dist/core/readable.d.ts +2 -0
- package/dist/core/readable.js +75 -0
- package/dist/core/signal.d.ts +3 -0
- package/dist/core/sourceProfiles.d.ts +4 -0
- package/dist/core/types.d.ts +239 -0
- package/dist/core/utility.d.ts +2 -0
- package/dist/rest/handler.d.ts +1 -0
- package/dist/security.d.ts +15 -0
- package/dist/security.js +3 -1
- package/dist/server.d.ts +2 -0
- package/dist/server.js +2 -2
- package/dist/tools/evaluateContext.d.ts +21 -0
- package/dist/tools/evaluateContext.js +22 -1
- package/dist/tools/freshnessStamp.d.ts +1 -0
- package/dist/types.d.ts +1 -0
- package/docs/API_DESIGN.md +28 -1
- package/docs/CLIENT_SETUP.md +166 -0
- package/docs/CODEX_MCP_USAGE.md +4 -4
- package/docs/CORE_API.md +69 -5
- package/docs/CORE_MCP_BOUNDARY.md +13 -4
- package/docs/FUTURE_LANES.md +26 -3
- package/docs/HA_PRI_V2_DESIGN.md +7 -1
- package/docs/HA_PRI_V2_PRODUCTION_ENFORCEMENT_PLAN.md +414 -0
- package/docs/HUMAN_READABLE_OUTPUT_CONTRACT.md +293 -0
- package/docs/RELEASE_INTEGRITY.md +1 -1
- package/docs/RELEASE_NOTES.md +33 -5
- package/docs/SIGNAL_CONTRACT.md +200 -2
- package/package-script-guard.mjs +59 -3
- package/package.json +33 -13
- package/server.json +2 -2
package/docs/CORE_API.md
CHANGED
|
@@ -13,14 +13,21 @@ Import stable Core functions from:
|
|
|
13
13
|
```ts
|
|
14
14
|
import {
|
|
15
15
|
calculateFreshnessScore,
|
|
16
|
+
evaluateSignals,
|
|
16
17
|
formatForLLM,
|
|
18
|
+
getSourceProfile,
|
|
19
|
+
interpretEvaluations,
|
|
17
20
|
looksLikeFailedAdapterContent,
|
|
21
|
+
normalizeSignal,
|
|
22
|
+
prepareProvenanceReadiness,
|
|
18
23
|
scoreLabel,
|
|
19
24
|
stampFreshness,
|
|
20
25
|
toStructuredJSON,
|
|
21
|
-
} from "
|
|
26
|
+
} from "freshcontext-mcp/core";
|
|
22
27
|
```
|
|
23
28
|
|
|
29
|
+
The `freshcontext-mcp/core` subpath is the direct Core import boundary inside the current MCP package. It does not create a standalone `freshcontext-core` package yet; that remains a future package-split lane.
|
|
30
|
+
|
|
24
31
|
### Envelope
|
|
25
32
|
|
|
26
33
|
- `stampFreshness(result, options, adapter)` creates a `FreshContext` object from adapter output.
|
|
@@ -55,7 +62,7 @@ It does not fetch, cache, write D1, inspect Worker bindings, know MCP tool schem
|
|
|
55
62
|
|
|
56
63
|
`evaluateSignals` evaluates each input and returns evaluations sorted by existing `rankSignal` final score, preserving input order when scores tie. Context utility is returned as a sidecar and does not replace `final_score`.
|
|
57
64
|
|
|
58
|
-
Context utility is returned as sidecar output in the current pipeline
|
|
65
|
+
Context utility is returned as sidecar output in the current pipeline. It does not replace or modify the default `rankSignal` / `evaluateSignals` ordering, and it does not control default decision labels. A future pass may add an explicit utility-weighted ranking or decision-support mode.
|
|
59
66
|
|
|
60
67
|
Local demo:
|
|
61
68
|
|
|
@@ -78,7 +85,7 @@ These types describe the stable envelope and adapter result contract.
|
|
|
78
85
|
|
|
79
86
|
## Signal Contract v1
|
|
80
87
|
|
|
81
|
-
Signal Contract v1 is the
|
|
88
|
+
Signal Contract v1 is the current FreshContext input standard: the stable shape for candidate context before it is ranked, wrapped, stored, judged by `evaluate_context`, or passed to an agent workflow.
|
|
82
89
|
|
|
83
90
|
Public exports:
|
|
84
91
|
|
|
@@ -92,6 +99,8 @@ Public exports:
|
|
|
92
99
|
|
|
93
100
|
`published_at` is the canonical signal timestamp. `content_date` is accepted as an adapter/envelope compatibility alias. Normalization clears invalid or meaningfully future-dated timestamps, marks failed/error-looking content as `status: "failed"`, clamps `semantic_score` into `0..1`, and records normalization reasons.
|
|
94
101
|
|
|
102
|
+
Future context signals and control signals are optional future metadata layers, not replacements for Signal Contract v1 and not required public input fields today.
|
|
103
|
+
|
|
95
104
|
See [Signal Contract v1](./SIGNAL_CONTRACT.md).
|
|
96
105
|
|
|
97
106
|
## Source Profiles
|
|
@@ -152,6 +161,51 @@ FreshContext decisions judge citation readiness, context usefulness, freshness,
|
|
|
152
161
|
|
|
153
162
|
Demo output will be updated separately so presentation stays separate from Core decision logic.
|
|
154
163
|
|
|
164
|
+
## Human-Readable Output
|
|
165
|
+
|
|
166
|
+
The human-readable output helper adds a small reader-facing layer on top of existing Core evaluations and decisions.
|
|
167
|
+
|
|
168
|
+
Public exports:
|
|
169
|
+
|
|
170
|
+
- `toReadableContextResult(evaluation, decision)`
|
|
171
|
+
- `HumanReadableContextResult`
|
|
172
|
+
|
|
173
|
+
The helper returns a bounded object with:
|
|
174
|
+
|
|
175
|
+
- `label`
|
|
176
|
+
- `summary`
|
|
177
|
+
- `why`
|
|
178
|
+
- `action`
|
|
179
|
+
- `warnings`
|
|
180
|
+
|
|
181
|
+
This is additive. It does not change machine decisions, ranking, freshness math, utility math, Source Profiles, envelopes, provenance, or host behavior. Utility may appear in `why` when it is already part of the decision reasons, but utility still does not control default decision labels.
|
|
182
|
+
|
|
183
|
+
Example structured result fragment:
|
|
184
|
+
|
|
185
|
+
```json
|
|
186
|
+
{
|
|
187
|
+
"decision": "cite_as_primary",
|
|
188
|
+
"label": "Cite as primary",
|
|
189
|
+
"readable": {
|
|
190
|
+
"label": "Primary source",
|
|
191
|
+
"summary": "This source is strong enough to use as main evidence.",
|
|
192
|
+
"why": [
|
|
193
|
+
"Strong semantic match and current freshness for arxiv.",
|
|
194
|
+
"source profile academic_research uses lenient date policy",
|
|
195
|
+
"intent profile citation_check selected"
|
|
196
|
+
],
|
|
197
|
+
"action": "Use this as main evidence while preserving citation and provenance.",
|
|
198
|
+
"warnings": [
|
|
199
|
+
"FreshContext judges citation readiness and context usefulness; it does not certify truth."
|
|
200
|
+
]
|
|
201
|
+
}
|
|
202
|
+
}
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
FreshContext does not certify truth. It records why context was used, supported, questioned, refreshed, watched, or excluded before it reaches a model.
|
|
206
|
+
|
|
207
|
+
See [Human-Readable Output Contract](./HUMAN_READABLE_OUTPUT_CONTRACT.md).
|
|
208
|
+
|
|
155
209
|
## Public Ranking Primitives
|
|
156
210
|
|
|
157
211
|
The ranking primitives are public, but consumers should treat their score scales carefully:
|
|
@@ -173,7 +227,7 @@ Ranking combines semantic relevance and freshness into a deterministic order. It
|
|
|
173
227
|
|
|
174
228
|
## Experimental Utility Primitive
|
|
175
229
|
|
|
176
|
-
The context-conditioned utility primitive is pure and tested
|
|
230
|
+
The context-conditioned utility primitive is pure and tested. It is surfaced in Core evaluation output and decision reasons, but it does not control default ranking or decision labels.
|
|
177
231
|
|
|
178
232
|
Experimental exports:
|
|
179
233
|
|
|
@@ -182,7 +236,7 @@ Experimental exports:
|
|
|
182
236
|
- `ContextUtilityInput`
|
|
183
237
|
- `ContextUtilityResult`
|
|
184
238
|
|
|
185
|
-
These are pure Core math. They are now connected inside `evaluateSignal` as sidecar utility output, but they are not production-wired into MCP ranking, Worker feeds, Store scoring, or
|
|
239
|
+
These are pure Core math. They are now connected inside `evaluateSignal` as sidecar utility output and explanatory material, but they are not production-wired into MCP ranking, Worker feeds, Store scoring, or default decision thresholds.
|
|
186
240
|
|
|
187
241
|
## Provenance Helpers
|
|
188
242
|
|
|
@@ -198,6 +252,16 @@ Ha-Pri v2 is available as pure Core helper functionality:
|
|
|
198
252
|
|
|
199
253
|
`evaluateSignal` can optionally prepare Ha-Pri v2 material when `includeProvenance` is set and required input material is present. Core does not persist provenance, add D1 columns, verify rows on read, reject rows, or replace Worker Ha-Pri v1 behavior.
|
|
200
254
|
|
|
255
|
+
Provenance readiness is available as a pure Core sidecar:
|
|
256
|
+
|
|
257
|
+
- `prepareProvenanceReadiness(input, options?)`
|
|
258
|
+
- `ProvenanceReadinessState`
|
|
259
|
+
- `ProvenanceReadinessResult`
|
|
260
|
+
|
|
261
|
+
It classifies caller-provided signal provenance as `complete`, `partial`, `incomplete`, `unknown`, or `derived`. The result exposes source identity completeness, timing completeness, normalized source and timestamp fields, canonical content hash material, optional semantic fingerprint hash material, optional Ha-Pri v2 identity material when the caller supplies enough inputs, warnings, and reasons.
|
|
262
|
+
|
|
263
|
+
`provenance_readiness` is included additively on `evaluateSignal` results and in the structured `evaluate_context` JSON result. It does not fetch, crawl, scrape, read folders, call adapters, change ranking, change decisions, certify truth, or enforce rejection policy.
|
|
264
|
+
|
|
201
265
|
## Internal, Policy, and Compatibility Exports
|
|
202
266
|
|
|
203
267
|
- `clampScore` is an internal ranking helper. It is currently exported for tests and utility use, but it should not be presented as a primary buyer-facing API.
|
|
@@ -50,18 +50,27 @@ Worker/site surfaces own deployment concerns:
|
|
|
50
50
|
|
|
51
51
|
Live today:
|
|
52
52
|
|
|
53
|
-
- npm package: `freshcontext-mcp@0.3.
|
|
53
|
+
- npm package: `freshcontext-mcp@0.3.21`
|
|
54
54
|
- MCP stdio server and published binary: `freshcontext-mcp`
|
|
55
|
+
- Core subpath export: `freshcontext-mcp/core`
|
|
55
56
|
- `evaluate_context` MCP tool for caller-provided candidate context
|
|
56
57
|
- 21 named read-only reference adapters
|
|
57
58
|
- Core signal evaluation
|
|
58
59
|
- Source Profiles
|
|
59
|
-
- Decision Helper
|
|
60
|
-
-
|
|
60
|
+
- Decision Helper
|
|
61
|
+
- human-readable `readable` output on structured `evaluate_context` results
|
|
62
|
+
- provenance readiness and readable handoff safety
|
|
63
|
+
- adapter registry metadata
|
|
61
64
|
- arXiv signal-to-decision proof
|
|
62
65
|
- bring-your-own-context local demos
|
|
63
66
|
- Trust Scanner release gate
|
|
64
67
|
|
|
68
|
+
Network boundary:
|
|
69
|
+
|
|
70
|
+
- `evaluate_context` does not fetch, crawl, scrape, browse, read folders, or call adapters.
|
|
71
|
+
- The 21 named reference adapters are optional read-only network tools and use network access only when invoked.
|
|
72
|
+
- FreshContext Core remains the no-network judgment layer.
|
|
73
|
+
|
|
65
74
|
Not live today:
|
|
66
75
|
|
|
67
76
|
- standalone Core npm package
|
|
@@ -78,7 +87,7 @@ Not live today:
|
|
|
78
87
|
The safe split path is staged:
|
|
79
88
|
|
|
80
89
|
1. Keep `freshcontext-mcp` stable for current users.
|
|
81
|
-
2. Maintain Core as a pure
|
|
90
|
+
2. Maintain Core as a pure package subpath export surface.
|
|
82
91
|
3. Audit Core dependencies, Node/browser compatibility, and API stability.
|
|
83
92
|
4. Publish a standalone Core package only after compatibility tests exist.
|
|
84
93
|
5. Make `freshcontext-mcp` depend on the standalone Core package.
|
package/docs/FUTURE_LANES.md
CHANGED
|
@@ -10,9 +10,10 @@ The current package boundary is documented in [Core / MCP Boundary](./CORE_MCP_B
|
|
|
10
10
|
|
|
11
11
|
Live today:
|
|
12
12
|
|
|
13
|
-
- npm package: `freshcontext-mcp@0.3.
|
|
13
|
+
- npm package: `freshcontext-mcp@0.3.21`
|
|
14
14
|
- MCP stdio server
|
|
15
15
|
- `evaluate_context` MCP tool for caller-provided candidate context
|
|
16
|
+
- Signal Contract v1 as the stable candidate-context input shape
|
|
16
17
|
- 21 read-only reference adapters
|
|
17
18
|
- Core signal evaluation
|
|
18
19
|
- Source Profiles
|
|
@@ -32,6 +33,26 @@ Not live today:
|
|
|
32
33
|
- standalone Core SDK package
|
|
33
34
|
- full adapter ingestion
|
|
34
35
|
|
|
36
|
+
## Phase 0: Stabilize The Signal Contract
|
|
37
|
+
|
|
38
|
+
Goal:
|
|
39
|
+
|
|
40
|
+
```text
|
|
41
|
+
Treat Signal Contract v1 as the stable input boundary for FreshContext.
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Current contract:
|
|
45
|
+
|
|
46
|
+
```text
|
|
47
|
+
title + content + source + source_type + published_at + retrieved_at + semantic_score
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
This is live today. It is not the same thing as future context signals or control signals.
|
|
51
|
+
|
|
52
|
+
Tasks in this lane should document examples, invalid-input behavior, and normalization expectations. Do not expand required fields unless tests prove the new metadata improves decisions.
|
|
53
|
+
|
|
54
|
+
Future context signals, control signals, ingestion quality signals, structure preservation signals, and provenance confidence signals belong to later Decision Layer upgrades. They should remain optional metadata, not public required fields.
|
|
55
|
+
|
|
35
56
|
## Lane 1: Client Setup Reliability
|
|
36
57
|
|
|
37
58
|
Goal:
|
|
@@ -121,9 +142,11 @@ Goal:
|
|
|
121
142
|
Make decisions more useful without silently changing ranking.
|
|
122
143
|
```
|
|
123
144
|
|
|
124
|
-
Possible inputs include context utility, control signal, future context signal, confidence tiers, and source-profile-specific thresholds.
|
|
145
|
+
Possible inputs include context utility, control signal, future context signal, ingestion quality, structure preservation, provenance confidence, confidence tiers, and source-profile-specific thresholds.
|
|
146
|
+
|
|
147
|
+
These are optional future metadata upgrades on top of Signal Contract v1. They should only be exposed when they make decisions clearer without making the caller-facing contract harder to use.
|
|
125
148
|
|
|
126
|
-
Do not make `utility.score` affect ranking by default without a dedicated
|
|
149
|
+
Do not make `utility.score` affect ranking or decision labels by default without a dedicated policy pass.
|
|
127
150
|
|
|
128
151
|
## Lane 7: Ha-Pri v2 Production Path
|
|
129
152
|
|
package/docs/HA_PRI_V2_DESIGN.md
CHANGED
|
@@ -10,7 +10,13 @@ Ha-Pri v2 is an additive provenance-hardening model for FreshContext Store/Ledge
|
|
|
10
10
|
|
|
11
11
|
The goal is to keep Ha-Pri v1 readable while designing a stronger future signature that binds a row to canonical content, semantic identity, source metadata, timestamps, and engine version.
|
|
12
12
|
|
|
13
|
-
Phase 3-B adds pure Core helper functions and deterministic tests for the v2 model. Phase 3-C adds `examples/ha-pri-v2-example.ts`, a deterministic developer fixture showing `calculateHaPriV2` and `verifyHaPriV2` returning valid, invalid, and unknown verification states. Production Store wiring remains future work. This document does not change the D1 schema, change Worker write paths, migrate old rows, add HMAC secrets, or alter production scoring.
|
|
13
|
+
Phase 3-B adds pure Core helper functions and deterministic tests for the v2 model. Phase 3-C adds `examples/ha-pri-v2-example.ts`, a deterministic developer fixture showing `calculateHaPriV2` and `verifyHaPriV2` returning valid, invalid, and unknown verification states. Production Store wiring remains future work. This document does not change the D1 schema, change Worker write paths, migrate old rows, add HMAC secrets, or alter production scoring.
|
|
14
|
+
|
|
15
|
+
Pass 11-J adds golden test vectors for the pure Core helpers. Ha-Pri v2 golden vectors prove deterministic Core provenance behavior: canonicalization, SHA-256 hashes, signing payload construction, signature generation, and verification status are stable and repeatable. They do not mean Ha-Pri v2 is production-enforced on Worker/D1 reads.
|
|
16
|
+
|
|
17
|
+
Plain SHA-256 provides deterministic integrity and audit checks. HMAC or private-key signing would be needed later for stronger origin-authentication guarantees.
|
|
18
|
+
|
|
19
|
+
Pass 11-K adds a design-only production enforcement plan in `docs/HA_PRI_V2_PRODUCTION_ENFORCEMENT_PLAN.md`. That plan covers the future D1/storage, write-path, read/debug verification, compatibility, backfill, threat model, and rollout path. It does not implement Worker/D1 enforcement.
|
|
14
20
|
|
|
15
21
|
## Current Ha-Pri v1 Audit
|
|
16
22
|
|
|
@@ -0,0 +1,414 @@
|
|
|
1
|
+
# Ha-Pri v2 Production Enforcement Plan
|
|
2
|
+
|
|
3
|
+
Status: design only
|
|
4
|
+
Phase: Pass 11-K
|
|
5
|
+
Runtime impact: none
|
|
6
|
+
|
|
7
|
+
Ha-Pri v2 production enforcement is a future rollout path. Current FreshContext releases only include the Core helper and deterministic golden vectors unless a later implementation pass explicitly wires Worker/D1 enforcement.
|
|
8
|
+
|
|
9
|
+
This plan describes how Ha-Pri v2 could move from pure Core provenance helper to production Store/Worker verification without overclaiming current behavior.
|
|
10
|
+
|
|
11
|
+
## 1. Current State
|
|
12
|
+
|
|
13
|
+
Current FreshContext behavior:
|
|
14
|
+
|
|
15
|
+
- Ha-Pri v1 is the current Worker/feed audit stamp.
|
|
16
|
+
- Ha-Pri v1 is stored as `scrape_results.ha_pri_sig`.
|
|
17
|
+
- Ha-Pri v1 is returned in Worker feed `intelligence_stamps`.
|
|
18
|
+
- Ha-Pri v1 is a provenance stamp and audit reference, not hard row rejection.
|
|
19
|
+
- Ha-Pri v2 exists as a pure Core helper in `src/core/provenance.ts`.
|
|
20
|
+
- Ha-Pri v2 has deterministic golden vectors in `tests/fixtures/ha-pri-v2-golden-vectors.json`.
|
|
21
|
+
- Ha-Pri v2 is not production-enforced on Worker/D1 reads.
|
|
22
|
+
- Ha-Pri v2 does not currently reject rows.
|
|
23
|
+
- Ha-Pri v2 does not currently provide private-key origin authentication.
|
|
24
|
+
|
|
25
|
+
The current v2 helper provides deterministic canonicalization, SHA-256 hashing, signing payload construction, signature calculation, and verification status:
|
|
26
|
+
|
|
27
|
+
```txt
|
|
28
|
+
valid
|
|
29
|
+
invalid
|
|
30
|
+
unknown
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
That is real Core behavior. It is not yet Worker/D1 enforcement.
|
|
34
|
+
|
|
35
|
+
## 2. Target Future State
|
|
36
|
+
|
|
37
|
+
Future production enforcement should make stored context rows independently reviewable after write.
|
|
38
|
+
|
|
39
|
+
Target behavior:
|
|
40
|
+
|
|
41
|
+
- Worker write path stores Ha-Pri v2 provenance material for new rows.
|
|
42
|
+
- Rows can later be verified as `valid`, `invalid`, or `unknown`.
|
|
43
|
+
- Debug/internal read paths can report verification status.
|
|
44
|
+
- Safe public read paths can expose limited verification status without leaking internals.
|
|
45
|
+
- Invalid rows are not silently treated as trusted.
|
|
46
|
+
- Old rows remain compatible through `unknown` and optional backfill behavior.
|
|
47
|
+
- Strict rejection remains optional and staged, not the first rollout.
|
|
48
|
+
|
|
49
|
+
The target is not "FreshContext proves truth." The target is "FreshContext can detect whether stored provenance material still matches the stored row material under the documented v2 contract."
|
|
50
|
+
|
|
51
|
+
## 3. Proposed D1 / Storage Fields
|
|
52
|
+
|
|
53
|
+
Essential fields:
|
|
54
|
+
|
|
55
|
+
```txt
|
|
56
|
+
ha_pri_sig_v2 TEXT
|
|
57
|
+
ha_pri_canonical_content_sha256 TEXT
|
|
58
|
+
ha_pri_semantic_fingerprint_sha256 TEXT
|
|
59
|
+
ha_pri_signing_payload_version TEXT
|
|
60
|
+
ha_pri_engine_version TEXT
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
Recommended operational fields:
|
|
64
|
+
|
|
65
|
+
```txt
|
|
66
|
+
ha_pri_verification_status TEXT
|
|
67
|
+
ha_pri_verified_at TEXT
|
|
68
|
+
ha_pri_backfill_status TEXT
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Likely unnecessary as separate stored fields:
|
|
72
|
+
|
|
73
|
+
```txt
|
|
74
|
+
ha_pri_adapter
|
|
75
|
+
ha_pri_published_at
|
|
76
|
+
ha_pri_retrieved_at
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
Reason: adapter, published timestamp, and retrieved/scraped timestamp already exist or should exist as first-class row fields. Duplicating them inside Ha-Pri-specific columns risks drift. The signing payload should read those canonical row fields directly during verification.
|
|
80
|
+
|
|
81
|
+
Minimum practical schema:
|
|
82
|
+
|
|
83
|
+
```sql
|
|
84
|
+
ALTER TABLE scrape_results ADD COLUMN ha_pri_sig_v2 TEXT;
|
|
85
|
+
ALTER TABLE scrape_results ADD COLUMN ha_pri_canonical_content_sha256 TEXT;
|
|
86
|
+
ALTER TABLE scrape_results ADD COLUMN ha_pri_semantic_fingerprint_sha256 TEXT;
|
|
87
|
+
ALTER TABLE scrape_results ADD COLUMN ha_pri_signing_payload_version TEXT;
|
|
88
|
+
ALTER TABLE scrape_results ADD COLUMN ha_pri_engine_version TEXT;
|
|
89
|
+
ALTER TABLE scrape_results ADD COLUMN ha_pri_verification_status TEXT;
|
|
90
|
+
ALTER TABLE scrape_results ADD COLUMN ha_pri_verified_at TEXT;
|
|
91
|
+
ALTER TABLE scrape_results ADD COLUMN ha_pri_backfill_status TEXT;
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
Storage status values should be boring and explicit:
|
|
95
|
+
|
|
96
|
+
```txt
|
|
97
|
+
valid
|
|
98
|
+
invalid
|
|
99
|
+
unknown
|
|
100
|
+
not_checked
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Backfill status values should avoid pretending old rows were originally v2-stamped:
|
|
104
|
+
|
|
105
|
+
```txt
|
|
106
|
+
none
|
|
107
|
+
backfilled
|
|
108
|
+
unknown_origin
|
|
109
|
+
failed
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
## 4. Write-Path Design
|
|
113
|
+
|
|
114
|
+
The future Worker write path should dual-stamp v1 and v2 for new rows.
|
|
115
|
+
|
|
116
|
+
Recommended write sequence:
|
|
117
|
+
|
|
118
|
+
1. Adapter returns raw candidate content.
|
|
119
|
+
2. Existing Worker scoring computes current DAR fields and Ha-Pri v1.
|
|
120
|
+
3. Write path prepares Ha-Pri v2 input:
|
|
121
|
+
- `resultId`: row id that will be stored
|
|
122
|
+
- `rawContent`: canonical row raw content
|
|
123
|
+
- `semanticFingerprint`: semantic fingerprint material or stored fingerprint
|
|
124
|
+
- `adapter`: adapter id
|
|
125
|
+
- `publishedAt`: normalized source publication timestamp or `null`
|
|
126
|
+
- `retrievedAt`: normalized scrape/retrieval timestamp
|
|
127
|
+
- `engineVersion`: FreshContext engine/package version or explicit Worker engine version
|
|
128
|
+
4. If required v2 material is complete, calculate:
|
|
129
|
+
- canonical content SHA-256
|
|
130
|
+
- semantic fingerprint SHA-256
|
|
131
|
+
- signing payload version
|
|
132
|
+
- Ha-Pri v2 signature
|
|
133
|
+
5. Store v1 fields as today.
|
|
134
|
+
6. Store v2 fields alongside v1 fields.
|
|
135
|
+
7. If material is incomplete, store unknown-compatible metadata and do not pretend the row is valid.
|
|
136
|
+
|
|
137
|
+
Recommended incomplete-material behavior:
|
|
138
|
+
|
|
139
|
+
```txt
|
|
140
|
+
ha_pri_sig_v2 = null
|
|
141
|
+
ha_pri_verification_status = "unknown"
|
|
142
|
+
ha_pri_backfill_status = "none"
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
Canonical content should be produced by the same pure helper behavior used in Core golden vectors. Do not invent a second Worker-only canonicalization contract.
|
|
146
|
+
|
|
147
|
+
Semantic fingerprint should be produced before signing and should be stable across retries for the same underlying source item. If the fingerprint is missing, v2 signing should fall back to `unknown`, not a fake valid signature.
|
|
148
|
+
|
|
149
|
+
Engine version should be explicit. The safest initial choice is the package/server version used by the running Worker build.
|
|
150
|
+
|
|
151
|
+
## 5. Read / Debug Verification Design
|
|
152
|
+
|
|
153
|
+
Future read verification should be a pure recomputation:
|
|
154
|
+
|
|
155
|
+
```txt
|
|
156
|
+
verifyHaPriV2(row) -> valid | invalid | unknown
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
Suggested internal verification input:
|
|
160
|
+
|
|
161
|
+
```ts
|
|
162
|
+
{
|
|
163
|
+
resultId: row.id,
|
|
164
|
+
rawContent: row.raw_content,
|
|
165
|
+
semanticFingerprint: row.semantic_fingerprint,
|
|
166
|
+
adapter: row.adapter,
|
|
167
|
+
publishedAt: row.published_at,
|
|
168
|
+
retrievedAt: row.scraped_at,
|
|
169
|
+
engineVersion: row.ha_pri_engine_version
|
|
170
|
+
}
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
Debug output may include:
|
|
174
|
+
|
|
175
|
+
```json
|
|
176
|
+
{
|
|
177
|
+
"ha_pri_v2": {
|
|
178
|
+
"status": "valid",
|
|
179
|
+
"checked_at": "2026-06-11T12:00:00.000Z",
|
|
180
|
+
"payload_version": "FRESHCONTEXT_HA_PRI_V2",
|
|
181
|
+
"canonical_content_sha256": "sha256...",
|
|
182
|
+
"semantic_fingerprint_sha256": "sha256..."
|
|
183
|
+
}
|
|
184
|
+
}
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
Safe public output should be smaller:
|
|
188
|
+
|
|
189
|
+
```json
|
|
190
|
+
{
|
|
191
|
+
"provenance": {
|
|
192
|
+
"ha_pri_v2_status": "valid"
|
|
193
|
+
}
|
|
194
|
+
}
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
Do not expose signing payloads or debug hashes in broad public outputs unless there is a clear user need.
|
|
198
|
+
|
|
199
|
+
Suggested staged behavior:
|
|
200
|
+
|
|
201
|
+
- Phase 1: report-only verification in internal/debug output.
|
|
202
|
+
- Phase 2: warning in read/debug path when invalid.
|
|
203
|
+
- Phase 3: optional strict mode for private deployments.
|
|
204
|
+
- Phase 4: possible reject/block policy only after replay data and operational evidence.
|
|
205
|
+
|
|
206
|
+
Invalid should not become automatic rejection first. A migration bug, canonicalization mismatch, or schema rollout issue could otherwise hide useful rows during rollout.
|
|
207
|
+
|
|
208
|
+
## 6. Compatibility And Backfill
|
|
209
|
+
|
|
210
|
+
Old rows must remain readable.
|
|
211
|
+
|
|
212
|
+
Compatibility rules:
|
|
213
|
+
|
|
214
|
+
- v1-only rows verify as `unknown` for v2.
|
|
215
|
+
- Rows with missing v2 fields verify as `unknown`.
|
|
216
|
+
- Rows with malformed v2 signatures verify as `invalid`.
|
|
217
|
+
- Rows with present but mismatched v2 signatures verify as `invalid`.
|
|
218
|
+
- Missing `ha_pri_sig_v2` is not the same as tampering.
|
|
219
|
+
- Existing `ha_pri_sig` remains readable for historical continuity.
|
|
220
|
+
|
|
221
|
+
Backfill rules:
|
|
222
|
+
|
|
223
|
+
- Backfilled provenance must be marked as `backfilled` or `unknown_origin`.
|
|
224
|
+
- Backfill must not imply the row was v2-stamped at original write time.
|
|
225
|
+
- Backfill should preserve original row timestamps.
|
|
226
|
+
- Backfill should record when verification/backfill happened.
|
|
227
|
+
- Backfill should be reversible or repeatable where practical.
|
|
228
|
+
|
|
229
|
+
Possible backfill process:
|
|
230
|
+
|
|
231
|
+
1. Select rows missing `ha_pri_sig_v2`.
|
|
232
|
+
2. Reconstruct v2 input from stored row fields.
|
|
233
|
+
3. If required material is complete, calculate v2 fields.
|
|
234
|
+
4. Store v2 fields with `ha_pri_backfill_status = "backfilled"`.
|
|
235
|
+
5. If required material is incomplete, store `ha_pri_verification_status = "unknown"` and `ha_pri_backfill_status = "unknown_origin"`.
|
|
236
|
+
6. Report counts for backfilled, unknown, invalid, and failed rows.
|
|
237
|
+
|
|
238
|
+
## 7. Security Boundary
|
|
239
|
+
|
|
240
|
+
Plain SHA-256 gives deterministic integrity and audit checks.
|
|
241
|
+
|
|
242
|
+
Plain SHA-256 does not prove private origin authentication. Anyone with all payload fields can recompute a plain SHA-256 signature.
|
|
243
|
+
|
|
244
|
+
Ha-Pri v2 helps detect:
|
|
245
|
+
|
|
246
|
+
- accidental row corruption
|
|
247
|
+
- changed content after write
|
|
248
|
+
- changed semantic fingerprint material
|
|
249
|
+
- changed adapter/timestamp/version fields included in the signing payload
|
|
250
|
+
- malformed stored signatures
|
|
251
|
+
|
|
252
|
+
Ha-Pri v2 does not solve:
|
|
253
|
+
|
|
254
|
+
- truth certification
|
|
255
|
+
- legal, medical, tax, employment, academic, or investment correctness
|
|
256
|
+
- private origin authentication without a secret or private key
|
|
257
|
+
- compromise of the write path before signing
|
|
258
|
+
- compromise of all row fields plus signature under plain SHA-256
|
|
259
|
+
|
|
260
|
+
Recommendation:
|
|
261
|
+
|
|
262
|
+
Do not add HMAC/private signing immediately to the open package. Keep the open package deterministic and stateless.
|
|
263
|
+
|
|
264
|
+
Consider HMAC or private-key signing later for:
|
|
265
|
+
|
|
266
|
+
- hosted FreshContext endpoints
|
|
267
|
+
- private production deployments
|
|
268
|
+
- paid/tenant-specific infrastructure
|
|
269
|
+
- environments where the verifier must know the row was stamped by a trusted FreshContext deployment
|
|
270
|
+
|
|
271
|
+
If HMAC/private signing is added later, it requires:
|
|
272
|
+
|
|
273
|
+
- secret storage outside the repo
|
|
274
|
+
- key ids
|
|
275
|
+
- key rotation
|
|
276
|
+
- old-key verification policy
|
|
277
|
+
- signer/verifier boundary documentation
|
|
278
|
+
- tests proving secrets are never logged or returned
|
|
279
|
+
|
|
280
|
+
## 8. Threat Model
|
|
281
|
+
|
|
282
|
+
Threats considered:
|
|
283
|
+
|
|
284
|
+
### Accidental row corruption
|
|
285
|
+
|
|
286
|
+
Ha-Pri v2 helps by recomputing the expected signature and surfacing `invalid`.
|
|
287
|
+
|
|
288
|
+
### Stale or partial provenance
|
|
289
|
+
|
|
290
|
+
Ha-Pri v2 helps by returning `unknown` when required material is missing. The system should not pretend such rows are valid.
|
|
291
|
+
|
|
292
|
+
### Tampered D1 rows
|
|
293
|
+
|
|
294
|
+
Ha-Pri v2 helps if an attacker changes stored content or bound fields without also updating all matching v2 fields.
|
|
295
|
+
|
|
296
|
+
Plain SHA-256 does not help if an attacker can rewrite all row fields and recompute the public signature.
|
|
297
|
+
|
|
298
|
+
### Malformed signatures
|
|
299
|
+
|
|
300
|
+
Malformed, blank, or nonmatching signatures should produce `invalid` or `unknown` according to current helper behavior. They should not crash reads.
|
|
301
|
+
|
|
302
|
+
### Recomputed public SHA-256 signatures
|
|
303
|
+
|
|
304
|
+
Because v2 currently uses plain SHA-256, a party with all fields can recompute a matching signature. HMAC/private signing is the later answer if origin authentication becomes necessary.
|
|
305
|
+
|
|
306
|
+
### Debug endpoint leakage
|
|
307
|
+
|
|
308
|
+
Debug routes should remain authenticated. Public outputs should avoid exposing full signing payloads or internal row material unless deliberately needed.
|
|
309
|
+
|
|
310
|
+
### Secret exposure if HMAC is added later
|
|
311
|
+
|
|
312
|
+
HMAC/private signing introduces secret-management risk. Secrets must live in deployment configuration, never in docs, fixtures, npm package output, or client-visible responses.
|
|
313
|
+
|
|
314
|
+
## 9. Tests Needed Before Implementation
|
|
315
|
+
|
|
316
|
+
Future implementation should add tests before production rollout:
|
|
317
|
+
|
|
318
|
+
- D1 migration tests if the migration harness supports them.
|
|
319
|
+
- Write-path stamping tests for new rows.
|
|
320
|
+
- Dual-stamp tests proving v1 remains unchanged.
|
|
321
|
+
- Read-path verification tests for `valid`, `invalid`, and `unknown`.
|
|
322
|
+
- Old-row tests proving missing v2 fields are `unknown`, not invalid.
|
|
323
|
+
- Tampered-row tests for changed content, semantic fingerprint, adapter, timestamps, and engine version.
|
|
324
|
+
- Malformed signature tests.
|
|
325
|
+
- Debug output safety tests.
|
|
326
|
+
- Public output minimization tests.
|
|
327
|
+
- Backfill tests with complete and incomplete rows.
|
|
328
|
+
- Worker dry-run validation.
|
|
329
|
+
- HMAC/private signing tests if that later lands.
|
|
330
|
+
|
|
331
|
+
Do not add fake production-enforcement tests before implementation exists.
|
|
332
|
+
|
|
333
|
+
## 10. Rollout Phases
|
|
334
|
+
|
|
335
|
+
### Phase 0: Core helper and golden vectors
|
|
336
|
+
|
|
337
|
+
Done.
|
|
338
|
+
|
|
339
|
+
Includes:
|
|
340
|
+
|
|
341
|
+
- pure Core Ha-Pri v2 helper
|
|
342
|
+
- deterministic golden vectors
|
|
343
|
+
- valid / invalid / unknown verification behavior
|
|
344
|
+
|
|
345
|
+
### Phase 1: Storage schema design
|
|
346
|
+
|
|
347
|
+
This document.
|
|
348
|
+
|
|
349
|
+
No migration yet.
|
|
350
|
+
|
|
351
|
+
### Phase 2: Write-path dual-stamp v1 + v2
|
|
352
|
+
|
|
353
|
+
Add D1 columns and write v2 fields for new rows only. Keep v1 intact.
|
|
354
|
+
|
|
355
|
+
### Phase 3: Report-only read/debug verification
|
|
356
|
+
|
|
357
|
+
Recompute v2 on read/debug paths and report status. Do not reject rows yet.
|
|
358
|
+
|
|
359
|
+
### Phase 4: Backfill tooling
|
|
360
|
+
|
|
361
|
+
Backfill historical rows only with explicit `backfilled` or `unknown_origin` markers.
|
|
362
|
+
|
|
363
|
+
### Phase 5: Optional strict mode
|
|
364
|
+
|
|
365
|
+
Private deployments may opt into warnings or blocking for invalid rows after enough evidence.
|
|
366
|
+
|
|
367
|
+
### Phase 6: Hosted/private HMAC signing
|
|
368
|
+
|
|
369
|
+
Add origin-authenticated signing only if hosted/private use cases require it.
|
|
370
|
+
|
|
371
|
+
## 11. Non-Goals
|
|
372
|
+
|
|
373
|
+
This pass does not implement:
|
|
374
|
+
|
|
375
|
+
- D1 migration
|
|
376
|
+
- Worker enforcement
|
|
377
|
+
- read-path rejection
|
|
378
|
+
- debug endpoint changes
|
|
379
|
+
- HMAC/private signing
|
|
380
|
+
- backfill script
|
|
381
|
+
- npm publish
|
|
382
|
+
- version bump
|
|
383
|
+
- Cloudflare deploy
|
|
384
|
+
- public security claim upgrade
|
|
385
|
+
- new MCP tools
|
|
386
|
+
- ranking/scoring changes
|
|
387
|
+
- Operator/retrieve behavior
|
|
388
|
+
|
|
389
|
+
## Release Gates For Future Implementation
|
|
390
|
+
|
|
391
|
+
Before any implementation pass can claim production enforcement:
|
|
392
|
+
|
|
393
|
+
- migrations must be reviewed and dry-run
|
|
394
|
+
- new rows must dual-stamp v1 and v2
|
|
395
|
+
- v1 compatibility tests must pass
|
|
396
|
+
- v2 read verification tests must pass
|
|
397
|
+
- old-row `unknown` behavior must be proven
|
|
398
|
+
- tampered-row `invalid` behavior must be proven
|
|
399
|
+
- debug output must avoid leaking secrets or excessive internals
|
|
400
|
+
- Worker dry-run must pass
|
|
401
|
+
- Trust gate must pass with effective fail 0
|
|
402
|
+
- public docs must say exactly what is implemented
|
|
403
|
+
|
|
404
|
+
## Product Interpretation
|
|
405
|
+
|
|
406
|
+
Ha-Pri v2 is a provenance hardening lane for stored FreshContext rows. It supports the larger product story only when it stays honest:
|
|
407
|
+
|
|
408
|
+
```txt
|
|
409
|
+
candidate context in
|
|
410
|
+
decision-ready context out
|
|
411
|
+
optional provenance verification for stored rows
|
|
412
|
+
```
|
|
413
|
+
|
|
414
|
+
It is not a truth engine and not a substitute for authentication, authorization, or hosted tenant isolation.
|