PyPI - proofbundle - Versions diffs - 0.8.0__tar.gz → 0.9.0__tar.gz - Mend

proofbundle 0.8.0tar.gz → 0.9.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

{proofbundle-0.8.0/src/proofbundle.egg-info → proofbundle-0.9.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: proofbundle
-Version: 0.8.0
+Version: 0.9.0
 Summary: Emit and verify portable cryptographic evidence bundles, offline: Ed25519 + RFC 6962 Merkle + optional SD-JWT.
 Author: Konrad Gruszka
 License: MIT
@@ -50,9 +50,10 @@ Dynamic: license-file
 <h1>proofbundle</h1>
-**Emit and verify, fully offline, portable evidence that a piece of data was
-signed and anchored in a tamper-evident log — and optionally carries a
-selectively disclosable credential. Pure Python, no server, no daemon, one JSON file.**
+**An offline verifier for AI eval receipts. Standards-native: Ed25519 signature,
+RFC 6962 transparency-log Merkle anchoring, optional SD-JWT (RFC 9901) selective
+disclosure, aligned to the in-toto test-result predicate. One portable JSON file,
+no server, no network.**
 [![CI](https://github.com/b7n0de/proofbundle/actions/workflows/ci.yml/badge.svg)](https://github.com/b7n0de/proofbundle/actions/workflows/ci.yml)
 [![PyPI](https://img.shields.io/pypi/v/proofbundle.svg?color=D6248A&cacheSeconds=3600)](https://pypi.org/project/proofbundle/)
@@ -70,7 +71,7 @@ selectively disclosable credential. Pure Python, no server, no daemon, one JSON
 **At a glance:** `proofbundle emit` signs and anchors a payload; `proofbundle
 verify` checks one self-contained `bundle.json` with three offline cryptographic
-checks → `OK` or `FAILED`. No network, no daemon, no own crypto. 74 tests.
+checks → `OK` or `FAILED`. No network, no daemon, no own crypto. 96 tests.
 ## Contents
@@ -325,24 +326,46 @@ SD-JWT selective disclosure over one portable file, offline.
 The maintainers of inspect_evals (Arcadia Impact, funded by the UK AI Safety Institute) name an open
 gap ([arXiv:2507.06893](https://arxiv.org/abs/2507.06893)):
 a database of trustworthy evaluation results with proper provenance tracking. proofbundle is the
-missing **signature + selective-disclosure layer** for exactly that — complementary to metadata
-aggregation (Every Eval Ever) and documentation taxonomies (Eval Factsheets), not a competitor.
-See [INTEROP.md](INTEROP.md) for how it maps to OpenSSF Model Signing, CycloneDX ML-BOM, and in-toto.
-- **Two framework adapters** — `pip install "proofbundle[inspect]"` reads a UK AISI
+missing **signature + selective-disclosure layer** for exactly that.
+**How it fits — standards-native, and honest about the neighbours.** proofbundle attests that a *claimed*
+evaluation result is authentic, tamper-evident, and selectively disclosable. It does **not** attest that
+the evaluation was computed correctly or that results were not cherry-picked — proving faithful
+computation is the domain of TEE approaches such as
+[Attestable Audits](https://arxiv.org/abs/2506.23706). It is complementary to its neighbours, named
+fairly: [Every Eval Ever](https://github.com/evaleval/every_eval_ever) standardizes eval *metadata* but
+adds no cryptography (proofbundle ships an EEE→receipt converter);
+[OpenSSF Model Signing](https://github.com/ossf/model-signing-spec) signs *model weights*, not eval
+results; [ValiChord](https://github.com/topeuph-ai/ValiChord) provides blind peer consensus and an
+attested log on a Holochain network (its v1 attestation library uses a simple SHA-256 Merkle tree, no
+signature, no SD-JWT, no in-toto). proofbundle is the lightweight, **standards-native** piece between them:
+a portable receipt a third party verifies offline, with selective disclosure so an auditor can prove a
+threshold was met without revealing the model or the data. See [INTEROP.md](INTEROP.md).
+- **Three framework bridges** — `pip install "proofbundle[inspect]"` reads a UK AISI
   [inspect_ai](https://github.com/UKGovernmentBEIS/inspect_ai) eval log via the stable `read_eval_log`
   API (lazy import). `proofbundle.adapters.from_lm_eval_results` reads a real EleutherAI
   [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) `results_*.json` (the
-  genuine `acc,none` filter-suffix format) and captures run provenance — no framework import either way.
-- **in-toto Statement v1** — `proofbundle.intoto.to_intoto_statement(claim, root_b64=…)`
-  emits the receipt as an in-toto statement with a self-hosted predicate type. The subject
-  digest is an *honest salted commitment* under a custom key, never `sha256` (see
-  [PREDICATE.md](PREDICATE.md)).
-- **SD-JWT issuance** (RFC 9901) — `proofbundle.sdjwt_issue.issue_sd_jwt(claim, signer,
+  genuine `acc,none` filter-suffix format). **`proofbundle.adapters.from_eee_dataset`** (v0.9) reads an
+  Every Eval Ever v0.2.2 aggregate JSON and builds a signed receipt — validated against the vendored EEE
+  schema, with **no runtime import** of `every_eval_ever` (it needs Python 3.12; proofbundle stays 3.9+).
+- **in-toto test-result export, DSSE-signed** (v0.9) — `proofbundle.intoto.export_intoto_dsse(claim,
+  signer)` emits the receipt as a DSSE-signed in-toto Statement v1 with the **generic
+  `test-result/v0.1` predicate** (result PASSED/FAILED, `configuration` ResourceDescriptors), so a generic
+  in-toto verifier understands it. Alongside the self-hosted-predicate `to_intoto_statement` (see
+  [PREDICATE.md](PREDICATE.md)). Metric details live in `annotations` (test-result has no native metric
+  field); the model/dataset stay salted commitments, never `sha256`.
+- **C2SP tlog-checkpoint** (v0.9) — `proofbundle.checkpoint.sign_checkpoint(origin, tree_size, root, …)`
+  emits a valid [C2SP](https://github.com/C2SP/C2SP/blob/main/tlog-checkpoint.md) signed note over the
+  RFC 6962 Merkle root, making a receipt witness-network / transparency-log compatible. Pure serialization
+  over the Ed25519 key already in use — no new crypto.
+- **SD-JWT issuance** (RFC 9901, verified Nov 2025) — `proofbundle.sdjwt_issue.issue_sd_jwt(claim, signer,
   root_b64=…, exact_score=…)` issues the receipt so a holder can disclose `passed` +
-  `threshold` while **withholding the exact score** and the identifier openings. The signed
-  bundle payload is the source of truth; the SD-JWT is a derived, bundle-bound view, verified
-  by proofbundle's own verifier **and** the `sd-jwt-python` reference.
+  `threshold` while **withholding the exact score** and the identifier openings. The digest mechanic is
+  RFC 9901 §4.2.3 (base64url of SHA-256 over the base64url-encoded Disclosure), cross-checked against the
+  `sd-jwt-python` reference.
+The signed bundle payload is always the source of truth; the SD-JWT and the in-toto export are derived,
+bundle-bound views.
 Every release ships **PEP 740 attestations** (Trusted Publishing) + an SLSA build-provenance
 attestation — see [SECURITY.md](SECURITY.md).
@@ -359,8 +382,10 @@ attestation — see [SECURITY.md](SECURITY.md).
   CITATION.cff, PEP 740 attestations documented.
 - **v0.7** — citability polish (ORCID, Zenodo DOI placeholder, in-toto proposal draft); v0.7.1 hardened
   verifier robustness + CI on Python 3.9 after a holistic review.
-- **v0.8 (current release)** — an offline `make demo` (real eval log -> signed receipt -> verified),
+- **v0.8** — an offline `make demo` (real eval log -> signed receipt -> verified),
   a sharpened honesty guardrail (authenticity/integrity, not computation-correctness), and outreach drafts.
+- **v0.9 (current release)** — the standards moat: a DSSE-signed in-toto `test-result` export, a C2SP
+  tlog-checkpoint over the RFC 6962 root, an Every Eval Ever converter, and standards-native repositioning.
 - **Deferred** (explicitly not yet built) — SD-JWT VC conformance + `vct` metadata,
   Key-Binding JWT, status lists / revocation, an official in-toto PR, DSSE / a full in-toto client.

{proofbundle-0.8.0 → proofbundle-0.9.0}/README.md RENAMED Viewed

@@ -7,9 +7,10 @@
 <h1>proofbundle</h1>
-**Emit and verify, fully offline, portable evidence that a piece of data was
-signed and anchored in a tamper-evident log — and optionally carries a
-selectively disclosable credential. Pure Python, no server, no daemon, one JSON file.**
+**An offline verifier for AI eval receipts. Standards-native: Ed25519 signature,
+RFC 6962 transparency-log Merkle anchoring, optional SD-JWT (RFC 9901) selective
+disclosure, aligned to the in-toto test-result predicate. One portable JSON file,
+no server, no network.**
 [![CI](https://github.com/b7n0de/proofbundle/actions/workflows/ci.yml/badge.svg)](https://github.com/b7n0de/proofbundle/actions/workflows/ci.yml)
 [![PyPI](https://img.shields.io/pypi/v/proofbundle.svg?color=D6248A&cacheSeconds=3600)](https://pypi.org/project/proofbundle/)
@@ -27,7 +28,7 @@ selectively disclosable credential. Pure Python, no server, no daemon, one JSON
 **At a glance:** `proofbundle emit` signs and anchors a payload; `proofbundle
 verify` checks one self-contained `bundle.json` with three offline cryptographic
-checks → `OK` or `FAILED`. No network, no daemon, no own crypto. 74 tests.
+checks → `OK` or `FAILED`. No network, no daemon, no own crypto. 96 tests.
 ## Contents
@@ -282,24 +283,46 @@ SD-JWT selective disclosure over one portable file, offline.
 The maintainers of inspect_evals (Arcadia Impact, funded by the UK AI Safety Institute) name an open
 gap ([arXiv:2507.06893](https://arxiv.org/abs/2507.06893)):
 a database of trustworthy evaluation results with proper provenance tracking. proofbundle is the
-missing **signature + selective-disclosure layer** for exactly that — complementary to metadata
-aggregation (Every Eval Ever) and documentation taxonomies (Eval Factsheets), not a competitor.
-See [INTEROP.md](INTEROP.md) for how it maps to OpenSSF Model Signing, CycloneDX ML-BOM, and in-toto.
-- **Two framework adapters** — `pip install "proofbundle[inspect]"` reads a UK AISI
+missing **signature + selective-disclosure layer** for exactly that.
+**How it fits — standards-native, and honest about the neighbours.** proofbundle attests that a *claimed*
+evaluation result is authentic, tamper-evident, and selectively disclosable. It does **not** attest that
+the evaluation was computed correctly or that results were not cherry-picked — proving faithful
+computation is the domain of TEE approaches such as
+[Attestable Audits](https://arxiv.org/abs/2506.23706). It is complementary to its neighbours, named
+fairly: [Every Eval Ever](https://github.com/evaleval/every_eval_ever) standardizes eval *metadata* but
+adds no cryptography (proofbundle ships an EEE→receipt converter);
+[OpenSSF Model Signing](https://github.com/ossf/model-signing-spec) signs *model weights*, not eval
+results; [ValiChord](https://github.com/topeuph-ai/ValiChord) provides blind peer consensus and an
+attested log on a Holochain network (its v1 attestation library uses a simple SHA-256 Merkle tree, no
+signature, no SD-JWT, no in-toto). proofbundle is the lightweight, **standards-native** piece between them:
+a portable receipt a third party verifies offline, with selective disclosure so an auditor can prove a
+threshold was met without revealing the model or the data. See [INTEROP.md](INTEROP.md).
+- **Three framework bridges** — `pip install "proofbundle[inspect]"` reads a UK AISI
   [inspect_ai](https://github.com/UKGovernmentBEIS/inspect_ai) eval log via the stable `read_eval_log`
   API (lazy import). `proofbundle.adapters.from_lm_eval_results` reads a real EleutherAI
   [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) `results_*.json` (the
-  genuine `acc,none` filter-suffix format) and captures run provenance — no framework import either way.
-- **in-toto Statement v1** — `proofbundle.intoto.to_intoto_statement(claim, root_b64=…)`
-  emits the receipt as an in-toto statement with a self-hosted predicate type. The subject
-  digest is an *honest salted commitment* under a custom key, never `sha256` (see
-  [PREDICATE.md](PREDICATE.md)).
-- **SD-JWT issuance** (RFC 9901) — `proofbundle.sdjwt_issue.issue_sd_jwt(claim, signer,
+  genuine `acc,none` filter-suffix format). **`proofbundle.adapters.from_eee_dataset`** (v0.9) reads an
+  Every Eval Ever v0.2.2 aggregate JSON and builds a signed receipt — validated against the vendored EEE
+  schema, with **no runtime import** of `every_eval_ever` (it needs Python 3.12; proofbundle stays 3.9+).
+- **in-toto test-result export, DSSE-signed** (v0.9) — `proofbundle.intoto.export_intoto_dsse(claim,
+  signer)` emits the receipt as a DSSE-signed in-toto Statement v1 with the **generic
+  `test-result/v0.1` predicate** (result PASSED/FAILED, `configuration` ResourceDescriptors), so a generic
+  in-toto verifier understands it. Alongside the self-hosted-predicate `to_intoto_statement` (see
+  [PREDICATE.md](PREDICATE.md)). Metric details live in `annotations` (test-result has no native metric
+  field); the model/dataset stay salted commitments, never `sha256`.
+- **C2SP tlog-checkpoint** (v0.9) — `proofbundle.checkpoint.sign_checkpoint(origin, tree_size, root, …)`
+  emits a valid [C2SP](https://github.com/C2SP/C2SP/blob/main/tlog-checkpoint.md) signed note over the
+  RFC 6962 Merkle root, making a receipt witness-network / transparency-log compatible. Pure serialization
+  over the Ed25519 key already in use — no new crypto.
+- **SD-JWT issuance** (RFC 9901, verified Nov 2025) — `proofbundle.sdjwt_issue.issue_sd_jwt(claim, signer,
   root_b64=…, exact_score=…)` issues the receipt so a holder can disclose `passed` +
-  `threshold` while **withholding the exact score** and the identifier openings. The signed
-  bundle payload is the source of truth; the SD-JWT is a derived, bundle-bound view, verified
-  by proofbundle's own verifier **and** the `sd-jwt-python` reference.
+  `threshold` while **withholding the exact score** and the identifier openings. The digest mechanic is
+  RFC 9901 §4.2.3 (base64url of SHA-256 over the base64url-encoded Disclosure), cross-checked against the
+  `sd-jwt-python` reference.
+The signed bundle payload is always the source of truth; the SD-JWT and the in-toto export are derived,
+bundle-bound views.
 Every release ships **PEP 740 attestations** (Trusted Publishing) + an SLSA build-provenance
 attestation — see [SECURITY.md](SECURITY.md).
@@ -316,8 +339,10 @@ attestation — see [SECURITY.md](SECURITY.md).
   CITATION.cff, PEP 740 attestations documented.
 - **v0.7** — citability polish (ORCID, Zenodo DOI placeholder, in-toto proposal draft); v0.7.1 hardened
   verifier robustness + CI on Python 3.9 after a holistic review.
-- **v0.8 (current release)** — an offline `make demo` (real eval log -> signed receipt -> verified),
+- **v0.8** — an offline `make demo` (real eval log -> signed receipt -> verified),
   a sharpened honesty guardrail (authenticity/integrity, not computation-correctness), and outreach drafts.
+- **v0.9 (current release)** — the standards moat: a DSSE-signed in-toto `test-result` export, a C2SP
+  tlog-checkpoint over the RFC 6962 root, an Every Eval Ever converter, and standards-native repositioning.
 - **Deferred** (explicitly not yet built) — SD-JWT VC conformance + `vct` metadata,
   Key-Binding JWT, status lists / revocation, an official in-toto PR, DSSE / a full in-toto client.

{proofbundle-0.8.0 → proofbundle-0.9.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "proofbundle"
-version = "0.8.0"
+version = "0.9.0"
 description = "Emit and verify portable cryptographic evidence bundles, offline: Ed25519 + RFC 6962 Merkle + optional SD-JWT."
 readme = "README.md"
 requires-python = ">=3.9"
@@ -67,7 +67,7 @@ proofbundle = "proofbundle.cli:main"
 where = ["src"]
 [tool.setuptools.package-data]
-proofbundle = ["py.typed"]
+proofbundle = ["py.typed", "eee_eval_schema.json"]
 [tool.ruff]
 line-length = 100

{proofbundle-0.8.0 → proofbundle-0.9.0}/src/proofbundle/__init__.py RENAMED Viewed

@@ -13,7 +13,7 @@ from .emit import emit_bundle, generate_signer
 from .errors import Check, ProofBundleError, VerificationResult
 from .merkle import verify_consistency, verify_inclusion
-__version__ = "0.8.0"
+__version__ = "0.9.0"
 __all__ = [
     "__version__",

{proofbundle-0.8.0 → proofbundle-0.9.0}/src/proofbundle/adapters/__init__.py RENAMED Viewed

@@ -5,6 +5,7 @@ add no runtime dependency. The output-format mapping is bound to a framework ver
 each fixture in tests/fixtures documents its source + version.
 """
 from .inspect_ai import from_inspect_ai_log
+from .eee import from_eee_dataset
 from .lm_eval import from_lm_eval_results
-__all__ = ["from_lm_eval_results", "from_inspect_ai_log"]
+__all__ = ["from_lm_eval_results", "from_inspect_ai_log", "from_eee_dataset"]

proofbundle-0.9.0/src/proofbundle/adapters/eee.py ADDED Viewed

@@ -0,0 +1,175 @@
+"""Adapter: an Every Eval Ever (EEE) dataset record → a signed proofbundle eval receipt (v0.9).
+Every Eval Ever (evaleval/every_eval_ever, MIT) is the community aggregation schema for eval metadata —
+it has no cryptography. This converter is strictly additive: it reads an EEE aggregate JSON and builds a
+signed, selectively-disclosable proofbundle receipt from it.
+IMPORTANT: `every_eval_ever` is NOT imported at runtime — it requires Python 3.12+ (pydantic/numpy/pandas/
+duckdb), while proofbundle stays 3.9+. We parse the EEE JSON directly and OPTIONALLY validate it against the
+vendored `eee_eval_schema.json` (schema version 0.2.2, MIT) using `jsonschema` if available.
+Field mapping (verified 2026-07 against schemas/eval.schema.json v0.2.2):
+  - model_info.id                                            → model_id
+  - evaluation_results[i].evaluation_name                    → suite / task
+  - evaluation_results[i].source_data.dataset_name           → dataset_id (required in every source variant)
+  - metric_config.metric_name | metric_id | metric_kind      → metric (all optional; fallback chain)
+  - score_details.score                                      → score
+  - score_details.uncertainty.standard_error.value           → provenance.stderr
+  - eval_library.{name,version}                              → provenance.harness / harness_version
+Gotcha handled: metric_config with score_type == "levels" is an integer level index; -1 with
+has_unknown_level == true means Unknown and is rejected (not silently mapped to 0).
+"""
+from __future__ import annotations
+import json
+from pathlib import Path
+from typing import Optional, Union
+from ..evalclaim import build_eval_claim
+_SCHEMA_PATH = Path(__file__).resolve().parent.parent / "eee_eval_schema.json"
+_SCHEMA_VERSION = "0.2.2"
+class EEEAdapterError(ValueError):
+    """Raised when the EEE record is missing the expected structure — a clear error, not a bare KeyError."""
+def _load(source: Union[str, Path, dict]) -> dict:
+    if isinstance(source, dict):
+        return source
+    try:
+        return json.loads(Path(source).read_text(encoding="utf-8"))
+    except (OSError, ValueError) as e:
+        raise EEEAdapterError(f"could not read EEE dataset {source!r}: {e}") from e
+def _validate(record: dict) -> None:
+    """Best-effort schema validation against the vendored EEE schema (skipped if jsonschema/schema absent)."""
+    try:
+        import jsonschema  # noqa: PLC0415
+    except ImportError:
+        return
+    if not _SCHEMA_PATH.is_file():
+        return
+    schema = json.loads(_SCHEMA_PATH.read_text(encoding="utf-8"))
+    try:
+        jsonschema.validate(record, schema)
+    except jsonschema.ValidationError as e:
+        raise EEEAdapterError(f"EEE record does not validate against schema {_SCHEMA_VERSION}: {e.message}") from e
+def _num_to_decimal_str(x) -> str:
+    """Format a JSON number as a plain decimal string (no exponent) for build_eval_claim's pattern."""
+    if isinstance(x, bool) or not isinstance(x, (int, float)):
+        raise EEEAdapterError(f"score must be a number, got {type(x).__name__}")
+    if isinstance(x, int):
+        return str(x)
+    if x != x or x in (float("inf"), float("-inf")):   # NaN/Inf
+        raise EEEAdapterError("score must be finite")
+    s = repr(x)
+    if "e" in s or "E" in s:                             # avoid exponent form (build_eval_claim rejects it)
+        s = f"{x:.12f}".rstrip("0").rstrip(".")
+    return s
+def _pick_metric(metric_config: dict) -> str:
+    for key in ("metric_name", "metric_id", "metric_kind"):
+        v = metric_config.get(key)
+        if isinstance(v, str) and v:
+            return v
+    return "score"
+def _extract_score(score_details: dict, metric_config: dict) -> str:
+    if "score" not in score_details:
+        raise EEEAdapterError("evaluation_results[].score_details.score is required")
+    raw = score_details["score"]
+    if metric_config.get("score_type") == "levels":
+        if not isinstance(raw, (int, float)) or isinstance(raw, bool):
+            raise EEEAdapterError("levels score must be an integer level index")
+        idx = int(raw)
+        if idx == -1 and metric_config.get("has_unknown_level"):
+            raise EEEAdapterError("levels score is -1 (Unknown) — cannot build a threshold claim")
+        return str(idx)
+    return _num_to_decimal_str(raw)
+def from_eee_dataset(source: Union[str, Path, dict], *, comparator: str, threshold: str,
+                     timestamp: Optional[str] = None, eval_index: int = 0, metric_name: Optional[str] = None,
+                     model_salt: Optional[bytes] = None, dataset_salt: Optional[bytes] = None,
+                     validate: bool = True):
+    """Read an EEE dataset record and build a proofbundle eval claim for one evaluation result.
+    `comparator`/`threshold` set the pass/fail assertion (EEE stores the raw score, not a threshold verdict).
+    `eval_index` selects which of `evaluation_results` to use; `metric_name` instead selects the first result
+    whose metric matches. Returns (claim, salts). Raises EEEAdapterError on a malformed record.
+    """
+    record = _load(source)
+    if not isinstance(record, dict):
+        raise EEEAdapterError("EEE dataset must be a JSON object")
+    if validate:
+        _validate(record)
+    model_info = record.get("model_info") or {}
+    model_id = model_info.get("id")
+    if not model_id:
+        raise EEEAdapterError("EEE record missing model_info.id")
+    results = record.get("evaluation_results")
+    if not isinstance(results, list) or not results:
+        raise EEEAdapterError("EEE record has no evaluation_results")
+    if metric_name is not None:
+        chosen = next((r for r in results if isinstance(r, dict)
+                       and _pick_metric(r.get("metric_config") or {}) == metric_name), None)
+        if chosen is None:
+            raise EEEAdapterError(f"no evaluation_result with metric {metric_name!r}")
+    else:
+        if eval_index < 0 or eval_index >= len(results):
+            raise EEEAdapterError(f"eval_index {eval_index} out of range (0..{len(results) - 1})")
+        chosen = results[eval_index]
+    if not isinstance(chosen, dict):
+        raise EEEAdapterError("evaluation_results item is not an object")
+    metric_config = chosen.get("metric_config") or {}
+    score_details = chosen.get("score_details") or {}
+    source_data = chosen.get("source_data") or {}
+    suite = chosen.get("evaluation_name")
+    if not suite:
+        raise EEEAdapterError("evaluation_results[].evaluation_name is required")
+    dataset_id = source_data.get("dataset_name") or str(suite)   # dataset_name is required in EEE; defensive fallback
+    metric = _pick_metric(metric_config)
+    score = _extract_score(score_details, metric_config)
+    eval_library = record.get("eval_library") or {}
+    ts = timestamp or chosen.get("evaluation_timestamp") or record.get("retrieved_timestamp")
+    if not ts:
+        raise EEEAdapterError("no timestamp: pass timestamp= or set retrieved_timestamp/evaluation_timestamp")
+    provenance = {"source": "every_eval_ever", "eee_schema_version": record.get("schema_version") or _SCHEMA_VERSION}
+    if eval_library.get("name"):
+        provenance["harness"] = str(eval_library["name"])
+    if eval_library.get("version"):
+        provenance["harness_version"] = str(eval_library["version"])
+    # NOTE: the EEE `evaluation_id` (format eval_name/model_id/timestamp) embeds the model id in cleartext,
+    # which would defeat proofbundle's salted model commitment (a receipt is meant to hide the model). So it
+    # is deliberately NOT copied into provenance — the receipt keeps the model private by design.
+    if metric_config.get("metric_id"):
+        provenance["metric_id"] = str(metric_config["metric_id"])
+    if metric_config.get("score_type"):
+        provenance["score_type"] = str(metric_config["score_type"])
+    se = ((score_details.get("uncertainty") or {}).get("standard_error") or {}).get("value")
+    if isinstance(se, (int, float)) and not isinstance(se, bool):
+        provenance["stderr"] = str(se)
+    rel = (record.get("source_metadata") or {}).get("evaluator_relationship")
+    if rel:
+        provenance["evaluator_relationship"] = str(rel)
+    return build_eval_claim(
+        suite=str(suite), suite_version=str(eval_library.get("version") or "unknown"),
+        metric=metric, comparator=comparator, threshold=threshold, score=score,
+        n=int((score_details.get("uncertainty") or {}).get("num_samples") or 0),
+        model_id=str(model_id), dataset_id=str(dataset_id), issuer="", timestamp=str(ts),
+        provenance=provenance, model_salt=model_salt, dataset_salt=dataset_salt)

proofbundle-0.9.0/src/proofbundle/checkpoint.py ADDED Viewed

@@ -0,0 +1,157 @@
+"""C2SP tlog-checkpoint output — a signed note over the RFC 6962 Merkle root (v0.9).
+proofbundle already has an RFC 6962 Merkle root and Ed25519, so it can emit a valid C2SP tlog-checkpoint:
+a signed note that makes a receipt witness-network / transparency-log compatible. Pure serialization and
+framing, no new crypto. Spec verified 2026-07 against C2SP/C2SP tlog-checkpoint.md + signed-note.md.
+Byte-exact rules (the ones that bite):
+  - Note text = at least three non-empty lines separated by U+000A: line 1 `origin` (a schemeless log
+    identity, no unicode spaces, no '+'), line 2 the tree size as ASCII decimal with no leading zeros
+    (empty tree = "0"), line 3 the Merkle root in STANDARD RFC 4648 §4 base64 (with padding) — NOT
+    base64url. The note text ends with a final U+000A.
+  - The signed note = note text (ending in U+000A) + one empty line + one-or-more signature lines.
+  - A signature line is:  U+2014 (EM DASH, not a hyphen) SP keyname SP base64(keyID ‖ signature) U+000A
+    where keyID is 4 bytes big-endian and, for Ed25519, signature is 64 raw bytes → 68 bytes total.
+  - What is signed: the note text bytes INCLUDING the final U+000A, EXCLUDING the separating empty line.
+    Raw bytes — NO DSSE/PAE wrapping.
+  - keyID = SHA-256(keyname_bytes ‖ 0x0A ‖ 0x01 ‖ pubkey[32])[:4]   (0x01 = Ed25519 signature type).
+  - vkey (to distribute the key) = keyname + "+" + hex8(keyID) + "+" + base64(0x01 ‖ pubkey[32]).
+"""
+from __future__ import annotations
+import base64
+import hashlib
+from typing import Optional
+from cryptography.hazmat.primitives.serialization import Encoding, PublicFormat
+from .errors import BundleFormatError
+from .signature import verify_ed25519
+__all__ = ["checkpoint_note", "key_id", "vkey", "sign_checkpoint", "verify_checkpoint", "root_bytes_from_b64"]
+EM_DASH = "—"
+_ED25519_SIG_TYPE = 0x01
+def _root_std_b64(root: bytes) -> str:
+    """Standard RFC 4648 §4 base64 (with padding) of the raw Merkle root — NOT base64url."""
+    return base64.b64encode(root).decode("ascii")
+def checkpoint_note(origin: str, tree_size: int, root: bytes) -> str:
+    """Build the C2SP checkpoint note text (3 lines + trailing newline). ``root`` is the raw RFC 6962
+    Merkle root bytes at ``tree_size``. ``origin`` must be non-empty with no spaces/'+' (a schemeless URL)."""
+    if not origin or " " in origin or "+" in origin or "\n" in origin:
+        raise BundleFormatError("checkpoint origin must be a non-empty schemeless id without spaces or '+'")
+    if isinstance(tree_size, bool) or not isinstance(tree_size, int) or tree_size < 0:
+        raise BundleFormatError("checkpoint tree_size must be a non-negative integer")
+    return f"{origin}\n{tree_size}\n{_root_std_b64(root)}\n"
+def key_id(keyname: str, pubkey: bytes) -> bytes:
+    """C2SP note key ID = first 4 bytes of SHA-256(keyname ‖ 0x0A ‖ 0x01 ‖ 32-byte-Ed25519-pubkey)."""
+    if len(pubkey) != 32:
+        raise BundleFormatError("Ed25519 public key must be 32 raw bytes")
+    h = hashlib.sha256(keyname.encode("utf-8") + b"\n" + bytes([_ED25519_SIG_TYPE]) + pubkey).digest()
+    return h[:4]
+def vkey(keyname: str, pubkey: bytes) -> str:
+    """C2SP verifier key encoding: name + '+' + hex8(keyID) + '+' + base64(0x01 ‖ pubkey)."""
+    kid = key_id(keyname, pubkey)
+    kid_hex = f"{int.from_bytes(kid, 'big'):08x}"
+    keymat = base64.b64encode(bytes([_ED25519_SIG_TYPE]) + pubkey).decode("ascii")
+    return f"{keyname}+{kid_hex}+{keymat}"
+def sign_checkpoint(origin: str, tree_size: int, root: bytes, signer, keyname: str) -> str:
+    """Produce a signed C2SP checkpoint note. ``signer`` is an Ed25519 private key whose public key must
+    correspond to ``keyname``. The signature is over the RAW note-text bytes (including the trailing
+    newline), never over base64 and never PAE-wrapped."""
+    note = checkpoint_note(origin, tree_size, root)
+    pubkey = signer.public_key().public_bytes(Encoding.Raw, PublicFormat.Raw)
+    sig = signer.sign(note.encode("utf-8"))
+    kid = key_id(keyname, pubkey)
+    sig_b64 = base64.b64encode(kid + sig).decode("ascii")
+    sig_line = f"{EM_DASH} {keyname} {sig_b64}\n"
+    return note + "\n" + sig_line
+def _parse_vkey(vkey_str: str) -> tuple[str, bytes, bytes]:
+    # The key material is standard base64, which can itself contain '+'. Since the name has no '+' (a
+    # schemeless origin) and the hex keyID has none, the FIRST TWO '+' are the separators and everything
+    # after is the base64 — so split with maxsplit=2, never a plain split (that would over-split the b64).
+    parts = vkey_str.split("+", 2)
+    if len(parts) != 3:
+        raise BundleFormatError("vkey must have 3 '+'-separated parts (name+hexKeyID+base64KeyMaterial)")
+    name, kid_hex, keymat_b64 = parts
+    try:
+        keymat = base64.b64decode(keymat_b64, validate=True)
+    except (ValueError, TypeError) as exc:
+        raise BundleFormatError("vkey key material is not valid base64") from exc
+    if len(keymat) != 33 or keymat[0] != _ED25519_SIG_TYPE:
+        raise BundleFormatError("vkey key material must be 0x01 followed by a 32-byte Ed25519 key")
+    pubkey = keymat[1:]
+    try:
+        kid = bytes.fromhex(kid_hex)
+    except ValueError as exc:
+        raise BundleFormatError("vkey keyID is not valid hex") from exc
+    return name, kid, pubkey
+def verify_checkpoint(signed_note: str, vkey_str: str) -> dict:
+    """Verify a signed C2SP checkpoint against a vkey. Returns {ok, origin, tree_size, root}. ``ok`` is
+    True iff a signature line whose keyID matches the vkey verifies (Ed25519) over the exact note-text
+    bytes. Reconstructs the note text from the parsed bytes — never re-derives it."""
+    name, kid_v, pubkey = _parse_vkey(vkey_str)
+    # note text = everything up to (and including the \n before) the separating empty line
+    if "\n\n" not in signed_note:
+        raise BundleFormatError("signed note has no empty-line separator between text and signatures")
+    note_text, sig_block = signed_note.split("\n\n", 1)
+    note_text += "\n"                       # restore the trailing newline that belongs to the note text
+    note_bytes = note_text.encode("utf-8")
+    lines = note_text.split("\n")
+    if len(lines) < 4 or not lines[0] or not lines[1] or not lines[2]:
+        raise BundleFormatError("checkpoint note must have at least 3 non-empty lines")
+    origin, size_s, root_b64 = lines[0], lines[1], lines[2]
+    if size_s != "0" and (size_s.startswith("0") or not size_s.isdigit()):
+        raise BundleFormatError("checkpoint tree size must be ASCII decimal with no leading zeros")
+    try:
+        root = base64.b64decode(root_b64, validate=True)
+    except (ValueError, TypeError) as exc:
+        raise BundleFormatError("checkpoint root is not valid standard base64") from exc
+    ok = False
+    kid_expected = key_id(name, pubkey)
+    for line in sig_block.split("\n"):
+        if not line.startswith(EM_DASH + " "):
+            continue
+        rest = line[len(EM_DASH) + 1:]
+        try:
+            lname, payload_b64 = rest.split(" ", 1)
+        except ValueError:
+            continue
+        if lname != name:
+            continue
+        try:
+            payload = base64.b64decode(payload_b64, validate=True)
+        except (ValueError, TypeError):
+            continue
+        if len(payload) < 4:
+            continue
+        kid, sig = payload[:4], payload[4:]
+        if kid != kid_v or kid != kid_expected:   # keyID must match both the vkey and the recomputed id
+            continue
+        if verify_ed25519(pubkey, sig, note_bytes):
+            ok = True
+            break
+    return {"ok": ok, "origin": origin, "tree_size": int(size_s), "root": root}
+def root_bytes_from_b64(root_b64: str) -> Optional[bytes]:
+    """Decode a bundle's standard-base64 Merkle root to raw bytes (for feeding into checkpoint_note)."""
+    try:
+        return base64.b64decode(root_b64, validate=True)
+    except (ValueError, TypeError):
+        return None

proofbundle 0.8.0__tar.gz → 0.9.0__tar.gz

proofbundle 0.8.0tar.gz → 0.9.0tar.gz