PyPI - sum-engine - Versions diffs - 0.7.0__tar.gz → 0.8.0__tar.gz - Mend

sum-engine 0.7.0tar.gz → 0.8.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (145) hide show

{sum_engine-0.7.0 → sum_engine-0.8.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: sum-engine
-Version: 0.7.0
+Version: 0.8.0
 Summary: SUM — bidirectional knowledge distillation with optional cryptographic attestation. Pipe prose, get a CanonicalBundle (HMAC / Ed25519 / W3C VC 2.0), verify anywhere.
 Author: ototao
 License: Apache-2.0
@@ -24,7 +24,8 @@ License-File: LICENSE
 Requires-Dist: cryptography>=41.0.0
 Requires-Dist: sympy>=1.12
 Provides-Extra: sieve
-Requires-Dist: spacy>=3.7.0; extra == "sieve"
+Requires-Dist: spacy>=3.8.0; extra == "sieve"
+Requires-Dist: click>=8.0; extra == "sieve"
 Provides-Extra: openai
 Requires-Dist: openai<3.0.0,>=1.40.0; extra == "openai"
 Requires-Dist: pydantic>=2.0.0; extra == "openai"
@@ -34,12 +35,18 @@ Provides-Extra: anthropic
 Requires-Dist: anthropic>=0.97.0; extra == "anthropic"
 Requires-Dist: pydantic>=2.0.0; extra == "anthropic"
 Provides-Extra: receipt-verify
-Requires-Dist: joserfc>=1.0.0; extra == "receipt-verify"
+Requires-Dist: joserfc<2.0.0,>=1.0.0; extra == "receipt-verify"
+Provides-Extra: verify
+Requires-Dist: joserfc<2.0.0,>=1.0.0; extra == "verify"
 Provides-Extra: mcp
 Requires-Dist: mcp>=1.0.0; extra == "mcp"
 Provides-Extra: research
 Requires-Dist: numpy>=1.24.0; extra == "research"
 Requires-Dist: scipy>=1.10.0; extra == "research"
+Provides-Extra: judge
+Requires-Dist: transformers>=4.30.0; extra == "judge"
+Requires-Dist: torch>=2.0.0; extra == "judge"
+Requires-Dist: sentencepiece>=0.1.99; extra == "judge"
 Provides-Extra: omni-format
 Requires-Dist: markitdown==0.1.5; extra == "omni-format"
 Provides-Extra: dev
@@ -56,12 +63,13 @@ Requires-Dist: sum-engine[sieve]; extra == "all"
 Requires-Dist: sum-engine[openai]; extra == "all"
 Requires-Dist: sum-engine[anthropic]; extra == "all"
 Requires-Dist: sum-engine[receipt-verify]; extra == "all"
+Requires-Dist: sum-engine[verify]; extra == "all"
 Requires-Dist: sum-engine[mcp]; extra == "all"
 Requires-Dist: sum-engine[omni-format]; extra == "all"
 Requires-Dist: sum-engine[dev]; extra == "all"
 Dynamic: license-file
-# SUM — verifiable bidirectional knowledge distillation
+# SUM — chain of custody for AI-transformed text
 [![CI](https://github.com/OtotaO/SUM/actions/workflows/quantum-ci.yml/badge.svg)](https://github.com/OtotaO/SUM/actions/workflows/quantum-ci.yml)
 [![PyPI — sum-engine](https://img.shields.io/pypi/v/sum-engine.svg?label=PyPI%20sum-engine)](https://pypi.org/project/sum-engine/)
@@ -85,13 +93,21 @@ Headline supporting numbers (each links to its source of truth):
 | Three-runtime byte-symmetric Ed25519 over JCS bytes | provable; locked by `make xruntime` (K1–K4) + `make xruntime-adversarial` (A1–A6) | [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §1.2, §1.3.1 |
 | Canonical round-trip `reconstruct(parse(canonical_tome(S))) == S` | provable; 0.00% drift on every CI run | [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §1.1 |
 | Render receipt — `sum.render_receipt.v1`, Ed25519 / JCS / detached JWS | shipped; verifier in three runtimes | [`docs/RENDER_RECEIPT_FORMAT.md`](docs/RENDER_RECEIPT_FORMAT.md) |
-| Slider fact preservation: median 1.000, p10 0.769 (long n=16) / 0.818 (short n=8) | empirical-benchmark | [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md) |
+| Slider fact preservation: median 1.000, p10 0.769 (long n=16) / 0.818 (short n=8) | empirical-benchmark — measured; same-commit replay receipt still pending (bench-hardening T2/T3) | [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md) |
 | Extraction F1 = 1.000 (`seed_v1`), 0.762 with precision 1.000 (`seed_v2`) | empirical-benchmark | [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §2.1 |
 A render receipt verifies the *render attestation* (issuer signed this tome, these triples, this slider position, this model, at this time). It does not verify the truth of the tome's content — that is what the slider bench measures separately. See [`docs/RENDER_RECEIPT_FORMAT.md`](docs/RENDER_RECEIPT_FORMAT.md) §5 for the explicit trust scope.
 ---
+## Why it matters
+More of what people read is now produced or reshaped by AI — summarised, translated, distilled, rewritten. As that grows, the ability to check *what changed, what was preserved, and what was lost* stops being a nicety and becomes shared infrastructure for a trustworthy information commons.
+SUM is built to be that layer **in the open**: Apache-2.0, offline-verifiable by anyone, and aligned with open standards (C2PA `digital_source_type`, W3C VC 2.0, JOSE / JWS / JWKS) rather than a proprietary trust silo. It does not ask you to trust *SUM* — any third party verifies the receipt themselves, in three independent runtimes, and the project states plainly where proof ends and measurement begins. The aim is a checkable **chain of custody for knowledge in motion**, not another walled garden.
+---
 ## Verify it yourself in 60 seconds
 The trust loop: hit the live Worker, get back a tome plus a detached Ed25519 JWS over the JCS-canonicalised receipt payload, fetch the issuer JWKS, verify.
@@ -123,19 +139,21 @@ A minimal Node verifier using `jose` + `canonicalize` is in [`docs/RENDER_RECEIP
 | Cross-runtime trust triangle | locked by CI (`make xruntime`) | K1 / K1-mw / K2 / K3 / K4 — Python ↔ Node ↔ Browser agree byte-for-byte on valid bundles. `make xruntime-adversarial` adds A1–A6 rejection-class equivalence. |
 | 5-axis slider rendering surface | density actioned deterministically; length / formality / audience / perspective LLM-conditioned. Two dispatch paths: Worker `/api/render` (Anthropic + Cloudflare AI Gateway optional) producing `sum.render_receipt.v1`, OR Python `sum transform apply slider` (OpenAI via `OPENAI_API_KEY`) producing `sum.transform_receipt.v1` | bench: median LLM-axis fact preservation 1.000, p10 0.769 (long, n=16) / 0.818 (short, n=8), order preservation 1.000 wherever measurable. Tightening worktrail at [`docs/BENCH_HARDENING_FROM_QCVV.md`](docs/BENCH_HARDENING_FROM_QCVV.md) adds iteration-stability + DKW worst-case bounds + capability-region headlines |
 | MCP server (`sum-mcp` console script) | shipped | five tools (`extract` / `attest` / `verify` / `inspect` / `schema`) exposed over stdio; bundles attested via MCP verify byte-identically through the CLI / Node / browser verifiers |
-| Transform substrate (`sum.transform_receipt.v1` + registry) | shipped (CLI in repo HEAD; PyPI catch-up tag pending) | `sum transform list` / `sum transform apply <name>` — three registered transforms (`slider` / `extract` / `compose`); receipts via Ed25519 / JCS / detached JWS just like render-receipts; 20-fixture cross-runtime K-matrix locks accept + reject across Python ↔ Node ↔ browser; T4 `source_chain_hash` binds receipts to source byte ranges; T5 `ShareableRender` round-trips signed renders for offline verification; T6 multi-school extract runs two extractors in tandem for adversarial-divergence detection. Wire spec at [`docs/TRANSFORM_RECEIPT_FORMAT.md`](docs/TRANSFORM_RECEIPT_FORMAT.md); design at [`docs/TRANSFORM_REGISTRY.md`](docs/TRANSFORM_REGISTRY.md). |
+| Transform substrate (`sum.transform_receipt.v1` + registry) | shipped on PyPI ≥ 0.7.0 | `sum transform list` / `sum transform apply <name>` — three registered transforms (`slider` / `extract` / `compose`); receipts via Ed25519 / JCS / detached JWS just like render-receipts; 20-fixture cross-runtime K-matrix locks accept + reject across Python ↔ Node ↔ browser; T4 `source_chain_hash` binds receipts to source byte ranges; T5 `ShareableRender` round-trips signed renders for offline verification; T6 multi-school extract runs two extractors in tandem for adversarial-divergence detection. Wire spec at [`docs/TRANSFORM_RECEIPT_FORMAT.md`](docs/TRANSFORM_RECEIPT_FORMAT.md); design at [`docs/TRANSFORM_REGISTRY.md`](docs/TRANSFORM_REGISTRY.md). |
 | Replay-defense window (`signed_at_out_of_window`) | shipped | opt-in `max_age_seconds` parameter across all four verifier surfaces (Python render / Python transform / JS render / JS transform). Default-off preserves archival use; receivers opt in per use-case (agent-swarm 60s, real-time 600s, newsletter 1d, legal-discovery no window). |
 | `sum verify --explain` layered output | shipped | Per-dimension report (`sum.verify_explained.v1`): cryptographic integrity / canonical reconstruction / axiom consistency / extraction provenance / source evidence coverage / semantic preservation / truth of content. Each carries `epistemic_status` (`provable` / `certified` / `empirical-benchmark` / `not-asserted`). Truth of content is ALWAYS `not_asserted` — locked by test. |
+| Meaning-loss receipts + `sum_verify` SDK | shipped on PyPI ≥ 0.8.0 | `sum.meaning_risk_receipt.v1` — a signed, replayable, distribution-free bound on a *named meaning-loss proxy* (`pip install 'sum-engine[verify]'` → `import sum_verify` / `python -m sum_verify`, dependency-light: no numpy/scipy/torch). Plus `sum meaning-diff` (per-document "what was kept / dropped / added"), `sum drift-budget` (compose meaning-loss across a transform chain), and `sum exchangeability` (advisory: is a bound applicable to *your* text?). Research-flagged; the affirmative contribution behind arXiv Paper-1. |
 | Negative-control corpus (T5 of bench-hardening) | shipped | 20 hand-authored documents across 5 failure modes (ambiguous coref / predicate-alias / contradictions / entity-resolution-adversarial / non-extractable). Runner exits 1 if observed failures don't match annotations. Baseline at [`fixtures/bench_receipts/negative_control_2026-05-17.json`](fixtures/bench_receipts/negative_control_2026-05-17.json). |
 | Compliance validators (six regimes) | shipped | `sum compliance check --regime <id> --audit-log <path>` — EU AI Act Article 12, GDPR Article 30, HIPAA § 164.312(b), ISO/IEC 27001 A.8.15, SOC 2 CC 7.2, PCI DSS v4.0 Req 10. All six produce the same `sum.compliance_report.v1` schema; per-regime docs at `docs/COMPLIANCE_*.md`. |
-The slider's product claim — *axis changes do not lose facts* — is the load-bearing empirical result. It is verified by NLI audit on every embedding-flagged "loss" cell; full attribution in [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md).
+The slider's product claim — *axis changes do not lose facts* — is the load-bearing empirical result. It is verified by NLI audit on every embedding-flagged "loss" cell; full attribution in [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md). In keeping with the "what remains unproven" half of the promise above: these headline numbers are **measured observations**, not yet same-commit-replayable — the bench harness (`Tests/benchmarks/slider_drift_bench.py`) is scaffold-state and no `sum.slider_drift_bench.v1` receipt is committed. Closing that to a replayable receipt is bench-hardening tasks T2 / T3 ([`docs/BENCH_HARDENING_FROM_QCVV.md`](docs/BENCH_HARDENING_FROM_QCVV.md)); see the reproducibility-status note in [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md).
 ## Strategic context
 The operational compass — read in this order if you want the project's intent + how it operates + where it's going:
 - [`docs/CHARTER_2026-05-17.md`](docs/CHARTER_2026-05-17.md) — intent, the Why, strategy, objectives, success criteria, constraints, and the operational loop. The compass every other doc resolves to.
+- [`docs/PRODUCT_VISION.md`](docs/PRODUCT_VISION.md) — the product vision (the slider workbench: drop text → render it from a tag to a tome, with a signed receipt of what was preserved) and the **positioning**: SUM is the chain-of-custody *standard* for AI-transformed text — **provenance-first, attest-don't-detect** (a cryptographic guarantee robust to rewriting; an "is this AI?" answer ships only as an honest advisory signal, never a "99 %").
 - [`docs/PRODUCT_DELIBERATION_2026-05-14.md`](docs/PRODUCT_DELIBERATION_2026-05-14.md) — three-option strategic analysis + grant-outcome decision tree.
 - [`docs/ZENITH_FRAMING_2026-05-16.md`](docs/ZENITH_FRAMING_2026-05-16.md) — destination framing (SUM as chain-of-custody for AI-transformed knowledge) plus three new concepts (Perspective Receipts, Trust Profiles, Epistemic Nutrition Label) on the design queue.
 - [`docs/BENCH_HARDENING_FROM_QCVV.md`](docs/BENCH_HARDENING_FROM_QCVV.md) — five-task empirical-benchmark hardening plan (T1–T5; T5 shipped, T1–T4 queued).
@@ -251,7 +269,7 @@ Below the slider sits the substrate that earlier phases shipped and verified. Po
 - **Bundle public-key attestation (provable).** Ed25519-signed CanonicalBundles are tamper-detectable by any third party in any of the three runtimes. [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §1.3.1.
 - **Merkle hash-chain integrity (provable, including under concurrent writers).** [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §1.7.
 - **Extraction F1 (empirical-benchmark).** 1.000 on `seed_v1` (50 simple-SVO docs); 0.762 with precision 1.000 on `seed_v2` (20-doc difficulty corpus). Every remaining `seed_v2` failure is a recall miss, not a truth inversion. [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §2.1.
-- **168 numbered features**, each with a reproducible verification command, in [`docs/FEATURE_CATALOG.md`](docs/FEATURE_CATALOG.md).
+- **170 numbered features**, each with a reproducible verification command, in [`docs/FEATURE_CATALOG.md`](docs/FEATURE_CATALOG.md).
 ### Research substrate (under `sum_engine_internal/research/`)
@@ -275,6 +293,10 @@ Less-surfaced but shipped:
 - **Audit log format** — every CLI operation can emit `sum.audit_log.v1` events; see [`docs/AUDIT_LOG_FORMAT.md`](docs/AUDIT_LOG_FORMAT.md).
 - **Agent surface** (`sum_engine_internal/agent_surface/`) — see [`docs/AGENT_SURFACE_FINDINGS.md`](docs/AGENT_SURFACE_FINDINGS.md).
+### Internal research surfaces (NOT shipped, present in repo)
+- **`api/quantum_router.py` + `quantum_main.py`** — FastAPI surface with 26+ endpoints (branchable knowledge graph, ZK semantic proofs, federated KG sync, JWT-tenant knowledge OS). 1,684 LOC; 58/58 tests pass; runs locally via `uvicorn quantum_main:app`. **NOT in the PyPI wheel** (`pyproject.toml` excludes `api*`), **NOT in the live Worker**, **NOT in the dogfood quickstart**. The substrate it composes is load-bearing for the shipping surfaces above; only the FastAPI HTTP layer is internal-research. Promote to a shipping `[api]` extra only if a named buyer or grant deliverable explicitly references one of the endpoint clusters. See top-of-file banner in `api/quantum_router.py` for the full triage rationale.
 ---
 ## Reproduce the bench
@@ -332,7 +354,7 @@ CI runs the full suite on every push (`.github/workflows/quantum-ci.yml`); the `
 ---
-## Truthfulness contract
+## Epistemic contract
 Every claim in this repo carries an explicit epistemic status — `provable`, `certified`, `empirical-benchmark`, or `expert-opinion`. The arbiter is [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md). A summary surface that quotes an empirical-benchmark number alongside language like "mathematically guaranteed" is a policy violation per §5 and must be corrected.

sum_engine-0.7.0/sum_engine.egg-info/PKG-INFO → sum_engine-0.8.0/README.md RENAMED Viewed

@@ -1,67 +1,4 @@
-Metadata-Version: 2.4
-Name: sum-engine
-Version: 0.7.0
-Summary: SUM — bidirectional knowledge distillation with optional cryptographic attestation. Pipe prose, get a CanonicalBundle (HMAC / Ed25519 / W3C VC 2.0), verify anywhere.
-Author: ototao
-License: Apache-2.0
-Project-URL: Homepage, https://github.com/OtotaO/SUM
-Project-URL: Repository, https://github.com/OtotaO/SUM
-Project-URL: Proof Boundary, https://github.com/OtotaO/SUM/blob/main/docs/PROOF_BOUNDARY.md
-Project-URL: Feature Catalog, https://github.com/OtotaO/SUM/blob/main/docs/FEATURE_CATALOG.md
-Keywords: knowledge-graph,verifiable-credentials,attestation,godel-encoding,semantic-web,agent-cli
-Classifier: Development Status :: 4 - Beta
-Classifier: Intended Audience :: Developers
-Classifier: License :: OSI Approved :: Apache Software License
-Classifier: Programming Language :: Python :: 3.10
-Classifier: Programming Language :: Python :: 3.11
-Classifier: Programming Language :: Python :: 3.12
-Classifier: Topic :: Scientific/Engineering :: Information Analysis
-Classifier: Topic :: Security :: Cryptography
-Classifier: Environment :: Console
-Requires-Python: >=3.10
-Description-Content-Type: text/markdown
-License-File: LICENSE
-Requires-Dist: cryptography>=41.0.0
-Requires-Dist: sympy>=1.12
-Provides-Extra: sieve
-Requires-Dist: spacy>=3.7.0; extra == "sieve"
-Provides-Extra: openai
-Requires-Dist: openai<3.0.0,>=1.40.0; extra == "openai"
-Requires-Dist: pydantic>=2.0.0; extra == "openai"
-Provides-Extra: llm
-Requires-Dist: sum-engine[openai]; extra == "llm"
-Provides-Extra: anthropic
-Requires-Dist: anthropic>=0.97.0; extra == "anthropic"
-Requires-Dist: pydantic>=2.0.0; extra == "anthropic"
-Provides-Extra: receipt-verify
-Requires-Dist: joserfc>=1.0.0; extra == "receipt-verify"
-Provides-Extra: mcp
-Requires-Dist: mcp>=1.0.0; extra == "mcp"
-Provides-Extra: research
-Requires-Dist: numpy>=1.24.0; extra == "research"
-Requires-Dist: scipy>=1.10.0; extra == "research"
-Provides-Extra: omni-format
-Requires-Dist: markitdown==0.1.5; extra == "omni-format"
-Provides-Extra: dev
-Requires-Dist: pytest>=7.0.0; extra == "dev"
-Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
-Requires-Dist: mypy>=1.0.0; extra == "dev"
-Requires-Dist: ruff>=0.1.0; extra == "dev"
-Requires-Dist: types-setuptools; extra == "dev"
-Requires-Dist: PyJWT>=2.8.0; extra == "dev"
-Requires-Dist: build>=1.0.0; extra == "dev"
-Requires-Dist: hypothesis>=6.0.0; extra == "dev"
-Provides-Extra: all
-Requires-Dist: sum-engine[sieve]; extra == "all"
-Requires-Dist: sum-engine[openai]; extra == "all"
-Requires-Dist: sum-engine[anthropic]; extra == "all"
-Requires-Dist: sum-engine[receipt-verify]; extra == "all"
-Requires-Dist: sum-engine[mcp]; extra == "all"
-Requires-Dist: sum-engine[omni-format]; extra == "all"
-Requires-Dist: sum-engine[dev]; extra == "all"
-Dynamic: license-file
-# SUM — verifiable bidirectional knowledge distillation
+# SUM — chain of custody for AI-transformed text
 [![CI](https://github.com/OtotaO/SUM/actions/workflows/quantum-ci.yml/badge.svg)](https://github.com/OtotaO/SUM/actions/workflows/quantum-ci.yml)
 [![PyPI — sum-engine](https://img.shields.io/pypi/v/sum-engine.svg?label=PyPI%20sum-engine)](https://pypi.org/project/sum-engine/)
@@ -85,13 +22,21 @@ Headline supporting numbers (each links to its source of truth):
 | Three-runtime byte-symmetric Ed25519 over JCS bytes | provable; locked by `make xruntime` (K1–K4) + `make xruntime-adversarial` (A1–A6) | [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §1.2, §1.3.1 |
 | Canonical round-trip `reconstruct(parse(canonical_tome(S))) == S` | provable; 0.00% drift on every CI run | [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §1.1 |
 | Render receipt — `sum.render_receipt.v1`, Ed25519 / JCS / detached JWS | shipped; verifier in three runtimes | [`docs/RENDER_RECEIPT_FORMAT.md`](docs/RENDER_RECEIPT_FORMAT.md) |
-| Slider fact preservation: median 1.000, p10 0.769 (long n=16) / 0.818 (short n=8) | empirical-benchmark | [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md) |
+| Slider fact preservation: median 1.000, p10 0.769 (long n=16) / 0.818 (short n=8) | empirical-benchmark — measured; same-commit replay receipt still pending (bench-hardening T2/T3) | [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md) |
 | Extraction F1 = 1.000 (`seed_v1`), 0.762 with precision 1.000 (`seed_v2`) | empirical-benchmark | [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §2.1 |
 A render receipt verifies the *render attestation* (issuer signed this tome, these triples, this slider position, this model, at this time). It does not verify the truth of the tome's content — that is what the slider bench measures separately. See [`docs/RENDER_RECEIPT_FORMAT.md`](docs/RENDER_RECEIPT_FORMAT.md) §5 for the explicit trust scope.
 ---
+## Why it matters
+More of what people read is now produced or reshaped by AI — summarised, translated, distilled, rewritten. As that grows, the ability to check *what changed, what was preserved, and what was lost* stops being a nicety and becomes shared infrastructure for a trustworthy information commons.
+SUM is built to be that layer **in the open**: Apache-2.0, offline-verifiable by anyone, and aligned with open standards (C2PA `digital_source_type`, W3C VC 2.0, JOSE / JWS / JWKS) rather than a proprietary trust silo. It does not ask you to trust *SUM* — any third party verifies the receipt themselves, in three independent runtimes, and the project states plainly where proof ends and measurement begins. The aim is a checkable **chain of custody for knowledge in motion**, not another walled garden.
+---
 ## Verify it yourself in 60 seconds
 The trust loop: hit the live Worker, get back a tome plus a detached Ed25519 JWS over the JCS-canonicalised receipt payload, fetch the issuer JWKS, verify.
@@ -123,19 +68,21 @@ A minimal Node verifier using `jose` + `canonicalize` is in [`docs/RENDER_RECEIP
 | Cross-runtime trust triangle | locked by CI (`make xruntime`) | K1 / K1-mw / K2 / K3 / K4 — Python ↔ Node ↔ Browser agree byte-for-byte on valid bundles. `make xruntime-adversarial` adds A1–A6 rejection-class equivalence. |
 | 5-axis slider rendering surface | density actioned deterministically; length / formality / audience / perspective LLM-conditioned. Two dispatch paths: Worker `/api/render` (Anthropic + Cloudflare AI Gateway optional) producing `sum.render_receipt.v1`, OR Python `sum transform apply slider` (OpenAI via `OPENAI_API_KEY`) producing `sum.transform_receipt.v1` | bench: median LLM-axis fact preservation 1.000, p10 0.769 (long, n=16) / 0.818 (short, n=8), order preservation 1.000 wherever measurable. Tightening worktrail at [`docs/BENCH_HARDENING_FROM_QCVV.md`](docs/BENCH_HARDENING_FROM_QCVV.md) adds iteration-stability + DKW worst-case bounds + capability-region headlines |
 | MCP server (`sum-mcp` console script) | shipped | five tools (`extract` / `attest` / `verify` / `inspect` / `schema`) exposed over stdio; bundles attested via MCP verify byte-identically through the CLI / Node / browser verifiers |
-| Transform substrate (`sum.transform_receipt.v1` + registry) | shipped (CLI in repo HEAD; PyPI catch-up tag pending) | `sum transform list` / `sum transform apply <name>` — three registered transforms (`slider` / `extract` / `compose`); receipts via Ed25519 / JCS / detached JWS just like render-receipts; 20-fixture cross-runtime K-matrix locks accept + reject across Python ↔ Node ↔ browser; T4 `source_chain_hash` binds receipts to source byte ranges; T5 `ShareableRender` round-trips signed renders for offline verification; T6 multi-school extract runs two extractors in tandem for adversarial-divergence detection. Wire spec at [`docs/TRANSFORM_RECEIPT_FORMAT.md`](docs/TRANSFORM_RECEIPT_FORMAT.md); design at [`docs/TRANSFORM_REGISTRY.md`](docs/TRANSFORM_REGISTRY.md). |
+| Transform substrate (`sum.transform_receipt.v1` + registry) | shipped on PyPI ≥ 0.7.0 | `sum transform list` / `sum transform apply <name>` — three registered transforms (`slider` / `extract` / `compose`); receipts via Ed25519 / JCS / detached JWS just like render-receipts; 20-fixture cross-runtime K-matrix locks accept + reject across Python ↔ Node ↔ browser; T4 `source_chain_hash` binds receipts to source byte ranges; T5 `ShareableRender` round-trips signed renders for offline verification; T6 multi-school extract runs two extractors in tandem for adversarial-divergence detection. Wire spec at [`docs/TRANSFORM_RECEIPT_FORMAT.md`](docs/TRANSFORM_RECEIPT_FORMAT.md); design at [`docs/TRANSFORM_REGISTRY.md`](docs/TRANSFORM_REGISTRY.md). |
 | Replay-defense window (`signed_at_out_of_window`) | shipped | opt-in `max_age_seconds` parameter across all four verifier surfaces (Python render / Python transform / JS render / JS transform). Default-off preserves archival use; receivers opt in per use-case (agent-swarm 60s, real-time 600s, newsletter 1d, legal-discovery no window). |
 | `sum verify --explain` layered output | shipped | Per-dimension report (`sum.verify_explained.v1`): cryptographic integrity / canonical reconstruction / axiom consistency / extraction provenance / source evidence coverage / semantic preservation / truth of content. Each carries `epistemic_status` (`provable` / `certified` / `empirical-benchmark` / `not-asserted`). Truth of content is ALWAYS `not_asserted` — locked by test. |
+| Meaning-loss receipts + `sum_verify` SDK | shipped on PyPI ≥ 0.8.0 | `sum.meaning_risk_receipt.v1` — a signed, replayable, distribution-free bound on a *named meaning-loss proxy* (`pip install 'sum-engine[verify]'` → `import sum_verify` / `python -m sum_verify`, dependency-light: no numpy/scipy/torch). Plus `sum meaning-diff` (per-document "what was kept / dropped / added"), `sum drift-budget` (compose meaning-loss across a transform chain), and `sum exchangeability` (advisory: is a bound applicable to *your* text?). Research-flagged; the affirmative contribution behind arXiv Paper-1. |
 | Negative-control corpus (T5 of bench-hardening) | shipped | 20 hand-authored documents across 5 failure modes (ambiguous coref / predicate-alias / contradictions / entity-resolution-adversarial / non-extractable). Runner exits 1 if observed failures don't match annotations. Baseline at [`fixtures/bench_receipts/negative_control_2026-05-17.json`](fixtures/bench_receipts/negative_control_2026-05-17.json). |
 | Compliance validators (six regimes) | shipped | `sum compliance check --regime <id> --audit-log <path>` — EU AI Act Article 12, GDPR Article 30, HIPAA § 164.312(b), ISO/IEC 27001 A.8.15, SOC 2 CC 7.2, PCI DSS v4.0 Req 10. All six produce the same `sum.compliance_report.v1` schema; per-regime docs at `docs/COMPLIANCE_*.md`. |
-The slider's product claim — *axis changes do not lose facts* — is the load-bearing empirical result. It is verified by NLI audit on every embedding-flagged "loss" cell; full attribution in [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md).
+The slider's product claim — *axis changes do not lose facts* — is the load-bearing empirical result. It is verified by NLI audit on every embedding-flagged "loss" cell; full attribution in [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md). In keeping with the "what remains unproven" half of the promise above: these headline numbers are **measured observations**, not yet same-commit-replayable — the bench harness (`Tests/benchmarks/slider_drift_bench.py`) is scaffold-state and no `sum.slider_drift_bench.v1` receipt is committed. Closing that to a replayable receipt is bench-hardening tasks T2 / T3 ([`docs/BENCH_HARDENING_FROM_QCVV.md`](docs/BENCH_HARDENING_FROM_QCVV.md)); see the reproducibility-status note in [`docs/SLIDER_CONTRACT.md`](docs/SLIDER_CONTRACT.md).
 ## Strategic context
 The operational compass — read in this order if you want the project's intent + how it operates + where it's going:
 - [`docs/CHARTER_2026-05-17.md`](docs/CHARTER_2026-05-17.md) — intent, the Why, strategy, objectives, success criteria, constraints, and the operational loop. The compass every other doc resolves to.
+- [`docs/PRODUCT_VISION.md`](docs/PRODUCT_VISION.md) — the product vision (the slider workbench: drop text → render it from a tag to a tome, with a signed receipt of what was preserved) and the **positioning**: SUM is the chain-of-custody *standard* for AI-transformed text — **provenance-first, attest-don't-detect** (a cryptographic guarantee robust to rewriting; an "is this AI?" answer ships only as an honest advisory signal, never a "99 %").
 - [`docs/PRODUCT_DELIBERATION_2026-05-14.md`](docs/PRODUCT_DELIBERATION_2026-05-14.md) — three-option strategic analysis + grant-outcome decision tree.
 - [`docs/ZENITH_FRAMING_2026-05-16.md`](docs/ZENITH_FRAMING_2026-05-16.md) — destination framing (SUM as chain-of-custody for AI-transformed knowledge) plus three new concepts (Perspective Receipts, Trust Profiles, Epistemic Nutrition Label) on the design queue.
 - [`docs/BENCH_HARDENING_FROM_QCVV.md`](docs/BENCH_HARDENING_FROM_QCVV.md) — five-task empirical-benchmark hardening plan (T1–T5; T5 shipped, T1–T4 queued).
@@ -251,7 +198,7 @@ Below the slider sits the substrate that earlier phases shipped and verified. Po
 - **Bundle public-key attestation (provable).** Ed25519-signed CanonicalBundles are tamper-detectable by any third party in any of the three runtimes. [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §1.3.1.
 - **Merkle hash-chain integrity (provable, including under concurrent writers).** [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §1.7.
 - **Extraction F1 (empirical-benchmark).** 1.000 on `seed_v1` (50 simple-SVO docs); 0.762 with precision 1.000 on `seed_v2` (20-doc difficulty corpus). Every remaining `seed_v2` failure is a recall miss, not a truth inversion. [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md) §2.1.
-- **168 numbered features**, each with a reproducible verification command, in [`docs/FEATURE_CATALOG.md`](docs/FEATURE_CATALOG.md).
+- **170 numbered features**, each with a reproducible verification command, in [`docs/FEATURE_CATALOG.md`](docs/FEATURE_CATALOG.md).
 ### Research substrate (under `sum_engine_internal/research/`)
@@ -275,6 +222,10 @@ Less-surfaced but shipped:
 - **Audit log format** — every CLI operation can emit `sum.audit_log.v1` events; see [`docs/AUDIT_LOG_FORMAT.md`](docs/AUDIT_LOG_FORMAT.md).
 - **Agent surface** (`sum_engine_internal/agent_surface/`) — see [`docs/AGENT_SURFACE_FINDINGS.md`](docs/AGENT_SURFACE_FINDINGS.md).
+### Internal research surfaces (NOT shipped, present in repo)
+- **`api/quantum_router.py` + `quantum_main.py`** — FastAPI surface with 26+ endpoints (branchable knowledge graph, ZK semantic proofs, federated KG sync, JWT-tenant knowledge OS). 1,684 LOC; 58/58 tests pass; runs locally via `uvicorn quantum_main:app`. **NOT in the PyPI wheel** (`pyproject.toml` excludes `api*`), **NOT in the live Worker**, **NOT in the dogfood quickstart**. The substrate it composes is load-bearing for the shipping surfaces above; only the FastAPI HTTP layer is internal-research. Promote to a shipping `[api]` extra only if a named buyer or grant deliverable explicitly references one of the endpoint clusters. See top-of-file banner in `api/quantum_router.py` for the full triage rationale.
 ---
 ## Reproduce the bench
@@ -332,7 +283,7 @@ CI runs the full suite on every push (`.github/workflows/quantum-ci.yml`); the `
 ---
-## Truthfulness contract
+## Epistemic contract
 Every claim in this repo carries an explicit epistemic status — `provable`, `certified`, `empirical-benchmark`, or `expert-opinion`. The arbiter is [`docs/PROOF_BOUNDARY.md`](docs/PROOF_BOUNDARY.md). A summary surface that quotes an empirical-benchmark number alongside language like "mathematically guaranteed" is a policy violation per §5 and must be corrected.

{sum_engine-0.7.0 → sum_engine-0.8.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "sum-engine"
-version = "0.7.0"
+version = "0.8.0"
 description = "SUM — bidirectional knowledge distillation with optional cryptographic attestation. Pipe prose, get a CanonicalBundle (HMAC / Ed25519 / W3C VC 2.0), verify anywhere."
 readme = "README.md"
 license = { text = "Apache-2.0" }
@@ -42,7 +42,26 @@ dependencies = [
 #   pip install sum-engine[openai]  # OpenAI structured-output path
 #   pip install sum-engine[llm]     # alias for [openai] (legacy name)
 #   pip install sum-engine[all]     # everything, plus dev tooling
-sieve = ["spacy>=3.7.0"]
+sieve = [
+  # Floor bumped 3.7.0 → 3.8.0 on 2026-05-29 (F14). At spacy 3.7.0
+  # the auto-downloaded en_core_web_sm now resolves to a 3.8-series
+  # model the older runtime cannot load, and the fallback download
+  # builds a malformed URL (`download/-en_core_web_sm/-…`) because
+  # spacy.io's compatibility table no longer serves 3.7-compatible
+  # entries. Bumping the floor to the empirically-operable version
+  # keeps the declared floor honest. CI: new `pip install sum-engine
+  # (floor venv smoke)` job pins to floor and runs the full smoke,
+  # so the next time the floor decays we catch it before users do.
+  # See `docs/DOGFOOD_FINDINGS_2026-05-29.md` F14.
+  "spacy>=3.8.0",
+  # spacy ≥ 3.8 imports `from click import NoSuchOption` at module
+  # load (spacy/cli/_util.py); typer ≥ 0.13 stopped pulling click
+  # transitively. Pin click explicitly so a fresh
+  # `pip install sum-engine[sieve]` does not ImportError on first
+  # spacy import. CI: `pip install sum-engine (fresh venv smoke)`
+  # caught this 2026-05-28. See F13.
+  "click>=8.0",
+]
 # `[openai]` is the canonical, vendor-named extra; `[llm]` is kept as a
 # back-compat alias because it predates the multi-provider dispatcher
 # (Anthropic and OpenAI now have their own named extras). Both install
@@ -62,7 +81,21 @@ anthropic = ["anthropic>=0.97.0", "pydantic>=2.0.0"]
 # detached-JWS / RFC 7797 b64=false machinery; the existing pure-Python
 # JCS module at sum_engine_internal/infrastructure/jcs.py handles
 # canonicalization. Cryptography is already a hard dep above.
-receipt-verify = ["joserfc>=1.0.0"]
+# Upper bound: joserfc>=1.x warns that the "EdDSA" JWS alg is deprecated
+# (RFC 9864 favours explicit Ed25519/Ed448 alg identifiers). The whole
+# render-receipt trust loop signs with "EdDSA", so we pin below 2.0.0
+# until we confirm a major release does not drop the "EdDSA" alias.
+receipt-verify = ["joserfc>=1.0.0,<2.0.0"]
+# The stable, dependency-light verify SDK (`import sum_verify`). The
+# package an integrator pins to CHECK SUM receipts — meaning-risk /
+# render / transform — without the CLI or the research numeric stack.
+# Deliberately tiny: joserfc for the detached-JWS machinery, cryptography
+# (already a core dep) for Ed25519. The conformal bound replay is
+# re-derived in pure Python (sum_verify/_conformal.py), so verifying a
+# meaning-risk receipt offline pulls NO numpy / scipy / torch — the
+# property `Tests/test_sum_verify_sdk.py` pins in a clean subprocess.
+# Same joserfc pin/rationale as [receipt-verify] above.
+verify = ["joserfc>=1.0.0,<2.0.0"]
 # MCP (Model Context Protocol) server. Exposes SUM verbs as MCP
 # tools so any MCP-aware LLM client (Claude Desktop, Claude Code,
 # Cursor, Continue, custom agents) can call SUM directly. The
@@ -79,6 +112,12 @@ mcp = ["mcp>=1.0.0"]
 # verified blindspots (predicate-flip, off-graph fact-fabrication,
 # empty-render false negative).
 research = ["numpy>=1.24.0", "scipy>=1.10.0"]
+# Local, deterministic, zero-$ entailment judge for the meaning-loss
+# EntailmentScorer (sum_engine_internal/research/meaning/local_judge.py).
+# A local sentence-embedding model run offline in eval mode — fixes the
+# F18 lexical-scorer paraphrase misranking without any paid API. Heavy
+# (torch); strictly optional, lazy-imported.
+judge = ["transformers>=4.30.0", "torch>=2.0.0", "sentencepiece>=0.1.99"]
 # Omni-format adapter. Markdown is the canonical pivot for the
 # attest pipeline: any input format -> markdown -> existing
 # extract/state/bundle path. Source URI anchors to the original
@@ -119,6 +158,7 @@ all = [
     "sum-engine[openai]",
     "sum-engine[anthropic]",
     "sum-engine[receipt-verify]",
+    "sum-engine[verify]",
     "sum-engine[mcp]",
     "sum-engine[omni-format]",
     "sum-engine[dev]",
@@ -140,21 +180,24 @@ Repository = "https://github.com/OtotaO/SUM"
 "Feature Catalog" = "https://github.com/OtotaO/SUM/blob/main/docs/FEATURE_CATALOG.md"
 [tool.check-wheel-contents]
-# The wheel intentionally ships TWO top-level packages:
+# The wheel intentionally ships THREE top-level packages:
 #   sum_cli                — the CLI entry point (provides the `sum`
 #                            console script via [project.scripts]).
 #   sum_engine_internal    — the implementation modules consumers
 #                            `import` from (the public Python API).
+#   sum_verify             — the small, stable, dependency-light verify
+#                            SDK (`import sum_verify`); pinnable independent
+#                            of the CLI / research stack.
 # Without this whitelist, check-wheel-contents fires W009 on every
 # build. Whitelisting by name (rather than ignoring W009 globally)
 # preserves the gate: if a future build accidentally adds an
-# unexpected third top-level package, this config still catches it.
-toplevel = ["sum_cli", "sum_engine_internal"]
+# unexpected fourth top-level package, this config still catches it.
+toplevel = ["sum_cli", "sum_engine_internal", "sum_verify"]
 [tool.setuptools.packages.find]
 # Include the CLI and the core engine modules it depends on. Tests and
 # scripts are dev-time only and excluded from the distribution.
-include = ["sum_cli*", "sum_engine_internal*"]
+include = ["sum_cli*", "sum_engine_internal*", "sum_verify*"]
 exclude = ["Tests*", "Tests.*", "scripts*", "api*", "single_file_demo*"]
 [tool.setuptools.package-data]

sum_engine-0.8.0/sum_cli/__main__.py ADDED Viewed

@@ -0,0 +1,11 @@
+"""Enable ``python -m sum_cli`` as an alias for the ``sum`` console script.
+Several prospective adopters in the 30-guest adoption simulation (2026-06-09)
+reached for ``python3 -m sum_cli`` and hit "package cannot be directly
+executed" — the only working forms were the ``sum`` entry point or
+``python3 -m sum_cli.main``. This makes the obvious on-ramp work.
+"""
+from sum_cli.main import main
+if __name__ == "__main__":
+    raise SystemExit(main())

sum-engine 0.7.0__tar.gz → 0.8.0__tar.gz

sum-engine 0.7.0tar.gz → 0.8.0tar.gz