PyPI - agentversion - Versions diffs - 0.1.0__tar.gz → 0.2.0__tar.gz - Mend

agentversion 0.1.0tar.gz → 0.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (129) hide show

agentversion-0.2.0/.github/dependabot.yml ADDED Viewed

@@ -0,0 +1,19 @@
+version: 2
+updates:
+  - package-ecosystem: "pip"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+    open-pull-requests-limit: 3
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+    open-pull-requests-limit: 3
+ignore:
+      # CI audit 2026-06-21: skip MAJOR github-actions bumps — they churn + break
+      # (actions/checkout@v7 broke docs CI). Take majors deliberately + tested;
+      # minor/patch (incl. security) still auto-PR.
+      - dependency-name: "*"
+        update-types: ["version-update:semver-major"]

{agentversion-0.1.0 → agentversion-0.2.0}/.github/workflows/ci.yml RENAMED Viewed

@@ -18,9 +18,9 @@ jobs:
         python-version: ["3.10", "3.11", "3.12"]
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
-      - uses: actions/setup-python@v5
+      - uses: actions/setup-python@v6
         with:
           python-version: ${{ matrix.python-version }}
           cache: pip
@@ -41,9 +41,9 @@ jobs:
     name: Lint
     runs-on: ubuntu-latest
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
-      - uses: actions/setup-python@v5
+      - uses: actions/setup-python@v6
         with:
           python-version: "3.12"
           cache: pip

{agentversion-0.1.0 → agentversion-0.2.0}/.github/workflows/publish.yml RENAMED Viewed

@@ -27,9 +27,9 @@ jobs:
       matrix:
         python-version: ["3.10", "3.11", "3.12"]
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
-      - uses: actions/setup-python@v5
+      - uses: actions/setup-python@v6
         with:
           python-version: ${{ matrix.python-version }}
@@ -46,9 +46,9 @@ jobs:
     environment: pypi
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
-      - uses: actions/setup-python@v5
+      - uses: actions/setup-python@v6
         with:
           python-version: "3.12"
@@ -69,4 +69,7 @@ jobs:
           echo "Version match: $PKG_VERSION"
       - name: Publish to PyPI
-        uses: pypa/gh-action-pypi-publish@release/v1
+        # REL-3 (2026-06-20 audit): SHA-pinned (was the mutable @release/v1 branch ref).
+        # This job holds OIDC id-token:write → PyPI publish authority; a mutable ref would
+        # run whatever HEAD of that branch is at publish time. Dependabot maintains the pin.
+        uses: pypa/gh-action-pypi-publish@cef221092ed1bacb1cc03d23a2d87d1d172e277b  # release/v1

{agentversion-0.1.0 → agentversion-0.2.0}/CHANGELOG.md RENAMED Viewed

@@ -7,6 +7,99 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 > **Package version ≠ spec version.** This file tracks the **package** version. The on-the-wire `spec_version` is independent and frozen at `1.0.0`; a pre-1.0 package can implement a stable 1.0 spec, which is exactly the situation today.
+## [0.2.0] - 2026-06-24
+### Added
+- **`behavioral_policy` contract surface** — a first-class surface for a multi-turn agent's behavioral
+  policy (the rules it holds across turns: a refund/escalation policy, etc.), bound to skillevaluation's
+  conversation-mode `policy_check`. A change to the RULES diffs as **breaking** → `replay`/`drop`, where
+  previously a policy flip lived only in the prompt-stack hash and read `non_breaking` → `keep`, silently
+  retaining a now-invalid conversation eval set. Optional + omitted by default, so existing manifests'
+  `overall_hash` is **unchanged**. Reason code `behavioral_policy_changed`; spec `spec/behavioral-policy.md`.
+- **A2A Agent Card mapping** (`a2a.manifest_to_agent_card`, exported): project a manifest onto an
+  [A2A (Agent2Agent)](https://a2a-protocol.org/) Agent Card, stamping the manifest's identity
+  (`manifest_id` + `overall_hash`) under an `x-agentversion` provenance block so a card consumer can pin
+  the exact versioned contract it describes. Positions AgentVersion as the version/diff/provenance layer
+  **on top of** the A2A interop standard rather than a competing descriptor. Spec: `spec/a2a-mapping.md`.
+- **Extension hatch** (`AgentContract` is now `extra="allow"`): a custom / emerging contract surface
+  (RAG corpus, MCP server registry, memory policy, vendor extension) is hashed by the hasher and diffed
+  by the engine, and now also **survives `model_validate`** — previously it was silently dropped, so a
+  validate→re-serialize→re-hash round-trip changed `overall_hash` (a moat-breaking non-determinism). The
+  surface set can grow without forking the model. ASCII/known-surface hashes are unchanged.
+- **`COMPONENT_TYPE_TO_SURFACE` + `surface_key_for_component()`** — the canonical routing from a
+  producer's flat `component_type` to the contract surface key it lands in (incl. the singular→plural
+  `guardrail`→`guardrails` rename). Exported so a producer (the SDK exporter) and a consumer (a diff
+  translator) share one source of truth instead of hand-copying the map — closing a cross-stack drift
+  class where the rename had to be applied independently in every translator copy.
+- **`contract.contract_from_components()`** — the single source of truth for assembling a contract block
+  (every surface, in canonical shape) from a producer's flat component list. Both the DecimalAI SDK
+  exporter and the platform's hash path route through it, so they compute the *same* `jcs-sha256`
+  identity hash for the same agent — the platform can now make the canonical hash authoritative and a
+  customer reproduces the stored hash with the OSS CLI. Closes audit X-2 (two hand-written translators
+  with no shared code/test).
+### Fixed (hash-determinism + trust)
+- **Canonical-hash domain** (`hasher.py`): NFC-normalize every string (keys + values) and reject
+  non-finite floats (`NaN`/`±Infinity`) **before** JCS canonicalization. JCS canonicalizes bytes but
+  does not Unicode-normalize, so a composed `"café"` and a decomposed `"café"` previously produced a
+  different `overall_hash` for the same agent (a cross-language reproduction break in the moat); and a
+  `NaN` raised an opaque `jcs` error instead of a clear domain error. NFC of ASCII is a no-op, so
+  **existing manifest hashes are unchanged** (frozen vectors still pass). Documented in `spec/hashing.md`.
+- **Attestation integrity** (`validator.py`): an attestation's `signed_payload_hash` must equal the
+  manifest's declared `overall_hash` (error on mismatch) — the no-crypto linkage that proves the
+  attestation covers *this* manifest. Previously the envelope was parsed but never checked, so a
+  copy-pasted/tampered attestation rode along inertly. Cryptographic signature verification remains
+  explicitly delegated to a verifier (out of scope for the validator).
+### Fixed (correctness audit)
+- **Schema ↔ code drift on `behavioral_policy`** — the bundled `manifest-diff.schema.json` surface enum
+  was missing `behavioral_policy`, and neither `decision.REASON_CODES` nor `compatibility-decision.schema.json`
+  listed `behavioral_policy_changed`, even though the diff/compat code **emits** both. A diff or decision
+  touching the policy surface therefore failed its own bundled schema. Both schemas + `REASON_CODES` now
+  include them. (`spec_version` stays `1.0.0` — these are compatible additions to an already-declared surface.)
+- **`model_runtime` reason code** — a model swap mapped to `prompt_policy_changed` (there was no model
+  reason code at all). Added `model_runtime_changed` to `REASON_CODES` + the decision schema, and remapped
+  the `model_runtime` surface to it.
+- **`output_contract` severity tiers corrected to match the normative spec** (`spec/compatibility-policy.md`):
+  was format→moderate, schema→major, strict→(no bump); now **format→minor, schema→moderate, strict→major**
+  (a strict-mode flip newly *rejects* previously-valid outputs, so it is the most consumer-breaking). A
+  `strict` change is now also `breaking`. The `output-schema-change` conformance fixture's expected severity
+  was corrected `major`→`moderate` accordingly: the fixture encoded the buggy output, so this is a defect
+  correction, not a change to a stable conformance scenario.
+- **Asymmetric add/remove diffs** — a surface that *appeared* or *vanished* bypassed its dedicated
+  classifier and got a flat generic verdict (add→moderate/non_breaking, remove→breaking/major), so
+  `diff(A,B)` was not the inverse of `diff(B,A)` (e.g. an added `output_contract{strict:true}` read as a
+  bland "moderate" instead of major). Add/remove now route through the dedicated classifier against an
+  empty sentinel, so a surface appearing/vanishing is severitied by the same logic as an in-place change.
+  Introducing a `behavioral_policy` stays additive (non_breaking); removing one stays breaking.
+- **Model-family extraction missed compact date stamps** — `_extract_model_family` stripped dashed dates
+  (`gpt-4o-2024-08-06`) but not compact 8-digit Anthropic dates (`claude-3-5-sonnet-20241022`), so a routine
+  model date-rev read as a *different family* → spurious `major`/`replay`. Now strips `\d{8}` (and applies
+  iteratively); genuine family changes (`gpt-4` vs `gpt-4o`) are preserved.
+- **Validator hardening** — `CompatibilityDecision.reason_codes` are now validated against `REASON_CODES`
+  (the schema enforced this on the wire; the model didn't); `validate_manifest(..., check_schema=True)` is a
+  new opt-in pass that validates against `agent-manifest.schema.json` (catches unknown top-level keys the
+  Pydantic models silently drop); a manifest whose contract can't be canonically hashed (e.g. a non-finite
+  float) is now an **error** (`hash_uncomputable`), not a soft warning that still validated.
+- **Lower-severity:** unnamed subagents are keyed by content hash rather than list position (removing one of
+  several no longer reads as a positional rename); `prompt_severity` boundary at exactly 5.0% is now `minor`
+  to match the spec's `≤5%` (was `moderate`).
+### Documentation & packaging
+- **The wheel now bundles `spec/`, `examples/`, and `compatibility-tests/`** (previously only `schemas/`), so a
+  `pip install`ed user actually has the files the README points at. Dropped the false `pytest --pyargs agentversion`
+  claim (tests aren't bundled) in favor of a clone-and-test snippet.
+- **New `examples/integrations/decimalai_bridge.py`** and a README **"From the DecimalAI SDK"** section showing
+  the `decimalai.export_manifest(snap)` → `agentversion diff` round-trip — the seam that makes agentversion the
+  open core of the paid platform was previously undocumented. The `decimalai-python` README now references it too.
+- **Fixed the README** `evaluation.gates[]` example (was missing the required `ran_at` → failed validation on
+  copy-paste) and regenerated the "See it in action" diff table to match current output (the `environment` row now
+  shows real field-level changes instead of a bland "environment added").
+- **`langgraph_example.py`** rewritten to compute real content hashes (was emitting placeholder `sha256:extract_from_*`
+  strings) and to actually validate; **`examples/scenarios/walkthrough.py`** is a new runnable, test-covered version
+  of the tool-rename drift scenario. A new smoke test runs every example so they can't bit-rot.
+- Stale-terminology fixes: `compatibility-batch.md` example id `rcb_` → `cbt_`.
 ## [0.1.0] - 2026-05-29
 **First published release** — the first `agentversion` release on PyPI.

{agentversion-0.1.0 → agentversion-0.2.0}/CONTRIBUTING.md RENAMED Viewed

@@ -57,11 +57,9 @@ The conformance scenarios (`tests/test_conformance.py`) are non-negotiable. If y
 ## Releases
-The maintainer cuts releases. The flow is:
+The maintainer cuts releases — see [`RELEASING.md`](./RELEASING.md) for the full runbook.
-1. PR with the version bump in `pyproject.toml` + a `CHANGELOG.md` entry.
-2. Tag `vX.Y.Z` on `main`.
-3. CI publishes to PyPI via trusted publishing.
+Today releases are published manually with `./scripts/release.sh` (local build + `twine` upload to PyPI). Once the repository is public, the target is tag-triggered CI via [trusted publishing](https://docs.pypi.org/trusted-publishers/), so no token is stored.
 ## License

{agentversion-0.1.0 → agentversion-0.2.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: agentversion
-Version: 0.1.0
+Version: 0.2.0
 Summary: An open specification for versioning agent runtimes and keeping datasets valid.
 Project-URL: Homepage, https://github.com/decimal-labs/agentversion
 Project-URL: Documentation, https://github.com/decimal-labs/agentversion/tree/main/spec
@@ -22,30 +22,38 @@ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Requires-Python: >=3.10
 Requires-Dist: click>=8.0
-Requires-Dist: jcs>=0.2
+Requires-Dist: jcs<1,>=0.2
 Requires-Dist: pydantic>=2.0
 Requires-Dist: rich>=13.0
 Provides-Extra: dev
 Requires-Dist: jsonschema>=4.0; extra == 'dev'
-Requires-Dist: mypy>=1.0; extra == 'dev'
+Requires-Dist: mypy<2.2,>=1.0; extra == 'dev'
 Requires-Dist: pytest-cov>=4.0; extra == 'dev'
 Requires-Dist: pytest>=7.0; extra == 'dev'
-Requires-Dist: ruff>=0.4; extra == 'dev'
+Requires-Dist: ruff<0.16,>=0.15; extra == 'dev'
 Description-Content-Type: text/markdown
 # AgentVersion
 **Your agent changed. Is your saved data still valid?**
-`agentversion` turns an agent version into a diffable, hashable contract — so when prompts, tools, models, or graphs change, you know exactly what broke and which traces, eval sets, and training data survived.
-[![CI](https://github.com/decimal-labs/agentversion/actions/workflows/ci.yml/badge.svg)](https://github.com/decimal-labs/agentversion/actions/workflows/ci.yml)
 [![PyPI](https://img.shields.io/pypi/v/agentversion)](https://pypi.org/project/agentversion/)
 [![Python](https://img.shields.io/badge/python-3.10%2B-blue)](https://pypi.org/project/agentversion/)
-[![Spec](https://img.shields.io/badge/spec-v1.0-success)](https://github.com/decimal-labs/agentversion/tree/main/spec)
-[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://github.com/decimal-labs/agentversion/blob/main/LICENSE)
+[![Spec](https://img.shields.io/badge/spec-v1.0-success)](https://pypi.org/project/agentversion/)
+[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://pypi.org/project/agentversion/)
+When you ship a new version of an agent, everything you collected against the old one — production traces, eval datasets, SFT (supervised fine-tuning) examples — quietly drifts out of date. There's no `package.json` to pin an agent's contract, and no `git diff` to tell you what changed.
-When you ship a new version of an agent, everything you collected against the old one — production traces, eval datasets, SFT examples — quietly drifts out of date. There's no `package.json` to pin an agent's contract, and no `git diff` to tell you what changed. `agentversion` is that missing format: a JSON **manifest** describing an agent version, a **diff** that classifies every change as breaking or non-breaking, and a **compatibility decision** that tells you whether to keep, repair, replay, or drop your old data.
+`agentversion` is that missing format. Three steps, one per noun:
+```
+manifest   →   diff   →   compatibility decision
+(what an       (what         (what to do with the data
+ agent          changed,      you already collected:
+ version is)    per surface)   keep / repair / replay / drop)
+```
+A **surface** is one independently-versioned part of the agent — its prompts, its tools, its model, its graph, its output format — each hashed on its own, so any change can be pinned to exactly one of them. A **diff** classifies each changed surface as breaking or non-breaking; a **compatibility decision** turns that into a per-data verdict.
 It's a dependency-light Python package with a CLI — and an open spec any tool can implement.
@@ -64,14 +72,20 @@ $ agentversion diff finance-agent-v1.json finance-agent-v2.json --compat
 ┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
 ┃ Surface         ┃ Change Type  ┃ Details                                             ┃
 ┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
-│ environment     │ non_breaking │ environment added                                   │
+│ environment     │ non_breaking │ deployment_id: None → 'prod-east-1'                 │
+│                 │              │ region: None → 'us-east-1'                          │
+│                 │              │ infra_image_hash: None →                            │
+│                 │              │ 'sha256:img2img2img2img2img2img2img2img2img2img2im… │
+│                 │              │ runtime_versions added=app-runtime,python           │
+│                 │              │ external_service_pins changed                       │
+│                 │              │ resource_limits changed                             │
 │ model_runtime   │ breaking     │ provider: 'google' → 'openai'                       │
 │                 │              │ runtime_version: 'app-runtime@1.5.0' →              │
 │                 │              │ 'app-runtime@1.8.2'                                 │
 │                 │              │ envelope changed                                    │
 │ output_contract │ breaking     │ format: 'text' → 'json'                             │
-│                 │              │ strict: False → True                                │
 │                 │              │ output schema changed                               │
+│                 │              │ strict: False → True                                │
 │ prompt_stack    │ non_breaking │ system_prompt hash changed                          │
 │                 │              │ developer_prompt hash changed                       │
 │ subagents       │ breaking     │ subagents added: ['finance_subagent',               │
@@ -96,7 +110,45 @@ $ agentversion diff finance-agent-v1.json finance-agent-v2.json --compat
 Between v1 and v2 the team swapped the model (Google → OpenAI), renamed a tool, added two subagents, and switched to strict JSON output. `agentversion` caught all five breaking surfaces and told you the old traces need a replay — not a guess, a classification you can gate CI on.
-> Try it yourself — both manifests live in [`examples/manifest/`](https://github.com/decimal-labs/agentversion/tree/main/examples/manifest).
+The recommendation is one of **four verdicts** — what to do with each piece of data you collected against the old version:
+| Verdict | What it means | Typical trigger |
+|---|---|---|
+| `keep`   | Still valid as-is. | Only non-breaking surfaces changed. |
+| `repair` | Salvageable with a transform — patch it, don't re-run the agent. | A recoverable output-contract change (the bundled default rules emit `repair` only for output-contract-only breaks). |
+| `replay` | Re-run it through the new version for fresh outputs. | A breaking surface (tool, model, workflow) makes old *outputs* untrustworthy but the *inputs* still apply. |
+| `drop`   | No longer usable — discard it. | The inputs themselves no longer apply. (`drop` comes from a custom policy, not the default `diff --compat` rules.) |
+In the demo above, five breaking surfaces (model swap, tool rename, new subagents, strict-JSON output, new graph) make the old *outputs* stale — but the old *inputs* still apply — so the verdict is `replay`.
+### What a manifest looks like
+A manifest is plain JSON. The top says *which* version this is; `contract` holds one entry per **surface** — exactly the rows you saw in the diff above:
+```jsonc
+{
+  "agent_name": "finance-agent",
+  "version_label": "2026-03-01.prod.1",
+  "identity": {
+    "overall_hash": "sha256:47301b25...",   // stable id for this whole version
+    "hash_algorithm": "jcs-sha256"
+  },
+  "contract": {
+    "prompt_stack":    { "system_prompt": { "version": "8", "hash": "sha256:aaa1..." }, "...": "..." },
+    "model_runtime":   { "provider": "google", "model": "gemini-2.0-flash", "...": "..." },
+    "tool_registry":   { "registry_version": "5", "tools": [ /* get_market_cap, search_population */ ] },
+    "workflow":        { "graph_name": "finance-simple-graph", "graph_version": "3", "...": "..." },
+    "subagents":       [],
+    "output_contract": { "format": "text", "strict": false, "...": "..." },
+    "guardrails":      { "bundle_version": "3", "...": "..." },
+    "context_config":  { "retrieval_config_version": "5", "...": "..." }
+  }
+}
+```
+Each surface is hashed on its own, so the diff can say *"`tool_registry` changed, `prompt_stack` didn't"* instead of just *"the manifest changed."*
+> Try it yourself — both [`examples/manifest/`](https://pypi.org/project/agentversion/) manifests ship inside the `agentversion` wheel.
 ---
@@ -110,7 +162,7 @@ You probably already have observability and a trace store. None of them answer *
 | A2A / ACP agent cards | runtime discovery + I/O types | version identity or data-compatibility |
 | OpenAI JSONL / SFT files | a training format | provenance — *which agent version* produced each row |
-**Isn't this A2A?** No — and they compose. A2A and ACP answer *"how does Agent A discover and talk to Agent B?"*. `agentversion` answers *"what changed in this agent, and what does that mean for my data?"*. An A2A Agent Card can carry an `agentversion` manifest hash so you know both at once.
+**Isn't this A2A?** No — and they compose. A2A and ACP (the Agent-to-Agent and Agent Communication protocols) answer *"how does Agent A discover and talk to Agent B?"*. `agentversion` answers *"what changed in this agent, and what does that mean for my data?"*. An A2A Agent Card can carry an `agentversion` manifest hash so you know both at once.
 ---
@@ -120,30 +172,48 @@ You probably already have observability and a trace store. None of them answer *
 pip install agentversion
 ```
-Apache-2.0, no config — just needs Python 3.10+. It implements the frozen **v1.0 spec**, but the Python package itself is early: `0.1.0`, pre-1.0, with the API still settling.
+Apache-2.0, no config — just needs Python 3.10+.
+There are **two version numbers**, deliberately different:
+- the **wire spec** is frozen at **v1.0** (stable format + conformance suite — safe to build against);
+- this **Python package** is **0.1.0** — pre-1.0, so its API may still shift.
+(The `spec-v1.0` and PyPI badges above show each one.)
 ## Quickstart
-**Diff two versions** (table by default; add `--json` for machine output, `--compat` for a keep/repair/replay/drop recommendation):
+First five minutes: **init → hash → validate → diff → gate in CI**.
+**1. Scaffold a manifest** for your agent (interactive):
 ```bash
-agentversion diff old-manifest.json new-manifest.json --compat
+agentversion init
 ```
-**Gate breaking changes in CI** — `--fail-on-breaking` exits non-zero when any surface is breaking:
+**2. Get its stable id and check it's valid:**
-```yaml
-# .github/workflows/agent.yml
-- name: Block breaking agent changes
-  run: agentversion diff baseline-manifest.json current-manifest.json --fail-on-breaking
+```bash
+agentversion hash manifest.json       # a content hash that ignores key order and
+                                      # whitespace, so the same agent always hashes the
+                                      # same id (JCS-SHA256 = JSON Canonicalization
+                                      # Scheme + SHA-256)
+agentversion validate manifest.json   # check it against the spec
 ```
-**Scaffold, hash, and validate** a manifest:
+**3. Diff two versions** — runnable right now against the bundled examples (`--compat` adds the keep/repair/replay/drop recommendation; `--json` for machine output):
 ```bash
-agentversion init                     # interactively create a manifest
-agentversion hash manifest.json       # canonical JCS-SHA256 identity hash
-agentversion validate manifest.json   # check it against the spec
+agentversion diff examples/manifest/finance-agent-v1.json \
+                  examples/manifest/finance-agent-v2.json --compat
+```
+**4. Gate breaking changes in CI** — `--fail-on-breaking` exits non-zero when any surface is breaking:
+```yaml
+# .github/workflows/agent.yml
+- name: Block breaking agent changes
+  run: agentversion diff baseline-manifest.json current-manifest.json --fail-on-breaking
 ```
 **Use it from Python** — every line below is exercised by the test suite:
@@ -197,7 +267,7 @@ The protocol is fully useful standalone:
 1. **Track versions locally** — `init` to scaffold, `hash` for a stable id, `diff` between any two. No account, fully offline.
 2. **Gate CI/CD** — `diff --fail-on-breaking` stops a breaking agent change from reaching production.
-3. **Annotate traces** — stamp `identity.overall_hash` onto your OpenTelemetry spans as `agentversion.manifest_hash` for version-scoped filtering. See [`examples/integrations/otel_mapping.md`](https://github.com/decimal-labs/agentversion/blob/main/examples/integrations/otel_mapping.md).
+3. **Annotate traces** — stamp `identity.overall_hash` onto your OpenTelemetry spans as `agentversion.manifest_hash` for version-scoped filtering. See [`examples/integrations/otel_mapping.md`](https://pypi.org/project/agentversion/), bundled in the package.
 4. **Classify data compatibility** — `diff --compat` (or `decision generate`) gives a per-episode keep / repair / replay / drop verdict you can act on.
 It interoperates with LangSmith, Langfuse, Phoenix, and W&B — annotate their traces/datasets with a manifest hash, or read/write compatibility decisions alongside your eval pipeline.
@@ -208,13 +278,13 @@ It interoperates with LangSmith, Langfuse, Phoenix, and W&B — annotate their t
 `agentversion` is an open spec so any tool, in any language, can produce interoperable manifests and diffs:
-- [`spec/manifest.md`](https://github.com/decimal-labs/agentversion/blob/main/spec/manifest.md) — the agent manifest
-- [`spec/diff.md`](https://github.com/decimal-labs/agentversion/blob/main/spec/diff.md) — surface diffs, breaking vs non-breaking
-- [`spec/compatibility-decision.md`](https://github.com/decimal-labs/agentversion/blob/main/spec/compatibility-decision.md) — keep / repair / replay / drop
-- [`spec/replay.md`](https://github.com/decimal-labs/agentversion/blob/main/spec/replay.md) · [`spec/dataset.md`](https://github.com/decimal-labs/agentversion/blob/main/spec/dataset.md) — replay jobs and dataset objects with provenance
-- [`spec/reference.md`](https://github.com/decimal-labs/agentversion/blob/main/spec/reference.md) — full schemas and validation rules · [`schemas/`](https://github.com/decimal-labs/agentversion/tree/main/schemas) — JSON Schemas
+- [`spec/manifest.md`](https://pypi.org/project/agentversion/) — the agent manifest
+- [`spec/diff.md`](https://pypi.org/project/agentversion/) — surface diffs, breaking vs non-breaking
+- [`spec/compatibility-decision.md`](https://pypi.org/project/agentversion/) — keep / repair / replay / drop
+- [`spec/replay.md`](https://pypi.org/project/agentversion/) · [`spec/dataset.md`](https://pypi.org/project/agentversion/) — replay jobs and dataset objects with provenance
+- [`spec/reference.md`](https://pypi.org/project/agentversion/) — full schemas and validation rules · [`schemas/`](https://pypi.org/project/agentversion/) — JSON Schemas
-[`CONFORMANCE.md`](https://github.com/decimal-labs/agentversion/blob/main/CONFORMANCE.md) + [`compatibility-tests/`](https://github.com/decimal-labs/agentversion/tree/main/compatibility-tests) are golden in/out pairs that any implementation must reproduce to claim conformance.
+The full spec and JSON Schemas ship inside the `agentversion` wheel. [`CONFORMANCE.md`](https://pypi.org/project/agentversion/) + [`compatibility-tests/`](https://pypi.org/project/agentversion/) are golden in/out pairs that any implementation must reproduce to claim conformance.
 ---
@@ -226,27 +296,56 @@ A manifest can carry the eval results that gated its release in `evaluation.gate
 {
   "evaluation": {
     "gates": [
-      { "name": "regression-suite", "threshold": 0.95, "actual_score": 0.972, "passed": true }
+      { "name": "regression-suite", "threshold": 0.95, "actual_score": 0.972,
+        "passed": true, "ran_at": "2026-03-05T14:00:00Z" }
     ]
   }
 }
 ```
-Those scores come from [`skillevaluation`](https://github.com/decimal-labs/skillevaluation), the sibling open spec for A/B benchmarking skills. `agentversion` records *what an agent version is*; `skillevaluation` measures *whether it's better*.
+Those scores come from [`skillevaluation`](https://pypi.org/project/skillevaluation/), the sibling open spec for A/B benchmarking skills. `agentversion` records *what an agent version is*; `skillevaluation` measures *whether it's better*.
-The [`decimalai`](https://github.com/decimal-labs/decimalai-python) Python SDK builds on `agentversion` to add framework adapters (capture a manifest straight from your LangGraph/CrewAI app), trace capture, and managed replay — but you never need it to use the spec.
+The [`decimalai`](https://pypi.org/project/decimalai/) Python SDK builds on `agentversion` to add framework adapters (capture a manifest straight from your LangGraph/CrewAI app), trace capture, and managed replay — but you never need it to use the spec.
+---
+## From the DecimalAI SDK
+If you use the [`decimalai`](https://pypi.org/project/decimalai/) SDK you don't hand-write manifests — it captures one straight from your running agent, and `export_manifest` hands it to the OSS tooling here:
+```python
+import decimalai
+from decimalai.schema.manifest import extract_from_config
+from agentversion.diff import diff_manifests
+from agentversion.compatibility import classify_compatibility
+# Capture a manifest from your agent's config (or a framework adapter)…
+snap = extract_from_config(
+    agent_name="support-agent",
+    prompts={"system": "You are a helpful support assistant."},
+    models={"default": {"provider": "openai", "model": "gpt-4o"}},
+)
+manifest = decimalai.export_manifest(snap)        # → an agentversion manifest dict
+# …then this package takes over: diff vs your last prod manifest, gate in CI.
+diff = diff_manifests(last_prod_manifest, manifest)
+print(classify_compatibility(diff).recommended_decision)
+```
+This is the seam that makes `agentversion` the **open core** of the paid platform: the manifest the SDK captures *is* the format `agentversion diff` consumes, so you can reproduce the platform's diffs and verdicts entirely outside DecimalAI. A runnable version is in [`examples/integrations/decimalai_bridge.py`](https://pypi.org/project/agentversion/).
 ---
 ## Project
-The **spec** is stable at v1.0 — frozen wire format and conformance suite. The **package** is `0.1.0`: pre-1.0 under semantic versioning, so the Python API may still shift before it catches up. Design decisions are logged in [`adrs/`](https://github.com/decimal-labs/agentversion/tree/main/adrs), releases in [`CHANGELOG.md`](https://github.com/decimal-labs/agentversion/blob/main/CHANGELOG.md). Contributions — especially new conformance cases — are genuinely welcome; see [`CONTRIBUTING.md`](https://github.com/decimal-labs/agentversion/blob/main/CONTRIBUTING.md):
+The spec is frozen at v1.0; the package is pre-1.0 (see [Install](#install)). Design decisions are logged in [`adrs/`](https://pypi.org/project/agentversion/), releases in [`CHANGELOG.md`](https://pypi.org/project/agentversion/). Contributions — especially new conformance cases — are genuinely welcome; see [`CONTRIBUTING.md`](https://pypi.org/project/agentversion/):
 ```bash
-git clone https://github.com/decimal-labs/agentversion
-cd agentversion
-pip install -e ".[dev]"
-pytest
+pip install agentversion
+agentversion --help
+# run the conformance + unit suite from a clone:
+git clone https://github.com/decimal-labs/agentversion && cd agentversion
+pip install -e ".[dev]" && pytest
 ```
-Licensed under [Apache 2.0](https://github.com/decimal-labs/agentversion/blob/main/LICENSE).
+Licensed under [Apache 2.0](https://pypi.org/project/agentversion/).

agentversion 0.1.0__tar.gz → 0.2.0__tar.gz

agentversion 0.1.0tar.gz → 0.2.0tar.gz