agentversion 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (107) hide show
  1. agentversion-0.1.0/.github/dependabot.yml +13 -0
  2. agentversion-0.1.0/.github/workflows/ci.yml +64 -0
  3. agentversion-0.1.0/.github/workflows/publish.yml +72 -0
  4. agentversion-0.1.0/.gitignore +32 -0
  5. agentversion-0.1.0/CHANGELOG.md +242 -0
  6. agentversion-0.1.0/CONFORMANCE.md +66 -0
  7. agentversion-0.1.0/CONTRIBUTING.md +68 -0
  8. agentversion-0.1.0/LICENSE +190 -0
  9. agentversion-0.1.0/PKG-INFO +252 -0
  10. agentversion-0.1.0/README.md +217 -0
  11. agentversion-0.1.0/adrs/0000-template.md +43 -0
  12. agentversion-0.1.0/adrs/0001-version-spec-core.md +64 -0
  13. agentversion-0.1.0/agentversion/__init__.py +31 -0
  14. agentversion-0.1.0/agentversion/_shared.py +23 -0
  15. agentversion-0.1.0/agentversion/cli.py +407 -0
  16. agentversion-0.1.0/agentversion/compatibility.py +258 -0
  17. agentversion-0.1.0/agentversion/constants.py +8 -0
  18. agentversion-0.1.0/agentversion/dataset.py +248 -0
  19. agentversion-0.1.0/agentversion/decision.py +249 -0
  20. agentversion-0.1.0/agentversion/diff.py +740 -0
  21. agentversion-0.1.0/agentversion/hasher.py +162 -0
  22. agentversion-0.1.0/agentversion/ids.py +324 -0
  23. agentversion-0.1.0/agentversion/manifest.py +405 -0
  24. agentversion-0.1.0/agentversion/py.typed +0 -0
  25. agentversion-0.1.0/agentversion/refs.py +128 -0
  26. agentversion-0.1.0/agentversion/replay.py +166 -0
  27. agentversion-0.1.0/agentversion/validator.py +346 -0
  28. agentversion-0.1.0/compatibility-tests/environment-region-change/after.json +73 -0
  29. agentversion-0.1.0/compatibility-tests/environment-region-change/before.json +73 -0
  30. agentversion-0.1.0/compatibility-tests/environment-region-change/expected-diff.json +21 -0
  31. agentversion-0.1.0/compatibility-tests/model-runtime-provider-change/after.json +73 -0
  32. agentversion-0.1.0/compatibility-tests/model-runtime-provider-change/before.json +73 -0
  33. agentversion-0.1.0/compatibility-tests/model-runtime-provider-change/expected-diff.json +21 -0
  34. agentversion-0.1.0/compatibility-tests/output-schema-change/after.json +46 -0
  35. agentversion-0.1.0/compatibility-tests/output-schema-change/before.json +46 -0
  36. agentversion-0.1.0/compatibility-tests/output-schema-change/expected-diff.json +22 -0
  37. agentversion-0.1.0/compatibility-tests/prompt-stack-edit/after.json +73 -0
  38. agentversion-0.1.0/compatibility-tests/prompt-stack-edit/before.json +73 -0
  39. agentversion-0.1.0/compatibility-tests/prompt-stack-edit/expected-diff.json +21 -0
  40. agentversion-0.1.0/compatibility-tests/skill-registry-skill-removed/after.json +68 -0
  41. agentversion-0.1.0/compatibility-tests/skill-registry-skill-removed/before.json +73 -0
  42. agentversion-0.1.0/compatibility-tests/skill-registry-skill-removed/expected-diff.json +21 -0
  43. agentversion-0.1.0/compatibility-tests/subagent-handoff-change/after.json +53 -0
  44. agentversion-0.1.0/compatibility-tests/subagent-handoff-change/before.json +53 -0
  45. agentversion-0.1.0/compatibility-tests/subagent-handoff-change/expected-diff.json +21 -0
  46. agentversion-0.1.0/compatibility-tests/tool-rename/after.json +46 -0
  47. agentversion-0.1.0/compatibility-tests/tool-rename/before.json +46 -0
  48. agentversion-0.1.0/compatibility-tests/tool-rename/expected-diff.json +22 -0
  49. agentversion-0.1.0/compatibility-tests/workflow-graph-change/after.json +73 -0
  50. agentversion-0.1.0/compatibility-tests/workflow-graph-change/before.json +73 -0
  51. agentversion-0.1.0/compatibility-tests/workflow-graph-change/expected-diff.json +22 -0
  52. agentversion-0.1.0/examples/.gitkeep +1 -0
  53. agentversion-0.1.0/examples/integrations/langgraph_example.py +187 -0
  54. agentversion-0.1.0/examples/integrations/otel_mapping.md +67 -0
  55. agentversion-0.1.0/examples/manifest/finance-agent-v1.json +117 -0
  56. agentversion-0.1.0/examples/manifest/finance-agent-v2.json +236 -0
  57. agentversion-0.1.0/examples/scenarios/tool-rename-drift.md +90 -0
  58. agentversion-0.1.0/pyproject.toml +80 -0
  59. agentversion-0.1.0/pyrightconfig.json +13 -0
  60. agentversion-0.1.0/schemas/.gitkeep +1 -0
  61. agentversion-0.1.0/schemas/agent-manifest.schema.json +464 -0
  62. agentversion-0.1.0/schemas/compatibility-batch.schema.json +86 -0
  63. agentversion-0.1.0/schemas/compatibility-decision.schema.json +128 -0
  64. agentversion-0.1.0/schemas/compatibility-policy.schema.json +56 -0
  65. agentversion-0.1.0/schemas/compatibility-report.schema.json +55 -0
  66. agentversion-0.1.0/schemas/dataset-snapshot.schema.json +113 -0
  67. agentversion-0.1.0/schemas/episode.schema.json +80 -0
  68. agentversion-0.1.0/schemas/manifest-diff.schema.json +91 -0
  69. agentversion-0.1.0/schemas/replay-job.schema.json +141 -0
  70. agentversion-0.1.0/schemas/replay-result.schema.json +77 -0
  71. agentversion-0.1.0/schemas/step.schema.json +120 -0
  72. agentversion-0.1.0/schemas/task.schema.json +47 -0
  73. agentversion-0.1.0/spec/attestation.md +75 -0
  74. agentversion-0.1.0/spec/compatibility-batch.md +104 -0
  75. agentversion-0.1.0/spec/compatibility-decision.md +65 -0
  76. agentversion-0.1.0/spec/compatibility-policy.md +90 -0
  77. agentversion-0.1.0/spec/data-classification.md +61 -0
  78. agentversion-0.1.0/spec/dataset.md +200 -0
  79. agentversion-0.1.0/spec/diff.md +51 -0
  80. agentversion-0.1.0/spec/environment.md +121 -0
  81. agentversion-0.1.0/spec/evaluation.md +139 -0
  82. agentversion-0.1.0/spec/hashing.md +64 -0
  83. agentversion-0.1.0/spec/ids.md +104 -0
  84. agentversion-0.1.0/spec/lifecycle.md +110 -0
  85. agentversion-0.1.0/spec/manifest.md +211 -0
  86. agentversion-0.1.0/spec/otel-mapping.md +66 -0
  87. agentversion-0.1.0/spec/reference.md +238 -0
  88. agentversion-0.1.0/spec/refs.md +95 -0
  89. agentversion-0.1.0/spec/replay-determinism.md +93 -0
  90. agentversion-0.1.0/spec/replay.md +94 -0
  91. agentversion-0.1.0/spec/versioning-policy.md +62 -0
  92. agentversion-0.1.0/tests/test_audit_v020.py +463 -0
  93. agentversion-0.1.0/tests/test_cli.py +275 -0
  94. agentversion-0.1.0/tests/test_conformance.py +66 -0
  95. agentversion-0.1.0/tests/test_dataset.py +161 -0
  96. agentversion-0.1.0/tests/test_decision_replay.py +219 -0
  97. agentversion-0.1.0/tests/test_diff.py +525 -0
  98. agentversion-0.1.0/tests/test_environment.py +294 -0
  99. agentversion-0.1.0/tests/test_evaluation.py +172 -0
  100. agentversion-0.1.0/tests/test_hasher.py +164 -0
  101. agentversion-0.1.0/tests/test_ids.py +304 -0
  102. agentversion-0.1.0/tests/test_lifecycle.py +212 -0
  103. agentversion-0.1.0/tests/test_manifest.py +375 -0
  104. agentversion-0.1.0/tests/test_refs.py +159 -0
  105. agentversion-0.1.0/tests/test_reproducible_replay.py +308 -0
  106. agentversion-0.1.0/tests/test_trust_observability.py +293 -0
  107. agentversion-0.1.0/tests/test_validator.py +152 -0
@@ -0,0 +1,13 @@
1
+ version: 2
2
+ updates:
3
+ - package-ecosystem: "pip"
4
+ directory: "/"
5
+ schedule:
6
+ interval: "weekly"
7
+ open-pull-requests-limit: 3
8
+
9
+ - package-ecosystem: "github-actions"
10
+ directory: "/"
11
+ schedule:
12
+ interval: "weekly"
13
+ open-pull-requests-limit: 3
@@ -0,0 +1,64 @@
1
+ name: CI
2
+
3
+ on:
4
+ pull_request:
5
+ branches: [main]
6
+
7
+ concurrency:
8
+ group: ci-${{ github.ref }}
9
+ cancel-in-progress: true
10
+
11
+ jobs:
12
+ test:
13
+ name: Test (Python ${{ matrix.python-version }})
14
+ runs-on: ubuntu-latest
15
+ strategy:
16
+ fail-fast: false
17
+ matrix:
18
+ python-version: ["3.10", "3.11", "3.12"]
19
+
20
+ steps:
21
+ - uses: actions/checkout@v4
22
+
23
+ - uses: actions/setup-python@v5
24
+ with:
25
+ python-version: ${{ matrix.python-version }}
26
+ cache: pip
27
+ cache-dependency-path: pyproject.toml
28
+
29
+ - name: Install with dev extras
30
+ run: |
31
+ python -m pip install --upgrade pip
32
+ pip install -e ".[dev]"
33
+
34
+ # test_conformance.py exercises the JSON scenarios under
35
+ # compatibility-tests/ (tool-rename, output-schema-change,
36
+ # subagent-handoff-change) so no separate job needed.
37
+ - name: Run pytest
38
+ run: python -m pytest tests/ -q
39
+
40
+ lint:
41
+ name: Lint
42
+ runs-on: ubuntu-latest
43
+ steps:
44
+ - uses: actions/checkout@v4
45
+
46
+ - uses: actions/setup-python@v5
47
+ with:
48
+ python-version: "3.12"
49
+ cache: pip
50
+ cache-dependency-path: pyproject.toml
51
+
52
+ - name: Install
53
+ run: |
54
+ python -m pip install --upgrade pip
55
+ pip install -e ".[dev]"
56
+
57
+ - name: Ruff
58
+ # Config lives in pyproject.toml ([tool.ruff.lint]) so local and CI
59
+ # lint stay in lock-step. Lints the whole repo, tests included.
60
+ run: ruff check .
61
+
62
+ - name: Mypy
63
+ # Strict type-check the package (config in pyproject [tool.mypy]).
64
+ run: mypy agentversion/
@@ -0,0 +1,72 @@
1
+ name: Publish to PyPI
2
+
3
+ # Triggers when a release is published in GitHub. Tag conventions:
4
+ # v1.0.0 → publishes agentversion 1.0.0 to PyPI
5
+ #
6
+ # Uses Trusted Publisher (OIDC) — no API token needed.
7
+ # Configure once at https://pypi.org/manage/account/publishing/ with:
8
+ # project: agentversion
9
+ # owner: decimal-labs
10
+ # repo: agentversion
11
+ # workflow: publish.yml
12
+ # environment: pypi
13
+
14
+ on:
15
+ release:
16
+ types: [published]
17
+
18
+ permissions:
19
+ id-token: write
20
+ contents: read
21
+
22
+ jobs:
23
+ test:
24
+ name: Run Tests
25
+ runs-on: ubuntu-latest
26
+ strategy:
27
+ matrix:
28
+ python-version: ["3.10", "3.11", "3.12"]
29
+ steps:
30
+ - uses: actions/checkout@v4
31
+
32
+ - uses: actions/setup-python@v5
33
+ with:
34
+ python-version: ${{ matrix.python-version }}
35
+
36
+ - name: Install
37
+ run: pip install -e ".[dev]"
38
+
39
+ - name: Run tests
40
+ run: pytest tests/ -v
41
+
42
+ publish:
43
+ name: Publish to PyPI
44
+ needs: test
45
+ runs-on: ubuntu-latest
46
+ environment: pypi
47
+
48
+ steps:
49
+ - uses: actions/checkout@v4
50
+
51
+ - uses: actions/setup-python@v5
52
+ with:
53
+ python-version: "3.12"
54
+
55
+ - name: Install build tools
56
+ run: pip install build
57
+
58
+ - name: Build package
59
+ run: python -m build
60
+
61
+ - name: Verify package version matches release tag
62
+ run: |
63
+ PKG_VERSION=$(python -c "import tomllib; print(tomllib.load(open('pyproject.toml','rb'))['project']['version'])")
64
+ TAG_VERSION="${GITHUB_REF_NAME#v}"
65
+ if [ "$PKG_VERSION" != "$TAG_VERSION" ]; then
66
+ echo "Version mismatch: pyproject.toml=$PKG_VERSION, tag=$TAG_VERSION"
67
+ exit 1
68
+ fi
69
+ echo "Version match: $PKG_VERSION"
70
+
71
+ - name: Publish to PyPI
72
+ uses: pypa/gh-action-pypi-publish@release/v1
@@ -0,0 +1,32 @@
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ *.egg-info/
7
+ *.egg
8
+ dist/
9
+ build/
10
+ .eggs/
11
+
12
+ # Virtual environments
13
+ .venv/
14
+ venv/
15
+ env/
16
+
17
+ # IDE
18
+ .idea/
19
+ .vscode/
20
+ *.swp
21
+ *.swo
22
+ *~
23
+
24
+ # Testing
25
+ .pytest_cache/
26
+ .coverage
27
+ htmlcov/
28
+ .mypy_cache/
29
+
30
+ # OS
31
+ .DS_Store
32
+ Thumbs.db
@@ -0,0 +1,242 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ > **Package version ≠ spec version.** This file tracks the **package** version. The on-the-wire `spec_version` is independent and frozen at `1.0.0`; a pre-1.0 package can implement a stable 1.0 spec, which is exactly the situation today.
9
+
10
+ ## [0.1.0] - 2026-05-29
11
+
12
+ **First published release** — the first `agentversion` release on PyPI.
13
+
14
+ The package ships pre-1.0 on purpose. The spec it implements is stable (`spec_version 1.0.0`, with a frozen wire format and conformance suite), but the Python package API hasn't earned its own 1.0 promise yet — so it enters at `0.1.0` (`Development Status :: 4 - Beta`). Feature-wise it's complete against the original audit roadmap; that work landed across the internal milestones listed further below, none of which were ever published.
15
+
16
+ ### Changed — Project renamed to **AgentVersion**
17
+
18
+ The project was renamed from "Agent Version Spec (AVS)" to **AgentVersion** before its first public release. The on-the-wire `spec_version` is unchanged (still `1.0.0`); only names and identifiers changed. Because nothing was published prior to this, there is no migration path — the rename is a pre-release change.
19
+
20
+ - **PyPI distribution**: `agent-version-spec` → `agentversion`.
21
+ - **Python import / package**: `agent_version_spec` → `agentversion`.
22
+ - **CLI command**: `avs` → `agentversion`.
23
+ - **Manifest-ref URI scheme**: `avs:manifest:<id>` / `avs:hash:<algo>:<hex>` → `agentversion:manifest:<id>` / `agentversion:hash:<algo>:<hex>`. Manifests that carry subagent `manifest_ref`s must update those values and recompute `identity.overall_hash` (refs are inside the hashed `subagents` surface). The example `examples/manifest/finance-agent-v2.json` was updated accordingly.
24
+ - **OpenTelemetry attribute key**: `agent_version_spec.manifest_hash` → `agentversion.manifest_hash`.
25
+ - **GitHub repository**: `decimal-labs/agent-version-spec` → `decimal-labs/agentversion`.
26
+
27
+ Unchanged: object ID prefixes (`amf`, `tsk`, `ep`, `dss`, `cdc`, `rpj`, `rpr`, `mdf`), the `spec_version` value (`1.0.0`), the `jcs-sha256` hash algorithm, and the schema file names (which are object-named, not project-named).
28
+
29
+ ---
30
+
31
+ ## Pre-release development (internal milestones — never published)
32
+
33
+ The entries below were development milestones tracked in-repo on the way to feature-completeness. None were published to PyPI or any other index, so there is no migration path between them — they're kept as a record of how the spec took shape. Their version numbers are the old internal numbering and overlap the published `0.1.0` above only by coincidence.
34
+
35
+ ### 1.0.0 - 2026-05-12 (internal milestone)
36
+
37
+ **Feature-complete milestone** (never published). The audit roadmap reached feature-completeness at this internal version; the stability promise it anticipated now lives on the spec, which is frozen at `1.0.0`.
38
+
39
+ ### What v1.0 looks like
40
+
41
+ - **Canonical IDs only.** Every ID matches `^[a-z][a-z0-9]*_[0-9A-HJKMNP-TV-Z]{26}$` (kind-prefixed ULID). The JSON Schema, Pydantic models, and semantic validator all enforce this. `malformed_id` and `wrong_id_prefix` are errors; there is no permissive mode.
42
+ - **Typed manifest references only.** `subagents[].manifest_ref` accepts `avs:manifest:<canonical-id>`, `avs:hash:<algo>:<hex>`, `https://...`, or `file:///...`. Bare IDs are rejected. `malformed_manifest_ref` is an error.
43
+ - **`Development Status` classifier**: `5 - Production/Stable`.
44
+ - **Conformance suite frozen.** `compatibility-tests/` scenarios in v1.0.0 stay stable through v1.x. New scenarios may be added in minors; existing ones don't change.
45
+ - **Semver locked.** `1.x.0` minors are always backward-compatible (additive only). Anything that removes/renames/tightens requires a major bump.
46
+
47
+ ### Migration
48
+
49
+ There is no migration path from pre-v1.0 — nothing pre-v1.0 was released. Build new manifests at canonical form.
50
+
51
+ ### What's next
52
+
53
+ - v1.x minors: federated registry resolution, well-known `extensions` namespace registry, streaming manifest support.
54
+ - v2.0 (no timeline): drop legacy `status` field in favor of `lifecycle.current_stage` only.
55
+
56
+ ### 0.9.0 - 2026-05-12
57
+
58
+ Trust + observability + governance batch. The last three §3 items.
59
+
60
+ ### Added — Attestation (§3d)
61
+ - **`Attestation` model** with `signer`, `algorithm`, `signature`, `signed_payload_hash`, `signed_at`, optional `key_id` + `expires_at`.
62
+ - **`IdentityBlock.attestations: List[Attestation]`** — multiple attestations supported (typical: CI provenance + release-manager approval + security-scan).
63
+ - **Hash isolation**: attestations live on `identity` (not `contract`), so adding or rotating signatures does NOT change `overall_hash`. A manifest's identity is its contract; signatures are evidence about it.
64
+ - **Verification is out of spec**: format-only. Implementations bring their own crypto (Sigstore, cosign, GPG, internal PKI). The validator only enforces well-formedness; it does not check signatures.
65
+ - Spec doc [`spec/attestation.md`](./spec/attestation.md).
66
+
67
+ ### Added — Richer `ComparisonSummary` (§3m)
68
+ - **New fields on `ReplayResult.comparison_summary`**: `final_output_diff_pct` (0-100), `tool_path_diff: ToolPathDiff` (`steps_added`, `steps_removed`, `first_divergence_step_index`), `step_count_delta`, `latency_delta_ms`, `cost_delta_usd`, `eval_score_delta`. All optional — back-compat preserved.
69
+ - **`ToolPathDiff` model** for structural diff of the tool-call sequence.
70
+ - **Use case**: sort divergent replays by severity. Pre-v0.9, you knew which replays diverged; now you know *how much* and *where* they first diverged.
71
+
72
+ ### Added — Data Classification (§3n)
73
+ - **`DataClassification` model** for compliance labels: `pii_state` (`raw|redacted|synthetic|none`), `retention_days`, `residency[]`, `redaction_policy_ref`, `consent_basis` (GDPR Article 6 enum).
74
+ - **`DatasetSnapshot.data_classification`** — optional; defaults to `pii_state="none"` when present without overrides.
75
+ - **`SelectionPolicy.pii_states`** — filter so a snapshot can declare "only include episodes whose data is redacted or synthetic".
76
+ - Spec doc [`spec/data-classification.md`](./spec/data-classification.md).
77
+
78
+ ### Tests
79
+ - 17 new tests in `tests/test_trust_observability.py`.
80
+ - avs total: **295 passing** (was 278, net +17).
81
+
82
+ ### Phase 3 complete
83
+ All 14 missing-capability items from the original audit §3 are now shipped (or, in the case of §3l, were folded into Phase 2). The spec is feature-complete for the audit roadmap. Next milestone: **v1.0** — tighten enforcement (drop permissive ID pattern, drop bare-ID manifest_refs), publish to PyPI.
84
+
85
+ ### 0.8.0 - 2026-05-12
86
+
87
+ Reproducible-replay batch. Four audit items in one bump because they collectively make most agents bit-reproducibly replayable — adding one without the others leaves replay still flaky.
88
+
89
+ ### Added — Tool semantic_version (§3i)
90
+ - **`ToolDescriptor.semantic_version`** — SemVer string catching *behavioral* drift that schema hashes miss (e.g. "we swapped the upstream Census API from 2019 to 2024; same schema, different numbers").
91
+ - **`ToolDescriptor.implementation_ref`** — opaque pointer to the implementation (git commit, image hash, etc.).
92
+ - **Diff classifier extension**: when schemas are unchanged but a tool's `semantic_version` bumps, the diff now flags the bump kind — major → breaking moderate, minor → non-breaking minor, patch → non-breaking minor.
93
+ - **Validator code**: `malformed_semver` (WARNING).
94
+
95
+ ### Added — Tool schema embedding (§3g)
96
+ - **`ToolDescriptor.input_schema_inline`** and **`output_schema_inline`** — optional inline JSON Schemas alongside the existing hashes.
97
+ - **Validator code**: `schema_hash_mismatch` (ERROR) when `JCS-SHA256(inline) != declared hash`.
98
+ - Enables fully-offline replay: archived agents don't need a live registry to verify tool I/O.
99
+
100
+ ### Added — Model cost & limits envelope (§3h)
101
+ - **`ModelRuntime.envelope`** — new sub-object with `context_window_tokens`, `expected_latency_ms_p50` / `p99`, `cost.{input,output,cached_input}_per_1k_tokens_usd`, `rate_limit.{rpm,tpm}`.
102
+ - Anchors `ReplayConstraints.max_cost_usd` budgeting and lets the diff classifier flag price-tier swaps.
103
+ - Envelope is part of `contract.model_runtime` → participates in `overall_hash`. Provider price changes warrant a new manifest version.
104
+
105
+ ### Added — Replay determinism hints (§3f)
106
+ - **`ReplayInput.determinism`** — new optional sub-object with `random_seed`, `clock_freeze_at`, `tool_response_pinning_ref` (the last a `ManifestRef`-style URI; `avs:hash:` is the typical scheme since you want tamper detection).
107
+ - Spec doc [`spec/replay-determinism.md`](./spec/replay-determinism.md) covers all four §3f/§3g items together and explains why they ship as a set.
108
+
109
+ ### Examples
110
+ - `finance-agent-v2.json` gains a populated `envelope` on `model_runtime` and a `semantic_version` + `implementation_ref` on `get_market_cap`. `overall_hash` updated because both fields are in-contract.
111
+
112
+ ### Tests
113
+ - 14 new tests in `tests/test_reproducible_replay.py`.
114
+ - avs total: **278 passing** (was 264, net +14).
115
+
116
+ ### 0.7.0 - 2026-05-12
117
+
118
+ ### Added — Lifecycle (§3e)
119
+ - **`Lifecycle` model** as an optional top-level field on `AgentManifest` (siblings: `lifecycle`, `evaluation` — both outside `contract`, so they do NOT participate in `identity.overall_hash`).
120
+ - Six stages: `draft → candidate → staging → production → deprecated → archived`.
121
+ - `LifecycleTransition` records each promotion: `stage`, `transitioned_at`, `by` (actor convention: `user:<id>`, `system:<id>`), optional `eval_ref`, `approved_by[]`, `notes`.
122
+ - `supersedes[]` and `superseded_by` for the version-chain bookkeeping. `sunset_at` for scheduled removal.
123
+ - Validator: `lifecycle_history_unsorted` (ERROR), `lifecycle_stage_mismatch` (ERROR), `lifecycle_status_mismatch` (WARNING — when the simple `status` field and `lifecycle.current_stage` disagree under the simple-to-rich mapping).
124
+ - Spec doc [`spec/lifecycle.md`](./spec/lifecycle.md).
125
+ - 13 new tests in `tests/test_lifecycle.py`.
126
+
127
+ ### Added — Evaluation Gates (§3k)
128
+ - **`Evaluation` model** as an optional top-level field carrying `gates[]`. Like lifecycle, NOT in contract — re-running an eval against the same agent produces the same `overall_hash` but updated evaluation data.
129
+ - `EvalGate` records: `name`, optional `dataset_ref`, `threshold`, `actual_score`, `threshold_direction` (`"min"` higher-is-better / `"max"` lower-is-better), `passed`, `ran_at`, optional `evaluator_ref`, `notes`.
130
+ - Validator: `eval_gate_inconsistent` (WARNING) when `passed` disagrees with `actual_score` vs `threshold` under the declared direction.
131
+ - Spec doc [`spec/evaluation.md`](./spec/evaluation.md).
132
+ - 9 new tests in `tests/test_evaluation.py`.
133
+
134
+ ### Added — Manifest Tombstone (§3j, folded in)
135
+ - `IdentityBlock.yanked_at` and `IdentityBlock.yanked_reason` — optional fields for marking a published manifest as no-longer-recommended without rewriting history (PyPI-yank semantics).
136
+ - Identity block is NOT part of contract, so yanking a manifest does NOT change its `overall_hash`.
137
+
138
+ ### Examples
139
+ - `examples/manifest/finance-agent-v2.json` gains populated `lifecycle` and `evaluation` blocks demonstrating a 4-transition path to production with three eval gates (regression, safety, latency).
140
+ - `overall_hash` of the example is **unchanged** — confirming lifecycle + evaluation correctly sit outside `contract`.
141
+
142
+ ### 0.6.0 - 2026-05-12
143
+
144
+ ### Added — Environment Fingerprint Surface (§3a)
145
+ - **New contract surface** `environment` on `AgentContract` with fields: `deployment_id`, `region`, `infra_image_hash`, `runtime_versions`, `secret_refs`, `external_service_pins`, `feature_flags`, `resource_limits`. All optional — older v0.5 manifests still validate.
146
+ - **`ResourceLimits` model** with `memory_mb`, `cpu_cores`, `timeout_seconds`, `max_concurrent_calls`.
147
+ - **JSON Schema** for the new block under `contract.environment`.
148
+ - **Diff classifier** `environment_severity()` with field-level severity rules:
149
+ - `deployment_id`, `secret_refs`, `feature_flags`, `resource_limits` → minor
150
+ - `region`, `infra_image_hash`, `runtime_versions`, `external_service_pins` → moderate
151
+ - Environment changes are always classified `non_breaking` (they affect replayability, not validity of past traces).
152
+ - **New reason codes** in `compatibility_decision.reason_codes` enum: `region_changed`, `infra_image_changed`, `external_service_pin_changed`, `runtime_version_changed`. Plus the existing `environment_unreplayable` as a catch-all.
153
+ - **Condition tokens** `environment_surface_unchanged` / `environment_surface_changed` for `ClassificationRule.condition`.
154
+ - **`CompatibilityPolicy.environment`** for user-configurable rules on the new surface.
155
+ - **Spec doc** [`spec/environment.md`](./spec/environment.md) — full surface spec, field reference, severity rules, security notes, hash participation.
156
+ - **Example** `examples/manifest/finance-agent-v2.json` gains a populated `environment` block.
157
+ - 19 new tests in `tests/test_environment.py`.
158
+
159
+ ### Security note
160
+ `environment.secret_refs` holds **names** (identifiers), not values. Implementations that put plaintext secrets there leak credentials into the manifest hash.
161
+
162
+ ### 0.5.0 - 2026-05-12
163
+
164
+ ### Added — Manifest References (§3c)
165
+ - **`agent_version_spec.refs` module** with `ManifestRef`, `parse_manifest_ref(s)`, `try_parse_manifest_ref(s)`, `is_bare_id_ref(s)`.
166
+ - **URI scheme** for `SubagentDescriptor.manifest_ref`:
167
+ - `avs:manifest:<id>` — by-ID reference (registry resolution).
168
+ - `avs:hash:<algo>:<hex>` — content-addressed (immutable).
169
+ - `https://...` / `http://...` — fetchable URL.
170
+ - `file:///path/manifest.json` — local file.
171
+ - Bare `<id>` — implicit `avs:manifest:` (deprecated in v0.x; removed in v1.0).
172
+ - **JSON Schema** pattern on `subagents[].manifest_ref` accepts all five forms.
173
+ - **Validator** semantic rules: `malformed_manifest_ref` (ERROR), `bare_manifest_ref` (WARNING; ERROR under `--strict-ids`). Embedded IDs in `avs:manifest:` URIs run through the same ID checks as `manifest_id`.
174
+ - **Spec doc** [`spec/refs.md`](./spec/refs.md) — full URI scheme, resolution semantics, JSON Schema pattern, v0.x → v1.0 promise.
175
+ - Example `examples/manifest/finance-agent-v2.json` updated: bare-ID subagent refs (`amf_finance_subagent_v3`) → canonical URIs (`avs:manifest:amf_01KREPJH26…`); fixed `manifest_id` from a not-actually-Crockford-base32 placeholder to a real ULID; recomputed `identity.overall_hash`.
176
+ - 25 new tests in `tests/test_refs.py`.
177
+
178
+ ### Added — Generalized ID Enforcement (§3b follow-up)
179
+ - **`check_object_ids(data, kind, strict)`** in `ids.py` validates every known ID field across **all** spec kinds (manifest, task, episode, step, dataset_snapshot, compatibility_decision, compatibility_batch, compatibility_report, replay_job, replay_result, manifest_diff).
180
+ - Walks dotted paths with `[]` array notation; handles `subject.id` specially (its expected prefix depends on `subject.type`).
181
+ - **CLI subcommands** `avs decision validate`, `avs replay validate`, `avs dataset validate` all gained `--strict-ids` and emit the same warning/error vocabulary as `avs validate`.
182
+ - 9 new tests covering non-manifest objects.
183
+
184
+ ### Changed
185
+ - `validate_manifest()` now delegates its ID checks to `check_object_ids()` — single source of truth for ID rules.
186
+
187
+ ### 0.4.0 - 2026-05-12
188
+
189
+ ### Added — Canonical IDs (§3b)
190
+ - **`agent_version_spec.ids` module** with `mint_id(kind)`, `parse_id(s)`, `validate_id(s, expected_kind=None, strict=False)`, `is_canonical_id(s)`, `is_permissive_id(s)`, and the `ID_PREFIXES` map (12 known kinds).
191
+ - **Canonical ID form**: `<kind-prefix>_<26-char Crockford base32 ULID>` (e.g. `amf_01HZK1A2B3C4D5E6F7G8H9J0K1`). Sortable by mint time; one less character than UUID; type-prefixed for at-a-glance kind identification.
192
+ - **Permissive form** (v0.x back-compat): JSON Schema `pattern` accepts both canonical ULID and semantic-slug IDs (e.g. `amf_finance_v3`). The validator emits a `non_canonical_id` WARNING for slug IDs through the v0.x line.
193
+ - **Semantic validator rules** (`validator.py`):
194
+ - `malformed_id` — ERROR when an ID matches neither canonical nor permissive form.
195
+ - `wrong_id_prefix` — ERROR (or escalated WARNING) when an ID's prefix doesn't match the object's kind.
196
+ - `non_canonical_id` — WARNING (or ERROR under `--strict-ids`) for slug IDs.
197
+ - **CLI**: `avs validate --strict-ids` escalates `non_canonical_id` warnings to errors. Matches the v1.0 behavior.
198
+ - **Spec doc**: [`spec/ids.md`](./spec/ids.md) documents the format, prefix table, rationale, API, and the v0.x → v1.0 tightening.
199
+ - 23 new tests in `tests/test_ids.py`.
200
+
201
+ ### Changed
202
+ - `validate_manifest()` and `validate_manifest_file()` accept a `strict_ids: bool = False` keyword.
203
+
204
+ ### Not yet enforced
205
+ - v1.0 will drop the permissive pattern. `non_canonical_id` becomes an error by default. Plan accordingly: tools that mint new IDs should produce canonical ULID form starting now.
206
+
207
+ ### 0.3.0 - 2026-05-12
208
+
209
+ ### Changed (breaking — nothing shipped publicly yet)
210
+ - **Renamed `rescue_decision` → `compatibility_decision`** (`RescueDecision` → `CompatibilityDecision`, schema file, kind, spec doc, CLI group). Aligns with the rest of the compatibility family.
211
+ - **Renamed `rescue_batch` → `compatibility_batch`** (`RescueBatch` → `CompatibilityBatch`, summary class, schema file, kind, spec doc).
212
+ - **Renamed `validators` surface → `guardrails`** (`ValidatorBundle` → `GuardrailBundle`). Removes naming collision with Pydantic and JSON-Schema validators.
213
+ - **Renamed schema file** `agent-version-spec.schema.json` → `agent-manifest.schema.json` to match the kind it defines.
214
+ - **Renamed module** `agent_version_spec/rescue.py` → `agent_version_spec/decision.py`.
215
+ - **CLI:** `avs rescue ...` → `avs decision ...` for both `validate` and `generate`.
216
+ - **Dropped** `validators.requires_confirmation_for_destructive_actions` — promote to per-tool `annotations.requires_confirmation` instead.
217
+ - **Renamed** `GuardrailBundle` fields: `validator_bundle_version` → `bundle_version`, `validator_bundle_hash` → `bundle_hash`.
218
+ - **Reason code** `validator_policy_changed` → `guardrail_policy_changed`; added `skill_missing`, `skill_content_changed`.
219
+ - **Condition tokens** `validator_surface_*` → `guardrail_surface_*`; added `skill_surface_unchanged` / `skill_surface_changed`.
220
+ - **Tool annotations** standardized to snake_case: `requiresConfirmation` → `requires_confirmation`, `readOnlyHint` → `read_only_hint`.
221
+
222
+ ### Added
223
+ - **`skill_registry` contract surface** is now first-class: `SkillRegistry` + `SkillDescriptor` are in the JSON schema (`agent-manifest.schema.json`), reference spec, diff surface enum, compatibility-policy schema, and condition DSL. Previously code-only.
224
+ - **`compatibility-report.schema.json`** — JSON Schema for the `CompatibilityReport` output of `classify_compatibility()`. Closes the gap where the class existed but had no schema.
225
+ - `__version__` is now read from package metadata via `importlib.metadata`, eliminating the package-version / `__version__` drift bug.
226
+
227
+ ### 0.2.0 - 2026-03-18
228
+
229
+ ### Added
230
+ - `skill_registry` Pydantic model + diff classifier (informally; not yet in schemas — see 0.3.0).
231
+ - Quantized float hashing for `generation_config` (temperature step 0.1, top_p step 0.05) so micro-tweaks don't churn manifest hashes.
232
+ - New manifest fields: `status`, `capabilities`, `description`, tool-level `description` + `annotations`.
233
+ - `OutputContract.modalities`.
234
+ - `compatibility-policy.schema.json` — user-configurable rules mapping change severity to actions per surface.
235
+ - Formalized condition DSL for `ClassificationRule.condition` with `SURFACE_STATE_TOKENS` / `PARAMETERIZED_TOKENS` and a `validate_condition()` enforcer.
236
+
237
+ ### 0.1.0 - 2026-03-11
238
+
239
+ ### Added
240
+ - Initial public scaffolding of the Agent Version Spec.
241
+ - Spec documents, JSON Schemas, Pydantic models, JCS-SHA256 hasher, surface-level diff engine, compatibility classifier.
242
+ - CLI entry point (`avs`) with `validate`, `diff`, `hash`, `init`, `upgrade` and subcommand groups.
@@ -0,0 +1,66 @@
1
+ # Conformance
2
+
3
+ How an implementation proves it conforms to the AgentVersion.
4
+
5
+ ## Why this exists
6
+
7
+ The spec is a multi-language target. The Python reference implementation lives in this repo, but an implementation in TypeScript, Rust, Go, or any other language is conforming as long as it produces the same outputs for the same inputs. This document defines "same outputs."
8
+
9
+ ## What an implementation must do
10
+
11
+ A conforming implementation must support, at minimum:
12
+
13
+ 1. **Manifest validation** — accept a manifest JSON, validate it against `schemas/agent-manifest.schema.json`, and enforce the semantic rules in `spec/manifest.md` § "Required fields" and § "Semantic Validation Rules" (`reference.md` §13).
14
+ 2. **Canonical hashing** — given a manifest, produce the same `identity.overall_hash` as the Python reference for any input. The algorithm is JCS-SHA256 (RFC 8785) applied to the `contract` block as documented in [`spec/hashing.md`](spec/hashing.md). Quantization of `generation_config` floats is part of the spec.
15
+ 3. **Diff** — given two manifests, produce a `manifest_diff` that matches the expected output of the conformance suite (described below).
16
+ 4. **Compatibility classification** — given a `manifest_diff`, produce a `compatibility_report` whose `recommended_decision` matches the reference implementation's output for the same input.
17
+
18
+ Implementations may add additional capabilities (e.g. signing, registry resolution), but those are extensions and do not affect conformance.
19
+
20
+ ## The conformance suite
21
+
22
+ Located under [`compatibility-tests/`](./compatibility-tests/). Each subdirectory is a scenario:
23
+
24
+ ```
25
+ compatibility-tests/
26
+ tool-rename/
27
+ before.json # input manifest A
28
+ after.json # input manifest B
29
+ expected-diff.json # ManifestDiff produced by a conforming implementation
30
+ output-schema-change/
31
+ before.json
32
+ after.json
33
+ expected-diff.json
34
+ subagent-handoff-change/
35
+ before.json
36
+ after.json
37
+ expected-diff.json
38
+ ```
39
+
40
+ The Python reference verifies conformance via `tests/test_conformance.py`. To verify an implementation in another language:
41
+
42
+ 1. For each scenario, load `before.json` and `after.json`.
43
+ 2. Run your implementation's diff function.
44
+ 3. Compare your output against `expected-diff.json`.
45
+ 4. The comparison must be tolerant to list ordering inside `changed_surfaces` (so use a set keyed on `(surface, change_type, severity)`), but the counts in `summary` and the set of surfaces and their `change_type`/`severity` must match exactly.
46
+
47
+ ## Adding scenarios
48
+
49
+ When the spec gains new behavior, add a new scenario directory with a `before.json`, `after.json`, and `expected-diff.json` produced by the reference implementation. Open a PR that includes both the new scenario and any code changes required to pass it.
50
+
51
+ When existing semantics change, update the expected diffs in the same PR that changes the implementation. Both the implementation change and the fixture change must be reviewed together.
52
+
53
+ ## What "matches" means
54
+
55
+ The reference comparison (see `tests/test_conformance.py`):
56
+
57
+ - `kind == "manifest_diff"`
58
+ - `old_manifest_id` and `new_manifest_id` match
59
+ - The set of `(surface, change_type, severity)` tuples in `changed_surfaces` is identical
60
+ - `summary.breaking_surfaces` and `summary.non_breaking_surfaces` counts match exactly
61
+
62
+ Things explicitly **not** part of conformance (intentionally tolerant):
63
+
64
+ - Order of items within `changed_surfaces` or `details` arrays
65
+ - Exact wording of human-readable strings in `details` (these are advisory, not contractual)
66
+ - `max_severity` field — derived; an implementation may omit or include it freely
@@ -0,0 +1,68 @@
1
+ # Contributing to AgentVersion
2
+
3
+ Thanks for your interest. AgentVersion is intended to be a stable, infrastructure-grade specification, so changes follow a slower and more deliberate process than typical libraries.
4
+
5
+ ## How to propose a change
6
+
7
+ 1. **Open an issue first** describing the problem. Don't open a PR before there's agreement that the change is desirable — spec evolution requires consensus.
8
+ 2. **For non-trivial changes, write an ADR** under `adrs/NNNN-<slug>.md` using `adrs/0000-template.md`. ADRs capture the *why*. The spec docs in `spec/` capture the *what*.
9
+ 3. **Reference the ADR from the PR.** ADRs may be amended or superseded but never deleted; they are the design log.
10
+
11
+ ## Spec evolution rules
12
+
13
+ The `spec_version` follows [Semantic Versioning 2.0.0](https://semver.org/). Pre-1.0 (where we are now) is unstable; after 1.0 the rules below apply.
14
+
15
+ **Allowed in a minor bump (1.x.0):**
16
+
17
+ - Adding new optional fields to any object
18
+ - Adding new values to enums (`step_type`, `reason_code`, `decision` verbs, etc.)
19
+ - Adding new `kind` values (introducing new spec objects)
20
+ - Adding new `$defs` to JSON Schemas
21
+
22
+ **Requires a major bump (x.0.0):**
23
+
24
+ - Removing or renaming any field
25
+ - Making an optional field required
26
+ - Changing field types or value semantics
27
+ - Changing the canonical hashing algorithm
28
+ - Removing enum values
29
+ - Changing the `overall_hash` derivation
30
+
31
+ When you propose a breaking change, your PR must include:
32
+
33
+ - The ADR explaining the motivation
34
+ - An entry in `CHANGELOG.md` under the next major version
35
+ - A migration note in `spec/versioning-policy.md`
36
+ - Updates to the conformance fixtures (`compatibility-tests/`) so existing implementations can verify their migrations
37
+
38
+ ## Code conventions
39
+
40
+ - Python ≥ 3.10, strict typing (`mypy --strict`).
41
+ - Pydantic v2 models are the source of truth for serialization; JSON Schemas mirror them.
42
+ - Tests live under `tests/`. Conformance fixtures live under `compatibility-tests/`.
43
+ - One canonical example per concept under `examples/`. Recompute hashes (`agentversion hash <file>`) whenever you change a manifest's `contract` block.
44
+
45
+ ## What to test
46
+
47
+ Every PR should run:
48
+
49
+ ```bash
50
+ pip install -e ".[dev]"
51
+ pytest # full suite, including conformance scenarios
52
+ ruff check .
53
+ mypy agentversion
54
+ ```
55
+
56
+ The conformance scenarios (`tests/test_conformance.py`) are non-negotiable. If your change breaks them, either the scenario is stale (update it) or your change breaks compatibility (then it's a major bump, not a minor).
57
+
58
+ ## Releases
59
+
60
+ The maintainer cuts releases. The flow is:
61
+
62
+ 1. PR with the version bump in `pyproject.toml` + a `CHANGELOG.md` entry.
63
+ 2. Tag `vX.Y.Z` on `main`.
64
+ 3. CI publishes to PyPI via trusted publishing.
65
+
66
+ ## License
67
+
68
+ By contributing, you agree your contribution is licensed under [Apache 2.0](./LICENSE), the same as the project.