@intentsolutions/audit-harness 1.2.1 → 1.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,553 +1,369 @@
1
1
  # Changelog
2
2
 
3
- All notable changes are recorded here. Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
3
+ All notable changes to `@intentsolutions/audit-harness` are documented here. The
4
+ format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and this
5
+ project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
4
6
 
5
7
  ## [Unreleased]
6
8
 
7
- _Nothing yet._
9
+ > **Riding a future v2.1 routine release (descoped from 1.2.0):** OTel event-name
10
+ > polish (iah-E07b/c). The `agent.rollout.gate.evaluated` and `gate.decision.emitted`
11
+ > event names are already locked + tested on main (PRs #78, #81 per NORMATIVE
12
+ > `intent-eval-lab/000-docs/067-AT-SPEC`). Any further attribute-schema polish on
13
+ > those events is deferred to a routine v2.1 release rather than headlined here — it
14
+ > is additive telemetry refinement, not a 1.2.0 capability boundary.
8
15
 
9
- ### Riding a future v2.1 routine release (descoped from 1.2.0)
16
+ ## [1.2.3] - 2026-06-20
10
17
 
11
- - **OTel event-name polish (iah-E07b/c).** The `agent.rollout.gate.evaluated` and `gate.decision.emitted` event names are already locked + tested on main (PRs #78, #81 per NORMATIVE `intent-eval-lab/000-docs/067-AT-SPEC`). Any further attribute-schema polish on those events is deferred to a routine v2.1 release rather than headlined here — it is additive telemetry refinement, not a 1.2.0 capability boundary.
18
+ A patch release shipping a correctness fix to the CLI `emit-evidence` command. No
19
+ CLI surface, no new commands — the evidence emitter now produces kernel-valid
20
+ output where it previously did not.
12
21
 
13
- ## [1.2.1] - 2026-06-16
14
-
15
- A patch release: release-pipeline supply-chain hardening (polyglot signing) plus
16
- dev-dependency bumps. No CLI surface, runtime behavior, or API boundary changes —
17
- the published artifacts are byte-identical in behavior to 1.2.0; only the release
18
- machinery and dev tooling moved.
19
-
20
- ### Changed — polyglot release signing wired into the publish pipeline (#90)
21
-
22
- - **crates.io build-provenance attestation.** The `publish-crates` leg now emits a
23
- GitHub build-provenance attestation for the published crate artifact, extending the
24
- signed-supply-chain guarantee to the Rust distribution.
25
- - **sigstore-python wheel + sdist signing.** The `publish-pypi` leg now signs the built
26
- wheel and sdist with `sigstore-python` (keyless Fulcio OIDC + Rekor), so the PyPI
27
- distribution carries verifiable provenance alongside the existing npm sigstore path.
28
- - **crates.io publish is now active.** With `CARGO_REGISTRY_TOKEN` provisioned as a
29
- repository secret, the `publish-crates` leg goes live on this tag — closing the
30
- polyglot publish loop (npm + PyPI + crates.io all publish + sign from one tag).
31
-
32
- ### Changed — dev-dependency bumps
33
-
34
- - Bump `eslint` from 9.39.4 to 10.5.0 (#71).
35
- - Bump `jeremylongshore/intent-rollout-gate` GitHub Action pin (#86).
36
- - Bump `crate-ci/typos` from 1.29.4 to 1.47.2 (#87).
37
-
38
- ## [1.2.0] - 2026-06-15
39
-
40
- A minor release: the read-only "comprehensive audit, on any repo" brain (`classify` → `conform` → `audit` → `scan` → `currency`), the kernel-emitting evidence path (`emit-evidence` Evidence Bundle, E04), the provider credential gate (`cred-gate`, E08), shared vendorable lint configs (#85), and a golden-master fitness function — all additive, with the zero-runtime-dependency guarantee preserved.
41
-
42
- ### Release narrative (what shipped since 1.1.8)
43
-
44
- - **`emit-evidence` Evidence Bundle emitter (E04).** The CI-only signed-evidence path emits the harness's own deterministic self-gate as a kernel `gate-result/v1` row inside an `EvidenceBundle`, cosign-signs the canonical bytes (Fulcio OIDC + Rekor), and publishes a `report-manifest.json` the dashboard re-verifies at ingest. Detail under "CI-only signed evidence emit" below.
45
- - **Provider credential gate (`cred-gate`, E08).** A new gate that asserts provider credentials PASS/FAIL with full redaction + spillover coverage (`scripts/cred-gate.sh`, fixtures via PR #80).
46
- - **Shared, vendorable lint configs (#85).** `.audit-harness-configs/` (markdownlint / yamllint / ruff / shellcheck) is the canonical config set the IEP repos vendor + extend; `install.sh` now vendors both `scripts/` and `configs/`.
47
- - **Dogfood AAR (iah-E10d).** First-downstream-adopter run captured at `000-docs/013-AA-AACR-rollout-gate-dogfood-iah-E10-2026-06-15.md`.
48
-
49
- ### Apache-2.0 §4(d) NOTICE obligation — satisfied
50
-
51
- `NOTICE` is present at the repo root, listed in `package.json#files` (ships in the npm tarball), included in the Python sdist + Rust crate distributions, AND vendored into `.audit-harness/` by `install.sh` (see "`install.sh` vendors NOTICE" below). The §4(d) attribution-travels-with-distribution obligation holds across npm, PyPI, crates.io, and the vendored-install path.
52
-
53
- ### Why minor, not patch
54
-
55
- Multiple new CLI verbs (`classify`, `conform`, `audit`, `scan`, `currency`, `cred-gate`) and new authored feature surfaces (shared lint configs, golden-master suite, the CI-only evidence emit). Per SemVer this is a minor bump. No CLI command was renamed or removed; the change is purely additive and the published tarball stays zero-runtime-dependency.
56
-
57
- ### Added — golden-master suite for gherkin-lint + crap-score stdout shapes (iah-golden-master)
58
-
59
- A fitness function that pins the raw stdout of the two scorers whose output is a downstream contract.
60
-
61
- - **`tests/golden/run-golden.sh`** captures `gherkin-lint.sh` (text rubric) and `crap-score.py --json` (gate-result envelope) stdout against a `tests/fixtures/deliberate-failure/` corpus and diffs each against a checked-in golden, failing on any drift. Environment-volatile bytes are normalized out (gherkin-lint's installed-vs-awk-fallback first line; crap-score's absolute `summary_path`) so the golden is byte-stable across machines. CI installs no complexity provider, so the crap golden captures the deterministic no-provider envelope shape.
62
- - **Why this and not the per-row schema gate:** the schema gate validates the *augmented* predicate that `emit-evidence` produces, not the raw scorer stdout. A silent reshape of the scorer stdout — a renamed field, a dropped WARN line, changed summary wording — is a backward-compat break the schema gate cannot see. This suite is that missing guard.
63
- - Regenerate intentional changes with `bash tests/golden/run-golden.sh --update` and review the golden diff in the PR. Wired into `.github/workflows/ci.yml` as the `golden` job.
64
-
65
- ### Changed — `install.sh` vendors NOTICE + the Node dispatcher (iah-install-sh-completeness)
66
-
67
- The vendored-install path (non-Node repos) now ships a complete, traceable copy.
68
-
69
- - **`NOTICE`** is copied into `.audit-harness/` — Apache-2.0 §4(d) requires the NOTICE file to travel with any distribution, and vendoring is a distribution.
70
- - **`bin/audit-harness.js`** (the Node CLI dispatcher) and **`package.json`** are copied into `.audit-harness/bin/` + `.audit-harness/` so the canonical dispatcher surface is present and its `--version` (which reads `../package.json`) resolves in the vendored tree.
71
- - A **`PROVENANCE`** file records the source repo, version, tarball URL, and install timestamp so a vendored tree is traceable back to the exact release it came from.
72
-
73
- ### Added — CI-only signed evidence emit for the intent-eval-dashboard (nr75.12)
74
-
75
- The dashboard reports hub (labs.intentsolutions.io) ingests a signed `report-manifest.json` of kernel `gate-result/v1` rows per repo. This adds audit-harness's own emit, lighting up its row.
76
-
77
- - **`ci/emit-evidence.ts` + `ci/assemble-manifest.ts`** — run the real deterministic self-gate (`harness-hash --verify`), shape it into a kernel `gate-result/v1` + `EvidenceBundle` (fail-closed against `@intentsolutions/core`), cosign-sign the canonical bytes (Fulcio OIDC + Rekor), and assemble the manifest the dashboard re-verifies at ingest.
78
- - **Zero-dep guarantee preserved.** The emitter lives in `ci/` (excluded from `package.json#files`) and the kernel is installed CI-only via `npm i --no-save` — `dependencies` + `devDependencies` stay empty and the published tarball is unchanged (verified via `npm pack --dry-run`).
79
- - **`.github/workflows/release.yml`** — adds a GitHub Release on tag push + an `emit-evidence` job (tag-only) that publishes the manifest as a Release asset.
80
-
81
- ### Added — `currency` advisory upstream-currency report (PP-PLAN-040 Phase 5 / E7)
82
-
83
- The fifth verb, and deliberately the weakest: an advisory report with no exit-code authority.
84
-
85
- - **`audit-harness currency`** (`scripts/currency.py`, stdlib): reads the per-upstream-identity pin relation (`schemas/currency/pins.v1.json`) and reports which pins are themselves **stale** — `checked_at` older than the pin's staleness window. Each upstream (mcp-spec, skill-md-schema, claude-code, gate-result-predicate, anthropic-sdk, agentskills-spec) carries its own `pinned_version` + `checked_at` + window, so the *pin's own staleness* is detectable (not one opaque scalar).
86
- - **No exit-code authority (always exit 0), no live-fetch, no auto-fix.** Currency depends on upstream state — non-deterministic and network-bound — so it only reports. `/sync-testing-harness` consumes the report to open advisory bump PRs; it never reddens a build. `--today YYYY-MM-DD` makes reports reproducible.
87
- - **`tests/currency/`**: golden suite (3 checks) — stale/current/unknown classification, the no-exit-authority guarantee (exit 0 even when all pins are stale), and the shipped relation reporting.
88
-
89
- ### Added — `scan` security/hygiene/skill-quality gate-runner (PP-PLAN-040 Phase 4 / E6)
90
-
91
- The fourth read-only verb: security + hygiene + skill-quality, by orchestrating standard tools (never reimplementing them).
92
-
93
- - **`audit-harness scan [repo]`** (`scripts/scan.py`, stdlib): for every `dimension: security | hygiene | skill-quality` gate in the profile, emits a `gate-result/v1` row. Three strategies: **local** (`hygiene-readme` README presence — deterministic), **shell-out** (every gate carrying a `tool` — gitleaks / osv-scanner / semgrep / syft / markdownlint / lychee — clean exit → PASS, findings → ADVISORY(error), tool absent → ADVISORY indeterminate), **consume** (`skill-behavioral` ingests a j-rig Evidence Bundle verdict via `--jrig-verdict`; the harness never runs behavioral judgment itself — no verdict → indeterminate).
94
- - Advisory-first; `--strict` (or a blocking gate) turns a finding/gap into `FAIL`. Kill-switch → `[]`. Each row records `metadata.method` (`local-presence` / `shell-out` / `consume-j-rig`).
95
- - **`tests/scan/`**: golden suite (10 checks) with pinned-profile isolation so shell-out tool availability never makes the suite flaky.
96
-
97
- **Security note:** on first run this gate caught — and this release redacts from HEAD — a PyPI publish token that had been pasted as a literal value in `python/PUBLISH.md`. The value remains in git history; it must be rotated at the registry (tracked separately). The doc now carries a placeholder.
98
-
99
- ### Added — `audit` testing-depth gate-runner (PP-PLAN-040 Phase 3 / E5)
100
-
101
- The third read-only verb: the "finish the pyramid" testing-depth diagnostic.
102
-
103
- - **`audit-harness audit [repo]`** (`scripts/audit.py`, stdlib): for every `dimension: testing-depth` gate in the profile, assesses the gate and emits a `gate-result/v1` row. Two read-only strategies: `crap-score` runs the bundled `crap` scorer (static complexity×coverage); every pyramid layer (unit/integration/e2e/smoke/perf/a11y/contract/migration/property-based/fuzz/sanitizers) gets a per-layer **presence heuristic** (test dirs, framework configs, dependency markers). Layer present → `PASS`; absent → `ADVISORY(warn)` testing-depth gap; not statically assessable → `ADVISORY` indeterminate.
104
- - **`--fast` (default)** presence heuristics only (<10s); **`--deep`** adds `crap-score`; **`--strict`** turns a gap on a blocking gate into `FAIL`. Kill-switch → `[]`. Each row records `metadata.method` (`crap-static` / `presence-heuristic` / `delegated`) for provenance.
105
- - **Deliberately does NOT execute the repo's test suite.** Running arbitrary untrusted suites is the repo's own CI's job; the harness reports coverage *presence* and the repo's CI test step produces the execution verdict. `audit` is the diagnostic, not the test runner.
106
- - **`tests/audit/`**: golden suite (7 checks) + `has-tests`/`no-tests` fixtures — asserts unit→PASS / gap→ADVISORY(default) / gap→FAIL(`--strict`), crap deep-only-in-fast, kill-switch, and gate-result/v1 validity. CI `audit` job.
107
-
108
- ### Added — registry projection + FP-rate harness (PP-PLAN-040 Phase 0 completion: c2b + c2e)
109
-
110
- Closes the data/safety-spine epic (E2): the registry becomes the single canonical datum and gate promotion gets a measured bar.
111
-
112
- - **`audit-harness gen-layer-applicability`** (`scripts/gen-layer-applicability.py`): projects `schemas/audit-profile/registry.v1.json` into `schemas/audit-profile/layer-applicability.md`. `--write` regenerates; `--check` fails on drift. The doc is now a **projection** of the registry datum, not a hand-maintained parallel source — CI gate `layer-applicability-drift` enforces it (c2b).
113
- - **`audit-harness fp-rate`** (`scripts/fp-rate.py`): measures each gate's false-positive / false-negative rate over a labeled corpus (`tests/fixtures/conform/{valid,malformed}/`). This is the metric that gates advisory→blocking promotion. `--max-fp-rate X` exits 1 if any gate exceeds the bar; CI runs it advisory at the 5% default bar (c2e).
114
- - **`docs/gate-promotion.md`**: the dedicated advisory→blocking promotion rule — FP-rate ≤ 5% bar, engineer-pinned in `tests/TESTING.md`, re-pinned manifest. Documents *why* FP-rate (not FN-rate) is the gate and how demotion/kill-switch works. `docs/` now ships in the npm package (`files`).
115
-
116
- ### Added — `conform` verb + bundled content-addressed schemas (PP-PLAN-040 Phase 2)
117
-
118
- The second piece of the read-only brain: deterministic conformance, emitting Evidence Bundle rows.
119
-
120
- - **`audit-harness conform [repo]`** (`scripts/conform.py`, stdlib + PyYAML): read-only conformance gate-runner. For every `dimension: conformance` gate in the repo's `audit-profile/v1`, it locates the artifact(s) and emits a `gate-result/v1` row (JSON array, stdout). **Never writes, never live-fetches.**
121
- - **Bundled content-addressed schemas** (`schemas/conform/v1/`): `skillmd-frontmatter`, `mcp-config`, `plugin-manifest`, `agent-frontmatter` — the deterministic *structural floor* (parses + required keys + types), distinct from the IS 100-point rubric / SAK authoring kernel (judgment, stays in `/validate-*`). conform records each schema's sha256 in the row's `policy_hash`, so a row re-verifies against the exact schema version that produced it.
122
- - **Reproducible-by-design engine.** Bundled JSON-Schemas are checked by an embedded subset validator (complete for the closed bundled schemas) rather than ajv — deliberately, because ajv's availability/version varies per machine and would make signed evidence non-reproducible. Same commit + same harness version produce an identical verdict.
123
- - **Genuinely-external formats shell out**: OpenAPI to `spectral`, GitHub Action to `yamllint`. Missing tool produces an `ADVISORY` indeterminate (never a false `FAIL`).
124
- - **Advisory-first.** A conformance violation on an `enforcement: advisory` gate is `ADVISORY` (severity `error`), exit 0 — logged, not blocking. `--strict` (or an engineer-promoted `enforcement: blocking` gate) turns a violation into `FAIL` (exit 1). Missing artifact produces `NOT_APPLICABLE`. Kill-switch (`AUDIT_HARNESS_DISABLE=1` / `.audit-harness.yml`) produces an empty `[]`, exit 0.
125
- - **`tests/conform/`**: golden suite (31 checks) + pass/fail fixtures (valid + malformed SKILL.md, .mcp.json, plugin manifest, agent) — asserts valid to PASS, malformed to ADVISORY (default) / FAIL (`--strict`), every row validates against `gate-result/v1`, the NOT_APPLICABLE + indeterminate paths, and `policy_hash` == bundled-schema sha256 + reproducible. Wired into CI (`conform` job).
126
-
127
- Scope boundary: conformance kinds without a bundled schema (`marketplace`, `hook`) resolve to `ADVISORY` indeterminate — drop a schema into `schemas/conform/v1/` to light them up, no code change. No gate *execution* for testing-depth/security yet (Phase 3+).
128
-
129
- ### Added — `classify` verb + `audit-profile/v1` (PP-PLAN-040 Phase 0 + Phase 1)
130
-
131
- The first piece of the "comprehensive audit, on any repo" build: the read-only brain.
132
-
133
- - **`audit-profile/v1` schema** (`schemas/audit-profile/v1.schema.json`): closed, versioned, hash-bearing value mirroring `gate-result/v1`. Four invariants: classifications are a UNION (not a winner), `unresolved[]` is the only Claude-refinable surface, `waived ⇒ disabled` (allOf-enforced), `registry_hash` makes a profile reproducible.
134
- - **Canonical dimension→gate registry** (`schemas/audit-profile/registry.v1.json`): the single datum that answers "which gates apply to repo-type X, in which dimension, at what applicability" — `layer-applicability.md` and `TESTING.md` become projections of it.
135
- - **`audit-harness classify [repo]`** (`scripts/classify.py`, stdlib-only): read-only repository classifier. Detects the UNION of repo-type + Claude-artifact classifications, resolves the gate set against the registry, records `registry_hash`, and emits an `audit-profile/v1` value to stdout. **Never writes to the repo.**
136
- - **Safety levers**: `INDETERMINATE` result class (infra failure ≠ policy failure); dispatcher per-command supervision via `AUDIT_HARNESS_TIMEOUT` (kill a hung gate, exit 124); `AUDIT_HARNESS_DISABLE=1` kill-switch (gate commands no-op; classify emits an all-disabled profile); engineer-owned `.audit-harness.yml` override (`classify_pins`, `advisory`, `disable_gates`, `disable`) — see `.audit-harness.example.yml`.
137
- - **`tests/classify/`**: golden fixture corpus (6 fixtures, authored before the classifier) + suite — golden-matches classifications, schema-validates every profile, exercises the kill-switch, the unknown/unresolved path, and override honoring. Wired into CI (`classify` job).
138
- - **`schemas/` now ships in the npm package** (`files`) so the registry + schema are available to consumers on any repo.
139
-
140
- Scope boundary: no `conform` verb, no gate execution yet (Phase 2+). `classify` is read-only and emits a profile only.
141
-
142
- ## [1.1.8] - 2026-06-18
143
-
144
- Ships the iah-E06 production-signing pre-flight gate to downstream consumers.
145
-
146
- ### Added — DNSSEC + CAA production-signing pre-flight (iah-E06)
147
-
148
- Before a production-mode `emit-evidence` run signs canonical bytes, two deterministic pre-flight scripts assert the signing domain is cryptographically sound. Both fail closed: any error, missing record, or unreachable resolver blocks the signing path rather than emitting an unverifiable attestation.
149
-
150
- - **`scripts/dnssec-check.sh`** — verifies the signing domain's DNSSEC chain is present and validates.
151
- - **`scripts/caa-check.sh`** — verifies the domain's CAA records authorize the signing certificate authority.
152
- - The `emit-evidence` production path gates on both before signing; staging/draft emit is unaffected.
153
-
154
- ### Fixed — query a trusted validating resolver in the DNSSEC + CAA pre-flight (PR #75)
155
-
156
- The pre-flight previously trusted the ambient resolver, which may not validate DNSSEC. Both scripts now query known validating resolvers (`1.1.1.1`, `8.8.8.8`) and require the authenticated-data (AD) flag plus an `RRSIG` on the answer. A resolver that does not set AD, or an answer with no RRSIG, is treated as a validation failure (fail-closed) rather than a pass.
157
-
158
- ### Changed — Version bumped to 1.1.8 across all manifests
159
-
160
- Per the `version-canonical-check` CI gate. `package.json` (canonical), `version.txt`, `python/pyproject.toml`, `python/src/intent_audit_harness/__init__.py`, and `rust/Cargo.toml` all report `1.1.8`.
161
-
162
- ### Why patch, not minor
163
-
164
- The pre-flight scripts shipped to the repo in earlier PRs (#70, #75); this patch propagates them to npm consumers via a version bump. No new public CLI commands or flag changes in this release boundary.
165
-
166
- ## [v1.1.5] - 2026-06-03
22
+ ### Fixed
167
23
 
168
- ### Added npm release pipeline (closes the publish-pipeline gap)
24
+ - **`emit-evidence` now emits kernel-valid `gate-result/v1` predicate bodies (#103).**
25
+ The CLI `emit-evidence` wrapped gate rows in an in-toto Statement declaring
26
+ `predicateType: https://evals.intentsolutions.io/gate-result/v1`, but the predicate
27
+ body carried the legacy draft envelope (`result`/`timestamp`), which fails
28
+ `@intentsolutions/core`'s `GateResultV1Schema` (it forbids additional properties) —
29
+ so a downstream `intent-rollout-gate` rejected the bundle. The emitter now builds the
30
+ canonical body (`gate_decision`, `gate_name`, `gate_version`, `gate_reasons`,
31
+ `coverage`, `policy_ref`, `evaluated_at`), bringing the general-purpose CLI path to
32
+ parity with the internal `ci/emit-evidence.ts` self-gate (which already emitted
33
+ kernel-valid rows). The post-emit predicate is now validated against a full-kernel
34
+ fixture (`tests/fixtures/gate-result-v1.schema.json`); the partial input-envelope
35
+ fixture stays for the gate emitters' raw rows. Surfaced by the first external-adopter
36
+ convergence run; verified `conform | emit-evidence` → 9/9 kernel-valid →
37
+ `intent-rollout-gate` decision `block → allow`.
169
38
 
170
- This is the first release published to npm via CI with Sigstore provenance. Until now the repo had **no release workflow** — npm was stuck at `0.1.0` while the code (and every other manifest) had advanced through `1.0.0` → `1.1.4`, four minors of CHANGELOG-documented work that never reached consumers. `npm install @intentsolutions/audit-harness` resolved to the stale `0.1.0` tarball.
39
+ ## [1.2.2] - 2026-06-16
171
40
 
172
- - **`.github/workflows/release.yml`** (NEW): mirrors the provenance approach of `intent-eval-core`'s release workflow, adapted for this zero-dependency polyglot CLI (no pnpm, no lockfile, no TS build, no coverage). Triggers on `push` of a `v*.*.*` tag and on `workflow_dispatch`. Sets `id-token: write` for npm/Sigstore OIDC. Verifies the pushed tag matches `package.json#version` (skipped on manual dispatch since there's no tag), runs the `node bin/audit-harness.js --version` self-check + the repo's `escape-scan.sh --staged` test script (non-blocking on no-staged-diff), then `npm publish --provenance --access public`. The `NPM_TOKEN` repo secret is already configured.
41
+ A patch release closing the polyglot publish loop. No CLI surface, runtime behavior,
42
+ or API boundary changes — only the release machinery moved. v1.2.1 published to npm
43
+ but failed PyPI (a twine bug) and crates.io (an account email-verification gate);
44
+ this release publishes all three registries cleanly.
173
45
 
174
- ### Fixed — package metadata + install.sh URLs for the `intent-audit-harness` repo rename
46
+ ### Fixed
175
47
 
176
- The GitHub repo was renamed `audit-harness` `intent-audit-harness`, but the metadata still pointed at the old path.
48
+ - **twine now uploads only built distributions, not the `.sigstore.json` bundles (#92).** The `publish-pypi` leg's `twine upload` call is scoped to `dist/*.whl dist/*.tar.gz`, so the sigstore signature bundles emitted alongside the wheel + sdist are no longer passed to twine (which rejected them and failed the v1.2.1 PyPI publish).
49
+ - **crates.io publish goes live.** The account email-verification gate that blocked the v1.2.1 crates.io publish is now resolved, so the `publish-crates` leg publishes on this tag — closing the npm + PyPI + crates polyglot publish loop.
177
50
 
178
- - **`package.json`**: `homepage`, `repository.url`, and `bugs.url` repointed from `jeremylongshore/audit-harness` → `jeremylongshore/intent-audit-harness` (these render on npmjs.com).
179
- - **`python/pyproject.toml` + `rust/Cargo.toml`**: project-URL fields (Homepage / Repository / Issues / Changelog / documentation) repointed to the renamed repo — these render on PyPI and crates.io.
180
- - **`python/src/intent_audit_harness/__init__.py`**: docstring source-link repointed.
181
- - **`README.md`**: the `curl … install.sh` line + the two "Related" skill links repointed to the renamed repo.
182
- - **`install.sh`**: the `REPO=` variable, the usage-comment URLs at the top, and the re-run hint repointed; the default `VERSION` bumped from the stale `v0.1.0` → `v1.1.5`.
183
-
184
- ### Fixed — install.sh tarball-path glob broke after the rename
185
-
186
- The GitHub archive tarball unpacks as `<repo>-<version>/`, which became `intent-audit-harness-1.1.5/` after the rename. The unpack-dir detection used `find … -name 'audit-harness-*'`, and `-name` matches the basename with no implicit leading wildcard, so it matched **nothing** under the new prefix — every vendored install would have failed at "could not find unpacked dir". Changed the glob to `-name '*audit-harness-*'` (leading wildcard), which matches both the current `intent-audit-harness-*` name and legacy `audit-harness-*` tags. Verified against both directory names.
187
-
188
- ### Added — README badge row
189
-
190
- npm-version, License Apache-2.0, and Sigstore-provenance shields under the H1 (mirrors the `intent-eval-core` badge row). The "Part of the Intent Eval Platform" cross-link line is preserved.
191
-
192
- ### Changed — Version bumped to v1.1.5 across all manifests
193
-
194
- Per the `version-canonical-check` CI gate (v1.0.2 PR #35). `package.json` (canonical), `version.txt`, `python/pyproject.toml`, `python/src/intent_audit_harness/__init__.py`, and `rust/Cargo.toml` all report `1.1.5`. (`rust/Cargo.lock` is gitignored; its working-tree entry is aligned for local cargo builds.)
195
-
196
- ### Why patch, not minor
197
-
198
- No new CLI commands, no new flags, no API change, no script behavior change. This is release-engineering + metadata: the publish pipeline that ships the existing `1.1.x` code, plus URL corrections for the repo rename, plus the install.sh glob fix. The pinned policy scripts (`.harness-hash`) are untouched.
199
-
200
- ### Verification
201
-
202
- - `npm pack --dry-run` → tarball contains `bin/`, `scripts/`, `README.md`, `LICENSE`, `NOTICE`, `CHANGELOG.md` per `package.json#files`
203
- - `node bin/audit-harness.js --version` → `1.1.5`
204
- - `bash -n install.sh` → exit 0; unpack-dir glob matches `intent-audit-harness-1.1.5` (and legacy `audit-harness-*`)
205
- - `bash scripts/harness-hash.sh --verify` → OK (no pinned files changed)
206
-
207
- ## [v1.1.4] - 2026-05-25
208
-
209
- ### Fixed — gherkin-lint.sh prev_blank print-every-line noise (IEP P3, Gemini #71 review chain)
210
-
211
- Closes `iah-gherkin-prev-blank-noise` (`bd_000-projects-o9q1`, P2). The third awk block in `scripts/gherkin-lint.sh` (the And-at-scenario-start checker) opened with a bare `prev_blank = 1` expression that awk interpreted as an always-true pattern with implicit `{ print }` default action — flooding stdout with every line of every feature file alongside the intentional ERROR printf. `prev_blank` was never USED anywhere in the awk script (verified via grep). Removed both touches: the top-level expression AND the assignment in the blank-line pattern (which was also unreachable for anything that mattered, since no downstream pattern read `prev_blank`). The third awk block now produces ONLY the targeted ERROR line when triggered. Verified via the same deliberate-failure test from v1.1.2 AAR — output before: full feature file printed interleaved with ERROR. Output after: just the ERROR line.
212
-
213
- ### Changed — gherkin-lint.sh process_awk_output() collapsed to single awk pass (Gemini #38 follow-up)
214
-
215
- Closes `iah-gherkin-single-awk-opt` (`bd_000-projects-vawm`, P3). v1.1.2 introduced `process_awk_output()` with two awk subprocesses per call (one counting WARN, one counting ERROR). v1.1.4 collapses to a single awk pass via `read -r w e < <(awk '/^WARN /{w++} /^ERROR /{e++} END {print w+0, e+0}' <<< "$out")` per Gemini PR #39 verbatim suggestion. Halves the awk fork count (4 callsites × 2 subprocesses = 8 awk processes/feature → 4). Verified with mixed WARN+ERROR test: 2 WARNs + 1 ERROR in one feature file produces summary `2 warning(s), 1 error(s)` and exit 1.
216
-
217
- ### Fixed — crap-score.py exclusion sets deduplicated via EXCLUDED_DIRS constant (Gemini #71 review)
218
-
219
- Closes `iah-crap-score-exclusion-dedup` (`bd_000-projects-niv8`, P2). Pre-v1.1.4, `scripts/crap-score.py` had TWO separate sets with overlapping intent but divergent contents:
220
-
221
- - `ignore` set in `score_python()` (line 85): had `"reports"` but lacked `.next`, `.nuxt`, `.cache`
222
- - `prune` set in `main()` (line 394, added v1.1.1 for `--json` input-hash walk): had `.next`, `.nuxt`, `.cache` but lacked `"reports"`
223
-
224
- Asymmetry was a real bug: a repo with `reports/` would skip score_python's candidate scan but its `.py` files DID get hashed by the input-hash walk; opposite for `.next/.nuxt/.cache`. Fixed by extracting a single module-level constant `EXCLUDED_DIRS` (union of both prior sets) referenced by both call sites. Set contents: `.git`, `.venv`, `venv`, `node_modules`, `__pycache__`, `dist`, `build`, `target`, `.tox`, `.mypy_cache`, `.pytest_cache`, `.next`, `.nuxt`, `.cache`, `reports`.
225
-
226
- ### Changed — Shellcheck CI job version-pinned (parity with ruff v1.1.3)
227
-
228
- Closes `iah-shellcheck-version-pin` (`bd_000-projects-v1ds`, P3). v1.1.2 (Phase A1) installed shellcheck via `apt-get install -y shellcheck` which pulls whatever Ubuntu's runner-image version happens to ship (currently 0.9.0). When the runner image upgrades shellcheck to 0.10.x or later, new rules activate silently and could surface findings in already-merged code. v1.1.4 pins to `v0.10.0` via download from the koalaman/shellcheck GitHub releases. CI step prints `shellcheck --version` for audit trail. To bump: edit `SHELLCHECK_VERSION` env in the workflow + run `shellcheck scripts/*.sh` locally + commit as explicit PR. Matches the ruff version-pin pattern from v1.1.3.
229
-
230
- ### Changed — Version bumped to v1.1.4 across all 5 manifests
231
-
232
- Per the version-canonical-check CI gate (v1.0.2 PR #35). All 5 manifest locations now report `1.1.4`.
233
-
234
- ### Changed — `.harness-hash` regenerated
235
-
236
- `scripts/gherkin-lint.sh` + `scripts/crap-score.py` modified; both are pinned. 2 of 9 pinned-file hashes change.
237
-
238
- ### Why patch, not minor
239
-
240
- Pure cleanup release: dead-code removal, perf microoptimization, bug fixes for cross-call inconsistencies, CI version pin. No new CLI commands, no new flags, no API change. Consumers re-vendor / `pnpm up` and get the cleaner scripts + tighter CI transparently.
51
+ ### Changed
241
52
 
242
- ### Verification
53
+ - Release-preparation chore for v1.2.2 (#93).
243
54
 
244
- - `shellcheck scripts/*.sh` → exit 0 (local 0.9.0; CI will run pinned 0.10.0)
245
- - `ruff check` → `All checks passed!`
246
- - `bash -n scripts/*.sh` → all pass
247
- - `python3 -m py_compile scripts/crap-score.py + cli.py` → exit 0
248
- - `bash scripts/harness-hash.sh --verify` → OK after `--init`
249
- - gherkin-lint deliberate-failure test (And-at-start): exit 1, summary correct
250
- - gherkin-lint mixed test (2 WARN + 1 ERROR): summary `2 warning(s), 1 error(s)`, exit 1
251
- - Output noise gone: feature-file lines no longer printed alongside ERRORs
55
+ ## [1.2.1] - 2026-06-16
252
56
 
253
- AAR: `000-docs/009-AA-AACR-v1.1.4-cleanup-bundle-2026-05-25.md`.
57
+ A patch release: release-pipeline supply-chain hardening (polyglot signing) plus
58
+ dev-dependency bumps. No CLI surface, runtime behavior, or API boundary changes —
59
+ the published artifacts are byte-identical in behavior to 1.2.0; only the release
60
+ machinery and dev tooling moved.
254
61
 
255
- ### Not bundled (separate scope)
62
+ ### Added
256
63
 
257
- `iah-python-wrapper-scripts-sync` (`bd_000-projects-65k4`) remains open. The Python wrapper's `python/src/intent_audit_harness/scripts/crap-score.py` (and the Rust wrapper's mirror) are stale by design — install.sh sources from canonical `scripts/` but wrapper packaging hasn't grown a build-time sync mechanism. Implementation requires choosing between hatch build-hook, Cargo build.rs, symlinks, or CI-enforced manual sync. Deferred to its own focused PR.
64
+ - **sigstore-python wheel + sdist signing (#90).** The `publish-pypi` leg now signs the built wheel and sdist with `sigstore-python` (keyless Fulcio OIDC + Rekor), so the PyPI distribution carries verifiable provenance alongside the existing npm sigstore path.
65
+ - **crates.io build-provenance attestation (#90).** The `publish-crates` leg now emits a GitHub build-provenance attestation for the published crate artifact, extending the signed-supply-chain guarantee to the Rust distribution.
258
66
 
259
- ## [v1.1.3] - 2026-05-24
67
+ ### Changed
260
68
 
261
- ### Added Ruff CI gate against own-code Python (IEP Convergence Debt Plan Priority 6 Phase A2)
69
+ - **crates.io publish is now active (#90).** With `CARGO_REGISTRY_TOKEN` provisioned as a repository secret, the `publish-crates` leg goes live on this tag — closing the polyglot publish loop (npm + PyPI + crates.io all publish + sign from one tag).
70
+ - Bump `eslint` from 9.39.4 to 10.5.0 (#71).
71
+ - Bump `jeremylongshore/intent-rollout-gate` GitHub Action pin from 0.1.0 to 0.2.0 (#86).
72
+ - Bump `crate-ci/typos` from 1.29.4 to 1.47.2 (#87).
73
+ - Release-preparation chore for v1.2.1 (#91).
262
74
 
263
- Closes `iah-ruff` (`bd_000-projects-x9bs`, P1). New `.github/workflows/ci.yml` job `ruff (Python lint)` runs `ruff check` (version-pinned to 0.15.4 per the iah-shellcheck-version-pin lesson) against the own-code Python surface. Ruleset `select = ["B", "E", "F"]` — pyflakes (F) for dead imports + unused variables; pycodestyle errors (E) for syntax-level issues; **flake8-bugbear (B) for Python-specific bugs** (mutable default args, unreliable exception handling — added per Gemini PR #39 review after empirical confirmation that zero new findings fire on our codebase). Line length set to 120 (modern Python convention). Further ratchet (I import-order, UP pyupgrade, etc.) deferred to a future ratchet bead.
75
+ ## [1.2.0] - 2026-06-15
264
76
 
265
- - New `ruff.toml` at repo root: lint scope = `scripts/*.py` + `python/src/intent_audit_harness/{__init__,__main__,cli}.py`; excludes `python/.venv/` + `python/src/intent_audit_harness/scripts/` + `rust/scripts/` (the last two are bundled-content mirrors of `scripts/*` — stale-sync tracked separately, see below).
266
- - Version pinned via `pip install 'ruff==0.15.4'`; CI prints `ruff --version` for audit trail.
77
+ A minor release: the provider credential gate (`cred-gate`, iah-E08), the locked
78
+ OTel runtime-event surface (`agent.rollout.gate.evaluated` + `gate.decision.emitted`,
79
+ iah-E07), shared vendorable lint configs, wrapper-mirror drift-guard CI, and tailnet
80
+ CI-failure alerting — all additive, with the zero-runtime-dependency guarantee
81
+ preserved.
267
82
 
268
- ### Removed 3 ruff-surfaced dead-code findings
83
+ > **Why minor, not patch:** A new CLI-adjacent gate surface (`cred-gate`) and new authored feature surfaces (shared lint configs, the locked OTel event taxonomy, the wrapper drift-guard lane). Per SemVer this is a minor bump. No CLI command was renamed or removed; the change is purely additive and the published tarball stays zero-runtime-dependency.
269
84
 
270
- - **`scripts/crap-score.py`**: redundant local `import hashlib, os` inside the `if args.json:` block was shadowing the module-level `import os`, causing ruff F401 against the top-level (which IS used by the same block). **Per Gemini PR #39 review (PEP 8 alignment)**, moved `hashlib` to module-level imports alongside the other stdlib imports; removed the local re-import entirely. The bandaid-comment explaining the local import is also gone.
271
- - **`scripts/crap-score.py`**: dead local variable `metrics = rec.get("metrics", {}).get("cyclomatic", {})` in `score_rust()` (line 266; F841). Assigned but never read. The actual cyclomatic value is fetched freshly inside the loop on line 268.
272
- - **`python/src/intent_audit_harness/cli.py`**: dead `import os` at line 12 (F401). Zero `os.*` usages in the file.
85
+ ### Added
273
86
 
274
- ### Changed Long-line reformat in scripts/crap-score.py
87
+ - **Provider credential gate (`cred-gate`, iah-E08) (#77).** A new gate that asserts provider credentials PASS/FAIL with full redaction + spillover coverage (`scripts/cred-gate.sh`).
88
+ - **Credential-leak fixtures + failure-mode docs (#80).** Full-catalog fixture coverage for the cred-gate's redaction + spillover behavior (iah-E08a/E08b).
89
+ - **OTel runtime events on `emit-evidence` (iah-E07) (#81).** Emits `agent.rollout.gate.evaluated` (the per-gate evaluation event, name + attributes locked + tested, iah-E07a) and `gate.decision.emitted` (the gate-decision event, iah-E07b) per the NORMATIVE `intent-eval-lab/000-docs/067-AT-SPEC` runtime-event taxonomy.
90
+ - **Shared, vendorable lint configs (#85).** `.audit-harness-configs/` (markdownlint / yamllint / ruff / shellcheck) is the canonical config set the IEP repos vendor + extend; `install.sh` now vendors both `scripts/` and `configs/`. CLAUDE.md cross-references the lab specs.
91
+ - **Advisory `typos` spell-check CI lane (#83)** and **advisory `actionlint` CI lane (#84).**
92
+ - **ntfy CI-failure alert over the tailnet (#79).** CI failures fan out a notification to the private tailnet ntfy topic.
275
93
 
276
- - Line 84 `ignore` set literal (155 chars) reformatted into a multi-line set literal that fits 120-char limit. Cosmetic; no behavior change.
94
+ ### Changed
277
95
 
278
- ### Changed Version bumped to v1.1.3 across all 5 manifests
96
+ - **Provider credential gate + OTel head landed first (#77).** The `cred-gate` head and the OTel `gate.decision.emitted` decision event landed together; PR #78 then renamed the gate-decision event to `gate.decision.emitted` to align with the 067-AT-SPEC runtime-event taxonomy.
97
+ - **Dogfood AAR (iah-E10d) (#88).** First-downstream-adopter run captured at `000-docs/013-AA-AACR-rollout-gate-dogfood-iah-E10-2026-06-15.md`.
98
+ - Release-preparation chore for v1.2.0 (#89).
279
99
 
280
- Per the version-canonical-check CI gate (v1.0.2 PR #35). All 5 manifest locations now report `1.1.3`.
100
+ ### Fixed
281
101
 
282
- ### Changed `.harness-hash` regenerated
102
+ - **Bundled wrapper mirrors resynced to canonical + drift-guard CI lane (iah-65k4) (#82).** The Python (`python/src/intent_audit_harness/scripts/`) and Rust (`rust/scripts/`) bundled copies of `crap-score.py` were stale mirrors of canonical `scripts/`; this resyncs them and adds a CI lane that fails on any future drift between canonical and the bundled mirrors.
283
103
 
284
- `scripts/crap-score.py` is pinned by `.harness-hash-extra-patterns`; the dead-code removal + long-line reformat changes its hash. 1 of 9 pinned-file hashes change.
104
+ ## [1.1.8] - 2026-06-13
285
105
 
286
- ### Why patch, not minor
106
+ Ships the iah-E06 production-signing pre-flight gate to downstream consumers, plus
107
+ the comprehensive PP-PLAN-040 supply-chain + hygiene wave, crap-score backend
108
+ repairs, and a SemVer contract-pin test suite.
287
109
 
288
- Pure lint-gate addition + dead-code removal. No new CLI commands, no new flags, no API change. Consumers re-vendor / `pnpm up` and get the cleaner scripts + the (new for them) ruff config transparently.
110
+ > **Why patch, not minor:** The pre-flight scripts shipped to the repo in earlier PRs (#70, #75); this patch propagates them to npm consumers via a version bump. No new public CLI commands or flag changes in this release boundary.
289
111
 
290
- ### Verification
112
+ ### Added
291
113
 
292
- - `ruff check` `All checks passed!` on clean checkout
293
- - `python3 -m py_compile scripts/crap-score.py` exit 0
294
- - `python3 -m py_compile python/src/intent_audit_harness/cli.py` exit 0
295
- - `shellcheck scripts/*.sh` exit 0 (no regression on Phase A1)
296
- - `bash scripts/harness-hash.sh --verify` → OK after `--init`
297
- - CI ruff job will block any future PR that introduces a Python lint finding (F401, F841, E*, etc.)
114
+ - **DNSSEC + CAA production-signing pre-flight (iah-E06) (#70).** Before a production-mode `emit-evidence` run signs canonical bytes, two deterministic pre-flight scripts assert the signing domain (`evals.intentsolutions.io`) is cryptographically sound — `scripts/dnssec-check.sh` verifies the DNSSEC chain is present and validates; `scripts/caa-check.sh` verifies the CAA records authorize the signing certificate authority. Both fail closed: any error, missing record, or unreachable resolver blocks the signing path rather than emitting an unverifiable attestation. Staging/draft emit is unaffected.
115
+ - **Supply-chain + hygiene + kernel-shadow detector (#69).** PyPI/crates publish wiring, dependabot polyglot coverage, lefthook, eslint, a bash-version floor, a kernel-shadow detector, and a crap-score dot-dir fix landed as one supply-chain wave.
116
+ - **`install.sh` completeness + per-repo blueprint + golden-master stdout suite (#63).** The vendored-install path now ships a complete traceable copy, plus a golden-master fitness function pinning the raw stdout of the scorers whose output is a downstream contract.
117
+ - **SemVer CLI/output-contract pin test (#65).** A test that pins the CLI + output contract so a MAJOR-worthy change fails CI rather than slipping out as a patch.
298
118
 
299
- ### Follow-up bead filed
119
+ ### Changed
300
120
 
301
- `iah-python-wrapper-scripts-sync` (new) `python/src/intent_audit_harness/scripts/crap-score.py` is a stale mirror of `scripts/crap-score.py`, ~1 month behind canonical source. Missing the v1.1.1 `--json` envelope emission, the `which_or_none("go")` PATH guard, and the rglob-walk pruning. Same pattern likely in `rust/scripts/`. Either (a) build-time copy in the Python/Rust wrapper packaging, (b) symlink, or (c) hand-sync discipline with CI check. Currently excluded from ruff scope; exclusion drops once the sync mechanism ships.
121
+ - **`currency`: one pin per upstream surface + advisory poll-freshness SLA rename (#68).** Each tracked upstream (mcp-spec, skill-md-schema, claude-code, gate-result-predicate, anthropic-sdk, agentskills-spec) carries its own pin relation so the pin's own staleness is detectable per-upstream rather than as one opaque scalar.
122
+ - **Version bumped to 1.1.8 across all manifests (#76).** Per the `version-canonical-check` CI gate: `package.json` (canonical), `version.txt`, `python/pyproject.toml`, `python/src/intent_audit_harness/__init__.py`, and `rust/Cargo.toml` all report `1.1.8`.
123
+ - **audit-harness self-adopts the intent-rollout-gate Action (#74).** CI dogfoods the downstream rollout-gate Action — graduation criterion 5 / M6 first downstream adopter.
124
+ - Bump `DavidAnson/markdownlint-cli2-action` from 17 to 23 (#49); bump `actions/setup-node` from 4 to 6 (#61); record the public gist id for sweep/release tooling (#67).
302
125
 
303
- AAR: `000-docs/008-AA-AACR-ruff-iep-P6-2026-05-24.md`.
126
+ ### Fixed
304
127
 
305
- ### What unblocks next
128
+ - **Query a trusted validating resolver in the DNSSEC + CAA pre-flight (#75).** The pre-flight previously trusted the ambient resolver, which may not validate DNSSEC. Both scripts now query known validating resolvers (`1.1.1.1`, `8.8.8.8`) and require the authenticated-data (AD) flag plus an `RRSIG` on the answer. A resolver that does not set AD, or an answer with no RRSIG, is treated as a validation failure (fail-closed) rather than a pass.
129
+ - **crap-score Go/JS scoring backends repaired + 3 bash defects from the umbrella review (#66).**
130
+ - **Evidence-integrity bugs + SHA256 portability + kernel schema URL (#64).**
306
131
 
307
- P6 Phase A2 complete. Next-ready P6 work:
132
+ ## [1.1.7] - 2026-06-08
308
133
 
309
- - A3: `iah-eslint-dispatcher` (`bd_000-projects-rnpy`) eslint coverage for `bin/audit-harness.js`
310
- - B1: `iep-shared-lint-configs` — `.audit-harness-configs/` for vendoring lint configs to consumer repos
311
- - Plus 2 bundleable Gemini-found fixes from v1.1.2 review: `iah-gherkin-prev-blank-noise` + `iah-gherkin-single-awk-opt`
134
+ A CI-only patch keeping the dashboard evidence-emit job runnable.
312
135
 
313
- ## [v1.1.2] - 2026-05-24
136
+ ### Fixed
314
137
 
315
- ### Changed Shellcheck CI gate flipped from tolerant to hard-fail (IEP Convergence Debt Plan Priority 6 Phase A1)
138
+ - **`emit-evidence` job needs Node 22 for `--experimental-strip-types` (nr75.12) (#60).** The CI-only `emit-evidence` TypeScript runner uses Node's experimental type-stripping, which requires Node 22; the job's Node version is bumped accordingly. No published-artifact change — the `ci/` emitter is excluded from the npm tarball.
316
139
 
317
- Closes `iah-shellcheck-hard-fail` (`bd_000-projects-4asc`, P1). The shellcheck job in `.github/workflows/ci.yml` previously ran `shellcheck scripts/*.sh || true` — warnings and errors were logged but never blocked the PR. As of this release the `|| true` suffix is removed: any shellcheck finding (warning or error) blocks the build. The locked precondition was v1.1.1 (PR #37) which addressed the 6 Gemini-flagged robustness findings — the surface was already clean enough that flipping the gate exposed exactly 3 residual dead-code findings, all fixed below.
140
+ ## [1.1.6] - 2026-06-08
318
141
 
319
- ### Removed 3 pieces of dead code surfaced by the harder shellcheck gate
142
+ A minor release: the read-only "comprehensive audit, on any repo" brain
143
+ (`classify` → `conform` → `audit` → `scan` → `currency`), the registry-projection +
144
+ FP-rate safety spine, and the CI-only kernel-emitting evidence path for the
145
+ dashboard (nr75.12) — all additive, with the zero-runtime-dependency guarantee
146
+ preserved. (Note: an earlier CHANGELOG draft attributed this PP-PLAN-040 verb set
147
+ to 1.2.0; it actually shipped here in 1.1.6 via PRs #52–#59.)
320
148
 
321
- - **`scripts/bias-count.sh`**: `declare -A PATTERN_COUNTS` plus the per-call `PATTERN_COUNTS["$label"]=$count` assignment in `count_pattern()`. SC2034: the associative array was populated but never read. Per-pattern counts are still printed inline (line 61) and are aggregated into `TOTAL_BIAS` for the JSON output `bias_total` metadata field; the per-pattern breakdown was apparently intended for a richer JSON shape that was never wired. Restoring it would be a feature, not a fix; filed as deferred scope if a consumer asks.
322
- - **`scripts/emit-evidence.sh`**: `INPUT_HASH_HEX="$(echo "$STATEMENT" | python3 -c ...)"` (formerly line 238). SC2034: computed but never read. Vestige from an earlier cosign integration; the surrounding `BLOB_FILE` construction relies on `ARTIFACT_NAME` only.
323
- - **`scripts/gherkin-lint.sh`**: `err()` helper function. SC2317: zero call sites in the file (verified via `grep -n "\berr\b"` — only the definition matches). The helper was defined symmetrically with `warn()` but never wired up to the awk rubric or the subprocess-fallback path. Replaced with `process_awk_output()` helper (see Fixed section below).
149
+ > **Why minor, not patch:** Multiple new read-only CLI verbs (`classify`, `conform`, `audit`, `scan`, `currency`) and new authored feature surfaces (the audit-profile data spec, the registry datum, the CI-only evidence emit). Per SemVer this is a minor bump. No CLI command was renamed or removed; the change is purely additive and the published tarball stays zero-runtime-dependency.
324
150
 
325
- ### Fixed — gherkin-lint.sh awk subprocess undercount (silent-failure class bug; Gemini PR #38 review)
151
+ ### Added
326
152
 
327
- While processing the SC2317 cleanup above, Gemini's PR #38 review surfaced a deeper bug: the gherkin-lint.sh awk-fallback path printed `WARN`/`ERROR` lines via `awk printf` but those subprocesses never incremented the parent shell's `WARN_COUNT`/`ERROR_COUNT` counters. The summary line said "0 warnings, 0 errors" while errors were actively being printed; the exit code stayed 0 regardless. Exactly the silent-failure class the linter exists to surface in OTHER projects.
153
+ - **`classify` verb + `audit-profile/v1` data-spec (PP-PLAN-040 Phase 0+1) (#53).** `audit-harness classify [repo]` (`scripts/classify.py`, stdlib-only) is a read-only repository classifier: it detects the UNION of repo-type + Claude-artifact classifications, resolves the gate set against the canonical `schemas/audit-profile/registry.v1.json` datum, records `registry_hash`, and emits an `audit-profile/v1` value to stdout — **never writes to the repo**. The `audit-profile/v1` schema is closed, versioned, and hash-bearing, mirroring `gate-result/v1`; its four invariants: classifications are a UNION (not a winner), `unresolved[]` is the only Claude-refinable surface, `waived ⇒ disabled` (allOf-enforced), `registry_hash` makes a profile reproducible. Safety levers: an `INDETERMINATE` result class (infra failure ≠ policy failure), per-command timeout supervision via `AUDIT_HARNESS_TIMEOUT`, the `AUDIT_HARNESS_DISABLE=1` kill-switch, and an engineer-owned `.audit-harness.yml` override. `schemas/` now ships in the npm package (`files`).
154
+ - **`conform` verb + bundled content-addressed schemas (PP-PLAN-040 Phase 2) (#54).** `audit-harness conform [repo]` (`scripts/conform.py`, stdlib + PyYAML): for every `dimension: conformance` gate in the repo's `audit-profile/v1`, locates the artifact(s) and emits a `gate-result/v1` row — never writes, never live-fetches. Bundled content-addressed schemas (`schemas/conform/v1/`: `skillmd-frontmatter`, `mcp-config`, `plugin-manifest`, `agent-frontmatter`) form the deterministic structural floor, checked by an embedded subset validator (not ajv) for reproducible signed evidence; each schema's sha256 is recorded in the row's `policy_hash`. Genuinely-external formats shell out (OpenAPI → `spectral`, GitHub Action → `yamllint`); a missing tool produces ADVISORY indeterminate, never a false FAIL. Advisory-first; `--strict` (or an engineer-promoted blocking gate) turns a violation into FAIL.
155
+ - **`audit` testing-depth gate-runner (PP-PLAN-040 Phase 3 / E5) (#56).** `audit-harness audit [repo]` (`scripts/audit.py`, stdlib): for every `dimension: testing-depth` gate, runs the bundled `crap` scorer and per-pyramid-layer presence heuristics (unit/integration/e2e/smoke/perf/a11y/contract/migration/property-based/fuzz/sanitizers). Layer present → PASS; absent → ADVISORY(warn); not statically assessable → ADVISORY indeterminate. `--fast` (default, presence heuristics only) / `--deep` (adds crap-score) / `--strict` (gap on a blocking gate → FAIL). Deliberately does NOT execute the repo's test suite — running untrusted suites is the repo's own CI's job.
156
+ - **`scan` security/hygiene/skill-quality gate-runner (PP-PLAN-040 Phase 4 / E6) (#57).** `audit-harness scan [repo]` (`scripts/scan.py`, stdlib): for every `dimension: security | hygiene | skill-quality` gate, emits a `gate-result/v1` row via three strategies — local (deterministic README presence), shell-out (gitleaks / osv-scanner / semgrep / syft / markdownlint / lychee; clean → PASS, findings → ADVISORY(error), absent → ADVISORY indeterminate), and consume (`skill-behavioral` ingests a j-rig Evidence Bundle verdict via `--jrig-verdict`). Advisory-first; `--strict` turns a finding/gap into FAIL. **Security note:** on first run this gate caught — and this release redacts from HEAD — a PyPI publish token pasted as a literal value in `python/PUBLISH.md`. The value remains in git history and must be rotated at the registry (tracked separately); the doc now carries a placeholder.
157
+ - **`currency` advisory upstream-currency report (PP-PLAN-040 Phase 5 / E7) (#58).** `audit-harness currency` (`scripts/currency.py`, stdlib): reads the per-upstream-identity pin relation (`schemas/currency/pins.v1.json`) and reports which pins are themselves stale (`checked_at` older than the pin's staleness window). No exit-code authority (always exit 0), no live-fetch, no auto-fix — `/sync-testing-harness` consumes the report to open advisory bump PRs; it never reddens a build. `--today YYYY-MM-DD` makes reports reproducible.
158
+ - **Registry projection + FP-rate harness (PP-PLAN-040 E2: c2b + c2e) (#55).** `audit-harness gen-layer-applicability` projects `schemas/audit-profile/registry.v1.json` into `schemas/audit-profile/layer-applicability.md` (the doc is now a projection of the registry datum, not a hand-maintained parallel source — CI gate `layer-applicability-drift` enforces it). `audit-harness fp-rate` measures each gate's false-positive / false-negative rate over a labeled corpus — the metric that gates advisory→blocking promotion. `docs/gate-promotion.md` documents the FP-rate ≤ 5% promotion bar.
159
+ - **CI-only signed evidence emit for the intent-eval-dashboard (nr75.12) (#59).** `ci/emit-evidence.ts` + `ci/assemble-manifest.ts` run the real deterministic self-gate (`harness-hash --verify`), shape it into a kernel `gate-result/v1` + `EvidenceBundle` (fail-closed against `@intentsolutions/core`), cosign-sign the canonical bytes (Fulcio OIDC + Rekor), and assemble the `report-manifest.json` the dashboard reports hub (labs.intentsolutions.io) re-verifies at ingest. Zero-dep guarantee preserved: the emitter lives in `ci/` (excluded from `package.json#files`) and the kernel is installed CI-only via `npm i --no-save`.
328
160
 
329
- - **New `process_awk_output()` helper**: wraps each awk subprocess, captures its output, counts `WARN` / `ERROR` lines via inline awk (`'/^WARN /{c++} END{print c+0}'` — set-euo-pipefail safe, no `|| true` needed), increments the bash counters, then re-prints. 4 awk blocks now feed through it.
330
- - **Verification**: deliberate-failure test against a feature with `Scenario: ... \n And ...` produces exit code 1 + summary `0 warning(s), 1 error(s)` (was: exit 0 + `0 warning(s), 0 error(s)` while still printing the ERROR line). Clean feature still exits 0.
331
- - **Separate-scope finding**: the third awk script contains a stray top-level `prev_blank = 1` that awk treats as an always-true pattern, triggering its default print-every-line action. That's a pre-existing cosmetic issue (extra noise in script output) but not a counter bug — filed as deferred scope.
161
+ ### Changed
332
162
 
333
- ### Changed Version bumped to v1.1.2 across all 5 manifests
163
+ - **Finished the `intent-audit-harness` rename in public contributor docs (#52).**
334
164
 
335
- Per the version-canonical-check CI gate (v1.0.2 PR #35). All 5 committed manifest locations now report `1.1.2`:
165
+ ## [1.1.5] - 2026-06-03
336
166
 
337
- - `package.json`
338
- - `version.txt`
339
- - `python/pyproject.toml`
340
- - `python/src/intent_audit_harness/__init__.py`
341
- - `rust/Cargo.toml`
167
+ > **Why patch, not minor:** No new CLI commands, no new flags, no API change, no script behavior change. This is release-engineering + metadata: the publish pipeline that ships the existing `1.1.x` code, plus URL corrections for the repo rename, plus the install.sh glob fix. The pinned policy scripts (`.harness-hash`) are untouched.
342
168
 
343
- ### Changed — `.harness-hash` regenerated
169
+ ### Added
344
170
 
345
- The self-pinning manifest is regenerated to capture the new script hashes (per `iep-P3 iah-self-pin` v1.1.0 mechanism). 3 of 9 pinned-file hashes change (the 3 modified scripts); 6 unchanged.
171
+ - **npm release pipeline (closes the publish-pipeline gap).** This is the first release published to npm via CI with Sigstore provenance. Until now the repo had **no release workflow** — npm was stuck at `0.1.0` while the code (and every other manifest) had advanced through `1.0.0` → `1.1.4`, four minors of CHANGELOG-documented work that never reached consumers. `npm install @intentsolutions/audit-harness` resolved to the stale `0.1.0` tarball. New `.github/workflows/release.yml` mirrors the provenance approach of `intent-eval-core`'s release workflow, adapted for this zero-dependency polyglot CLI (no pnpm, no lockfile, no TS build). Triggers on `push` of a `v*.*.*` tag and on `workflow_dispatch`, sets `id-token: write` for npm/Sigstore OIDC, verifies the pushed tag matches `package.json#version`, runs the `--version` self-check + `escape-scan.sh --staged`, then `npm publish --provenance --access public`.
172
+ - **README badge row.** npm-version, License Apache-2.0, and Sigstore-provenance shields under the H1 (mirrors the `intent-eval-core` badge row). The "Part of the Intent Eval Platform" cross-link line is preserved.
346
173
 
347
- ### Why patch, not minor
174
+ ### Changed
348
175
 
349
- Pure dead-code removal + a CI policy tightening. No new CLI commands, no new flags, no API change, no behavioral change for any consumer. Downstream consumers re-vendor (or `pnpm up`) and get the cleaner scripts transparently.
176
+ - **Version bumped to v1.1.5 across all 5 manifests.** Per the `version-canonical-check` CI gate (v1.0.2 PR #35). `package.json` (canonical), `version.txt`, `python/pyproject.toml`, `python/src/intent_audit_harness/__init__.py`, and `rust/Cargo.toml` all report `1.1.5`.
350
177
 
351
- ### Verification
178
+ ### Fixed
352
179
 
353
- - `shellcheck scripts/*.sh` → exit 0 on a clean checkout (verified locally before push)
354
- - `bash -n scripts/*.sh` all pass
355
- - `python3 -m py_compile scripts/crap-score.py` → exit 0
356
- - `bash scripts/harness-hash.sh --verify` → harness-hash: OK after `--init`
357
- - CI shellcheck job will now block on any future warning — try staging `cmd $var` (unquoted expansion) to verify the gate fires
180
+ - **Package metadata + `install.sh` URLs for the `intent-audit-harness` repo rename.** The GitHub repo was renamed `audit-harness` `intent-audit-harness`, but the metadata still pointed at the old path. `package.json` (`homepage`, `repository.url`, `bugs.url`), `python/pyproject.toml` + `rust/Cargo.toml` project-URL fields, `python/src/intent_audit_harness/__init__.py` docstring source-link, `README.md` (the `curl … install.sh` line + two "Related" skill links), and `install.sh` (the `REPO=` variable, usage-comment URLs, re-run hint, and the default `VERSION` bumped `v0.1.0` → `v1.1.5`) were all repointed to the renamed repo.
181
+ - **`install.sh` tarball-path glob broke after the rename.** The GitHub archive tarball unpacks as `<repo>-<version>/`, which became `intent-audit-harness-1.1.5/` after the rename. The unpack-dir detection used `find … -name 'audit-harness-*'`, and `-name` matches the basename with no implicit leading wildcard, so it matched **nothing** under the new prefix — every vendored install would have failed. Changed the glob to `-name '*audit-harness-*'` (leading wildcard), matching both the current `intent-audit-harness-*` name and legacy `audit-harness-*` tags.
358
182
 
359
- AAR: `000-docs/007-AA-AACR-shellcheck-hard-fail-iep-P6-2026-05-24.md`.
183
+ ## [1.1.4] - 2026-05-25
360
184
 
361
- ### What this unblocks in the IEP Convergence Debt Plan
185
+ > **Why patch, not minor:** Pure cleanup release: dead-code removal, perf microoptimization, bug fixes for cross-call inconsistencies, CI version pin. No new CLI commands, no new flags, no API change. AAR: `000-docs/009-AA-AACR-v1.1.4-cleanup-bundle-2026-05-25.md`.
362
186
 
363
- P6 Phase A1 closed. Next-ready P6 work:
187
+ ### Changed
364
188
 
365
- - A2: `iah-ruff` add Python ruff CI gate
366
- - A3: `iah-eslint-dispatcher` add eslint coverage for `bin/audit-harness.js`
367
- - A4: `iah-script-robustness-upstream` (already shipped in v1.1.1; nothing more to do)
189
+ - **`gherkin-lint.sh process_awk_output()` collapsed to a single awk pass (Gemini #38 follow-up).** Closes `iah-gherkin-single-awk-opt` (P3). v1.1.2 introduced `process_awk_output()` with two awk subprocesses per call; v1.1.4 collapses to a single awk pass, halving the awk fork count (4 callsites × 2 subprocesses → 4). Verified with a mixed WARN+ERROR test.
190
+ - **Shellcheck CI job version-pinned (parity with ruff v1.1.3).** Closes `iah-shellcheck-version-pin` (P3). v1.1.2 installed shellcheck via `apt-get` which pulls whatever Ubuntu's runner image ships; v1.1.4 pins to `v0.10.0` downloaded from the koalaman/shellcheck GitHub releases so runner-image upgrades can't silently activate new rules. CI prints `shellcheck --version` for the audit trail.
191
+ - **Version bumped to v1.1.4 across all 5 manifests** and **`.harness-hash` regenerated** (2 of 9 pinned-file hashes change: `gherkin-lint.sh` + `crap-score.py`).
368
192
 
369
- ## [v1.1.1] - 2026-05-23
193
+ ### Fixed
370
194
 
371
- ### Fixed 6 script robustness + portability fixes (IEP Convergence Debt Plan Priority 3)
195
+ - **`gherkin-lint.sh` `prev_blank` print-every-line noise (Gemini #71 review chain).** Closes `iah-gherkin-prev-blank-noise` (P2). The third awk block (the And-at-scenario-start checker) opened with a bare `prev_blank = 1` expression that awk interpreted as an always-true pattern with implicit `{ print }` — flooding stdout with every line of every feature file alongside the intentional ERROR printf. `prev_blank` was never read anywhere; both touches were removed so the block produces ONLY the targeted ERROR line.
196
+ - **`crap-score.py` exclusion sets deduplicated via an `EXCLUDED_DIRS` constant (Gemini #71 review).** Closes `iah-crap-score-exclusion-dedup` (P2). Two separate sets with overlapping intent but divergent contents — `ignore` in `score_python()` (had `reports`, lacked `.next`/`.nuxt`/`.cache`) and `prune` in `main()` (had `.next`/`.nuxt`/`.cache`, lacked `reports`) — caused real asymmetric skips. Extracted to a single module-level `EXCLUDED_DIRS` union referenced by both call sites.
372
197
 
373
- Closes `iah-script-robustness-upstream` (`bd_000-projects-qqkq`, P2). Addresses the 6 medium-severity Gemini findings surfaced when audit-harness scripts were vendored into `intent-eval-lab` via `iep-harness-hash-platform-rollout` (PR #67). All fixes are upstream-only: zero CLI surface change, zero runtime-dep change, zero policy change.
198
+ ## [1.1.3] - 2026-05-25
374
199
 
375
- - **`scripts/escape-scan.sh`** (mktemp leak): `--staged` and `--range` modes allocate a temp file via `mktemp` to capture the diff but never clean it up. Adds `trap 'rm -f "$DIFF_SRC"' EXIT` immediately after each `mktemp` so the temp file is removed on every exit path (clean exit, REFUSE, CHALLENGE, signal). Matters most when escape-scan runs as a local git hook where temp accumulation is silent.
376
- - **`scripts/crap-score.py`** (missing `go` PATH guard): `score_go()` called `run(["go", "test", "-coverprofile=...", ...])` without first checking that `go` is on PATH, so on systems without Go installed the subprocess raised `FileNotFoundError` and aborted the whole CRAP pass. Wraps the call in the existing `which_or_none("go")` pattern already used for `radon`, `gocyclo`, and the downstream `go tool cover` invocation.
377
- - **`scripts/crap-score.py`** (rglob walk pruning): the `--json` input-hash computation walked every file under `root` via `rglob("*")`, only filtering `node_modules` / `.venv` after the directory had been traversed. Replaces with `os.walk` + `dirs[:] = [...]` in-place pruning, skipping `.git`, `node_modules`, `.venv`/`venv`, `__pycache__`, `dist`, `build`, `target`, `.tox`, `.mypy_cache`, `.pytest_cache`, `.next`, `.nuxt`, `.cache`. Major perf win on large repos; no behavioral change to the resulting hash for repos without pruned-extension files under those directories.
378
- - **`scripts/emit-evidence.sh`** (shell→Python path injection): `python3 -c "import json, sys; print(json.load(open('$PKG_JSON'))['version'])"` interpolated the shell variable directly into the Python source. Paths containing single quotes (or arbitrary characters in adversarial cases) broke the parse. Now passes `$PKG_JSON` via `sys.argv[1]` — `python3 -c "import json, sys; print(json.load(open(sys.argv[1]))['version'])" "$PKG_JSON"` — moving the path through the safe argv channel.
379
- - **`scripts/bias-count.sh`** (per-file sha256sum fork): `find ... -exec sha256sum {} \;` spawned one `sha256sum` process per matched file. Changes the terminator to `+` so `find` batches arguments into one (or few) sha256sum invocations. Perf win on test suites with many files; output identical because the downstream `sort | sha256sum` step normalizes.
380
- - **`scripts/harness-hash.sh`** (cross-platform sha256sum): GNU coreutils ships `sha256sum`, macOS ships `shasum -a 256`. Adds detection at script top selecting whichever is available into a `SHA256_CMD` bash array, falling back with a clear error if neither is on PATH. Both produce identical `<hash> <file>` output, so the manifest format and downstream `awk` parsing are byte-equivalent. Enables engineer-local runs on macOS without forcing every contributor to install coreutils.
200
+ > **Why patch, not minor:** Pure lint-gate addition + dead-code removal. No new CLI commands, no new flags, no API change. AAR: `000-docs/008-AA-AACR-ruff-iep-P6-2026-05-24.md`.
381
201
 
382
- ### Changed — Version bumped to v1.1.1 across all 5 manifests
202
+ ### Added
383
203
 
384
- Per the version-canonical-check CI gate (added in v1.0.2 PR #35). All 5 committed manifest locations now report `1.1.1`:
204
+ - **Ruff CI gate against own-code Python (IEP Convergence Debt Plan Priority 6 Phase A2).** Closes `iah-ruff` (P1). New `ci.yml` job `ruff (Python lint)` runs `ruff check` (version-pinned to 0.15.4 per the shellcheck-version-pin lesson) against the own-code Python surface. Ruleset `select = ["B", "E", "F"]` — pyflakes (F), pycodestyle errors (E), and flake8-bugbear (B) per Gemini PR #39 review. Line length 120. New `ruff.toml` at repo root scopes lint to `scripts/*.py` + the CLI files and excludes the bundled-content mirrors (stale-sync tracked separately).
385
205
 
386
- - `package.json`
387
- - `version.txt`
388
- - `python/pyproject.toml`
389
- - `python/src/intent_audit_harness/__init__.py`
390
- - `rust/Cargo.toml`
206
+ ### Changed
391
207
 
392
- ### Changed `.harness-hash` regenerated
208
+ - **Long-line reformat in `scripts/crap-score.py`.** The 155-char `ignore` set literal reformatted into a multi-line set literal under the 120-char limit. Cosmetic; no behavior change.
209
+ - **Version bumped to v1.1.3 across all 5 manifests** and **`.harness-hash` regenerated** (1 of 9 pinned-file hashes change: `crap-score.py`).
393
210
 
394
- The self-pinning manifest is regenerated to capture the new script hashes (per `iep-P3 iah-self-pin` v1.1.0 mechanism). The 6 script edits change 4 of the 9 pinned-file hashes; `--init` rewrites the manifest.
211
+ ### Removed
395
212
 
396
- ### Why patch, not minor
213
+ - **3 ruff-surfaced dead-code findings.** `crap-score.py`: a redundant local `import hashlib, os` inside the `if args.json:` block (shadowing the used module-level `import os`, F401) was removed and `hashlib` moved to module-level imports per Gemini PR #39; and a dead local `metrics = …` in `score_rust()` (F841). `cli.py`: a dead `import os` (F401, zero `os.*` usages).
397
214
 
398
- Pure bug + portability fixes. No new flags, no new commands, no policy change, no breaking change to the manifest format. Downstream consumers re-vendor (or re-install via the polyglot installers) and get the improvements transparently.
215
+ ## [1.1.2] - 2026-05-24
399
216
 
400
- ### Why this matters for the platform
217
+ > **Why patch, not minor:** Pure dead-code removal + a CI policy tightening. No new CLI commands, no new flags, no API change, no behavioral change for any consumer. AAR: `000-docs/007-AA-AACR-shellcheck-hard-fail-iep-P6-2026-05-24.md`.
401
218
 
402
- The scripts in this release are now vendored into `intent-eval-lab` (per `iep-harness-hash-platform-rollout` rollout 1, lab PR #67) and will land in `j-rig-binary-eval` next. Bug-fix patches travel via re-vendor — `AUDIT_HARNESS_VERSION=v1.1.1 curl -sSL https://raw.githubusercontent.com/jeremylongshore/audit-harness/main/install.sh | bash` for vendored consumers, `pnpm up @intentsolutions/audit-harness` for node consumers. Landing the fixes before the rollout reaches more repos avoids re-publishing buggy vendored copies that immediately need replacement.
219
+ ### Changed
403
220
 
404
- AAR: `000-docs/006-AA-AACR-script-robustness-upstream-iep-P3-2026-05-23.md`.
221
+ - **Shellcheck CI gate flipped from tolerant to hard-fail (IEP Convergence Debt Plan Priority 6 Phase A1).** Closes `iah-shellcheck-hard-fail` (P1). The shellcheck job previously ran `shellcheck scripts/*.sh || true` — findings were logged but never blocked the PR. The `|| true` suffix is removed: any shellcheck finding (warning or error) now blocks the build. The locked precondition was v1.1.1 (PR #37), which addressed the 6 Gemini-flagged robustness findings.
222
+ - **Version bumped to v1.1.2 across all 5 manifests** and **`.harness-hash` regenerated** (3 of 9 pinned-file hashes change).
405
223
 
406
- ### Sequencing impact on Priority 6 Phase A1
224
+ ### Removed
407
225
 
408
- Priority 6 Phase A1 (`iah-shellcheck-hard-fail`) flips `.github/workflows/ci.yml:89` from `shellcheck scripts/*.sh || true` to hard-fail. Per the IEP Convergence Debt Plan risk-mitigation table ("Flipping shellcheck to hard-fail breaks existing audit-harness CImitigation: land fixes for Gemini's 6 findings FIRST, THEN flip the gate"), this release is the explicit precondition for the shellcheck flip. Phase A1 PR opens after v1.1.1 lands on main.
226
+ - **3 pieces of dead code surfaced by the harder shellcheck gate.** `bias-count.sh`: `declare -A PATTERN_COUNTS` + its per-call assignment (SC2034 populated, never read). `emit-evidence.sh`: `INPUT_HASH_HEX=$(…)` (SC2034 computed, never read; vestige of an earlier cosign integration). `gherkin-lint.sh`: the `err()` helper (SC2317 zero call sites), replaced with `process_awk_output()`.
409
227
 
410
- ## [v1.1.0] - 2026-05-22
228
+ ### Fixed
411
229
 
412
- ### Added Per-repo `.harness-hash-extra-patterns` mechanism + audit-harness self-pin (IEP Convergence Debt Plan Priority 3)
230
+ - **`gherkin-lint.sh` awk subprocess undercount (silent-failure class bug; Gemini PR #38 review).** The awk-fallback path printed `WARN`/`ERROR` lines via `awk printf`, but those subprocesses never incremented the parent shell's `WARN_COUNT`/`ERROR_COUNT` — the summary said "0 warnings, 0 errors" while errors were actively printed and the exit code stayed 0. Exactly the silent-failure class the linter exists to surface elsewhere. The new `process_awk_output()` helper wraps each awk subprocess, counts `WARN`/`ERROR` lines via inline awk, increments the bash counters, then re-prints. Verified: a deliberate failure now exits 1 with `0 warning(s), 1 error(s)`.
413
231
 
414
- Closes `iah-self-pin` (`bd_000-projects-itpl`, P1). The harness's own policy enforcement surface (scripts/*.sh + scripts/*.py + bin/audit-harness.js) is now hash-pinned at the audit-harness repo root. CI's `audit-harness list` + `harness-hash --verify` self-check steps are flipped from `|| true` exit-3 tolerance to hard-fail: any byte change to a pinned policy file without a fresh `--init` + commit of the regenerated `.harness-hash` exits 2 (HARNESS_TAMPERED) and blocks the PR.
232
+ ## [1.1.1] - 2026-05-23
415
233
 
416
- - **`scripts/harness-hash.sh`**: NEW reads an optional `.harness-hash-extra-patterns` file at the repo root and appends its lines to the default PATTERNS array. Comments (`#`) + blank lines ignored. Backward-compatible: repos without the file get exactly the previous behavior consumer repos are not affected.
417
- - **`.harness-hash-extra-patterns`** (NEW, audit-harness repo root): pins `scripts/*.sh`, `scripts/*.py`, `bin/audit-harness.js`, and the extras file itself (preventing silent edits to the self-pinning scope).
418
- - **`.harness-hash`** (NEW, audit-harness repo root): 9-file manifest produced by `bash scripts/harness-hash.sh --init`. Committed to main.
419
- - **`.github/workflows/ci.yml`**: `audit-harness list` + `harness-hash --verify` self-check steps drop `|| true` suffixes. Hard-fail in place. Comment block updated.
234
+ > **Why patch, not minor:** Pure bug + portability fixes. No new flags, no new commands, no policy change, no breaking change to the manifest format. These scripts are now vendored into `intent-eval-lab` (PR #67); landing the fixes before the rollout reaches more repos avoids re-publishing buggy vendored copies.
420
235
 
421
- ### Why minor not patch
236
+ ### Fixed
422
237
 
423
- The `.harness-hash-extra-patterns` mechanism is a new authored feature surface repos that opt in get a new capability. Per SemVer, minor bump. Existing repos (zero adopters today; this is the first one) are unaffected.
238
+ - **6 script robustness + portability fixes (IEP Convergence Debt Plan Priority 3).** Closes `iah-script-robustness-upstream` (P2). Addresses the 6 medium-severity Gemini findings surfaced when the scripts were vendored into `intent-eval-lab` (PR #67). All fixes are upstream-only zero CLI surface, runtime-dep, or policy change:
239
+ - **`escape-scan.sh`** (mktemp leak): adds `trap 'rm -f "$DIFF_SRC"' EXIT` after each `mktemp` so the temp file is removed on every exit path (matters most when escape-scan runs as a local git hook).
240
+ - **`crap-score.py`** (missing `go` PATH guard): `score_go()` now wraps the `go test` call in the existing `which_or_none("go")` pattern, so a system without Go no longer raises `FileNotFoundError` and aborts the whole CRAP pass.
241
+ - **`crap-score.py`** (rglob walk pruning): the `--json` input-hash walk now uses `os.walk` + in-place `dirs[:]` pruning (skipping `.git`, `node_modules`, `.venv`/`venv`, `__pycache__`, `dist`, `build`, `target`, `.tox`, `.mypy_cache`, `.pytest_cache`, `.next`, `.nuxt`, `.cache`) — a major perf win on large repos with no hash change for clean repos.
242
+ - **`emit-evidence.sh`** (shell→Python path injection): the package-version read now passes `$PKG_JSON` via `sys.argv[1]` instead of interpolating the shell variable into the Python source, so paths containing single quotes no longer break the parse.
243
+ - **`bias-count.sh`** (per-file sha256sum fork): `find … -exec sha256sum {} \;` changed to `… +` so `find` batches arguments into one (or few) invocations — output identical (the downstream `sort | sha256sum` normalizes).
244
+ - **`harness-hash.sh`** (cross-platform sha256sum): adds detection selecting `sha256sum` (GNU) or `shasum -a 256` (macOS) into a `SHA256_CMD` array, enabling engineer-local runs on macOS without coreutils.
424
245
 
425
- ### Why this matters
246
+ ### Changed
426
247
 
427
- Before this release, the audit-harness CI workflow could not enforce its own policy. The "harness tests itself" design rule (CLAUDE.md rule 5) was aspirational — `audit-harness list` and `harness-hash --verify` both exited 0 when no manifest existed (intentional tolerance to avoid false-failing every PR). A silent edit to `scripts/escape-scan.sh` (the gate that REFUSES threshold-lowering changes) would pass CI. That's the failure mode this release closes.
248
+ - **Version bumped to v1.1.1 across all 5 manifests** and **`.harness-hash` regenerated** (4 of 9 pinned-file hashes change). AAR: `000-docs/006-AA-AACR-script-robustness-upstream-iep-P3-2026-05-23.md`.
428
249
 
429
- ### Cross-platform-rollout note
250
+ ## [1.1.0] - 2026-05-22
430
251
 
431
- `iep-harness-hash-platform-rollout` (`bd_000-projects-g6zu`) unblocks on this release. The remaining 4 IEP repos (intent-eval-lab, j-rig-binary-eval, intent-rollout-gate kernel already pinned) can now copy this pattern using their own `.harness-hash-extra-patterns` to pin per-repo policy files (CI workflow definitions, governance docs, vendored harness wrappers).
252
+ > **Why minor, not patch:** The `.harness-hash-extra-patterns` mechanism is a new authored feature surface repos that opt in get a new capability. Before this release the audit-harness CI workflow could not enforce its own policy; a silent edit to `escape-scan.sh` (the gate that REFUSES threshold-lowering changes) would pass CI. That is the failure mode this release closes.
432
253
 
433
- ### Changed — Version bumped to v1.1.0 across all 5 manifests
254
+ ### Added
434
255
 
435
- Per the version-canonical-check CI gate landed in v1.0.2 (PR #35). All 5 committed manifest locations now report `1.1.0`.
256
+ - **Per-repo `.harness-hash-extra-patterns` mechanism + audit-harness self-pin (IEP Convergence Debt Plan Priority 3).** Closes `iah-self-pin` (P1). The harness's own policy-enforcement surface (`scripts/*.sh` + `scripts/*.py` + `bin/audit-harness.js`) is now hash-pinned at the repo root. CI's `audit-harness list` + `harness-hash --verify` self-check steps flip from `|| true` exit-3 tolerance to hard-fail: any byte change to a pinned policy file without a fresh `--init` + commit of the regenerated `.harness-hash` exits 2 (HARNESS_TAMPERED) and blocks the PR.
257
+ - **`scripts/harness-hash.sh`** (new): reads an optional `.harness-hash-extra-patterns` file at the repo root and appends its lines to the default PATTERNS array. Backward-compatible — repos without the file get exactly the previous behavior.
258
+ - **`.harness-hash-extra-patterns`** (new): pins `scripts/*.sh`, `scripts/*.py`, `bin/audit-harness.js`, and the extras file itself.
259
+ - **`.harness-hash`** (new): 9-file manifest produced by `bash scripts/harness-hash.sh --init`, committed to main.
260
+ - **`.github/workflows/ci.yml`**: the self-check steps drop their `|| true` suffixes.
436
261
 
437
- AAR: `000-docs/005-AA-AACR-iah-self-pin-iep-P3-2026-05-22.md`.
262
+ ### Changed
438
263
 
439
- ## [v1.0.2] - 2026-05-21
264
+ - **Version bumped to v1.1.0 across all 5 manifests.** Per the `version-canonical-check` CI gate landed in v1.0.2 (PR #35). AAR: `000-docs/005-AA-AACR-iah-self-pin-iep-P3-2026-05-22.md`.
440
265
 
441
- ### Chore — Polyglot manifest alignment + Apache-2.0 NOTICE inclusion in distributions (IEP Convergence Debt Plan Priority 3)
266
+ ## [1.0.2] - 2026-05-21
442
267
 
443
- Aligned all polyglot manifests (`package.json` + `version.txt` + `python/pyproject.toml` + `python/src/intent_audit_harness/__init__.py` + `rust/Cargo.toml` + `rust/Cargo.lock`) at version `1.0.2`. Bumped from npm `v1.0.1` → `v1.0.2` (rather than aligning the PyPI/crates.io wrappers to npm's `v1.0.1`) so all four registries publish lockstep from this release forward — preserves the immutability of the already-shipped npm `v1.0.1` tarball. Added a CI gate that fails any future drift. Folded NOTICE file inclusion into Python sdist + Rust crate distributions per Apache-2.0 § 4. No CLI surface or runtime behavior changes — pure metadata + packaging alignment.
268
+ ### Changed
444
269
 
445
- - `package.json`: version `1.0.1` → `1.0.2`
446
- - `version.txt`: `0.2.0` `1.0.2`
447
- - `python/pyproject.toml`: version `0.1.0` → `1.0.2`; license `MIT` → `Apache-2.0`; PyPI classifier updated to "License :: OSI Approved :: Apache Software License"; `[tool.hatch.build.targets.sdist].include` adds `/LICENSE` + `/NOTICE` per Apache-2.0 § 4
448
- - `python/src/intent_audit_harness/__init__.py`: `__version__` `0.1.0` → `1.0.2`
449
- - `rust/Cargo.toml`: version `0.1.0` → `1.0.2`; license `MIT` → `Apache-2.0`; `include` adds `NOTICE` per Apache-2.0 § 4
450
- - `rust/Cargo.lock`: package entry version `1.0.1` → `1.0.2` (file is gitignored but the working-tree state is consistent for cargo builds)
451
- - `.github/workflows/ci.yml`: NEW `version-canonical-check` job — fails if any of the 5 tracked version locations diverge from `package.json`, or if any non-npm manifest carries a non-`Apache-2.0` license. The gate also includes a robustness check for `rust/Cargo.lock` (currently gitignored; no-ops gracefully when the file isn't present in CI checkout).
270
+ - **Polyglot manifest alignment + Apache-2.0 NOTICE inclusion in distributions (IEP Convergence Debt Plan Priority 3).** Aligned all polyglot manifests at version `1.0.2`, bumping from npm `v1.0.1` → `v1.0.2` (rather than aligning the PyPI/crates wrappers to npm's `v1.0.1`) so all four registries publish lockstep from this release forward — preserving the immutability of the already-shipped npm `v1.0.1` tarball. Per-file: `package.json` `1.0.1` → `1.0.2`; `version.txt` `0.2.0` → `1.0.2`; `python/pyproject.toml` `0.1.0` → `1.0.2` (license `MIT` → `Apache-2.0`, classifier updated, sdist `include` adds `/LICENSE` + `/NOTICE`); `python/src/intent_audit_harness/__init__.py` `__version__` → `1.0.2`; `rust/Cargo.toml` `0.1.0` → `1.0.2` (license `MIT` → `Apache-2.0`, `include` adds `NOTICE`); `rust/Cargo.lock` package entry `1.0.1` → `1.0.2`.
271
+ - Folded NOTICE-file inclusion into the Python sdist + Rust crate distributions per Apache-2.0 § 4. No CLI surface or runtime behavior changes — pure metadata + packaging alignment.
452
272
 
453
- Closes beads (pending PR merge): `iah-version-drift` (bd_000-projects-uoz3), `iah-license-drift` (bd_000-projects-ck2e), `iah-version-canonical-check` (bd_000-projects-hd5y). AAR at `000-docs/004-AA-AACR-polyglot-version-license-alignment-2026-05-21.md`.
273
+ ### Added
454
274
 
455
- Notes for downstream consumers:
275
+ - **`version-canonical-check` CI job (#35).** Fails if any of the 5 tracked version locations diverge from `package.json`, or if any non-npm manifest carries a non-`Apache-2.0` license. Includes a robustness check for the gitignored `rust/Cargo.lock`. Closes `iah-version-drift`, `iah-license-drift`, `iah-version-canonical-check`. AAR: `000-docs/004-AA-AACR-polyglot-version-license-alignment-2026-05-21.md`.
456
276
 
457
- - **npm** users: `v1.0.2` is purely metadata + packaging — no observable behavior change vs. `v1.0.1`. Upgrade at your convenience.
458
- - **PyPI + crates.io** users: this is the first published `v1.0.2` and the first published Apache-2.0 release on these registries. The prior published `0.1.0` artifacts pre-date the `v1.0.0` Apache-2.0 relicense and remain available under their original MIT terms (registry tarballs are immutable). From `v1.0.2` forward all four registries publish lockstep at the same SemVer.
277
+ ## [1.0.1] - 2026-05-20
459
278
 
460
- ## [v1.0.1] - 2026-05-20
279
+ ### Fixed
461
280
 
462
- ### Fixed NOTICE in published tarball
281
+ - **NOTICE in the published tarball.** Added `NOTICE` to `package.json#files` so the file ships in the npm tarball alongside `LICENSE`. Per Apache 2.0 § 4, derivatives must carry the NOTICE file's attribution text if one exists in the source. `v1.0.0` shipped the relicense to Apache 2.0 but the tarball only carried `LICENSE` — this corrects that omission. No code, behavior, CLI, or dependency changes — packaging-only patch.
463
282
 
464
- - Added `NOTICE` to `package.json#files` so the file ships in the npm tarball alongside `LICENSE`. Per Apache 2.0 § 4, derivatives must carry the NOTICE file's attribution text if one exists in the source. `v1.0.0` shipped the relicense to Apache 2.0 but the tarball only carried `LICENSE` — this corrects that omission.
283
+ ## [1.0.0] - 2026-05-19
465
284
 
466
- No code, behavior, CLI, or dependency changes — packaging-only patch.
285
+ ### Changed
467
286
 
468
- ## [v1.0.0] - 2026-05-19
287
+ - **Relicensed from MIT to Apache 2.0 (BREAKING) (#32).** Deliberate alignment with the rest of the Intent Eval Platform ecosystem (`intent-eval-lab`, `intent-eval-core`) so every repo ships under a single OSI-approved license with explicit patent-grant language. Existing `0.x` releases on npm remain available under their original MIT terms (npm tarballs are immutable); all releases `>= 1.0.0` are Apache 2.0. README license section updated with a backward-compat note. No code, CLI surface, behavior, or runtime-dependency changes — license-only bump cut as MAJOR for legal clarity and consumer-review signaling.
288
+ - **Terminology: matcher-map → Intentional Mapping (per ISEDC v2).**
469
289
 
470
- ### Changed — License (BREAKING)
290
+ ### Added
471
291
 
472
- - **Relicensed from MIT to Apache 2.0.** Deliberate alignment with the rest of the Intent Eval Platform ecosystem (`intent-eval-lab`, `intent-eval-core`) so every repo ships under a single OSI-approved license with explicit patent-grant language.
473
- - Existing `0.x` releases on npm remain available under their original MIT terms (npm tarballs are immutable). All releases `>= 1.0.0` are Apache 2.0.
474
- - Added `NOTICE` file per Apache 2.0 best practice with copyright attribution and license summary.
475
- - README license section updated to reflect the change with a backward-compat note.
292
+ - **`NOTICE` file** per Apache 2.0 best practice with copyright attribution and license summary.
476
293
 
477
- No code, CLI surface, behavior, or runtime dependency changes in this release — license-only bump cut as MAJOR for legal clarity and consumer review signaling.
294
+ ## [0.3.0] - 2026-05-12
478
295
 
479
- ## [v0.3.0] - 2026-05-12
296
+ > Documented for completeness — the `--json` + `emit-evidence` work landed in the
297
+ > source tree as the v0.3.0 milestone but a `v0.3.0` git tag was never cut; the next
298
+ > published tag was `v1.0.0`. Kept here so the Milestone-2 capability set is not lost.
299
+ >
300
+ > **Notes:**
301
+ >
302
+ > - **No breaking changes.** Pre-v0.3.0 callers see identical text-mode output and exit codes; `--json` is purely additive.
303
+ > - **CISO gate (per ISEDC v1 Q1, 2026-05-10):** pushing a signed Statement to Rekor against `evals.intentsolutions.io/gate-result/v1` is BLOCKED until DNSSEC + CAA records are verified on the namespace.
480
304
 
481
- ### Added — Evidence Bundle emission (Milestone 2 of the build journey)
305
+ ### Added
482
306
 
483
- - `--json` flag on every gate (`escape-scan`, `harness-hash --verify`, `arch`, `bias`,
484
- `gherkin-lint`, `crap`). Emits a machine-readable gate-result envelope to stdout while
485
- preserving the existing human-readable text on stderr. Exit codes unchanged.
486
- - `emit-evidence` subcommand. Reads a gate-result envelope from stdin (or `--input`),
487
- augments it with `timestamp`, `runner`, `commit_sha`, and emits a complete
488
- [in-toto Statement v1](https://github.com/in-toto/attestation/blob/main/spec/v1/statement.md)
489
- with `predicateType` `https://evals.intentsolutions.io/gate-result/v1` per
490
- [`evidence-bundle/v0.1.0-draft/SPEC.md`](https://github.com/jeremylongshore/intent-eval-lab/blob/main/specs/evidence-bundle/v0.1.0-draft/SPEC.md).
491
- Optional `--sign` (cosign keyless or `--key`), `--rekor-url` for transparency-log push.
492
- OTel `agent.rollout.gate.evaluated` event when `AUDIT_HARNESS_OTEL=1` or
493
- `OTEL_EXPORTER_OTLP_ENDPOINT` set (best-effort no-op otherwise).
494
- - `SEMVER.md` — explicit SemVer commitment doc covering exit codes, stream contracts,
495
- and the predicate URI freeze.
496
- - `tests/regression/run-regression.sh` — backward-compat regression suite. 11 checks
497
- across text-mode parity, `--json` stream separation, schema validation, and the
498
- `emit-evidence` pipeline.
499
- - CI: `regression` job in `.github/workflows/ci.yml` runs the regression suite on every PR.
307
+ - **Evidence Bundle emission (Milestone 2 of the build journey).** A `--json` flag on every gate (`escape-scan`, `harness-hash --verify`, `arch`, `bias`, `gherkin-lint`, `crap`) emits a machine-readable gate-result envelope to stdout while preserving the existing human-readable text on stderr; exit codes unchanged.
308
+ - **`emit-evidence` subcommand.** Reads a gate-result envelope from stdin (or `--input`), augments it with `timestamp`, `runner`, `commit_sha`, and emits a complete [in-toto Statement v1](https://github.com/in-toto/attestation/blob/main/spec/v1/statement.md) with `predicateType` `https://evals.intentsolutions.io/gate-result/v1`. Optional `--sign` (cosign keyless or `--key`) + `--rekor-url`. OTel `agent.rollout.gate.evaluated` event when `AUDIT_HARNESS_OTEL=1` or `OTEL_EXPORTER_OTLP_ENDPOINT` is set.
309
+ - **`SEMVER.md`** explicit SemVer commitment doc covering exit codes, stream contracts, and the predicate-URI freeze.
310
+ - **`tests/regression/run-regression.sh`** backward-compat regression suite (11 checks across text-mode parity, `--json` stream separation, schema validation, and the `emit-evidence` pipeline), wired into a `regression` CI job.
500
311
 
501
312
  ### Changed
502
313
 
503
- - `bin/audit-harness.js` dispatcher exposes the new `emit-evidence` subcommand.
504
- - `scripts/arch-check.sh` `--json` output reshaped to the gate-result envelope shape
505
- (the prior single-line `{"tool","status","violations","log"}` was internal — no
506
- documented adopter parsed it).
507
-
508
- ### Notes
314
+ - **`bin/audit-harness.js`** dispatcher exposes the new `emit-evidence` subcommand.
315
+ - **`scripts/arch-check.sh`** `--json` output reshaped to the gate-result envelope shape.
509
316
 
510
- - **No breaking changes.** Pre-v0.3.0 callers see identical text-mode output and exit
511
- codes. The `--json` flag is purely additive.
512
- - **CISO gate (per ISEDC v1 Q1, 2026-05-10):** pushing a signed Statement to Rekor
513
- against `evals.intentsolutions.io/gate-result/v1` is BLOCKED until DNSSEC + CAA
514
- records are verified on the namespace. The script supports unsigned envelope
515
- emission until that gate clears (tracked in `intent-eval-lab/.beads/` as `iel-4zr`).
516
- - **Plan reference:** `~/.claude/plans/se-the-council-bubbly-frog.md` Milestone 2.
317
+ ## [0.2.0] - 2026-05-10
517
318
 
518
- ## [v0.2.0] - 2026-05-10
319
+ ### Added
519
320
 
520
- - docs: add release.yml complete /repo-dress 21-file canon (c0298ef)
521
- - docs: fill baseline OSS governance gaps via /repo-dress (closes #10) (29a8520)
522
- - docs: Part 2 Workstream A upgrade landscape (c967f3e)
523
- - docs(CLAUDE.md): add three-repo convergence section (b8255a3)
524
- - infra: convergence Phase A.0 + A — bd init, GH templates, CI workflow, design notes (8f30db4)
525
- - bd init: initialize beads issue tracking (ffc7597)
526
- - feat: add PyPI and crates.io wrappers for audit-harness (9b97217)
321
+ - **PyPI and crates.io wrappers for audit-harness** (9b97217) the polyglot trifecta (npm + PyPI + crates) begins here.
527
322
 
528
- All notable changes to `@intentsolutions/audit-harness` are documented here.
323
+ ### Changed
529
324
 
530
- The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
531
- and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
325
+ - **Filled baseline OSS governance gaps via `/repo-dress` (#11).** Completed the `/repo-dress` 21-file canon, including the `release.yml` workflow (#15).
326
+ - **Convergence Phase A.0 + A scaffolding** (8f30db4) — bd issue-tracking init, GitHub issue templates, CI workflow, and the three-repo convergence design notes / CLAUDE.md section (b8255a3, ffc7597).
327
+ - **Part 2 Workstream A upgrade-landscape docs (#9).**
532
328
 
533
- ## [0.1.0] 2026-04-21
329
+ ## [0.1.0] - 2026-04-21
534
330
 
535
331
  Initial release. Extracted from the `audit-tests` Claude Code skill v7.0.0 to enable in-repo enforcement without global skill installation.
536
332
 
333
+ > **Key design decisions:**
334
+ >
335
+ > - **Scripts stay as shell/python** — not a TypeScript port; battle-tested, language-portable, minimal dependencies.
336
+ > - **Thin Node CLI** — `bin/audit-harness.js` is a dispatcher only; all logic lives in `scripts/`.
337
+ > - **Policy-driven thresholds** — `escape-scan.sh` reads floors from `tests/TESTING.md` in the target repo, not from the script source.
338
+ > - **Zero runtime dependencies** beyond Node 18+, bash, and Python 3 (only if using `crap`).
339
+
537
340
  ### Added
538
341
 
539
- - `audit-harness verify` — SHA-256 hash verification for pinned policy files
540
- - `audit-harness init` — initialize/re-init the `.harness-hash` manifest
541
- - `audit-harness list` — list pinned files
542
- - `audit-harness escape-scan` — detect AI escape patterns in a diff (coverage threshold lowering, test deletion, architecture bypasses, test skip markers)
543
- - `audit-harness arch` — dispatch language-appropriate architecture checker (dependency-cruiser / import-linter / ArchUnit / deptrac / arch-go)
544
- - `audit-harness bias` — count common test-bias patterns
545
- - `audit-harness gherkin-lint` — advisory Gherkin quality check
546
- - `audit-harness crap` — CRAP (Complexity × Coverage) scorer for Python, JS/TS, Go, Rust
547
-
548
- ### Key design decisions
549
-
550
- - **Scripts stay as shell/python.** Not a TypeScript port — battle-tested implementations, language-portable, minimal dependencies.
551
- - **Thin Node CLI.** `bin/audit-harness.js` is a dispatcher only; all logic lives in `scripts/`.
552
- - **Policy-driven thresholds.** `escape-scan.sh` reads floors from `tests/TESTING.md` in the target repo, not from the script source.
553
- - **Zero runtime dependencies** beyond Node 18+, bash, and Python 3 (only if using `crap` command).
342
+ - **`audit-harness verify`** — SHA-256 hash verification for pinned policy files.
343
+ - **`audit-harness init`** — initialize / re-init the `.harness-hash` manifest.
344
+ - **`audit-harness list`** — list pinned files.
345
+ - **`audit-harness escape-scan`** — detect AI escape patterns in a diff (coverage-threshold lowering, test deletion, architecture bypasses, test-skip markers).
346
+ - **`audit-harness arch`** — dispatch the language-appropriate architecture checker (dependency-cruiser / import-linter / ArchUnit / deptrac / arch-go).
347
+ - **`audit-harness bias`** — count common test-bias patterns.
348
+ - **`audit-harness gherkin-lint`** — advisory Gherkin quality check.
349
+ - **`audit-harness crap`** — CRAP (Complexity × Coverage) scorer for Python, JS/TS, Go, Rust.
350
+
351
+ [Unreleased]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.2.2...HEAD
352
+ [1.2.2]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.2.1...v1.2.2
353
+ [1.2.1]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.2.0...v1.2.1
354
+ [1.2.0]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.8...v1.2.0
355
+ [1.1.8]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.7...v1.1.8
356
+ [1.1.7]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.6...v1.1.7
357
+ [1.1.6]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.5...v1.1.6
358
+ [1.1.5]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.4...v1.1.5
359
+ [1.1.4]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.3...v1.1.4
360
+ [1.1.3]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.2...v1.1.3
361
+ [1.1.2]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.1...v1.1.2
362
+ [1.1.1]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.1.0...v1.1.1
363
+ [1.1.0]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.0.2...v1.1.0
364
+ [1.0.2]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.0.1...v1.0.2
365
+ [1.0.1]: https://github.com/jeremylongshore/intent-audit-harness/compare/v1.0.0...v1.0.1
366
+ [1.0.0]: https://github.com/jeremylongshore/intent-audit-harness/compare/v0.2.0...v1.0.0
367
+ [0.3.0]: https://github.com/jeremylongshore/intent-audit-harness/compare/v0.2.0...v1.0.0
368
+ [0.2.0]: https://github.com/jeremylongshore/intent-audit-harness/compare/v0.1.0...v0.2.0
369
+ [0.1.0]: https://github.com/jeremylongshore/intent-audit-harness/releases/tag/v0.1.0
package/README.md CHANGED
@@ -10,7 +10,7 @@ Deterministic test-enforcement toolkit. Companion to the `audit-tests` and `impl
10
10
 
11
11
  ## What it is
12
12
 
13
- A small CLI wrapping 6 deterministic scripts:
13
+ A small CLI dispatching 17 deterministic commands (shell + stdlib-Python scripts):
14
14
 
15
15
  | Command | Purpose |
16
16
  |---|---|
@@ -18,10 +18,19 @@ A small CLI wrapping 6 deterministic scripts:
18
18
  | `audit-harness init` | Pin the current state of engineer-owned policy files |
19
19
  | `audit-harness list` | Show pinned files |
20
20
  | `audit-harness escape-scan --staged` | Detect AI attempts to lower test thresholds, delete tests, bypass architecture rules |
21
+ | `audit-harness cred-gate` | Provider-credential PASS/FAIL gate — FAIL if a declared secret, provider-key shape, or serialized env leaks into the artifact about to be signed |
21
22
  | `audit-harness arch` | Run language-appropriate architecture-rule checker (dependency-cruiser / import-linter / ArchUnit / deptrac / arch-go) |
22
23
  | `audit-harness bias` | Count common test-bias patterns |
23
24
  | `audit-harness gherkin-lint` | Advisory Gherkin quality check |
24
25
  | `audit-harness crap` | CRAP (Complexity × Coverage) scorer — Python, Go, JS/TS, Rust |
26
+ | `audit-harness emit-evidence` | Wrap a gate-result JSON envelope in an in-toto Statement v1 (predicate `gate-result/v1`) |
27
+ | `audit-harness classify` | Read-only repo classifier → an `audit-profile/v1` value (never writes) |
28
+ | `audit-harness conform` | Read-only conformance gate-runner → `gate-result/v1` rows against bundled content-addressed schemas |
29
+ | `audit-harness audit` | Read-only testing-depth gate-runner → coverage presence per pyramid layer + crap-score |
30
+ | `audit-harness scan` | Read-only security/hygiene/skill-quality gate-runner (gitleaks / osv-scanner / Semgrep / syft / markdownlint / lychee) |
31
+ | `audit-harness fp-rate` | Measure each gate's false-positive / false-negative rate over a labeled corpus |
32
+ | `audit-harness currency` | Advisory poll-freshness report over the per-upstream pin relation |
33
+ | `audit-harness gen-layer-applicability` | Project the canonical audit-profile registry into `layer-applicability.md` |
25
34
 
26
35
  ## Install
27
36
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@intentsolutions/audit-harness",
3
- "version": "1.2.1",
3
+ "version": "1.2.3",
4
4
  "description": "Deterministic test-enforcement harness — escape-scan, hash-pinning, CRAP, architecture checks, bias detection, Gherkin lint. Companion to the audit-tests and implement-tests Claude Code skills.",
5
5
  "license": "Apache-2.0",
6
6
  "author": "Jeremy Longshore <jeremy@intentsolutions.io>",
@@ -138,30 +138,95 @@ TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
138
138
  STATEMENT=$(GATE_JSON="$GATE_JSON" PREDICATE_URI="$PREDICATE_URI" STATEMENT_TYPE="$STATEMENT_TYPE" \
139
139
  RUNNER="$RUNNER" COMMIT_SHA="$COMMIT_SHA" TIMESTAMP="$TIMESTAMP" \
140
140
  python3 - <<'PY'
141
- import json, os, sys
141
+ import json, os, re, sys
142
142
 
143
143
  gate = json.loads(os.environ["GATE_JSON"])
144
144
 
145
+ # Kernel _common.schema.json#/$defs/semver
146
+ _SEMVER_RE = re.compile(r"^[0-9]+\.[0-9]+\.[0-9]+(-[A-Za-z0-9.-]+)?(\+[A-Za-z0-9.-]+)?$")
147
+
145
148
  required = ["gate_id", "result", "input_hash", "policy_hash"]
146
149
  missing = [k for k in required if k not in gate]
147
150
  if missing:
148
151
  sys.stderr.write(f"emit-evidence: gate-result missing required keys: {missing}\n")
149
152
  sys.exit(1)
150
153
 
151
- # Augment predicate with runner-supplied fields
154
+ # Build the canonical gate-result/v1 predicate body (Blueprint B § 7.4 / kernel
155
+ # GateResultV1Schema). The inbound gate JSON is the legacy/draft envelope
156
+ # (gate_id/result/policy_hash/input_hash[/metadata]); map + synthesize the
157
+ # canonical fields. The kernel schema FORBIDS additionalProperties, so the legacy
158
+ # `result`/`timestamp` keys are REPLACED, not augmented. Mirrors the kernel-valid
159
+ # self-gate emitter ci/emit-evidence.ts:buildGateResult.
160
+ metadata = gate.get("metadata") or {}
161
+
162
+ # result (legacy UPPERCASE) / gate_decision (canonical) -> closed enum.
163
+ _DECISION_MAP = {"pass": "pass", "fail": "fail", "advisory": "advisory", "error": "error"}
164
+ decision_raw = str(gate.get("gate_decision", gate.get("result", ""))).strip().lower()
165
+ gate_decision = _DECISION_MAP.get(decision_raw, "error")
166
+
167
+ # gate_name: kebab-case short name; fall back to the last ':' segment of gate_id.
168
+ gate_name = gate.get("gate_name") or gate["gate_id"].rsplit(":", 1)[-1]
169
+
170
+ # gate_version: SemVer; fall back to the runner's semver (<tool>@X.Y.Z). The
171
+ # kernel pattern is strict, so a non-SemVer runner suffix (e.g. '@unknown')
172
+ # degrades to 0.0.0 rather than emitting a row that fails kernel validation.
173
+ gate_version = gate.get("gate_version")
174
+ if not gate_version:
175
+ _runner = os.environ["RUNNER"]
176
+ gate_version = _runner.split("@", 1)[1] if "@" in _runner else ""
177
+ if not _SEMVER_RE.match(str(gate_version)):
178
+ gate_version = "0.0.0"
179
+
180
+ # gate_reasons: empty array permitted ONLY for unconditional pass; otherwise >=1.
181
+ reasons = gate.get("gate_reasons")
182
+ if not reasons:
183
+ if gate_decision == "pass":
184
+ reasons = []
185
+ else:
186
+ reasons = [str(metadata.get("reason") or gate.get("failure_mode")
187
+ or f"{gate_name}: {gate_decision}")]
188
+
189
+ # coverage: BOTH arrays REQUIRED. Pass an inbound coverage through only when both
190
+ # keys are present AND lists (a half-populated dict would fail kernel validation);
191
+ # otherwise synthesize. An indeterminate row records the dimension as skipped.
192
+ _cov = gate.get("coverage")
193
+ if (isinstance(_cov, dict)
194
+ and isinstance(_cov.get("dimensions_evaluated"), list)
195
+ and isinstance(_cov.get("dimensions_skipped"), list)):
196
+ coverage = {"dimensions_evaluated": _cov["dimensions_evaluated"],
197
+ "dimensions_skipped": _cov["dimensions_skipped"]}
198
+ else:
199
+ _dim = str(metadata.get("kind") or gate_name)
200
+ if metadata.get("indeterminate"):
201
+ coverage = {"dimensions_evaluated": [], "dimensions_skipped": [_dim]}
202
+ else:
203
+ coverage = {"dimensions_evaluated": [_dim], "dimensions_skipped": []}
204
+
205
+ # policy_ref: `sha256:<hex>:<path>` — append an artifact/schema path to policy_hash.
206
+ policy_ref = gate.get("policy_ref")
207
+ if not policy_ref:
208
+ _path = metadata.get("artifact_path") or metadata.get("schema_id") or ".harness-hash"
209
+ policy_ref = f'{gate["policy_hash"]}:{_path}'
210
+
152
211
  predicate = {
153
- "gate_id": gate["gate_id"],
154
- "result": gate["result"],
155
- "policy_hash": gate["policy_hash"],
156
- "input_hash": gate["input_hash"],
157
- "timestamp": os.environ["TIMESTAMP"],
158
- "runner": os.environ["RUNNER"],
159
- "commit_sha": os.environ["COMMIT_SHA"],
212
+ "gate_id": gate["gate_id"],
213
+ "gate_name": gate_name,
214
+ "gate_version": gate_version,
215
+ "gate_decision": gate_decision,
216
+ "gate_reasons": reasons,
217
+ "coverage": coverage,
218
+ "policy_ref": policy_ref,
219
+ "policy_hash": gate["policy_hash"],
220
+ "input_hash": gate["input_hash"],
221
+ "evaluated_at": os.environ["TIMESTAMP"],
222
+ "runner": os.environ["RUNNER"],
223
+ "commit_sha": os.environ["COMMIT_SHA"],
160
224
  }
161
225
 
162
- # Carry forward optional fields if present
163
- for opt in ("metadata", "failure_mode", "advisory_severity"):
164
- if opt in gate:
226
+ # Carry forward optional canonical fields only (schema forbids unknown keys).
227
+ for opt in ("metadata", "failure_mode", "advisory_severity", "cost_record_ref",
228
+ "replay_fidelity_level", "coverage_detail"):
229
+ if gate.get(opt) is not None:
165
230
  predicate[opt] = gate[opt]
166
231
 
167
232
  # Subject naming: subject.name MUST equal predicate.gate_id (SPEC § 6 R8)