@intentsolutions/audit-harness 1.1.7 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +65 -0
- package/bin/audit-harness.js +17 -6
- package/docs/cred-gate.md +131 -0
- package/package.json +8 -1
- package/schemas/currency/pins.v1.json +164 -22
- package/scripts/arch-check.sh +20 -2
- package/scripts/bias-count.sh +18 -1
- package/scripts/caa-check.sh +143 -0
- package/scripts/check-wrapper-sync.sh +120 -0
- package/scripts/crap-score.py +57 -6
- package/scripts/cred-gate.sh +238 -0
- package/scripts/currency.py +70 -25
- package/scripts/dnssec-check.sh +158 -0
- package/scripts/emit-evidence.sh +186 -14
- package/scripts/escape-scan.sh +28 -3
- package/scripts/gherkin-lint.sh +5 -0
- package/scripts/harness-hash.sh +5 -0
- package/scripts/kernel-shadow-check.sh +132 -0
package/CHANGELOG.md
CHANGED
|
@@ -4,6 +4,47 @@ All notable changes are recorded here. Format follows [Keep a Changelog](https:/
|
|
|
4
4
|
|
|
5
5
|
## [Unreleased]
|
|
6
6
|
|
|
7
|
+
_Nothing yet._
|
|
8
|
+
|
|
9
|
+
### Riding a future v2.1 routine release (descoped from 1.2.0)
|
|
10
|
+
|
|
11
|
+
- **OTel event-name polish (iah-E07b/c).** The `agent.rollout.gate.evaluated` and `gate.decision.emitted` event names are already locked + tested on main (PRs #78, #81 per NORMATIVE `intent-eval-lab/000-docs/067-AT-SPEC`). Any further attribute-schema polish on those events is deferred to a routine v2.1 release rather than headlined here — it is additive telemetry refinement, not a 1.2.0 capability boundary.
|
|
12
|
+
|
|
13
|
+
## [1.2.0] - 2026-06-15
|
|
14
|
+
|
|
15
|
+
A minor release: the read-only "comprehensive audit, on any repo" brain (`classify` → `conform` → `audit` → `scan` → `currency`), the kernel-emitting evidence path (`emit-evidence` Evidence Bundle, E04), the provider credential gate (`cred-gate`, E08), shared vendorable lint configs (#85), and a golden-master fitness function — all additive, with the zero-runtime-dependency guarantee preserved.
|
|
16
|
+
|
|
17
|
+
### Release narrative (what shipped since 1.1.8)
|
|
18
|
+
|
|
19
|
+
- **`emit-evidence` Evidence Bundle emitter (E04).** The CI-only signed-evidence path emits the harness's own deterministic self-gate as a kernel `gate-result/v1` row inside an `EvidenceBundle`, cosign-signs the canonical bytes (Fulcio OIDC + Rekor), and publishes a `report-manifest.json` the dashboard re-verifies at ingest. Detail under "CI-only signed evidence emit" below.
|
|
20
|
+
- **Provider credential gate (`cred-gate`, E08).** A new gate that asserts provider credentials PASS/FAIL with full redaction + spillover coverage (`scripts/cred-gate.sh`, fixtures via PR #80).
|
|
21
|
+
- **Shared, vendorable lint configs (#85).** `.audit-harness-configs/` (markdownlint / yamllint / ruff / shellcheck) is the canonical config set the IEP repos vendor + extend; `install.sh` now vendors both `scripts/` and `configs/`.
|
|
22
|
+
- **Dogfood AAR (iah-E10d).** First-downstream-adopter run captured at `000-docs/013-AA-AACR-rollout-gate-dogfood-iah-E10-2026-06-15.md`.
|
|
23
|
+
|
|
24
|
+
### Apache-2.0 §4(d) NOTICE obligation — satisfied
|
|
25
|
+
|
|
26
|
+
`NOTICE` is present at the repo root, listed in `package.json#files` (ships in the npm tarball), included in the Python sdist + Rust crate distributions, AND vendored into `.audit-harness/` by `install.sh` (see "`install.sh` vendors NOTICE" below). The §4(d) attribution-travels-with-distribution obligation holds across npm, PyPI, crates.io, and the vendored-install path.
|
|
27
|
+
|
|
28
|
+
### Why minor, not patch
|
|
29
|
+
|
|
30
|
+
Multiple new CLI verbs (`classify`, `conform`, `audit`, `scan`, `currency`, `cred-gate`) and new authored feature surfaces (shared lint configs, golden-master suite, the CI-only evidence emit). Per SemVer this is a minor bump. No CLI command was renamed or removed; the change is purely additive and the published tarball stays zero-runtime-dependency.
|
|
31
|
+
|
|
32
|
+
### Added — golden-master suite for gherkin-lint + crap-score stdout shapes (iah-golden-master)
|
|
33
|
+
|
|
34
|
+
A fitness function that pins the raw stdout of the two scorers whose output is a downstream contract.
|
|
35
|
+
|
|
36
|
+
- **`tests/golden/run-golden.sh`** captures `gherkin-lint.sh` (text rubric) and `crap-score.py --json` (gate-result envelope) stdout against a `tests/fixtures/deliberate-failure/` corpus and diffs each against a checked-in golden, failing on any drift. Environment-volatile bytes are normalized out (gherkin-lint's installed-vs-awk-fallback first line; crap-score's absolute `summary_path`) so the golden is byte-stable across machines. CI installs no complexity provider, so the crap golden captures the deterministic no-provider envelope shape.
|
|
37
|
+
- **Why this and not the per-row schema gate:** the schema gate validates the *augmented* predicate that `emit-evidence` produces, not the raw scorer stdout. A silent reshape of the scorer stdout — a renamed field, a dropped WARN line, changed summary wording — is a backward-compat break the schema gate cannot see. This suite is that missing guard.
|
|
38
|
+
- Regenerate intentional changes with `bash tests/golden/run-golden.sh --update` and review the golden diff in the PR. Wired into `.github/workflows/ci.yml` as the `golden` job.
|
|
39
|
+
|
|
40
|
+
### Changed — `install.sh` vendors NOTICE + the Node dispatcher (iah-install-sh-completeness)
|
|
41
|
+
|
|
42
|
+
The vendored-install path (non-Node repos) now ships a complete, traceable copy.
|
|
43
|
+
|
|
44
|
+
- **`NOTICE`** is copied into `.audit-harness/` — Apache-2.0 §4(d) requires the NOTICE file to travel with any distribution, and vendoring is a distribution.
|
|
45
|
+
- **`bin/audit-harness.js`** (the Node CLI dispatcher) and **`package.json`** are copied into `.audit-harness/bin/` + `.audit-harness/` so the canonical dispatcher surface is present and its `--version` (which reads `../package.json`) resolves in the vendored tree.
|
|
46
|
+
- A **`PROVENANCE`** file records the source repo, version, tarball URL, and install timestamp so a vendored tree is traceable back to the exact release it came from.
|
|
47
|
+
|
|
7
48
|
### Added — CI-only signed evidence emit for the intent-eval-dashboard (nr75.12)
|
|
8
49
|
|
|
9
50
|
The dashboard reports hub (labs.intentsolutions.io) ingests a signed `report-manifest.json` of kernel `gate-result/v1` rows per repo. This adds audit-harness's own emit, lighting up its row.
|
|
@@ -73,6 +114,30 @@ The first piece of the "comprehensive audit, on any repo" build: the read-only b
|
|
|
73
114
|
|
|
74
115
|
Scope boundary: no `conform` verb, no gate execution yet (Phase 2+). `classify` is read-only and emits a profile only.
|
|
75
116
|
|
|
117
|
+
## [1.1.8] - 2026-06-18
|
|
118
|
+
|
|
119
|
+
Ships the iah-E06 production-signing pre-flight gate to downstream consumers.
|
|
120
|
+
|
|
121
|
+
### Added — DNSSEC + CAA production-signing pre-flight (iah-E06)
|
|
122
|
+
|
|
123
|
+
Before a production-mode `emit-evidence` run signs canonical bytes, two deterministic pre-flight scripts assert the signing domain is cryptographically sound. Both fail closed: any error, missing record, or unreachable resolver blocks the signing path rather than emitting an unverifiable attestation.
|
|
124
|
+
|
|
125
|
+
- **`scripts/dnssec-check.sh`** — verifies the signing domain's DNSSEC chain is present and validates.
|
|
126
|
+
- **`scripts/caa-check.sh`** — verifies the domain's CAA records authorize the signing certificate authority.
|
|
127
|
+
- The `emit-evidence` production path gates on both before signing; staging/draft emit is unaffected.
|
|
128
|
+
|
|
129
|
+
### Fixed — query a trusted validating resolver in the DNSSEC + CAA pre-flight (PR #75)
|
|
130
|
+
|
|
131
|
+
The pre-flight previously trusted the ambient resolver, which may not validate DNSSEC. Both scripts now query known validating resolvers (`1.1.1.1`, `8.8.8.8`) and require the authenticated-data (AD) flag plus an `RRSIG` on the answer. A resolver that does not set AD, or an answer with no RRSIG, is treated as a validation failure (fail-closed) rather than a pass.
|
|
132
|
+
|
|
133
|
+
### Changed — Version bumped to 1.1.8 across all manifests
|
|
134
|
+
|
|
135
|
+
Per the `version-canonical-check` CI gate. `package.json` (canonical), `version.txt`, `python/pyproject.toml`, `python/src/intent_audit_harness/__init__.py`, and `rust/Cargo.toml` all report `1.1.8`.
|
|
136
|
+
|
|
137
|
+
### Why patch, not minor
|
|
138
|
+
|
|
139
|
+
The pre-flight scripts shipped to the repo in earlier PRs (#70, #75); this patch propagates them to npm consumers via a version bump. No new public CLI commands or flag changes in this release boundary.
|
|
140
|
+
|
|
76
141
|
## [v1.1.5] - 2026-06-03
|
|
77
142
|
|
|
78
143
|
### Added — npm release pipeline (closes the publish-pipeline gap)
|
package/bin/audit-harness.js
CHANGED
|
@@ -7,7 +7,7 @@
|
|
|
7
7
|
* and language-portable. The CLI just adds discoverability + cross-platform-ish shell resolution.
|
|
8
8
|
*/
|
|
9
9
|
const { spawn } = require('node:child_process');
|
|
10
|
-
const { resolve
|
|
10
|
+
const { resolve } = require('node:path');
|
|
11
11
|
const { existsSync } = require('node:fs');
|
|
12
12
|
|
|
13
13
|
const SCRIPTS = resolve(__dirname, '..', 'scripts');
|
|
@@ -17,6 +17,7 @@ const COMMANDS = {
|
|
|
17
17
|
'init': { script: 'harness-hash.sh', args: ['--init'] },
|
|
18
18
|
'list': { script: 'harness-hash.sh', args: ['--list'] },
|
|
19
19
|
'escape-scan': { script: 'escape-scan.sh', args: [] },
|
|
20
|
+
'cred-gate': { script: 'cred-gate.sh', args: [] },
|
|
20
21
|
'arch': { script: 'arch-check.sh', args: [] },
|
|
21
22
|
'bias': { script: 'bias-count.sh', args: [] },
|
|
22
23
|
'gherkin-lint': { script: 'gherkin-lint.sh', args: [] },
|
|
@@ -35,7 +36,7 @@ const COMMANDS = {
|
|
|
35
36
|
// classify is intentionally NOT here: it emits a meaningful kill-switched profile
|
|
36
37
|
// itself (every gate enforcement=disabled). verify/init/list always run.
|
|
37
38
|
const KILLABLE_GATES = new Set([
|
|
38
|
-
'escape-scan', 'arch', 'bias', 'gherkin-lint', 'crap', 'emit-evidence',
|
|
39
|
+
'escape-scan', 'cred-gate', 'arch', 'bias', 'gherkin-lint', 'crap', 'emit-evidence',
|
|
39
40
|
]);
|
|
40
41
|
|
|
41
42
|
function usage() {
|
|
@@ -50,6 +51,15 @@ Commands:
|
|
|
50
51
|
list List currently pinned files
|
|
51
52
|
escape-scan <source> Scan a diff for escape attempts
|
|
52
53
|
source: --staged | --range A..B | - (stdin) | path.patch
|
|
54
|
+
cred-gate [args...] Provider credential PASS/FAIL gate (iah-E08, CISO
|
|
55
|
+
binding DR-010 S1Q5). Reads a candidate artifact (the
|
|
56
|
+
JSON about to be signed/emitted) on stdin or --input and
|
|
57
|
+
FAILs (exit 1) if a declared secret value leaks verbatim,
|
|
58
|
+
a known provider-key shape is embedded, or the artifact
|
|
59
|
+
serializes the process environment (env-var spillover).
|
|
60
|
+
Offline + read-only. --secret-env NAME (repeatable)
|
|
61
|
+
declares a secret by env-var name; --json emits a
|
|
62
|
+
gate-result/v1 envelope. See docs/cred-gate.md.
|
|
53
63
|
arch Run architecture-rule checks (Wall 7)
|
|
54
64
|
bias Count test-bias patterns (tautology, smoke-only, etc.)
|
|
55
65
|
gherkin-lint Advisory Gherkin quality check
|
|
@@ -85,11 +95,12 @@ Commands:
|
|
|
85
95
|
should flag). The metric that gates advisory->blocking
|
|
86
96
|
promotion. --max-fp-rate X exits 1 if any gate exceeds X.
|
|
87
97
|
See docs/gate-promotion.md.
|
|
88
|
-
currency Advisory
|
|
98
|
+
currency Advisory poll-freshness report. Reads the per-upstream
|
|
89
99
|
pin relation (schemas/currency/pins.v1.json) and flags
|
|
90
|
-
pins whose checked_at is past their
|
|
91
|
-
|
|
92
|
-
|
|
100
|
+
pins whose checked_at is past their poll-freshness SLA
|
|
101
|
+
(the SLA gates nothing but human attention). NO exit-code
|
|
102
|
+
authority (always exit 0), no live-fetch, no auto-fix —
|
|
103
|
+
it reports; /sync-testing-harness acts.
|
|
93
104
|
gen-layer-applicability Project schemas/audit-profile/registry.v1.json into
|
|
94
105
|
schemas/audit-profile/layer-applicability.md. --write to
|
|
95
106
|
regenerate, --check to fail on drift (CI gate). The doc
|
|
@@ -0,0 +1,131 @@
|
|
|
1
|
+
# `cred-gate` — provider credential PASS/FAIL gate (iah-E08)
|
|
2
|
+
|
|
3
|
+
CISO non-negotiable per DR-010 S1Q5. Before any provider abstraction is allowed
|
|
4
|
+
to flow data into an Evidence Bundle / OTel signal / gate-result envelope, the
|
|
5
|
+
`cred-gate` gate proves — deterministically and offline — that:
|
|
6
|
+
|
|
7
|
+
1. **Credential redaction** — no provider secret VALUE appears verbatim in the
|
|
8
|
+
candidate artifact (the JSON the runner is about to sign, the OTel line it is
|
|
9
|
+
about to emit, any log it captures). A leaked API key in a signed,
|
|
10
|
+
Rekor-anchored in-toto Statement is irreversible.
|
|
11
|
+
2. **No env-var spillover** — the candidate artifact does not blindly serialize
|
|
12
|
+
the process environment. A provider key need not be named to leak: a wholesale
|
|
13
|
+
`env` dump spills every secret at once.
|
|
14
|
+
|
|
15
|
+
## Usage
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
# Candidate on stdin (the artifact about to be emitted/signed):
|
|
19
|
+
producer | audit-harness cred-gate
|
|
20
|
+
|
|
21
|
+
# Candidate from a file:
|
|
22
|
+
audit-harness cred-gate --input candidate.json
|
|
23
|
+
|
|
24
|
+
# Declare secrets by env-var NAME (the VALUE is read from the environment and
|
|
25
|
+
# never appears on the command line / in `ps`):
|
|
26
|
+
audit-harness cred-gate --secret-env ANTHROPIC_API_KEY --secret-env OPENAI_API_KEY < cand.json
|
|
27
|
+
|
|
28
|
+
# Emit a gate-result/v1 envelope, pipe-ready for emit-evidence:
|
|
29
|
+
audit-harness cred-gate --json < candidate.json | audit-harness emit-evidence
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
## Exit codes
|
|
33
|
+
|
|
34
|
+
| Code | Meaning |
|
|
35
|
+
| ---- | ------- |
|
|
36
|
+
| `0` | **PASS** — no secret value present, no provider-key shape, no env-var spillover |
|
|
37
|
+
| `1` | **FAIL** — a secret value leaked OR a provider-key shape matched OR env-var spillover detected |
|
|
38
|
+
| `2` | usage / input error (no candidate, unreadable `--input`) |
|
|
39
|
+
|
|
40
|
+
## What it detects
|
|
41
|
+
|
|
42
|
+
### Detected provider-key shapes (value-agnostic catalog)
|
|
43
|
+
|
|
44
|
+
These match the on-the-wire SHAPE of a known provider key, so a raw key is caught
|
|
45
|
+
even when it was not declared via `--secret-env`. Patterns are intentionally
|
|
46
|
+
specific to keep the false-positive rate low.
|
|
47
|
+
|
|
48
|
+
| Name | Shape (regex fragment) |
|
|
49
|
+
| ---- | ---------------------- |
|
|
50
|
+
| `anthropic-key` | `sk-ant-…` |
|
|
51
|
+
| `openai-key` | `sk-…` / `sk-proj-…` (excludes `sk-ant-`) |
|
|
52
|
+
| `groq-key` | `gsk_…` |
|
|
53
|
+
| `nvidia-key` | `nvapi-…` |
|
|
54
|
+
| `aws-access-key-id` | `AKIA…` |
|
|
55
|
+
| `google-api-key` | `AIza…` |
|
|
56
|
+
| `github-token` | `ghp_` / `gho_` / `ghs_` / `ghr_` / `ghu_…` |
|
|
57
|
+
| `slack-token` | `xoxb-` / `xoxa-` / `xoxp-` / `xoxr-` / `xoxs-…` |
|
|
58
|
+
| `private-key-block` | `-----BEGIN … PRIVATE KEY-----` |
|
|
59
|
+
|
|
60
|
+
### Env-var spillover heuristics
|
|
61
|
+
|
|
62
|
+
| Name | What it catches |
|
|
63
|
+
| ---- | --------------- |
|
|
64
|
+
| `process-env-spread` | `...process.env` (JS object spread of the whole environment) |
|
|
65
|
+
| `os-environ-dump` | `dict(os.environ)` / a bare `os.environ` serialized into JSON |
|
|
66
|
+
| `env-block-key` | an `"env"` / `"environ"` / `"environment"` object key whose value is a `{…}` block |
|
|
67
|
+
| `printenv-capture` | a `printenv` / `/usr/bin/env` invocation captured into the artifact |
|
|
68
|
+
|
|
69
|
+
A spillover match is a hard **FAIL**: an environment dump inside a to-be-signed
|
|
70
|
+
artifact is exactly the irreversible leak this gate exists to stop.
|
|
71
|
+
|
|
72
|
+
## False-positive posture
|
|
73
|
+
|
|
74
|
+
- **Declared secrets shorter than 8 chars are ignored** — a 1-char "secret"
|
|
75
|
+
would false-positive on virtually any artifact and is not a real credential.
|
|
76
|
+
- **The word "environment" in prose is NOT a spillover** — only the structural
|
|
77
|
+
`"env"/"environment": { … }` block shape, the `...process.env` spread, the
|
|
78
|
+
`os.environ` dump, or a `printenv` capture flag. (See the `tests/cred-gate`
|
|
79
|
+
FP-guard assertion.)
|
|
80
|
+
- The shape catalog is conservative by design; promotion from advisory to
|
|
81
|
+
blocking elsewhere in the harness follows `docs/gate-promotion.md`.
|
|
82
|
+
|
|
83
|
+
## No re-leak guarantee
|
|
84
|
+
|
|
85
|
+
When a declared secret leaks, the FAIL finding **never echoes the secret value
|
|
86
|
+
back**. It reports only the value's length and a non-reversible SHA-256
|
|
87
|
+
fingerprint prefix, so the finding is actionable without re-leaking. The
|
|
88
|
+
`tests/cred-gate` suite asserts this explicitly.
|
|
89
|
+
|
|
90
|
+
## Remediation when the gate FAILs
|
|
91
|
+
|
|
92
|
+
| Finding kind | Fix |
|
|
93
|
+
| ------------ | --- |
|
|
94
|
+
| `secret-value-leak` | Remove the literal secret from the artifact. Pass an opaque reference (key NAME, a hash, or a vault path) instead of the value. |
|
|
95
|
+
| `secret-shape-match` | A raw provider key is embedded. Strip it; if it is a real credential, treat it as compromised and rotate. |
|
|
96
|
+
| `env-spillover` | Stop serializing the whole environment. Allowlist the specific non-secret fields you actually need (`os.getenv("X")` per key), never `dict(os.environ)` / `{...process.env}`. |
|
|
97
|
+
|
|
98
|
+
## Safety + scope
|
|
99
|
+
|
|
100
|
+
- **Offline + read-only**: never contacts a provider, never reads a real key
|
|
101
|
+
from disk, never writes.
|
|
102
|
+
- **Secret values via env-var NAME only**: `--secret-env NAME` reads `$NAME`
|
|
103
|
+
through indirect expansion; the value never appears on `argv` (so it is not
|
|
104
|
+
visible to `ps`), and the candidate + secret blob are passed to the python
|
|
105
|
+
analyzer through the environment, not the command line.
|
|
106
|
+
- **Kill-switch aware**: `cred-gate` is in `KILLABLE_GATES`, so
|
|
107
|
+
`AUDIT_HARNESS_DISABLE=1` no-ops it (exit 0, banner) like the other gates.
|
|
108
|
+
- **Timeout aware**: `AUDIT_HARNESS_TIMEOUT=N` supervises it like every gate.
|
|
109
|
+
|
|
110
|
+
## CI (iah-E08c)
|
|
111
|
+
|
|
112
|
+
The `cred-gate` CI lane in `.github/workflows/ci.yml` runs
|
|
113
|
+
`tests/cred-gate/run-cred-gate-tests.sh`, which proves the credential-redaction
|
|
114
|
+
fixtures (E08a), the env-var spillover fixtures (E08b), the `--json` envelope
|
|
115
|
+
round-trip, and — because the same suite also exercises `emit-evidence.sh` — the
|
|
116
|
+
`gate.decision.emitted` OTel event (iah-E07b), which fires per the NORMATIVE
|
|
117
|
+
runtime event taxonomy (intent-eval-lab `067-AT-SPEC` § 2.2) with the
|
|
118
|
+
`gate.decision` enum `{pass, fail, advisory, error}` and the kernel-pinned
|
|
119
|
+
attribute spelling. Both the redaction group AND the spillover group must pass
|
|
120
|
+
for the lane to be green (iah-E08c "both must pass").
|
|
121
|
+
|
|
122
|
+
The fixture suite covers the **full catalog**: every provider-key shape in
|
|
123
|
+
`SHAPE_PATTERNS` (anthropic, openai, groq, nvidia, AWS, Google, GitHub, Slack,
|
|
124
|
+
private-key block) has a FAILing fixture built from a **synthetic, non-real**
|
|
125
|
+
value, and every `SPILLOVER_PATTERNS` heuristic (`process-env-spread`,
|
|
126
|
+
`os-environ-dump`, `env-block-key`, `printenv-capture`) has its own fixture — so
|
|
127
|
+
a regression in any single regex cannot ship silently green. Two PASS guards
|
|
128
|
+
(a non-matching value, and benign "environment" prose) pin the false-positive
|
|
129
|
+
posture. All fixtures are inline in the runner: synthetic secret values are
|
|
130
|
+
injected into the local environment for the duration of one assertion and never
|
|
131
|
+
touch `argv` (passed by env-var NAME via `--secret-env`).
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@intentsolutions/audit-harness",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.2.0",
|
|
4
4
|
"description": "Deterministic test-enforcement harness — escape-scan, hash-pinning, CRAP, architecture checks, bias detection, Gherkin lint. Companion to the audit-tests and implement-tests Claude Code skills.",
|
|
5
5
|
"license": "Apache-2.0",
|
|
6
6
|
"author": "Jeremy Longshore <jeremy@intentsolutions.io>",
|
|
@@ -40,8 +40,15 @@
|
|
|
40
40
|
],
|
|
41
41
|
"scripts": {
|
|
42
42
|
"test": "bash scripts/escape-scan.sh --staged || true",
|
|
43
|
+
"lint": "eslint \"bin/**/*.js\"",
|
|
44
|
+
"lint:fix": "eslint --fix \"bin/**/*.js\"",
|
|
43
45
|
"prepublishOnly": "node bin/audit-harness.js --version"
|
|
44
46
|
},
|
|
47
|
+
"devDependencies": {
|
|
48
|
+
"@eslint/js": "^9.39.4",
|
|
49
|
+
"eslint": "^9.39.4",
|
|
50
|
+
"lefthook": "^1.13.6"
|
|
51
|
+
},
|
|
45
52
|
"publishConfig": {
|
|
46
53
|
"access": "public"
|
|
47
54
|
},
|
|
@@ -1,34 +1,183 @@
|
|
|
1
1
|
{
|
|
2
2
|
"pins_version": "currency-pins/v1",
|
|
3
|
-
"description": "Per-upstream-identity pin relation. Each upstream the harness/skills depend on carries ITS OWN pinned version + the date it was last verified against upstream (checked_at) +
|
|
3
|
+
"description": "Per-upstream-identity pin relation. Each upstream the harness/skills depend on carries ITS OWN pinned version + the date it was last verified against upstream (checked_at) + an advisory poll-freshness SLA (staleness_window_days, resolvable from the pin's class). The `currency` advisory report reads this datum and flags pins whose checked_at is older than their SLA — i.e. it makes the PIN'S OWN STALENESS detectable, without ever live-fetching. The SLA gates NOTHING except human attention: currency is advisory-only — it reports + (in /sync-testing-harness) opens PRs; it has no exit-code authority and never auto-fixes. Updating a pin (after a human re-verifies against upstream) is an engineer edit to this file + a fresh checked_at. Pins whose identity matches a surface in the intent-eval-lab upstream-surface registry (specs/upstream-surface-registry.v1.json, 16 monitored surfaces) use the registry surface id as their identity; where the registry declares no version, pinned_version is the sha256 prefix of the lab's vendored snapshot baseline (intent-eval-lab specs/snapshots/.sha/<surface>.sha).",
|
|
4
|
+
"$comment": "2026-06-12 [9k5h.10]: terminology — the former 'bounded-staleness window' framing is now the advisory 'poll-freshness SLA' (same datum, honest name: it is a polling-attention SLA, not a consistency bound, and it gates nothing). Renamed pins to lab-registry surface ids: 'mcp-spec' -> 'mcp-spec-docs', 'claude-code' -> 'claude-code-changelog' ('agentskills-spec' already matched). Added one pin per remaining registry surface (13 new; 16 registry pins total) + kept the 3 internal-contract pins (skill-md-schema, gate-result-predicate, anthropic-sdk). The per-pin 'class' field is an additive, backward-compatible v1 extension (window resolution: explicit staleness_window_days > class SLA > default_staleness_window_days), so no pins.v2.json bump is needed.",
|
|
4
5
|
"default_staleness_window_days": 90,
|
|
6
|
+
"staleness_classes": {
|
|
7
|
+
"spec-page": {
|
|
8
|
+
"staleness_window_days": 7,
|
|
9
|
+
"description": "Human-readable spec / reference doc pages (agentskills.io, modelcontextprotocol.io, code.claude.com + platform.claude.com .md shims). Re-verify weekly."
|
|
10
|
+
},
|
|
11
|
+
"schema-file": {
|
|
12
|
+
"staleness_window_days": 7,
|
|
13
|
+
"description": "Machine-readable schema files (e.g. the MCP schema.ts) — exact field-level diff possible upstream. Re-verify weekly."
|
|
14
|
+
},
|
|
15
|
+
"release-feed": {
|
|
16
|
+
"staleness_window_days": 3,
|
|
17
|
+
"description": "Release/version signals (GH releases.atom, commits.atom, npm version probes, changelogs, engineering-blog index) — earliest material-change signal, so the tightest SLA. Re-verify every 3 days."
|
|
18
|
+
},
|
|
19
|
+
"internal-contract": {
|
|
20
|
+
"staleness_window_days": null,
|
|
21
|
+
"description": "Intent Solutions internal contracts (not lab-registry surfaces). No shared class SLA; each pin carries its own staleness_window_days."
|
|
22
|
+
}
|
|
23
|
+
},
|
|
5
24
|
"pins": [
|
|
6
25
|
{
|
|
7
|
-
"identity": "
|
|
26
|
+
"identity": "agentskills-spec",
|
|
27
|
+
"class": "spec-page",
|
|
28
|
+
"pinned_version": "1.0.0",
|
|
29
|
+
"source": "https://agentskills.io/specification.md",
|
|
30
|
+
"checked_at": "2026-06-12",
|
|
31
|
+
"staleness_window_days": 7,
|
|
32
|
+
"notes": "Open SKILL.md standard (compatibility/metadata/license fields). Lab-registry surface (wave 0, official-spec)."
|
|
33
|
+
},
|
|
34
|
+
{
|
|
35
|
+
"identity": "platform-skills-overview",
|
|
36
|
+
"class": "spec-page",
|
|
37
|
+
"pinned_version": "sha256:0bd9758afca5",
|
|
38
|
+
"source": "https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview.md",
|
|
39
|
+
"checked_at": "2026-06-12",
|
|
40
|
+
"staleness_window_days": 7,
|
|
41
|
+
"notes": "Anthropic doc page about agent skills. Lab-registry surface (wave 0, anthropic-doc); version = vendored snapshot sha prefix."
|
|
42
|
+
},
|
|
43
|
+
{
|
|
44
|
+
"identity": "mcp-spec-docs",
|
|
45
|
+
"class": "spec-page",
|
|
8
46
|
"pinned_version": "2025-06-18",
|
|
9
|
-
"source": "https://
|
|
10
|
-
"checked_at": "2026-06-
|
|
11
|
-
"staleness_window_days":
|
|
12
|
-
"notes": "MCP protocol spec revision the .mcp.json conform schema targets."
|
|
47
|
+
"source": "https://modelcontextprotocol.io/specification/draft",
|
|
48
|
+
"checked_at": "2026-06-12",
|
|
49
|
+
"staleness_window_days": 7,
|
|
50
|
+
"notes": "MCP protocol spec revision the .mcp.json conform schema targets. Lab-registry surface (wave 1, official-spec); renamed from 'mcp-spec' 2026-06-12."
|
|
51
|
+
},
|
|
52
|
+
{
|
|
53
|
+
"identity": "claude-hooks",
|
|
54
|
+
"class": "spec-page",
|
|
55
|
+
"pinned_version": "sha256:0b644e2208f8",
|
|
56
|
+
"source": "https://code.claude.com/docs/en/hooks.md",
|
|
57
|
+
"checked_at": "2026-06-12",
|
|
58
|
+
"staleness_window_days": 7,
|
|
59
|
+
"notes": "Claude Code hooks reference (hook-config contract). Lab-registry surface (wave 1, reference); version = vendored snapshot sha prefix."
|
|
60
|
+
},
|
|
61
|
+
{
|
|
62
|
+
"identity": "claude-settings",
|
|
63
|
+
"class": "spec-page",
|
|
64
|
+
"pinned_version": "sha256:491b623ae274",
|
|
65
|
+
"source": "https://code.claude.com/docs/en/settings.md",
|
|
66
|
+
"checked_at": "2026-06-12",
|
|
67
|
+
"staleness_window_days": 7,
|
|
68
|
+
"notes": "Claude Code settings reference (hook-config contract). Lab-registry surface (wave 1, reference); version = vendored snapshot sha prefix."
|
|
69
|
+
},
|
|
70
|
+
{
|
|
71
|
+
"identity": "claude-slash-commands",
|
|
72
|
+
"class": "spec-page",
|
|
73
|
+
"pinned_version": "sha256:d7d367c7d004",
|
|
74
|
+
"source": "https://code.claude.com/docs/en/slash-commands.md",
|
|
75
|
+
"checked_at": "2026-06-12",
|
|
76
|
+
"staleness_window_days": 7,
|
|
77
|
+
"notes": "Claude Code slash-commands reference. Lab-registry surface (wave 1, reference); version = vendored snapshot sha prefix."
|
|
78
|
+
},
|
|
79
|
+
{
|
|
80
|
+
"identity": "plugins-reference",
|
|
81
|
+
"class": "spec-page",
|
|
82
|
+
"pinned_version": "sha256:bbb4618ec8b1",
|
|
83
|
+
"source": "https://code.claude.com/docs/en/plugins-reference.md",
|
|
84
|
+
"checked_at": "2026-06-12",
|
|
85
|
+
"staleness_window_days": 7,
|
|
86
|
+
"notes": "Claude Code plugin-manifest reference. Lab-registry surface (wave 2, reference); version = vendored snapshot sha prefix."
|
|
87
|
+
},
|
|
88
|
+
{
|
|
89
|
+
"identity": "sub-agents",
|
|
90
|
+
"class": "spec-page",
|
|
91
|
+
"pinned_version": "sha256:824162201ae4",
|
|
92
|
+
"source": "https://code.claude.com/docs/en/sub-agents.md",
|
|
93
|
+
"checked_at": "2026-06-12",
|
|
94
|
+
"staleness_window_days": 7,
|
|
95
|
+
"notes": "Claude Code sub-agents reference (agent-definition contract). Lab-registry surface (wave 2, reference); version = vendored snapshot sha prefix."
|
|
96
|
+
},
|
|
97
|
+
{
|
|
98
|
+
"identity": "plugin-marketplaces",
|
|
99
|
+
"class": "spec-page",
|
|
100
|
+
"pinned_version": "sha256:1f37e87ff344",
|
|
101
|
+
"source": "https://code.claude.com/docs/en/plugin-marketplaces.md",
|
|
102
|
+
"checked_at": "2026-06-12",
|
|
103
|
+
"staleness_window_days": 7,
|
|
104
|
+
"notes": "Claude Code marketplace-catalog reference. Lab-registry surface (wave 2, reference); version = vendored snapshot sha prefix."
|
|
105
|
+
},
|
|
106
|
+
{
|
|
107
|
+
"identity": "mcp-schema-ts",
|
|
108
|
+
"class": "schema-file",
|
|
109
|
+
"pinned_version": "sha256:1bf94a601817",
|
|
110
|
+
"source": "https://raw.githubusercontent.com/modelcontextprotocol/modelcontextprotocol/main/schema/draft/schema.ts",
|
|
111
|
+
"checked_at": "2026-06-12",
|
|
112
|
+
"staleness_window_days": 7,
|
|
113
|
+
"notes": "MCP machine-readable schema (mcp-config contract; exact field-level diff possible). Lab-registry surface (wave 1, official-spec-machine-readable); version = vendored snapshot sha prefix."
|
|
114
|
+
},
|
|
115
|
+
{
|
|
116
|
+
"identity": "skills-releases",
|
|
117
|
+
"class": "release-feed",
|
|
118
|
+
"pinned_version": "sha256:8ab0fc2a54fa",
|
|
119
|
+
"source": "https://github.com/anthropics/skills/releases.atom + commits/main.atom",
|
|
120
|
+
"checked_at": "2026-06-12",
|
|
121
|
+
"staleness_window_days": 3,
|
|
122
|
+
"notes": "anthropics/skills release + commit feeds (skill-frontmatter contract). Lab-registry surface (wave 0, release-feed); version = vendored snapshot sha prefix."
|
|
123
|
+
},
|
|
124
|
+
{
|
|
125
|
+
"identity": "mcp-releases",
|
|
126
|
+
"class": "release-feed",
|
|
127
|
+
"pinned_version": "sha256:1b180712c47f",
|
|
128
|
+
"source": "https://github.com/modelcontextprotocol/modelcontextprotocol/releases.atom",
|
|
129
|
+
"checked_at": "2026-06-12",
|
|
130
|
+
"staleness_window_days": 3,
|
|
131
|
+
"notes": "MCP spec-repo release feed (mcp-config contract). Lab-registry surface (wave 2, release-feed); version = vendored snapshot sha prefix."
|
|
132
|
+
},
|
|
133
|
+
{
|
|
134
|
+
"identity": "claude-code-changelog",
|
|
135
|
+
"class": "release-feed",
|
|
136
|
+
"pinned_version": "2.1.152",
|
|
137
|
+
"source": "https://raw.githubusercontent.com/anthropics/claude-code/main/CHANGELOG.md",
|
|
138
|
+
"checked_at": "2026-06-12",
|
|
139
|
+
"staleness_window_days": 3,
|
|
140
|
+
"notes": "Claude Code release changelog (version-signal contract; 2.1.152 added disallowed-tools frontmatter). Lab-registry surface (wave 0, changelog); renamed from 'claude-code' 2026-06-12."
|
|
141
|
+
},
|
|
142
|
+
{
|
|
143
|
+
"identity": "claude-code-npm",
|
|
144
|
+
"class": "release-feed",
|
|
145
|
+
"pinned_version": "2.1.152",
|
|
146
|
+
"source": "npm view @anthropic-ai/claude-code version",
|
|
147
|
+
"checked_at": "2026-06-12",
|
|
148
|
+
"staleness_window_days": 3,
|
|
149
|
+
"notes": "Claude Code npm version probe (version-signal contract). Lab-registry surface (wave 0, changelog); version mirrors the last verified npm version."
|
|
150
|
+
},
|
|
151
|
+
{
|
|
152
|
+
"identity": "claude-code-releases",
|
|
153
|
+
"class": "release-feed",
|
|
154
|
+
"pinned_version": "sha256:556b4faba702",
|
|
155
|
+
"source": "https://github.com/anthropics/claude-code/releases.atom",
|
|
156
|
+
"checked_at": "2026-06-12",
|
|
157
|
+
"staleness_window_days": 3,
|
|
158
|
+
"notes": "Claude Code GH release feed (version-signal contract). Lab-registry surface (wave 2, release-feed); version = vendored snapshot sha prefix."
|
|
159
|
+
},
|
|
160
|
+
{
|
|
161
|
+
"identity": "anthropic-engineering",
|
|
162
|
+
"class": "release-feed",
|
|
163
|
+
"pinned_version": "sha256:f99507064aeb",
|
|
164
|
+
"source": "https://www.anthropic.com/engineering",
|
|
165
|
+
"checked_at": "2026-06-12",
|
|
166
|
+
"staleness_window_days": 3,
|
|
167
|
+
"notes": "Anthropic engineering-blog index (cross-cutting-signal contract). Lab-registry surface (wave 0, release-feed); version = vendored snapshot sha prefix."
|
|
13
168
|
},
|
|
14
169
|
{
|
|
15
170
|
"identity": "skill-md-schema",
|
|
171
|
+
"class": "internal-contract",
|
|
16
172
|
"pinned_version": "3.7.0",
|
|
17
173
|
"source": "claude-code-plugins 000-docs/SCHEMA_CHANGELOG.md",
|
|
18
174
|
"checked_at": "2026-06-06",
|
|
19
175
|
"staleness_window_days": 90,
|
|
20
176
|
"notes": "IS SKILL.md schema the conform skillmd-frontmatter floor tracks (full rubric stays in /validate-skillmd)."
|
|
21
177
|
},
|
|
22
|
-
{
|
|
23
|
-
"identity": "claude-code",
|
|
24
|
-
"pinned_version": "2.1.152",
|
|
25
|
-
"source": "https://code.claude.com/docs/en/changelog",
|
|
26
|
-
"checked_at": "2026-06-06",
|
|
27
|
-
"staleness_window_days": 60,
|
|
28
|
-
"notes": "Claude Code release (added disallowed-tools frontmatter at 2.1.152)."
|
|
29
|
-
},
|
|
30
178
|
{
|
|
31
179
|
"identity": "gate-result-predicate",
|
|
180
|
+
"class": "internal-contract",
|
|
32
181
|
"pinned_version": "v1",
|
|
33
182
|
"source": "@intentsolutions/core gate-result/v1 (https://evals.intentsolutions.io/gate-result/v1)",
|
|
34
183
|
"checked_at": "2026-06-06",
|
|
@@ -37,19 +186,12 @@
|
|
|
37
186
|
},
|
|
38
187
|
{
|
|
39
188
|
"identity": "anthropic-sdk",
|
|
189
|
+
"class": "internal-contract",
|
|
40
190
|
"pinned_version": "unverified",
|
|
41
191
|
"source": "https://github.com/anthropics/anthropic-sdk-python (+ -typescript)",
|
|
42
192
|
"checked_at": "2026-06-06",
|
|
43
193
|
"staleness_window_days": 90,
|
|
44
194
|
"notes": "Anthropic SDK surface referenced by downstream skills; pinned_version=unverified until first deliberate verification."
|
|
45
|
-
},
|
|
46
|
-
{
|
|
47
|
-
"identity": "agentskills-spec",
|
|
48
|
-
"pinned_version": "1.0.0",
|
|
49
|
-
"source": "https://agentskills.io/specification",
|
|
50
|
-
"checked_at": "2026-06-06",
|
|
51
|
-
"staleness_window_days": 90,
|
|
52
|
-
"notes": "Open SKILL.md standard (compatibility/metadata/license fields)."
|
|
53
195
|
}
|
|
54
196
|
]
|
|
55
197
|
}
|
package/scripts/arch-check.sh
CHANGED
|
@@ -17,6 +17,24 @@
|
|
|
17
17
|
|
|
18
18
|
set -euo pipefail
|
|
19
19
|
|
|
20
|
+
# Bash version floor: these gates rely on bash 4+ features. Refuse early with a
|
|
21
|
+
# clear message on bash 3.x (e.g. macOS system bash) instead of failing later
|
|
22
|
+
# with a cryptic syntax error (jcgw).
|
|
23
|
+
[ "${BASH_VERSINFO:-0}" -ge 4 ] || { echo 'audit-harness requires bash >= 4' >&2; exit 3; }
|
|
24
|
+
|
|
25
|
+
# Cross-platform SHA-256: `sha256sum` ships with GNU coreutils (Linux);
|
|
26
|
+
# macOS only has `shasum -a 256`. Both produce identical `<hash> <file>`
|
|
27
|
+
# output, so downstream awk parsing is unchanged. Same pattern as
|
|
28
|
+
# harness-hash.sh / escape-scan.sh / bias-count.sh.
|
|
29
|
+
if command -v sha256sum >/dev/null 2>&1; then
|
|
30
|
+
SHA256_CMD=(sha256sum)
|
|
31
|
+
elif command -v shasum >/dev/null 2>&1; then
|
|
32
|
+
SHA256_CMD=(shasum -a 256)
|
|
33
|
+
else
|
|
34
|
+
echo "arch-check: neither sha256sum nor shasum found in PATH" >&2
|
|
35
|
+
exit 2
|
|
36
|
+
fi
|
|
37
|
+
|
|
20
38
|
ROOT="${ROOT:-$(pwd)}"
|
|
21
39
|
JSON_OUT=0
|
|
22
40
|
REPORT_DIR="${ROOT}/reports/arch"
|
|
@@ -51,12 +69,12 @@ emit_result() {
|
|
|
51
69
|
local policy_hash="sha256:0000000000000000000000000000000000000000000000000000000000000000"
|
|
52
70
|
# Best-effort: input_hash is the source tree fingerprint when running against ROOT/src
|
|
53
71
|
if [[ -d "${ROOT}/src" ]]; then
|
|
54
|
-
input_hash=$(find "${ROOT}/src" -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.py" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.kt" -o -name "*.cs" -o -name "*.php" \) -exec
|
|
72
|
+
input_hash=$(find "${ROOT}/src" -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.py" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.kt" -o -name "*.cs" -o -name "*.php" \) -exec "${SHA256_CMD[@]}" {} \; 2>/dev/null | sort | "${SHA256_CMD[@]}" | awk '{print "sha256:"$1}')
|
|
55
73
|
fi
|
|
56
74
|
# Hash the architecture rule config (whichever tool's config was used)
|
|
57
75
|
for cfg in .dependency-cruiser.js .dependency-cruiser.cjs .importlinter deptrac.yaml arch-go.yml; do
|
|
58
76
|
if [[ -f "${ROOT}/${cfg}" ]]; then
|
|
59
|
-
policy_hash=$(
|
|
77
|
+
policy_hash=$("${SHA256_CMD[@]}" "${ROOT}/${cfg}" | awk '{print "sha256:"$1}')
|
|
60
78
|
break
|
|
61
79
|
fi
|
|
62
80
|
done
|
package/scripts/bias-count.sh
CHANGED
|
@@ -12,6 +12,23 @@
|
|
|
12
12
|
|
|
13
13
|
set -euo pipefail
|
|
14
14
|
|
|
15
|
+
# Bash version floor: these gates rely on bash 4+ features. Refuse early with a
|
|
16
|
+
# clear message on bash 3.x (e.g. macOS system bash) instead of failing later
|
|
17
|
+
# with a cryptic syntax error (jcgw).
|
|
18
|
+
[ "${BASH_VERSINFO:-0}" -ge 4 ] || { echo 'audit-harness requires bash >= 4' >&2; exit 3; }
|
|
19
|
+
|
|
20
|
+
# Cross-platform SHA-256: `sha256sum` ships with GNU coreutils (Linux);
|
|
21
|
+
# macOS only has `shasum -a 256`. Both produce identical `<hash> <file>`
|
|
22
|
+
# output, so downstream awk parsing is unchanged. Mirrors harness-hash.sh.
|
|
23
|
+
if command -v sha256sum >/dev/null 2>&1; then
|
|
24
|
+
SHA256_CMD=(sha256sum)
|
|
25
|
+
elif command -v shasum >/dev/null 2>&1; then
|
|
26
|
+
SHA256_CMD=(shasum -a 256)
|
|
27
|
+
else
|
|
28
|
+
echo "bias-count: neither sha256sum nor shasum found in PATH" >&2
|
|
29
|
+
exit 2
|
|
30
|
+
fi
|
|
31
|
+
|
|
15
32
|
JSON_OUT=0
|
|
16
33
|
TEST_DIR="tests"
|
|
17
34
|
|
|
@@ -36,7 +53,7 @@ if [ ! -d "$TEST_DIR" ]; then
|
|
|
36
53
|
fi
|
|
37
54
|
|
|
38
55
|
# Hash the test directory tree as the "input"
|
|
39
|
-
INPUT_HASH=$(find "$TEST_DIR" -type f \( -name "*.py" -o -name "*.ts" -o -name "*.js" -o -name "*.tsx" -o -name "*.jsx" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.kt" -o -name "*.cs" -o -name "*.php" -o -name "*.rb" \) -exec
|
|
56
|
+
INPUT_HASH=$(find "$TEST_DIR" -type f \( -name "*.py" -o -name "*.ts" -o -name "*.js" -o -name "*.tsx" -o -name "*.jsx" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.kt" -o -name "*.cs" -o -name "*.php" -o -name "*.rb" \) -exec "${SHA256_CMD[@]}" {} + 2>/dev/null | sort | "${SHA256_CMD[@]}" | awk '{print "sha256:"$1}')
|
|
40
57
|
|
|
41
58
|
if [[ "$JSON_OUT" -eq 1 ]]; then
|
|
42
59
|
exec 3>&1 # save stdout for the JSON object
|