wicked-vault 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -26,15 +26,16 @@ It checks on two tiers (ADR-0002):
26
26
  hashes, re-run the pure verifier. CI-gate-safe. *Never trust a cached status.*
27
27
  - **Judgment tier** — an **independent** evaluator (≠ the agent that did the
28
28
  work) judges the frozen evidence against the frozen criteria; the opinion is
29
- recorded as a tamper-evident, append-only `opinion_attestation`. *Never trust
30
- a self-graded "done".*
29
+ recorded as a hash-bound, append-only `opinion_attestation` (mutation-
30
+ detecting in the same sense as the envelope — see "Tamper detection"). *Never
31
+ trust a self-graded "done".*
31
32
 
32
33
  ## Boundary
33
34
 
34
35
  | Owns (the primitive) | Refuses (lives in a consumer) |
35
36
  |---|---|
36
37
  | `record` · `verify` · `inspect` · `attest` · `cross-check` · `supersede` | "is the work *done*?" (gate logic) |
37
- | criteria-binding + tamper-evidence (envelope hash; git as audit chain) | scenario/flake history; claim authoring; work-shape |
38
+ | criteria-binding + mutation detection (envelope hash detects naive tamper; committed git history is the audit chain) | scenario/flake history; claim authoring; work-shape |
38
39
  | the deterministic verifier family **and** the append-only attestation ledger | running the judge — the model lives in the `analyze-evidence` *skill*, never in the CLI |
39
40
 
40
41
  The consumer authors the contract; the vault evaluates it mechanically (G9) and
@@ -68,7 +69,7 @@ binaries support `--help`.
68
69
  ## CLI
69
70
 
70
71
  ```bash
71
- wicked-vault init
72
+ wicked-vault init # optional — record / declare-contract / supersede create .wicked-vault/ automatically
72
73
  # record: --criteria is MANDATORY (the bar this evidence claims to clear); --verifier is optional
73
74
  wicked-vault record --scope S --phase build --claim tests-pass --kind test-run \
74
75
  --source "npm test" --criteria "all unit tests pass (exit 0)" \
@@ -108,11 +109,21 @@ skill orchestrates an **independent** judge:
108
109
  2. a model **distinct from the worker** judges criteria-vs-evidence (criteria
109
110
  and evidence are passed as escaped *data*, never as instructions).
110
111
  3. `attest` records the `{opinion, rationale, evaluator, model, …}` to an
111
- append-only, tamper-evident log.
112
+ append-only, hash-bound log (mutation-detecting; the committed git history is
113
+ the durable tamper-evidence — see "Tamper detection").
112
114
 
113
115
  Guarantees that hold: criteria are frozen to the evidence (anti-downgrade);
114
116
  `attest` is **fail-closed** on a tampered artifact and **rejects a self-grade**
115
- (`evaluator == created_by`). What's traded: a judgment is **not reproducible** —
117
+ (`evaluator == created_by`, compared trimmed + case-folded). The independence
118
+ check is hardened: the worker should record with an explicit `--actor` (or
119
+ `WICKED_VAULT_ACTOR`) — when the artifact carries only an *ambient* identity
120
+ (bare `$USER` / anonymous), `attest` **fails closed** and requires
121
+ `--allow-weak-worker-identity` (which stamps the weakness on the attestation for
122
+ audit), and the **evaluator** identity must itself be an explicit assertion.
123
+ This is a stronger mechanical baseline + audit trail, **not** cryptographic
124
+ independence — a determined human can still assert two distinct strings locally;
125
+ real independence comes from a separate evaluator process/credential and the
126
+ committed git trail. What's traded: a judgment is **not reproducible** —
116
127
  it's re-evaluated, not re-derived. The default CI gate stays on the
117
128
  deterministic `--integrity-only` path; the judgment tier is opt-in. Threat model
118
129
  (prompt injection, lax-bar self-grade) and the council 5–0 review:
@@ -149,15 +160,48 @@ signal, *not* a deterministic verdict). `wicked.evidence.tampered` is the
149
160
  high-value alarm: a payload, criteria, or envelope diverged from what was
150
161
  recorded (G2).
151
162
 
163
+ ## Tamper detection — what it does, and what it does NOT do
164
+
165
+ Be precise about the word "tamper-evident", because the mechanism is easy to
166
+ overstate:
167
+
168
+ - **What the envelope hash catches:** *naive or accidental* mutation. The
169
+ envelope is an **unkeyed SHA-256** over the artifact's public fields (scope,
170
+ phase, claim, kind, source, verifier, `criteria_sha256`, `payload_sha256`).
171
+ `verify` re-derives every hash from the bytes on disk and re-runs the pure
172
+ verifier — so a hand-edit to a payload, the criteria, or a cached status is
173
+ detected (`hash_ok: false`), and a stale "pass" is never trusted (G3). This
174
+ defeats the common failure modes: a fat-fingered edit, a tool that rewrites a
175
+ file, an agent that flips `status_at_record`.
176
+ - **What it does NOT do:** it is **not** cryptographically tamper-*resistant*
177
+ against a *determined local writer*. Because the hashes are unkeyed and over
178
+ public fields, anyone who can edit `entries/` can also recompute every hash to
179
+ match — `verify` would then return `hash_ok: true` on a forged entry. There is
180
+ no secret key, no signature, no HMAC. **Do not rely on the envelope hash alone
181
+ as a security boundary.**
182
+ - **Where real tamper-EVIDENCE comes from:** the **committed, branch-protected
183
+ git history** of `.wicked-vault/`. Evidence is committed by default; the PR
184
+ diff shows exactly what was recorded, and branch protection prevents silent
185
+ rewrites. This is **audit-trail-grade** tamper-evidence (you can see, in a
186
+ reviewable history, what changed and who changed it) — **not** cryptographic
187
+ immutability (a force-push by a privileged actor can still rewrite history; CI
188
+ branch protection is the backstop). This matches CONTRACTS.md §6 and ADR-0002.
189
+
190
+ In one line: **the envelope hash detects mutation; committed, branch-protected
191
+ git history is what makes that mutation *evident and accountable*.**
192
+
152
193
  ## Guarantees
153
194
 
154
- G1 server-minted ids · G2 envelope-hash tamper-evidence (**binds the criteria
155
- too**) · **G3 re-derivation (never trust a cached status)** · G4 honest
156
- recording (not sandboxed harness owns isolation) · G5 fail-closed · G6
157
- append-only · G7 verifier purity (CLI never calls a model) · G8 contract pinning
158
- · G9 mechanical evaluation · **G10 attestation-chain trust** (independent
159
- judgments are recorded, not re-derived; distinct from deterministic results).
160
- Full text + threat model: [`docs/CONTRACTS.md`](docs/CONTRACTS.md). Founding
195
+ G1 server-minted ids · **G2 envelope hash detects naive/accidental payload,
196
+ criteria, or envelope mutation (unkeyed SHA-256 over public fields; binds the
197
+ criteria too). NOT a defense against a determined local writer see "Tamper
198
+ detection" above.** · **G3 re-derivation (never trust a cached status)** · G4
199
+ honest recording (not sandboxed — harness owns isolation) · G5 fail-closed · G6
200
+ append-only (git history is the audit chain) · G7 verifier purity (CLI never
201
+ calls a model) · G8 contract pinning · G9 mechanical evaluation · **G10
202
+ attestation-chain trust** (independent judgments are recorded, not re-derived;
203
+ distinct from deterministic results). Full text + threat model:
204
+ [`docs/CONTRACTS.md`](docs/CONTRACTS.md). Founding
161
205
  decisions + council reviews:
162
206
  [`docs/adr/0001`](docs/adr/0001-standalone-and-council-revisions.md) ·
163
207
  [`docs/adr/0002`](docs/adr/0002-independent-evaluation-and-criteria-binding.md).
@@ -174,16 +218,18 @@ evaluator may cite — not the whole story. Nondeterministic observation verifie
174
218
  ## Proof
175
219
 
176
220
  ```bash
177
- npm run prove # record -> tamper -> verify-rejects on a real repo
221
+ npm test # the full gating suite (cli-baseline + attestation + bus + verifiers)
222
+ npm run prove # record -> tamper -> verify-rejects on a real repo (needs a sibling repo)
178
223
  bash test/verifiers.sh # the 5 verifiers, pass + fail cases
179
- bash test/attestation.sh # criteria-binding, attest fail-closed/independence, require_attestation
224
+ bash test/attestation.sh # criteria-binding, attest fail-closed/independence (incl. weak-identity), payload limit, require_attestation
180
225
  bash test/bus-integration.sh # graceful no-op + schema validity + real-bus emission (init/record/attest/cross-check)
181
226
  ```
182
227
 
183
- `attestation.sh` and `bus-integration.sh` are the gating proofs and run in CI
228
+ `npm test` runs the gating proofs (`cli-baseline.sh`, `attestation.sh`,
229
+ `bus-integration.sh`, `verifiers.sh`) and is what CI invokes
184
230
  (`.github/workflows/ci.yml`) on ubuntu + macos, with a Windows CLI smoke.
185
231
 
186
- Status: v0.3.0 — deterministic core proven on real repos; criteria-binding +
232
+ Status: v0.3.1 — deterministic core proven on real repos; criteria-binding +
187
233
  independent judgment tier (ADR-0002, council 5–0) implemented and proven;
188
234
  wicked-bus integration **proven end-to-end against a real bus** (emit → store →
189
235
  poll), optional and fire-and-forget; `--help` on both binaries + a
@@ -42,16 +42,22 @@ USAGE
42
42
  wicked-vault <command> [options]
43
43
 
44
44
  COMMANDS
45
- init Create .wicked-vault/ in the current repo
45
+ init Create .wicked-vault/ in the current repo (optional —
46
+ record / declare-contract / supersede auto-create it)
46
47
  record Capture evidence + the criteria it must clear
47
48
  --scope S --phase P --claim C --kind K --source "<cmd|file>"
48
49
  --criteria "<text|@file>" (--run | --artifact <file>) [--verifier "kind:arg"]
50
+ [--actor ID] (the asserted worker identity; strengthens the
51
+ independence check — falls back to WICKED_VAULT_ACTOR then $USER)
49
52
  verify <artifact-id> Integrity tier: re-derive hashes + verifier (deterministic,
50
53
  model-free). Exit 0 iff intact AND pass. Surfaces latest opinion.
51
54
  inspect <artifact-id> Frozen criteria + evidence + integrity (what a judge evaluates)
52
55
  attest <artifact-id> Record an INDEPENDENT judgment (fail-closed; evaluator != creator)
53
56
  --opinion <pass|reject|unclear> --rationale "..." --evaluator ID
54
57
  [--model prov/ver] [--prompt-hash H] [--sampling '<json>']
58
+ [--allow-weak-worker-identity] (attest anyway when the artifact
59
+ was recorded under an ambient $USER/anonymous identity; the
60
+ weakness is stamped on the attestation for audit)
55
61
  attestations <artifact-id> Show the append-only opinion log
56
62
  cross-check Mechanical contract verdict; exit 0 iff PASS
57
63
  --scope S --phase P [--integrity-only (default) | --with-attestations]
@@ -62,9 +68,12 @@ COMMANDS
62
68
  GLOBAL
63
69
  --cwd <dir> Operate on a vault rooted at <dir> (default: walk up from cwd)
64
70
  --help, -h Show this help
71
+ --version, -v Print the wicked-vault version
65
72
 
66
73
  OUTPUT JSON on stdout; exit code is the gate signal (0 = PASS / success).
67
74
  ENV WICKED_VAULT_NO_BUS=1 Disable optional wicked-bus event emission
75
+ WICKED_VAULT_ACTOR=ID Assert the worker identity for record/supersede
76
+ (used by the G10/D4 independence check)
68
77
 
69
78
  Skills (AI CLIs): wicked-vault:{init,record-evidence,verify-evidence,analyze-evidence,cross-check-evidence,update}
70
79
  Install skills: npx wicked-vault-install (run with --help for options)
@@ -80,6 +89,14 @@ if (cmd === undefined || cmd === '--help' || cmd === '-h' || cmd === 'help') {
80
89
  process.exit(0);
81
90
  }
82
91
 
92
+ // Version: like --help, must work outside any repo — resolved from this
93
+ // package's own manifest, never from a vault.
94
+ if (cmd === '--version' || cmd === '-v' || cmd === 'version') {
95
+ const pkg = JSON.parse(readFileSync(new URL('../package.json', import.meta.url), 'utf8'));
96
+ process.stdout.write(pkg.version + '\n');
97
+ process.exit(0);
98
+ }
99
+
83
100
  const args = parseArgs(rest);
84
101
  const cwd = (typeof args.cwd === 'string' && args.cwd) || process.cwd();
85
102
 
@@ -94,7 +111,22 @@ try {
94
111
  }
95
112
 
96
113
  const root = findRoot(cwd, { create: cmd === 'record' || cmd === 'declare-contract' || cmd === 'supersede' });
97
- if (!root) emit({ error: 'no .wicked-vault/ found; run `wicked-vault init`' }, false);
114
+ if (!root) {
115
+ // "What evidence exists?" in a repo with no vault is a question with a
116
+ // truthful answer — none — not an infrastructure error.
117
+ if (cmd === 'list') emit([], true);
118
+ const notFound = {
119
+ error: `no .wicked-vault/ found in or above ${cwd}`,
120
+ code: 'VAULT_NOT_FOUND',
121
+ hint: 'no evidence has been recorded here — `wicked-vault record` creates the vault automatically; `wicked-vault init` is optional scaffolding',
122
+ };
123
+ // Gate consumers (wicked-loom) read `overall` from cross-check JSON:
124
+ // report a truthful FAIL (no evidence exists) instead of a generic error.
125
+ if (cmd === 'cross-check') {
126
+ emit({ scope: args.scope, phase: args.phase, overall: 'FAIL', ...notFound }, false);
127
+ }
128
+ emit(notFound, false);
129
+ }
98
130
 
99
131
  switch (cmd) {
100
132
  case 'record': {
@@ -102,6 +134,7 @@ try {
102
134
  scope: args.scope, phase: args.phase, claim: args.claim, kind: args.kind,
103
135
  source: args.source, verifier: args.verifier, criteria: resolveCriteria(args.criteria),
104
136
  run: !!args.run, artifact: typeof args.artifact === 'string' ? args.artifact : undefined,
137
+ actor: typeof args.actor === 'string' ? args.actor : undefined,
105
138
  cwd,
106
139
  });
107
140
  publish('wicked.evidence.recorded', 'vault.record', {
@@ -120,6 +153,7 @@ try {
120
153
  opinion: args.opinion, rationale: args.rationale, evaluator: args.evaluator,
121
154
  model: args.model, prompt_hash: args['prompt-hash'],
122
155
  sampling: typeof args.sampling === 'string' ? JSON.parse(args.sampling) : undefined,
156
+ allowWeakWorkerIdentity: args['allow-weak-worker-identity'] === true,
123
157
  });
124
158
  publish('wicked.evidence.attested', 'vault.attest', {
125
159
  artifact_id: args._[0] || args.id, attestation_id: res.attestation_id,
@@ -190,6 +224,7 @@ try {
190
224
  scope: args.scope, phase: args.phase, claim: args.claim, kind: args.kind,
191
225
  source: args.source, verifier: args.verifier, criteria: resolveCriteria(args.criteria),
192
226
  run: !!args.run, artifact: typeof args.artifact === 'string' ? args.artifact : undefined,
227
+ actor: typeof args.actor === 'string' ? args.actor : undefined,
193
228
  cwd,
194
229
  });
195
230
  publish('wicked.evidence.superseded', 'vault.supersede', {
package/docs/CONTRACTS.md CHANGED
@@ -97,6 +97,7 @@ artifact still verify?* and *is this scope+phase's contract satisfied?*
97
97
  | `supersedes` | string? | prior artifact id |
98
98
  | `contract_version` | string? | the contract hash in force at record |
99
99
  | `created_at` / `created_by` | ts / string | actor provenance |
100
+ | `created_by_source` | enum | how `created_by` was resolved: `explicit` (`--actor`) · `env-actor` (`WICKED_VAULT_ACTOR`) · `env-user` (ambient `$USER`, weak) · `anonymous` (none, weak). Governs the G10/D4 independence check — a weak source makes `evaluator != created_by` untrustworthy, so `attest` fails closed unless explicitly overridden. |
100
101
 
101
102
  ### 3.2 Contract (exit-criteria — what evidence a scope+phase requires)
102
103
 
@@ -136,7 +137,9 @@ Its trust is G10 (attestation-chain), not G3 (re-derivation).
136
137
  | `artifact_id` | string | the evidence it judges |
137
138
  | `opinion` | enum | `pass` · `reject` · `unclear` — deliberately NOT named `verdict`/`status` |
138
139
  | `rationale` | string | the judge's reasoning (structured output, not free-form prose injection) |
139
- | `evaluator` | string | the judging identity — **MUST differ from the artifact's `created_by`** (G10/D4) |
140
+ | `evaluator` | string | the judging identity — **MUST differ from the artifact's `created_by`** (G10/D4), compared trimmed + case-folded; **MUST be an explicit assertion** (an ambient `$USER` evaluator is refused) |
141
+ | `evaluator_source` | enum | provenance of the evaluator identity (`explicit` · `env-actor`); ambient sources are refused at `attest` |
142
+ | `worker_identity_weak` | bool | true if the judged artifact was recorded under a weak/ambient/legacy `created_by_source`; `attest` fails closed in that case unless `--allow-weak-worker-identity` is passed, which stamps this flag for audit |
140
143
  | `model` | string | provider/version, e.g. `gemini/2.5-pro` |
141
144
  | `prompt_hash` | string? | hash of the prompt template used |
142
145
  | `sampling` | object? | `{temperature, …}` — provenance for disagreement analysis |
@@ -203,8 +206,16 @@ reproducible.**
203
206
  (a) acceptance criteria are mandatory and bound into the envelope, frozen to
204
207
  the evidence (anti-downgrade); (b) the model runs only in the orchestration
205
208
  layer (`analyze-evidence` skill) — the CLI never calls a model, so G7 holds;
206
- (c) `attest` is fail-closed if the frozen inputs no longer hash-match, and
207
- rejects when `evaluator == created_by`; (d) judgments are non-reproducible by
209
+ (c) `attest` is fail-closed if the frozen inputs no longer hash-match, and the
210
+ independence check is hardened: it rejects when `evaluator == created_by`
211
+ (trimmed + case-folded), **requires an explicit (non-ambient) evaluator
212
+ identity**, and **fails closed when the worker identity is ambient/weak**
213
+ (`created_by_source` of `env-user`/`anonymous`/legacy) unless explicitly
214
+ overridden with `--allow-weak-worker-identity` (which records the weakness for
215
+ audit). This is a mechanical baseline + audit trail, **not** cryptographic
216
+ independence — a determined local actor can still assert two strings; real
217
+ independence is a separate evaluator process/credential + the committed git
218
+ trail; (d) judgments are non-reproducible by
208
219
  design — "never trust the cached verdict" here means *re-evaluate
209
220
  independently*, complementary to G3's *re-derive deterministically*. Threat
210
221
  model in §5a.
@@ -298,8 +309,12 @@ In-repo, committed (Decision D1) — **one file per artifact** (council Q2):
298
309
  audit-trail-grade tamper-evidence, not cryptographic immutability. G2's
299
310
  envelope hash detects payload/verdict mutation; it does not prevent a force-push
300
311
  that rewrites both. CI branch protection is the backstop.
301
- - **Large payloads:** `payload_max_bytes` guard; over-size payloads externalize
302
- (hash recorded in the entry, blob stored out-of-tree) to keep the repo lean.
312
+ - **Large payloads:** `payload_max_bytes` (default 1 MiB) is **enforced at
313
+ `record` time** an over-size payload is rejected fail-closed (G5): no entry
314
+ and no blob are written, keeping the committed repo lean. Set it to `0` to
315
+ disable the guard. (Externalizing over-size blobs out-of-tree with the hash
316
+ recorded in the entry is future hardening, not yet implemented — today the
317
+ contract is "reject", not "externalize".)
303
318
 
304
319
  ---
305
320
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "wicked-vault",
3
- "version": "0.3.0",
3
+ "version": "0.4.0",
4
4
  "description": "Local-first evidence primitive — record evidence with its acceptance criteria, re-derive integrity deterministically, and record independent third-party judgments. Never trusts a stored verdict, never lets work self-grade its own \"done\".",
5
5
  "type": "module",
6
6
  "bin": {
@@ -8,7 +8,7 @@
8
8
  "wicked-vault-install": "install.mjs"
9
9
  },
10
10
  "engines": {
11
- "node": ">=18"
11
+ "node": ">=20.0.0"
12
12
  },
13
13
  "license": "MIT",
14
14
  "author": "Mike Parcewski",
@@ -43,6 +43,7 @@
43
43
  "install.mjs"
44
44
  ],
45
45
  "scripts": {
46
+ "test": "bash test/cli-baseline.sh && bash test/attestation.sh && bash test/bus-integration.sh && bash test/verifiers.sh",
46
47
  "prove": "bash test/prove-on-memos.sh",
47
48
  "prove:verifiers": "bash test/verifiers.sh",
48
49
  "prove:attestation": "bash test/attestation.sh",
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: wicked-vault:analyze-evidence
3
- description: Have an INDEPENDENT party analyze whether recorded evidence actually meets its frozen acceptance criteria, and record the judgment as a tamper-evident attestation. Use when judging free-form criteria a deterministic check can't express ("does this adequately address the failure modes"), or producing a third-party sign-off that defeats self-graded "done". Runs a model (non-reproducible, costs a call). For the cheap deterministic integrity check, use wicked-vault:verify-evidence instead.
3
+ description: Have an INDEPENDENT party analyze whether recorded evidence actually meets its frozen acceptance criteria, and record the judgment as a hash-bound, append-only attestation (mutation-detecting; durable tamper-evidence is the committed git history). Use when judging free-form criteria a deterministic check can't express ("does this adequately address the failure modes"), or producing a third-party sign-off that defeats self-graded "done". Runs a model (non-reproducible, costs a call). For the cheap deterministic integrity check, use wicked-vault:verify-evidence instead.
4
4
  ---
5
5
 
6
6
  # wicked-vault:analyze-evidence
@@ -8,7 +8,9 @@ description: Have an INDEPENDENT party analyze whether recorded evidence actuall
8
8
  This is the vault's **independent referee** — the judgment tier (G10). The agent
9
9
  that produced the work cannot grade its own "done"; this flow has a *different*
10
10
  evaluator analyze the frozen evidence against its frozen acceptance criteria,
11
- then records that analysis as a tamper-evident, append-only `opinion_attestation`.
11
+ then records that analysis as a hash-bound, append-only `opinion_attestation`
12
+ (mutation-detecting; the durable tamper-evidence is the committed git history —
13
+ see the README "Tamper detection" section).
12
14
 
13
15
  **Know what you're invoking.** This skill:
14
16
  - **runs a model** (an independent evaluator), so it costs a call and is
@@ -26,9 +28,23 @@ satisfy the acceptance criteria?"* and the criteria need judgment.
26
28
 
27
29
  The evaluator **MUST be distinct from the agent that produced the evidence.**
28
30
  Use a separate model CLI (e.g. `gemini`, `codex`) or an isolated subagent — not
29
- the same context that did the work. The CLI enforces the floor: `attest`
30
- **rejects** when `--evaluator` equals the artifact's `created_by`. Spoofable, so
31
- treat the rule as real, not as a checkbox.
31
+ the same context that did the work. The CLI enforces a hardened floor:
32
+
33
+ - `attest` **rejects** when `--evaluator` equals the artifact's `created_by`
34
+ (compared trimmed + case-folded, so `Alice`/`alice ` can't sidestep it).
35
+ - For the independence claim to be meaningful, the **worker** should record with
36
+ an explicit `--actor "<id>"` (or set `WICKED_VAULT_ACTOR`). If the artifact was
37
+ recorded under only an *ambient* identity (bare `$USER` / anonymous), `attest`
38
+ **fails closed** — pass `--allow-weak-worker-identity` to proceed anyway, which
39
+ stamps `worker_identity_weak: true` on the attestation for audit.
40
+ - The **evaluator** identity must itself be an explicit assertion; a bare ambient
41
+ evaluator id is refused (that is the silent self-grade).
42
+
43
+ This is a stronger mechanical baseline + audit trail, **not** cryptographic
44
+ independence — a determined human can still assert two distinct strings for the
45
+ same person locally. Real independence comes from a genuinely separate evaluator
46
+ process/credential and the committed, branch-protected git trail. Treat the rule
47
+ as real, not as a checkbox.
32
48
 
33
49
  ## Orchestration
34
50
 
@@ -1,19 +1,26 @@
1
1
  ---
2
2
  name: wicked-vault:init
3
- description: Initialize a wicked-vault in a repository so claims can be backed by re-derivable evidence. Use when setting up the vault for the first time, when a vault command reports "no .wicked-vault/ found", or before the first record/cross-check in a project.
3
+ description: Initialize a wicked-vault in a repository so claims can be backed by re-derivable evidence. OPTIONAL ceremony record/declare-contract/supersede create the vault automatically; use init only to scaffold explicitly. A read command reporting code VAULT_NOT_FOUND means no evidence exists yet, which record (not init) fixes.
4
4
  ---
5
5
 
6
6
  # wicked-vault:init
7
7
 
8
8
  Set up the local-first **evidence primitive** in the current repository. The
9
- vault records claim-backing artifacts, hashes them tamper-evidently, and
10
- *re-derives* their verdict on demand — it never trusts a stored status.
9
+ vault records claim-backing artifacts, hashes them so naive/accidental mutation
10
+ is detected on re-derivation, and *re-derives* their verdict on demand — it
11
+ never trusts a stored status. (The hash detects mutation; the committed,
12
+ branch-protected git history is the durable tamper-evidence — see below and the
13
+ README "Tamper detection" section.)
11
14
 
12
15
  ## When to use
13
16
 
14
- - First time using the vault in a repo.
15
- - A command failed with `no .wicked-vault/ found; run \`wicked-vault init\``.
16
- - Before the first `record` or `declare-contract` in a project.
17
+ - You want the `.wicked-vault/` scaffold to exist before any evidence is
18
+ recorded (e.g. committing the directory layout, pre-provisioning CI).
19
+ - Otherwise init is **optional**: `record`, `declare-contract`, and
20
+ `supersede` create the vault automatically on first use.
21
+ - A read command failing with `code: VAULT_NOT_FOUND` means no evidence has
22
+ been recorded in that repo — recording evidence fixes it; bare init alone
23
+ does not produce evidence.
17
24
 
18
25
  ## Initialize
19
26
 
@@ -25,12 +32,17 @@ This creates `.wicked-vault/` at the repo root with:
25
32
 
26
33
  ```
27
34
  .wicked-vault/
28
- vault.json # schema_version, store_mode: in-repo, payload_max_bytes
35
+ vault.json # schema_version, store_mode: in-repo, payload_max_bytes (enforced on record)
29
36
  entries/ # one JSON envelope per recorded artifact (append-only)
30
37
  payloads/ # content-addressed payload blobs (sha256-named, deduped)
31
38
  contracts/ # consumer-authored contracts, per scope/phase
39
+ attestations/ # append-only independent opinion log, per artifact
32
40
  ```
33
41
 
42
+ `payload_max_bytes` (default 1 MiB) is enforced at `record` time: an over-size
43
+ payload is rejected fail-closed (no entry, no blob written) so the committed
44
+ audit chain stays lean. Set it to `0` to disable the guard.
45
+
34
46
  `record`, `declare-contract`, and `supersede` auto-create the vault if one
35
47
  isn't found, so explicit `init` is mostly for clarity. `verify`, `cross-check`,
36
48
  and `list` do **not** auto-create — they fail-closed when no vault exists.
@@ -38,13 +50,23 @@ and `list` do **not** auto-create — they fail-closed when no vault exists.
38
50
  The vault is discovered by walking up from the current directory, so any
39
51
  subdirectory of the repo can run vault commands.
40
52
 
41
- ## Should this be committed?
42
-
43
- `store_mode` defaults to `in-repo` the vault lives inside the working tree
44
- and git becomes the audit chain. Decide per project whether `.wicked-vault/` is
45
- committed (shared, auditable evidence) or git-ignored (local-only scratch). The
46
- repo's own `.gitignore` ignores `.wicked-vault/` by default; remove that line to
47
- commit evidence.
53
+ ## Commit the vault — it is the real tamper-evidence backstop
54
+
55
+ `store_mode` defaults to `in-repo`, and **`.wicked-vault/` should be committed.**
56
+ This is not incidental: the envelope hash only catches *naive/accidental*
57
+ mutation a determined local writer can recompute every hash after editing
58
+ (the hashes are unkeyed SHA-256 over public fields). The protection that
59
+ actually survives a determined editor is the **committed, branch-protected git
60
+ history**: it is what makes after-the-fact tampering visible in a diff and
61
+ preventable with branch protection. Audit-trail-grade, not cryptographic — see
62
+ the README "Tamper detection" section and CONTRACTS.md §6.
63
+
64
+ So **do not git-ignore the vault.** Commit `entries/`, `payloads/`,
65
+ `contracts/`, and `attestations/`; only the derived `index.sqlite` query cache
66
+ is ignored (it is rebuilt from the source of truth). If you have a deliberate
67
+ reason to keep evidence local-only (throwaway scratch, never reviewed), that is
68
+ an explicit opt-out — add `.wicked-vault/` to your `.gitignore` knowing you have
69
+ forfeited the only durable tamper-evidence the vault offers.
48
70
 
49
71
  ## Next steps
50
72
 
@@ -5,9 +5,12 @@ description: Record a claim-backing artifact in the vault and attach a determini
5
5
 
6
6
  # wicked-vault:record-evidence
7
7
 
8
- Capture an artifact, hash it tamper-evidently, and attach a verifier that can
9
- **re-derive** its verdict later. The vault does the capture itself — it never
10
- trusts a claimed status (G4).
8
+ Capture an artifact, hash it (so naive/accidental mutation is detected on
9
+ re-derivation), and attach a verifier that can **re-derive** its verdict later.
10
+ The vault does the capture itself — it never trusts a claimed status (G4). The
11
+ hash detects mutation; the committed git history is the durable tamper-evidence
12
+ (see the README "Tamper detection" section — the envelope hash is unkeyed, so it
13
+ is not a defense against a determined local writer).
11
14
 
12
15
  ## When to use
13
16
 
package/src/vault.mjs CHANGED
@@ -74,6 +74,45 @@ function loadContract(root, scope, phase) {
74
74
  return existsSync(p) ? JSON.parse(readFileSync(p, 'utf8')) : null;
75
75
  }
76
76
 
77
+ // Resolve the acting identity for provenance + the G10/D4 independence check.
78
+ // Precedence (strongest first):
79
+ // 1. explicit value (CLI --actor / --evaluator) -> source 'explicit'
80
+ // 2. WICKED_VAULT_ACTOR env (harness-asserted identity) -> source 'env-actor'
81
+ // 3. $USER env (the OS login — easily spoofed) -> source 'env-user'
82
+ // 4. nothing -> source 'anonymous'
83
+ // The *source* matters: 'explicit' and 'env-actor' are deliberate assertions;
84
+ // 'env-user'/'anonymous' are weak and must not silently satisfy independence.
85
+ function resolveActor(explicit) {
86
+ if (typeof explicit === 'string' && explicit.trim() !== '') {
87
+ return { id: explicit.trim(), source: 'explicit' };
88
+ }
89
+ const envActor = process.env.WICKED_VAULT_ACTOR;
90
+ if (typeof envActor === 'string' && envActor.trim() !== '') {
91
+ return { id: envActor.trim(), source: 'env-actor' };
92
+ }
93
+ const user = process.env.USER || process.env.USERNAME; // USERNAME = Windows
94
+ if (typeof user === 'string' && user.trim() !== '') {
95
+ return { id: user.trim(), source: 'env-user' };
96
+ }
97
+ return { id: 'unknown', source: 'anonymous' };
98
+ }
99
+ // Weak identity provenance — derived from ambient env, not deliberately asserted.
100
+ const WEAK_IDENTITY_SOURCES = new Set(['env-user', 'anonymous']);
101
+
102
+ // Read the vault config (vault.json). Falls back to defaults if the file is
103
+ // absent or unreadable — record auto-creates the vault, so a config always
104
+ // exists by the time a payload is captured, but be defensive.
105
+ const DEFAULT_PAYLOAD_MAX_BYTES = 1048576;
106
+ function loadConfig(root) {
107
+ const cfg = join(root, DIR, 'vault.json');
108
+ if (!existsSync(cfg)) return { payload_max_bytes: DEFAULT_PAYLOAD_MAX_BYTES };
109
+ try {
110
+ return JSON.parse(readFileSync(cfg, 'utf8'));
111
+ } catch {
112
+ return { payload_max_bytes: DEFAULT_PAYLOAD_MAX_BYTES };
113
+ }
114
+ }
115
+
77
116
  export function record(root, opts) {
78
117
  const P = paths(root);
79
118
 
@@ -99,6 +138,18 @@ export function record(root, opts) {
99
138
  throw new Error('record requires --run or --artifact');
100
139
  }
101
140
 
141
+ // Enforce the configured payload ceiling (CONTRACTS.md §6). Oversize payloads
142
+ // are rejected here — before hashing or writing the blob — so a too-large
143
+ // capture can never bloat the committed audit chain. Fail-closed (G5): a
144
+ // rejected record produces NO entry and NO payload blob. `payload_max_bytes`
145
+ // <= 0 disables the guard (escape hatch for an explicitly unbounded vault).
146
+ const cfg = loadConfig(root);
147
+ const maxBytes = typeof cfg.payload_max_bytes === 'number'
148
+ ? cfg.payload_max_bytes : DEFAULT_PAYLOAD_MAX_BYTES;
149
+ if (maxBytes > 0 && blob.length > maxBytes) {
150
+ throw new Error(`payload exceeds payload_max_bytes: ${blob.length} > ${maxBytes} (set payload_max_bytes in .wicked-vault/vault.json to raise the limit, or 0 to disable)`);
151
+ }
152
+
102
153
  const payload_sha256 = sha256(blob);
103
154
 
104
155
  // G10/D1 — acceptance criteria are mandatory and frozen to the evidence.
@@ -147,6 +198,11 @@ export function record(root, opts) {
147
198
  ? runVerifier(verifier, payloadView(blob), { repoRoot: opts.cwd || root })
148
199
  : { status: 'n/a', detail: 'no deterministic verifier (judgment-tier claim)' };
149
200
 
201
+ // Actor provenance for the G10/D4 independence assertion. An explicit
202
+ // --actor (or WICKED_VAULT_ACTOR) is a deliberate identity claim; a bare
203
+ // $USER is ambient and weak. attest() uses created_by_source to refuse a
204
+ // silent self-grade where both worker and judge are unasserted (see attest).
205
+ const actor = resolveActor(opts.actor);
150
206
  const entry = {
151
207
  id, ...fields,
152
208
  acceptance_criteria, criteria_authored_by,
@@ -157,7 +213,8 @@ export function record(root, opts) {
157
213
  supersedes: null,
158
214
  contract_version: contract ? contract.contract_version : null,
159
215
  created_at: new Date().toISOString(),
160
- created_by: process.env.USER || 'unknown',
216
+ created_by: actor.id,
217
+ created_by_source: actor.source,
161
218
  };
162
219
  writeFileSync(join(P.entries, `${id}.json`), JSON.stringify(entry, null, 2));
163
220
  return { id, envelope_hash, criteria_authored_by, status_at_record: sr.status, status_detail: sr.detail };
@@ -267,6 +324,7 @@ export function inspect(root, id) {
267
324
  acceptance_criteria: entry.acceptance_criteria,
268
325
  criteria_authored_by: entry.criteria_authored_by,
269
326
  created_by: entry.created_by,
327
+ created_by_source: entry.created_by_source || null,
270
328
  evidence: { text: view.text, json: view.json },
271
329
  hash_ok: v.hash_ok,
272
330
  integrity_status: v.status,
@@ -287,11 +345,47 @@ export function attest(root, id, opts) {
287
345
  const entry = JSON.parse(readFileSync(entryPath, 'utf8'));
288
346
 
289
347
  if (!OPINIONS.has(opts.opinion)) throw new Error(`attest: --opinion must be one of pass|reject|unclear (got '${opts.opinion}')`);
290
- if (typeof opts.evaluator !== 'string' || !opts.evaluator) throw new Error('attest requires --evaluator');
348
+ if (typeof opts.evaluator !== 'string' || opts.evaluator.trim() === '') throw new Error('attest requires --evaluator');
349
+
350
+ // G10/D4 — mechanical independence, hardened. The judge must be a DELIBERATELY
351
+ // ASSERTED identity that differs from the worker. Three failure modes are
352
+ // closed here (all on top of the existing equality check):
353
+ //
354
+ // (a) trivial-equality bypass — compare trimmed + case-folded so 'Alice',
355
+ // 'alice', and 'Alice ' can't sidestep the self-grade rejection.
356
+ // (b) ambiguous worker identity — if the artifact was recorded under an
357
+ // ambient identity ($USER / anonymous, created_by_source weak), the
358
+ // independence claim cannot be trusted from a string compare alone.
359
+ // We FAIL CLOSED unless the caller acknowledges it explicitly
360
+ // (--allow-weak-worker-identity / opts.allowWeakWorkerIdentity), and we
361
+ // stamp the weakness onto the attestation so audit can see it.
362
+ // (c) ambiguous evaluator identity — the evaluator must be an explicit
363
+ // assertion. A bare ambient identity for the JUDGE is refused: that is
364
+ // exactly the silent self-grade the env var would otherwise enable.
365
+ //
366
+ // This is a stronger mechanical baseline + audit trail, NOT cryptographic
367
+ // independence. A determined human can still assert two distinct strings for
368
+ // the same person locally; real independence comes from a separate evaluator
369
+ // process/credential (see analyze-evidence skill) and the committed git trail.
370
+ const evaluator = resolveActor(opts.evaluator);
371
+ const norm = (s) => (typeof s === 'string' ? s.trim().toLowerCase() : '');
372
+
373
+ if (WEAK_IDENTITY_SOURCES.has(evaluator.source)) {
374
+ throw new Error(`attest refused (G10/D4): evaluator identity is ambient (${evaluator.source}='${evaluator.id}'), not a deliberate assertion. Pass an explicit --evaluator naming the independent judge (e.g. a model CLI or reviewer id) so a self-grade can't slip through silently.`);
375
+ }
376
+
377
+ if (entry.created_by && norm(evaluator.id) === norm(entry.created_by)) {
378
+ throw new Error(`attest refused (G10/D4): evaluator '${evaluator.id}' equals the artifact creator '${entry.created_by}' — a judgment must be independent of the worker`);
379
+ }
291
380
 
292
- // G10/D4 mechanical independence: the judge must differ from the worker.
293
- if (entry.created_by && opts.evaluator === entry.created_by) {
294
- throw new Error(`attest refused (G10/D4): evaluator '${opts.evaluator}' equals the artifact creator a judgment must be independent of the worker`);
381
+ // The worker's identity provenance governs how much the independence claim is
382
+ // worth. A weak (ambient) worker identity means "different string" proves
383
+ // little. Fail closed unless the caller explicitly accepts that risk.
384
+ const workerSource = entry.created_by_source
385
+ || (entry.created_by && entry.created_by !== 'unknown' ? 'legacy' : 'anonymous');
386
+ const workerIdentityWeak = WEAK_IDENTITY_SOURCES.has(workerSource) || workerSource === 'legacy';
387
+ if (workerIdentityWeak && !opts.allowWeakWorkerIdentity) {
388
+ throw new Error(`attest refused (G10/D4): the artifact was recorded under a weak/ambient worker identity (created_by_source='${workerSource}'), so 'evaluator != created_by' is not a trustworthy independence signal. Re-record with an explicit --actor for the worker, or pass --allow-weak-worker-identity to attest anyway (the weakness is stamped on the attestation for audit).`);
295
389
  }
296
390
 
297
391
  // Fail-closed (G5/G10): never attest against a tampered artifact.
@@ -303,12 +397,17 @@ export function attest(root, id, opts) {
303
397
  artifact_id: id,
304
398
  opinion: opts.opinion,
305
399
  rationale: opts.rationale || '',
306
- evaluator: opts.evaluator,
400
+ evaluator: evaluator.id,
401
+ evaluator_source: evaluator.source, // provenance of the judge identity (G10/D4)
307
402
  model: opts.model || null,
308
403
  prompt_hash: opts.prompt_hash || null,
309
404
  sampling: opts.sampling || null,
310
405
  evidence_sha256: entry.payload_sha256,
311
406
  criteria_sha256: entry.criteria_sha256,
407
+ // Audit flag: the worker identity this independence claim rests on was weak
408
+ // (ambient $USER / anonymous / legacy). The attestation was allowed via an
409
+ // explicit acknowledgement; a downstream gate may choose to discount it.
410
+ worker_identity_weak: workerIdentityWeak,
312
411
  created_at: new Date().toISOString(),
313
412
  };
314
413
  // tamper-evident binding over the attestation tuple (G2-style, G10)