npm - wicked-vault - Versions diffs - 0.3.0 → 0.4.0 - Mend

wicked-vault 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +63 -17
package/bin/wicked-vault.mjs +37 -2
package/docs/CONTRACTS.md +20 -5
package/package.json +3 -2
package/skills/wicked-vault/analyze-evidence/SKILL.md +21 -5
package/skills/wicked-vault/init/SKILL.md +36 -14
package/skills/wicked-vault/record-evidence/SKILL.md +6 -3
package/src/vault.mjs +105 -6

package/README.md CHANGED Viewed

@@ -26,15 +26,16 @@ It checks on two tiers (ADR-0002):
   hashes, re-run the pure verifier. CI-gate-safe. *Never trust a cached status.*
 - **Judgment tier** — an **independent** evaluator (≠ the agent that did the
   work) judges the frozen evidence against the frozen criteria; the opinion is
-  recorded as a tamper-evident, append-only `opinion_attestation`. *Never trust
-  a self-graded "done".*
+  recorded as a hash-bound, append-only `opinion_attestation` (mutation-
+  detecting in the same sense as the envelope — see "Tamper detection"). *Never
+  trust a self-graded "done".*
 ## Boundary
 | Owns (the primitive) | Refuses (lives in a consumer) |
 |---|---|
 | `record` · `verify` · `inspect` · `attest` · `cross-check` · `supersede` | "is the work *done*?" (gate logic) |
-| criteria-binding + tamper-evidence (envelope hash; git as audit chain) | scenario/flake history; claim authoring; work-shape |
+| criteria-binding + mutation detection (envelope hash detects naive tamper; committed git history is the audit chain) | scenario/flake history; claim authoring; work-shape |
 | the deterministic verifier family **and** the append-only attestation ledger | running the judge — the model lives in the `analyze-evidence` *skill*, never in the CLI |
 The consumer authors the contract; the vault evaluates it mechanically (G9) and
@@ -68,7 +69,7 @@ binaries support `--help`.
 ## CLI
 ```bash
-wicked-vault init
+wicked-vault init   # optional — record / declare-contract / supersede create .wicked-vault/ automatically
 # record: --criteria is MANDATORY (the bar this evidence claims to clear); --verifier is optional
 wicked-vault record  --scope S --phase build --claim tests-pass --kind test-run \
                      --source "npm test" --criteria "all unit tests pass (exit 0)" \
@@ -108,11 +109,21 @@ skill orchestrates an **independent** judge:
 2. a model **distinct from the worker** judges criteria-vs-evidence (criteria
    and evidence are passed as escaped *data*, never as instructions).
 3. `attest` records the `{opinion, rationale, evaluator, model, …}` to an
-   append-only, tamper-evident log.
+   append-only, hash-bound log (mutation-detecting; the committed git history is
+   the durable tamper-evidence — see "Tamper detection").
 Guarantees that hold: criteria are frozen to the evidence (anti-downgrade);
 `attest` is **fail-closed** on a tampered artifact and **rejects a self-grade**
-(`evaluator == created_by`). What's traded: a judgment is **not reproducible** —
+(`evaluator == created_by`, compared trimmed + case-folded). The independence
+check is hardened: the worker should record with an explicit `--actor` (or
+`WICKED_VAULT_ACTOR`) — when the artifact carries only an *ambient* identity
+(bare `$USER` / anonymous), `attest` **fails closed** and requires
+`--allow-weak-worker-identity` (which stamps the weakness on the attestation for
+audit), and the **evaluator** identity must itself be an explicit assertion.
+This is a stronger mechanical baseline + audit trail, **not** cryptographic
+independence — a determined human can still assert two distinct strings locally;
+real independence comes from a separate evaluator process/credential and the
+committed git trail. What's traded: a judgment is **not reproducible** —
 it's re-evaluated, not re-derived. The default CI gate stays on the
 deterministic `--integrity-only` path; the judgment tier is opt-in. Threat model
 (prompt injection, lax-bar self-grade) and the council 5–0 review:
@@ -149,15 +160,48 @@ signal, *not* a deterministic verdict). `wicked.evidence.tampered` is the
 high-value alarm: a payload, criteria, or envelope diverged from what was
 recorded (G2).
+## Tamper detection — what it does, and what it does NOT do
+Be precise about the word "tamper-evident", because the mechanism is easy to
+overstate:
+- **What the envelope hash catches:** *naive or accidental* mutation. The
+  envelope is an **unkeyed SHA-256** over the artifact's public fields (scope,
+  phase, claim, kind, source, verifier, `criteria_sha256`, `payload_sha256`).
+  `verify` re-derives every hash from the bytes on disk and re-runs the pure
+  verifier — so a hand-edit to a payload, the criteria, or a cached status is
+  detected (`hash_ok: false`), and a stale "pass" is never trusted (G3). This
+  defeats the common failure modes: a fat-fingered edit, a tool that rewrites a
+  file, an agent that flips `status_at_record`.
+- **What it does NOT do:** it is **not** cryptographically tamper-*resistant*
+  against a *determined local writer*. Because the hashes are unkeyed and over
+  public fields, anyone who can edit `entries/` can also recompute every hash to
+  match — `verify` would then return `hash_ok: true` on a forged entry. There is
+  no secret key, no signature, no HMAC. **Do not rely on the envelope hash alone
+  as a security boundary.**
+- **Where real tamper-EVIDENCE comes from:** the **committed, branch-protected
+  git history** of `.wicked-vault/`. Evidence is committed by default; the PR
+  diff shows exactly what was recorded, and branch protection prevents silent
+  rewrites. This is **audit-trail-grade** tamper-evidence (you can see, in a
+  reviewable history, what changed and who changed it) — **not** cryptographic
+  immutability (a force-push by a privileged actor can still rewrite history; CI
+  branch protection is the backstop). This matches CONTRACTS.md §6 and ADR-0002.
+In one line: **the envelope hash detects mutation; committed, branch-protected
+git history is what makes that mutation *evident and accountable*.**
 ## Guarantees
-G1 server-minted ids · G2 envelope-hash tamper-evidence (**binds the criteria
-too**) · **G3 re-derivation (never trust a cached status)** · G4 honest
-recording (not sandboxed — harness owns isolation) · G5 fail-closed · G6
-append-only · G7 verifier purity (CLI never calls a model) · G8 contract pinning
-· G9 mechanical evaluation · **G10 attestation-chain trust** (independent
-judgments are recorded, not re-derived; distinct from deterministic results).
-Full text + threat model: [`docs/CONTRACTS.md`](docs/CONTRACTS.md). Founding
+G1 server-minted ids · **G2 envelope hash — detects naive/accidental payload,
+criteria, or envelope mutation (unkeyed SHA-256 over public fields; binds the
+criteria too). NOT a defense against a determined local writer — see "Tamper
+detection" above.** · **G3 re-derivation (never trust a cached status)** · G4
+honest recording (not sandboxed — harness owns isolation) · G5 fail-closed · G6
+append-only (git history is the audit chain) · G7 verifier purity (CLI never
+calls a model) · G8 contract pinning · G9 mechanical evaluation · **G10
+attestation-chain trust** (independent judgments are recorded, not re-derived;
+distinct from deterministic results). Full text + threat model:
+[`docs/CONTRACTS.md`](docs/CONTRACTS.md). Founding
 decisions + council reviews:
 [`docs/adr/0001`](docs/adr/0001-standalone-and-council-revisions.md) ·
 [`docs/adr/0002`](docs/adr/0002-independent-evaluation-and-criteria-binding.md).
@@ -174,16 +218,18 @@ evaluator may cite — not the whole story. Nondeterministic observation verifie
 ## Proof
 ```bash
-npm run prove                 # record -> tamper -> verify-rejects on a real repo
+npm test                      # the full gating suite (cli-baseline + attestation + bus + verifiers)
+npm run prove                 # record -> tamper -> verify-rejects on a real repo (needs a sibling repo)
 bash test/verifiers.sh        # the 5 verifiers, pass + fail cases
-bash test/attestation.sh      # criteria-binding, attest fail-closed/independence, require_attestation
+bash test/attestation.sh      # criteria-binding, attest fail-closed/independence (incl. weak-identity), payload limit, require_attestation
 bash test/bus-integration.sh  # graceful no-op + schema validity + real-bus emission (init/record/attest/cross-check)
 ```
-`attestation.sh` and `bus-integration.sh` are the gating proofs and run in CI
+`npm test` runs the gating proofs (`cli-baseline.sh`, `attestation.sh`,
+`bus-integration.sh`, `verifiers.sh`) and is what CI invokes
 (`.github/workflows/ci.yml`) on ubuntu + macos, with a Windows CLI smoke.
-Status: v0.3.0 — deterministic core proven on real repos; criteria-binding +
+Status: v0.3.1 — deterministic core proven on real repos; criteria-binding +
 independent judgment tier (ADR-0002, council 5–0) implemented and proven;
 wicked-bus integration **proven end-to-end against a real bus** (emit → store →
 poll), optional and fire-and-forget; `--help` on both binaries + a

package/bin/wicked-vault.mjs CHANGED Viewed

@@ -42,16 +42,22 @@ USAGE
   wicked-vault <command> [options]
 COMMANDS
-  init                         Create .wicked-vault/ in the current repo
+  init                         Create .wicked-vault/ in the current repo (optional —
+                               record / declare-contract / supersede auto-create it)
   record                       Capture evidence + the criteria it must clear
                                --scope S --phase P --claim C --kind K --source "<cmd|file>"
                                --criteria "<text|@file>" (--run | --artifact <file>) [--verifier "kind:arg"]
+                               [--actor ID]  (the asserted worker identity; strengthens the
+                               independence check — falls back to WICKED_VAULT_ACTOR then $USER)
   verify   <artifact-id>       Integrity tier: re-derive hashes + verifier (deterministic,
                                model-free). Exit 0 iff intact AND pass. Surfaces latest opinion.
   inspect  <artifact-id>       Frozen criteria + evidence + integrity (what a judge evaluates)
   attest   <artifact-id>       Record an INDEPENDENT judgment (fail-closed; evaluator != creator)
                                --opinion <pass|reject|unclear> --rationale "..." --evaluator ID
                                [--model prov/ver] [--prompt-hash H] [--sampling '<json>']
+                               [--allow-weak-worker-identity]  (attest anyway when the artifact
+                               was recorded under an ambient $USER/anonymous identity; the
+                               weakness is stamped on the attestation for audit)
   attestations <artifact-id>   Show the append-only opinion log
   cross-check                  Mechanical contract verdict; exit 0 iff PASS
                                --scope S --phase P [--integrity-only (default) | --with-attestations]
@@ -62,9 +68,12 @@ COMMANDS
 GLOBAL
   --cwd <dir>     Operate on a vault rooted at <dir> (default: walk up from cwd)
   --help, -h      Show this help
+  --version, -v   Print the wicked-vault version
 OUTPUT   JSON on stdout; exit code is the gate signal (0 = PASS / success).
 ENV      WICKED_VAULT_NO_BUS=1   Disable optional wicked-bus event emission
+         WICKED_VAULT_ACTOR=ID   Assert the worker identity for record/supersede
+                                 (used by the G10/D4 independence check)
 Skills (AI CLIs):  wicked-vault:{init,record-evidence,verify-evidence,analyze-evidence,cross-check-evidence,update}
 Install skills:    npx wicked-vault-install        (run with --help for options)
@@ -80,6 +89,14 @@ if (cmd === undefined || cmd === '--help' || cmd === '-h' || cmd === 'help') {
   process.exit(0);
 }
+// Version: like --help, must work outside any repo — resolved from this
+// package's own manifest, never from a vault.
+if (cmd === '--version' || cmd === '-v' || cmd === 'version') {
+  const pkg = JSON.parse(readFileSync(new URL('../package.json', import.meta.url), 'utf8'));
+  process.stdout.write(pkg.version + '\n');
+  process.exit(0);
+}
 const args = parseArgs(rest);
 const cwd = (typeof args.cwd === 'string' && args.cwd) || process.cwd();
@@ -94,7 +111,22 @@ try {
   }
   const root = findRoot(cwd, { create: cmd === 'record' || cmd === 'declare-contract' || cmd === 'supersede' });
-  if (!root) emit({ error: 'no .wicked-vault/ found; run `wicked-vault init`' }, false);
+  if (!root) {
+    // "What evidence exists?" in a repo with no vault is a question with a
+    // truthful answer — none — not an infrastructure error.
+    if (cmd === 'list') emit([], true);
+    const notFound = {
+      error: `no .wicked-vault/ found in or above ${cwd}`,
+      code: 'VAULT_NOT_FOUND',
+      hint: 'no evidence has been recorded here — `wicked-vault record` creates the vault automatically; `wicked-vault init` is optional scaffolding',
+    };
+    // Gate consumers (wicked-loom) read `overall` from cross-check JSON:
+    // report a truthful FAIL (no evidence exists) instead of a generic error.
+    if (cmd === 'cross-check') {
+      emit({ scope: args.scope, phase: args.phase, overall: 'FAIL', ...notFound }, false);
+    }
+    emit(notFound, false);
+  }
   switch (cmd) {
     case 'record': {
@@ -102,6 +134,7 @@ try {
         scope: args.scope, phase: args.phase, claim: args.claim, kind: args.kind,
         source: args.source, verifier: args.verifier, criteria: resolveCriteria(args.criteria),
         run: !!args.run, artifact: typeof args.artifact === 'string' ? args.artifact : undefined,
+        actor: typeof args.actor === 'string' ? args.actor : undefined,
         cwd,
       });
       publish('wicked.evidence.recorded', 'vault.record', {
@@ -120,6 +153,7 @@ try {
         opinion: args.opinion, rationale: args.rationale, evaluator: args.evaluator,
         model: args.model, prompt_hash: args['prompt-hash'],
         sampling: typeof args.sampling === 'string' ? JSON.parse(args.sampling) : undefined,
+        allowWeakWorkerIdentity: args['allow-weak-worker-identity'] === true,
       });
       publish('wicked.evidence.attested', 'vault.attest', {
         artifact_id: args._[0] || args.id, attestation_id: res.attestation_id,
@@ -190,6 +224,7 @@ try {
         scope: args.scope, phase: args.phase, claim: args.claim, kind: args.kind,
         source: args.source, verifier: args.verifier, criteria: resolveCriteria(args.criteria),
         run: !!args.run, artifact: typeof args.artifact === 'string' ? args.artifact : undefined,
+        actor: typeof args.actor === 'string' ? args.actor : undefined,
         cwd,
       });
       publish('wicked.evidence.superseded', 'vault.supersede', {

package/docs/CONTRACTS.md CHANGED Viewed

@@ -97,6 +97,7 @@ artifact still verify?* and *is this scope+phase's contract satisfied?*
 | `supersedes` | string? | prior artifact id |
 | `contract_version` | string? | the contract hash in force at record |
 | `created_at` / `created_by` | ts / string | actor provenance |
+| `created_by_source` | enum | how `created_by` was resolved: `explicit` (`--actor`) · `env-actor` (`WICKED_VAULT_ACTOR`) · `env-user` (ambient `$USER`, weak) · `anonymous` (none, weak). Governs the G10/D4 independence check — a weak source makes `evaluator != created_by` untrustworthy, so `attest` fails closed unless explicitly overridden. |
 ### 3.2 Contract (exit-criteria — what evidence a scope+phase requires)
@@ -136,7 +137,9 @@ Its trust is G10 (attestation-chain), not G3 (re-derivation).
 | `artifact_id` | string | the evidence it judges |
 | `opinion` | enum | `pass` · `reject` · `unclear` — deliberately NOT named `verdict`/`status` |
 | `rationale` | string | the judge's reasoning (structured output, not free-form prose injection) |
-| `evaluator` | string | the judging identity — **MUST differ from the artifact's `created_by`** (G10/D4) |
+| `evaluator` | string | the judging identity — **MUST differ from the artifact's `created_by`** (G10/D4), compared trimmed + case-folded; **MUST be an explicit assertion** (an ambient `$USER` evaluator is refused) |
+| `evaluator_source` | enum | provenance of the evaluator identity (`explicit` · `env-actor`); ambient sources are refused at `attest` |
+| `worker_identity_weak` | bool | true if the judged artifact was recorded under a weak/ambient/legacy `created_by_source`; `attest` fails closed in that case unless `--allow-weak-worker-identity` is passed, which stamps this flag for audit |
 | `model` | string | provider/version, e.g. `gemini/2.5-pro` |
 | `prompt_hash` | string? | hash of the prompt template used |
 | `sampling` | object? | `{temperature, …}` — provenance for disagreement analysis |
@@ -203,8 +206,16 @@ reproducible.**
   (a) acceptance criteria are mandatory and bound into the envelope, frozen to
   the evidence (anti-downgrade); (b) the model runs only in the orchestration
   layer (`analyze-evidence` skill) — the CLI never calls a model, so G7 holds;
-  (c) `attest` is fail-closed if the frozen inputs no longer hash-match, and
-  rejects when `evaluator == created_by`; (d) judgments are non-reproducible by
+  (c) `attest` is fail-closed if the frozen inputs no longer hash-match, and the
+  independence check is hardened: it rejects when `evaluator == created_by`
+  (trimmed + case-folded), **requires an explicit (non-ambient) evaluator
+  identity**, and **fails closed when the worker identity is ambient/weak**
+  (`created_by_source` of `env-user`/`anonymous`/legacy) unless explicitly
+  overridden with `--allow-weak-worker-identity` (which records the weakness for
+  audit). This is a mechanical baseline + audit trail, **not** cryptographic
+  independence — a determined local actor can still assert two strings; real
+  independence is a separate evaluator process/credential + the committed git
+  trail; (d) judgments are non-reproducible by
   design — "never trust the cached verdict" here means *re-evaluate
   independently*, complementary to G3's *re-derive deterministically*. Threat
   model in §5a.
@@ -298,8 +309,12 @@ In-repo, committed (Decision D1) — **one file per artifact** (council Q2):
   audit-trail-grade tamper-evidence, not cryptographic immutability. G2's
   envelope hash detects payload/verdict mutation; it does not prevent a force-push
   that rewrites both. CI branch protection is the backstop.
-- **Large payloads:** `payload_max_bytes` guard; over-size payloads externalize
-  (hash recorded in the entry, blob stored out-of-tree) to keep the repo lean.
+- **Large payloads:** `payload_max_bytes` (default 1 MiB) is **enforced at
+  `record` time** — an over-size payload is rejected fail-closed (G5): no entry
+  and no blob are written, keeping the committed repo lean. Set it to `0` to
+  disable the guard. (Externalizing over-size blobs out-of-tree with the hash
+  recorded in the entry is future hardening, not yet implemented — today the
+  contract is "reject", not "externalize".)
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "wicked-vault",
-  "version": "0.3.0",
+  "version": "0.4.0",
   "description": "Local-first evidence primitive — record evidence with its acceptance criteria, re-derive integrity deterministically, and record independent third-party judgments. Never trusts a stored verdict, never lets work self-grade its own \"done\".",
   "type": "module",
   "bin": {
@@ -8,7 +8,7 @@
     "wicked-vault-install": "install.mjs"
   },
   "engines": {
-    "node": ">=18"
+    "node": ">=20.0.0"
   },
   "license": "MIT",
   "author": "Mike Parcewski",
@@ -43,6 +43,7 @@
     "install.mjs"
   ],
   "scripts": {
+    "test": "bash test/cli-baseline.sh && bash test/attestation.sh && bash test/bus-integration.sh && bash test/verifiers.sh",
     "prove": "bash test/prove-on-memos.sh",
     "prove:verifiers": "bash test/verifiers.sh",
     "prove:attestation": "bash test/attestation.sh",

package/skills/wicked-vault/analyze-evidence/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: wicked-vault:analyze-evidence
-description: Have an INDEPENDENT party analyze whether recorded evidence actually meets its frozen acceptance criteria, and record the judgment as a tamper-evident attestation. Use when judging free-form criteria a deterministic check can't express ("does this adequately address the failure modes"), or producing a third-party sign-off that defeats self-graded "done". Runs a model (non-reproducible, costs a call). For the cheap deterministic integrity check, use wicked-vault:verify-evidence instead.
+description: Have an INDEPENDENT party analyze whether recorded evidence actually meets its frozen acceptance criteria, and record the judgment as a hash-bound, append-only attestation (mutation-detecting; durable tamper-evidence is the committed git history). Use when judging free-form criteria a deterministic check can't express ("does this adequately address the failure modes"), or producing a third-party sign-off that defeats self-graded "done". Runs a model (non-reproducible, costs a call). For the cheap deterministic integrity check, use wicked-vault:verify-evidence instead.
 ---
 # wicked-vault:analyze-evidence
@@ -8,7 +8,9 @@ description: Have an INDEPENDENT party analyze whether recorded evidence actuall
 This is the vault's **independent referee** — the judgment tier (G10). The agent
 that produced the work cannot grade its own "done"; this flow has a *different*
 evaluator analyze the frozen evidence against its frozen acceptance criteria,
-then records that analysis as a tamper-evident, append-only `opinion_attestation`.
+then records that analysis as a hash-bound, append-only `opinion_attestation`
+(mutation-detecting; the durable tamper-evidence is the committed git history —
+see the README "Tamper detection" section).
 **Know what you're invoking.** This skill:
 - **runs a model** (an independent evaluator), so it costs a call and is
@@ -26,9 +28,23 @@ satisfy the acceptance criteria?"* and the criteria need judgment.
 The evaluator **MUST be distinct from the agent that produced the evidence.**
 Use a separate model CLI (e.g. `gemini`, `codex`) or an isolated subagent — not
-the same context that did the work. The CLI enforces the floor: `attest`
-**rejects** when `--evaluator` equals the artifact's `created_by`. Spoofable, so
-treat the rule as real, not as a checkbox.
+the same context that did the work. The CLI enforces a hardened floor:
+- `attest` **rejects** when `--evaluator` equals the artifact's `created_by`
+  (compared trimmed + case-folded, so `Alice`/`alice ` can't sidestep it).
+- For the independence claim to be meaningful, the **worker** should record with
+  an explicit `--actor "<id>"` (or set `WICKED_VAULT_ACTOR`). If the artifact was
+  recorded under only an *ambient* identity (bare `$USER` / anonymous), `attest`
+  **fails closed** — pass `--allow-weak-worker-identity` to proceed anyway, which
+  stamps `worker_identity_weak: true` on the attestation for audit.
+- The **evaluator** identity must itself be an explicit assertion; a bare ambient
+  evaluator id is refused (that is the silent self-grade).
+This is a stronger mechanical baseline + audit trail, **not** cryptographic
+independence — a determined human can still assert two distinct strings for the
+same person locally. Real independence comes from a genuinely separate evaluator
+process/credential and the committed, branch-protected git trail. Treat the rule
+as real, not as a checkbox.
 ## Orchestration

package/skills/wicked-vault/init/SKILL.md CHANGED Viewed

@@ -1,19 +1,26 @@
 ---
 name: wicked-vault:init
-description: Initialize a wicked-vault in a repository so claims can be backed by re-derivable evidence. Use when setting up the vault for the first time, when a vault command reports "no .wicked-vault/ found", or before the first record/cross-check in a project.
+description: Initialize a wicked-vault in a repository so claims can be backed by re-derivable evidence. OPTIONAL ceremony — record/declare-contract/supersede create the vault automatically; use init only to scaffold explicitly. A read command reporting code VAULT_NOT_FOUND means no evidence exists yet, which record (not init) fixes.
 ---
 # wicked-vault:init
 Set up the local-first **evidence primitive** in the current repository. The
-vault records claim-backing artifacts, hashes them tamper-evidently, and
-*re-derives* their verdict on demand — it never trusts a stored status.
+vault records claim-backing artifacts, hashes them so naive/accidental mutation
+is detected on re-derivation, and *re-derives* their verdict on demand — it
+never trusts a stored status. (The hash detects mutation; the committed,
+branch-protected git history is the durable tamper-evidence — see below and the
+README "Tamper detection" section.)
 ## When to use
-- First time using the vault in a repo.
-- A command failed with `no .wicked-vault/ found; run \`wicked-vault init\``.
-- Before the first `record` or `declare-contract` in a project.
+- You want the `.wicked-vault/` scaffold to exist before any evidence is
+  recorded (e.g. committing the directory layout, pre-provisioning CI).
+- Otherwise init is **optional**: `record`, `declare-contract`, and
+  `supersede` create the vault automatically on first use.
+- A read command failing with `code: VAULT_NOT_FOUND` means no evidence has
+  been recorded in that repo — recording evidence fixes it; bare init alone
+  does not produce evidence.
 ## Initialize
@@ -25,12 +32,17 @@ This creates `.wicked-vault/` at the repo root with:
 ```
 .wicked-vault/
-  vault.json     # schema_version, store_mode: in-repo, payload_max_bytes
+  vault.json     # schema_version, store_mode: in-repo, payload_max_bytes (enforced on record)
   entries/       # one JSON envelope per recorded artifact (append-only)
   payloads/      # content-addressed payload blobs (sha256-named, deduped)
   contracts/     # consumer-authored contracts, per scope/phase
+  attestations/  # append-only independent opinion log, per artifact
 ```
+`payload_max_bytes` (default 1 MiB) is enforced at `record` time: an over-size
+payload is rejected fail-closed (no entry, no blob written) so the committed
+audit chain stays lean. Set it to `0` to disable the guard.
 `record`, `declare-contract`, and `supersede` auto-create the vault if one
 isn't found, so explicit `init` is mostly for clarity. `verify`, `cross-check`,
 and `list` do **not** auto-create — they fail-closed when no vault exists.
@@ -38,13 +50,23 @@ and `list` do **not** auto-create — they fail-closed when no vault exists.
 The vault is discovered by walking up from the current directory, so any
 subdirectory of the repo can run vault commands.
-## Should this be committed?
-`store_mode` defaults to `in-repo` — the vault lives inside the working tree
-and git becomes the audit chain. Decide per project whether `.wicked-vault/` is
-committed (shared, auditable evidence) or git-ignored (local-only scratch). The
-repo's own `.gitignore` ignores `.wicked-vault/` by default; remove that line to
-commit evidence.
+## Commit the vault — it is the real tamper-evidence backstop
+`store_mode` defaults to `in-repo`, and **`.wicked-vault/` should be committed.**
+This is not incidental: the envelope hash only catches *naive/accidental*
+mutation — a determined local writer can recompute every hash after editing
+(the hashes are unkeyed SHA-256 over public fields). The protection that
+actually survives a determined editor is the **committed, branch-protected git
+history**: it is what makes after-the-fact tampering visible in a diff and
+preventable with branch protection. Audit-trail-grade, not cryptographic — see
+the README "Tamper detection" section and CONTRACTS.md §6.
+So **do not git-ignore the vault.** Commit `entries/`, `payloads/`,
+`contracts/`, and `attestations/`; only the derived `index.sqlite` query cache
+is ignored (it is rebuilt from the source of truth). If you have a deliberate
+reason to keep evidence local-only (throwaway scratch, never reviewed), that is
+an explicit opt-out — add `.wicked-vault/` to your `.gitignore` knowing you have
+forfeited the only durable tamper-evidence the vault offers.
 ## Next steps

package/skills/wicked-vault/record-evidence/SKILL.md CHANGED Viewed

@@ -5,9 +5,12 @@ description: Record a claim-backing artifact in the vault and attach a determini
 # wicked-vault:record-evidence
-Capture an artifact, hash it tamper-evidently, and attach a verifier that can
-**re-derive** its verdict later. The vault does the capture itself — it never
-trusts a claimed status (G4).
+Capture an artifact, hash it (so naive/accidental mutation is detected on
+re-derivation), and attach a verifier that can **re-derive** its verdict later.
+The vault does the capture itself — it never trusts a claimed status (G4). The
+hash detects mutation; the committed git history is the durable tamper-evidence
+(see the README "Tamper detection" section — the envelope hash is unkeyed, so it
+is not a defense against a determined local writer).
 ## When to use

package/src/vault.mjs CHANGED Viewed

@@ -74,6 +74,45 @@ function loadContract(root, scope, phase) {
   return existsSync(p) ? JSON.parse(readFileSync(p, 'utf8')) : null;
 }
+// Resolve the acting identity for provenance + the G10/D4 independence check.
+// Precedence (strongest first):
+//   1. explicit value (CLI --actor / --evaluator)        -> source 'explicit'
+//   2. WICKED_VAULT_ACTOR env (harness-asserted identity) -> source 'env-actor'
+//   3. $USER env (the OS login — easily spoofed)          -> source 'env-user'
+//   4. nothing                                            -> source 'anonymous'
+// The *source* matters: 'explicit' and 'env-actor' are deliberate assertions;
+// 'env-user'/'anonymous' are weak and must not silently satisfy independence.
+function resolveActor(explicit) {
+  if (typeof explicit === 'string' && explicit.trim() !== '') {
+    return { id: explicit.trim(), source: 'explicit' };
+  }
+  const envActor = process.env.WICKED_VAULT_ACTOR;
+  if (typeof envActor === 'string' && envActor.trim() !== '') {
+    return { id: envActor.trim(), source: 'env-actor' };
+  }
+  const user = process.env.USER || process.env.USERNAME; // USERNAME = Windows
+  if (typeof user === 'string' && user.trim() !== '') {
+    return { id: user.trim(), source: 'env-user' };
+  }
+  return { id: 'unknown', source: 'anonymous' };
+}
+// Weak identity provenance — derived from ambient env, not deliberately asserted.
+const WEAK_IDENTITY_SOURCES = new Set(['env-user', 'anonymous']);
+// Read the vault config (vault.json). Falls back to defaults if the file is
+// absent or unreadable — record auto-creates the vault, so a config always
+// exists by the time a payload is captured, but be defensive.
+const DEFAULT_PAYLOAD_MAX_BYTES = 1048576;
+function loadConfig(root) {
+  const cfg = join(root, DIR, 'vault.json');
+  if (!existsSync(cfg)) return { payload_max_bytes: DEFAULT_PAYLOAD_MAX_BYTES };
+  try {
+    return JSON.parse(readFileSync(cfg, 'utf8'));
+  } catch {
+    return { payload_max_bytes: DEFAULT_PAYLOAD_MAX_BYTES };
+  }
+}
 export function record(root, opts) {
   const P = paths(root);
@@ -99,6 +138,18 @@ export function record(root, opts) {
     throw new Error('record requires --run or --artifact');
   }
+  // Enforce the configured payload ceiling (CONTRACTS.md §6). Oversize payloads
+  // are rejected here — before hashing or writing the blob — so a too-large
+  // capture can never bloat the committed audit chain. Fail-closed (G5): a
+  // rejected record produces NO entry and NO payload blob. `payload_max_bytes`
+  // <= 0 disables the guard (escape hatch for an explicitly unbounded vault).
+  const cfg = loadConfig(root);
+  const maxBytes = typeof cfg.payload_max_bytes === 'number'
+    ? cfg.payload_max_bytes : DEFAULT_PAYLOAD_MAX_BYTES;
+  if (maxBytes > 0 && blob.length > maxBytes) {
+    throw new Error(`payload exceeds payload_max_bytes: ${blob.length} > ${maxBytes} (set payload_max_bytes in .wicked-vault/vault.json to raise the limit, or 0 to disable)`);
+  }
   const payload_sha256 = sha256(blob);
   // G10/D1 — acceptance criteria are mandatory and frozen to the evidence.
@@ -147,6 +198,11 @@ export function record(root, opts) {
     ? runVerifier(verifier, payloadView(blob), { repoRoot: opts.cwd || root })
     : { status: 'n/a', detail: 'no deterministic verifier (judgment-tier claim)' };
+  // Actor provenance for the G10/D4 independence assertion. An explicit
+  // --actor (or WICKED_VAULT_ACTOR) is a deliberate identity claim; a bare
+  // $USER is ambient and weak. attest() uses created_by_source to refuse a
+  // silent self-grade where both worker and judge are unasserted (see attest).
+  const actor = resolveActor(opts.actor);
   const entry = {
     id, ...fields,
     acceptance_criteria, criteria_authored_by,
@@ -157,7 +213,8 @@ export function record(root, opts) {
     supersedes: null,
     contract_version: contract ? contract.contract_version : null,
     created_at: new Date().toISOString(),
-    created_by: process.env.USER || 'unknown',
+    created_by: actor.id,
+    created_by_source: actor.source,
   };
   writeFileSync(join(P.entries, `${id}.json`), JSON.stringify(entry, null, 2));
   return { id, envelope_hash, criteria_authored_by, status_at_record: sr.status, status_detail: sr.detail };
@@ -267,6 +324,7 @@ export function inspect(root, id) {
     acceptance_criteria: entry.acceptance_criteria,
     criteria_authored_by: entry.criteria_authored_by,
     created_by: entry.created_by,
+    created_by_source: entry.created_by_source || null,
     evidence: { text: view.text, json: view.json },
     hash_ok: v.hash_ok,
     integrity_status: v.status,
@@ -287,11 +345,47 @@ export function attest(root, id, opts) {
   const entry = JSON.parse(readFileSync(entryPath, 'utf8'));
   if (!OPINIONS.has(opts.opinion)) throw new Error(`attest: --opinion must be one of pass|reject|unclear (got '${opts.opinion}')`);
-  if (typeof opts.evaluator !== 'string' || !opts.evaluator) throw new Error('attest requires --evaluator');
+  if (typeof opts.evaluator !== 'string' || opts.evaluator.trim() === '') throw new Error('attest requires --evaluator');
+  // G10/D4 — mechanical independence, hardened. The judge must be a DELIBERATELY
+  // ASSERTED identity that differs from the worker. Three failure modes are
+  // closed here (all on top of the existing equality check):
+  //
+  //  (a) trivial-equality bypass — compare trimmed + case-folded so 'Alice',
+  //      'alice', and 'Alice ' can't sidestep the self-grade rejection.
+  //  (b) ambiguous worker identity — if the artifact was recorded under an
+  //      ambient identity ($USER / anonymous, created_by_source weak), the
+  //      independence claim cannot be trusted from a string compare alone.
+  //      We FAIL CLOSED unless the caller acknowledges it explicitly
+  //      (--allow-weak-worker-identity / opts.allowWeakWorkerIdentity), and we
+  //      stamp the weakness onto the attestation so audit can see it.
+  //  (c) ambiguous evaluator identity — the evaluator must be an explicit
+  //      assertion. A bare ambient identity for the JUDGE is refused: that is
+  //      exactly the silent self-grade the env var would otherwise enable.
+  //
+  // This is a stronger mechanical baseline + audit trail, NOT cryptographic
+  // independence. A determined human can still assert two distinct strings for
+  // the same person locally; real independence comes from a separate evaluator
+  // process/credential (see analyze-evidence skill) and the committed git trail.
+  const evaluator = resolveActor(opts.evaluator);
+  const norm = (s) => (typeof s === 'string' ? s.trim().toLowerCase() : '');
+  if (WEAK_IDENTITY_SOURCES.has(evaluator.source)) {
+    throw new Error(`attest refused (G10/D4): evaluator identity is ambient (${evaluator.source}='${evaluator.id}'), not a deliberate assertion. Pass an explicit --evaluator naming the independent judge (e.g. a model CLI or reviewer id) so a self-grade can't slip through silently.`);
+  }
+  if (entry.created_by && norm(evaluator.id) === norm(entry.created_by)) {
+    throw new Error(`attest refused (G10/D4): evaluator '${evaluator.id}' equals the artifact creator '${entry.created_by}' — a judgment must be independent of the worker`);
+  }
-  // G10/D4 — mechanical independence: the judge must differ from the worker.
-  if (entry.created_by && opts.evaluator === entry.created_by) {
-    throw new Error(`attest refused (G10/D4): evaluator '${opts.evaluator}' equals the artifact creator — a judgment must be independent of the worker`);
+  // The worker's identity provenance governs how much the independence claim is
+  // worth. A weak (ambient) worker identity means "different string" proves
+  // little. Fail closed unless the caller explicitly accepts that risk.
+  const workerSource = entry.created_by_source
+    || (entry.created_by && entry.created_by !== 'unknown' ? 'legacy' : 'anonymous');
+  const workerIdentityWeak = WEAK_IDENTITY_SOURCES.has(workerSource) || workerSource === 'legacy';
+  if (workerIdentityWeak && !opts.allowWeakWorkerIdentity) {
+    throw new Error(`attest refused (G10/D4): the artifact was recorded under a weak/ambient worker identity (created_by_source='${workerSource}'), so 'evaluator != created_by' is not a trustworthy independence signal. Re-record with an explicit --actor for the worker, or pass --allow-weak-worker-identity to attest anyway (the weakness is stamped on the attestation for audit).`);
   }
   // Fail-closed (G5/G10): never attest against a tampered artifact.
@@ -303,12 +397,17 @@ export function attest(root, id, opts) {
     artifact_id: id,
     opinion: opts.opinion,
     rationale: opts.rationale || '',
-    evaluator: opts.evaluator,
+    evaluator: evaluator.id,
+    evaluator_source: evaluator.source, // provenance of the judge identity (G10/D4)
     model: opts.model || null,
     prompt_hash: opts.prompt_hash || null,
     sampling: opts.sampling || null,
     evidence_sha256: entry.payload_sha256,
     criteria_sha256: entry.criteria_sha256,
+    // Audit flag: the worker identity this independence claim rests on was weak
+    // (ambient $USER / anonymous / legacy). The attestation was allowed via an
+    // explicit acknowledgement; a downstream gate may choose to discount it.
+    worker_identity_weak: workerIdentityWeak,
     created_at: new Date().toISOString(),
   };
   // tamper-evident binding over the attestation tuple (G2-style, G10)