agent-cli-runtime 0.1.0-alpha.1 → 0.1.0-alpha.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,25 +1,26 @@
1
1
  # Agent CLI Compatibility Matrix
2
2
 
3
- Status: P3-7 API / CLI Schema Freeze
4
- Last updated: 2026-06-22
3
+ Status: P6 offline compatibility gate integrated; `0.1.0-alpha.3` is the corrective pre-alpha release
4
+ Last updated: 2026-06-26
5
5
 
6
- This matrix records the CLI versions and behaviors that have been verified with the current runtime. Real agent CLIs change quickly; treat this file as dated compatibility evidence, not a permanent guarantee. P3-6 added a reviewable opt-in real smoke evidence path while keeping default release gates on detection/profile certification only. P3-7 freezes the API / CLI schema inventory and versioning policy in [docs/api-schema-contract.md](./api-schema-contract.md). It does not publish npm, configure trusted publishing, implement a daemon/API server, or add authenticated real agent runs to CI, dogfood, prepublish, or release-candidate gates. Raw CLI output, tokens, full prompts, auth env values, and private paths are not committed.
6
+ This matrix records the CLI versions and behaviors that have been verified with the current runtime. Real agent CLIs change quickly; treat this file as dated compatibility evidence, not a permanent guarantee. P3-6 added a reviewable opt-in real smoke evidence path while keeping default release gates on detection/profile certification only. P3-7 freezes the API / CLI schema inventory and versioning policy in [docs/api-schema-contract.md](./api-schema-contract.md). P6 integrates the offline real compatibility evidence verifier into prepublish and release-candidate evidence; it does not refresh real CLI evidence during normal release gates. P7-5 marks `0.1.0-alpha.3` as the corrective pre-alpha release after the published `0.1.0-alpha.2` tarball shipped stale package docs from the pre-publish state. npm registry metadata and GitHub Releases are the source of truth for available versions and dist-tags. Raw CLI output, tokens, full prompts, auth env values, private paths, local temporary paths, artifact ids, and artifact digests are not committed to packaged docs.
7
7
 
8
8
  ## Evidence policy
9
9
 
10
- Current status is P3-6 pre-alpha real CLI opt-in smoke evidence, which is intended to be the default interpretation for this matrix.
10
+ Current status is P6 offline compatibility gate integration carried into the alpha.3 corrective path. P6-1 keeps the P3-6 real-smoke safety boundary, adds repo-only summarized evidence under `.release-evidence/p6-1-real-cli-compatibility.json`, and audits every built-in adapter `needsVerification` item against current local CLI preflight and opt-in smoke results. P6-2 adds `npm run compat:real:evidence:verify` as an offline drift gate for that file; it does not launch real CLI runs. P6-3 wires that verifier into `prepublish:check` and `release:candidate` evidence without running `npm run compat:real:evidence` or passing `--allow-real-run`. P6 remote run and artifact details are recorded under `.release-evidence/`, outside npm package contents. P7-5 adds packaged-docs verification so local packed tarball docs and npm registry tarball docs are checked directly. The release gate confirms `compat:real:evidence:verify` emits `agent-cli-runtime.realCompatibilityEvidenceVerification.v1`, verifies `agent-cli-runtime.realCompatibilityEvidence.v1`, and keeps diagnostics summarized as count/codes only.
11
11
 
12
12
  - Current behavior is what is validated by `npm test` / typecheck / lint / build plus the current `npm pack`, package boundary, CLI JSON contract, and single-Node TypeScript consumer install-smoke checks.
13
13
  - CI behavior is matrixed for Node.js 20/22/24 except dogfood, which runs once on Node.js 22 to avoid duplicating the slower install smoke.
14
14
  - `npm test` uses Vitest's verbose reporter for contract coverage; slower installed-package gates and install smokes stay out of the Node.js matrix and run through single-Node release gates or explicit opt-in checks.
15
- - `npm run prepublish:check` is the local guard that combines typecheck, lint, tests, build, `daemon:verify`, `runtime:safety`, dogfood, production audit, package boundary checks, and pack dry-run.
16
- - `npm run release:candidate` creates local release-candidate artifacts including `gate-evidence.json`, and `npm run release:verify -- --dir <path>` validates local or downloaded artifacts with stable redacted JSON.
15
+ - `npm run prepublish:check` is the local guard that combines typecheck, lint, tests, build, `daemon:verify`, `runtime:safety`, offline real compatibility evidence verification, dogfood, production audit, package boundary checks, packaged-docs verification, and pack dry-run.
16
+ - `npm run release:candidate` creates local release-candidate artifacts including `gate-evidence.json`, and `npm run release:verify -- --dir <path>` validates local or downloaded artifacts with stable redacted JSON. `gate-evidence.json` records the compatibility verification gate as a redacted summary only: command, ok, verifier schema, verified evidence schema, and diagnostic count/codes.
17
17
  - `npm publish --dry-run --ignore-scripts --tag alpha` is a documented manual local dry-run check; it is not a remote CI gate.
18
18
  - `docs/release-publish-runbook.md` documents the future human alpha publish path, dist-tag verification, rollback/deprecation/unpublish boundary, 2FA, trusted publishing, provenance, and token strategy; no real publish is performed in P2-13.
19
19
  - `docs/daemon-ready-contract.md` documents embedding semantics for daemon/product shell callers without adding a hosted daemon surface.
20
20
  - `npm run dogfood` installs the tarball into a temporary consumer project, runs `tsc --noEmit`, then executes fake-CLI library run/goal/replay/diagnostics smoke through the installed package.
21
- - CI runs `daemon:verify`, `runtime:safety`, and dogfood once in a single Node.js 22 release-gates job; the Node.js 20/22/24 matrix does not repeat installed-package gates.
22
- - Remote GitHub Actions release-candidate evidence is run `27932628093` on workflow head SHA `8d7bc2a19c626caa1ad5223acbcd35df34aff18e`; historical run `27869580048` only proves commit `2f8832119b4ebdb8393077052560589a398ebf56`.
21
+ - `npm run published:adapters:verify` installs the already published npm package from the npm registry into a temporary consumer and verifies built-in Codex, Claude, and OpenCode adapter detection, argv shape, stdin prompt transport, parser behavior, redaction, and per-adapter failure isolation with fake CLIs only.
22
+ - CI runs `daemon:verify`, `runtime:safety`, and dogfood once in a single Node.js 22 release-gates job; the Node.js 20/22/24 matrix does not repeat installed-package gates. CI does not run `compat:real:evidence:verify` because that verifier depends on repo-only `.release-evidence/`, while dogfood remains an installed-package consumer gate.
23
+ - Remote GitHub Actions release-candidate evidence is commit-specific and recorded outside the package under `.release-evidence/`; historical runs only prove their own `headSha` and must not be reused for alpha.3 corrective release evidence.
23
24
  - Evidence modes are intentionally separate:
24
25
  - `fixtures`: offline parser contract fixtures; no real or fake CLI process is launched.
25
26
  - `fake`: temporary local fake CLIs through the real adapter argv/stdin/parser path; no network or real account is used.
@@ -29,9 +30,58 @@ Current status is P3-6 pre-alpha real CLI opt-in smoke evidence, which is intend
29
30
  - When using this file as runtime contract input, prioritize the `Status` section, explicit "Runtime notes" in each adapter, and the most recent command evidence.
30
31
  - For changed behavior, add a new evidence row at the top of the section rather than keeping the old row as authoritative.
31
32
 
33
+ ## P6-1 Current Real CLI Evidence
34
+
35
+ P6-1 local evidence was generated on 2026-06-23 with:
36
+
37
+ ```bash
38
+ npm run compat:real:evidence -- --allow-real-run \
39
+ --agent codex --expect-text "agent-runtime codex smoke ok" \
40
+ --agent opencode --expect-text "agent-runtime opencode smoke ok"
41
+ ```
42
+
43
+ The generated repo-only file is `.release-evidence/p6-1-real-cli-compatibility.json` with `schemaVersion: "agent-cli-runtime.realCompatibilityEvidence.v1"`. It stores redacted summaries only: no raw stdout/stderr, no prompt text, no full observed text tail, no private paths, no token values, no Bearer values, and no auth environment assignment values. It records `gitHeadSha` plus `gitDirty` and before/after dirty summaries so a dirty-tree evidence file is not mistaken for clean-commit evidence. The command runs safe preflight by default; authenticated real runs are added only when `--allow-real-run`, `--agent <id>`, and `--expect-text <text>` are all explicit.
44
+
45
+ P6-2 verifies the same repo-only evidence with:
46
+
47
+ ```bash
48
+ npm run compat:real:evidence:verify
49
+ npm run compat:real:evidence:verify -- --self-test
50
+ ```
51
+
52
+ The verifier emits `schemaVersion: "agent-cli-runtime.realCompatibilityEvidenceVerification.v1"` and stable diagnostics. It rejects raw stdout/stderr fields, private paths, token/Bearer/auth env values, missing `gitHeadSha` / `gitDirty` / before-after dirty summaries, safe preflight skipped states claimed as success, authenticated success without expected-text and cwd-mutation evidence, missing Codex/Claude/OpenCode `needsVerification` audit items, and evidence that claims `.release-evidence/` belongs in the package boundary.
53
+
54
+ P6-3 does not regenerate this evidence. It only requires the existing repo-only evidence to pass the offline verifier before local prepublish and while creating release-candidate artifacts. `dogfood` does not run the verifier, so installed-package consumers never depend on `.release-evidence/`.
55
+
56
+ Current safe preflight command results:
57
+
58
+ | Command | Result | Notes |
59
+ | --- | --- | --- |
60
+ | `node ./dist/cli/main.js agents --json` | parsed | Codex, Claude Code, and OpenCode detected; paths redacted. |
61
+ | `node ./dist/cli/main.js doctor --json` | `ok: true` | Overall local adapter catalog is usable. |
62
+ | `node ./dist/cli/main.js conformance --mode real --agent all --json` | `ok: true` | Safe detection/profile certification only; no authenticated real run launched. |
63
+ | `node ./dist/cli/main.js smoke --mode real --agent codex --json` | `real_run_skipped` | Safe preflight only; `skippedReason: "real_run_not_allowed"`. |
64
+ | `node ./dist/cli/main.js smoke --mode real --agent claude --json` | `auth_missing` | Local Claude Code auth missing; no run launched. |
65
+ | `node ./dist/cli/main.js smoke --mode real --agent opencode --json` | `real_run_skipped` | Safe preflight only; `skippedReason: "real_run_not_allowed"`. |
66
+
67
+ Current adapter evidence:
68
+
69
+ | Adapter | CLI version | Auth/model source | Safe runClassification | Authenticated smoke | Current `needsVerification` decision |
70
+ | --- | --- | --- | --- | --- | --- |
71
+ | Codex CLI | `codex-cli 0.142.0` | auth `unknown`; models `live` | `real_run_skipped` | `success`; expected text matched; cwd not mutated | Keep `session` and `authProbe` unpromoted. The successful run verifies the current prompt/stdin/parser/cwd-mutation path, not a stable session/resume or non-mutating auth probe. |
72
+ | Claude Code | `2.1.178 (Claude Code)` | auth `missing`; models `fallback` | `auth_missing` | not attempted | Keep `session.id` and `reasoning` unpromoted. Local auth is missing, and provider-dependent reasoning behavior is still not a stable mapped flag. |
73
+ | OpenCode | `1.15.6` | auth `unknown`; models `live` | `real_run_skipped` | `success`; expected text matched; cwd not mutated | Keep `extraAllowedDirs`, `session`, and `permissionPolicy.read-only` unpromoted. The successful run verifies stdin/parser/cwd-mutation behavior, not explicit extra-dir/session/read-only/workspace-write flags. |
74
+
75
+ Drift analysis:
76
+
77
+ - Codex version changed from the previous documented `codex-cli 0.142.0-alpha.6` to `codex-cli 0.142.0`; current model probe still returns live models and no unsupported flag diagnostic.
78
+ - Claude Code remains executable at `2.1.178`, but auth is still `missing`; `auth_missing` is evidence, not success.
79
+ - OpenCode remains `1.15.6`; live model probe still works and no unsupported flag diagnostic appeared.
80
+ - No `unsupported_flag` or `needs_verification` diagnostic was produced by current safe preflight. Existing `needsVerification` entries remain because they are unproven capabilities, not because current CLI preflight failed.
81
+
32
82
  ## P3-6 Real CLI Opt-In Smoke Evidence
33
83
 
34
- P3-6 changes the evidence path, not the adapter invocation profiles:
84
+ P3-6 is historical after P6-1. It changed the evidence path, not the adapter invocation profiles:
35
85
 
36
86
  - `smoke --mode real --agent <id> --json` performs detection/profile certification and reports `runClassification: "real_run_skipped"` unless `--allow-real-run` is explicit.
37
87
  - `smoke --mode real --agent <id> --allow-real-run --expect-text <safe_text> --json` is the recommended authenticated real-run evidence command.
@@ -60,8 +110,8 @@ P2-12 remote audit evidence on 2026-06-20:
60
110
  - Run status/conclusion: `completed` / `success`.
61
111
  - Run created/updated: `2026-06-20T11:19:33Z` / `2026-06-20T11:20:40Z`.
62
112
  - Uploaded artifacts: `agent-cli-runtime-tarball`, `agent-cli-runtime-pack-metadata`, `agent-cli-runtime-package-files`, `agent-cli-runtime-release-verification`.
63
- - Downloaded artifact re-verification: `npm run release:verify -- --dir /tmp/agent-runtime-p2-12-remote-5P5MSc/normalized`.
64
- - Verification result: `schemaVersion: "agent-cli-runtime.releaseVerification.v1"`, `ok: true`, package file count `145`, tarball `agent-cli-runtime-0.1.0-alpha.0.tgz`, tarball size `187378` bytes, tarball sha256 `3701bd6355651bbc200d5c017a9b01c3dd7136140b64dee0781e6eb601a7a657`, empty diagnostics.
113
+ - Downloaded artifact re-verification: `npm run release:verify -- --dir <normalized-artifact-dir>`.
114
+ - Verification result: `schemaVersion: "agent-cli-runtime.releaseVerification.v1"`, `ok: true`, package file count recorded in package-out evidence, and empty diagnostics.
65
115
 
66
116
  The GitHub download layout used one directory per artifact name; the downloaded files were copied into a temporary normalized review directory before local verification.
67
117
 
@@ -82,7 +132,7 @@ P3-5 does not change adapter invocation compatibility. It closes workflow-head r
82
132
 
83
133
  - `.github/workflows/release-candidate.yml` run `27932628093` completed successfully on workflow head SHA `8d7bc2a19c626caa1ad5223acbcd35df34aff18e`.
84
134
  - The run uploaded `agent-cli-runtime-tarball`, `agent-cli-runtime-pack-metadata`, `agent-cli-runtime-package-files`, `agent-cli-runtime-gate-evidence`, and `agent-cli-runtime-release-verification`.
85
- - Downloaded artifacts were normalized into `/tmp/agent-runtime-p3-5-remote-7rkBqm/normalized` and passed `npm run release:verify -- --dir /tmp/agent-runtime-p3-5-remote-7rkBqm/normalized`.
135
+ - Downloaded artifacts were normalized into a local review directory and passed `npm run release:verify -- --dir <normalized-artifact-dir>`.
86
136
  - Verification result: `schemaVersion: "agent-cli-runtime.releaseVerification.v1"`, `ok: true`, package file count `147`, empty diagnostics, and gate evidence for `daemon:verify` plus `runtime:safety` with `packageSource: "installed-tarball"`.
87
137
  - Local real conformance after the remote run still did not launch authenticated real agent runs: Codex and OpenCode reported `real_run_skipped`; Claude Code reported `auth_missing`.
88
138
 
@@ -188,9 +238,11 @@ Historical local real-CLI detection/preflight evidence from `node ./dist/cli/mai
188
238
 
189
239
  | Adapter | CLI path | CLI version tested | Detection | Run smoke | Goal smoke | Notes |
190
240
  | --- | --- | --- | --- | --- | --- | --- |
191
- | Codex CLI | redacted local app path | `codex-cli 0.142.0-alpha.6` | Pass | Skipped in P3-1 default real conformance; prior opt-in Codex smoke evidence remains historical. | Not run in P3-1 | Uses `codex exec --json --skip-git-repo-check` with stdin prompt and `-C <cwd>`. Live model probe passed. P3-1 reports `real_run_skipped` without `--allow-real-run`; session and auth probe remain `needsVerification`. |
192
- | Claude Code | redacted local app path | `2.1.178 (Claude Code)` | Pass with `auth_missing` diagnostic | Blocked by local auth | Not run in P2-9 | `claude auth status` returned auth missing in the local P2-9 certification. Conformance skips before launching Claude. |
193
- | OpenCode | redacted local app path | `1.15.6` | Pass | Skipped in P3-1 default real conformance; prior opt-in OpenCode smoke evidence remains historical. | Not run in P3-1 | P3-1 reports `real_run_skipped` without `--allow-real-run` and live model source is available. Explicit read-only/workspace-write flags, extra dirs, and session remain unverified. |
241
+ | Codex CLI | redacted local app path | `codex-cli 0.142.0` | Pass | P6-1 opt-in smoke passed; safe preflight reports `real_run_skipped` without `--allow-real-run`. | Not run in P6-1 | Uses `codex exec --json --skip-git-repo-check` with stdin prompt and `-C <cwd>`. Live model probe passed. Session and auth probe remain `needsVerification`. |
242
+ | Claude Code | redacted local app path | `2.1.178 (Claude Code)` | Pass with `auth_missing` diagnostic | Blocked by local auth | Not run in P6-1 | `claude auth status` returned auth missing in the local P6-1 certification. Conformance skips before launching Claude. |
243
+ | OpenCode | redacted local app path | `1.15.6` | Pass | P6-1 opt-in smoke passed; safe preflight reports `real_run_skipped` without `--allow-real-run`. | Not run in P6-1 | Live model source is available. Explicit read-only/workspace-write flags, extra dirs, and session remain unverified. |
244
+
245
+ P5-2 published adapter evidence uses fake CLIs only. It verifies that the published package's built-in adapter invocation profiles still match the documented shapes and that prompts stay on stdin, but it is not authenticated real CLI compatibility success evidence.
194
246
 
195
247
  ## Verified Invocation Shapes
196
248
 
@@ -211,8 +263,8 @@ Runtime notes:
211
263
  - auth probe: no stable non-mutating auth probe is enabled; auth status is `unknown`
212
264
  - model probe: `codex debug models`; parser keeps only model `slug`/`display_name` and ignores hidden models
213
265
  - parser note: transient `Reconnecting... n/5` structured error frames are normalized to `status: reconnecting`; they are not fatal if the run later emits text/usage and exits `0`
214
- - 2026-06-20 P2-9 local certification: executable/version/model preflight passed for `codex-cli 0.142.0-alpha.1`; no real run was launched because `--allow-real-run` was not supplied.
215
- - Historical opt-in Codex real smoke evidence remains useful but is not the latest local status.
266
+ - 2026-06-23 P6-1 local certification: executable/version/model preflight passed for `codex-cli 0.142.0`; safe preflight reports `real_run_skipped` without `--allow-real-run`.
267
+ - 2026-06-23 P6-1 opt-in Codex real smoke passed with expected text matched and no isolated-cwd mutation. This verifies the current prompt/stdin/parser/cwd-mutation path only; session/resume and auth probe remain `needsVerification`.
216
268
 
217
269
  ### Claude Code
218
270
 
@@ -229,7 +281,7 @@ Runtime notes:
229
281
  - capability probe: `claude -p --help`; current local output includes the tracked capability flags and produced no capability diagnostics
230
282
  - model probe: no live model probe; fallback aliases are `default`, `sonnet`, `opus`, `haiku`
231
283
  - `--resume` is the verified resume path in fixtures; `--session-id` is represented in the profile as `needsVerification` and is not emitted by `buildArgs()`
232
- - 2026-06-20 P2-9 local certification: executable/version/auth preflight passed for `2.1.178 (Claude Code)`, but auth was `missing`; no real run was launched.
284
+ - 2026-06-23 P6-1 local certification: executable/version/auth preflight passed for `2.1.178 (Claude Code)`, but auth was `missing`; no authenticated real run was launched.
233
285
  - DeepSeek or another Anthropic-compatible provider can be supplied through environment variables. Keep this as names and placeholders only; do not commit real token values, account-specific URLs, or private model aliases:
234
286
 
235
287
  ```bash
@@ -259,8 +311,8 @@ Runtime notes:
259
311
  - model probe: `opencode models`
260
312
  - read-only and workspace-write are left to OpenCode defaults until stable permission flags are verified
261
313
  - extra dirs and session/resume are not mapped; profile marks them as `needsVerification`
262
- - 2026-06-20 P2-9 local certification: executable/version/model preflight passed for `opencode` 1.15.6; no real run was launched because `--allow-real-run` was not supplied.
263
- - Historical opt-in OpenCode real smoke evidence verifies stdin prompt support for local `opencode` 1.15.6. Keep prompt out of argv; do not switch to positional argv prompt. The runtime requested read-only behavior, but OpenCode explicit read-only/workspace-write flags remain unverified.
314
+ - 2026-06-23 P6-1 local certification: executable/version/model preflight passed for `opencode` 1.15.6; safe preflight reports `real_run_skipped` without `--allow-real-run`.
315
+ - 2026-06-23 P6-1 opt-in OpenCode real smoke passed with expected text matched and no isolated-cwd mutation. Keep prompt out of argv; do not switch to positional argv prompt. The runtime requested read-only behavior, but OpenCode explicit read-only/workspace-write flags, extra dirs, and session remain unverified.
264
316
 
265
317
  ## Smoke Commands
266
318
 
@@ -329,7 +381,7 @@ All conformance and real-smoke output is redacted recursively. Do not commit rea
329
381
  Equivalent lower-level run command:
330
382
 
331
383
  ```bash
332
- tmp="$(mktemp -d /tmp/agent-runtime-run-smoke.XXXXXX)"
384
+ tmp="$(mktemp -d)"
333
385
  node ./dist/cli/main.js run \
334
386
  --agent codex \
335
387
  --cwd "$tmp" \
@@ -355,7 +407,7 @@ node ./dist/cli/main.js smoke \
355
407
  Equivalent OpenCode smoke:
356
408
 
357
409
  ```bash
358
- tmp="$(mktemp -d /tmp/agent-runtime-run-smoke.XXXXXX)"
410
+ tmp="$(mktemp -d)"
359
411
  node ./dist/cli/main.js run \
360
412
  --agent opencode \
361
413
  --cwd "$tmp" \
@@ -369,7 +421,7 @@ node ./dist/cli/main.js run \
369
421
  Run smoke in an isolated temp directory:
370
422
 
371
423
  ```bash
372
- tmp="$(mktemp -d /tmp/agent-runtime-smoke.XXXXXX)"
424
+ tmp="$(mktemp -d)"
373
425
  node ./dist/cli/main.js run \
374
426
  --agent codex \
375
427
  --cwd "$tmp" \
@@ -381,7 +433,7 @@ node ./dist/cli/main.js run \
381
433
  Goal smoke:
382
434
 
383
435
  ```bash
384
- tmp="$(mktemp -d /tmp/agent-runtime-goal.XXXXXX)"
436
+ tmp="$(mktemp -d)"
385
437
  node ./dist/cli/main.js goal \
386
438
  --agent codex \
387
439
  --cwd "$tmp" \
@@ -775,18 +827,18 @@ Release-preflight workflow:
775
827
 
776
828
  ```bash
777
829
  repo_root="${GITHUB_WORKSPACE:-$(pwd -P)}"
778
- tmp_dir="$(mktemp -d /tmp/agent-runtime-release-XXXXXX)"
830
+ tmp_dir="$(mktemp -d)"
779
831
  pushd "$tmp_dir"
780
832
  pack_info="$(cd "$repo_root" && npm pack --json --ignore-scripts --pack-destination "$tmp_dir")"
781
833
  package_file="$(printf '%s' "$pack_info" | node -e "const data = JSON.parse(require('node:fs').readFileSync(0, 'utf8')); process.stdout.write(data[0].filename);")"
782
834
  npm init -y >/dev/null
783
- npm install "$tmp_dir/$package_file" --no-save --ignore-scripts --no-audit --no-fund >/tmp/agent-runtime-release-smoke-install.log
835
+ npm install "$tmp_dir/$package_file" --no-save --ignore-scripts --no-audit --no-fund >"$tmp_dir/install.log"
784
836
  node -e "(async()=>{ const m = await import('agent-cli-runtime'); if (typeof m.createAgentRuntime !== 'function') process.exit(1); console.log(typeof m.createAgentRuntime); })()"
785
- node ./node_modules/.bin/agent-runtime agents --json > /tmp/agent-runtime-release-smoke-agents.json
786
- node ./node_modules/.bin/agent-runtime doctor --json > /tmp/agent-runtime-release-smoke-doctor.json
787
- node ./node_modules/.bin/agent-runtime smoke --mode fixtures --json > /tmp/agent-runtime-release-smoke-fixtures.json
837
+ node ./node_modules/.bin/agent-runtime agents --json >"$tmp_dir/agents.json"
838
+ node ./node_modules/.bin/agent-runtime doctor --json >"$tmp_dir/doctor.json"
839
+ node ./node_modules/.bin/agent-runtime smoke --mode fixtures --json >"$tmp_dir/fixtures-smoke.json"
788
840
  popd
789
- node -e "const fs = require('node:fs'); JSON.parse(fs.readFileSync('/tmp/agent-runtime-release-smoke-agents.json','utf8')); JSON.parse(fs.readFileSync('/tmp/agent-runtime-release-smoke-doctor.json','utf8')); JSON.parse(fs.readFileSync('/tmp/agent-runtime-release-smoke-fixtures.json','utf8'));"
841
+ node -e "const fs = require('node:fs'); for (const file of process.argv.slice(1)) JSON.parse(fs.readFileSync(file, 'utf8'));" "$tmp_dir/agents.json" "$tmp_dir/doctor.json" "$tmp_dir/fixtures-smoke.json"
790
842
  ```
791
843
 
792
844
  Release-candidate notes:
@@ -82,6 +82,54 @@ The gate emits `schemaVersion: "agent-runtime.runtimeSafety.v1"` JSON with redac
82
82
 
83
83
  P3-3 is intentionally still local-kernel hardening. It does not implement HTTP, IPC, RPC, auth, users, tenants, queue admission, remote workers, Docker/SSH, telemetry, database, WAL, compaction, UI/artifact layers, or OpenDesign daemon parity.
84
84
 
85
+ ## P5-1 Published Package Consumer Gate
86
+
87
+ `npm run published:daemon:verify` is the P5-1 published-package daemon consumer harness. It installs `agent-cli-runtime@0.1.0-alpha.1` from the npm registry into a temporary consumer project, imports only the package root, and runs a long-lived daemon-style process with fake Codex, Claude, and OpenCode binaries on `PATH`.
88
+
89
+ The gate exercises the published package path, not the local checkout, local `dist/`, or a freshly packed tarball:
90
+
91
+ 1. create a runtime with an isolated `storageDir`;
92
+ 2. detect the fake Codex adapter;
93
+ 3. run a successful fake task;
94
+ 4. create a successful fake goal through the planner/task path;
95
+ 5. cancel a running run;
96
+ 6. time out a running run;
97
+ 7. replay run and goal events;
98
+ 8. run read-only store inspection while a writer runtime is active;
99
+ 9. verify a second writer for the same `storageDir` is refused without mutating live records;
100
+ 10. shut down and reopen terminal records;
101
+ 11. recover stale active run/goal records after simulated crash ownership.
102
+
103
+ The gate emits `schemaVersion: "agent-runtime.publishedDaemonConsumer.v1"` JSON with `packageSource: "npm-registry"`, `version`, `checks`, `diagnostics`, and `noAuthenticatedRealRun`. The output is a redacted summary only: it must not include temp paths, private user paths, tokens, raw secrets, or full prompts. It uses fake CLIs only and does not launch authenticated real Codex, Claude Code, or OpenCode runs.
104
+
105
+ P5-1 is evidence for embeddability of the already published npm package. It does not publish a new npm version, expand package-root value exports, add a daemon/API server, database, WAL, remote worker, queue service, UI, telemetry, or hosted control plane.
106
+
107
+ ## P5-2 Published Package Built-In Adapter Gate
108
+
109
+ `npm run published:adapters:verify` is the P5-2 published-package built-in adapter compatibility gate. It installs `agent-cli-runtime@0.1.0-alpha.1` from the npm registry into a temporary consumer project, creates fake Codex, Claude, and OpenCode executables, and exercises the already published package's built-in adapter definitions through package-root `createAgentRuntime` and the installed `agent-runtime` CLI.
110
+
111
+ The gate verifies the published package path, not the local checkout, local `dist/`, or a freshly packed tarball:
112
+
113
+ 1. `agent-runtime agents --json` detects the three fake built-in adapters;
114
+ 2. `agent-runtime conformance --mode fake --json` emits `agent-runtime.conformance.v1`;
115
+ 3. each built-in adapter runs through its real argv builder, stdin prompt transport, and parser path;
116
+ 4. long prompts stay out of argv for Codex, Claude, and OpenCode;
117
+ 5. Claude stdin uses stream-json JSONL, while Codex and OpenCode use stdin text;
118
+ 6. Claude stream-json partial/unknown events and non-JSON noise do not fail parsing;
119
+ 7. Codex and OpenCode non-JSON noise is ignored by parsers;
120
+ 8. a forced single-adapter failure still leaves summaries for the other adapters;
121
+ 9. token-looking diagnostics, Bearer values, auth env assignments, full prompts, temp paths, private paths, and raw stdout/stderr are excluded from the emitted summary.
122
+
123
+ The gate emits `schemaVersion: "agent-runtime.publishedAdapters.v1"` with `packageSource: "npm-registry"`, `agents`, `checks`, `diagnostics`, and `noAuthenticatedRealRun`. Stable classification fields include `checks.failureIsolation` and `agents[].terminalStatus`. This is fake-CLI compatibility evidence for the published package's built-in adapter contract. It is not authenticated real Codex, Claude Code, or OpenCode run evidence, and it does not publish npm, expand package-root value exports, or add a daemon/API server, database, WAL, remote worker, queue service, UI, telemetry, or hosted control plane.
124
+
125
+ ## P5-3 Published Package Remote Verification Evidence
126
+
127
+ `npm run published:verify -- --out-dir published-verification` is the repo-only P5-3 aggregation gate for post-publish evidence. It runs the published package smoke, the P5-1 daemon consumer gate, the P5-2 adapter gate, post-alpha npm/GitHub Release verification, and an npm registry metadata lookup for the current `package.json` version.
128
+
129
+ The summary is written as `published-verification/published-verification.json` with `schemaVersion: "agent-cli-runtime.publishedVerification.v1"`, `packageSource: "npm-registry"`, gate summaries, registry metadata, `gitSha`, `checkedAt`, and explicit `noAuthenticatedRealRun`, `noNpmPublish`, and `noNpmToken` flags. It records gate commands, pass/fail state, output schema versions, durations, selected summary fields, and redacted diagnostics only. It does not store raw stdout/stderr, temp paths, private paths, full prompts, token values, Bearer values, or auth environment assignments.
130
+
131
+ `.github/workflows/published-package-verification.yml` is manual `workflow_dispatch` only. It runs on Node.js 22, executes `npm ci`, creates and verifies the published verification evidence, and uploads `agent-cli-runtime-published-verification` with the same 14-day retention window as release-candidate artifacts. It is evidence for the already published package on a clean runner; it is not a publish workflow and does not configure registry credentials, provenance, authenticated real agent runs, or hosted daemon behavior.
132
+
85
133
  ## Writer Lease And Store Ownership
86
134
 
87
135
  The local lease is a best-effort same-machine writer guard. It is not a distributed lock, daemon consensus protocol, WAL, database transaction, or multi-host scheduler.
@@ -176,6 +224,9 @@ Stable daemon-facing schemas:
176
224
  | Store health | `agent-runtime.storeHealth.v1` | `schemaVersion`, `ok`, `storageDir`, `checkedAt`, `lock`, `totals`, `corruptManifests`, `corruptEventLogs`, `partialTails`, `activeRecords`, `activeInterrupted`, `warnings`, `storageDiagnostics`, `diagnostics` |
177
225
  | Store repair | `agent-runtime.storeRepair.v1` | `schemaVersion`, `storageDir`, `checkedAt`, `dryRun`, `applied`, `ok`, optional `blockedReason`, `actions`, `diagnostics` |
178
226
  | CLI JSON error | `agent-runtime.cliError.v1` | `schemaVersion`, `ok`, `error` |
227
+ | Published daemon consumer | `agent-runtime.publishedDaemonConsumer.v1` | `schemaVersion`, `ok`, `packageName`, `version`, `packageSource`, `checks`, `diagnostics`, `noAuthenticatedRealRun` |
228
+ | Published built-in adapter gate | `agent-runtime.publishedAdapters.v1` | `schemaVersion`, `ok`, `packageName`, `version`, `packageSource`, `checks`, `agents`, `diagnostics`, `noAuthenticatedRealRun` |
229
+ | Published verification evidence | `agent-cli-runtime.publishedVerification.v1` | `schemaVersion`, `ok`, `packageName`, `version`, `gitSha`, `checkedAt`, `packageSource`, `gates`, `registry`, `diagnostics`, `noAuthenticatedRealRun`, `noNpmPublish`, `noNpmToken` |
179
230
  | Release verification | `agent-cli-runtime.releaseVerification.v1` | `schemaVersion`, `ok`, `checkedFiles`, `tarball`, `diagnostics`, `artifactNames`, `gateEvidence`, `packageName`, `version` |
180
231
  | Release gate evidence | `agent-cli-runtime.releaseGateEvidence.v1` | `schemaVersion`, `generatedAt`, `gates`, `noAuthenticatedRealRun`, `noNpmPublish`, `noNpmToken` |
181
232
 
@@ -1,11 +1,13 @@
1
1
  # Production Readiness
2
2
 
3
- Status: 0.1.0-alpha.1 corrective alpha candidate; human publish gate required
4
- Last updated: 2026-06-23
3
+ Status: `0.1.0-alpha.3` corrective pre-alpha release
4
+ Last updated: 2026-06-26
5
5
 
6
- This project is still **pre-alpha / developer preview**. P2-11 through P2-13 established release-candidate artifact verification, remote evidence closure, and alpha publish-readiness docs. Version `0.1.0-alpha.0` has since been published to npm and GitHub pre-release `v0.1.0-alpha.0`, but that immutable tarball contains stale pre-publish status text; `0.1.0-alpha.1` is the corrective alpha candidate. P3-1 froze daemon-ready execution-kernel contracts for embedders in [docs/daemon-ready-contract.md](./daemon-ready-contract.md); P3-2 added an executable daemon embedding stability gate for the installed-package fake-CLI path; P3-3 added an installed-package long-lived runtime resource safety gate; P3-4 aligned CI and release-candidate artifacts so those gates are represented in remote release artifacts; P3-5 verified its workflow head SHA through a successful remote release-candidate workflow and downloaded artifact re-verification; P3-6 added a redacted opt-in real smoke evidence format for Codex, Claude Code, and OpenCode while keeping default release gates on detection/profile certification only; P3-7 freezes the API / CLI schema inventory and versioning policy in [docs/api-schema-contract.md](./api-schema-contract.md); P3-8 refreshed remote release-candidate evidence for target SHA `eb8de0f9b1edfa3f94c35a50b31005c5d3c105d4`; P3-9 locked evidence-target release-candidate evidence for target SHA `65fac505ca3eb830a06d8656068cf4ed5f6dd46a`.
6
+ This project is still **pre-alpha / developer preview**. Version `0.1.0-alpha.3` is the corrective pre-alpha release for package consumers.
7
7
 
8
- P3-11 keeps volatile current-head release-candidate evidence out of the npm package. Fresh run ids, artifact ids, artifact digests, tarball shasums, and pack shasums belong under `.release-evidence/` or durable GitHub Release assets, while packaged docs keep stable release rules and the human-gated publish packet. The corrective alpha path still does not configure trusted publishing, claim provenance, or add daemon/API server/database/WAL/remote-worker/UI/telemetry/artifact layers.
8
+ Version `0.1.0-alpha.2` was published with fresh main release-candidate evidence, real publish evidence, registry verification, installed-package CLI smoke, and GitHub Release verification, but its immutable npm tarball contains stale pre-publish package docs. Version `0.1.0-alpha.1` remains published as an earlier alpha and has GitHub pre-release `v0.1.0-alpha.1`. Version `0.1.0-alpha.0` is deprecated because that immutable tarball contains stale pre-publish status text. npm registry metadata and GitHub Releases are the source of truth for available versions and dist-tags. P3-1 froze daemon-ready execution-kernel contracts for embedders in [docs/daemon-ready-contract.md](./daemon-ready-contract.md); P3-7 freezes the API / CLI schema inventory and versioning policy in [docs/api-schema-contract.md](./api-schema-contract.md); P6 integrates the offline real compatibility evidence verifier into local prepublish and release-candidate evidence without launching authenticated real agent runs. Detailed run and artifact evidence for P6-4 through P7-5 is recorded outside the npm package under `.release-evidence/`.
9
+
10
+ Volatile current-head evidence stays out of the npm package. Fresh run ids, artifact ids, artifact digests, tarball hashes, pack hashes, downloaded verification paths, raw logs, raw CLI output, full prompts, and token-looking values belong under `.release-evidence/` or durable GitHub Release assets. Packaged docs keep stable release rules, the alpha.2 stale package-docs incident, the alpha.3 corrective release boundary, package-docs verification, and the human-gated boundary for any registry mutation. P5-1 adds a published-package daemon consumer harness for the already published `agent-cli-runtime@0.1.0-alpha.1`: it installs from the npm registry, uses fake CLIs only, and verifies daemon-style lifecycle coverage without touching local `dist/` or publishing a new version. The post-alpha path does not configure trusted publishing, claim provenance, or add daemon/API server/database/WAL/remote-worker/UI/telemetry/artifact layers.
9
11
 
10
12
  ## Local-First Production Definition
11
13
 
@@ -26,18 +28,26 @@ For this repository, "production-ready local runtime" means:
26
28
  - daemon/product shell embedding semantics are documented without adding a hosted daemon surface;
27
29
  - `npm run daemon:verify` packs and installs the package into a temporary consumer, then verifies fake run, fake goal, replay, diagnostics, store inspection, shutdown, and reopen using temp storage and fake CLIs;
28
30
  - `npm run runtime:safety` packs and installs the package into a temporary consumer, then verifies repeated run/goal execution, slow event consumption, cancel/timeout churn, bounded redacted diagnostics, repeated shutdown, lease close, and reopen behavior using fake CLIs only;
31
+ - `npm run published:daemon:verify` installs `agent-cli-runtime@0.1.0-alpha.1` from the npm registry into a temporary daemon-style consumer and verifies detect, run, goal, cancel, timeout, replay, read-only inspection during active writer ownership, second-writer refusal, shutdown/reopen, and stale owner recovery with schema `agent-runtime.publishedDaemonConsumer.v1`;
29
32
  - real CLI conformance and smoke default to detection/profile certification only; authenticated real agent runs require explicit `--allow-real-run`;
33
+ - `npm run compat:real:evidence` creates repo-only redacted real CLI compatibility evidence under `.release-evidence/`; it defaults to safe preflight and requires explicit `--allow-real-run --agent <id> --expect-text <text>` pairs before launching authenticated real smoke;
34
+ - `npm run compat:real:evidence:verify` is the offline P6-3 evidence gate for `.release-evidence/p6-1-real-cli-compatibility.json`; it does not launch authenticated real runs and rejects unsafe content, evidence drift, skip-as-success claims, incomplete authenticated success evidence, missing `needsVerification` audits, and invalid repo-only package-boundary claims;
30
35
  - real smoke evidence uses `schemaVersion: "agent-runtime.realSmoke.v1"`, requires expected text for success, checks cwd mutation, and omits prompts, raw stdout/stderr, private cwd, tokens, and final run records;
31
- - release artifact verification uses `agent-cli-runtime.releaseVerification.v1`, release gate evidence uses `agent-cli-runtime.releaseGateEvidence.v1`, and both are covered by the schema versioning policy in [docs/api-schema-contract.md](./api-schema-contract.md);
36
+ - release artifact verification uses `agent-cli-runtime.releaseVerification.v1`, release gate evidence uses `agent-cli-runtime.releaseGateEvidence.v1`, and both are covered by the schema versioning policy in [docs/api-schema-contract.md](./api-schema-contract.md); release gate evidence records compatibility verification with command, ok, verifier schema, verified evidence schema, and diagnostic count/codes only;
32
37
  - `npm run dogfood` is the default release-candidate gate and does not launch authenticated real agent runs;
33
38
  - `npm run dogfood` also installs the packed tarball into a temporary TypeScript consumer, runs `tsc --noEmit`, and executes fake-CLI library run/goal/replay/diagnostics smoke;
34
- - `npm run prepublish:check` is the local prepublish guard, includes `npm run daemon:verify` and `npm run runtime:safety`, and also avoids authenticated real agent runs;
35
- - `npm run release:candidate` creates local release-candidate artifacts without publishing npm;
39
+ - `npm run prepublish:check` is the local prepublish guard, includes `npm run daemon:verify`, `npm run runtime:safety`, and `npm run compat:real:evidence:verify`, and also avoids authenticated real agent runs;
40
+ - `npm run release:candidate` creates local release-candidate artifacts without publishing npm and records the offline compatibility verifier in `gate-evidence.json`;
36
41
  - `npm run release:verify` validates local or downloaded release artifacts and emits stable redacted JSON;
37
- - `docs/release-publish-runbook.md` records the future alpha publish command path, 2FA/trusted publishing/provenance decisions, dist-tag checks, and rollback boundaries without configuring real publishing;
42
+ - `npm run release:post-alpha:verify` compares npm registry and GitHub Release tarballs, allowing raw gzip hash differences only when unpacked package content is identical;
43
+ - `npm run smoke:published` installs the published npm package and verifies package-root ESM import plus `agent-runtime agents --json` parsing without authenticated real runs;
44
+ - `npm run published:daemon:verify` is the published-package daemon lifecycle proof and emits redacted JSON with `packageSource: "npm-registry"` and `noAuthenticatedRealRun: true`;
45
+ - `npm run published:adapters:verify` installs the published npm package from the npm registry and verifies built-in Codex, Claude, and OpenCode adapters with fake binaries, real adapter argv/stdin/parser paths, redacted JSON schema `agent-runtime.publishedAdapters.v1`, and no authenticated real runs;
46
+ - `npm run published:verify -- --out-dir published-verification` aggregates `smoke:published`, `published:daemon:verify`, `published:adapters:verify`, `release:post-alpha:verify`, and npm registry metadata into `agent-cli-runtime.publishedVerification.v1` evidence without raw stdout/stderr, token values, temp paths, or publish credentials;
47
+ - `docs/release-publish-runbook.md` records current post-alpha registry state, the future alpha publish command path, 2FA/trusted publishing/provenance decisions, dist-tag checks, and rollback boundaries without configuring real publishing;
38
48
  - CLI JSON success and error contracts are parseable, redacted, and covered for core release-facing commands;
39
49
  - `npm test` uses Vitest verbose output for default contract coverage; slower installed-package gates and install smokes run through single-Node release gates or explicit opt-in checks rather than every Node matrix entry;
40
- - GitHub Actions CI runs Node.js 20/22/24 matrix checks plus one single-Node release-gates job for `npm run daemon:verify`, `npm run runtime:safety`, and `npm run dogfood`;
50
+ - GitHub Actions CI runs Node.js 20/22/24 matrix checks plus one single-Node release-gates job for `npm run daemon:verify`, `npm run runtime:safety`, and `npm run dogfood`; CI does not run the repo-only compatibility verifier because it is evidence-bound to `.release-evidence/`, while prepublish and release-candidate do;
41
51
  - the manual release-candidate workflow is configured to upload the packed tarball, pack metadata, package file list, gate evidence JSON, and verification JSON with explicit artifact retention;
42
52
  - the release report records local commands, remote workflow evidence, downloaded artifact verification, package boundary, real CLI evidence boundaries, known risks, and non-goals;
43
53
  - validation evidence is replayable through goal manifests and diagnostics export.
@@ -53,12 +63,21 @@ npm run lint
53
63
  npm run build
54
64
  npm run daemon:verify
55
65
  npm run runtime:safety
66
+ npm run compat:real:evidence:verify
67
+ npm run published:daemon:verify
68
+ npm run published:adapters:verify
56
69
  npm run ci
57
70
  npm run dogfood
58
71
  npm run prepublish:check
59
72
  npm run package:check
60
73
  npm run release:candidate -- --out-dir release-candidate
61
74
  npm run release:verify -- --dir release-candidate
75
+ npm run release:post-alpha:verify
76
+ npm run smoke:published
77
+ npm run published:daemon:verify
78
+ npm run published:adapters:verify
79
+ npm run published:verify -- --out-dir <tmp-dir>
80
+ npm run published:verify:evidence -- --dir <tmp-dir>
62
81
  node ./dist/cli/main.js conformance --mode fixtures --json
63
82
  node ./dist/cli/main.js conformance --mode fake --json
64
83
  node ./dist/cli/main.js conformance --mode real --agent all --json
@@ -75,15 +94,16 @@ npm publish --dry-run --ignore-scripts --tag alpha
75
94
 
76
95
  Remote CI gates:
77
96
 
78
- - `.github/workflows/ci.yml`: Node.js 20/22/24 matrix for typecheck, lint, tests, build, production dependency audit, package boundary checks, and pack dry-run; single Node.js 22 release-gates job for `npm run daemon:verify`, `npm run runtime:safety`, and `npm run dogfood`.
79
- - `.github/workflows/release-candidate.yml`: manual `workflow_dispatch` gate that runs `npm ci`, `npm run ci`, `npm run dogfood`, and `npm run release:candidate -- --out-dir release-candidate`; it uploads `agent-cli-runtime-tarball`, `agent-cli-runtime-pack-metadata`, `agent-cli-runtime-package-files`, `agent-cli-runtime-gate-evidence`, and `agent-cli-runtime-release-verification`. For P3-11 and later, the fresh workflow head SHA must match the commit being considered, the downloaded artifacts must pass `npm run release:verify -- --dir <normalized-artifact-dir>`, and volatile evidence must be recorded under `.release-evidence/` instead of package docs. Historical runs only prove their own head SHAs. The workflow does not publish and does not require an npm token.
97
+ - `.github/workflows/ci.yml`: Node.js 20/22/24 matrix for typecheck, lint, tests, build, production dependency audit, package boundary checks, and pack dry-run; single Node.js 22 release-gates job for `npm run daemon:verify`, `npm run runtime:safety`, and `npm run dogfood`. The CI path intentionally leaves repo-only compatibility evidence verification to `prepublish:check` and `release:candidate`.
98
+ - `.github/workflows/release-candidate.yml`: manual `workflow_dispatch` gate that runs `npm ci`, `npm run ci`, `npm run dogfood`, and `npm run release:candidate -- --out-dir release-candidate`; it uploads `agent-cli-runtime-tarball`, `agent-cli-runtime-pack-metadata`, `agent-cli-runtime-package-files`, `agent-cli-runtime-gate-evidence`, and `agent-cli-runtime-release-verification`. `gate-evidence.json` must include `daemon:verify`, `runtime:safety`, and `compat:real:evidence:verify`. The fresh workflow head SHA must match the commit being considered, the downloaded artifacts must pass `npm run release:verify -- --dir <normalized-artifact-dir>`, and volatile evidence must be recorded under `.release-evidence/` instead of package docs. Historical runs only prove their own head SHAs. The workflow does not publish and does not require an npm token.
99
+ - `.github/workflows/published-package-verification.yml`: manual `workflow_dispatch` post-publish gate on Node.js 22 that runs `npm ci`, `npm run published:verify -- --out-dir published-verification`, verifies the summary, and uploads `agent-cli-runtime-published-verification` with 14-day retention. P5-4 records a successful downloaded-artifact re-verification under `.release-evidence/`; each run proves only its own `headSha`. This is not a publish workflow and does not require npm registry credentials.
80
100
 
81
101
  `npm publish --dry-run --ignore-scripts --tag alpha` is a manual local dry-run check. The explicit `--tag alpha` keeps dry-run output aligned with the pre-alpha release intent even when npm does not apply `publishConfig.tag` in dry-run output. It is intentionally documented but not required as a remote CI gate because npm dry-run output can vary by npm version and registry context.
82
102
 
83
103
  Package install smoke:
84
104
 
85
105
  ```bash
86
- tmp_dir="$(mktemp -d /tmp/agent-runtime-release-XXXXXX)"
106
+ tmp_dir="$(mktemp -d)"
87
107
  pack_info="$(npm pack --json --ignore-scripts --pack-destination "$tmp_dir")"
88
108
  package_file="$(printf '%s' "$pack_info" | node -e "const data = JSON.parse(require('node:fs').readFileSync(0, 'utf8')); process.stdout.write(data[0].filename);")"
89
109
  (
@@ -99,7 +119,7 @@ package_file="$(printf '%s' "$pack_info" | node -e "const data = JSON.parse(requ
99
119
  )
100
120
  ```
101
121
 
102
- The checked-in automated version of this smoke is `npm run dogfood`; it creates the temporary `consumer.ts`, `consumer.mjs`, fake adapter binary, and fake CLI environment itself. `daemon:verify` and `runtime:safety` run in the single-Node CI release-gates job and in `release:candidate`, not in every Node matrix entry.
122
+ The checked-in automated version of this smoke is `npm run dogfood`; it creates the temporary `consumer.ts`, `consumer.mjs`, fake adapter binary, and fake CLI environment itself. `daemon:verify` and `runtime:safety` run in the single-Node CI release-gates job and in `release:candidate`, not in every Node matrix entry. `compat:real:evidence:verify` runs in `prepublish:check` and `release:candidate`, not in dogfood, so installed-package consumers never depend on repo-only `.release-evidence/`.
103
123
 
104
124
  Manual real CLI run gate, only on a machine where the selected CLI is installed, authorized, and safe to run:
105
125
 
@@ -168,6 +188,12 @@ Repository-only daemon embedding gates:
168
188
 
169
189
  - `scripts/verify-daemon-ready.mjs`
170
190
  - `scripts/verify-runtime-safety.mjs`
191
+ - `scripts/verify-published-daemon-consumer.mjs`
192
+ - `scripts/verify-published-adapters.mjs`
193
+ - `scripts/create-published-verification-evidence.mjs`
194
+ - `scripts/verify-published-verification-evidence.mjs`
195
+ - `scripts/create-real-compatibility-evidence.mjs`
196
+ - `scripts/verify-real-compatibility-evidence.mjs`
171
197
 
172
198
  Repository-only prepublish artifacts:
173
199
 
@@ -176,6 +202,7 @@ Repository-only prepublish artifacts:
176
202
  Excluded artifacts:
177
203
 
178
204
  - `.reference/`
205
+ - `published-verification/`
179
206
  - `tests/`
180
207
  - `tests/fixtures/`
181
208
  - fault fixtures
@@ -190,11 +217,13 @@ Excluded artifacts:
190
217
  ## Known Risks
191
218
 
192
219
  - Real CLI behavior can drift after this release candidate. Treat `docs/compatibility.md` as dated evidence, not a permanent guarantee.
193
- - P3-10 verifies one remote release-candidate run and downloaded artifact re-verification for pre-documentation SHA `fdba3ebccb2e57a0ad295101028a2a3937a92204`. Because release docs are packaged, final publish evidence must come from a fresh workflow run after this evidence packet is committed. Historical P3-9 run `27943672095` only proves target SHA `65fac505ca3eb830a06d8656068cf4ed5f6dd46a`; historical P3-9 interim run `27942743285` only proves target SHA `a0299a7d81bb614661922bebc8c75496cf0a3d11` before the strict `fixtures?` package-boundary lock; historical P3-8 run `27940814340` only proves target SHA `eb8de0f9b1edfa3f94c35a50b31005c5d3c105d4`; historical P3-5 run `27932628093` only proves workflow head SHA `8d7bc2a19c626caa1ad5223acbcd35df34aff18e`; historical P2-12 run `27869580048` only proves commit `2f8832119b4ebdb8393077052560589a398ebf56`. Internal files under `dist/` may exist in the tarball for declarations and CLI execution, but importing internal subpaths is not a documented contract.
220
+ - P6-1 local evidence on 2026-06-23 refreshed Codex, Claude Code, and OpenCode preflight plus opt-in Codex/OpenCode smoke. It is machine-local compatibility evidence, not a CI gate and not a future-version guarantee.
221
+ - P6 release-candidate evidence is commit-specific and recorded outside the package. It is not evidence for future commits, future publish, or authenticated real Codex/Claude/OpenCode success. Because release docs are packaged, final publish evidence must come from a fresh workflow run after candidate documentation is committed. Internal files under `dist/` may exist in the tarball for declarations and CLI execution, but importing internal subpaths is not a documented contract.
194
222
  - `status-only real smoke exit 0`, wrong expected text, or a custom prompt without `--expect-text` remain intentionally non-passing: classification is `unexpected_output`.
195
223
  - Real conformance preflight can classify a local CLI as unavailable/auth-missing because of machine-specific executable, auth, network, or proxy state. That skip is useful compatibility evidence but is not a successful real run.
224
+ - Codex session/resume and a stable non-mutating auth probe remain in `needsVerification`.
196
225
  - OpenCode explicit read-only/workspace-write flags, extra dirs, and session/resume mappings remain in `needsVerification`.
197
- - Claude Code authenticated run smoke remains dependent on local auth or a correctly configured Anthropic-compatible provider environment.
226
+ - Claude Code authenticated run smoke remains dependent on local auth or a correctly configured Anthropic-compatible provider environment; current P6-1 local evidence is `auth_missing`.
198
227
  - P3-6 adds opt-in real smoke evidence, but does not add authenticated real runs to CI, dogfood, prepublish, or release-candidate gates and does not implement scheduler expansion, daemon/API server, database, WAL, remote workers, web UI, telemetry, npm publish, trusted publishing configuration, provenance publishing, or guaranteed authenticated real-run success certification. Repair and fault-injection hardening remains local JSONL-only within the existing store layout.
199
228
 
200
229
  ## Durable Supervisor Contract