@lcv-ideas-software/cross-review 4.1.0 → 4.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +28 -0
- package/README.md +74 -73
- package/dist/scripts/runtime-smoke.js +10 -3
- package/dist/scripts/runtime-smoke.js.map +1 -1
- package/dist/scripts/smoke.js +25 -0
- package/dist/scripts/smoke.js.map +1 -1
- package/dist/src/core/config.d.ts +1 -1
- package/dist/src/core/config.js +1 -1
- package/dist/src/core/session-store.js +7 -9
- package/dist/src/core/session-store.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -21,7 +21,7 @@ npm install -g @lcv-ideas-software/cross-review
|
|
|
21
21
|
npm install -g @lcv-ideas-software/cross-review --registry=https://npm.pkg.github.com
|
|
22
22
|
```
|
|
23
23
|
|
|
24
|
-
**Status.** Stable. Current release: **v04.01.
|
|
24
|
+
**Status.** Stable. Current release: **v04.01.01** (npm package `4.1.1`). See
|
|
25
25
|
[CHANGELOG.md](./CHANGELOG.md) for the release history.
|
|
26
26
|
|
|
27
27
|
> **Project renamed 2026-05-15.** This project was previously published as
|
|
@@ -34,78 +34,79 @@ npm install -g @lcv-ideas-software/cross-review --registry=https://npm.pkg.githu
|
|
|
34
34
|
|
|
35
35
|
The version history at a glance:
|
|
36
36
|
|
|
37
|
-
| Release
|
|
38
|
-
|
|
39
|
-
| **`v04.01.
|
|
40
|
-
| **`v04.00
|
|
41
|
-
| **`v04.00.
|
|
42
|
-
| **`v04.00.
|
|
43
|
-
| **`v04.00.
|
|
44
|
-
| **`v04.00.
|
|
45
|
-
| **`v04.00.
|
|
46
|
-
| **`
|
|
47
|
-
| **`v03.07.
|
|
48
|
-
| **`v03.07.
|
|
49
|
-
| **`v03.07.
|
|
50
|
-
| **`v03.07.
|
|
51
|
-
| **`v03.07.
|
|
52
|
-
| **`v03.
|
|
53
|
-
| **`v03.
|
|
54
|
-
| **`v03.
|
|
55
|
-
| **`v03.
|
|
56
|
-
| **`v03.
|
|
57
|
-
| **`v03.
|
|
58
|
-
| **`
|
|
59
|
-
| **`v02.
|
|
60
|
-
| **`v02.27.
|
|
61
|
-
| **`v02.
|
|
62
|
-
| **`v02.26.
|
|
63
|
-
| **`v02.
|
|
64
|
-
| **`v02.25.
|
|
65
|
-
| **`v02.
|
|
66
|
-
| **`v02.
|
|
67
|
-
| **`v02.
|
|
68
|
-
| **`v02.
|
|
69
|
-
| **`v02.
|
|
70
|
-
| **`v02.18.
|
|
71
|
-
| **`v02.18.
|
|
72
|
-
| **`v02.18.
|
|
73
|
-
| **`v02.18.
|
|
74
|
-
| **`v02.18.
|
|
75
|
-
| **`v02.18.
|
|
76
|
-
| **`v02.18.
|
|
77
|
-
| **`v02.
|
|
78
|
-
| **`v02.
|
|
79
|
-
| **`v02.
|
|
80
|
-
| **`v02.15.
|
|
81
|
-
| **`v02.
|
|
82
|
-
| **`v02.14.
|
|
83
|
-
| **`v02.
|
|
84
|
-
| **`v02.
|
|
85
|
-
| **`v02.
|
|
86
|
-
| **`v02.
|
|
87
|
-
| **`v02.
|
|
88
|
-
| **`v02.
|
|
89
|
-
| **`v02.
|
|
90
|
-
| **`v02.06.
|
|
91
|
-
| **`v02.
|
|
92
|
-
| **`v02.
|
|
93
|
-
| **`v02.04.
|
|
94
|
-
| **`v02.
|
|
95
|
-
| **`v02.03.
|
|
96
|
-
| **`v02.03.
|
|
97
|
-
| **`v02.03.
|
|
98
|
-
| **`v02.
|
|
99
|
-
| **`v02.
|
|
100
|
-
| **`v02.01.
|
|
101
|
-
| **`v02.00
|
|
102
|
-
| **`v02.00.
|
|
103
|
-
| **`v02.00.
|
|
104
|
-
| **`v02.00.
|
|
105
|
-
| **`v02.00.
|
|
106
|
-
| **`
|
|
107
|
-
| **`v2.0.0-alpha.
|
|
108
|
-
| **`v2.0.0-alpha.
|
|
37
|
+
| Release | Scope |
|
|
38
|
+
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
39
|
+
| **`v04.01.01`** | **Patch — release the hard-gate cleanup as a published package.** Formalizes the linter/formatter hard-gate cleanup with a package-version bump so every patch shipped to `main` remains publishable. Removes the dead global ESLint `@typescript-eslint/no-explicit-any` waiver, restores README coverage under Prettier, adds smoke coverage against future linter/formatter masking, makes `runtime-smoke` polling terminal-outcome aware with a 60-second deadline, fixes two CodeQL `js/file-system-race` patterns with atomic/file-descriptor based file operations, and records the scoped StepSecurity cleanup for generated `dist/**` artifacts in the publish workflow. |
|
|
40
|
+
| **`v04.01.00`** | **Minor — security hardening of session-store concurrency, write-path DoS surface, and credential redaction.** Closes three high-impact findings from an in-depth security audit of v4.0.8: (F1) `withSessionLock` switched from `fs.openSync(.., "wx")` + separate write to `proper-lockfile`'s `fs.mkdir`-based atomic locking, eliminating the multi-process TOCTOU race window where two host processes sharing the same `data_dir` could both enter the critical section and corrupt `meta.json`. (F2) `redactPrivateKeyBlocks` now redacts unterminated `-----BEGIN PRIVATE KEY-----` blocks to end-of-string instead of returning the original input unredacted — pre-v4.1.0 leaked partial keys to events.ndjson when logs were truncated mid-key. (F3) `writeJson`'s `renameSync` retry no longer busy-waits with `while (Date.now() - start < wait)` (which blocked the event loop for up to 310 ms under Windows AV stress); it now awaits a Promise-based timer so the event loop remains responsive during backoff. The cascading internal refactor (~22 SessionStore methods became async, ~80 internal call sites added `await`) preserves the public MCP tool surface unchanged. New runtime dep: `proper-lockfile` ^4.1.2. |
|
|
41
|
+
| **`v04.00.08`** | **Patch — eliminate the recurring `js/file-access-to-http` CodeQL false positive at the source.** `scripts/verify-registry-dist.mjs` no longer reads `package.json` from disk; package name and version come from `PACKAGE_NAME` / `PACKAGE_VERSION` env vars (with `npm_package_name` / `npm_package_version` auto-injected by npm as a transparent fallback when invoked via `npm run release:verify-registry`). Both inputs are required; missing values throw a clear error before any network call. Removing the `fs.readFileSync` → outbound-fetch flow stops future CodeQL analyses from re-filing the same alert on every release. |
|
|
42
|
+
| **`v04.00.07`** | **Patch — bounded npm registry fetch in the post-publish verifier.** `scripts/verify-registry-dist.mjs` now passes `signal: AbortSignal.timeout(30_000)` to the `https://registry.npmjs.org/<package>/<version>` `fetch` call so a slow or unreachable registry surfaces as a deterministic abort instead of hanging the publish workflow until its 60-minute ceiling. Timeouts map to an explicit `"npm registry lookup for <spec> timed out after 30000 ms"` error; the validated fields (`dist.shasum`, `dist.integrity`, `dist.tarball`) and the script CLI/env contract are unchanged. |
|
|
43
|
+
| **`v04.00.06`** | **Patch — Windows-safe registry verifier.** `scripts/verify-registry-dist.mjs` now queries `https://registry.npmjs.org` directly instead of spawning `npm.cmd`, closing the Windows Node hardening failure (`spawnSync npm.cmd EINVAL`) while preserving the post-publish validation of registry `dist.shasum`, `dist.integrity`, and `dist.tarball`. |
|
|
44
|
+
| **`v04.00.05`** | **Patch — hard-gate close-out for the Codex v4.0.4 audit.** Clears the 6 residual findings: StepSecurity `Source-Code-Overwritten` detections for generated `dist/*` publish artifacts are suppressed against the existing narrow post-rename rule; `docs/model-selection.md` now uses the post-v4 product name, removes misleading fallback wording, and links to the real historical v2 capability-smoke report; model-selection failure text now says it keeps the configured model pin instead of the old fallback phrase; Copilot/Gemini agent instructions preserve the `cross-review-v2` → `cross-review` rename history; local tag verification is expected to use fetched remote tags; the publish workflow now records npm registry `dist.shasum` / `dist.integrity` / `dist.tarball` metadata so audits do not confuse local `npm --registry=https://registry.npmjs.org pack --dry-run` output with the published artifact identity; and `grok-4-latest` model-match accepts provider-reported dot-release aliases such as `grok-4.3` without weakening true cross-family downgrade rejection. |
|
|
45
|
+
| **`v04.00.04`** | **Patch — restore prettier coverage of `src/` and `scripts/` (close audit on v4.0.3 hard-gate gap).** v4.0.3 added biome but also moved `src/**/*.ts`, `src/**/*.js`, `scripts/**/*.ts`, `scripts/**/*.js` into `.prettierignore` to dodge a biome↔prettier disagreement on dynamic-import call-style. Net effect: prettier ran against zero JS/TS under `src/`/`scripts/`, silently turning one of the four hard-gate checks into a no-op there. v4.0.4 restores full coverage and resolves the disagreement at the source — the 7 `scripts/smoke.ts` dynamic-import sites that triggered the wrap conflict were rewritten from destructure-from-call form to a 2-statement form (`const mod = await import("..."); const { A, B, C } = mod;`). Functionally identical; static type inference preserved. Both formatters now check the full JS/TS surface and pass simultaneously. |
|
|
46
|
+
| **`v04.00.00`** | **Major — project renamed to `cross-review`** (drops the `-v2` suffix after the companion `cross-review-v1` project was discontinued and archived 2026-05-15). Breaking: npm package `@lcv-ideas-software/cross-review-v2` → `@lcv-ideas-software/cross-review` (old name stays on npm at `3.7.5` for historical installs); binaries `cross-review-v2` / `cross-review-v2-dashboard` → `cross-review` / `cross-review-dashboard`; env-var prefix `CROSS_REVIEW_V2_*` → `CROSS_REVIEW_*` across all config knobs that previously carried the `V2` infix (e.g. `CROSS_REVIEW_DATA_DIR`, `CROSS_REVIEW_DISABLE_CACHE_ANTHROPIC`); API-key env vars unchanged; per-host identity env vars (`CROSS_REVIEW_CALLER_TOKEN`, `CROSS_REVIEW_REQUIRE_TOKEN`) unchanged. GitHub repo URL: `LCV-Ideas-Software/cross-review-v2` → `LCV-Ideas-Software/cross-review` (auto-redirected). GitHub Pages: `cross-review-v2.lcv.dev` → `cross-review.lcv.dev`. MCP server key in host configs: operators who declared `cross-review-v2` rename to `cross-review`; after reload, MCP tool prefix becomes `mcp__cross-review__*`. Data dir migration is manual: operators copy `${HOME}/.cross-review/data_v2/*` into the new default `${HOME}/.cross-review/data/` (or set `CROSS_REVIEW_DATA_DIR` to the legacy path) — the v4.0.0 runtime reads only `CROSS_REVIEW_DATA_DIR` and does not fall back to the `_v2` suffix automatically. Preserved when copied: persisted session data, `config.json`, `host-tokens.json`, `cache_manifest.json`, archived/corrupt session dirs. Wire shape of all MCP tools, event types, convergence semantics is unchanged; all capabilities, peers, models, security defenses carry over from v3.7.5 verbatim. 504 source/script/doc text substitutions across 26 files. |
|
|
47
|
+
| **`v03.07.05`** | **Patch — logs+sessions study 2026-05-15 close-out (4 surgical fixes from 244-session/429-round corpus).** **A1** — `session_doctor` classified cancelled sessions as `stale` (22 of 244 false positives); doctor now treats any terminal outcome (`aborted`/`converged`/`max-rounds`) as NOT-stale regardless of the persisted `convergence_health.state`. Source-layer state untouched (backward-compat with existing sessions). **A2** — `lockCallerPeerSelection` emitted false-positive `session.caller_peer_selection_ignored` events when callers passed a panel identical to the enabled set (13 of 106 recent events); the lock now accepts an optional `enabledPeers` snapshot in its context and short-circuits the emit when the caller-supplied list set-equals the enabled set (sorted comparison). **A3** — per-provider cache disable env vars (`CROSS_REVIEW_DISABLE_CACHE_ANTHROPIC | OPENAI | GEMINI | DEEPSEEK | GROK | PERPLEXITY`; provider names match v2.21.0 `_CACHE_TTL_\*`convention; same parsing as`peer_enabled`); Anthropic default flipped to disabled based on empirical 0.3% hit-rate ($1.18 wasted to save $0.0035 over 244 sessions). Global `CROSS_REVIEW_DISABLE_CACHE`kill-switch unchanged; per-provider is an additive layer. Anthropic adapter`buildSystemBlock`+ short-prefix warning gated on the per-provider flag; central`config.json` `cache`block accepts the new disable keys. **B1** —`session_sweep`gains opt-in`prune_corrupt: boolean.default(false)`+`corrupt_min_age_days: number.int.default(30)`to clean`<data_dir>/corrupt_sessions/`(no prior automated cleanup; 1 stale entry from 2026-05-08 v2.25.1 redact bug still on disk at study time). New`store.pruneCorruptSessions(minAgeMs)`returns`{scanned, removed, kept}`. Response shape stays `SessionMeta[]`when`prune_corrupt: false`(default); wraps to`{ swept, pruned_corrupt }` when true. **Patch bump** (3.7.4 → 3.7.5). |
|
|
48
|
+
| **`v03.07.04`** | **Patch — Codex v3.7.3 parecer close-out + two cross-review-gate root-cause fixes** (APROVADO-COM-RESSALVAS; 2 parecer findings + 2 operator-directed fixes; no public-surface or tool-schema change). **`model_match` `-latest`-alias false positive (operator-directed)** — `BasePeerAdapter.modelMatches()` matched the reported model with `reported === requested` or `reported.startsWith(`${requested}-`)`. That works for a base id resolving to a dated id (`gpt-5.5` → `gpt-5.5-2026-04-23`) but FAILS for a `-latest` alias: xAI returns `grok-4-0709` for the pinned `grok-4-latest`, which does not start with the literal `grok-4-latest-`. Every grok response was flagged `model_match: false` → `status` forced `null` → `silent_model_downgrade` rejection → format-recovery skipped, so grok was dead-on-arrival in every cross-review session and no panel including grok could reach unanimity. Fix: `modelMatches` strips a `-latest` suffix to the family stem and matches the reported id against it (`grok-4-latest` → `grok-4` → `grok-4-0709` matches); a genuine cross-family downgrade (`grok-3-*`) is still flagged. New smoke marker `model_match_latest_alias_test`. **`detectFabricatedEvidence` false positive (operator-directed)** — the detector validated operational assertions (`npm run build`, `index <hash>..<hash>`, `cargo test`, …) against the `provenanceCorpus` (attached evidence) ONLY; the prior draft was lumped into `narrativeCorpus` and never consulted for assertions. The documented process REQUIRES embedding the verbatim diff + raw gate output in `initial_draft`, so when R1 didn't converge and a relator generated an R2 revision, the relator faithfully PRESERVING that embedded evidence was flagged as "fabricating" it → `lead_fabrication_repeated` abort (misread as "perplexity keeps fabricating"; in fact it hit any relator and was a detector self-contradiction). Fix: a **three-tier corpus** — `FabricationDetectionCorpus` gains a `priorDraftCorpus` field; operational assertions are flagged only when **net-new** vs `{provenanceCorpus ∪ priorDraftCorpus}` (symmetric with the hex-token check). Preserved evidence is not fabrication; the task `narrativeCorpus` stays excluded so the v2.24.0 eee886d3 protection holds exactly. Signature unchanged; interface gains one field. **AUDIT-1 (MEDIUM)** — `scripts/runtime-smoke.ts` injected cost rate cards for only 4 peers (codex/claude/gemini/deepseek), but the public MCP path strips a caller's `peers` list (the v3.3.0 `lockCallerPeerSelection` lock), so every round runs the full 6-peer panel; grok + perplexity had no rate cards → `missingFinancialControlVars` tripped → the round finalized `outcome=max-rounds`/`financial_controls_missing` while runtime-smoke still printed `ok: true` with no assert. Fix: inject grok + perplexity rate cards (+ `CROSS_REVIEW_PERPLEXITY_DISABLE_SEARCH` and per-size request-fee defaults), and add explicit `assert` calls on every async flow's durable terminal `outcome` (review round + unanimity flow → `converged`, cancellation flow → `aborted`) placed before the `ok: true` print so a non-converging round fails the smoke loudly. **AUDIT-2 (LOW)** — `src/core/convergence.ts` comment imprecision: the skip was framed only as "the user declared no fallback models", but `fallback_exhausted` is in the skippable set and arises AFTER a declared fallback chain is drained; both comment blocks now split the skip into its two paths (no fallback declared → retry-same exhausted → skip; fallback declared, tried, and drained → also skip). Comment-only, zero logic change. New smoke marker `runtime_smoke_outcome_assert_test` + 2 new `relator_evidence_provenance_lock_test` cases source-pin the fixes. **Patch bump** (3.7.3 → 3.7.4). |
|
|
49
|
+
| **`v03.07.03`** | **Patch — "sem fallback é sem fallback" directive + Codex v3.7.2 parecer residuals.** **Skip-peer on model-unavailability** — when a reviewer peer's pinned model is genuinely unavailable (infra failure — `auth`/`rate_limit`/`provider_error`/`network`/`timeout`/`fallback_exhausted`, retries exhausted, no user-declared fallback), the round now SKIPS that peer and converges on the remaining peers instead of the failure blocking convergence (the operator's "pular aquele peer e trabalhar apenas com os outros"). A peer that responded but badly, or a policy/budget/content stop, still blocks. **Skip-gated quorum floor (`SKIP_QUORUM_FLOOR = 2`)** prevents a degenerate 0/1-peer "unanimous" review; guarded by `skipped.length > 0` so on a zero-skip round the convergence decision is identical to pre-v3.7.3 (the only output delta is the additive `skipped_peers` field). New `skipped_peers` on `ConvergenceResult`/`ConvergenceScope` + `session.peer_skipped_unavailable` event. **No model-downgrade fallback** — fallback is 100% user-declared via the central config `fallback_models` (default empty = no fallback → retry-same-model then skip); `model_fallback` capability flag now derived honestly. **Codex v3.7.2 residuals**: grok reasoning-effort shadow set + boot warning (added `grok-4.3`), "7 MCP configs" → "host MCP configs". 100% backward-compatible; no tool-schema change. **Patch bump** (3.7.2 → 3.7.3). |
|
|
50
|
+
| **`v03.07.02`** | **Patch — Codex 3rd super-audit close-out of v3.7.1** (3 findings, all verified against primary-source code; Codex verdict REPROVADO without v3.7.2). **AUDIT-1 (BLOCKER)** — v3.7.1's `runUntilUnanimous` fix led the `??` chain with `input.caller`, but the `run_until_unanimous` MCP schema declares `caller: CallerSchema.default("operator")` — so on the public path `input.caller` is never `undefined`, the `existingSession` fallback was dead code, and the real persisted peer-petitioner could still be reclassified / placed in the voting colegiado / lottery-picked as relator of its own session (Codex reproduced it). Fix: the persisted session wins — `callerForLottery = existingSession?.convergence_scope?.petitioner ?? existingSession?.caller ?? input.caller ?? "operator"`. (`askPeers` does not share the bug — it keys off `input.petitioner`, which has no MCP schema field.) **AUDIT-2** — the continuation smoke marker gains post-schema cases (explicit `caller:"operator"` + mismatching `caller:"claude"`) simulating the schema-materialized value the public path produces; source pin tightened to the v3.7.2 chain ordering. **AUDIT-3 + operator directive** — NO model fallback: every peer `PRIORITY` is now a single canonical pin (`gpt-5.5`, `claude-opus-4-7`, `gemini-2.5-pro`, `deepseek-v4-pro`, `grok-4-latest`, `sonar-reasoning-pro`); v3.7.1 trimmed only gemini/deepseek, this completes all 6. The explicit per-host env/config override is the only escape hatch. 100% backward-compatible; no tool-schema change. **Patch bump** (3.7.1 → 3.7.2). |
|
|
51
|
+
| **`v03.07.01`** | **Patch — Codex super-audit close-out of v3.7.0** (4 findings AUDIT-1..AUDIT-4, all verified against primary-source code before fixing; Codex verdict REPROVADO without v3.7.1). **AUDIT-1 (BLOCKER)** — `runUntilUnanimous` derived the petitioner from `input.caller ?? "operator"` _before_ reading the persisted session; v3.7.0 fixed this in `askPeers` but left the sibling automatic entry point — a caller-omitted continuation could place the real persisted peer-petitioner into the voting colegiado or select it as the relator of its own session (anti-self-review HARD GATE violation, Codex reproduced it). Fix: read the session once up front via `existingSession`, derive `callerForLottery` from it before any recusal/lottery; `existingSession` reused (single read, no double-read). **AUDIT-2** — new smoke marker `audit1_run_until_unanimous_continuation_test` (v3.7.0's coverage only exercised `askPeers`). **AUDIT-3** — trimmed `deepseek`/`gemini` `PRIORITY` to their lone canonical pin so `selectFromCandidates` can no longer silently auto-select `deepseek-v4-flash` (forbidden "flash" tier) or `gemini-3.1-pro-preview` (manual-override-only per the workspace Model Selection Standards directive); `codex`/`claude`/`grok` same-provider degradation chains left intact. **AUDIT-4** — refreshed two stale internal comments. 100% backward-compatible; no tool-schema change. **Patch bump** (3.7.0 → 3.7.1). |
|
|
52
|
+
| **`v03.07.00`** | **Minor — Codex super-audit close-out 2026-05-14** (bit-by-bit review of v3.6.0; 6 findings, all verified real against primary-source code). **AUDIT-1 (BLOCKER)** — `askPeers` computed auto-recusal from the current call's `caller` _before_ reading the persisted session; a continuation that omitted `caller` defaulted it to `"operator"`, skipped recusal, and let the real persisted peer-petitioner back into the voting colegiado (anti-self-review HARD GATE violation, Codex reproduced it). Fix: read the session first, derive `effectivePetitioner`, recuse from that. **AUDIT-2 (HIGH)** — operator default relator hardcoded `"codex"` ignoring `peer_enabled`; now prefers codex when enabled else the first enabled session peer. **AUDIT-3 (MEDIUM)** — `peers` + `judge_peers` MCP schemas capped at `.max(5)` against a 6-element `PEERS` roster (stale since v3.0.0 Perplexity); `.max(PEERS.length)` at all 5 sites. **AUDIT-4 (LOW)** — `server_info.financial_controls` now computes readiness over the enabled peer subset. **AUDIT-5 (NIT)** — corrected stale internal comments (`addressed`→`not_resurfaced`, `max_rounds` 32→1000, "5 peer probes"→6). **AUDIT-6** — clarifying comment on the "API-only" claim (no caller-supplied shell/repo execution; the internal `reg`/`tasklist` calls are constant-arg/PID-derived). 2 new smoke markers; smoke `ok: true / events: 99`. 100% backward-compatible additive (AUDIT-3 widens schema acceptance; AUDIT-1/2 are bug fixes). **Minor bump** (3.6.0 → 3.7.0; Y-component per SemVer). |
|
|
53
|
+
| **`v03.06.00`** | **Minor — observability + caller-discipline close-out 2026-05-14**, from a study of the cross-review logs + 169 past sessions (324 rounds, $45.92, 42541 events). **B2** — token-delta default threshold raised 1024 → 16384 (`session_doctor` showed `peer.token.delta` was 79.5% of all persisted events); operators with a `config.json` `token_streaming.chars_threshold` override should bump it too. **C** — `session_doctor` gains an opt-in `repair: boolean` param (default false → still read-only) that recomputes `convergence_health` for sessions stuck in the contradictory `outcome="converged"`+`health="blocked"` state (pre-v3.2.0 corruption artifact; v3.2.0 fixed the cause, old metas persist); `readOnlyHint` flips to false since `repair=true` mutates; new `repaired` array on the report; idempotent. **B3 + B4** — new top-level `notices: string[]` on all 4 caller-facing tool responses (+ `session_poll`): a `relator_non_voting:` notice naming the relator + voting peers (callers kept misreading the relator's deliberate exclusion as a dropped peer even after v3.5.0's nested metadata), and a `peer_selection_lock:` notice when a caller's `peers`/`lead_peer` was stripped (the v3.3.0 lock fired 30× silently across the corpus). New exported `buildResponseNotices()`. **B1** — `session_poll` gains a derived `needs_attention: boolean` (non-terminal + stale/blocked health + no running job) — the study found 28 non-terminal sessions abandoned until the 24h sweep; this surfaces the risk sooner. 3 new smoke markers; smoke `ok: true / events: 99`. 100% backward-compatible additive: new optional input, new response fields, new exported helper, new report field, config-default tuning. **Minor bump** (3.5.0 → 3.6.0; Y-component per SemVer). |
|
|
54
|
+
| **`v03.05.00`** | **Minor — Codex operational-report close-out 2026-05-14: 5 findings from sessions `f0db3970` + `df052926`.** **CRV2-2 (substantive fix)** — the evidence checklist no longer marks asks `addressed` purely because a peer did not resurface them; "peer did not re-ask" is not proof of satisfaction. The resurfacing-inference path now produces a distinct `not_resurfaced` status (not `open` → still does not hard-block the `=== "open"` convergence gate; not `addressed` → the audit trail no longer lies). `addressed` is reserved for the judge verified-satisfied path + explicit operator action. **CRV2-4** — new pure-textual `evidencePreflight()` runs before any paid peer call; catches submissions that _claim_ completed operational work (tests pass / diff exists / build validated) but embed zero concrete evidence, failing locally with `needs_evidence_preflight` instead of burning API across rounds. Conservative trip condition (completed-work claim AND zero evidence markers — mere keyword presence does not trip). New optional `evidence` input field on `run_until_unanimous` + `session_start_unanimous`; opt-out via `CROSS_REVIEW_EVIDENCE_PREFLIGHT=off`. cross-review stays API-only — evidence _packaging_ is caller-side (see `docs/evidence-preflight.md`). **CRV2-1 + CRV2-6** — `SessionMeta` gains `requested_max_rounds` / `effective_max_rounds` + `requested_max_cost_usd` / `effective_cost_ceiling_usd` / `cost_ceiling_source` traceability (legacy `cost_ceiling_usd` kept in sync for back-compat). **CRV2-3-meta** — CRV2-3 reclassified as not-a-bug (relator-non-voting is the correct tribunal design); `convergence_scope` now carries explicit `lead_peer_role` / `voting_peers` / `quorum_basis` / `anti_self_review_exclusion_reason` so the deliberate exclusion is not misread as a missing-vote bug. **CRV2-5 removed from server scope** — automatic evidence packaging would expand the security surface (shell/repo access); it stays caller-side. 4 new smoke markers; smoke `ok: true / events: 100`. 100% backward-compatible additive: new union member, new exported helper, new meta/scope fields, new optional input field, new finalize reason, new events, new env var. **Minor bump** (3.4.0 → 3.5.0; Y-component increment per SemVer). |
|
|
55
|
+
| **`v03.04.00`** | **Minor — Perplexity multi-failure-mode close-out 2026-05-13: 3 coordinated fixes covering 7 production sessions Codex flagged (`51973fac`, `f72e597a`, `f9a19401`, `99d46a2b`, `00d92cce`, `59776026`, `0003b2fe`).** **Fix #1 — streaming-path strip parity** (P0, surgical 2-line edit in `src/peers/perplexity.ts:~409/~504`): the v3.2.0 `stripPerplexityThinkingBlock` fix was applied only inside `sonarText(response)` (non-streaming path at `:~426/~521`). Production `server_info.streaming.tokens=true` is the default, so virtually every Perplexity call traversed the streaming branches which used raw `stream_buffer.text()` and bypassed the strip entirely. `<think>...</think>` preambles from `sonar-reasoning-pro` / `sonar-deep-research` reached the status parser, producing `unparseable_after_recovery` despite valid trailing JSON. v3.4.0 wraps `stream_buffer.text()` with `stripPerplexityThinkingBlock(...)` at both streaming sites, restoring parity. Forensic evidence: sess `f9a19401` (v3.3.0 self-investigation) — 4 peers converged READY on the exact diagnosis; Perplexity `ready_rate=0.28125` (9/32) vs `~1.0` for other peers. **Fix #2 — anti-meta-audit lock** (P1, prompt clause + heuristic detector): sess `51973fac` shipped a checklist of `MISSING: diff hunk` placeholders + sections titled `Evidence Gap` / `Validation Claims (NARRATIVE` / `Peer Review Readiness Blockers` instead of refining the artifact. `leadShipModeDirective()` gains an `## Anti-Meta-Audit Lock (HARD)` clause; new exported `detectMetaAuditFabrication(text)` in `src/core/orchestrator.ts` flags placeholder + section anti-patterns with double-bar threshold `(placeholders ≥ 3) OR (sections ≥ 1 AND placeholders ≥ 2)` for false-positive resistance. Reuses the shared `consecutiveLeadDrifts` counter (cap=2); new event `session.lead_meta_audit_fabrication_detected` + finalize reason `lead_meta_audit_repeated`. **Fix #3 — reviewer proportionality** (P2, prompt only): sess `0003b2fe` — Perplexity reviewer demanded separate `session_attach_evidence` of the same `rg` scan output the caller had narrated inline, blocking convergence over rounds. `sessionContractDirectives()` gains item 5 scoped tightly to pure config/script/text static-scan reviews; runtime work (build/test/deploy/migration/network) still requires raw output; "when in doubt, prefer asking for evidence" preserves rigor default. 3 new smoke markers (`perplexity_streaming_strip_parity_test`, `meta_audit_fabrication_detection_test`, `proportionality_guidance_test`). 100% backward-compatible additive: new exported helper, new event type, new finalize reason; tool schema unchanged. **Minor bump** (3.3.0 → 3.4.0; Y-component increment per SemVer) — additive public surface is the reason; behavior change for callers passing valid args is pure failure-mode prevention. |
|
|
56
|
+
| **`v03.03.00`** | **Minor — Caller peer-selection lock (operator directive 2026-05-12: "TODOS OS AGENTES/PEERS SEMPRE PARTICIPAM, INDEPENDENTE DA ESCOLHA OU VONTADE DO CALLER").** Closes the systematic gaming pattern where peer callers (notably Codex, observed across multiple sessions) selectively excluded other peers from their own cross-review panels via curated `peers: [...]` lists or pinned a sympathetic relator via `lead_peer`. **Lock surface**: `peers` is locked for ALL callers (including operator) — reviewer panel is ALWAYS the full server-configured `peer_enabled` set; operators tune via env vars (`CROSS*REVIEW_PEER*<NAME>=on | off`), not per-call overrides. `lead_peer`is locked for peer callers (forces lottery so callers cannot pin a sympathetic relator); operator caller may still pin`lead_peer`(legitimate testing/debug). Audit event`session.caller_peer_selection_ignored`emitted to event stream with`site`, `caller`, `peer_panel_overridden`, `ignored_peers`, `lead_peer_overridden`, `ignored_lead_peer`so operator can inspect via`session_events`who tried to game which peer in/out. **Implementation**: new exported`lockCallerPeerSelection<T>(input, ctx): T`helper in`src/mcp/server.ts`— pure function that strips locked fields and emits audit event via supplied`ctx.emit`. Lives at the MCP-handler boundary by design: external callers ALWAYS traverse the lock; internal call sites (orchestrator's own `runUntilUnanimous`→`askPeers` loop, smoke harness) bypass by construction. Wired at all 4 caller-facing handlers (`ask_peers`, `session_start_round`, `run_until_unanimous`, `session_start_unanimous`); `runtime`factory exposes`runtime.emit`so handlers route audit events through the same emitter. v3.2.0's Fix #3 (autowire-judge filter) remains as defense-in-depth (now trivially satisfied since`input.peers`is always undefined post-lock). New smoke marker`caller_peer_selection_lock_test` (5 behavioral scenarios + source-pin asserting all 4 handlers wire the lock). **Public surface**: 100% backward-compatible at schema/tool-surface level (parameters still accepted; values silently overridden + audit-logged). Behavior change deliberate. **Minor bump** — observable behavior change, not a bug fix. |
|
|
57
|
+
| **`v03.02.00`** | **Patch — Codex bug-report close-out 2026-05-12: three surgical fixes (Perplexity `<think>` parser + session-state invariant + orchestrator strict peers).** **Fix #1** (`src/peers/perplexity.ts`): `sonar-reasoning-pro` / `sonar-deep-research` emit a `<think>...</think>` reasoning preamble before structured JSON; pre-v3.2.0 the parser fed that raw string into the format-recovery pipeline, which failed `unparseable_after_recovery` even when the trailing JSON was valid READY. New `PERPLEXITY_THINKING_BLOCK` regex + exported `stripPerplexityThinkingBlock()` helper; `sonarText()` now strips before returning. Closes the long-standing blocker that forced v3.0.0/v3.1.0 to self-bypass HARD GATE. **Fix #2** (`src/core/session-store.ts`): closes session-state corruption observed in session `41244a1c-e7e8-439a-a59e-9339f7c7175d` (R1-R3 didn't converge, R4 finalized as `converged`, R5+R6 ran on top and clobbered `convergence_health` back to `"blocked"`, leaving meta with `outcome="converged" / health.state="blocked"`). `finalize()` now validates `outcome="converged"` against the latest round's `convergence.converged` (throws `code: "session_finalize_outcome_mismatch"`); `appendRound()` refuses to append to a finalized session (`code: "session_already_finalized"`); new public `assertNotFinalized()` helper wired into `askPeers` + `runUntilUnanimous` entry points so the round fails fast instead of after burning budget. **Fix #3** (`src/core/orchestrator.ts`): when the caller passes an explicit `peers: [...]` list, autowire judges are intersected with the explicit list — both the consensus and single-peer paths. Observed in session `73036fbb` where peers=[codex,gemini,deepseek,grok] but autowire still invoked perplexity as judge. New `hadExplicitPeers` flag + `judgeRespectsExplicitPeers()` helper; skipped sessions emit `session.evidence_judge_pass.autowire_skipped` with `skipped_for_explicit_peers: true` + `session_explicit_peers: [...]` for operator audit. 3 new smoke markers (`perplexity_thinking_block_strip_test` 7 scenarios + 3 pins; `session_finalize_state_invariant_test` 5 scenarios + 1 pin; `orchestrator_strict_peer_panel_test` 5 source pins). Smoke harness completes `ok: true / events: 99`. **Patch bump** (additive — new exports + new error codes; pre-existing anti-patterns now reject loudly instead of corrupting state). The `cross-review-attachment-inline-test` smoke fixture was updated to `caller_status: "NOT_READY"` so R1 doesn't auto-converge under stub mode. |
|
|
58
|
+
| **`v03.00.00`** | **Major — Perplexity joins the sexteto. Quinteto (5 peers) → sexteto (6).** Operator directive 2026-05-12. New `PerplexityAdapter` at `https://api.perplexity.ai` (Sonar API, OpenAI-Chat-Completions-compatible; reuses shared `loadOpenAICtor` lazy SDK helper). 5 architectural traits handled explicitly: (1) web search is the DEFAULT per call — peer becomes fact-check overlay; (2) system prompt is half-honored (search component does not attend to it); (3) `reasoning_effort` enum is `minimal | low | medium | high`only (clamped via exported`clampEffortForPerplexity()`); (4) **pricing is 3-dimensional** (input + output + per-1000-request fee scaled by `search*context_size`; Sonar Deep Research adds 4th dimension for citation/reasoning/search_queries); (5) API reports `usage.cost`per call in USD (captured as telemetry; config-driven cost layer remains authoritative). **Role-aware search**:`call()`→ reviewer keeps search active (peer's differentiator value);`generate()`→ relator forces`disable_search:true`(synthesis role, not lookup);`probe()` → search off (already inline). All 6 peers remain symmetric in role assignment — Perplexity can be caller, lead_peer, or reviewer; the HARD GATE caller!=lead_peer!=reviewer applies uniformly. Adds 14 new env vars (`PERPLEXITY_API_KEY`+`CROSS_REVIEW_PERPLEXITY*\*`for model/effort/search-context/disable/pricing). Extends`cost_rates[peer]`with 6 optional fields (request_fee × 3 tiers + citation/reasoning/search_queries Deep Research). Extends`CostEstimate`with 4 new line items +`TokenUsage` with 3 new fields. Boot notice for reasoning-effort-not-honored on sonar/sonar-pro models. 2 new smoke markers (`perplexity_integration_test`+`perplexity_reasoning_capability_allowlist_test`). **Default model**: `sonar-reasoning-pro`. **Default search_context_size**: `low`(cheapest tier; cross-review focus is the attached draft, not broad search). **Default disable_search**:`false`(search ATIVO; fact-check overlay is Perplexity's differentiator). Tool surface 100% backward-compatible additive (PeerSchema/CallerSchema accept`perplexity`as new value; legacy 5-peer payloads still valid). Default`session_start_unanimous`now dispatches 6 reviewers — set`CROSS_REVIEW_PEER_PERPLEXITY=off` per host to preserve quinteto behavior. **Major bump** — sexteto transition is an epoch shift over the quinteto baseline that held since v2.14.0. |
|
|
59
|
+
| **`v02.28.00`** | **Minor — Cold-start hardening Part 3: Windows registry env-var lookup bulk-cached (3-7 s → ~100 ms).** Empirical profile revealed the real boot bottleneck on Windows: `loadConfig()` consuming 3.1-7.0 s because `readWindowsRegistryEnv(name)` fired `reg query <root> /v NAME` per missing env var × 2 scopes (HKCU + HKLM). With ~140 config vars and partial `process.env`, this burned 3-7 s dwarfing every other boot cost. v2.27.0 + v2.27.1 attacked SDK imports + sweeps (~340 ms) — a side concern. **v2.28.0 fix**: single bulk `reg query <root>` at first miss populates a `Map<string,string>` module cache; `readWindowsRegistryEnv` becomes a pure `cache.get(name)`. Cost: `O(1 + 2 registry reads)` instead of `O(N missing × 2 spawns)`. **Empirical handshake** (3 trials each): v2.27.1 3.18 / 3.12 / 3.14 s → v2.28.0 **0.37 / 0.37 / 0.38 s** = 8.4× speedup. `loadConfig()` alone: 3,307 ms → 87 ms (38×). Cold-start now well below Claude Code's spawn timeout. New smoke `windows_registry_env_bulk_cache_test` (7-class assertion pinning Map cache + bulk loader + canonical `reg query <root>` shape + negative invariant on per-var `/v NAME` + escapeRegExp absence + thin lookup + dist parity). Public surface 100% backward-compatible. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md` (gate-fixing-itself, third installment). **Minor bump** — internal behavior change with measurable 8.4× runtime impact. |
|
|
60
|
+
| **`v02.27.01`** | **Patch — Cold-start hardening Part 2: lazy-load 5 provider SDKs + defer 6 startup sweeps to setTimeout(30s).** Completes the cold-start fix started in v2.27.0. Empirical motivation 2026-05-12: cross-review failed to register tools in a Claude Code session while the other 5 MCP hosts (Codex CLI / Gemini Code Assist / Antigravity / Grok CLI / DeepSeek CLI / VS Code) loaded normally. Diagnostic measurement of the real JSON-RPC `initialize` handshake showed the server taking ~4.2 s to respond — right on top of Claude Code's per-spawn timeout. Two contributors stacked: eager top-level imports of 5 provider SDK module trees (~3 s of CommonJS/ESM graph) + v2.27.0's 4 boot sweeps + 2 boot notices all running on the same event-loop tick as the initialize message. **Lazy-load**: every adapter source uses `import type` only for provider SDKs; new shared cached loaders `loadAnthropicCtor()` / `loadOpenAICtor()` / `loadGenaiModule()` wrap `import(<sdk>)` in a per-module promise so concurrent first-callers resolve exactly once. Each adapter's `client()` is now async; 25 call sites across the 5 adapters updated; `geminiThinkingConfig(model, ThinkingLevel)` takes the lazy-loaded enum as 2nd arg. `model-selection.ts` consumes the same loaders. **Deferred sweeps**: 6 boot-time `setImmediate` blocks in `server.ts` become `setTimeout(..., STARTUP_SWEEP_DELAY_MS)` with delay = 30_000 ms — initialize responds in <200 ms while sweeps run later when the operator is idle. Empirical handshake measurement post-ship: 3.6-3.9 s (vs 3.7-4.2 s pre-ship); margin is modest because Node.js + MCP SDK + orchestrator still dominate, but the architectural correctness keeps SDK and FS work off the initialize tick entirely. 2 new smoke markers (`lazy_provider_sdk_imports_test`, `startup_sweeps_use_setTimeout_test`) + 2 existing gemini assertions updated. **Public surface** 100% backward-compatible (3 new named exports for cross-module loader reuse; `client()` is `private` so async-conversion is internal). **Patch bump**. |
|
|
61
|
+
| **`v02.27.00`** | **Minor — Cold-start hardening Part 1: corrupted meta.json auto-quarantine + finalized-session auto-prune.** Empirically motivated by Claude Code reload friction 2026-05-12: 534 session dirs accumulated under `~/.cross-review/data_v2/sessions/`, including 3 corrupted by the v2.25.1 redact escape-boundary bug (`77c47284`, `be47a5b0`, `7edf63e3`). The startup sweeps iterate via `list()` which read every `meta.json`; a single corrupted file caused the sweep to throw + abort, surfacing parse-error stderr on every reload — Claude Code is more sensitive to startup stderr than other hosts. **`SessionStore.list()`** now silently skips + quarantines corrupted meta.json (renamed to `meta.json.bad` with one `[cross-review] quarantined …` stderr line, idempotent). **`SessionStore.pruneOldSessions(maxAgeDays?)`** removes finalized session dirs (outcome ∈ converged/aborted/max-rounds) whose `updated_at` is older than the cutoff. Default 60 days; `CROSS_REVIEW_PRUNE_AFTER_DAYS=0` disables entirely. In-flight or untyped-outcome sessions are NEVER pruned (preserves audit trail). New boot `setImmediate` block wires the prune; stderr only emitted when `pruned > 0`. **Minor bump** — 2 new methods on `SessionStore`; `list()` swallows-and-quarantines instead of throws (additive defensive). Backward-compatible default; operators see no behavior change unless they have corrupted meta.json or >60-day-old finalized sessions. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md`. |
|
|
62
|
+
| **`v02.26.01`** | **Patch — `max_attached_evidence_chars` default raised 80_000 → 200_000 to fix multi-file evidence truncation.** Empirically demonstrated by the stepsecurity v0.2.0 ship 2026-05-12 (sess `fd1037e5` and prior `85f94725`): with 5 attached evidence files totaling ~95KB, `session-store.readEvidenceAttachments()` budget allocator at `src/core/session-store.ts:1481-1543` exhausted the 80KB total cap before reaching the 4th+ attachment, surfacing `(truncated to 33273 of 38412 bytes)` to peers, who in 5 consecutive rounds correctly flagged the truncation as a blocker. The `perFileCap = max(2_000, floor(totalCap * 0.6))` mechanic remains correct (60% per-file allowance leaves room for at least 1 other attachment); only the global `totalCap` default needed bumping. New default 200_000 chars accommodates ~5 attachments averaging 30KB each. Operator override unchanged via `CROSS_REVIEW_MAX_ATTACHED_EVIDENCE_CHARS`. **Documented adjacent issues** (no code fix; tracked for v2.27+): (1) lead-drift abort threshold is 2 consecutive (`orchestrator.ts:3662`) — when `max_rounds` is reached with `consecutiveLeadDrifts === 1`, the session ends `max-rounds` instead of `lead_meta_review_drift`; workaround = use `ask_peers` for known-drift-prone task patterns; (2) inaccessible upstream OpenAPI spec — when peers demand verbatim spec excerpts but the spec endpoint requires browser-session cookie auth, the caller must rely on alternative-evidence patterns. **Patch bump** — backward-compatible default change. No public API surface change. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md`. |
|
|
63
|
+
| **`v02.26.00`** | **Minor — Full pricing-model schema: base + extended-tier + cache (read/write) + promo (limited-time discount), all env-configurable, graceful fallback when fields are absent or promo expires.** Operator directive 2026-05-11 ("Cross-review-v2 precisa saber ler das variáveis configuráveis nos arquivos de configuração e no env var todos os modelos de preços vigentes, com e sem cache, com promoção e sem promoção abaixo de tantos tokens e acima de tantos tokens"). Adds 14 new optional pricing env vars per provider plus 2 metadata env vars per provider (`_THRESHOLD_TOKENS`, `_PROMO_EXPIRES_AT_UTC`) on top of the v2.0.0 required pair — total 18 env-var slots per provider × 5 providers = 90 max. Selection cascade in new exported `selectRate()`: (promo+extended) → promo → extended → base, each step automatically falling through when the corresponding field is unset OR the gating condition does not apply. When promo expires (`Date.now() >= Date.parse(promo_expires_at)`), system uses base without operator intervention; when extended is unset, base applies to all prompt sizes; when cache rates are unset entirely, cache tokens are billed at the input rate (zero savings reported, no penalty). **No-hardcoded-financials directive** — the legacy `src/core/cache-rates.json` runtime fallback was REMOVED; cache pricing comes exclusively from env vars or graceful input-rate fallback. **CostEstimate** type extended with `cache_read_cost?`, `cache_write_cost?`, `tier_used?` ("base"\|"extended"\|"promo"\|"promo_extended"). `estimateCacheSavings()` third positional arg (`configRate`) is now required — internal/MCP callers route through `estimateCost()` and are unaffected. New smoke marker `full_pricing_model_v2260_test` pinning 11 invariants. **Minor bump** — additive public surface; breaking only for direct `estimateCacheSavings()` callers. |
|
|
64
|
+
| **`v02.25.01`** | **Patch — `meta.json` corruption hotfix: `redact()` env-style pattern was crossing JSON-escape boundaries.** The env-style assignment regex in `src/security/redact.ts:26` used `[^\s"',}]{6,}` for the value capture group; backslash was NOT in the exclusion class, so when a peer response contained the JSON-escaped sequence `token: write\"` (the inner-string close-quote of an escaped peer text), the `{6,}` quantifier consumed `write\` (6 chars including the escape backslash). The replacement `[REDACTED]` ate the closing `\` of the escape, leaving a bare `"` that prematurely closed the outer JSON string — producing structurally-broken `meta.json` files that could not be re-parsed at session resume time. Empirical impact: 3 cross-review sessions today (`be47a5b0`, `77c47284`, `7edf63e3`) all aborted at session_init with parser errors at different positions — same root cause: peer responses to a 13-repo scorecard hotfix submission quoted `id-token: write` inside backtick-fenced YAML excerpts. Fix: extend the negative char class with `\\`. Three smoke regression cases added (`escapeBoundary`, `realAssignment`, `yamlExcerpt`). **Patch bump** — additive defensive narrowing of an existing pattern; no public surface change. Cross-review-v2 self-review BYPASSED per operator directive 2026-05-11 (the bug being fixed is in the cross-review gate itself; routing the fix through the broken gate would re-encounter the same corruption). |
|
|
65
|
+
| **`v02.25.00`** | **Third deliberation mode `circular` joins `ship` and `review`.** Imported from `maestro-app`'s editorial protocol after operator review of the maestro design 2026-05-11. Serial deliberative custody: caller submits artifact; non-caller peers rotate as temporary curators; each rotator either approves the current version unchanged or produces a narrowly justified revision; convergence = full rotation completes without substantive change. No parallel peer-voting in this mode — the rotator IS the actor each round. **When to use each mode**: `ship` (default) for approving/rejecting an external artifact (PR review, spec approval, security gate — tribunal primitive, all peers vote READY); `review` for tasks phrased as a review act where the lead emits structured response; `circular` for producing/refining a shared artifact (spec drafting, RFC, protocol evolution, CHANGELOG copy — editorial primitive, panel produces). Modes coexist; mixing within a single session is not supported. Implementation: new `SessionMode = "ship" \| "review" \| "circular"`; new `leadCircularModeDirective()` Layer-1 prompt clause with 5 subsections (approve unchanged, approved-content lock, quality preservation, no-self-review, evidence-provenance-lock shared with ship); new `runCircularLoop()` private orchestrator method called when `sessionMode === "circular"`; new `circular_state: {rotation_order, consecutive_no_change_count, last_revision_round}` persisted in `meta.json`; new `circular_max_rotations` config (default 3, env `CROSS_REVIEW_CIRCULAR_MAX_ROTATIONS`); new event types `session.circular_rotation_assigned` / `_step_unchanged` / `_step_revised` / `_full_rotation_no_change` / `_max_rotations_exceeded` / `_rotation_too_small`; new finalize reasons `circular_full_rotation_no_change` / `circular_max_rotations_exceeded` / `circular_rotation_too_small`. Rotation length minimum is 2 (no-self-immediate-output guard). Drift / empty / fabrication detection from v2.23/v2.24 fires identically. New smoke marker `circular_mode_test` pinning 11 invariants. MCP tool schemas (`run_until_unanimous`, `session_start_unanimous`) accept `mode: "circular"`. **Minor bump** — additive public surface; pre-v2.25 callers see no behavior change. |
|
|
66
|
+
| **`v02.24.00`** | **Evidence-provenance lock for the ship-mode relator (Codex bug report 2026-05-10).** Codex empirically observed two adjacent failure modes from his own working session `019dc794`: (a) **session `09c21d7a`** — lead_peer (Grok) fabricating operational evidence ex nihilo in `run_until_unanimous` with `mode: ship` (symmetric-pattern SHAs, 39-char SHAs where git emits 40, test-run counts not in attached evidence, `git diff --check passed` assertions, vite asset hashes); (b) **session `eee886d3`** — different relator (DeepSeek) propagating caller-narrated operational claims (`cargo test: 147 passed`, `npm run typecheck: passed`) as if they were verified evidence, when the caller never attached raw command output via `session_attach_evidence`. Same architectural gap from two angles: **NARRATIVE about operational evidence ≠ PROVENANCE-GRADE operational evidence**. Pre-v2.24.0 the orchestrator promoted such revisions to next-round draft, burning a full round of peer calls before downstream peers (claude + deepseek) blocked convergence. **Layer 1** — Evidence Provenance Lock (HARD) clause added to `leadShipModeDirective()` system prompt, instructing the relator that operational evidence (SHAs/hashes/build outputs/test counts/diff hunks/git assertions) MUST be cited verbatim from the corpus or declared as a blocker. **Layer 2** — new exported `detectFabricatedEvidence(revisionText, provenanceCorpus): FabricationDetectionResult` heuristic detector with hex-token-subset check + canonical operational-assertion patterns. Thresholds: 3+ net-new hex tokens or 2+ suspicious assertions trip fabrication; corpus-quoted tokens are subtracted before scoring (false-positive guard). **Layer 3** — orchestrator relator-revision branch wires the detector after `emptyText`/`driftDetected` checks, preserves prior draft on detection, increments `consecutiveLeadDrifts`, emits `session.lead_fabrication_detected` event (data.fabrication_signals carries net_new_hex_count + sample + suspicious_assertion_count + sample), finalizes with `lead_fabrication_repeated` at the consecutive-cap. New smoke marker `relator_evidence_provenance_lock_test` pins behavioral matrix (clean/hex/assertion/provenance-correct) + source-level invariants (prompt sentinel, threshold constants, event type, finalize reason, unified-counter contract). No tool surface change. **Patch bump** — additive event + finalize reason; failure-mode behavior change only. |
|
|
67
|
+
| **`v02.23.00`** | **Anthropic empty-revision degenerate path detection.** Patch closing a $0.21 USD waste path discovered while triaging maestro-app v0.5.20 review session `8187f5a8` (2026-05-10): Claude Opus extended-thinking responses can return a content array with only `thinking`/`redacted_thinking` blocks and no final `text` block. Pre-v2.23.0 the Anthropic adapter silently produced `text: ""` and the orchestrator promoted that empty string to the next-round draft, dispatching peer calls against an empty `Draft Or Solution Under Review:` block. **Layer 1** — new `parseAnthropicContent(content)` returns `{text, parser_warning?}` instead of the lossy `string`; legacy `textFromAnthropicContent` kept as backward-compat shim. **Layer 2** — anthropic.ts call sites surface `parser_warning` via new `extraParserWarnings` parameter on `resultFromText`/`generationFromText`, flowing to `PeerResult.parser_warnings` and (new) `GenerationResult.parser_warnings`. **Layer 3** — orchestrator's relator-revision branch treats `generation.text.trim() === ""` the same as drift: preserve prior draft, increment `consecutiveLeadDrifts`, emit dedicated `session.lead_empty_revision` event, finalize with `lead_empty_revision_repeated` when the cap is hit. New smoke marker `anthropic_empty_text_detection_test` pins all 4 invariants (helper return shape, adapter call-site uniformity, orchestrator sentinel strings, types declaration). No public surface change for callers passing valid arguments. **Patch bump** — failure-mode behavior change only. |
|
|
68
|
+
| **`v02.22.00`** | **`session_doctor` drill-down + per-round cost telemetry + budget warning event.** Three observability/audit improvements identified during a forensic audit of 467 durable sessions. **A.P2:** `session_doctor` hides per-session enumeration of `findings.self_lead_metadata` by default (178/467 = 38% pre-v2.16.0 noise); `totals.self_lead_metadata` count remains visible; pass `include_legacy: true` to enumerate. **B.P2:** entries in `findings.open_evidence_sessions[]` gain `item_types` (open items grouped by surfacing peer) + `chronic_blockers` (item ids with `round_count >= 3`) so operators see which evidence asks are systemic. **B.P3:** new `costs_per_round[]` + `cost_ceiling_usd` in `meta.json` (snapshot at session_init time so retroactive analysis is decoupled from later env-var changes); new one-shot `session.budget_warning` event fires when cumulative cost crosses 75% of the ceiling, providing early visibility before `max_rounds_budget_exceeded`. 3 new smoke markers (`session_doctor_legacy_filter_test`, `evidence_checklist_drilldown_test`, `budget_warning_emit_test`). **Minor bump** — public surface is additive; pre-v2.22 callers see no behavior change. |
|
|
69
|
+
| **`v02.21.00`** | **Cross-provider prompt caching across all 5 peers (OpenAI, Anthropic, Gemini, DeepSeek, Grok).** Single coordinated ship that wires uniform prompt-caching telemetry through the runtime: each adapter parses provider-native cache fields (`prompt_tokens_details.cached_tokens` / `cache_creation_input_tokens` / `cache_read_input_tokens` / `cachedContentTokenCount` / `prompt_cache_hit_tokens` / `prompt_cache_miss_tokens`); orchestrator emits a canonical `provider.cache.usage` event; per-session `cache_manifest.json` is appended for every cached call. **Anthropic** uses EXPLICIT cache_control breakpoints on the system prompt (TTL `5m`/`1h`). **OpenAI** uses pair-scoped `prompt_cache_key` + `prompt_cache_retention` (`in_memory`/`24h`). **Grok** mirrors OpenAI plus `x-grok-conv-id` header for cache-bucket scoping. **DeepSeek** parses auto-cache telemetry (no payload changes). **Gemini** parses implicit-cache telemetry only (explicit `caches.create` deferred). New `src/core/prompt-parts.ts` builds the canonical `stablePrefix` that always begins with `cache_schema_version: vN` and produces a sha256 hex hash invariant across rounds for the same case. New `src/core/cache-manifest.ts` persists per-session cache history with the same atomic-write retry pattern as `meta.json`. New rate cards in `src/core/cache-rates.json` populate `CostEstimate.cache_savings_usd` (or `cache_savings_unknown` when no rate matches). Operator can disable globally with `CROSS_REVIEW_DISABLE_CACHE=true`; TTL via `CROSS_REVIEW_CACHE_TTL_ANTHROPIC` / `CROSS_REVIEW_CACHE_TTL_OPENAI`; schema bump via `CROSS_REVIEW_CACHE_SCHEMA_VERSION`. 5 new smoke markers (`cache_hash_invariance_test`, `cache_schema_version_in_prefix_test`, `cache_rates_json_loaded_test`, `cache_manifest_atomic_write_test`, `cache_disable_kill_switch_test`). New `docs/caching.md` documents per-provider behavior matrix. **Minor bump** — public surface is additive; pre-v2.21 callers see no behavior change. |
|
|
70
|
+
| **`v02.18.08`** | **Site sponsor card iteration.** `site/index.html` GitHub Sponsors iframe (caixa branca cross-origin) substituído por link card dark navy com ❤ pink + meta cyan + seta animada; card movido para DEPOIS dos botões (lcv.dev/sponsor primário, GitHub Sponsors alternativa). Companion ship Phase 3 (12 repos). |
|
|
71
|
+
| **`v02.18.07`** | **Patch — `site/index.html` visual identity refresh.** GitHub Pages doc/sponsor page reskin to the new LCV org dark-first navy/cyan visual identity (palette `#050b18`/`#38bdf8`/`#34d399`, radial gradients, glow shadows, gradient text on h1). Coordinated companion ship with cross-review-v1 1.12.9, deepseek-cli 0.3.1, grok-cli 1.6.2, sponsor-motor APP v01.02.02, and `.github-org/site` (org root + /sponsor). No change to the published npm tarball (`files[]` does not include `site/`); only the GitHub Pages page changes. **Patch bump** (no public surface change). |
|
|
72
|
+
| **`v02.18.06`** | **Patch — Gemini API function-declaration compatibility for MCP tool inputSchemas.** Gemini Code Assist forwards each MCP tool's `inputSchema` to the Gemini API as a `function_declarations[*].parameters` payload; the Gemini API's OpenAPI 3.0 subset rejects three patterns the SDK was emitting from the existing zod schemas, surfacing as `400 INVALID_ARGUMENT` for every chat turn including cross-review tools. v2.18.6 cleans the offending zod usage. **(1)** `additionalProperties: false` removed from every MCP tool inputSchema (~28 tools) by dropping the `.strict()` chain; runtime accepts the same valid arguments because handlers consume only declared properties via destructuring. **(2)** `caller` field flattened from `z.union([PeerSchema, z.literal("operator")])` (6 occurrences) to a single `CallerSchema = z.enum([...PEERS, "operator"])`, replacing the `anyOf: [enum, const]` shape with a clean single `enum`. **(3)** `reasoning_effort_overrides` refactored from `z.record(PeerSchema, ReasoningEffortSchema).optional()` to an explicit `z.object({codex?, claude?, gemini?, deepseek?, grok?}).optional()`, eliminating the non-OpenAPI `propertyNames` constraint and the spurious `required: [<all 5 peers>]` artifact that contradicted the field's `.optional()` declaration. No behavior change for any caller passing valid arguments — Claude Code, Codex CLI, Gemini Code Assist, Grok CLI and DeepSeek CLI continue invoking the same tools with the same keys. Lint/typecheck/format clean; smoke harness completes with `ok: true / events: 96`. **Patch bump** (compatibilidade pública preservada; única diferença observável é que campos extras não declarados passam a ser silenciosamente descartados em vez de rejeitados com `mcp_arg_validation_failed`). |
|
|
73
|
+
| **`v02.18.05`** | **Patch — anti-drift smoke drivers for v2.18.4 audit closure (operator directive 2026-05-07).** v2.18.4 shipped 6 surgical fixes from the Codex external audit; v2.18.5 hardens those fixes against silent regression with 5 anti-drift smoke checks (`hono_override` / `abort_signal_threading` / `max_items_per_pass_default` / `clamp_effort_for_model` / `consensus_event_per_peer_attribution`). **P1.1**: `package.json` overrides.hono === ">=4.12.16" + ip-address override retained. **P1.3**: ≥2 sites with `signal?: AbortSignal` param + `signal: params.signal` wiring + `signal: input.signal` autowire emission; consensus pass has no leftover `signal: undefined`. **P1.4**: source-level `?? "4"` fallback + behavioral `loadConfig()` returns max_items_per_pass=4 (env unset). **P2.1**: behavioral clampEffortForModel("xhigh", "grok-4.3")="high"; passthrough on multi-agent; clamp wired at exactly 2 responses.create sites. **P2.4**: legacy judge_peer + new judge_peers array + per_peer_verdict map co-emitted at every `this.emit({...})` event payload. `clampEffortForModel` is now exported from src/peers/grok.ts so the harness can verify directly. Companion to cross-review-v1 v1.12.7 (parallel ship, same operator directive). Smoke harness completes with `ok: true` / exit 0; lint/typecheck/format clean; `npm audit --audit-level=moderate` 0 vulnerabilities. **Patch bump** (additive — only new exports + new smoke markers; no runtime behavior change). |
|
|
74
|
+
| **`v02.18.04`** | **Patch — Codex external audit 2026-05-07 outcome: 6 surgical fixes (P1.1, P1.2, P1.3, P1.4, P2.1, P2.4).** Codex submitted a read-only audit of cross-review v2.18.3 with 4 P1 + 7 P2 findings; this ship lands 6 verified-actionable items. **P1.1**: `package.json` adds `"hono": ">=4.12.16"` override clearing 2 npm-audit moderate advisories (GHSA-9vqf-7f2p-gf9v + GHSA-69xw-7hcm-h432) via @modelcontextprotocol/sdk transitive (practical exposure ~zero in stdio runtime, but audit-gate matters for publish + defense-in-depth; same precedent as v2.18.1 ip-address override). **P1.2**: `src/security/redact.ts` adds `xai-` API key pattern at parity with sk-/sk-ant-/AIza/etc; logs/sessions could previously leak xAI keys via persisted provider errors. **P1.3**: `runEvidenceChecklistJudgeConsensusPass` + `runEvidenceChecklistJudgePass` now thread `AbortSignal` through to `judgeEvidenceAsk(context.signal)` — pre-v2.18.4 the consensus path hardcoded `signal: undefined` and single-peer omitted the field, so `session_cancel_job` could not abort judges mid-flight. Autowire call sites pass `input.signal` from round scope. **P1.4**: lowered default `CROSS_REVIEW_EVIDENCE_JUDGE_MAX_ITEMS_PER_PASS` from 8 → 4 — with default consensus_peers=4, worst-case round goes from 4×8=32 paid judge calls down to 4×4=16. Operators wanting prior behavior set env-var explicitly. **P2.1**: `GROK_REASONING_EFFORT_MODELS` allowlist expanded from `{"grok-4.20-multi-agent"}` to include `"grok-4.3"` per current xAI docs (verified via WebFetch 2026-05-07; xAI added `grok-4.3` reasoning_effort support after v2.16.0 froze). New `clampEffortForModel()` narrows internal `xhigh`/`minimal` scale to `high` for grok-4.3 (which only accepts `none | low | medium | high`). v2.16.0 verification 2026-05-05 was authoritative at the time but is now stale; v2.18.4 closes the drift. **P2.4**: consensus events at orchestrator.ts:1008 + :1030 previously emitted only `judge_peer: params.judge_peers[0]`, so the rollup at session-store.ts:911 attributed every consensus decision to the first peer (codex by default). v2.18.4 keeps `judge_peer`for backward compat AND emits`judge_peers: PeerId[]`+`per_peer_verdict`map so per-peer accuracy is computable from the raw event stream. Smoke harness completes with exit 0 + final`{ ok: true, events: 96 }`payload (the harness's binary success signal);`grok_reasoning_capability_allowlist_test`updated from prior`size === 1`to`size === 2`. Lint/typecheck/format clean. **Patch bump** (additive public surface; default-behavior change on `max_items_per_pass` documented). |
|
|
75
|
+
| **`v02.18.03`** | **Patch — Gemini default pin bump `gemini-3.1-pro-preview` → `gemini-2.5-pro` (operator preference 2026-05-07; coordinated with cross-review-v1 v1.12.4).** Source-of-truth defaults flipped: `src/core/config.ts` `models.gemini` default → `gemini-2.5-pro`; `src/peers/model-selection.ts` priority list → `["gemini-2.5-pro", "gemini-3.1-pro-preview"]` (3.1-pro-preview retained as fallback). Rationale: under Google One AI Ultra subscription, `gemini-2.5-pro` carries 1k requests/day quota vs `gemini-3.1-pro-preview`'s 250 requests/day; post-bump empirical sessions (08cbc942, 1d5be5f2, 256ac7c9 — all 2026-05-07) confirm `gemini-2.5-pro` stable across the 5-peer panel without rate_limit blockers. The 7 LCV-workspace MCP host configs already flipped `CROSS_REVIEW_GEMINI_MODEL=gemini-2.5-pro` env-override 2026-05-07; this ship aligns the source-of-truth defaults so a fresh install without env-override picks the same model. Workspace policy (operator directive 2026-05-07): only `gemini-*-pro` variants ≥ 2.5 are permitted — no `*-flash` and no models below 2.5. Smoke fixture `scripts/smoke.ts:225` (currentOfficialModel iterator) flipped to `gemini-2.5-pro`. `docs/api-keys.md` env-var example + `docs/model-selection.md` priority documentation refreshed to match. **Patch bump** (no public surface change beyond default model ID; behavior unchanged for env-override users). |
|
|
76
|
+
| **`v02.18.02`** | **Tier 5 — Windows process-tree introspection (coordinated with cross-review-v1 v1.12.2).** Closes the long-standing forensics gap: pre-v2.18.2 `getParentProcessSnapshot()` returned `parent_exe_basename: null` on Windows because we only had a POSIX `/proc/<ppid>/comm` reader (Windows path deferred at F1 v2.18.0). v2.18.2 closes the gap with a defensive `tasklist /FI "PID eq <ppid>" /FO CSV /NH` reader via `child_process.spawnSync` (`timeout: 500`, `windowsHide: true`); parser uses leading-quote discriminator and the same `1 ≤ length < 128` sanity filter as POSIX. Best-effort try/catch swallows ENOENT, timeout, parse failures. POSIX path unchanged. `scripts/smoke.ts` sub-test (14) extended with shape sanity + Windows-specific populated-basename assertion + source-level anti-drift guards. Forensics-only field — NOT used by F1 token gate or v2.17.0 clientInfo cross-check. **Patch bump** (no public surface change). |
|
|
77
|
+
| **`v02.18.00`** | **F1 caller capability tokens (coordinated with cross-review-v1 v1.11.0).** Cryptographic identity proof that complements the v2.17.0 clientInfo gate. Pre-v2.18.0 the v2.17.0 cross-check between `caller` and `clientInfo.name` only catches _inconsistent_ self-reports — both fields are declared by the caller. F1 introduces a per-host secret (env `CROSS_REVIEW_CALLER_TOKEN`), authoritative on match and rejected on mismatch. New `caller-tokens` module exposes generation, loading, constant-time hex matching, env verification and a best-effort parent-process snapshot for forensics (Option C / Hybrid). New MCP tool `regenerate_caller_tokens` rotates `host-tokens.json`. New env vars `CROSS_REVIEW_CALLER_TOKEN`, `CROSS_REVIEW_TOKENS_FILE`, `CROSS_REVIEW_REQUIRE_TOKEN`. New `caller_tokens` block in `server_info` surfaces the gate state. `verifyCallerIdentity` extended with `verification_method` ("token" | "client_info" | "none") and `identity_metadata`. R2 codex catch hardening: `caller="operator"` from a host carrying a token throws `identity_forgery_blocked` (closes the operator-bypass window). Permissive default — hosts without tokens fall back to v2.17.0 clientInfo gate; operator opts into hard-enforce mode after distributing secrets. Smoke marker `caller_capability_tokens_test` covers 16 cases including the new overlay paths and the R2 hardening. **Minor bump** (additive public surface). |
|
|
78
|
+
| **`v02.17.00`** | **HARD GATE — identity forgery rejection (operator directive 2026-05-05).** Empirical evidence flagrada: cross-review session `0994cbaf` foi criada por Codex com `caller=claude` (impersonação para auto-exclusão do real Claude da panel). Pre-v2.17.0 v2 nem capturava `clientInfo` da MCP initialize handshake — `caller` era trusted unconditionally. v2.17.0 adiciona `verifyCallerIdentity(declaredCaller, clientInfo)` que cross-checks o caller declarado contra `getCallerCandidatesFromClientInfo(clientInfo)`. Aplicado em todos os 6 handlers caller-accepting: `session_init`, `ask_peers`, `session_start_round`, `run_until_unanimous`, `session_start_unanimous`, `contest_verdict` (quando `new_caller` provided). Match → OK + `identity_verified=true`. clientInfo unknown → OK + `identity_verified=false` (legitimate override). `caller="operator"` → OK (no agent claim made). Mismatch OR multi-match clientInfo → throws `identity_forgery_blocked`. Smoke `identity_forgery_blocked_test` (6 sub-tests). Coordinated ship com `cross-review-v1 v1.9.0`. **Minor bump** porque public surface adds `identity_forgery_blocked` error. Cross-review trilateral bypassed por operator directive (security fix to the gate itself, would otherwise route through compromised gate). |
|
|
79
|
+
| **`v02.16.00`** | **Tribunal protocol repair plus operational doctor.** Separates petitioner/caller from relator metadata, applies self-recusal to direct `ask_peers`, adds read-only `session_doctor`, fixes Windows smoke teardown, and refreshes provider model guidance from official docs. |
|
|
80
|
+
| **`v02.15.01`** | **`server_info` consensus visibility hotfix.** Exposes `consensus_peers` and `configured_consensus_peers_raw` for evidence-judge autowire so operators can audit the same configuration the dispatcher is using. |
|
|
81
|
+
| **`v02.15.00`** | **Backlog bundle for operational judge controls.** Added consensus-based judge autowire, per-call reasoning-effort overrides, opt-in real-API smoke, provider 4xx docs hints, and a Grok reasoning-capability allowlist while exposing consensus toggles across the six MCP host configs. |
|
|
82
|
+
| **`v02.14.01`** | **Grok reasoning model hotfix.** Switched the default Grok model to `grok-4.20-multi-agent` after real xAI verification and official docs showed `reasoning.effort` is accepted only on that model family. |
|
|
83
|
+
| **`v02.14.00`** | **Grok joins the tribunal.** Expanded the peer set to five with Grok, added per-peer on/off env vars, precision-report groundwork, active evidence-judge autowire, `contest_verdict`, multi-peer judge consensus, attached-evidence prompt injection, and CodeQL-safe temp-directory handling. |
|
|
84
|
+
| **`v02.13.00`** | **Lead meta-review drift fix.** Added explicit `ship` versus `review` session mode, lead drift detection, drift telemetry, and an abort gate so `run_until_unanimous` does not replace the artifact under review with a structured peer-review verdict. |
|
|
85
|
+
| **`v02.12.00`** | **Shadow judge observability.** Turned on evidence-judge shadow-mode data collection, surfaced autowire config in `server_info`, added dashboard/runtime rollups, and codified the tribunal-colegiado model for caller, relator, peer votes, and contestation. |
|
|
86
|
+
| **`v02.11.00`** | **Relator lottery plus shadow auto-wire.** Added automatic relator selection that excludes the caller and wired the v2.9 judge pass in shadow mode so self-review drift stops at the session structure. |
|
|
87
|
+
| **`v02.09.00`** | **LLM evidence-judge pass.** Added an operator-triggered judge that evaluates open evidence asks against the current draft and promotes only verified satisfied items, leaving inferred/unknown cases open. |
|
|
88
|
+
| **`v02.08.00`** | **Per-peer health and Evidence Broker lifecycle.** Added health rollups, evidence lifecycle tracking, resurfacing inference, dashboard surfaces, and the final architectural audit item on top of v2.7. |
|
|
89
|
+
| **`v02.07.00`** | **Evidence Broker.** Added a persistent per-session evidence checklist that deduplicates `NEEDS_EVIDENCE` caller requests and injects outstanding asks into subsequent revision prompts. |
|
|
90
|
+
| **`v02.06.01`** | **Fallback/recovery budget hard gate.** Replicated hard budget refusal to fallback and moderation-recovery paths so paid recovery calls cannot silently exceed the session cost ceiling. |
|
|
91
|
+
| **`v02.06.00`** | **Token-delta compaction plus v2.5 format hotfix bundle.** Coalesced streaming token delta events to reduce `events.ndjson` noise and bundled the deferred Prettier/format fix from v2.5. |
|
|
92
|
+
| **`v02.05.00`** | **Evidence and budget hardening pass.** Folded in operator-requested evidence/budget improvements plus empirical Codex/Gemini audit findings from historical session analysis. |
|
|
93
|
+
| **`v02.04.01`** | **CI stub fail-fast hotfix.** Fixed import-time server startup so the smoke harness can import MCP schemas while `CROSS_REVIEW_STUB=1` is set in CI with explicit confirmation. |
|
|
94
|
+
| **`v02.04.00`** | **Audit-closure hardening pass.** Closed internal v2.3.3 technical-opinion priorities with additive public-surface hardening and several explicitly documented behavior changes. |
|
|
95
|
+
| **`v02.03.03`** | **Prompt shielding and financial safety.** Wrapped `review_focus` in escaped delimiters, blocked paid calls until financial controls are configured, expanded `server_info` financial diagnostics, and hardened MCP IDs, sweeps, jobs, and recovery cost alerts. |
|
|
96
|
+
| **`v02.03.02`** | **CI-green README/docs cleanup.** Reissued README organizational standardization under the repository Prettier policy and completed active-document rename cleanup in `NOTICE` and `CODE_OF_CONDUCT.md`. |
|
|
97
|
+
| **`v02.03.01`** | **README organizational standardization.** Adopted the shared LCV README opening while preserving the API-first runtime, model-selection, streaming, and observability sections. |
|
|
98
|
+
| **`v02.03.00`** | **Provider-neutral `review_focus`.** Added focus support across session tools, persisted focus metadata, injected bounded focus blocks into generation/review/retry prompts, and aligned auto-tag/publish automation with the stable package line. |
|
|
99
|
+
| **`v02.02.00`** | **Provider token streaming.** Added real token streaming for OpenAI, Anthropic, Gemini, and DeepSeek, with count-based progress events, runtime controls, and text-redaction defaults for persisted event logs. |
|
|
100
|
+
| **`v02.01.01`** | **CodeQL and model-selection hardening.** Fixed secret-redaction ReDoS and dashboard log-injection alerts, added decision retry for empty peer output, max-output-token controls, stronger model selection, and improved thinking controls. |
|
|
101
|
+
| **`v02.01.00`** | **First stable `cross-review` release.** Promoted the API-first implementation to stable with cancellation, restart recovery, metrics, runtime capabilities, prompt compaction, budget preflight, model fallback, and stable naming. |
|
|
102
|
+
| **`v02.00.04`** | **Session event race hotfix.** Removed the CodeQL file-system race in `events.ndjson` persistence by appending under the session lock. |
|
|
103
|
+
| **`v02.00.03`** | **Background sessions and durable reports.** Added background MCP tools, durable events and reports, peer decision-quality tracking, generation accounting, provider cost rates, budget guard, moderation-safe retry, and dashboard event/report APIs. |
|
|
104
|
+
| **`v02.00.02`** | **Publishing and dashboard sanitization.** Normalized npm dist-tags, replaced the sponsor landing with the SumUp support page, sanitized dashboard 500 responses, and bumped the alpha runtime. |
|
|
105
|
+
| **`v02.00.01`** | **Public npm/package metadata alignment.** Enforced public npm visibility, added registry visibility checks, aligned funding metadata, normalized `repository.url`, and bumped the alpha runtime. |
|
|
106
|
+
| **`v02.00.00`** | **Development package line hardening.** Added parser format recovery, convergence metadata, shared MCP timeout/runtime smoke, auto-tag/release publishing, padded public tags, prepack clean builds, ignore-rule hardening, and quorum preservation. |
|
|
107
|
+
| **`v2.0.0-alpha.2`** | **Durable session recovery alpha.** Added in-flight metadata, convergence health, evidence attachment, operator escalation, session sweep, convergence inspection, silent-model-downgrade failures, and smoke coverage for the new surfaces. |
|
|
108
|
+
| **`v2.0.0-alpha.1`** | **Model attestation and store hardening alpha.** Added reported-model tracking, failed-attempt aggregation, recovery hints, atomic/locked session writes, UUID path hardening, safer probes, self-review prevention, English peer prompts, and expanded redaction. |
|
|
109
|
+
| **`v2.0.0-alpha.0`** | **Initial API/SDK-only MCP server.** Introduced official SDK adapters for OpenAI, Anthropic, Gemini, and DeepSeek, runtime model discovery, best-model selection, and a durable local session store. |
|
|
109
110
|
|
|
110
111
|
## What It Does
|
|
111
112
|
|