|
@@ -1,16 +1,19 @@
|
|
1
1
|
<p align="center">
|
|
2
|
-
<img src=".github/assets/lcv-ideas-software-logo.svg" alt="LCV Ideas & Software" width="520" />
|
|
2
|
+
<img src=".github/assets/lcv-ideas-software-logo.svg" alt="LCV Ideas & Software" width="520" />
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
5
|
# cross-review
|
|
6
6
|
|
|
7
7
|
> MCP server orchestrating API-first cross-review between Claude, ChatGPT Codex,
|
|
8
|
-
> Gemini, DeepSeek, and Grok with unanimous convergence gates.
|
|
8
|
+
> Gemini, DeepSeek, Grok, and Perplexity with unanimous convergence gates.
|
|
9
9
|
|
|
10
10
|
[](#status)
|
|
11
|
+
[](https://github.com/LCV-Ideas-Software/cross-review/releases)
|
|
11
12
|
[](https://www.npmjs.com/package/@lcv-ideas-software/cross-review)
|
|
13
|
+
[](https://github.com/LCV-Ideas-Software/cross-review/actions/workflows/ci.yml)
|
|
14
|
+
[](https://github.com/LCV-Ideas-Software/cross-review/actions/workflows/codeql.yml)
|
|
15
|
+
[](https://github.com/LCV-Ideas-Software/cross-review/actions/workflows/publish.yml)
|
|
12
16
|
[](#what-it-does)
|
|
13
|
-
[](#security)
|
|
14
17
|
[](./LICENSE)
|
|
15
18
|
|
|
16
19
|
**Install.**
|
|
@@ -21,8 +24,7 @@ npm install -g @lcv-ideas-software/cross-review
|
|
21
24
|
npm install -g @lcv-ideas-software/cross-review --registry=https://npm.pkg.github.com
|
|
22
25
|
```
|
|
23
26
|
|
|
24
|
-
**Status.** Stable. Current release: **v04.02.00** (npm package `4.2.0`). See
|
|
25
|
-
[CHANGELOG.md](./CHANGELOG.md) for the release history.
|
|
27
|
+
**Status.** Stable. Current release: **v04.02.02** (npm package `4.2.2`). See [CHANGELOG.md](./CHANGELOG.md) for the full release history.
|
|
26
28
|
|
|
27
29
|
> **Project renamed 2026-05-15.** This project was previously published as
|
|
28
30
|
> [`@lcv-ideas-software/cross-review-v2`](https://www.npmjs.com/package/@lcv-ideas-software/cross-review-v2)
|
|
@@ -34,87 +36,92 @@ npm install -g @lcv-ideas-software/cross-review --registry=https://npm.pkg.githu
|
|
34
36
|
|
|
35
37
|
The version history at a glance:
|
|
36
38
|
|
|
37
|
-
| Release | Scope |
|
|
38
|
-
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
39
|
-
| **`v04.02.00`** | **Minor — bounded MCP session listing and cancellation semantics cleanup.** `session_list` is now paginated and summary-only by default (`limit=25`, `max=100`) with `offset`, `outcome_filter`, and `detail` controls so large local histories no longer create oversized stdio responses. `session_cancel_job` now returns `requested=false` / `no_running_job_matched` without aborting the whole session when no running job exists, and `session_init` now honors `response_format="markdown"`. Smoke and runtime-smoke pin all three behaviors. |
|
|
40
|
-
| **`v04.01.01`** | **Patch — release the hard-gate cleanup as a published package.** Formalizes the linter/formatter hard-gate cleanup with a package-version bump so every patch shipped to `main` remains publishable. Removes the dead global ESLint `@typescript-eslint/no-explicit-any` waiver, restores README coverage under Prettier, adds smoke coverage against future linter/formatter masking, makes `runtime-smoke` polling terminal-outcome aware with a 60-second deadline, fixes two CodeQL `js/file-system-race` patterns with atomic/file-descriptor based file operations, and records the scoped StepSecurity cleanup for generated `dist/**` artifacts in the publish workflow. |
|
|
41
|
-
| **`v04.01.00`** | **Minor — security hardening of session-store concurrency, write-path DoS surface, and credential redaction.** Closes three high-impact findings from an in-depth security audit of v4.0.8: (F1) `withSessionLock` switched from `fs.openSync(.., "wx")` + separate write to `proper-lockfile`'s `fs.mkdir`-based atomic locking, eliminating the multi-process TOCTOU race window where two host processes sharing the same `data_dir` could both enter the critical section and corrupt `meta.json`. (F2) `redactPrivateKeyBlocks` now redacts unterminated `-----BEGIN PRIVATE KEY-----` blocks to end-of-string instead of returning the original input unredacted — pre-v4.1.0 leaked partial keys to events.ndjson when logs were truncated mid-key. (F3) `writeJson`'s `renameSync` retry no longer busy-waits with `while (Date.now() - start < wait)` (which blocked the event loop for up to 310 ms under Windows AV stress); it now awaits a Promise-based timer so the event loop remains responsive during backoff. The cascading internal refactor (~22 SessionStore methods became async, ~80 internal call sites added `await`) preserves the public MCP tool surface unchanged. New runtime dep: `proper-lockfile` ^4.1.2. |
|
|
42
|
-
| **`v04.00.08`** | **Patch — eliminate the recurring `js/file-access-to-http` CodeQL false positive at the source.** `scripts/verify-registry-dist.mjs` no longer reads `package.json` from disk; package name and version come from `PACKAGE_NAME` / `PACKAGE_VERSION` env vars (with `npm_package_name` / `npm_package_version` auto-injected by npm as a transparent fallback when invoked via `npm run release:verify-registry`). Both inputs are required; missing values throw a clear error before any network call. Removing the `fs.readFileSync` → outbound-fetch flow stops future CodeQL analyses from re-filing the same alert on every release. |
|
|
43
|
-
| **`v04.00.07`** | **Patch — bounded npm registry fetch in the post-publish verifier.** `scripts/verify-registry-dist.mjs` now passes `signal: AbortSignal.timeout(30_000)` to the `https://registry.npmjs.org/<package>/<version>` `fetch` call so a slow or unreachable registry surfaces as a deterministic abort instead of hanging the publish workflow until its 60-minute ceiling. Timeouts map to an explicit `"npm registry lookup for <spec> timed out after 30000 ms"` error; the validated fields (`dist.shasum`, `dist.integrity`, `dist.tarball`) and the script CLI/env contract are unchanged. |
|
|
44
|
-
| **`v04.00.06`** | **Patch — Windows-safe registry verifier.** `scripts/verify-registry-dist.mjs` now queries `https://registry.npmjs.org` directly instead of spawning `npm.cmd`, closing the Windows Node hardening failure (`spawnSync npm.cmd EINVAL`) while preserving the post-publish validation of registry `dist.shasum`, `dist.integrity`, and `dist.tarball`. |
|
|
45
|
-
| **`v04.00.05`** | **Patch — hard-gate close-out for the Codex v4.0.4 audit.** Clears the 6 residual findings: StepSecurity `Source-Code-Overwritten` detections for generated `dist/*` publish artifacts are suppressed against the existing narrow post-rename rule; `docs/model-selection.md` now uses the post-v4 product name, removes misleading fallback wording, and links to the real historical v2 capability-smoke report; model-selection failure text now says it keeps the configured model pin instead of the old fallback phrase; Copilot/Gemini agent instructions preserve the `cross-review-v2` → `cross-review` rename history; local tag verification is expected to use fetched remote tags; the publish workflow now records npm registry `dist.shasum` / `dist.integrity` / `dist.tarball` metadata so audits do not confuse local `npm --registry=https://registry.npmjs.org pack --dry-run` output with the published artifact identity; and `grok-4-latest` model-match accepts provider-reported dot-release aliases such as `grok-4.3` without weakening true cross-family downgrade rejection. |
|
|
46
|
-
| **`v04.00.04`** | **Patch — restore prettier coverage of `src/` and `scripts/` (close audit on v4.0.3 hard-gate gap).** v4.0.3 added biome but also moved `src/**/*.ts`, `src/**/*.js`, `scripts/**/*.ts`, `scripts/**/*.js` into `.prettierignore` to dodge a biome↔prettier disagreement on dynamic-import call-style. Net effect: prettier ran against zero JS/TS under `src/`/`scripts/`, silently turning one of the four hard-gate checks into a no-op there. v4.0.4 restores full coverage and resolves the disagreement at the source — the 7 `scripts/smoke.ts` dynamic-import sites that triggered the wrap conflict were rewritten from destructure-from-call form to a 2-statement form (`const mod = await import("..."); const { A, B, C } = mod;`). Functionally identical; static type inference preserved. Both formatters now check the full JS/TS surface and pass simultaneously. |
|
|
47
|
-
| **`v04.00.00`** | **Major — project renamed to `cross-review`** (drops the `-v2` suffix after the companion `cross-review-v1` project was discontinued and archived 2026-05-15). Breaking: npm package `@lcv-ideas-software/cross-review-v2` → `@lcv-ideas-software/cross-review` (old name stays on npm at `3.7.5` for historical installs); binaries `cross-review-v2` / `cross-review-v2-dashboard` → `cross-review` / `cross-review-dashboard`; env-var prefix `CROSS_REVIEW_V2_*` → `CROSS_REVIEW_*` across all config knobs that previously carried the `V2` infix (e.g. `CROSS_REVIEW_DATA_DIR`, `CROSS_REVIEW_DISABLE_CACHE_ANTHROPIC`); API-key env vars unchanged; per-host identity env vars (`CROSS_REVIEW_CALLER_TOKEN`, `CROSS_REVIEW_REQUIRE_TOKEN`) unchanged. GitHub repo URL: `LCV-Ideas-Software/cross-review-v2` → `LCV-Ideas-Software/cross-review` (auto-redirected). GitHub Pages: `cross-review-v2.lcv.dev` → `cross-review.lcv.dev`. MCP server key in host configs: operators who declared `cross-review-v2` rename to `cross-review`; after reload, MCP tool prefix becomes `mcp__cross-review__*`. Data dir migration is manual: operators copy `${HOME}/.cross-review/data_v2/*` into the new default `${HOME}/.cross-review/data/` (or set `CROSS_REVIEW_DATA_DIR` to the legacy path) — the v4.0.0 runtime reads only `CROSS_REVIEW_DATA_DIR` and does not fall back to the `_v2` suffix automatically. Preserved when copied: persisted session data, `config.json`, `host-tokens.json`, `cache_manifest.json`, archived/corrupt session dirs. Wire shape of all MCP tools, event types, convergence semantics is unchanged; all capabilities, peers, models, security defenses carry over from v3.7.5 verbatim. 504 source/script/doc text substitutions across 26 files. |
|
|
48
|
-
| **`v03.07.05`** | **Patch — logs+sessions study 2026-05-15 close-out (4 surgical fixes from 244-session/429-round corpus).** **A1** — `session_doctor` classified cancelled sessions as `stale` (22 of 244 false positives); doctor now treats any terminal outcome (`aborted`/`converged`/`max-rounds`) as NOT-stale regardless of the persisted `convergence_health.state`. Source-layer state untouched (backward-compat with existing sessions). **A2** — `lockCallerPeerSelection` emitted false-positive `session.caller_peer_selection_ignored` events when callers passed a panel identical to the enabled set (13 of 106 recent events); the lock now accepts an optional `enabledPeers` snapshot in its context and short-circuits the emit when the caller-supplied list set-equals the enabled set (sorted comparison). **A3** — per-provider cache disable env vars (`CROSS_REVIEW_DISABLE_CACHE_ANTHROPIC | OPENAI | GEMINI | DEEPSEEK | GROK | PERPLEXITY`; provider names match v2.21.0 `_CACHE_TTL_\*`convention; same parsing as`peer_enabled`); Anthropic default flipped to disabled based on empirical 0.3% hit-rate ($1.18 wasted to save $0.0035 over 244 sessions). Global `CROSS_REVIEW_DISABLE_CACHE`kill-switch unchanged; per-provider is an additive layer. Anthropic adapter`buildSystemBlock`+ short-prefix warning gated on the per-provider flag; central`config.json` `cache`block accepts the new disable keys. **B1** —`session_sweep`gains opt-in`prune_corrupt: boolean.default(false)`+`corrupt_min_age_days: number.int.default(30)`to clean`<data_dir>/corrupt_sessions/`(no prior automated cleanup; 1 stale entry from 2026-05-08 v2.25.1 redact bug still on disk at study time). New`store.pruneCorruptSessions(minAgeMs)`returns`{scanned, removed, kept}`. Response shape stays `SessionMeta[]`when`prune_corrupt: false`(default); wraps to`{ swept, pruned_corrupt }` when true. **Patch bump** (3.7.4 → 3.7.5). |
|
|
49
|
-
| **`v03.07.04`** | **Patch — Codex v3.7.3 parecer close-out + two cross-review-gate root-cause fixes** (APROVADO-COM-RESSALVAS; 2 parecer findings + 2 operator-directed fixes; no public-surface or tool-schema change). **`model_match` `-latest`-alias false positive (operator-directed)** — `BasePeerAdapter.modelMatches()` matched the reported model with `reported === requested` or `reported.startsWith(`${requested}-`)`. That works for a base id resolving to a dated id (`gpt-5.5` → `gpt-5.5-2026-04-23`) but FAILS for a `-latest` alias: xAI returns `grok-4-0709` for the pinned `grok-4-latest`, which does not start with the literal `grok-4-latest-`. Every grok response was flagged `model_match: false` → `status` forced `null` → `silent_model_downgrade` rejection → format-recovery skipped, so grok was dead-on-arrival in every cross-review session and no panel including grok could reach unanimity. Fix: `modelMatches` strips a `-latest` suffix to the family stem and matches the reported id against it (`grok-4-latest` → `grok-4` → `grok-4-0709` matches); a genuine cross-family downgrade (`grok-3-*`) is still flagged. New smoke marker `model_match_latest_alias_test`. **`detectFabricatedEvidence` false positive (operator-directed)** — the detector validated operational assertions (`npm run build`, `index <hash>..<hash>`, `cargo test`, …) against the `provenanceCorpus` (attached evidence) ONLY; the prior draft was lumped into `narrativeCorpus` and never consulted for assertions. The documented process REQUIRES embedding the verbatim diff + raw gate output in `initial_draft`, so when R1 didn't converge and a relator generated an R2 revision, the relator faithfully PRESERVING that embedded evidence was flagged as "fabricating" it → `lead_fabrication_repeated` abort (misread as "perplexity keeps fabricating"; in fact it hit any relator and was a detector self-contradiction). Fix: a **three-tier corpus** — `FabricationDetectionCorpus` gains a `priorDraftCorpus` field; operational assertions are flagged only when **net-new** vs `{provenanceCorpus ∪ priorDraftCorpus}` (symmetric with the hex-token check). Preserved evidence is not fabrication; the task `narrativeCorpus` stays excluded so the v2.24.0 eee886d3 protection holds exactly. Signature unchanged; interface gains one field. **AUDIT-1 (MEDIUM)** — `scripts/runtime-smoke.ts` injected cost rate cards for only 4 peers (codex/claude/gemini/deepseek), but the public MCP path strips a caller's `peers` list (the v3.3.0 `lockCallerPeerSelection` lock), so every round runs the full 6-peer panel; grok + perplexity had no rate cards → `missingFinancialControlVars` tripped → the round finalized `outcome=max-rounds`/`financial_controls_missing` while runtime-smoke still printed `ok: true` with no assert. Fix: inject grok + perplexity rate cards (+ `CROSS_REVIEW_PERPLEXITY_DISABLE_SEARCH` and per-size request-fee defaults), and add explicit `assert` calls on every async flow's durable terminal `outcome` (review round + unanimity flow → `converged`, cancellation flow → `aborted`) placed before the `ok: true` print so a non-converging round fails the smoke loudly. **AUDIT-2 (LOW)** — `src/core/convergence.ts` comment imprecision: the skip was framed only as "the user declared no fallback models", but `fallback_exhausted` is in the skippable set and arises AFTER a declared fallback chain is drained; both comment blocks now split the skip into its two paths (no fallback declared → retry-same exhausted → skip; fallback declared, tried, and drained → also skip). Comment-only, zero logic change. New smoke marker `runtime_smoke_outcome_assert_test` + 2 new `relator_evidence_provenance_lock_test` cases source-pin the fixes. **Patch bump** (3.7.3 → 3.7.4). |
|
|
50
|
-
| **`v03.07.03`** | **Patch — "sem fallback é sem fallback" directive + Codex v3.7.2 parecer residuals.** **Skip-peer on model-unavailability** — when a reviewer peer's pinned model is genuinely unavailable (infra failure — `auth`/`rate_limit`/`provider_error`/`network`/`timeout`/`fallback_exhausted`, retries exhausted, no user-declared fallback), the round now SKIPS that peer and converges on the remaining peers instead of the failure blocking convergence (the operator's "pular aquele peer e trabalhar apenas com os outros"). A peer that responded but badly, or a policy/budget/content stop, still blocks. **Skip-gated quorum floor (`SKIP_QUORUM_FLOOR = 2`)** prevents a degenerate 0/1-peer "unanimous" review; guarded by `skipped.length > 0` so on a zero-skip round the convergence decision is identical to pre-v3.7.3 (the only output delta is the additive `skipped_peers` field). New `skipped_peers` on `ConvergenceResult`/`ConvergenceScope` + `session.peer_skipped_unavailable` event. **No model-downgrade fallback** — fallback is 100% user-declared via the central config `fallback_models` (default empty = no fallback → retry-same-model then skip); `model_fallback` capability flag now derived honestly. **Codex v3.7.2 residuals**: grok reasoning-effort shadow set + boot warning (added `grok-4.3`), "7 MCP configs" → "host MCP configs". 100% backward-compatible; no tool-schema change. **Patch bump** (3.7.2 → 3.7.3). |
|
|
51
|
-
| **`v03.07.02`** | **Patch — Codex 3rd super-audit close-out of v3.7.1** (3 findings, all verified against primary-source code; Codex verdict REPROVADO without v3.7.2). **AUDIT-1 (BLOCKER)** — v3.7.1's `runUntilUnanimous` fix led the `??` chain with `input.caller`, but the `run_until_unanimous` MCP schema declares `caller: CallerSchema.default("operator")` — so on the public path `input.caller` is never `undefined`, the `existingSession` fallback was dead code, and the real persisted peer-petitioner could still be reclassified / placed in the voting colegiado / lottery-picked as relator of its own session (Codex reproduced it). Fix: the persisted session wins — `callerForLottery = existingSession?.convergence_scope?.petitioner ?? existingSession?.caller ?? input.caller ?? "operator"`. (`askPeers` does not share the bug — it keys off `input.petitioner`, which has no MCP schema field.) **AUDIT-2** — the continuation smoke marker gains post-schema cases (explicit `caller:"operator"` + mismatching `caller:"claude"`) simulating the schema-materialized value the public path produces; source pin tightened to the v3.7.2 chain ordering. **AUDIT-3 + operator directive** — NO model fallback: every peer `PRIORITY` is now a single canonical pin (`gpt-5.5`, `claude-opus-4-7`, `gemini-2.5-pro`, `deepseek-v4-pro`, `grok-4-latest`, `sonar-reasoning-pro`); v3.7.1 trimmed only gemini/deepseek, this completes all 6. The explicit per-host env/config override is the only escape hatch. 100% backward-compatible; no tool-schema change. **Patch bump** (3.7.1 → 3.7.2). |
|
|
52
|
-
| **`v03.07.01`** | **Patch — Codex super-audit close-out of v3.7.0** (4 findings AUDIT-1..AUDIT-4, all verified against primary-source code before fixing; Codex verdict REPROVADO without v3.7.1). **AUDIT-1 (BLOCKER)** — `runUntilUnanimous` derived the petitioner from `input.caller ?? "operator"` _before_ reading the persisted session; v3.7.0 fixed this in `askPeers` but left the sibling automatic entry point — a caller-omitted continuation could place the real persisted peer-petitioner into the voting colegiado or select it as the relator of its own session (anti-self-review HARD GATE violation, Codex reproduced it). Fix: read the session once up front via `existingSession`, derive `callerForLottery` from it before any recusal/lottery; `existingSession` reused (single read, no double-read). **AUDIT-2** — new smoke marker `audit1_run_until_unanimous_continuation_test` (v3.7.0's coverage only exercised `askPeers`). **AUDIT-3** — trimmed `deepseek`/`gemini` `PRIORITY` to their lone canonical pin so `selectFromCandidates` can no longer silently auto-select `deepseek-v4-flash` (forbidden "flash" tier) or `gemini-3.1-pro-preview` (manual-override-only per the workspace Model Selection Standards directive); `codex`/`claude`/`grok` same-provider degradation chains left intact. **AUDIT-4** — refreshed two stale internal comments. 100% backward-compatible; no tool-schema change. **Patch bump** (3.7.0 → 3.7.1). |
|
|
53
|
-
| **`v03.07.00`** | **Minor — Codex super-audit close-out 2026-05-14** (bit-by-bit review of v3.6.0; 6 findings, all verified real against primary-source code). **AUDIT-1 (BLOCKER)** — `askPeers` computed auto-recusal from the current call's `caller` _before_ reading the persisted session; a continuation that omitted `caller` defaulted it to `"operator"`, skipped recusal, and let the real persisted peer-petitioner back into the voting colegiado (anti-self-review HARD GATE violation, Codex reproduced it). Fix: read the session first, derive `effectivePetitioner`, recuse from that. **AUDIT-2 (HIGH)** — operator default relator hardcoded `"codex"` ignoring `peer_enabled`; now prefers codex when enabled else the first enabled session peer. **AUDIT-3 (MEDIUM)** — `peers` + `judge_peers` MCP schemas capped at `.max(5)` against a 6-element `PEERS` roster (stale since v3.0.0 Perplexity); `.max(PEERS.length)` at all 5 sites. **AUDIT-4 (LOW)** — `server_info.financial_controls` now computes readiness over the enabled peer subset. **AUDIT-5 (NIT)** — corrected stale internal comments (`addressed`→`not_resurfaced`, `max_rounds` 32→1000, "5 peer probes"→6). **AUDIT-6** — clarifying comment on the "API-only" claim (no caller-supplied shell/repo execution; the internal `reg`/`tasklist` calls are constant-arg/PID-derived). 2 new smoke markers; smoke `ok: true / events: 99`. 100% backward-compatible additive (AUDIT-3 widens schema acceptance; AUDIT-1/2 are bug fixes). **Minor bump** (3.6.0 → 3.7.0; Y-component per SemVer). |
|
|
54
|
-
| **`v03.06.00`** | **Minor — observability + caller-discipline close-out 2026-05-14**, from a study of the cross-review logs + 169 past sessions (324 rounds, $45.92, 42541 events). **B2** — token-delta default threshold raised 1024 → 16384 (`session_doctor` showed `peer.token.delta` was 79.5% of all persisted events); operators with a `config.json` `token_streaming.chars_threshold` override should bump it too. **C** — `session_doctor` gains an opt-in `repair: boolean` param (default false → still read-only) that recomputes `convergence_health` for sessions stuck in the contradictory `outcome="converged"`+`health="blocked"` state (pre-v3.2.0 corruption artifact; v3.2.0 fixed the cause, old metas persist); `readOnlyHint` flips to false since `repair=true` mutates; new `repaired` array on the report; idempotent. **B3 + B4** — new top-level `notices: string[]` on all 4 caller-facing tool responses (+ `session_poll`): a `relator_non_voting:` notice naming the relator + voting peers (callers kept misreading the relator's deliberate exclusion as a dropped peer even after v3.5.0's nested metadata), and a `peer_selection_lock:` notice when a caller's `peers`/`lead_peer` was stripped (the v3.3.0 lock fired 30× silently across the corpus). New exported `buildResponseNotices()`. **B1** — `session_poll` gains a derived `needs_attention: boolean` (non-terminal + stale/blocked health + no running job) — the study found 28 non-terminal sessions abandoned until the 24h sweep; this surfaces the risk sooner. 3 new smoke markers; smoke `ok: true / events: 99`. 100% backward-compatible additive: new optional input, new response fields, new exported helper, new report field, config-default tuning. **Minor bump** (3.5.0 → 3.6.0; Y-component per SemVer). |
|
|
55
|
-
| **`v03.05.00`** | **Minor — Codex operational-report close-out 2026-05-14: 5 findings from sessions `f0db3970` + `df052926`.** **CRV2-2 (substantive fix)** — the evidence checklist no longer marks asks `addressed` purely because a peer did not resurface them; "peer did not re-ask" is not proof of satisfaction. The resurfacing-inference path now produces a distinct `not_resurfaced` status (not `open` → still does not hard-block the `=== "open"` convergence gate; not `addressed` → the audit trail no longer lies). `addressed` is reserved for the judge verified-satisfied path + explicit operator action. **CRV2-4** — new pure-textual `evidencePreflight()` runs before any paid peer call; catches submissions that _claim_ completed operational work (tests pass / diff exists / build validated) but embed zero concrete evidence, failing locally with `needs_evidence_preflight` instead of burning API across rounds. Conservative trip condition (completed-work claim AND zero evidence markers — mere keyword presence does not trip). New optional `evidence` input field on `run_until_unanimous` + `session_start_unanimous`; opt-out via `CROSS_REVIEW_EVIDENCE_PREFLIGHT=off`. cross-review stays API-only — evidence _packaging_ is caller-side (see `docs/evidence-preflight.md`). **CRV2-1 + CRV2-6** — `SessionMeta` gains `requested_max_rounds` / `effective_max_rounds` + `requested_max_cost_usd` / `effective_cost_ceiling_usd` / `cost_ceiling_source` traceability (legacy `cost_ceiling_usd` kept in sync for back-compat). **CRV2-3-meta** — CRV2-3 reclassified as not-a-bug (relator-non-voting is the correct tribunal design); `convergence_scope` now carries explicit `lead_peer_role` / `voting_peers` / `quorum_basis` / `anti_self_review_exclusion_reason` so the deliberate exclusion is not misread as a missing-vote bug. **CRV2-5 removed from server scope** — automatic evidence packaging would expand the security surface (shell/repo access); it stays caller-side. 4 new smoke markers; smoke `ok: true / events: 100`. 100% backward-compatible additive: new union member, new exported helper, new meta/scope fields, new optional input field, new finalize reason, new events, new env var. **Minor bump** (3.4.0 → 3.5.0; Y-component increment per SemVer). |
|
|
56
|
-
| **`v03.04.00`** | **Minor — Perplexity multi-failure-mode close-out 2026-05-13: 3 coordinated fixes covering 7 production sessions Codex flagged (`51973fac`, `f72e597a`, `f9a19401`, `99d46a2b`, `00d92cce`, `59776026`, `0003b2fe`).** **Fix #1 — streaming-path strip parity** (P0, surgical 2-line edit in `src/peers/perplexity.ts:~409/~504`): the v3.2.0 `stripPerplexityThinkingBlock` fix was applied only inside `sonarText(response)` (non-streaming path at `:~426/~521`). Production `server_info.streaming.tokens=true` is the default, so virtually every Perplexity call traversed the streaming branches which used raw `stream_buffer.text()` and bypassed the strip entirely. `<think>...</think>` preambles from `sonar-reasoning-pro` / `sonar-deep-research` reached the status parser, producing `unparseable_after_recovery` despite valid trailing JSON. v3.4.0 wraps `stream_buffer.text()` with `stripPerplexityThinkingBlock(...)` at both streaming sites, restoring parity. Forensic evidence: sess `f9a19401` (v3.3.0 self-investigation) — 4 peers converged READY on the exact diagnosis; Perplexity `ready_rate=0.28125` (9/32) vs `~1.0` for other peers. **Fix #2 — anti-meta-audit lock** (P1, prompt clause + heuristic detector): sess `51973fac` shipped a checklist of `MISSING: diff hunk` placeholders + sections titled `Evidence Gap` / `Validation Claims (NARRATIVE` / `Peer Review Readiness Blockers` instead of refining the artifact. `leadShipModeDirective()` gains an `## Anti-Meta-Audit Lock (HARD)` clause; new exported `detectMetaAuditFabrication(text)` in `src/core/orchestrator.ts` flags placeholder + section anti-patterns with double-bar threshold `(placeholders ≥ 3) OR (sections ≥ 1 AND placeholders ≥ 2)` for false-positive resistance. Reuses the shared `consecutiveLeadDrifts` counter (cap=2); new event `session.lead_meta_audit_fabrication_detected` + finalize reason `lead_meta_audit_repeated`. **Fix #3 — reviewer proportionality** (P2, prompt only): sess `0003b2fe` — Perplexity reviewer demanded separate `session_attach_evidence` of the same `rg` scan output the caller had narrated inline, blocking convergence over rounds. `sessionContractDirectives()` gains item 5 scoped tightly to pure config/script/text static-scan reviews; runtime work (build/test/deploy/migration/network) still requires raw output; "when in doubt, prefer asking for evidence" preserves rigor default. 3 new smoke markers (`perplexity_streaming_strip_parity_test`, `meta_audit_fabrication_detection_test`, `proportionality_guidance_test`). 100% backward-compatible additive: new exported helper, new event type, new finalize reason; tool schema unchanged. **Minor bump** (3.3.0 → 3.4.0; Y-component increment per SemVer) — additive public surface is the reason; behavior change for callers passing valid args is pure failure-mode prevention. |
|
|
57
|
-
| **`v03.03.00`** | **Minor — Caller peer-selection lock (operator directive 2026-05-12: "TODOS OS AGENTES/PEERS SEMPRE PARTICIPAM, INDEPENDENTE DA ESCOLHA OU VONTADE DO CALLER").** Closes the systematic gaming pattern where peer callers (notably Codex, observed across multiple sessions) selectively excluded other peers from their own cross-review panels via curated `peers: [...]` lists or pinned a sympathetic relator via `lead_peer`. **Lock surface**: `peers` is locked for ALL callers (including operator) — reviewer panel is ALWAYS the full server-configured `peer_enabled` set; operators tune via env vars (`CROSS*REVIEW_PEER*<NAME>=on | off`), not per-call overrides. `lead_peer`is locked for peer callers (forces lottery so callers cannot pin a sympathetic relator); operator caller may still pin`lead_peer`(legitimate testing/debug). Audit event`session.caller_peer_selection_ignored`emitted to event stream with`site`, `caller`, `peer_panel_overridden`, `ignored_peers`, `lead_peer_overridden`, `ignored_lead_peer`so operator can inspect via`session_events`who tried to game which peer in/out. **Implementation**: new exported`lockCallerPeerSelection<T>(input, ctx): T`helper in`src/mcp/server.ts`— pure function that strips locked fields and emits audit event via supplied`ctx.emit`. Lives at the MCP-handler boundary by design: external callers ALWAYS traverse the lock; internal call sites (orchestrator's own `runUntilUnanimous`→`askPeers` loop, smoke harness) bypass by construction. Wired at all 4 caller-facing handlers (`ask_peers`, `session_start_round`, `run_until_unanimous`, `session_start_unanimous`); `runtime`factory exposes`runtime.emit`so handlers route audit events through the same emitter. v3.2.0's Fix #3 (autowire-judge filter) remains as defense-in-depth (now trivially satisfied since`input.peers`is always undefined post-lock). New smoke marker`caller_peer_selection_lock_test` (5 behavioral scenarios + source-pin asserting all 4 handlers wire the lock). **Public surface**: 100% backward-compatible at schema/tool-surface level (parameters still accepted; values silently overridden + audit-logged). Behavior change deliberate. **Minor bump** — observable behavior change, not a bug fix. |
|
|
58
|
-
| **`v03.02.00`** | **Patch — Codex bug-report close-out 2026-05-12: three surgical fixes (Perplexity `<think>` parser + session-state invariant + orchestrator strict peers).** **Fix #1** (`src/peers/perplexity.ts`): `sonar-reasoning-pro` / `sonar-deep-research` emit a `<think>...</think>` reasoning preamble before structured JSON; pre-v3.2.0 the parser fed that raw string into the format-recovery pipeline, which failed `unparseable_after_recovery` even when the trailing JSON was valid READY. New `PERPLEXITY_THINKING_BLOCK` regex + exported `stripPerplexityThinkingBlock()` helper; `sonarText()` now strips before returning. Closes the long-standing blocker that forced v3.0.0/v3.1.0 to self-bypass HARD GATE. **Fix #2** (`src/core/session-store.ts`): closes session-state corruption observed in session `41244a1c-e7e8-439a-a59e-9339f7c7175d` (R1-R3 didn't converge, R4 finalized as `converged`, R5+R6 ran on top and clobbered `convergence_health` back to `"blocked"`, leaving meta with `outcome="converged" / health.state="blocked"`). `finalize()` now validates `outcome="converged"` against the latest round's `convergence.converged` (throws `code: "session_finalize_outcome_mismatch"`); `appendRound()` refuses to append to a finalized session (`code: "session_already_finalized"`); new public `assertNotFinalized()` helper wired into `askPeers` + `runUntilUnanimous` entry points so the round fails fast instead of after burning budget. **Fix #3** (`src/core/orchestrator.ts`): when the caller passes an explicit `peers: [...]` list, autowire judges are intersected with the explicit list — both the consensus and single-peer paths. Observed in session `73036fbb` where peers=[codex,gemini,deepseek,grok] but autowire still invoked perplexity as judge. New `hadExplicitPeers` flag + `judgeRespectsExplicitPeers()` helper; skipped sessions emit `session.evidence_judge_pass.autowire_skipped` with `skipped_for_explicit_peers: true` + `session_explicit_peers: [...]` for operator audit. 3 new smoke markers (`perplexity_thinking_block_strip_test` 7 scenarios + 3 pins; `session_finalize_state_invariant_test` 5 scenarios + 1 pin; `orchestrator_strict_peer_panel_test` 5 source pins). Smoke harness completes `ok: true / events: 99`. **Patch bump** (additive — new exports + new error codes; pre-existing anti-patterns now reject loudly instead of corrupting state). The `cross-review-attachment-inline-test` smoke fixture was updated to `caller_status: "NOT_READY"` so R1 doesn't auto-converge under stub mode. |
|
|
59
|
-
| **`v03.00.00`** | **Major — Perplexity joins the sexteto. Quinteto (5 peers) → sexteto (6).** Operator directive 2026-05-12. New `PerplexityAdapter` at `https://api.perplexity.ai` (Sonar API, OpenAI-Chat-Completions-compatible; reuses shared `loadOpenAICtor` lazy SDK helper). 5 architectural traits handled explicitly: (1) web search is the DEFAULT per call — peer becomes fact-check overlay; (2) system prompt is half-honored (search component does not attend to it); (3) `reasoning_effort` enum is `minimal | low | medium | high`only (clamped via exported`clampEffortForPerplexity()`); (4) **pricing is 3-dimensional** (input + output + per-1000-request fee scaled by `search*context_size`; Sonar Deep Research adds 4th dimension for citation/reasoning/search_queries); (5) API reports `usage.cost`per call in USD (captured as telemetry; config-driven cost layer remains authoritative). **Role-aware search**:`call()`→ reviewer keeps search active (peer's differentiator value);`generate()`→ relator forces`disable_search:true`(synthesis role, not lookup);`probe()` → search off (already inline). All 6 peers remain symmetric in role assignment — Perplexity can be caller, lead_peer, or reviewer; the HARD GATE caller!=lead_peer!=reviewer applies uniformly. Adds 14 new env vars (`PERPLEXITY_API_KEY`+`CROSS_REVIEW_PERPLEXITY*\*`for model/effort/search-context/disable/pricing). Extends`cost_rates[peer]`with 6 optional fields (request_fee × 3 tiers + citation/reasoning/search_queries Deep Research). Extends`CostEstimate`with 4 new line items +`TokenUsage` with 3 new fields. Boot notice for reasoning-effort-not-honored on sonar/sonar-pro models. 2 new smoke markers (`perplexity_integration_test`+`perplexity_reasoning_capability_allowlist_test`). **Default model**: `sonar-reasoning-pro`. **Default search_context_size**: `low`(cheapest tier; cross-review focus is the attached draft, not broad search). **Default disable_search**:`false`(search ATIVO; fact-check overlay is Perplexity's differentiator). Tool surface 100% backward-compatible additive (PeerSchema/CallerSchema accept`perplexity`as new value; legacy 5-peer payloads still valid). Default`session_start_unanimous`now dispatches 6 reviewers — set`CROSS_REVIEW_PEER_PERPLEXITY=off` per host to preserve quinteto behavior. **Major bump** — sexteto transition is an epoch shift over the quinteto baseline that held since v2.14.0. |
|
|
60
|
-
| **`v02.28.00`** | **Minor — Cold-start hardening Part 3: Windows registry env-var lookup bulk-cached (3-7 s → ~100 ms).** Empirical profile revealed the real boot bottleneck on Windows: `loadConfig()` consuming 3.1-7.0 s because `readWindowsRegistryEnv(name)` fired `reg query <root> /v NAME` per missing env var × 2 scopes (HKCU + HKLM). With ~140 config vars and partial `process.env`, this burned 3-7 s dwarfing every other boot cost. v2.27.0 + v2.27.1 attacked SDK imports + sweeps (~340 ms) — a side concern. **v2.28.0 fix**: single bulk `reg query <root>` at first miss populates a `Map<string,string>` module cache; `readWindowsRegistryEnv` becomes a pure `cache.get(name)`. Cost: `O(1 + 2 registry reads)` instead of `O(N missing × 2 spawns)`. **Empirical handshake** (3 trials each): v2.27.1 3.18 / 3.12 / 3.14 s → v2.28.0 **0.37 / 0.37 / 0.38 s** = 8.4× speedup. `loadConfig()` alone: 3,307 ms → 87 ms (38×). Cold-start now well below Claude Code's spawn timeout. New smoke `windows_registry_env_bulk_cache_test` (7-class assertion pinning Map cache + bulk loader + canonical `reg query <root>` shape + negative invariant on per-var `/v NAME` + escapeRegExp absence + thin lookup + dist parity). Public surface 100% backward-compatible. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md` (gate-fixing-itself, third installment). **Minor bump** — internal behavior change with measurable 8.4× runtime impact. |
|
|
61
|
-
| **`v02.27.01`** | **Patch — Cold-start hardening Part 2: lazy-load 5 provider SDKs + defer 6 startup sweeps to setTimeout(30s).** Completes the cold-start fix started in v2.27.0. Empirical motivation 2026-05-12: cross-review failed to register tools in a Claude Code session while the other 5 MCP hosts (Codex CLI / Gemini Code Assist / Antigravity / Grok CLI / DeepSeek CLI / VS Code) loaded normally. Diagnostic measurement of the real JSON-RPC `initialize` handshake showed the server taking ~4.2 s to respond — right on top of Claude Code's per-spawn timeout. Two contributors stacked: eager top-level imports of 5 provider SDK module trees (~3 s of CommonJS/ESM graph) + v2.27.0's 4 boot sweeps + 2 boot notices all running on the same event-loop tick as the initialize message. **Lazy-load**: every adapter source uses `import type` only for provider SDKs; new shared cached loaders `loadAnthropicCtor()` / `loadOpenAICtor()` / `loadGenaiModule()` wrap `import(<sdk>)` in a per-module promise so concurrent first-callers resolve exactly once. Each adapter's `client()` is now async; 25 call sites across the 5 adapters updated; `geminiThinkingConfig(model, ThinkingLevel)` takes the lazy-loaded enum as 2nd arg. `model-selection.ts` consumes the same loaders. **Deferred sweeps**: 6 boot-time `setImmediate` blocks in `server.ts` become `setTimeout(..., STARTUP_SWEEP_DELAY_MS)` with delay = 30_000 ms — initialize responds in <200 ms while sweeps run later when the operator is idle. Empirical handshake measurement post-ship: 3.6-3.9 s (vs 3.7-4.2 s pre-ship); margin is modest because Node.js + MCP SDK + orchestrator still dominate, but the architectural correctness keeps SDK and FS work off the initialize tick entirely. 2 new smoke markers (`lazy_provider_sdk_imports_test`, `startup_sweeps_use_setTimeout_test`) + 2 existing gemini assertions updated. **Public surface** 100% backward-compatible (3 new named exports for cross-module loader reuse; `client()` is `private` so async-conversion is internal). **Patch bump**. |
|
|
62
|
-
| **`v02.27.00`** | **Minor — Cold-start hardening Part 1: corrupted meta.json auto-quarantine + finalized-session auto-prune.** Empirically motivated by Claude Code reload friction 2026-05-12: 534 session dirs accumulated under `~/.cross-review/data_v2/sessions/`, including 3 corrupted by the v2.25.1 redact escape-boundary bug (`77c47284`, `be47a5b0`, `7edf63e3`). The startup sweeps iterate via `list()` which read every `meta.json`; a single corrupted file caused the sweep to throw + abort, surfacing parse-error stderr on every reload — Claude Code is more sensitive to startup stderr than other hosts. **`SessionStore.list()`** now silently skips + quarantines corrupted meta.json (renamed to `meta.json.bad` with one `[cross-review] quarantined …` stderr line, idempotent). **`SessionStore.pruneOldSessions(maxAgeDays?)`** removes finalized session dirs (outcome ∈ converged/aborted/max-rounds) whose `updated_at` is older than the cutoff. Default 60 days; `CROSS_REVIEW_PRUNE_AFTER_DAYS=0` disables entirely. In-flight or untyped-outcome sessions are NEVER pruned (preserves audit trail). New boot `setImmediate` block wires the prune; stderr only emitted when `pruned > 0`. **Minor bump** — 2 new methods on `SessionStore`; `list()` swallows-and-quarantines instead of throws (additive defensive). Backward-compatible default; operators see no behavior change unless they have corrupted meta.json or >60-day-old finalized sessions. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md`. |
|
|
63
|
-
| **`v02.26.01`** | **Patch — `max_attached_evidence_chars` default raised 80_000 → 200_000 to fix multi-file evidence truncation.** Empirically demonstrated by the stepsecurity v0.2.0 ship 2026-05-12 (sess `fd1037e5` and prior `85f94725`): with 5 attached evidence files totaling ~95KB, `session-store.readEvidenceAttachments()` budget allocator at `src/core/session-store.ts:1481-1543` exhausted the 80KB total cap before reaching the 4th+ attachment, surfacing `(truncated to 33273 of 38412 bytes)` to peers, who in 5 consecutive rounds correctly flagged the truncation as a blocker. The `perFileCap = max(2_000, floor(totalCap * 0.6))` mechanic remains correct (60% per-file allowance leaves room for at least 1 other attachment); only the global `totalCap` default needed bumping. New default 200_000 chars accommodates ~5 attachments averaging 30KB each. Operator override unchanged via `CROSS_REVIEW_MAX_ATTACHED_EVIDENCE_CHARS`. **Documented adjacent issues** (no code fix; tracked for v2.27+): (1) lead-drift abort threshold is 2 consecutive (`orchestrator.ts:3662`) — when `max_rounds` is reached with `consecutiveLeadDrifts === 1`, the session ends `max-rounds` instead of `lead_meta_review_drift`; workaround = use `ask_peers` for known-drift-prone task patterns; (2) inaccessible upstream OpenAPI spec — when peers demand verbatim spec excerpts but the spec endpoint requires browser-session cookie auth, the caller must rely on alternative-evidence patterns. **Patch bump** — backward-compatible default change. No public API surface change. Self-review BYPASSED per `feedback_cross_review_self_repair_exception.md`. |
|
|
64
|
-
| **`v02.26.00`** | **Minor — Full pricing-model schema: base + extended-tier + cache (read/write) + promo (limited-time discount), all env-configurable, graceful fallback when fields are absent or promo expires.** Operator directive 2026-05-11 ("Cross-review-v2 precisa saber ler das variáveis configuráveis nos arquivos de configuração e no env var todos os modelos de preços vigentes, com e sem cache, com promoção e sem promoção abaixo de tantos tokens e acima de tantos tokens"). Adds 14 new optional pricing env vars per provider plus 2 metadata env vars per provider (`_THRESHOLD_TOKENS`, `_PROMO_EXPIRES_AT_UTC`) on top of the v2.0.0 required pair — total 18 env-var slots per provider × 5 providers = 90 max. Selection cascade in new exported `selectRate()`: (promo+extended) → promo → extended → base, each step automatically falling through when the corresponding field is unset OR the gating condition does not apply. When promo expires (`Date.now() >= Date.parse(promo_expires_at)`), system uses base without operator intervention; when extended is unset, base applies to all prompt sizes; when cache rates are unset entirely, cache tokens are billed at the input rate (zero savings reported, no penalty). **No-hardcoded-financials directive** — the legacy `src/core/cache-rates.json` runtime fallback was REMOVED; cache pricing comes exclusively from env vars or graceful input-rate fallback. **CostEstimate** type extended with `cache_read_cost?`, `cache_write_cost?`, `tier_used?` ("base"\|"extended"\|"promo"\|"promo_extended"). `estimateCacheSavings()` third positional arg (`configRate`) is now required — internal/MCP callers route through `estimateCost()` and are unaffected. New smoke marker `full_pricing_model_v2260_test` pinning 11 invariants. **Minor bump** — additive public surface; breaking only for direct `estimateCacheSavings()` callers. |
|
|
65
|
-
| **`v02.25.01`** | **Patch — `meta.json` corruption hotfix: `redact()` env-style pattern was crossing JSON-escape boundaries.** The env-style assignment regex in `src/security/redact.ts:26` used `[^\s"',}]{6,}` for the value capture group; backslash was NOT in the exclusion class, so when a peer response contained the JSON-escaped sequence `token: write\"` (the inner-string close-quote of an escaped peer text), the `{6,}` quantifier consumed `write\` (6 chars including the escape backslash). The replacement `[REDACTED]` ate the closing `\` of the escape, leaving a bare `"` that prematurely closed the outer JSON string — producing structurally-broken `meta.json` files that could not be re-parsed at session resume time. Empirical impact: 3 cross-review sessions today (`be47a5b0`, `77c47284`, `7edf63e3`) all aborted at session_init with parser errors at different positions — same root cause: peer responses to a 13-repo scorecard hotfix submission quoted `id-token: write` inside backtick-fenced YAML excerpts. Fix: extend the negative char class with `\\`. Three smoke regression cases added (`escapeBoundary`, `realAssignment`, `yamlExcerpt`). **Patch bump** — additive defensive narrowing of an existing pattern; no public surface change. Cross-review-v2 self-review BYPASSED per operator directive 2026-05-11 (the bug being fixed is in the cross-review gate itself; routing the fix through the broken gate would re-encounter the same corruption). |
|
|
66
|
-
| **`v02.25.00`** | **Third deliberation mode `circular` joins `ship` and `review`.** Imported from `maestro-app`'s editorial protocol after operator review of the maestro design 2026-05-11. Serial deliberative custody: caller submits artifact; non-caller peers rotate as temporary curators; each rotator either approves the current version unchanged or produces a narrowly justified revision; convergence = full rotation completes without substantive change. No parallel peer-voting in this mode — the rotator IS the actor each round. **When to use each mode**: `ship` (default) for approving/rejecting an external artifact (PR review, spec approval, security gate — tribunal primitive, all peers vote READY); `review` for tasks phrased as a review act where the lead emits structured response; `circular` for producing/refining a shared artifact (spec drafting, RFC, protocol evolution, CHANGELOG copy — editorial primitive, panel produces). Modes coexist; mixing within a single session is not supported. Implementation: new `SessionMode = "ship" \| "review" \| "circular"`; new `leadCircularModeDirective()` Layer-1 prompt clause with 5 subsections (approve unchanged, approved-content lock, quality preservation, no-self-review, evidence-provenance-lock shared with ship); new `runCircularLoop()` private orchestrator method called when `sessionMode === "circular"`; new `circular_state: {rotation_order, consecutive_no_change_count, last_revision_round}` persisted in `meta.json`; new `circular_max_rotations` config (default 3, env `CROSS_REVIEW_CIRCULAR_MAX_ROTATIONS`); new event types `session.circular_rotation_assigned` / `_step_unchanged` / `_step_revised` / `_full_rotation_no_change` / `_max_rotations_exceeded` / `_rotation_too_small`; new finalize reasons `circular_full_rotation_no_change` / `circular_max_rotations_exceeded` / `circular_rotation_too_small`. Rotation length minimum is 2 (no-self-immediate-output guard). Drift / empty / fabrication detection from v2.23/v2.24 fires identically. New smoke marker `circular_mode_test` pinning 11 invariants. MCP tool schemas (`run_until_unanimous`, `session_start_unanimous`) accept `mode: "circular"`. **Minor bump** — additive public surface; pre-v2.25 callers see no behavior change. |
|
|
67
|
-
| **`v02.24.00`** | **Evidence-provenance lock for the ship-mode relator (Codex bug report 2026-05-10).** Codex empirically observed two adjacent failure modes from his own working session `019dc794`: (a) **session `09c21d7a`** — lead_peer (Grok) fabricating operational evidence ex nihilo in `run_until_unanimous` with `mode: ship` (symmetric-pattern SHAs, 39-char SHAs where git emits 40, test-run counts not in attached evidence, `git diff --check passed` assertions, vite asset hashes); (b) **session `eee886d3`** — different relator (DeepSeek) propagating caller-narrated operational claims (`cargo test: 147 passed`, `npm run typecheck: passed`) as if they were verified evidence, when the caller never attached raw command output via `session_attach_evidence`. Same architectural gap from two angles: **NARRATIVE about operational evidence ≠ PROVENANCE-GRADE operational evidence**. Pre-v2.24.0 the orchestrator promoted such revisions to next-round draft, burning a full round of peer calls before downstream peers (claude + deepseek) blocked convergence. **Layer 1** — Evidence Provenance Lock (HARD) clause added to `leadShipModeDirective()` system prompt, instructing the relator that operational evidence (SHAs/hashes/build outputs/test counts/diff hunks/git assertions) MUST be cited verbatim from the corpus or declared as a blocker. **Layer 2** — new exported `detectFabricatedEvidence(revisionText, provenanceCorpus): FabricationDetectionResult` heuristic detector with hex-token-subset check + canonical operational-assertion patterns. Thresholds: 3+ net-new hex tokens or 2+ suspicious assertions trip fabrication; corpus-quoted tokens are subtracted before scoring (false-positive guard). **Layer 3** — orchestrator relator-revision branch wires the detector after `emptyText`/`driftDetected` checks, preserves prior draft on detection, increments `consecutiveLeadDrifts`, emits `session.lead_fabrication_detected` event (data.fabrication_signals carries net_new_hex_count + sample + suspicious_assertion_count + sample), finalizes with `lead_fabrication_repeated` at the consecutive-cap. New smoke marker `relator_evidence_provenance_lock_test` pins behavioral matrix (clean/hex/assertion/provenance-correct) + source-level invariants (prompt sentinel, threshold constants, event type, finalize reason, unified-counter contract). No tool surface change. **Patch bump** — additive event + finalize reason; failure-mode behavior change only. |
|
|
68
|
-
| **`v02.23.00`** | **Anthropic empty-revision degenerate path detection.** Patch closing a $0.21 USD waste path discovered while triaging maestro-app v0.5.20 review session `8187f5a8` (2026-05-10): Claude Opus extended-thinking responses can return a content array with only `thinking`/`redacted_thinking` blocks and no final `text` block. Pre-v2.23.0 the Anthropic adapter silently produced `text: ""` and the orchestrator promoted that empty string to the next-round draft, dispatching peer calls against an empty `Draft Or Solution Under Review:` block. **Layer 1** — new `parseAnthropicContent(content)` returns `{text, parser_warning?}` instead of the lossy `string`; legacy `textFromAnthropicContent` kept as backward-compat shim. **Layer 2** — anthropic.ts call sites surface `parser_warning` via new `extraParserWarnings` parameter on `resultFromText`/`generationFromText`, flowing to `PeerResult.parser_warnings` and (new) `GenerationResult.parser_warnings`. **Layer 3** — orchestrator's relator-revision branch treats `generation.text.trim() === ""` the same as drift: preserve prior draft, increment `consecutiveLeadDrifts`, emit dedicated `session.lead_empty_revision` event, finalize with `lead_empty_revision_repeated` when the cap is hit. New smoke marker `anthropic_empty_text_detection_test` pins all 4 invariants (helper return shape, adapter call-site uniformity, orchestrator sentinel strings, types declaration). No public surface change for callers passing valid arguments. **Patch bump** — failure-mode behavior change only. |
|
|
69
|
-
| **`v02.22.00`** | **`session_doctor` drill-down + per-round cost telemetry + budget warning event.** Three observability/audit improvements identified during a forensic audit of 467 durable sessions. **A.P2:** `session_doctor` hides per-session enumeration of `findings.self_lead_metadata` by default (178/467 = 38% pre-v2.16.0 noise); `totals.self_lead_metadata` count remains visible; pass `include_legacy: true` to enumerate. **B.P2:** entries in `findings.open_evidence_sessions[]` gain `item_types` (open items grouped by surfacing peer) + `chronic_blockers` (item ids with `round_count >= 3`) so operators see which evidence asks are systemic. **B.P3:** new `costs_per_round[]` + `cost_ceiling_usd` in `meta.json` (snapshot at session_init time so retroactive analysis is decoupled from later env-var changes); new one-shot `session.budget_warning` event fires when cumulative cost crosses 75% of the ceiling, providing early visibility before `max_rounds_budget_exceeded`. 3 new smoke markers (`session_doctor_legacy_filter_test`, `evidence_checklist_drilldown_test`, `budget_warning_emit_test`). **Minor bump** — public surface is additive; pre-v2.22 callers see no behavior change. |
|
|
70
|
-
| **`v02.21.00`** | **Cross-provider prompt caching across all 5 peers (OpenAI, Anthropic, Gemini, DeepSeek, Grok).** Single coordinated ship that wires uniform prompt-caching telemetry through the runtime: each adapter parses provider-native cache fields (`prompt_tokens_details.cached_tokens` / `cache_creation_input_tokens` / `cache_read_input_tokens` / `cachedContentTokenCount` / `prompt_cache_hit_tokens` / `prompt_cache_miss_tokens`); orchestrator emits a canonical `provider.cache.usage` event; per-session `cache_manifest.json` is appended for every cached call. **Anthropic** uses EXPLICIT cache_control breakpoints on the system prompt (TTL `5m`/`1h`). **OpenAI** uses pair-scoped `prompt_cache_key` + `prompt_cache_retention` (`in_memory`/`24h`). **Grok** mirrors OpenAI plus `x-grok-conv-id` header for cache-bucket scoping. **DeepSeek** parses auto-cache telemetry (no payload changes). **Gemini** parses implicit-cache telemetry only (explicit `caches.create` deferred). New `src/core/prompt-parts.ts` builds the canonical `stablePrefix` that always begins with `cache_schema_version: vN` and produces a sha256 hex hash invariant across rounds for the same case. New `src/core/cache-manifest.ts` persists per-session cache history with the same atomic-write retry pattern as `meta.json`. New rate cards in `src/core/cache-rates.json` populate `CostEstimate.cache_savings_usd` (or `cache_savings_unknown` when no rate matches). Operator can disable globally with `CROSS_REVIEW_DISABLE_CACHE=true`; TTL via `CROSS_REVIEW_CACHE_TTL_ANTHROPIC` / `CROSS_REVIEW_CACHE_TTL_OPENAI`; schema bump via `CROSS_REVIEW_CACHE_SCHEMA_VERSION`. 5 new smoke markers (`cache_hash_invariance_test`, `cache_schema_version_in_prefix_test`, `cache_rates_json_loaded_test`, `cache_manifest_atomic_write_test`, `cache_disable_kill_switch_test`). New `docs/caching.md` documents per-provider behavior matrix. **Minor bump** — public surface is additive; pre-v2.21 callers see no behavior change. |
|
|
71
|
-
| **`v02.18.08`** | **Site sponsor card iteration.** `site/index.html` GitHub Sponsors iframe (caixa branca cross-origin) substituído por link card dark navy com ❤ pink + meta cyan + seta animada; card movido para DEPOIS dos botões (lcv.dev/sponsor primário, GitHub Sponsors alternativa). Companion ship Phase 3 (12 repos). |
|
|
72
|
-
| **`v02.18.07`** | **Patch — `site/index.html` visual identity refresh.** GitHub Pages doc/sponsor page reskin to the new LCV org dark-first navy/cyan visual identity (palette `#050b18`/`#38bdf8`/`#34d399`, radial gradients, glow shadows, gradient text on h1). Coordinated companion ship with cross-review-v1 1.12.9, deepseek-cli 0.3.1, grok-cli 1.6.2, sponsor-motor APP v01.02.02, and `.github-org/site` (org root + /sponsor). No change to the published npm tarball (`files[]` does not include `site/`); only the GitHub Pages page changes. **Patch bump** (no public surface change). |
|
|
73
|
-
| **`v02.18.06`** | **Patch — Gemini API function-declaration compatibility for MCP tool inputSchemas.** Gemini Code Assist forwards each MCP tool's `inputSchema` to the Gemini API as a `function_declarations[*].parameters` payload; the Gemini API's OpenAPI 3.0 subset rejects three patterns the SDK was emitting from the existing zod schemas, surfacing as `400 INVALID_ARGUMENT` for every chat turn including cross-review tools. v2.18.6 cleans the offending zod usage. **(1)** `additionalProperties: false` removed from every MCP tool inputSchema (~28 tools) by dropping the `.strict()` chain; runtime accepts the same valid arguments because handlers consume only declared properties via destructuring. **(2)** `caller` field flattened from `z.union([PeerSchema, z.literal("operator")])` (6 occurrences) to a single `CallerSchema = z.enum([...PEERS, "operator"])`, replacing the `anyOf: [enum, const]` shape with a clean single `enum`. **(3)** `reasoning_effort_overrides` refactored from `z.record(PeerSchema, ReasoningEffortSchema).optional()` to an explicit `z.object({codex?, claude?, gemini?, deepseek?, grok?}).optional()`, eliminating the non-OpenAPI `propertyNames` constraint and the spurious `required: [<all 5 peers>]` artifact that contradicted the field's `.optional()` declaration. No behavior change for any caller passing valid arguments — Claude Code, Codex CLI, Gemini Code Assist, Grok CLI and DeepSeek CLI continue invoking the same tools with the same keys. Lint/typecheck/format clean; smoke harness completes with `ok: true / events: 96`. **Patch bump** (compatibilidade pública preservada; única diferença observável é que campos extras não declarados passam a ser silenciosamente descartados em vez de rejeitados com `mcp_arg_validation_failed`). |
|
|
74
|
-
| **`v02.18.05`** | **Patch — anti-drift smoke drivers for v2.18.4 audit closure (operator directive 2026-05-07).** v2.18.4 shipped 6 surgical fixes from the Codex external audit; v2.18.5 hardens those fixes against silent regression with 5 anti-drift smoke checks (`hono_override` / `abort_signal_threading` / `max_items_per_pass_default` / `clamp_effort_for_model` / `consensus_event_per_peer_attribution`). **P1.1**: `package.json` overrides.hono === ">=4.12.16" + ip-address override retained. **P1.3**: ≥2 sites with `signal?: AbortSignal` param + `signal: params.signal` wiring + `signal: input.signal` autowire emission; consensus pass has no leftover `signal: undefined`. **P1.4**: source-level `?? "4"` fallback + behavioral `loadConfig()` returns max_items_per_pass=4 (env unset). **P2.1**: behavioral clampEffortForModel("xhigh", "grok-4.3")="high"; passthrough on multi-agent; clamp wired at exactly 2 responses.create sites. **P2.4**: legacy judge_peer + new judge_peers array + per_peer_verdict map co-emitted at every `this.emit({...})` event payload. `clampEffortForModel` is now exported from src/peers/grok.ts so the harness can verify directly. Companion to cross-review-v1 v1.12.7 (parallel ship, same operator directive). Smoke harness completes with `ok: true` / exit 0; lint/typecheck/format clean; `npm audit --audit-level=moderate` 0 vulnerabilities. **Patch bump** (additive — only new exports + new smoke markers; no runtime behavior change). |
|
|
75
|
-
| **`v02.18.04`** | **Patch — Codex external audit 2026-05-07 outcome: 6 surgical fixes (P1.1, P1.2, P1.3, P1.4, P2.1, P2.4).** Codex submitted a read-only audit of cross-review v2.18.3 with 4 P1 + 7 P2 findings; this ship lands 6 verified-actionable items. **P1.1**: `package.json` adds `"hono": ">=4.12.16"` override clearing 2 npm-audit moderate advisories (GHSA-9vqf-7f2p-gf9v + GHSA-69xw-7hcm-h432) via @modelcontextprotocol/sdk transitive (practical exposure ~zero in stdio runtime, but audit-gate matters for publish + defense-in-depth; same precedent as v2.18.1 ip-address override). **P1.2**: `src/security/redact.ts` adds `xai-` API key pattern at parity with sk-/sk-ant-/AIza/etc; logs/sessions could previously leak xAI keys via persisted provider errors. **P1.3**: `runEvidenceChecklistJudgeConsensusPass` + `runEvidenceChecklistJudgePass` now thread `AbortSignal` through to `judgeEvidenceAsk(context.signal)` — pre-v2.18.4 the consensus path hardcoded `signal: undefined` and single-peer omitted the field, so `session_cancel_job` could not abort judges mid-flight. Autowire call sites pass `input.signal` from round scope. **P1.4**: lowered default `CROSS_REVIEW_EVIDENCE_JUDGE_MAX_ITEMS_PER_PASS` from 8 → 4 — with default consensus_peers=4, worst-case round goes from 4×8=32 paid judge calls down to 4×4=16. Operators wanting prior behavior set env-var explicitly. **P2.1**: `GROK_REASONING_EFFORT_MODELS` allowlist expanded from `{"grok-4.20-multi-agent"}` to include `"grok-4.3"` per current xAI docs (verified via WebFetch 2026-05-07; xAI added `grok-4.3` reasoning_effort support after v2.16.0 froze). New `clampEffortForModel()` narrows internal `xhigh`/`minimal` scale to `high` for grok-4.3 (which only accepts `none | low | medium | high`). v2.16.0 verification 2026-05-05 was authoritative at the time but is now stale; v2.18.4 closes the drift. **P2.4**: consensus events at orchestrator.ts:1008 + :1030 previously emitted only `judge_peer: params.judge_peers[0]`, so the rollup at session-store.ts:911 attributed every consensus decision to the first peer (codex by default). v2.18.4 keeps `judge_peer`for backward compat AND emits`judge_peers: PeerId[]`+`per_peer_verdict`map so per-peer accuracy is computable from the raw event stream. Smoke harness completes with exit 0 + final`{ ok: true, events: 96 }`payload (the harness's binary success signal);`grok_reasoning_capability_allowlist_test`updated from prior`size === 1`to`size === 2`. Lint/typecheck/format clean. **Patch bump** (additive public surface; default-behavior change on `max_items_per_pass` documented). |
|
|
76
|
-
| **`v02.18.03`** | **Patch — Gemini default pin bump `gemini-3.1-pro-preview` → `gemini-2.5-pro` (operator preference 2026-05-07; coordinated with cross-review-v1 v1.12.4).** Source-of-truth defaults flipped: `src/core/config.ts` `models.gemini` default → `gemini-2.5-pro`; `src/peers/model-selection.ts` priority list → `["gemini-2.5-pro", "gemini-3.1-pro-preview"]` (3.1-pro-preview retained as fallback). Rationale: under Google One AI Ultra subscription, `gemini-2.5-pro` carries 1k requests/day quota vs `gemini-3.1-pro-preview`'s 250 requests/day; post-bump empirical sessions (08cbc942, 1d5be5f2, 256ac7c9 — all 2026-05-07) confirm `gemini-2.5-pro` stable across the 5-peer panel without rate_limit blockers. The 7 LCV-workspace MCP host configs already flipped `CROSS_REVIEW_GEMINI_MODEL=gemini-2.5-pro` env-override 2026-05-07; this ship aligns the source-of-truth defaults so a fresh install without env-override picks the same model. Workspace policy (operator directive 2026-05-07): only `gemini-*-pro` variants ≥ 2.5 are permitted — no `*-flash` and no models below 2.5. Smoke fixture `scripts/smoke.ts:225` (currentOfficialModel iterator) flipped to `gemini-2.5-pro`. `docs/api-keys.md` env-var example + `docs/model-selection.md` priority documentation refreshed to match. **Patch bump** (no public surface change beyond default model ID; behavior unchanged for env-override users). |
|
|
77
|
-
| **`v02.18.02`** | **Tier 5 — Windows process-tree introspection (coordinated with cross-review-v1 v1.12.2).** Closes the long-standing forensics gap: pre-v2.18.2 `getParentProcessSnapshot()` returned `parent_exe_basename: null` on Windows because we only had a POSIX `/proc/<ppid>/comm` reader (Windows path deferred at F1 v2.18.0). v2.18.2 closes the gap with a defensive `tasklist /FI "PID eq <ppid>" /FO CSV /NH` reader via `child_process.spawnSync` (`timeout: 500`, `windowsHide: true`); parser uses leading-quote discriminator and the same `1 ≤ length < 128` sanity filter as POSIX. Best-effort try/catch swallows ENOENT, timeout, parse failures. POSIX path unchanged. `scripts/smoke.ts` sub-test (14) extended with shape sanity + Windows-specific populated-basename assertion + source-level anti-drift guards. Forensics-only field — NOT used by F1 token gate or v2.17.0 clientInfo cross-check. **Patch bump** (no public surface change). |
|
|
78
|
-
| **`v02.18.00`** | **F1 caller capability tokens (coordinated with cross-review-v1 v1.11.0).** Cryptographic identity proof that complements the v2.17.0 clientInfo gate. Pre-v2.18.0 the v2.17.0 cross-check between `caller` and `clientInfo.name` only catches _inconsistent_ self-reports — both fields are declared by the caller. F1 introduces a per-host secret (env `CROSS_REVIEW_CALLER_TOKEN`), authoritative on match and rejected on mismatch. New `caller-tokens` module exposes generation, loading, constant-time hex matching, env verification and a best-effort parent-process snapshot for forensics (Option C / Hybrid). New MCP tool `regenerate_caller_tokens` rotates `host-tokens.json`. New env vars `CROSS_REVIEW_CALLER_TOKEN`, `CROSS_REVIEW_TOKENS_FILE`, `CROSS_REVIEW_REQUIRE_TOKEN`. New `caller_tokens` block in `server_info` surfaces the gate state. `verifyCallerIdentity` extended with `verification_method` ("token" | "client_info" | "none") and `identity_metadata`. R2 codex catch hardening: `caller="operator"` from a host carrying a token throws `identity_forgery_blocked` (closes the operator-bypass window). Permissive default — hosts without tokens fall back to v2.17.0 clientInfo gate; operator opts into hard-enforce mode after distributing secrets. Smoke marker `caller_capability_tokens_test` covers 16 cases including the new overlay paths and the R2 hardening. **Minor bump** (additive public surface). |
|
|
79
|
-
| **`v02.17.00`** | **HARD GATE — identity forgery rejection (operator directive 2026-05-05).** Empirical evidence flagrada: cross-review session `0994cbaf` foi criada por Codex com `caller=claude` (impersonação para auto-exclusão do real Claude da panel). Pre-v2.17.0 v2 nem capturava `clientInfo` da MCP initialize handshake — `caller` era trusted unconditionally. v2.17.0 adiciona `verifyCallerIdentity(declaredCaller, clientInfo)` que cross-checks o caller declarado contra `getCallerCandidatesFromClientInfo(clientInfo)`. Aplicado em todos os 6 handlers caller-accepting: `session_init`, `ask_peers`, `session_start_round`, `run_until_unanimous`, `session_start_unanimous`, `contest_verdict` (quando `new_caller` provided). Match → OK + `identity_verified=true`. clientInfo unknown → OK + `identity_verified=false` (legitimate override). `caller="operator"` → OK (no agent claim made). Mismatch OR multi-match clientInfo → throws `identity_forgery_blocked`. Smoke `identity_forgery_blocked_test` (6 sub-tests). Coordinated ship com `cross-review-v1 v1.9.0`. **Minor bump** porque public surface adds `identity_forgery_blocked` error. Cross-review trilateral bypassed por operator directive (security fix to the gate itself, would otherwise route through compromised gate). |
|
|
80
|
-
| **`v02.16.00`** | **Tribunal protocol repair plus operational doctor.** Separates petitioner/caller from relator metadata, applies self-recusal to direct `ask_peers`, adds read-only `session_doctor`, fixes Windows smoke teardown, and refreshes provider model guidance from official docs. |
|
|
81
|
-
| **`v02.15.01`** | **`server_info` consensus visibility hotfix.** Exposes `consensus_peers` and `configured_consensus_peers_raw` for evidence-judge autowire so operators can audit the same configuration the dispatcher is using. |
|
|
82
|
-
| **`v02.15.00`** | **Backlog bundle for operational judge controls.** Added consensus-based judge autowire, per-call reasoning-effort overrides, opt-in real-API smoke, provider 4xx docs hints, and a Grok reasoning-capability allowlist while exposing consensus toggles across the six MCP host configs. |
|
|
83
|
-
| **`v02.14.01`** | **Grok reasoning model hotfix.** Switched the default Grok model to `grok-4.20-multi-agent` after real xAI verification and official docs showed `reasoning.effort` is accepted only on that model family. |
|
|
84
|
-
| **`v02.14.00`** | **Grok joins the tribunal.** Expanded the peer set to five with Grok, added per-peer on/off env vars, precision-report groundwork, active evidence-judge autowire, `contest_verdict`, multi-peer judge consensus, attached-evidence prompt injection, and CodeQL-safe temp-directory handling. |
|
|
85
|
-
| **`v02.13.00`** | **Lead meta-review drift fix.** Added explicit `ship` versus `review` session mode, lead drift detection, drift telemetry, and an abort gate so `run_until_unanimous` does not replace the artifact under review with a structured peer-review verdict. |
|
|
86
|
-
| **`v02.12.00`** | **Shadow judge observability.** Turned on evidence-judge shadow-mode data collection, surfaced autowire config in `server_info`, added dashboard/runtime rollups, and codified the tribunal-colegiado model for caller, relator, peer votes, and contestation. |
|
|
87
|
-
| **`v02.11.00`** | **Relator lottery plus shadow auto-wire.** Added automatic relator selection that excludes the caller and wired the v2.9 judge pass in shadow mode so self-review drift stops at the session structure. |
|
|
88
|
-
| **`v02.09.00`** | **LLM evidence-judge pass.** Added an operator-triggered judge that evaluates open evidence asks against the current draft and promotes only verified satisfied items, leaving inferred/unknown cases open. |
|
|
89
|
-
| **`v02.08.00`** | **Per-peer health and Evidence Broker lifecycle.** Added health rollups, evidence lifecycle tracking, resurfacing inference, dashboard surfaces, and the final architectural audit item on top of v2.7. |
|
|
90
|
-
| **`v02.07.00`** | **Evidence Broker.** Added a persistent per-session evidence checklist that deduplicates `NEEDS_EVIDENCE` caller requests and injects outstanding asks into subsequent revision prompts. |
|
|
91
|
-
| **`v02.06.01`** | **Fallback/recovery budget hard gate.** Replicated hard budget refusal to fallback and moderation-recovery paths so paid recovery calls cannot silently exceed the session cost ceiling. |
|
|
92
|
-
| **`v02.06.00`** | **Token-delta compaction plus v2.5 format hotfix bundle.** Coalesced streaming token delta events to reduce `events.ndjson` noise and bundled the deferred Prettier/format fix from v2.5. |
|
|
93
|
-
| **`v02.05.00`** | **Evidence and budget hardening pass.** Folded in operator-requested evidence/budget improvements plus empirical Codex/Gemini audit findings from historical session analysis. |
|
|
94
|
-
| **`v02.04.01`** | **CI stub fail-fast hotfix.** Fixed import-time server startup so the smoke harness can import MCP schemas while `CROSS_REVIEW_STUB=1` is set in CI with explicit confirmation. |
|
|
95
|
-
| **`v02.04.00`** | **Audit-closure hardening pass.** Closed internal v2.3.3 technical-opinion priorities with additive public-surface hardening and several explicitly documented behavior changes. |
|
|
96
|
-
| **`v02.03.03`** | **Prompt shielding and financial safety.** Wrapped `review_focus` in escaped delimiters, blocked paid calls until financial controls are configured, expanded `server_info` financial diagnostics, and hardened MCP IDs, sweeps, jobs, and recovery cost alerts. |
|
|
97
|
-
| **`v02.03.02`** | **CI-green README/docs cleanup.** Reissued README organizational standardization under the repository Prettier policy and completed active-document rename cleanup in `NOTICE` and `CODE_OF_CONDUCT.md`. |
|
|
98
|
-
| **`v02.03.01`** | **README organizational standardization.** Adopted the shared LCV README opening while preserving the API-first runtime, model-selection, streaming, and observability sections. |
|
|
99
|
-
| **`v02.03.00`** | **Provider-neutral `review_focus`.** Added focus support across session tools, persisted focus metadata, injected bounded focus blocks into generation/review/retry prompts, and aligned auto-tag/publish automation with the stable package line. |
|
|
100
|
-
| **`v02.02.00`** | **Provider token streaming.** Added real token streaming for OpenAI, Anthropic, Gemini, and DeepSeek, with count-based progress events, runtime controls, and text-redaction defaults for persisted event logs. |
|
|
101
|
-
| **`v02.01.01`** | **CodeQL and model-selection hardening.** Fixed secret-redaction ReDoS and dashboard log-injection alerts, added decision retry for empty peer output, max-output-token controls, stronger model selection, and improved thinking controls. |
|
|
102
|
-
| **`v02.01.00`** | **First stable `cross-review` release.** Promoted the API-first implementation to stable with cancellation, restart recovery, metrics, runtime capabilities, prompt compaction, budget preflight, model fallback, and stable naming. |
|
|
103
|
-
| **`v02.00.04`** | **Session event race hotfix.** Removed the CodeQL file-system race in `events.ndjson` persistence by appending under the session lock. |
|
|
104
|
-
| **`v02.00.03`** | **Background sessions and durable reports.** Added background MCP tools, durable events and reports, peer decision-quality tracking, generation accounting, provider cost rates, budget guard, moderation-safe retry, and dashboard event/report APIs. |
|
|
105
|
-
| **`v02.00.02`** | **Publishing and dashboard sanitization.** Normalized npm dist-tags, replaced the sponsor landing with the SumUp support page, sanitized dashboard 500 responses, and bumped the alpha runtime. |
|
|
106
|
-
| **`v02.00.01`** | **Public npm/package metadata alignment.** Enforced public npm visibility, added registry visibility checks, aligned funding metadata, normalized `repository.url`, and bumped the alpha runtime. |
|
|
107
|
-
| **`v02.00.00`** | **Development package line hardening.** Added parser format recovery, convergence metadata, shared MCP timeout/runtime smoke, auto-tag/release publishing, padded public tags, prepack clean builds, ignore-rule hardening, and quorum preservation. |
|
|
108
|
-
| **`v2.0.0-alpha.2`** | **Durable session recovery alpha.** Added in-flight metadata, convergence health, evidence attachment, operator escalation, session sweep, convergence inspection, silent-model-downgrade failures, and smoke coverage for the new surfaces. |
|
|
109
|
-
| **`v2.0.0-alpha.1`** | **Model attestation and store hardening alpha.** Added reported-model tracking, failed-attempt aggregation, recovery hints, atomic/locked session writes, UUID path hardening, safer probes, self-review prevention, English peer prompts, and expanded redaction. |
|
|
110
|
-
| **`v2.0.0-alpha.0`** | **Initial API/SDK-only MCP server.** Introduced official SDK adapters for OpenAI, Anthropic, Gemini, and DeepSeek, runtime model discovery, best-model selection, and a durable local session store. |
|
|
39
|
+
| Release | Scope |
|
|
40
|
+
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
41
|
+
| **`v04.02.02`** | Patch — provider-doc refresh, Perplexity probe repair, current model pins, and rate-card guidance. |
|
|
42
|
+
| **`v04.02.01`** | Patch — publish the workspace hard-gate cleanup as a package release. |
|
|
43
|
+
| **`v04.02.00`** | Minor — bounded MCP session listing and cancellation semantics cleanup. |
|
|
44
|
+
| **`v04.01.01`** | Patch — release the hard-gate cleanup as a published package. |
|
|
45
|
+
| **`v04.01.00`** | Minor — security hardening of session-store concurrency, write-path DoS surface, and credential redaction. |
|
|
46
|
+
| **`v04.00.08`** | Patch — eliminate the recurring `js/file-access-to-http` CodeQL false positive at the source. |
|
|
47
|
+
| **`v04.00.07`** | Patch — bounded npm registry fetch in the post-publish verifier. |
|
|
48
|
+
| **`v04.00.06`** | Patch — Windows-safe registry verifier. |
|
|
49
|
+
| **`v04.00.05`** | Patch — hard-gate close-out for the Codex v4.0.4 audit. |
|
|
50
|
+
| **`v04.00.04`** | Patch — restore prettier coverage of `src/` and `scripts/` (close audit on v4.0.3 hard-gate gap). |
|
|
51
|
+
| **`v04.00.02`** | Patch — Codex second-pass audit close-out (6 findings). |
|
|
52
|
+
| **`v04.00.01`** | Patch — close-out of post-v4.0.0 audit (eight surfaces left stale by the rename bulk-replace). |
|
|
53
|
+
| **`v04.00.00`** | Major — project renamed to `cross-review` |
|
|
54
|
+
| **`v03.07.05`** | Patch — logs+sessions study 2026-05-15 close-out (4 surgical fixes from 244-session/429-round corpus). |
|
|
55
|
+
| **`v03.07.03`** | Patch — "sem fallback é sem fallback" directive + Codex v3.7.2 parecer residuals. |
|
|
56
|
+
| **`v03.07.02`** | Patch — Codex 3rd super-audit close-out of v3.7.1 |
|
|
57
|
+
| **`v03.07.01`** | Patch — Codex super-audit close-out of v3.7.0 |
|
|
58
|
+
| **`v03.07.00`** | Minor — Codex super-audit close-out 2026-05-14 |
|
|
59
|
+
| **`v03.06.00`** | Minor — observability + caller-discipline close-out 2026-05-14 |
|
|
60
|
+
| **`v03.05.00`** | Minor — Codex operational-report close-out 2026-05-14: 5 findings from sessions `f0db3970` + `df052926`. |
|
|
61
|
+
| **`v03.04.00`** | Minor — Perplexity multi-failure-mode close-out 2026-05-13: 3 coordinated fixes covering 7 production sessions Codex flagged (`51973fac`, `f72e597a`, `f9a19401`, `99d46a2b`, `00d92cce`, `59776026`, `0003b2fe`). |
|
|
62
|
+
| **`v03.03.00`** | Minor — Caller peer-selection lock (operator directive 2026-05-12: "TODOS OS AGENTES/PEERS SEMPRE PARTICIPAM, INDEPENDENTE DA ESCOLHA OU VONTADE DO CALLER"). |
|
|
63
|
+
| **`v03.02.00`** | Patch — Codex bug-report close-out 2026-05-12: three surgical fixes (Perplexity `<think>` parser + session-state invariant + orchestrator strict peers). |
|
|
64
|
+
| **`v03.01.00`** | Minor — Central config file (`config.json`). Eliminates ~700 redundant env-var declarations across the 7 MCP host configs. |
|
|
65
|
+
| **`v03.00.00`** | Major — Perplexity joins the sexteto. Quinteto (5 peers) → sexteto (6). |
|
|
66
|
+
| **`v02.28.00`** | Minor — Cold-start hardening Part 3: Windows registry env-var lookup bulk-cached (3-7 s → ~100 ms). |
|
|
67
|
+
| **`v02.27.01`** | Patch — Cold-start hardening Part 2: lazy-load 5 provider SDKs + defer 6 startup sweeps to setTimeout(30s). |
|
|
68
|
+
| **`v02.27.00`** | Minor — Cold-start hardening Part 1: corrupted meta.json auto-quarantine + finalized-session auto-prune. |
|
|
69
|
+
| **`v02.26.01`** | Patch — `max_attached_evidence_chars` default raised 80_000 → 200_000 to fix multi-file evidence truncation. |
|
|
70
|
+
| **`v02.26.00`** | Minor — Full pricing-model schema: base + extended-tier + cache (read/write) + promo (limited-time discount), all env-configurable, graceful fallback when fields are absent or promo expires. |
|
|
71
|
+
| **`v02.25.01`** | Patch — `meta.json` corruption hotfix: `redact()` env-style pattern was crossing JSON-escape boundaries. |
|
|
72
|
+
| **`v02.25.00`** | Third deliberation mode `circular` joins `ship` and `review`. |
|
|
73
|
+
| **`v02.24.00`** | Evidence-provenance lock for the ship-mode relator (Codex bug report 2026-05-10). |
|
|
74
|
+
| **`v02.23.00`** | Anthropic empty-revision degenerate path detection. |
|
|
75
|
+
| **`v02.22.00`** | `session_doctor` drill-down + per-round cost telemetry + budget warning event. |
|
|
76
|
+
| **`v02.21.00`** | Cross-provider prompt caching across all 5 peers (OpenAI, Anthropic, Gemini, DeepSeek, Grok). |
|
|
77
|
+
| **`v02.18.08`** | Site sponsor card iteration. |
|
|
78
|
+
| **`v02.18.07`** | Patch — `site/index.html` visual identity refresh. |
|
|
79
|
+
| **`v02.18.06`** | Patch — Gemini API function-declaration compatibility for MCP tool inputSchemas. |
|
|
80
|
+
| **`v02.18.05`** | Patch — anti-drift smoke drivers for v2.18.4 audit closure (operator directive 2026-05-07). |
|
|
81
|
+
| **`v02.18.04`** | Patch — Codex external audit 2026-05-07 outcome: 6 surgical fixes (P1.1, P1.2, P1.3, P1.4, P2.1, P2.4). |
|
|
82
|
+
| **`v02.18.03`** | Patch — Gemini default pin bump `gemini-3.1-pro-preview` → `gemini-2.5-pro` (operator preference 2026-05-07; coordinated with cross-review-v1 v1.12.4). |
|
|
83
|
+
| **`v02.18.02`** | Tier 5 — Windows process-tree introspection (coordinated with cross-review-v1 v1.12.2). |
|
|
84
|
+
| **`v02.18.01`** | Hotfix: closes Dependabot security advisory GHSA-v2v4-37r5-5v8g (medium severity) — `ip-address` XSS in Address6 HTML-emitting methods. |
|
|
85
|
+
| **`v02.18.00`** | F1 caller capability tokens (coordinated with cross-review-v1 v1.11.0). |
|
|
86
|
+
| **`v02.17.00`** | HARD GATE — identity forgery rejection (operator directive 2026-05-05). |
|
|
87
|
+
| **`v02.16.00`** | Tribunal protocol repair plus operational doctor. |
|
|
88
|
+
| **`v02.15.01`** | `server_info` consensus visibility hotfix. |
|
|
89
|
+
| **`v02.15.00`** | Backlog bundle for operational judge controls. |
|
|
90
|
+
| **`v02.14.01`** | Grok reasoning model hotfix. |
|
|
91
|
+
| **`v02.14.00`** | Grok joins the tribunal. |
|
|
92
|
+
| **`v02.13.00`** | Lead meta-review drift fix. |
|
|
93
|
+
| **`v02.12.00`** | Shadow judge observability. |
|
|
94
|
+
| **`v02.11.00`** | Relator lottery plus shadow auto-wire. |
|
|
95
|
+
| **`v02.09.00`** | LLM evidence-judge pass. |
|
|
96
|
+
| **`v02.08.00`** | Per-peer health and Evidence Broker lifecycle. |
|
|
97
|
+
| **`v02.07.00`** | Evidence Broker. |
|
|
98
|
+
| **`v02.06.01`** | Fallback/recovery budget hard gate. |
|
|
99
|
+
| **`v02.06.00`** | Token-delta compaction plus v2.5 format hotfix bundle. |
|
|
100
|
+
| **`v02.05.00`** | Evidence and budget hardening pass. |
|
|
101
|
+
| **`v02.04.01`** | CI stub fail-fast hotfix. |
|
|
102
|
+
| **`v02.04.00`** | Audit-closure hardening pass. |
|
|
103
|
+
| **`v02.03.03`** | Prompt shielding and financial safety. |
|
|
104
|
+
| **`v02.03.02`** | CI-green README/docs cleanup. |
|
|
105
|
+
| **`v02.03.01`** | README organizational standardization. |
|
|
106
|
+
| **`v02.03.00`** | Provider-neutral `review_focus`. |
|
|
107
|
+
| **`v02.02.00`** | Provider token streaming. |
|
|
108
|
+
| **`v02.01.01`** | CodeQL and model-selection hardening. |
|
|
109
|
+
| **`v02.01.00`** | First stable `cross-review` release. |
|
|
110
|
+
| **`v02.00.04`** | Session event race hotfix. |
|
|
111
|
+
| **`v02.00.03`** | Background sessions and durable reports. |
|
|
112
|
+
| **`v02.00.02`** | Publishing and dashboard sanitization. |
|
|
113
|
+
| **`v02.00.01`** | Public npm/package metadata alignment. |
|
|
114
|
+
| **`v02.00.00`** | Development package line hardening. |
|
|
115
|
+
| **`v2.0.0-alpha.2`** | Durable session recovery alpha. |
|
|
116
|
+
| **`v2.0.0-alpha.1`** | Model attestation and store hardening alpha. |
|
|
117
|
+
| **`v2.0.0-alpha.0`** | Initial API/SDK-only MCP server. |
|
|
111
118
|
|
|
112
119
|
## What It Does
|
|
113
120
|
|
|
114
121
|
`cross-review` is the stable API-first implementation of the cross-review
|
|
115
122
|
pattern. It orchestrates provider API clients (OpenAI/Codex, Anthropic/Claude,
|
|
116
|
-
Google Gemini, DeepSeek, and xAI/Grok) and provides an MCP-compatible server
|
|
117
|
-
surface.
|
|
123
|
+
Google Gemini, DeepSeek, xAI/Grok, and Perplexity Sonar) and provides an
|
|
124
|
+
MCP-compatible server surface.
|
|
118
125
|
|
|
119
126
|
Runtime calls are real provider calls by default. Stubs exist only for smoke
|
|
120
127
|
tests and CI when `CROSS_REVIEW_STUB=1`.
|
|
@@ -124,6 +131,7 @@ tests and CI when `CROSS_REVIEW_STUB=1`.
|
|
124
131
|
- Google Gen AI client library for Gemini.
|
|
125
132
|
- OpenAI-compatible DeepSeek API through the OpenAI client library.
|
|
126
133
|
- OpenAI-compatible xAI Grok API through the OpenAI client library.
|
|
134
|
+
- OpenAI-compatible Perplexity Sonar API through the OpenAI client library.
|
|
127
135
|
|
|
128
136
|
## Quick Start
|
|
129
137
|
|
|
@@ -165,11 +173,12 @@ variables. Example overrides (PowerShell):
|
|
165
173
|
[Environment]::SetEnvironmentVariable("CROSS_REVIEW_GROK_REASONING_EFFORT", "xhigh", "User")
|
|
166
174
|
```
|
|
167
175
|
|
|
168
|
-
For Grok, `GROK_API_KEY` is canonical. `grok-4-latest`, `grok-4.3`,
|
|
169
|
-
`grok-4.20`, and `grok-4.20-reasoning` use xAI automatic reasoning without an explicit
|
|
170
|
-
`reasoning.effort` field. `grok-4.20-multi-agent` accepts explicit
|
|
171
|
-
`reasoning.effort`; `low`/`medium` select 4 agents and `high`/`xhigh` select
|
|
172
|
-
16 agents.
|
|
176
|
+
For Grok, `GROK_API_KEY` is canonical. The default pin is `grok-4.3`, which
|
|
177
|
+
accepts explicit `reasoning.effort` through `high`; the adapter clamps the
|
|
178
|
+
shared effort scale before sending it. `grok-4-latest`, `grok-4.20`, and
|
|
179
|
+
`grok-4.20-reasoning` use xAI automatic reasoning in this runtime.
|
|
180
|
+
`grok-4.20-multi-agent` remains available as an explicit override for the
|
|
181
|
+
multi-agent variant.
|
|
173
182
|
|
|
174
183
|
Financial and budget controls are required for paid provider calls. Configure
|
|
175
184
|
these environment variables before running real sessions (example):
|
|
@@ -211,12 +220,28 @@ these environment variables before running real sessions (example):
|
|
211
220
|
- `session_sweep`
|
|
212
221
|
- `session_finalize`
|
|
213
222
|
|
|
214
|
-
## License
|
|
223
|
+
## Repository conventions
|
|
224
|
+
|
|
225
|
+
- **License**: [Apache-2.0](./LICENSE). See [NOTICE](./NOTICE) and [THIRDPARTY](./THIRDPARTY.md).
|
|
226
|
+
- **Security disclosure**: see [SECURITY.md](./SECURITY.md).
|
|
227
|
+
- **Code of conduct**: see [CODE_OF_CONDUCT.md](./CODE_OF_CONDUCT.md).
|
|
228
|
+
- **Changelog**: [CHANGELOG.md](./CHANGELOG.md).
|
|
229
|
+
- **Contributing**: see [CONTRIBUTING.md](./CONTRIBUTING.md).
|
|
230
|
+
- **Sponsorship**: see the repo's `Sponsor` button or [central sponsor page](https://www.lcv.dev/sponsor).
|
|
231
|
+
- **Action pinning**: all GitHub Actions are pinned by full SHA per supply-chain hardening baseline.
|
|
232
|
+
- **Code owners**: [.github/CODEOWNERS](.github/CODEOWNERS).
|
|
233
|
+
|
|
234
|
+
## Links
|
|
215
235
|
|
|
216
|
-
Apache License 2.0 — see [LICENSE](./LICENSE) and [NOTICE](./NOTICE).
|
|
236
|
+
- Site: [https://cross-review.lcv.dev](https://cross-review.lcv.dev)
|
|
237
|
+
- npmjs.com: [https://www.npmjs.com/package/@lcv-ideas-software/cross-review](https://www.npmjs.com/package/@lcv-ideas-software/cross-review)
|
|
238
|
+
- GitHub: [https://github.com/LCV-Ideas-Software/cross-review](https://github.com/LCV-Ideas-Software/cross-review)
|
|
239
|
+
- Sponsors: [https://github.com/sponsors/LCV-Ideas-Software](https://github.com/sponsors/LCV-Ideas-Software)
|
|
240
|
+
|
|
241
|
+
## License
|
|
217
242
|
|
|
218
|
-
Copyright 2026 Leonardo Cardozo Vargas.
|
|
243
|
+
Apache-2.0. See [LICENSE](./LICENSE), [NOTICE](./NOTICE), and [THIRDPARTY](./THIRDPARTY.md).
|
|
219
244
|
|
|
220
245
|
---
|
|
221
246
|
|
|
222
|
-
<p align="center"><span style="font-size: 1.5em;"><strong>© LCV Ideas & Software</strong></span><br><sub>LEONARDO CARDOZO VARGAS TECNOLOGIA DA INFORMACAO LTDA<br>Rua Pais Leme, 215 Conj 1713 - Pinheiros<br>São Paulo - SP<br>CEP 05.424-150<br>CNPJ: 66.584.678/0001-77<br>IM 05.424-150</sub></p>
|
|
247
|
+
<p align="center"><span style="font-size: 1.5em;"><strong>Copyright © 2026 LCV Ideas & Software</strong></span><br><sub>LEONARDO CARDOZO VARGAS TECNOLOGIA DA INFORMACAO LTDA<br>Rua Pais Leme, 215 Conj 1713 - Pinheiros<br>São Paulo - SP - CEP 05424-150<br>CNPJ: 66.584.678/0001-77 - IM: 3039854</sub></p>
|