npm - @verica-app/cli - Versions diffs - 0.1.4 → 0.1.5 - Mend

@verica-app/cli 0.1.4 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -152,8 +152,12 @@ also answers `200` instead of `202`).
 - `--reuse-same-ref` — only reuse a run on the **same git ref**. Off by default: an
   identical config produces the same output distribution regardless of branch.
 - Only **completed** runs are reused (never a partial/failed one).
-- Incompatible with `--threshold` / `--baseline-ref` / `--baseline-run` — a reused
-  verdict was frozen under its own gate, so it can't honor a new one.
+- Incompatible with `--threshold` / `--baseline-ref` / `--baseline-run`. Reuse hands
+  back a _prior_ run's verdict, frozen under the gate that applied when it ran, so a
+  new `--threshold` can't be recomputed against it. `--baseline-ref` is worse than
+  stale: no-regression compares against the _last run on the ref_ — a moving target —
+  so a cached verdict can never be a fresh no-regression check. Gate on either → run
+  fresh (omit reuse).
 Omit `--reuse-if-unchanged` (the default) any time you want a guaranteed fresh run.
@@ -176,8 +180,8 @@ During `0.x` the **minor** version is the breaking lever, so pin accordingly:
 We bump the **minor** for any breaking change (flags, output shapes, push behavior) and
 the **patch** for additive features and fixes. **1.0** will freeze the commands, flags,
-exit codes, and output shapes under standard semver. See
-[CHANGELOG.md](./CHANGELOG.md) for what changed in each release.
+exit codes, and output shapes under standard semver. See the bundled `CHANGELOG.md`
+for what changed in each release.
 MIT licensed. There's no IP in the client — the engine, graders, gate, and crypto all
 run server-side behind the token API.

package/dist/cli.js CHANGED Viewed

@@ -4110,7 +4110,9 @@ var runRequestSchema = external_exports.object({
    * default — an eval's output isn't a pure function of its config (generation +
    * judge are non-deterministic, the model endpoint drifts), so reuse is always
    * the caller's explicit choice and is bounded by `maxAgeHours`. Incompatible
-   * with `gate` (the cached verdict was frozen under the old gate).
+   * with `gate`: a reused verdict is frozen under its own gate (a new threshold
+   * can't be recomputed), and no-regression compares against a moving baseline
+   * (the last run on the ref), so a cache can never be a fresh gated check.
    */
   reuse: external_exports.object({
     /** Turn reuse on. The trigger — everything else is just tuning. */
@@ -4718,7 +4720,7 @@ async function main() {
   }
   if (values["reuse-if-unchanged"] && (threshold !== void 0 || values["baseline-ref"] !== void 0 || values["baseline-run"] !== void 0)) {
     throw new Error(
-      "--reuse-if-unchanged cannot be combined with --threshold / --baseline-ref / --baseline-run (a reused verdict was frozen under the prior gate)."
+      "--reuse-if-unchanged cannot be combined with --threshold / --baseline-ref / --baseline-run: a reused verdict is frozen under its own gate, and no-regression compares against a moving baseline \u2014 neither can be recomputed. Gate on these? Run fresh (drop --reuse-if-unchanged)."
     );
   }
   const opts = {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@verica-app/cli",
-  "version": "0.1.4",
+  "version": "0.1.5",
   "private": false,
   "description": "Run a Verica eval from CI and block the merge on the result.",
   "license": "MIT",