npm - @openthink/stamp - Versions diffs - 1.3.0 → 1.4.0 - Mend

@openthink/stamp 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +47 -0
package/dist/hooks/pre-receive.cjs.map +1 -1
package/dist/index.js +134 -28
package/dist/index.js.map +1 -1
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -292,6 +292,53 @@ models pinned, each operator records their own verdict in their own
 state.db (same as today's reviewer-prompt model). Stamp does not assume
 verdicts are model-portable.
+### Reviewer execution budgets
+Each reviewer subprocess runs under bounds that can be set in three
+places, narrowest-wins: per-reviewer fields in `.stamp/config.yml`
+(committed policy, hashed into the attestation), operator env vars on
+the calling shell (per-shell, not committed), or the built-in default.
+| Knob | Env var (default) | `.stamp/config.yml` field | What it caps |
+|---|---|---|---|
+| Turn cap | `STAMP_REVIEWER_MAX_TURNS` (`8`) | `reviewers.<name>.max_turns` | Model/tool round-trips. Hitting it surfaces as `reviewer "<name>" run failed (subtype=error_max_turns) — turn trace at <path>; raise STAMP_REVIEWER_MAX_TURNS or set reviewers.<name>.max_turns to extend it`. |
+| Wall-clock | `STAMP_REVIEWER_TIMEOUT_MS` (`300000`) | `reviewers.<name>.timeout_ms` | Time per reviewer. Hitting it aborts the SDK call and writes a turn trace. |
+| Diff size | `STAMP_REVIEW_DIFF_CAP_BYTES` (`204800`) | — (operator-side only) | Per-reviewer diff size; bypass per-invocation with `--allow-large`. Lives here because diff size is operator-bounded input rather than per-reviewer execution policy. |
+The defaults are tight enough that a pathological reviewer gives up in
+single-digit minutes rather than racking up Anthropic spend silently.
+Reach for the committed `.stamp/config.yml` form when one reviewer
+legitimately needs headroom (e.g. a `product` reviewer that does Linear
+ticket reconciliation) but raising the global env would over-budget the
+others; reach for the env vars for ad-hoc operator overrides.
+```yaml
+# .stamp/config.yml — example: heavy product reviewer
+reviewers:
+  security:  { prompt: .stamp/reviewers/security.md }
+  standards: { prompt: .stamp/reviewers/standards.md }
+  product:
+    prompt: .stamp/reviewers/product.md
+    max_turns: 20
+    timeout_ms: 600000
+```
+```sh
+# Operator-side global override for a one-off ad-hoc run
+STAMP_REVIEWER_MAX_TURNS=20 STAMP_REVIEWER_TIMEOUT_MS=600000 \
+  stamp review --diff main..HEAD
+```
+When a reviewer trips the cap, a structured turn trace is written to
+`<repoRoot>/.git/stamp/failed-runs/<unix-ms>-<reviewer>.log` (mode
+`0600`, parent `0700`, JSON; lists the tool-call sequence and input
+hashes that the reviewer made before failure — never raw model prose
+or unhashed inputs). Use it to distinguish a looping prompt from a
+legitimately under-budgeted reviewer. `stamp prune --older-than <dur>`
+walks both `failed-runs/` and `failed-parses/`. See
+[`docs/troubleshooting.md`](./docs/troubleshooting.md) for the full
+runbook.
 ## Deployment shapes
 Three ways to run stamp-cli in a real setting, trading setup cost for