@openthink/stamp 1.3.1 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -294,30 +294,50 @@ verdicts are model-portable.
294
294
 
295
295
  ### Reviewer execution budgets
296
296
 
297
- Each reviewer subprocess runs under two bounds, set on the operator's
298
- machine via env vars (they are operator infrastructure, not committed
299
- policy different operators on the same repo can pick different values):
297
+ Each reviewer subprocess runs under bounds that can be set in three
298
+ places, narrowest-wins: per-reviewer fields in `.stamp/config.yml`
299
+ (committed policy, hashed into the attestation), operator env vars on
300
+ the calling shell (per-shell, not committed), or the built-in default.
300
301
 
301
- | Env var | Default | What it caps |
302
- |---|---|---|
303
- | `STAMP_REVIEWER_MAX_TURNS` | `8` | Model/tool round-trips per reviewer call. Hitting it surfaces as `reviewer "<name>" run failed (subtype=error_max_turns)`. |
304
- | `STAMP_REVIEWER_TIMEOUT_MS` | `300000` (5 min) | Wall-clock budget per reviewer. Hitting it surfaces as `reviewer "<name>" exceeded <N>ms wall-clock budget raise STAMP_REVIEWER_TIMEOUT_MS to extend it`. |
302
+ | Knob | Env var (default) | `.stamp/config.yml` field | What it caps |
303
+ |---|---|---|---|
304
+ | Turn cap | `STAMP_REVIEWER_MAX_TURNS` (`8`) | `reviewers.<name>.max_turns` | Model/tool round-trips. Hitting it surfaces as `reviewer "<name>" run failed (subtype=error_max_turns) — turn trace at <path>; raise STAMP_REVIEWER_MAX_TURNS or set reviewers.<name>.max_turns to extend it`. |
305
+ | Wall-clock | `STAMP_REVIEWER_TIMEOUT_MS` (`300000`) | `reviewers.<name>.timeout_ms` | Time per reviewer. Hitting it aborts the SDK call and writes a turn trace. |
306
+ | Diff size | `STAMP_REVIEW_DIFF_CAP_BYTES` (`204800`) | — (operator-side only) | Per-reviewer diff size; bypass per-invocation with `--allow-large`. Lives here because diff size is operator-bounded input rather than per-reviewer execution policy. |
305
307
 
306
308
  The defaults are tight enough that a pathological reviewer gives up in
307
309
  single-digit minutes rather than racking up Anthropic spend silently.
308
- Raise them when a reviewer with legitimately heavy lookup tools (Linear
309
- MCP, multi-file `Read`, ticket reconciliation) repeatedly trips the cap
310
- on a non-trivial diff. Example:
310
+ Reach for the committed `.stamp/config.yml` form when one reviewer
311
+ legitimately needs headroom (e.g. a `product` reviewer that does Linear
312
+ ticket reconciliation) but raising the global env would over-budget the
313
+ others; reach for the env vars for ad-hoc operator overrides.
314
+
315
+ ```yaml
316
+ # .stamp/config.yml — example: heavy product reviewer
317
+ reviewers:
318
+ security: { prompt: .stamp/reviewers/security.md }
319
+ standards: { prompt: .stamp/reviewers/standards.md }
320
+ product:
321
+ prompt: .stamp/reviewers/product.md
322
+ max_turns: 20
323
+ timeout_ms: 600000
324
+ ```
311
325
 
312
326
  ```sh
327
+ # Operator-side global override for a one-off ad-hoc run
313
328
  STAMP_REVIEWER_MAX_TURNS=20 STAMP_REVIEWER_TIMEOUT_MS=600000 \
314
329
  stamp review --diff main..HEAD
315
330
  ```
316
331
 
317
- If a reviewer trips the cap consistently on small diffs too, the prompt
318
- is probably looping rather than working — diagnose before raising the
319
- budget. See [`docs/troubleshooting.md`](./docs/troubleshooting.md) for
320
- the runbook.
332
+ When a reviewer trips the cap, a structured turn trace is written to
333
+ `<repoRoot>/.git/stamp/failed-runs/<unix-ms>-<reviewer>.log` (mode
334
+ `0600`, parent `0700`, JSON; lists the tool-call sequence and input
335
+ hashes that the reviewer made before failure — never raw model prose
336
+ or unhashed inputs). Use it to distinguish a looping prompt from a
337
+ legitimately under-budgeted reviewer. `stamp prune --older-than <dur>`
338
+ walks both `failed-runs/` and `failed-parses/`. See
339
+ [`docs/troubleshooting.md`](./docs/troubleshooting.md) for the full
340
+ runbook.
321
341
 
322
342
  ## Deployment shapes
323
343