@bookedsolid/rea 0.9.0 → 0.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,14 +2,15 @@
2
2
 
3
3
  **Agentic governance layer for Claude Code — policy enforcement, hook-based safety gates, audit logging, and Codex-integrated adversarial review.**
4
4
 
5
- [![npm version](https://img.shields.io/badge/npm-pending-lightgrey)](https://www.npmjs.com/package/@bookedsolid/rea)
6
- [![CI](https://img.shields.io/badge/ci-pending-lightgrey)](https://github.com/bookedsolidtech/rea/actions)
7
- [![provenance](https://img.shields.io/badge/npm%20provenance-pending-lightgrey)](https://docs.npmjs.com/generating-provenance-statements)
5
+ [![npm version](https://img.shields.io/npm/v/%40bookedsolid%2Frea?color=cb3837&label=npm)](https://www.npmjs.com/package/@bookedsolid/rea)
6
+ [![CI](https://github.com/bookedsolidtech/rea/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/bookedsolidtech/rea/actions/workflows/ci.yml)
7
+ [![npm provenance](https://img.shields.io/badge/npm%20provenance-attested-blue?logo=npm)](https://docs.npmjs.com/generating-provenance-statements)
8
8
  [![license](https://img.shields.io/badge/license-MIT-blue)](./LICENSE)
9
9
  [![DCO](https://img.shields.io/badge/DCO-required-green)](https://developercertificate.org/)
10
10
  [![Node](https://img.shields.io/badge/node-%3E%3D22-brightgreen)](https://nodejs.org/)
11
11
 
12
- > Status: 0.0.x, pre-release. Badges are placeholders until the first publish.
12
+ > Status: `0.9.x` published to npm with provenance. See
13
+ > [CHANGELOG.md](./CHANGELOG.md) for the per-release history.
13
14
 
14
15
  ---
15
16
 
@@ -31,18 +32,38 @@ Node 22+ and pnpm 9+ required.
31
32
  REA is a governance layer for Claude Code. It is a single npm package that
32
33
  ships four things:
33
34
 
34
- 1. A **hook layer** — 11 shell scripts wired into Claude Code's `PreToolUse`
35
- and `PostToolUse` events. Hooks enforce secret scanning, dangerous-command
36
- interception, blocked-path protection, settings protection, attribution
37
- rejection, and commit/push review gates.
35
+ 1. A **hook layer** — 14 shell scripts total. 12 are registered in the
36
+ shipped `.claude/settings.json` and fire on Claude Code's `PreToolUse`
37
+ / `PostToolUse` events (secret scanning, dangerous-command
38
+ interception, blocked-path protection, settings protection,
39
+ attribution rejection, env-file protection, disclosure-policy
40
+ routing, dependency audit, changeset security, architecture advisory,
41
+ PR-issue-link advisory, and the Claude-Code push-review adapter).
42
+ One more shipped hook, `commit-review-gate.sh`, is a Claude
43
+ `PreToolUse: Bash` hook that matches `git commit` — it is shipped
44
+ ready-to-wire but intentionally NOT registered in the default
45
+ `.claude/settings.json`, so operators who want commit-time review can
46
+ opt in by adding a rule. The final script,
47
+ `push-review-gate-git.sh`, is a thin native-git adapter that sources
48
+ `hooks/_lib/push-review-core.sh` (the same shared core used by the
49
+ Claude-Code push-review adapter), so a fix to the push-review logic
50
+ lands in one place. It ships for consumers who manually configure
51
+ a wrapper-based `.husky/pre-push` (and as scaffolding for a future
52
+ installer revision). The default `rea init` installer emits a
53
+ standalone inline `.husky/pre-push` body instead of wiring the
54
+ adapter — see the Hooks section for details.
38
55
  2. A **gateway layer** — an MCP server (`rea serve`) that proxies downstream
39
56
  MCP servers through a middleware chain. Every tool call — native or
40
57
  proxied — is classified, policy-checked, redacted, audited, and
41
- size-capped before it executes.
42
- 3. A **policy runtime** `.rea/policy.yaml` with strict zod-validated
58
+ size-capped before it executes. The gateway also supervises downstream
59
+ child processes: unexpected deaths are detected eagerly, the circuit
60
+ breaker never reuses a zombie client, and a `SESSION_BLOCKER` audit
61
+ event fires when a downstream crosses the per-session failure threshold.
62
+ 3. A **policy runtime** — `.rea/policy.yaml` with a strict zod-validated
43
63
  schema. Defines autonomy level, a hard ceiling (`max_autonomy_level`),
44
- blocked paths, attribution rules, context protection, and optional
45
- Discord notification webhook.
64
+ blocked paths, attribution rules, context protection, redaction and
65
+ injection tuning, review/cache knobs, and an optional Discord
66
+ notification webhook.
46
67
  4. A **kill switch** — `.rea/HALT` is a single file. If it exists, every
47
68
  tool call is denied at the middleware and hook layers. Use
48
69
  `rea freeze --reason "..."` to create it and `rea unfreeze --reason "..."`
@@ -69,7 +90,8 @@ to build a separate package that composes with REA.
69
90
  no `rea stop`, no systemd unit. A short-lived `.rea/serve.pid`
70
91
  breadcrumb is written at startup so `rea status` can detect a live
71
92
  gateway — it is removed on graceful shutdown and never used for
72
- locking or lifecycle management.
93
+ locking or lifecycle management. A per-session `.rea/serve.state.json`
94
+ snapshot accompanies it for live per-downstream introspection.
73
95
  - **Not a hosted service.** There is no REA Cloud, no SaaS tier, no
74
96
  multi-token workstreams, no workload isolation platform.
75
97
  - **Not a 70-agent roster.** 10 curated agents ship in the package. Four
@@ -130,10 +152,16 @@ its own.
130
152
  rea doctor
131
153
  ```
132
154
 
133
- `rea doctor` checks hook coverage, policy parse, husky commit-msg hook
134
- install, `.mcp.json` gateway wiring, Codex plugin availability, and the
135
- integrity of the audit hash chain. It returns a pass/fail summary with
136
- specific remediation hints.
155
+ `rea doctor` checks `.rea/` directory presence, policy parse, registry
156
+ parse, curated-agent presence, hook coverage, `.claude/settings.json`
157
+ wiring, commit-msg / pre-push git hooks, Codex CLI + agent availability
158
+ (when `codex_required: true`), and the TOFU fingerprint store. It
159
+ returns a pass/fail summary with specific remediation hints. In non-git
160
+ directories (knowledge repos, docs-only projects) the commit-msg and
161
+ pre-push checks are skipped cleanly — REA governs policy and injection
162
+ detection there, not pushes. Audit hash-chain integrity is verified by
163
+ a separate command — `rea check` (on-disk tail) or the full replay
164
+ verifier — not by `rea doctor`.
137
165
 
138
166
  ### 4. Watch the running gateway
139
167
 
@@ -144,10 +172,31 @@ rea status --json # JSON — pipe to jq
144
172
 
145
173
  `rea status` is the live-process view. It reads the pidfile written by
146
174
  `rea serve`, verifies the pid is alive, and surfaces the session id,
147
- policy summary (profile, autonomy, HALT state), and audit stats (lines,
148
- last timestamp, whether the tail record's hash looks well-formed). Use
149
- `rea check` when you want the pure on-disk view without probing for a
150
- live process.
175
+ policy summary (profile, autonomy, HALT state), audit stats (lines,
176
+ last timestamp, whether the tail record's hash looks well-formed), and
177
+ as of 0.9.0 a **per-downstream live block** sourced from
178
+ `.rea/serve.state.json`. Each downstream entry includes:
179
+
180
+ | Field | Type | Meaning |
181
+ | --------------------------- | ------------------------------------ | --------------------------------------------------------------- |
182
+ | `name` | string | Registry server name |
183
+ | `connected` | boolean | MCP client currently holds an open stdio transport |
184
+ | `healthy` | boolean | Gateway considers the server safe to route calls to |
185
+ | `circuit_state` | `closed` \| `open` \| `half-open` | Current breaker position |
186
+ | `retry_at` | ISO timestamp \| `null` | Next allowed half-open probe, when `open` |
187
+ | `last_error` | string \| `null` | Bounded, redacted diagnostic from the most recent failure |
188
+ | `tools_count` | integer \| `null` | Tool count from the last successful `tools/list` |
189
+ | `open_transitions` | integer | Cumulative circuit-open events in this session |
190
+ | `session_blocker_emitted` | boolean | Whether `SESSION_BLOCKER` has fired for this server yet |
191
+
192
+ `.rea/serve.state.json` is the authoritative live source — it is written
193
+ atomically (temp+rename) on every circuit transition and supervisor
194
+ event, debounced through a 250 ms trailing timer so a flap storm can't
195
+ spam disk. State files written by a pre-0.9.0 gateway degrade gracefully:
196
+ `downstreams` surfaces as `null` with a hint to upgrade.
197
+
198
+ Use `rea check` when you want the pure on-disk view (policy + HALT +
199
+ tail audit) without probing for a live process.
151
200
 
152
201
  ### 5. Optional Prometheus `/metrics` endpoint
153
202
 
@@ -172,6 +221,39 @@ Set `REA_LOG_LEVEL=debug` for verbose gateway logs; the default is
172
221
  `info`. Records are JSON lines on a non-TTY stderr and pretty-printed
173
222
  on an interactive terminal.
174
223
 
224
+ ### 6. Ask the gateway how it's doing — `__rea__health`
225
+
226
+ The gateway advertises a single built-in tool, `__rea__health`, in
227
+ every `listTools` response regardless of downstream state. Calling it
228
+ returns a snapshot of gateway version, uptime, HALT state, policy
229
+ summary, and per-downstream health. The handler **short-circuits the
230
+ middleware chain** — it is callable under HALT and at any autonomy
231
+ level — because it is the tool an operator reaches for when everything
232
+ else is frozen. Every invocation still writes an audit record.
233
+
234
+ The wire response is **sanitized by default**: `halt_reason` and
235
+ `downstreams[].last_error` surface as `null`. Full diagnostic detail
236
+ lives in the audit record's metadata (`halt_reason`,
237
+ `downstream_errors[]`) — local disk, hash-chained, not
238
+ LLM-reachable — which is the right sink for trusted-operator text.
239
+
240
+ Operators who genuinely need error strings on the MCP wire can opt in:
241
+
242
+ ```yaml
243
+ # .rea/policy.yaml
244
+ gateway:
245
+ health:
246
+ expose_diagnostics: true
247
+ ```
248
+
249
+ Opt-in mode still runs the full sanitizer pass: `redactSecrets` replaces
250
+ known secret patterns with `[REDACTED:*]`, `classifyInjection` replaces
251
+ any non-`clean` diagnostic string (verdicts `suspicious` or
252
+ `likely_injection`) with the exported `INJECTION_REDACTED_PLACEHOLDER`
253
+ token — the literal string `<redacted: suspected injection>` — and
254
+ oversize values are bounded before scanning so an adversarial downstream
255
+ can't DoS the tool with a multi-megabyte error.
256
+
175
257
  ## Architecture
176
258
 
177
259
  ### Middleware chain
@@ -192,12 +274,12 @@ tool call
192
274
  │ rate-limit — token bucket per server │
193
275
  │ circuit-breaker — trip on downstream failure │
194
276
  │ redact (args) — secrets in arguments │
195
- │ injection — prompt-injection heuristics │
196
277
  │ │
197
278
  │ ==== EXECUTE ==== │
198
279
  │ │
199
- │ redact (result) — secrets in result │
200
280
  │ result-size-cap — bounded response │
281
+ │ redact (result) — secrets in result │
282
+ │ injection — prompt-injection in result │
201
283
  │ audit.exit — hash-chained record close │
202
284
  └───────────────────────────────────────────────────┘
203
285
 
@@ -209,14 +291,104 @@ result
209
291
  from policy. Policy is re-read on every invocation — any edit to
210
292
  `policy.yaml` takes effect on the next tool call.
211
293
 
212
- ### Hook layer
294
+ The `__rea__health` meta-tool is the one documented exception: it
295
+ short-circuits the chain (see §6 above) and writes an audit record from
296
+ the short-circuit handler itself.
297
+
298
+ ### Gateway supervisor
299
+
300
+ Downstream MCP servers run as child processes over stdio. The
301
+ `DownstreamConnection` wrapper wires the SDK `StdioClientTransport`'s
302
+ `onclose` + `onerror` callbacks, so an unexpected child death — OS
303
+ OOM-kill, unhandled exception in the child, stdio pipe error outside a
304
+ caller-initiated close — is detected **eagerly**: the client and
305
+ transport are nulled before the next `callTool` tries to use them. The
306
+ following call forces a genuine reconnect rather than invoking through a
307
+ stale handle.
308
+
309
+ "Not connected" errors from the SDK (the in-flight fallback) are
310
+ promoted to the same respawn path with the same eager invalidation.
311
+ A 30-second flapping guard refuses a second reconnect that lands too
312
+ quickly after the previous one — the child is clearly unhealthy and the
313
+ circuit breaker is a better place to handle it.
314
+
315
+ `SessionBlockerTracker` subscribes to circuit-breaker
316
+ `onStateChange` events and counts circuit-open transitions per
317
+ `(session_id, server_name)`. Once the threshold (default: 3) is
318
+ crossed, exactly one `SESSION_BLOCKER` audit record is appended and a
319
+ LOUD structured log line is emitted — subsequent opens do not re-fire
320
+ until recovery (a transition to `closed`) re-arms the emit. A new
321
+ session (new `rea serve` process) drops every counter and starts fresh.
322
+
323
+ ### Live state
324
+
325
+ `.rea/serve.state.json` is the on-disk live snapshot. It is written
326
+ once at boot and again on every circuit transition or supervisor event,
327
+ debounced through a 250 ms trailing timer and flushed atomically via
328
+ temp-file + rename. The snapshot carries a `session_id` (boot-time
329
+ ownership key) and `owner_pid`; a newly-started `rea serve` whose
330
+ predecessor crashed without cleanup can detect the abandoned file and
331
+ take over ownership rather than stalling forever. `rea status` is a
332
+ read-only consumer of this file.
333
+
334
+ ### Downstream environment safety
335
+
336
+ `rea serve` does **not** forward `process.env` wholesale to downstream
337
+ children. Each child gets:
338
+
339
+ 1. A fixed allowlist of neutral OS vars (`PATH`, `HOME`, `TZ`,
340
+ `NODE_OPTIONS`, …).
341
+ 2. Any names opted into via `registry.yaml#servers[].env_passthrough` —
342
+ the schema refuses secret-looking names (`*_TOKEN`, `*_KEY`,
343
+ `*_SECRET`, …), so secrets must be named explicitly.
344
+ 3. Values from the registry's `env:` mapping, which may contain
345
+ `${VAR}` placeholders resolved against the host environment
346
+ (0.3.0). Secret-looking values are redacted in logs by default.
347
+ A `${VAR}` whose host variable is unset is treated as fatal — the
348
+ downstream is marked unhealthy rather than handed an unresolved
349
+ placeholder.
213
350
 
214
- Hooks are shell scripts wired into `.claude/settings.json`. They run at
215
- Claude Code tool-invocation time, independently of the gateway. Both
216
- layers fail closed. Bypassing one does not disable the other.
351
+ ### Hook layer
217
352
 
218
- Every hook sources `hooks/_lib/halt-check.sh` and `hooks/_lib/policy-read.sh`
219
- at the top of the script. Every hook uses `set -euo pipefail`.
353
+ Hooks are shell scripts. 14 ship in the package; 12 are wired into
354
+ the default `.claude/settings.json` and run at Claude Code
355
+ tool-invocation time, independently of the gateway. The remaining
356
+ two (`commit-review-gate.sh` and `push-review-gate-git.sh`) ship
357
+ ready-to-wire but are not registered by default — see "What REA is"
358
+ above and the inventory table at the end of this section for the full
359
+ picture. Both layers (hooks and the gateway middleware) fail closed.
360
+ Bypassing one does not disable the other.
361
+
362
+ Every hook uses `set -euo pipefail` (or `set -uo pipefail` for the
363
+ ones that process stdin JSON) and performs a HALT check near the top.
364
+ The review-gate hooks (`push-review-gate.sh`, `push-review-gate-git.sh`,
365
+ `commit-review-gate.sh`) additionally anchor `REA_ROOT` to their own
366
+ on-disk location (BUG-012 fix, 0.6.2) — for those hooks,
367
+ `CLAUDE_PROJECT_DIR` is accepted only as an advisory signal because it
368
+ is caller-controlled. The remaining hooks (e.g. `secret-scanner.sh`,
369
+ `settings-protection.sh`, `blocked-paths-enforcer.sh`,
370
+ `dangerous-bash-interceptor.sh`) still derive `REA_ROOT` from
371
+ `${CLAUDE_PROJECT_DIR:-$(pwd)}`; extending the script-anchor idiom to
372
+ those hooks is tracked as an open hardening item. Cross-repo
373
+ invocations (running a review-gate hook from a consumer project that
374
+ is not the rea install) short-circuit cleanly using
375
+ `git --git-common-dir` comparison (0.6.1).
376
+
377
+ The two push-review adapters that ship in `hooks/` share a single
378
+ implementation core at `hooks/_lib/push-review-core.sh` (0.7.0 BUG-008
379
+ cleanup) so a fix lands in one place: `push-review-gate.sh` consumes
380
+ Claude-Code PreToolUse JSON and is what `rea init` copies to
381
+ `.claude/hooks/`; `push-review-gate-git.sh` consumes git's native
382
+ `.husky/pre-push` refspec lines and is shipped for consumers who wire
383
+ a wrapper-based `.husky/pre-push` that execs it directly. The default
384
+ `rea init` installer does NOT currently emit that wrapper — it writes
385
+ a standalone inline gate body as `.husky/pre-push` (source of truth:
386
+ `src/cli/install/pre-push.ts`). The native-git adapter and the
387
+ inline installer currently implement the same protected-path logic
388
+ separately; unifying the husky installer on the adapter is tracked as
389
+ follow-up hardening. `commit-review-gate.sh` is a standalone Claude
390
+ `PreToolUse: Bash` hook that matches `git commit`; it does not source
391
+ the push-review core.
220
392
 
221
393
  ### Slash commands
222
394
 
@@ -228,9 +400,13 @@ during `rea init`.
228
400
  Ten curated agents ship in the package: `rea-orchestrator`, `code-reviewer`,
229
401
  `codex-adversarial`, `security-engineer`, `accessibility-engineer`,
230
402
  `typescript-specialist`, `frontend-specialist`, `backend-engineer`,
231
- `qa-engineer`, `technical-writer`. Four profiles
232
- (`client-engagement`, `bst-internal`, `lit-wc`, `open-source`) layer
233
- additional specialists on top.
403
+ `qa-engineer`, `technical-writer`. Profiles
404
+ (`client-engagement`, `bst-internal`, `bst-internal-no-codex`,
405
+ `lit-wc`, `open-source`, `open-source-no-codex`, `minimal`) layer
406
+ additional specialists on top. The `-no-codex` variants match their
407
+ parents but default `review.codex_required: false` so teams without a
408
+ Codex CLI on the bench get a first-class opt-out rather than relying on
409
+ `REA_SKIP_CODEX_REVIEW`.
234
410
 
235
411
  The orchestrator is the single entry point for non-trivial tasks. The
236
412
  CLAUDE.md template installed by `rea init` instructs the host agent:
@@ -259,9 +435,38 @@ Three things make this work:
259
435
  2. The **`/codex-review` slash command** is one of the five shipped
260
436
  commands. It produces an audit entry including the request summary,
261
437
  response summary, and pass/fail signal.
262
- 3. The **`push-review-gate.sh` hook** checks for a recent `/codex-review`
263
- audit entry on the current branch and warns (does not block) if none
264
- is present.
438
+ 3. The **`push-review-gate.sh` hook** blocks (exit 2) every protected-path
439
+ push that does not carry a matching `codex.review` audit entry for the
440
+ pushed `head_sha` with a `verdict` of `pass` or `concerns`. The only
441
+ other way through the protected-path branch is an active Codex-only
442
+ waiver (`REA_SKIP_CODEX_REVIEW=<reason>`, 0.8.0 narrowing). For
443
+ **non-protected-path** pushes the gate runs a separate review-cache
444
+ lookup — this is where the cache predicate and pushed-ref key
445
+ hardening live. The cache-hit predicate requires
446
+ `.hit == true and .result == "pass"` (0.8.0 hardening — a cached
447
+ `fail` verdict no longer satisfies the gate), and the cache key is
448
+ derived from the **pushed source ref** (from pre-push stdin) rather
449
+ than the checkout branch, so `git push origin hotfix:main` from a
450
+ `feature` checkout correctly looks up the `hotfix` cache entry.
451
+
452
+ ### Codex-only waiver semantics (0.8.0)
453
+
454
+ Through 0.7.0, `REA_SKIP_CODEX_REVIEW=<reason>` short-circuited the
455
+ **entire** push-review gate — operators reached for it to silence a
456
+ transient Codex outage and accidentally bypassed HALT, the cross-repo
457
+ guard, and the general push-review gate. 0.8.0 narrows it to what the
458
+ name implies: the waiver satisfies **only** the protected-path Codex
459
+ audit requirement. HALT, cross-repo guard, ref-resolution failures, and
460
+ push-review-cache misses still block. The skip audit record is still
461
+ named `codex.review.skipped` and still fails the `codex.review` jq
462
+ predicate — skipping a review is not a review.
463
+
464
+ For the previous whole-gate bypass, use `REA_SKIP_PUSH_REVIEW=<reason>`
465
+ (unchanged, 0.5.0). It writes `push.review.skipped` with an
466
+ `os_identity` sub-object (uid, whoami, hostname, pid, ppid, tty, ci)
467
+ so auditors can distinguish a real operator from a forged git-config
468
+ actor, and refuses on CI runners unless the policy opts in via
469
+ `review.allow_skip_in_ci: true`.
265
470
 
266
471
  Codex responses are treated as untrusted input. They flow through the
267
472
  `redact` and `injection` middleware on return — same treatment as any
@@ -269,30 +474,32 @@ other downstream tool result. Codex never receives `.rea/policy.yaml`
269
474
  content in its prompts; Codex reviews diffs, not policy.
270
475
 
271
476
  If Codex is not installed, `rea doctor` warns with a one-line install
272
- hint. REA does not require Codex to function, but the default workflow
273
- assumes it.
477
+ hint. REA does not require Codex to function the `bst-internal-no-codex`
478
+ and `open-source-no-codex` profiles disable the requirement entirely,
479
+ and `ClaudeSelfReviewer` is the in-process fallback (tagged
480
+ `degraded: true` in the audit record so self-review is visible and
481
+ countable).
274
482
 
275
483
  ## Hooks
276
484
 
277
- Eleven hooks, down from reagent's 26. Each does one thing.
485
+ Fourteen hooks. Each does one thing.
278
486
 
279
487
  | Hook | Event | One-line purpose |
280
488
  | --- | --- | --- |
281
489
  | `dangerous-bash-interceptor` | PreToolUse: Bash | Block categories of destructive shell commands |
282
490
  | `env-file-protection` | PreToolUse: Bash | Block reads of `.env*` files |
283
- | `dependency-audit-gate` | PreToolUse: Bash | Run `npm audit`; block on high/critical |
491
+ | `dependency-audit-gate` | PreToolUse: Bash | Verify packages exist on the registry before install |
284
492
  | `commit-review-gate` | PreToolUse: Bash | Intercept `git commit`; require review on non-trivial diffs |
285
- | `push-review-gate` | PreToolUse: Bash | Intercept `git push`; warn if no recent `/codex-review` |
286
- | `attribution-advisory` | PreToolUse: Bash | Block commits containing AI attribution markers |
493
+ | `push-review-gate` | PreToolUse: Bash | Intercept `git push` (Claude-Code-JSON adapter); protected-path + Codex audit |
494
+ | `push-review-gate-git` | `.husky/pre-push` | Native git adapter around the same core |
495
+ | `attribution-advisory` | PreToolUse: Bash | Block commits / PRs containing AI attribution markers |
496
+ | `pr-issue-link-gate` | PreToolUse: Bash | Advisory warn when `gh pr create` has no linked issue |
497
+ | `security-disclosure-gate` | PreToolUse: Bash | Route security-keyword `gh issue create` to private disclosure |
287
498
  | `secret-scanner` | PreToolUse: Write\|Edit | Scan file writes for credential patterns |
288
- | `settings-protection` | PreToolUse: Write\|Edit | Block agent writes to `.claude/settings.json` |
499
+ | `settings-protection` | PreToolUse: Write\|Edit | Block agent writes to `.claude/settings.json`, hook dirs, policy |
289
500
  | `blocked-paths-enforcer` | PreToolUse: Write\|Edit | Enforce `blocked_paths` from policy |
290
- | `changeset-security-gate` | PreToolUse: Write\|Edit | Require changeset entry on security-relevant changes |
291
- | `architecture-review-gate` | PostToolUse: Write\|Edit | Flag edits crossing architectural boundaries |
292
-
293
- A twelfth hook, `security-disclosure-gate`, intercepts `gh issue create`
294
- commands containing security-sensitive keywords and redirects to private
295
- disclosure. It is installed as part of the Bash PreToolUse set.
501
+ | `changeset-security-gate` | PreToolUse: Write\|Edit | Guard changesets against GHSA leaks and malformed frontmatter |
502
+ | `architecture-review-gate` | PostToolUse: Write\|Edit | Flag edits crossing architectural boundaries (advisory) |
296
503
 
297
504
  ## Slash commands
298
505
 
@@ -311,7 +518,7 @@ rejected, not ignored.
311
518
 
312
519
  | Field | Type | Purpose |
313
520
  | --- | --- | --- |
314
- | `version` | string, `"1"` | Schema version; only `"1"` accepted in 0.1.x |
521
+ | `version` | string, `"1"` | Schema version; only `"1"` accepted in the current major |
315
522
  | `profile` | string | Profile name from `profiles/` (e.g. `bst-internal`) |
316
523
  | `autonomy_level` | `L0`\|`L1`\|`L2`\|`L3` | Current autonomy. `L0` = read-only; `L3` = full tool access |
317
524
  | `max_autonomy_level` | `L0`\|`L1`\|`L2`\|`L3` | Hard ceiling. `autonomy_level` cannot exceed this |
@@ -321,6 +528,13 @@ rejected, not ignored.
321
528
  | `context_protection.delegate_to_subagent` | string[] | Commands that must run in a subagent context to preserve the parent's context window |
322
529
  | `context_protection.max_bash_output_lines` | number | Truncate long bash output at this line count |
323
530
  | `notification_channel` | string | Optional Discord webhook URL. Empty string = no notifications |
531
+ | `review.codex_required` | boolean | When `false`, protected-path pushes don't require a Codex audit (first-class no-Codex mode). Default `true` |
532
+ | `review.cache_max_age_seconds` | number | TTL for entries in `.rea/review-cache.jsonl`. Default 3600 |
533
+ | `review.allow_skip_in_ci` | boolean | When `true`, `REA_SKIP_PUSH_REVIEW` is accepted on CI runners. Default `false` |
534
+ | `injection.suspicious_blocks_writes` | boolean | `bst-internal` posture — `suspicious` verdict on a write/destructive tool denies instead of warning. Default `false` |
535
+ | `redact.patterns[]` | string[] | User-supplied secret patterns; vetted via `safe-regex` at load |
536
+ | `redact.match_timeout_ms` | number | Per-call regex budget. Default 100 |
537
+ | `gateway.health.expose_diagnostics` | boolean | When `true`, `__rea__health` emits redacted+classified diagnostic strings on the wire. Default `false` (null) |
324
538
 
325
539
  `autonomy_level > max_autonomy_level` is rejected at parse time. Setting
326
540
  `promotion_requires_human_approval: false` requires the CLI flag
@@ -345,8 +559,11 @@ npx @bookedsolid/rea init --from-reagent
345
559
  - Leaves `.reagent/` in place; you delete it manually after verifying
346
560
  `rea doctor` passes and a dogfood run completes.
347
561
 
348
- Reagent will be deprecated via `npm deprecate` within seven days of
349
- REA 0.1.0. The deprecation notice points users here.
562
+ See [MIGRATION-0.5.0.md](./MIGRATION-0.5.0.md) for the BUG-008 / BUG-009
563
+ / BUG-010 coordinated fix window. Between 0.5.0 and 0.9.0, the breaking
564
+ semantic change worth calling out is 0.8.0's narrowing of
565
+ `REA_SKIP_CODEX_REVIEW` to a Codex-only waiver — see the CHANGELOG
566
+ entry for the migration steps.
350
567
 
351
568
  ## Security
352
569
 
package/SECURITY.md CHANGED
@@ -2,10 +2,16 @@
2
2
 
3
3
  ## Supported Versions
4
4
 
5
- | Version | Supported |
6
- | ------- | --------- |
7
- | 0.1.x | Yes |
8
- | < 0.1 | No (pre-release) |
5
+ Security fixes land on the latest minor line. Older minors receive fixes only
6
+ when the issue is critical and a backport is tractable.
7
+
8
+ | Version | Supported |
9
+ | ------- | ------------------------------------------- |
10
+ | 0.9.x | Yes — active line |
11
+ | 0.8.x | Critical fixes only, 30 days from 0.9.0 |
12
+ | 0.7.x | No — superseded; upgrade recommended |
13
+ | ≤ 0.6.x | No — superseded; upgrade recommended |
14
+ | < 0.1 | No (pre-release) |
9
15
 
10
16
  ## Reporting a Vulnerability
11
17
 
@@ -85,10 +91,21 @@ REA's security model is defense-in-depth across two independent layers:
85
91
 
86
92
  **Hook layer** (development-time, Claude Code hooks):
87
93
 
88
- - 11 Claude Code hooks enforce security at the point of tool invocation
89
- - `security-disclosure-gate` blocks public issue creation for security topics
94
+ - 14 shell scripts ship in the hook layer. 12 are wired into Claude Code's
95
+ `PreToolUse` / `PostToolUse` events via the default `.claude/settings.json`.
96
+ Two are shipped but NOT registered by default: `commit-review-gate.sh`
97
+ is a `PreToolUse: Bash` hook that matches `git commit` for operators who
98
+ opt into commit-time review by adding a rule, and `push-review-gate-git.sh`
99
+ is a native-git adapter that sources `hooks/_lib/push-review-core.sh`
100
+ (the same shared core used by the Claude-Code push-review adapter),
101
+ shipped for consumers who wire a wrapper-based `.husky/pre-push` that
102
+ execs it directly. `rea init`'s default installer emits a standalone
103
+ inline `.husky/pre-push` body rather than a wrapper; unifying the
104
+ husky installer on the adapter is tracked as a follow-up
105
+ - `security-disclosure-gate` routes public security-keyword issue creation to private disclosure
90
106
  - `settings-protection` prevents agents from modifying their own safety rails
91
107
  - `dangerous-bash-interceptor` blocks categories of destructive shell commands
108
+ - `push-review-gate` and the shared-core adapter (`push-review-gate-git.sh` sourcing `hooks/_lib/push-review-core.sh`) anchor trust on the hook's own on-disk location via `BASH_SOURCE` rather than caller-controlled env vars; see `THREAT_MODEL.md §5.18`. The shipped inline `.husky/pre-push` body uses `git rev-parse --show-toplevel` to locate `REA_ROOT` — extending the script-anchor idiom to the inline path is tracked follow-up hardening
92
109
 
93
110
  Both layers operate independently — compromising one does not disable the other.
94
111
 
@@ -99,6 +116,6 @@ Both layers operate independently — compromising one does not disable the othe
99
116
  - Policy parsing is strict zod schema — unknown fields rejected, not ignored
100
117
  - Path traversal protection on profile loading (regex + path containment check)
101
118
  - CI publish pipeline includes gitleaks secret scanning, npm provenance attestation via OIDC, SBOM generation, and payload validation
102
- - All shell hooks use `set -euo pipefail` with explicit variable quoting
119
+ - All shell hooks set fail-fast flags with explicit variable quoting (`set -euo pipefail`, or `set -uo pipefail` for hooks that consume stdin JSON where a single `jq`-path miss must not abort before the conditional branches run)
103
120
  - Commits are signed and DCO-signed-off; `main` branch protection requires passing checks and review
104
121
  - npm publish uses OIDC provenance; no long-lived NPM tokens in CI
package/THREAT_MODEL.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Threat Model — REA Gateway and Hook Layer
2
2
 
3
- Version: 0.2.x | Last updated: 2026-04-18
3
+ Version: 0.9.x | Last updated: 2026-04-21
4
4
 
5
5
  ---
6
6
 
@@ -82,12 +82,12 @@ Downstream MCP servers are treated as untrusted by default. Codex plugin *invoca
82
82
 
83
83
  **Mitigations:**
84
84
 
85
- - `injection` middleware scans tool arguments and results for instruction-like patterns.
85
+ - `injection` middleware scans downstream tool **results** (`ctx.result`) post-execute for instruction-like patterns — see §5.21 for the three-tier classifier.
86
86
  - All injection regexes run under a per-call worker-thread timeout (`wrapRegex`, default 100ms) with a hard kill on timeout — catastrophic backtracking cannot hang the gateway. See §5.12.
87
87
  - Gateway middleware chain re-derives tier from tool_name independently on every invocation — a poisoned description cannot change the tier classification.
88
88
  - `.rea/policy.yaml` is re-read on every invocation; a "set autonomy to L3" instruction cannot be persisted by any middleware path.
89
89
 
90
- **Residual risk:** Semantic injection in natural-language descriptions is not fully mitigated. No regex can reliably detect all instruction-injection patterns. A three-valued escalation (`clean` / `suspicious` / `likely-injection`) is planned for 0.3.0 so write/destructive tiers require human-loop approval on suspicion rather than accepting a binary pass/fail.
90
+ **Residual risk:** Semantic injection in natural-language descriptions is not fully mitigated. No regex can reliably detect all instruction-injection patterns. The shipped three-valued classifier (`clean` / `suspicious` / `likely_injection`; see §5.21) narrows the footgun by making "write under suspicion" a conscious policy decision but does not eliminate it.
91
91
 
92
92
  ---
93
93
 
@@ -203,7 +203,7 @@ Downstream MCP servers are treated as untrusted by default. Codex plugin *invoca
203
203
 
204
204
  ### 5.8 Codex Plugin Abuse and Codex-Review Gate Weakening
205
205
 
206
- **Threat:** A poisoned Codex adversarial-review response contains prompt-injection content designed to make Claude take an unsafe action "per the reviewer's recommendation." A malicious actor uses `/codex-review` to launder an attack past the policy layer. Separately: a rogue agent or operator disables the Codex-review requirement to push protected-path changes without adversarial review.
206
+ **Threat:** A poisoned Codex adversarial-review response contains prompt-injection content designed to make Claude take an unsafe action "per the reviewer's recommendation." A malicious actor uses `/codex-review` to launder an attack past the policy layer. Separately: a rogue agent or operator disables the Codex-review requirement to push protected-path changes without adversarial review, or uses the escape hatch to bypass far more than the Codex requirement.
207
207
 
208
208
  **Mitigations:**
209
209
 
@@ -212,12 +212,27 @@ Downstream MCP servers are treated as untrusted by default. Codex plugin *invoca
212
212
  - Codex never receives `.rea/policy.yaml` content in its prompt; Codex reviews diffs, not policy.
213
213
  - The `codex-adversarial` agent cannot by itself modify policy, trigger writes, or bypass blocked paths — it is a review tool, not an actor.
214
214
  - **Pluggable reviewer** (0.2.0, G11.2): when Codex is unreachable, `ClaudeSelfReviewer` is the fallback. Claude-on-Claude review is explicitly tagged `degraded: true` in the audit record so self-review is visible and countable.
215
- - **Audited escape hatch** (0.2.0, G11.1): `REA_SKIP_CODEX_REVIEW=<reason>` bypasses the protected-path Codex requirement but writes a `codex.review.skipped` audit record carrying the verbatim reason, the operator's git identity, the head_sha, and the files-changed count. Fail-closed on missing `dist/audit/append.js` or missing git identity — the gate never silently disables. Skip records use `tool_name: "codex.review.skipped"` so a skip cannot satisfy a future Codex-review requirement on the same HEAD.
216
- - **First-class no-Codex mode** (0.2.0, G11.4): `policy.review.codex_required: false` skips the protected-path Codex requirement entirely. In that mode `REA_SKIP_CODEX_REVIEW` becomes a no-op (skipping a review that isn't required has no meaning), and no skip record is emitted. Both `.claude/hooks/push-review-gate.sh` (Claude Code path) and `.husky/pre-push` (terminal path) honor this knob.
215
+ - **First-class no-Codex mode** (0.2.0, G11.4): `policy.review.codex_required: false` skips the protected-path Codex requirement entirely. In that mode `REA_SKIP_CODEX_REVIEW` becomes a no-op (skipping a review that isn't required has no meaning), and no skip record is emitted. Both the Claude-Code adapter (`.claude/hooks/push-review-gate.sh`) and the native git adapter (`.claude/hooks/push-review-gate-git.sh`, sharing `hooks/_lib/push-review-core.sh`) honor this knob.
217
216
  - **Availability probe** (0.2.0, G11.3): `rea serve` runs an initial `codex --version` probe on startup when `codex_required` ≠ false. A failed probe emits a single stderr warn — startup never fail-closes on a Codex miss.
218
217
  - **Reviewer telemetry** (0.2.0, G11.5): `ClaudeSelfReviewer.review()` writes a row to `.rea/metrics.jsonl` with invocation counts, estimated tokens (chars/4), latency, and a `rate_limited` signal parsed from stderr. Payloads are NEVER stored; a unit test asserts that marker strings in inputs never appear in the metrics file.
219
218
 
220
- **Residual risk:** Semantic injection in Codex responses (e.g., reviewer recommends a specific code change that is itself malicious) cannot be fully detected. Mitigation is defense-in-depth: the middleware still runs on any subsequent write that Claude attempts based on the review. A `rea doctor` abuse signal on escape-hatch frequency (≥3 invocations per rolling 7 days) is proposed for 0.3.0.
219
+ **`REA_SKIP_CODEX_REVIEW` Codex-only waiver (0.8.0, #85).** Through 0.7.0 this env var short-circuited the **entire** push-review gate after writing its skip audit record equivalent in scope to `REA_SKIP_PUSH_REVIEW`. Operators reached for it to silence a transient Codex unavailability and accidentally bypassed HALT, the cross-repo guard, ref-resolution, and the push-review cache. 0.8.0 narrows it to what the name implies: the waiver satisfies **only** the protected-path Codex-audit requirement. Every other gate still runs:
220
+
221
+ - **HALT** (`.rea/HALT`) — still blocks.
222
+ - **Cross-repo guard** — still blocks.
223
+ - **Ref-resolution failures** (missing remote object, unresolvable source ref) — still block, but the skip audit record is written first so the operator's commitment to waive is durable.
224
+ - **Push-review cache** — a miss still falls through to the general "Review required" block.
225
+
226
+ The skip audit record is still named `codex.review.skipped` and still fails the `codex.review` jq predicate. Banner text changed from `CODEX REVIEW SKIPPED` to `CODEX REVIEW WAIVER active` to reflect the narrower scope. Fail-closed contract preserved: missing `dist/audit/append.js` (rea unbuilt) or missing git identity → exit 2.
227
+
228
+ **Cache gate hardening (0.8.0, same release).** The review cache is a separate, later check in the core (`hooks/_lib/push-review-core.sh` §8) — it governs the general push-review gate for non-protected-path pushes, not the protected-path Codex audit itself. Two composition bugs in that cache layer became load-bearing once the Codex waiver no longer papered over cache behavior, so they were fixed in the same release:
229
+
230
+ - The cache-hit predicate now requires `.hit == true and .result == "pass"`. Previously `.hit == true` alone was sufficient, which meant a cached `fail` verdict would silently satisfy the gate. The permissive predicate was a real exposure once the Codex-only waiver stopped short-circuiting subsequent checks.
231
+ - The cache key is derived from the PUSHED source ref (from pre-push stdin), not from the checkout branch. `git push origin hotfix:main` from a `feature` checkout now correctly looks up the `hotfix` cache entry.
232
+
233
+ **`REA_SKIP_PUSH_REVIEW` — whole-gate bypass (0.5.0).** The recovery path for consumers deadlocked on a broken rea install. Writes `tool_name: "push.review.skipped"` with an `os_identity` sub-object (uid, whoami, hostname, pid, ppid, ppid_cmd, tty, ci) so auditors can distinguish a real operator from a forged git-config actor. Refuses with exit 2 on CI runners (`CI` env var set) unless `review.allow_skip_in_ci: true` is opted in via policy — closes the ambient-env-var bypass surface on shared build agents. HALT check runs before the skip branch: `.rea/HALT` cannot be bypassed by either hatch.
234
+
235
+ **Residual risk:** Semantic injection in Codex responses (e.g., reviewer recommends a specific code change that is itself malicious) cannot be fully detected. Mitigation is defense-in-depth: the middleware still runs on any subsequent write that Claude attempts based on the review. A `rea doctor` abuse signal on escape-hatch frequency (≥3 invocations per rolling 7 days) remains tracked.
221
236
 
222
237
  ---
223
238
 
@@ -300,22 +315,185 @@ Downstream MCP servers are treated as untrusted by default. Codex plugin *invoca
300
315
 
301
316
  ---
302
317
 
318
+ ### 5.14 Supervisor Trust Boundary (0.9.0, BUG-002..003)
319
+
320
+ **Threat:** A downstream MCP child process crashes unexpectedly — OS OOM-kill, unhandled exception in the child, stdio pipe error outside a caller-initiated close — and the gateway keeps a stale `Client` handle around. Every subsequent `callTool` hits the zombie, receives `Not connected`, the circuit breaker flaps open → half-open → open against the same dead handle, and the child is never respawned. From the operator's perspective the gateway is "up" but nothing works.
321
+
322
+ **Mitigations:**
323
+
324
+ - `DownstreamConnection` wires the MCP SDK `StdioClientTransport`'s `onclose` and `onerror` callbacks on a **per-transport** basis (never global) and treats an unexpected close as "child is dead": the client and transport fields are nulled before the next call. The next `callTool` takes the `connect()` branch and actually respawns the child.
325
+ - Intentional `close()` sets a local flag before calling into the SDK, so the same `onclose` callback does not double-count a graceful shutdown as an unexpected death.
326
+ - "Not connected" errors from the SDK (the in-flight fallback path) are promoted to the respawn path with the same eager invalidation — a stale client is invalidated before the one-shot reconnect fires, so we spawn fresh rather than retrying with the same dead handle.
327
+ - A 30-second flapping guard (`RECONNECT_FLAP_WINDOW_MS`) refuses a second reconnect that lands too quickly after the previous successful one — the child is clearly unhealthy and the circuit breaker is a better place to handle it.
328
+ - `DownstreamConnection.lastError` is bounded **at write** via `boundedDiagnosticString` on a true ES-private `#lastErrorMessage` setter (0.7.0, BUG-014). The invariant is structural: every write produces a bounded stored value regardless of assignment-site count. Non-string inputs raise `TypeError` instead of silently corrupting the field.
329
+ - Error strings published to `serve.state.json` flow through the same `buildRegexRedactor` the gateway logger uses (policy `redact.patterns` + built-in `SECRET_PATTERNS`) via the `lastErrorRedactor` option on the live-state publisher — a credential that leaked into a downstream error message is scrubbed before it lands on disk or on an operator's terminal via `rea status`.
330
+
331
+ **Residual risk:** A child that advertises tools but then returns malicious responses on every call is not a supervisor-layer concern — it is handled by the standard middleware chain (injection, redact, result-size-cap). A child that alternates between healthy and malicious responses more slowly than the circuit breaker can trip is a limitation of any breaker-based approach; detection depends on `.rea/metrics.jsonl` anomalies.
332
+
333
+ Ref: `src/gateway/downstream.ts`, `src/gateway/downstream.test.ts`.
334
+
335
+ ---
336
+
337
+ ### 5.15 SESSION_BLOCKER Audit Semantics (0.9.0, BUG-004)
338
+
339
+ **Threat:** A persistently failing downstream produces a log stream full of identical circuit-open records. Operators miss the signal because it looks like normal circuit-breaker churn, or alert-fatigue kicks in and they tune it out entirely.
340
+
341
+ **Mitigations:**
342
+
343
+ - `SessionBlockerTracker` subscribes to circuit-breaker `onStateChange` events and counts circuit-open transitions per `(session_id, server_name)`. It tracks **open-level** failures per session, not wire-hot call-level failures — every circuit-open transition counts as one, so a downstream that flaps `open→closed→open` three times in ten minutes crosses the threshold once.
344
+ - On threshold crossing (default: 3), exactly **one** `SESSION_BLOCKER` event fires: a LOUD structured log record plus an audit append via `appendAuditRecord`. The counter keeps incrementing but subsequent opens do **not** re-fire.
345
+ - Recovery (transition to `closed`) resets the counter and re-arms the emit flag — a later threshold crossing fires a fresh record.
346
+ - A new session (new `rea serve` process / new `session_id`) drops every counter and starts fresh.
347
+ - Audit append is best-effort; log-side emission happens first and unconditionally. A broken audit pipeline must never break state tracking.
348
+ - `SESSION_BLOCKER` is an **audit event**, not a gateway exception. The gateway keeps serving traffic; the event is the forensic signal an operator can search for in `audit.jsonl`.
349
+
350
+ **Residual risk:** A downstream that flaps fast enough to hit the threshold on every session but recovers quickly in between can still generate a record per session. This is the intended behavior — the operator should see it every session and fix the downstream.
351
+
352
+ Ref: `src/gateway/session-blocker.ts`, `src/gateway/session-blocker.test.ts`.
353
+
354
+ ---
355
+
356
+ ### 5.16 `.rea/serve.state.json` Lock / Ownership Handoff (0.9.0, BUG-005)
357
+
358
+ **Threat:** A crashed `rea serve` leaves `serve.state.json` and `serve.pid` behind. A new `rea serve` instance either (a) refuses to start because ownership-by-session-id locks the file forever, or (b) silently takes over without verifying the predecessor is dead — letting two live gateways race on writes.
359
+
360
+ **Mitigations:**
361
+
362
+ - Writes use atomic temp-file + rename (`writeFileAtomic`) with a `.<filename>.<randomUUID>.tmp` suffix, so a reader never sees a torn intermediate.
363
+ - The snapshot carries both `session_id` (boot-time ownership key) and `owner_pid` (0.9.0 pass-4). A newly-started `rea serve` whose predecessor crashed can detect the abandoned file — `kill(owner_pid, 0)` returns ESRCH — and take over ownership rather than stalling.
364
+ - The session-id check runs first; `owner_pid` is a secondary lock-guarded field used only to distinguish "abandoned" from "actively owned by a different session." The combination preserves the safety invariant (no silent takeover of a live gateway's file) while avoiding the pass-2 strict-one-directional lock.
365
+ - Consumers (`rea status`, `rea check`) read the file **only**. They never write and never clean up stale state — the serve process is the only writer. Pre-0.9.0 snapshots without `downstreams` or `owner_pid` degrade to `null` fields with a hint to upgrade.
366
+ - Writes are debounced through a 250 ms trailing timer so a flap storm (open → half-open → open → half-open in rapid succession) doesn't spam the filesystem.
367
+
368
+ **Residual risk:** A pathological PID reuse (the OS recycled the crashed gateway's PID onto an unrelated process before the new gateway boots) would cause the `kill(pid, 0)` probe to report "alive" spuriously. The session-id check catches this — a live process with a different session_id is distinguishable from an abandoned file — but the first-boot window where session_id is also unchanged is a theoretical corner. Operators who suspect this should `rm .rea/serve.pid .rea/serve.state.json` and restart.
369
+
370
+ Ref: `src/gateway/live-state.ts`, `src/gateway/server.ts`.
371
+
372
+ ---
373
+
374
+ ### 5.17 Health Payload Sanitization (0.6.2, BUG-011)
375
+
376
+ **Threat:** The `__rea__health` meta-tool short-circuits the middleware chain (intentionally — so it stays callable under HALT) and previously serialized `halt_reason` plus every `downstreams[].last_error` verbatim. Error strings from upstream MCPs can contain secrets (API keys, tokens) or prompt-injection payloads, neither of which the `redact` or `injection` middleware would see because the short-circuit response skips the chain entirely. Net effect: a redact + injection-sanitizer bypass, callable precisely when HALT should be holding the line.
377
+
378
+ **Mitigations:**
379
+
380
+ - `sanitizeHealthSnapshot()` runs on the short-circuit response before it reaches the MCP wire. Default behavior: `halt_reason = null` and every `downstreams[].last_error = null`. The operator-visible wire response carries no downstream-controlled diagnostic strings.
381
+ - Full diagnostic detail still flows into the meta-tool audit record. The record written for `__rea__health` carries the unsanitized `metadata.halt_reason` and `metadata.downstream_errors[]` (sourced pre-sanitize from `pool.healthSnapshot()` inside `server.ts`) alongside the existing counts. Audit is on local disk, hash-chained, append-only, and not LLM-reachable — the correct sink for trusted-operator diagnostic text.
382
+ - Operators who need error strings on the MCP wire can opt in via `.rea/policy.yaml`:
383
+
384
+ ```yaml
385
+ gateway:
386
+ health:
387
+ expose_diagnostics: true
388
+ ```
389
+
390
+ Opt-in mode still runs the full sanitizer pass: `redactSecrets` replaces known secret patterns with `[REDACTED:*]`, `classifyInjection` replaces any non-`clean` diagnostic string (verdicts `suspicious` or `likely_injection`) with the exported `INJECTION_REDACTED_PLACEHOLDER` token (`<redacted: suspected injection>`), and the redact-timeout sentinel `[REDACTED: pattern timeout]` is filtered from the wire so a caller cannot distinguish "pattern timed out" from "pattern matched."
391
+
392
+ - Diagnostic strings are bounded at 4096 UTF-16 code units before any scanning runs, via a UTF-8-safe truncate that drops trailing lone surrogates — an adversarial downstream cannot DoS the tool by throwing oversize errors.
393
+ - `meta.health.audit_failed` log level was elevated from `warn` to `error` and `summary.audit_fail_count` is exposed in the snapshot so operators can detect an audit-sink failure without parsing stderr.
394
+
395
+ **Residual risk:** `expose_diagnostics: true` is still operator-controlled text on an LLM-reachable surface. The sanitizer is best-effort defense-in-depth — a secret pattern not in the catalog, or an injection pattern that `classifyInjection` rates `clean`, will pass through unchanged.
396
+
397
+ Ref: `src/gateway/meta/health.ts`, `src/gateway/meta/health-sanitize.test.ts`.
398
+
399
+ ---
400
+
401
+ ### 5.18 Script-Anchor Hook Trust Boundary (0.6.2, BUG-012)
402
+
403
+ **Threat:** The `push-review-gate.sh` and `commit-review-gate.sh` hooks need to know the rea repository root for (a) the cross-repo short-circuit when invoked from a consumer repository, and (b) HALT / policy enforcement against the correct policy file. Prior to 0.6.2, `REA_ROOT=${CLAUDE_PROJECT_DIR:-$(pwd)}`. `CLAUDE_PROJECT_DIR` is caller-controlled — any process invoking the hook can set it to a foreign path, which the guard would treat as rea. Result: HALT silently bypassed, cross-repo short-circuit fires on the wrong comparison, policy read from a directory the caller chose.
404
+
405
+ **Mitigations:**
406
+
407
+ - Hooks derive `REA_ROOT` from their own on-disk location using `BASH_SOURCE[0]` + `pwd -P`, then walk up to 4 parent directories looking for `.rea/policy.yaml` as the authoritative install marker. Install topology is fixed (`<root>/.claude/hooks/<name>.sh`), so the anchor is forge-resistant — a caller cannot relocate the hook without filesystem write access to the rea install, which is already protected by `settings-protection.sh` and `blocked-paths` enforcement.
408
+ - `CLAUDE_PROJECT_DIR` is retained only as an advisory signal. When set and the realpath differs from the script-derived `REA_ROOT`, the hook emits a stderr advisory and continues using the script-derived value. It is never compared for short-circuit, never used to select the policy file, and never used to locate HALT.
409
+ - The cross-repo guard (0.6.1) compares `git rev-parse --git-common-dir` on both sides (not path prefixes). Mixed state (one side git, one non-git) fails **closed** — the gate runs — rather than falling through to path-prefix. Only the both-non-git case uses path-prefix, matching the documented 0.5.1 non-git escape hatch.
410
+ - The 0.7.0 BUG-008 cleanup extracted the shared logic into `hooks/_lib/push-review-core.sh` so both the Claude-Code PreToolUse adapter (`push-review-gate.sh`) and the native git adapter (`push-review-gate-git.sh`) share a single anchor-walk implementation — a fix lands in one place.
411
+
412
+ **Residual risk:** If a local attacker has write access to the rea install directory they can move or replace the hook file, which would change `SCRIPT_DIR` and therefore `REA_ROOT`. This is equivalent to tampering with any other hook contents (`settings-protection.sh` already addresses it) and lies outside the `CLAUDE_PROJECT_DIR` threat class.
413
+
414
+ Ref: `hooks/_lib/push-review-core.sh`, `__tests__/hooks/push-review-gate-cross-repo.test.ts` "BUG-012: foreign CLAUDE_PROJECT_DIR does NOT bypass HALT".
415
+
416
+ ---
417
+
418
+ ### 5.19 Tarball-Smoke Security-Claim Gate (0.6.2, BUG-013)
419
+
420
+ **Threat:** A changeset file claims a security fix (`[security]` marker), the release workflow merges and publishes, but the shipping `dist/` is byte-identical to the previous release — the claimed fix never made it into the compiled output. The 0.6.0 → 0.6.1 regression is the canonical example: `src/` changed, `dist/` did not. Without a pipeline gate that rebuilds `dist/` from the shipping commit and verifies the published tarball contents, no future security changeset can be trusted.
421
+
422
+ **Mitigations (shipped across 0.6.2 + 0.7.0):**
423
+
424
+ - `scripts/tarball-smoke.sh` (0.6.2) enforces a **content-based security-claim gate**. When any `.changeset/*.md` contains the `[security]` marker, the smoke requires at least one `src/**/*(sanitize|security)*.test.ts` file exists **and** every named-import symbol it pulls from a relative path is present in the compiled `dist/` tree. The gate fails loudly (exit 2) if the marker is present but no testable security symbols are extractable.
425
+ - `.github/workflows/release.yml` (0.7.0) rebuilds `dist/` from the shipping HEAD immediately before `changesets/action`, records the SHA-256 tree hash to `$RUNNER_TEMP/rea-dist-hash` (CI scratch space — cannot be accidentally committed by `changesets/action`'s `git add .`), and post-publish re-packs the just-published tarball from npm and fails the release if the published `dist/` tree hash doesn't match.
426
+ - `scripts/dist-regression-gate.sh` (0.7.0) + the `dist-regression` CI job run on every PR and every push-to-main. If `src/` has changed vs the last published tag but the rebuilt `dist/` tree hashes identically to the published tarball, CI fails — the "src changed, dist didn't" regression class is caught **before** the release branch, not only at publish time.
427
+ - Husky e2e regression guard (`__tests__/hooks/husky-e2e.test.ts`, 0.7.0) invokes a REAL `git push` against a bare remote via `core.hooksPath=.husky` with the SHIPPED `.husky/pre-push` in place (the standalone inline body emitted by `src/cli/install/pre-push.ts`). The ten-test matrix covers: nine cases that exercise the inline body's HALT, protected-path, Codex-waiver, `review.codex_required: false`, and bootstrap-push branches, plus one case that swaps in a wrapper around `hooks/push-review-gate-git.sh` as a shape-guard for the future installer path. The kind of BUG-008 silent-exit-0 regression that slipped past synthesized-stdin unit tests through 0.4.0 would now fail loudly.
428
+
429
+ **Residual risk:** A security claim whose fix is purely a deletion (no new symbols, no new test file) cannot be validated by the symbol-extraction gate. The `dist-regression` job catches this as a byte-identity failure, but the gate has no positive evidence of the fix's presence. Manual maintainer review on `[security]`-labeled PRs remains the compensating control.
430
+
431
+ Ref: `scripts/tarball-smoke.sh`, `scripts/dist-regression-gate.sh`, `.github/workflows/release.yml`.
432
+
433
+ ---
434
+
435
+ ### 5.20 Registry TOFU Pinning (0.3.0, G7)
436
+
437
+ **Threat:** An attacker who lands a malicious template via `rea init`, or who patches `.rea/registry.yaml` out-of-band (compromised dependency postinstall, CI-bot misconfig, editor plugin writing through stale buffers), can silently swap a downstream server's `command`, `args`, or `env` keys. The gateway would spawn the new child at next startup and proxy it without challenge.
438
+
439
+ **Mitigations:**
440
+
441
+ - On first successful connect, the gateway records a SHA-256 fingerprint of each downstream's **canonicalized registry config path** — `name`, `command`, `args`, the sorted KEY SET of `env` (values excluded so secret rotation doesn't trip drift), `env_passthrough`, and `tier_overrides` — to `.rea/fingerprints.json`. Trust-On-First-Use (TOFU) by config-path hash, not by tool-surface or binary hash.
442
+ - Subsequent connects re-compute the fingerprint and compare. A mismatch is a **hard fail**: the downstream is marked unhealthy, a structured log + audit record names the drift, and the gateway refuses to route calls to it. The operator must inspect the registry delta and either clear the fingerprint entry (re-pin) or acknowledge the drift via one-shot `REA_ACCEPT_DRIFT=<name>`.
443
+ - `fingerprints.json` is gitignored by default via the `.rea/` managed block so a local re-pin does not pollute history.
444
+ - Scope is explicitly **path-only, not binary, and not tool-surface**. Binary hashing would turn TOFU into a slow-boot tax and would trip false-positive drift on every legitimate MCP server upgrade. Tool-surface hashing was considered and deferred — see residual risk below.
445
+
446
+ **Residual risk:** Two classes remain uncovered by G7:
447
+
448
+ 1. **Catalog drift from a legitimately-configured downstream.** A downstream whose registry config is unchanged but whose `tools/list` response changes between connects (new tool, renamed tool, modified description, modified input schema) is **not** detected by the config-path fingerprint. An attacker who compromises the downstream binary at `config.command` without changing the registry entry, or a legitimate upstream MCP server that silently expands its tool catalog in a patch release, both fall through this gate. See §6 "Catalog drift by downstream not detected on reconnect" — this is an active, tracked residual risk, not a mitigated one. The redact + injection middleware running on every proxied result is the compensating control, not a substitute.
449
+ 2. **Host compromise with config-matching binary substitution.** An attacker who swaps the on-disk binary at `config.command` but leaves `.rea/registry.yaml` untouched is outside the G7 threat model — that is a host-integrity / supply-chain class, not a registry-tampering class.
450
+
451
+ Ref: `src/registry/fingerprint.ts` (`canonicalize()`, `fingerprintServer()`), `src/gateway/downstream-pool.ts` fingerprint-probe path.
452
+
453
+ ---
454
+
455
+ ### 5.21 G9 Three-Tier Injection Classifier (0.3.0)
456
+
457
+ **Threat:** A binary pass/fail injection detector is either too permissive (known instruction patterns slip through) or too strict (every tool description flags and the gateway becomes unusable). Either failure mode eventually trains operators to ignore the signal.
458
+
459
+ **Mitigations:**
460
+
461
+ - `classifyInjection()` returns one of three verdicts: `clean`, `suspicious`, or `likely_injection`. The verdict is derived from weighted matches against the shipped pattern catalog, tuned so legitimate tool descriptions rate `clean` by default.
462
+ - Escalation rules (first match wins, per `src/gateway/middleware/injection.ts:450-527`):
463
+ 1. No literal and no base64-decoded match → `clean`.
464
+ 2. Any base64-decoded match, regardless of tier → `likely_injection`.
465
+ 3. ≥2 distinct literal matches, regardless of tier → `likely_injection`.
466
+ 4. Any match at read-tier (or unknown tier — fail closed) → `likely_injection`.
467
+ 5. Exactly one literal match at write/destructive tier → `suspicious`.
468
+ - `likely_injection` → always deny. No opt-out at policy level. (Note: because of rule 4, ANY injection match at read-tier is denied — the "warn but permit" path only exists for single-literal matches at write/destructive tier.)
469
+ - `suspicious` on a write/destructive tier → **policy-controlled**. `injection.suspicious_blocks_writes: true` (shipped in `bst-internal` and `bst-internal-no-codex` profiles — internal posture) denies. The schema default is `false` — external profiles (`open-source`, `client-engagement`, `minimal`, `lit-wc`) inherit the looser behavior so upgrading 0.2.x consumers are not silently tightened.
470
+ - **Regex timeout / oversize-result `error` verdict is mode-dependent** (`src/gateway/middleware/injection.ts:654-728`). Under `injection_detection: block` (all profiles except `warn`), any scan timeout or oversize input denies unconditionally — the partial scan cannot prove the unscanned suffix is safe, so block mode fails closed. Under `injection_detection: warn`, a timeout on an otherwise-clean partial scan is recorded as `metadata.injection.verdict = 'error'` and let through — this matches the 0.2.x `warn` semantics (fail-open by design) and operators opting into `warn` must accept this trade-off. Operators who want fail-closed everywhere should stay on `block`.
471
+ - The opt-in strict flag is honored at both the middleware layer (write/destructive deny) and the sanitizer layer (health payload replacement — the `<redacted: suspected injection>` placeholder collapses **any** non-`clean` diagnostic, so `suspicious` and `likely_injection` strings are both replaced on the `__rea__health` wire under `expose_diagnostics: true`).
472
+ - Every non-`clean` invocation records a nested `ctx.metadata.injection = { verdict, matched_patterns, base64_decoded }` object on the audit row (`src/gateway/middleware/injection.ts:733-740`). Consumers must read the nested shape — there is no top-level `injection_verdict` / `injection_match_count` field. The matched-patterns array contains the distinct phrase names only; the original input text is never exported.
473
+
474
+ **Residual risk:** Semantic injection in natural-language descriptions — a well-phrased instruction that no pattern catalog will catch — is not mitigated by pattern matching. This is the general limitation acknowledged in §5.1; the three-tier classifier narrows the footgun (by making "write under suspicion" a conscious policy decision) but does not eliminate it.
475
+
476
+ Ref: `src/gateway/middleware/injection.ts`, `src/gateway/middleware/injection.test.ts`.
477
+
478
+ ---
479
+
303
480
  ## 6. Residual Risks and Open Issues
304
481
 
305
- | Risk | Severity | Tracking |
482
+ | Risk | Severity | Status / Tracking |
306
483
  | ------------------------------------------------------------- | -------- | ------------------------------ |
307
- | Semantic prompt injection via tool descriptions | High | 0.3.0 G9 (tier escalation) |
484
+ | Semantic prompt injection via tool descriptions | High | Partially mitigated — G9 three-tier classifier (§5.21) narrows the footgun via pattern matching, but semantic/natural-language injection that no catalog entry will catch is still unmitigated by design |
308
485
  | Semantic injection via Codex adversarial-review responses | High | No issue filed (defense in depth via middleware) |
309
- | Double-URL-encoding bypass for blocked paths | Medium | Planned fix |
310
- | No real-time alert on audit hash chain break | Medium | 0.3.0 G1 + G5 |
311
- | Concurrent audit writers can race at fsync | Medium | 0.3.0 G1 (proper-lockfile) |
486
+ | Concurrent audit writers can race at fsync | Medium | Mitigated — proper-lockfile shipped 0.3.0 (G1) |
487
+ | Catalog drift by downstream not detected on reconnect | Medium | Active — G7 TOFU (§5.20) pins registry CONFIG (name/command/args/env keys), not the `tools/list` response. A downstream that silently expands or alters its tool catalog without a registry edit is not caught by the fingerprint; compensating control is the per-result redact + injection middleware. Tool-surface TOFU is a planned follow-up. |
488
+ | Post-publish tarball smoke not in CI | Medium | Mitigated — tarball-smoke shipped 0.3.0, security-claim gate 0.6.2 (§5.19) |
489
+ | No real-time alert on audit hash chain break | Medium | Mitigated — audit-rotation + verify-on-append shipped 0.3.0 (G1 + G5) |
490
+ | OIDC trusted publisher not yet migrated (`NODE_AUTH_TOKEN` still in use) | Medium | Deferred past 0.5.0 per MIGRATION-0.5.0.md; current path is `--provenance` with `NODE_AUTH_TOKEN` |
491
+ | Double-URL-encoding bypass for blocked paths | Medium | Planned fix (iterative decode to fixed-point) |
312
492
  | SBOM not automated in publish pipeline | Medium | Planned |
313
493
  | Secret pattern gaps (custom token formats, encoding variants) | Medium | No issue filed |
314
- | Post-publish tarball smoke not in CI | Medium | 0.3.0 CI hardening |
315
- | Escape-hatch abuse signal not surfaced in `rea doctor` | Low | 0.3.0 (threshold: ≥3 / 7d) |
316
- | Catalog drift by downstream not detected on reconnect | Medium | 0.3.0 G7 (fingerprint + drift) |
317
- | OIDC trusted publisher not yet migrated (`NODE_AUTH_TOKEN` still in use) | Medium | 0.3.0 G8 |
494
+ | Escape-hatch abuse signal not surfaced in `rea doctor` | Low | Tracked (threshold: ≥3 / 7d) |
318
495
  | Local user can escalate policy.yaml outside gateway | Low | By design (trusted actor) |
496
+ | Registry pin mismatch → hard fail (no rollback) on TOFU | Low | By design — operator clears `.rea/fingerprints.json` to re-pin |
319
497
 
320
498
  ---
321
499
 
@@ -323,8 +501,8 @@ Downstream MCP servers are treated as untrusted by default. Codex plugin *invoca
323
501
 
324
502
  REA operates two independent layers. Bypassing one does not disable the other.
325
503
 
326
- **Hook layer** (development-time): 13 Claude Code hooks intercept tool calls before execution at the Claude Code level. Hooks enforce: secret scanning, dangerous command interception, blocked path enforcement, settings protection, attribution advisory, dependency audit, commit/push review gates, PR issue linking, architecture review, env file protection, changeset security gates, and security-disclosure gates.
504
+ **Hook layer** (development-time): 14 shell scripts ship. 12 are wired into Claude Code's `PreToolUse` / `PostToolUse` events via the default `.claude/settings.json`. Two are shipped but NOT registered by default: `commit-review-gate.sh` is a `PreToolUse: Bash` hook that matches `git commit` for operators who opt into commit-time review by adding a rule, and `push-review-gate-git.sh` is a native-git adapter that sources `hooks/_lib/push-review-core.sh` (the same shared core the Claude-Code `push-review-gate.sh` sources), shipped for consumers who wire a wrapper-based `.husky/pre-push` that execs it directly. `rea init` currently emits a standalone inline `.husky/pre-push` body (`src/cli/install/pre-push.ts`) rather than a wrapper; unifying the husky installer on the shared-core adapter is tracked as follow-up hardening. Hooks enforce: secret scanning, dangerous command interception, blocked path enforcement, settings protection, attribution advisory, dependency audit, push review gate (Claude-Code-JSON adapter registered; native `.husky/pre-push` adapter opt-in), PR issue linking, architecture review, env file protection, changeset security, and security-disclosure routing. The review-gate hooks (`push-review-gate.sh`, `push-review-gate-git.sh`, `commit-review-gate.sh`) anchor their trust decision on their own on-disk script location (BUG-012, §5.18), not on caller-controlled env vars. The remaining hooks still derive `REA_ROOT` from `${CLAUDE_PROJECT_DIR:-$(pwd)}`; extending the script-anchor idiom across the full hook set is a tracked hardening follow-up.
327
505
 
328
- **Gateway layer** (runtime, `rea serve`): A middleware chain processes every proxied MCP tool call. Middleware enforces: audit, kill switch, policy/autonomy level, tier classification, blocked paths, rate limit, circuit breaker, prompt injection detection, secret redaction (pre and post), and result size cap.
506
+ **Gateway layer** (runtime, `rea serve`): A middleware chain processes every proxied MCP tool call. Middleware enforces: audit, kill switch, policy/autonomy level, tier classification, blocked paths, rate limit, circuit breaker, prompt-injection classification (§5.21), secret redaction (pre and post), and result size cap. The gateway also supervises downstream child processes (§5.14), emits a `SESSION_BLOCKER` audit event on persistent failure (§5.15), and publishes a live per-downstream state snapshot to `.rea/serve.state.json` (§5.16) that `rea status` reads read-only. The `__rea__health` meta-tool short-circuits the chain for callability under HALT and runs a dedicated sanitizer on its response (§5.17).
329
507
 
330
508
  Both layers fail closed: on read failure, parse error, unknown errno on HALT, regex timeout, or any unexpected condition, the default action is deny (or for redaction specifically: replace with a sentinel — the content never escapes unscanned).
@@ -6,7 +6,7 @@
6
6
  *
7
7
  * `rea status` is the LIVE view: is a gateway running for this cwd? What is
8
8
  * its session id? What does the audit chain look like right now? Is HALT
9
- * active?
9
+ * active? Which downstreams are connected / healthy / tripped?
10
10
  *
11
11
  * Detection strategy for "is serve running":
12
12
  * 1. Read `.rea/serve.pid`.
@@ -14,6 +14,15 @@
14
14
  * 3. If kill throws ESRCH or EPERM, the pid is stale — treat as not-running
15
15
  * and surface that nuance in the output.
16
16
  *
17
+ * 0.9.0 — per-downstream live block. `readServeState` parses the
18
+ * `downstreams: [...]` array from `.rea/serve.state.json` (written by the
19
+ * live-state publisher on every circuit transition + supervisor event).
20
+ * Each entry carries `name`, `connected`, `healthy`, `circuit_state`,
21
+ * `retry_at`, `last_error` (redacted by the publisher), `tools_count`,
22
+ * `open_transitions`, and `session_blocker_emitted`. State files written
23
+ * by a pre-0.9.0 gateway degrade gracefully: `downstreams` surfaces as
24
+ * `null` with a hint to upgrade.
25
+ *
17
26
  * Output modes:
18
27
  * - Default: human-pretty, matching the spacing used by `rea check`.
19
28
  * - `--json`: canonical JSON object, composable with jq and future tooling.
@@ -23,6 +32,11 @@
23
32
  * `rea audit verify` is the authoritative check and is expensive on large
24
33
  * chains; here we just report line count, last timestamp, and a cheap "last
25
34
  * record's stored hash is non-empty" heuristic as an integrity smoke signal.
35
+ *
36
+ * Every disk-sourced string field flows through `sanitizeForTerminal` on the
37
+ * pretty-print path — JSON mode relies on `JSON.stringify` to escape control
38
+ * chars safely — so a malicious `halt_reason` or `last_error` cannot inject
39
+ * ANSI/OSC escapes into the operator's terminal.
26
40
  */
27
41
  /**
28
42
  * Strip every ASCII control code (C0 plus DEL) from a string. Defense
@@ -6,7 +6,7 @@
6
6
  *
7
7
  * `rea status` is the LIVE view: is a gateway running for this cwd? What is
8
8
  * its session id? What does the audit chain look like right now? Is HALT
9
- * active?
9
+ * active? Which downstreams are connected / healthy / tripped?
10
10
  *
11
11
  * Detection strategy for "is serve running":
12
12
  * 1. Read `.rea/serve.pid`.
@@ -14,6 +14,15 @@
14
14
  * 3. If kill throws ESRCH or EPERM, the pid is stale — treat as not-running
15
15
  * and surface that nuance in the output.
16
16
  *
17
+ * 0.9.0 — per-downstream live block. `readServeState` parses the
18
+ * `downstreams: [...]` array from `.rea/serve.state.json` (written by the
19
+ * live-state publisher on every circuit transition + supervisor event).
20
+ * Each entry carries `name`, `connected`, `healthy`, `circuit_state`,
21
+ * `retry_at`, `last_error` (redacted by the publisher), `tools_count`,
22
+ * `open_transitions`, and `session_blocker_emitted`. State files written
23
+ * by a pre-0.9.0 gateway degrade gracefully: `downstreams` surfaces as
24
+ * `null` with a hint to upgrade.
25
+ *
17
26
  * Output modes:
18
27
  * - Default: human-pretty, matching the spacing used by `rea check`.
19
28
  * - `--json`: canonical JSON object, composable with jq and future tooling.
@@ -23,6 +32,11 @@
23
32
  * `rea audit verify` is the authoritative check and is expensive on large
24
33
  * chains; here we just report line count, last timestamp, and a cheap "last
25
34
  * record's stored hash is non-empty" heuristic as an integrity smoke signal.
35
+ *
36
+ * Every disk-sourced string field flows through `sanitizeForTerminal` on the
37
+ * pretty-print path — JSON mode relies on `JSON.stringify` to escape control
38
+ * chars safely — so a malicious `halt_reason` or `last_error` cannot inject
39
+ * ANSI/OSC escapes into the operator's terminal.
26
40
  */
27
41
  import fs from 'node:fs';
28
42
  import { loadPolicy } from '../policy/loader.js';
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@bookedsolid/rea",
3
- "version": "0.9.0",
3
+ "version": "0.9.1",
4
4
  "description": "Agentic governance layer for Claude Code — policy enforcement, hook-based safety gates, audit logging, and Codex-integrated adversarial review for AI-assisted projects",
5
5
  "license": "MIT",
6
6
  "author": "Booked Solid Technology <oss@bookedsolid.tech> (https://bookedsolid.tech)",