@bookedsolid/rea 0.22.0 → 0.23.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (55) hide show
  1. package/README.md +15 -0
  2. package/THREAT_MODEL.md +753 -0
  3. package/dist/audit/append.js +1 -1
  4. package/dist/cli/doctor.js +11 -12
  5. package/dist/cli/hook.d.ts +37 -3
  6. package/dist/cli/hook.js +167 -5
  7. package/dist/cli/init.js +14 -26
  8. package/dist/cli/install/canonical.js +18 -3
  9. package/dist/cli/install/commit-msg.js +1 -2
  10. package/dist/cli/install/copy.js +4 -13
  11. package/dist/cli/install/fs-safe.js +5 -16
  12. package/dist/cli/install/gitignore.js +1 -5
  13. package/dist/cli/install/pre-push.js +3 -8
  14. package/dist/cli/install/settings-merge.js +79 -16
  15. package/dist/cli/upgrade.js +14 -10
  16. package/dist/gateway/downstream.js +1 -2
  17. package/dist/gateway/live-state.js +3 -1
  18. package/dist/gateway/log.js +1 -3
  19. package/dist/gateway/middleware/audit.js +1 -1
  20. package/dist/gateway/middleware/injection.js +3 -9
  21. package/dist/gateway/middleware/policy.js +3 -1
  22. package/dist/gateway/middleware/redact.js +1 -1
  23. package/dist/gateway/observability/codex-telemetry.js +1 -2
  24. package/dist/gateway/reviewers/claude-self.js +10 -6
  25. package/dist/hooks/bash-scanner/blocked-scan.d.ts +26 -0
  26. package/dist/hooks/bash-scanner/blocked-scan.js +467 -0
  27. package/dist/hooks/bash-scanner/index.d.ts +41 -0
  28. package/dist/hooks/bash-scanner/index.js +62 -0
  29. package/dist/hooks/bash-scanner/parse-fail-closed.d.ts +31 -0
  30. package/dist/hooks/bash-scanner/parse-fail-closed.js +27 -0
  31. package/dist/hooks/bash-scanner/parser.d.ts +42 -0
  32. package/dist/hooks/bash-scanner/parser.js +92 -0
  33. package/dist/hooks/bash-scanner/protected-scan.d.ts +76 -0
  34. package/dist/hooks/bash-scanner/protected-scan.js +868 -0
  35. package/dist/hooks/bash-scanner/verdict.d.ts +80 -0
  36. package/dist/hooks/bash-scanner/verdict.js +49 -0
  37. package/dist/hooks/bash-scanner/walker.d.ts +165 -0
  38. package/dist/hooks/bash-scanner/walker.js +9087 -0
  39. package/dist/hooks/push-gate/base.js +2 -6
  40. package/dist/hooks/push-gate/codex-runner.js +3 -1
  41. package/dist/hooks/push-gate/index.js +9 -10
  42. package/dist/policy/loader.js +4 -1
  43. package/dist/registry/tofu-gate.js +2 -2
  44. package/hooks/blocked-paths-bash-gate.sh +142 -272
  45. package/hooks/protected-paths-bash-gate.sh +227 -511
  46. package/package.json +3 -2
  47. package/profiles/bst-internal-no-codex.yaml +1 -1
  48. package/profiles/bst-internal.yaml +1 -1
  49. package/profiles/client-engagement.yaml +1 -1
  50. package/profiles/lit-wc.yaml +1 -1
  51. package/profiles/minimal.yaml +1 -1
  52. package/profiles/open-source-no-codex.yaml +1 -1
  53. package/profiles/open-source.yaml +1 -1
  54. package/scripts/postinstall.mjs +1 -2
  55. package/scripts/run-vitest.mjs +117 -0
package/THREAT_MODEL.md CHANGED
@@ -524,3 +524,756 @@ REA operates two independent layers. Bypassing one does not disable the other.
524
524
  **Gateway layer** (runtime, `rea serve`): A middleware chain processes every proxied MCP tool call. Middleware enforces: audit, kill switch, policy/autonomy level, tier classification, blocked paths, rate limit, circuit breaker, prompt-injection classification (§5.21), secret redaction (pre and post), and result size cap. The gateway also supervises downstream child processes (§5.14), emits a `SESSION_BLOCKER` audit event on persistent failure (§5.15), and publishes a live per-downstream state snapshot to `.rea/serve.state.json` (§5.16) that `rea status` reads read-only. The `__rea__health` meta-tool short-circuits the chain for callability under HALT and runs a dedicated sanitizer on its response (§5.17).
525
525
 
526
526
  Both layers fail closed: on read failure, parse error, unknown errno on HALT, regex timeout, or any unexpected condition, the default action is deny (or for redaction specifically: replace with a sentinel — the content never escapes unscanned).
527
+
528
+ ---
529
+
530
+ ## 8. Bash-tier scanner (parser-backed, 0.23.0+)
531
+
532
+ Two of the shipped hooks — `protected-paths-bash-gate.sh` and
533
+ `blocked-paths-bash-gate.sh` — are thin shims that forward stdin to
534
+ the `rea hook scan-bash` CLI subcommand. The CLI parses the Bash
535
+ command via `mvdan-sh@0.10.1`, walks the AST in
536
+ `src/hooks/bash-scanner/walker.ts`, and applies per-utility detectors
537
+ that produce a `DetectedWrite[]`. The scanner then matches each
538
+ detection's path against the protected-paths or blocked_paths policy
539
+ and emits a verdict JSON. The shim re-verifies the verdict shape via
540
+ `node -e` before honoring the exit code (defense against a tampered
541
+ `REA_NODE_CLI` that returns exit 0 with empty stdout).
542
+
543
+ ### 8.1 Trust assumptions
544
+
545
+ The scanner trusts the following components. Each row names what we
546
+ trust, what would happen if the trust were violated, and what pins
547
+ the trust.
548
+
549
+ | Component | What we trust | If violated | Pinned by |
550
+ | ---------------------------------- | -------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
551
+ | `mvdan-sh@0.10.1` | Produces a faithful AST from any input bash accepts | Detector misses target on mis-shaped node | Pinned version, RedirOperator op-code snapshot tests, Class O exhaustiveness contract |
552
+ | Walker dispatch table | Enumerates every shape that produces a write | Novel utility silently allowed | Per-PR corpus fixture requirement; 18 corpus classes; convergence ladder |
553
+ | `fs.lstatSync` / `fs.readlinkSync` | Identify symlinks including dangling links | Symlink-out bypass class reopens | Codex round 1 F-2 closure + symlink corpus |
554
+ | `node` on PATH in the shim | Verdict-JSON verifier runs | Shim refuses on uncertainty (fail-closed) | Shim test for missing-node branch |
555
+ | Project-root realpath | `realpath(cli).startsWith(realpath(CLAUDE_PROJECT_DIR))` defends symlink-out | Forged CLI path inside `node_modules` defeats CLI-resolver | Codex round 5 F2 closure + corpus |
556
+ | OS realpath semantics | `node:fs.realpathSync` resolves symlinks consistently with the kernel; macOS `/var` ↔ `/private/var` aliasing handled via the rea_resolved_relative_form helper | Path-traversal escape | helix-021 closure + corpus |
557
+ | `@bookedsolid/rea` package install | `node_modules/@bookedsolid/rea/package.json#name === "@bookedsolid/rea"` AND realpath stays in project | Supply-chain compromise — see §8.3 | npm provenance; opt-in `policy.review.cli_sha256` (deferred to future minor) |
558
+
559
+ The scanner does NOT trust:
560
+
561
+ - The Bash command string itself — every input is parsed with a
562
+ hostile-input-tolerant parser and walked with a deny-by-default
563
+ visitor.
564
+ - `REA_NODE_CLI` / any environment variable nominating the CLI path
565
+ (codex round 1 F-3 + round 2 R2-3 — env-var hijack class dropped).
566
+ - The CLI exit code alone — the shim re-parses verdict JSON via
567
+ `node -e` and cross-checks exit code matches verdict (round 1 F-3).
568
+ - The visitor's per-`Cmd`-kind enumeration — replaced with `syntax.Walk`
569
+ in round-6 (round 6 closure: walker-dispatch field-omission is
570
+ structurally impossible).
571
+ - mvdan-sh's `syntax.Walk` to visit every Word-bearing field — Class O
572
+ contract test pins reach (round-7 closure).
573
+ - `unshellEscape` to handle every DQ-escape sequence — bash spec
574
+ enumerated and pinned by Class P corpus (round-8 closure).
575
+
576
+ ### 8.2 Bypass classes structurally impossible
577
+
578
+ - **walker dispatch field-omission is structurally impossible**, and
579
+ **mvdan-sh `syntax.Walk` field gaps are pinned by a contract test**
580
+ (0.23.0 round-6 + round-7 layered closure).
581
+
582
+ Round 6 — our own dispatch. Pre-refactor `walkForWrites` dispatched
583
+ on AST `Cmd` kinds via an explicit `case` ladder
584
+ (`case 'WhileClause':`, `case 'ForClause':`, …) and manually
585
+ enumerated which fields each kind traversed. Any field NOT
586
+ enumerated in a case branch was silently dropped — that pattern
587
+ produced six rounds of P0 bypasses (rounds 1-5 patched detection
588
+ gaps; round 6 closed the structural class). Round 6 found two P0s
589
+ in the same class as round 5 (`WhileClause.Cond`,
590
+ `ForClause.CStyleLoop.{Init,Cond,Post}`) and the convergence ladder
591
+ 34→14→9→8→5→2 demonstrated the walker would never reach 0 with
592
+ patches alone — it was structurally a denylist over AST shapes.
593
+ The new walker uses `mvdan-sh`'s built-in `syntax.Walk(node, visit)`
594
+ which traverses every Cmd-kind dispatch exhaustively from OUR side.
595
+ Our dispatch is preserved (per-utility cp/mv/sed/find/etc.), but the
596
+ TRAVERSAL is no longer a denylist of OUR shapes. A new `Cmd` type
597
+ added to mvdan-sh, or a new field on an existing type, reaches OUR
598
+ dispatcher when Walk descends into its inner Stmts / CallExprs /
599
+ BinaryCmds.
600
+
601
+ Round 7 — Walk's own field gaps. Codex round 7 (P0) flagged that
602
+ the round-6 framing "Walk visits every field" was overclaim:
603
+ mvdan-sh@0.10.1's `syntax.Walk` itself has empirically-verified
604
+ field gaps. Specifically, `ParamExp.Slice.Offset` and
605
+ `ParamExp.Slice.Length` (Word nodes that can hold CmdSubst payloads)
606
+ are NOT visited. Pre-fix this defeated 17 round-7 PoCs including
607
+ `${X:$(rm)}`, `${X:0:$(rm)}`, `${arr[@]:$(rm)}`, `${@:$(rm)}` —
608
+ every paramexp-slice form bypassed every detector. That made
609
+ 0.23.0 a regression vs 0.22.0 (whose bash regex caught
610
+ `${X:$(rm)}` directly). Round-7 closure layers on round-6:
611
+
612
+ 1. Tactical fix: `walkForWrites` declares its visitor up front and
613
+ manually re-enters `syntax.Walk` on `ParamExp.Slice.Offset` /
614
+ `Slice.Length` whenever the visitor sees a ParamExp. The
615
+ re-entry uses the SAME visitor, so nested ParamExp.Slice forms
616
+ (e.g. `${X:${Y:$(rm)}}`) recurse correctly.
617
+ 2. Structural pin: the **Class O exhaustiveness contract test**
618
+ (`__tests__/hooks/bash-scanner/walker-exhaustiveness.contract.test.ts`)
619
+ enumerates every named (node-type, field) Word-bearing position
620
+ mvdan-sh's parser populates and asserts the walker reaches each
621
+ one. If mvdan-sh@0.11.0+ adds a new node-with-Word-field that
622
+ Walk skips, the contract fails CI before runtime. The fix in
623
+ that case is always a one-line manual recursion in the visit
624
+ callback (same pattern as `recurseParamExpSlice`).
625
+
626
+ Combined: walker-dispatch field-omission bugs in OUR code are
627
+ structurally impossible (Walk-based traversal). Walk's own field
628
+ gaps are pinned by Class O. New mvdan-sh versions cannot silently
629
+ introduce new bypass classes — they fail the contract first.
630
+ - segment-splitter mis-detection (helix-014/015/016) — there is no
631
+ segmenter to bypass.
632
+ - shell-redirect ordering vs argv ordering ambiguity — the parser
633
+ attaches `Redirs` to the right `Stmt`, including FuncDecl Body
634
+ Stmts (codex round 1 F-1).
635
+ - nested-shell payload bypass (helix-017 #2 / helix-022 #3) —
636
+ `bash -c PAYLOAD` re-parses the payload up to a depth cap of 8.
637
+ - `find -exec bash -c` re-wrap (codex round 1 F-4) — inner argv
638
+ routes through `detectNestedShell` for re-parse.
639
+ - combined flag-cluster bypass (`bash -ic`, `node -pe`, `perl -E`,
640
+ codex round 1 F-13/F-8) — the eval-flag scanner accepts any
641
+ short-flag cluster containing the eval char.
642
+ - backslash-escape bypass (codex round 1 F-15) — normalize strips
643
+ bash's runtime backslash collapse.
644
+ - dangling-symlink bypass (codex round 1 F-2) — `lstat`+`readlink`
645
+ resolves the link manually.
646
+ - REA_NODE_CLI hijack (codex round 1 F-3, codex round 2 R2-3) — env
647
+ var dropped entirely. The 4-tier resolver was further narrowed in
648
+ codex round 4 Finding 2: tiers 1 (`command -v rea` PATH lookup) and
649
+ 2 (`node_modules/.bin/rea` symlink) were DROPPED because both are
650
+ workspace-attacker-controllable. The shim now uses a 2-tier sandboxed
651
+ resolver (`node_modules/@bookedsolid/rea/dist/cli/index.js` or
652
+ `dist/cli/index.js`). Workspace-bin attacker hijack via fake `rea`
653
+ binary at `node_modules/.bin/rea` is closed. The structural-impossibility
654
+ claim is bounded — see §8.3 for the supply-chain residual.
655
+ - symlink-out-of-project sandbox bypass (codex round 5 F2 — P0) — pre-fix
656
+ the realpath check only walked up looking for a `package.json` with
657
+ `name === "@bookedsolid/rea"`. An attacker who could write a symlink
658
+ at `node_modules/@bookedsolid/rea` pointing to `/tmp/sym-attacker/`
659
+ containing a forged `package.json` + a forged `dist/cli/index.js` that
660
+ exits 0 with `{"verdict":"allow"}` defeated the sandbox. Fix: PRIMARY
661
+ check is `realpath(cli).startsWith(realpath(CLAUDE_PROJECT_DIR) + sep)`;
662
+ any escape-out-of-project realpath refuses. The pkg.json walk-up
663
+ remains as the SECONDARY check guarding intra-project hijacks.
664
+ - assignment-side cmdsubst silent bypass (codex round 5 F1 — P0) —
665
+ `FOO=$(rm -rf .rea)`, `X=\`rm -rf .rea\``, `export FOO=$(rm)`,
666
+ `readonly X=$(rm)`, `local`/`declare`/`typeset`, `ARR=( $(rm) )`,
667
+ `[[ -n $(rm) ]]`, `case $(rm) in *) ;; esac`, `cat <<< $(rm)`,
668
+ `read X < <(rm)`, `(( $(rm | wc -l) ))`, `for x in $(rm)`. Pre-fix the
669
+ walker short-circuited at `args.length === 0` and ignored
670
+ `CallExpr.Assigns`; the AST cases `DeclClause`, `TestClause`,
671
+ `ArithmCmd`, `LetClause`, `SelectClause` and `CaseClause.Word` were
672
+ dropped at the walkCmd default. Stmt.Redirs's Word also wasn't walked
673
+ for embedded CmdSubst on read ops (here-string `<<<` 0x3f, procsubst-
674
+ on-stdin `< <(...)` 0x38). Fix: new `walkAssignsForSubstNodes` walks
675
+ every Assign.Value / Assign.Array.Elems[*].Value / Assign.Index for
676
+ embedded CmdSubst/ProcSubst/ArithmExp; new `walkTestExpr` recurses
677
+ through UnaryTest/BinaryTest/ParenTest leaves; walkCmd cases added for
678
+ every dropped clause type; extractStmtRedirects walks the Word for
679
+ cmdsubst regardless of operator.
680
+ - mixed-quote interpreter shell-out (codex round 5 F3 — P1) — pre-fix
681
+ the per-language `*_SHELL_OUT_RE` patterns used `["']([^"']+)["']` for
682
+ the inner-cmd capture, which truncated bodies whose source contained
683
+ the alternate quote (e.g. `os.system('rm "x"')` captured `rm `). Fix:
684
+ quote-aware variants `(["'])((?:(?!\1)[^\\]|\\.)+)\1` for python /
685
+ ruby / node / perl, plus a fail-closed shell-out fallback that emits a
686
+ dynamic detection when the payload contains a shell-out API token but
687
+ no shell-out regex extracted a clean payload.
688
+ - chained-interpreter multi-level escape bypass (codex round 5 F4 — P1)
689
+ — pre-fix `python -c "import os; os.system('node -e \"require(\\\"fs
690
+ \\\").rmSync(\\\".rea\\\", ...)\"')"` allowed because each layer
691
+ accumulates a `\\\"` shell-escape level and the per-language path-
692
+ quote regex rejects `(\\"` after the call paren. Fix: when a shell-
693
+ out body itself contains a known interpreter binary head followed by
694
+ an eval flag (`-c`/`-e`/`--eval`/`-pe`/`-ic`), emit a dynamic
695
+ detection (`looksLikeChainedInterpreter`). Refuse on uncertainty.
696
+ - non-string `tool_input.command` (codex round 1 F-31) — refused at
697
+ CLI input parse.
698
+ - absolute / relative-path command head (codex round 2 R2-14) —
699
+ basename-normalized before dispatcher switch; `/usr/bin/bash`,
700
+ `./sed`, `/usr/bin/env bash` all dispatch identically to the
701
+ bare-name form.
702
+ - decoupled-variable interpreter writes (codex round 2 R2-1) —
703
+ flat-scan over the payload: write API + any string-construction
704
+ primitive → dynamic detection.
705
+ - symlink cycles / deep chains (codex round 2 R2-2) — visited-set +
706
+ depth cap (32); cycle/cap returns sentinel that maps to "refuse on
707
+ uncertainty".
708
+ - joined `-t<DIR>` form (codex round 2 R2-4) — cp/mv/install/ln all
709
+ recognize the no-space form.
710
+ - tar `-C DIR`, rsync DEST, curl `-o`, wget `-O`, shred FILE,
711
+ eval payload, git checkout/restore/reset path (codex round 2 R2-7
712
+ through R2-13) — each utility now has a dedicated dispatcher.
713
+ - heredoc-into-shell (codex round 2 R2-12) — `bash <<EOF\n…\nEOF`
714
+ re-parses the heredoc body and walks the inner AST.
715
+ - eval re-parse (codex round 2 R2-13) — argv concat → re-parse →
716
+ walk; refuse on dynamic argv or parse failure.
717
+ - eval ordering with cmdsubst (codex round 3 Finding 1 — P0) —
718
+ `eval $(cmd)` no longer slips through the empty-inner short-circuit;
719
+ any-dynamic-argv emits a dynamic detection.
720
+ - pipe-into-bare-shell (codex round 3 Finding 2 — P1) — `cmd | bash`
721
+ with no `-c` is refuse-on-uncertainty.
722
+ - tar cluster `-xzfC` (codex round 3 Finding 3 — P1) — value-bearing
723
+ cluster chars consume subsequent argv tokens correctly.
724
+ - git top-level value-bearing flags (codex round 3 Finding 4 — P1) —
725
+ `-C`, `-c`, `--git-dir`, `--work-tree`, etc. are walked past before
726
+ identifying the subcommand.
727
+ - python shell-out shapes (codex round 3 Finding 5 — P1) —
728
+ `subprocess.* shell=True` and `subprocess.run(..., stdout=open())`
729
+ re-parse the inner shell.
730
+ - recursive directory delete bypass (codex round 4 Finding 1 — P0) —
731
+ `rm -rf .rea`, `rmdir .rea`, `find .rea -delete`, `shutil.rmtree(...)`,
732
+ `fs.rmSync(..., {recursive:true})`, `FileUtils.rm_rf(...)` etc. all
733
+ flag isDestructive on emit; the matcher's protected-ancestry path
734
+ treats writes against an ancestor directory as hits on every protected
735
+ pattern under it. Structurally closed: `PROTECTED_DIR_ANCESTORS` was
736
+ added to the corpus generator so the cross-product produces directory-
737
+ shaped destructive fixtures, eliminating the structural gap that
738
+ prevented detection.
739
+ - mv source-side bypass (codex round 4 Finding 3 — P1) — `mv` source
740
+ positionals are emitted as destructive writes too (mv removes content
741
+ at the source).
742
+ - find -delete unmodeled (codex round 4 Finding 4 — P1) — seed paths
743
+ are emitted as destructive write targets; with `-name PREDICATE`
744
+ present, the seed is emitted as dynamic+destructive (refuse on
745
+ uncertainty).
746
+ - interpreter shell-out breadth (codex round 4 Finding 5 — P1) — perl
747
+ exec/open-pipe, ruby Open3 / IO.popen, node spawn-with-bash-c, python
748
+ pty.spawn, opaque-spawn APIs (os.spawnv* / os.execv*) all detected
749
+ with re-parse-or-refuse.
750
+ - pathlib & File-class destructive APIs (codex round 4 Finding 6 — P1)
751
+ — Path(...).touch / .unlink / .rmdir / .rename, File.delete /
752
+ File.unlink / File.rename, ruby `open(F, 'w')` (bare), perl unlink /
753
+ rename all caught with isDestructive plumbed through where the API
754
+ semantic is removal.
755
+ - misc utility writes (codex round 4 Finding 7 — P1) — `patch`, `sort
756
+ -o`, `shuf -o`, `gpg --output`, `split <PREFIX>`, `trap "cmd" SIGNAL`
757
+ (re-parse trap command), `git config --file` all detected.
758
+ - procsubst-feeding-bash (codex round 4 Finding 7 — P1) — `bash <(cmd)`,
759
+ `bash 0< <(cmd)`, `bash <<< "cmd"` all emit dynamic detections; the
760
+ shell reads the FIFO/here-string as a script we cannot statically
761
+ resolve.
762
+ - nested-shell DQ-escape parity (codex round 8 P0) — `unshellEscape`
763
+ pre-fix collapsed only `\"` and `\'` from DQ payloads. Bash spec
764
+ collapses 5 DQ-significant escape sequences (`\$`, `` \` ``, `\"`,
765
+ `\\`, `\<newline>`). Pre-fix `bash -c "echo \"\$(rm .rea/HALT)\""`
766
+ was a verified end-to-end exploit: the real shell collapsed `\$` →
767
+ `$`, evaluated the CmdSubst, and deleted HALT, while the walker's
768
+ re-parse saw `\$` as a literal `$` (no CmdSubst child) and ALLOWED.
769
+ Fix: widened `unshellEscape`'s replace class to all five DQ-significant
770
+ escape characters. Class P corpus (≥560 fixtures across 5 shells × 16
771
+ DQ-escape shapes × 7 protected targets, plus 8 negatives) pins the
772
+ closure. Class O contract (round-7) was simultaneously tightened —
773
+ the lenient `|| w.dynamic` acceptance was replaced with opt-in
774
+ `acceptDynamic` per row, so contract-test passes can no longer mask
775
+ walker gaps via unrelated `nested_shell_inner` dynamic emits.
776
+ - wrapper-shell-exec class (codex round 9 F1 + round 10 P1) —
777
+ `<wrapper> <shell> -c PAYLOAD` shape where the wrapper transparently
778
+ forks/execs the next argv as the "real" command (`nice`, `timeout`,
779
+ `chronic`, `parallel`, `watch`, `dbus-launch`, ... and unbounded
780
+ future similar wrappers). Pre-round-9 `stripEnvAndModifiers` ignored
781
+ these wrappers, so the head-dispatch saw the wrapper name and missed
782
+ the inner `<shell> -c PAYLOAD`. Round 9 enumerated 21 wrappers; round
783
+ 10 surfaced 5 more — clear evidence the enumeration approach was
784
+ unbounded. Round-10 closure is **structural**: a new
785
+ `detectWrappedNestedShell` pass runs in `walkCallExpr`'s `default:`
786
+ case (head not in dispatcher's allow-list) and detects the bypass
787
+ shape `<UNRECOGNIZED-HEAD> [...flags...] <KNOWN-SHELL> -c PAYLOAD`
788
+ REGARDLESS of wrapper identity. Synthesizes a `[shell, -c, PAYLOAD,
789
+ ...]` slice and re-dispatches through `detectNestedShell`. False-
790
+ positive guards (a) skip when head is an introspection / output
791
+ utility (`echo`, `printf`, `man`, `which`, ...) and (b) skip when
792
+ argv[1] is itself an introspection head — covers
793
+ `<wrapper> echo bash` shapes. Three-token lookahead window between
794
+ shell positional and `-c` flag bounds false-positive risk. Bare-
795
+ shell-without-`-c` form refuses on uncertainty (stdin read).
796
+ Closes the bug class — every future unknown wrapper that
797
+ fork/execs a shell is caught without enumeration. Round 10 also
798
+ added explicit enumerations for `chronic`/`dbus-launch`/`watch`/
799
+ `script -c`/`parallel ::: ` for clean dispatch (no
800
+ refuse-on-uncertainty banner). Class S (233 wrapper-extension
801
+ positives + 38 negatives) and Class T (314 synthetic-wrapper
802
+ structural-guard positives + 29 false-positive-guard negatives)
803
+ pin the closure.
804
+ - find-exec placeholder, git history-rewrite seams, archive
805
+ extraction, parallel-stdin, more wrappers, php (codex round 11
806
+ F11-1..F11-7) — seven INDEPENDENT classes against the round-10
807
+ wrapper closure. None were variants of the wrapper family;
808
+ each landed in a different parser seam. (a) `find . -name HALT
809
+ -exec rm {} \;` — `{}` is a placeholder substituted at runtime
810
+ by find against the live filesystem; static analysis cannot
811
+ resolve which paths it expands to. Round-11 fix: synthetic
812
+ `find_exec_placeholder_unresolvable` dynamic detection emitted
813
+ whenever inner argv has `{}` AND the inner head is not in a
814
+ small read-only allow-list (`cat`, `grep`, `head`, `wc`, etc.).
815
+ (b) `git rm -f .rea/HALT` and `git mv .rea/HALT /tmp/x` were
816
+ not in the `TRACKED` subcommand set; round-11 added explicit
817
+ branches with `--cached` carve-out for `git rm`. (c) `git
818
+ filter-branch --tree-filter PAYLOAD` and `git rebase --exec`
819
+ / `-x` / `git bisect run` / `git commit --template=PATH` were
820
+ re-parse seams where git feeds PAYLOAD through `/bin/sh -c` at
821
+ runtime; round-11 added per-subcommand handlers feeding PAYLOAD
822
+ through `recurseShellPayload` → `parseBashCommand` →
823
+ `walkForWrites` (full top-level walker re-dispatch). (d)
824
+ archive extraction: `tar -xf x.tar -C . .rea/HALT` extracts the
825
+ protected member; `tar -xzf x.tgz` (no `-C`, no member list)
826
+ extracts every member — archive may contain `.rea/HALT`.
827
+ Round-11 fix: `detectTar` extended with extract-mode positional
828
+ harvesting, plus new `detectUnzip`/`detect7z`/`detectGzip`/
829
+ `detectPax` dispatchers. `bsdtar` aliases to `tar`. When -x is
830
+ set with no `-C` AND no explicit members, emit
831
+ `archive_extract_unresolvable`. (e) `echo .rea/HALT | parallel
832
+ rm` reads input from stdin (no `:::`); round-11 added pre-strip
833
+ detection in `walkCallExpr` that emits
834
+ `parallel_stdin_unresolvable` when parallel head has positional
835
+ template tokens AND no `:::`/`::::`/`:::+`/`::::+` separator.
836
+ (f) `fakeroot`/`flock`/`gtimeout`/`unshare`/`env --chdir=`/
837
+ `sudo -s --` were not stripped by `stripEnvAndModifiers`;
838
+ round-11 added each wrapper with appropriate flag arity
839
+ handling, plus shell-mode `sudo -s --` synthesis to `sh -c
840
+ PAYLOAD`. (g) `php -r CODE` was missing from interpreter
841
+ dispatch; round-11 added `PHP_EVAL_FLAGS` (-r, --run) and
842
+ `PHP_WRITE_PATTERNS` covering `unlink`/`file_put_contents`/
843
+ `rename`/`copy`/`fopen` (write modes)/`mkdir`/`rmdir`/`touch`/
844
+ `chmod`/`chown`/`chgrp`/`symlink`/`link`/`move_uploaded_file`.
845
+ Class U (round-11 find/git/archive/parallel — 360 positives +
846
+ 17 negatives) and Class V (round-11 wrappers + PHP — 124
847
+ positives + 8 negatives) pin the closure.
848
+ - adjacent-utility / cumulative-parity gaps (codex round 12
849
+ F12-1..F12-9 — nine INDEPENDENT findings against the round-11
850
+ surface). Not variants of any prior round; each landed in
851
+ PHP / archive-create / cmake / mkfifo+mknod / find-write-
852
+ predicate space where round-11 had not applied the cumulative
853
+ discipline established by earlier rounds. (a) F12-1 P0:
854
+ PHP `rename(SRC, DEST)` SOURCE-side blindspot — round-4 F3
855
+ established mv-shape source IS destructive; round-11 bundled
856
+ PHP rename with the destination-only group, so SRC slipped
857
+ past. Round-12 fix: split rename into TWO patterns + add
858
+ `rename(` to DESTRUCTIVE_API_TOKENS. (b) F12-2 P0: PHP
859
+ `rmdir(PATH)` not flagged destructive — bundled with mkdir/
860
+ touch (creates), so protected-ancestry never matched. Round-12
861
+ fix: split rmdir + add `rmdir(` to DESTRUCTIVE_API_TOKENS.
862
+ (c) F12-3 P0: PHP shell-out missing entirely —
863
+ `pickShellOutPatternsFor` had no php_r_path case. Round-12
864
+ fix: new PHP_SHELL_OUT_RE with quote-aware backref body
865
+ extraction covering system / exec / shell_exec / passthru /
866
+ popen / proc_open / backtick. (d) F12-4 P0: PHP -B/-E /
867
+ --process-begin / --process-end accept CODE same as -r;
868
+ round-11 PHP_EVAL_FLAGS only had -r/--run. Round-12 fix:
869
+ extend exactLong + shortChars (case-sensitive uppercase).
870
+ (e) F12-5 P0: archive CREATE direction missing — only EXTRACT
871
+ was checked. `tar -cf .rea/policy.yaml docs/`, `zip
872
+ .rea/policy.yaml docs/file`, `7z a .rea/policy.yaml docs/`
873
+ all silently overwrote the OUTPUT archive at the protected
874
+ path. Round-12 fix: detectTar gains isCreateOrAppend pass +
875
+ -f/-cf/--file emit; detect7z gains a/u/d compress branch;
876
+ new detectZip dispatcher (zip OUTPUT.zip [files...]).
877
+ (f) F12-6 P1: cmake -E utility surface — rm/remove/rename/
878
+ copy/copy_if_different/copy_directory/touch/remove_directory/
879
+ create_symlink/create_hardlink/make_directory all slipped past
880
+ pre-fix. Round-12 fix: new detectCmake with per-subcommand
881
+ argv shapes (cp-shape, mv-shape, variadic, second-positional).
882
+ (g) F12-7 P1: mkfifo / mknod create special files at protected
883
+ paths; no dispatchers existed. Round-12 fix: new detectMkfifo
884
+ (variadic) and detectMknod (NAME is first bare positional).
885
+ (h) F12-8 P1: find write-predicates -fls / -fprint / -fprintf
886
+ not in detectFind. Round-12 fix: scan for these predicates
887
+ and emit FILE as destructive write target (-fprintf consumes
888
+ TWO args). (i) F12-9 P2 false-positive regression: detectUnzip
889
+ emitted dynamic unresolvable for read-only flags `-p` (extract
890
+ to stdout), `-l/-Z/-t/-v/-z` (list/test/verbose/comment).
891
+ Round-12 fix: early-return ALLOW from detectUnzip when any
892
+ read-only flag (or cluster char) present. Class W (round-12
893
+ closures — 173 positives + 18 negatives) pins the closure.
894
+ - **cwd-relative-write kill-switch defeat** (helix-024 F1 — P1
895
+ closed in 0.23.1). `cd .rea && echo > HALT`,
896
+ `cd .husky && echo > pre-push`, `(cd .rea && echo > HALT)`,
897
+ `pushd .rea && echo > HALT`, `echo x | (cd .rea && tee HALT)`,
898
+ `p=.rea; cd $p && echo > HALT` all defeated 0.23.0. Pre-fix the
899
+ walker emitted only the relative redirect target (`HALT`); the
900
+ scanner normalized `HALT` against REA_ROOT and got `HALT`,
901
+ which doesn't match `.rea/HALT`. The cd was structurally
902
+ invisible — the walker source explicitly documented this limit.
903
+ Closure: new `detectCwdChangeIntoProtected` post-walker pass
904
+ scans the AST a second time for `cd`/`pushd` CallExprs and
905
+ emits a synthetic `cwd_protected_unresolvable` (literal target
906
+ — scanner runs the protected-prefix test with
907
+ `forceDirSemantics: true`) or `cwd_dynamic_with_writes_unresolvable`
908
+ (dynamic target — refuse on uncertainty) under a four-rule
909
+ refined predicate (round-14 codex P1 over-correction fix,
910
+ tightened in round-15 closure below):
911
+ (1) writes must be in-scope of the cd (sequential successors
912
+ in the same StmtList, or BinaryCmd.Y subtree of cd's BinaryCmd.X,
913
+ or nested compound stmts of those — NOT unrelated parallel
914
+ stmts); (2) writes must be bare-relative path-shape (absolute /
915
+ tilde / outside-root sentinels are unaffected by cwd); (3)
916
+ dynamic cd with a known-safe source is treated as ALLOW. The
917
+ known-safe set is intentionally narrow: NO env-var name
918
+ qualifies (round-15 P1 closure — `$HOME`/`$PWD`/`$OLDPWD` are
919
+ rebindable via inline assignment-prefix or parent-shell export,
920
+ and `$OLDPWD` tracks any previous cd including into protected
921
+ dirs); the only ParamExp source that is known-safe is a for-iter
922
+ variable bound to all-literal-non-protected Items. Known-safe
923
+ cmdsubst sources are `$(pwd)` and `$(git rev-parse <flag>)`
924
+ with flag in `{--show-toplevel, --show-cdup,
925
+ --show-superproject-working-tree}` — flags that resolve to
926
+ absolute paths or paths stepping OUT of cwd. `$(git rev-parse
927
+ --show-prefix)` is NOT known-safe (round-15 P1 closure — it
928
+ returns the cwd-relative path INSIDE the toplevel, so when the
929
+ agent is already in `.rea/` it returns `.rea/`). (4) dynamic cd
930
+ without bare-relative writes in scope emits nothing. Caught:
931
+ literal protected cd + bare-relative writes in any nested scope,
932
+ dynamic cd with unknown / env-var / show-prefix source +
933
+ bare-relative write in scope. Accepted false-negatives
934
+ (out of scope for hotfix, tracked for 0.24.0):
935
+ `cd $(echo .rea)` cmdsubst-resolved literals,
936
+ `alias evil="..."; evil` alias-then-invoke, for-iter loops
937
+ whose Items list is a cmdsubst.
938
+ - **doubly/N-nested eval bypass** (helix-024 F2 — P1 closed in
939
+ 0.23.1). `eval "eval \"echo > .rea/HALT\""` defeated 0.23.0.
940
+ Pre-fix `detectEval` re-parsed exactly one level. The outer DQ-
941
+ significant escapes (`\"`) survived as literal backslash-quote
942
+ pairs into the joined inner string, corrupting the second-
943
+ level redirect target (`.rea/HALT\"`) so it didn't match the
944
+ protected list. Same shape as the helix-022 #3 nested-bash-c
945
+ DQ-escape bypass — `unshellEscape` collapses one level of DQ-
946
+ escapes before re-parse. Closure: two-part fix. (1)
947
+ unshellEscape the inner string before re-parsing in detectEval
948
+ (parity with detectNestedShell). (2) New `EVAL_DEPTH_CAP = 8`
949
+ counter mirroring NESTED_SHELL_DEPTH_CAP; past the cap the
950
+ walker emits a synthetic dynamic detection. Eval and bash -c
951
+ use SEPARATE counters so a heavy mix doesn't exhaust one
952
+ budget through the other. Single-level `eval "echo > .rea/
953
+ HALT"` continues to BLOCK (no regression).
954
+ - **symlink-alias-write kill-switch defeat** (helix-024 F3 — P1
955
+ closed in 0.23.1). `ln -sf .rea/HALT /tmp/_x && echo y > /tmp/
956
+ _x` exited 0 in 0.23.0. The static walker doesn't track
957
+ filesystem state across statements; the first statement stages
958
+ a symlink whose target is protected, the second writes through
959
+ the link. Neither token in isolation references a protected
960
+ write target — `ln`'s SRC was just a positional, DEST was
961
+ unprotected. Closure: detectLn emits a synthetic
962
+ `ln_to_protected_unresolvable` detection on every SRC positional
963
+ in both `ln SRC DEST` and `ln -t DEST_DIR SRC1 ...` forms. The
964
+ detection carries `isDestructive: true` so protected-ancestry
965
+ matches bare-dir SRCs (`ln -s .rea /tmp/_x` aliases the .rea
966
+ directory). Dynamic SRC emits the dynamic form. Class X
967
+ (helix-024 closures — 234 positives + 16 negatives) pins all
968
+ three closures. Related class deferred to 0.24.0:
969
+ alias-then-invoke (`alias evil="echo > .rea/HALT"; evil`)
970
+ requires modeling shell aliases at AST time.
971
+ - **env-var rebind + `--show-prefix` cwd-eval bypass**
972
+ (helix-024 round-15 P1 — closed in 0.23.1). Round-14's
973
+ `detectCwdChangeIntoProtected` refined predicate ALLOWed dynamic
974
+ cd whose source was an env-var ParamExp from
975
+ `{HOME, PWD, OLDPWD}` or a cmdsubst from `git rev-parse
976
+ --show-prefix`. Both classes are attacker-controllable. (1) Env
977
+ vars rebind via inline assignment-prefix on the same simple
978
+ command (`HOME=.rea cd "$HOME" && echo > HALT`) and via
979
+ parent-shell exports across commands (`export HOME=.rea; cd
980
+ "$HOME" && echo > HALT`); `OLDPWD` automatically tracks any
981
+ previous cd, so a prior cd into a protected dir poisons later
982
+ `cd "$OLDPWD"`. (2) `git rev-parse --show-prefix` returns the
983
+ cwd-relative path INSIDE the toplevel; when the agent's cwd is
984
+ already `.rea/`, `cd "$(git rev-parse --show-prefix)" && echo
985
+ > HALT` lands inside `.rea/` with a `HALT` write target that
986
+ resolves against `.rea/HALT`. Closure: empty `KNOWN_SAFE_VARS`
987
+ in `isParamExpKnownSafe` (no env-var name is statically safe;
988
+ the for-iter carve-out remains because Items literals are
989
+ statically checked); drop `--show-prefix` from the
990
+ `isCmdSubstKnownSafe` FLAGS allow-list (the remaining flags
991
+ `--show-toplevel`, `--show-cdup`,
992
+ `--show-superproject-working-tree` resolve to absolute paths or
993
+ paths stepping OUT of cwd — never INTO it). Class X corpus
994
+ rehomes: 3 fixtures moved from R14_ALLOW to R14_BLOCK
995
+ (`cd "$HOME"` / `cd "$OLDPWD"` / `pushd "$HOME"` with bare
996
+ writes), 4 new BLOCK fixtures pin the round-15 PoCs
997
+ (`HOME=.rea cd "$HOME"`, `PWD=.rea cd "$PWD"`,
998
+ `cd "$(git rev-parse --show-prefix)"`,
999
+ `export HOME=.rea; cd "$HOME"`). Single-level eval, ln-source-
1000
+ protected, and the literal `cd .rea` path remain unchanged. As
1001
+ a side improvement under round-15 P3, `.github/workflows/` is
1002
+ added to the historical default protected list so consumers
1003
+ without an explicit `policy.blocked_paths` entry still refuse
1004
+ Bash-tier writes to CI workflows; the path is intentionally NOT
1005
+ a kill-switch invariant — operators may relax it via
1006
+ `policy.protected_paths_relax`. Round-16 closure (helix-024
1007
+ hotfix continued, sibling threat class to round-15 F1) extends
1008
+ the refuse-on-uncertainty path to bare `cd` (defaults cwd to
1009
+ `$HOME`), `cd -L` / `cd -P` (flag-only, also default to
1010
+ `$HOME`), `cd -` (reverts to `$OLDPWD`), and `popd` (reverts
1011
+ to dir-stack head): all four forms emit no positional after
1012
+ flag-skip and previously fell through with no detection — they
1013
+ now run the same in-scope bare-relative-write check as the
1014
+ dynamic-target branch and emit
1015
+ `cwd_dynamic_with_writes_unresolvable` if a bare-relative write
1016
+ is in scope. 5 new R16_BLOCK fixtures + 4 R16-shape negatives
1017
+ added to Class X corpus.
1018
+ Round-17 closure (helix-024 hotfix continued, P1 + P2 + P3 +
1019
+ P3-doc — control-flow walker gap, NOT a predicate weakness): the
1020
+ round-14/15/16 walker visited a conditional's Cond and Body as
1021
+ separate scopes via `walkScopeForCwd`. A `cd` inside the Cond
1022
+ therefore had a single-command scope with no successors, never
1023
+ collected the body's writes as downstream, and never emitted —
1024
+ even though bash semantics keep the cwd change in the current
1025
+ shell so it persists into the Body when the cond is truthy AND
1026
+ past the conditional into post-stmt siblings. Closure: thread an
1027
+ `extraDownstream` parameter through `walkScopeForCwd` →
1028
+ `classifyCdInStmt` → `collectCdSitesInStmt` /
1029
+ `collectCdSitesInBinaryX`. When `descendCmdScopes` enters an
1030
+ IfClause/WhileClause/UntilClause, the Cond walk receives `[...
1031
+ body, ...post-stmt-siblings]` as carriers; the Body walk receives
1032
+ `[...post-stmt-siblings]`. Subshell stays cwd-isolated (forks a
1033
+ child shell) so its inner walk does NOT inherit parent siblings.
1034
+ The same closure adds explicit `TimeClause` / `CoprocClause`
1035
+ cases to `descendCmdScopes` (descend into the wrapped Stmt with
1036
+ carriers) and a TimeClause/CoprocClause unwrap in
1037
+ `collectCdSitesInBinaryX` so `time cd .rea && echo > HALT`
1038
+ reaches the cd site. `pushd` no-positional / `pushd -N` /
1039
+ `pushd +N` already BLOCK incidentally via the round-16 fallback
1040
+ (runtime-determined dir-stack manipulation refused on uncertainty),
1041
+ but R17 P3 pins the verdict with three explicit fixtures so a
1042
+ future predicate relaxation cannot silently re-open the bypass.
1043
+ 12 new R17_BLOCK fixtures + 3 R17_ALLOW negatives added to Class
1044
+ X corpus, including the pragmatic-bound ALLOW for `pushd && cat
1045
+ README.md` (no bare-relative WRITE in scope), `if cd /tmp; then
1046
+ echo > log; fi` (literal non-protected cd target — protected-
1047
+ prefix test ALLOWS), and `if cd .rea; then cat HALT; fi` (read-
1048
+ only body — predicate requires a WRITE).
1049
+
1050
+ ### 8.3 Bypass classes still possible
1051
+
1052
+ - **`mvdan-sh@0.10.1` deprecation advisory** (helix-024 F4 — P2
1053
+ acknowledged residual, surfaced 2026-05-04). The 0.23.0 upgrade
1054
+ introduced `mvdan-sh@0.10.1` as a transitive runtime dependency
1055
+ at the security boundary. The package is the JavaScript port of
1056
+ mvdan's Go shell parser and is upstream-deprecated per
1057
+ https://github.com/mvdan/sh/issues/1145 (Go-original is
1058
+ actively maintained; the JS port is on hold). The deprecation
1059
+ is a code-freeze, not a removal. Mitigations already in place:
1060
+ (1) integrity hash pinned in pnpm-lock.yaml, (2) the project
1061
+ fails closed on parser anomalies (parse errors → refuse on
1062
+ uncertainty), (3) Class O exhaustiveness contract pins the
1063
+ walker against any latent field-gap. A future mvdan-sh
1064
+ migration / replacement is out-of-scope for the helix-024
1065
+ hotfix; tracked for 0.24.0 evaluation. Listed as still-possible
1066
+ rather than structurally-impossible because the security model
1067
+ binds rea to a deprecated parser at the AST boundary.
1068
+ - **`@bookedsolid/rea` package-tier supply-chain compromise** (codex
1069
+ round 5 F5 — P1/P3 acknowledged residual). The bash-tier shim's
1070
+ CLI-resolution sandbox check (codex round 4 #2 + round 5 F2)
1071
+ defeats node_modules-symlink-out and workspace-bin hijack. It does
1072
+ NOT defeat an attacker who can write a forged
1073
+ `node_modules/@bookedsolid/rea/dist/cli/index.js` *and* a matching
1074
+ `node_modules/@bookedsolid/rea/package.json` with `name ===
1075
+ "@bookedsolid/rea"` directly inside the project's `node_modules/`.
1076
+ Such an attacker has already compromised the package install
1077
+ pipeline (e.g. via a compromised lockfile / dependency-confusion
1078
+ attack / npm registry compromise). At that level the attacker can
1079
+ also forge any other dependency the agent uses, so hook-tier defense
1080
+ is past — the trust boundary is the `npm install` provenance check
1081
+ (npm provenance + manifest verification) rather than the bash gate.
1082
+ Hardening (opt-in): operators may set `policy.review.cli_sha256: <hex>`
1083
+ in `.rea/policy.yaml` and the shim will refuse if the resolved CLI's
1084
+ SHA-256 does not match. Defaults to unset; documented as an opt-in
1085
+ belt-and-braces measure rather than a structural defense (see
1086
+ `docs/architecture/bash-scanner.md` for the full rationale).
1087
+ - **`git checkout REVISION PATH` and `git restore --source=REVISION
1088
+ PATH` without the POSIX `--` argv separator** (codex round 9 F3 —
1089
+ accepted false negative since 0.22.0). When an attacker invokes
1090
+ `git checkout main .rea/HALT` (or `git restore --source=HEAD~1
1091
+ .rea/policy.yaml`), git's pre-`--` argv shape is structurally
1092
+ ambiguous between "REVISION PATH" and "PATH...". The `detectGit`
1093
+ walker conservatively treats positionals after the subcommand as
1094
+ destructive-overwrite targets ONLY when `--` is present, because
1095
+ disambiguation requires a runtime ref-existence check that the
1096
+ static walker cannot perform without filesystem I/O. Workaround:
1097
+ the kill-switch invariants for protected files (`.rea/HALT`,
1098
+ `.rea/policy.yaml`, `.claude/settings.json`, etc.) are still caught
1099
+ by the symlink-resolution layer in `protected-paths-bash-gate.sh`
1100
+ at file-write time when git actually opens the destination for
1101
+ write — the bash-scanner's static layer is one of multiple
1102
+ enforcement points. Pin: 0.24.0 milestone for a comprehensive fix
1103
+ (likely a conservative refusal whenever the first positional
1104
+ matches a known git-revspec shape AND following positionals exist).
1105
+ - glob expansion in argv-based commands (`chmod +x bin/*.sh` on a
1106
+ filesystem where `bin/.rea` exists). Glob detection is scoped to
1107
+ redirect-form targets only because argv-globs in legitimate code
1108
+ are common. Future: enumerate filesystem-level glob matches.
1109
+ - awk `-f script-file` body. We currently emit a dynamic detection
1110
+ (refuse on uncertainty). Future: read + scan the file.
1111
+ - semantic obfuscation via `${!ind}` indirect expansion, `read -p`
1112
+ prompts, computed-property attacks in interpreter payloads that
1113
+ evade the flat-scan heuristic. These are parse-correct and the
1114
+ detector is best-effort.
1115
+ - WASM `sh-syntax` migration risk: parser bugs in the new library
1116
+ could reopen bypass classes. Mitigated by the corpus fixture
1117
+ suite (every closed bypass replays as a positive regression
1118
+ test).
1119
+ - **Round-13 deferred to 0.24.0** (utility-enumeration completeness
1120
+ — same shape as round 12; convergence at this tier is asymptotic
1121
+ per codex's explicit assessment):
1122
+ 1. **PHP indirect-callable shell-out** —
1123
+ `array_map("system", [...])`, `call_user_func("system", ...)`,
1124
+ `eval("system(...);")`, variable-bound callable
1125
+ `$f = "system"; $f(...)`. Round 12 closed direct calls;
1126
+ indirect callable forms remain.
1127
+ 2. **vim/emacs editor exec re-parse seam** — `vim -c "!cmd"`,
1128
+ `vim -c "w PATH"`, `vim -c "source PATH"`, `emacs --eval
1129
+ "(shell-command \"...\")"`, `emacs --eval "(delete-file
1130
+ \"...\")"`. No vim/emacs dispatcher in 0.23.0.
1131
+ 3. **Standalone compression utilities** — `xz -o FILE`,
1132
+ `zstd -o FILE`, `lz4 INPUT FILE`, `lzma -o FILE`. Round 12
1133
+ closed tar/zip/7z/cpio/pax create-direction; standalone
1134
+ compression family adjacent.
1135
+ 4. **Image-builder utilities** — `mksquashfs INPUT OUTPUT`,
1136
+ `genisoimage -o OUTPUT`, `mkisofs -o OUTPUT`, `xorriso
1137
+ -outdev OUTPUT`. CI/build pipelines use these.
1138
+ 5. **`7zz` Linux-canonical 7zip alias** — `detect7z` only fires
1139
+ on `7z`; the official Linux package binary is `7zz`.
1140
+ 6. **Python argv-form subprocess** — `subprocess.Popen(['rm',
1141
+ '...'])`, `subprocess.run(['rm', '...'])` etc. without
1142
+ `shell=True`. Round 3 F5 closed `shell=True` form;
1143
+ argv-list form still allows.
1144
+ 7. **vim ex-mode literal-path write** (P2) — `vim -c "w PATH"`
1145
+ writes literally with PATH in argv, regex-level miss
1146
+ independent of the broader `-c` re-parse seam.
1147
+ - **Denylist scanner is structurally limited** (acknowledged by
1148
+ codex round 4 and reaffirmed by codex round 13: "convergence on
1149
+ enumeration completeness is asymptotic — every round will probably
1150
+ find more"). A denylist enumerates the destructive shapes the
1151
+ scanner knows. Novel shapes (interpreters or utilities not yet
1152
+ modeled, language constructs we haven't seen) can in principle
1153
+ bypass until added. Defense in depth:
1154
+ 1. mvdan-sh AST parsing eliminates an entire class of regex/
1155
+ segmenter mistakes.
1156
+ 2. Comprehensive walker dispatchers per known destructive utility +
1157
+ per known shell-out + per known interpreter API.
1158
+ 3. Adversarial corpus generators span the parameter cross-product
1159
+ so generators produce shapes Codex hasn't visited.
1160
+ 4. Per-round Codex review surfaces gaps before release; the
1161
+ convergence ladder
1162
+ (round 1 → round 2 → round 3 → round 4 → round 5 ...) is the
1163
+ audit trail.
1164
+ 5. Fail-closed defaults: dynamic targets always block.
1165
+ An allowlist scanner ("only known-safe commands pass") would close
1166
+ this class structurally but is incompatible with the rea use case
1167
+ (agentic workflows need arbitrary bash access).
1168
+
1169
+ ### 8.4 Test surface
1170
+
1171
+ The fixture corpus at
1172
+ `__tests__/hooks/bash-tier-corpus.test.ts` (≥185 entries) and
1173
+ `__tests__/hooks/bash-tier-corpus-round2.test.ts` (≥186 entries,
1174
+ codex round 2 bypass-class fixtures) locks every documented bypass
1175
+ class as a regression-positive test.
1176
+ The walker unit tests at `__tests__/hooks/bash-scanner/walker.test.ts`
1177
+ pin the parser-emitted RedirOperator codes (codex round 1 F-33) so a
1178
+ parser-library bump that re-numbers them fails LOUDLY. The verdict-
1179
+ shape snapshot at `__tests__/hooks/bash-scanner/verdict-shape.test.ts`
1180
+ locks the wire format for the bash shim consumers.
1181
+
1182
+ The cross-product corpus at
1183
+ `__tests__/hooks/bash-scanner/adversarial-corpus.test.ts` runs ≥7700
1184
+ fixtures across 18 classes (A–P plus extensions). Coverage assertion:
1185
+ ≥3000 positive (must-block) and ≥1000 negative (must-allow) fixtures.
1186
+
1187
+ The Class O exhaustiveness contract test at
1188
+ `__tests__/hooks/bash-scanner/walker-exhaustiveness.contract.test.ts`
1189
+ pins the walker reach across every Word-bearing AST position
1190
+ mvdan-sh's parser populates. Round-8 tightened the acceptance to
1191
+ path-explicit-by-default; opt-in `acceptDynamic` per row is the only
1192
+ way to accept a `dynamic: true` write as proof-of-reach.
1193
+
1194
+ The bash-shim subprocess sampling at
1195
+ `adversarial-corpus.test.ts > "bash shim subprocess sampling"`
1196
+ spawns the actual hook script under a clean env across 100
1197
+ deterministically-sampled fixtures, parses verdict JSON, and
1198
+ cross-checks against in-process scan. Catches drift between the
1199
+ in-process verdict and what the shim's JSON verifier + 4-tier
1200
+ resolver chain actually returns.
1201
+
1202
+ ### 8.5 Defense in depth
1203
+
1204
+ The bash gate is one layer. The full defensive stack:
1205
+
1206
+ 1. **Parser AST** (`mvdan-sh@0.10.1`) — eliminates regex/segmenter
1207
+ tokenization mistakes.
1208
+ 2. **Walker** (`syntax.Walk`-based deny-by-default traversal +
1209
+ `recurseParamExpSlice` for Walk gaps) — visits every node type;
1210
+ no Cmd-kind branch can silently drop a field.
1211
+ 3. **Per-utility dispatchers** — comprehensive coverage of cp, mv,
1212
+ sed, dd, tee, install, ln, awk, ed, ex, find, xargs, node,
1213
+ python, ruby, perl, tar, rsync, curl, wget, shred, eval, git,
1214
+ patch, sort, shuf, gpg, split, trap, bash/sh/zsh/dash/ksh.
1215
+ 4. **Interpreter scanners** — write-API tokens for node fs, python
1216
+ os/shutil/subprocess, ruby Pathname/FileUtils, perl unlink/rename;
1217
+ shell-out re-parse for `system`/`subprocess.run shell=True`/`qx`/
1218
+ backticks.
1219
+ 5. **DQ-escape parity** — `unshellEscape` collapses all 5 DQ-significant
1220
+ escapes (round-8) so re-parser sees the same syntax tree as bash.
1221
+ 6. **Symlink resolver** — visited-set + depth cap (32); refuses on
1222
+ cycle/cap; macOS `/var` ↔ `/private/var` aliasing handled by
1223
+ `rea_resolved_relative_form` (helix-021).
1224
+ 7. **2-tier sandboxed CLI resolver** — only
1225
+ `node_modules/@bookedsolid/rea/dist/cli/index.js` and
1226
+ `dist/cli/index.js` accepted; `realpath` containment check
1227
+ refuses any escape from `CLAUDE_PROJECT_DIR`.
1228
+ 8. **Verdict JSON verifier** — shim re-parses CLI output via
1229
+ `node -e` and cross-checks exit code ↔ verdict (round-1 F-3).
1230
+ 9. **Cross-product corpus** (Classes A–P) — ≥7700 fixtures span the
1231
+ parameter space so generators produce shapes the round-by-round
1232
+ manual review hadn't visited.
1233
+ 10. **Class O exhaustiveness contract** — pins every Word-bearing
1234
+ AST position so mvdan-sh upgrades cannot silently introduce
1235
+ Walk-skip bypasses.
1236
+ 11. **Codex adversarial review** — every release goes through
1237
+ `/codex-review` before merge; convergence ladder is the audit
1238
+ trail. 0.23.0 round count: 8 (and counting).
1239
+ 12. **Middleware audit log** — every tool invocation is hash-chained
1240
+ in `.rea/audit.jsonl` (append-only, tamper-evident).
1241
+ 13. **Codex push-gate** (0.11.0+) — pre-push stateless review by
1242
+ GPT-5.4 (codex-auto-review) catches semantic concerns the static
1243
+ scanner cannot reason about.
1244
+ 14. **Husky 9 hook chain** — `commit-msg`, `pre-push`, `pre-commit`
1245
+ register every hook in the package; consumers can extend via
1246
+ `.husky/{commit-msg,pre-push,pre-commit}.d/*` (helix-018 Option B).
1247
+
1248
+ A bypass requires defeating multiple layers simultaneously. The
1249
+ trust boundary between this stack and the rest of the system is
1250
+ package-tier integrity (npm provenance + manifest verification).
1251
+
1252
+ ### 8.6 Out of scope
1253
+
1254
+ The bash gate explicitly does NOT defend against:
1255
+
1256
+ - **Kernel-level / LD_PRELOAD / ptrace attacks.** The OS kernel and
1257
+ loader are trusted. An attacker with code execution at that tier
1258
+ bypasses every userland defense.
1259
+ - **Network-tier MITM during package install.** rea trusts the npm
1260
+ registry (with provenance verification) and the system's TLS root
1261
+ store. A compromised TLS chain at `npm install` time gives the
1262
+ attacker the same authority as the package itself.
1263
+ - **Supply-chain compromise of `@bookedsolid/rea` on npm.** A signed
1264
+ release with malicious code defeats the gate because the gate IS
1265
+ the malicious code. Mitigation is npm provenance + manifest
1266
+ verification + (opt-in) `policy.review.cli_sha256`. See §8.3.
1267
+ - **Out-of-band file modifications.** rea gates Bash tool calls and
1268
+ Write/Edit/MultiEdit Write-tier hooks. Filesystem changes initiated
1269
+ outside the harness (user editing files directly, language server
1270
+ edits, other processes) are not gated.
1271
+ - **Read-side policy leaks.** The bash gate concerns WRITES. Reading
1272
+ `.rea/policy.yaml` is allowed by default — the policy is checked-in
1273
+ and visible. `env-file-protection.sh` handles `.env*` reads at the
1274
+ Write tier; bash-tier coverage of `.env*` reads is via
1275
+ `dependency-audit-gate.sh` and the segmenter for those forms.
1276
+ - **Attacker-controlled PATH at scanner runtime.** If `rea` resolves
1277
+ to an attacker binary on PATH, the gate is defeated. Production
1278
+ deployments pin PATH via the harness; `rea doctor` verifies PATH
1279
+ integrity at install time but does not enforce it at runtime.