@bookedsolid/rea 0.22.0 → 0.23.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +15 -0
- package/THREAT_MODEL.md +582 -0
- package/dist/audit/append.js +1 -1
- package/dist/cli/doctor.js +11 -12
- package/dist/cli/hook.d.ts +37 -3
- package/dist/cli/hook.js +167 -5
- package/dist/cli/init.js +14 -26
- package/dist/cli/install/canonical.js +18 -3
- package/dist/cli/install/commit-msg.js +1 -2
- package/dist/cli/install/copy.js +4 -13
- package/dist/cli/install/fs-safe.js +5 -16
- package/dist/cli/install/gitignore.js +1 -5
- package/dist/cli/install/pre-push.js +3 -8
- package/dist/cli/install/settings-merge.js +79 -16
- package/dist/cli/upgrade.js +14 -10
- package/dist/gateway/downstream.js +1 -2
- package/dist/gateway/live-state.js +3 -1
- package/dist/gateway/log.js +1 -3
- package/dist/gateway/middleware/audit.js +1 -1
- package/dist/gateway/middleware/injection.js +3 -9
- package/dist/gateway/middleware/policy.js +3 -1
- package/dist/gateway/middleware/redact.js +1 -1
- package/dist/gateway/observability/codex-telemetry.js +1 -2
- package/dist/gateway/reviewers/claude-self.js +10 -6
- package/dist/hooks/bash-scanner/blocked-scan.d.ts +26 -0
- package/dist/hooks/bash-scanner/blocked-scan.js +467 -0
- package/dist/hooks/bash-scanner/index.d.ts +41 -0
- package/dist/hooks/bash-scanner/index.js +62 -0
- package/dist/hooks/bash-scanner/parse-fail-closed.d.ts +31 -0
- package/dist/hooks/bash-scanner/parse-fail-closed.js +27 -0
- package/dist/hooks/bash-scanner/parser.d.ts +42 -0
- package/dist/hooks/bash-scanner/parser.js +92 -0
- package/dist/hooks/bash-scanner/protected-scan.d.ts +76 -0
- package/dist/hooks/bash-scanner/protected-scan.js +815 -0
- package/dist/hooks/bash-scanner/verdict.d.ts +80 -0
- package/dist/hooks/bash-scanner/verdict.js +49 -0
- package/dist/hooks/bash-scanner/walker.d.ts +165 -0
- package/dist/hooks/bash-scanner/walker.js +7954 -0
- package/dist/hooks/push-gate/base.js +2 -6
- package/dist/hooks/push-gate/codex-runner.js +3 -1
- package/dist/hooks/push-gate/index.js +9 -10
- package/dist/policy/loader.js +4 -1
- package/dist/registry/tofu-gate.js +2 -2
- package/hooks/blocked-paths-bash-gate.sh +142 -272
- package/hooks/protected-paths-bash-gate.sh +227 -511
- package/package.json +3 -2
- package/profiles/bst-internal-no-codex.yaml +1 -1
- package/profiles/bst-internal.yaml +1 -1
- package/profiles/client-engagement.yaml +1 -1
- package/profiles/lit-wc.yaml +1 -1
- package/profiles/minimal.yaml +1 -1
- package/profiles/open-source-no-codex.yaml +1 -1
- package/profiles/open-source.yaml +1 -1
- package/scripts/postinstall.mjs +1 -2
- package/scripts/run-vitest.mjs +117 -0
package/README.md
CHANGED
|
@@ -152,6 +152,21 @@ PR-issue-link advisory, architecture advisory). Each hook uses
|
|
|
152
152
|
runs a HALT check near the top. See [Hooks shipped](#hooks-shipped) for
|
|
153
153
|
the full inventory.
|
|
154
154
|
|
|
155
|
+
**Bash-tier scanner (parser-backed since 0.23.0).** Two hooks —
|
|
156
|
+
`protected-paths-bash-gate.sh` and `blocked-paths-bash-gate.sh` — are
|
|
157
|
+
shims that forward stdin to `rea hook scan-bash`, a CLI subcommand
|
|
158
|
+
that parses the Bash command via `mvdan-sh@0.10.1`, walks the AST,
|
|
159
|
+
and emits a verdict JSON. Pre-0.23.0 these were 500-line bash regex
|
|
160
|
+
pipelines; the rewrite closes 24 known-bypass classes
|
|
161
|
+
(helix-021..023 + discord-ops Round 13 + codex round 1) by replacing
|
|
162
|
+
re-tokenization heuristics with structural matches against the parsed
|
|
163
|
+
argv tree. The other nine hooks remain regex-based bash. The shim
|
|
164
|
+
re-verifies the verdict JSON shape on return so a tampered
|
|
165
|
+
`REA_NODE_CLI` env var cannot bypass. See
|
|
166
|
+
[`docs/architecture/bash-scanner.md`](docs/architecture/bash-scanner.md)
|
|
167
|
+
for the AST-walker design and [`docs/migration/0.23.0.md`](docs/migration/0.23.0.md)
|
|
168
|
+
for consumer migration notes.
|
|
169
|
+
|
|
155
170
|
The hook layer runs independently of the MCP gateway — bypassing one does
|
|
156
171
|
not disable the other. That redundancy is intentional.
|
|
157
172
|
|
package/THREAT_MODEL.md
CHANGED
|
@@ -524,3 +524,585 @@ REA operates two independent layers. Bypassing one does not disable the other.
|
|
|
524
524
|
**Gateway layer** (runtime, `rea serve`): A middleware chain processes every proxied MCP tool call. Middleware enforces: audit, kill switch, policy/autonomy level, tier classification, blocked paths, rate limit, circuit breaker, prompt-injection classification (§5.21), secret redaction (pre and post), and result size cap. The gateway also supervises downstream child processes (§5.14), emits a `SESSION_BLOCKER` audit event on persistent failure (§5.15), and publishes a live per-downstream state snapshot to `.rea/serve.state.json` (§5.16) that `rea status` reads read-only. The `__rea__health` meta-tool short-circuits the chain for callability under HALT and runs a dedicated sanitizer on its response (§5.17).
|
|
525
525
|
|
|
526
526
|
Both layers fail closed: on read failure, parse error, unknown errno on HALT, regex timeout, or any unexpected condition, the default action is deny (or for redaction specifically: replace with a sentinel — the content never escapes unscanned).
|
|
527
|
+
|
|
528
|
+
---
|
|
529
|
+
|
|
530
|
+
## 8. Bash-tier scanner (parser-backed, 0.23.0+)
|
|
531
|
+
|
|
532
|
+
Two of the shipped hooks — `protected-paths-bash-gate.sh` and
|
|
533
|
+
`blocked-paths-bash-gate.sh` — are thin shims that forward stdin to
|
|
534
|
+
the `rea hook scan-bash` CLI subcommand. The CLI parses the Bash
|
|
535
|
+
command via `mvdan-sh@0.10.1`, walks the AST in
|
|
536
|
+
`src/hooks/bash-scanner/walker.ts`, and applies per-utility detectors
|
|
537
|
+
that produce a `DetectedWrite[]`. The scanner then matches each
|
|
538
|
+
detection's path against the protected-paths or blocked_paths policy
|
|
539
|
+
and emits a verdict JSON. The shim re-verifies the verdict shape via
|
|
540
|
+
`node -e` before honoring the exit code (defense against a tampered
|
|
541
|
+
`REA_NODE_CLI` that returns exit 0 with empty stdout).
|
|
542
|
+
|
|
543
|
+
### 8.1 Trust assumptions
|
|
544
|
+
|
|
545
|
+
The scanner trusts the following components. Each row names what we
|
|
546
|
+
trust, what would happen if the trust were violated, and what pins
|
|
547
|
+
the trust.
|
|
548
|
+
|
|
549
|
+
| Component | What we trust | If violated | Pinned by |
|
|
550
|
+
| ---------------------------------- | -------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
|
|
551
|
+
| `mvdan-sh@0.10.1` | Produces a faithful AST from any input bash accepts | Detector misses target on mis-shaped node | Pinned version, RedirOperator op-code snapshot tests, Class O exhaustiveness contract |
|
|
552
|
+
| Walker dispatch table | Enumerates every shape that produces a write | Novel utility silently allowed | Per-PR corpus fixture requirement; 18 corpus classes; convergence ladder |
|
|
553
|
+
| `fs.lstatSync` / `fs.readlinkSync` | Identify symlinks including dangling links | Symlink-out bypass class reopens | Codex round 1 F-2 closure + symlink corpus |
|
|
554
|
+
| `node` on PATH in the shim | Verdict-JSON verifier runs | Shim refuses on uncertainty (fail-closed) | Shim test for missing-node branch |
|
|
555
|
+
| Project-root realpath | `realpath(cli).startsWith(realpath(CLAUDE_PROJECT_DIR))` defends symlink-out | Forged CLI path inside `node_modules` defeats CLI-resolver | Codex round 5 F2 closure + corpus |
|
|
556
|
+
| OS realpath semantics | `node:fs.realpathSync` resolves symlinks consistently with the kernel; macOS `/var` ↔ `/private/var` aliasing handled via the rea_resolved_relative_form helper | Path-traversal escape | helix-021 closure + corpus |
|
|
557
|
+
| `@bookedsolid/rea` package install | `node_modules/@bookedsolid/rea/package.json#name === "@bookedsolid/rea"` AND realpath stays in project | Supply-chain compromise — see §8.3 | npm provenance; opt-in `policy.review.cli_sha256` (deferred to future minor) |
|
|
558
|
+
|
|
559
|
+
The scanner does NOT trust:
|
|
560
|
+
|
|
561
|
+
- The Bash command string itself — every input is parsed with a
|
|
562
|
+
hostile-input-tolerant parser and walked with a deny-by-default
|
|
563
|
+
visitor.
|
|
564
|
+
- `REA_NODE_CLI` / any environment variable nominating the CLI path
|
|
565
|
+
(codex round 1 F-3 + round 2 R2-3 — env-var hijack class dropped).
|
|
566
|
+
- The CLI exit code alone — the shim re-parses verdict JSON via
|
|
567
|
+
`node -e` and cross-checks exit code matches verdict (round 1 F-3).
|
|
568
|
+
- The visitor's per-`Cmd`-kind enumeration — replaced with `syntax.Walk`
|
|
569
|
+
in round-6 (round 6 closure: walker-dispatch field-omission is
|
|
570
|
+
structurally impossible).
|
|
571
|
+
- mvdan-sh's `syntax.Walk` to visit every Word-bearing field — Class O
|
|
572
|
+
contract test pins reach (round-7 closure).
|
|
573
|
+
- `unshellEscape` to handle every DQ-escape sequence — bash spec
|
|
574
|
+
enumerated and pinned by Class P corpus (round-8 closure).
|
|
575
|
+
|
|
576
|
+
### 8.2 Bypass classes structurally impossible
|
|
577
|
+
|
|
578
|
+
- **walker dispatch field-omission is structurally impossible**, and
|
|
579
|
+
**mvdan-sh `syntax.Walk` field gaps are pinned by a contract test**
|
|
580
|
+
(0.23.0 round-6 + round-7 layered closure).
|
|
581
|
+
|
|
582
|
+
Round 6 — our own dispatch. Pre-refactor `walkForWrites` dispatched
|
|
583
|
+
on AST `Cmd` kinds via an explicit `case` ladder
|
|
584
|
+
(`case 'WhileClause':`, `case 'ForClause':`, …) and manually
|
|
585
|
+
enumerated which fields each kind traversed. Any field NOT
|
|
586
|
+
enumerated in a case branch was silently dropped — that pattern
|
|
587
|
+
produced six rounds of P0 bypasses (rounds 1-5 patched detection
|
|
588
|
+
gaps; round 6 closed the structural class). Round 6 found two P0s
|
|
589
|
+
in the same class as round 5 (`WhileClause.Cond`,
|
|
590
|
+
`ForClause.CStyleLoop.{Init,Cond,Post}`) and the convergence ladder
|
|
591
|
+
34→14→9→8→5→2 demonstrated the walker would never reach 0 with
|
|
592
|
+
patches alone — it was structurally a denylist over AST shapes.
|
|
593
|
+
The new walker uses `mvdan-sh`'s built-in `syntax.Walk(node, visit)`
|
|
594
|
+
which traverses every Cmd-kind dispatch exhaustively from OUR side.
|
|
595
|
+
Our dispatch is preserved (per-utility cp/mv/sed/find/etc.), but the
|
|
596
|
+
TRAVERSAL is no longer a denylist of OUR shapes. A new `Cmd` type
|
|
597
|
+
added to mvdan-sh, or a new field on an existing type, reaches OUR
|
|
598
|
+
dispatcher when Walk descends into its inner Stmts / CallExprs /
|
|
599
|
+
BinaryCmds.
|
|
600
|
+
|
|
601
|
+
Round 7 — Walk's own field gaps. Codex round 7 (P0) flagged that
|
|
602
|
+
the round-6 framing "Walk visits every field" was overclaim:
|
|
603
|
+
mvdan-sh@0.10.1's `syntax.Walk` itself has empirically-verified
|
|
604
|
+
field gaps. Specifically, `ParamExp.Slice.Offset` and
|
|
605
|
+
`ParamExp.Slice.Length` (Word nodes that can hold CmdSubst payloads)
|
|
606
|
+
are NOT visited. Pre-fix this defeated 17 round-7 PoCs including
|
|
607
|
+
`${X:$(rm)}`, `${X:0:$(rm)}`, `${arr[@]:$(rm)}`, `${@:$(rm)}` —
|
|
608
|
+
every paramexp-slice form bypassed every detector. That made
|
|
609
|
+
0.23.0 a regression vs 0.22.0 (whose bash regex caught
|
|
610
|
+
`${X:$(rm)}` directly). Round-7 closure layers on round-6:
|
|
611
|
+
|
|
612
|
+
1. Tactical fix: `walkForWrites` declares its visitor up front and
|
|
613
|
+
manually re-enters `syntax.Walk` on `ParamExp.Slice.Offset` /
|
|
614
|
+
`Slice.Length` whenever the visitor sees a ParamExp. The
|
|
615
|
+
re-entry uses the SAME visitor, so nested ParamExp.Slice forms
|
|
616
|
+
(e.g. `${X:${Y:$(rm)}}`) recurse correctly.
|
|
617
|
+
2. Structural pin: the **Class O exhaustiveness contract test**
|
|
618
|
+
(`__tests__/hooks/bash-scanner/walker-exhaustiveness.contract.test.ts`)
|
|
619
|
+
enumerates every named (node-type, field) Word-bearing position
|
|
620
|
+
mvdan-sh's parser populates and asserts the walker reaches each
|
|
621
|
+
one. If mvdan-sh@0.11.0+ adds a new node-with-Word-field that
|
|
622
|
+
Walk skips, the contract fails CI before runtime. The fix in
|
|
623
|
+
that case is always a one-line manual recursion in the visit
|
|
624
|
+
callback (same pattern as `recurseParamExpSlice`).
|
|
625
|
+
|
|
626
|
+
Combined: walker-dispatch field-omission bugs in OUR code are
|
|
627
|
+
structurally impossible (Walk-based traversal). Walk's own field
|
|
628
|
+
gaps are pinned by Class O. New mvdan-sh versions cannot silently
|
|
629
|
+
introduce new bypass classes — they fail the contract first.
|
|
630
|
+
- segment-splitter mis-detection (helix-014/015/016) — there is no
|
|
631
|
+
segmenter to bypass.
|
|
632
|
+
- shell-redirect ordering vs argv ordering ambiguity — the parser
|
|
633
|
+
attaches `Redirs` to the right `Stmt`, including FuncDecl Body
|
|
634
|
+
Stmts (codex round 1 F-1).
|
|
635
|
+
- nested-shell payload bypass (helix-017 #2 / helix-022 #3) —
|
|
636
|
+
`bash -c PAYLOAD` re-parses the payload up to a depth cap of 8.
|
|
637
|
+
- `find -exec bash -c` re-wrap (codex round 1 F-4) — inner argv
|
|
638
|
+
routes through `detectNestedShell` for re-parse.
|
|
639
|
+
- combined flag-cluster bypass (`bash -ic`, `node -pe`, `perl -E`,
|
|
640
|
+
codex round 1 F-13/F-8) — the eval-flag scanner accepts any
|
|
641
|
+
short-flag cluster containing the eval char.
|
|
642
|
+
- backslash-escape bypass (codex round 1 F-15) — normalize strips
|
|
643
|
+
bash's runtime backslash collapse.
|
|
644
|
+
- dangling-symlink bypass (codex round 1 F-2) — `lstat`+`readlink`
|
|
645
|
+
resolves the link manually.
|
|
646
|
+
- REA_NODE_CLI hijack (codex round 1 F-3, codex round 2 R2-3) — env
|
|
647
|
+
var dropped entirely. The 4-tier resolver was further narrowed in
|
|
648
|
+
codex round 4 Finding 2: tiers 1 (`command -v rea` PATH lookup) and
|
|
649
|
+
2 (`node_modules/.bin/rea` symlink) were DROPPED because both are
|
|
650
|
+
workspace-attacker-controllable. The shim now uses a 2-tier sandboxed
|
|
651
|
+
resolver (`node_modules/@bookedsolid/rea/dist/cli/index.js` or
|
|
652
|
+
`dist/cli/index.js`). Workspace-bin attacker hijack via fake `rea`
|
|
653
|
+
binary at `node_modules/.bin/rea` is closed. The structural-impossibility
|
|
654
|
+
claim is bounded — see §8.3 for the supply-chain residual.
|
|
655
|
+
- symlink-out-of-project sandbox bypass (codex round 5 F2 — P0) — pre-fix
|
|
656
|
+
the realpath check only walked up looking for a `package.json` with
|
|
657
|
+
`name === "@bookedsolid/rea"`. An attacker who could write a symlink
|
|
658
|
+
at `node_modules/@bookedsolid/rea` pointing to `/tmp/sym-attacker/`
|
|
659
|
+
containing a forged `package.json` + a forged `dist/cli/index.js` that
|
|
660
|
+
exits 0 with `{"verdict":"allow"}` defeated the sandbox. Fix: PRIMARY
|
|
661
|
+
check is `realpath(cli).startsWith(realpath(CLAUDE_PROJECT_DIR) + sep)`;
|
|
662
|
+
any escape-out-of-project realpath refuses. The pkg.json walk-up
|
|
663
|
+
remains as the SECONDARY check guarding intra-project hijacks.
|
|
664
|
+
- assignment-side cmdsubst silent bypass (codex round 5 F1 — P0) —
|
|
665
|
+
`FOO=$(rm -rf .rea)`, `X=\`rm -rf .rea\``, `export FOO=$(rm)`,
|
|
666
|
+
`readonly X=$(rm)`, `local`/`declare`/`typeset`, `ARR=( $(rm) )`,
|
|
667
|
+
`[[ -n $(rm) ]]`, `case $(rm) in *) ;; esac`, `cat <<< $(rm)`,
|
|
668
|
+
`read X < <(rm)`, `(( $(rm | wc -l) ))`, `for x in $(rm)`. Pre-fix the
|
|
669
|
+
walker short-circuited at `args.length === 0` and ignored
|
|
670
|
+
`CallExpr.Assigns`; the AST cases `DeclClause`, `TestClause`,
|
|
671
|
+
`ArithmCmd`, `LetClause`, `SelectClause` and `CaseClause.Word` were
|
|
672
|
+
dropped at the walkCmd default. Stmt.Redirs's Word also wasn't walked
|
|
673
|
+
for embedded CmdSubst on read ops (here-string `<<<` 0x3f, procsubst-
|
|
674
|
+
on-stdin `< <(...)` 0x38). Fix: new `walkAssignsForSubstNodes` walks
|
|
675
|
+
every Assign.Value / Assign.Array.Elems[*].Value / Assign.Index for
|
|
676
|
+
embedded CmdSubst/ProcSubst/ArithmExp; new `walkTestExpr` recurses
|
|
677
|
+
through UnaryTest/BinaryTest/ParenTest leaves; walkCmd cases added for
|
|
678
|
+
every dropped clause type; extractStmtRedirects walks the Word for
|
|
679
|
+
cmdsubst regardless of operator.
|
|
680
|
+
- mixed-quote interpreter shell-out (codex round 5 F3 — P1) — pre-fix
|
|
681
|
+
the per-language `*_SHELL_OUT_RE` patterns used `["']([^"']+)["']` for
|
|
682
|
+
the inner-cmd capture, which truncated bodies whose source contained
|
|
683
|
+
the alternate quote (e.g. `os.system('rm "x"')` captured `rm `). Fix:
|
|
684
|
+
quote-aware variants `(["'])((?:(?!\1)[^\\]|\\.)+)\1` for python /
|
|
685
|
+
ruby / node / perl, plus a fail-closed shell-out fallback that emits a
|
|
686
|
+
dynamic detection when the payload contains a shell-out API token but
|
|
687
|
+
no shell-out regex extracted a clean payload.
|
|
688
|
+
- chained-interpreter multi-level escape bypass (codex round 5 F4 — P1)
|
|
689
|
+
— pre-fix `python -c "import os; os.system('node -e \"require(\\\"fs
|
|
690
|
+
\\\").rmSync(\\\".rea\\\", ...)\"')"` allowed because each layer
|
|
691
|
+
accumulates a `\\\"` shell-escape level and the per-language path-
|
|
692
|
+
quote regex rejects `(\\"` after the call paren. Fix: when a shell-
|
|
693
|
+
out body itself contains a known interpreter binary head followed by
|
|
694
|
+
an eval flag (`-c`/`-e`/`--eval`/`-pe`/`-ic`), emit a dynamic
|
|
695
|
+
detection (`looksLikeChainedInterpreter`). Refuse on uncertainty.
|
|
696
|
+
- non-string `tool_input.command` (codex round 1 F-31) — refused at
|
|
697
|
+
CLI input parse.
|
|
698
|
+
- absolute / relative-path command head (codex round 2 R2-14) —
|
|
699
|
+
basename-normalized before dispatcher switch; `/usr/bin/bash`,
|
|
700
|
+
`./sed`, `/usr/bin/env bash` all dispatch identically to the
|
|
701
|
+
bare-name form.
|
|
702
|
+
- decoupled-variable interpreter writes (codex round 2 R2-1) —
|
|
703
|
+
flat-scan over the payload: write API + any string-construction
|
|
704
|
+
primitive → dynamic detection.
|
|
705
|
+
- symlink cycles / deep chains (codex round 2 R2-2) — visited-set +
|
|
706
|
+
depth cap (32); cycle/cap returns sentinel that maps to "refuse on
|
|
707
|
+
uncertainty".
|
|
708
|
+
- joined `-t<DIR>` form (codex round 2 R2-4) — cp/mv/install/ln all
|
|
709
|
+
recognize the no-space form.
|
|
710
|
+
- tar `-C DIR`, rsync DEST, curl `-o`, wget `-O`, shred FILE,
|
|
711
|
+
eval payload, git checkout/restore/reset path (codex round 2 R2-7
|
|
712
|
+
through R2-13) — each utility now has a dedicated dispatcher.
|
|
713
|
+
- heredoc-into-shell (codex round 2 R2-12) — `bash <<EOF\n…\nEOF`
|
|
714
|
+
re-parses the heredoc body and walks the inner AST.
|
|
715
|
+
- eval re-parse (codex round 2 R2-13) — argv concat → re-parse →
|
|
716
|
+
walk; refuse on dynamic argv or parse failure.
|
|
717
|
+
- eval ordering with cmdsubst (codex round 3 Finding 1 — P0) —
|
|
718
|
+
`eval $(cmd)` no longer slips through the empty-inner short-circuit;
|
|
719
|
+
any-dynamic-argv emits a dynamic detection.
|
|
720
|
+
- pipe-into-bare-shell (codex round 3 Finding 2 — P1) — `cmd | bash`
|
|
721
|
+
with no `-c` is refuse-on-uncertainty.
|
|
722
|
+
- tar cluster `-xzfC` (codex round 3 Finding 3 — P1) — value-bearing
|
|
723
|
+
cluster chars consume subsequent argv tokens correctly.
|
|
724
|
+
- git top-level value-bearing flags (codex round 3 Finding 4 — P1) —
|
|
725
|
+
`-C`, `-c`, `--git-dir`, `--work-tree`, etc. are walked past before
|
|
726
|
+
identifying the subcommand.
|
|
727
|
+
- python shell-out shapes (codex round 3 Finding 5 — P1) —
|
|
728
|
+
`subprocess.* shell=True` and `subprocess.run(..., stdout=open())`
|
|
729
|
+
re-parse the inner shell.
|
|
730
|
+
- recursive directory delete bypass (codex round 4 Finding 1 — P0) —
|
|
731
|
+
`rm -rf .rea`, `rmdir .rea`, `find .rea -delete`, `shutil.rmtree(...)`,
|
|
732
|
+
`fs.rmSync(..., {recursive:true})`, `FileUtils.rm_rf(...)` etc. all
|
|
733
|
+
flag isDestructive on emit; the matcher's protected-ancestry path
|
|
734
|
+
treats writes against an ancestor directory as hits on every protected
|
|
735
|
+
pattern under it. Structurally closed: `PROTECTED_DIR_ANCESTORS` was
|
|
736
|
+
added to the corpus generator so the cross-product produces directory-
|
|
737
|
+
shaped destructive fixtures, eliminating the structural gap that
|
|
738
|
+
prevented detection.
|
|
739
|
+
- mv source-side bypass (codex round 4 Finding 3 — P1) — `mv` source
|
|
740
|
+
positionals are emitted as destructive writes too (mv removes content
|
|
741
|
+
at the source).
|
|
742
|
+
- find -delete unmodeled (codex round 4 Finding 4 — P1) — seed paths
|
|
743
|
+
are emitted as destructive write targets; with `-name PREDICATE`
|
|
744
|
+
present, the seed is emitted as dynamic+destructive (refuse on
|
|
745
|
+
uncertainty).
|
|
746
|
+
- interpreter shell-out breadth (codex round 4 Finding 5 — P1) — perl
|
|
747
|
+
exec/open-pipe, ruby Open3 / IO.popen, node spawn-with-bash-c, python
|
|
748
|
+
pty.spawn, opaque-spawn APIs (os.spawnv* / os.execv*) all detected
|
|
749
|
+
with re-parse-or-refuse.
|
|
750
|
+
- pathlib & File-class destructive APIs (codex round 4 Finding 6 — P1)
|
|
751
|
+
— Path(...).touch / .unlink / .rmdir / .rename, File.delete /
|
|
752
|
+
File.unlink / File.rename, ruby `open(F, 'w')` (bare), perl unlink /
|
|
753
|
+
rename all caught with isDestructive plumbed through where the API
|
|
754
|
+
semantic is removal.
|
|
755
|
+
- misc utility writes (codex round 4 Finding 7 — P1) — `patch`, `sort
|
|
756
|
+
-o`, `shuf -o`, `gpg --output`, `split <PREFIX>`, `trap "cmd" SIGNAL`
|
|
757
|
+
(re-parse trap command), `git config --file` all detected.
|
|
758
|
+
- procsubst-feeding-bash (codex round 4 Finding 7 — P1) — `bash <(cmd)`,
|
|
759
|
+
`bash 0< <(cmd)`, `bash <<< "cmd"` all emit dynamic detections; the
|
|
760
|
+
shell reads the FIFO/here-string as a script we cannot statically
|
|
761
|
+
resolve.
|
|
762
|
+
- nested-shell DQ-escape parity (codex round 8 P0) — `unshellEscape`
|
|
763
|
+
pre-fix collapsed only `\"` and `\'` from DQ payloads. Bash spec
|
|
764
|
+
collapses 5 DQ-significant escape sequences (`\$`, `` \` ``, `\"`,
|
|
765
|
+
`\\`, `\<newline>`). Pre-fix `bash -c "echo \"\$(rm .rea/HALT)\""`
|
|
766
|
+
was a verified end-to-end exploit: the real shell collapsed `\$` →
|
|
767
|
+
`$`, evaluated the CmdSubst, and deleted HALT, while the walker's
|
|
768
|
+
re-parse saw `\$` as a literal `$` (no CmdSubst child) and ALLOWED.
|
|
769
|
+
Fix: widened `unshellEscape`'s replace class to all five DQ-significant
|
|
770
|
+
escape characters. Class P corpus (≥560 fixtures across 5 shells × 16
|
|
771
|
+
DQ-escape shapes × 7 protected targets, plus 8 negatives) pins the
|
|
772
|
+
closure. Class O contract (round-7) was simultaneously tightened —
|
|
773
|
+
the lenient `|| w.dynamic` acceptance was replaced with opt-in
|
|
774
|
+
`acceptDynamic` per row, so contract-test passes can no longer mask
|
|
775
|
+
walker gaps via unrelated `nested_shell_inner` dynamic emits.
|
|
776
|
+
- wrapper-shell-exec class (codex round 9 F1 + round 10 P1) —
|
|
777
|
+
`<wrapper> <shell> -c PAYLOAD` shape where the wrapper transparently
|
|
778
|
+
forks/execs the next argv as the "real" command (`nice`, `timeout`,
|
|
779
|
+
`chronic`, `parallel`, `watch`, `dbus-launch`, ... and unbounded
|
|
780
|
+
future similar wrappers). Pre-round-9 `stripEnvAndModifiers` ignored
|
|
781
|
+
these wrappers, so the head-dispatch saw the wrapper name and missed
|
|
782
|
+
the inner `<shell> -c PAYLOAD`. Round 9 enumerated 21 wrappers; round
|
|
783
|
+
10 surfaced 5 more — clear evidence the enumeration approach was
|
|
784
|
+
unbounded. Round-10 closure is **structural**: a new
|
|
785
|
+
`detectWrappedNestedShell` pass runs in `walkCallExpr`'s `default:`
|
|
786
|
+
case (head not in dispatcher's allow-list) and detects the bypass
|
|
787
|
+
shape `<UNRECOGNIZED-HEAD> [...flags...] <KNOWN-SHELL> -c PAYLOAD`
|
|
788
|
+
REGARDLESS of wrapper identity. Synthesizes a `[shell, -c, PAYLOAD,
|
|
789
|
+
...]` slice and re-dispatches through `detectNestedShell`. False-
|
|
790
|
+
positive guards (a) skip when head is an introspection / output
|
|
791
|
+
utility (`echo`, `printf`, `man`, `which`, ...) and (b) skip when
|
|
792
|
+
argv[1] is itself an introspection head — covers
|
|
793
|
+
`<wrapper> echo bash` shapes. Three-token lookahead window between
|
|
794
|
+
shell positional and `-c` flag bounds false-positive risk. Bare-
|
|
795
|
+
shell-without-`-c` form refuses on uncertainty (stdin read).
|
|
796
|
+
Closes the bug class — every future unknown wrapper that
|
|
797
|
+
fork/execs a shell is caught without enumeration. Round 10 also
|
|
798
|
+
added explicit enumerations for `chronic`/`dbus-launch`/`watch`/
|
|
799
|
+
`script -c`/`parallel ::: ` for clean dispatch (no
|
|
800
|
+
refuse-on-uncertainty banner). Class S (233 wrapper-extension
|
|
801
|
+
positives + 38 negatives) and Class T (314 synthetic-wrapper
|
|
802
|
+
structural-guard positives + 29 false-positive-guard negatives)
|
|
803
|
+
pin the closure.
|
|
804
|
+
- find-exec placeholder, git history-rewrite seams, archive
|
|
805
|
+
extraction, parallel-stdin, more wrappers, php (codex round 11
|
|
806
|
+
F11-1..F11-7) — seven INDEPENDENT classes against the round-10
|
|
807
|
+
wrapper closure. None were variants of the wrapper family;
|
|
808
|
+
each landed in a different parser seam. (a) `find . -name HALT
|
|
809
|
+
-exec rm {} \;` — `{}` is a placeholder substituted at runtime
|
|
810
|
+
by find against the live filesystem; static analysis cannot
|
|
811
|
+
resolve which paths it expands to. Round-11 fix: synthetic
|
|
812
|
+
`find_exec_placeholder_unresolvable` dynamic detection emitted
|
|
813
|
+
whenever inner argv has `{}` AND the inner head is not in a
|
|
814
|
+
small read-only allow-list (`cat`, `grep`, `head`, `wc`, etc.).
|
|
815
|
+
(b) `git rm -f .rea/HALT` and `git mv .rea/HALT /tmp/x` were
|
|
816
|
+
not in the `TRACKED` subcommand set; round-11 added explicit
|
|
817
|
+
branches with `--cached` carve-out for `git rm`. (c) `git
|
|
818
|
+
filter-branch --tree-filter PAYLOAD` and `git rebase --exec`
|
|
819
|
+
/ `-x` / `git bisect run` / `git commit --template=PATH` were
|
|
820
|
+
re-parse seams where git feeds PAYLOAD through `/bin/sh -c` at
|
|
821
|
+
runtime; round-11 added per-subcommand handlers feeding PAYLOAD
|
|
822
|
+
through `recurseShellPayload` → `parseBashCommand` →
|
|
823
|
+
`walkForWrites` (full top-level walker re-dispatch). (d)
|
|
824
|
+
archive extraction: `tar -xf x.tar -C . .rea/HALT` extracts the
|
|
825
|
+
protected member; `tar -xzf x.tgz` (no `-C`, no member list)
|
|
826
|
+
extracts every member — archive may contain `.rea/HALT`.
|
|
827
|
+
Round-11 fix: `detectTar` extended with extract-mode positional
|
|
828
|
+
harvesting, plus new `detectUnzip`/`detect7z`/`detectGzip`/
|
|
829
|
+
`detectPax` dispatchers. `bsdtar` aliases to `tar`. When -x is
|
|
830
|
+
set with no `-C` AND no explicit members, emit
|
|
831
|
+
`archive_extract_unresolvable`. (e) `echo .rea/HALT | parallel
|
|
832
|
+
rm` reads input from stdin (no `:::`); round-11 added pre-strip
|
|
833
|
+
detection in `walkCallExpr` that emits
|
|
834
|
+
`parallel_stdin_unresolvable` when parallel head has positional
|
|
835
|
+
template tokens AND no `:::`/`::::`/`:::+`/`::::+` separator.
|
|
836
|
+
(f) `fakeroot`/`flock`/`gtimeout`/`unshare`/`env --chdir=`/
|
|
837
|
+
`sudo -s --` were not stripped by `stripEnvAndModifiers`;
|
|
838
|
+
round-11 added each wrapper with appropriate flag arity
|
|
839
|
+
handling, plus shell-mode `sudo -s --` synthesis to `sh -c
|
|
840
|
+
PAYLOAD`. (g) `php -r CODE` was missing from interpreter
|
|
841
|
+
dispatch; round-11 added `PHP_EVAL_FLAGS` (-r, --run) and
|
|
842
|
+
`PHP_WRITE_PATTERNS` covering `unlink`/`file_put_contents`/
|
|
843
|
+
`rename`/`copy`/`fopen` (write modes)/`mkdir`/`rmdir`/`touch`/
|
|
844
|
+
`chmod`/`chown`/`chgrp`/`symlink`/`link`/`move_uploaded_file`.
|
|
845
|
+
Class U (round-11 find/git/archive/parallel — 360 positives +
|
|
846
|
+
17 negatives) and Class V (round-11 wrappers + PHP — 124
|
|
847
|
+
positives + 8 negatives) pin the closure.
|
|
848
|
+
- adjacent-utility / cumulative-parity gaps (codex round 12
|
|
849
|
+
F12-1..F12-9 — nine INDEPENDENT findings against the round-11
|
|
850
|
+
surface). Not variants of any prior round; each landed in
|
|
851
|
+
PHP / archive-create / cmake / mkfifo+mknod / find-write-
|
|
852
|
+
predicate space where round-11 had not applied the cumulative
|
|
853
|
+
discipline established by earlier rounds. (a) F12-1 P0:
|
|
854
|
+
PHP `rename(SRC, DEST)` SOURCE-side blindspot — round-4 F3
|
|
855
|
+
established mv-shape source IS destructive; round-11 bundled
|
|
856
|
+
PHP rename with the destination-only group, so SRC slipped
|
|
857
|
+
past. Round-12 fix: split rename into TWO patterns + add
|
|
858
|
+
`rename(` to DESTRUCTIVE_API_TOKENS. (b) F12-2 P0: PHP
|
|
859
|
+
`rmdir(PATH)` not flagged destructive — bundled with mkdir/
|
|
860
|
+
touch (creates), so protected-ancestry never matched. Round-12
|
|
861
|
+
fix: split rmdir + add `rmdir(` to DESTRUCTIVE_API_TOKENS.
|
|
862
|
+
(c) F12-3 P0: PHP shell-out missing entirely —
|
|
863
|
+
`pickShellOutPatternsFor` had no php_r_path case. Round-12
|
|
864
|
+
fix: new PHP_SHELL_OUT_RE with quote-aware backref body
|
|
865
|
+
extraction covering system / exec / shell_exec / passthru /
|
|
866
|
+
popen / proc_open / backtick. (d) F12-4 P0: PHP -B/-E /
|
|
867
|
+
--process-begin / --process-end accept CODE same as -r;
|
|
868
|
+
round-11 PHP_EVAL_FLAGS only had -r/--run. Round-12 fix:
|
|
869
|
+
extend exactLong + shortChars (case-sensitive uppercase).
|
|
870
|
+
(e) F12-5 P0: archive CREATE direction missing — only EXTRACT
|
|
871
|
+
was checked. `tar -cf .rea/policy.yaml docs/`, `zip
|
|
872
|
+
.rea/policy.yaml docs/file`, `7z a .rea/policy.yaml docs/`
|
|
873
|
+
all silently overwrote the OUTPUT archive at the protected
|
|
874
|
+
path. Round-12 fix: detectTar gains isCreateOrAppend pass +
|
|
875
|
+
-f/-cf/--file emit; detect7z gains a/u/d compress branch;
|
|
876
|
+
new detectZip dispatcher (zip OUTPUT.zip [files...]).
|
|
877
|
+
(f) F12-6 P1: cmake -E utility surface — rm/remove/rename/
|
|
878
|
+
copy/copy_if_different/copy_directory/touch/remove_directory/
|
|
879
|
+
create_symlink/create_hardlink/make_directory all slipped past
|
|
880
|
+
pre-fix. Round-12 fix: new detectCmake with per-subcommand
|
|
881
|
+
argv shapes (cp-shape, mv-shape, variadic, second-positional).
|
|
882
|
+
(g) F12-7 P1: mkfifo / mknod create special files at protected
|
|
883
|
+
paths; no dispatchers existed. Round-12 fix: new detectMkfifo
|
|
884
|
+
(variadic) and detectMknod (NAME is first bare positional).
|
|
885
|
+
(h) F12-8 P1: find write-predicates -fls / -fprint / -fprintf
|
|
886
|
+
not in detectFind. Round-12 fix: scan for these predicates
|
|
887
|
+
and emit FILE as destructive write target (-fprintf consumes
|
|
888
|
+
TWO args). (i) F12-9 P2 false-positive regression: detectUnzip
|
|
889
|
+
emitted dynamic unresolvable for read-only flags `-p` (extract
|
|
890
|
+
to stdout), `-l/-Z/-t/-v/-z` (list/test/verbose/comment).
|
|
891
|
+
Round-12 fix: early-return ALLOW from detectUnzip when any
|
|
892
|
+
read-only flag (or cluster char) present. Class W (round-12
|
|
893
|
+
closures — 173 positives + 18 negatives) pins the closure.
|
|
894
|
+
|
|
895
|
+
### 8.3 Bypass classes still possible
|
|
896
|
+
|
|
897
|
+
- **`@bookedsolid/rea` package-tier supply-chain compromise** (codex
|
|
898
|
+
round 5 F5 — P1/P3 acknowledged residual). The bash-tier shim's
|
|
899
|
+
CLI-resolution sandbox check (codex round 4 #2 + round 5 F2)
|
|
900
|
+
defeats node_modules-symlink-out and workspace-bin hijack. It does
|
|
901
|
+
NOT defeat an attacker who can write a forged
|
|
902
|
+
`node_modules/@bookedsolid/rea/dist/cli/index.js` *and* a matching
|
|
903
|
+
`node_modules/@bookedsolid/rea/package.json` with `name ===
|
|
904
|
+
"@bookedsolid/rea"` directly inside the project's `node_modules/`.
|
|
905
|
+
Such an attacker has already compromised the package install
|
|
906
|
+
pipeline (e.g. via a compromised lockfile / dependency-confusion
|
|
907
|
+
attack / npm registry compromise). At that level the attacker can
|
|
908
|
+
also forge any other dependency the agent uses, so hook-tier defense
|
|
909
|
+
is past — the trust boundary is the `npm install` provenance check
|
|
910
|
+
(npm provenance + manifest verification) rather than the bash gate.
|
|
911
|
+
Hardening (opt-in): operators may set `policy.review.cli_sha256: <hex>`
|
|
912
|
+
in `.rea/policy.yaml` and the shim will refuse if the resolved CLI's
|
|
913
|
+
SHA-256 does not match. Defaults to unset; documented as an opt-in
|
|
914
|
+
belt-and-braces measure rather than a structural defense (see
|
|
915
|
+
`docs/architecture/bash-scanner.md` for the full rationale).
|
|
916
|
+
- **`git checkout REVISION PATH` and `git restore --source=REVISION
|
|
917
|
+
PATH` without the POSIX `--` argv separator** (codex round 9 F3 —
|
|
918
|
+
accepted false negative since 0.22.0). When an attacker invokes
|
|
919
|
+
`git checkout main .rea/HALT` (or `git restore --source=HEAD~1
|
|
920
|
+
.rea/policy.yaml`), git's pre-`--` argv shape is structurally
|
|
921
|
+
ambiguous between "REVISION PATH" and "PATH...". The `detectGit`
|
|
922
|
+
walker conservatively treats positionals after the subcommand as
|
|
923
|
+
destructive-overwrite targets ONLY when `--` is present, because
|
|
924
|
+
disambiguation requires a runtime ref-existence check that the
|
|
925
|
+
static walker cannot perform without filesystem I/O. Workaround:
|
|
926
|
+
the kill-switch invariants for protected files (`.rea/HALT`,
|
|
927
|
+
`.rea/policy.yaml`, `.claude/settings.json`, etc.) are still caught
|
|
928
|
+
by the symlink-resolution layer in `protected-paths-bash-gate.sh`
|
|
929
|
+
at file-write time when git actually opens the destination for
|
|
930
|
+
write — the bash-scanner's static layer is one of multiple
|
|
931
|
+
enforcement points. Pin: 0.24.0 milestone for a comprehensive fix
|
|
932
|
+
(likely a conservative refusal whenever the first positional
|
|
933
|
+
matches a known git-revspec shape AND following positionals exist).
|
|
934
|
+
- glob expansion in argv-based commands (`chmod +x bin/*.sh` on a
|
|
935
|
+
filesystem where `bin/.rea` exists). Glob detection is scoped to
|
|
936
|
+
redirect-form targets only because argv-globs in legitimate code
|
|
937
|
+
are common. Future: enumerate filesystem-level glob matches.
|
|
938
|
+
- awk `-f script-file` body. We currently emit a dynamic detection
|
|
939
|
+
(refuse on uncertainty). Future: read + scan the file.
|
|
940
|
+
- semantic obfuscation via `${!ind}` indirect expansion, `read -p`
|
|
941
|
+
prompts, computed-property attacks in interpreter payloads that
|
|
942
|
+
evade the flat-scan heuristic. These are parse-correct and the
|
|
943
|
+
detector is best-effort.
|
|
944
|
+
- WASM `sh-syntax` migration risk: parser bugs in the new library
|
|
945
|
+
could reopen bypass classes. Mitigated by the corpus fixture
|
|
946
|
+
suite (every closed bypass replays as a positive regression
|
|
947
|
+
test).
|
|
948
|
+
- **Round-13 deferred to 0.24.0** (utility-enumeration completeness
|
|
949
|
+
— same shape as round 12; convergence at this tier is asymptotic
|
|
950
|
+
per codex's explicit assessment):
|
|
951
|
+
1. **PHP indirect-callable shell-out** —
|
|
952
|
+
`array_map("system", [...])`, `call_user_func("system", ...)`,
|
|
953
|
+
`eval("system(...);")`, variable-bound callable
|
|
954
|
+
`$f = "system"; $f(...)`. Round 12 closed direct calls;
|
|
955
|
+
indirect callable forms remain.
|
|
956
|
+
2. **vim/emacs editor exec re-parse seam** — `vim -c "!cmd"`,
|
|
957
|
+
`vim -c "w PATH"`, `vim -c "source PATH"`, `emacs --eval
|
|
958
|
+
"(shell-command \"...\")"`, `emacs --eval "(delete-file
|
|
959
|
+
\"...\")"`. No vim/emacs dispatcher in 0.23.0.
|
|
960
|
+
3. **Standalone compression utilities** — `xz -o FILE`,
|
|
961
|
+
`zstd -o FILE`, `lz4 INPUT FILE`, `lzma -o FILE`. Round 12
|
|
962
|
+
closed tar/zip/7z/cpio/pax create-direction; standalone
|
|
963
|
+
compression family adjacent.
|
|
964
|
+
4. **Image-builder utilities** — `mksquashfs INPUT OUTPUT`,
|
|
965
|
+
`genisoimage -o OUTPUT`, `mkisofs -o OUTPUT`, `xorriso
|
|
966
|
+
-outdev OUTPUT`. CI/build pipelines use these.
|
|
967
|
+
5. **`7zz` Linux-canonical 7zip alias** — `detect7z` only fires
|
|
968
|
+
on `7z`; the official Linux package binary is `7zz`.
|
|
969
|
+
6. **Python argv-form subprocess** — `subprocess.Popen(['rm',
|
|
970
|
+
'...'])`, `subprocess.run(['rm', '...'])` etc. without
|
|
971
|
+
`shell=True`. Round 3 F5 closed `shell=True` form;
|
|
972
|
+
argv-list form still allows.
|
|
973
|
+
7. **vim ex-mode literal-path write** (P2) — `vim -c "w PATH"`
|
|
974
|
+
writes literally with PATH in argv, regex-level miss
|
|
975
|
+
independent of the broader `-c` re-parse seam.
|
|
976
|
+
- **Denylist scanner is structurally limited** (acknowledged by
|
|
977
|
+
codex round 4 and reaffirmed by codex round 13: "convergence on
|
|
978
|
+
enumeration completeness is asymptotic — every round will probably
|
|
979
|
+
find more"). A denylist enumerates the destructive shapes the
|
|
980
|
+
scanner knows. Novel shapes (interpreters or utilities not yet
|
|
981
|
+
modeled, language constructs we haven't seen) can in principle
|
|
982
|
+
bypass until added. Defense in depth:
|
|
983
|
+
1. mvdan-sh AST parsing eliminates an entire class of regex/
|
|
984
|
+
segmenter mistakes.
|
|
985
|
+
2. Comprehensive walker dispatchers per known destructive utility +
|
|
986
|
+
per known shell-out + per known interpreter API.
|
|
987
|
+
3. Adversarial corpus generators span the parameter cross-product
|
|
988
|
+
so generators produce shapes Codex hasn't visited.
|
|
989
|
+
4. Per-round Codex review surfaces gaps before release; the
|
|
990
|
+
convergence ladder
|
|
991
|
+
(round 1 → round 2 → round 3 → round 4 → round 5 ...) is the
|
|
992
|
+
audit trail.
|
|
993
|
+
5. Fail-closed defaults: dynamic targets always block.
|
|
994
|
+
An allowlist scanner ("only known-safe commands pass") would close
|
|
995
|
+
this class structurally but is incompatible with the rea use case
|
|
996
|
+
(agentic workflows need arbitrary bash access).
|
|
997
|
+
|
|
998
|
+
### 8.4 Test surface
|
|
999
|
+
|
|
1000
|
+
The fixture corpus at
|
|
1001
|
+
`__tests__/hooks/bash-tier-corpus.test.ts` (≥185 entries) and
|
|
1002
|
+
`__tests__/hooks/bash-tier-corpus-round2.test.ts` (≥186 entries,
|
|
1003
|
+
codex round 2 bypass-class fixtures) locks every documented bypass
|
|
1004
|
+
class as a regression-positive test.
|
|
1005
|
+
The walker unit tests at `__tests__/hooks/bash-scanner/walker.test.ts`
|
|
1006
|
+
pin the parser-emitted RedirOperator codes (codex round 1 F-33) so a
|
|
1007
|
+
parser-library bump that re-numbers them fails LOUDLY. The verdict-
|
|
1008
|
+
shape snapshot at `__tests__/hooks/bash-scanner/verdict-shape.test.ts`
|
|
1009
|
+
locks the wire format for the bash shim consumers.
|
|
1010
|
+
|
|
1011
|
+
The cross-product corpus at
|
|
1012
|
+
`__tests__/hooks/bash-scanner/adversarial-corpus.test.ts` runs ≥7700
|
|
1013
|
+
fixtures across 18 classes (A–P plus extensions). Coverage assertion:
|
|
1014
|
+
≥3000 positive (must-block) and ≥1000 negative (must-allow) fixtures.
|
|
1015
|
+
|
|
1016
|
+
The Class O exhaustiveness contract test at
|
|
1017
|
+
`__tests__/hooks/bash-scanner/walker-exhaustiveness.contract.test.ts`
|
|
1018
|
+
pins the walker reach across every Word-bearing AST position
|
|
1019
|
+
mvdan-sh's parser populates. Round-8 tightened the acceptance to
|
|
1020
|
+
path-explicit-by-default; opt-in `acceptDynamic` per row is the only
|
|
1021
|
+
way to accept a `dynamic: true` write as proof-of-reach.
|
|
1022
|
+
|
|
1023
|
+
The bash-shim subprocess sampling at
|
|
1024
|
+
`adversarial-corpus.test.ts > "bash shim subprocess sampling"`
|
|
1025
|
+
spawns the actual hook script under a clean env across 100
|
|
1026
|
+
deterministically-sampled fixtures, parses verdict JSON, and
|
|
1027
|
+
cross-checks against in-process scan. Catches drift between the
|
|
1028
|
+
in-process verdict and what the shim's JSON verifier + 4-tier
|
|
1029
|
+
resolver chain actually returns.
|
|
1030
|
+
|
|
1031
|
+
### 8.5 Defense in depth
|
|
1032
|
+
|
|
1033
|
+
The bash gate is one layer. The full defensive stack:
|
|
1034
|
+
|
|
1035
|
+
1. **Parser AST** (`mvdan-sh@0.10.1`) — eliminates regex/segmenter
|
|
1036
|
+
tokenization mistakes.
|
|
1037
|
+
2. **Walker** (`syntax.Walk`-based deny-by-default traversal +
|
|
1038
|
+
`recurseParamExpSlice` for Walk gaps) — visits every node type;
|
|
1039
|
+
no Cmd-kind branch can silently drop a field.
|
|
1040
|
+
3. **Per-utility dispatchers** — comprehensive coverage of cp, mv,
|
|
1041
|
+
sed, dd, tee, install, ln, awk, ed, ex, find, xargs, node,
|
|
1042
|
+
python, ruby, perl, tar, rsync, curl, wget, shred, eval, git,
|
|
1043
|
+
patch, sort, shuf, gpg, split, trap, bash/sh/zsh/dash/ksh.
|
|
1044
|
+
4. **Interpreter scanners** — write-API tokens for node fs, python
|
|
1045
|
+
os/shutil/subprocess, ruby Pathname/FileUtils, perl unlink/rename;
|
|
1046
|
+
shell-out re-parse for `system`/`subprocess.run shell=True`/`qx`/
|
|
1047
|
+
backticks.
|
|
1048
|
+
5. **DQ-escape parity** — `unshellEscape` collapses all 5 DQ-significant
|
|
1049
|
+
escapes (round-8) so re-parser sees the same syntax tree as bash.
|
|
1050
|
+
6. **Symlink resolver** — visited-set + depth cap (32); refuses on
|
|
1051
|
+
cycle/cap; macOS `/var` ↔ `/private/var` aliasing handled by
|
|
1052
|
+
`rea_resolved_relative_form` (helix-021).
|
|
1053
|
+
7. **2-tier sandboxed CLI resolver** — only
|
|
1054
|
+
`node_modules/@bookedsolid/rea/dist/cli/index.js` and
|
|
1055
|
+
`dist/cli/index.js` accepted; `realpath` containment check
|
|
1056
|
+
refuses any escape from `CLAUDE_PROJECT_DIR`.
|
|
1057
|
+
8. **Verdict JSON verifier** — shim re-parses CLI output via
|
|
1058
|
+
`node -e` and cross-checks exit code ↔ verdict (round-1 F-3).
|
|
1059
|
+
9. **Cross-product corpus** (Classes A–P) — ≥7700 fixtures span the
|
|
1060
|
+
parameter space so generators produce shapes the round-by-round
|
|
1061
|
+
manual review hadn't visited.
|
|
1062
|
+
10. **Class O exhaustiveness contract** — pins every Word-bearing
|
|
1063
|
+
AST position so mvdan-sh upgrades cannot silently introduce
|
|
1064
|
+
Walk-skip bypasses.
|
|
1065
|
+
11. **Codex adversarial review** — every release goes through
|
|
1066
|
+
`/codex-review` before merge; convergence ladder is the audit
|
|
1067
|
+
trail. 0.23.0 round count: 8 (and counting).
|
|
1068
|
+
12. **Middleware audit log** — every tool invocation is hash-chained
|
|
1069
|
+
in `.rea/audit.jsonl` (append-only, tamper-evident).
|
|
1070
|
+
13. **Codex push-gate** (0.11.0+) — pre-push stateless review by
|
|
1071
|
+
GPT-5.4 (codex-auto-review) catches semantic concerns the static
|
|
1072
|
+
scanner cannot reason about.
|
|
1073
|
+
14. **Husky 9 hook chain** — `commit-msg`, `pre-push`, `pre-commit`
|
|
1074
|
+
register every hook in the package; consumers can extend via
|
|
1075
|
+
`.husky/{commit-msg,pre-push,pre-commit}.d/*` (helix-018 Option B).
|
|
1076
|
+
|
|
1077
|
+
A bypass requires defeating multiple layers simultaneously. The
|
|
1078
|
+
trust boundary between this stack and the rest of the system is
|
|
1079
|
+
package-tier integrity (npm provenance + manifest verification).
|
|
1080
|
+
|
|
1081
|
+
### 8.6 Out of scope
|
|
1082
|
+
|
|
1083
|
+
The bash gate explicitly does NOT defend against:
|
|
1084
|
+
|
|
1085
|
+
- **Kernel-level / LD_PRELOAD / ptrace attacks.** The OS kernel and
|
|
1086
|
+
loader are trusted. An attacker with code execution at that tier
|
|
1087
|
+
bypasses every userland defense.
|
|
1088
|
+
- **Network-tier MITM during package install.** rea trusts the npm
|
|
1089
|
+
registry (with provenance verification) and the system's TLS root
|
|
1090
|
+
store. A compromised TLS chain at `npm install` time gives the
|
|
1091
|
+
attacker the same authority as the package itself.
|
|
1092
|
+
- **Supply-chain compromise of `@bookedsolid/rea` on npm.** A signed
|
|
1093
|
+
release with malicious code defeats the gate because the gate IS
|
|
1094
|
+
the malicious code. Mitigation is npm provenance + manifest
|
|
1095
|
+
verification + (opt-in) `policy.review.cli_sha256`. See §8.3.
|
|
1096
|
+
- **Out-of-band file modifications.** rea gates Bash tool calls and
|
|
1097
|
+
Write/Edit/MultiEdit Write-tier hooks. Filesystem changes initiated
|
|
1098
|
+
outside the harness (user editing files directly, language server
|
|
1099
|
+
edits, other processes) are not gated.
|
|
1100
|
+
- **Read-side policy leaks.** The bash gate concerns WRITES. Reading
|
|
1101
|
+
`.rea/policy.yaml` is allowed by default — the policy is checked-in
|
|
1102
|
+
and visible. `env-file-protection.sh` handles `.env*` reads at the
|
|
1103
|
+
Write tier; bash-tier coverage of `.env*` reads is via
|
|
1104
|
+
`dependency-audit-gate.sh` and the segmenter for those forms.
|
|
1105
|
+
- **Attacker-controlled PATH at scanner runtime.** If `rea` resolves
|
|
1106
|
+
to an attacker binary on PATH, the gate is defeated. Production
|
|
1107
|
+
deployments pin PATH via the harness; `rea doctor` verifies PATH
|
|
1108
|
+
integrity at install time but does not enforce it at runtime.
|
package/dist/audit/append.js
CHANGED
|
@@ -35,7 +35,7 @@
|
|
|
35
35
|
import fs from 'node:fs/promises';
|
|
36
36
|
import path from 'node:path';
|
|
37
37
|
import { Tier, InvocationStatus } from '../policy/types.js';
|
|
38
|
-
import { GENESIS_HASH, computeHash, fsyncFile, readLastRecord, withAuditLock
|
|
38
|
+
import { GENESIS_HASH, computeHash, fsyncFile, readLastRecord, withAuditLock } from './fs.js';
|
|
39
39
|
import { maybeRotate } from '../gateway/audit/rotator.js';
|
|
40
40
|
const REA_DIR = '.rea';
|
|
41
41
|
const AUDIT_FILE = 'audit.jsonl';
|