@bookedsolid/rea 0.23.0 → 0.24.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -234,8 +234,1036 @@ export function walkForWrites(file) {
234
234
  // means no detections, which the compositor pairs with parse-fail
235
235
  // sentinels at a higher tier).
236
236
  }
237
+ // helix-024 F1: cd-into-protected + relative-write bypass closure.
238
+ // The static AST scanner doesn't model process-cwd. A command like
239
+ // `cd .rea && echo > HALT` emits ONLY a `redirect: HALT` detection
240
+ // from extractStmtRedirects — the scanner normalizes `HALT` against
241
+ // REA_ROOT and gets `HALT` (project-root-relative), which doesn't
242
+ // match `.rea/HALT` in the protected list. The kill switch is
243
+ // bypassed because the cd is invisible to the scanner.
244
+ //
245
+ // Closure: post-walker pass emits a synthetic detection at every
246
+ // `cd <DIR>` / `pushd <DIR>` / `cd $VAR` site WHEN the AST contains
247
+ // any write detection. The scanner runs the protected-prefix test
248
+ // on the cd target with `forceDirSemantics: true` — if DIR matches
249
+ // a protected dir-prefix (`.rea`, `.husky`, `.claude`, `.github/
250
+ // workflows`), refuse on uncertainty regardless of whether the
251
+ // relative write tokens themselves match.
252
+ //
253
+ // This pass must run AFTER the main walk so it can observe the full
254
+ // detection set. The helper detectCwdChangeIntoProtected scans the
255
+ // AST again for cd/pushd CallExprs and decides per-site whether to
256
+ // emit (the path-shape decision is the scanner's, not ours — we
257
+ // emit conservatively).
258
+ detectCwdChangeIntoProtected(file, out);
237
259
  return out;
238
260
  }
261
+ /**
262
+ * helix-024 F1 closure — detect `cd <DIR>` / `pushd <DIR>` (and `cd
263
+ * $VAR` / `pushd $VAR` dynamic forms) and emit synthetic detections
264
+ * ONLY when a bare-relative-path write is reachable in execution-order
265
+ * from the cd within the same lexical scope. The scanner consumes these
266
+ * as a refuse-on-uncertainty signal.
267
+ *
268
+ * Round-14 codex P1 refinement (over-correction fix). The first
269
+ * iteration of this pass used a coarse global predicate: "if AST has
270
+ * any write, emit on every cd". That over-blocked common idioms:
271
+ *
272
+ * - `cd "$HOME" && echo > log` (known-safe source)
273
+ * - `cd "$REPO_ROOT" && echo > /tmp/log` (write is absolute)
274
+ * - `cd "$(pwd)" && echo > log` (cmdsubst safe source)
275
+ * - `cd "$(git rev-parse --show-toplevel)" && pnpm test > out.log`
276
+ * - `for d in src test; do cd "$d" && echo x > out; done`
277
+ * - `cd /tmp && echo > log; cd .rea && cat HALT` (cross-scope read-only)
278
+ *
279
+ * The refined predicate is:
280
+ * 1. **In-scope**: writes must be reachable from the cd in execution
281
+ * order within the same lexical scope. Specifically:
282
+ * - Sequential successor Stmts in the same StmtList container.
283
+ * - For a cd in BinaryCmd.X, the entire BinaryCmd.Y subtree.
284
+ * - For a cd inside a Subshell/Block/Function-body, all
285
+ * subsequent stmts within that container.
286
+ * - Recurses into nested compound stmts of in-scope successors
287
+ * (a write inside an `if` after a cd in the same scope counts).
288
+ * Writes from sibling stmts BEFORE the cd, or in unrelated scopes
289
+ * (e.g., a separate top-level Stmt earlier or later that itself
290
+ * has its own cd), do NOT taint this cd's emit decision.
291
+ * 2. **Bare-relative target**: only writes whose static path-shape is
292
+ * bare-relative (no leading `/`, no `~/`, not the outside-root
293
+ * sentinel) actually move when cwd changes. Absolute writes
294
+ * (`/tmp/log`, `/dev/null`, `~/file`) are unaffected by the cd, so
295
+ * they do not bootstrap an emit.
296
+ * 3. **Known-safe dynamic source**: a dynamic cd whose entire target
297
+ * Word is a known-safe expansion (`$HOME`, `$PWD`, `$OLDPWD`,
298
+ * `$(pwd)`, `$(git rev-parse --show-toplevel)`, `$(git rev-parse
299
+ * --show-cdup)`, or a for-loop iter variable whose Items list is
300
+ * all literal non-protected paths) is treated as ALLOW even with
301
+ * bare-relative writes in scope. The runtime cwd cannot land in
302
+ * .rea/.husky/.claude/.github/workflows/ from those sources.
303
+ * 4. **Conservative dynamic fallback**: any other dynamic cd target
304
+ * (`$P`, `$REPO_ROOT`, arbitrary `$(...)`) WITH bare-relative
305
+ * writes in scope emits `cwd_dynamic_with_writes_unresolvable` so
306
+ * the scanner refuses on uncertainty. Without bare-relative writes
307
+ * in scope the dynamic cd is a no-op for kill-switch purposes —
308
+ * it cannot exfiltrate via writes that don't exist.
309
+ *
310
+ * Two emit shapes (preserved from round 1):
311
+ * - `cwd_protected_unresolvable` — cd target is a literal path.
312
+ * Scanner runs the protected-prefix test (with dir semantics).
313
+ * Match → BLOCK; non-match (or outside-root) → ALLOW.
314
+ * - `cwd_dynamic_with_writes_unresolvable` — dynamic cd target with
315
+ * unknown-safe source AND a bare-relative write in scope. Scanner
316
+ * refuses on uncertainty.
317
+ *
318
+ * Pre-existing detections from the first walker pass are unaffected —
319
+ * the cd synthetic is additive. A redirect to a protected path is
320
+ * still flagged on its own merits even if its enclosing cd is safe.
321
+ *
322
+ * Accepted false-negatives (deferred to 0.24.0+):
323
+ * - `cd $(echo .rea) && echo > HALT` — cmdsubst-resolved literal.
324
+ * The cd source isn't in the known-safe set so the dynamic path
325
+ * fires only if there's a bare-relative write in scope; this case
326
+ * would fire correctly. But `cd $(printf %s .rea) && cat HALT`
327
+ * (read-only) does NOT fire because no bare-relative write exists.
328
+ * - `alias evil="cd .rea && echo > HALT"; evil` — alias-then-invoke
329
+ * requires modeling shell aliases at AST time, out of scope here.
330
+ *
331
+ * Performance: O(N) over AST nodes per scope walk, allocation-light.
332
+ * The recursion bound is the parser's own AST depth (mvdan-sh enforces
333
+ * its own depth caps upstream).
334
+ */
335
+ function detectCwdChangeIntoProtected(file, out) {
336
+ // Reusable nothing-to-do shortcut: if there is literally no Stmt in the
337
+ // file, there is no cd to find and no write to taint.
338
+ const topStmts = asArray(file['Stmts']);
339
+ if (topStmts.length === 0)
340
+ return;
341
+ // Recursively walk the AST visiting each lexical scope (StmtList).
342
+ // For each cd/pushd CallExpr, decide whether to emit a synthetic
343
+ // detection by:
344
+ // - Collecting in-scope successor writes (per the predicate above).
345
+ // - Filtering by bare-relative path shape.
346
+ // - Classifying the cd target source.
347
+ //
348
+ // `safeForIterVars` carries the scope's set of for-iter variable
349
+ // names whose Items list is all literal non-protected paths. The set
350
+ // is inherited into nested scopes (closure-capture style) and
351
+ // extended on entry to a qualifying ForClause body.
352
+ walkScopeForCwd(topStmts, new Set(), out);
353
+ }
354
+ /**
355
+ * Walk a list of sibling Stmts (one lexical scope), emitting cd
356
+ * synthetic detections per the refined predicate.
357
+ *
358
+ * Round-17 P1: `extraDownstream` carries additional AST regions that
359
+ * are reachable in execution order from this scope's cd-sites *beyond*
360
+ * later siblings. It is appended to every cd-site's downstream when
361
+ * descending into the Cond / Body of If/While/Until clauses, so that:
362
+ * - `if cd .rea; then echo > HALT; fi` — the cond's cd sees the
363
+ * then-body as downstream (cwd persists into the body).
364
+ * - `if cd .rea; then :; fi; echo > HALT` — the cond's cd sees the
365
+ * post-conditional sibling as downstream (cwd persists past the
366
+ * conditional in bash semantics).
367
+ * For nested-scope descent from `descendIntoNestedScopes`, we carry the
368
+ * containing Stmt's own post-stmt siblings as extraDownstream so cds
369
+ * found inside subshells/blocks/if/while/etc. see them too.
370
+ */
371
+ function walkScopeForCwd(stmts, safeForIterVars, out, extraDownstream = []) {
372
+ for (let i = 0; i < stmts.length; i += 1) {
373
+ const stmt = stmts[i];
374
+ if (!stmt || typeof stmt !== 'object')
375
+ continue;
376
+ // For each Stmt in this scope, consider every cd/pushd CallExpr
377
+ // that lives inside it (including via BinaryCmd.X). The "in-scope
378
+ // successor" set is:
379
+ // - For a cd in this Stmt's BinaryCmd.X: that BinaryCmd's Y subtree.
380
+ // - PLUS: every later sibling Stmt (i+1..end) in this scope.
381
+ // - PLUS: extraDownstream (R17 cwd-persistence + post-conditional
382
+ // siblings inherited from a parent scope).
383
+ classifyCdInStmt(stmt, stmts, i, safeForIterVars, out, extraDownstream);
384
+ // Recurse into nested scopes inside this Stmt — each opens its own
385
+ // sequential scope. The parent scope's safeForIterVars are inherited.
386
+ // R17 P1: this Stmt's own post-stmt siblings (stmts[i+1..]) PLUS the
387
+ // inherited extraDownstream become the extraDownstream for nested
388
+ // scopes — cwd changes inside an if/while/until/subshell/block on
389
+ // this Stmt persist into and past those constructs.
390
+ const postSiblings = [];
391
+ for (let j = i + 1; j < stmts.length; j += 1) {
392
+ const s = stmts[j];
393
+ if (s)
394
+ postSiblings.push(s);
395
+ }
396
+ for (const d of extraDownstream)
397
+ postSiblings.push(d);
398
+ descendIntoNestedScopes(stmt, safeForIterVars, out, postSiblings);
399
+ }
400
+ }
401
+ /**
402
+ * Find every cd/pushd CallExpr in `stmt` (top-level or LHS of a
403
+ * BinaryCmd chain) and emit per the predicate. The "downstream-in-
404
+ * scope" subtree for emission is computed per-cd-site.
405
+ */
406
+ function classifyCdInStmt(stmt, scopeStmts, stmtIndex, safeForIterVars, out, extraDownstream = []) {
407
+ const cmd = stmt['Cmd'];
408
+ if (!cmd || typeof cmd !== 'object')
409
+ return;
410
+ // Each cd-site finder yields the `cd` CallExpr and the "downstream
411
+ // subtree" — the AST region whose writes are in scope of THIS cd.
412
+ // For sequential successors, downstream extends across scopeStmts[i+1..].
413
+ // R17 P1: extraDownstream is appended to every cd-site's downstream so
414
+ // cwd-persistence into the body of an if/while/until and past it is
415
+ // observable to the predicate.
416
+ const sites = [];
417
+ collectCdSitesInStmt(stmt, scopeStmts, stmtIndex, sites, extraDownstream);
418
+ for (const site of sites) {
419
+ emitCdDecisionIfAny(site.callExpr, site.downstream, safeForIterVars, out);
420
+ }
421
+ }
422
+ /**
423
+ * Recursively locate cd/pushd CallExprs within `stmt`. Each site's
424
+ * downstream subtree is the AST region containing in-scope successor
425
+ * writes. The traversal stays in the SAME lexical scope as the cd —
426
+ * it does NOT walk into Subshells / Blocks / Function bodies (those
427
+ * open new scopes that are handled by `walkScopeForCwd` on the
428
+ * containing-scope iteration).
429
+ */
430
+ function collectCdSitesInStmt(stmt, scopeStmts, stmtIndex, sites, extraDownstream = []) {
431
+ const cmd = stmt['Cmd'];
432
+ if (!cmd || typeof cmd !== 'object')
433
+ return;
434
+ const cmdNode = cmd;
435
+ const t = nodeType(cmdNode);
436
+ if (t === 'CallExpr') {
437
+ if (isCdOrPushd(cmdNode)) {
438
+ // Downstream: every sibling Stmt after this one in the same
439
+ // scope. Stmt-level redirects on later siblings are reachable
440
+ // because they execute sequentially after the cd.
441
+ const downstream = [];
442
+ for (let j = stmtIndex + 1; j < scopeStmts.length; j += 1) {
443
+ const s = scopeStmts[j];
444
+ if (s)
445
+ downstream.push(s);
446
+ }
447
+ // R17 P1: append cwd-persistence carriers (then/else/do bodies of
448
+ // an enclosing IfClause/WhileClause/UntilClause whose Cond holds
449
+ // this cd, plus the conditional's own post-stmt siblings).
450
+ for (const d of extraDownstream)
451
+ downstream.push(d);
452
+ sites.push({ callExpr: cmdNode, downstream });
453
+ }
454
+ return;
455
+ }
456
+ if (t === 'BinaryCmd') {
457
+ // mvdan-sh BinaryCmd ops we care about for "executes after":
458
+ // 0xa && 0xb || 0xc | 0xd |&
459
+ // For all four, X executes (or starts piping) before Y. A cd in X
460
+ // is followed in execution order by Y. Pipes (0xc/0xd) DO have a
461
+ // separate-process boundary — `cd .rea | echo > HALT` does NOT
462
+ // affect the second process's cwd. We conservatively still treat
463
+ // pipes as "downstream" because the kill-switch is symmetric: in a
464
+ // subshell child the cd applies, but the redirect happens in a
465
+ // parallel pipe stage. The corner case yields a small over-emit
466
+ // for pipe-only chains; in practice `&&`/`||`/`;` are the attack
467
+ // shapes (per helix-024 PoCs F1-1..F1-8), so this is acceptable.
468
+ const x = cmdNode['X'];
469
+ const y = cmdNode['Y'];
470
+ // Recurse into X looking for cd; its downstream is Y plus the
471
+ // outer-scope siblings plus any inherited extraDownstream (R17 P1
472
+ // cwd-persistence carriers).
473
+ const yDownstream = [];
474
+ if (y && typeof y === 'object')
475
+ yDownstream.push(y);
476
+ for (let j = stmtIndex + 1; j < scopeStmts.length; j += 1) {
477
+ const s = scopeStmts[j];
478
+ if (s)
479
+ yDownstream.push(s);
480
+ }
481
+ for (const d of extraDownstream)
482
+ yDownstream.push(d);
483
+ if (x && typeof x === 'object') {
484
+ collectCdSitesInBinaryX(x, yDownstream, sites);
485
+ }
486
+ // ALSO recurse into Y for cd sites whose own downstream is the
487
+ // outer scope only. This covers `(non-cd) && cd .rea && echo >
488
+ // HALT` shapes — Y is itself a BinaryCmd with cd in its X.
489
+ if (y && typeof y === 'object') {
490
+ const yStmt = y;
491
+ if (nodeType(yStmt) === 'Stmt') {
492
+ // Y has its own (X, Y) inner BinaryCmd if the operator chain
493
+ // continues. We let the outer-scope iteration via scopeStmts
494
+ // pick up siblings for Y's own cd; here we only need to find
495
+ // cd sites in Y itself with the outer downstream.
496
+ const outerDownstream = [];
497
+ for (let j = stmtIndex + 1; j < scopeStmts.length; j += 1) {
498
+ const s = scopeStmts[j];
499
+ if (s)
500
+ outerDownstream.push(s);
501
+ }
502
+ for (const d of extraDownstream)
503
+ outerDownstream.push(d);
504
+ // Inner BinaryCmd in Y: walk it as if it were a Stmt with
505
+ // index -1 in a synthetic scope of [Y, ...outerDownstream].
506
+ // We pass an empty extraDownstream because outerDownstream
507
+ // already includes the inherited carriers.
508
+ const synthScope = [yStmt, ...outerDownstream];
509
+ collectCdSitesInStmt(yStmt, synthScope, 0, sites);
510
+ }
511
+ }
512
+ return;
513
+ }
514
+ // Other Cmd shapes (FuncDecl, Subshell, Block, IfClause, ForClause,
515
+ // WhileClause, CaseClause, etc.) open new scopes — the descent into
516
+ // their body lists is `descendIntoNestedScopes`, not here.
517
+ }
518
+ /** Walk into a BinaryCmd.X subtree to find cd sites. */
519
+ function collectCdSitesInBinaryX(x, downstream, sites) {
520
+ if (nodeType(x) !== 'Stmt')
521
+ return;
522
+ const cmd = x['Cmd'];
523
+ if (!cmd || typeof cmd !== 'object')
524
+ return;
525
+ const cmdNode = cmd;
526
+ const t = nodeType(cmdNode);
527
+ if (t === 'CallExpr') {
528
+ if (isCdOrPushd(cmdNode)) {
529
+ sites.push({ callExpr: cmdNode, downstream: downstream.slice() });
530
+ }
531
+ return;
532
+ }
533
+ if (t === 'BinaryCmd') {
534
+ const innerX = cmdNode['X'];
535
+ const innerY = cmdNode['Y'];
536
+ const innerDownstream = [];
537
+ if (innerY && typeof innerY === 'object')
538
+ innerDownstream.push(innerY);
539
+ for (const d of downstream)
540
+ innerDownstream.push(d);
541
+ if (innerX && typeof innerX === 'object') {
542
+ collectCdSitesInBinaryX(innerX, innerDownstream, sites);
543
+ }
544
+ if (innerY && typeof innerY === 'object') {
545
+ collectCdSitesInBinaryX(innerY, downstream, sites);
546
+ }
547
+ return;
548
+ }
549
+ // R17 P2: TimeClause / CoprocClause wrap a single Stmt without
550
+ // opening a new scope. `time cd .rea && echo > HALT` parses as
551
+ // BinaryCmd(X=Stmt[TimeClause[Stmt[cd .rea]]], Y=Stmt[echo > HALT])
552
+ // so the cd lives one wrap-level deeper than CallExpr/BinaryCmd
553
+ // expects. Unwrap and recurse with the same downstream so the cd
554
+ // site sees the && Y as its downstream carrier.
555
+ if (t === 'TimeClause' || t === 'CoprocClause') {
556
+ const innerStmt = cmdNode['Stmt'];
557
+ if (innerStmt && typeof innerStmt === 'object') {
558
+ collectCdSitesInBinaryX(innerStmt, downstream, sites);
559
+ }
560
+ return;
561
+ }
562
+ }
563
+ /**
564
+ * Descend into nested-scope-opening constructs inside `stmt`. For each
565
+ * new scope (Subshell.Stmts, Block.Stmts, ForClause.Do, WhileClause.Do,
566
+ * IfClause.Then/Else, FuncDecl body, CaseClause item bodies), call
567
+ * `walkScopeForCwd` to repeat the analysis with that scope's Stmts.
568
+ */
569
+ function descendIntoNestedScopes(stmt, safeForIterVars, out, extraDownstream = []) {
570
+ const cmd = stmt['Cmd'];
571
+ if (!cmd || typeof cmd !== 'object')
572
+ return;
573
+ descendCmdScopes(cmd, safeForIterVars, out, extraDownstream);
574
+ // Stmt-level redirects can contain process substitutions whose inner
575
+ // Stmts open their own scope. Walk via syntax.Walk for those — the
576
+ // ProcSubst.Stmts will itself be visited as a nested scope handled
577
+ // by the walker's main pass; for this F1 analysis, ProcSubst is rare
578
+ // enough on the cd path that we accept missing it as a deferred
579
+ // false-negative (documented).
580
+ }
581
+ function descendCmdScopes(cmd, safeForIterVars, out, extraDownstream = []) {
582
+ const t = nodeType(cmd);
583
+ switch (t) {
584
+ case 'BinaryCmd': {
585
+ const x = cmd['X'];
586
+ const y = cmd['Y'];
587
+ if (x && typeof x === 'object')
588
+ descendStmtScopes(x, safeForIterVars, out, extraDownstream);
589
+ if (y && typeof y === 'object')
590
+ descendStmtScopes(y, safeForIterVars, out, extraDownstream);
591
+ break;
592
+ }
593
+ case 'Subshell':
594
+ case 'Block': {
595
+ const inner = asArray(cmd['Stmts']);
596
+ // R17 P1: cwd changes inside a Subshell DO NOT escape — bash forks
597
+ // a child shell. So Subshell's inner cd-sites should NOT inherit
598
+ // the parent's post-stmt siblings. Block ({...}) DOES persist cwd
599
+ // to the parent shell — inherit extraDownstream there.
600
+ const subExtra = t === 'Subshell' ? [] : extraDownstream;
601
+ walkScopeForCwd(inner, safeForIterVars, out, subExtra);
602
+ break;
603
+ }
604
+ case 'ForClause': {
605
+ // Extend safeForIterVars if the WordIter has all-literal-non-
606
+ // protected Items — the iter variable is provably bound to a
607
+ // safe value at runtime.
608
+ const loop = cmd['Loop'];
609
+ let nextSafe = safeForIterVars;
610
+ if (loop && typeof loop === 'object') {
611
+ const loopNode = loop;
612
+ const loopType = nodeType(loopNode);
613
+ if (loopType === 'WordIter') {
614
+ const nameNode = loopNode['Name'];
615
+ let iterName = '';
616
+ if (nameNode && typeof nameNode === 'object') {
617
+ iterName = stringifyField(nameNode['Value']);
618
+ }
619
+ const items = asArray(loopNode['Items']);
620
+ let allSafe = items.length > 0;
621
+ for (const item of items) {
622
+ if (typeof item !== 'object' || item === null) {
623
+ allSafe = false;
624
+ break;
625
+ }
626
+ const v = wordToString(item);
627
+ if (v === null || v.dynamic) {
628
+ allSafe = false;
629
+ break;
630
+ }
631
+ if (isPathPotentiallyProtected(v.value)) {
632
+ allSafe = false;
633
+ break;
634
+ }
635
+ }
636
+ if (allSafe && iterName.length > 0) {
637
+ const ext = new Set(safeForIterVars);
638
+ ext.add(iterName);
639
+ nextSafe = ext;
640
+ }
641
+ }
642
+ // CStyleLoop: counter variables are arithmetic, never paths;
643
+ // we don't try to mark them safe but we also don't need to —
644
+ // the body's cd would have to use arithmetic-as-path, which
645
+ // is degenerate and not a known attack shape.
646
+ }
647
+ const doStmts = asArray(cmd['Do']);
648
+ // R17 P1: a cd in a ForClause body persists cwd to the next iter
649
+ // and past the loop. Inherit extraDownstream into the body.
650
+ walkScopeForCwd(doStmts, nextSafe, out, extraDownstream);
651
+ break;
652
+ }
653
+ case 'WhileClause':
654
+ case 'UntilClause': {
655
+ const doStmts = asArray(cmd['Do']);
656
+ const cond = asArray(cmd['Cond']);
657
+ // R17 P1 closure: a cd in the Cond persists into the Do body
658
+ // (cwd applies to body when cond evaluates true) AND past the
659
+ // loop (cwd persists when the loop exits). Pass the body PLUS
660
+ // the inherited extraDownstream as carriers when walking Cond.
661
+ const condCarriers = [];
662
+ for (const s of doStmts)
663
+ condCarriers.push(s);
664
+ for (const d of extraDownstream)
665
+ condCarriers.push(d);
666
+ walkScopeForCwd(cond, safeForIterVars, out, condCarriers);
667
+ // Body cd-sites inherit only the post-loop extraDownstream
668
+ // (cwd persists past the loop).
669
+ walkScopeForCwd(doStmts, safeForIterVars, out, extraDownstream);
670
+ break;
671
+ }
672
+ case 'IfClause': {
673
+ const cond = asArray(cmd['Cond']);
674
+ const thenStmts = asArray(cmd['Then']);
675
+ const elseEntry = cmd['Else'];
676
+ // R17 P1 closure: a cd in the Cond persists into the then-body
677
+ // (when cond is truthy) AND into the else-body (when cond is
678
+ // truthy on the last cmd in cond — cd's exit code is generally
679
+ // truthy on success but the cwd-change happens regardless of
680
+ // which branch is taken; bash semantics: side-effects from cond
681
+ // persist into BOTH branches and past the conditional). So the
682
+ // carriers for the cond walk are: then-body + else-body +
683
+ // inherited extraDownstream (post-conditional siblings).
684
+ const condCarriers = [];
685
+ for (const s of thenStmts)
686
+ condCarriers.push(s);
687
+ // Else can be a Cmd whose body is its own IfClause (elif chain)
688
+ // or a Block — flatten its Stmts into carriers via a synthetic
689
+ // collection. We can't easily walk its inner stmt-list without
690
+ // recursing into a structure that may itself contain control
691
+ // flow; passing the Else-Cmd's containing Stmt as a single
692
+ // BashNode in carriers is sufficient for the predicate (any
693
+ // bare-relative write inside that subtree counts).
694
+ if (elseEntry && typeof elseEntry === 'object') {
695
+ condCarriers.push(elseEntry);
696
+ }
697
+ for (const d of extraDownstream)
698
+ condCarriers.push(d);
699
+ walkScopeForCwd(cond, safeForIterVars, out, condCarriers);
700
+ // then-body cd-sites inherit only the post-conditional
701
+ // extraDownstream.
702
+ walkScopeForCwd(thenStmts, safeForIterVars, out, extraDownstream);
703
+ if (elseEntry && typeof elseEntry === 'object') {
704
+ // Else is itself an IfClause-or-StmtList in mvdan-sh; descend
705
+ // through its Cmd-like structure carrying extraDownstream.
706
+ descendCmdScopes(elseEntry, safeForIterVars, out, extraDownstream);
707
+ }
708
+ break;
709
+ }
710
+ case 'CaseClause': {
711
+ const items = asArray(cmd['Items']);
712
+ for (const it of items) {
713
+ if (!it || typeof it !== 'object')
714
+ continue;
715
+ const itStmts = asArray(it['Stmts']);
716
+ // CaseClause Word is evaluated once; cwd-side-effects in
717
+ // an item body persist past the case (control returns to
718
+ // parent scope). Inherit extraDownstream.
719
+ walkScopeForCwd(itStmts, safeForIterVars, out, extraDownstream);
720
+ }
721
+ break;
722
+ }
723
+ case 'FuncDecl': {
724
+ const body = cmd['Body'];
725
+ if (body && typeof body === 'object') {
726
+ // FuncDecl is a definition only; the body's cwd side-effects
727
+ // apply to the caller's scope WHEN INVOKED, not at definition
728
+ // time. Static analysis cannot know callers, so we conserva-
729
+ // tively walk the body with no inherited extraDownstream.
730
+ descendStmtScopes(body, safeForIterVars, out, []);
731
+ }
732
+ break;
733
+ }
734
+ case 'TimeClause':
735
+ case 'CoprocClause': {
736
+ // R17 P2 closure: `time <stmt>` and `coproc <stmt>` wrap a single
737
+ // Stmt without opening a new lexical scope. A cd inside the
738
+ // wrapped stmt has the same cwd-persistence behavior as the
739
+ // unwrapped form (TimeClause runs in the current shell;
740
+ // CoprocClause forks a coprocess but bash-3.2+ also exposes the
741
+ // construct in interactive shells where cwd-persistence is
742
+ // observed in scripted invocations of the wrapped command).
743
+ // Conservative: descend into the inner Stmt carrying the
744
+ // inherited extraDownstream.
745
+ const stmt = cmd['Stmt'];
746
+ if (stmt && typeof stmt === 'object') {
747
+ descendStmtScopes(stmt, safeForIterVars, out, extraDownstream);
748
+ }
749
+ // CoprocClause may also use a CoprocCmd field in some mvdan-sh
750
+ // versions; handle both shapes defensively.
751
+ const coprocCmd = cmd['CoprocCmd'];
752
+ if (coprocCmd && typeof coprocCmd === 'object') {
753
+ descendCmdScopes(coprocCmd, safeForIterVars, out, extraDownstream);
754
+ }
755
+ break;
756
+ }
757
+ default:
758
+ // Other Cmd kinds (CallExpr, ArithmCmd, DeclClause, TestClause,
759
+ // LetClause, etc.) don't open new sequential scopes for our
760
+ // purposes. Their inner expressions can contain CmdSubst payloads
761
+ // that themselves are full BashFiles re-parsed elsewhere; those
762
+ // are handled by the main walker pass.
763
+ break;
764
+ }
765
+ }
766
+ function descendStmtScopes(stmt, safeForIterVars, out, extraDownstream = []) {
767
+ if (nodeType(stmt) !== 'Stmt') {
768
+ // Some Cmd containers (FuncDecl.Body) wrap a CallExpr / Block
769
+ // directly. Descend on Cmd shape.
770
+ descendCmdScopes(stmt, safeForIterVars, out, extraDownstream);
771
+ return;
772
+ }
773
+ const cmd = stmt['Cmd'];
774
+ if (!cmd || typeof cmd !== 'object')
775
+ return;
776
+ descendCmdScopes(cmd, safeForIterVars, out, extraDownstream);
777
+ }
778
+ /** Test whether a CallExpr is `cd` or `pushd`. */
779
+ function isCdOrPushd(callExpr) {
780
+ const args = asArray(callExpr['Args']);
781
+ if (args.length === 0)
782
+ return false;
783
+ const argv = [];
784
+ for (const arg of args) {
785
+ if (typeof arg !== 'object' || arg === null)
786
+ continue;
787
+ const v = wordToString(arg);
788
+ argv.push(v ?? { value: '', dynamic: true, position: { line: 0, col: 0 } });
789
+ }
790
+ if (argv.length === 0 || argv[0] === undefined)
791
+ return false;
792
+ const stripped = stripEnvAndModifiers(argv);
793
+ if (stripped.length === 0 || stripped[0] === undefined)
794
+ return false;
795
+ const head = normalizeCmdHead(stripped[0].value);
796
+ return head === 'cd' || head === 'pushd' || head === 'popd';
797
+ }
798
+ /**
799
+ * Decide whether a cd-site emits a synthetic detection. Per the
800
+ * refined predicate:
801
+ * - Find the cd target Word.
802
+ * - Search the downstream subtree for bare-relative writes.
803
+ * - For literal cd targets: emit cwd_protected_unresolvable IFF a
804
+ * bare-relative write exists in scope. Scanner will then run the
805
+ * protected-prefix test on the literal cd target — non-protected
806
+ * literals (e.g., `/tmp`) ALLOW; protected (e.g., `.rea`) BLOCK.
807
+ * - For dynamic cd targets: classify the source. If known-safe →
808
+ * no emit. If unknown-safe AND bare-relative write exists in scope
809
+ * → emit cwd_dynamic_with_writes_unresolvable. Else → no emit.
810
+ */
811
+ function emitCdDecisionIfAny(callExpr, downstream, safeForIterVars, out) {
812
+ const args = asArray(callExpr['Args']);
813
+ if (args.length === 0)
814
+ return;
815
+ // Re-extract argv (we did this in isCdOrPushd; the caller already
816
+ // confirmed the head is cd/pushd).
817
+ const argv = [];
818
+ const argWords = [];
819
+ for (const arg of args) {
820
+ if (typeof arg !== 'object' || arg === null)
821
+ continue;
822
+ const v = wordToString(arg);
823
+ argv.push(v ?? { value: '', dynamic: true, position: { line: 0, col: 0 } });
824
+ argWords.push(arg);
825
+ }
826
+ if (argv.length === 0 || argv[0] === undefined)
827
+ return;
828
+ const stripped = stripEnvAndModifiers(argv);
829
+ if (stripped.length === 0 || stripped[0] === undefined)
830
+ return;
831
+ const head = normalizeCmdHead(stripped[0].value);
832
+ if (head !== 'cd' && head !== 'pushd' && head !== 'popd')
833
+ return;
834
+ // Find the first non-flag positional. Recover both the WordValue and
835
+ // the underlying Word AST node so the dynamic-source classifier can
836
+ // inspect the Parts directly.
837
+ let target = null;
838
+ let targetWord = null;
839
+ // Note: stripEnvAndModifiers may synthesize argv (env-prefix strip),
840
+ // so we cannot trust positional indices in `stripped` to align with
841
+ // argWords. Re-walk argv (which DOES align with argWords) and skip
842
+ // the leading Assigns / wrapper modifiers using the same flag-skip.
843
+ // Simpler: we accept that position-info (col/line) for the cd target
844
+ // comes from `stripped`'s WordValue.position; for AST classification
845
+ // we walk the original argv looking for the first non-flag positional
846
+ // whose value matches `target.value`.
847
+ for (let i = 1; i < stripped.length; i += 1) {
848
+ const tok = stripped[i];
849
+ if (tok === undefined)
850
+ continue;
851
+ const v = tok.value;
852
+ if (v === '--')
853
+ continue;
854
+ if (v === '-' || v === '+')
855
+ continue;
856
+ if (v.startsWith('-') && v.length > 1)
857
+ continue;
858
+ if (v.startsWith('+') && v.length > 1 && /^\+\d+$/.test(v))
859
+ continue;
860
+ target = tok;
861
+ break;
862
+ }
863
+ if (target === null) {
864
+ // No positional after flag-skip — bash defaults cwd to $HOME (cd /
865
+ // cd -L / cd -P) or to OLDPWD (cd -). popd defaults to dir-stack
866
+ // head. Same threat class as `cd "$HOME"` / `cd "$OLDPWD"` (R15 F1):
867
+ // env-var rebindable, refuse on uncertainty when bare-relative
868
+ // writes are in scope.
869
+ const downstreamWrites = collectWritesInSubtrees(downstream);
870
+ let hasBareRelativeWrite = false;
871
+ for (const w of downstreamWrites) {
872
+ if (isBareRelativeWrite(w)) {
873
+ hasBareRelativeWrite = true;
874
+ break;
875
+ }
876
+ }
877
+ if (!hasBareRelativeWrite)
878
+ return;
879
+ out.push({
880
+ path: '',
881
+ form: 'cwd_dynamic_with_writes_unresolvable',
882
+ position: stripped[0]?.position ?? { line: 0, col: 0 },
883
+ dynamic: true,
884
+ originSrc: `${head} (no positional / hyphen-only / popd) defaults to $HOME or $OLDPWD or dir-stack — refusing on uncertainty`,
885
+ });
886
+ return;
887
+ }
888
+ // Try to recover the underlying Word for the cd target. If
889
+ // stripEnvAndModifiers inserted synthetic wrappers, this lookup may
890
+ // fail — we fall back to treating the target as opaque-dynamic.
891
+ for (let i = 0; i < argv.length; i += 1) {
892
+ if (argv[i] === target) {
893
+ targetWord = argWords[i] ?? null;
894
+ break;
895
+ }
896
+ // value-equality fallback when argv has been rewritten
897
+ if (argv[i] !== undefined &&
898
+ argv[i]?.value === target.value &&
899
+ argv[i]?.dynamic === target.dynamic) {
900
+ targetWord = argWords[i] ?? null;
901
+ }
902
+ }
903
+ // Compute in-scope bare-relative writes. We collect ALL writes
904
+ // reachable in the downstream subtree and filter by bare-relative
905
+ // path-shape. We must include writes from the cd's BinaryCmd.Y too,
906
+ // which is part of `downstream`. We reuse `walkForWrites` semantics
907
+ // by running the same dispatcher over the downstream subtree.
908
+ const downstreamWrites = collectWritesInSubtrees(downstream);
909
+ let hasBareRelativeWrite = false;
910
+ for (const w of downstreamWrites) {
911
+ if (isBareRelativeWrite(w)) {
912
+ hasBareRelativeWrite = true;
913
+ break;
914
+ }
915
+ }
916
+ if (!hasBareRelativeWrite)
917
+ return;
918
+ if (target.dynamic) {
919
+ // Classify source. Known-safe → no emit.
920
+ if (targetWord !== null && isKnownSafeCdSource(targetWord, safeForIterVars)) {
921
+ return;
922
+ }
923
+ out.push({
924
+ path: '',
925
+ form: 'cwd_dynamic_with_writes_unresolvable',
926
+ position: target.position,
927
+ dynamic: true,
928
+ originSrc: `${head} <dynamic> with bare-relative writes in scope — refusing on uncertainty`,
929
+ });
930
+ return;
931
+ }
932
+ // Literal target — emit and let the scanner decide protected vs
933
+ // non-protected via checkPathProtected with forceDirSemantics.
934
+ out.push({
935
+ path: target.value,
936
+ form: 'cwd_protected_unresolvable',
937
+ position: target.position,
938
+ dynamic: false,
939
+ isDirTarget: true,
940
+ originSrc: `${head} ${target.value} (bare-relative writes in scope — protected-prefix test required)`,
941
+ });
942
+ }
943
+ /**
944
+ * Collect detected writes in a list of subtrees by re-running the
945
+ * walker's dispatcher on each. This is safe because the dispatcher is
946
+ * pure (no global state) and produces the same DetectedWrite shapes as
947
+ * the main pass — we do NOT propagate these into the final `out` array
948
+ * (the main pass already covers them); we only use the path/dynamic
949
+ * attributes for the in-scope-bare-relative predicate.
950
+ */
951
+ function collectWritesInSubtrees(subtrees) {
952
+ const writes = [];
953
+ for (const root of subtrees) {
954
+ if (!root || typeof root !== 'object')
955
+ continue;
956
+ try {
957
+ syntax.Walk(root, (node) => {
958
+ if (node === null || node === undefined)
959
+ return true;
960
+ const t = nodeType(node);
961
+ if (t === 'Stmt') {
962
+ extractStmtRedirects(node, writes);
963
+ }
964
+ else if (t === 'CallExpr') {
965
+ // Filter cd/pushd themselves out of the downstream-write set
966
+ // (a cd's own argv shouldn't bootstrap an emit on a sibling
967
+ // cd). All other call-expr-level write detections reach
968
+ // walkCallExpr.
969
+ if (!isCdOrPushd(node)) {
970
+ walkCallExpr(node, writes);
971
+ }
972
+ }
973
+ return true;
974
+ });
975
+ }
976
+ catch {
977
+ // pathological subtree — fall through with whatever writes we
978
+ // collected so far. Fail-OPEN here is fine: missing a write means
979
+ // the cd won't emit, but the main walker pass already emitted
980
+ // the same write through its own normal traversal — the kill
981
+ // switch decision still rides on the main pass's verdict.
982
+ }
983
+ }
984
+ return writes;
985
+ }
986
+ /**
987
+ * A write target is "bare-relative" iff it would resolve relative to
988
+ * cwd at runtime — i.e., a cd before it changes which file is hit:
989
+ * - Not absolute (no leading `/`).
990
+ * - Not tilde-expanded (no leading `~/` or `~`).
991
+ * - Has a non-empty path (empty paths are dynamic-only emits we
992
+ * skip — they don't pin a cwd-relative shape).
993
+ * - Static — dynamic targets are "we don't know"; conservatively we
994
+ * treat them as bare-relative (could resolve either way) so the
995
+ * cd-source classifier still decides.
996
+ */
997
+ function isBareRelativeWrite(w) {
998
+ // Skip detections that don't carry a path. Dynamic emits with empty
999
+ // path (xargs-stdin, find-exec-placeholder, nested-shell-inner) are
1000
+ // refuse-on-uncertainty regardless of cwd — they're handled on
1001
+ // their own merits by the scanner.
1002
+ if (w.path.length === 0)
1003
+ return false;
1004
+ // Skip the cd-class synthetic emits themselves (defensive — the
1005
+ // walker filters cd/pushd CallExprs from the downstream set).
1006
+ if (w.form === 'cwd_protected_unresolvable' ||
1007
+ w.form === 'cwd_dynamic_with_writes_unresolvable' ||
1008
+ w.form === 'ln_to_protected_unresolvable') {
1009
+ return false;
1010
+ }
1011
+ if (w.path.startsWith('/'))
1012
+ return false;
1013
+ if (w.path.startsWith('~/'))
1014
+ return false;
1015
+ if (w.path === '~')
1016
+ return false;
1017
+ // Outside-root sentinel emitted by the dynamic-target normalizer.
1018
+ // Defensive — the walker emits raw paths here but if a future change
1019
+ // surfaces sentinels into DetectedWrite.path, treat them as not
1020
+ // bare-relative (already-normalized to cwd-independent shape).
1021
+ if (w.path.startsWith('__rea_outside_root__'))
1022
+ return false;
1023
+ if (w.path.startsWith('__rea_unresolved_expansion__'))
1024
+ return false;
1025
+ // Dynamic empty/$VAR paths arrive here with `dynamic: true` and a
1026
+ // string value that may be partial. Conservatively treat them as
1027
+ // bare-relative — the cd-source classifier runs next and decides.
1028
+ return true;
1029
+ }
1030
+ /**
1031
+ * Test whether a path string is "potentially protected" — used for
1032
+ * for-loop iter classification. A safe iter Items list contains ONLY
1033
+ * non-protected literal paths. The check is conservative: any path
1034
+ * starting with `.rea`, `.husky`, `.claude`, `.github`, or otherwise
1035
+ * matching the dot-anchored protected list disqualifies the iter from
1036
+ * being marked safe. We keep this in sync with the protected-list
1037
+ * shape rather than hard-coding patterns; the walker only needs a
1038
+ * coarse over-approximation here because the result only EXPANDS the
1039
+ * known-safe set (false-positive = miss the optimization, fall back
1040
+ * to the dynamic emit).
1041
+ */
1042
+ function isPathPotentiallyProtected(p) {
1043
+ if (p.length === 0)
1044
+ return false;
1045
+ // Strip a leading ./ for normalization parity with the scanner.
1046
+ let s = p;
1047
+ if (s.startsWith('./'))
1048
+ s = s.slice(2);
1049
+ // Any path under one of the known protected dir prefixes — keep this
1050
+ // in sync with the scanner's effective patterns (best-effort).
1051
+ const PROTECTED_DIR_PREFIXES = ['.rea', '.husky', '.claude', '.github'];
1052
+ for (const prefix of PROTECTED_DIR_PREFIXES) {
1053
+ if (s === prefix)
1054
+ return true;
1055
+ if (s.startsWith(`${prefix}/`))
1056
+ return true;
1057
+ }
1058
+ return false;
1059
+ }
1060
+ /**
1061
+ * Classify a dynamic cd target's source. Returns true if the entire
1062
+ * Word's expansion is provably non-protected at runtime:
1063
+ * - `$HOME`, `$PWD`, `$OLDPWD` — environment vars set by the shell
1064
+ * to absolute paths outside the project root.
1065
+ * - `$(pwd)`, `$(git rev-parse --show-toplevel)`, `$(git rev-parse
1066
+ * --show-cdup)` — cmdsubst that resolves to the current cwd or
1067
+ * project-root absolute path.
1068
+ * - For-iter variables whose Items list is all literal non-
1069
+ * protected paths (`for d in src test` → `$d` is safe).
1070
+ *
1071
+ * Anything else (`$P`, `$REPO_ROOT`, `${VAR:-default}`, mixed
1072
+ * literal+expansion like `prefix/$VAR`) is NOT known-safe — the
1073
+ * scanner refuses on uncertainty.
1074
+ */
1075
+ function isKnownSafeCdSource(word, safeForIterVars) {
1076
+ const parts = asArray(word['Parts']);
1077
+ if (parts.length === 0)
1078
+ return false;
1079
+ // The Word must be a SINGLE Part for classification; mixed shapes
1080
+ // like `prefix/$VAR` have multiple Parts and are not provably safe.
1081
+ if (parts.length !== 1)
1082
+ return false;
1083
+ const partRaw = parts[0];
1084
+ if (!partRaw || typeof partRaw !== 'object')
1085
+ return false;
1086
+ let part = partRaw;
1087
+ // Unwrap a single-Part DblQuoted — `cd "$HOME"` parses as Word ⊃
1088
+ // DblQuoted ⊃ ParamExp{HOME}. The DblQuoted wrapper is just shell
1089
+ // quoting and doesn't change the expansion's semantics.
1090
+ if (nodeType(part) === 'DblQuoted') {
1091
+ const innerParts = asArray(part['Parts']);
1092
+ if (innerParts.length !== 1)
1093
+ return false;
1094
+ const ip = innerParts[0];
1095
+ if (!ip || typeof ip !== 'object')
1096
+ return false;
1097
+ part = ip;
1098
+ }
1099
+ const t = nodeType(part);
1100
+ if (t === 'ParamExp') {
1101
+ return isParamExpKnownSafe(part, safeForIterVars);
1102
+ }
1103
+ if (t === 'CmdSubst') {
1104
+ return isCmdSubstKnownSafe(part);
1105
+ }
1106
+ return false;
1107
+ }
1108
+ /**
1109
+ * A ParamExp is known-safe iff it's a plain reference to a safe var name.
1110
+ *
1111
+ * **Round-15 P1 closure — no env var name is statically safe.** Earlier
1112
+ * iterations included `HOME`/`PWD`/`OLDPWD` in a known-safe allow-list
1113
+ * on the assumption that the shell sets them to absolute paths outside
1114
+ * the project root. That assumption is wrong:
1115
+ *
1116
+ * - **Inline assignment-prefix on the same simple command rebinds them**
1117
+ * (`HOME=.rea cd "$HOME"` → cwd lands in `.rea/`).
1118
+ * - **Parent-shell exports rebind them across commands**
1119
+ * (`export HOME=.rea; cd "$HOME" && echo > HALT`).
1120
+ * - **`OLDPWD` automatically tracks any previous cd**, so a previous cd
1121
+ * into a protected dir poisons later `cd "$OLDPWD"` references even
1122
+ * without explicit assignment.
1123
+ *
1124
+ * Static analysis cannot prove a name's runtime value, so the only safe
1125
+ * answer is to refuse on any ParamExp expansion. The `safeForIterVars`
1126
+ * carve-out remains because its safety property is structural — a for-
1127
+ * loop body's iterator variable is bounded to the literal Items list,
1128
+ * which we already statically check for protected paths.
1129
+ */
1130
+ function isParamExpKnownSafe(paramExp, safeForIterVars) {
1131
+ // Reject any expansion modifier — `${HOME:-junk}`, `${HOME#x}` etc.
1132
+ // Those can introduce attacker-controlled defaults / suffixes.
1133
+ const exp = paramExp['Exp'];
1134
+ if (exp !== null && exp !== undefined)
1135
+ return false;
1136
+ const slice = paramExp['Slice'];
1137
+ if (slice !== null && slice !== undefined)
1138
+ return false;
1139
+ const repl = paramExp['Repl'];
1140
+ if (repl !== null && repl !== undefined)
1141
+ return false;
1142
+ const index = paramExp['Index'];
1143
+ if (index !== null && index !== undefined)
1144
+ return false;
1145
+ if (paramExp['Length'] === true)
1146
+ return false;
1147
+ if (paramExp['Width'] === true)
1148
+ return false;
1149
+ if (paramExp['Excl'] === true)
1150
+ return false;
1151
+ // Name lookup.
1152
+ const paramNode = paramExp['Param'];
1153
+ if (!paramNode || typeof paramNode !== 'object')
1154
+ return false;
1155
+ const name = stringifyField(paramNode['Value']);
1156
+ if (name.length === 0)
1157
+ return false;
1158
+ // Round-15 P1: no env-var name is safe at static analysis; all are
1159
+ // runtime-rebindable via inline assignment-prefix or parent-shell
1160
+ // export. The set is intentionally empty — see the function docstring.
1161
+ const KNOWN_SAFE_VARS = new Set();
1162
+ if (KNOWN_SAFE_VARS.has(name))
1163
+ return true;
1164
+ if (safeForIterVars.has(name))
1165
+ return true;
1166
+ return false;
1167
+ }
1168
+ /**
1169
+ * A CmdSubst is known-safe iff its inner command is one of:
1170
+ * - `pwd` (no args, or only `-L` / `-P`).
1171
+ * - `git rev-parse` with one or more flags from a tight allow-list
1172
+ * that resolves to either an ABSOLUTE path or a path that steps
1173
+ * OUT of cwd; never INTO it.
1174
+ *
1175
+ * Anything else — even other read-only commands — is refused.
1176
+ * "Known-safe" here means *the printed value cannot land inside
1177
+ * `.rea/` / `.husky/` / `.claude/` / `.github/workflows/` at runtime
1178
+ * regardless of where cwd happens to be*. Misclassifying a command as
1179
+ * safe would create a real bypass; the allow-list stays small.
1180
+ *
1181
+ * **Round-15 P1 — `--show-prefix` removed.** `git rev-parse --show-prefix`
1182
+ * returns the cwd-relative path INSIDE the toplevel. When the agent's
1183
+ * cwd is `.rea/`, it returns `.rea/` (or `.husky/`, `.claude/`, etc.),
1184
+ * landing the cd inside a protected dir. The remaining flags are safe
1185
+ * because:
1186
+ * - `--show-toplevel` returns an absolute path to the repo root.
1187
+ * - `--show-cdup` returns a string of `../` segments that can only
1188
+ * step OUT of the current dir, never IN.
1189
+ * - `--show-superproject-working-tree` returns an absolute path to
1190
+ * the parent superproject (or empty when not in a submodule).
1191
+ * All three resolve to a value that is independent of how deep cwd is
1192
+ * inside the worktree, so they cannot be steered into protected dirs.
1193
+ */
1194
+ function isCmdSubstKnownSafe(cmdSubst) {
1195
+ const stmts = asArray(cmdSubst['Stmts']);
1196
+ if (stmts.length !== 1)
1197
+ return false;
1198
+ const stmt = stmts[0];
1199
+ if (!stmt || typeof stmt !== 'object')
1200
+ return false;
1201
+ const cmd = stmt['Cmd'];
1202
+ if (!cmd || typeof cmd !== 'object')
1203
+ return false;
1204
+ if (nodeType(cmd) !== 'CallExpr')
1205
+ return false;
1206
+ const args = asArray(cmd['Args']);
1207
+ if (args.length === 0)
1208
+ return false;
1209
+ const argv = [];
1210
+ for (const arg of args) {
1211
+ if (typeof arg !== 'object' || arg === null)
1212
+ continue;
1213
+ const v = wordToString(arg);
1214
+ argv.push(v ?? { value: '', dynamic: true, position: { line: 0, col: 0 } });
1215
+ }
1216
+ // No env-prefix strip needed here — known-safe commands don't tolerate
1217
+ // foreign env (PWD=/etc pwd is technically still pwd, but adding env
1218
+ // is suspicious and we refuse it).
1219
+ if (argv[0] === undefined || argv[0].dynamic)
1220
+ return false;
1221
+ const head = normalizeCmdHead(argv[0].value);
1222
+ if (head === 'pwd') {
1223
+ // `pwd` with no args / only `-L` / `-P`. Reject anything else.
1224
+ for (let i = 1; i < argv.length; i += 1) {
1225
+ const a = argv[i];
1226
+ if (a === undefined)
1227
+ continue;
1228
+ if (a.dynamic)
1229
+ return false;
1230
+ if (a.value === '-L' || a.value === '-P')
1231
+ continue;
1232
+ return false;
1233
+ }
1234
+ return true;
1235
+ }
1236
+ if (head === 'git') {
1237
+ // `git rev-parse <flag>` — only flags that resolve to an absolute
1238
+ // path, or a path stepping OUT of cwd, qualify. `--show-prefix` is
1239
+ // intentionally NOT in the set: it returns cwd-relative-to-toplevel
1240
+ // and can be `.rea/` / `.husky/` / `.claude/` when the agent is
1241
+ // already inside a protected dir (round-15 P1 closure).
1242
+ // No other git subcommands qualify here because many are
1243
+ // attacker-influenceable through `.git/config` settings.
1244
+ if (argv.length < 3)
1245
+ return false;
1246
+ if (argv[1] === undefined || argv[1].dynamic || argv[1].value !== 'rev-parse')
1247
+ return false;
1248
+ const FLAGS = new Set([
1249
+ '--show-toplevel',
1250
+ '--show-cdup',
1251
+ '--show-superproject-working-tree',
1252
+ ]);
1253
+ // All remaining args must be in FLAGS (no arbitrary refspecs).
1254
+ for (let i = 2; i < argv.length; i += 1) {
1255
+ const a = argv[i];
1256
+ if (a === undefined)
1257
+ continue;
1258
+ if (a.dynamic)
1259
+ return false;
1260
+ if (!FLAGS.has(a.value))
1261
+ return false;
1262
+ }
1263
+ return true;
1264
+ }
1265
+ return false;
1266
+ }
239
1267
  /**
240
1268
  * Round 7 P0 — manual descent into ParamExp.Slice subtrees.
241
1269
  *
@@ -2590,7 +3618,41 @@ function detectEval(argv, out) {
2590
3618
  const inner = parts.join(' ');
2591
3619
  if (inner.length === 0)
2592
3620
  return;
2593
- const parsed = parseBashCommand(inner);
3621
+ // helix-024 F2: doubly-nested eval bypass. Pre-fix the inner parse
3622
+ // received the raw concatenation of argv tokens — for
3623
+ // `eval "eval \"echo x > .rea/HALT\""` the outer DQ-significant
3624
+ // escapes (`\"`) were preserved through wordToString as literal
3625
+ // backslash-quote pairs in the joined string. The inner parser then
3626
+ // treated them as literal characters glued onto the redirect target,
3627
+ // producing a corrupted path (`.rea/HALT\"`) that didn't match the
3628
+ // protected list. Mirrors the helix-022 #3 nested-shell fix: bash
3629
+ // collapses one level of DQ-escapes BEFORE the inner shell sees the
3630
+ // payload, so the static analyzer must do the same. detectNestedShell
3631
+ // applies unshellEscape; detectEval pre-fix did not — that asymmetry
3632
+ // is the bypass.
3633
+ //
3634
+ // After unshellEscape the inner parses correctly and walkForWrites
3635
+ // re-dispatches CallExpr → detectEval recursively. The recursion
3636
+ // bottoms out via the nested-shell depth model: walkForWrites on the
3637
+ // re-parsed file inherits CURRENT_NESTED_DEPTH (set non-zero only
3638
+ // when the parent dispatch came from detectNestedShell). For pure-
3639
+ // eval recursion we add a dedicated CURRENT_EVAL_DEPTH counter
3640
+ // mirroring NESTED_SHELL_DEPTH_CAP=8 — every eval re-parse increments
3641
+ // it, every walker entry checks the cap and emits a refuse-on-
3642
+ // uncertainty detection past the cap. This closes Finding 2 of
3643
+ // helix-024 even for arbitrary eval-depth chains.
3644
+ if (CURRENT_EVAL_DEPTH >= EVAL_DEPTH_CAP) {
3645
+ out.push({
3646
+ path: '',
3647
+ form: 'redirect',
3648
+ position: argv[0]?.position ?? { line: 0, col: 0 },
3649
+ dynamic: true,
3650
+ originSrc: 'eval recursion exceeded depth cap — refusing on uncertainty',
3651
+ });
3652
+ return;
3653
+ }
3654
+ const innerSource = unshellEscape(inner);
3655
+ const parsed = parseBashCommand(innerSource);
2594
3656
  if (!parsed.ok) {
2595
3657
  out.push({
2596
3658
  path: '',
@@ -2601,7 +3663,14 @@ function detectEval(argv, out) {
2601
3663
  });
2602
3664
  return;
2603
3665
  }
2604
- const innerWrites = walkForWrites(parsed.file);
3666
+ CURRENT_EVAL_DEPTH += 1;
3667
+ let innerWrites;
3668
+ try {
3669
+ innerWrites = walkForWrites(parsed.file);
3670
+ }
3671
+ finally {
3672
+ CURRENT_EVAL_DEPTH -= 1;
3673
+ }
2605
3674
  for (const d of innerWrites) {
2606
3675
  out.push({
2607
3676
  ...d,
@@ -2610,6 +3679,21 @@ function detectEval(argv, out) {
2610
3679
  });
2611
3680
  }
2612
3681
  }
3682
+ /**
3683
+ * Eval recursion depth cap. helix-024 F2: pre-fix detectEval re-parsed
3684
+ * exactly one level — `eval "eval \"echo > .rea/HALT\""` bypassed
3685
+ * because the inner re-parsed string carried the outer DQ-escape (`\"`)
3686
+ * as literal characters in the redirect target, corrupting the path so
3687
+ * it didn't match the protected list. The fix is two-part:
3688
+ * 1. unshellEscape() the inner string before re-parsing (mirrors the
3689
+ * helix-022 #3 nested-shell fix).
3690
+ * 2. Cap eval recursion at 8 levels. Past the cap, emit a synthetic
3691
+ * dynamic detection so the scanner refuses on uncertainty. Mirrors
3692
+ * NESTED_SHELL_DEPTH_CAP — same shape, separate counter so eval
3693
+ * and bash -c don't exhaust each other's budget.
3694
+ */
3695
+ const EVAL_DEPTH_CAP = 8;
3696
+ let CURRENT_EVAL_DEPTH = 0;
2613
3697
  // ─────────────────────────────────────────────────────────────────────
2614
3698
  // Codex round 4 Finding 7: misc utilities (patch, sort, shuf, gpg,
2615
3699
  // split/csplit, trap)
@@ -5468,6 +6552,26 @@ function detectLn(argv, out) {
5468
6552
  // Codex round 1 F-7: -t target is a directory.
5469
6553
  isDirTarget: true,
5470
6554
  });
6555
+ // helix-024 F3: when -t is set, every positional is a SOURCE that
6556
+ // gets linked into the target dir. Each source whose path matches
6557
+ // a protected pattern is a write-through-symlink risk — emit the
6558
+ // synthetic detection on every positional so the scanner refuses
6559
+ // on protected sources.
6560
+ for (const src of positionals) {
6561
+ out.push({
6562
+ path: src.value,
6563
+ form: 'ln_to_protected_unresolvable',
6564
+ position: src.position,
6565
+ dynamic: src.dynamic,
6566
+ // Treat ln source as destructive-ancestry for the scanner
6567
+ // match — `ln -s .rea /tmp/x` aliases the .rea directory, and
6568
+ // a subsequent write through /tmp/x/HALT would hit .rea/HALT.
6569
+ // Without isDestructive the scanner would match `.rea` against
6570
+ // `.rea/HALT` only with dir-shape input; isDestructive enables
6571
+ // the protected-ancestry path so a bare-dir source still hits.
6572
+ isDestructive: true,
6573
+ });
6574
+ }
5471
6575
  return;
5472
6576
  }
5473
6577
  if (positionals.length >= 2) {
@@ -5480,6 +6584,35 @@ function detectLn(argv, out) {
5480
6584
  dynamic: dest.dynamic,
5481
6585
  });
5482
6586
  }
6587
+ // helix-024 F3: ln SRC DEST and ln SRC1 SRC2 ... DEST_DIR. Every
6588
+ // source argv (all positionals except the last) may be staged as
6589
+ // a link whose write goes through SRC. If SRC is a protected path
6590
+ // (e.g. `.rea/HALT`), the subsequent `echo > DEST` writes through
6591
+ // the link and the kill switch is bypassed. Walker emits a
6592
+ // synthetic `ln_to_protected_unresolvable` detection on every
6593
+ // source; the scanner refuses on uncertainty when SRC matches the
6594
+ // protected list. Non-protected sources fall through (NEG-5 /
6595
+ // `ln -s /tmp/a /tmp/b` and NEG-6 / `ln -s docs/file.md /tmp/link`
6596
+ // both ALLOW because the source side resolves outside-root or
6597
+ // doesn't match a protected pattern).
6598
+ for (let s = 0; s < positionals.length - 1; s += 1) {
6599
+ const src = positionals[s];
6600
+ if (src === undefined)
6601
+ continue;
6602
+ out.push({
6603
+ path: src.value,
6604
+ form: 'ln_to_protected_unresolvable',
6605
+ position: src.position,
6606
+ dynamic: src.dynamic,
6607
+ // Treat ln source as destructive-ancestry for the scanner
6608
+ // match — `ln -s .rea /tmp/x` aliases the .rea directory, and
6609
+ // a subsequent write through /tmp/x/HALT would hit .rea/HALT.
6610
+ // Without isDestructive the scanner would match `.rea` against
6611
+ // `.rea/HALT` only with dir-shape input; isDestructive enables
6612
+ // the protected-ancestry path so a bare-dir source still hits.
6613
+ isDestructive: true,
6614
+ });
6615
+ }
5483
6616
  }
5484
6617
  }
5485
6618
  /**